An interview with Professor Roger Zimmermann

Roger at the start of his career.

Please describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

I have had an interest in technology early on, though my path to becoming an academic has not been very direct. In high school, I really enjoyed to tinker with electronics, taking radios apart, and learning about digital circuits. My goal was to work in this field, and after high school, I did an apprenticeship with Brown, Boveri & Cie. (BBC), which sometime later became Asea Brown Boveri (ABB). The apprentices were assigned to different company locations, and I was lucky enough to be sent to BBC’s Forschungszentrum (Research Center). The labs, the researchers, and the cutting-edge equipment and projects there left a deep impression on me. Beyond electronics, I really liked microprocessors, computers and how they could be flexibly programmed with software. I decided that I wanted to pursue further studies and I subsequently enrolled in the Höhere Technische Lehranstalt (HTL) Brugg-Windisch in their Informatik program (the HTL program has since changed and the building where I studied is now part of the campus Windisch of the Fachhochschule Nordwestschweiz). Fresh with my HTL degree in hand, I started to work for an engineering company and over the next years, I got the chance to work on some fascinating projects. After five years, I got an itch to study for a Master’s degree and I ended up in California. One of the professors (who became my advisor) encouraged me to go for a Ph.D., and I took him up on his offer to support me. His group worked at the intersection of databases and multimedia. It really fascinated me and we ended up building one of the early streaming media servers. What I still find fascinating about multimedia today is how it brings together many fundamental computer science areas such as networking, graphics, operating system support, signal processing, etc. I also like that multimedia is used by people to express their creativity, humanity and artistic aspirations – it is not only about technology.

My personal lessons looking back are that sometimes you may not know where your journey will take you, but make sure you enjoy and learn from the path to get there.

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

I currently work broadly in two areas, namely streaming media systems and data analytics. At this point, one of the main enjoyment I get is from working with my research group and international colleagues from around the world. On the technical side, it is fun if somebody is actually using what we develop. On the human side of things, it is great to see when my students and former students are doing well in various parts of the globe.

Can you profile your current research, its challenges, opportunities, and implications?

In my research group, I have two main themes and those are media systems and multimedia data analytics. In the first cluster, we look at media streaming on the Internet. The main technology in use today is Dynamic Adaptive Streaming over HTTP, also called DASH. Some interesting challenges are in the area of enabling very low latency in live streaming, which is of interest to many large Internet companies. Going forward, I see 5G networks as an interesting challenge. Most people are excited about the very high bandwidth that 5G can offer (in the best case), but I believe one of the major challenges will be the very high variability of 5G networks when a device is moving. On the multimedia, and especially spatial, data analytics side, I am part of a new lab between NUS and the ridesharing company Grab. There is a tremendous amount of data generated (e.g., GPS trajectories) that allow novel data-driven applications such as generating accurate road maps in regions where this information is not readily available or the inference of semantic attributes of roads (e.g., no right turn allowed). The fusion of multiple data types such as trajectories, images, maps, etc., will allow for some exciting new applications.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

One of the areas where my group made innovative contributions was georeferenced mobile video — combining videos with their geo-spatial properties led to a lot of interesting developments. We started with this just about at the same time when the first iPhone came out, and the idea of utilizing all the sensors in a phone in combination with its video was really novel. Nowadays, sensor fusion is common and is used in many machine-learning applications and I am sure there will be even greater break-throughs in the future. Another area where I have been working for decades is media streaming and this whole industry has changed from proprietary networks to the Internet. There have been many people working in this area, but I believe that our own contributions have helped to transform this field.

Over your distinguished career, what are the top lessons you want to share with the audience?

My path to becoming an academic has not been as direct as for some other people. But one of the key things that I have enjoyed along the way was to work with many outstandingly talented and bright people from all around the world. I hope that humanity will keep working together based on facts and science to solve some of the big challenges that are coming our way.

If you were conducting this interview, what questions would you ask, and then what would be your answers?

One issue that concerns me is the apparent trend to not trust facts anymore. So a possible question could be: Do you see a danger when people easily distribute and believe in “alternate facts”?

My answer would be, I definitely see this as a considerable concern in the future. While there may be some technical solutions to combat fake news, etc., it is also increasingly important that people are well educated and think critically, especially in a world where fake information may look very persuasive.

 

What is the best joke you know?

I like many of the weird, but strangely funny comments on life and baseball from Yogi Berra. He was born Lawrence Peter Berra and was a US baseball legend. Two examples:

“When you come to a fork in the road, take it.”

“You should always go to other people’s funerals. Otherwise, they won’t come to yours.”


A current image of Roger.

Short bio:

Roger Zimmermann is an Associate Professor at the School of Computing at the National University of Singapore (NUS). He is also Deputy Director with the Smart Systems Institute (SSI) at NUS. From 2010 to 2016 he co-directed the Centre of Social Media Innovations for Communities (COSMIC), a research institute funded by the National Research Foundation (NRF) of Singapore. Prior to joining NUS he held the positions of Research Area Director with the Integrated Media Systems Center (IMSC) and Research Assistant Professor at the University of Southern California (USC). He earned his M.S. and Ph.D. degrees from the Viterbi School of Engineering at the University of Southern California.

An interview with Géraldine Morin

Please describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

My journey into research was not such a linear path (or ’straight path’ as some French institutions put it —a criteria for them to hire)… I started convinced that I wanted to be a high school math teacher. Since I was accepted in a Math and CS engineering school after a competitive exam, I did accept to study there, working in parallel towards a pure math degree.
The first year, I did manage to follow both curricula (taking two math exams in September), but it was quite a challenge and the second year I gave up on the math degree to keep following the engineering curricula.
I finished with a master degree in applied Math (back then fully included in the engineering curricula) and really enjoyed working on the Master thesis (I did my internship in Kaiserslautern, Germany) so I decided to apply for a Ph.D. grant.
I made it into the Ph.D. program in Grenoble and liked my Ph.D. topic in geometric modelling but had a hard time with my advisor there.
So I decided after two years to give up, (passed a motorcycle driving licence) and went on teaching Math in high school for a year (also passed the teacher examination). Encouraged by my former German Master thesis advisor, I then applied for a Ph.D. program at Rice University in the US to work with Ron Goldman, a researcher whose work and papers I really liked. I got the position and really enjoyed doing research there.
After a wedding, a kid, and finishing the Ph.D. (in that order) I had moved to Germany to live with my husband and found a Postdoc position in Berlin for one year. I applied then to Toulouse, where I have stayed since. In Toulouse, I was hired in a Computer Vision research group, where a subgroup of people were tackling problems in multimedia, and offered me the chance to be the 3D-person of their team 🙂

I learned that a career, or research path, is really shaped by the people you meet on your way, for good or bad. Perseverance for something you enjoy is certainly necessary, and not staying in a context that do not fit you is also important! I am glad I did start again after giving up at first, but also do not regret my choice to give up either.

Research topic, and research areas, are important and a good match with your close collaborators is also very relevant to me. I really enjoy the multimedia community for that matter. The people are open minded and curious, and very encouraging… At multimedia conferences I always feel that my research is valued and relevant to the field (in the other communities, CG or CV, I sometimes get a remark like, ‘oh well, I guess you are not really doing C{G|V}’ …). Multimedia also has a good balance between theory and practice, and that’s fun !

Visit in Chicago during my Ph.D. in the US.

Visit in Chicago during my Ph.D. in the US.

 

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

I just took the responsibility of a department, while we are changing the curricula. This is a lot of organisation and administrative work, but also forces me to have a larger vision of how the field of computer science is evolving and what is important to teach. Interestingly, we prepare our student for jobs that do not exist yet ! This new challenge for me, also makes me realise how important it is to keep time for research, and the open-mindedness I get from my research activity.

Can you profile your current research, its challenges, opportunities, and implications?

As I mentioned before, currently, my challenge is to be able to keep on being active in research. I follow up on two paths: first in geometric modeling, trying to bridge the gap between my current interest in skeleton based models and two hot topics that are 3D printing, and machine learning.
The second is to continue working in multimedia, in distributing 3D content in a scalable way.
Concerning my implication, I am also currently co-heading the French geometric modeling group, and I very much appreciate to promote our research community, and contribute to keep it active and recognised.

How would you describe the role of women especially in the field of multimedia?

I have participated in my first women in MM meeting in ACM, and very much appreciated it. I have to admit I was not really interested in women targeted activities before I did participate in my first women workshop (WiSH – Women in SHape) in 2013, that brought groups on women to collaborate during one week… that was a great experience, that made me realise that, despite the fact that I really enjoy working with my -almost all male- colleagues, it was also fun and very inspiring to work with women groups. Moreover, being questioned by younger colleagues about the ability for a woman to have a family and faculty job, I now think that my good experience as a faculty and mother of 3 should be shared when needed.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

My first contributions were in a quite theoretical field : during my Ph.D. I proposed to use analytic functions in a geometric modeling context. That raised some convergence issues that I managed to prove.
Later, I really enjoyed working with collaborators and proposing a shared topic with my colleague Romulus who worked on streaming, we started in 2006 to work on 3D streaming; that led us to collaborating with Wei Tsang Ooi for the National University of Singapore and for more than 12 years, we have been now advancing some innovative solutions for the distribution of 3D content, working on adapted 3D models for me, and system solutions for them… implying along the way new colleagues. Along the way, we won the best paper award for my Ph.D. student paper in the ACM MM in 2008 (I am very proud of that —despite the fact that I could not attend the conference, I gave birth between submission and conference ;).

Over your distinguished career, what are your top lessons you want to share with the audience?

A very simple one: Enjoy what you do! and work will be fun.
For me, I am amazed thinking over new ideas always remain so exciting 🙂

What is the best joke you know? 🙂

hard one !

Jogging in the morning to N Seoul Tower for sunrise, ACM-MM 2018.

Jogging in the morning to N Seoul Tower for sunrise, ACM-MM 2018.

 

If you were conducting this interview, what questions would you ask, and then what would be your answers?

I have heard there are very detailed studies, especially in the US about difference between male and female behaviour.
It seems that being aware of these helps. For example, women tend to judge themselves harder that men do…
(that’s not really a question and answer, more a remark :p )

Another try:
Q: What would make you feel confident/helps you get over challenges ?
A: I think I lack self confidence, and I always ask for a lot of feedback from colleagues (for examples for dry runs).
If I get good feedback, it boosts my confidence, if I get worst feedback, it helps me improve… I win both ways 🙂

 


Bios

Assoc. Prof. Géraldine Morin: 

Je suis Maître de conférences à l’ENSEEIHT, l’une des écoles de l’Institut National Polytechnique de Toulouse de l’Université de Toulouse, et j’effectue ma recherche à l’IRIT (UMR CNRS 5505). Avant de m’installer à Toulouse, j’étais Grenobloise et j’ai été diplomée de l’ENSIMAG (diplôme d’ingénieur) et de l’ Université Joseph Fourier (D.E.A. de mathématiques appliquées) ainsi qu’une licence de maths purs que j’ai suivi en parallèle à ma première année d’école d’ingénieur. J’ai ensuite fait une thèse en Modélisation Géométrique aux Etats-Unis à (Rice University) (“Analytic Functions for Computer Aided Geometric Design”) sous la direction de Ron Goldman. Ensuite, j’ai fait un postdoc d’un an en géométrie algorithmique, à la Freie Universität de Berlin.

An interview with Assoc. Prof. Ragnhild Eg

Please describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

In high school, I really had no idea what I wanted to study in university. I liked writing, so I first tried out journalism. I soon discovered that I was too timid for this line of work, and the writing was less creative than I had imagined. So I returned to my favourite subject, psychology. I have always been fascinated by how the human mind works, how we can process all the information that surrounds us – and act on it. This fascination led me from a Bachelor in Australia, back to Norway where I started a Master in cognitive and biological psychology. One of my professors (whom I was lucky to have as a supervisor later) was working on a project on speech perception, and I still remember the first example she used to demonstrate how what we see can alter what we hear. I am delighted that I still encounter new examples of how multi-sensory processes can trick us. Most of all, I am interested by how these complex processes happen naturally, beyond our consciousness. And that is also what interests me in multimedia, how is it that we perceive information conveyed by digital systems in much the same way we perceive information from the physical world? And when we do not perceive it in the same way, what is causing the discrepancy?

My personal lessons are not to let a chosen path lead you in a direction you do not want to go. Moreover, not all of us are driven by a grand master plan. I am very much driven by impulses and curiosity, and this has led me to a line of work where curiosity is an asset.

Ragnhild Eg at the begin of hear research career in 2011

Ragnhild Eg at the beginning of her research career in 2011.

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

I currently work at a university college, where I have the opportunity to combine two passions: teaching and research. I wish to continue with both, so my vision relates to my research progression. My objective is pretty basic, I wish to broaden the scope of my research to include more perspectives on human perception. To do that, I want to start with new collaborations that can lead to long-term projects. As mentioned, I often let curiosity guide me, and I do not intend to stop doing just that.

Can you profile your current research, its challenges, opportunities, and implications?

In later years, my research scope has extended from perception of multimedia content to human-computer interactions, and further on to individual factors. Although we investigate perceptual processes in the context of computer systems’ limitations, our original approach was to generalise across a population. Yet, the question of how universal perceptual processes can differ so much between individuals has become more and more intriguing.

How would you describe the role of women especially in the field of multimedia?

I have a love-hate relationship when it comes to stereotypes. Not only are they unavoidable, they are essential for us to process information. Moreover, it can be quite amusing to apply characteristics to stereotypes. On the other hand, stereotypes contribute to preserve, and even strengthen, certain conceptions about individuals. On the topic of women in multimedia, I find it important because we are a minority and I believe any community benefits from diversity. However, I find it difficult to describe our role without falling back on stereotypical gender traits.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

The path that led me to multimedia research started with my studies in psychology, so I came into the field with a different outlook. I use my theoretical knowledge about human cognition and perception, and my experience with psychological research methods, to tackle multimedia challenges. For instance, designing behavioural studies with experimental controls and validity checks. Perhaps not innovative, my first approach to study the perception of multimedia quality was to avoid addressing quality, and rather control it as an experimental factor. Instead, I explored variations in perceptual integration, across different quality levels. Interestingly, I see more and more knowledge introduced from psychology and neuroscience to multimedia research. I regard these cross-overs as an indication that multimedia research has come to be an established field with versatile research methods, and I look forward to seeing what insights come out of it.

Over your distinguished career, what are your top lessons you want to share with the audience?

When I started my PhD, I came into a research environment dominated by computer science. The transition went far smoother than I had imagined, mostly due to open-minded and welcoming colleagues. Yet, working with inter-disciplinary research will lead to encounters where you do not understand the contributions of others, and they may not understand yours. Have respect for the knowledge and expertise others bring with them, and expect the same respect for your own strengths. This type of collaboration can be demanding, but can also bring about the most interesting questions and results.

Another lesson I want to share, is perhaps one that can only come through personal experience. I enjoy collaborating on research projects, but being a researcher also requires a great deal of autonomy. Only at the end of the first year did I realise that no one could tell me what should be the focus of my PhD, even though I was expected to contribute to a larger project. Research is not constrained by clear boundaries, and I believe a researcher must be able to apply their own curiosity even when external forces seem to enforce limits.

Ragnhild Eg in 2018.

Ragnhild Eg in 2018.

If you were conducting this interview, what questions would you ask, and then what would be your answers?

I would ask what is the best joke you know! And my answer would undoubtedly be a knock-knock joke. 
Editor’s note: Officially added to the standard questionnaire!

What is the best joke you know? 🙂

Knock knock

– Who’s there?

A little old lady

– A little old lady who?

Wow, I had no idea you could yodel! 


Bios

Assoc. Prof. Ragnhild Eg: 

Ragnhild Eg is an associate professor at Kristiania University College, where she combines her background and interests in psychology with research and education. She teaches psychology and ethics, and pursue research interests spanning from perception and the effects of technological constraints, to the consequences of online media consumption.

Michael Alexander Riegler: 

Michael is a scientific researcher at Simula Research Laboratory. His research interests are medical multimedia data analysis and understanding, image processing, image retrieval, parallel processing, crowdsourcing, social computing and user intent. 

An interview with Miriam Redi

Miriam at the begin of her research career.

Miriam at the begin of her research career.

Describe your journey into computing from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

I literally grew up with computers all around me. I was born in a little town raised around the headquarters of Olivetti, one of the biggest tech companies of the last century: becoming a computer geek, in that place, at that time, was easier than usual! I have always been fascinated by the power of visuals and music to convey ideas. I loved to learn about history and the world through songs and movies. How to merge my love for computers with my passion for the audiovisual arts? I enrolled  in Media Engineering studies, where, aside from the traditional Computer Engineering knowledge, I had the chance to learn more about media history and design. The main message? Multidisciplinarity is key. We cannot design intelligent multimedia technologies without deeply understanding how a media is created, perceived and distributed.

Talking about multidisciplinary, what do you think is the current state of multidisciplinarity in the multimedia community?

My impression is that, due to the inherent multimodality of our research, our community has developed a natural ability of blending techniques and theories from various domains. I believe we can push the boundaries of this multidisciplinarity even further. I am thinking, for example, of that MM subcommunity interested in mining subjective attributes from data, such as mood, sentiment, or beauty. I believe such research works could incredibly benefit from a collaboration between MM scientists and domain experts in psychology, cognitive science, visual perception, or visual arts.

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

My dream is to make multimedia science even more useful for society and for collective growth. Multimedia data allows to easily absorb and communicate knowledge, without language barriers. Producing and generating audiovisual content has never been easier: today, the potential of multimedia for learning and sharing human knowledge is unprecedented! Intelligent multimedia systems could be put in place to support editors communities in making free online encyclopedias like Wikipedia or collaborative knowledge bases like Wikidata more “visual” – and therefore less tied to individual languages. By doing so, we could increase the possibility for people around the world to freely access the sum of all knowledge.

I like your approach about making something useful for society. What do you think about the criticism that multimedia research is too applied?

For me, high-quality research means creative research. Where ‘creative’ means ‘new and valuable’. The coexistence of breath and depth in Multimedia allows to create novel and useful applied research works, thus making these, to me, as interesting as inspiring as more theoretical research works.

Can you profile your current research, its challenges, opportunities, and implications?

I work on responsible multimedia algorithms. I love building machines that can classify audiovisual and textual data according to subjective properties – for example, the informativeness of an image with respect to a topic, its epistemic value, the beauty of a photo, the creative degree of a video. Given the inherently subjective nature of these algorithms, one of the main challenges of my research is to make such models responsible, namely:
1) Diversity-Aware i.e. reflecting the real subjective perception of people with different cultural backgrounds; this is key to empower specific cultures, designing AI to grow diversified content and fill the knowledge gaps in online knowledge repositories.
2) Interpretable and Unbiased, namely not only able to classify content, but also able to say why the content was classified in a certain way (so that we can detect algorithmic bias). Such powerful algorithms can be used to study the visual preferences of users of web and social media platforms, and retrieve interesting content accordingly.

Do you think that one day we will have algorithms that truly understand human perception of beauty and art? Or will it always be depended on the data?

Philosophers have been triying for centuries to understand the true nature of aesthetic perception. In general, I do not believe in absolute truths. And I am not really confident that algorithms will be able to become great philosophers anytime soon.

How would you describe the role of women especially in the field of multimedia?

The role of women in multimedia is the role of any researcher in their scientific community: contribute to scientific development, push the boundaries of what is known, doubt the widely accepted notions, make this world a better place (no pressure!). Maintaining diversity (any kind of diversity – including gender, expertise, race, age) in the scientific discourse is crucial: as opposed to a single mono-culture, a diverse community gathers, elaborates and combines different perspectives, thus forcing a collective creative process of exchange and growth, which is essential to scientific development.

Do you think that female researchers are well presented in the multimedia community? For example, there was not female keynote speaker at ACM MM 2017.

I am not sure about the numbers, so I can’t say for sure the percentage of women and non-binary gender persons in the multimedia community. But I am sure that percentage is greater than 0. When filling positions of high visibility such as keynotes or committee members, I we should always keep in mind that one of our tasks is to inspire younger generations. Generations of young, brilliant, beautifully diverse researchers.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and in the future?

Since my early days in multimedia, when we were retrieving video shots of airplanes, until today, when we classify creative videos or interesting pictures, I would say that the main contribution of my research has been to “break the boundaries”.
We broke the scientific field boundaries. We designed multimedia algorithms inspired by the visual arts and psychology; we collaborated with experts from philosophy, media history, sociology; and we could deliver creative, interdisciplinary research works which would contribute to the advancement of multimedia and all the fields involved.

We broke the social network boundaries: with models able to quantify the intrinsic quality of images in a photo sharing platform. Furthermore, we showed that popularity-driven mechanisms, typical of social networks, fail to promote high-quality content, and that only content-based quality assessment tools could restore meritocracy in online media platforms.

We broke the cultural boundaries: together with an amazing multi-cultural research team, we were able to design computer vision models that can adapt to different cultures and language communities. While the effectiveness of our approaches and the scientific growth is per-se a main achievement, the publications resulting from this collaborative effort reached the top-level Computer Vision, Multimedia and Social media conferences (with a best paper award – ICWSM -and a multimodal best paper award – ICMR) and our work was featured by a number of tech journals and in a TedX presentation. Together with other scientists, we also started a number of initiatives to gather people from different communities who are interested in this area: a special session at ICMR 2017, a workshop at MM 2017, one at CVPR 2018, and, a special issue of ACM TOMM.

What are in your opinion the future topics in multimedia? Where is the community strong, and where could it improve or increase focus?

My feeling is that we should re-discover and empower the ‘multi-’ness of our research field.
I think the beauty of multimedia research is the ability to tell compelling multimodal stories from signals of very diverse nature, with a focus on the positive experience of the user. We are able to process multiple sources of information and use them, for example, to generate multi-sensorial artistic compositions, expose interesting findings about users and their behavior in multiple modalities, or provide tools to explore and align multimodal information, allowing easier knowledge absorption. We should not forget the diversity of modalities we are able to process (e.g. music or social signals, or traditional image data), the types of attributes we can draw from these modalities (e.g. sentiment or appeal, or more binary semantic labels), and the variety of applications scenarios we can imagine for our research works (e.g. arts, photography, cooking, or more consolidated use cases, such as image search or retrieval). And we should encourage emerging topics and applications towards these ‘multi-nesses’.
Beyond multidisciplinarity and multiple modalities, I would also hope to see more multi-cultural research works: given the beautifully diverse world we are part of, I believe multimedia research works and applications should model and take into account the multiple points of views, diverse perceptual responses, as well as the cultural and language differences of users around the world.

Miriam nowadays.

Miriam nowadays.

Over your distinguished career, what are your top lessons you want to share with the audience?

I am not sure if this is a real lesson, more something I deeply believe in. Stereotypes kill ideas. Stereotyping on others (colleagues, friends) might make communication, brainstorming, aor collective problem solving much harder, because it somehow influences the importance given to other people ideas. Also, stereotyping on oneself and one’s limits might constrain the possibilities and narrow one’s view on the shapes of possible future paths.

How was it to have a sister working in the same field of research? Is it motivation or pressure? Did you collaborate on some topics?

In one word: inspiring. We never officially collaborated in any research work. Unofficially, we’ve been ‘collaborating’ for 32 years 🙂 (Interview with Judith Redi)

An interview with Prof. Alan Smeaton

A young Alan Smeaton before the start of his career.

The young Alan Smeaton before the start of his career.

Please describe your journey into computing from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

I started a University course in Physics and Mathematics and in order to make up my credits I needed to add another subject so I chose Computer Science, which was then a brand new topic in the Science Faculty.  Maybe it was because the class sizes were small so the attention we got was great, or maybe I was drawn to the topic in some other way but I dropped the Physics and took the Computer Science modules instead and I never looked back.  I was fortunate in that my PhD supervisor was Keith van Rijsbergen who is one of the “fathers” of information retrieval and who had developed the probabilistic model of IR. Having him as my supervisor was the first lucky thing to have happened to me in my research. His approach was to let me make mistakes in my research, to go down cul-de-sacs and discover them myself, and as a result I emerged as a more rounded, streetwise researcher and I’ve tried to use the same philosophy with my own students.  

For many years after completing my PhD I was firmly in the information retrieval area. I hosted the ACM SIGIR Conference in Dublin in the mid 1990s and was Program Co-Chair in 2003, and workshops, tutorials, etc. chair in other years. My second lucky break in my research career happened in 1991 when Donna Harman of NIST asked me if I’d like to join the program committee of a new initiative she was forming called TREC, which was going to look at information retrieval on test collections of documents and queries but in a collaborative, shared framework.  I jumped at the opportunity and got really involved in TREC in those early years through the 1990s. In 2001 Donna asked me if I’d chair a new TREC track that she wanted to see happen, doing content analysis and search on digital video which was then emerging and in which our lab was establishing a reputation for novel research.  Two years later that TREC activity had grown so big it was spawned off as a separate activity and TRECVid was born, starting formally in 2003 and continuing each year since then. That’s my third lucky break.

Sometime in the early 2000s I went to my first ACM MULTMEDIA conference because of my leading of TRECVid, and I loved it. The topics, the openness, the collaborations, the workshops, the intersection of disciplines all appealed to me and I don’t think I’ve missed an ACM MULTIMEDIA Conference since then.

Talking about ACM MULTIMEDIA, this year emerged some critics that there was no female keynote speaker. What do you think about this and how do you see the role of women in research and especially in the field of multimedia?

The first I heard of this was when I saw it on the conference website and that is when I realised it and I don’t agree with it. I will be proposing several initiatives to the Executive Committee of SIGMM to improve the gender balance and diversity in our sponsored conferences, covering invited panel speakers, invited keynote speakers, raising the importance of the women’s lunch event at the ACM MULTIMEDIA conference starting with this year.  I will also propose including a role for a Diversity Chair in some of the SIGMM sponsored events.  I’ve learned a lot in a short period of time from colleagues in ACM SIGCHI whom I reached out to for advice, and I’ve looked at the practices and experiences of conferences like ACM CHI, ACM UIST, and others.  However these are just suggestions at the moment and need to be proposed and approved by the SIGMM Executive so I can’t say much more about them yet, but watch this space.

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

I hold a variety of roles in my Professional work. As a Professor and teacher I am responsible for delivering courses to first year first semester undergraduates which I love doing because these are the fresh-faced students just arriving at University. I also teach at advanced Masters level and that’s something else I love, albeit with different challenges. As a Board member of the Irish Research Council I help oversee the policies and procedures for Council’s funding of about 1,400 researchers from all disciplines in Ireland. I’m also on the inaugural Scientific Committee of COST, the EU funding agency which funds networking of researchers across more than 30 EU countries and further field. Each year COST funds networking activities for over 40,000 researchers across all disciplines, which is a phenomenal number and my role on the Scientific Committee is to oversee the policies and procedures and help select those areas (called Actions) that get funded.  

Apart from my own research team and working with them as part of the Insight Centre for Data Analytics, and the work I do each year in TRECVid, the other major responsibility I have is as Chair of ACM SIGMM, a role I took up in July 2017, just 2 months ago.  While I had a vision of what I believed should happen in SIGMM and I wrote some of this in my candidature statement (can be found at the bottom of the interview), since assuming the role and realising what SIGMM is like “from the inside” I am seeing that vision and objectives evolve as I learn more. Certainly there are some fundamentals like welcoming and supporting early career researchers, broadening our reach to new communities both geographical and in terms of research topics, ensuring our conferences maintain their very high standards, and being open to new initiatives and ideas, these fundamentals will remain as important.  We expect to announce a new annual conference in multimedia for Asia shortly and that will be added to the 4 existing annual events we run.   In addition I am realising that we need to increase our diversity, gender being one obvious instance of that but there are others.  Finally, I think we need to constantly monitor what is our identity as a community of researchers linked by the bond of working in Multimedia. As the area of Multimedia itself evolves, we have to lead and drive that evolution, and change with it.

I know that may not seem like a lot of aspiration without much detail but as I said earlier, that’s because I’m only in the role a couple of months and the details of these need to be worked out and agreed with the SIGMM Executive Committee, not just me alone, and that will happen over the next few months.

Prof. Alan Smeaton in 2017.

Prof. Alan Smeaton in 2017.

That multimedia evolves is an interesting statement. I often heard people discussing about the definition of multimedia research and they are quite diverse. What is your “current” definition of multimedia research?

The development of every technology has a similar pathway. Multimedia is not a single technology but a constellation of technologies but it has the same kind of pathway. It starts from a blue skies idea that somebody has, like lets put images and sound on computers, and then it becomes theoretical research perhaps involving modelling in some way. That then turns into basic research about the feasibility of the idea and gradually the research gets more and more practical. Somewhere along the way, not necessarily from the outset, applications of the technology are taken into consideration and that is important to sustain the research interest and funding. As applications for the technology start to roll out, this triggers a feedback loop with more and more interest directed back towards the theory and the modelling, improving the initial ideas and taking them further, pushing boundaries of the implementations and making the final applications more compelling, cheaper, faster, greater reach, more impact, etc.  Eventually, the technology may get overtaken by some new blue skies idea leading to some new theories and some new feasibilities and practical applications. Technology for personal transport is one such example with horse-drawn carriages leading to petrol-driven cars and as we are witnessing, into other forms of autonomous, electric-power vehicles.

Research into multimedia is in the mid-life stage of the cycle. We’re in that spiral where new foundational ideas, new theories, new models for those theories, new feasibility studies, new applications, and new impacts, are all valid areas to be working in, and so the long answer to your question about my definition of multimedia research is that it is all of the above.

At the conference people often talk about their experience that their research got criticized for being too applied which seems to be a general problem of multimedia hearing it from so many. Based on your experience in national and international funding panels it would be interesting hear your opinion about this issue and how researchers in the multimedia community could tackle it.

I’ve been there too, so I understand what they are talking about.  Within our field of multimedia we cover a broad church of research topics, application areas, theories and techniques and to say a piece of work is too applied is an inappropriate criterion for it not to be appreciated.  

“Too applied” should not be confused with research impact as research impact is something completely different.  Research impact refers to when our research contributes or generates some benefit outside of academic or research circles and starts to influence the economy or society or culture. That’s something we should all aspire to as members of our society and when it happens it is great. Yet not all research ideas will develop into technologies or implementations that have impact.  Funding agencies right across the world now like to include impact as part of their evaluation and assessment and researchers are now expected to include impact assessment as part of funding proposals.

I do have concerns that for really blue skies research the eventual impact cannot really be estimated. This is what we call high risk / high return and while some funding agencies like the European Research Council actively promote such high risk exploratory work, other agencies tend to go for the safer bet. Happily, we’re seeing more and more of the blue skies funding like the Australian Research Council’s and the Irish Research Council’s Laureate schemes

Can you profile your current research, its challenges, opportunities, and implications?

This is a difficult question for me to answer since the single most dominant characteristics of my research are that it is hugely varied and it is based on a large number of collaborations with researchers in diverse areas. I am not a solo researcher and while I respect and admire those who are, I am at the opposite end of that spectrum. I work with people.

For example today, as I write this interview, is been a busy day for me in terms of research.  I’ve done a bit of writing on a grant proposal I’m working on which proposes using data from a wearable electromyography coupled with other sensors, in determining the quality of a surgical procedure.  I’ve reviewed a report from a project I’m part of which uses low-grade virtual reality in a care home for people with dementia.  I’ve looked at some of the sample data we’ve just got where we’re applying our people-counting work to drone footage of crowds. I wrote a section of a paper describing our work on human-in-the-loop evaluation of video captioning and I met a Masters student who is doing work on propensity modelling for a large bank, and now at the end of the day I’m finishing this interview. That’s an atypical day for me but the range of topics is not unusual.  

What are the challenges and opportunities in this … well it is never difficult to get motivated because the variety of work makes it so interesting, so the challenge is in managing them so that they each get a decent slice of time and effort. Prioritisation of work tasks is a life skill which is best learned the hard way, it is something we can’t teach and while to some people it comes naturally for most of us it is something we need to be aware of.  So if I have a takeaway message for the young researcher it is this … always try to make your work interesting and to explore interesting things because then it is not a chore, it becomes a joy.

This was an very inspiring answer and I think described perfectly how diverse and interesting multimedia research is. Thinking about the list of your projects you describe it seems that all of them address societal important challenges (health care, security, etc.) How important do you think it is to address problems that are helpful for the society and do you think that more researchers in the field of multimedia should follow this path?

I didn’t deliberately set out to address societal challenges in my work and I don’t advocate that everyone should do so in all their work. The samples of my work I mentioned earlier just happen to be like that but sometimes it is worth doing something just because it is interesting even though it may end up as a cul-de-sac. We can learn so much from going down such cul-de-sacs both for ourselves as researchers, for our own development, as well as contributing to knowledge that something does not work.

In your whole interview so far you did not mention A.I. or deep learning. Could you please share your view on this hot topic and its influence on the multimedia community (if possible positive and negative aspects)?

Imagine, a whole conversation on multimedia without mentioning deep learning, so far !  Yes indeed it is a hot topic and there’s a mad scramble to use and try it for all kinds of applications because it is showing such improvement in many tasks and yes indeed it has raised the bar in terms of the quality of some tasks in multimedia, like concept indexing from visual media. However those of us around long enough will remember the “AI Winter” from a few decades ago, and we can’t let this great breakthrough raise expectations that we and others may have about what we can do with multi-modal and multimedia information.

So that’s the word of caution about expectations, but when this all settles down a bit and we analyse the “why” behind the success of deep learning we will realise that the breakthrough is as a result of closer modelling of our own neural processes. Early implementations of our own neural processing was in the form of  multi-connected networks, and things like the Connection Machine were effectively unstructured networks. What deep learning is doing is it is applying structure to the network by adding layers. Going forward, I believe we will turn more and more to neuroscience to inform us about other more sophisticated network structures besides layers, which reflect how the brain works and, just as today’s layered neural networks replicate one element we will use other neural structures for even more sophisticated (and deeper) learning.

ACM candidature statement:

I am honored to run for the position of Chair of SIGMM. I am an active member of ACM since I hosted the SIGIR conference in Dublin in 1994 and have served in various roles for SIGMM events since the early 2000s.

I see two ways in which we can maintain and grow SIGMM’s relevance and importance. The first is to grow collaborations we have with other areas. Multimedia technologies are now a foundation stone in many application areas, from digital humanities to educational technologies, from gaming to healthcare. If elected chair I will seek to reach out to other areas collaboratively, whereby their multimedia problems become our challenges, and developments in our area become their solutions.

My second priority will be to support a deepening of collaborations within our field. Already we have shown leadership in collaborative research with our Grand Challenges, Videolympics, and the huge leverage we get from shared datasets, but I believe this could be even better.
By reaching out to others and by deepening collaborations, this will improve SIGMM’s ability to attract and support new members while keeping existing members energised and rejuvenated, ensuring SIGMM is the leading special interest group on multimedia.


Bios

 

Prof. Alan Smeaton: 

Since 1997 Alan Smeaton has been a Professor of Computing at Dublin City University. He joined DCU (then NIHED) in 1987 having completed his PhD in UCD under the supervision of Prof. Keith van Rijsbergen. He also completed an M.Sc. and  B.Sc. at UCD.

In 1994 Alan was chair of the ACM SIGIR Conference which he hosted in Dublin, program co-chair of  SIGIR in Toronto in 2003 and general chair of the Conference on Image and Video Retrieval (CIVR) which he hosted in Dublin in 2004.  In 2005 he was program co-chair of the International Conference on Multimedia and Expo in Amsterdam, in 2009 he was program co-chair of ACM MultiMedia Modeling conference in Sophia Antipolis, France and in 2010 co-chair of the program for CLEF-2010 in Padova, Italy.

Alan has published over 600 book chapters, journal and refereed conference papers as well as dozens of other presentations, seminars and posters and he has a Google Scholar h-index of 58. He was an Associate Editor of the ACM Transactions on Information Systems for 8 years, and has been a member of the editorial board of four other journals. He is presently a member of the Editorial Board of Information Processing and Management.

Alan has graduated 50 research students since 1991, the vast majority at PhD level. He has acted as examiner for PhD theses in other Universities on more than 30 occasions, and has assisted the European Commission since 1990 in dozens of advisory and consultative roles, both as an evaluator or reviewer of project proposals and as a reviewer of ongoing projects. He has also carried out project proposal reviews for more than 20 different research councils and funding agencies in the last 10 years.

More recently Alan is a Founding Director of the Insight Centre for Data Analytics, Dublin City University (2013-2019), the largest single non-capital research award given by a research funding agency in Ireland. He is Chair of ACM SIGMM (Special Interest Group in Multimedia), (2017-) and a member of the Scientific Committee of COST (European Cooperation in Science and Technology), an EU funding program with a budget of €300m in Horizon 2020.

In 2001 he was joint (and founding) coordinator of TRECVid – the largest worldwide benchmarking evaluation on content-based analysis of multimedia (digital video) which runs annually since then and way back in 1991 he was a member of the founding steering group of TREC, the annual Text Retrieval Evaluation Conference carried out at the US National Institute for Standards and Technology, US, 1991-1996.

Alan was awarded the Royal Irish Academy Gold Medal for Engineering Sciences in 2015. Awarded once every 3 years, the RIA Gold Medals were established in 2005 “to acclaim Ireland’s foremost thinkers in the humanities, social sciences, physical & mathematical sciences, life sciences, engineering sciences and the environment & geosciences”.

He was jointly awarded the Niwa-Takayanagi Prize by the Institute of Image Information and Television Engineers, Japan for outstanding achievements in the field of video information media and in promoting basic research in this field.  He is a member of the Irish Research Council (2012-2015, 2015-2018), an appointment by the Irish Government and winner of Tony Kent Strix award (2011) from the UK e-Information Society for “sustained contributions to the field of … indexing and retrieval of image, audio and video data”.

Alan is a member of the ACM, a Fellow of the IEEE and is a Fellow of the Irish Computer Society.

Michael Alexander Riegler: 

Michael is a scientific researcher at Simula Research Laboratory. He received his Master’s degree from Klagenfurt University with distinction and finished his PhD at the University of Oslo in two and a half years. His PhD thesis topic was efficient processing of medical multimedia workloads.

His research interests are medical multimedia data analysis and understanding, image processing, image retrieval, parallel processing, crowdsourcing, social computing and user intent. Furthermore, he is involved in several initiatives like the MediaEval Benchmarking initiative for Multimedia Evaluation, which runs this year the Medico task (automatic analysis of colonoscopy videos)footnote{http://www.multimediaeval.org/mediaeval2017/medico/}.

An interview with Prof. Ramesh Jain

Prof. Ramesh Jain in the year 2016.

Prof. Ramesh Jain in 2016.

Please describe your journey into computing from your youth up to the present. What foundational lessons did you learn from this journey? Why you were initially attracted to multimedia?

I am luckier than most people in that I have been able to experience really diverse situations in my life. Computing was just being introduced at Indian Universities when I was a student, so I never had a chance to learn computing in a classroom setting.  I took a few electronics courses as part of my undergraduate education, but nothing even close to computing.  I first used computers during my doctoral studies at the Indian Institute of Technology, Kharagpur, in 1970.  I was instantly fascinated and decided to use this emerging technology in the design of sophisticated control systems.  The information I picked up along the way was driven by my interests and passion.

I grew up in a traditional Indian Ashram, with no facilities for childhood education, so this was not the first time I faced a lack of formal instruction.  My father taught me basic reading, writing, and math skills and then I took a school placement exam.  I started school at the age of nine in fifth grade.

During my doctoral days, two areas fascinated me: computing and cybernetics.  I decided to do my research in digital control systems because it gave me a chance to combine computing and control.  At the time, the use of computing was very basic—digitizing control signals and understanding the effect of digitalization.  After my PhD, I became interested in artificial intelligence and entered AI through pattern recognition.  

In my current research, I am applying cybernetics to health.  Computing has finally matured enough that it can be applied in real control systems that play a critical role in our lives.  And what is more important to our well-being than our health?

The main driver of my career has been realizing that ultimately I am responsible for my own learning. Teachers are important, but ultimately I learn what I find interesting.  The most important attribute in learning is a person’s curiosity and desire to solve problems.  

Something else significantly impacted my thinking in my early research days.  I found that it is fundamental to accept ignorance about a problem and then examine concepts and techniques from multiple perspectives.  One person’s or one research paper’s perspective is just that—an opinion.  By examining multiple perspectives and relating those to your experiences, you can better understand a problem and its solutions.

Another important lesson is that problems or concepts are often independent of the academic and other organisational walls that exist.  Interesting problems always require perspectives, concepts, and technologies from different academic disciplines. Over time, it’s then necessary to create to new disciplines, or as Thomas Kuhn called them new paradigms [Kuhn 62].

In the late 1980s, much of my research was addressing different aspects of computer vision.  I was frustrated by the slow progress in computer vision.  In fact, I coauthored a paper on this topic that became quite controversial [Jain 91].  It was clear that computer vision could be central to computing in the real world, such as in industry, medical imaging, and robotics, but it was unable to solve any real problems.  Progress was slow.  

While working on object recognition, it became increasingly obvious to me that images alone do not contain enough information to solve the vision problem.  Projection of real-world images to a photograph results in a loss of information that can only be recovered by combining information from many other sources, including knowledge in many different forms, metadata, and other signals.  I started thinking that our goal should be to understand the real world using sensors and other sources of knowledge, not just images.  I felt that we were addressing the wrong problem—understanding the physical world using only images.  The real problem is to understand the physical world.  The physical world can only be understood by capturing correlated information.  To me, this is multimedia: understand the physical world using multiple disparate sensors and other sources of information.

This is a very good definition of multimedia. In this context, what do you think is the future of multimedia research in general?

Different aspects of physical world must be captured using different types of sensors. In early days, multimedia concerned itself with the two most dominant human senses:vision and hearing. As the field is advancing, we must deal with every type of sensor that is developed to capture information in different applications. Multimedia must become the area that processes disparate data in context to convert it to information.

Taking into account that you are working with AI for such a long time, what do you think about the current trend of deep learning and how it will develop?

Every field has its trends. Learning is definitely a very important step in AI and has attracted attention from early days. However, it was known that reasoning and search play equally important role in AI. Ultimately problem solving depends on recognizing real world objects and patterns and here learning plays key role. To design successful deep systems, learning needs to be combined with search and reasoning.

Prof. Ramesh Jain at an early stage of his career (1975).

Prof. Ramesh Jain at an early stage of his career (1975).

Please tell us more about your vision and objectives behind your current roles. What do you hope to accomplish, and how will you bring this about?

One thing that is of great interest to every human is their health.  Ironically, technology utilization in healthcare is not as pervasive as in many other fields.  Another intriguing fact about technology and health is that almost all progress in health is due to advances in technology, but barriers to using technology are also the most overwhelming in health.  I experienced the terrifying state of healthcare first hand while going through treatment for gastro-esophageal cancer in 2004.  It became clear to me during my fight with cancer that technology could revolutionize most aspects of treatment—from diagnosis to guidance and operationalization of patient care and engagement—but it was not being used.  During that period, it became clear to me that multimodal data leading to information and knowledge is the key to success in this and many other fields.  That experience changed my thinking and research.

Ancient civilizations observed that health is not the absence of disease; disease is a perturbation of a healthy state.  This wisdom was based on empirical observations and resulted in guidelines for healthy living that includes diet, sleep, and whole-body exercise, such as yoga or tai chi.  Now is the time to develop scientific guidelines based on the latest evolving knowledge and technology to maximize periods of overall health and minimize suffering during diseases in human lives.  It seems possible to raise life expectancy to 100+ years for most people.  I want to cross the 100-year threshold myself and live an active life until my last day.  I am working toward making that happen.

Technology for healthcare is increasingly a popular topic.  Data is at the center of healthcare, and new areas like precision health and wellness are becoming increasingly popular. At the University of California, Irvine (UCI), we’ve created a major effort to bring together researchers from Information and Computer Sciences, Health Sciences, Engineering, Public Health, Nursing, Biology, and others fields who are adopting a novel perspective in an effort to build technology that empowers people. From this perspective, we adopt a cybernetics approach to health.  This work is being done at the UCI’s Institute for Future Health, of which I am the founding director.  

At the Institute for Future Health, currently we are building a community that will do academic research as well as work closely with industry, local communities, hospitals, and start-up companies. We will also collaborate with global researchers and practitioners interested in this approach.  There is significant interest from several institutions in several countries to collaborate and pursue this approach.

This is very interesting and relevant! Do you think that the multimedia community will be open for such a direction or since it is so important and societal relevant would it be good to built a new research community around this idea?

As you said, this is the most important research direction I have been involved in and most challenging. And this is an important direction in itself — this needs to happen using all tech and other resources.

Since I can not wait for any community to be ready to address this, I started building a community to address Future Health. But, I believe that this could be the most relevant application for multimedia technology as well as the techniques from multimedia are very relevant to this area.

Exciting problem because the time is right to address this area.

Do you think that the multimedia community has the right skills to address medical multimedia problems and how could the community be encouraged into that direction?

Multimedia community is better equipped than any other community to deal with diverse types of data. New tools will be required for new challenges, but we already have enough tools and techniques to address many current challenges. To do this, however, the community has to become an open forward looking community going beyond visual information to consider all other modes that are currently ignored under ‘meta data’. All data is data and contributes to information.

Can you profile your current research and its challenges, opportunities, and implications?

I am involved in a research area that is one of the most challenging and that has implications for every human.

The most exciting aspect of health is that it is truly a multimodal data-intensive operation.  As discussed by Norbert Wiener in his book Cybernetics [Wiener 48] about 75 years ago, control and communication processes in machines and animals are similar and are based on information.  Until recently, these principles formed the basis for understanding health, but they can now be used to control health as well.  This is exciting for everybody, and it motivates me to work hard and make something happen. For others, but also for me.

We can discuss some fundamental components of this area from a cybernetics/information perspective:

Creating individual health model:  Each person is unique.  Our bodies and lives are determined by two major factors:  genetics and lifestyle.  Until recently, personal genome information was difficult to obtain, and personal lifestyle information was only anecdotally collected.  This century is different. Personal genomic, in fact all Omics, data is becoming easier to get and more precise and informative. And mobile phones, wearables, the Internet of Things (IoTs) around us, and social media are all coming together to quantitatively determine different aspects of our lifestyles as well as many bio-markers.

This requires combining multimodal data from different sources, which is a challenge. By collecting all such lifestyle data, we can start assembling a log of information—a kind of multimodal lifelog on turbo charge—that could be used to build a model of a person using event mining tools.  By combining genomic and lifestyle data, we can form a complete model of a person that contains all detailed health-related information.

Aggregating individual health models to population disease models:  Current disease models rely on limited data from real people.  Until recently, it was not possible to gather all such data. As discussed earlier, the situation is rapidly changing.  Once data is available for individual health models, it could be sliced and diced to formulate disease models for different populations and demographics.  This will be revolutionary.

Correlating health and related knowledge to actions for each individual and for society: Cybernetics underlies most complex engineering real-time systems.  The concept of feedback used generate a correct signal to be applied to a system to take it from the current state to a desired state is essential in all real-time control systems.  Even for the human body, homeostasis uses similar principles.  Can we use this to guide people in their lifestyle choices and medical compliance?  

Navigation systems are a good example of how an old, tedious problem can become extremely easy to use.  Only 15 years ago, we needed maps and a lot of planning to visit new places.  Now, mobile navigation systems can anticipate upcoming actions and even help you correct your mistakes gracefully, in real time.  They can also identify traffic conditions and suggest the best routes.

If technology can do this for navigation in the physical world, can we develop technology to help us select appropriate lifestyle decisions and do so perpetually?  The answer is obviously yes.  By compiling all health and related knowledge, determining your current personal health situation and surrounding environmental situations, and using your past chronicle to log your preferences, it can provide you with suggestions that will make your life not only more healthy but also more enjoyable.

This is our dream at the Institute for Future Health.

Future Health: Perpetual enhancement of health by managing lifestyle and environment.

Future Health: Perpetual enhancement of health by managing lifestyle and environment.

4) How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

I am lucky to have been active for more than four decades and to have had the opportunity to participate in research and entrepreneurial activities in multiple countries at the best organizations. This gave me a chance to interact with the brightest young people as well as seasoned creative visionaries and researchers.  Thus, it is difficult for me to decide what to list.  I will adopt a chronological approach to answer your question.

Working in H.H. Nagel’s research group in Hamburg Germany, I got involved in developing an approach to motion detection and analysis in 1976.  We wrote the first papers on video analysis that worked with traffic video sequences and detected and analyzed the motion of cars, pedestrians, and other objects.  Our paper at IJCAI 1977 [Jain 77] was remarkable in showing these results at a time when digitizing a picture was a chore lasting minutes and the most powerful computer could not store a full video frame in its memory.  Even today, the first step in many video analysis systems is differencing, as proposed in that work.

Many bright people contributed powerful ideas in computer vision from my groups.  E. North Coleman was possibly the first person to propose Photometric Stereo in 1981 [Coleman].  Paul Besl’s work on segmentation using surface characteristics and 3D object recognition made a significant impact [Besl]. Tom Knoll did some exciting research on feature-indexed hypotheses for object recognition.  But Tom’s major contribution to current computer technology was his development of Photoshop when he was doing his PhD in my research group.  As we all know, Photoshop revolutionized how we view photos. Working with Kurt Skifstad at my first company Imageware, we demonstrated the first version of capturing a 3D shape of a person’s face and reproducing it using a machine in the next room at the Autofact Conference in 1994. I guess that was a primitive version of 3D printing.  At the time, we called it 3D fax.

The idea of designing a content-based organization to build a large database of images was considered crazy in 1990, but it bugged me so much that I started first a project and later a company, Virage, working with several people.  In fact, Bradley Horowitz left his research at MIT to join me in building Virage and later he managed the project that brought Google Photos to its current form.  That process building video databases resulted in my realizing that photos and videos are a lot more than just intensity values.  And that realization lead me to champion the idea that information about the physical world can be recovered more effectively and efficiently by combining correlated, but incomplete, information from several sources, including metadata.  This was the thinking that encouraged me to start building the multimedia community.

Since computing and camera technology had advanced enough by 1994, my research group at the University of California, San Diego (UCSD), particularly Koji Wakimoto[Jain 95] and then Arun Katkere and Saeed Moezzi [Moezzi 96] helped in developing initially Multiple Perspective Interactive Video and later Immersive video to realize compelling telepresence.  That research area in various forms attracted people from the movie industry as well as people interested in different art forms and collaborative spaces.  By licensing our patents from UCSD, we started a company Praja to bring immersive video technology to sports.  I left academia to be the CEO of Praja.

While developing technology for indexing sporting events, it became obvious that events are as important as objects, if not more, when indexing multimedia data.  Information about events comes from separate sources, and events combine different dimensions that play a key role in our understanding of the world.  This realization resulted in Westermann and I working on a general computational model for events.  Later we realized that by aggregating events over space and time, we could detect situations.  Vivek Singh and Mingyan Gao helped prototype an EventShop platform [Singh 2010], which was later converted to an open source platform under the leadership of Siripen Pongpaichet.

One of the most fundamental problems in society is connecting people’s needs to appropriate resources effectively, efficiently, and promptly in a given situation.  To understand people’s needs, it is essential to build objective models that could be used to recommend correct resources in given situations.  Laleh Jalali started building an event-mining framework that could be used to build an objective self model using the different types of data streams related to people that have now become easily available [Jalali 2015].  

All this work is leading to a framework that is behind my current thinking related to health intelligence. In health intelligence, our goal is to perpetually measure a person’s activities, lifestyle, environment, and bio-markers to understand his/her current state as well as continuously build his/her model. Using that model, current state, and medical knowledge, it is possible to provide perpetual guidance to help people take the right action in a given situation.

Over your distinguished career, what are the top lessons you want to share with the audience?

I have been lucky to get a chance to work on several fun projects.  More importantly, I have worked closely on an equal number of successful and not so successful projects. I consider a project successful if it accomplishes its goal and the people working on the project enjoy it.  Although each project is unique, I’ve noticed that some common themes make for a project successful.

Passion for the Project:  Time and again, I’ve seen that passion for the project makes a huge difference. When people are passionate, they don’t consider it work and will literally do whatever is required to make it successful.  In my own case, I find that the ideas that I find compelling, both in terms of their goals and implications, are the ones that motivate me to do my best.  I am focused, driven, and willing to work hard.  I learned long ago to work only on problems that I find important and compelling.  Some ideas are just not for me.  Otherwise, it is better for the project and for me if I dissociate with it at the first opportunity to do so.

Open Mind:  Departmental or similar boundaries in both academia and industry severely restrict how a problem is addressed.  Solving a problem should be the goal, not using the resources or technology of a specific department.  In academia, I often hear things like “this is not a multimedia problem” or “this is database problem.”  Usually, the goal of a project is to solve a problem, so we should use the best technique or resource available to solve the problem.

Most of the boundaries for academic disciplines are artificial, and because they keep changing, the departments based on any specific factor will likely also change over time.  By addressing challenging problems using appropriate technology and resources, we push boundaries and either expand older boundaries or create new disciplines.

Another manifestation of an open mind is the ability to see the same problem from multiple perspectives.  This is not easy—we all have our biases.  The best thing to do is to form a group of researchers from diverse cultural and disciplinary backgrounds.  Diversity naturally results in diverse perspectives.

Persistence:  Good research is usually the result of sustained efforts to understand and solve a challenge.  Many intrinsic and extrinsic issues must be handled during a successful research journey. By definition, an important research challenge requires navigating unchartered territories.  Many people get frustrated in an unmapped area and when there is no easy way to evaluate progress.  In my experience, even some of my brightest students are comfortable only when they can say I am better than X approach by N%.  In most novel problems, there is no X and no metrics to judge performance. Only a few people are comfortable in such situations where incremental progress may not be computable.  We require both kinds of people: those who can improve given approaches and those who can pioneer new areas.  The second group requires people that can be confident about their research directions without having concrete external evaluation measures.  The ability to work confidently without external affirmation is essential in important deep challenges.

In the current culture, a researcher’s persistence is also tested by “publish or perish” oriented colleagues who determine the quality of research by acceptance rates at the so-called top conferences. When your papers are rejected, you are dejected and sometimes feel that you are doing the wrong research.  Not always true.  The best thing about these conferences is that they test your self-confidence.

We have all read the stories about the research that ultimately resulted in the WWW and the paper on PageRank that later became the foundation of Google search.  Both were initially rejected. Yet, the authors were confident in their work so they persevered.  When one of my papers gets rejected (which is more often the case than with my much inferior papers), much of the time the reviewers are looking for incremental work—the trendy topics—and don’t have time, openness, and energy to think beyond what they and their friends have been doing. I read and analyze reviewers’ comments to see whether they understood my work and then decide whether to take them seriously or ignore them.  In other words, you have to be confident of your own ideas and review the reviews to decide your next steps.

I noticed that one of your favourite quotes is “Imagination is more important than knowledge.” In this regard, do you think there is enough “imagination” in today’s research, or are researchers mainly driven/constrained by grants, metrics, and trends? 

The complete quote by Albert Einstein is “Imagination is more important than knowledge. For knowledge is limited, whereas imagination embraces the entire world, stimulating progress, giving birth to evolution.”  So knowledge begins with imagination. Imagination is the beginning of a hypothesis. When the hypothesis is validated, that results in knowledge.

People often seek short-term rewards.  It is easier to follow trends and established paradigms than to go against them or create new paradigms.  This is nothing new; it has always happened. At one time scientists, like Galileo Galilei, were persecuted for opposing the established beliefs. Today, I only have to worry about my papers and grant proposals getting rejected.  The most engaged researchers are driven by their passion and the long-term rewards that may (or may not) come with it.

Albert Einstein (Source: Planet Science)

Albert Einstein (Source: Planet Science)

References:

  1. Kuhn, T. S. The Structure of Scientific Revolutions. Chicago: University of Chicago Press, 1962. ISBN 0-226-45808-3
  2. R. Jain and T. O. Binford, “Ignorance, Myopia, and Naiveté in Computer    Vision Systems,” CVGIP, Image Understanding, 53(1), 112-117. 1991.   
  3. Norbert Wiener, Cybernetics: Or Control and Communication in the Animal and the Machine. Paris, (Hermann & Cie) & Camb. Mass. (MIT Press) ISBN 978-0-262-73009-9; 2nd revised ed. 1961.
  4. R. Jain, D. Militzer and H. Nagel, “Separating a Stationary Form from Nonstationary Scene Components in a Sequence of Real World TV Frames,” Proceedings of IJCAI 77, Cambridge, Massachusetts, 612-618. 1977.
  5. E. N. Coleman and R. Jain, “Shape from Shading for Surfaces with Texture    and Specularity,” Proceedings of IJCAI. 1981.  
  6. P. Besl, and R. Jain, “Invariant Surface Characteristics for 3-D Object     Recognition in Depth Maps,” Computer Vision, Graphics and Image Processing, 33, 33-80. 1986.
  7. R. Jain and K. Wakimoto, “Multiple Perspective Interactive Video,” Proceedings of IEEE Conference on Multimedia Systems. May 1995.
  8. S. Moezzi, Arun Katkere, D. Kuramura, and R. Jain, “Reality Modeling    and Visualization from Multiple Video Sequences,” IEEE Computer     Graphics and Applications, 58-63. November 1996.
  9. Vivek Singh, Mingyan Gao, and Ramesh Jain,”Social Pixels: Genesis and evaluation”, Proc. ACM Multimedia, 2010.
  10. Laleh Jalali, Ramesh Jain: Bringing Deep Causality to Multimedia Data Streams. ACM Multimedia 2015: 221-230

Bios

 

About Prof. Ramesh Jain: 

Ramesh Jain is an entrepreneur, researcher, and educator. He is a Donald Bren Professor in Information & Computer Sciences at University of California, Irvine.  Earlier he has been at Georgia Tech, University of California, San Diego, University of Michigan, and some other universities in many countries.  He was educated at Nagpur University (B.E.) and Indian Institute of Technology, Kharagpur (Ph.D.) in India.  His current research is in Social Life Networks including EventShop and Objective Self, and Health Intelligence.  He has been an active member of professional community serving in various positions and contributing more than 400 research papers and coauthoring several books including text books in Machine Vision and Multimedia Computing.  He is a Fellow of AAAI, AAAS, ACM, IEEE, IAPR, and SPIE.

Ramesh co-founded several companies, managed them in initial stages, and then turned them over to professional management.  He also advised major companies in multimedia and search technology.  He still enjoys the thrill of start-up environment.

His research and entrepreneurial interests have been in computer vision, AI, multimedia, and social computing. He is the founding director of Institute for Future Health at UCI.

Michael Alexander Riegler: 

Michael is a scientific researcher at Simula Research Laboratory. He received his Master’s degree from Klagenfurt University with distinction and finished his PhD at the University of Oslo in two and a half years. His PhD thesis topic was efficient processing of medical multimedia workloads.

His research interests are medical multimedia data analysis and understanding, image processing, image retrieval, parallel processing, gamification and serious games, crowdsourcing, social computing and user intentions. Furthermore, he is involved in several initiatives like the MediaEval Benchmarking initiative for Multimedia Evaluation, which runs this year the Medico task (automatic analysis of colonoscopy videos)footnote{http://www.multimediaeval.org/mediaeval2017/medico/}.

Since 1997 Alan Smeaton has been a Professor of Computing at Dublin City University. He joined DCU (then NIHED) in 1987 having completed his PhD in UCD under the supervision of Prof. Keith van Rijsbergen. He also completed an M.Sc. and  B.Sc. at UCD.

In 1994 Alan was chair of the ACM SIGIR Conference which he hosted in Dublin, program co-chair of  SIGIR in Toronto in 2003 and general chair of the Conference on Image and Video Retrieval (CIVR) which he hosted in Dublin in 2004.  In 2005 he was program co-chair of the International Conference on Multimedia and Expo in Amsterdam, in 2009 he was program co-chair of ACM MultiMedia Modeling conference in Sophia Antipolis, France and in 2010 co-chair of the program for CLEF-2010 in Padova, Italy.

Alan has published over 600 book chapters, journal and refereed conference papers as well as dozens of other presentations, seminars and posters and he has a Google Scholar h-index of 58. He was an Associate Editor of the ACM Transactions on Information Systems for 8 years, and has been a member of the editorial board of four other journals. He is presently a member of the Editorial Board of Information Processing and Management.

Alan has graduated 50 research students since 1991, the vast majority at PhD level. He has acted as examiner for PhD theses in other Universities on more than 30 occasions, and has assisted the European Commission since 1990 in dozens of advisory and consultative roles, both as an evaluator or reviewer of project proposals and as a reviewer of ongoing projects. He has also carried out project proposal reviews for more than 20 different research councils and funding agencies in the last 10 years.

More recently Alan is a Founding Director of the Insight Centre for Data Analytics, Dublin City University (2013-2019), the largest single non-capital research award given by a research funding agency in Ireland. He is Chair of ACM SIGMM (Special Interest Group in Multimedia), (2017-) and a member of the Scientific Committee of COST (European Cooperation in Science and Technology), an EU funding program with a budget of €300m in Horizon 2020.

In 2001 he was joint (and founding) coordinator of TRECVid – the largest worldwide benchmarking evaluation on content-based analysis of multimedia (digital video) which runs annually since then and way back in 1991 he was a member of the founding steering group of TREC, the annual Text Retrieval Evaluation Conference carried out at the US National Institute for Standards and Technology, US, 1991-1996.

Alan was awarded the Royal Irish Academy Gold Medal for Engineering Sciences in 2015. Awarded once every 3 years, the RIA Gold Medals were established in 2005 “to acclaim Ireland’s foremost thinkers in the humanities, social sciences, physical & mathematical sciences, life sciences, engineering sciences and the environment & geosciences”.

He was jointly awarded the Niwa-Takayanagi Prize by the Institute of Image Information and Television Engineers, Japan for outstanding achievements in the field of video information media and in promoting basic research in this field.  He is a member of the Irish Research Council (2012-2015, 2015-2018), an appointment by the Irish Government and winner of Tony Kent Strix award (2011) from the UK e-Information Society for “sustained contributions to the field of … indexing and retrieval of image, audio and video data”.

Alan is a member of the ACM, a Fellow of the IEEE and is a Fellow of the Irish Computer Society.

Michael Alexander Riegler:  

Michael is a scientific researcher at Simula Research Laboratory. He received his Master’s degree from Klagenfurt University with distinction and finished his PhD at the University of Oslo in two and a half years. His PhD thesis topic was efficient processing of medical multimedia workloads.

His research interests are medical multimedia data analysis and understanding, image processing, image retrieval, parallel processing, crowdsourcing, social computing and user intent. Furthermore, he is involved in several initiatives like the MediaEval Benchmarking initiative for Multimedia Evaluation, which runs this year the Medico task (automatic analysis of colonoscopy videos)footnote{http://www.multimediaeval.org/mediaeval2017/medico/}.

An interview with David Ayman Shamma

 

aymanbio

Describe your journey into computing from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

I’ve always been curious about solving problems.  Not so much the answer but actually I like to know how a problem can be broken down into parts, abstracted, and reasoned with—which often drives us to think about abstraction (is there a non-specific instance of this problem), theory (is there some known literature from the mathematical or social sciences that will help us frame what’s happening, and analogy (can we solve this because its structure is like another problem?).  My education included classes in psychology, philosophy, math, and engineering; eventually I realized Computer Science and specifically Artificial Intelligence embodied everything I was looking for: understanding people, modeling problems, and building new systems.

Interestingly enough, as an undergrad I took a job in an art department at the local state college as a technician; my job was to keep their Macs running with Adobe products. While I was there, I was allowed to audit studio art classes.  I began to see how artistic and creative processes were influenced by the tools we have—be it a 1:50 D-76 bath with fiber based paper in a darkroom or masking layers in Photoshop.  This connection between creative and constructive processes carried into my work at NASA’s Center for Mars Exploration where I worked on diagrammatic knowledge tools and then into my Ph.D on community driven Multimedia systems. It was around this time that I saw ACM Multimedia 2004 had a call for technical papers in the Interactive Arts.  Since then I’ve been active in the community, mostly focused on the Arts track but as my work began to include social computing in 2009 I started to think about hybrid social-visual systems.  In 2013, I was the Technical Program Co-chair, and  we started to look critically at the broad technical areas, the review process, and started some inclusion and diversity initiatives.

The main foundational lesson for me is to continue asking the right questions, even if you’re branching stemming out of some smaller, under-represented area or track.  In many cases, you’ll find new exciting research questions.  That said, I found I need to couple this with a personal understanding of the outside domain; only then can a truly functional hybrid system work; it’s not enough to look at divergent sources as just a big bag of the same data—pixels, tags, comments, clicks, they all carry an explicit or tacit semantic implication; respect that.

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

My Ph.D. dealt with social computing and community semantics: the objects in a photo carry a broader semantic conversation context of the online site sharing that photo. When I graduated, I joined an industry research lab. I spent 10 years there through a few organizational shifts. In my last 4 years there I founded the HCI Research group with a charter on investigating what our research meant to people.  My group’s research spanned across several domains: multimedia, computer vision, information visualisation, social computing, ethnography, and physical computing; this gave me deep perspective across many areas.  Personally, understanding how things are connected and what those connections meant became a focus of my research.  Data is created for a reason and structured link data can carry a tacit semantic that helps us understand people and tasks in the world. Lately, I’ve been thinking about physical spaces where people interact and create content. What sort of camera do you have on you? How does it change your practice of photography? What sensors might be in your clothes or in the world? These questions have been part of my current focus at Centrum Wiskunde & Informatica.  We’ve been working with a Dutch fashion designer in Amsterdam investigating how fashion and technology can be used in various situational tasks and environments through instrumenting clothing and creating structured data to understand people’s activity and flocking.  What’s exciting beyond the research is connecting goals of a fashion designer and computer science research; it’s an exciting bridge to create. Once all the fabric and sensors are accounted for, it becomes a social computing problem again…that’s where I like to live, creating bridges.

Can you profile your current research, its challenges, opportunities, and implications?

Now more than ever, we are a function of our own data.  Data drives much of computing today, be it data science or machine learning driven.  I like to emphasize how we collect and label data as it has direct consequences on what we can analyze, predict, and create.  For many, this means harvesting data for use.  For me, it means understanding how people act, behave, and communicate through those signals.  For example, at CSCW 2016 I published some work where we looked at the browsing behavior of millions of people on Flickr which we matched into a relatively small set of editorial judgements to surface high quality geo-tagged weather photos.  The alternate approach, which they did attempt at first, was to just train a neural net to find photos of storms or lightning or sunny days. While that’s recall optimistic, the editors were quick to point out everyone takes crummy photos of lightning so conventional approaches didn’t work. My research took a different approach, instead of training generic aesthetics into the system, we modeled a community-centric approach. Using the tacit aesthetic judgments from the Flickr community, we couple the structured link data with CNN to surface high quality photos.It’s not a case of active learning, in fact, it’s a supervised model where that supervision comes from implicit community actions and explicit editorial judgements.  We have some similar work to be published at CHI 2017 later this year where we were surfacing deviant/abuse images on Tumblr; a task that was even harder as the image may not be representative of such behavior, so the social-visual system was a necessity.

Taking you interest in AI and fashion into account, I am wondering what you generally think about the current hype on deep learning and in context to the fashion research. Do you think AI based systems will ever be able to understand context which is an important factor in fashion?

You know, I remember when DeepBlue beat Kasparov back in the 90s and while it was great, I didn’t think much of it as an AI victory (nor did IBM if I recall). The recent win by AlphaGo  is different and something amazing.  I don’t think it’s hype as things work and work well—however we still face many of the same limitations. With regard to fashion, it’s a great time to be excited about AI. I mean we see solutions to many of the older research and fashion issues (like point your camera at someone and find the clothes they are wearing to buy online) but I think smart electronics, AI and fashion is the new sweet spot.  There have been many advancements in textiles like pixel to stitch knitting and small electronics make for a fun new playground for AI, sensors, and IoT. We’re just now starting to explore how clothes and fashion can sense, detect, and respond to people and to the environment.  I get what you’re saying by AI hype and that’s another discussion, but right now I’m excited to build the next generation of wearable tech.

How generalizable is data from sources like Flickr? For example, are your insights on Flickr also valid in non-western countries?

I certainly have had reviewers ask me how generalizable research is because it used Flickr data or Yelp data or Twitter data or whatever; I see it as the hallmark of a bad review.  On one hand, there is no sense to believe that any slice of a specific social media dataset should be generalizable. People act differently on Flickr than they do on Instagram or on Snapchat.  The application/website dictates an interaction, and really that’s what we are studying—as a research community we need to move beyond just studying naive pixels and examine what it’s doing.  Ok, if you’re just looking for indoor vs outdoor shots in Yelp photos, then maybe.  But have you ever tried to find a restaurant in Japan versus Italy versus America? Store fronts look completely different. Internationalization is rarely studied by multimedia researchers and I think multimedia mediated cultural communication is more important than website generalization. 

I think it would be very interesting if you could also answer about what do you think is the role or responsibility of multimedia researchers in context of all the fake news/alternative new debate. Do you think we should focus on it?

In 2009, I began publishing work on doing multimedia summarization from using aggregated Twitter feeds from the Obama McCain debate. Back then, people really really wanted to tweet and it was a narrow interest community.  A few years later, during the Egyptian of 2011, I ran my methods against the Twitter firehose and saw some mis-information (like a bus on fire that was reported which was actually from another country years ago). Delayed information is a systemic problem, where something happened hours or days ago and it gets propagated as fresh information. I don’t believe we had widespread purposeful propagation of misinformation (least not like what we see in today’s world). So today, we have misplaced information, delayed information, fake/alt information and the field of multimedia is ripe to handle this problem. For example, take a fake news story with a photo.  Has the photo been altered to retell a story? Is the photo from a different news story? Are there clusters of other news sources that contradict? There’s a whole world of multimedia problems, many of which large companies are struggling to get a grip on, in finding fake news, but the hard problem will be the explanation. Identifying fake is half of the problem, explaining to people why it’s fake is the other.  News, now more than ever, is highly visual (photos/video) and social; dealing with a plurality of signals is the core of multimedia research.

In this context do you think that fake news are a problem of social network platforms or should newspapers also be investigated?

Can you name a news source that does not rely on social network platforms?  Conversely, have you seen Twitter deliver news?  Their streaming video with tweet interfaces speaks to research we did 10 years back.  I don’t think we can decouple the two, but we’ve seen how social media sites tend to amplify things by propagating clickable content.  So for a news agency, it starts with the title and snippet of a story and it’s related photo.  But then there’s also the face news agencies gaming the social sites.  There’s been some great work from UW cracking the problem, but I think it’s time for multimedia research to step up here as visual content always carries more engagement.

How would you describe the role of women especially in the field of multimedia?

Diversity of all types—gender, nationality, race—is critically important to the future of multimedia research.  When I was on the TPC for Multimedia in 2013 I did some data analytics of the past several years of the conference series; the gender stats were abysmal.  We worked hard to increase the gender diversity in the area chairs and in the conference.  To the former, following some advice from Maria Klawe I heard in a lecture maybe 10 years prior, we pushed on topic diversity for the conference.  The idea here is legacy areas can carry legacy diversity problems; so newer areas (social computing, affect, crowdsourcing, music, etc.) are more likely to have better gender leadership ratios.  It was the correct approach and we doubled the number of women in leadership roles in the ACs but still there was much room to grow.  We coupled this with finding corporate support for a womens & diversity lunch—a practice that I’m happy the conference has continued.  Diversity brings an expanded set of ideas, methods, and approaches in research.  We’ve come a ways since 2013 and I’m very happy to see the 2017 program also similarly expand its diversity but we have a very long way to go to catch up to some other SIGs.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

Impact happens where research connects to people. For me, it’s usually revolves around creative practice in multimedia.  How online broadcasters DJing house and hip-hop connect with their audience online and how does it differ from when they are in a club?  If you have an iPad and an iPhone and want to take a picture, when do you reach for the iPad to take the photo?  If you’re posting a photo to Instagram, what filter will you use to enhance the photo?  The most valuable research include method, system, and people. Let’s take that last one as an example.  One could build a prediction model to automatically apply filters based on a training set of what got likes and the types of transformation but would that change people’s creative practice?  We found people enjoyed the process of selection (despite usually picking the same filter over and over again). So the question becomes how do we optimize the experience without hindering it.

In my time as Director of Research at Flickr, we enjoyed looking at the full stack: data, machine learning, engineering, visualization, and all the components that affect people and media experience. We knew there was an advantage to easily dive into 13 billion photos and 100 million people but felt, even inside a corporation, there should be more open data for all researchers.  This lead to the creation of the YFCC100M (http://cacm.acm.org/magazines/2016/2/197425-yfcc100m/fulltext): 100 million Creative Commons images in a single dataset for open research.  Beyond the data itself, we found ourselves reviewing small technical Creative Commons details to ensure legal and privacy concerns were met but still opening the data for wide academic and corporate use.  The impact has been incredible.  Outside of the multimedia and computer vision communities, in the first year since release we’ve seen published work using our dataset from the HCI, Data Science, and Visualization communities and even were featured by the Library of Congress.  All driven by the idea to share data we felt was too locked up; fortunately Flickr, Creative Commons, and Yahoo Legal shared our vision and we’ll look to see more impact to come.

Over your distinguished career, what are your top lessons you want to share with the audience?

Really nothing happens in a vacuum. Partnerships and collaborations make things interesting as they make one malleable and push one to think full stack. This is shaped by my 10 years in an industry lab, connecting with academia through hosting interns, collaborative work, and sponsorships really fueled my work.  I’d say still a good 70% of our work was internally driven but that 30% outreach was really valuable.  Now at an academic lab, I’m doing the reverse.  We partnered with a fashion designer to keep connected to their goals and their problems while we think about the wearable and social Internet of Things.  It’s great to think without constraints but really adapting to the real world and thinking end-to-end is a critical driver for me.  At the end of the day, I want to use it. Build what you love and make it real.  This was easier when I was at a corporation, but there are still plenty of ways to collaborate depending on scope. And really think full stack in system and evaluation.  You’ll find yourself evaluating your work on multiple levels from F-1 metrics to Likert scale surveys. What we do is develop new systems and methods but work with real impact will affect applications and design. My favorite research (of mine or others) always critically engages with the bigger picture.

Since you are active researcher in both US and in Europe, what do you think are the main differences? What is positive and what is negative? And what could we learn from each other?

I did a semester sabbatical at the Keio-NUS CUTE center in Singapore a few years back, so it’s not my first dive outside of industry.  I’m reminded in La Nausée Sartre wrote that anyplace you live feels the same after two weeks; the idea being once you get back to job and life, it becomes the same again. I can’t say I quite agree in this case. The move from an industry lab in California to an academic one in the Netherlands was a bit of a culture and cadence shift.  After almost a year, it’s clear to me that it’s the pace as we share research culture.  We tend to sprint constantly in industry and the sprinting seems to come and go in the academic. Each style has it’s pros and cons; there’s been times I wanted everyone to be running and times I was happy I could dive into something because we weren’t running. I don’t think it’s something to enumerate positive and negative points, just a different state of being.  I’m not sure why I gave you an existential response either.


Bios

About David Ayman Shamma:

I am a Principal Investigator and Senior Scientist at Centrum Wiskunde & Informatica (CWI) where I lead a team looking at Social Computing, Internet of Things (IoT), and fashion. Formerly, I was Director of Research at Yahoo Labs where I ran the HCI Research Group and I was the scientific liaison to Flickr (where I co-founded the Data-science group there). Broadly speaking, I design and prototype systems for multimedia-mediated communication, as well as, develops targeted methods and metrics for understanding how people communicate online in small environments and at web scale. Additionally, I create media art installations that have been reviewed by The New York Times, International Herald Tribune, and Chicago Magazine and exhibited internationally, including Second City Chicago, the Berkeley Art Museum, SIGGRAPH ETECH, Chicago Improv Festival, and Wired NextFest/NextMusic.

I have a Ph.D. in Computer Science from the Intelligent Information Laboratory at Northwestern University and a B.S./M.S. from the Institute for Human and Machine Cognition at The University of West Florida. Before Yahoo!, I was an instructor at the Medill School of Journalism; I have also taught courses in Computer Science and Studio Art departments. Prior to receiving my Ph.D., I was a visiting research scientist for the Center for Mars Exploration at NASA Ames Research Center.

Michael Alexander Riegler: 

Michael is a scientific researcher at Simula Research Laboratory. He received his Master’s degree from Klagenfurt University with distinction and finished his PhD at the University of Oslo in two and a half years. His PhD thesis topic was efficient processing of medical multimedia workloads.

His research interests are medical multimedia data analysis and understanding, image processing, image retrieval, parallel processing, gamification and serious games, crowdsourcing, social computing and user intentions. Furthermore, he is involved in several initiatives like the MediaEval Benchmarking initiative for Multimedia Evaluation, which runs this year the Medico task (automatic analysis of colonoscopy videos)footnote{http://www.multimediaeval.org/mediaeval2017/medico/}.

Interview Column – Introduction

The interviews in the SIGMM records aim to provide the community with the insights, visions, and views from outstanding researchers in multimedia. With the interviews we particularly try to find out what makes these researchers outstanding and also to a certain extend what is going on in their mind, what are their visions and what are their thoughts about current topics. Examples from the last issues include interviews with Judith Redi, Klara Nahrstedt, and Wallapak Tavanapong.

The interviewers are conducted via Skype or — even better — in person by meeting them at conferences or other community events. We aim to publish three to four interviews a year. If you have suggestions for who to interview, please feel free to contact one of the column editors, which are:

Michael Alexander Riegler is a scientific researcher at Simula Research Laboratory. He received his Master’s degree from Klagenfurt University with distinction and finished his PhD at the University of Oslo in two and a half years. His PhD thesis topic was efficient processing of medical multimedia workloads.
His research interests are medical multimedia data analysis and understanding, image processing, image retrieval, parallel processing, gamification and serious games, crowdsourcing, social computing and user intentions. Furthermore, he is involved in several initiatives like the MediaEval Benchmarking initiative for Multimedia Evaluation, which runs this year the Medico task (automatic analysis of colonoscopy videos, http://www.multimediaeval.org/mediaeval2017/medico/.

DSC_0104

Herman Engelbrecht is one of the directors at the MIH Electronic Media Laboratory at Stellenbosch University. He is a lecturer in Signal Processing at the Department of Electrical and Electronic Engineering. His responsibilities in the Electronic Media Laboratory are the following: Managing the immediate objectives and research activities of the Laboratory; regularly meeting with postgraduate researchers and their supervisors to assist in steering their research efforts towards the overall research goals of the Laboratory; ensuring that the Laboratory infrastructure is developed and maintained; managing interaction with external contractors and service providers; managing the capital expenditure of the Laboratory; and managing the University’s relationship with the post­graduate researchers – See more at: http://ml.sun.ac.za/people/dr-ha-engelbrecht/#sthash.3SexKFo5.dpuf

herman

Mathias Lux is associate professor at the Institute for Information Technology (ITEC) at Klagenfurt University. He is working on user intentions in multimedia retrieval and production and emergent semantics in social multimedia computing. In his scientific career he has (co-) authored more than 80 scientific publications, has served in multiple program committees and as reviewer of international conferences, journals and magazines, and has organized multiple scientific events. Mathias Lux is also well known for the development of the award winning and popular open source tools Caliph & Emir and LIRe (http://www.semanticmetadata.net) for multimedia information retrieval. Dr. Mathias Lux received his M.S. in Mathematics 2004, his Ph.D. in Telematics 2006 from Graz University of Technology, both with distinction, and his Habilitation (venia docendi) from Klagenfurt University in 2013.

Mathias_Lux_2016