An interview with Benoit Huet

Benoit at the beginning of his research career.

Describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

This is an excellent question. Indeed, life is a journey, and every step is a lesson. I was originally attracted by electronics but as I was studying it, I discovered computers. Remember, for those who were old enough in the 1980’s, this was the start of personal computers. So I decided to learn more about them, and as I was studying computer science I found out about AI, yes AI 1990’s style. I was interested, but this coincided with one of the AI winters and I was advised, or rather decided, to go in a different direction. The area that attracted me most was computer vision. The reason was that it seemed like a very hard problem which would clearly have a very broad impact. It turns out that vision alone is indeed very hard and using additional information or signals could help obtain better results, hence reducing time to impact for such a scientific approach/method. This was what attracted me to multimedia and kept me busy for many years at EURECOM. What did I learn along the way? Follow your instinct and your heart as you go along as it is rare to know where to go from the very start. Your destination might not even exist at the time you started your journey!

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

Since July 2019 I have headed the Data Science team of MEDIAN Technologies. The objective is to bring recent advances from the field of computer vision, neural networks, and also multimedia to part from the way medical imaging is currently performed while providing solutions to detect cancer at the earliest possible stage of the disease and help identify the best treatment for each patient. Concretely, we are currently working on the identification of biomarkers for Hepatocellular Carcinoma (HCC) which is the most common type of primary liver cancer and which is known to be a difficult organ for medical imaging solutions.

Can you profile your current research, its challenges, opportunities, and implications?

To answer this very broad question concisely, I will limit myself to one challenge, one opportunity, and one implication. For the challenge, I will mention one key challenge which I have encountered many times in many projects: Interdisciplinary communication. In most projects, whether small or large and involving multiple domains of expertise, the communication between people of different backgrounds is not as straightforward as one would assume. It is important to address this challenge proactively. For the opportunity, medical imaging is nowadays still mostly employing “traditional” machine learning on top of man-made features (Grey-Level Co-occurrence Matrices, Gabor, etc). The “end to end” paradigm shift brought in by the recent developments in the field of deep neural nets is still to take place in the medical domain at large. This is what we aim to achieve for medical imaging for Oncology. The implication, a significant improvement in the detection of cancer, such as the early detection of tumors. Such early detection allows for therapy to take place at the earliest possible stage hence drastically increasing the patient chance of survival. Saving lives in short.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

There is a number of research work originating from my team that would be worth mentioning here; EventEnricher for collecting media illustrating event, the Hyper Video Browser, an interactive tool for searching and hyperlinking in broadcast media, or the work resulting from the collaborative project NexGen-TV: Providing real-time insight during political debates in a second screen application… to name just a few.

But the one with the highest impact is the work performed while on sabbatical at IBM T.J. Watson research center. As I onboarded, the research group received a request from the 20th Century Fox, regarding the possibility for some AI to help generate the trailer of a sci-fi horror movie that was about to be released. The project was both challenging and interesting as I had previously addressed video summarization and multimedia emotion recognition as part of previous research projects. The challenge was the limited amount of time available to deliver the shots which using state of the art machine learning were identified as the best suited to be part of the trailer. The team worked hard and hard work was rewarded multiple times. First because the hard deadline was met, having the “AI Trailer” on time for the screening of the movie in the US. Second because Fox sent a whole video crew to shoot the making of the trailer behind the scene. The video was posted on YouTube and got about 2 million views in about a week. This was the level of impact this scientific research work had. And if that was not enough, the work got another reward at the ACM Multimedia 2017 conference for being the Best Brave New Ideas paper that year.

Over your distinguished career, what are your top lessons you want to share with the audience?

Over the years I have observed that as a researcher, one needs to be curious while being able to find a good compromise between being focused and exploring new or alternative options/approaches. I feel that it is easy for today’s young researchers to be overwhelmed with the pace at which high-quality publications are becoming available. Social media (i.e. Twitter) and online repositories (i.e. Arxiv) are no stranger to this situation. There will always be a new paper reporting something potentially interesting with respect to your research, yet it doesn’t mean you should keep reading and reading at the cost of making slow or no progress on your own work! Reading and being aware of the state of the art is one thing, contributing and being innovative is another and the latter is the key to a successful PhD. Life as a researcher whether in academia or in the industry is made of choices, directions to follow, etc. While more senior people may sometimes rely on their experience, I believe it is important to listen to your inner self and follow what motivates you whenever possible. I have always believed and often witnessed that it is easier to work toward something of interest (to yourself) and in most cases the outcome exceeds expectations.

What is the best joke you know?

I have a very bad memory for jokes. Tell me one and I will laugh because I have a good sense of humor. But ask me to tell the story the next day and I will not be able to. So I looked jokes on the internet and here is the first one that made me laugh (I did read quite a few before!!!):

Two men meet on opposite sides of a river.  One shouts to the other, “I need you to help me get to the other side!” The other guy replies, “You’re on the other side!”

Not the best, but it will do for now!

If you were conducting this interview, what questions would you ask, and then what would be your answers?

The COVID-19 pandemic is affecting people’s lives on an international scale, do you think this will have an influence on research and in particular multimedia research?

Indeed, the situation is forcing us to change the way we collaborate and interact. As a researcher, one regularly travels for project meetings, conferences, PhD presentations, etc. in addition to local activities and events such as teaching, labs, group meetings, etc. With current travel restrictions and social distancing recommendations, remote work relying heavily on high bandwidth internet has developed to an unprecedented level, exposing both its limitations and advantages. Similarly, scientific conferences where a lot of interaction takes place have been forced to adapt. At first, organizers postponed the events, hoping the situation will quickly return to normal. However, with the extended duration of the pandemic, the shift from physical to remote or virtual conferencing, using online tools and systems, had to be performed. This clearly demonstrated not only the possibility of organizing such events online but also showed some limitations regarding interaction. On this topic, this could be a great opportunity for the multimedia community to have an impact at large. Indeed, who would be better suited to contribute to the next generation of tools for effective interactive remote work and conferencing than the multimedia community. I believe we have a role to play and look forward to seeing and using such tools. I didn’t touch on the health aspect of this question but that is also something multimedia researchers, usually well acquainted with the state of the art machine learning, can contribute to. On that note, if Medical Imaging is a topic that attracts you and that you are motivated by, do not hesitate to reach out.

Disclaimer: All views expressed in this interview are my own and do not represent the opinions of any entity with which I have been or am now affiliated.

A recent photo of Benoit.

Bio: Benoit Huet heads the data science team at MEDIAN Technologies. His research interests include computer vision, machine learning, and large-scale multimedia data mining and indexing.

An interview with Associate Professor Duc-Tien Dang-Nguyen

Tien at the beginning of his research career.

Describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

Looking back at the early days of my life, my love for science started quite young. I loved solving puzzles and small recreational mathematical problems. Actually, I still do this. It may also be because my mother “seeded” many stories with great scientific people like Thomas Edison or Marie Curie every night. I admired them a lot and often dreamed of being like them. I also love to play video games. I played them a lot, and I think that I am quite good, especially in games like The Legend of Zelda and the Castlevania series. I also love to travel, and perhaps that is why I have a nomad’s journey over the last ten years, starting from Vietnam to Japan, Italy, Ireland, and now Norway. While living in Vietnam, I would often travel to the countryside on my motorbike. Solving puzzles, playing video games, collecting things, and travelling; these tiny things play an essential role in making me who I am today.

Now back to the story. I come from Vietnam, where it is very normal for my generation to grow through endless competitions. My first challenge was a math competition when I was eight. I then became a math student and followed many competitions like the current MIT Mystery Hunt. When I was 12, a friend of my father gave me his old PC as a present. It was a 486 (we called it that since it has an Intel 486 core), and it changed my life. I played with it endlessly. I learned Pascal by myself, and in the last year of my secondary schools (K-9), I proudly won the first rank at both Math and Informatics in the regional contests. Thanks to that, I entered one of the best high schools in Vietnam. I joined the Informatics class, and as you might already guess, we were dealing with programming challenges every day. We learned mainly algorithms and data structures, discrete mathematics, and computational complexity through solving challenging problems from the International Olympiad of Informatics. It is quite similar to Topcoder now. It was tough and very competitive, but it was exciting to me since it was like solving hard puzzles.

Moving to my bachelor’s, I took an honor program in Computer Science, which was one of the best Computer Science programs in Vietnam. In the third year of my bachelor’s in an Image Processing course, I did a project about image annotation. It was a pure K-means for image segmentation based on pixel color values, followed by a k-NN on a pre-trained set of images. It sounds pretty basic now, but this was in 2001, and “I did it my way” so it was a fantastic achievement! It was from this project I became a multimedia researcher.

After my bachelor’s, I continued researching computer vision and image retrieval in my master’s. In my first year as a Ph.D. student, I was working on a multimedia retrieval project, but just three months before the qualifying exam (you need to present your research proposal to continue your Ph.D.), I changed my research topic to Image Forensics, thanks to the course of the same name. I found everything I love in this new research field. It is like solving a puzzle, collecting evidence, and playing a game simultaneously. So, I became an image forensics researcher.

Some people say, “Choose a job you love, and you will never have to work a day in your life”, perhaps they are missing the last part “because no one will hire you”. Yes, it’s just a joke, but it can also be quite true in may circumstances. It was hard to find a job that needs image forensics when I finished my Ph.D. However, since I know image processing, computer vision, and machine learning, it was not that hard for me to find a postdoc in those fields. I was then doing both multimedia forensics and multimedia retrieval. This “evolvement” introduced me to a new field, lifelogging, a research direction that tries to discover insights from personal data archives. At first, it was an “okay” field to me, but later, after digging more into it, I found many interesting challenges that need to be solved. And that was a very long story about how I reach the starting point of my research.

Can you profile your current research, its challenges, opportunities, and implications? Tell us more about your vision and objectives behind your current roles.

Bergen, where I mainly focus on image forensics and lifelogging. Multimedia forensics is about discovering the history of modifications to multimedia content such as videos, images, audio, etc. Mainly, I work with images and have dabbled a bit in video forensics. Audio is nice too, but I mostly enjoy working with the visual side of multimedia. People tend to think about multimedia forensics as a tool to check if an image or a video is real or fake. However, we also try to look at the specifics for the media in question. Some potential questions for an image could, for example, be where was it first posted? What type of camera was it taken with? These are questions that help identify the reliability of the image in questions and give more information than fake or real. Also, I believe that we should also take a further step by considering the context of use (how, where, and when) of the multimedia content. The expectation of truthfulness is radically different if the image is hanging in an art gallery than if it is being used as evidence in a court case.

As previously mentioned, I also work with lifelogging. This work is still in its early stages. We have not proposed any novel approaches yet. Instead, we are building a community by organizing research activities as workshops and bench-marking initiatives. We believe that by holding such events, we are preparing a solid user-base for the next phase when people are more familiar with such technologies, the phase of personal data analytics. We have witnessed great applications of AI during the last decade. Since AI needs data, and people need more personalized solutions, I believe that very soon we will be doing lifelogging in our everyday life. Let’s wait and see if my prediction is becoming true or not.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

In multimedia forensics, I am quite happy that I was among the first to propose an approach for discriminating between computer graphics and natural human faces. People are well aware of “Deepfake”, and many great people are working on this problem. However, when I presented my first study in 2011, many people, including computer graphics researchers, were laughing when I told them that they would soon not be able to distinguish computer-generated faces from the real ones. In image forensics, we try to reveal all traces of the image acquisition history, and since digital images are based on pixels, they are susceptible to changes. For example, many traces of modifications become incredibly hard to find if the image is resized. Most of my approaches are thus physical or geometrical based, which makes them more robust against changes as well as more reliable in terms of decision explanation.

Over your distinguished career, what are your top lessons you want to share with the audience?

I believe that I am still at the start of my career, and perhaps the first and the most important lesson I have is about “causes and effects” or what Steve Jobs described as “Connecting the dots”. There are dots in our life that are very hard to understand or predict how everything is connected, but eventually, when looking back, the connections will reveal themselves over time. Just follow whatever you think is good for you and try very hard to make it a good “dot”. Everyone wants to work with something we love, but finding what we love in our current work is even more important.

What is the best joke you know?

Most of the jokes I love are in Vietnamese, and unless you are Vietnamese, you can’t get them. I am trying to think about some “Western” jokes that share some commonalities with Vietnamese humor and culture. That should be a politics joke. I believe that you can find a similar version with KGB or Stasi. This one was very famous, and surprisingly, it is very well suited to my current research on lifelogging 🙂

“Why do Stasi officers make such good taxi drivers? — You get in the car and they already know your name and where you live.”

A recent photo of Tien.

Bio: Duc-Tien Dang-Nguyen is an associate professor at the University of Bergen. His main research interests are multimedia forensics, lifelogging, and machine learning.

An interview with Associate Professor Hugo L. Hammer

Hugo as a Ph.D. student, at the beginning of his research career.

Describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

From an early age, I had the ability to focus and work individually and loved to develop new systems for all sorts of things, which probably was quite annoying for those around me. It turns out that it is these abilities to focus, being curious, and developing new systems is what drives my research today. When I started as a student in mathematics and statistics at the Norwegian University of Science and Technology (NTNU), I didn’t think of research as an alternative and was determined to find a job in the industry. Throughout the studies, I learned how little mathematics and statistics I had actually learned, which is why I decided to become a Ph.D. student. I expected to find a job in the industry after the Ph.D. period but ended up loving research, and that is why I am where I am today.

As a statistician, I have worked a lot with spatial and spatio-temporal data, such as geophysical observations. Such observations have striking similarities to multimedia content, such as images and videos. I have become very interested in machine learning methods used to process and make decisions from multimedia content and the potential for applying such methods towards other applications, such as geophysical applications. I also love working as a statistician within this field. A crucial part of my research is to try to combine methods from machine learning and statistics into new and exciting ways.

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish, and how will you bring this about? 

In my current position as an associate professor, I do both teaching and research. Teaching and research challenge me in different ways. I continuously try to develop and improve my teaching. I especially focus on how to do high quality, yet resource-efficient, teaching. I have, for example, worked a lot on how to activate students and improve learning when being a single teacher for hundreds of students.

Can you profile your current research, its challenges, opportunities, and implications?

My current research can roughly be divided into three directions. The first direction is about methods for real-time information processing and decision making, for example, from sensory information or video streams. The second direction is based on developing new machine learning models and methods, and as mentioned above, by taking advantage of my background in statistics. The third direction is doing more applied use of machine learning methods toward real-life multimedia data, in particular, medical data. Direction two and three go hand in hand. Having a background in statistics and working more and more with multimedia data is more of an opportunity than a challenge.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

 I am proud of the research we have done on real-time information processing and decision making. Our developed methods are simple but still document state-of-the-art performance. In 2020, we plan to develop software packages to make the methods readily available and hopefully useful for many. We saw the potential of using machine learning, and in particular deep learning, towards geophysical data and problems quite early, and we are now able to operate at the forefront of this research. I’m also proud of our externally funded research projects and, for sure, our rejected research proposals.

Over your distinguished career, what are your top lessons you want to share with the audience?

Here is a lesson from my personal experience. I think it is easy to depend on or have too much respect for other researchers early in the career. Research is of course all about collaboration, but still, for me, it was useful early in the career to create a small research project where I did every step of the process myself (shaping ideas, collecting data, running simulation, writing, finding suitable publishing channels, revisions, etc.). It was hard work, but for sure, it made me a better and more independent researcher.

What is the best joke you know?

Daddy, what are clouds made of?

Linux servers, mostly.

If you were conducting this interview, what questions would you ask, and then what would be your answers?

One suggestion: What do you like to do in your spare time?

Research, right? 🙂 Working every day at an office, I try to find time for physical activity in my spare time. I love to run, bike, or go skiing in Nordmarka (a forest near Oslo, Norway) or in the mountains on the weekends.

A recent photo of Hugo.

Bio: Hugo L. Hammer is an associate professor in statistics at Oslo Metropolitan University. His main research interests are computational statistics, probabilistic forecasting, real-time analytics, and machine learning.