Describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?
This is an excellent question. Indeed, life is a journey, and every step is a lesson. I was originally attracted by electronics but as I was studying it, I discovered computers. Remember, for those who were old enough in the 1980’s, this was the start of personal computers. So I decided to learn more about them, and as I was studying computer science I found out about AI, yes AI 1990’s style. I was interested, but this coincided with one of the AI winters and I was advised, or rather decided, to go in a different direction. The area that attracted me most was computer vision. The reason was that it seemed like a very hard problem which would clearly have a very broad impact. It turns out that vision alone is indeed very hard and using additional information or signals could help obtain better results, hence reducing time to impact for such a scientific approach/method. This was what attracted me to multimedia and kept me busy for many years at EURECOM. What did I learn along the way? Follow your instinct and your heart as you go along as it is rare to know where to go from the very start. Your destination might not even exist at the time you started your journey!
Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?
Since July 2019 I have headed the Data Science team of MEDIAN Technologies. The objective is to bring recent advances from the field of computer vision, neural networks, and also multimedia to part from the way medical imaging is currently performed while providing solutions to detect cancer at the earliest possible stage of the disease and help identify the best treatment for each patient. Concretely, we are currently working on the identification of biomarkers for Hepatocellular Carcinoma (HCC) which is the most common type of primary liver cancer and which is known to be a difficult organ for medical imaging solutions.
Can you profile your current research, its challenges, opportunities, and implications?
To answer this very broad question concisely, I will limit myself to one challenge, one opportunity, and one implication. For the challenge, I will mention one key challenge which I have encountered many times in many projects: Interdisciplinary communication. In most projects, whether small or large and involving multiple domains of expertise, the communication between people of different backgrounds is not as straightforward as one would assume. It is important to address this challenge proactively. For the opportunity, medical imaging is nowadays still mostly employing “traditional” machine learning on top of man-made features (Grey-Level Co-occurrence Matrices, Gabor, etc). The “end to end” paradigm shift brought in by the recent developments in the field of deep neural nets is still to take place in the medical domain at large. This is what we aim to achieve for medical imaging for Oncology. The implication, a significant improvement in the detection of cancer, such as the early detection of tumors. Such early detection allows for therapy to take place at the earliest possible stage hence drastically increasing the patient chance of survival. Saving lives in short.
How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?
There is a number of research work originating from my team that would be worth mentioning here; EventEnricher for collecting media illustrating event, the Hyper Video Browser, an interactive tool for searching and hyperlinking in broadcast media, or the work resulting from the collaborative project NexGen-TV: Providing real-time insight during political debates in a second screen application… to name just a few.
But the one with the highest impact is the work performed while on sabbatical at IBM T.J. Watson research center. As I onboarded, the research group received a request from the 20th Century Fox, regarding the possibility for some AI to help generate the trailer of a sci-fi horror movie that was about to be released. The project was both challenging and interesting as I had previously addressed video summarization and multimedia emotion recognition as part of previous research projects. The challenge was the limited amount of time available to deliver the shots which using state of the art machine learning were identified as the best suited to be part of the trailer. The team worked hard and hard work was rewarded multiple times. First because the hard deadline was met, having the “AI Trailer” on time for the screening of the movie in the US. Second because Fox sent a whole video crew to shoot the making of the trailer behind the scene. The video was posted on YouTube and got about 2 million views in about a week. This was the level of impact this scientific research work had. And if that was not enough, the work got another reward at the ACM Multimedia 2017 conference for being the Best Brave New Ideas paper that year.
Over your distinguished career, what are your top lessons you want to share with the audience?
Over the years I have observed that as a researcher, one needs to be curious while being able to find a good compromise between being focused and exploring new or alternative options/approaches. I feel that it is easy for today’s young researchers to be overwhelmed with the pace at which high-quality publications are becoming available. Social media (i.e. Twitter) and online repositories (i.e. Arxiv) are no stranger to this situation. There will always be a new paper reporting something potentially interesting with respect to your research, yet it doesn’t mean you should keep reading and reading at the cost of making slow or no progress on your own work! Reading and being aware of the state of the art is one thing, contributing and being innovative is another and the latter is the key to a successful PhD. Life as a researcher whether in academia or in the industry is made of choices, directions to follow, etc. While more senior people may sometimes rely on their experience, I believe it is important to listen to your inner self and follow what motivates you whenever possible. I have always believed and often witnessed that it is easier to work toward something of interest (to yourself) and in most cases the outcome exceeds expectations.
What is the best joke you know?
I have a very bad memory for jokes. Tell me one and I will laugh because I have a good sense of humor. But ask me to tell the story the next day and I will not be able to. So I looked jokes on the internet and here is the first one that made me laugh (I did read quite a few before!!!):
Two men meet on opposite sides of a river. One shouts to the other, “I need you to help me get to the other side!” The other guy replies, “You’re on the other side!”
Not the best, but it will do for now!
If you were conducting this interview, what questions would you ask, and then what would be your answers?
The COVID-19 pandemic is affecting people’s lives on an international scale, do you think this will have an influence on research and in particular multimedia research?
Indeed, the situation is forcing us to change the way we collaborate and interact. As a researcher, one regularly travels for project meetings, conferences, PhD presentations, etc. in addition to local activities and events such as teaching, labs, group meetings, etc. With current travel restrictions and social distancing recommendations, remote work relying heavily on high bandwidth internet has developed to an unprecedented level, exposing both its limitations and advantages. Similarly, scientific conferences where a lot of interaction takes place have been forced to adapt. At first, organizers postponed the events, hoping the situation will quickly return to normal. However, with the extended duration of the pandemic, the shift from physical to remote or virtual conferencing, using online tools and systems, had to be performed. This clearly demonstrated not only the possibility of organizing such events online but also showed some limitations regarding interaction. On this topic, this could be a great opportunity for the multimedia community to have an impact at large. Indeed, who would be better suited to contribute to the next generation of tools for effective interactive remote work and conferencing than the multimedia community. I believe we have a role to play and look forward to seeing and using such tools. I didn’t touch on the health aspect of this question but that is also something multimedia researchers, usually well acquainted with the state of the art machine learning, can contribute to. On that note, if Medical Imaging is a topic that attracts you and that you are motivated by, do not hesitate to reach out.
Disclaimer: All views expressed in this interview are my own and do not represent the opinions of any entity with which I have been or am now affiliated.
Bio: Benoit Huet heads the data science team at MEDIAN Technologies. His research interests include computer vision, machine learning, and large-scale multimedia data mining and indexing.