SIGMM Award for Outstanding Ph.D. Thesis in Multimedia Computing, Communications and Applications 2016

image001ACM Special Interest Group on Multimedia (SIGMM) is pleased to present the 2016 SIGMM Outstanding Ph.D. Thesis Award to Dr. Christoph Kofler. The award committee considers Dr. Kofler’s dissertation entitled “User Intent in Online Video Search” worthy of the recognition as the thesis is the first to innovatively consider a user’s intent in multimedia search yielding significantly improved results in satisfying the information need of the user. The work has high originality and is expected to have significant impact, especially in boosting the search performance for multimedia data.

Dr. Kofler’s thesis systematically explores a user’s video search intent that is behind a user’s information need in three steps: (1) analyzing a real-world transaction log produced by a large video search engine to understand why searches fail, (2) understanding the possible intents of users behind video search and uploads, and (3) designing an intent-aware video search result optimization approach that re-ranks initial video search results so as to yield the highest potential to satisfy the users’ search intent.

The effectiveness of the framework developed in the thesis has been successfully justified by a thorough range of experiments. The thesis topic itself is highly topical and the framework makes groundbreaking contributions to our understanding and knowledge in the area of users’ information seeking, user intent, user satisfaction, and multimedia search engine usability.  The publications related to the thesis clearly demonstrate the impact of this work across several research disciplines including multimedia, web, and information retrieval.  Overall, the committee recognizes that the thesis has significant impact and makes considerable contributions to the multimedia community. 

Bio of Awardee:

Dr. Christoph Kofler is a software engineer and data scientist at Bloomberg L.P., NY, USA. He holds a Ph.D. degree from Delft University of Technology, The Netherlands, and an M.Sc. and B.Sc. degree from Klagenfurt University, Austria – all in Computer Science. His research interests include the broad fields of multimedia and text-based information retrieval with focus on search intent inference and its applications for search results optimization throughout the entire search engine pipeline (indexing, ranking, query formulation). In addition to “what” a user is looking for using search, Dr. Kofler is particularly interested in the “why” component behind the search and in the related opportunities for improving the efficiency and effectiveness of information retrieval systems. Dr. Kofler has co-authored more than 20 scientific publications with predominant focus on venues such as ACM Multimedia, IEEE Transactions on Multimedia, and ACM Computing Surveys. He has been a task co-organizer of the MediaEval Benchmark initiative. He received the Grand Challenge Best Presentation Award at ACM Multimedia and the Best Paper nomination at the European Conference on Information Retrieval. Dr. Kofler is a recipient of the Google Doctoral Fellowship in Information Retrieval (Video Search). He has held positions at Microsoft Research, Beijing, China; Columbia University, NY, USA; and Google, NY, USA.

 

image003-1The award committee is pleased to present an honorable mention to Dr. Varun Singh for the thesis entitled: “Protocols and Algorithms for Adaptive Multimedia Systems.” The thesis develops and presents congestion control algorithms and signaling protocols that are used in interactive multimedia communications.  The committee is impressed by the thorough theoretical and experimental depth of the thesis. Additionally, remarkable are Dr. Singh’s efforts to shepherd his work to real world adoption which has led him to author four RFCs and several standards-track documents in the IETF. This has resulted in the incorporation of his work in the production versions of the Chrome and Firefox web browsers. Therefore, it can be seen that his work has already achieved impact in the multimedia community.

Bio of Awardee:

Dr. Varun Singh received his Master’s degree in Electrical Engineering from Helsinki University of Technology, Finland, in 2009 his Ph.D. degree from Aalto University, Finland, in 2015.  His research has led him to making important contributions to different standardization organization: 3GPP (2008 – 2010), IETF (since 2010), and W3C (since 2014). He is the co-author of the WebRTC Statistics API. Beyond this, his research work led him to found and become CEO of callstats.io, a startup which analyses and optimizes the Quality of multimedia in real-time communication (currently, WebRTC).

 

ACM TOMM Special Issues and Special Sections

ACM TOMM journal has launched a new two-year program of SPECIAL ISSUES and SPECIAL SECTIONS on strategic and emerging topics in Multimedia research. Each Special Issue will also include an extended survey paper on the subject of the issue, prepared by the Guest Editors. It will help to highlight trends and research paths and will position the contributed papers appropriately.

On May, we received 11 proposals and selected 4 proposals for Special Issues and 2 proposals for Special Sections, based on the timeliness and relevance of the topic and the qualification of the proponents:

SPECIAL ISSUES (8 papers each)

  • “Deep Learning for Mobile Multimedia”
    for publication on April’17. Submission deadline Oct 15, 2016
  • “Delay-Sensitive Video Computing in the Cloud”
    for publication on July’17. Submission deadline Nov. 30, 2016
  • “Representation, Analysis and Recognition of 3D Human”
    for publication on Nov’17. Submission deadline Jan. 15, 2017
  • “QoE Management for Multimedia Services”
    for publication on April’18. Submission deadline May 15, 2017

SPECIAL SECTIONS (4 papers each)

  • “Multimedia Computing and Applications of Socio-Affective Behaviors in the Wild”
    for publication on May ’17. Submission deadline Oct 31, 2016
  • “Multimedia Understanding via Multimodal Analytics”
    for publication on May ’17. Submission deadline Oct 31, 2016

You can visit the ACM TOMM home page at  http://tomm.acm.org, news section, for more in-detail information. We will be definitely happy of your valuable contributions to this initiative.

 

ACM SIGMM Award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications

image002The 2016 winner of the prestigious ACM Special Interest Group on Multimedia (SIGMM) award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications is Prof. Dr. Alberto del Bimbo. The award is given in recognition of his outstanding, pioneering and continued research contributions in the areas of multimedia processing, multimedia content analysis, and multimedia applications, his leadership in multimedia education, and his outstanding and continued service to the community.

Prof. del Bimbo was among the very few who pioneered the research in image and video content-based retrieval in the late 80’s. Since that time, for over 25 years, he has been among the most visionary and influential researchers in Europe and world-wide in this field. His research has influenced several generations of researchers that are now active in some of the most important research centers world-wide. Over the years, he has made significant innovative research contributions.

In the early times of the discipline he explored all the modalities for retrieval by visual similarity of images and video. In his early paper Visual Image Retrieval by Elastic Matching of User Sketches published in IEEE Trans. on Pattern Analysis and Machine Intelligence in 1997, he presented one of the first and top performing methods for image retrieval by shape similarity from user’s sketches. He also published in IEEE Trans. on Pattern Analysis and Machine Intelligence and IEEE Trans. on Multimedia his original research on representations for spatial relationships between image regions based on spatial logic. This ground-breaking research was accompanied by the definition of efficient index structures to permit retrieval from large datasets. He was one of the first to address this large datasets aspect that has now become very important for the research community.

Since the early 2000s, with the advancement of 3D imaging technologies and the availability of a new generation of acquisition devices capable of capturing the geometry of 3D objects in the three-dimensional physical space, Prof. del Bimbo and his team initiated research in 3D content based retrieval that has now become increasingly popular in mainstream research. Again, he was among the very first researchers to initiate this research. Particularly, he focused on 3D face recognition extending the weighted walkthrough representation of spatial relationships between image regions to model the 3D relationships between facial stripes. His solution of 3D Face Recognition Using Iso-geodesic Stripes scored the best performance at SHREC Shape Retrieval Contest in 2008, and was published in IEEE Trans. on Pattern Analysis and Machine Intelligence, in 2010. At CVPR’15 he presented a novel idea for representing 3D textured mesh manifolds using Local Binary Patterns, that is highly effective for 3D face retrieval. This was the first attempt to combine 3D geometry and photometric texture into a single unified representation. In 2016 he has co-authored a forward looking survey on content-based image retrieval in the context of social image platforms, that has appeared on ACM Computing Surveys. It includes an extensive treatise of image tag assignment, refinement and tag-based retrieval and explores the differences between traditional image retrieval and retrieval with socially generated images.
One very important aspect of his contribution to the community is Professor del Bimbo’s educational impact during his career. He was the author of the monograph, Visual Information Retrieval, published by Morgan Kaufmann in 1999 which became one of the most cited and influential books from the early years of image and video content-based retrieval. Many young researchers have used this book as the main reference in their studies, and their career has been shaped by the ideas discussed in this book. Being the first and sole book on that subject in the early times of the discipline, it played a key role to develop content-based retrieval from a research niche to a largely populated field of research and to make it central to Multimedia research.

Professor del Bimbo has an extraordinary and long-lasting track record of services to the scientific community through the last 20 years. As the General Chair he organized two of the most successful conferences in Multimedia, namely IEEE ICMCS’99, the Int’l Conf. on Multimedia Computing and Systems (now renamed IEEE ICME) and ACM MULTIMEDIA’10. The quality and success of these conferences were highly influential to attract new young researchers in the field and form the present research community. Since 2016, he is the Editor-in-Chief for ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM).

Announcement of ACM SIGMM Rising Star Award 2016

image003The ACM Special Interest Group on Multimedia (SIGMM) is pleased to present this year’s Rising Star Award in multimedia computing, communications and applications to Dr. Bart Thomee for his significant contributions in the areas of geo-multimedia computing, media evaluation, and open research datasets. The ACM SIGMM Rising Star Award recognizes a young researcher who has made outstanding research contributions to the field of multimedia computing, communication and applications during the early part of his or her career.

Dr. Bart Thomee received his Ph.D. from Leiden University in 2010. In his thesis, he focused on multimedia search and exploration, specifically targeting artificial imagination and duplicate detection. On the topic of artificial imagination, he aimed to more rapidly understand the user’s search intent by generating imagery that resemble the ideal image the user is looking for. Using the synthesized images as queries instead of existing images from the database boosted the relevance of the image results by up to 23%. On the topic of duplicate detection, he designed descriptors to compactly represent web-scale image collections and to accurately detect transformed versions of the same image. This work led to an Outstanding Paper Citation at ACM Conference on Multimedia Information Retrieval 2008.

In 2011, he jointed Yahoo Labs, where Dr. Thomee ‘s interests grew into geographic computing in Multimedia. He began characterizing spatiotemporal regions from labeled (e.g. tagged) georeferenced media, for which he devised a technique based on scale-space theory that could process billions of georeferenced labels in a matter of hours. This work was published at WWW 2013 and became a reference example at Yahoo for how to disambiguate multi-language and multi-meaning labels from media with noisy annotations.

He also started to use an overlooked piece of information that is found in most camera phone images: compass information. He developed a technique to accurately pinpoint the locations and surface area of landmarks, solely based on the positions and orientations of photos taken of them which may have been taken hundreds of yards to miles away.

Dr. Thomee’s recent work on the YFCC100M dataset has had important impacts on the multimedia and SIGMM research community. This new dataset was real in size and structure to fuel and change the landscape of research in Multimedia. What started as an initiative to release a geo-Flickr dataset, Dr. Thomee quickly saw the broader impact and worked rapidly to scale the size. He had to push the limits of openness without violating licensing terms, copyright, or privacy. He worked closely with many lawyers to overturn the default, restrictive terms of use by making it also available to non-academics all over the world. He coordinated and led the efforts to share the data and effort horizontally with ICSI, LLNL, and Amazon Open Data. It was highlighted in the 2016 February issue of the Communications of ACM (CACM). The dataset has been requested over 1200 times in just a few months and cited many times since launch. Dr. Thomee has continued by releasing expansion packs to the YFCC100M. This dataset is expected to impact Multimedia research significantly over the future years.

Dr. Thomee has also been an exemplary community member of the Multimedia community. For example, he organized the ImageCLEF photo annotation task (2012-2013) and MediaEval placing task (2013-2016) as well as designed the ACM Grand Challenge on Event Summarization (2015) and on Tag & Caption Prediction (2016).

In summary, Dr. Bart Thomee receives the 2016 ACM SIGMM Rising Star Award Thomee for significant contributions in the areas of geo-multimedia computing, media evaluation, and open datasets for research.

MPEG Column: 115th MPEG Meeting

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The 115th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects:

 

  • IMG_2276MPEG issues Genomic Information Compression and Storage joint Call for Proposals in conjunction with ISO/TC 276/WG 5
  • Plug-in free decoding of 3D objects within Web browsers
  • MPEG-H 3D Audio AMD 3 reaches FDAM status
  • Common Media Application Format for Dynamic Adaptive Streaming Applications
  • 4th edition of AVC/HEVC file format

In this blog post, however, I will cover topics specifically relevant for adaptive media streaming, namely:

  • Recent developments in MPEG-DASH
  • Common media application format (CMAF)
  • MPEG-VR (virtual reality)
  • The MPEG roadmap/vision for the future.

MPEG-DASH Server and Network assisted DASH (SAND): ISO/IEC 23009-5

Part 5 of MPEG-DASH, referred to as SAND – server and network-assisted DASH – has reached FDIS. This work item started sometime ago at a public MPEG workshop during the 105th MPEG meeting in Vienna. The goal of this part of MPEG-DASH is to enhance the delivery of DASH content by introducing messages between DASH clients and network elements or between various network elements for the purpose of improving the efficiency of streaming sessions by providing information about real-time operational characteristics of networks, servers, proxies, caches, CDNs as well as DASH client’s performance and status. In particular, it defines the following:

  1. The SAND architecture which identifies the SAND network elements and the nature of SAND messages exchanged among them.
  2. The semantics of SAND messages exchanged between the network elements present in the SAND architecture.
  3. An encoding scheme for the SAND messages.
  4. The minimum to implement a SAND message delivery protocol.

The way that this information is to be utilized is deliberately not defined within the standard and left open for (industry) competition (or other standards developing organizations). In any case, there’s plenty of room for research activities around the topic of SAND, specifically:

  • A main issue is the evaluation of MPEG-DASH SAND in terms of qualitative and quantitative improvements with respect to QoS/QoE. Some papers are available already and have been published within ACM MMSys 2016.
  • Another topic of interest includes an analysis regarding scalability and possible overhead; in other words, I’m wondering whether it’s worth using SAND to improve DASH.

MPEG-DASH with Server Push and WebSockets: ISO/IEC 23009-6

Part 6 of MPEG-DASH reached DIS stage and deals with server push and Web sockets, i.e., it specifies the carriage of MPEG-DASH media presentations over full duplex HTTP-compatible protocols, particularly HTTP/2 and WebSocket. The specification comes with a set of generic definitions for which bindings are defined allowing its usage in various formats. Currently, the specification supports HTTP/2 and WebSocket.

For the former it is required to define the push policy as an HTTP header extension whereas the latter requires the definition of a DASH subprotocol. Luckily, these are the preferred extension mechanisms for both HTTP/2 and WebSocket and, thus, interoperability is provided. The question of whether or not the industry will adopt these extensions cannot be answered right now but I would recommend keeping an eye on this and there are certainly multiple research topics worth exploring in the future.

An interesting aspect for the research community would be to quantify the utility of using push methods within dynamic adaptive environments in terms of QoE and start-up delay. Some papers provide preliminary answers but a comprehensive evaluation is missing.

To conclude the recent MPEG-DASH developments, the DASH-IF recently established the Excellence in DASH Award at ACM MMSys’16 and the winners are presented here (including some of the recent developments described in this blog post).

Common Media Application Format (CMAF): ISO/IEC 23000-19

The goal of CMAF is to enable application consortia to reference a single MPEG specification (i.e., a “common media format”) that would allow a single media encoding to use across many applications and devices. Therefore, CMAF defines the encoding and packaging of segmented media objects for delivery and decoding on end user devices in adaptive multimedia presentations. This sounds very familiar and reminds us a bit on what the DASH-IF is doing with their interoperability points. One of the goals of CMAF is to integrate HLS in MPEG-DASH which is backed up with this WWDC video where Apple announces the support of fragmented MP4 in HLS. The streaming of this announcement is only available in Safari and through the WWDC app but Bitmovin has shown that it also works on Mac iOS 10 and above, and for PC users all recent browser versions including Edge, FireFox, Chrome, and (of course) Safari. 

MPEG Virtual Reality

IMG_2285 (1)
Virtual reality is becoming a hot topic across the industry (and also academia) which also reaches standards developing organizations like MPEG. Therefore, MPEG established an ad-hoc group (with an email reflector) to develop a roadmap required for MPEG-VR. Others have also started working on this like DVB, DASH-IF, and QUALINET (and maybe many others: W3C, 3GPP). In any case, it shows that there’s a massive interest in this topic and Bitmovin has shown already what can be done in this area within today’s Web environments. Obviously, adaptive streaming is an important aspect for VR applications including a many research questions to be addressed in the (near) future. A first step towards a concrete solution is the Omnidirectional Media Application Format (OMAF) which is currently at working draft stage (details to be provided in a future blog post).

The research aspects covers a wide range activity including – but not limited to – content capturing, content representation, streaming/network optimization, consumption, and QoE.

MPEG roadmap/vision

At it’s 115th meeting, MPEG published a document that lays out its medium-term strategic standardization roadmap. The goal of this document is collecting feedback from anyone in professional and B2B industries dealing with media, specifically but not limited to broadcasting, content and service provision, media equipment manufacturing, and telecommunication industry. The roadmap is depicted below and further described in the document available here. Please note that “360 AV” in the figure below also refers to VR but unfortunately it’s not (yet) reflected in the figure. However, it points out the aspects to be addressed by MPEG in the future which would be relevant for both industry and academia.

MPEG-Roadmap

The next MPEG meeting will be held in Chengdu, October 17-21, 2016.

An interview with Judith Redi

Describe your journey into computing from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

Dr. Judith Redi

Dr. Judith Redi

My path to multimedia was, let’s say, non-linear. I grew up in the Italian educational system, which up until university, is somewhat biased towards social sciences and humanities. My family was not one of engineers/scientists either, and never really encouraged me to look at the technical side of things. Basically, I was on a science-free educational diet until university. On the other hand, my hometown used to host the headquarters of Olivetti (may remember fancy typewriters and early personal computers?). This meant that at a very young age I had a PC at home and at school, and could use it (as a “user” on the other side of the systems we develop; I had no clue about programming).

When the time came to choose a major at university, I decided to turn the tables, a bit as a provocative action towards my previous education/mind-set, and a bit because I was fascinated by the perspective of being able to design and build future technologies. So, I picked computer engineering, perhaps inspired by my hometown technological legacy. I immediately got fascinated by artificial intelligence, and its potential to make machines more human-like (I still tell all my bachelor students that they should have a picture of Turing on their desk or above their bed). I specialized in machine learning and applied it to cryptanalysis within my master thesis. I won a scholarship to continue that research line in a PhD project at the University of Genoa. And then Philips came along, and multimedia with it.

At the time (2007), Philips was still manufacturing displays, and to stay ahead of the competition, they had to make sure their products would deliver to users the highest possible visual quality. They had algorithms to enhance image quality, but needed a system able to understand how much enhancement was needed, and of which type (sharpening? De-noising?), based on the analysis on the incoming video signal. They wanted to try a machine-learning approach to this issue, and referred to my group for collaboration. I picked up the project immediately: the goal was to model human vision (or at least the processes underlying visual quality perception), which implied not only developing new intelligent systems at the intersection between Signal Processing and Machine Learning, but also to learn more about the users of these systems, their perception and cognition. It was the fact that it would allow me to adopt a user-centred approach, closing the loop back to my social science-oriented education, that made multimedia so attractive to me. So, I left cyber-security, embraced Multimedia, and never left since.

One Philips internship, a best PhD thesis award and a Postdoc later, I am still fascinated by this duality. Much has changed in multimedia delivery, with the shift from linear TV to on-demand content consumption, video streaming accounting for 70% of the internet traffic nowadays, and the advent of Ultra High Definition solutions. User expectations in terms of Quality of Experience (QoE) increase by the day, and they are not only affected by the amount of disruptions (due to encoding, untrustworthy transmissions, rendering inaccuracies) in the delivered video, but also relate to content semantics and popularity, user affective state, environment and social context. The role of these factors on QoE is yet to be understood, let alone modelled. This is what I am working on at TU Delft, and is a long term plan, so I guess I won’t be leaving multimedia any time soon.

I’d say it’s too early for me to draw “foundational lessons” worth sharing from my journey. I guess there are a few things, though, that I figured out along the years, and that may be worthwhile mentioning:

  1. Seemingly reckless choices may be the best decisions you have ever made. Change is scary, but can pay off big time.

  2. Luck exists but hard work is a much safer bet

  3. Keep having fun doing your research. If you’re not having fun anymore, see point (1).

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

As a researcher, I have been devoting most of my efforts to understanding multimedia experiences and steer their optimization (or improvement) towards a higher user satisfaction (with the delivery system). On the longer term, I want broaden this scope, to make an even bigger impact on people’s life: I want to go beyond quality of experience and multimedia enjoyment, and target the optimization (or at least improvement) of users’ well-being.

For the past four years, I have been working with Philips Research on an Ambient Assisted Living system able to (1) sense the mood of a user in a room and (2) adapt the lighting in the room to alleviate negative moods (e.g., sadness, or anxiety), when sensed. We were able to show that the system can successfully counter negative moods in elderly users (see our recent PLoS One publication if you are interested), without the need of human intervention. The thing is, negative affective states are experienced by elderly (but by younger people too, according to recent findings) quite often, and most times, a fellow human (relative, friend, caretaker) is not available to comfort the person. My vision is to build systems that, based on the unobtrusive sensing of users’ affective states, can act upon the detection of negative states and relieve the user just as a human would do.

I want to design “empathic technology”, able to provide empathic care, whenever human care is not within reach. Challenges are multiple here. First, (long-term) affective states (such as mood, which is more constant and subtle than emotion) are to be sensed. (Wearable) sensors, cameras, or also interaction with mobile devices and social media can provide relevant information here. Empathic care can then be conveyed through ambient intelligence solutions, but also by creative industries products, ranging from gaming to intelligent clothing, to, of course, Multimedia technology (think about empathic recommender systems, or videotelephony systems that are optimized to maximize the affective charge of the communication). This type of work is highly multidisciplinary (involving multimedia systems, affective computing, embedded systems and sensors, HCI and certainly psychology), and the low-hanging fruits are not many. But I’d like this to be my contribution to make the world a better place, and I am ready to take up the challenge.

Can you profile your current research, its challenges, opportunities, and implications?

Internet-based video fruition has been reality for a while, yet it is constantly growing. Cisco’s forecasts see video delivery to account for 79% of the overall internet consumer traffic by 2018 (this is equivalent to one million minutes of video crossing IP networks every second). As the media fruition grows, so do user expectations in terms of Quality of Experinece (see the recent Conviva reports!). And, future multimedia will have to be optimized for multiple, more immersive (plenoptic, HDRi, ultra-high definition) devices, both fixed and mobile. Moore’s law and broadband speed alone won’t do the job. Resources and delivery mechanisms have to be optimized on a more application- and user-specific basis. To do so, it will be essential to be able to measure (unobtrusively) the extent to which the user deems the video experience to be of a high quality.

In this context, my work aims to (1) understand the perceptual, cognitive and affective processes underlying user appreciation for multimedia experiences, and (2) model these processes in order to automatically assess the delivered QoE, and, when applicable, enhance it. It is important here to bear in mind that multimedia quality of experience cannot be considered to depend solely on the presence (absence) of visual/auditory impairments introduced by technology limitations (e.g., packet loss errors or blocking artifacts from compression). Although that’s been the most common approach to QoE assessment and optimization, it is not sufficient anymore. The appearance of social media and internet-based delivery has challenged the way media are consumed: we don’t deal with passive observers anymore, but with users that select specific videos, to be delivered on specific devices, in any type of context. Elements such as semantics, user personality, preferences and intent, and socio- cultural context of fruition come into play, that have never been investigated (let alone modelled) for delivery optimization. My research focuses on integrating these elements in QoE estimation, to enable effective, personalized optimization.

The challenges are countless: user and context characteristics have to be quantified and modelled, to be then integrated with the video content analysis to deliver a final quality assessment, representing the experience as it would be perceived by that user, in that context, given that specific video. Before that, which user and context factors impact QoE is to be determined (to date, there is not even agreement on a taxonomy of these factors). Adaptive streaming protocols make it possible to implement user- and context- aware delivery strategies, the willingness of users to share personal data publicly can lead to more accurate user models, and especially crowdsourcing and crowdsensing can support the systematic study of the influence that context and user factors have on the overall QoE.

How would you describe the role of women especially in the field of multimedia?

Just like for their male colleagues (would you ask them to describe the role of men in multimedia?), the role of women in multimedia is:

  1. to push the boundaries of science, knowledge and practice in the field, doing amazing research that will make the world a better place
  2. to train new generations of brilliant engineers and scientists that will keep doing amazing research to make the world an even better place and
  3. serve the community as professionals and leaders to steer the future amazing research that will go on making the wold better and better.

I’d say the first two points are covered. The third, instead, may be implemented a bit better in practice, as there is a general lack of representativeness of women at a leadership level. The reasons for this are countless. They go from the lack of incoming talent (traditionally girls are not attracted to STEM subjects, perhaps for socio-cultural reasons), to the so-called leaking pipeline, which sees talented women leaving demanding yet rewarding careers too early, to an underlying presence of the impostor syndrome, that sometimes prevents women from putting their name forward for given roles. The solution is not necessarily in quotas (although I understand the reasoning behind the need for quotas, I think they are actually making women’s life more difficult – there is an underlying feeling that “women have it all easy these days” that makes work relationships more suspicious and ends up making women have to work three times as hard to show that they actually deserve what they accomplished), but rather in coaching and dedicated sponsorship of talent since the early stages.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

The methods that I developed for subjective image quality assessment have been adopted within Philips research and their evolution to video quality assessment is now under evaluation of the Video Quality Experts Group to be advised as an alternative methodology to the standard ACR and paired comparison. The research that I carried out on the suitability of crowdsourcing for subjective QoE testing and adaptation of traditional lab-based experimental designs to crowdtesting is now included in the Qualinet white paper on Best practices for crowdsourced QoE, and has helped in better understanding the potential of this tool for QoE research (and the risks involved in its use). This research is also currently feeding new ITU-T recommendations on the subject. The models that I developed for objective QoE estimation have been published in top journals and pose the basis for a more encompassing and personalized QoE optimization.

Over your distinguished career, what are your top lessons you want to share with the audience?

Again, I am not sure whether I am yet in the position of giving advice and/or sharing lessons, but here are a couple of things:

  1. Be patient and long-sighted. Going for research that pays off on the short term is very appealing, especially when you are challenged with job insecurity (been there, done that). But it is not a sustainable strategy, you can’t make the world a better place with your research if you don’t have a long term vision, where all the pieces fit together towards a final goal. And on the long term, it’s not fun either.

  2. Be generous. Science is supposed to move forward as a collaborative effort. That’s why we talk about a “scientific community”. Be generous in sharing your knowledge and work (open access, datasets, code). Be generous in providing feedback, to your peers (be constructive in your reviews!) and to students. Be generous in helping out fellow scientists and early stage researchers. True, it is horribly time consuming. But it is rewarding, and makes our community tighter and stronger.

For girls, watch Sheryl Sandberg’s TED talk, do participate to the Grace Hopper Celebration of Women in Computing, don’t be afraid to come to the ACMMM women’s lunches, they are a lot of fun. Actually, these are good tips for boys too.

For the rest just watch The last lecture of Randy Pausch because he said it all already and much better than I could ever do.

If you were conducting this interview, what questions would you ask, and then what would be your answers?

Q: Why should one attend the ACMMM women’s lunch?

A: If you are a female junior member of the community, do attend because it will give you the opportunity to chat with senior women who have been around for a while, and can tell you all about how they got where they are (most precious advice, trust me). If you are a female senior member of the community, do attend because you could meet some young, talented researcher that needs some good tips from you, and you should not keep all your valuable advice for yourself :). If you are a male member of the community, you should attend because we really need to initiate some constructive dialogue on how to deal with the problem of low female representation in the community (because it is a problem, see next question). Being this a community problem (and not a problem of females only), we need all members of the community to discuss it.

Q: Why do we need more women in Multimedia?

A: Read this or this, or just check the Wikipedia page on women in STEM.