Can the Multimedia Research Community via Quality of Experience contribute to a better Quality of Life?

Can the multimedia community contribute to a better Quality of Life? Delivering a higher resolution and distortion-free media stream so you can enjoy the latest movie on Netflix or YouTube may provide instantaneous satisfaction, but does it make your long term life better? Whilst the QoMEX conference series has traditionally considered the former, in more recent years and with a view to QoMEX 2020, research works that consider the later are also welcome. In this context, rather than looking at what we do, reflecting on how we do it could offer opportunities for sustained rather than instantaneous impact in fields such as health, inclusive of assistive technologies (AT) and digital heritage among many others.

In this article, we ask if the concepts from the Quality of Experience (QoE) [1] framework model can be applied, adapted and reimagined to inform and develop tools and systems that enhance our Quality of Life. The World Health Organisation (WHO) definition of health states that “[h]ealth is a state of complete physical, mental and social well-being and not merely the absence of disease or infirmity” [2]. This is a definition that is well-aligned with the familiar yet ill-defined term, Quality of Life (QoL). Whilst QoL requires further work towards a concrete definition, the definition of QoE has been developed through work by the QUALINET EU COST Network [3]. Using multimedia quality as a use case, a white paper [1] resulted from this effort that describes the human, context, service and system factors that influence the quality of experience for multimedia systems.

Fig. 1: (a) Quality of Experience and (b) Quality of Life. (reproduced from [2]).

The QoE formation process has been mapped to a conceptual model allowing systems and services to be evaluated and improved. Such a model has been developed and used in predicting QoE. Adapting and applying the methods to health-related QoL will allow predictive models for QoL to be developed.

In this context, the best paper award winner at QoMEX in 2017 [4] proposed such a mapping for QoL in stroke prevention, care and rehabilitation (Fig. 1) along with examining practical challenges for modeling and applications. The process of identifying and categorizing factors and features was illustrated using stroke patient treatment as an example use case and this work has continued through the European Union Horizon 2020 research project PRECISE4Q [5]. For medical practitioners, a QoL framework can assist in the development of decision support systems solutions, patient monitoring, and imaging systems.

At more of a “systems” level in e-health applications, the WHO defines assistive devices and technologies as “those whose primary purpose is to maintain or improve an individual’s functioning and independence to facilitate participation and to enhance overall well-being” [6]. A proposed application of immersive technologies as an assistive technology (AT) training solution applied QoE as a mechanism to evaluate the usability and utility of the system [7]. The assessment of immersive AT used a number of physiological data: EEG signal, GSR/EDA, body surface temperature, accelerometer, HR and BVP. These allow objective analysis while the individual is operating the wheelchair simulator. Performing such evaluations in an ecologically valid manner is a challenging task. However, the QoE framework provides a concrete mechanism to consider the human, context and system factors that influence the usability and utility of such a training simulator. In particular, the use of implicit and objective metrics can complement qualitative approaches to evaluations.

In the same vein, another work presented at QoMEX 2017 [8], employed the use of Augmented Reality (AR) and Virtual Reality (VR) as a clinical aid for diagnosis of speech and language difficulties, specifically aphasia (see Fig. 2). It is estimated, that speech or language difficulties affect more than 12% of people internationally [9]. Individuals who suffer from a stroke or traumatic brain injury (TBI) often experience symptoms of aphasia as a result of damage to the left frontal lobe. Anomic aphasia [10] is a mild form of aphasia in which patients experience word retrieval problems and semantic memory difficulties. Opportunities exist to digitalize well-accepted clinical approaches that can be augmented through QoE based objective and implicit metrics. Understanding the user via advanced processing techniques is an area in dire need of further research with significant opportunities to understand the user at a cognitive, interaction and performance levels moving far beyond the binary pass/fail of traditional approaches.

Fig. 2: Prototype System Framework (Reproduced from [8]). I. Physiological wearable sensors used to capture data. (a) Neurosky mindwave® device. (b) Empatica E4® wristband. II. Representation of user interaction with the wheelchair simulator. III. The compatibles displays. (a) Common screen. (b) Oculus Rift® HMD device. (c) HTC Vive® HMD device.

Moving beyond health, the QoE concept can also be extended to other areas such as digital heritage. Organizations such as broadcasters and national archives that collect media recordings are digitizing their material because the analog storage media degrade over time. Archivists, restoration experts, content creators, and consumers are all stakeholders but they have different perspectives when it comes to their expectations and needs. Hence their QoE for archive material can be very different, as discussed at QoMEX 2019 [11]. For people interested in media archives viewing quality through a QoE lens, QoE aids in understanding the issues and priorities of the stakeholders. Applying the QoE framework to explore the different stakeholders and the influencing factors that affect their QoE perceptions over time allows different kinds of models for QoE to be developed and used across the stages of the archived material lifecycle from digitization through restoration and consumption.

The QoE framework’s simple yet comprehensive conceptual model for the quality formation process has had a major impact on multimedia quality. The examples presented here highlight how it can be used as a blueprint in other domains and to reconcile different perspectives and attitudes to quality. With an eye on the next and future editions of QoMEX, will we see other use cases and applications of QoE to domains and concepts beyond multimedia quality evaluations? The QoMEX conference series has evolved and adapted based on emerging application domains, industry engagement, and approaches to quality evaluations.  It is clear that the scope of QoE research broadened significantly over the last 11 years. Please take a look at [12] for details on the conference topics and special sessions that the organizing team for QoMEX2020 in Athlone Ireland hope will broaden the range of use cases that apply QoE towards QoL and other application domains in a spirit of inclusivity and diversity.

References:

[1] P. Le Callet, S. Möller, and A. Perkis, eds., “Qualinet White Paper on Definitions of Quality of Experience (2012). European Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003), Lausanne, Switzerland, Version 1.2, March 2013.”

[2] World Health Organization, “World health organisation. preamble to the constitution of the world health organisation,” 1946. [Online]. Available: http://apps.who.int/gb/bd/PDF/bd47/EN/constitution-en.pdf. [Accessed: 21-Jan-2020].

[3] QUALINET [Online], Available: https://www.qualinet.eu. [Accessed: 21-Jan-2020].

[4] A. Hines and J. D. Kelleher, “A framework for post-stroke quality of life prediction using structured prediction,” 9th International Conference on Quality of Multimedia Experience, QoMEX 2017, Erfurt, Germany, June 2017.

[5] European Union Horizon 2020 research project PRECISE4Q, https://precise4q.eu/. [Accessed: 21-Jan-2020].

[6] “WHO | Assistive devices and technologies,” WHO, 2017. [Online]. Available: http://www.who.int/disabilities/technology/en/. [Accessed: 21-Jan-2020].

[7] D. Pereira Salgado, F. Roque Martins, T. Braga Rodrigues, C. Keighrey, R. Flynn, E. L. Martins Naves, and N. Murray, “A QoE assessment method based on EDA, heart rate and EEG of a virtual reality assistive technology system”, In Proceedings of the 9th ACM Multimedia Systems Conference (Demo Paper), pp. 517-520, 2018.

[8] C. Keighrey, R. Flynn, S. Murray, and N. Murray, “A QoE Evaluation of Immersive Augmented and Virtual Reality Speech & Language Assessment Applications”, 9th International Conference on Quality of Multimedia Experience, QoMEX 2017, Erfurt, Germany, June 2017.

[9] “Scope of Practice in Speech-Language Pathology,” 2016. [Online]. Available: http://www.asha.org/uploadedFiles/SP2016-00343.pdf. [Accessed: 21-Jan-2020].

[10] J. Reilly, “Semantic Memory and Language Processing in Aphasia and Dementia,” Seminars in Speech and Language, vol. 29, no. 1, pp. 3-4, 2008.

[11] A. Ragano, E. Benetos, and A. Hines, “Adapting the Quality of Experience Framework for Audio Archive Evaluation,” Eleventh International Conference on Quality of Multimedia Experience (QoMEX), Berlin, Germany, 2019.

[12] QoMEX 2020, Athlone, Ireland. [Online]. Available: https://www.qomex2020.ie. [Accessed: 21-Jan-2020].

MPEG Column: 128th MPEG Meeting in Geneva, Switzerland

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

The 128th MPEG meeting concluded on October 11, 2019 in Geneva, Switzerland with the following topics:

  • Low Complexity Enhancement Video Coding (LCEVC) Promoted to Committee Draft
  • 2nd Edition of Omnidirectional Media Format (OMAF) has reached the first milestone
  • Genomic Information Representation – Part 4 Reference Software and Part 5 Conformance Promoted to Draft International Standard

The corresponding press release of the 128th MPEG meeting can be found here: https://mpeg.chiariglione.org/meetings/128. In this report we will focus on video coding aspects (i.e., LCEVC) and immersive media applications (i.e., OMAF). At the end, we will provide an update related to adaptive streaming (i.e., DASH and CMAF).

Low Complexity Enhancement Video Coding

Low Complexity Enhancement Video Coding (LCEVC) has been promoted to committee draft (CD) which is the first milestone in the ISO/IEC standardization process. LCEVC is part two of MPEG-5 or ISO/IEC 23094-2 if you prefer the always easy-to-remember ISO codes. We introduced MPEG-5 already in previous posts and LCEVC is about a standardized video coding solution that leverages other video codecs in a manner that improves video compression efficiency while maintaining or lowering the overall encoding and decoding complexity.

The LCEVC standard uses a lightweight video codec to add up to two layers of encoded residuals. The aim of these layers is correcting artefacts produced by the base video codec and adding detail and sharpness for the final output video.

The target of this standard comprises software or hardware codecs with extra processing capabilities, e.g., mobile devices, set top boxes (STBs), and personal computer based decoders. Additional benefits are the reduction in implementation complexity or a corresponding expansion in spatial resolution.

LCEVC is based on existing codecs which allows for backwards-compatibility with existing deployments. Supporting LCEVC enables “softwareized” video coding allowing for release and deployment options known from software-based solutions which are well understood by software companies and, thus, opens new opportunities in improving and optimizing video-based services and applications.

Research aspects: in video coding, research efforts are mainly related to coding efficiency and complexity (as usual). However, as MPEG-5 basically adds a software layer on top of what is typically implemented in hardware, all kind of aspects related to software engineering could become an active area of research.

Omnidirectional Media Format

The scope of the Omnidirectional Media Format (OMAF) is about 360° video, images, audio and associated timed text and specifies (i) a coordinate system, (ii) projection and rectangular region-wise packing methods, (iii) storage of omnidirectional media and the associated metadata using ISOBMFF, (iv) encapsulation, signaling and streaming of omnidirectional media in DASH and MMT, and (v) media profiles and presentation profiles.

At this meeting, the second edition of OMAF (ISO/IEC 23090-2) has been promoted to committee draft (CD) which includes

  • support of improved overlay of graphics or textual data on top of video,
  • efficient signaling of videos structured in multiple sub parts,
  • enabling more than one viewpoint, and
  • new profiles supporting dynamic bitstream generation according to the viewport.

As for the first edition, OMAF includes encapsulation and signaling in ISOBMFF as well as streaming of omnidirectional media (DASH and MMT). It will reach its final milestone by the end of 2020.

360° video is certainly a vital use case towards a fully immersive media experience. Devices to capture and consume such content are becoming increasingly available and will probably contribute to the dissemination of this type of content. However, it is also understood that the complexity increases significantly, specifically with respect to large-scale, scalable deployments due to increased content volume/complexity, timing constraints (latency), and quality of experience issues.

Research aspects: understanding the increased complexity of 360° video or immersive media in general is certainly an important aspect to be addressed towards enabling applications and services in this domain. We may even start thinking that 360° video actually works (e.g., it’s possible to capture, upload to YouTube and consume it on many devices) but the devil is in the detail in order to handle this complexity in an efficient way to enable seamless and high quality of experience.

DASH and CMAF

The 4th edition of DASH (ISO/IEC 23009-1) will be published soon and MPEG is currently working towards a first amendment which will be about (i) CMAF support and (ii) event processing model. An overview of all DASH standards is depicted in the figure below, notably part one of MPEG-DASH referred to as media presentation description and segment formats.

MPEG-DASH-standard-status

The 2nd edition of the CMAF standard (ISO/IEC 23000-19) will become available very soon and MPEG is currently reviewing additional tools in the so-called technologies under considerations document as well as conducting various explorations. A working draft for additional media profiles is also under preparation.

Research aspects: with CMAF, low-latency supported is added to DASH-like applications and services. However, the implementation specifics are actually not defined in the standard and subject to competition (e.g., here). Interestingly, the Bitmovin video developer reports from both 2018 and 2019 highlight the need for low-latency solutions in this domain.

At the ACM Multimedia Conference 2019 in Nice, France I gave a tutorial entitled “A Journey towards Fully Immersive Media Access” which includes updates related to DASH and CMAF. The slides are available here.

Outlook 2020

Finally, let me try giving an outlook for 2020, not so much content-wise but events planned for 2020 that are highly relevant for this column:

  • MPEG129, Jan 13-17, 2020, Brussels, Belgium
  • DCC 2020, Mar 24-27, 2020, Snowbird, UT, USA
  • MPEG130, Apr 20-24, 2020, Alpbach, Austria
  • NAB 2020, Apr 08-22, Las Vegas, NV, USA
  • ICASSP 2020, May 4-8, 2020, Barcelona, Spain
  • QoMEX 2020, May 26-28, 2020, Athlone, Ireland
  • MMSys 2020, Jun 8-11, 2020, Istanbul, Turkey
  • IMX 2020, June 17-19, 2020, Barcelona, Spain
  • MPEG131, Jun 29 – Jul 3, 2020, Geneva, Switzerland
  • NetSoft,QoE Mgmt Workshop, Jun 29 – Jul 3, 2020, Ghent, Belgium
  • ICME 2020, Jul 6-10, London, UK
  • ATHENA summer school, Jul 13-17, Klagenfurt, Austria
  • … and many more!

JPEG Column: 85th JPEG Meeting in San Jose, California, U.S.A.

The 85th JPEG meeting was held in San Jose, CA, USA.

The meeting was distinguished by the Prime Time Engineering Emmy Award from the Academy of Television Arts & Sciences (ATAS) for the longevity of the first JPEG standard. Furthermore, a very successful workshop on JPEG emerging technologies was held at Microsoft premises in Silicon Valley with a broad participation from several companies working in imaging technologies. This workshop ended with the celebration of two JPEG committee experts, Thomas Richter and Ogawa Shigetaka, recognized by ISO outstanding contribution awards for the key roles they played in the development of JPEG XT standard.

The 85th JPEG meeting continued laying the groundwork for the continuous development of JPEG standards and exploration studies. In particular, the developments on new image coding standard JPEG XL,  the low latency and complexity standard JPEG XS, and the release of the JPEG Systems interoperable 360 image standard, together with the exploration studies on image compression using machine learning and on the use of blockchain and distributed ledger technologies for media applications.

The 85th JPEG meeting had the following highlights:

  • Prime Time Engineering Emmy award,
  • JPEG Emerging Technologies Workshop,
  • JPEG XL progresses towards a final specification,
  • JPEG AI evaluates machine learning based coding solutions,
  • JPEG exploration on Media Blockchain,
  • JPEG Systems interoperable 360 image standards released,
  • JPEG XS announces significant improvements of Bayer image sensor data compression.
JPEG Emerging Technologies Workshop.

Prime Time Engineering Emmy

The JPEG committee is honored to be the recipient of a prestigious Prime Time Engineering Award in 2019 by the US Academy of Television Arts & Sciences at the 71st Engineering Emmy Awards ceremony on the 23rd of October 2019 in Los Angeles, CA, USA. The first JPEG standard is known as a popular format in digital photography, used by hundreds of millions of users everywhere, in a wide range of applications including the world wide web, social media, photographic apparatus and smart cameras. The first part of the standard was published in 1992 and has grown to seven parts, with the latest, defining the reference software, published in 2019. This is a unique example of longevity in the fast moving information technologies and the Emmy award acknowledges this longevity and continuing influence over nearly three decades.

This is a well-deserved recognition not only for the Joint Photographic Experts Group committee members who started this standard under the auspices of ITU, ISO, IEC but also to all experts in the JPEG committee who continued to extend and maintain it, hence guaranteeing such a longevity.

JPEG convenor Touradj Ebrahimi during the Emmy acceptance speech.

According to Prof. Touradj Ebrahimi, Convenor of JPEG standardization committee, the longevity of JPEG is based on three very important factors: “The credibility by being developed under the auspices of three important standardization bodies, namely ITU, ISO and IEC, development by explicitly taking into account end users, and the choice of being royalty free”. Furthermore,  “JPEG defined not only a great technology but also it was a committee that first defined how standardization should take place in order to become successful”.

JPEG Emerging Technologies Workshop

At the 85th JPEG meeting in San Jose, CA, USA, JPEG organized the “JPEG Emerging Technologies Workshop” on the 5th of November 2019 to inform industry and academia active in the wider field of multimedia and in particular in imaging, about current JPEG Committee standardization activities and exploration studies. Leading JPEG experts shared highlights about some of the emerging JPEG technologies that could shape the future of imaging and multimedia, with the following program:

  • Welcome and Introduction (Touradj Ebrahimi);
  • JPEG XS – Lightweight compression; Transparent quality. (Antonin Descampe);
  • JPEG Pleno (Peter Schelkens);
  • JPEG XL – Next-generation Image Compression (Jan Wassenberg and Jon Sneyers);
  • High-Throughput JPEG 2000 – Big improvement to JPEG 2000 (Pierre-Anthony Lemieux);
  • JPEG Systems – The framework for future and legacy standards (Andy Kuzma);
  • JPEG Privacy and Security and Exploration on Media Blockchain Standardization Needs (Frederik Temmermans);
  • JPEG AI: Learning to Compress (João Ascenso)

This very successful workshop ended with a panel moderated by Fernando Pereira where different relevant media technology issues were discussed with a vibrant participation of the attendees.

Proceedings of the JPEG Emerging Technologies Workshop are available for download via the following link: https://jpeg.org/items/20191108_jpeg_emerging_technologies_workshop_proceedings.html

JPEG XL

The JPEG XL Image Coding System (ISO/IEC 18181) continues its progression towards a final specification. The Committee Draft of JPEG XL is being refined based on feedback received from experts from ISO/IEC national bodies. Experiments indicate the main two JPEG XL modes compare favorably with specialized responsive and lossless modes, enabling a simpler specification.

The JPEG committee has approved open-sourcing the JPEG XL software. JPEG XL will advance to the Draft International Standard stage in 2020-01.

JPEG AI

JPEG AI carried out rigorous subjective and objective evaluations of a number of promising learning-based image coding solutions from state of the art, which show the potential of these codecs for different rate-quality tradeoffs, in comparison to widely used anchors. Moreover, a wide set of objective metrics were evaluated for several types of image coding solutions.

JPEG exploration on Media Blockchain

Fake news, copyright violations, media forensics, privacy and security are emerging challenges in digital media. JPEG has determined that blockchain and distributed ledger technologies (DLT) have great potential as a technology component to address these challenges in transparent and trustable media transactions. However, blockchain and DLT need to be integrated closely with a widely adopted standard to ensure broad interoperability of protected images. Therefore, the JPEG committee has organized several workshops to engage with the industry and help to identify use cases and requirements that will drive the standardization process. During the San Jose meeting, the committee drafted a first version of the use cases and requirements document. On the 21st of January 2020, during its 86th JPEG Meeting to be held in Sydney, Australia, JPEG plans to organize an interactive discussion session with stakeholders. Practical and registration information is available on the JPEG website. To keep informed and to get involved in this activity, interested parties are invited to register to the ad hoc group’s mailing list. (http://jpeg-blockchain-list.jpeg.org).

JPEG Systems interoperable 360 image standards released.

The ISO/IEC 19566-5 JUMBF and ISO/IEC 19566-6 JPEG 360 were published in July 2019.  These two standards work together to define basics for interoperability and lay the groundwork for future capabilities for richer interactions with still images as we add functionality to JUMBF (Part 5), Privacy & Security (Part 4), JPEG 360 (Part 6), and JLINK (Part 7). 

JPEG XS announces significant improvements of Bayer image sensor data compression.

JPEG XS aims at standardization of a visually lossless low-latency and lightweight compression that can be used as a mezzanine codec in various markets. Work has been done in the last meeting to enable JPEG XS for use in Bayer image sensor compression. Among the targeted use cases for Bayer image sensor compression, one can cite video transport over professional video links, real-time video storage in and outside of cameras, and data compression onboard of autonomous cars. The JPEG Committee also announces the final publication of JPEG XS Part-3 “Transport and Container Formats” as International Standard. This part enables storage of JPEG XS images in various formats. In addition, an effort is currently on its final way to specify RTP payload for JPEG XS, which will enable transport of JPEG XS in the SMPTE ST2110 framework.

“The 2019 Prime Time Engineering Award by the Academy is a well-deserved recognition for the Joint Photographic Experts Group members who initiated standardization of the first JPEG standard and to all experts of the JPEG committee who since then have extended and maintained it, guaranteeing its longevity. JPEG defined not only a great technology but also it was the first committee that defined how standardization should take place in order to become successful” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JPEG, JPEG 2000, JPEG XR, JPSearch, JPEG XT and more recently, the JPEG XS, JPEG Systems, JPEG Pleno and JPEG XL families of imaging standards.

The JPEG Committee nominally meets four times a year, in different world locations. The 84th JPEG Meeting was held on 13-19 July 2019, in Brussels, Belgium. The next 86th JPEG Meeting will be held on 18-24 January 2020, in Sydney, Australia.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

Future JPEG meetings are planned as follows:

  • No 86, Sydney, Australia, January 18 to 24, 2020
  • No 87, Erlangen, Germany, April 25 to 30, 2020

Report from ACM SIG Heritage Workshop

What does history mean to computer scientists?” – that was the first question that popped up in my mind when I was to attend the ACM Heritage Workshop at Minneapolis few months back. And needless to say, the follow up question was “what does history mean for a multimedia systems researcher?” As a young graduate student, I had the joy of my life when my first research paper on multimedia authoring (a hot topic those days) was accepted for presentation in the first ACM Multimedia in 1993, and that conference was held along side SIGGRAPH. Thinking about that, it gives multimedia systems researchers about 25 to 30 years of history. But what a flow of topics this area has seen: from authoring to streaming to content-based retrieval to social media and human-centered multimedia, the research area has been hot as ever. So, is it the history of research topics or the researchers or both? Then, how about the venues hosting these conferences, the networking events, or the grueling TPC meetings that prepped the conference actions?

Figure 1. Picture from the venue

With only questions and no clear answers, I decided to attend the workshop with an open mind. Most SIGs (Special Interest Groups) in ACM had representation at this workshop. The workshop itself was organized by the ACM History Committee. I understood this committee, apart from the workshop, organizes several efforts to track, record, and preserve computing efforts across disciplines. This includes identifying distinguished persons (who are retired but made significant contributions to computing), coming up with a customized questionnaire for the persons, training the interviewer, recording the conversations, curating them, archiving, and providing them for public consumption. Efforts at most SIGs were mostly based on the website. They were talking about how they try to preserve conference materials such as paper proceedings (when only paper proceedings were published), meeting notes, pictures, and videos. For instance, some SIGs were talking about how they tracked and preserved ACM’s approval letter for the SIG! 

It was very interesting – and touching – to see some attendees (senior Professors) coming to the workshop with boxes of materials – papers, reports, books, etc. They were either downsizing their offices or clearing out, and did not feel like throwing the material in recycling bins! These materials were given to ACM and Babbage Institute (at University of Minnesota, Minneapolis) for possible curation and storage.

Figure 2. Galleries with collected material

ACM History committee members talked about how they can fund (at a small level) projects that target specific activities for preserving and archiving computing events and materials. ACM History Committee agreed that ACM should take more responsibility in providing technical support to web hosting – obviously, not sure whether anything tangible would result.

Over the two days at the workshop, I was getting answers to my questions: History can mean pictures and videos taken at earlier MM conferences, TPC meetings, SIGMM sponsored events and retreats. Perhaps, the earlier paper proceedings that have some additional information than what is found in the corresponding ACM Digital Library version. Interviews with different research leaders that built and promoted SIGMM.

It was clear that history meant different things to different SIGs, and as SIGMM community, we would have to arrive at our own interpretation, collect and preserve that. And that made me understand the most obvious and perhaps, the most important thing: today’s events become tomorrow’s history! No brainer, right? Preserving today’s SIGMM events will give us a richer, colorful, and more complete SIGMM history for the future generations!

For the curious ones:

ACM Heritage Workshop website is at: https://acmsigheritage.dash.umn.ed

Some of the workshop presentation materials are available at: https://acmsigheritage.dash.umn.edu/uncategorized/class-material-posted/

Reports from ACM Multimedia 2019

Introduction

The annual ACM Multimedia Conference was held in Nice, France during October 21st to 25th, 2019. Being the 27th of its series, it attracted approximately 800 participants from all over the World. Among them were the student volunteers who supported the smooth organization of the Conference. In this article, I would like to introduce the reports and comments provided by each of them.

Figure. Student volunteers at ACM Multimedia 2019

Reports from student volunteers

Hui Chen (Tsinghua University, China)

It was such an honor for me to be granted for the student travel funding. During my stay in Nice, as a Ph.D. researcher, I read a lot of nice academical works which inspired me a lot. And I had wonderful conversations with authors from all over the world. Meanwhile, as a session volunteer, I was glad to help speakers and the audience during sessions. Their nice works and warm smiles impressed me a lot. What I most valued about is the friendship with other volunteers. We often discussed the attractive places and the delicious food in Nice, and cared for each other along the journey. I am deeply thankful for this wonderful experience in Nice. Some advice: (1) I think the beret was not necessary for the volunteers. Majority of us seemed to dislike it, because I did not see many volunteers wearing them. (2) Notifications about the room changing for sessions should be made clear early. (3) The manner of being punctual can be emphasized in the ice-break meeting. (4) Reminding of volunteered sessions could be shown in the Whova app.

Shizhe Chen (Renmin University of China, China)

It was a great pleasure to attend the ACM Multimedia this year. I have attended MM twice and the organizations are getting better and better. One big change was the deployment of the Whova APP, which really improved our experience at MM. On the one hand, it made connections among different attendants and organizations more convenient and efficient. On the other hand, it was nice to share photos in the APP about the conference. The volunteers are very devoted to serve the conference and uploaded many good pictures. The conference banquet at Nice also improved a lot. I really enjoyed local foods and magic shows. Even though there were so many people at that night, the organization was very ordered and made everyone satisfied. I also liked some multimedia modern art pieces exhibited at the conference which were wonderful. The conference session I enjoyed most was the Multimedia Grand Challenge, which provided a great opportunity for us academics to get involved in real-life problems in industries. It would have been better if there were more opportunities off-line to communicate with industry people in the conference. In summary, thanks for all the efforts the organizers have put on the conference. I am also proud to be able to contribute a little as a volunteer this time.

Yang Chen (University of Science and Technology of China, China)

This was my first time attending an international conference and needed to be a session volunteer during the conference. It was also my first time abroad. So I felt a litter nervous before going abroad for the conference. Fortunately, everything went smoothly in the end. The MM conference has been held for many years, so the experience of organizing the conference is rich, and the scale is also large. The MM conference provided a lot of convenience for the participants. All conference schedules can be found at the venue, so attendees can easily find the sessions that they needed to participate or were interested in. In addition, this year, the MM conference had many local characteristics of Nice, France. All attendees were given the famous local soap of Nice. The French food provided at the venue was also very delicious. All in all, it was a very impressive MM conference experience.

Amanda Duarte (Universitat Politècnica de Catalunya, Spain)

ACM Multimedia 2019 for me was a different and great experience. This was the first time that I attended this conference and it was very different of what I am used to find in a big conference. For the past four years I have been going to conferences more focused on Computer Vision and Machine Learning which nowadays have a large number of attendees, accepted papers, parallel sessions, and all the stress of being in a large venue and need to find the sessions that interest you across large rooms full of people.
ACM Multimedia on the other way around was held in a smaller venue with less attendees but yet with a very large amount of high quality researchers. Thus, I had the chance of talking more to great researchers in the areas that I have interest and also were interested in my work. In addition to my great experience during the conference in general, I had a great experience participating in the Doctoral Symposium during the conference. This event gave me the opportunity to present my work to great researchers that work on topics related to my doctoral thesis and were able of giving me great feedback and suggestions on how to improve my research.

Gelli Francesco (National University of Singapore, Singapore)

Although I am still a student, this edition of ACM Multimedia has been my third. Similar to the previous times, I met with the now more familiar community and allocated my time between attending sessions, walking around the posters, and rehearsing my presentation. My observation is that this year, there has been a major focus on applications rather than on the technical aspects. For example, the Best Paper session included works on zooming audio together with video, multi-modal dialogue system and privacy. The Brave New Ideas session, in which I presented, saw some more unusual and daring applications, such as the automatic creation of a sequence of images to match a short story. I had a great time presenting my paper on ranking images by subjective attributes, as I did my best to engage the audience with multiple questions. I learned from the senior organizers that their goal is to push the Multimedia community on applications such as Wellness and Human-Machine interaction, which naturally involves multimedia data. It was also inspiring to see so many engaged volunteers all dressed in blue running around with that very traditional beret. Definitely looking forward to attend the next edition.

Trung-Hiếu Hoàng (University of Science, Vietnam National University Ho Chi Minh City, Vietnam)

I am excited to share my experience in ACMMM 2019, as a person who received the student travel grant. Living in Vietnam, I cannot believe that I had such a great opportunity to travel thousands of kilometers and attend one of the top conferences in the world. On the first day, I met a lot of friends who received the same travel grant like me. We hung out together sharing different stories and experiences, all of us were enthusiastic and couldn’t wait to become a part of the volunteer team and contribute to the success of this year’s conference. During the last two years, I have had a strong interest in medical image processing. In detail, my research focuses on abnormality detection in the endoscopic image. Attending ACMMM 2019 gave me a wonderful chance to present my work, and discuss with experts in this field. I enjoyed the Healthcare Multimedia workshop, where I met the organizers of the BioMedia Grand Challenge track. I loved talking with them and discussing the future and their interests. In conclusion, I am so glad that the student grant brought me to Europe for the first time, opened up my mind and showed me wonderful things that I had never seen before.

Chia-Wei Hsieh (National Chiao Tung University, Taiwan)

I attended the ACM Multimedia 2019 in Nice, France, and listened to new AI approaches by experts and scholars from various countries. In this conference, I got the chance to learn about the latest studies’ results from world-renowned universities and research institutions, and learn about the latest developments in the industry. These most advanced tools broadened my view and realized the disabilities that can be improved in our future research. Furthermore, I appreciated serving as a volunteer at the conference. This forced me to interact with people and have made many good friends from all over the world. Everything is really well to attend MM’19, but a fly in the ointment is that the attendance of the last two days was pretty low. With some special benefits for people to stay, there could be more academic exchanges at the conference.

Michael Kerr (RMIT University, Australia)

I came to the conference this year hoping to learn about some very specific research that was being presented in my own field of employment of video surveillance. My expectations around these presentations was well met, but additionally I also took away new insights into other areas that were previously not of great interest to me, mainly as I had not explored their application to my own field.
I particularly enjoyed the Tutorials on Multimedia Forensics and was interested to see the work done in areas that had been developed in recent years. I was very engaged by the application of CNN to solve forensic challenges and quickly found that the application of these systems was a major theme in the entire conference. So, whilst I enjoyed many of the practical applications such as the Tutorials, the System Demonstrations, and the Open Source Software Competition, I also learnt a great deal about the growth of CNN technologies within the multimedia discipline as a whole. This has had a positive effect by helping to develop my own research plans and in particular enabling the identification of new applications that may be of interest to those working in multimedia as well as my specific field of interest.

Saurabh Kumar (Indian Institute of Technology Bombay, India)

I had an enjoyable experience at ACM Multimedia and learned a lot as this was my first big international conference. The papers were from diverse applications, and it was great talking to the speakers after the talks and at the posters. This allowed me to meet many amazing people from various backgrounds and talk about the exciting research they are doing. It was easy to approach anyone at the conference for casual or technical discussions. These days conferences are recorded with recording and proceedings are put up online, but that is just the tip of the iceberg. Attending a conference is a much broader experience, and I got an opportunity to experience this thanks to this travel grant. I made friends from many countries, thanks to the friendly atmosphere, and learned how my research fits in. I would like to highlight that being a volunteer was the primary reason all of this was possible. As a volunteer, it was so much easier to talk to people, and it was great helping them around. I would love to come and help out again anytime. The conference was just perfect, and I will remember my experience as a volunteer, which made it way more fun and especially the people I interacted with. I am certainly submitting to the next MM and coming back again with more exciting research and to meet this fantastic community. Also, visiting Nice was a delight, and it is a magnificent city, and the food was delicious.

Yadan Luo (University of Queensland, Australia)

It has been a great experience attending ACM Multimedia 2019 in Nice this October, where I met many brilliant people working in the same field. The Invited Talks offered impressive ideas, inspiring visions of the future and excellent coverage of many areas, like preserving audiovisual archives and data protection law. The most impressive part of the conference was the Art Exhibition, which showed a great power of installation art and interactive multimedia. Moreover, this great meeting brought me a lot of precious opportunities of meeting other researchers working in other subfields like video streaming, domain adaptation, and image generation. All chatting with them helped me quickly pick up plenty of new knowledge and opened a door to other research directions. In conclusion, I would like to sincerely express my thanks to people who have prepared the conference, in which I have benefited a lot from this fantastic event.

Kwanyong Park (Korea Advanced Institute of Science and Technology, Korea)

ACM Multimedia 2019 was especially special to me in terms of my improvement. Honestly speaking, my paper, presented in ACM Multimedia 2019, is my first international research accomplishment. So I really lacked experiences and skills about presenting my work and communicating with other researchers. But after ACM Multimedia 2019, I have confidence that at least I can do better and better. Combination of Oral and Poster sessions was really impressive and effective to obtain a lot of information in a short time. Every paper had at least 2 minutes oral presentation, and I could catch the core concept. Based on that, I easily decided whether the paper is closely related to my interest or not. I agree that this kind of configuration is a really efficient way. Through the conference, I saw which topics the students, who have mostly academic perspective, are focusing on. Although it is a great stimulus to me, I think practical perspective from various companies is also important to broaden the horizon. However, research from companies was relatively hard to find in ACM Multimedia 2019. I think that having some interactive booths from companies would be helpful.

K. R. Prajawal (International Institute of Information Technology, India)

ACM Multimedia was not only my first top-tier conference, but my first conference as well. I was pleased to see a lot of interesting and impactful papers from people from various backgrounds and universities. I particularly liked the conference venue as well, as it was spacious and comfortable to encourage a healthy discussion. I personally feel the food and meals could have been better curated. For example, I’m a vegetarian. I understand I have few items to eat, but the vegetarian items were not clearly labeled. This can be rectified in the future editions of the conference. I also believe that most of the presentation rooms were well prepared and organized for the presentation. During my oral presentation, however, I had an issue in playing a demo video. This issue had occurred because the conference organizers were not fully prepared to play a video during the presentation. That is rather odd, I felt, given this is a top-tier multimedia conference, which means it will have lots of audio and visual content. But, other than that, I had a very pleasant and fruitful time at the conference. I was able to connect and socialize with eminent researchers at ACM Multimedia and I hope to attend the next edition as well.

Estêvão Bissoli Saleme (Federal University of Espírito Santo, Brazil)

ACM Multimedia 2019 in Nice was such a unique experience. I volunteered for six sessions and attended a couple more, including the Best Paper session which I particularly liked the most. Not only because it brought original ideas, but also because I had the opportunity to witness an innovative presentation of the paper “Multimodal Dialog System: Generating Responses via Adaptive Decoders,” in which the speakers kept a dialog between them to give their talk. Besides that, I enjoyed the poster presentation hall, which we could mingle with other participants, get to know other people’s work better, and interact with them. One presentation that impressed me was entitled “Editing Text in the Wild.” In this work, the researchers proposed a method to replace any text in a picture keeping the background intact. The outcome looked like a real figure. Just impressive! Technically, I was more interested in Quality of Experience and Interaction, but I thought the subject of the papers in this session was spread out, which hindered the interaction with other presenters. It lacked a bit of work related to QoE itself. Finally, another aspect that deserves praise was the organization. Whova helped hugely, and we could post photos and interact with other people there. Moreover, Martha, Laurent, and Benoit were omnipresent and tireless. They were just on fire and worked very well to deliver such a great conference!

David Semedo (Universidade NOVA de Lisboa, Portugal)

My experience at ACM MM 2019 was very positive. I presented two full papers: one as a full oral and one as a short presentation. As such, the whole event was quite intense for me but also very personally enriching. I could do a lot of networking, with both students and senior researchers (the ConfLab contributed in this regard). As I am in my last Ph.D. year, I could talk with several researchers, from which I got valuable advices on how to take the next steps towards pursuing a career in research. At the poster sessions, I had the opportunity to discuss in detail my work with several people, from which I received constructive feedback. While I liked the fact that posters stayed posted during the whole conference, some were hard to find or were a bit hidden (e.g. the ones facing the wall). The conference program covered a wide range of topics on Multimedia. This allowed me to understand which techniques are being used on different tasks, and identify common technical aspects across these different tasks. It not only helped me in being updated, in terms of state-of-the-art approaches, but also in defining potential future research directions.

Junbo Wang (Institute of Automation, Chinese Academy of Sciences, China)

From 21-25 October 2019, I attended the ACM Multimedia 2019 Conference in Nice, France. This conference is a premier international conference in the area of multimedia within the field of computer science and I am very proud of attending this professional conference thanks to the ACM student travel grant. In this conference, I met many famous researchers in the area of multimedia, such as Tao Mei, Tat-Seng Chua, and Changsheng Xu. During the Poster or Oral sessions, I discussed many academic problems with these researchers, which really gave me new vision and insight. In addition to many academic talks, I also enjoyed a lot of French food, such as Macaroon and Foie Gras. As a session volunteer, I was also very happy to help the attendees in some session talks. The interesting and professional talks inspired me and guided my interest to many different research areas. Moreover, the conference was held at the NICE ACROPOLIS Convention Center in Nice, which is a beautiful and peaceful city. The fresh air and pleasant sea breeze gave us a good mood every day and made us have an unforgettable experience in this city. Overall, I think this conference was very successful to reach its fundamental objective: free communication. However, I also found that the sponsors this year was far less than that for last year, which can be expected to be better in the next year.

Xin Wang (Donghua University, China)

In my experience, I think MM’19 was very impressive and easy to follow. The arrangement of the conference was very reasonable especially the Whova APP helped me a lot whenever I wanted to figure on what is going on during the conference. Except one thing that I found in the first two days, there were still some workshops that had different room numbers between the session volunteer schedule (a Google sheet). That made me confused for a while, but luckily Martha told us use the APP as the standard. I really loved the Demo session and I think there must be people who had the same feeling like me. I met and talked with many researchers from all of the world, such as NUS, DCU, Nagoya University, Shandong University, National Chiao Tung University, etc. I still keep contact with some of them and exchange our research ideas. Besides, the weather in Nice was very comfortable. The food during the conference was rich and delicious. All of these reasons make me look forward to the next year’s MM conference.

Yitian Yuan (Tsinghua University, China)

It was very enjoyable to attend the ACM MM 2019 conference. As a volunteer, I could meet peers from other countries and schools and communicate with them, which is of great benefit to my scientific research knowledge. I think the agenda of this ACM MM conference was compact and reasonably arranged, but there are still the following problems that I think need to be improved: (1) The entrance of the main conference hall was dimly lit and the signs were not obvious, so volunteers needed to guide, otherwise it was difficult for participants to find the place. (2) I wish the stage at the Banquet had a bigger screen, so that everyone can see the name of the winners and the prize information. Finally, I wish the ACM MM better and better and more international influence.

Zhengyu Zhao (Radboud University, The Netherlands)

This was my second time to attend ACM Multimedia, after the first time in Korea in 2018. Overall, I felt the conference this year was a very successful edition, reflected by the perfect location, delicious food, well-designed program and especially the efforts from the volunteers. But still, I have some suggestions for further improvement. Specifically, from the experience of the poster presentation of my reproducibility paper, I realized that most people actually know nothing about this new reproducibility track. This made most of my time spent on explaining the general background of the track and so less time for my own research. I was happy to explain and get more people involved in this track but it would be better if the organization team could give more exposure of this track beforehand. From this experience serving as one of the poster session chairs, I figured out that many people do not use the official communication APP Whova, so the instructions and important announcements could not reach all the participants timely. In my opinion, more offline solutions (e.g., a big screen on the spot) would help.

Summary

In general, the student volunteers seemed to have enjoyed the event to the full extent, but some of them have proposed constructive suggestions that organizers and participants to future versions of the conference could take in account to provide better experiences!

All in all, I think we can see from the submitted reports that providing the chance to experience top-level research and to mix with all-range of researchers at a top-level Conference to young researchers who may one day become leaders in our community, would surely benefit us in the future.

An interview with Associate Professor Hugo L. Hammer

Hugo as a Ph.D. student, at the beginning of his research career.

Describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

From an early age, I had the ability to focus and work individually and loved to develop new systems for all sorts of things, which probably was quite annoying for those around me. It turns out that it is these abilities to focus, being curious, and developing new systems is what drives my research today. When I started as a student in mathematics and statistics at the Norwegian University of Science and Technology (NTNU), I didn’t think of research as an alternative and was determined to find a job in the industry. Throughout the studies, I learned how little mathematics and statistics I had actually learned, which is why I decided to become a Ph.D. student. I expected to find a job in the industry after the Ph.D. period but ended up loving research, and that is why I am where I am today.

As a statistician, I have worked a lot with spatial and spatio-temporal data, such as geophysical observations. Such observations have striking similarities to multimedia content, such as images and videos. I have become very interested in machine learning methods used to process and make decisions from multimedia content and the potential for applying such methods towards other applications, such as geophysical applications. I also love working as a statistician within this field. A crucial part of my research is to try to combine methods from machine learning and statistics into new and exciting ways.

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish, and how will you bring this about? 

In my current position as an associate professor, I do both teaching and research. Teaching and research challenge me in different ways. I continuously try to develop and improve my teaching. I especially focus on how to do high quality, yet resource-efficient, teaching. I have, for example, worked a lot on how to activate students and improve learning when being a single teacher for hundreds of students.

Can you profile your current research, its challenges, opportunities, and implications?

My current research can roughly be divided into three directions. The first direction is about methods for real-time information processing and decision making, for example, from sensory information or video streams. The second direction is based on developing new machine learning models and methods, and as mentioned above, by taking advantage of my background in statistics. The third direction is doing more applied use of machine learning methods toward real-life multimedia data, in particular, medical data. Direction two and three go hand in hand. Having a background in statistics and working more and more with multimedia data is more of an opportunity than a challenge.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

 I am proud of the research we have done on real-time information processing and decision making. Our developed methods are simple but still document state-of-the-art performance. In 2020, we plan to develop software packages to make the methods readily available and hopefully useful for many. We saw the potential of using machine learning, and in particular deep learning, towards geophysical data and problems quite early, and we are now able to operate at the forefront of this research. I’m also proud of our externally funded research projects and, for sure, our rejected research proposals.

Over your distinguished career, what are your top lessons you want to share with the audience?

Here is a lesson from my personal experience. I think it is easy to depend on or have too much respect for other researchers early in the career. Research is of course all about collaboration, but still, for me, it was useful early in the career to create a small research project where I did every step of the process myself (shaping ideas, collecting data, running simulation, writing, finding suitable publishing channels, revisions, etc.). It was hard work, but for sure, it made me a better and more independent researcher.

What is the best joke you know?

Daddy, what are clouds made of?

Linux servers, mostly.

If you were conducting this interview, what questions would you ask, and then what would be your answers?

One suggestion: What do you like to do in your spare time?

Research, right? 🙂 Working every day at an office, I try to find time for physical activity in my spare time. I love to run, bike, or go skiing in Nordmarka (a forest near Oslo, Norway) or in the mountains on the weekends.

A recent photo of Hugo.

Bio: Hugo L. Hammer is an associate professor in statistics at Oslo Metropolitan University. His main research interests are computational statistics, probabilistic forecasting, real-time analytics, and machine learning.

An interview with Professor Roger Zimmermann

Roger at the start of his career.

Please describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

I have had an interest in technology early on, though my path to becoming an academic has not been very direct. In high school, I really enjoyed to tinker with electronics, taking radios apart, and learning about digital circuits. My goal was to work in this field, and after high school, I did an apprenticeship with Brown, Boveri & Cie. (BBC), which sometime later became Asea Brown Boveri (ABB). The apprentices were assigned to different company locations, and I was lucky enough to be sent to BBC’s Forschungszentrum (Research Center). The labs, the researchers, and the cutting-edge equipment and projects there left a deep impression on me. Beyond electronics, I really liked microprocessors, computers and how they could be flexibly programmed with software. I decided that I wanted to pursue further studies and I subsequently enrolled in the Höhere Technische Lehranstalt (HTL) Brugg-Windisch in their Informatik program (the HTL program has since changed and the building where I studied is now part of the campus Windisch of the Fachhochschule Nordwestschweiz). Fresh with my HTL degree in hand, I started to work for an engineering company and over the next years, I got the chance to work on some fascinating projects. After five years, I got an itch to study for a Master’s degree and I ended up in California. One of the professors (who became my advisor) encouraged me to go for a Ph.D., and I took him up on his offer to support me. His group worked at the intersection of databases and multimedia. It really fascinated me and we ended up building one of the early streaming media servers. What I still find fascinating about multimedia today is how it brings together many fundamental computer science areas such as networking, graphics, operating system support, signal processing, etc. I also like that multimedia is used by people to express their creativity, humanity and artistic aspirations – it is not only about technology.

My personal lessons looking back are that sometimes you may not know where your journey will take you, but make sure you enjoy and learn from the path to get there.

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

I currently work broadly in two areas, namely streaming media systems and data analytics. At this point, one of the main enjoyment I get is from working with my research group and international colleagues from around the world. On the technical side, it is fun if somebody is actually using what we develop. On the human side of things, it is great to see when my students and former students are doing well in various parts of the globe.

Can you profile your current research, its challenges, opportunities, and implications?

In my research group, I have two main themes and those are media systems and multimedia data analytics. In the first cluster, we look at media streaming on the Internet. The main technology in use today is Dynamic Adaptive Streaming over HTTP, also called DASH. Some interesting challenges are in the area of enabling very low latency in live streaming, which is of interest to many large Internet companies. Going forward, I see 5G networks as an interesting challenge. Most people are excited about the very high bandwidth that 5G can offer (in the best case), but I believe one of the major challenges will be the very high variability of 5G networks when a device is moving. On the multimedia, and especially spatial, data analytics side, I am part of a new lab between NUS and the ridesharing company Grab. There is a tremendous amount of data generated (e.g., GPS trajectories) that allow novel data-driven applications such as generating accurate road maps in regions where this information is not readily available or the inference of semantic attributes of roads (e.g., no right turn allowed). The fusion of multiple data types such as trajectories, images, maps, etc., will allow for some exciting new applications.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

One of the areas where my group made innovative contributions was georeferenced mobile video — combining videos with their geo-spatial properties led to a lot of interesting developments. We started with this just about at the same time when the first iPhone came out, and the idea of utilizing all the sensors in a phone in combination with its video was really novel. Nowadays, sensor fusion is common and is used in many machine-learning applications and I am sure there will be even greater break-throughs in the future. Another area where I have been working for decades is media streaming and this whole industry has changed from proprietary networks to the Internet. There have been many people working in this area, but I believe that our own contributions have helped to transform this field.

Over your distinguished career, what are the top lessons you want to share with the audience?

My path to becoming an academic has not been as direct as for some other people. But one of the key things that I have enjoyed along the way was to work with many outstandingly talented and bright people from all around the world. I hope that humanity will keep working together based on facts and science to solve some of the big challenges that are coming our way.

If you were conducting this interview, what questions would you ask, and then what would be your answers?

One issue that concerns me is the apparent trend to not trust facts anymore. So a possible question could be: Do you see a danger when people easily distribute and believe in “alternate facts”?

My answer would be, I definitely see this as a considerable concern in the future. While there may be some technical solutions to combat fake news, etc., it is also increasingly important that people are well educated and think critically, especially in a world where fake information may look very persuasive.

 

What is the best joke you know?

I like many of the weird, but strangely funny comments on life and baseball from Yogi Berra. He was born Lawrence Peter Berra and was a US baseball legend. Two examples:

“When you come to a fork in the road, take it.”

“You should always go to other people’s funerals. Otherwise, they won’t come to yours.”


A current image of Roger.

Short bio:

Roger Zimmermann is an Associate Professor at the School of Computing at the National University of Singapore (NUS). He is also Deputy Director with the Smart Systems Institute (SSI) at NUS. From 2010 to 2016 he co-directed the Centre of Social Media Innovations for Communities (COSMIC), a research institute funded by the National Research Foundation (NRF) of Singapore. Prior to joining NUS he held the positions of Research Area Director with the Integrated Media Systems Center (IMSC) and Research Assistant Professor at the University of Southern California (USC). He earned his M.S. and Ph.D. degrees from the Viterbi School of Engineering at the University of Southern California.

Multidisciplinary Column: Conferences as Career and Community Catalysts

A little over 10 years ago, I chose to pursue a PhD. This meant I chose a professional life in which research publications and their uptake would be seen as major evidence of achievement. For those working in computer science, the major dissemination platforms for such publications are conferences.

Given my dual background in music and computer science, it was logical that my main interests were in topics that connected these both worlds. As a consequence, I hoped to become part of the Music Information Retrieval community. The International Society for Music Information Retrieval (ISMIR) therefore seemed the professional community to target, and the annual ISMIR conference the most logical place to present my work at.

In terms of its education and research, my department at TU Delft had track records and agendas in visual and social multimedia content analysis, but not particularly in music. Considering methodology and philosophy, I did think a lot of the work at the department was compatible with what I tried to do in music. Furthermore, as I still was in training in a selective major at the conservatoire, I was not in a good position to geographically move to any other institute that would have a more established Music Information Retrieval track record. So I inquired whether I could stay in Delft for pursuing my PhD.

The answer was somewhat complicated. There was no funding for a PhD position in Music Information Retrieval, and there were no strategic plans to change that. At the same time, the people who had supervised me as a student (in particular, my thesis supervisor Alan Hanjalic) saw promise in me, and would like to keep working with me. Ultimately, I got a one-year contract in which my main task was to try acquiring funding and international community backing to pursue a Music Information Retrieval PhD in a multimedia group.

At the start of that year, I got to attend my first ISMIR conference, where I presented a paper based on my master’s thesis. In a previous column for the SIGMM records, I already discussed my experiences at that moment: how debuting alone at a conference was intimidating, but how I was lucky that senior members of the community pro-actively took care I got introduced to other attendees. Frans Wiering, the senior member who looked after me in particular at that moment, was general chair of the upcoming ISMIR, which would take place in Utrecht, so in my home country. Frans was quick to invite me to serve as a student volunteer, which was very good news for me. As my year would be filled with grant-writing, I did not yet have a sufficiently stable infrastructure around me to be able to truly do research, so submitting to the next ISMIR was out of reach. But this way, I could still attend the conference, and even would have an excuse to keep mingling with all the attendees, as we as volunteers would be the first people to answer any participant questions regarding logistics.

Getting funding turned out a true challenge. In 2009, digital music consumption was not as large yet as it is today, and many potential data-providing partners were reluctant to collaborate. Of course, it also did not help my cause that I still was a complete nobody. Finally, when working on music, one faces an interesting paradox. On the one hand, many people, regardless of their backgrounds, identify with music, up to the point that they personally deeply care about it. As such, working on music makes for a good conversation starter, in which people are always happy to share their personal experiences. On the other hand, this makes music a commonplace topic, which risks it being shoved aside as ‘less serious’. Even though technically, the problems we are working on are framed in very similar ways as they may be in neighboring domains such as vision (and the research challenges are at least as hard, if not harder, due to subjective human factors being an integral part of the problem), common criticisms we receive are that music is fun but does not save lives, and does not deal with areas of major economical impact, nor easily measurable societal impact. So while we never have any problems legitimizing our work in public outreach, in grant-writing, we always need to justify extra why our work is more than a fun hobby, and sufficiently relevant to justify serious funding.

After several collaboration rejections, and the one proposal I did manage setting up getting rejected despite good review scores, I was very lucky that at the very end of my grant-writing year, I managed securing PhD funding through a Google Doctoral Fellowship (now PhD Fellowship). For this, I needed to get a research mentor, although my Google contacts weren’t so sure who would be appropriate for this role, as they were not aware of anyone working in music in the company at that stage.

Several weeks later, I was volunteering at ISMIR in Utrecht. That was where I found out that Douglas Eck had just moved from academia to industry, to work on music research at Google. And that was how I got my research mentor, with several extremely useful interning experiences at the company as a consequence.

When Emilia Gómez, the 2018-2019 president of the ISMIR society invited me to become general co-chair to ISMIR’s 20th anniversary edition with her, and host the event in Delft, this was my chance to give back. Now I had general chair powers, and as the society was quite open to discussing any innovations, I could try realizing the conference of my dreams.

As described in my previous column, the inclusive spirit of ISMIR has always been quite elaborate, including mentoring programs spearheaded by our Women in MIR movement, an explicit focus on multidisciplinarity over exclusivity, and on being medium-sized but single-track. Since two years, all our accepted papers are presented in a 4-minute presentation and a poster, such that all the works get equal visibility. This year, we chose to not do themed sessions but to randomize the paper order, such that authors on related topics would not be presenting their posters at the same time. As a side-effect, this also would nudge attendees towards learning about everything that got accepted, beyond the topics of their specializations. This is something I have seen the ISMIR community always being enthusiastic about, while I had very different experiences at (more prestigious) larger-sized conferences. In many cases, their larger size led to many parallel tracks with fragmented audiences, while any plenary program elements were so massive that it was hard to engage with anyone you did not happen to know already, or incidentally happened to stand or sit next to.

We made sure we offered more than paper presentations. For the keynotes, we invited speakers from neighboring fields and disciplines, and encouraged them to give some critical perspectives on our field. We engaged with a local school in an outreach program. Before the conference, we held workshops, including the Women in MIR prototyping workshop, so people would already get to know one another; we had a dedicated Newcomer Initiatives chair to make sure no one felt lost, and the socials were set up such that people could really mingle. With many people in music also happening to be active music players, we offered both formal and informal options to jam together, so that week, several cafes in Delft faced more live music than we would normally see.

But while I was preparing for this conference, one of my strongest experiences was that I kept being haunted by these memories of the past: that being able to join this community (and an academic career at all) had been a really close call, that really was catalyzed by me having been able to join the conferences, and having met supportive seniors, while I was still an early-stage student without a full research embedding.

So one of the ISMIR 2019 achievements I am most proud of, was that we extended our financial support programs, enabled by the ISMIR board and sponsorship funds. Beyond the existing grants for student authors and female participants, we added a third ‘community grant’ category, meant for individuals who would like to attend ISMIR, but who had not been in the capacity to actively participate to the conference at this stage. Reading through the motivation letters for this grant made me realize that my experiences not as much of a freak case, and that colleagues have been facing similar challenges.

I am deeply grateful that these grants enabled for us to get more people over to ISMIR. Young professionals in between positions, students in other disciplines seeking to collaborate more closely on music topics; students that have found themselves as sole people in their labs working on music, as the labs faced other strategic priorities; but also, seniors who used to be members of our field, but who had gradually been drifting out, when entering a vicious circle of not getting music projects funded, then having to do more teaching in other topics, and then taking hits on their research output and profile. It was a wonderful experience seeing all of them actively mingling with the community, and hearing how being at ISMIR indeed had been personally impactful for them.

For my student volunteers, I especially targeted local and national students who were not yet at the PhD level, such that they could experience our academic atmosphere. Here as well, I saw the positive impact of the ISMIR spirit; several of these students (of whom I am not even the thesis supervisor…) made friends with international colleagues, and are even trying to collaborate on music information research with them in their free time today.

Hopefully, this story can help inspiring colleagues who are seeking to make their conference cultures more inclusive and impactful. With this, I do want to add a warning that endeavors like this will not come for free, but demand considerable extra work and advocacy. Much of our proposed innovations initially faced pushback in some form, as these were not how things normally were done, and they required financial and human resources that would not be normally accounted for. But I am very grateful that we followed through, and extremely proud of what we achieved in the end. My great thanks go to the ISMIR society, my fellow ISMIR 2019 organizers and our sponsors for their trust and support.

All ISMIR 2019 presentations have been recorded, and are available through this link. The accepted (open access) papers with supplementary material are available via this page. Photos of the socials are available here.


About the Column

The Multidisciplinary Column is edited by Cynthia C. S. Liem and Jochen Huber. Every other edition, we will feature an interview with a researcher performing multidisciplinary work, or a column of our own hand. For this edition, we feature a column by Cynthia C. S. Liem.

Dr. Cynthia C. S. Liem is an Assistant Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. Her research interests consider search and recommendation for music and multimedia, with special interest in making people discover new interests, as well as questions of interpretability and validity. She initiated, co-coordinated and participated in various (inter)national collaborative research projects on the accessibility of content which would not trivially be retrieved, both in the music/cultural heritage world, as well as in social sciences applications, e.g. collaborating with organizational psychologists. Beyond her academic activities, Cynthia gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, and a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach. In 2018, she was Researcher-in-Residence at the National Library of The Netherlands, and in 2019, she served as general co-chair of the ISMIR conference.

Dr. Jochen Huber is a Senior User Experience Researcher at Synaptics. Previously, he was an SUTD-MIT postdoctoral fellow in the Fluid Interfaces Group at MIT Media Lab and the Augmented Human Lab at Singapore University of Technology and Design. He holds a Ph.D. in Computer Science and degrees in both Mathematics (Dipl.-Math.) and Computer Science (Dipl.-Inform.), all from Technische Universität Darmstadt, Germany. Jochen’s work is situated at the intersection of Human-Computer Interaction and Human Augmentation. He designs, implements and studies novel input technology in the areas of mobile, tangible & non-visual interaction, automotive UX and assistive augmentation. He has co-authored over 60 academic publications and regularly serves as program committee member in premier HCI and multimedia conferences. He was program co-chair of ACM TVX 2016 and Augmented Human 2015 and chaired tracks of ACM Multimedia, ACM Creativity and Cognition and ACM International Conference on Interface Surfaces and Spaces, as well as numerous workshops at ACM CHI and IUI. Further information can be found on his personal homepage: http://jochenhuber.com