On System QoE: Merging the system and the QoE perspectives

With Quality of Experience (QoE) research having made significant advances over the years, increased attention is being put on exploiting this knowledge from a service/network provider perspective in the context of the user-centric evaluation of systems. Current research investigates the impact of system/service mechanisms, their implementation or configurations on the service performance and how it affects the corresponding QoE of its users. Prominent examples address adaptive video streaming services, as well as enabling technologies for QoE-aware service management and monitoring, such as SDN/NFV and machine learning. This is also reflected in the latest edition of conferences such as the ACM Multimedia Systems Conference (MMSys ‘19), see some selected exemplary papers.

  • “ERUDITE: a Deep Neural Network for Optimal Tuning of Adaptive Video Streaming Controllers” by De Cicco, L., Cilli, G., & Mascolo, S.
  • “An SDN-Based Device-Aware Live Video Service For Inter-Domain Adaptive Bitrate Streaming” by Khalid, A., Zahran, H. & Sreenan C.J.
  • “Quality-aware Strategies for Optimizing ABR Video Streaming QoE and Reducing Data Usage” by Qin, Y., Hao, S., Pattipati, K., Qian, F., Sen, S., Wang, B., & Yue, C.
  • “Evaluation of Shared Resource Allocation using SAND for Adaptive Bitrate Streaming” by Pham, S., Heeren, P., Silhavy, D., Arbanowski, S.
  • “Requet: Real-Time QoE Detection for Encrypted YouTube Traffic” by Gutterman, C., Guo, K., Arora, S., Wang, X., Wu, L., Katz-Bassett, E., & Zussman, G.

For the evaluation of systems, proper QoE models are of utmost importance, as they  provide a mapping of various parameters to QoE. One of the main research challenges faced by the QoE community is deriving QoE models for various applications and services, whereby ratings collected from subjective user studies are used to model the relationship between tested influence factors and QoE. Below is a selection of papers dealing with this topic from QoMEX 2019; the main scientific venue for the  QoE community.

  • “Subjective Assessment of Adaptive Media Playout for Video Streaming” by Pérez, P., García, N., & Villegas, A.
  • “Assessing Texture Dimensions and Video Quality in Motion Pictures using Sensory Evaluation Techniques” by Keller, D., Seybold, T., Skowronek, J., & Raake, A.
  • “Tile-based Streaming of 8K Omnidirectional Video: Subjective and Objective QoE Evaluation” by Schatz, R., Zabrovskiy, A., & Timmerer, C.
  • “SUR-Net: Predicting the Satisfied User Ratio Curve for Image Compression with Deep Learning” by Fan, C., Lin, H., Hosu, V., Zhang, Y., Jiang, Q., Hamzaoui, R., & Saupe, D.
  • “Analysis and Prediction of Video QoE in Wireless Cellular Networks using Machine Learning” by Minovski, D., Åhlund, C., Mitra, K., & Johansson, P.

System-centric QoE

When considering the whole service, the question arises of how to properly evaluate QoE in a systems context, i.e., how to quantify system-centric QoE. The paper [1] provides fundamental relationships for deriving system-centric QoE,which are the basis for this article.

In the QoE community, subjective user studies are conducted to derive relationships between influence factors and QoE. Typically, the results of these studies are presented in terms of Mean Opinion Scores (MOS). However, these MOS results mask user diversity, which leads to specific distributions of user scores for particular test conditions. In a systems context, QoE can be better represented as a random variable Q|t for a fixed test condition. Such models are commonly exploited by service/network providers to derive various QoE metrics [2] in their system, such as expected QoE, or the percentage of users rating above a certain threshold (Good-or-Better ratio GoB).

Across the whole service, users will experience different performance, measured by e.g.,  response times, throughput, etc. which depend on the system’s (and services’) configuration and implementation. In turn, this leads to users experiencing different quality levels. As an example, we consider the response time of a system, which offers a certain web service, such as access to a static web site. In such a case, the system’s performance can be represented by a random variable R for the response time. In the system community, research aims at deriving such distributions of the performance, R.

The user centric evaluation of the system combines the system’s perspective and the QoE perspective, as illustrated in the figure below. We consider service/network providers interested in deriving various QoE metrics in their system, given (a) the system’s performance, and (b) QoE models available from user studies. The main questions we need to answer are how to combine a) user rating distributions obtained from subjective studies, and b) system performance condition distributions, so as to obtain the actual observed QoE distribution in the system? Moreover, how can various QoE metrics of interest in the system be derived?

System centric QoE - Merging the system and the QoE perspectives

System centric QoE – Merging the system and the QoE perspectives

Model of System-centric QoE

A service provider is interested in the QoE distribution Q in the system, which includes the following stochastic components: 1) system performance condition, t (i.e., response time in our example), and 2) user diversity, Q|t. This system-centric QoE distribution allows us to derive various QoE metrics, such as expected QoE or expected GoB in the system.

Some basic mathematical transformations allow us to derive the expected system-centric QoE E[Q], as shown below. As a result, we show that the expected system QoE is equal to the expected Mean Opinion Score (MOS) in the system! Hence, for deriving system QoE, it is necessary to measure the response time distribution R and to have a proper QoS-to-MOS mapping function f(t) obtained from subjective studies. From the subjective studies, we obtain the MOS mapping function for a response time t, f(t)=E[Q|t]. The system QoE then follows as E[Q] = E[f(R)]=E[M]. Note: The MOS M distribution in the system allows only to derive the expected MOS, i.e., expected system-centric QoE.

Expected system QoE E[Q] in the system is equal to the expected MOS

Expected system QoE E[Q] in the system is equal to the expected MOS

Let us consider another system-centric QoE metric, such as the GoB ratio. On a typical 5-point Absolute Category Rating (ACR) scale (1:bad quality, 5: excellent quality), the system-centric GoB is defined as GoB[Q]=P(Q>=4). We find that it is not possible to use a MOS mapping function f and the MOS distribution M=f(R) to derive GoB[Q] in the system! Instead, it is necessary to use the corresponding QoS-to-GoB mapping function g. This mapping function g can also be derived from the same subjective studies as the MOS mapping function, and maps the response time (tested in the subjective experiment) to the ratio of users rating “good or better” QoE, i.e., g(t)=P(Q|t > 4). We may thus derive in a similar way: GoB[Q]=E[g(R)]. In the system, the GoB ratio is the expected value of the response times R mapped to g(R). Similar observations lead to analogous results for other QoE metrics, such as quantiles or variances (see [1]).

Conclusions

The reported fundamental relationships provide an important link between the QoE community and the systems community. If researchers conducting subjective user studies provide different QoS-to-QoE mapping functions for QoE metrics of interest (e.g.,  MOS or GoB), this is enough to derive corresponding QoE metrics from a system’s perspective. This holds for any QoS (e.g., response time) distribution in the system, as long as the corresponding QoS values are captured in the reported QoE models. As a result, we encourage QoE researchers to report not only MOS mappings, but the entire rating distributions from conducted subjective studies. As an alternative, researchers may report QoE metrics and corresponding mapping functions beyond just those relying on MOS!

We draw the attention of the systems community to the fact that the actual QoE distribution in a system is not (necessarily) equal to the MOS distribution in the system (see [1] for numerical examples). Just applying MOS mapping functions and then using observed MOS distribution to derive other QoE metrics like GoB is not adequate. The current systems literature however, indicates that there is clearly a lack of a common understanding as to what are the implications of using MOS distributions rather than actual QoE distributions.

References

[1] Hoßfeld, T., Heegaard, P.E., Skorin-Kapov, L., & Varela, M. (2019). Fundamental Relationships for Deriving QoE in Systems. 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX). IEEE 

[2] Hoßfeld, T., Heegaard, P. E., Varela, M., & Möller, S. (2016). QoE beyond the MOS: an in-depth look at QoE via better metrics and their relation to MOS. Quality and User Experience, 1(1), 2.

Authors

  • Tobias Hoßfeld (University of Würzburg, Germany) is heading the chair of communication networks.
  • Poul E. Heegaard (NTNU – Norwegian University of Science and Technology) is heading the Networking Research Group.
  • Lea Skorin-Kapov (University of Zagreb, Faculty of Electrical Engineering and Computing, Croatia) is heading the Multimedia Quality of Experience Research Lab
  • Martin Varela is working in the analytics team at callstats.io focusing on understanding and monitoring QoE for WebRTC services.

Multidisciplinary Column: An Interview with Max Mühlhäuser

 

Picture1

Could you tell us a bit about your background, and what the road to your current position was?

Well, this road is marked by wonderful people who inspired me and sparked my interest in the research fields I pursued. In addition, it is marked by two of my major deficiencies: I cannot stop to investigate the role of my research in the larger context of systems and disciplines, and I have the strong desire to see “inventions” by researchers make their way into practice i.e. turn into “innovations”. The first of these deficiencies led to the unusually broad research interests of my lab and myself, and the second one made me spend a substantial part of my career conceptualizing and leading technology transfer organizations, for the most part industry-funded ones.

More precisely, I started to cooperate with Digital Equipment Corp. (DEC) during the time of my Diploma thesis already. DEC was then the second largest computer manufacturer and spearhead of the efforts to build affordable “computers for every engineering group”. My boss, the late Professor Krüger, gave me a lot of freedom, so I was able to turn the research cooperation into the first funded European research project of DEC and later into their first research center in Europe, conceived as a campus-based organization that worked very closely with academia. I am proud to say that I was allowed to conceptualize this academia-industry cooperation and that it was later on copied – often with my help and consultancy – many times across the globe, by several companies and governments. I acted as the founding director of the first such center, but at that time I was already determined to follow the academic career path. At the age of 32, I was appointed professor at the university of Kaiserslautern. Over the years, I was offered positions at prestigious universities in Canada, France, and the Netherlands, and I accepted positions in Austria and Germany (Karlsruhe, Darmstadt). My sabbaticals led me to Australia, France and Canada, and for the most part to California (San Diego and four times Palo Alto). In retrospective it was exciting to start at a new academic position every couple of years in the beginning, but it was also exciting to “finally settle” in Darmstadt and to build the strengths and connections there that were necessary to drive even larger cooperative projects than before.

The Telecooperation Lab embraces many different disciplines. Celebrating its 20th birthday next year, how did these disciplines evolve over the years?

It started with my excitement for distributed systems, based on solid knowledge about computer networks. At the time (the early 1980s), little more than point-to-point communication between file transfer or e-mail agents existed, and neither client-server nor multi-party systems were common. My early interest in this field concerned software engineering for distributed systems, ranging from design and specification support via programming and simulation to debugging and testing. Soon, multimedia became feasible due to advancements in computer hardware– and in peripherals: think of the late laser disk, a clumsy predecessor of today’s DVDs and BDs. Multimedia grabbed my immediate attention since numerous problems arose from the interest to enable it in a distributed manner. Almost at the same time, e-learning became my favorite application field since I saw the great potential of distributed multimedia for this domain, given the challenges of global education and of the knowledge society. I believe that technology has come a long way with respect to e-learning, but we are still far from mastering the challenges of technology supported education and knowledge work.

Soon came the time when computers left the desk and became ubiquitous. From my experience in multimedia and e-learning, it was obvious to me that human computer interaction would be a key to the success of ubiquitous computing. Simply extrapolating the keyboard-mouse-monitor based interaction paradigm to a future where tens, hundreds, or thousands of computers would surround an individual –  what a nightmare! This threat of a dystopia made us work on implicit and tangible interaction, hybrid cyber-physical knowledge work, novel mobile and workspace interaction, augmented and virtual reality, and custom 3D printed interaction – HCI became our “new multimedia”.

Regarding applications domains, our research in supporting the knowledge society evolved towards supporting ‘smart environments and spaces’, a natural consequence of the evolution of our core research towards networked ubiquitous computers. My continued interest in turning inventions into innovations made us work on urgent problems of industry – mainly revolving around business processes – and on computers that expect the unexpected: emergencies and disasters. Both these domains were a nice fit since they could benefit from appropriate smart spaces. Looking at smart spaces of ever larger scale, we naturally hit the challenge of supporting smart cities and critical infrastructures.

Finally, a bit more than ten years ago, our ubiquitous computing research made us encounter and realize the “ubiquity” of related cybersecurity threats to at large, in particular threats to privacy and appropriate trustworthiness estimation and of detecting networked attacks. These cybersecurity research activities were, like those in HCI, natural consequences of my afore-mentioned deficiency: my desire to take a holistic look at systems – in my case, ubiquitous computing systems.

Finally, the fact that we adapt, apply and sometimes further machine learning concepts in our research is nothing but a natural consequence of the utility of those concepts for our purposes.

How would you describe the interrelationship between those disciplines? Do these benefit from cross-fertilization effects and if so, how?

In my answer to your last question, I unwillingly used the word “natural” several times. This shows already that research on ubiquitous computing and smart spaces with a holistic slant almost inevitably leads you to looking at the different aspects we investigate. These aspects just happen to concern different research disciplines in computer science. The starting point is the fact that ubiquitous computing devices are much less general-purpose computers than dedicated components. Networking and distributed systems support are therefore a prerequisite for orchestrating these dedicated skills, forming what can be called a truly smart space. Such spaces are usually meant to assist humans, so that multimedia – conveying “humane” information representations – and HCI – for interacting with many cooperating dedicated components – are indispensable. Next, how can a smart space assist a human if it is subject to cyber-vulnerabilities? Instead, it has to enforce its users’ concerns with respect to privacy, trust, and intended behavior. Finally, true smartness is by nature bound to adopting and adapting best-of-breed AI techniques.

You also asked for cross-fertilizing effects. Let me share just three of the many examples in this respect. (i) Our AI related work cross-feritlized our cyberattack defense. (ii) On the other hand, the AI work introduced new challenges in distributed and networked systems, driving our research on edge computing forward. (iii) New requirements are added to this edge computing research by HCI since we want to support collaborative AR applications at large i.e. city-wide scale.

Moreover, cross-fertilizing goes beyond the research fields of computer science that we integrate in my own lab. As you know, I was and am heading highly interdisciplinary doctoral schools, formerly on e-learning, and now on privacy and trust for mobile users. When you work with researchers from sociology, law, economics, and psychology on topics like privacy protecting Smartphones, you first consider these topics as pertaining to computer science. Soon, you realize that the other disciplines dealt with issues like privacy and trust long before computers existed. Not only can you learn a lot from the deep and concise findings brought forth by these disciplines for decades or centuries, you can quickly establish a very fruitful cooperation with researchers from these disciplines who address the new challenges of mobile and ubiquitous computing from their perspective. I am convinced that the unique role of Xerox PARC in the history of computer science, with so many of the most fundamental innovations originating there, is mainly a consequence of their highly interdisciplinary approaches, combining the “science of computers” with the “sciences concerned with humans”.

Please tell us about the main challenges you faced when uniting such diverse topics under the Telecooperation Lab’s multi-disciplinary umbrella?

The major challenge lies in a balancing act for each PhD thesis and researcher. On one hand, the work must be strictly anchored in a narrow academic field; as a young researcher, you are lucky if you can make yourself a bit of a name in a single narrow community–which is a prerequisite for any further academic career steps for many reasons. Trying to get rooted in more than one community during a PhD would be what I call academic suicide. The second side of the balancing act, for us, is the challenge to keep that narrow and focused PhD well connected to the multi-area context of my lab – and for the members of the doctoral schools, even connected to the respective multi-disciplinary context. While this second side is not a prerequisite for a PhD, it is an inexhaustible source of both new challenges for, and new approaches to, the respective narrow PhD fields. In fact, reaching out to other fields while mastering your own field costs some additional time; in my experience, however, this additional time can easily be spared in the search for original scientific contributions that will earn you a PhD. The reason is that the cross-fertilizing from a multi-area or even multi-disciplinary setting will lead you to original contributions much faster, due to a fresh look at both, challenges and approaches.

When it comes to Postdoctoral researchers, things are a bit different since they are already rooted in a field, which means that they can reach out a bit further to other areas and disciplines, thereby creating a unique little research domain in which they can make themselves a name for their further career. My aim for my postdocs is to help them attain a status where, when I mention their name in a pertinent academic circle, my colleagues would say “oh, I know, that’s the guy who is working on XYZ”, with XYZ being a concise subdomain of research which that postdoc was instrumental in shaping.

The Telecooperation Lab is part of CRISP, the National Research Center for Applied Cybersecurity in Germany, which embraces many disciplines as well. Can you give us some insights into multidisciplinarity in such an environment?

Let me start by explaining that we started the first large cybersecurity research center in Darmstadt more than ten years ago, CRISP in its current form as a national center has only started to exist. By the way, CRISP will have to be renamed again for legal reasons (sigh!). Therefore, let me address our cybersecurity research in general. This research involved a very broad spectrum of disciplines, from physicists that address quantum related aspects to psychologists that investigate usable security and mental models. The most fruitful cooperations always concern areas that establish a “mutual benefits and challenges” relationship with the computer science side of cybersecurity. Two examples that come to my mind are The Laws and Economics. Computer science solutions to security and privacy always have limits. For instance, cryptographic solutions are always linked to trust at their boundaries (cf. trusted certificate authorities, trusted implementations of theoretically “proven-secure” protocols, trust in the absence of insider threats etc.). At such boundaries, law must punish what technology cannot guarantee, otherwise the systems remain insecure. In the reverse direction, new technical possibilities and solutions must be reflected in law. A prominent example is the power of AI: privacy law, such as the European Union’s GDPR, holds data processing organizations liable if they process personally identifiable information, PII for short. If data is not considered to be PII, it can be released. Now what if, three years later, a novel AI algorithm can link that data to some background data and infer PII from it? Privacy law needs a considerable update due to these new technical possibilities. I could talk about these mutual benefits and challenges on and on, but let me just quickly mention one more example from economics: if technology comes up with new privacy preserving schemes then these schemes may open up new opportunities for privacy-respecting services. In order for such services to succeed in the market, we need to learn about possible corresponding business models. This kind of economics research may lead to new challenges for technical approaches, and so on. Such “cycles of innovation” across different disciplines are among the most exciting facets of interdisciplinary research.

Could you name a grand challenge of multidisciplinary research in the Multimedia community?

Oh, I think I have a quite dedicated opinion on this one! We clearly live in the era of the fusion of bits and atoms – and this metaphor is of course just one way to characterize what is going on. Firstly, in the cyber-physical society that we are currently creating, the digital components are becoming the “brains” of complex real-world systems such as the transport system, energy grids, industrial production etc. This development creates already significant challenges concerning our future society, but beyond this trend and directly related to multimedia, there is an even more striking development: we increasingly feed the human senses by means of digitally created or processed signals – and hence, basically by means of multimedia. TV and telephone, social media and Web based information, Skype conversations and meetings, you-name-it: our perception of objects, spaces, and of our conversation partners – in other words: of the physical world – is conveyed, augmented, altered, and filtered by means of computers and computer networks. Now, you will ask what I consider the challenge in this development that goes on since decades. Consider that this field “jumps forward” in our days due to AI and other advancements: it is the challenge for interdisciplinary multimedia research to properly conserve the distinction between “real” and “imaginary” in all cases where we would or should conserve it. To cite a field that is only marginally concerned here, let me mention games: in games, it is – mostly – desired to blur the distinction between the real and the virtual. However, if you think of fake news or of highly persuasive social media governmental election campaigns, you get an idea of what I mean. The challenge here is highly multidisciplinary: for instance, many computer science areas have to come together already in order to check where in the media processing chain we can intervene in order to keep a handle on the real-versus-virtual distinction. Way beyond that, we need many disciplines to work hand-in-hand in order to figure out what we want and how we can achieve it. We have to recognize that many long-existing trends are at the fringe of jumping forward to an unprecedented level of perfection. We must figure out what society needs and wants. It is reckless to leave this development to economic or even malicious forces or to tech nerds who invent their own ethics. The examples are endless, let me cite a few in addition to those mentioned above, highlighting fake news and manipulative election campaigns.

Machine learning experts may call me paranoid, hinting at the fact that the detection of manipulated photos or deep fake videos is still a much simpler machine learning task than creating them. While this is true, I fear that it may change in the future. Moreover, alluding to the multidisciplinary challenges mentioned, let me remind you that we currently don’t have processes in place that would sufficiently check content for authenticity in a systematic way.

As another example, humans are told they are “valued customers”, but they are since long considered as consumers at best. More recently, they are downgraded to mass objects in which purchase desires are first created then directed–by sophisticated algorithms and with ever more convincing multimedia content. Meanwhile in the background, pricing discrimination is rising to new levels of sophistication. On a different field, questionable political powers are more and more capable of destabilizing democracies from a save seat across the Internet, using curated and increasingly machine-created influential media.

As a next big wave, we are witnessing a giants’ race among global IT players for the crown in the augmented and virtual reality markets. What is still a niche area may become wide spread technology tomorrow – reckon that the first successful smartphone was introduced only little more than a decade ago and that meanwhile the majority of the world’s population use Smartphones to access the Internet. A similar success story may lie ahead for AR/VR: at the latest when a generation grows up wearing AR contact lenses, noise-cancelling earplugs and haptics-augmented cloths, reality will not be threatened by fake information any more but digitally created, imaginary content will be reality, rendering the question “what is real?” obsolete. Of course, the list of technologies and application domains mentioned here is by far non-exhaustive.

The problem is that all these trends appear to be evolutionary, not disruptive as they are. Marketing has influenced customers already centuries ago, fake news existed even longer, and the movie industry has always had a leading role in imaginary technology, from chroma keying to the most advanced animation techniques. Therefore, the new and upcoming AI-powered multimedia technology is not (yet) recognized as disruptive and hence as a considerable threat to the fundamental rules of our society. This is a key reason why I consider this field a grand interdisciplinary research challenge. We need definitely far more than technology solutions. As an outset, we need to come to grips with appropriate ethical and socio-political norms. To what extend do we want to keep and protect the governing rules of society and humankind? Which changes do we want, which ones not? What does all that mean in terms of governing rules for AI-powered multimedia, for the merging of the real and the virtual? Apart from basic research, we need a participatory approach that involves society in general and the rising generations in particular. Since we cannot expect these fundamental societal process to lead to a final decision, we have to advance the other research challenges in parallel. For instance, we need a better understanding of social implications and of psychological factors related to the merge of the real and the virtual. Technology-related research must be intertwined with these efforts; as to technology fields concerned, multimedia research must go hand-in-hand with others like AI, cybersecurity, privacy, etc. –the selection depends on the particular questions addressed. This research must be further intertwined with human-related fields such as Law: laws must again regulate what technology can’t solve, and reflect what technology can achieve for the good or the evil. In all this, I did not yet mention further related issues like for instance biometric access control: as we try to make access control more user friendly, we rely on biometric data, most of which are variants of multimedia, namely speech, face or iris photos, gait and others. The difference between real and virtual remain important here and we can expect enormous malicious efforts to blur it.  You see, there is really a multidisciplinary grand challenge for multimedia.

How and in what form do you feel we as academics can be most impactful?

During the first half of my career, computer science was still in that wonderful gold diggers’ era: if you had a good idea and just decent skills to convey it to your academic peers, you could count on that idea to be heart, valued, and – if it was socially and economically viable – realized. Since then, we have moved to a state in which good research results are not even half the story. Many seemingly marginal factors drive innovation today. No wonder have we reached a point at which many industry players think that innovation should be driven by the company’s product groups in a close loop with customers, or by startups that can be acquired if successful, or – for the small part that requires long-term research – by a few top research institutions. I am confident that this opinion will be replaced by a new craze among CEOs in a few years. Meanwhile, academics should do there homework in three ways. (a) They should look for the true kernel in the current anti-academic trend and improve academic research accordingly. (b) They should orient their research towards the unique strength of academia, like the possibility to carry out true interdisciplinary research at universities. (c) They should tune their role, their words and deeds to those much-increased societal responsibilities highlighted above.

Academics from computer science trigger confusion and reshaping of our society to a bigger and bigger extend; it is time for them to live up to their responsibility.


Bios

Prof. Dr. Max Mühlhäuser is head of the Telecooperation Lab at Technische Universität Darmstadt, Informatics Dept. His Lab conducts research on smart ubiquitous computing environments for the ‘pervasive Future Internet’ in three research fields: middleware and large network infrastructures, novel multimodal interaction concepts, and human protection in ubiquitous computing (privacy, trust, & civil security). He heads or co-supervises various multilateral projects, e.g., on the Internet-of-Services, smart products, ad-hoc and sensor networks, and civil security; these projects are funded by the National Funding Agency DFG, the EU, German ministries, and industry. Max is heading the doctoral school Privacy and Trust for Mobile Users and serves as deputy speaker of the collaborative research center MAKI on the Future Internet. Max has also led several university wide programs that fostered E-Learning research and application. In his career, Max put a particular emphasis on technology transfer, e.g., as the founder and mentor of several campus-based industrial research centers.

Max has over 30 years of experience in research and teaching in areas related to Ubiquitous Computing (UC), Networks, Distributed Multimedia Systems, E-Learning, and Privacy&Trust. He held permanent or visiting professorships at the Universities of Kaiserslautern, Karlsruhe, Linz, Darmstadt, Montréal, Sophia Antipolis (Eurecom), and San Diego (UCSD). In 1993, he founded the TeCO institute (www.teco.edu) in Karlsruhe, Germany, which became one of the pace-makers for Ubiquitous Computing research in Europe. Max regularly publishes in Ubiquitous and Distributed Computing, HCI, Multimedia, E-Learning, and Privacy&Trust conferences and journals and authored or co-authored more than 400 publications. He was and is active in numerous conference program committees, as organizer of several annual conferences, and as member of editorial boards or guest editor for journals like Pervasive Computing, ACM Multimedia, Pervasive and Mobile Computing, Web Engineering, and Distance Learning Technology.

Editor Biographies

Cynthia_Liem_2017Dr. Cynthia C. S. Liem is an Assistant Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. She initiated and co-coordinated the European research project PHENICX (2013-2016), focusing on technological enrichment of symphonic concert recordings with partners such as the Royal Concertgebouw Orchestra. Her research interests consider music and multimedia search and recommendation, and increasingly shift towards making people discover new interests and content which would not trivially be retrieved. Beyond her academic activities, Cynthia gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, and a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach.

 

 

 

jochen_huberDr. Jochen Huber is a Senior User Experience Researcher at Synaptics. Previously, he was an SUTD-MIT postdoctoral fellow in the Fluid Interfaces Group at MIT Media Lab and the Augmented Human Lab at Singapore University of Technology and Design. He holds a Ph.D. in Computer Science and degrees in both Mathematics (Dipl.-Math.) and Computer Science (Dipl.-Inform.), all from Technische Universität Darmstadt, Germany. Jochen’s work is situated at the intersection of Human-Computer Interaction and Human Augmentation. He designs, implements and studies novel input technology in the areas of mobile, tangible & non-visual interaction, automotive UX and assistive augmentation. He has co-authored over 60 academic publications and regularly serves as program committee member in premier HCI and multimedia conferences. He was program co-chair of ACM TVX 2016 and Augmented Human 2015 and chaired tracks of ACM Multimedia, ACM Creativity and Cognition and ACM International Conference on Interface Surfaces and Spaces, as well as numerous workshops at ACM CHI and IUI. Further information can be found on his personal homepage: http://jochenhuber.com

An interview with Professor Pål Halvorsen

Describe your journey into research from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

I remember when I was about 14 years old and had an 8th grade project where we were to identify what we wanted to do in the future and the road to get there. I had just recently discovered the world of computers and so reported several ways to become a computer scientist. After following the identified path to the University of Oslo, graduating with a Bachelor in computer science, my way into research was more by chance, or maybe even by accident. At that time, I spent a lot of time on sports and was not sure what to do for my master thesis. However, I was lucky. I found an interesting topic in the area of system support for multimedia, mainly video. I guess my supervisors liked the work because they later offered me a PhD position (thanks!) where they brought me deeper into the world of multimedia systems research.

My supervisors then helped me to get an associate professor position at the university (thanks again!). I got to know more colleagues, all inspiring me to continue research in the area of multimedia. After a couple of years performing research as a continuation of my PhD and teaching system related courses, I got an opportunity to join Simula Research Laboratory together with Carsten Griwodz. A bit later, we started our own small research group at Simula, and it is still a great place to be.

I think it is safe to say my path has been to a large degree influenced by some of the great people that I have met. You cannot do everything yourself, and I have been blessed with a lot of very good colleagues and friends. As a PhD student, I was told that after a year I should know more about my topic than my supervisors. It sounded not possible, but after having supervised a number of students myself, I believe it is true! Another friend and colleague also said that he had learned everything he knew from his students. Again, very correct – my students (and colleagues) have taught me a lot (thanks!). Thus, my main take home message is to find an area that interests you and nice people to work with! You can accomplish a lot as a good team!  

Regarding my research interests, I initially found an interest in how efficient a computer system could be. I became fascinated by delivery of continuous media early on, and the “system support for multimedia” quickly became my area. After years of reporting an X% improvement of component Y, an interest of the complete end-to-end system rose. I have had a wish to build complete systems. So today, our research group does not only aim to improve individual components but also the entire pipeline in a holistic system – especially in the area of sports and medicine – where we can see the effects of the systems we deploy.

Pål Halvorsen at the beginning of his career

Pål Halvorsen at the beginning of his career as a computer scientist

Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

Currently, I have several roles. My main position is with SimulaMet, a research center established by Simula Research Laboratory and Oslo Metropolitan University (OsloMet). I also recently moved my main university affiliation to OsloMet while still having a small adjunct professor position at University of Oslo. Both my research and teaching activities are related to my previously stated interests, and the combination of universities and research center is a perfect match for me, enabling a good mix of students and seniors.

I hope to be able to deliver results back into real systems, so that our results are not only published and then forgotten in a dark drawer somewhere. In this respect, we have contact with several real life “problem owners”, mainly in sports and medicine. To bring our results beyond research prototypes, we have also spun off both a sport and a medical company, achieving the vision of having real impact. The fact that we now run our systems for the two top soccer leagues in both Norway and Sweden is an example of our aims being fulfilled. Hopefully, we can soon say similar things in the medical scenario – that medical experts are assisted using our research-based systems!  

Can you profile your current research, its challenges, opportunities, and implications?

Having the end-to-end view, it is hard to make a short answer. We are trying to optimize both single components and the entire pipeline of components. Thus, we are doing a lot of different things. Our challenges are not only related to a specific requirement or a component, but also its integration into a system as a whole. We also address a number of real world applications. As you can see, the variety in our research is large.

However, there are also large opportunities in that the systems are researched and developed with real requirements and wishes in mind. Thus, if we succeed, there is a chance that we might actually have some impact. For example, in sports, we have three deployed systems in use.

How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

Together with colleagues at Simula, University of Oslo and University of Tromsø, we have been lucky to find some interesting and usable solutions. For example, at the system level, we have solutions (code) included in the Linux kernel, and at the application level, or as efficient complete system providing functionality beyond existing systems, we have running (prototype) systems in both the areas of sport and medicine.

Pål Halvorsen today

Pål Halvorsen in his office in 2019

Over your distinguished career, what are your top lessons you want to share with the audience?

Well, first, I do not think you can call it “distinguished”. This is your description.

The most important thing for me is to have some fun. You must like what you do, and you must find people you enjoy working with. There are a lot of interesting challenges out there. You must just find yours.

What is the best joke you know?

Hehe, I am so bad at jokes. Every ten years, I might have a catchy comment, but I hardly ever tell jokes.

If you were conducting this interview, what questions would you ask, and what would be your answers?

Haha, I am not a man of many words, so I would probably just stick to the set of questions I was given and hoping it would soon be finished 😉

So, maybe this one last question:

Q: Anything to add?

A: No. (Both short since I have to both Q and A)


Bios

Professor Pål Halvorsen: 

Pål Halvorsen is a chief research scientist at SimulaMet, a professor at OsloMet University, an adjunct professor at University of Oslo, Norway, and the CEO of ForzaSys AS. He received his doctoral degree (Dr.Scient.) in 2001.  His research focuses mainly on complete  distributed multimedia systems including operating systems, processing, storage and retrieval, communication and distribution from a performance and efficiency point of view. He is a member of the IEEE and ACM. More information
can be found at http://home.ifi.uio.no/paalh

Pia Helén Smedsrud: 

Pia Helén Smedsrud is a PhD student at Simula Research Laboratory in Oslo, Norway. She has a medical degree from UiO (University of Oslo), and worked as a medical doctor before starting as a research trainee in the field of computer science at Simula. She also has a background from journalism. Her research interests include medical multimedia, clinical implementation and machine learning. Currently, she is doing her PhD in the intersection between informatics and medicine, on machine learning in endoscopy.

Opinion Column: Evolution of Topics in the Multimedia Community

For this edition of the SIGMM Opinion Column, we asked members of the Multimedia community to share their impressions about the shift of scientific topics in the community over the years, namely the evolution of “traditional” and “emerging” Multimedia topics. 

This subject has emerged in several conversations over the 2 years of history of the SIGMM Opinion Column, and we report here a summary of recent and old discussions, happened over different channels – our Facebook group, the SIGMM Linkedin group, and in-person conversations between the column editors and MM researchers – with hopes, fears and opinions around this problem. We want to thank all participants for their precious contribution to these discussions.

Historical Perspective of Topics in ACM MM

opinion11_2_1This year, ACM Multimedia turns 27. Today, MM is a large premium conference with hundreds of paper submissions every year, spanning 12 different thematic areas spanning across the wide spectrum of multimedia topics. But back at the beginning of MM’s history, the scale of the topic range was very different.

In the first editions of the conference, a general call for papers encouraged submissions about “technology, tools and techniques for the construction and delivery of high quality, innovative multimedia systems and interfaces”. Already in its 3rd edition, MM featured an Arts and Multimedia program. Starting from 2004, the conference offered three tracks for paper submissions: content (Multimedia analysis, processing, and retrieval), Systems (Multimedia networking and system support), and Applications (Multimedia tools, end-systems, and applications), plus a “Brave New Topics” track for work-in-progress submissions. Later on, the Human-Centered Multimedia track was included in the projects. In 2011, after a conference review, the ACM MM program went beyond the notion of “tracks”, and the concept of areas was introduced to allow the community to “solicit papers from a wide range of timely multimedia-related topics” (see the ACMM11 website). In 2014, the areas became 14, including, among others, Music, Speech and Audio Processing in Multimedia, and Social Media and Collective Online Presence. After a retreat in 2014, starting from 2015, areas are grouped in larger “Themes”, the core thematic areas of ACM Multimedia. After the last retreat in 2014, no major changes were introduced in the thematic structure of the conference.

Dynamics of Evolution Emerging Topics

Emerging topics and less mature works are generally welcome at conferences’ workshops. In our discussions, most members of the community agree that “you’ll see great work there, and very fruitful discussions due to the common focus on the workshop theme”. When emerging topics become more popular, they can be promoted to conference areas, as it happened for the “music, speech and audio” theme. 

It was observed in our community conversations that, while this upgrade to the main conference is great for visibility, being a separate, relatively novel area could lead to isolation: the workload for reviewers specialized on emerging topics could become too high, given that they are assigned to works in other areas; and the flat acceptance rate across all conference themes could mean that even accepting 2 submissions from an emerging topic area would give ‘unreasonably’ high acceptance rate, thus leading to many good papers (even with 3 accepts) having to be rejected. Participants to our forums noticed that these dynamics are somehow “counteracted the ‘Multimedia’ and multidisciplinary nature of the field”, they prevent conferences from growing and eventually hurt emerging topics. One solution proposed to balance this effect is to “maintain a solid specialized reviewer pool (where needed managed by someone from the field), which however would be distributed over relevant MM areas”, rather than forming a new area.

It was also noted that some emerging topics in their early stage would most likely not have an appropriate workshop. Therefore, it is important for the main conference to have places to accept such early works, thus making tracks such as the short paper tracks or the brave new idea track absolutely crucial for the development of  novel topics.

The Near-Future of Multimedia

In multiple occasions, MM community members shared their thoughts about how they would like to see the Multimedia community evolve around new topics.

There are a few topics that emerged in the past and that the community wishes they continued growing, and these include interactive Multimedia applications, as well as music-related Multimedia technology, Multimedia in cooking spaces, and arts and Multimedia. It was also pointed out that, although very important for Multimedia applications, topics around compression technology are also often given low weight in Multimedia spaces, and that MM should encourage submissions in the domain of machine learning concepts applied to compression.

There are also a few areas that are emerging across different sub-communities in computer science, and that, according to our community members, we should be encouraging to grow within the Multimedia field as well. These include works in digital health exploring the power of Multimedia for health care and monitoring, research around applications of Multimedia for good, understanding how the technologies we develop can help having a real impact on society, and discussions around the ethics and responsibility of Multimedia technologies, encouraging fair, transparent, inclusive and accountable Multimedia tools.

The Future of Multimedia

The future of MM according to the participants of the discussion goes beyond the forms we know today, as new technologies could significantly broaden and shake the current applicative paradigm of Multimedia. 

The upcoming 5G technology will enable a plethora of applications that are now extremely limited by the lack of bandwidth. This could go from mobile virtual reality, to interconnection with objects and, of course, smart cities. To extract meaningful information to be presented to the user, various and highly diverse data streams will need to be treated consistently. And Multimedia researchers will develop methods, applications, systems and models to understand how to properly develop and impact this field. Likewise, this technology will push the limits of what is currently possible in terms of content demand and interaction with connected objects. We will see technologies for hyper-personalization, dynamic user interaction and real-time video personalization. These technologies will be enabled by the study of how film grammar and storytelling works for novel content types like AR, VR, panoramic and 360° video, by research around novel immersive media experiences, and by the design of new media formats, with novel consumption paradigms.

Multimedia has a bright future, with new, exciting emerging topics to be discussed and encouraged. Perhaps time for a new retreat or for a conference review?

First Combined ACM SIGMM Strategic Workshop and Summer School in Stellenbosch, South Africa

The first combined ACM SIGMM Strategic Workshop and Summer School will be held in Stellenbosch, South Africa, in the beginning of July 2020.

Rooiplein

First ACM Multimedia Strategic Workshop

The first Multimedia Strategic Workshop follows the successful series of workshops in areas such as information retrieval. The field of multimedia has continued to evolve and develop: collections of images, sounds and videos have become larger, computers have become more powerful, broadband and mobile Internet are widely supported, complex interactive searches can be done on personal computers or mobile devices, and soon. In addition, as large business enterprises find new ways to leverage the data they collect from users, the gap between the types of research conducted in industry and academics has widened, creating tensions over “repeatability” and “public data” in publications. These changes in environment and attitude mean that the time has come for the field to reassess its assumptions, goals, objectives and methodologies. The goal is to bring together researchers in the field to discuss long-term challenges and opportunities within the field. 

The participants of Multimedia Strategic Workshop will be active researchers in the field of Multimedia. The strategic workshop will give these researchers the opportunity to explore long-term issues in the multimedia field, to recognise the challenges on the horizon, to reach consensus on key issues and to describe them in the resulting report that will be made available to the multimedia research community. The report will stimulate debate, provide research directions to both researchers and graduate students, and also provide funding agencies with data that can be used coordinate the support for research.

The workshop will be held at the Wallenberg Research Centre at the Stellenbosch Institute for Advanced Study (STIAS). STIAS provides  provides venues and state-of-the art equipment for up to 300 conference guests at a time as well as breakaway rooms. 

The First ACM Multimedia Summer School on Multimedia

The motivation of the proposed summer school is to build on the success of the Deep Learning Indaba, but to focus on the application of machine learning to the field of Multimedia. We want delegates to be exposed to current research challenges in Multimedia. A secondary goal is to establish and grow the community of African researchers in the field of Multimedia; and to stimulate scientific research and collaboration between African researchers and the international community. The exact topics covered during the summer school will decided later together with the instructors but will reflect the current research trends in Multimedia.

The Strategic Workshop will be followed by the Summer School on Multimedia. Having the first summer school co-located with the Strategic Workshop will help to recruit the best possible instructors for the summer school. 

The Multimedia Summer School on Multimedia will be held at the Faculty of Engineering at Stellenbosch University, which is one of South Africa’s major producers of top quality engineers. The faculty was established in 1944 and is housed in a large complex of buildings with modern facilities, including lectures halls and electronic classrooms.

Stellenbosch is a university town in South Africa’s Western Cape province. It’s surrounded by the vineyards of the Cape Winelands and the mountainous nature reserves of Jonkershoek and Simonsberg. The town’s oak-shaded streets are lined with cafes, boutiques and art galleries. Cape Dutch architecture gives a sense of South Africa’s Dutch colonial history, as do the Village Museum’s period houses and gardens.

For more information about both events, please refer to the events’ web site (africanmultimedia.acm.org) or contact the organizers:

MPEG Column: 125th MPEG Meeting in Marrakesh, Morocco

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

The 125th MPEG meeting concluded on January 18, 2019 in Marrakesh, Morocco with the following topics:

  • Network-Based Media Processing (NBMP) – MPEG promotes NBMP to Committee Draft stage
  • 3DoF+ Visual – MPEG issues Call for Proposals on Immersive 3DoF+ Video Coding Technology
  • MPEG-5 Essential Video Coding (EVC) – MPEG starts work on MPEG-5 Essential Video Coding
  • ISOBMFF – MPEG issues Final Draft International Standard of Conformance and Reference software for formats based on the ISO Base Media File Format (ISOBMFF)
  • MPEG-21 User Description – MPEG finalizes 2nd edition of the MPEG-21 User Description

The corresponding press release of the 125th MPEG meeting can be found here. In this blog post I’d like to focus on those topics potentially relevant for over-the-top (OTT), namely NBMP, EVC, and ISOBMFF.

Network-Based Media Processing (NBMP)

The NBMP standard addresses the increasing complexity and sophistication of media services, specifically as the incurred media processing requires offloading complex media processing operations to the cloud/network to keep receiver hardware simple and power consumption low. Therefore, NBMP standard provides a standardized framework that allows content and service providers to describe, deploy, and control media processing for their content in the cloud. It comes with two main functions: (i) an abstraction layer to be deployed on top of existing cloud platforms (+ support for 5G core and edge computing) and (ii) a workflow manager to enable composition of multiple media processing tasks (i.e., process incoming media and metadata from a media source and produce processed media streams and metadata that are ready for distribution to a media sink). The NBMP standard now reached Committee Draft (CD) stage and final milestone is targeted for early 2020.

In particular, a standard like NBMP might become handy in the context of 5G in combination with mobile edge computing (MEC) which allows offloading certain tasks to a cloud environment in close proximity to the end user. For OTT, this could enable lower latency and more content being personalized towards the user’s context conditions and needs, hopefully leading to a better quality and user experience.

For further research aspects please see one of my previous posts

MPEG-5 Essential Video Coding (EVC)

MPEG-5 EVC clearly targets the high demand for efficient and cost-effective video coding technologies. Therefore, MPEG commenced work on such a new video coding standard that should have two profiles: (i) royalty-free baseline profile and (ii) main profile, which adds a small number of additional tools, each of which is capable, on an individual basis, of being either cleanly switched off or else switched over to the corresponding baseline tool. Timely publication of licensing terms (if any) is obviously very important for the success of such a standard.

The target coding efficiency for responses to the call for proposals was to be at least as efficient as HEVC. This target was exceeded by approximately 24% and the development of the MPEG-5 EVC standard is expected to be completed in 2020.

As of today, there’s the need to support AVC, HEVC, VP9, and AV1; soon VVC will become important. In other words, we already have a multi-codec environment to support and one might argue one more codec is probably not a big issue. The main benefit of EVC will be a royalty-free baseline profile but with AV1 there’s already such a codec available and it will be interesting to see how the royalty-free baseline profile of EVC compares to AV1.

For a new video coding format we will witness a plethora of evaluations and comparisons with existing formats (i.e., AVC, HEVC, VP9, AV1, VVC). These evaluations will be mainly based on objective metrics such as PSNR, SSIM, and VMAF. It will be also interesting to see subjective evaluations, specifically targeting OTT use cases (e.g., live and on demand).

ISO Base Media File Format (ISOBMFF)

The ISOBMFF (ISO/IEC 14496-12) is used as basis for many file (e.g., MP4) and streaming formats (e.g., DASH, CMAF) and as such received widespread adoption in both industry and academia. An overview of ISOBMFF is available here. The reference software is now available on GitHub and a plethora of conformance files are available here. In this context, the open source project GPAC is probably the most interesting aspect from a research point of view.

JPEG Column: 82nd JPEG Meeting in Lisbon, Portugal

The 82nd JPEG meeting was held in Lisbon, Portugal. Highlights of the meeting are progress on JPEG XL, JPEG XS, HTJ2K, JPEG Pleno, JPEG Systems and JPEG reference software.

JPEG has been the most common representation format of digital images for more than 25 years. Other image representation formats have been standardised by JPEG committee like JPEG 2000 or more recently JPEG XS. Furthermore, JPEG has been extended with new functionalities like HDR or alpha plane coding with the JPEG XT standard, and more recently with a reference software. Another solutions have been also proposed by different players with limited success. The JPEG committee decided it is the time to create a new working item, named JPEG XL, that aims to develop an image coding standard with increased quality and flexibility combined with a better compression efficiency. The evaluation of the call for proposals responses had already confirmed the industry interest, and development of core experiments has now begun. Several functionalities will be considered, like support for lossless transcoding of images represented with JPEG standard.

A 2nd workshop on media blockchain technologies was held in Lisbon, collocated with the JPEG meeting. Touradj Ebrahimi and Frederik Temmermans opened the workshop with presentations on relevant JPEG activities such as JPEG Privacy and Security. Thereafter, Zekeriya Erkin made a presentation on blockchain, distributed trust and privacy, and Carlos Serrão presented an overview of the ISO/TC 307 standardization work on blockchain and distributed ledger technologies. The workshop concluded with a panel discussion chaired by Fernando Pereira where the interoperability of blockchain and media technologies was discussed. A 3rd workshop is planned during the 83rd meeting to be held in Geneva, Switzerland on March 20th, 2019.

The 82nd JPEG meeting had the following highlights: jpeg82ndpicS

  • The new working item JPEG XL
  • JPEG Pleno
  • JPEG XS
  • HTJ2K
  • JPEG Systems – JUMBF & JPEG 360
  • JPEG reference software

 

The following summarizes various highlights during JPEG’s Lisbon meeting. As always, JPEG welcomes participation from industry and academia in all its standards activities.

JPEG XL

The JPEG Committee launched JPEG XL with the aim of developing a standard for image coding that offers substantially better compression efficiency when compared to existing image formats, along with features desirable for web distribution and efficient compression of high quality images. Subjective tests conducted by two independent research laboratories were presented at the 82nd meeting in Lisbon and indicate promising results that compare favorably with state of the art codecs.

A development software for the JPEG XL verification model is currently being implemented. A series of experiments have been also defined for improving the above model; these experiments address new functionalities such as lossless coding and progressive decoding.

JPEG Pleno

The JPEG Committee has three activities in JPEG Pleno: Light Field, Point Cloud, and Holographic image coding.

At the Lisbon meeting, Part 2 of JPEG Pleno Light Field was refined and a Committee Draft (CD) text was prepared. A new round of core experiments targets improved subaperture image prediction quality and scalability functionality.

JPEG Pleno Holography will be hosting a workshop on March 19th, 2019 during the 83rd JPEG meeting in Geneva. The purpose of this workshop is to provide insights in the status of holographic applications such as holographic microscopy and tomography, displays and printing, and to assess their impact on the planned standardization specification. This workshop invites participation from both industry and academia experts. Information on the workshop can be find at https://jpeg.org/items/20190228_pleno_holography_workshop_geneva_announcement.html

JPEG XS

The JPEG Committee is pleased to announce a new milestone of the JPEG XS project, with the Profiles and Buffer Models (JPEG XS ISO/IEC 21122 Part 2) submitted to ISO for immediate publication as International Standard.

This project aims at standardization of a visually lossless low-latency and lightweight compression scheme that can be used as a mezzanine codec within any AV market. Among the targeted use cases are video transport over professional video links (SDI, IP, Ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and sensor compression (for example in cameras and in the automotive industry). The Core Coding System allows for visually lossless quality at moderate compression rates, scalable end-to-end latency ranging from less than a line to a few lines of the image, and low complexity real time implementations in ASIC, FPGA, CPU and GPU. The new part “Profiles and Buffer Models” defines different coding tools subsets addressing specific application fields and use cases. For more information, interested parties are invited to read the JPEG White paper on JPEG XS that has been recently published on the JPEG website (https://jpeg.org).

 HTJ2K

The JPEG Committee continues its work on ISO/IEC 15444-15 High-Throughput JPEG 2000 (HTJ2K) with the development of conformance codestreams and reference software, improving interoperability and reducing obstacles to implementation.

The HTJ2K block coding algorithm has demonstrated an average tenfold increase in encoding and decoding throughput compared to the block coding algorithm currently defined by JPEG 2000 Part 1. This increase in throughput results in an average coding efficiency loss of 10% or less in comparison to the most efficient modes of the block coding algorithm in JPEG 2000 Part 1, and enables mathematically lossless transcoding to-and-from JPEG 2000 Part 1 codestreams.

JPEG Systems – JUMBF & JPEG 360

At the 82nd JPEG meeting, the Committee DIS ballots were completed, comments reviewed, and the standard progressed towards FDIS text for upcoming ballots on “JPEG Universal Metadata Box Format (JUMBF)” as ISO/IEC 19566-5, and “JPEG 360” as ISO/IEC 19566-6. Investigations continued to generalize the framework to other applications relying on JPEG (ISO/IEC 10918 | ITU-T.81), and JPEG Pleno Light Field.

JPEG reference software

With the JPEG Reference Software reaching FDIS stage, the JPEG Committee reaches an important milestone by extending its specifications with a new part containing a reference software. With its FDIS release, two implementations will become official reference to the most successful standard of the JPEG Committee: The fast and widely deployed libjpeg-turbo code, along with a complete implementation of JPEG coming from the Committee itself that also covers coding modes that were only known by a few experts.

 

Final Quote

“One of the strengths of the JPEG Committee has been in its ability to identify important trends in imaging technologies and their impact on products and services. I am delighted to see that this effort still continues and the Committee remains attentive to future.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JPEG, JPEG 2000, JPEG XR, JPSearch and more recently, the JPEG XT, JPEG XS, JPEG Systems and JPEG Pleno families of imaging standards.

The JPEG Committee nominally meets four times a year, in different world locations. The 82nd JPEG Meeting was held on 19-25 October 2018, in Lisbon, Portugal. The next 83rd JPEG Meeting will be held on 16-22 March 2019, in Geneva, Switzerland.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

Future JPEG meetings are planned as follows:

  • No 83, Geneva, Switzerland, March 16 to 22, 2019
  • No 84, Brussels, Belgium, July 13 to 19, 2019

 

Solving Complex Issues through Immersive Narratives — Does QoE Play a Role?

Introduction

A transdisciplinary dialogue and innovative research, including technical and artistic research as well as digital humanities are necessary to solve complex issues. We need to support and produce creative practices, and engage in a critical reflection about the social and ethical dimensions of our current technology developments. At the core is an understanding that no single discipline, technology, or field can produce knowledge capable of addressing the complexities and crises of the contemporary world. Moreover, we see the arts and humanities as critical tools for understanding this hyper-complex, mediated, and fragmented global reality. As a use case, we will consider the complexity of extreme weather events, natural disasters and failure of climate change mitigation and adaptation, which are the risks with the highest likelihood of occurrence and largest global impact (World Economic Forum, 2017). Through our project, World of Wild Waters (WoWW), we are using immersive narratives and gamification to create a simpler holistic understanding of cause and effect of natural hazards by creating immersive user experiences based on real data, realistic scenarios and simulations. The objective is to increase societal preparedness for a multitude of stakeholders. Quality of Experience (QoE) modeling and assessment of immersive media experiences are at the heart of the expected impact of the narratives, where we would expect active participation, engagement and change, to play a key role [1].

Here, we present our views of immersion and presence in light of Quality of Experience (QoE). We will discuss the technical and creative considerations needed for QoE modeling and assessment of immersive media experiences. Finally, we will provide some reflections on QoE being an important building block in immersive narratives in general, and especially towards considering Extended Realities (XR) as an instantiation of Digital storytelling.

But what is Immersion and an Immersive Media Experience?

Immersion and immersive media experiences are commonly used terms in industry and academia today to describe new digital media. However, there is a gap in definitions of the term between the two worlds that can lead to confusions. This gap needs to be filled for XR to become a success and finally hit the masses, and not simply vanish as it has done so many times before since the invention of VR in 1962 by Morton Heilig (The Sensorama, or «Experience Theatre»). Immersion, thus far, can be plainly put as submersion in a medium (representational, fictional or simulated). It refers to a sense of belief, or the suspension of disbelief, while describing  the experience/event of being surrounded by an environment (artificial, mental, etc.). This view is contrasted by a data-oriented view often used by technophiles who regard immersion as a technological feat that ensures a multimodal sensory input to the user [2]. This is the objective description, which views immersion as quantifiable afforded or offered by the system (computer and head-mounted display (HMD), in this case).

Developing immersion on these lines risks favoring the typology of spatial immersion while alienating the rest (phenomenological, narrative, tactical, pleasure, etc.). This can be seen in recent VR applications that propel high-fidelity, low-latency, and precision-tracking products that aim to simulate the exactitude of sensorial information (visual, auditory, haptic) available in the real world to make the experience as ‘real’ as possible – a sense of realness, that is not necessarily immersive [3].

Another closely related phenomenon is that of presence, shortened from its original 1980’s form of telepresence [3]. It is a core phenomenon for immersive technologies describing an engagement via technology where one feels as oneself, even though physically removed. This definition was later appropriated for simulated/virtual environments where it was described as a “feeling of being transported” into the synthetic/artificial space of a simulated environment. It is for this reason that presence, a subjective sensation, is most often associated with spatial immersion. A renewed interest in presence research has invited fresh insights into conceptualizing presence.

Based on the technical or system approach towards immersion, we can refer to immersive media experiences through the definitions given in in Figure 1.

Figure 1. Evolution of current immersive media experiences

Figure 1. Definitions of current immersive media experiences

Much of the media considered today still consists of audio and visual presentations, but now enriched by new functionality such as 360 view, 3D and enabling interactivity. The ultimate goals are to create immersive media experiences by digitally creating real world presence by using available media technology and optimizing the experience as perceived by the participant [4].

Immersive Narratives for Solving Complex issues

The optimized immersive experience can be used in various domains to help solve complex issues by narration or gamification. Through World of Wild Waters (WoWW) we aim to focus on immersive narration and gamification of natural hazards. The project focuses on implication of immersive storytelling for disaster management by depicting extreme weather events and natural disasters. Immersive media experiences can present XR solutions for natural hazards by simulating real time data and providing people with a hands-on experience of how it feels to face an unexpected disaster. Immersive narratives can be used to allow people to be better prepared by experiencing the effects of different emergency scenarios while in a safe environment. However, QoE modeling and assessment for serious immersive narrations is a challenge and one need to carefully combine immersion, media technology and end user experiences for solving such complex issues.

Does QoE Play a Role?

Current state-of-the-art (SOTA) in immersive narratives from a technology point of view is by implementing virtual experience through Virtual Reality (VR), Augmented Reality (AR) and Mixed Reality (MR), commonly referred to as eXtended Realities (XR) seen as XR. Discussing the SOTA of XR is challenging as it exists across a large number of companies and sectors in form of fragmented domain specific products and services, and is changing from quarter to quarter. The definitions of immersion and presence differ, however, it is important to raise awareness of its generic building blocks to start a discussion on the way to move forward. The most important building blocks are the use of digital storytelling in the creation of the experience and the quality of the final experiences as perceived by the participants.

XR relies heavily on immersive narratives, stories where the experiences surround you providing a sense of realness as well as a sense of being there. Following Mel Slaters platform for VR [5], immersion consists of three parts:

  1. the concrete technical system for production,
  2. the illusions we are addressing and
  3. the resulting experience as interpreted by the participant.

The illusions part of XR play on providing a sense of being in a different place, which through high quality media makes us perceive that this is really happening (plausibility). Providing a high-quality experience eventually make us feel as participants in the story (agency). Finally, by feeling we are really participating in the experience, we get body ownership in this place. To be able to achieve these high-quality future media technology experiences we need new work processes and work flows for immersive experiences, requiring a vibrant connection between artists, innovators and technologists utilizing creative narratives and interactivity. To validate their quality and usefulness and ultimately business success, we need to focus on research and innovation within quality modeling and assessment making it possibly for the creators to iteratively improve the performance of their XR experience.

A transdisciplinary approach to immersive media experiences amplifies the relevance of content. Current QoE models predominantly treat content as a system influence factor, which allows for evaluations limited to its format, i.e., nature (e.g., image, sound, motion, speech, etc.) and type (e.g., analog or digital). Such a definition seems insufficient given how much the overall perceptual quality of such media is important. With technologies becoming mainstream, there is a global push for engaging content. Successful XR applications require strong content to generate, and retain, interest. One-time adventures, such as rollercoaster rides, are now deal breakers. With technologies, users too have matured, as the novelty factor of such media diminishes so does the initial preoccupation with interactivity and simulations. Immersive experiences must rely on content for a lasting impression.

However, the social impact of this media saturated reality is yet to be completely understood. QoE modeling and assessment and business models are evolving as we see more and more experiences being used commercially. However, there is still a lot of work to be done in the fields of the legal, ethical, political, health and cultural domains.

Conclusion

Immersive media experiences make a significant impact on the use and experience of new digital media through new and innovative approaches. These services are capable of establishing advanced transferable and sustainable best practices, specifically in art and technology, for playful and liveable human centered experiences solving complex problems. Further, the ubiquity of such media is changing our understanding for mediums as they form liveable environments that envelop our lives as a whole. The effects of these experiences are challenging our traditional concepts of liveability, which is why it is imperative for us to approach them as a paradigmatic shift in the civilizational project. The path taken should merge work on the technical aspects (systems) with the creative considerations (content).

Reference and Bibliography Entries

[1] Le Callet, P., Möller, S. and Perkis, A., 2013. Qualinet White Paper on Definitions of Quality of Experience (2012). European Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003). Version 1.2. Mar-2013. [URL]

[2] Perrin, A.F.N.M., Xu, H., Kroupi, E., Řeřábek, M. and Ebrahimi, T., 2015, October. Multimodal dataset for assessment of quality of experience in immersive multimedia. In Proceedings of the 23rd ACM international conference on Multimedia (pp. 1007-1010). ACM. [URL]

[3] Normand, V., Babski, C., Benford, S., Bullock, A., Carion, S., Chrysanthou, Y., Farcet, N., Frécon, E., Harvey, J., Kuijpers, N. and Magnenat-Thalmann, N., 1999. The COVEN project: Exploring applicative, technical, and usage dimensions of collaborative virtual environments. Presence: Teleoperators & Virtual Environments, 8(2), pp.218-236. [URL]

[4] A. Perkis and A. Hameed, “Immersive media experiences – what do we need to move forward?,” SMPTE 2018, Westin Bonaventure Hotel & Suites, Los Angeles, California, 2018, pp. 1-12.
doi: 10.5594/M001846

[5] M. Slater, MV Sanchez-Vives, “Enhancing Our Lives with Immersive Virtual Reality”, Frontiers in Robotics and AI, 2016 – frontiersin.org

Note from the Editors:

Quality of Experience (QoE) in the context of immersive media applications and services are gaining momentum as such apps/services become available. Thus, it requires a deep integrated understanding of all involved aspects and corresponding scientific evaluations of the various dimensions (including but not limited to reproducibility). Therefore, the interested reader is referred to QUALINET and QoMEX, specifically QoMEX2019 which play a key role in this exciting application domain.

Report from ACM ICMR 2018 – by Cathal Gurrin

 

Multimedia computing, indexing, and retrieval continue to be one of the most exciting and fastest-growing research areas in the field of multimedia technology. ACM ICMR is the premier international conference that brings together experts and practitioners in the field for an annual conference. The eighth ACM International Conference on Multimedia Retrieval (ACM ICMR 2018) took place from June 11th to 14th, 2018 in Yokohama, Japan’s second most populous city. ACM ICMR 2018 featured a diverse range of activities including: Keynote talks, Demonstrations, Special Sessions and related Workshops, a Panel, a Doctoral Symposium, Industrial Talks, Tutorials, alongside regular conference papers in oral and poster session. The full ICMR2018 schedule can be found on the ICMR 2018 website <http://www.icmr2018.org/>. The organisers of ACM ICMR 2018 placed a large emphasis on generating a high-quality programme and in 2018; ICMR received 179 submissions to the main conference, with 21 accepted for oral presentation and 23 for poster presentation. A number of key themes emerged from the published papers at the conference: deep neural networks for content annotation; multimodal event detection and summarisation; novel multimedia applications; multimodal indexing and retrieval; and video retrieval from regular & social media sources. In addition, a strong emphasis on the user (in terms of end-user applications and user-predictive models) was noticeable throughout the ICMR 2018 programme. Indeed, the user theme was central to many of the components of the conference, from the panel discussion to the keynotes, workshops and special sessions. One of the most memorable elements of ICMR 2018 was a panel discussion on the ‘Top Five Problems in Multimedia Retrieval’ http://www.icmr2018.org/program_panel.html. The panel was composed of leading figures in the multimedia retrieval space: Tat-Seng Chua (National University of Singapore); Michael Houle (National Institute of Informatics); Ramesh Jain (University of California, Irvine); Nicu Sebe (University of Trento) and Rainer Lienhart (University of Augsburg). An engaging panel discussion was facilitated by Chong-Wah Ngo (City University of Hong Kong) and Vincent Oria (New Jersey Institute of Technology). The common theme was that multimedia retrieval is a hard challenge and that there are a number of fundamental topics that we need to make progress in, including bridging the semantic and user gaps, improving approaches to multimodal content fusion, neural network learning, addressing the challenge of processing at scale and the so called “curse of dimensionality”. ICMR2018 included two excellent keynote talks <http://www.icmr2018.org/program_keynote.html>. Firstly, Kohji Mitani, the Deputy Director of Science & Technology Research Laboratories NHK (Japan Broadcasting Corporation) explained about the ongoing evolution of broadcast technology and the efforts underway to create new (connected) broadcast services that can provide viewing experiences never before imagined and user experiences more attuned to daily life. The second keynote from Shunji Yamanaka, from The University of Tokyo discussed his experience of prototyping new user technologies and highlighted the importance of prototyping as a process that bridges an ever increasing gap between advanced technological solutions and societal users. During this entertaining and inspiring talk many prototypes developed in Yamanaka’s lab were introduced and the related vision explained to an eager audience. Three workshops were accepted for ACM ICMR 2018, covering the fields of lifelogging, art and real-estate technologies. Interestingly, all three workshops focused on domain specific applications in three emerging fields for multimedia analytics, all related to users and the user experience. The “LSC2018 – Lifelog Search Challenge”< http://lsc.dcu.ie/2018/> workshop was a novel and highly entertaining workshop modelled on the successful Video Browser Showdown series of participation workshops at the annual MMM conference. LSC was a participation workshop, which means that the participants wrote a paper describing a prototype interactive retrieval system for multimodal lifelog data. It was then evaluated during a live interactive search challenge during the workshop. Six prototype systems took part in the search challenge in front of an audience that reached fifty conference attendees. This was a popular and exciting workshop and could become a regular feature at future ICMR conferences. The second workshop was the MM-Art & ACM workshop <http://www.attractiveness-computing.org/mmart_acm2018/index.html>, which was a joint workshop that merged two existing workshops, the International Workshop on Multimedia Artworks Analysis (MMArt) and the International Workshop on Attractiveness Computing in Multimedia (ACM). The aim of the joint workshop was to enlarge the scope of discussion issues and inspire more works in related fields. The papers at the workshop focused on the creation, editing and retrieval of art-related multimedia content. The third workshop was RETech 2018 <https://sites.google.com/view/multimedia-for-retech/>, the first international workshop on multimedia for real estate tech. In recent years there has been a huge uptake of multimedia processing and retrieval technologies in the domain, but there are still a lot of challenges remaining, such as quality, cost, sensitivity, diversity, and attractiveness to users of content. In addition, ICMR 2018 included three tutorials <http://www.icmr2018.org/program_tutorial.html> on topical areas for the multimedia retrieval communities. The first was ‘Objects, Relationships and Context in Visual Data’ by Hanwang Zhang and Qianru Sun. The second was ‘Recommendation Technologies for Multimedia Content’ by Xiangnan He, Hanwang Zhang and Tat-Seng Chua and the final tutorial was ‘Multimedia Content Understanding, my Learning from very few Examples’ by Guo-Jun Qi. All tutorials were well received and feedback was very good. Other aspects of note from ICMR2018 were a doctoral symposium that attracted five authors and a dedicated industrial session that had four industrial talks highlighting the multimedia retrieval challenges faced by industry. It was interesting from the industrial talks to hear how the analytics and retrieval technologies developed over years and presented at venues such as ICMR were actually being deployed in real-world user applications by large organisations such as NEC and Hitachi. It is always a good idea to listen to the real-world applications of the research carried out by our community. The best paper session at ICMR 2018 had four top ranked works covering multimodal, audio and text retrieval. The best paper award went to ‘Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval’, by Niluthpol Mithun, Juncheng Li, Florian Metze and Amit Roy-Chowdhury. The best Multi-Modal Paper Award winner was ‘Cross-Modal Retrieval Using Deep De-correlated Subspace Ranking Hashing’ by Kevin Joslyn, Kai Li and Kien Hua. In addition, there were awards for best poster ‘PatternNet: Visual Pattern Mining with Deep Neural Network’ by Hongzhi Li, Joseph Ellis, Lei Zhang and Shih-Fu Chang, and best demo ‘Dynamic construction and manipulation of hierarchical quartic image graphs’ by Nico Hezel and Kai Uwe Barthel. Finally, although often overlooked, there were six reviewers commended for their outstanding reviews; Liqiang Nie, John Kender, Yasushi Makihara, Pascal Mettes, Jianquan Liu, and Yusuke Matsui. As with some other ACM sponsored conferences, ACM ICMR 2018 included an award for the most active social media commentator, which is how I ended up writing this report. There were a number of active social media commentators at ICMR 2018 each of which provided a valuable commentary on the proceedings and added to the historical archive.
fig1

Of course, the social side of a conference can be as important as the science. ICMR 2018 included two main social events, a welcome reception and the conference banquet. The welcome reception took place at the Fisherman’s Market, an Asian and ethnic dining experience with a wide selection of Japanese food available. The Conference Banquet took place in the Hotel New Grand, which was built in 1927 and has a long history of attracting famous guests. The venue is famed for the quality of the food and the spectacular panoramic views of the port of Yokohama. As with the rest of the conference, the banquet food was top-class with more than one of the attendees commenting that the Japanese beef on offer was the best they had ever tasted.

ICMR 2018 was an exciting and excellently organised conference and it is important to acknowledge the efforts of the general co-chairs: Kiyoharu Aizawa (The Univ. Of Tokyo), Michael Lew (Leiden Univ.) and Shin’ichi Satoh (National Inst. Of Informatics). They were ably assisted by the TPC co-chairs, Benoit Huet (Eurecom), Qi Tian (Univ. Of Texas at San Antonio) and Keiji Yanai (The Univ. Of Electro-Comm), who coordinated the reviews from a 111 person program committee in a double-blind manner, with an average of 3.8 reviews being prepared for every paper. ICMR 2019 will take place in Ottawa, Canada in June 2019 and ICMR 2020 will take place in Dublin, Ireland in June 2020. I hope to see you all there and continuing the tradition of excellent ICMR conferences.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

The Lifelog Search Challenge Workshop attracted six teams for a real-time public interactive search competition.

Shunji Yamanaka about to begin his keynote talk on Prototyping

Shunji Yamanaka about to begin his keynote talk on Prototyping

Kiyoharu Aizawa and Shin'ichi Satoh, two of the ICMR 2018 General co-Chairs welcoming attendees to the ICMR 2018 Banquet at the historical Hotel New Grand.

Kiyoharu Aizawa and Shin’ichi Satoh, two of the ICMR 2018 General co-Chairs welcoming attendees to the ICMR 2018 Banquet at the historical Hotel New Grand.

ACM Multimedia 2019 and Reproducibility in Multimedia Research

The first months of the new calendar year, multimedia researchers traditionally are hard at work on their ACM Multimedia submissions. (This year the submission deadline is 1 April.) Questions of reproducibility, including those of data set availability and release, are at the forefront of everyone’s mind. In this edition of SIGMM Records, the editors of the “Data Sets and Benchmarks” column have teamed up with two intersecting groups, the Reproducibility Chairs and the General Chairs of ACM Multimedia 2019, to bring you a column about reproducibility in multimedia research and the connection between reproducible research and publicly available data sets. The column highlights the activities of SIGMM towards implementing ACM paper badging. ACM MMSys has pushed our community forward on reproducibility and pioneered the use of ACM badging [1]. We are proud that in 2019 the newly established Reproducibility track will introduce badging at ACM Multimedia.

Complete information on Reproducibility at ACM Multimedia is available at:  https://project.inria.fr/acmmmreproducibility/

The importance of reproducibility

Researchers intuitively understand the importance of reproducibility. Too often, however, it is explained superficially, with statements such as, “If you don’t pay attention to reproducibility, your paper will be rejected”. The essence of the matter lies deeper: reproducibility is important because of its role in making scientific progress possible.

What is this role exactly? The reason that we do research is to contribute to the totality of knowledge at the disposal of humankind. If we think of this knowledge as a building, i.e. a sort of edifice, the role of reproducibility is to provide the strength and stability that makes it possible to build continually upwards. Without reproducibility, there would simply be no way of creating new knowledge.

ACM provides a helpful characterization of reproducibility: “An experimental result is not fully established unless it can be independently reproduced” [2]. In short, a result that is obtainable only once is not actually a result.

Reproducibility and scientific rigor are often mentioned in the same breath. Rigorous research provides systematic and sufficient evidence for its contributions. For example, in an experimental paper, the experiments must be properly designed and the conclusions of the paper must be directly supported by the experimental findings. Rigor involves careful analysis, interpretation, and reporting of the research results. Attention to reproducibility can be considered a part of rigor.

When we commit ourselves to reproducible research, we also commit ourselves to making sure that the research community has what it needs to reproduce our work. This means releasing the data that we use, and also releasing implementations of our algorithms. Devoting time and effort to reproducible research is an important way in which we support Open Science, the movement to make research resources and research results openly accessible to society.

Repeatability vs. Replicability vs. Reproducibility

We frequently use the word “reproducibility” in an informal way that includes three individual concepts, which actually have distinct formal uses: “repeatability”, “replicability” and “reproducibility”. Again, we can turn to ACM for definitions [2]. All three concepts express the idea that research results must be invariant with respect to changes in the conditions under which they were obtained.

Specifically, “repeatability” means that the same research team can achieve the same result using the same setup and resources. “Replicability” means that that team can pass the setup and resources to a different research team, and that that team can also achieve the same result. “Reproducibility” (here, used in the formal sense) means that a different team can achieve the same result using a different setup and different resources. Note the connection to scientific rigor: obtaining the same result multiple times via a process that lacks rigor is meaningless.

When we write a research paper paying attention to reproducibility, it means that we are confident we would obtain the same results again within our own research team, that the paper includes a detailed description of how we achieved the result (and is accompanied by code or other resources), and that we are convinced that other researchers would reach the same conclusions using a comparable, but not identical, set up and resources.

Reproducibility at ACM Multimedia 2019

ACM Multimedia 2019 promotes reproducibility in two ways: First, as usual, reproducibility is one of the review criteria considered by the reviewers (https://www.acmmm.org/2019/reviewer-guidelines/). It is critical that authors describe their approach clearly and completely, and do not omit any details of their implementation or evaluation. Authors should release their data and also provide experimental results on publicly available data. Finally, increasingly, we are seeing authors who include a link to their code or other resources associated with the paper. Releasing resources should be considered a best practice.

The second way that ACM Multimedia 2019 promotes reproducibility is the new Reproducibility Track. Full information is available on the ACM Multimedia Reproducibility website [3]. The purpose of the track is to ensure that authors receive recognition for the effort they have dedicated to making their research reproducible, and also to assign ACM badges to their papers. Next, we summarize the concept of ACM badges, then we will return to discuss the Reproducibility Track in more detail.

ACM Paper badging

Here, we provide a short summary of the information on badging available on the ACM website at [2]. ACM introduced a system of badges in order to help push forward the processes by which papers are reviewed. The goal is to move the attention given to reproducibility to a new level, beyond the level achieved during traditional reviews. Badges seek to motivate authors to use practices leading to better replicability, with the idea that replicability will in turn lead to reproducibility.

In order to understand the badge system, it is helpful to know that ACM badges are divided into two categories. “Artifacts Evaluated” and “Results Evaluated”. ACM defines artifacts as digital objects that are created for the purpose of, or as a result of, carrying out research. Artifacts include implementation code as well as scripts used to run experiments, analyze results, or generate plots. Critically, they also include the data sets that were used in the experiment. The different “Artifacts Evaluated” badges reflect the level of care that authors put into making the artifacts available including how far do they go beyond the minimal functionality necessary and how well are the artifacts are documented.  

There are two “Results Evaluated” badges. The “Results Replicated” badge, which results from a replicability review, and a “Results Reproduced” badge, which results from a full reproducibility review, in which the referees have succeeded in reproducing the results of the paper with only the descriptions of the authors, and without any of the authors’ artifacts. ACM Multimedia adopts the ACM idea that replicability leads to full reproducibility, and for this reason choses to focus in its first year on the “Results replicated” badge. Next we turn to a discussion of the ACM Multimedia 2019 Reproducibility Track and how it implements the “Results Replicated” badge.

Badging ACM MM 2019

Authors of main-conference papers appearing at ACM Multimedia 2018 or 2017 are eligible to make a submission to the Reproducibility Track of ACM Multimedia 2019. The submission has two components: An archive containing the resources needed to replicate the paper, and a short companion paper that contains a description of the experiments that were carried out in the original paper and implemented in the archive. The submissions undergo a formal reproducibility review, and submissions that pass receive a “Results Replicated” badge, which  is added to the original paper in the ACM Digital Library. The companion paper appears in the proceedings of ACM Multimedia 2019 (also with a badge) and is presented at the conference as a poster.

ACM defines the badges, but the choice of which badges to award, and how to implement the review process that leads to the badge, is left to the individual conferences. The consequence is that the design and implementation of the ACM Multimedia Reproducibility Track requires a number of important decisions as well as careful implementation.

A key consideration when designing the ACM Multimedia Reproducibility Track was the work of the reproducibility reviewers. These reviewers carry out tasks that go beyond those of main-conference reviewers, since they must use the authors’ artifacts to replicate their results. The track is designed such that the reproducibility reviewers are deeply involved in the process. Because the companion paper is submitted a year after the original paper, reproducibility reviewers have plenty of time to dive into the code and work together with the authors. During this intensive process, the reviewers extend the originally submitted companion paper with a description of the review process and become authors on the final version of the companion paper.

The ACM Multimedia Reproducibility Track is expected to run similarly in years beyond 2019. The experience gained in 2019 will allow future years to tweak the process in small ways if it proves necessary, and also to expand to other ACM badges.

The visibility of badged papers is important for ACM Multimedia. Visibility incentivizes the authors who submit work to the conference to apply best practices in reproducibility. Practically, the visibility of badges also allows researchers to quickly identify work that they can build on. If a paper presenting new research results has a badge, researchers can immediately understand that this paper would be straightforward to use as a baseline, or that they can build confidently on the paper results without encountering ambiguities, technical issues, or other time-consuming frustrations.

The link between reproducibility and multimedia data sets

The link between Reproducibility and Multimedia Data Sets has been pointed out before, for example, in the theme chosen by the ACM Multimedia 2016 MMCommons workshop, “Datasets, Evaluation, and Reproducibility” [4]. One of the goals of this workshop was to discuss how data challenges and benchmarking tasks can catalyze the reproducibility of algorithms and methods.

Researchers who dedicate time and effort to creating and publishing data sets are making a valuable contribution to research. In order to compare the effectiveness of two algorithms, all other aspects of the evaluation must be controlled, including the data set that is used. Making data sets publicly available supports the systematic comparison of algorithms that is necessary to demonstrate that new algorithms are capable of outperforming the state of the art.

Considering the definitions of “replicability” and “reproducibility” introduced above, additional observations can be made about the importance of multimedia data sets. Creating and publishing data sets supports replicability. In order to replicate a research result, the same resources as used in the original experiments, including the data set, must be available to research teams beyond the one who originally carried out the research.

Creating and publishing data sets also supports reproducibility (in the formal sense of the word defined above). In order to reproduce research results, however, it is necessary that there is more than one data set available that is suitable for carrying out evaluation of a particular approach or algorithm. Critically, the definition of reproducibility involves using different resources than were used in the original work. As the multimedia community continues to move from replication to reproduction, it is essential that a large number of data sets are created and published, in order to ensure that multiple data sets are available to assess the reproducibility of research results.

Acknowledgements

Thank you to people whose hard work is making reproducibility at ACM Multimedia happen: This includes the 2019 TPC Chairs, main-conference ACs and reviewers, as well as the Reproducibility reviewers. If you would like to volunteer to be a reproducibility committee member in this or future years, please contact the Reproducibility Chairs at MM19-Repro@sigmm.org

[1] Simon, Gwendal. Reproducibility in ACM MMSys Conference. Blogpost, 9 May 2017 http://peerdal.blogspot.com/2017/05/reproducibility-in-acm-mmsys-conference.html Accessed 9 March 2019.

[2] ACM, Artifact Review and Badging, Reviewed April 2018,  https://www.acm.org/publications/policies/artifact-review-badging Accessed 9 March 2019.

[3] ACM MM Reproducibility: Information on Reproducibility at ACM Multimedia https://project.inria.fr/acmmmreproducibility/ Accessed 9 March 2019.

[4] Bart Thomee, Damian Borth, and Julia Bernd. 2016. Multimedia COMMONS Workshop 2016 (MMCommons’16): Datasets, Evaluation, and Reproducibility. In Proceedings of the 24th ACM international conference on Multimedia (MM ’16). ACM, New York, NY, USA, 1485-1486.