Hello Multidisciplinary Column!

There is ‘multi’ in multimedia. Every day, an increasing amount of extremely diverse multimedia content has meaning and purpose to an increasing amount of extremely diverse human users, under extremely diverse use cases. As multimedia professionals, we work in an extremely diverse set of focus areas to enable this, ranging from systems aspects to user factors, which each have their own methodologies and related communities outside of the multimedia field.

In our multimedia publication venues, we see all this work coming together. However, are we already sufficiently aware of the multidisciplinary potential in our field? Do we take sufficient effort to consider our daily challenges under the perspectives and methodologies of radically different disciplines than our own? Do we sufficiently make use of existing experiences in problems related to our own, but studied in neighboring communities? And how can an increased multidisciplinary awareness help and inspire us to take the field further?

Feeling the need for a stage for multi- and interdisciplinary dialogue within the multimedia community—and beyond its borders—we are excited to serve as editors to this newly established multidisciplinary column of SIGMM records. This column will be published as part of the records, in 4 issues per year. Content-wise, we foresee a mix of opinion-based articles on multidisciplinary aspects of multimedia and interviews of peers whose work sits at the intersection of disciplines.

Call for contributions
We can only truly highlight the multidisciplinary merit of our field if the extreme diversity of our community is properly reflected in the contributions to this column. Therefore, in addition to invited articles, we are continuously looking for contributions from the community. Do you work at the junction of multimedia and another discipline? Did you get any important professional insights by interacting with neighboring communities? Do you want to share experiences on bridging towards other communities, or user audiences who are initially unfamiliar with our common interest areas? Can you contribute meta-perspectives on common case studies and challenges in our field? Do you know someone who should be interviewed or featured for this column? Then, please do not hesitate to reach out to us!

We see this column as a great opportunity to shape the multimedia community and raise awareness for multidisciplinary work, as well as neighboring communities. Looking forward to your input!

Cynthia and Jochen

 

Editor Biographies

Cynthia_Liem_2017Dr. Cynthia C. S. Liem is an Assistant Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. She initiated and co-coordinated the European research project PHENICX (2013-2016), focusing on technological enrichment of symphonic concert recordings with partners such as the Royal Concertgebouw Orchestra. Her research interests consider music and multimedia search and recommendation, and increasingly shift towards making people discover new interests and content which would not trivially be retrieved. Beyond her academic activities, Cynthia gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, and a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach.

 

 

jochen_huberDr. Jochen Huber is a Senior User Experience Researcher at Synaptics. Previously, he was an SUTD-MIT postdoctoral fellow in the Fluid Interfaces Group at MIT Media Lab and the Augmented Human Lab at Singapore University of Technology and Design. He holds a Ph.D. in Computer Science and degrees in both Mathematics (Dipl.-Math.) and Computer Science (Dipl.-Inform.), all from Technische Universität Darmstadt, Germany. Jochen’s work is situated at the intersection of Human-Computer Interaction and Human Augmentation. He designs, implements and studies novel input technology in the areas of mobile, tangible & non-visual interaction, automotive UX and assistive augmentation. He has co-authored over 60 academic publications and regularly serves as program committee member in premier HCI and multimedia conferences. He was program co-chair of ACM TVX 2016 and Augmented Human 2015 and chaired tracks of ACM Multimedia, ACM Creativity and Cognition and ACM International Conference on Interface Surfaces and Spaces, as well as numerous workshops at ACM CHI and IUI. Further information can be found on his personal homepage: http://jochenhuber.com

Datasets and Benchmarks Column: Introduction

Datasets are critical for research and development as, rather obviously, data is required for performing experiments, validating hypotheses, analyzing designs, and building applications. Over the years a plurality of multimedia collections have been put together, which can range from the one-off instances that have been exclusively created for supporting the work presented in a single paper or demo to those that have been created with multiple related or separate endeavors in mind. Unfortunately, the collected data is often not made publicly available. In some cases, it may not be possible to make a public release due to the proprietary or sensitive nature of the data, but other forces are also at work. For example, one might be reluctant to share data freely, as it has a value from the often substantial amount of time, effort, and money that was invested in collecting it. 

Once a dataset has been made public though, it becomes possible to perform validations of results reported in the literature and to make comparisons between methods using the same source of truth, although matters are complicated when the source code of the methods is not published or the ground truth labels are not made available. Benchmarks offer a useful compromise by offering a particular task to solve along with the data that one is allowed to use and the evaluation metrics that dictate what is considered success and failure. While benchmarks may not offer the cutting edge of research challenges for which utilizing the freshest data is an absolute requirement, they are a useful sanity check to ensure that methods that appear to work on paper also work in practice and are indeed as good as claimed.

Several efforts are underway to stimulate sharing of datasets and code, as well as to promote the reproducibility of experiments. These efforts provide encouragement to overcome the reluctance to share data by underlining the ways in which data becomes more valuable with community-wide use. They also offer insights on how researchers can put data sets that are publicly available to the best possible use. We provide here a couple of key examples of ongoing efforts. At the MMSys conference series, there is a special track for papers on datasets, and Qualinet maintains an index of known multimedia collections. The ACM Artifact Review and Badging policy proposal recommends journals and conferences to adopt a reviewing procedure where the submitted papers can be granted special badges to indicate to what extent the performed experiments are repeatable, replicable, and reproducible. For example, the “Artifacts Evaluated – Reusable” badge would indicate that artifacts associated with the research are found to be documented, consistent, complete, exercisable, and include appropriate evidence of verification and validation to the extent that reuse and repurposing is facilitated.

In future posts appearing in this column, we will be highlighting new public datasets and upcoming benchmarks through a series of invited guest posts, as well as provide insights and updates on the latest development in this area. The columns are edited by Bart Thomee and Martha Larson (see our bios at the end of this post).

To establish a baseline of popular multimedia datasets and benchmarks that have been used over the years by the research community, refer to the table below to see what the state of the art was as of 2015 when the data was compiled by Bart for his paper on the YFCC100M dataset. We can see the sizes of the datasets steadily increasing over the years, the license becoming less restrictive, and it now is the norm to also release additional metadata, precomputed features, and/or ground truth annotations together with the dataset. The last three entries in the table are benchmarks that include tasks such as video surveillance and object localization (TRECVID), diverse image search and music genre recognition (MediaEval), life-logging event search and medical image analysis (ImageCLEF), to name just a few. The table is most certainly not exhaustive, although it is reflective of the evolution of datasets over the last two decades. We will use this table to provide context for the datasets and benchmarks that we will cover in our upcoming columns, so stay tuned for our next post!

collections

bartBart Thomee is a Software Engineer at Google/YouTube in San Bruno, CA, USA, where he focuses on web-scale real-time streaming and batch techniques to fight abuse, spam, and fraud. He was previously a Senior Research Scientist at Yahoo Labs and Flickr, where his research centered on the visual and spatiotemporal dimensions of media, in order to better understand how people experience and explore the world, and how to better assist them with doing so. He led the development of the YFCC100M dataset released in 2014, and previously was part of the efforts leading to the creation of both MIRFLICKR datasets. He has furthermore been part of the organization of the ImageCLEF photo annotation tasks 2012–2013, the MediaEval placing tasks 2013–2016, and the ACM MM Yahoo-Flickr Grand Challenges 2015–2016. In addition, he has served on the program committees of, amongst others, ACM MM, ICMR, SIGIR, ICWSM and ECIR. He was part of the Steering Committee of the Multimedia COMMONS 2015 workshop at ACM MM and co-chaired the workshop in 2016; he also co-organized the TAIA workshop at SIGIR 2015.

Martha Larson is professor in the area of multiSquaremedia information technology at Radboud University in Nijmegen, Netherlands. Previously, she researched and lectured in the area of audio-visual retrieval Fraunhofer IAIS, Germany, and at the University of Amsterdam, Netherlands. Larson is co-founder of the MediaEval international benchmarking initiative for Multimedia Evaluation. She has contributed to the organization of various other challenges, including CLEF NewsREEL 2015-2017, ACM RecSys Challenge 2016, and TRECVid Video Hyperlinking 2016. She has served on the program committees of numerous conferences in the area of information retrieval, multimedia, recommender systems, and speech technology. Other forms of service have included: Area Chair at ACM Multimedia 2013, 2014, 2017, and TPC Chair at ACM ICMR 2017. Currently, she is an Associated Editor for IEEE Transactions of Multimedia. She is a founding member of the ISCA Special Interest Group on Speech and Language in Multimedia and serves on the IAPR Technical Committee 12 Multimedia and Visual Information Systems. Together with Hayley Hung she developed and currently teaches an undergraduate course in Multimedia Analysis at Delft University of Technology, where she maintains a part-time membership in the Multimedia Computing Group.

Standards Column: JPEG and MPEG

Introduction

ISO/IEC JTC 1/SC 29 area of work comprises the standardization of coded representation of audio, picture, multimedia and hypermedia information and sets of compression and control functions for use with such information. SC29 basically hosts two working groups responsible for the development of international standards for the compression, decompression, processing, and coded representation of media content, in order to satisfy a wide variety of applications”, specifically WG1 targeting “digital still pictures”  — also known as JPEG — and WG11 targeting “moving pictures, audio, and their combination” — also known as MPEG. The earlier SC29 standards, namely JPEG, MPEG-1 and MPEG-2, received the technology & engineering Emmy award in 1995-96.

The standards columns within ACM SIGMM Records provide timely updates about the most recent developments within JPEG and MPEG respectively. The JPEG column is edited by Antonio Pinheiro and the MPEG column is edited by Christian Timmerer. The editors and an overview of recent JPEG and MPEG achievements as well as future plans are highlighted in this article.

Antonio Pinheiro received the BSc (Licenciatura) from I.S.T., Lisbon in 1988 and the PhD in faceAMGP3Electronic Systems Engineering from University of Essex in 2002. He is a lecturer at U.B.I. (Universidade da Beira Interior), Covilha, Portugal from 1988 and a researcher at I.T. (Instituto de Telecomunicações), Portugal. Currently, his research interests are on Image Processing, namely on Multimedia Quality Evaluation and Medical Image Analysis. He was a Portuguese representative of the European Union Actions COST IC1003 – QUALINET, COST IC1206 – DE-ID, COST 292 and currently of COST BM1304 – MYO-MRI. He is currently involved in the project EmergIMG funded by the Portuguese Funding agency and H2020, and he is a Portuguese delegate to JPEG, where he is currently the Communication Subgroup chair and involved with the JPEG Pleno project.

 

 

ct2013octChristian Timmerer received his M.Sc. (Dipl.-Ing.) in January 2003 and his Ph.D. (Dr.techn.) in June 2006 (for research on the adaptation of scalable multimedia content in streaming and constrained environments) both from the Alpen-Adria-Universität (AAU) Klagenfurt. He joined the AAU in 1999 (as a system administrator) and is currently an Associate Professor at the Institute of Information Technology (ITEC) within the Multimedia Communication Group. His research interests include immersive multimedia communications, streaming, adaptation, Quality of Experience, and Sensory Experience. He was the general chair of WIAMIS 2008, QoMEX 2013, and MMSys 2016 and has participated in several EC-funded projects, notably DANAE, ENTHRONE, P2P-Next, ALICANTE, SocialSensor, COST IC1003 QUALINET, and ICoSOLE. He also participated in ISO/MPEG work for several years, notably in the area of MPEG-21, MPEG-M, MPEG-V, and MPEG-DASH where he also served as standard editor. In 2012 he cofounded Bitmovin (http://www.bitmovin.com/) to provide professional services around MPEG-DASH where he holds the position of the Chief Innovation Officer (CIO).

 

Major JPEG and MPEG Achievements

In this section we would like to highlight major JPEG and MPEG achievements without claiming to be exhaustive.

JPEG developed the well-known digital pictures coding standard, known as JPEG image format almost 25 years ago. Due to the recent increase of social networks usage, the number of JPEG encoded images shared online grew to an impressive number of 1,800 billion per day in 2014. JPEG 2000 is another JPEG successful standard that also received the 2015 Technology and Engineering Emmy award. This standard uses state of the art compression technology providing higher compression and a wider applications domain. It is widely used at professional level, namely on movies production and medical imaging. JPEG also developed JBIG2, JPEG-LS, JPSearch and JPEG-XR standards. More recently JPEG launched JPEG-AIC, JPEG Systems and JPEG-XT. JPEG-XT defines backward compatible extensions of JPEG, adding support for HDR, lossless/near lossless, and alpha coding. An overview of the JPEG family of standards is shown in the figure below.

JPEGstandards
An overview of existing MPEG standards and achievements is shown in the figure below (taken from here).

MPEGStandards

A first major milestone and success was the development of MP3 which revolutionized digital audio content resulting in a sustainable change of the digital media ecosystem. The same holds for MPEG-2 video & systems where the latter, i.e., MPEG-2 Transport Stream, received the technology & engineering Emmy award. The mobile era within MPEG has been introduced with the MPEG-4 standard resulting in the development of AVC (received yet another Emmy award), AAC, and also the MP4 file format which have been deployed widely. Finally, streaming over the open internet is addressed by DASH and new forms of digital television including ultra high-definition & immersive services are targeted by MPEG-H comprising MMT, HEVC, and 3D audio.

Roadmap for Future JPEG and MPEG Standards

In this section we would like to highlight a roadmap for future JPEG and MPEG standards.

A roadmap for future JPEG standards is represented in the figure above. The main efforts are towards the JPEG Pleno project that aims to standardize new immersive technologies like light fields, point clouds or digital holography. Moreover, JPEG is launching JPEG-XS for low latency and light weight coding, while JPEG Systems is also developing a new part to add privacy and security protection to their standards. Furthermore, JPEG is continuously seeking new technological developments and it is committed on providing new standardized image coding solutions.

JPEGroadmap

The future roadmap of MPEG standards is shown in the Figure below (taken from here).

MPEGRoadmap

MPEG’s roadmap for future standards comprises a variety of tools ranging from traditional audio-video coding to new forms of compression technologies like genome compression and lightfield. The systems aspects will cover applications domains which require media orchestration as well as focus on becoming the enabler for immersive media experiences.

Conclusion

In this article we briefly highlighted achievements and future plans of JPEG and MPEG but the future is not defined and requires participation from both industry and academia. We hope that our JPEG and MPEG columns will stimulate research and development within the multimedia domain and we are open for any kind of feedback. Contact Antonio Pinheiro (pinheiro@ubi.pt) or Christian Timmerer (christian.timmerer@itec.uni-klu.ac.at) for any further questions or comments.

@sigmm Records: serving the community

The SIGMM Records are renewing, with the continued ambition of being a useful resource for the multimedia community. We want to provide a forum for (open) discussion, but also to become the primary source of information for our community.

Firstly, I would like to thank Carsten who was run, single-handed, the whole records for many many years. We all agree that he has done an amazing job, and that his service deserves our gratitude, and possibly some beers, when you meet him at conferences and meetings.

As you are probably aware, a number of changes in the records are underway. We want your opinions and suggestions to make this resource the best it can be. Hence, we need your help to make this a success, so please drop us a line if you want to join the team.

The two main visible changes are:

We have a new amazing team to lead the records in the coming years. I am so glad to have their help: http://sigmm.hosting.acm.org/impressum/

We have reorganized the records and its structure, in three main clusters:

More changes to come. Stay tuned!

Pablo (Editor in Chief) + Carsten and Mario (Information Directors)

Pablo CesarDr. Pablo Cesar leads the Distributed and Interactive Systems group at Centrum Wiskunde & Informatica (CWI) in the Netherlands. Pablo’s research focuses on modeling and controlling complex collections of media objects (including real-time media and sensor data) that are distributed in time and space. His fundamental interest is in understanding how different customizations of such collections affect the user experience. Pablo is the PI of Public Private Partnership projects with Xinhuanet and ByBorre, and very successful EU-funded projects like 2-IMMERSE, REVERIE and Vconect. He has (co)-authored over 100 articles. He is member of the editorial board of, among others, ACM Transactions on Multimedia (TOMM). Pablo has given tutorials about multimedia systems in prestigious conferences such as ACM Multimedia, CHI, and the WWW conference. He acted as an invited expert at the European Commission’s Future Media Internet Architecture Think Tank and participates in standardisation activities at MPEG (point-cloud compression) and ITU (QoE for multi-party tele-meetings). Webpage: http://homepages.cwi.nl/~garcia/

 

Carsten GriwodzDr. Carsten Griwodz is Chief Research Scientist at the Media Department of theNorwegian research company Simula Research Laboratory AS, Norway, and professor at the University of Oslo. He is also co-founder of ForzaSys AS, a social media startup for sports. He is steering committee member of ACM MMSys and ACM/IEEE NetGames. He is associate editor of the IEEE MMTC R-Letter and was previously editor-in-chief of the ACM SIGMM Records and editor of ACM TOMM.

 

 

photo_mario_montagudDr. Mario Montagud (@mario_montagud) was born in Montitxelvo (Spain). He received a BsC in Telecommunications Engineering in 2011, an MsC degree in “Telecommunication Technologies, Systems and Networks” in 2012 and a PhD degree in Telecommunications (Cum Laude Distinction) in 2015, all of them at the Polytechnic University of Valencia (UPV). During his PhD degree and after completing it, he did 3 research stays (accumulating 18 months) at CWI (The National Research Institute for Mathematics and Computer Science in the Netherlands). He also has experience as a postdoc researcher at UPV. His topics of interest include Computer Networks, Interactive and Immersive Media, Synchronization, and QoE (Quality of Experience). Mario is (co-) author of over 50 scientific and teaching publications, and has contributed to standardization within the IETF (Internet Engineering Task Force). He is member of the Technical Committee of several international conferences (e.g., ACM MM, MMSYS and TVX), co-organizer of the international MediaSync Workshop series, and member of the Editorial Board of international journals. He is also lead editor of “MediaSync: Handbook on Multimedia Synchronization” (Springer, 2017) and Communication Embassador of ACM SIGCHI (Special Interest Group on Computer-Human Interaction). Webpage: https://sites.google.com/site/mamontor/