Interview Column – Introduction

Leave a comment

The interviews in the SIGMM records aim to provide the community with the insights, visions, and views from outstanding researchers in multimedia. With the interviews we particularly try to find out what makes these researchers outstanding and also to a certain extend what is going on in their mind, what are their visions and what are their thoughts about current topics. Examples from the last issues include interviews with Judith Redi, Klara Nahrstedt, and Wallapak Tavanapong.

The interviewers are conducted via Skype or — even better — in person by meeting them at conferences or other community events. We aim to publish three to four interviews a year. If you have suggestions for who to interview, please feel free to contact one of the column editors, which are:

Michael Alexander Riegler is a scientific researcher at Simula Research Laboratory. He received his Master’s degree from Klagenfurt University with distinction and finished his PhD at the University of Oslo in two and a half years. His PhD thesis topic was efficient processing of medical multimedia workloads.
His research interests are medical multimedia data analysis and understanding, image processing, image retrieval, parallel processing, gamification and serious games, crowdsourcing, social computing and user intentions. Furthermore, he is involved in several initiatives like the MediaEval Benchmarking initiative for Multimedia Evaluation, which runs this year the Medico task (automatic analysis of colonoscopy videos, http://www.multimediaeval.org/mediaeval2017/medico/.

Herman Engelbrecht is one of the directors at the MIH Electronic Media Laboratory at Stellenbosch University. He is a lecturer in Signal Processing at the Department of Electrical and Electronic Engineering. His responsibilities in the Electronic Media Laboratory are the following: Managing the immediate objectives and research activities of the Laboratory; regularly meeting with postgraduate researchers and their supervisors to assist in steering their research efforts towards the overall research goals of the Laboratory; ensuring that the Laboratory infrastructure is developed and maintained; managing interaction with external contractors and service providers; managing the capital expenditure of the Laboratory; and managing the University’s relationship with the postgraduate researchers – See more at: http://ml.sun.ac.za/people/dr-ha-engelbrecht/#sthash.3SexKFo5.dpuf

Mathias Lux is associate professor at the Institute for Information Technology (ITEC) at Klagenfurt University. He is working on user intentions in multimedia retrieval and production and emergent semantics in social multimedia computing. In his scientific career he has (co-) authored more than 80 scientific publications, has served in multiple program committees and as reviewer of international conferences, journals and magazines, and has organized multiple scientific events. Mathias Lux is also well known for the development of the award winning and popular open source tools Caliph & Emir and LIRe (http://www.semanticmetadata.net) for multimedia information retrieval. Dr. Mathias Lux received his M.S. in Mathematics 2004, his Ph.D. in Telematics 2006 from Graz University of Technology, both with distinction, and his Habilitation (venia docendi) from Klagenfurt University in 2013.

Introduction to the Opinion Column

By Mario Montagud | May 12, 2017 - 21:34 |November 29, 2019 0217, Feature, Opinion Column Intro (special use only), Opinion: All Opinion Column Intros (special use only)

Leave a comment

Welcome to the SIGMM Community Discussion Column! In this very first edition we would like to introduce the column to the community, its objectives and main operative characteristics.

Given the exponential amount of multimedia data shared online and offline everyday, research in Multimedia is of unprecedented importance. We might be now facing a new era of our research field, and we would like the whole community to be involved in the improvement and evolution of our domain.

The column has two main goals. First, we will promote dialogue regarding topics of interests for the MM community, by providing tools for continuous discussion among the members of the multimedia community. Every quarter, we will discuss (usually) one topic via online tools. Topics will include “What is Multimedia, and what is the role of the Multimedia community in science?”; “Diversity and minorities in the community”; “The ACM code of ethics”; etc.

Second, we will monitor and summarize on-going discussions, and spread their results within and outside the community. Every edition of this column will then summarize the discussion, highlighting popular and non-popular opinions, agreed action points and future work.

To foster the discussion, we set up an online discussion forum to which all members of the multimedia community (expertise and seniority mixed) can participate: the Facebook MM Community Discussion group (follow this link: https://www.facebook.com/groups/132278853988735/) . For every edition of the column, we will choose an initial set of topics of high relevance for the community. We will include, for example, topics that have been previously discussed at ACM meetings (e.g., the code of ethics), or in related events (e.g., Diversity at MM Women lunch), or popular off-line discussions among MM researchers (e.g., review processes, vision of the scientific community…). In the first 15 days of the quarter, the members of the community will choose one topic from this short-list via an online poll shared through the MM Facebook group. We will then select the topic that received the higher number of votes as the subject for the quarterly discussion.

Volunteers or selected members of the MM group will start the discussion via Facebook posts on the group page. The discussion will be then open for a period of a month. All members of the community can participate by replying to posts or by directly posting on the group page, describing their point of view on the subject while being concise and clear. During this period, we will monitor and moderate (when needed) the discussion. At the end of the month, we will summarise the discussion by describing its evolution, exposing major and minor opinions, outlining highlights and lowlights. A final text with the summary and some relevant discussion extracts will be prepared and will appear in the SIGMM Records and in the Facebook “MM Community page”: https://www.facebook.com/MM-Community-217668705388738/.

Hopefully, the community will benefit from this initiative by either reaching some consensus or by pointing out important topics that are not mature enough and require further exploration. In the long-term, we hope these process will make the community evolve through large consensus and bottom-up discussions.

Let’s contribute and foster research around topics of high interest for the community!

Xavi and Miriam

Dr. Xavier Alameda-Pineda (Xavi) is research scientist at INRIA. Xavi’s interdisciplinary background (Msc in Mathematics, Telecommunications and Computer Science) grounded him to pursue his PhD in Mathematics and Computer Science, and a further post-doc in the University of Trento. His research interests are signal processing, computer vision and machine learning for scene and behavior understanding using multimodal data. He is the winner of the best paper award of ACM MM 2015, the best student paper award at IEEE WASPAA 2015 and the best scientific paper award at IAPR ICPR 2016.

Dr. Miriam Redi is a research scientist in the Social Dynamics team at Bell Labs Cambridge. Her research focuses on content-based social multimedia understanding and culture analytics. In particular, Miriam explores ways to automatically assess visual aesthetics, sentiment, and creativity and exploit the power of computer vision in the context of web, social media, and online communities. Previously, she was a postdoc in the Social Media group at Yahoo Labs Barcelona and a research scientist at Yahoo London. Miriam holds a PhD from the Multimedia group in EURECOM, Sophia Antipolis.

Open Source Column – Introduction

By Mathias Lux | May 12, 2017 - 11:24 |December 26, 2017 0217, Feature, Open Source Column

Leave a comment

“Open source software is software that can be freely accessed, used, changed, and shared (in modified or unmodified form) by anyone” (cp. https://opensource.org/osd). So open source software (OSS) is actually something that one or more people can work on, improve it, refine it, change it, adapt it and share or use it. Why would anyone support such a feature? Examples from the industry show that this is a valid approach for many software products. Prominent open source projects are in use worldwide on an everyday basis, including the Apache Web Server, the Linux Kernel, the GNU Compiler Collection, Samba, OpenSSL, and MySQL. For industry this means not only re-using components, and libraries, but also being able to fix them, adapt them to their needs and hire people who are already familiar with the tools. Business models based on open source software focus more on services than products and ensure the longevity of the software as even if companies vanish, the open source software is here to stay.

In academia open source provides a way to employ well-known methods as a base line or a starting point without having to re-invent the wheel by programming algorithms and methods all over again. This is especially popular in multimedia research, which would not be as agile and forward looking if it weren’t for OpenCV, ffmpeg, Caffe, and SciPy and NumPy, just to name a few. In research the need for publishing source code and data along with the scientific publication to ensure reproducibility has been identified recently (cp. ACM Artifact Review and Badging, https://www.acm.org/publications/policies/artifact-review-badging). This of course includes stronger support for releasing software and data artifacts based on open licenses.

The SIGMM community has been very active in this regard, since ACM Intl. Conference on Multimedia hosts the Open Source Software Competition since 2004; this competition has attracted in the latest years an increasing number of submissions and, according to Google Scholar, two of the currently three top cited papers in the last 5 years of the conference were submitted to this competition. This year also the ACM Intl. Conference on Multimedia Retrieval has introduced an OSS track.

Our aim for SIGMM Records is to point out recent development, announce interesting releases, share insights from the community and actively support knowledge transfer from research to industry based on open source software and open data four times a year. If you are interested in writing for the open source column, or have something you would like to know more about in this area, please do not hesitate to contact the editors. Examples are articles on open source frameworks or projects like the Menpo project, the Siva Suite, or the Yael library.

The SIGMM Records editors responsible for the open source are dedicated to the cause and have quite some history with open source in academia and industry.

Marco Bertini (https://github.com/mbertini) is associate professor at the University of Florence and long term open source supporter, especially by having served as chair and co-chair of the open source software competition at ACM Intl. Conference on Multimedia.

Mathias Lux (https://github.com/dermotte) has participated in the very same challenge with several open source projects. He’s associate professor at Klagenfurt University and dedicated to open source in research and teaching and main contributor to several open source projects.

Hello Multidisciplinary Column!

By Jochen Huber | May 11, 2017 - 11:15 |December 26, 2017 0217, Feature

Leave a comment

There is ‘multi’ in multimedia. Every day, an increasing amount of extremely diverse multimedia content has meaning and purpose to an increasing amount of extremely diverse human users, under extremely diverse use cases. As multimedia professionals, we work in an extremely diverse set of focus areas to enable this, ranging from systems aspects to user factors, which each have their own methodologies and related communities outside of the multimedia field.

In our multimedia publication venues, we see all this work coming together. However, are we already sufficiently aware of the multidisciplinary potential in our field? Do we take sufficient effort to consider our daily challenges under the perspectives and methodologies of radically different disciplines than our own? Do we sufficiently make use of existing experiences in problems related to our own, but studied in neighboring communities? And how can an increased multidisciplinary awareness help and inspire us to take the field further?

Feeling the need for a stage for multi- and interdisciplinary dialogue within the multimedia community—and beyond its borders—we are excited to serve as editors to this newly established multidisciplinary column of SIGMM records. This column will be published as part of the records, in 4 issues per year. Content-wise, we foresee a mix of opinion-based articles on multidisciplinary aspects of multimedia and interviews of peers whose work sits at the intersection of disciplines.

Call for contributions
We can only truly highlight the multidisciplinary merit of our field if the extreme diversity of our community is properly reflected in the contributions to this column. Therefore, in addition to invited articles, we are continuously looking for contributions from the community. Do you work at the junction of multimedia and another discipline? Did you get any important professional insights by interacting with neighboring communities? Do you want to share experiences on bridging towards other communities, or user audiences who are initially unfamiliar with our common interest areas? Can you contribute meta-perspectives on common case studies and challenges in our field? Do you know someone who should be interviewed or featured for this column? Then, please do not hesitate to reach out to us!

We see this column as a great opportunity to shape the multimedia community and raise awareness for multidisciplinary work, as well as neighboring communities. Looking forward to your input!

Cynthia and Jochen

Editor Biographies

Dr. Cynthia C. S. Liem is an Assistant Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. She initiated and co-coordinated the European research project PHENICX (2013-2016), focusing on technological enrichment of symphonic concert recordings with partners such as the Royal Concertgebouw Orchestra. Her research interests consider music and multimedia search and recommendation, and increasingly shift towards making people discover new interests and content which would not trivially be retrieved. Beyond her academic activities, Cynthia gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, and a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach.

Dr. Jochen Huber is a Senior User Experience Researcher at Synaptics. Previously, he was an SUTD-MIT postdoctoral fellow in the Fluid Interfaces Group at MIT Media Lab and the Augmented Human Lab at Singapore University of Technology and Design. He holds a Ph.D. in Computer Science and degrees in both Mathematics (Dipl.-Math.) and Computer Science (Dipl.-Inform.), all from Technische Universität Darmstadt, Germany. Jochen’s work is situated at the intersection of Human-Computer Interaction and Human Augmentation. He designs, implements and studies novel input technology in the areas of mobile, tangible & non-visual interaction, automotive UX and assistive augmentation. He has co-authored over 60 academic publications and regularly serves as program committee member in premier HCI and multimedia conferences. He was program co-chair of ACM TVX 2016 and Augmented Human 2015 and chaired tracks of ACM Multimedia, ACM Creativity and Cognition and ACM International Conference on Interface Surfaces and Spaces, as well as numerous workshops at ACM CHI and IUI. Further information can be found on his personal homepage: http://jochenhuber.com

Datasets and Benchmarks Column: Introduction

By Bart Thomee | May 10, 2017 - 03:21 |December 26, 2017 0217, Feature

Leave a comment

Datasets are critical for research and development as, rather obviously, data is required for performing experiments, validating hypotheses, analyzing designs, and building applications. Over the years a plurality of multimedia collections have been put together, which can range from the one-off instances that have been exclusively created for supporting the work presented in a single paper or demo to those that have been created with multiple related or separate endeavors in mind. Unfortunately, the collected data is often not made publicly available. In some cases, it may not be possible to make a public release due to the proprietary or sensitive nature of the data, but other forces are also at work. For example, one might be reluctant to share data freely, as it has a value from the often substantial amount of time, effort, and money that was invested in collecting it.

Once a dataset has been made public though, it becomes possible to perform validations of results reported in the literature and to make comparisons between methods using the same source of truth, although matters are complicated when the source code of the methods is not published or the ground truth labels are not made available. Benchmarks offer a useful compromise by offering a particular task to solve along with the data that one is allowed to use and the evaluation metrics that dictate what is considered success and failure. While benchmarks may not offer the cutting edge of research challenges for which utilizing the freshest data is an absolute requirement, they are a useful sanity check to ensure that methods that appear to work on paper also work in practice and are indeed as good as claimed.

Several efforts are underway to stimulate sharing of datasets and code, as well as to promote the reproducibility of experiments. These efforts provide encouragement to overcome the reluctance to share data by underlining the ways in which data becomes more valuable with community-wide use. They also offer insights on how researchers can put data sets that are publicly available to the best possible use. We provide here a couple of key examples of ongoing efforts. At the MMSys conference series, there is a special track for papers on datasets, and Qualinet maintains an index of known multimedia collections. The ACM Artifact Review and Badging policy proposal recommends journals and conferences to adopt a reviewing procedure where the submitted papers can be granted special badges to indicate to what extent the performed experiments are repeatable, replicable, and reproducible. For example, the “Artifacts Evaluated – Reusable” badge would indicate that artifacts associated with the research are found to be documented, consistent, complete, exercisable, and include appropriate evidence of verification and validation to the extent that reuse and repurposing is facilitated.

In future posts appearing in this column, we will be highlighting new public datasets and upcoming benchmarks through a series of invited guest posts, as well as provide insights and updates on the latest development in this area. The columns are edited by Bart Thomee and Martha Larson (see our bios at the end of this post).

To establish a baseline of popular multimedia datasets and benchmarks that have been used over the years by the research community, refer to the table below to see what the state of the art was as of 2015 when the data was compiled by Bart for his paper on the YFCC100M dataset. We can see the sizes of the datasets steadily increasing over the years, the license becoming less restrictive, and it now is the norm to also release additional metadata, precomputed features, and/or ground truth annotations together with the dataset. The last three entries in the table are benchmarks that include tasks such as video surveillance and object localization (TRECVID), diverse image search and music genre recognition (MediaEval), life-logging event search and medical image analysis (ImageCLEF), to name just a few. The table is most certainly not exhaustive, although it is reflective of the evolution of datasets over the last two decades. We will use this table to provide context for the datasets and benchmarks that we will cover in our upcoming columns, so stay tuned for our next post!

Bart Thomee is a Software Engineer at Google/YouTube in San Bruno, CA, USA, where he focuses on web-scale real-time streaming and batch techniques to fight abuse, spam, and fraud. He was previously a Senior Research Scientist at Yahoo Labs and Flickr, where his research centered on the visual and spatiotemporal dimensions of media, in order to better understand how people experience and explore the world, and how to better assist them with doing so. He led the development of the YFCC100M dataset released in 2014, and previously was part of the efforts leading to the creation of both MIRFLICKR datasets. He has furthermore been part of the organization of the ImageCLEF photo annotation tasks 2012–2013, the MediaEval placing tasks 2013–2016, and the ACM MM Yahoo-Flickr Grand Challenges 2015–2016. In addition, he has served on the program committees of, amongst others, ACM MM, ICMR, SIGIR, ICWSM and ECIR. He was part of the Steering Committee of the Multimedia COMMONS 2015 workshop at ACM MM and co-chaired the workshop in 2016; he also co-organized the TAIA workshop at SIGIR 2015.

Martha Larson is professor in the area of multi Square media information technology at Radboud University in Nijmegen, Netherlands. Previously, she researched and lectured in the area of audio-visual retrieval Fraunhofer IAIS, Germany, and at the University of Amsterdam, Netherlands. Larson is co-founder of the MediaEval international benchmarking initiative for Multimedia Evaluation. She has contributed to the organization of various other challenges, including CLEF NewsREEL 2015-2017, ACM RecSys Challenge 2016, and TRECVid Video Hyperlinking 2016. She has served on the program committees of numerous conferences in the area of information retrieval, multimedia, recommender systems, and speech technology. Other forms of service have included: Area Chair at ACM Multimedia 2013, 2014, 2017, and TPC Chair at ACM ICMR 2017. Currently, she is an Associated Editor for IEEE Transactions of Multimedia. She is a founding member of the ISCA Special Interest Group on Speech and Language in Multimedia and serves on the IAPR Technical Committee 12 Multimedia and Visual Information Systems. Together with Hayley Hung she developed and currently teaches an undergraduate course in Multimedia Analysis at Delft University of Technology, where she maintains a part-time membership in the Multimedia Computing Group.

Standards Column: JPEG and MPEG

By Christian Timmerer | May 9, 2017 - 07:41 |December 26, 2017 0217, Feature, Standards Introduction (special use only)

Leave a comment

Introduction

ISO/IEC JTC 1/SC 29 area of work comprises the standardization of coded representation of audio, picture, multimedia and hypermedia information and sets of compression and control functions for use with such information. SC29 basically hosts two working groups responsible for the development of international standards for the compression, decompression, processing, and coded representation of media content, in order to satisfy a wide variety of applications”, specifically WG1 targeting “digital still pictures” — also known as JPEG — and WG11 targeting “moving pictures, audio, and their combination” — also known as MPEG. The earlier SC29 standards, namely JPEG, MPEG-1 and MPEG-2, received the technology & engineering Emmy award in 1995-96.

The standards columns within ACM SIGMM Records provide timely updates about the most recent developments within JPEG and MPEG respectively. The JPEG column is edited by Antonio Pinheiro and the MPEG column is edited by Christian Timmerer. The editors and an overview of recent JPEG and MPEG achievements as well as future plans are highlighted in this article.

Antonio Pinheiro received the BSc (Licenciatura) from I.S.T., Lisbon in 1988 and the PhD in faceAMGP3 Electronic Systems Engineering from University of Essex in 2002. He is a lecturer at U.B.I. (Universidade da Beira Interior), Covilha, Portugal from 1988 and a researcher at I.T. (Instituto de Telecomunicações), Portugal. Currently, his research interests are on Image Processing, namely on Multimedia Quality Evaluation and Medical Image Analysis. He was a Portuguese representative of the European Union Actions COST IC1003 – QUALINET, COST IC1206 – DE-ID, COST 292 and currently of COST BM1304 – MYO-MRI. He is currently involved in the project EmergIMG funded by the Portuguese Funding agency and H2020, and he is a Portuguese delegate to JPEG, where he is currently the Communication Subgroup chair and involved with the JPEG Pleno project.

Christian Timmerer received his M.Sc. (Dipl.-Ing.) in January 2003 and his Ph.D. (Dr.techn.) in June 2006 (for research on the adaptation of scalable multimedia content in streaming and constrained environments) both from the Alpen-Adria-Universität (AAU) Klagenfurt. He joined the AAU in 1999 (as a system administrator) and is currently an Associate Professor at the Institute of Information Technology (ITEC) within the Multimedia Communication Group. His research interests include immersive multimedia communications, streaming, adaptation, Quality of Experience, and Sensory Experience. He was the general chair of WIAMIS 2008, QoMEX 2013, and MMSys 2016 and has participated in several EC-funded projects, notably DANAE, ENTHRONE, P2P-Next, ALICANTE, SocialSensor, COST IC1003 QUALINET, and ICoSOLE. He also participated in ISO/MPEG work for several years, notably in the area of MPEG-21, MPEG-M, MPEG-V, and MPEG-DASH where he also served as standard editor. In 2012 he cofounded Bitmovin (http://www.bitmovin.com/) to provide professional services around MPEG-DASH where he holds the position of the Chief Innovation Officer (CIO).

Major JPEG and MPEG Achievements

In this section we would like to highlight major JPEG and MPEG achievements without claiming to be exhaustive.

JPEG developed the well-known digital pictures coding standard, known as JPEG image format almost 25 years ago. Due to the recent increase of social networks usage, the number of JPEG encoded images shared online grew to an impressive number of 1,800 billion per day in 2014. JPEG 2000 is another JPEG successful standard that also received the 2015 Technology and Engineering Emmy award. This standard uses state of the art compression technology providing higher compression and a wider applications domain. It is widely used at professional level, namely on movies production and medical imaging. JPEG also developed JBIG2, JPEG-LS, JPSearch and JPEG-XR standards. More recently JPEG launched JPEG-AIC, JPEG Systems and JPEG-XT. JPEG-XT defines backward compatible extensions of JPEG, adding support for HDR, lossless/near lossless, and alpha coding. An overview of the JPEG family of standards is shown in the figure below.

An overview of existing MPEG standards and achievements is shown in the figure below (taken from here).

A first major milestone and success was the development of MP3 which revolutionized digital audio content resulting in a sustainable change of the digital media ecosystem. The same holds for MPEG-2 video & systems where the latter, i.e., MPEG-2 Transport Stream, received the technology & engineering Emmy award. The mobile era within MPEG has been introduced with the MPEG-4 standard resulting in the development of AVC (received yet another Emmy award), AAC, and also the MP4 file format which have been deployed widely. Finally, streaming over the open internet is addressed by DASH and new forms of digital television including ultra high-definition & immersive services are targeted by MPEG-H comprising MMT, HEVC, and 3D audio.

Roadmap for Future JPEG and MPEG Standards

In this section we would like to highlight a roadmap for future JPEG and MPEG standards.

A roadmap for future JPEG standards is represented in the figure above. The main efforts are towards the JPEG Pleno project that aims to standardize new immersive technologies like light fields, point clouds or digital holography. Moreover, JPEG is launching JPEG-XS for low latency and light weight coding, while JPEG Systems is also developing a new part to add privacy and security protection to their standards. Furthermore, JPEG is continuously seeking new technological developments and it is committed on providing new standardized image coding solutions.

The future roadmap of MPEG standards is shown in the Figure below (taken from here).

MPEG’s roadmap for future standards comprises a variety of tools ranging from traditional audio-video coding to new forms of compression technologies like genome compression and lightfield. The systems aspects will cover applications domains which require media orchestration as well as focus on becoming the enabler for immersive media experiences.

Conclusion

In this article we briefly highlighted achievements and future plans of JPEG and MPEG but the future is not defined and requires participation from both industry and academia. We hope that our JPEG and MPEG columns will stimulate research and development within the multimedia domain and we are open for any kind of feedback. Contact Antonio Pinheiro (pinheiro@ubi.pt) or Christian Timmerer (christian.timmerer@itec.uni-klu.ac.at) for any further questions or comments.

JPEG Column: 74th JPEG Meeting

By Antonio Pinheiro | February 22, 2017 - 01:05 |July 13, 2017 0416, Feature, Standards

Leave a comment

The 74th JPEG meeting was held at ITU Headquarters in Geneva, Switzerland, from 15 to 20 January featuring the following highlights:

A Final Call for Proposals on JPEG Pleno was issued focusing on light field coding;
Creation of a test model for the upcoming JPEG XS standard;
A draft Call for Proposals for JPEG Privacy & Security was issued;
JPEG AIC technical report finalized on Guidelines for image coding system evaluation;
An AHG was created to investigate the evidence of high throughput JPEG 2000;
An AHG on next generation image compression standard was initiated to explore a future image coding format with superior compression efficiency.

JPEG Pleno kicks off its activities towards standardization of light field coding

At the 74^th JPEG meeting in Geneva, Switzerland the final Call for Proposals (CfP) on JPEG Pleno was issued particularly focusing on light field coding. The CfP is available here.

The call encompasses coding technologies for lenslet light field cameras, and content produced by high-density arrays of cameras. In addition, system-level solutions associated with light field coding and processing technologies that have a normative impact are called for. In a later stage, calls for other modalities such as point cloud, holographic and omnidirectional data will be issued, encompassing image representations and new and rich forms of visual data beyond the traditional planar image representations.

JPEG Pleno intends to provide a standard framework to facilitate capture, representation and exchange of these omnidirectional, depth-enhanced, point cloud, light field, and holographic imaging modalities. It aims to define new tools for improved compression while providing advanced functionalities at the system level. Moreover, it targets to support data and metadata manipulation, editing, random access and interaction, protection of privacy and ownership rights as well as other security mechanisms.

JPEG XS aims at the standardization of a visually lossless low-latency lightweight compression scheme that can be used for a wide range of applications including mezzanine codec for the broadcast industry and Pro-AV markets. Targeted use cases are professional video links, IP transport, Ethernet transport, real-time video storage, video memory buffers, and omnidirectional video capture and rendering. After a Call for Proposal issued on March 11^th 2016 and the assessment of the submitted technologies, a test model for the upcoming JPEG XS standard was created during the 73rd JPEG meeting in Chengdu and the results of a first set of core experiments have been reviewed during the 74^th JPEG meeting in Geneva. More core experiments are on their way before finalizing the standard: JPEG committee therefore invites interested parties – in particular coding experts, codec providers, system integrators and potential users of the foreseen solutions – to contribute to the further specification process.

JPEG Privacy & Security aims at developing a standard for realizing secure image information sharing which is capable of ensuring privacy, maintaining data integrity, and protecting intellectual property rights (IPR). JPEG Privacy & Security will explore ways on how to design and implement the necessary features without significantly impacting coding performance while ensuring scalability, interoperability, and forward and backward compatibility with current JPEG standard frameworks.

A draft Call for Proposals for JPEG Privacy & Security has been issued and the JPEG committee invites interested parties to contribute to this standardisation activity in JPEG Systems. The draft of CfP is available here.

The call addresses protection mechanisms and technologies such as handling hierarchical levels of access and multiple protection levels for metadata and image protection, checking integrity of image data and embedded metadata, and supporting backward and forward compatibility with JPEG coding technologies. Interested parties are encouraged to subscribe to the JPEG Privacy & Security email reflector for collecting more information. A final version of the JPEG Privacy & Security Call for Proposals is expected at the 75th JPEG meeting located in Sydney, Australia.

JPEG AIC provides guidance and standard procedures for advanced image coding evaluation. At this meeting JPEG completed a technical report: TR 29170-1 Guidelines for image coding system evaluation. This report is a compendium of JPEGs best practices in evaluation that draws sources from several different international standards and international recommendations. The report discusses use of objective tools, subjective procedures and computational analysis techniques and when to use the different tools. Some of the techniques are tried and true tools familiar to image compression experts and vision scientists. Several tools represent new fields where few tools have been available, such as the evaluation of coding systems for high dynamic range content.

High throughput JPEG 2000

The JPEG committee started a new activity for high throughput JPEG 2000 and an AHG was created to investigate the evidence for such kind of standard. Experts are invited to participate in this expert group and to join the mailing list.

Final Quote

“JPEG continues to offer standards that redefine imaging products and services contributing to a better society without borders.” said Prof. Touradj Ebrahimi, the Convener of the JPEG committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JBIG, JPEG, JPEG 2000, JPEG XR, JPSearch and more recently, the JPEG XT, JPEG XS and JPEG Systems and JPEG PLENO families of imaging standards.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro and Tim Bruylants of the JPEG Communication Subgroup at pr@jpeg.org.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on https://listserv.uni-stuttgart.de/mailman/listinfo/jpeg-news. Moreover, you can follow JPEG twitter account on http://twitter.com/WG1JPEG.

Future JPEG meetings are planned as follows:

No. 75, Sydney, AU, 26 – 31 March, 2017
No. 76, Torino, IT, 17 – 21 July, 2017
No. 77, Macau, CN, 23 – 27 October 2017

Call for Grand Challenge Problem Proposals

By griff | February 9, 2017 - 06:58 |February 9, 2017 0416, Feature

Leave a comment

Original page: http://www.acmmm.org/2017/contribute/call-for-multimedia-grand-challenge-proposals/

The Multimedia Grand Challenge was first presented as part of ACM Multimedia 2009 and has established itself as a prestigious competition in the multimedia community. The purpose of the Multimedia Grand Challenge is to engage with the multimedia research community by establishing well-defined and objectively judged challenge problems intended to exercise state-of-the-art techniques and methods and inspire future research directions.

Industry leaders and academic institutions are invited to submit proposals for specific Multimedia Grand Challenges to be included in this year’s program.

A Grand Challenge proposal should include:

A brief description motivating why the challenge problem is important and relevant for the multimedia research community, industry, and/or society today and going forward for the next 3-5 years.
A description of a specific set of tasks or goals to be accomplished by challenge problem submissions.
Links to relevant datasets to be used for experimentation, training, and evaluation as necessary. Full appropriate documentation on any datasets should be provided or made accessible.
A description of rigorously defined objective criteria and/or procedures for how submissions will be judged.
Contact information of at least two organizers who will be responsible for accepting and judging submissions as described in the proposal.

Grand Challenge proposals will be considered until March 1st and will be evaluated on an on-going basis as they are received. Grand Challenge proposals that are accepted to be part of the ACM Multimedia 2017 program will be posted on the conference website and included in subsequent calls for participation. All material, datasets, and procedures for a Grand Challenge problem should be ready for dissemination no later than March 14th.

While each Grand Challenge is allowed to define an independent timeline for solution evaluation and may allow iterative resubmission and possible feedback (e.g., a publicly posted leaderboard), challenge submissions must be complete and a paper describing the solution and results should be submitted to the conference program committee by July 14, 2017.

Grand Challenge proposals should be sent via email to the Grand Challenge chair, Ketan Mayer-Patel.

Those interested in submitting a Grand Challenge proposal are encouraged to review the problem descriptions from ACM Multimedia 2016 as examples. These are available here: http://www.acmmm.org/2016/?page_id=353

MPEG Column: 117th MPEG Meeting

By Christian Timmerer | January 31, 2017 - 10:51 |July 13, 2017 0416, Feature, Standards

Leave a comment

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The 117th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects:

MPEG issues Committee Draft of the Omnidirectional Media Application Format (OMAF)
MPEG-H 3D Audio Verification Test Report
MPEG Workshop on 5-Year Roadmap Successfully Held in Geneva
Call for Proposals (CfP) for Point Cloud Compression (PCC)
Preliminary Call for Evidence on video compression with capability beyond HEVC
MPEG issues Committee Draft of the Media Orchestration (MORE) Standard
Technical Report on HDR/WCG Video Coding

In this article, I’d like to focus on the topics related to multimedia communication starting with OMAF.

Omnidirectional Media Application Format (OMAF)

Real-time entertainment services deployed over the open, unmanaged Internet – streaming audio and video – account now for more than 70% of the evening traffic in North American fixed access networks and it is assumed that this figure will reach 80 percent by 2020. More and more such bandwidth hungry applications and services are pushing onto the market including immersive media services such as virtual reality and, specifically 360-degree videos. However, the lack of appropriate standards and, consequently, reduced interoperability is becoming an issue. Thus, MPEG has started a project referred to as Omnidirectional Media Application Format (OMAF). The first milestone of this standard has been reached and the committee draft (CD) has been approved at the 117th MPEG meeting. Such application formats “are essentially superformats that combine selected technology components from MPEG (and other) standards to provide greater application interoperability, which helps satisfy users’ growing need for better-integrated multimedia solutions” [MPEG-A].” In the context of OMAF, the following aspects are defined:

Equirectangular projection format (note: others might be added in the future)
Metadata for interoperable rendering of 360-degree monoscopic and stereoscopic audio-visual data
Storage format: ISO base media file format (ISOBMFF)
Codecs: High Efficiency Video Coding (HEVC) and MPEG-H 3D audio

OMAF is the first specification which is defined as part of a bigger project currently referred to as ISO/IEC 23090 — Immersive Media (Coded Representation of Immersive Media). It currently has the acronym MPEG-I and we have previously used MPEG-VR which is now replaced by MPEG-I (that still might chance in the future). It is expected that the standard will become Final Draft International Standard (FDIS) by Q4 of 2017. Interestingly, it does not include AVC and AAC, probably the most obvious candidates for video and audio codecs which have been massively deployed in the last decade and probably still will be a major dominator (and also denominator) in upcoming years. On the other hand, the equirectangular projection format is currently the only one defined as it is broadly used already in off-the-shelf hardware/software solutions for the creation of omnidirectional/360-degree videos. Finally, the metadata formats enabling the rendering of 360-degree monoscopic and stereoscopic video is highly appreciated. A solution for MPEG-DASH based on AVC/AAC utilizing equirectangular projection format for both monoscopic and stereoscopic video is shown as part of Bitmovin’s solution for VR and 360-degree video.

Research aspects related to OMAF can be summarized as follows:

HEVC supports tiles which allow for efficient streaming of omnidirectional video but HEVC is not as widely deployed as AVC. Thus, it would be interesting how to mimic such a tile-based streaming approach utilizing AVC.
The question how to efficiently encode and package HEVC tile-based video is an open issue and call for a tradeoff between tile flexibility and coding efficiency.
When combined with MPEG-DASH (or similar), there’s a need to update the adaptation logic as the with tiles yet another dimension is added that needs to be considered in order to provide a good Quality of Experience (QoE).
QoE is a big issue here and not well covered in the literature. Various aspects are worth to be investigated including a comprehensive dataset to enable reproducibility of research results in this domain. Finally, as omnidirectional video allows for interactivity, also the user experience is becoming an issue which needs to be covered within the research community.

A second topic I’d like to highlight in this blog post is related to the preliminary call for evidence on video compression with capability beyond HEVC.

Preliminary Call for Evidence on video compression with capability beyond HEVC

A call for evidence is issued to see whether sufficient technological potential exists to start a more rigid phase of standardization. Currently, MPEG together with VCEG have developed a Joint Exploration Model (JEM) algorithm that is already known to provide bit rate reductions in the range of 20-30% for relevant test cases, as well as subjective quality benefits. The goal of this new standard — with a preliminary target date for completion around late 2020 — is to develop technology providing better compression capability than the existing standard, not only for conventional video material but also for other domains such as HDR/WCG or VR/360-degrees video. An important aspect in this area is certainly over-the-top video delivery (like with MPEG-DASH) which includes features such as scalability and Quality of Experience (QoE). Scalable video coding has been added to video coding standards since MPEG-2 but never reached wide-spread adoption. That might change in case it becomes a prime-time feature of a new video codec as scalable video coding clearly shows benefits when doing dynamic adaptive streaming over HTTP. QoE did find its way already into video coding, at least when it comes to evaluating the results where subjective tests are now an integral part of every new video codec developed by MPEG (in addition to usual PSNR measurements). Therefore, the most interesting research topics from a multimedia communication point of view would be to optimize the DASH-like delivery of such new codecs with respect to scalability and QoE. Note that if you don’t like scalable video coding, feel free to propose something else as long as it reduces storage and networking costs significantly.

MPEG Workshop “Global Media Technology Standards for an Immersive Age”

On January 18, 2017 MPEG successfully held a public workshop on “Global Media Technology Standards for an Immersive Age” hosting a series of keynotes from Bitmovin, DVB, Orange, Sky Italia, and Technicolor. Stefan Lederer, CEO of Bitmovin discussed today’s and future challenges with new forms of content like 360°, AR and VR. All slides are available here and MPEG took their feedback into consideration in an update of its 5-year standardization roadmap. David Wood (EBU) reported on the DVB VR study mission and Ralf Schaefer (Technicolor) presented a snapshot on VR services. Gilles Teniou (Orange) discussed video formats for VR pointing out a new opportunity to increase the content value but also raising a question what is missing today. Finally, Massimo Bertolotti (Sky Italia) introduced his view on the immersive media experience age.

Overall, the workshop was well attended and as mentioned above, MPEG is currently working on a new standards project related to immersive media. Currently, this project comprises five parts. The first part comprises a technical report describing the scope (incl. kind of system architecture), use cases, and applications. The second part is OMAF (see above) and the third/forth parts are related to immersive video and audio respectively. Part five is about point cloud compression.

For those interested, please check out the slides from industry representatives in this field and draw your own conclusions what could be interesting for your own research. I’m happy to see any reactions, hints, etc. in the comments.

Finally, let’s have a look what happened related to MPEG-DASH, a topic with a long history on this blog.

MPEG-DASH and CMAF: Friend or Foe?

For MPEG-DASH and CMAF it was a meeting “in between” official standardization stages. MPEG-DASH experts are still working on the third edition which will be a consolidated version of the 2nd edition and various amendments and corrigenda. In the meantime, MPEG issues a white paper on the new features of MPEG-DASH which I would like to highlight here.

Spatial Relationship Description (SRD): allows to describe tiles and region of interests for partial delivery of media presentations. This is highly related to OMAF and VR/360-degree video streaming.
External MPD linking: this feature allows to describe the relationship between a single program/channel and a preview mosaic channel having all channels at once within the MPD.
Period continuity: simple signaling mechanism to indicate whether one period is a continuation of the previous one which is relevant for ad-insertion or live programs.
MPD chaining: allows for chaining two or more MPDs to each other, e.g., pre-roll ad when joining a live program.
Flexible segment format for broadcast TV: separates the signaling of the switching points and random access points in each stream and, thus, the content can be encoded with a good compression efficiency, yet allowing higher number of random access point, but with lower frequency of switching points.
Server and network-assisted DASH (SAND): enables asynchronous network-to-client and network-to-network communication of quality-related assisting information.
DASH with server push and WebSockets: basically addresses issues related to HTTP/2 push feature and WebSocket.

CMAF issued a study document which captures the current progress and all national bodies are encouraged to take this into account when commenting on the Committee Draft (CD). To answer the question in the headline above, it looks more and more like as DASH and CMAF will become friends — let’s hope that the friendship lasts for a long time.

What else happened at the MPEG meeting?

Committee Draft MORE (note: type in ‘man more’ on any unix/linux/max terminal and you’ll get ‘less – opposite of more’;): MORE stands for “Media Orchestration” and provides a specification that enables the automated combination of multiple media sources (cameras, microphones) into a coherent multimedia experience. Additionally, it targets use cases where a multimedia experience is rendered on multiple devices simultaneously, again giving a consistent and coherent experience.
Technical Report on HDR/WCG Video Coding: This technical report comprises conversion and coding practices for High Dynamic Range (HDR) and Wide Colour Gamut (WCG) video coding (ISO/IEC 23008-14). The purpose of this document is to provide a set of publicly referenceable recommended guidelines for the operation of AVC or HEVC systems adapted for compressing HDR/WCG video for consumer distribution applications
CfP Point Cloud Compression (PCC): This call solicits technologies for the coding of 3D point clouds with associated attributes such as color and material properties. It will be part of the immersive media project introduced above.
MPEG-H 3D Audio verification test report: This report presents results of four subjective listening tests that assessed the performance of the Low Complexity Profile of MPEG-H 3D Audio. The tests covered a range of bit rates and a range of “immersive audio” use cases (i.e., from 22.2 down to 2.0 channel presentations). Seven test sites participated in the tests with a total of 288 listeners.

The next MPEG meeting will be held in Hobart, April 3-7, 2017. Feel free to contact us for any questions or comments.

ACM TVX — Call for Volunteer Associate Chairs

By griff | December 2, 2016 - 07:34 |December 2, 2016 0416, Feature

Leave a comment

CALL FOR VOLUNTEER ASSOCIATE CHAIRS – Applications for Technical Program Committee

ACM TVX 2017 International Conference on Interactive Experiences for Television and Online Video June 14-16, 2017, Hilversum, The Netherlands www.tvx2017.com

We are welcoming applications to become part of the TVX 2017 Technical Program Committee (TPC), as Associate Chair (AC). This involves playing a key role in the submission and review process, including attendance at the TPC meeting (please note that this is not a call for reviewers, but a call for Associate Chairs). We are opening applications to all members of the community, from both industry and academia, who feel they can contribute to this team.

This call is open to new Associate Chairs and to those who have been Associate Chairs in previous years and want to be an Associate Chair again for TVX 2017
Application form: https://goo.gl/forms/c9gNPHYZbh2m6VhJ3
The application deadline is December 12, 2016

Following the success of previous years’ invitations for open applications to join our Technical Program Committee, we again invite applications for Associate Chairs. Successful applicants would be responsible for arranging and coordinating reviews for around 3 or 4 submissions in the main Full and Short Papers track of ACM TVX2017, and attend the Technical Program Committee meeting in Delft, The Netherlands, in mid-March 2017 (participation in person is strongly recommended). Our aim is to broaden participation, ensuring a diverse Technical Program Committee, and to help widen the ACM TVX community to include a full range of perspectives.

We welcome applications from academics, industrial practitioners and (where appropriate) senior PhD students, who have expertise in Human Computer Interaction or related fields, and who have an interest in topics related to interactive experiences for television or online video. We would expect all applicants to have ‘top-tier’ publications related to this area. Applicants should have an expertise or interest in at least one or more topics in our call for papers: https://tvx.acm.org/2017/participation/full-and-short-paper-submissions/

After the application deadline, the volunteers will be considered and selected for ACs, and the TPC Chairs will be free to also invite previous ACs or other researchers of the community to integrate the team. The ultimate goal is to reach a balanced, diverse and inclusive TPC team in terms of fields of expertise, experience and perspectives, both from academia and industry.

To submit, just fill in the application form above!

CONTACT INFORMATION

For up to date information and further details please visit: www.tvx2017.com or get in touch with the Inclusion Chairs:

Teresa Chambel, University of Lisbon, PT; Rob Koenen, TNO, NL
at: inclusion@tvx2017.com

In collaboration with the Program Chairs: Wendy van den Broeck, Vrije Universiteit Brussel, BE; Mike Darnell, Samsung, USA; Roger Zimmermann, NUS, Singapore