About Christian Timmerer

Christian Timmerer is a researcher, entrepreneur, and teacher on immersive multimedia communication, streaming, adaptation, and Quality of Experience. He is an Assistant Professor at Alpen-Adria-Universität Klagenfurt, Austria. Follow him on Twitter at http://twitter.com/timse7 and subscribe to his blog at http://blog.timmerer.com.

MPEG Column: 120th MPEG Meeting in Macau, China

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

MPEG Plenary Meeting

MPEG Plenary Meeting

The MPEG press release comprises the following topics:

  • Point Cloud Compression – MPEG evaluates responses to call for proposal and kicks off its technical work
  • The omnidirectional media format (OMAF) has reached its final milestone
  • MPEG-G standards reach Committee Draft for compression and transport technologies of genomic data
  • Beyond HEVC – The MPEG & VCEG call to set the next standard in video compression
  • MPEG adds better support for mobile environment to MMT
  • New standard completed for Internet Video Coding
  • Evidence of new video transcoding technology using side streams

Point Cloud Compression

At its 120th meeting, MPEG analysed the technologies submitted by nine industry leaders as responses to the Call for Proposals (CfP) for Point Cloud Compression (PCC). These technologies address the lossless or lossy coding of 3D point clouds with associated attributes such as colour and material properties. Point clouds are referred to as unordered sets of points in a 3D space and typically captured using various setups of multiple cameras, depth sensors, LiDAR scanners, etc., but can also be generated synthetically and are in use in several industries. They have recently emerged as representations of the real world enabling immersive forms of interaction, navigation, and communication. Point clouds are typically represented by extremely large amounts of data providing a significant barrier for mass market applications. Thus, MPEG has issued a Call for Proposal seeking technologies that allow reduction of point cloud data for its intended applications. After a formal objective and subjective evaluation campaign, MPEG selected three technologies as starting points for the test models for static, animated, and dynamically acquired point clouds. A key conclusion of the evaluation was that state-of-the-art point cloud compression can be significantly improved by leveraging decades of 2D video coding tools and combining 2D and 3D compression technologies. Such an approach provides synergies with existing hardware and software infrastructures for rapid deployment of new immersive experiences.

Although the initial selection of technologies for point cloud compression has been concluded at the 120th MPEG meeting, it could be also seen as a kick-off for its scientific evaluation and various further developments including the optimization thereof. It is expected that various scientific conference will focus on point cloud compression and may open calls for grand challenges like for example at IEEE ICME 2018.

Omnidirectional Media Format (OMAF)

The understanding of the virtual reality (VR) potential is growing but market fragmentation caused by the lack of interoperable formats for the storage and delivery of such content stifles VR’s market potential. MPEG’s recently started project referred to as Omnidirectional Media Format (OMAF) has reached Final Draft of International Standard (FDIS) at its 120th meeting. It includes

  • equirectangular projection and cubemap projection as projection formats;
  • signalling of metadata required for interoperable rendering of 360-degree monoscopic and stereoscopic audio-visual data; and
  • provides a selection of audio-visual codecs for this application.

It also includes technologies to arrange video pixel data in numerous ways to improve compression efficiency and reduce the size of video, a major bottleneck for VR applications and services. The standard also includes technologies for the delivery of OMAF content with MPEG-DASH and MMT.

MPEG has defined a format comprising a minimal set of tools to enable interoperability among implementers of the standard. Various aspects are deliberately excluded from the normative parts to foster innovation leading to novel products and services. This enables us — researcher and practitioners — to experiment with these new formats in various ways and focus on informative aspects where typically competition can be found. For example, efficient means for encoding and packaging of omnidirectional/360-degree media content and its adaptive streaming including support for (ultra-)low latency will become a big issue in the near future.

MPEG-G: Compression and Transport Technologies of Genomic Data

The availability of high throughput DNA sequencing technologies opens new perspectives in the treatment of several diseases making possible the introduction of new global approaches in public health known as “precision medicine”. While routine DNA sequencing in the doctor’s office is still not current practice, medical centers have begun to use sequencing to identify cancer and other diseases and to find effective treatments. As DNA sequencing technologies produce extremely large amounts of data and related information, the ICT costs of storage, transmission, and processing are also very high. The MPEG-G standard addresses and solves the problem of efficient and economical handling of genomic data by providing new

  • compression technologies (ISO/IEC 23092-2) and
  • transport technologies (ISO/IEC 23092-1),

which reached Committee Draft level at its 120th meeting.

Additionally, the Committee Drafts for

  • metadata and APIs (ISO/IEC 23092-3) and
  • reference software (ISO/IEC 23092-4)

are scheduled for the next MPEG meeting and the goal is to publish Draft International Standards (DIS) at the end of 2018.

This new type of (media) content, which requires compression and transport technologies, is emerging within the multimedia community at large and, thus, input is welcome.

Beyond HEVC – The MPEG & VCEG Call to set the Next Standard in Video Compression

The 120th MPEG meeting marked the first major step toward the next generation of video coding standard in the form of a joint Call for Proposals (CfP) with ITU-T SG16’s VCEG. After two years of collaborative informal exploration studies and a gathering of evidence that successfully concluded at the 118th MPEG meeting, MPEG and ITU-T SG16 agreed to issue the CfP for future video coding technology with compression capabilities that significantly exceed those of the HEVC standard and its current extensions. They also formalized an agreement on formation of a joint collaborative team called the “Joint Video Experts Team” (JVET) to work on development of the new planned standard, pending the outcome of the CfP that will be evaluated at the 122nd MPEG meeting in April 2018. To evaluate the proposed compression technologies, formal subjective tests will be performed using video material submitted by proponents in February 2018. The CfP includes the testing of technology for 360° omnidirectional video coding and the coding of content with high-dynamic range and wide colour gamut in addition to conventional standard-dynamic-range camera content. Anticipating a strong response to the call, a “test model” draft design is expected be selected in 2018, with development of a potential new standard in late 2020.

The major goal of a new video coding standard is to be better than its successor (HEVC). Typically this “better” is quantified by 50% which means, that it should be possible encode the video at the same quality with half of the bitrate or a significantly higher quality with the same bitrate including. However, at this time the “Joint Video Experts Team” (JVET) from MPEG and ITU-T SG16 faces competition from the Alliance for Open Media, which is working on AV1. In any case, we are looking forward to an exciting time frame from now until this new codec is ratified and how it will perform compared to AV1. Multimedia systems and applications will also benefit from new codecs which will gain traction as soon as first implementations of this new codec becomes available (note that AV1 is available as open source already and continuously further developed).

MPEG adds Better Support for Mobile Environment to MPEG Media Transport (MMT)

MPEG has approved the Final Draft Amendment (FDAM) to MPEG Media Transport (MMT; ISO/IEC 23008-1:2017), which is referred to as “MMT enhancements for mobile environments”. In order to reflect industry needs on MMT, which has been well adopted by broadcast standards such as ATSC 3.0 and Super Hi-Vision, it addresses several important issues on the efficient use of MMT in mobile environments. For example, it adds distributed resource identification message to facilitate multipath delivery and transition request message to change the delivery path of an active session. This amendment also introduces the concept of a MMT-aware network entity (MANE), which might be placed between the original server and the client, and provides a detailed description about how to use it for both improving efficiency and reducing delay of delivery. Additionally, this amendment provides a method to use WebSockets to setup and control an MMT session/presentation.

New Standard Completed for Internet Video Coding

A new standard for video coding suitable for the internet as well as other video applications, was completed at the 120th MPEG meeting. The Internet Video Coding (IVC) standard was developed with the intention of providing the industry with an “Option 1” video coding standard. In ISO/IEC language, this refers to a standard for which patent holders have declared a willingness to grant licenses free of charge to an unrestricted number of applicants for all necessary patents on a worldwide, non-discriminatory basis and under other reasonable terms and conditions, to enable others to make, use, and sell implementations of the standard. At the time of completion of the IVC standard, the specification contained no identified necessary patent rights except those available under Option 1 licensing terms. During the development of IVC, MPEG removed from the draft standard any necessary patent rights that it was informed were not available under such Option 1 terms, and MPEG is optimistic of the outlook for the new standard. MPEG encourages interested parties to provide information about any other similar cases. The IVC standard has roughly similar compression capability as the earlier AVC standard, which has become the most widely deployed video coding technology in the world. Tests have been conducted to verify IVC’s strong technical capability, and the new standard has also been shown to have relatively modest implementation complexity requirements.

Evidence of new Video Transcoding Technology using Side Streams

Following a “Call for Evidence” (CfE) issued by MPEG in July 2017, evidence was evaluated at the 120th MPEG meeting to investigate whether video transcoding technology has been developed for transcoding assisted by side data streams that is capable of significantly reducing the computational complexity without reducing compression efficiency. The evaluations of the four responses received included comparisons of the technology against adaptive bit-rate streaming using simulcast as well as against traditional transcoding using full video re-encoding. The responses span the compression efficiency space between simulcast and full transcoding, with trade-offs between the bit rate required for distribution within the network and the bit rate required for delivery to the user. All four responses provided a substantial computational complexity reduction compared to transcoding using full re-encoding. MPEG plans to further investigate transcoding technology and is soliciting expressions of interest from industry on the need for standardization of such assisted transcoding using side data streams.

MPEG currently works on two related topics which are referred to as network-distributed video coding (NDVC) and network-based media processing (NBMP). Both activities involve the network, which is more and more evolving to highly distributed compute and delivery platform as opposed to a bit pipe, which is supposed to deliver data as fast as possible from A to B. This phenomena could be also interesting when looking at developments around 5G, which is actually much more than just radio access technology. These activities are certainly worth to monitor as it basically contributes in order to make networked media resources accessible or even programmable. In this context, I would like to refer the interested reader to the December’17 theme of the IEEE Computer Society Computing Now, which is about Advancing Multimedia Content Distribution.


Publicly available documents from the 120th MPEG meeting can be found here (scroll down to the end of the page). The next MPEG meeting will be held in Gwangju, Korea, January 22-26, 2018. Feel free to contact Christian Timmerer for any questions or comments.


Some of the activities reported above are considered within the Call for Papers at 23rd Packet Video Workshop (PV 2018) co-located with ACM MMSys 2018 in Amsterdam, The Netherlands. Topics of interest include (but are not limited to):

  • Adaptive media streaming, and content storage, distribution and delivery
  • Network-distributed video coding and network-based media processing
  • Next-generation/future video coding, point cloud compression
  • Audiovisual communication, surveillance and healthcare systems
  • Wireless, mobile, IoT, and embedded systems for multimedia applications
  • Future media internetworking: information-centric networking and 5G
  • Immersive media: virtual reality (VR), augmented reality (AR), 360° video and multi-sensory systems, and its streaming
  • Machine learning in media coding and streaming systems
  • Standardization: DASH, MMT, CMAF, OMAF, MiAF, WebRTC, MSE, EME, WebVR, Hybrid Media, WAVE, etc.
    Applications: social media, game streaming, personal broadcast, healthcare, industry 4.0, education, transportation, etc.

Important dates

  • Submission deadline: March 1, 2018
  • Acceptance notification: April 9, 2018
  • Camera-ready deadline: April 19, 2018

Report from ACM MMSys 2017

–A report from Christian Timmerer, AAU/Bitmovin Austria

The ACM Multimedia Systems Conference (MMSys) provides a forum for researchers to present and share their latest research findings in multimedia systems. It is a unique event targeting “multimedia systems” from various angles and views across all domains instead of focusing on a specific aspect or data type. ACM MMSys’17 was held in Taipei, Taiwan in June 20-23, 2017.

MMSys is a single-track conference which hosts also a series of workshops, namely NOSSDAV, MMVE, and NetGames. Since 2016, it kicks off with overview talks and 2017 we’ve seen the following talks: “Geometric representations of 3D scenes” by Geraldine Morin; “Towards Understanding Truly Immersive Multimedia Experiences” by Niall Murray; “Rate Control In The Age Of Vision” by Ketan Mayer-Patel; “Humans, computers, delays and the joys of interaction” by Ragnhild Eg; “Context-aware, perception-guided workload characterization and resource scheduling on mobile phones for interactive applications” by Chung-Ta King and Chun-Han Lin.

Additionally, industry talks have been introduced: “Virtual Reality – The New Era of Future World” by WeiGing Ngang; “The innovation and challenge of Interactive streaming technology” by Wesley Kuo; “What challenges are we facing after Netflix revolutionized TV watching?” by Shuen-Huei Guan; “The overview of app streaming technology” by Sam Ding; “Semantic Awareness in 360 Streaming” by Shannon Chen; “On the frontiers of Video SaaS” by Sega Cheng.

An interesting set of keynotes presented different aspects related multimedia systems and its co-located workshops:

  • Henry Fuchs, The AR/VR Renaissance: opportunities, pitfalls, and remaining problems
  • Julien Lai, Towards Large-scale Deployment of Intelligent Video Analytics Systems
  • Dah Ming Chiu, Smart Streaming of Panoramic Video
  • Bo Li, When Computation Meets Communication: The Case for Scheduling Resources in the Cloud
  • Polly Huang, Measuring Subjective QoE for Interactive System Design in the Mobile Era – Lessons Learned Studying Skype Calls

IMG_4405The program included a diverse set of topics such as immersive experiences in AR and VR, network optimization and delivery, multisensory experiences, processing, rendering, interaction, cloud-based multimedia, IoT connectivity, infrastructure, media streaming, and security. A vital aspect of MMSys is dedicated sessions for showcasing latest developments in the area of multimedia systems and presenting datasets, which is important towards enabling reproducibility and sustainability in multimedia systems research.

The social events were a perfect venue for networking and in-depth discussion how to advance the state of the art. A welcome reception was held at “LE BLE D’OR (Miramar)”, the conference banquet at the Taipei World Trade Center Club, and finally a tour to the Shilin Night Market was organized.

ACM MMSys 2917 issued the following awards:

  • The Best Paper Award  goes to “A Scalable and Privacy-Aware IoT Service for Live Video Analytics” by Junjue Wang (Carnegie Mellon University), Brandon Amos (Carnegie Mellon University), Anupam Das (Carnegie Mellon University), Padmanabhan Pillai (Intel Labs), Norman Sadeh (Carnegie Mellon University), and Mahadev Satyanarayanan (Carnegie Mellon University).
  • The Best Student Paper Award goes to “A Measurement Study of Oculus 360 Degree Video Streaming” by Chao Zhou (SUNY Binghamton), Zhenhua Li (Tsinghua University), and Yao Liu (SUNY Binghamton).
  • The NOSSDAV’17 Best Paper Award goes to “A Comparative Case Study of HTTP Adaptive Streaming Algorithms in Mobile Networks” by Theodoros Karagkioules (Huawei Technologies France/Telecom ParisTech), Cyril Concolato (Telecom ParisTech), Dimitrios Tsilimantos (Huawei Technologies France), Stefan Valentin (Huawei Technologies France).

Excellence in DASH award sponsored by the DASH-IF 

  • 1st place: “SAP: Stall-Aware Pacing for Improved DASH Video Experience in Cellular Networks” by Ahmed Zahran (University College Cork), Jason J. Quinlan (University College Cork), K. K. Ramakrishnan (University of California, Riverside), and Cormac J. Sreenan (University College Cork)
  • 2nd place: “Improving Video Quality in Crowded Networks Using a DANE” by Jan Willem Kleinrouweler, Britta Meixner and Pablo Cesar (Centrum Wiskunde & Informatica)
  • 3rd place: “Towards Bandwidth Efficient Adaptive Streaming of Omnidirectional Video over HTTP” by Mario Graf (Bitmovin Inc.), Christian Timmerer (Alpen-Adria-Universität Klagenfurt / Bitmovin Inc.), and Christopher Mueller (Bitmovin Inc.)

Finally, student travel grants awards have been sponsored by SIGMM. All details including nice pictures can be found here.


ACM MMSys 2018 will be held in Amsterdam, The Netherlands, June 12 – 15, 2018 and includes the following tracks:

  • Research track: Submission deadline on November 30, 2017
  • Demo track: Submission deadline on February 25, 2018
  • Open Dataset & Software Track: Submission deadline on February 25, 2018

MMSys’18 co-locates the following workshops (with submission deadline on March 1, 2018):

  • MMVE2018: 10th International Workshop on Immersive Mixed and Virtual Environment Systems,
  • NetGames2018: 16th Annual Worksop on Network and Systems Support for Games,
  • NOSSDAV2018: 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video,
  • PV2018: 23rd Packet Video Workshop

MMSys’18 includes the following special sessions (submission deadline on December 15, 2017):

MPEG Column: 119th MPEG Meeting in Turin, Italy

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The MPEG press release comprises the following topics:

  • Evidence of New Developments in Video Compression Coding
  • Call for Evidence on Transcoding for Network Distributed Video Coding
  • 2nd Edition of Storage of Sample Variants reaches Committee Draft
  • New Technical Report on Signalling, Backward Compatibility and Display Adaptation for HDR/WCG Video Coding
  • Draft Requirements for Hybrid Natural/Synthetic Scene Data Container

Evidence of New Developments in Video Compression Coding

At the 119th MPEG meeting, responses to the previously issued call for evidence have been evaluated and they have all successfully demonstrated evidence. The call requested responses for use cases of video coding technology in three categories:

  • standard dynamic range (SDR) — two responses;
  • high dynamic range (HDR) — two responses; and
  • 360° omnidirectional video — four responses.

The evaluation of the responses included subjective testing and an assessment of the performance of the “Joint Exploration Model” (JEM). The results indicate significant gains over HEVC for a considerable number of test cases with comparable subjective quality at 40-50% less bit rate compared to HEVC for the SDR and HDR test cases with some positive outliers (i.e., higher bit rate savings). Thus, the MPEG-VCEG Joint Video Exploration Team (JVET) concluded that evidence exists of compression technology that may significantly outperform HEVC after further development to establish a new standard. As a next step, the plan is to issue a call for proposals at 120th MPEG meeting (October 2017) and responses expected to be evaluated at the 122th MPEG meeting (April 2018).

We already witness an increase of research articles addressing video coding technologies with capabilities beyond HEVC which will further increase in the future. The main driving force is over the top (OTT) delivery which calls for more efficient bandwidth utilization. However, competition is also increasing with the emergence of AV1 of AOMedia and we may observe also an increasing number of articles in that direction including evaluations thereof. An interesting aspect is also that the number of use cases is also increasing (e.g., see different categories above), which adds further challenges to the “complex video problem”.

Call for Evidence on Transcoding for Network Distributed Video Coding

The call for evidence on transcoding for network distributed video coding targets interested parties possessing technology providing transcoding of video at lower computational complexity than transcoding done using a full re-encode. The primary application is adaptive bitrate streaming where a highest bitrate stream is transcoded into lower bitrate streams. It is expected that responses may use “side streams” (or side information, some may call it metadata) accompanying the highest bitrate stream to assist in the transcoding process. MPEG expects submissions for the 120th MPEG meeting where compression efficiency and computational complexity will be assessed.

Transcoding has been discussed already for a long time and I can certainly recommend this article from 2005 published in the Proceedings of the IEEE. The question is, what is different now, 12 years later, and what metadata (or side streams/information) is required for interoperability among different vendors (if any)?

A Brief Overview of Remaining Topics…

  • The 2nd edition of storage of sample variants reaches Committee Draft and expands its usage to MPEG-2 transport stream whereas the first edition primarily focused on ISO base media file format.
  • The new technical report for high dynamic range (HDR) and wide colour gamut (WCG) video coding comprises a survey of various signaling mechanisms including backward compatibility and display adaptation.
  • MPEG issues draft requirements for a scene representation media container enabling the interchange of content for authoring and rendering rich immersive experiences which is currently referred to as hybrid natural/synthetic scene (HNSS) data container.

Other MPEG (Systems) Activities at the 119th Meeting

DASH is in fully maintenance mode as only minor enhancements/corrections have been discussed including contributions to conformance and reference software. The omnidirectional media format (OMAF) is certainly the hottest topic within MPEG systems which is actually between two stages (i.e., between DIS and FDIS) and, thus, a study of DIS has been approved and national bodies are kindly requested to take this into account when casting their votes (incl. comments). The study of DIS comprises format definitions with respect to coding and storage of omnidirectional media including audio and video (aka 360°). The common media application format (CMAF) has been ratified at the last meeting and awaits publications by ISO. In the meantime CMAF is focusing on conformance and reference software as well as amendments regarding various media profiles. Finally, requirements for a multi-image application format (MiAF) are available since the last meeting and at the 119th MPEG meeting a work draft has been approved. MiAF will be based on HEIF and the goal is to define additional constraints to simplify its file format options.

We have successfully demonstrated live 360 adaptive streaming as described here but we expect various improvements from standards available and under development of MPEG. Research aspects in these areas are certainly interesting in the area of performance gains and evaluations with respect to bandwidth efficiency in open networks as well as how these standardization efforts could be used to enable new use cases. 

Publicly available documents from the 119th MPEG meeting can be found here (scroll down to the end of the page). The next MPEG meeting will be held in Macau, China, October 23-27, 2017. Feel free to contact me for any questions or comments.

MPEG Column: 118th MPEG Meeting

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The entire MPEG press release can be found here comprising the following topics:

  • Coded Representation of Immersive Media (MPEG-I): new work item approved and call for test data issued
  • Common Media Application Format (CMAF): FDIS approved
  • Beyond High Efficiency Video Coding (HEVC): call for evidence for “beyond HEVC” and verification tests for screen content coding extensions of HEVC

Coded Representation of Immersive Media (MPEG-I)

MPEG started to work on the new work item referred to as ISO/IEC 23090 with the “nickname” MPEG-I targeting future immersive applications. The goal of this new standard is to enable various forms of audio-visual immersion including panoramic video with 2D and 3D audio with various degrees of true 3D visual perception. It currently comprises five parts: (pt. 1) a technical report describing the scope of this new standard and a set of use cases and applications; (pt. 2) an application format for omnidirectional media (aka OMAF) to address the urgent need of the industry for a standard is this area; (pt. 3) immersive video which is a kind of placeholder for the successor of HEVC (if at all); (pt. 4) immersive audio as a placeholder for the successor of 3D audio (if at all); and (pt. 5) for point cloud compression. The point cloud compression standard targets lossy compression for point clouds in real-time communication, six Degrees of Freedom (6 DoF) virtual reality, and the dynamic mapping for autonomous driving, cultural heritage applications, etc. Part 2 is related to OMAF which I’ve discussed in my previous blog post.

MPEG also established an Ad-hoc Group (AhG) on immersive Media quality evaluation with the following mandates: 1. Produce a document on VR QoE requirements; 2. Collect test material with immersive video and audio signals; 3. Study existing methods to assess human perception and reaction to VR stimuli; 4. Develop test methodology for immersive media, including simultaneous video and audio; 5. Study VR experience metrics and their measurability in VR services and devices. AhGs are open to everybody and mostly discussed using mailing lists (join here https://lists.aau.at/mailman/listinfo/immersive-quality). Interestingly, a Joint Qualinet-VQEG team on Immersive Media (JQVIM) has been recently established with similar goals and also the VR Industry Forum (VRIF) has issued a call for VR360 content. It seems there’s a strong need for a dataset similar to the one we have created for MPEG-DASH long time ago.

The JQVIM has been created as part of the QUALINET task force on “Immersive Media Experiences (IMEx)” which aims at providing end users the sensation of being part of the particular media which shall result in a worthwhile, informative user and quality of experience. The main goals are providing datasets and tools (hardware/software), subjective quality evaluations, field studies, cross- validation including a strong theoretical foundation relevant along the empirical databases and tools which hopefully results in a framework, methodology, and best practices for immersive media experiences.

Common Media Application Format (CMAF)

The Final Draft International Standard (FDIS) has been issued at the 118th MPEG meeting which concludes the formal technical development process of the standard. At this point in time national bodies can only vote Yes|No and editorial changes are allowed (if any) before the International Standard (IS) becomes available. The goal of CMAF is to define a single format for the transport and storage of segmented media including audio/video formats, subtitles, and encryption — it is derived from the ISO Base Media File Format (ISOBMFF). As it’s a combination of various MPEG standard it’s referred to as an Application Format (AS) which mainly takes existing formats/standards and glues them together for a specific target application. The CMAF standard clearly targets dynamic adaptive streaming (over — but not limited to — HTTP) but focusing on the media format only and excluding the manifest format. Thus, the CMAF standard shall be compatible with other formats such as MPEG-DASH and HLS. In fact, HLS has been extended already some time ago to support ‘fragmented MP4’ which we have demonstrated also and it has been interpreted as a first step towards the harmonization of MPEG-DASH and HLS; at least on the segment format. The delivery of CMAF contents with DASH will be described in part 7 of MPEG-DASH that basically comprises a mapping of CMAF concepts to DASH terms.

From a research perspective, it would be interesting to explore how certain CMAF concepts are able to address current industry needs, specifically in the context of low-latency streaming which has been demonstrated recently.

Beyond HEVC…

The preliminary call for evidence (CfE) on video compression with capability beyond HEVC has been issued and is addressed to interested parties that have technology providing better compression capability than the existing standard, either for conventional video material, or for other domains such as HDR/WCG or 360-degree (“VR”) video. Test cases are defined for SDR, HDR, and 360-degree content. This call has been made jointly by ISO/IEC MPEG and ITU-T SG16/Q6 (VCEG). The evaluation of the responses is scheduled for July 2017 and depending on the outcome of the CfE, the parent bodies of the Joint Video Exploration Team (JVET) of MPEG and VCEG collaboration intend to issue a Draft Call for Proposals by the end of the July meeting.

Finally, verification tests have been conducted for the Screen Content Coding (SCC) extensions to HEVC showing exceptional performance. Screen content is video containing a significant proportion of rendered (moving or static) graphics, text, or animation rather than, or in addition to, camera-captured video scenes. For scenes containing a substantial amount of text and graphics, the tests showed a major benefit in compression capability for the new extensions over both the Advanced Video Coding standard and the previous version of the newer HEVC standard without the new SCC features.

The question whether and how new codecs like (beyond) HEVC competes with AV1 is subject to research and development. It has been discussed also in the scientific literature but lacks of vendor neutral comparison which is difficult to achieve and not to compare apples with oranges (due to the high number of different coding tools and parameters). An important aspect which always needs to be considered is one typically compares specific implementations of a coding format and not the standard as the encoding is usually not defined, only the bitstream syntax that implicitly defines the decoder.

Publicly available documents from the 118th MPEG meeting can be found here (scroll down to the end of the page). The next MPEG meeting will be held in Torino, Italy, July 17-21, 2017. Feel free to contact us for any questions or comments.

Standards Column: JPEG and MPEG

Introduction

ISO/IEC JTC 1/SC 29 area of work comprises the standardization of coded representation of audio, picture, multimedia and hypermedia information and sets of compression and control functions for use with such information. SC29 basically hosts two working groups responsible for the development of international standards for the compression, decompression, processing, and coded representation of media content, in order to satisfy a wide variety of applications”, specifically WG1 targeting “digital still pictures”  — also known as JPEG — and WG11 targeting “moving pictures, audio, and their combination” — also known as MPEG. The earlier SC29 standards, namely JPEG, MPEG-1 and MPEG-2, received the technology & engineering Emmy award in 1995-96.

The standards columns within ACM SIGMM Records provide timely updates about the most recent developments within JPEG and MPEG respectively. The JPEG column is edited by Antonio Pinheiro and the MPEG column is edited by Christian Timmerer. The editors and an overview of recent JPEG and MPEG achievements as well as future plans are highlighted in this article.

Antonio Pinheiro received the BSc (Licenciatura) from I.S.T., Lisbon in 1988 and the PhD in faceAMGP3Electronic Systems Engineering from University of Essex in 2002. He is a lecturer at U.B.I. (Universidade da Beira Interior), Covilha, Portugal from 1988 and a researcher at I.T. (Instituto de Telecomunicações), Portugal. Currently, his research interests are on Image Processing, namely on Multimedia Quality Evaluation and Medical Image Analysis. He was a Portuguese representative of the European Union Actions COST IC1003 – QUALINET, COST IC1206 – DE-ID, COST 292 and currently of COST BM1304 – MYO-MRI. He is currently involved in the project EmergIMG funded by the Portuguese Funding agency and H2020, and he is a Portuguese delegate to JPEG, where he is currently the Communication Subgroup chair and involved with the JPEG Pleno project.

 

 

ct2013octChristian Timmerer received his M.Sc. (Dipl.-Ing.) in January 2003 and his Ph.D. (Dr.techn.) in June 2006 (for research on the adaptation of scalable multimedia content in streaming and constrained environments) both from the Alpen-Adria-Universität (AAU) Klagenfurt. He joined the AAU in 1999 (as a system administrator) and is currently an Associate Professor at the Institute of Information Technology (ITEC) within the Multimedia Communication Group. His research interests include immersive multimedia communications, streaming, adaptation, Quality of Experience, and Sensory Experience. He was the general chair of WIAMIS 2008, QoMEX 2013, and MMSys 2016 and has participated in several EC-funded projects, notably DANAE, ENTHRONE, P2P-Next, ALICANTE, SocialSensor, COST IC1003 QUALINET, and ICoSOLE. He also participated in ISO/MPEG work for several years, notably in the area of MPEG-21, MPEG-M, MPEG-V, and MPEG-DASH where he also served as standard editor. In 2012 he cofounded Bitmovin (http://www.bitmovin.com/) to provide professional services around MPEG-DASH where he holds the position of the Chief Innovation Officer (CIO).

 

Major JPEG and MPEG Achievements

In this section we would like to highlight major JPEG and MPEG achievements without claiming to be exhaustive.

JPEG developed the well-known digital pictures coding standard, known as JPEG image format almost 25 years ago. Due to the recent increase of social networks usage, the number of JPEG encoded images shared online grew to an impressive number of 1,800 billion per day in 2014. JPEG 2000 is another JPEG successful standard that also received the 2015 Technology and Engineering Emmy award. This standard uses state of the art compression technology providing higher compression and a wider applications domain. It is widely used at professional level, namely on movies production and medical imaging. JPEG also developed JBIG2, JPEG-LS, JPSearch and JPEG-XR standards. More recently JPEG launched JPEG-AIC, JPEG Systems and JPEG-XT. JPEG-XT defines backward compatible extensions of JPEG, adding support for HDR, lossless/near lossless, and alpha coding. An overview of the JPEG family of standards is shown in the figure below.

JPEGstandards
An overview of existing MPEG standards and achievements is shown in the figure below (taken from here).

MPEGStandards

A first major milestone and success was the development of MP3 which revolutionized digital audio content resulting in a sustainable change of the digital media ecosystem. The same holds for MPEG-2 video & systems where the latter, i.e., MPEG-2 Transport Stream, received the technology & engineering Emmy award. The mobile era within MPEG has been introduced with the MPEG-4 standard resulting in the development of AVC (received yet another Emmy award), AAC, and also the MP4 file format which have been deployed widely. Finally, streaming over the open internet is addressed by DASH and new forms of digital television including ultra high-definition & immersive services are targeted by MPEG-H comprising MMT, HEVC, and 3D audio.

Roadmap for Future JPEG and MPEG Standards

In this section we would like to highlight a roadmap for future JPEG and MPEG standards.

A roadmap for future JPEG standards is represented in the figure above. The main efforts are towards the JPEG Pleno project that aims to standardize new immersive technologies like light fields, point clouds or digital holography. Moreover, JPEG is launching JPEG-XS for low latency and light weight coding, while JPEG Systems is also developing a new part to add privacy and security protection to their standards. Furthermore, JPEG is continuously seeking new technological developments and it is committed on providing new standardized image coding solutions.

JPEGroadmap

The future roadmap of MPEG standards is shown in the Figure below (taken from here).

MPEGRoadmap

MPEG’s roadmap for future standards comprises a variety of tools ranging from traditional audio-video coding to new forms of compression technologies like genome compression and lightfield. The systems aspects will cover applications domains which require media orchestration as well as focus on becoming the enabler for immersive media experiences.

Conclusion

In this article we briefly highlighted achievements and future plans of JPEG and MPEG but the future is not defined and requires participation from both industry and academia. We hope that our JPEG and MPEG columns will stimulate research and development within the multimedia domain and we are open for any kind of feedback. Contact Antonio Pinheiro (pinheiro@ubi.pt) or Christian Timmerer (christian.timmerer@itec.uni-klu.ac.at) for any further questions or comments.

MPEG Column: 117th MPEG Meeting

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The 117th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects:

  • MPEG issues Committee Draft of the Omnidirectional Media Application Format (OMAF)
  • MPEG-H 3D Audio Verification Test Report
  • MPEG Workshop on 5-Year Roadmap Successfully Held in Geneva
  • Call for Proposals (CfP) for Point Cloud Compression (PCC)
  • Preliminary Call for Evidence on video compression with capability beyond HEVC
  • MPEG issues Committee Draft of the Media Orchestration (MORE) Standard
  • Technical Report on HDR/WCG Video Coding

In this article, I’d like to focus on the topics related to multimedia communication starting with OMAF.

Omnidirectional Media Application Format (OMAF)

Real-time entertainment services deployed over the open, unmanaged Internet – streaming audio and video – account now for more than 70% of the evening traffic in North American fixed access networks and it is assumed that this figure will reach 80 percent by 2020. More and more such bandwidth hungry applications and services are pushing onto the market including immersive media services such as virtual reality and, specifically 360-degree videos. However, the lack of appropriate standards and, consequently, reduced interoperability is becoming an issue. Thus, MPEG has started a project referred to as Omnidirectional Media Application Format (OMAF). The first milestone of this standard has been reached and the committee draft (CD) has been approved at the 117th MPEG meeting. Such application formats “are essentially superformats that combine selected technology components from MPEG (and other) standards to provide greater application interoperability, which helps satisfy users’ growing need for better-integrated multimedia solutions” [MPEG-A].” In the context of OMAF, the following aspects are defined:

  • Equirectangular projection format (note: others might be added in the future)
  • Metadata for interoperable rendering of 360-degree monoscopic and stereoscopic audio-visual data
  • Storage format: ISO base media file format (ISOBMFF)
  • Codecs: High Efficiency Video Coding (HEVC) and MPEG-H 3D audio

OMAF is the first specification which is defined as part of a bigger project currently referred to as ISO/IEC 23090 — Immersive Media (Coded Representation of Immersive Media). It currently has the acronym MPEG-I and we have previously used MPEG-VR which is now replaced by MPEG-I (that still might chance in the future). It is expected that the standard will become Final Draft International Standard (FDIS) by Q4 of 2017. Interestingly, it does not include AVC and AAC, probably the most obvious candidates for video and audio codecs which have been massively deployed in the last decade and probably still will be a major dominator (and also denominator) in upcoming years. On the other hand, the equirectangular projection format is currently the only one defined as it is broadly used already in off-the-shelf hardware/software solutions for the creation of omnidirectional/360-degree videos. Finally, the metadata formats enabling the rendering of 360-degree monoscopic and stereoscopic video is highly appreciated. A solution for MPEG-DASH based on AVC/AAC utilizing equirectangular projection format for both monoscopic and stereoscopic video is shown as part of Bitmovin’s solution for VR and 360-degree video.

Research aspects related to OMAF can be summarized as follows:

  • HEVC supports tiles which allow for efficient streaming of omnidirectional video but HEVC is not as widely deployed as AVC. Thus, it would be interesting how to mimic such a tile-based streaming approach utilizing AVC.
  • The question how to efficiently encode and package HEVC tile-based video is an open issue and call for a tradeoff between tile flexibility and coding efficiency.
  • When combined with MPEG-DASH (or similar), there’s a need to update the adaptation logic as the with tiles yet another dimension is added that needs to be considered in order to provide a good Quality of Experience (QoE).
  • QoE is a big issue here and not well covered in the literature. Various aspects are worth to be investigated including a comprehensive dataset to enable reproducibility of research results in this domain. Finally, as omnidirectional video allows for interactivity, also the user experience is becoming an issue which needs to be covered within the research community.

A second topic I’d like to highlight in this blog post is related to the preliminary call for evidence on video compression with capability beyond HEVC. 

Preliminary Call for Evidence on video compression with capability beyond HEVC

A call for evidence is issued to see whether sufficient technological potential exists to start a more rigid phase of standardization. Currently, MPEG together with VCEG have developed a Joint Exploration Model (JEM) algorithm that is already known to provide bit rate reductions in the range of 20-30% for relevant test cases, as well as subjective quality benefits. The goal of this new standard — with a preliminary target date for completion around late 2020 — is to develop technology providing better compression capability than the existing standard, not only for conventional video material but also for other domains such as HDR/WCG or VR/360-degrees video. An important aspect in this area is certainly over-the-top video delivery (like with MPEG-DASH) which includes features such as scalability and Quality of Experience (QoE). Scalable video coding has been added to video coding standards since MPEG-2 but never reached wide-spread adoption. That might change in case it becomes a prime-time feature of a new video codec as scalable video coding clearly shows benefits when doing dynamic adaptive streaming over HTTP. QoE did find its way already into video coding, at least when it comes to evaluating the results where subjective tests are now an integral part of every new video codec developed by MPEG (in addition to usual PSNR measurements). Therefore, the most interesting research topics from a multimedia communication point of view would be to optimize the DASH-like delivery of such new codecs with respect to scalability and QoE. Note that if you don’t like scalable video coding, feel free to propose something else as long as it reduces storage and networking costs significantly.

 

MPEG Workshop “Global Media Technology Standards for an Immersive Age”

On January 18, 2017 MPEG successfully held a public workshop on “Global Media Technology Standards for an Immersive Age” hosting a series of keynotes from Bitmovin, DVB, Orange, Sky Italia, and Technicolor. Stefan Lederer, CEO of Bitmovin discussed today’s and future challenges with new forms of content like 360°, AR and VR. All slides are available here and MPEG took their feedback into consideration in an update of its 5-year standardization roadmap. David Wood (EBU) reported on the DVB VR study mission and Ralf Schaefer (Technicolor) presented a snapshot on VR services. Gilles Teniou (Orange) discussed video formats for VR pointing out a new opportunity to increase the content value but also raising a question what is missing today. Finally, Massimo Bertolotti (Sky Italia) introduced his view on the immersive media experience age.

Overall, the workshop was well attended and as mentioned above, MPEG is currently working on a new standards project related to immersive media. Currently, this project comprises five parts. The first part comprises a technical report describing the scope (incl. kind of system architecture), use cases, and applications. The second part is OMAF (see above) and the third/forth parts are related to immersive video and audio respectively. Part five is about point cloud compression.

For those interested, please check out the slides from industry representatives in this field and draw your own conclusions what could be interesting for your own research. I’m happy to see any reactions, hints, etc. in the comments.

Finally, let’s have a look what happened related to MPEG-DASH, a topic with a long history on this blog.

MPEG-DASH and CMAF: Friend or Foe?

For MPEG-DASH and CMAF it was a meeting “in between” official standardization stages. MPEG-DASH experts are still working on the third edition which will be a consolidated version of the 2nd edition and various amendments and corrigenda. In the meantime, MPEG issues a white paper on the new features of MPEG-DASH which I would like to highlight here.

  • Spatial Relationship Description (SRD): allows to describe tiles and region of interests for partial delivery of media presentations. This is highly related to OMAF and VR/360-degree video streaming.
  • External MPD linking: this feature allows to describe the relationship between a single program/channel and a preview mosaic channel having all channels at once within the MPD.
  • Period continuity: simple signaling mechanism to indicate whether one period is a continuation of the previous one which is relevant for ad-insertion or live programs.
  • MPD chaining: allows for chaining two or more MPDs to each other, e.g., pre-roll ad when joining a live program.
  • Flexible segment format for broadcast TV: separates the signaling of the switching points and random access points in each stream and, thus, the content can be encoded with a good compression efficiency, yet allowing higher number of random access point, but with lower frequency of switching points.
  • Server and network-assisted DASH (SAND): enables asynchronous network-to-client and network-to-network communication of quality-related assisting information.
  • DASH with server push and WebSockets: basically addresses issues related to HTTP/2 push feature and WebSocket.

CMAF issued a study document which captures the current progress and all national bodies are encouraged to take this into account when commenting on the Committee Draft (CD). To answer the question in the headline above, it looks more and more like as DASH and CMAF will become friends — let’s hope that the friendship lasts for a long time.

What else happened at the MPEG meeting?

  • Committee Draft MORE (note: type in ‘man more’ on any unix/linux/max terminal and you’ll get ‘less – opposite of more’;): MORE stands for “Media Orchestration” and provides a specification that enables the automated combination of multiple media sources (cameras, microphones) into a coherent multimedia experience. Additionally, it targets use cases where a multimedia experience is rendered on multiple devices simultaneously, again giving a consistent and coherent experience.
  • Technical Report on HDR/WCG Video Coding: This technical report comprises conversion and coding practices for High Dynamic Range (HDR) and Wide Colour Gamut (WCG) video coding (ISO/IEC 23008-14). The purpose of this document is to provide a set of publicly referenceable recommended guidelines for the operation of AVC or HEVC systems adapted for compressing HDR/WCG video for consumer distribution applications
  • CfP Point Cloud Compression (PCC): This call solicits technologies for the coding of 3D point clouds with associated attributes such as color and material properties. It will be part of the immersive media project introduced above.
  • MPEG-H 3D Audio verification test report: This report presents results of four subjective listening tests that assessed the performance of the Low Complexity Profile of MPEG-H 3D Audio. The tests covered a range of bit rates and a range of “immersive audio” use cases (i.e., from 22.2 down to 2.0 channel presentations). Seven test sites participated in the tests with a total of 288 listeners.

The next MPEG meeting will be held in Hobart, April 3-7, 2017. Feel free to contact us for any questions or comments.

MPEG Column: 115th MPEG Meeting

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The 115th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects:

 

  • IMG_2276MPEG issues Genomic Information Compression and Storage joint Call for Proposals in conjunction with ISO/TC 276/WG 5
  • Plug-in free decoding of 3D objects within Web browsers
  • MPEG-H 3D Audio AMD 3 reaches FDAM status
  • Common Media Application Format for Dynamic Adaptive Streaming Applications
  • 4th edition of AVC/HEVC file format

In this blog post, however, I will cover topics specifically relevant for adaptive media streaming, namely:

  • Recent developments in MPEG-DASH
  • Common media application format (CMAF)
  • MPEG-VR (virtual reality)
  • The MPEG roadmap/vision for the future.

MPEG-DASH Server and Network assisted DASH (SAND): ISO/IEC 23009-5

Part 5 of MPEG-DASH, referred to as SAND – server and network-assisted DASH – has reached FDIS. This work item started sometime ago at a public MPEG workshop during the 105th MPEG meeting in Vienna. The goal of this part of MPEG-DASH is to enhance the delivery of DASH content by introducing messages between DASH clients and network elements or between various network elements for the purpose of improving the efficiency of streaming sessions by providing information about real-time operational characteristics of networks, servers, proxies, caches, CDNs as well as DASH client’s performance and status. In particular, it defines the following:

  1. The SAND architecture which identifies the SAND network elements and the nature of SAND messages exchanged among them.
  2. The semantics of SAND messages exchanged between the network elements present in the SAND architecture.
  3. An encoding scheme for the SAND messages.
  4. The minimum to implement a SAND message delivery protocol.

The way that this information is to be utilized is deliberately not defined within the standard and left open for (industry) competition (or other standards developing organizations). In any case, there’s plenty of room for research activities around the topic of SAND, specifically:

  • A main issue is the evaluation of MPEG-DASH SAND in terms of qualitative and quantitative improvements with respect to QoS/QoE. Some papers are available already and have been published within ACM MMSys 2016.
  • Another topic of interest includes an analysis regarding scalability and possible overhead; in other words, I’m wondering whether it’s worth using SAND to improve DASH.

MPEG-DASH with Server Push and WebSockets: ISO/IEC 23009-6

Part 6 of MPEG-DASH reached DIS stage and deals with server push and Web sockets, i.e., it specifies the carriage of MPEG-DASH media presentations over full duplex HTTP-compatible protocols, particularly HTTP/2 and WebSocket. The specification comes with a set of generic definitions for which bindings are defined allowing its usage in various formats. Currently, the specification supports HTTP/2 and WebSocket.

For the former it is required to define the push policy as an HTTP header extension whereas the latter requires the definition of a DASH subprotocol. Luckily, these are the preferred extension mechanisms for both HTTP/2 and WebSocket and, thus, interoperability is provided. The question of whether or not the industry will adopt these extensions cannot be answered right now but I would recommend keeping an eye on this and there are certainly multiple research topics worth exploring in the future.

An interesting aspect for the research community would be to quantify the utility of using push methods within dynamic adaptive environments in terms of QoE and start-up delay. Some papers provide preliminary answers but a comprehensive evaluation is missing.

To conclude the recent MPEG-DASH developments, the DASH-IF recently established the Excellence in DASH Award at ACM MMSys’16 and the winners are presented here (including some of the recent developments described in this blog post).

Common Media Application Format (CMAF): ISO/IEC 23000-19

The goal of CMAF is to enable application consortia to reference a single MPEG specification (i.e., a “common media format”) that would allow a single media encoding to use across many applications and devices. Therefore, CMAF defines the encoding and packaging of segmented media objects for delivery and decoding on end user devices in adaptive multimedia presentations. This sounds very familiar and reminds us a bit on what the DASH-IF is doing with their interoperability points. One of the goals of CMAF is to integrate HLS in MPEG-DASH which is backed up with this WWDC video where Apple announces the support of fragmented MP4 in HLS. The streaming of this announcement is only available in Safari and through the WWDC app but Bitmovin has shown that it also works on Mac iOS 10 and above, and for PC users all recent browser versions including Edge, FireFox, Chrome, and (of course) Safari. 

MPEG Virtual Reality

IMG_2285 (1)
Virtual reality is becoming a hot topic across the industry (and also academia) which also reaches standards developing organizations like MPEG. Therefore, MPEG established an ad-hoc group (with an email reflector) to develop a roadmap required for MPEG-VR. Others have also started working on this like DVB, DASH-IF, and QUALINET (and maybe many others: W3C, 3GPP). In any case, it shows that there’s a massive interest in this topic and Bitmovin has shown already what can be done in this area within today’s Web environments. Obviously, adaptive streaming is an important aspect for VR applications including a many research questions to be addressed in the (near) future. A first step towards a concrete solution is the Omnidirectional Media Application Format (OMAF) which is currently at working draft stage (details to be provided in a future blog post).

The research aspects covers a wide range activity including – but not limited to – content capturing, content representation, streaming/network optimization, consumption, and QoE.

MPEG roadmap/vision

At it’s 115th meeting, MPEG published a document that lays out its medium-term strategic standardization roadmap. The goal of this document is collecting feedback from anyone in professional and B2B industries dealing with media, specifically but not limited to broadcasting, content and service provision, media equipment manufacturing, and telecommunication industry. The roadmap is depicted below and further described in the document available here. Please note that “360 AV” in the figure below also refers to VR but unfortunately it’s not (yet) reflected in the figure. However, it points out the aspects to be addressed by MPEG in the future which would be relevant for both industry and academia.

MPEG-Roadmap

The next MPEG meeting will be held in Chengdu, October 17-21, 2016.

MPEG Column: 112th MPEG Meeting

This blog post is also available at at bitmovin tech blog and blog.timmerer.com.

The 112th MPEG meeting in Warsaw, Poland was a special meeting for me. It was my 50th MPEG meeting which roughly accumulates to one year of MPEG meetings (i.e., one year of my life I’ve spend in MPEG meetings incl. traveling – scary, isn’t it? … more on this in another blog post). But what happened at this 112th MPEG meeting (my 50th meeting)…

  • Requirements: CDVA, Future of Video Coding Standardization (no acronym yet), Genome compression
  • Systems: M2TS (ISO/IEC 13818-1:2015), DASH 3rd edition, Media Orchestration (no acronym yet), TRUFFLE
  • Video/JCT-VC/JCT-3D: MPEG-4 AVC, Future Video Coding, HDR, SCC
  • Audio: 3D audio
  • 3DG: PCC, MIoT, Wearable

MPEG Friday Plenary. Photo (c) Christian Timmerer.

As usual, the official press release and other publicly available documents can be found here. Let’s dig into the different subgroups:

Requirements

In requirements experts were working on the Call for Proposals (CfP) for Compact Descriptors for Video Analysis (CDVA) including an evaluation framework. The evaluation framework includes 800-1000 objects (large objects like building facades, landmarks, etc.; small(er) objects like paintings, books, statues, etc.; scenes like interior scenes, natural scenes, multi-camera shots) and the evaluation of the responses should be conducted for the 114th meeting in San Diego.

The future of video coding standardization is currently happening in MPEG and shaping the way for the successor of of the HEVC standard. The current goal is providing (native) support for scalability (more than two spatial resolutions) and 30% compression gain for some applications (requiring a limited increase in decoder complexity) but actually preferred is 50% compression gain (at a significant increase of the encoder complexity). MPEG will hold a workshop at the next meeting in Geneva discussing specific compression techniques, objective (HDR) video quality metrics, and compression technologies for specific applications (e.g., multiple-stream representations, energy-saving encoders/decoders, games, drones). The current goal is having the International Standard for this new video coding standard around 2020.

MPEG has recently started a new project referred to as Genome Compression which is about of course about the compression of genome information. A big dataset has been collected and experts working on the Call for Evidence (CfE). The plan is holding a workshop at the next MPEG meeting in Geneva regarding prospect of Genome Compression and Storage Standardization targeting users, manufactures, service providers, technologists, etc.

Summer in Warsaw. Photo (c) Christian Timmerer.

Systems

The 5th edition of the MPEG-2 Systems standard has been published as ISO/IEC 13818-1:2015 on the 1st of July 2015 and is a consolidation of the 4th edition + Amendments 1-5.

In terms of MPEG-DASH, the draft text of ISO/IEC 23009-1 3rd edition comprising 2nd edition + COR 1 + AMD 1 + AMD 2 + AMD 3 + COR 2 is available for committee internal review. The expected publication date is scheduled for, most likely, 2016. Currently, MPEG-DASH includes a lot of activity in the following areas: spatial relationship description, generalized URL parameters, authentication, access control, multiple MPDs, full duplex protocols (aka HTTP/2 etc.), advanced and generalized HTTP feedback information, and various core experiments:

  • SAND (Sever and Network Assisted DASH)
  • FDH (Full Duplex DASH)
  • SAP-Independent Segment Signaling (SISSI)
  • URI Signing for DASH
  • Content Aggregation and Playback COntrol (CAPCO)

In particular, the core experiment process is very open as most work is conducted during the Ad hoc Group (AhG) period which is discussed on the publicly available MPEG-DASH reflector.

MPEG systems recently started an activity that is related to media orchestration which applies to capture as well as consumption and concerns scenarios with multiple sensors as well as multiple rendering devices, including one-to-many and many-to-one scenarios resulting in a worthwhile, customized experience.

Finally, the systems subgroup started an exploration activity regarding real-time streaming of file (a.k.a TRUFFLE) which should perform an gap analysis leading to extensions of the MPEG Media Transport (MMT) standard. However, some experts within MPEG concluded that most/all use cases identified within this activity could be actually solved with existing technology such as DASH. Thus, this activity may still need some discussions…

Video/JCT-VC/JCT-3D

The MPEG video subgroup is working towards a new amendment for the MPEG-4 AVC standard covering resolutions up to 8K and higher frame rates for lower resolution. Interestingly, although MPEG most of the time is ahead of industry, 8K and high frame rate is already supported in browser environments (e.g., using bitdash 8K, HFR) and modern encoding platforms like bitcodin. However, it’s good that we finally have means for an interoperable signaling of this profile.

In terms of future video coding standardization, the video subgroup released a call for test material. Two sets of test sequences are already available and will be investigated regarding compression until next meeting.

After a successful call for evidence for High Dynamic Range (HDR), the technical work starts in the video subgroup with the goal to develop an architecture (“H2M”) as well as three core experiments (optimization without HEVC specification change, alternative reconstruction approaches, objective metrics).

The main topic of the JCT-VC was screen content coding (SCC) which came up with new coding tools that are better compressing content that is (fully or partially) computer generated leading to a significant improvement of compression, approx. or larger than 50% rate reduction for specific screen content.

Audio

The audio subgroup is mainly concentrating on 3D audio where they identified the need for intermediate bitrates between 3D audio phase 1 and 2. Currently, phase 1 identified 256, 512, 1200 kb/s whereas phase 2 focuses on 128, 96, 64, 48 kb/s. The broadcasting industry needs intermediate bitrates and, thus, phase 2 is extended to bitrates between 128 and 256 kb/s.

3DG

MPEG 3DG is working on point cloud compression (PCC) for which open source software has been identified. Additionally, there’re new activity in the area of Media Internet of Things (MIoT) and wearable computing (like glasses and watches) that could lead to new standards developed within MPEG. Therefore, stay tuned on these topics as they may shape your future.

The week after the MPEG meeting I met the MPEG convenor and the JPEG convenor again during ICME2015 in Torino but that’s another story…

L. Chiariglione, H. Hellwagner, T. Ebrahimi, C. Timmerer (from left to right) during ICME2015. Photo (c) T. Ebrahimi.

MPEG Column: 111th MPEG Meeting

— original posts here by Multimedia Communication blogChristian TimmererAAU/bitmovin

The 111th MPEG meeting (note: link includes press release and all publicly available output documents) was held in Geneva, Switzerland showing up some interesting aspects which I’d like to highlight here. Undoubtedly, it was the shortest meeting I’ve ever attended (and my first meeting was #61) as final plenary concluded at 2015/02/20T18:18!

MPEG111 opening plenary

In terms of the requirements (subgroup) it’s worth to mention the call for evidence (CfE) for high-dynamic range (HDR) and wide color gamut (WCG) video coding which comprises a first milestone towards a new video coding format. The purpose of this CfE is to explore whether or not  (a) the coding efficiency and/or (b) the functionality of the HEVC Main 10 and Scalable Main 10 profiles can be significantly improved for HDR and WCG content. In addition to that requirements issues a draft call for evidence on free viewpoint TV. Both documents are publicly available here.

The video subgroup continued discussions related to the future of video coding standardisation and issued a public document requesting contributions on “future video compression technology”. Interesting application requirements come from over-the-top streaming use cases which request HDR and WCG as well as video over cellular networks. Well, at least the former is something to be covered by the CfE mentioned above. Furthermore, features like scalability and perceptual quality is something that should be considered from ground-up and not (only) as an extension. Yes, scalability is something that really helps a lot in OTT streaming starting from easier content management, cache-efficient delivery, and it allows for a more aggressive buffer modelling and, thus, adaptation logic within the client enabling better Quality of Experience (QoE) for the end user. It seems like complexity (at the encoder) is not such much a concern as long as it scales with cloud deployments such as http://www.bitcodin.com/ (e.g., the bitdash demo area shows some neat 4K/8K/HFR DASH demos which have been encoded with bitcodin). Closely related to 8K, there’s a new AVC amendment coming up covering 8K although one can do it already today (see before) but it’s good to have standards support for this. For HEVC, the JCT-3D/VC issued the FDAM4 for 3D Video Extensions and started with PDAM5 for Screen Content Coding Extensions (both documents being publicly available after an editing period of about a month).

And what about audio, the audio subgroup has decided that ISO/IEC DIS 23008-3 3D Audio shall be promoted directly to IS which means that the DIS was already at such a good state that only editorial comments are applied which actually saves a balloting cycle. We have to congratulate the audio subgroup for this remarkable milestone.

Finally, I’d like to discuss a few topics related to DASH which is progressing towards its 3rd edition which will incorporate amendment 2 (Spatial Relationship Description, Generalized URL parameters and other extensions), amendment 3 (Authentication, Access Control and multiple MPDs), and everything else that will be incorporated within this year, like some aspects documented in the technologies under consideration or currently being discussed within the core experiments (CE). Currently, MPEG-DASH conducts 5 core experiments:

  • Server and Network Assisted DASH (SAND)
  • DASH over Full Duplex HTTP-based Protocols (FDH)
  • URI Signing for DASH (CE-USD)
  • SAP-Independent Segment SIgnaling (SISSI)
  • Content aggregation and playback control (CAPCO)

The description of core experiments is publicly available and, compared to the previous meeting, we have a new CE which is about content aggregation and playback control (CAPCO) which “explores solutions for aggregation of DASH content from multiple live and on-demand origin servers, addressing applications such as creating customized on-demand and live programs/channels from multiple origin servers per client, targeted preroll ad insertion in live programs and also limiting playback by client such as no-skip or no fast forward.” This process is quite open and anybody can join by subscribing to the email reflector.

The CE for DASH over Full Duplex HTTP-based Protocols (FDH) is becoming major and basically defines the usage of DASH for push-features of WebSockets and HTTP/2. At this meeting MPEG issues a working draft and also the CE on Server and Network Assisted DASH (SAND) got its own part 5 where it goes to CD but documents are not publicly available. However, I’m pretty sure I can report more on this next time, so stay tuned or feel free to comment here.

MPEG Column: 110th MPEG Meeting

— original posts here by Multimedia Communication blogChristian TimmererAAU/bitmovin

The 110th MPEG meeting was held at the Strasbourg Convention and Conference Centre featuring the following highlights:

  • The future of video coding standardization
  • Workshop on media synchronization
  • Standards at FDIS: Green Metadata and CDVS
  • What’s happening in MPEG-DASH?

Additional details about MPEG’s 110th meeting can be also found here including the official press release and all publicly available documents.

The Future of Video Coding Standardization

MPEG110 hosted a panel discussion about the future of video coding standardization. The panel was organized jointly by MPEG and ITU-T SG 16’s VCEG featuring Roger Bolton (Ericsson), Harald Alvestrand (Google), Zhong Luo (Huawei), Anne Aaron (Netflix), Stéphane Pateux (Orange), Paul Torres (Qualcomm), and JeongHoon Park (Samsung).

As expected, “maximizing compression efficiency remains a fundamental need” and as usual, MPEG will study “future application requirements, and the availability of technology developments to fulfill these requirements”. Therefore, two Ad-hoc Groups (AhGs) have been established which are open to the public:

The presentations of the brainstorming session on the future of video coding standardization can be found here.

Workshop on Media Synchronization

MPEG101 also hosted a workshop on media synchronization for hybrid delivery (broadband-broadcast) featuring six presentations “to better understand the current state-of-the-art for media synchronization and identify further needs of the industry”.

  • An overview of MPEG systems technologies providing advanced media synchronization, Youngkwon Lim, Samsung
  • Hybrid Broadcast – Overview of DVB TM-Companion Screens and Streams specification, Oskar van Deventer, TNO
  • Hybrid Broadcast-Broadband distribution for new video services :  a use cases perspective, Raoul Monnier, Thomson Video Networks
  • HEVC and Layered HEVC for UHD deployments, Ye Kui Wang, Qualcomm
  • A fingerprinting-based audio synchronization technology, Masayuki Nishiguchi, Sony Corporation
  • Media Orchestration from Capture to Consumption, Rob Koenen, TNO

The presentation material is available here. Additionally, MPEG established an AhG on timeline alignment (that’s how the project is internally called) to study use cases and solicit contributions on gap analysis and also technical contributions [email][subscription].

Standards at FDIS: Green Metadata and CDVS

My first report on MPEG Compact Descriptors for Visual Search (CDVS) dates back to July 2011 which provides details about the call for proposals. Now, finally, the FDIS has been approved during the 110th MPEG meeting. CDVS defines a compact image description that facilitates the comparison and search of pictures that include similar content, e.g. when showing the same objects in different scenes from different viewpoints. The compression of key point descriptors not only increases compactness, but also significantly speeds up, when compared to a raw representation of the same underlying features, the search and classification of images within large image databases. Application of CDVS for real-time object identification, e.g. in computer vision and other applications, is envisaged as well.

Another standard reached FDIS status entitled Green Metadata (first reported in August 2012). This standard specifies the format of metadata that can be used to reduce energy consumption from the encoding, decoding, and presentation of media content, while simultaneously controlling or avoiding degradation in the Quality of Experience (QoE). Moreover, the metadata specified in this standard can facilitate a trade-off between energy consumption and QoE. MPEG is also working on amendments to the ubiquitous MPEG-2 TS ISO/IEC 13818-1 and ISOBMFF ISO/IEC 14496-12 so that green metadata can be delivered by these formats.

What’s happening in MPEG-DASH?

MPEG-DASH is in a kind of maintenance mode but still receiving new proposals in the area of SAND parameters and some core experiments are going on. Also, the DASH-IF is working towards new interoperability points and test vectors in preparation of actual deployments. When speaking about deployments, they are happening, e.g., a 40h live stream right before Christmas (by bitmovin, a top-100 company that matters most in online video). Additionally, VideoNext was co-located with CoNEXT’14 targeting scientific presentations about the design, quality and deployment of adaptive video streaming. Webex recordings of the talks are available here. In terms of standardization, MPEG-DASH is progressing towards the 2nd amendment including spatial relationship description (SRD), generalized URL parameters and other extensions. In particular, SRD will enable new use cases which can be only addressed using MPEG-DASH and the FDIS is scheduled for the next meeting which will be in Geneva, Feb 16-20, 2015. I’ll report on this within my next blog post, stay tuned..