About Christian Timmerer

Christian Timmerer is a researcher, entrepreneur, and teacher on immersive multimedia communication, streaming, adaptation, and Quality of Experience. He is an Assistant Professor at Alpen-Adria-Universität Klagenfurt, Austria. Follow him on Twitter at http://twitter.com/timse7 and subscribe to his blog at http://blog.timmerer.com.

MPEG Column: 118th MPEG Meeting

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The entire MPEG press release can be found here comprising the following topics:

  • Coded Representation of Immersive Media (MPEG-I): new work item approved and call for test data issued
  • Common Media Application Format (CMAF): FDIS approved
  • Beyond High Efficiency Video Coding (HEVC): call for evidence for “beyond HEVC” and verification tests for screen content coding extensions of HEVC

Coded Representation of Immersive Media (MPEG-I)

MPEG started to work on the new work item referred to as ISO/IEC 23090 with the “nickname” MPEG-I targeting future immersive applications. The goal of this new standard is to enable various forms of audio-visual immersion including panoramic video with 2D and 3D audio with various degrees of true 3D visual perception. It currently comprises five parts: (pt. 1) a technical report describing the scope of this new standard and a set of use cases and applications; (pt. 2) an application format for omnidirectional media (aka OMAF) to address the urgent need of the industry for a standard is this area; (pt. 3) immersive video which is a kind of placeholder for the successor of HEVC (if at all); (pt. 4) immersive audio as a placeholder for the successor of 3D audio (if at all); and (pt. 5) for point cloud compression. The point cloud compression standard targets lossy compression for point clouds in real-time communication, six Degrees of Freedom (6 DoF) virtual reality, and the dynamic mapping for autonomous driving, cultural heritage applications, etc. Part 2 is related to OMAF which I’ve discussed in my previous blog post.

MPEG also established an Ad-hoc Group (AhG) on immersive Media quality evaluation with the following mandates: 1. Produce a document on VR QoE requirements; 2. Collect test material with immersive video and audio signals; 3. Study existing methods to assess human perception and reaction to VR stimuli; 4. Develop test methodology for immersive media, including simultaneous video and audio; 5. Study VR experience metrics and their measurability in VR services and devices. AhGs are open to everybody and mostly discussed using mailing lists (join here https://lists.aau.at/mailman/listinfo/immersive-quality). Interestingly, a Joint Qualinet-VQEG team on Immersive Media (JQVIM) has been recently established with similar goals and also the VR Industry Forum (VRIF) has issued a call for VR360 content. It seems there’s a strong need for a dataset similar to the one we have created for MPEG-DASH long time ago.

The JQVIM has been created as part of the QUALINET task force on “Immersive Media Experiences (IMEx)” which aims at providing end users the sensation of being part of the particular media which shall result in a worthwhile, informative user and quality of experience. The main goals are providing datasets and tools (hardware/software), subjective quality evaluations, field studies, cross- validation including a strong theoretical foundation relevant along the empirical databases and tools which hopefully results in a framework, methodology, and best practices for immersive media experiences.

Common Media Application Format (CMAF)

The Final Draft International Standard (FDIS) has been issued at the 118th MPEG meeting which concludes the formal technical development process of the standard. At this point in time national bodies can only vote Yes|No and editorial changes are allowed (if any) before the International Standard (IS) becomes available. The goal of CMAF is to define a single format for the transport and storage of segmented media including audio/video formats, subtitles, and encryption — it is derived from the ISO Base Media File Format (ISOBMFF). As it’s a combination of various MPEG standard it’s referred to as an Application Format (AS) which mainly takes existing formats/standards and glues them together for a specific target application. The CMAF standard clearly targets dynamic adaptive streaming (over — but not limited to — HTTP) but focusing on the media format only and excluding the manifest format. Thus, the CMAF standard shall be compatible with other formats such as MPEG-DASH and HLS. In fact, HLS has been extended already some time ago to support ‘fragmented MP4’ which we have demonstrated also and it has been interpreted as a first step towards the harmonization of MPEG-DASH and HLS; at least on the segment format. The delivery of CMAF contents with DASH will be described in part 7 of MPEG-DASH that basically comprises a mapping of CMAF concepts to DASH terms.

From a research perspective, it would be interesting to explore how certain CMAF concepts are able to address current industry needs, specifically in the context of low-latency streaming which has been demonstrated recently.

Beyond HEVC…

The preliminary call for evidence (CfE) on video compression with capability beyond HEVC has been issued and is addressed to interested parties that have technology providing better compression capability than the existing standard, either for conventional video material, or for other domains such as HDR/WCG or 360-degree (“VR”) video. Test cases are defined for SDR, HDR, and 360-degree content. This call has been made jointly by ISO/IEC MPEG and ITU-T SG16/Q6 (VCEG). The evaluation of the responses is scheduled for July 2017 and depending on the outcome of the CfE, the parent bodies of the Joint Video Exploration Team (JVET) of MPEG and VCEG collaboration intend to issue a Draft Call for Proposals by the end of the July meeting.

Finally, verification tests have been conducted for the Screen Content Coding (SCC) extensions to HEVC showing exceptional performance. Screen content is video containing a significant proportion of rendered (moving or static) graphics, text, or animation rather than, or in addition to, camera-captured video scenes. For scenes containing a substantial amount of text and graphics, the tests showed a major benefit in compression capability for the new extensions over both the Advanced Video Coding standard and the previous version of the newer HEVC standard without the new SCC features.

The question whether and how new codecs like (beyond) HEVC competes with AV1 is subject to research and development. It has been discussed also in the scientific literature but lacks of vendor neutral comparison which is difficult to achieve and not to compare apples with oranges (due to the high number of different coding tools and parameters). An important aspect which always needs to be considered is one typically compares specific implementations of a coding format and not the standard as the encoding is usually not defined, only the bitstream syntax that implicitly defines the decoder.

Publicly available documents from the 118th MPEG meeting can be found here (scroll down to the end of the page). The next MPEG meeting will be held in Torino, Italy, July 17-21, 2017. Feel free to contact us for any questions or comments.

Standards Column: JPEG and MPEG

Introduction

ISO/IEC JTC 1/SC 29 area of work comprises the standardization of coded representation of audio, picture, multimedia and hypermedia information and sets of compression and control functions for use with such information. SC29 basically hosts two working groups responsible for the development of international standards for the compression, decompression, processing, and coded representation of media content, in order to satisfy a wide variety of applications”, specifically WG1 targeting “digital still pictures”  — also known as JPEG — and WG11 targeting “moving pictures, audio, and their combination” — also known as MPEG. The earlier SC29 standards, namely JPEG, MPEG-1 and MPEG-2, received the technology & engineering Emmy award in 1995-96.

The standards columns within ACM SIGMM Records provide timely updates about the most recent developments within JPEG and MPEG respectively. The JPEG column is edited by Antonio Pinheiro and the MPEG column is edited by Christian Timmerer. The editors and an overview of recent JPEG and MPEG achievements as well as future plans are highlighted in this article.

Antonio Pinheiro received the BSc (Licenciatura) from I.S.T., Lisbon in 1988 and the PhD in faceAMGP3Electronic Systems Engineering from University of Essex in 2002. He is a lecturer at U.B.I. (Universidade da Beira Interior), Covilha, Portugal from 1988 and a researcher at I.T. (Instituto de Telecomunicações), Portugal. Currently, his research interests are on Image Processing, namely on Multimedia Quality Evaluation and Medical Image Analysis. He was a Portuguese representative of the European Union Actions COST IC1003 – QUALINET, COST IC1206 – DE-ID, COST 292 and currently of COST BM1304 – MYO-MRI. He is currently involved in the project EmergIMG funded by the Portuguese Funding agency and H2020, and he is a Portuguese delegate to JPEG, where he is currently the Communication Subgroup chair and involved with the JPEG Pleno project.

 

 

ct2013octChristian Timmerer received his M.Sc. (Dipl.-Ing.) in January 2003 and his Ph.D. (Dr.techn.) in June 2006 (for research on the adaptation of scalable multimedia content in streaming and constrained environments) both from the Alpen-Adria-Universität (AAU) Klagenfurt. He joined the AAU in 1999 (as a system administrator) and is currently an Associate Professor at the Institute of Information Technology (ITEC) within the Multimedia Communication Group. His research interests include immersive multimedia communications, streaming, adaptation, Quality of Experience, and Sensory Experience. He was the general chair of WIAMIS 2008, QoMEX 2013, and MMSys 2016 and has participated in several EC-funded projects, notably DANAE, ENTHRONE, P2P-Next, ALICANTE, SocialSensor, COST IC1003 QUALINET, and ICoSOLE. He also participated in ISO/MPEG work for several years, notably in the area of MPEG-21, MPEG-M, MPEG-V, and MPEG-DASH where he also served as standard editor. In 2012 he cofounded Bitmovin (http://www.bitmovin.com/) to provide professional services around MPEG-DASH where he holds the position of the Chief Innovation Officer (CIO).

 

Major JPEG and MPEG Achievements

In this section we would like to highlight major JPEG and MPEG achievements without claiming to be exhaustive.

JPEG developed the well-known digital pictures coding standard, known as JPEG image format almost 25 years ago. Due to the recent increase of social networks usage, the number of JPEG encoded images shared online grew to an impressive number of 1,800 billion per day in 2014. JPEG 2000 is another JPEG successful standard that also received the 2015 Technology and Engineering Emmy award. This standard uses state of the art compression technology providing higher compression and a wider applications domain. It is widely used at professional level, namely on movies production and medical imaging. JPEG also developed JBIG2, JPEG-LS, JPSearch and JPEG-XR standards. More recently JPEG launched JPEG-AIC, JPEG Systems and JPEG-XT. JPEG-XT defines backward compatible extensions of JPEG, adding support for HDR, lossless/near lossless, and alpha coding. An overview of the JPEG family of standards is shown in the figure below.

JPEGstandards
An overview of existing MPEG standards and achievements is shown in the figure below (taken from here).

MPEGStandards

A first major milestone and success was the development of MP3 which revolutionized digital audio content resulting in a sustainable change of the digital media ecosystem. The same holds for MPEG-2 video & systems where the latter, i.e., MPEG-2 Transport Stream, received the technology & engineering Emmy award. The mobile era within MPEG has been introduced with the MPEG-4 standard resulting in the development of AVC (received yet another Emmy award), AAC, and also the MP4 file format which have been deployed widely. Finally, streaming over the open internet is addressed by DASH and new forms of digital television including ultra high-definition & immersive services are targeted by MPEG-H comprising MMT, HEVC, and 3D audio.

Roadmap for Future JPEG and MPEG Standards

In this section we would like to highlight a roadmap for future JPEG and MPEG standards.

A roadmap for future JPEG standards is represented in the figure above. The main efforts are towards the JPEG Pleno project that aims to standardize new immersive technologies like light fields, point clouds or digital holography. Moreover, JPEG is launching JPEG-XS for low latency and light weight coding, while JPEG Systems is also developing a new part to add privacy and security protection to their standards. Furthermore, JPEG is continuously seeking new technological developments and it is committed on providing new standardized image coding solutions.

JPEGroadmap

The future roadmap of MPEG standards is shown in the Figure below (taken from here).

MPEGRoadmap

MPEG’s roadmap for future standards comprises a variety of tools ranging from traditional audio-video coding to new forms of compression technologies like genome compression and lightfield. The systems aspects will cover applications domains which require media orchestration as well as focus on becoming the enabler for immersive media experiences.

Conclusion

In this article we briefly highlighted achievements and future plans of JPEG and MPEG but the future is not defined and requires participation from both industry and academia. We hope that our JPEG and MPEG columns will stimulate research and development within the multimedia domain and we are open for any kind of feedback. Contact Antonio Pinheiro (pinheiro@ubi.pt) or Christian Timmerer (christian.timmerer@itec.uni-klu.ac.at) for any further questions or comments.

MPEG Column: 117th MPEG Meeting

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The 117th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects:

  • MPEG issues Committee Draft of the Omnidirectional Media Application Format (OMAF)
  • MPEG-H 3D Audio Verification Test Report
  • MPEG Workshop on 5-Year Roadmap Successfully Held in Geneva
  • Call for Proposals (CfP) for Point Cloud Compression (PCC)
  • Preliminary Call for Evidence on video compression with capability beyond HEVC
  • MPEG issues Committee Draft of the Media Orchestration (MORE) Standard
  • Technical Report on HDR/WCG Video Coding

In this article, I’d like to focus on the topics related to multimedia communication starting with OMAF.

Omnidirectional Media Application Format (OMAF)

Real-time entertainment services deployed over the open, unmanaged Internet – streaming audio and video – account now for more than 70% of the evening traffic in North American fixed access networks and it is assumed that this figure will reach 80 percent by 2020. More and more such bandwidth hungry applications and services are pushing onto the market including immersive media services such as virtual reality and, specifically 360-degree videos. However, the lack of appropriate standards and, consequently, reduced interoperability is becoming an issue. Thus, MPEG has started a project referred to as Omnidirectional Media Application Format (OMAF). The first milestone of this standard has been reached and the committee draft (CD) has been approved at the 117th MPEG meeting. Such application formats “are essentially superformats that combine selected technology components from MPEG (and other) standards to provide greater application interoperability, which helps satisfy users’ growing need for better-integrated multimedia solutions” [MPEG-A].” In the context of OMAF, the following aspects are defined:

  • Equirectangular projection format (note: others might be added in the future)
  • Metadata for interoperable rendering of 360-degree monoscopic and stereoscopic audio-visual data
  • Storage format: ISO base media file format (ISOBMFF)
  • Codecs: High Efficiency Video Coding (HEVC) and MPEG-H 3D audio

OMAF is the first specification which is defined as part of a bigger project currently referred to as ISO/IEC 23090 — Immersive Media (Coded Representation of Immersive Media). It currently has the acronym MPEG-I and we have previously used MPEG-VR which is now replaced by MPEG-I (that still might chance in the future). It is expected that the standard will become Final Draft International Standard (FDIS) by Q4 of 2017. Interestingly, it does not include AVC and AAC, probably the most obvious candidates for video and audio codecs which have been massively deployed in the last decade and probably still will be a major dominator (and also denominator) in upcoming years. On the other hand, the equirectangular projection format is currently the only one defined as it is broadly used already in off-the-shelf hardware/software solutions for the creation of omnidirectional/360-degree videos. Finally, the metadata formats enabling the rendering of 360-degree monoscopic and stereoscopic video is highly appreciated. A solution for MPEG-DASH based on AVC/AAC utilizing equirectangular projection format for both monoscopic and stereoscopic video is shown as part of Bitmovin’s solution for VR and 360-degree video.

Research aspects related to OMAF can be summarized as follows:

  • HEVC supports tiles which allow for efficient streaming of omnidirectional video but HEVC is not as widely deployed as AVC. Thus, it would be interesting how to mimic such a tile-based streaming approach utilizing AVC.
  • The question how to efficiently encode and package HEVC tile-based video is an open issue and call for a tradeoff between tile flexibility and coding efficiency.
  • When combined with MPEG-DASH (or similar), there’s a need to update the adaptation logic as the with tiles yet another dimension is added that needs to be considered in order to provide a good Quality of Experience (QoE).
  • QoE is a big issue here and not well covered in the literature. Various aspects are worth to be investigated including a comprehensive dataset to enable reproducibility of research results in this domain. Finally, as omnidirectional video allows for interactivity, also the user experience is becoming an issue which needs to be covered within the research community.

A second topic I’d like to highlight in this blog post is related to the preliminary call for evidence on video compression with capability beyond HEVC. 

Preliminary Call for Evidence on video compression with capability beyond HEVC

A call for evidence is issued to see whether sufficient technological potential exists to start a more rigid phase of standardization. Currently, MPEG together with VCEG have developed a Joint Exploration Model (JEM) algorithm that is already known to provide bit rate reductions in the range of 20-30% for relevant test cases, as well as subjective quality benefits. The goal of this new standard — with a preliminary target date for completion around late 2020 — is to develop technology providing better compression capability than the existing standard, not only for conventional video material but also for other domains such as HDR/WCG or VR/360-degrees video. An important aspect in this area is certainly over-the-top video delivery (like with MPEG-DASH) which includes features such as scalability and Quality of Experience (QoE). Scalable video coding has been added to video coding standards since MPEG-2 but never reached wide-spread adoption. That might change in case it becomes a prime-time feature of a new video codec as scalable video coding clearly shows benefits when doing dynamic adaptive streaming over HTTP. QoE did find its way already into video coding, at least when it comes to evaluating the results where subjective tests are now an integral part of every new video codec developed by MPEG (in addition to usual PSNR measurements). Therefore, the most interesting research topics from a multimedia communication point of view would be to optimize the DASH-like delivery of such new codecs with respect to scalability and QoE. Note that if you don’t like scalable video coding, feel free to propose something else as long as it reduces storage and networking costs significantly.

 

MPEG Workshop “Global Media Technology Standards for an Immersive Age”

On January 18, 2017 MPEG successfully held a public workshop on “Global Media Technology Standards for an Immersive Age” hosting a series of keynotes from Bitmovin, DVB, Orange, Sky Italia, and Technicolor. Stefan Lederer, CEO of Bitmovin discussed today’s and future challenges with new forms of content like 360°, AR and VR. All slides are available here and MPEG took their feedback into consideration in an update of its 5-year standardization roadmap. David Wood (EBU) reported on the DVB VR study mission and Ralf Schaefer (Technicolor) presented a snapshot on VR services. Gilles Teniou (Orange) discussed video formats for VR pointing out a new opportunity to increase the content value but also raising a question what is missing today. Finally, Massimo Bertolotti (Sky Italia) introduced his view on the immersive media experience age.

Overall, the workshop was well attended and as mentioned above, MPEG is currently working on a new standards project related to immersive media. Currently, this project comprises five parts. The first part comprises a technical report describing the scope (incl. kind of system architecture), use cases, and applications. The second part is OMAF (see above) and the third/forth parts are related to immersive video and audio respectively. Part five is about point cloud compression.

For those interested, please check out the slides from industry representatives in this field and draw your own conclusions what could be interesting for your own research. I’m happy to see any reactions, hints, etc. in the comments.

Finally, let’s have a look what happened related to MPEG-DASH, a topic with a long history on this blog.

MPEG-DASH and CMAF: Friend or Foe?

For MPEG-DASH and CMAF it was a meeting “in between” official standardization stages. MPEG-DASH experts are still working on the third edition which will be a consolidated version of the 2nd edition and various amendments and corrigenda. In the meantime, MPEG issues a white paper on the new features of MPEG-DASH which I would like to highlight here.

  • Spatial Relationship Description (SRD): allows to describe tiles and region of interests for partial delivery of media presentations. This is highly related to OMAF and VR/360-degree video streaming.
  • External MPD linking: this feature allows to describe the relationship between a single program/channel and a preview mosaic channel having all channels at once within the MPD.
  • Period continuity: simple signaling mechanism to indicate whether one period is a continuation of the previous one which is relevant for ad-insertion or live programs.
  • MPD chaining: allows for chaining two or more MPDs to each other, e.g., pre-roll ad when joining a live program.
  • Flexible segment format for broadcast TV: separates the signaling of the switching points and random access points in each stream and, thus, the content can be encoded with a good compression efficiency, yet allowing higher number of random access point, but with lower frequency of switching points.
  • Server and network-assisted DASH (SAND): enables asynchronous network-to-client and network-to-network communication of quality-related assisting information.
  • DASH with server push and WebSockets: basically addresses issues related to HTTP/2 push feature and WebSocket.

CMAF issued a study document which captures the current progress and all national bodies are encouraged to take this into account when commenting on the Committee Draft (CD). To answer the question in the headline above, it looks more and more like as DASH and CMAF will become friends — let’s hope that the friendship lasts for a long time.

What else happened at the MPEG meeting?

  • Committee Draft MORE (note: type in ‘man more’ on any unix/linux/max terminal and you’ll get ‘less – opposite of more’;): MORE stands for “Media Orchestration” and provides a specification that enables the automated combination of multiple media sources (cameras, microphones) into a coherent multimedia experience. Additionally, it targets use cases where a multimedia experience is rendered on multiple devices simultaneously, again giving a consistent and coherent experience.
  • Technical Report on HDR/WCG Video Coding: This technical report comprises conversion and coding practices for High Dynamic Range (HDR) and Wide Colour Gamut (WCG) video coding (ISO/IEC 23008-14). The purpose of this document is to provide a set of publicly referenceable recommended guidelines for the operation of AVC or HEVC systems adapted for compressing HDR/WCG video for consumer distribution applications
  • CfP Point Cloud Compression (PCC): This call solicits technologies for the coding of 3D point clouds with associated attributes such as color and material properties. It will be part of the immersive media project introduced above.
  • MPEG-H 3D Audio verification test report: This report presents results of four subjective listening tests that assessed the performance of the Low Complexity Profile of MPEG-H 3D Audio. The tests covered a range of bit rates and a range of “immersive audio” use cases (i.e., from 22.2 down to 2.0 channel presentations). Seven test sites participated in the tests with a total of 288 listeners.

The next MPEG meeting will be held in Hobart, April 3-7, 2017. Feel free to contact us for any questions or comments.

MPEG Column: 115th MPEG Meeting

The original blog post can be found at the Bitmovin Techblog and has been updated here to focus on and highlight research aspects.

The 115th MPEG meeting was held in Geneva, Switzerland and its press release highlights the following aspects:

 

  • IMG_2276MPEG issues Genomic Information Compression and Storage joint Call for Proposals in conjunction with ISO/TC 276/WG 5
  • Plug-in free decoding of 3D objects within Web browsers
  • MPEG-H 3D Audio AMD 3 reaches FDAM status
  • Common Media Application Format for Dynamic Adaptive Streaming Applications
  • 4th edition of AVC/HEVC file format

In this blog post, however, I will cover topics specifically relevant for adaptive media streaming, namely:

  • Recent developments in MPEG-DASH
  • Common media application format (CMAF)
  • MPEG-VR (virtual reality)
  • The MPEG roadmap/vision for the future.

MPEG-DASH Server and Network assisted DASH (SAND): ISO/IEC 23009-5

Part 5 of MPEG-DASH, referred to as SAND – server and network-assisted DASH – has reached FDIS. This work item started sometime ago at a public MPEG workshop during the 105th MPEG meeting in Vienna. The goal of this part of MPEG-DASH is to enhance the delivery of DASH content by introducing messages between DASH clients and network elements or between various network elements for the purpose of improving the efficiency of streaming sessions by providing information about real-time operational characteristics of networks, servers, proxies, caches, CDNs as well as DASH client’s performance and status. In particular, it defines the following:

  1. The SAND architecture which identifies the SAND network elements and the nature of SAND messages exchanged among them.
  2. The semantics of SAND messages exchanged between the network elements present in the SAND architecture.
  3. An encoding scheme for the SAND messages.
  4. The minimum to implement a SAND message delivery protocol.

The way that this information is to be utilized is deliberately not defined within the standard and left open for (industry) competition (or other standards developing organizations). In any case, there’s plenty of room for research activities around the topic of SAND, specifically:

  • A main issue is the evaluation of MPEG-DASH SAND in terms of qualitative and quantitative improvements with respect to QoS/QoE. Some papers are available already and have been published within ACM MMSys 2016.
  • Another topic of interest includes an analysis regarding scalability and possible overhead; in other words, I’m wondering whether it’s worth using SAND to improve DASH.

MPEG-DASH with Server Push and WebSockets: ISO/IEC 23009-6

Part 6 of MPEG-DASH reached DIS stage and deals with server push and Web sockets, i.e., it specifies the carriage of MPEG-DASH media presentations over full duplex HTTP-compatible protocols, particularly HTTP/2 and WebSocket. The specification comes with a set of generic definitions for which bindings are defined allowing its usage in various formats. Currently, the specification supports HTTP/2 and WebSocket.

For the former it is required to define the push policy as an HTTP header extension whereas the latter requires the definition of a DASH subprotocol. Luckily, these are the preferred extension mechanisms for both HTTP/2 and WebSocket and, thus, interoperability is provided. The question of whether or not the industry will adopt these extensions cannot be answered right now but I would recommend keeping an eye on this and there are certainly multiple research topics worth exploring in the future.

An interesting aspect for the research community would be to quantify the utility of using push methods within dynamic adaptive environments in terms of QoE and start-up delay. Some papers provide preliminary answers but a comprehensive evaluation is missing.

To conclude the recent MPEG-DASH developments, the DASH-IF recently established the Excellence in DASH Award at ACM MMSys’16 and the winners are presented here (including some of the recent developments described in this blog post).

Common Media Application Format (CMAF): ISO/IEC 23000-19

The goal of CMAF is to enable application consortia to reference a single MPEG specification (i.e., a “common media format”) that would allow a single media encoding to use across many applications and devices. Therefore, CMAF defines the encoding and packaging of segmented media objects for delivery and decoding on end user devices in adaptive multimedia presentations. This sounds very familiar and reminds us a bit on what the DASH-IF is doing with their interoperability points. One of the goals of CMAF is to integrate HLS in MPEG-DASH which is backed up with this WWDC video where Apple announces the support of fragmented MP4 in HLS. The streaming of this announcement is only available in Safari and through the WWDC app but Bitmovin has shown that it also works on Mac iOS 10 and above, and for PC users all recent browser versions including Edge, FireFox, Chrome, and (of course) Safari. 

MPEG Virtual Reality

IMG_2285 (1)
Virtual reality is becoming a hot topic across the industry (and also academia) which also reaches standards developing organizations like MPEG. Therefore, MPEG established an ad-hoc group (with an email reflector) to develop a roadmap required for MPEG-VR. Others have also started working on this like DVB, DASH-IF, and QUALINET (and maybe many others: W3C, 3GPP). In any case, it shows that there’s a massive interest in this topic and Bitmovin has shown already what can be done in this area within today’s Web environments. Obviously, adaptive streaming is an important aspect for VR applications including a many research questions to be addressed in the (near) future. A first step towards a concrete solution is the Omnidirectional Media Application Format (OMAF) which is currently at working draft stage (details to be provided in a future blog post).

The research aspects covers a wide range activity including – but not limited to – content capturing, content representation, streaming/network optimization, consumption, and QoE.

MPEG roadmap/vision

At it’s 115th meeting, MPEG published a document that lays out its medium-term strategic standardization roadmap. The goal of this document is collecting feedback from anyone in professional and B2B industries dealing with media, specifically but not limited to broadcasting, content and service provision, media equipment manufacturing, and telecommunication industry. The roadmap is depicted below and further described in the document available here. Please note that “360 AV” in the figure below also refers to VR but unfortunately it’s not (yet) reflected in the figure. However, it points out the aspects to be addressed by MPEG in the future which would be relevant for both industry and academia.

MPEG-Roadmap

The next MPEG meeting will be held in Chengdu, October 17-21, 2016.

MPEG Column: 112th MPEG Meeting

This blog post is also available at at bitmovin tech blog and blog.timmerer.com.

The 112th MPEG meeting in Warsaw, Poland was a special meeting for me. It was my 50th MPEG meeting which roughly accumulates to one year of MPEG meetings (i.e., one year of my life I’ve spend in MPEG meetings incl. traveling – scary, isn’t it? … more on this in another blog post). But what happened at this 112th MPEG meeting (my 50th meeting)…

  • Requirements: CDVA, Future of Video Coding Standardization (no acronym yet), Genome compression
  • Systems: M2TS (ISO/IEC 13818-1:2015), DASH 3rd edition, Media Orchestration (no acronym yet), TRUFFLE
  • Video/JCT-VC/JCT-3D: MPEG-4 AVC, Future Video Coding, HDR, SCC
  • Audio: 3D audio
  • 3DG: PCC, MIoT, Wearable

MPEG Friday Plenary. Photo (c) Christian Timmerer.

As usual, the official press release and other publicly available documents can be found here. Let’s dig into the different subgroups:

Requirements

In requirements experts were working on the Call for Proposals (CfP) for Compact Descriptors for Video Analysis (CDVA) including an evaluation framework. The evaluation framework includes 800-1000 objects (large objects like building facades, landmarks, etc.; small(er) objects like paintings, books, statues, etc.; scenes like interior scenes, natural scenes, multi-camera shots) and the evaluation of the responses should be conducted for the 114th meeting in San Diego.

The future of video coding standardization is currently happening in MPEG and shaping the way for the successor of of the HEVC standard. The current goal is providing (native) support for scalability (more than two spatial resolutions) and 30% compression gain for some applications (requiring a limited increase in decoder complexity) but actually preferred is 50% compression gain (at a significant increase of the encoder complexity). MPEG will hold a workshop at the next meeting in Geneva discussing specific compression techniques, objective (HDR) video quality metrics, and compression technologies for specific applications (e.g., multiple-stream representations, energy-saving encoders/decoders, games, drones). The current goal is having the International Standard for this new video coding standard around 2020.

MPEG has recently started a new project referred to as Genome Compression which is about of course about the compression of genome information. A big dataset has been collected and experts working on the Call for Evidence (CfE). The plan is holding a workshop at the next MPEG meeting in Geneva regarding prospect of Genome Compression and Storage Standardization targeting users, manufactures, service providers, technologists, etc.

Summer in Warsaw. Photo (c) Christian Timmerer.

Systems

The 5th edition of the MPEG-2 Systems standard has been published as ISO/IEC 13818-1:2015 on the 1st of July 2015 and is a consolidation of the 4th edition + Amendments 1-5.

In terms of MPEG-DASH, the draft text of ISO/IEC 23009-1 3rd edition comprising 2nd edition + COR 1 + AMD 1 + AMD 2 + AMD 3 + COR 2 is available for committee internal review. The expected publication date is scheduled for, most likely, 2016. Currently, MPEG-DASH includes a lot of activity in the following areas: spatial relationship description, generalized URL parameters, authentication, access control, multiple MPDs, full duplex protocols (aka HTTP/2 etc.), advanced and generalized HTTP feedback information, and various core experiments:

  • SAND (Sever and Network Assisted DASH)
  • FDH (Full Duplex DASH)
  • SAP-Independent Segment Signaling (SISSI)
  • URI Signing for DASH
  • Content Aggregation and Playback COntrol (CAPCO)

In particular, the core experiment process is very open as most work is conducted during the Ad hoc Group (AhG) period which is discussed on the publicly available MPEG-DASH reflector.

MPEG systems recently started an activity that is related to media orchestration which applies to capture as well as consumption and concerns scenarios with multiple sensors as well as multiple rendering devices, including one-to-many and many-to-one scenarios resulting in a worthwhile, customized experience.

Finally, the systems subgroup started an exploration activity regarding real-time streaming of file (a.k.a TRUFFLE) which should perform an gap analysis leading to extensions of the MPEG Media Transport (MMT) standard. However, some experts within MPEG concluded that most/all use cases identified within this activity could be actually solved with existing technology such as DASH. Thus, this activity may still need some discussions…

Video/JCT-VC/JCT-3D

The MPEG video subgroup is working towards a new amendment for the MPEG-4 AVC standard covering resolutions up to 8K and higher frame rates for lower resolution. Interestingly, although MPEG most of the time is ahead of industry, 8K and high frame rate is already supported in browser environments (e.g., using bitdash 8K, HFR) and modern encoding platforms like bitcodin. However, it’s good that we finally have means for an interoperable signaling of this profile.

In terms of future video coding standardization, the video subgroup released a call for test material. Two sets of test sequences are already available and will be investigated regarding compression until next meeting.

After a successful call for evidence for High Dynamic Range (HDR), the technical work starts in the video subgroup with the goal to develop an architecture (“H2M”) as well as three core experiments (optimization without HEVC specification change, alternative reconstruction approaches, objective metrics).

The main topic of the JCT-VC was screen content coding (SCC) which came up with new coding tools that are better compressing content that is (fully or partially) computer generated leading to a significant improvement of compression, approx. or larger than 50% rate reduction for specific screen content.

Audio

The audio subgroup is mainly concentrating on 3D audio where they identified the need for intermediate bitrates between 3D audio phase 1 and 2. Currently, phase 1 identified 256, 512, 1200 kb/s whereas phase 2 focuses on 128, 96, 64, 48 kb/s. The broadcasting industry needs intermediate bitrates and, thus, phase 2 is extended to bitrates between 128 and 256 kb/s.

3DG

MPEG 3DG is working on point cloud compression (PCC) for which open source software has been identified. Additionally, there’re new activity in the area of Media Internet of Things (MIoT) and wearable computing (like glasses and watches) that could lead to new standards developed within MPEG. Therefore, stay tuned on these topics as they may shape your future.

The week after the MPEG meeting I met the MPEG convenor and the JPEG convenor again during ICME2015 in Torino but that’s another story…

L. Chiariglione, H. Hellwagner, T. Ebrahimi, C. Timmerer (from left to right) during ICME2015. Photo (c) T. Ebrahimi.

MPEG Column: 111th MPEG Meeting

— original posts here by Multimedia Communication blogChristian TimmererAAU/bitmovin

The 111th MPEG meeting (note: link includes press release and all publicly available output documents) was held in Geneva, Switzerland showing up some interesting aspects which I’d like to highlight here. Undoubtedly, it was the shortest meeting I’ve ever attended (and my first meeting was #61) as final plenary concluded at 2015/02/20T18:18!

MPEG111 opening plenary

In terms of the requirements (subgroup) it’s worth to mention the call for evidence (CfE) for high-dynamic range (HDR) and wide color gamut (WCG) video coding which comprises a first milestone towards a new video coding format. The purpose of this CfE is to explore whether or not  (a) the coding efficiency and/or (b) the functionality of the HEVC Main 10 and Scalable Main 10 profiles can be significantly improved for HDR and WCG content. In addition to that requirements issues a draft call for evidence on free viewpoint TV. Both documents are publicly available here.

The video subgroup continued discussions related to the future of video coding standardisation and issued a public document requesting contributions on “future video compression technology”. Interesting application requirements come from over-the-top streaming use cases which request HDR and WCG as well as video over cellular networks. Well, at least the former is something to be covered by the CfE mentioned above. Furthermore, features like scalability and perceptual quality is something that should be considered from ground-up and not (only) as an extension. Yes, scalability is something that really helps a lot in OTT streaming starting from easier content management, cache-efficient delivery, and it allows for a more aggressive buffer modelling and, thus, adaptation logic within the client enabling better Quality of Experience (QoE) for the end user. It seems like complexity (at the encoder) is not such much a concern as long as it scales with cloud deployments such as http://www.bitcodin.com/ (e.g., the bitdash demo area shows some neat 4K/8K/HFR DASH demos which have been encoded with bitcodin). Closely related to 8K, there’s a new AVC amendment coming up covering 8K although one can do it already today (see before) but it’s good to have standards support for this. For HEVC, the JCT-3D/VC issued the FDAM4 for 3D Video Extensions and started with PDAM5 for Screen Content Coding Extensions (both documents being publicly available after an editing period of about a month).

And what about audio, the audio subgroup has decided that ISO/IEC DIS 23008-3 3D Audio shall be promoted directly to IS which means that the DIS was already at such a good state that only editorial comments are applied which actually saves a balloting cycle. We have to congratulate the audio subgroup for this remarkable milestone.

Finally, I’d like to discuss a few topics related to DASH which is progressing towards its 3rd edition which will incorporate amendment 2 (Spatial Relationship Description, Generalized URL parameters and other extensions), amendment 3 (Authentication, Access Control and multiple MPDs), and everything else that will be incorporated within this year, like some aspects documented in the technologies under consideration or currently being discussed within the core experiments (CE). Currently, MPEG-DASH conducts 5 core experiments:

  • Server and Network Assisted DASH (SAND)
  • DASH over Full Duplex HTTP-based Protocols (FDH)
  • URI Signing for DASH (CE-USD)
  • SAP-Independent Segment SIgnaling (SISSI)
  • Content aggregation and playback control (CAPCO)

The description of core experiments is publicly available and, compared to the previous meeting, we have a new CE which is about content aggregation and playback control (CAPCO) which “explores solutions for aggregation of DASH content from multiple live and on-demand origin servers, addressing applications such as creating customized on-demand and live programs/channels from multiple origin servers per client, targeted preroll ad insertion in live programs and also limiting playback by client such as no-skip or no fast forward.” This process is quite open and anybody can join by subscribing to the email reflector.

The CE for DASH over Full Duplex HTTP-based Protocols (FDH) is becoming major and basically defines the usage of DASH for push-features of WebSockets and HTTP/2. At this meeting MPEG issues a working draft and also the CE on Server and Network Assisted DASH (SAND) got its own part 5 where it goes to CD but documents are not publicly available. However, I’m pretty sure I can report more on this next time, so stay tuned or feel free to comment here.

MPEG Column: 110th MPEG Meeting

— original posts here by Multimedia Communication blogChristian TimmererAAU/bitmovin

The 110th MPEG meeting was held at the Strasbourg Convention and Conference Centre featuring the following highlights:

  • The future of video coding standardization
  • Workshop on media synchronization
  • Standards at FDIS: Green Metadata and CDVS
  • What’s happening in MPEG-DASH?

Additional details about MPEG’s 110th meeting can be also found here including the official press release and all publicly available documents.

The Future of Video Coding Standardization

MPEG110 hosted a panel discussion about the future of video coding standardization. The panel was organized jointly by MPEG and ITU-T SG 16’s VCEG featuring Roger Bolton (Ericsson), Harald Alvestrand (Google), Zhong Luo (Huawei), Anne Aaron (Netflix), Stéphane Pateux (Orange), Paul Torres (Qualcomm), and JeongHoon Park (Samsung).

As expected, “maximizing compression efficiency remains a fundamental need” and as usual, MPEG will study “future application requirements, and the availability of technology developments to fulfill these requirements”. Therefore, two Ad-hoc Groups (AhGs) have been established which are open to the public:

The presentations of the brainstorming session on the future of video coding standardization can be found here.

Workshop on Media Synchronization

MPEG101 also hosted a workshop on media synchronization for hybrid delivery (broadband-broadcast) featuring six presentations “to better understand the current state-of-the-art for media synchronization and identify further needs of the industry”.

  • An overview of MPEG systems technologies providing advanced media synchronization, Youngkwon Lim, Samsung
  • Hybrid Broadcast – Overview of DVB TM-Companion Screens and Streams specification, Oskar van Deventer, TNO
  • Hybrid Broadcast-Broadband distribution for new video services :  a use cases perspective, Raoul Monnier, Thomson Video Networks
  • HEVC and Layered HEVC for UHD deployments, Ye Kui Wang, Qualcomm
  • A fingerprinting-based audio synchronization technology, Masayuki Nishiguchi, Sony Corporation
  • Media Orchestration from Capture to Consumption, Rob Koenen, TNO

The presentation material is available here. Additionally, MPEG established an AhG on timeline alignment (that’s how the project is internally called) to study use cases and solicit contributions on gap analysis and also technical contributions [email][subscription].

Standards at FDIS: Green Metadata and CDVS

My first report on MPEG Compact Descriptors for Visual Search (CDVS) dates back to July 2011 which provides details about the call for proposals. Now, finally, the FDIS has been approved during the 110th MPEG meeting. CDVS defines a compact image description that facilitates the comparison and search of pictures that include similar content, e.g. when showing the same objects in different scenes from different viewpoints. The compression of key point descriptors not only increases compactness, but also significantly speeds up, when compared to a raw representation of the same underlying features, the search and classification of images within large image databases. Application of CDVS for real-time object identification, e.g. in computer vision and other applications, is envisaged as well.

Another standard reached FDIS status entitled Green Metadata (first reported in August 2012). This standard specifies the format of metadata that can be used to reduce energy consumption from the encoding, decoding, and presentation of media content, while simultaneously controlling or avoiding degradation in the Quality of Experience (QoE). Moreover, the metadata specified in this standard can facilitate a trade-off between energy consumption and QoE. MPEG is also working on amendments to the ubiquitous MPEG-2 TS ISO/IEC 13818-1 and ISOBMFF ISO/IEC 14496-12 so that green metadata can be delivered by these formats.

What’s happening in MPEG-DASH?

MPEG-DASH is in a kind of maintenance mode but still receiving new proposals in the area of SAND parameters and some core experiments are going on. Also, the DASH-IF is working towards new interoperability points and test vectors in preparation of actual deployments. When speaking about deployments, they are happening, e.g., a 40h live stream right before Christmas (by bitmovin, a top-100 company that matters most in online video). Additionally, VideoNext was co-located with CoNEXT’14 targeting scientific presentations about the design, quality and deployment of adaptive video streaming. Webex recordings of the talks are available here. In terms of standardization, MPEG-DASH is progressing towards the 2nd amendment including spatial relationship description (SRD), generalized URL parameters and other extensions. In particular, SRD will enable new use cases which can be only addressed using MPEG-DASH and the FDIS is scheduled for the next meeting which will be in Geneva, Feb 16-20, 2015. I’ll report on this within my next blog post, stay tuned..

MPEG Column: 108th MPEG Meeting

— original posts here and here by Multimedia Communication blog and bitmovin techblogChristian TimmererAAU/bitmovin

The 108th MPEG meeting was held at the Palacio de Congresos de Valencia in Spain featuring the following highlights (no worries about the acronyms, this is on purpose and they will be further explained below):

  • Requirements: PSAF, SCC, CDVA
  • Systems: M2TS, MPAF, Green Metadata
  • Video: CDVS, WVC, VCB
  • JCT-VC: SHVC, SCC
  • JCT-3D: MV/3D-HEVC, 3D-AVC
  • Audio: 3D audio

Opening Plenary of the 108th MPEG meeting in Valencia, Spain.

The official MPEG press release can be downloaded from the MPEG Web site. Some of the above highlighted topics will be detailed in the following and, of course, there’s an update on DASH-related matters at the end.

As indicated above, MPEG is full of (new) acronyms and in order to become familiar with those, I’ve put them deliberately in the overview but I will explain them further below.

PSAF – Publish/Subscribe Application Format

Publish/subscribe corresponds to a new network paradigm related to content-centric networking (or information-centric networking) where the content is addressed by its name rather than location. An application format within MPEG typically defines a combination of existing MPEG tools jointly addressing the needs for a given application domain, in this case, the publish/subscribe paradigm. The current requirements and a preliminary working draft are publicly available.

SCC – Screen Content Coding

I’ve introduced this topic in my previous report and this meeting the responses to the CfP have been evaluated. In total, seven responses have been received which meet all requirements and, thus, the actual standardization work is transferred to JCT-VC. Interestingly, the results of the CfP are publicly available. Within JCT-VC, a first test model has been defined and core experiments have been established. I will report more on this as an output of the next meetings…

CDVA – Compact Descriptors for Video Analysis

This project has been renamed from compact descriptors for video search to compact descriptors for video analysis and comprises a publicly available vision statement. That is, interested parties are welcome to join this new activity within MPEG.

M2TS – MPEG-2 Transport Stream

At this meeting, various extensions to M2TS have been defined such as transport of multi-view video coding depth information and extensions to HEVC, delivery of timeline for external data as well as carriage of layered HEVC, green metadata, and 3D audio. Hence, M2TS is still very active and multiple amendments are developed in parallel.

MPAF – Multimedia Preservation Application Format

The committee draft for MPAF has been approved and, in this context, MPEG-7 is extended with additional description schemes.

Green Metadata

Well, this standard does not have its own acronym; it’s simply referred to as MPEG-GREEN. The draft international standard has been approved and national bodies will vote on it at the JTC 1 level. It basically defines metadata to allow clients operating in an energy-efficient way. It comes along with amendments to M2TS and ISOBMFF that enable the carriage and storage of this metadata.

CDVS – Compact Descriptors for Visual Search

CDVS is at DIS stage and provide improvements on global descriptors as well as non-normative improvements of key-point detection and matching in terms of speedup and memory consumption. As all standards at DIS stage, national bodies will vote on it at the JTC 1 level.

What’s new in the video/audio-coding domain?

  • WVC – Web Video Coding: This project reached final draft international standard with the goal to provide a video-coding standard for Web applications. It basically defines a profile of the MPEG-AVC standard including those tools not encumbered by patents.
  • VCB – Video Coding for Browsers: The committee draft for part 31 of MPEG-4 defines video coding for browsers and basically defines VP8 as an international standard. This is explains also the difference to WVC.
  • SHVC – Scalable HEVC extensions: As for SVC, SHVC will be defined as an amendment to HEVC providing the same functionality as SVC, scalable video coding functionality.
  • MV/3D-HEVC, 3D-AVC: These are multi-view and 3D extensions for the HEVC and AVC standards respectively.
  • 3D Audio: Also, no acronym for this standard although I would prefer 3DA. However, CD has been approved at this meeting and the plan is to have DIS at the next meeting. At the same time, the carriage and storage of 3DA is being defined in M2TS and ISOBMFF respectively.

Finally, what’s new in the media transport area, specifically DASH and MMT?

As interested readers know from my previous reports, DASH 2nd edition has been approved has been approved some time ago. In the meantime, a first amendment to the 2nd edition is at draft amendment state including additional profiles (mainly adding xlink support) and time synchronization. A second amendment goes to the first ballot stage referred to as proposed draft amendment and defines spatial relationship description, generalized URL parameters, and other extensions. Eventually, these two amendments will be integrated in the 2nd edition which will become the MPEG-DASH 3rd edition. Also a corrigenda on the 2nd edition is currently under ballot and new contributions are still coming in, i.e., there is still a lot of interest in DASH. For your information – there will be two DASH-related sessions at Streaming Forum 2014.

On the other hand, MMT’s amendment 1 is currently under ballot and amendment 2 defines header compression and cross-layer interface. The latter has been progressed to a study document which will be further discussed at the next meeting. Interestingly, there will be a MMT developer’s day at the 109th MPEG meeting as in Japan, 4K/8K UHDTV services will be launched based on MMT specifications and in Korea and China, implementation of MMT is now under way. The developer’s day will be on July 5th (Saturday), 2014, 10:00 – 17:00 at the Sapporo Convention Center. Therefore, if you don’t know anything about MMT, the developer’s day is certainly a place to be.

Contact:

Dr. Christian Timmerer
CIO bitmovin GmbH | christian.timmerer@bitmovin.net
Alpen-Adria-Universität Klagenfurt | christian.timmerer@aau.at

What else? That is, some publicly available MPEG output documents… (Dates indicate availability and end of editing period, if applicable, using the following format YY/MM/DD):

  • Text of ISO/IEC 13818-1:2013 PDAM 7 Carriage of Layered HEVC (14/05/02)
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of Green Metadata (14/04/04)
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of 3D Audio (14/04/04)
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of additional audio profiles & levels (14/04/04)
  • Text of ISO/IEC 14496-12:2012 PDAM 4 Enhanced audio support (14/04/04)
  • TuC on sample variants, signatures and other improvements for the ISOBMFF (14/04/04)
  • Text of ISO/IEC CD 14496-22 3rd edition (14/04/04)
  • Text of ISO/IEC CD 14496-31 Video Coding for Browsers (14/04/11)
  • Text of ISO/IEC 15938-5:2005 PDAM 5 Multiple text encodings, extended classification metadata (14/04/04)
  • WD 2 of ISO/IEC 15938-6:201X (2nd edition) (14/05/09)
  • Text of ISO/IEC DIS 15938-13 Compact Descriptors for Visual Search (14/04/18)
  • Test Model 10: Compact Descriptors for Visual Search (14/05/02)
  • WD of ARAF 2nd Edition (14/04/18)
  • Use cases for ARAF 2nd Edition (14/04/18)
  • WD 5.0 MAR Reference Model (14/04/18)
  • Logistic information for the 5th JAhG MAR meeting (14/04/04)
  • Text of ISO/IEC CD 23000-15 Multimedia Preservation Application Format (14/04/18)
  • WD of Implementation Guideline of MP-AF (14/04/04)
  • Requirements for Publish/Subscribe Application Format (PSAF) (14/04/04)
  • Preliminary WD of Publish/Subscribe Application Format (14/04/04)
  • WD2 of ISO/IEC 23001-4:201X/Amd.1 Parser Instantiation from BSD (14/04/11)
  • Text of ISO/IEC 23001-8:2013/DCOR1 (14/04/18)
  • Text of ISO/IEC DIS 23001-11 Green Metadata (14/04/25)
  • Study Text of ISO/IEC 23002-4:201x/DAM2 FU and FN descriptions for HEVC (14/04/04)
  • Text of ISO/IEC 23003-4 CD, Dynamic Range Control (14/04/11)
  • MMT Developers’ Day in 109th MPEG meeting (14/04/04)
  • Results of CfP on Screen Content Coding Tools for HEVC (14/04/30)
  • Study Text of ISO/IEC 23008-2:2013/DAM3 HEVC Scalable Extensions (14/06/06)
  • HEVC RExt Test Model 7 (14/06/06)
  • Scalable HEVC (SHVC) Test Model 6 (SHM 6) (14/06/06)
  • Report on HEVC compression performance verification testing (14/04/25)
  • HEVC Screen Content Coding Test Model 1 (SCM 1) (14/04/25)
  • Study Text of ISO/IEC 23008-2:2013/PDAM4 3D Video Extensions (14/05/15)
  • Test Model 8 of 3D-HEVC and MV-HEVC (14/05/15)
  • Text of ISO/IEC 23008-3/CD, 3D audio (14/04/11)
  • Listening Test Logistics for 3D Audio Phase 2 (14/04/04)
  • Active Downmix Control (14/04/04)
  • Text of ISO/IEC PDTR 23008-13 Implementation Guidelines for MPEG Media Transport (14/05/02)
  • Text of ISO/IEC 23009-1 2nd edition DAM 1 Extended Profiles and availability time synchronization (14/04/18)
  • Text of ISO/IEC 23009-1 2nd edition PDAM 2 Spatial Relationship Description, Generalized URL parameters and other extensions (14/04/18)
  • Text of ISO/IEC PDTR 23009-3 2nd edition DASH Implementation Guidelines (14/04/18)
  • MPEG vision for Compact Descriptors for Video Analysis (CDVA) (14/04/04)
  • Plan of FTV Seminar at 109th MPEG Meeting (14/04/04)
  • Draft Requirements and Explorations for HDR /WCG Content Distribution and Storage (14/04/04)
  • Working Draft 2 of Internet Video Coding (IVC) (14/04/18)
  • Internet Video Coding Test Model (ITM) v 9.0 (14/04/18)
  • Uniform Timeline Alignment (14/04/18)
  • Plan of Seminar on Hybrid Delivery at the 110th MPEG Meeting (14/04/04)
  • WD 2 of MPEG User Description (14/04/04)

MPEG Column: 107th MPEG Meeting

— original posts here and here by Multimedia Communication blog and bitmovin techblogChristian TimmererAAU/bitmovin

The MPEG-2 Transport Stream (M2TS; formally known as Rec. ITU-T H.222.0 | ISO/IEC 13818-1) has been awarded with the Technology & Engineering Emmy® Award by the National Academy of Television Arts & Sciences. It is the fourth time MPEG received an Emmy award. The M2TS is widely deployed across a broad range of application domain such as broadcast, cable TV, Internet TV (IPTV and OTT), and Blu-ray Disks. The Emmy was received during this year’s CES2014 in Las Vegas.

Plenary during the 107th MPEG Meeting.

Other topics of the 107th MPEG meeting in San Jose include the following highlights:

  • Requirements: Call for Proposals on Screen Content jointly with ITU-T’s Video Coding Experts Group (VCEG)
  • Systems: Committee Draft for Green Metadata
  • Video: Study Text Committee Draft for Compact Descriptors for Visual Search (CDVS)
  • JCT-VC: Draft Amendment for HEVC Scalable Extensions (SHVC)
  • JCT-3D: Proposed Draft Amendment for HEVC 3D Extensions (3D-HEVC)
  • Audio: 3D audio plans to progress to CD at 108th meeting
  • 3D Graphics: Working Draft 4.0 of Augmented Reality Application Format (ARAF) 2nd Edition

The official MPEG press release can be downloaded from the MPEG Web site. Some of the above highlighted topics will be detailed in the following and, of course, there’s an update on DASH-related matters at the end.

Call for Proposals on Screen Content

Screen content refers to content coming not from cameras but from screen/desktop sharing and collaboration, cloud computing and gaming, wirelessly connected displays, control rooms with high resolution display walls, virtual desktop infrastructures, tablets as secondary displays, PC over IP, ultra-thin client technology, etc. Also mixed-content is within the scope of this work item and may contain a mixture of camera-captured video and images with rendered computer-generated graphics, text, animation, etc.

Although this type of content was considered during the course of the HEVC standardization, recent studies in MPEG have led to the conclusion that significant further improvements in coding efficiency can be obtained by exploiting the characteristics of screen content and, thus, a Call for Proposals (CfP) is being issued for developing possible future extensions of the HEVC standard.

Companies and organizations are invited to submit proposals in response to this call –issued jointly by MPEG with ITU-T VCEG. Responses are expected to be submitted by early March, and will be evaluated during the 108th MPEG meeting. The timeline is as follows:

  • 2014/01/17: Final Call for Proposals
  • 2014/01/22: Availability of anchors and end of editing period for Final CfP
  • 2014/02/10: Mandatory registration deadline
    One of the contact persons (see Section 10) must be notified, and an invoice for the testing fee will be sent after registration. Additional logistic information will also be sent to proponents by this date.
  • 2014/03/05: Coded test material shall be available at the test site. By this date, the payment of the testing fee is expected to be finalized.
  • 2014/03/17: Submission of all documents and requested data associated with the proposal.
  • 2014/03/27-04/04: Evaluation of proposals at standardization meeting.
  • 2015: Final draft standard expected.

It will be interesting to see the coding efficiency of the submitted proposals compared to a pure HEVC or even AVC approach.

DEC PDP-8 at Computer History Museum during MPEG Social Event.

Committee Draft for Green Metadata

Green Metadata, formerly known as Green MPEG, shall enable energy-efficient media consumption and reached Committee Draft (CD) status at the 107th MPEG meeting. The representation formats defined within Green Metadata help reducing decoder power consumption and display power consumption. Clients may utilize such information for the adaptive selection of operating voltage or clock frequencies within their chipsets. Additional, it may be used to set the brightness of the backlights for the display to save power consumption.

Green Metadata also provides metadata for the signaling and selection of DASH representations to enable the reduction of power consumption for their encoding.

The main challenge in terms of adoption of this kind of technology is how to exploit these representation formats to actually achieve energy-efficient media consumption and how much!

What’s new on the DASH frontier?

The text of ISO/IEC 23009-1 2nd edition PDAM1 has been approved which may be referred to as MPEG-DASH v3 (once finalized and integrated into the second edition, possibly with further amendments and corrigenda, if applicable). This first amendment to MPEG-DASH v2 comprises accurate time synchronization between server and client for live services as well as a new profile, i.e., ISOBMFF High Profile which basically combines the ISOBMFF Live and ISOBMFF On-demand profiles and adds the Xlink feature.

Additionally, a second amendment to MPEG-DASH v2 has been started featuring Spatial Relationship Description (SRD) and DASH Client Authentication and Content Access Authorization (DAA).

Other DASH-related aspects include the following:

  • The common encryption for ISOBMFF has been extended with a simple pattern-based encryption mode, i.e., a new method which should simply content encryption.
  • The CD has been approved for the carriage of timed metadata metrics of media in ISOBMFF. This allows for the signaling of quality metrics within the segments enabling QoE-aware DASH clients.

What else? That is, some publicly available MPEG output documents… (Dates indicate availability and end of editing period, if applicable, using the following format YY/MM/DD):

  • Report of 3D-AVC Subjective Quality Assessment (14/02/28)
  • Working Draft 3 of Video Coding for Browsers (14/01/31)
  • Common Test Conditions for Proposals on VCB Enhancements (14/01/17)
  • Study Text of ISO/IEC CD 15938-13 Compact Descriptors for Visual Search (14/02/14)
  • WD 4.0 of ARAF 2nd Edition (14/02/07)
  • Text of ISO/IEC 23001-7 PDAM 1 Simple pattern-based encryption mode (14/01/31)
  • Text of ISO/IEC CD 23001-10 Carriage of Timed Metadata Metrics of Media in the ISO Base Media File Format (14/01/31)
  • Text of ISO/IEC CD 23001-11 Green Metadata (14/01/24)
  • Preliminary Draft of ISO/IEC 23008-2:2013/FDAM1 HEVC Range Extensions (14/02/28)
  • Text of ISO/IEC 23008-2:2013/DAM3 HEVC Scalable Extensions (14/01/31)
  • Preliminary Draft of ISO/IEC 23008-2:2013/FDAM2 HEVC Multiview Extensions (14/02/28)
  • Text of ISO/IEC 23008-2:2013/PDAM4 3D Extensions (14/03/14)
  • Text of ISO/IEC CD 23008-12 Image File Format (14/01/17)
  • Text of ISO/IEC 23009-1:201x DCOR 1 (14/01/24)
  • Text of ISO/IEC 23009-1:201x PDAM 1 High Profile and Availability Time Synchronization (14/01/24)
  • WD of ISO/IEC 23009-1 AMD 2 (14/01/31)
  • Requirements for an extension of HEVC for coding of screen content (14/01/17)
  • Joint Call for Proposals for coding of screen content (14/01/22)
  • Draft requirements for Higher Dynamic Range (HDR) and Wide Color Gamut (WCG) video coding for Broadcasting, OTT, and Storage Media (14/01/17)
  • Working Draft 1 of Internet Video Coding (IVC) (14/01/31)

MPEG Column: 106th MPEG Meeting

— original posts here and here by Multimedia Communication blog and bitmovin techblogChristian TimmererAAU/bitmovin

National Day Present by Austrian Airlines on my way to Geneva.

November, 2013, Geneva, Switzerland. Here comes a news report from the 106th MPEG in Geneva, Switzerland which was actually during the Austrian national day but Austrian Airlines had a nice present (see picture) for their guests.

The official press release can be found here.

In this meeting, ISO/IEC 23008-1 (i.e., MPEG-H Part 1) MPEG Media Transport (MMT) reached Final Draft International Standard (FDIS). Looking back when this project was started with the aim to supersede the widely adopted MPEG-2 Transport Stream (M2TS) — which receives the Technology & Engineering Emmy®Award in Jan’14 — and what we have now, the following features are supported within MMT:

  • Self-contained multiplexing structure
  • Strict timing model
  • Reference buffer model
  • Flexible splicing of content
  • Name based access of data
  • AL-FEC (application layer forward error correction)
  • Multiple Qualities of Service within one packet flow

ITU-T Tower Building, Geneva.

Interestingly, MMT supports the carriage of MPEG-DASH segments and MPD for uni-directional environments such as broadcasting.

MPEG-H now comprises three major technologies, part 1 is about transport (MMT; at FDIS stage), part 2 deals with video coding (HEVC; at FDIS stage), and part 3 will be about audio coding, specifically 3D audio coding (but it’s still in its infancy for which technical responses have been evaluated only recently). Other parts of MPEG-H are currently related to these three parts.

In terms of research, it is important to determine the efficiency, overhead, and — in general — the use cases enabled by MMT. From a business point of view, it will be interesting to see whether MMT will actually supersede M2TS and how it will evolve compared or in relation to DASH.

On another topic, MPEG-7 visual reached an important milestone at this meeting. The Committee Draft (CD) for Part 13 (ISO/IEC 15938-13) has been approved and is entitled Compact Descriptors for Visual Search (CDVS). This image description enables comparing and finding pictures that include similar content, e.g., when showing the same object from different viewpoints. CDVS mainly deals with images but MPEG also started work for compact descriptors for video search.

The CDVS standard truly helps to reduce the semantic gap. However, research in this domain is already well developed and it is unclear whether the research community will adopt CDVS, specifically because the interest in MPEG-7 descriptors has decreased lately. On the other hand, such a standard will enable interoperability among vendors and services (e.g., Google Goggles) reducing the number of proprietary formats and, hopefully, APIs. However, the most important question is whether CDVS will be adopted by the industry (and research).

Finally, what about MPEG-DASH?

The 2nd edition of part 1 (MPD and segment formats) and the 1st edition of part 2 (conformance and reference software) have been finalized at the 105th MPEG meeting (FDIS). Additionally, we had a public/open workshop at that meeting which was about session management and control for DASH. This and other new topics are further developed within so-called core experiments for which I’d like to give a brief overview:

  • Server and Network assisted DASH Operation (SAND) which is the immediate result of the workshop at the 105th MPEG meeting and introduces a DASH-Aware Media Element (DANE) as depicted in the Figure below. Parameters from this element — as well as others — may support the DASH client within its operations, i.e., downloading the “best” segments for its context. SAND parameters are typically coming from the network itself whereas Parameters for enhancing delivery by DANE (PED) are coming from the content author.

Baseline Architecture for Server and Network assisted DASH.

  • Spatial Relationship Description is about delivering (tiled) ultra-high-resolution content towards heterogeneous clients while at the same time providing interactivity (e.g., zooming). Thus, not only the temporal but also spatial relationship of representations needs to be described.

Other CEs are related to signaling intended source and display characteristicscontrolling the DASH client behavior, and DASH client authentication and content access authorization.

The outcome of these CEs is potentially interesting for future amendments. One CE closed at this meeting which was about including quality information within DASH, e.g., as part of an additional track within ISOBMFF and an additional representation within the MPD. Clients may access this quality information in advance to assist the adaptation logic in order to make informed decisions about which segment to download next.

Interested people may join the MPEG-DASH Ad-hoc Group (AhG; http://lists.uni-klu.ac.at/mailman/listinfo/dash) where these topics (and others) are discussed.

Finally, additional information/outcome from the last meeting is accessible via http://mpeg.chiariglione.org/meetings/106 including documents publicly available (some may have an editing period).