MPEG Column: 111th MPEG Meeting

Christian Timmerer

The 111th MPEG meeting (note: link includes press release and all publicly available output documents) was held in Geneva, Switzerland showing up some interesting aspects which I’d like to highlight here. Undoubtedly, it was the shortest meeting I’ve ever attended (and my first meeting was #61) as final plenary concluded at 2015/02/20T18:18!

MPEG111 opening plenary

In terms of the requirements (subgroup) it’s worth to mention the call for evidence (CfE) for high-dynamic range (HDR) and wide color gamut (WCG) video coding which comprises a first milestone towards a new video coding format. The purpose of this CfE is to explore whether or not  (a) the coding efficiency and/or (b) the functionality of the HEVC Main 10 and Scalable Main 10 profiles can be significantly improved for HDR and WCG content. In addition to that requirements issues a draft call for evidence on free viewpoint TV. Both documents are publicly available here.

The video subgroup continued discussions related to the future of video coding standardisation and issued a public document requesting contributions on “future video compression technology”. Interesting application requirements come from over-the-top streaming use cases which request HDR and WCG as well as video over cellular networks. Well, at least the former is something to be covered by the CfE mentioned above. Furthermore, features like scalability and perceptual quality is something that should be considered from ground-up and not (only) as an extension. Yes, scalability is something that really helps a lot in OTT streaming starting from easier content management, cache-efficient delivery, and it allows for a more aggressive buffer modelling and, thus, adaptation logic within the client enabling better Quality of Experience (QoE) for the end user. It seems like complexity (at the encoder) is not such much a concern as long as it scales with cloud deployments such as (e.g., the bitdash demo area shows some neat 4K/8K/HFR DASH demos which have been encoded with bitcodin). Closely related to 8K, there’s a new AVC amendment coming up covering 8K although one can do it already today (see before) but it’s good to have standards support for this. For HEVC, the JCT-3D/VC issued the FDAM4 for 3D Video Extensions and started with PDAM5 for Screen Content Coding Extensions (both documents being publicly available after an editing period of about a month).

And what about audio, the audio subgroup has decided that ISO/IEC DIS 23008-3 3D Audio shall be promoted directly to IS which means that the DIS was already at such a good state that only editorial comments are applied which actually saves a balloting cycle. We have to congratulate the audio subgroup for this remarkable milestone.

Finally, I’d like to discuss a few topics related to DASH which is progressing towards its 3rd edition which will incorporate amendment 2 (Spatial Relationship Description, Generalized URL parameters and other extensions), amendment 3 (Authentication, Access Control and multiple MPDs), and everything else that will be incorporated within this year, like some aspects documented in the technologies under consideration or currently being discussed within the core experiments (CE). Currently, MPEG-DASH conducts 5 core experiments:

  • Server and Network Assisted DASH (SAND)
  • DASH over Full Duplex HTTP-based Protocols (FDH)
  • URI Signing for DASH (CE-USD)
  • SAP-Independent Segment SIgnaling (SISSI)
  • Content aggregation and playback control (CAPCO)

The description of core experiments is publicly available and, compared to the previous meeting, we have a new CE which is about content aggregation and playback control (CAPCO) which “explores solutions for aggregation of DASH content from multiple live and on-demand origin servers, addressing applications such as creating customized on-demand and live programs/channels from multiple origin servers per client, targeted preroll ad insertion in live programs and also limiting playback by client such as no-skip or no fast forward.” This process is quite open and anybody can join by subscribing to the email reflector.

The CE for DASH over Full Duplex HTTP-based Protocols (FDH) is becoming major and basically defines the usage of DASH for push-features of WebSockets and HTTP/2. At this meeting MPEG issues a working draft and also the CE on Server and Network Assisted DASH (SAND) got its own part 5 where it goes to CD but documents are not publicly available. However, I’m pretty sure I can report more on this next time, so stay tuned or feel free to comment here.

MPEG Column: 110th MPEG Meeting

Christian Timmerer

The 110th MPEG meeting was held at the Strasbourg Convention and Conference Centre featuring the following highlights:

  • The future of video coding standardization
  • Workshop on media synchronization
  • Standards at FDIS: Green Metadata and CDVS
  • What’s happening in MPEG-DASH?

Additional details about MPEG’s 110th meeting can be also found here including the official press release and all publicly available documents.

The Future of Video Coding Standardization

MPEG110 hosted a panel discussion about the future of video coding standardization. The panel was organized jointly by MPEG and ITU-T SG 16’s VCEG featuring Roger Bolton (Ericsson), Harald Alvestrand (Google), Zhong Luo (Huawei), Anne Aaron (Netflix), Stéphane Pateux (Orange), Paul Torres (Qualcomm), and JeongHoon Park (Samsung).

As expected, “maximizing compression efficiency remains a fundamental need” and as usual, MPEG will study “future application requirements, and the availability of technology developments to fulfill these requirements”. Therefore, two Ad-hoc Groups (AhGs) have been established which are open to the public:

The presentations of the brainstorming session on the future of video coding standardization can be found here.

Workshop on Media Synchronization

MPEG101 also hosted a workshop on media synchronization for hybrid delivery (broadband-broadcast) featuring six presentations “to better understand the current state-of-the-art for media synchronization and identify further needs of the industry”.

  • An overview of MPEG systems technologies providing advanced media synchronization, Youngkwon Lim, Samsung
  • Hybrid Broadcast – Overview of DVB TM-Companion Screens and Streams specification, Oskar van Deventer, TNO
  • Hybrid Broadcast-Broadband distribution for new video services :  a use cases perspective, Raoul Monnier, Thomson Video Networks
  • HEVC and Layered HEVC for UHD deployments, Ye Kui Wang, Qualcomm
  • A fingerprinting-based audio synchronization technology, Masayuki Nishiguchi, Sony Corporation
  • Media Orchestration from Capture to Consumption, Rob Koenen, TNO

The presentation material is available here. Additionally, MPEG established an AhG on timeline alignment (that’s how the project is internally called) to study use cases and solicit contributions on gap analysis and also technical contributions [email][subscription].

Standards at FDIS: Green Metadata and CDVS

My first report on MPEG Compact Descriptors for Visual Search (CDVS) dates back to July 2011 which provides details about the call for proposals. Now, finally, the FDIS has been approved during the 110th MPEG meeting. CDVS defines a compact image description that facilitates the comparison and search of pictures that include similar content, e.g. when showing the same objects in different scenes from different viewpoints. The compression of key point descriptors not only increases compactness, but also significantly speeds up, when compared to a raw representation of the same underlying features, the search and classification of images within large image databases. Application of CDVS for real-time object identification, e.g. in computer vision and other applications, is envisaged as well.

Another standard reached FDIS status entitled Green Metadata (first reported in August 2012). This standard specifies the format of metadata that can be used to reduce energy consumption from the encoding, decoding, and presentation of media content, while simultaneously controlling or avoiding degradation in the Quality of Experience (QoE). Moreover, the metadata specified in this standard can facilitate a trade-off between energy consumption and QoE. MPEG is also working on amendments to the ubiquitous MPEG-2 TS ISO/IEC 13818-1 and ISOBMFF ISO/IEC 14496-12 so that green metadata can be delivered by these formats.

What’s happening in MPEG-DASH?

MPEG-DASH is in a kind of maintenance mode but still receiving new proposals in the area of SAND parameters and some core experiments are going on. Also, the DASH-IF is working towards new interoperability points and test vectors in preparation of actual deployments. When speaking about deployments, they are happening, e.g., a 40h live stream right before Christmas (by bitmovin, a top-100 company that matters most in online video). Additionally, VideoNext was co-located with CoNEXT’14 targeting scientific presentations about the design, quality and deployment of adaptive video streaming. Webex recordings of the talks are available here. In terms of standardization, MPEG-DASH is progressing towards the 2nd amendment including spatial relationship description (SRD), generalized URL parameters and other extensions. In particular, SRD will enable new use cases which can be only addressed using MPEG-DASH and the FDIS is scheduled for the next meeting which will be in Geneva, Feb 16-20, 2015. I’ll report on this within my next blog post, stay tuned..

MPEG Column: Press release for the 109th MPEG meeting

MPEG collaborates with SC24 experts to develop committee draft of MAR reference model

SC 29/WG 11 (MPEG) is pleased to announce that the Mixed and Augmented Reality Reference Model (MAR RM), developed jointly and in close collaboration with SC 24/WG 9, has reached Committee Draft status at the 109th WG 11 meeting. The MAR RM defines not only the main concepts and terms of MAR, but also its application domain and an overall system architecture that can be applied to all MAR systems, regardless of the particular algorithms, implementation methods, computational platforms, display systems, and sensors/devices used. The MAR RM can therefore be used as a consultation source to aid in the development of MAR applications or services, business models, or new (or extensions to existing) standards. It identifies representative system classes and use cases with respect to the defined architecture, but does not specify technologies for the encoding of MAR information, or interchange formats.

2nd edition of HEVC includes scalable and multi-view video coding

At the 109th MPEG meeting, the standard development work was completed for two important extensions to the High Efficiency Video Coding standard (ISO/IEC 23008-2, also standardized by ITU-T as Rec. H.265).
The first of these are the scalability extensions of HEVC, known as SHVC, adding support for embedded bitstream scalability in which different levels of encoding quality are efficiently supported by adding or removing layered subsets of encoded data. The other are the multiview extensions of HEVC, known as MV-HEVC providing efficient representation of video content with multiple camera views and optional depth map information, such as for 3D stereoscopic and autostereoscopic video applications. MV-HEVC is the 3D video extension of HEVC, and further work for more efficient coding of 3D video is ongoing.
SHVC and MV-HEVC will be combined with the original content of the HEVC standard and also the recently-completed format range extensions (known as RExt), so that a new edition of the standard will be published that contains all extensions approved up to this time.

In addition, the finalization of reference software and a conformance test set for HEVC was completed at the 109th meeting, as ISO/IEC 23008-5 and ISO/IEC 23008-8, respectively. These important standards will greatly help industry achieve effective interoperability between products using HEVC and provide valuable information to ease the development of such products.
In consideration of the recent dramatic developments in video coding technology, including the completion of the development of the HEVC standard and several major extensions, MPEG plans to host a brainstorming event during its 110th meeting which will be open to the public. The event will be co-hosted by MPEG’s frequent collaboration partner in video coding standardization work, the Video Coding Experts Group (VCEG) of ITU-T Study Group 16. More information on how to register for the event will be available at

MPEG-H 3D Audio extended to lower bit rates

At its 109th meeting, MPEG has selected technology for Version II of the MPEG-H 3D Audio standard (ISO/IEC 23008-3) based on responses submitted to the Call for Proposals issued in January 2013. This follows from selection of Version I technology, which was chosen at the 105th meeting, in August 2013. While Version I technology was evaluated for bitrates between 1.2 Mb/s to 256 kb/s, Version II technology is focused on bitrates between 128 kb/s to 48 kb/s.
The selected technology supports content in multiple formats: channel-based, channels and objects (C+O), and scene-based Higher Order Ambisonics (HOA). A total of six submissions were reviewed: three for coding C+O content and three for coding HOA content.
The selected technologies for Version II were shown to be within the framework of the unified Version I technology.
The submissions were evaluated using a comprehensive set of subjective listening tests in which the resulting statistical analysis guided the selection process. At the highest bitrate of 128 kb/s for the coding of a signal supporting a 22.2 loudspeaker configuration, both of the selected technologies had performance of “Good” on the MUSHRA subjective quality scale. It is expected that the C+O and HOA Version II technologies will be merged into a unified architecture.
This MPEG-H 3D Audio Version II is expected to reach Draft International Standard by June 2015.

The 109th meeting also saw the technical completion of Version I of the MPEG-H 3D Audio standard and is expected to be an International Standard by February, 2015.

Public seminar for media synchronization planned for 110th MPEG meeting in October

A public seminar on Media Synchronization for Hybrid Delivery will be held on the 22nd of October 2014 during the 110th MPEG meeting in Strasbourg. The purpose of this seminar is to introduce MPEG’s activity on media stream synchronization for heterogeneous delivery environments, including hybrid environments employing both broadcast and broadband networks, with existing MPEG systems technologies such as MPEG-2 TS, DASH, and MMT. The seminar will also strive to ensure alignment of its present and future projects with users and industry use-cases needs. Main topics covered by the seminar interventions include:

  • Hybrid Broadcast – Broadband distribution for UHD deployments and 2nd screen content
  • Inter Destination Media Synchronization
  • MPEG Standardization efforts on Time Line Alignment of media contents
  • Audio Fingerprint based Synchronization

You are invited to join the seminar to learn more about MPEG activities in this area and to work with us to further develop technologies and standards supporting new applications of rich and heterogeneous media delivery.
The seminar is open to the public and registration is free of charge.

First MMT Developers’ Day held at MPEG 109, second planned for MPEG 110

Following the recent finalization of the MPEG Media Transport standard (ISO/IEC 23008-1), MPEG has hosted an MMT Developers’ Day to better understand the rate of MMT adoption and to provide a channel for MPEG to receive comments from industries about the standard. During the event four oral presentations have been presented including “Multimedia transportation technology and status in China”, “MMT delivery considering bandwidth utilization”, “Fast channel change/ Targeted Advertisement insertion over hybrid media delivery”, and “MPU Generator.” In addition, seven demonstrations have been presented such as Reliable 4K HEVC Realtime Transmission by using MMT-FEC, MMT Analyzer, Applications of MMT content through Broadcast, Storage, and Network Delivery, Media Delivery Optimization with the MMT Cache Middle Box, MMT-based Transport Technology for Advanced Services in Super Hi-Vision, target ad insertion and multi-view content composition in broadcasting system with MMT, and QoS management for Media Delivery. MPEG is planning to host a 2nd MMT Developer’s Day during the 110th meeting on Wednesday, Oct 22nd.

Seminar at MPEG 109 introduces MPEG’s activity for Free Viewpoint Television

A seminar for FTV (Free Viewpoint Television) was held during the 109th MPEG meeting in Sapporo. FTV is an emerging visual media technology that will revolutionize the viewing of 3D scenes to facilitate a more immersive experience by allowing users to freely navigate the view of a 3D scene as if they were actually there. The purpose of the seminar was to introduce MPEG’s activity on FTV to interested parties and to align future MPEG standardization of FTV technologies with user and industry needs.

Digging Deeper – How to Contact MPEG

Communicating the large and sometimes complex array of technology that the MPEG Committee has developed is not a simple task. Experts, past and present, have contributed a series of tutorials and vision documents that explain each of these standards individually. The repository is growing with each meeting, so if something you are interested is not yet there, it may appear shortly – but you should also not hesitate to request it. You can start your MPEG adventure at

Further Information

Future MPEG meetings are planned as follows:

  • No. 110, Strasbourg, FR, 20 – 24 October 2014
  • No. 111, Geneva, CH, 16 – 20 February 2015
  • No. 112, Warsaw, PL, 22 – 26 June 2015

The MPEG homepage also has links to other MPEG pages that are maintained by the MPEG subgroups. It also contains links to public documents that are freely available for download by those who are not MPEG members. Journalists that wish to receive MPEG Press Releases by email s

MPEG Column: 108th MPEG Meeting

Christian Timmerer

The 108th MPEG meeting was held at the Palacio de Congresos de Valencia in Spain featuring the following highlights (no worries about the acronyms, this is on purpose and they will be further explained below):

  • Requirements: PSAF, SCC, CDVA
  • Systems: M2TS, MPAF, Green Metadata
  • Video: CDVS, WVC, VCB
  • JCT-3D: MV/3D-HEVC, 3D-AVC
  • Audio: 3D audio

Opening Plenary of the 108th MPEG meeting in Valencia, Spain.

The official MPEG press release can be downloaded from the MPEG Web site. Some of the above highlighted topics will be detailed in the following and, of course, there’s an update on DASH-related matters at the end.

As indicated above, MPEG is full of (new) acronyms and in order to become familiar with those, I’ve put them deliberately in the overview but I will explain them further below.

PSAF – Publish/Subscribe Application Format

Publish/subscribe corresponds to a new network paradigm related to content-centric networking (or information-centric networking) where the content is addressed by its name rather than location. An application format within MPEG typically defines a combination of existing MPEG tools jointly addressing the needs for a given application domain, in this case, the publish/subscribe paradigm. The current requirements and a preliminary working draft are publicly available.

SCC – Screen Content Coding

I’ve introduced this topic in my previous report and this meeting the responses to the CfP have been evaluated. In total, seven responses have been received which meet all requirements and, thus, the actual standardization work is transferred to JCT-VC. Interestingly, the results of the CfP are publicly available. Within JCT-VC, a first test model has been defined and core experiments have been established. I will report more on this as an output of the next meetings…

CDVA – Compact Descriptors for Video Analysis

This project has been renamed from compact descriptors for video search to compact descriptors for video analysis and comprises a publicly available vision statement. That is, interested parties are welcome to join this new activity within MPEG.

M2TS – MPEG-2 Transport Stream

At this meeting, various extensions to M2TS have been defined such as transport of multi-view video coding depth information and extensions to HEVC, delivery of timeline for external data as well as carriage of layered HEVC, green metadata, and 3D audio. Hence, M2TS is still very active and multiple amendments are developed in parallel.

MPAF – Multimedia Preservation Application Format

The committee draft for MPAF has been approved and, in this context, MPEG-7 is extended with additional description schemes.

Green Metadata

Well, this standard does not have its own acronym; it’s simply referred to as MPEG-GREEN. The draft international standard has been approved and national bodies will vote on it at the JTC 1 level. It basically defines metadata to allow clients operating in an energy-efficient way. It comes along with amendments to M2TS and ISOBMFF that enable the carriage and storage of this metadata.

CDVS – Compact Descriptors for Visual Search

CDVS is at DIS stage and provide improvements on global descriptors as well as non-normative improvements of key-point detection and matching in terms of speedup and memory consumption. As all standards at DIS stage, national bodies will vote on it at the JTC 1 level.

What’s new in the video/audio-coding domain?

  • WVC – Web Video Coding: This project reached final draft international standard with the goal to provide a video-coding standard for Web applications. It basically defines a profile of the MPEG-AVC standard including those tools not encumbered by patents.
  • VCB – Video Coding for Browsers: The committee draft for part 31 of MPEG-4 defines video coding for browsers and basically defines VP8 as an international standard. This is explains also the difference to WVC.
  • SHVC – Scalable HEVC extensions: As for SVC, SHVC will be defined as an amendment to HEVC providing the same functionality as SVC, scalable video coding functionality.
  • MV/3D-HEVC, 3D-AVC: These are multi-view and 3D extensions for the HEVC and AVC standards respectively.
  • 3D Audio: Also, no acronym for this standard although I would prefer 3DA. However, CD has been approved at this meeting and the plan is to have DIS at the next meeting. At the same time, the carriage and storage of 3DA is being defined in M2TS and ISOBMFF respectively.

Finally, what’s new in the media transport area, specifically DASH and MMT?

As interested readers know from my previous reports, DASH 2nd edition has been approved has been approved some time ago. In the meantime, a first amendment to the 2nd edition is at draft amendment state including additional profiles (mainly adding xlink support) and time synchronization. A second amendment goes to the first ballot stage referred to as proposed draft amendment and defines spatial relationship description, generalized URL parameters, and other extensions. Eventually, these two amendments will be integrated in the 2nd edition which will become the MPEG-DASH 3rd edition. Also a corrigenda on the 2nd edition is currently under ballot and new contributions are still coming in, i.e., there is still a lot of interest in DASH. For your information – there will be two DASH-related sessions at Streaming Forum 2014.

On the other hand, MMT’s amendment 1 is currently under ballot and amendment 2 defines header compression and cross-layer interface. The latter has been progressed to a study document which will be further discussed at the next meeting. Interestingly, there will be a MMT developer’s day at the 109th MPEG meeting as in Japan, 4K/8K UHDTV services will be launched based on MMT specifications and in Korea and China, implementation of MMT is now under way. The developer’s day will be on July 5th (Saturday), 2014, 10:00 – 17:00 at the Sapporo Convention Center. Therefore, if you don’t know anything about MMT, the developer’s day is certainly a place to be.


Dr. Christian Timmerer
CIO bitmovin GmbH |
Alpen-Adria-Universität Klagenfurt |

MPEG Column: 107th MPEG Meeting

Christian Timmerer

The MPEG-2 Transport Stream (M2TS; formally known as Rec. ITU-T H.222.0 | ISO/IEC 13818-1) has been awarded with the Technology & Engineering Emmy® Award by the National Academy of Television Arts & Sciences. It is the fourth time MPEG received an Emmy award. The M2TS is widely deployed across a broad range of application domain such as broadcast, cable TV, Internet TV (IPTV and OTT), and Blu-ray Disks. The Emmy was received during this year’s CES2014 in Las Vegas.

Plenary during the 107th MPEG Meeting.

Other topics of the 107th MPEG meeting in San Jose include the following highlights:

  • Requirements: Call for Proposals on Screen Content jointly with ITU-T’s Video Coding Experts Group (VCEG)
  • Systems: Committee Draft for Green Metadata
  • Video: Study Text Committee Draft for Compact Descriptors for Visual Search (CDVS)
  • JCT-VC: Draft Amendment for HEVC Scalable Extensions (SHVC)
  • JCT-3D: Proposed Draft Amendment for HEVC 3D Extensions (3D-HEVC)
  • Audio: 3D audio plans to progress to CD at 108th meeting
  • 3D Graphics: Working Draft 4.0 of Augmented Reality Application Format (ARAF) 2nd Edition

The official MPEG press release can be downloaded from the MPEG Web site. Some of the above highlighted topics will be detailed in the following and, of course, there’s an update on DASH-related matters at the end.

Call for Proposals on Screen Content

Screen content refers to content coming not from cameras but from screen/desktop sharing and collaboration, cloud computing and gaming, wirelessly connected displays, control rooms with high resolution display walls, virtual desktop infrastructures, tablets as secondary displays, PC over IP, ultra-thin client technology, etc. Also mixed-content is within the scope of this work item and may contain a mixture of camera-captured video and images with rendered computer-generated graphics, text, animation, etc.

Although this type of content was considered during the course of the HEVC standardization, recent studies in MPEG have led to the conclusion that significant further improvements in coding efficiency can be obtained by exploiting the characteristics of screen content and, thus, a Call for Proposals (CfP) is being issued for developing possible future extensions of the HEVC standard.

Companies and organizations are invited to submit proposals in response to this call –issued jointly by MPEG with ITU-T VCEG. Responses are expected to be submitted by early March, and will be evaluated during the 108th MPEG meeting. The timeline is as follows:

  • 2014/01/17: Final Call for Proposals
  • 2014/01/22: Availability of anchors and end of editing period for Final CfP
  • 2014/02/10: Mandatory registration deadline
    One of the contact persons (see Section 10) must be notified, and an invoice for the testing fee will be sent after registration. Additional logistic information will also be sent to proponents by this date.
  • 2014/03/05: Coded test material shall be available at the test site. By this date, the payment of the testing fee is expected to be finalized.
  • 2014/03/17: Submission of all documents and requested data associated with the proposal.
  • 2014/03/27-04/04: Evaluation of proposals at standardization meeting.
  • 2015: Final draft standard expected.

It will be interesting to see the coding efficiency of the submitted proposals compared to a pure HEVC or even AVC approach.

DEC PDP-8 at Computer History Museum during MPEG Social Event.

Committee Draft for Green Metadata

Green Metadata, formerly known as Green MPEG, shall enable energy-efficient media consumption and reached Committee Draft (CD) status at the 107th MPEG meeting. The representation formats defined within Green Metadata help reducing decoder power consumption and display power consumption. Clients may utilize such information for the adaptive selection of operating voltage or clock frequencies within their chipsets. Additional, it may be used to set the brightness of the backlights for the display to save power consumption.

Green Metadata also provides metadata for the signaling and selection of DASH representations to enable the reduction of power consumption for their encoding.

The main challenge in terms of adoption of this kind of technology is how to exploit these representation formats to actually achieve energy-efficient media consumption and how much!

What’s new on the DASH frontier?

The text of ISO/IEC 23009-1 2nd edition PDAM1 has been approved which may be referred to as MPEG-DASH v3 (once finalized and integrated into the second edition, possibly with further amendments and corrigenda, if applicable). This first amendment to MPEG-DASH v2 comprises accurate time synchronization between server and client for live services as well as a new profile, i.e., ISOBMFF High Profile which basically combines the ISOBMFF Live and ISOBMFF On-demand profiles and adds the Xlink feature.

Additionally, a second amendment to MPEG-DASH v2 has been started featuring Spatial Relationship Description (SRD) and DASH Client Authentication and Content Access Authorization (DAA).

Other DASH-related aspects include the following:

  • The common encryption for ISOBMFF has been extended with a simple pattern-based encryption mode, i.e., a new method which should simply content encryption.
  • The CD has been approved for the carriage of timed metadata metrics of media in ISOBMFF. This allows for the signaling of quality metrics within the segments enabling QoE-aware DASH clients.

MPEG Column: 106th MPEG Meeting

Christian Timmerer

National Day Present by Austrian Airlines on my way to Geneva.

November, 2013, Geneva, Switzerland. Here comes a news report from the 106th MPEG in Geneva, Switzerland which was actually during the Austrian national day but Austrian Airlines had a nice present (see picture) for their guests.

The official press release can be found here.

In this meeting, ISO/IEC 23008-1 (i.e., MPEG-H Part 1) MPEG Media Transport (MMT) reached Final Draft International Standard (FDIS). Looking back when this project was started with the aim to supersede the widely adopted MPEG-2 Transport Stream (M2TS) — which receives the Technology & Engineering Emmy®Award in Jan’14 — and what we have now, the following features are supported within MMT:

  • Self-contained multiplexing structure
  • Strict timing model
  • Reference buffer model
  • Flexible splicing of content
  • Name based access of data
  • AL-FEC (application layer forward error correction)
  • Multiple Qualities of Service within one packet flow

ITU-T Tower Building, Geneva.

Interestingly, MMT supports the carriage of MPEG-DASH segments and MPD for uni-directional environments such as broadcasting.

MPEG-H now comprises three major technologies, part 1 is about transport (MMT; at FDIS stage), part 2 deals with video coding (HEVC; at FDIS stage), and part 3 will be about audio coding, specifically 3D audio coding (but it’s still in its infancy for which technical responses have been evaluated only recently). Other parts of MPEG-H are currently related to these three parts.

In terms of research, it is important to determine the efficiency, overhead, and — in general — the use cases enabled by MMT. From a business point of view, it will be interesting to see whether MMT will actually supersede M2TS and how it will evolve compared or in relation to DASH.

On another topic, MPEG-7 visual reached an important milestone at this meeting. The Committee Draft (CD) for Part 13 (ISO/IEC 15938-13) has been approved and is entitled Compact Descriptors for Visual Search (CDVS). This image description enables comparing and finding pictures that include similar content, e.g., when showing the same object from different viewpoints. CDVS mainly deals with images but MPEG also started work for compact descriptors for video search.

The CDVS standard truly helps to reduce the semantic gap. However, research in this domain is already well developed and it is unclear whether the research community will adopt CDVS, specifically because the interest in MPEG-7 descriptors has decreased lately. On the other hand, such a standard will enable interoperability among vendors and services (e.g., Google Goggles) reducing the number of proprietary formats and, hopefully, APIs. However, the most important question is whether CDVS will be adopted by the industry (and research).

Finally, what about MPEG-DASH?

The 2nd edition of part 1 (MPD and segment formats) and the 1st edition of part 2 (conformance and reference software) have been finalized at the 105th MPEG meeting (FDIS). Additionally, we had a public/open workshop at that meeting which was about session management and control for DASH. This and other new topics are further developed within so-called core experiments for which I’d like to give a brief overview:

  • Server and Network assisted DASH Operation (SAND) which is the immediate result of the workshop at the 105th MPEG meeting and introduces a DASH-Aware Media Element (DANE) as depicted in the Figure below. Parameters from this element — as well as others — may support the DASH client within its operations, i.e., downloading the “best” segments for its context. SAND parameters are typically coming from the network itself whereas Parameters for enhancing delivery by DANE (PED) are coming from the content author.

Baseline Architecture for Server and Network assisted DASH.

  • Spatial Relationship Description is about delivering (tiled) ultra-high-resolution content towards heterogeneous clients while at the same time providing interactivity (e.g., zooming). Thus, not only the temporal but also spatial relationship of representations needs to be described.

Other CEs are related to signaling intended source and display characteristicscontrolling the DASH client behavior, and DASH client authentication and content access authorization.

The outcome of these CEs is potentially interesting for future amendments. One CE closed at this meeting which was about including quality information within DASH, e.g., as part of an additional track within ISOBMFF and an additional representation within the MPD. Clients may access this quality information in advance to assist the adaptation logic in order to make informed decisions about which segment to download next.

Interested people may join the MPEG-DASH Ad-hoc Group (AhG; where these topics (and others) are discussed.

Finally, additional information/outcome from the last meeting is accessible via including documents publicly available (some may have an editing period).

MPEG Column: 105th MPEG Meeting

Christian Timmerer


Opening plenary, 105th MPEG meeting, Vienna, Klagenfurt

At the 105th MPEG meeting in Vienna, Austria, a lot of interesting things happened. First, this was not only the 105th MPEG meeting but also the 48th VCEG meeting, 14th JCT-VC meeting, 5th JCT-3V meeting, and 26th SC29 meeting bringing together more than 400 experts from more than 20 countries to discuss technical issues in the domain of coding of audio, [picture (SC29 only),] multimedia and hypermedia information. Second, it was the 3rd meeting hosted in Austria after the 62nd in July 2002 and 77th in July 2006. In 2002, “the new video coding standard being developed jointly with the ITU-T VCEG organization was promoted to Final Committee Draft (FCD)” and in 2006 “MPEG Surround completed its technical work and has been submitted for final FDIS balloting” as well as “MPEG has issued a Final Call for Proposals on MPEG-7 Query Format (MP7QF)”.

The official press release of the 105th meeting can be found here but I’d like to highlight a couple of interesting topics including research aspects covered or enabled by them. Although research efforts may lead to the standardization activities but also enables research as you may see below.

MPEG selects technology for the upcoming MPEG-H 3D audio standard

Based on the responses submitted to the Call for Proposals (CfP) on MPEG-H 3D audio, MPEG selected technology supporting content based on multiple formats, i.e., channels and objects (CO) and higher order ambisonics (HOA). All submissions have been evaluated by comprehensive and standardized subjective listening tests followed by statistical analysis of the results. Interestingly, when taking the highest bitrate of 1.2 Mb/s with a 22.2 channel configuration, both of the selected technologies have achieved excellent quality and are very close to true transparency. That is, listeners cannot differentiate between the encoded and uncompressed bitstream. A first version of the MPEG-H 3D audio standard with higher bitrates of around 1.2 Mb/s to 256 kb/s should be available by March 2014 (Committee Draft – CD), July 2014 (Draft International Standard – DIS), and January 2015 (Final Draft International Standards – FDIS), respectively.

Research topics: Although the technologies have been selected, it’s still a long way until the standard gets ratified by MPEG and published by ISO/IEC. Thus, there’s a lot of space for researching efficient encoding tools including the subjective quality evaluations thereof. Additionally, it may impact the way 3D Audio bitstreams are transferred from one entity to the another including file-based, streaming, on demand, and live services. Finally, within the application domain it may enable new use cases which are interesting to explore from a research point of view.

Augmented Reality Application Format reaches FDIS status

The MPEG Augmented Reality Application Format (ARAF, ISO/IEC 23000-13) enables the augmentation of the real world with synthetic media objects by combining multiple, existing standards within a single specific application format addressing certain industry needs. In particular, it combines standards providing representation formats for scene description (i.e., subset of BIFS), sensor/actuator descriptors (MPEG-V), and media formats such as audio/video coding formats. There are multiple target applications which may benefit from the MPEG ARAF standard, e.g., geolocation-based services, image-based object detection and tracking, mixed and augmented reality games and real-virtual interactive scenarios.

Research topics: Please note that MPEG ARAF only specifies the format to enable interoperability in order to support use cases enabled by this format. Hence, there are many research topics which could be associated to the application domains identified above.

What’s new in Dynamic Adaptive Streaming over HTTP?

The DASH outcome of the 105th MPEG meeting comes with a couple of highlights. First, a public workshop was held on session management and control (#DASHsmc) which will be used to derive additional requirements for DASH. All position papers and presentations are publicly available here. Second, the first amendment (Amd.1) to part 1 of MPEG-DASH (ISO/IEC 23009-1:2012) has reached the final stage of standardization and together with the first corrigendum (Cor.1) and the existing part 1, the FDIS of the second edition of ISO/IEC 23009-1:201x has been approved. This includes support for event messages (e.g., to be used for live streaming and dynamic ad insertion) and a media presentation anchor which enables session mobility among others. Third and finally, the FDIS of conformance and reference software (ISO/IEC 23009-2) has been approved providing means for media presentation conformance, test vectors, a DASH access engine reference software, and various sample software tools.

Research topics: The MPEG-DASH conformance and reference software provides the ideal playground for researchers as it can be used both to generate and to consume bitstreams compliant to the standard. This playground could be used together with other open source tools from the DASH-IFGPAC, and DASH@ITEC. Additionally, see also Open Source Column: Dynamic Adaptive Streaming over HTTP Toolset.

HEVC support in MPEG-2 Transport Stream and ISO Base Media File Format

After the completion of High Efficiency Video Coding (HEVC) – ITU-T H.265 | MPEG HEVC at the 103rd MPEG meeting in Geneva, HEVC bitstreams can be now delivered using the MPEG-2 Transport Stream (M2TS) and files based on the ISO Base Media File Format (ISOBMFF). For the latter, the scope of the Advanced Video Coding (AVC) file format has been extended to support also HEVC and this part of MPEG-4 has been renamed to Network Abstract Layer (NAL) file format. This file format now covers AVC and its family (Scalable Video Coding – SVC and Multiview Video Coding – MVC) but also HEVC.

Research topics: Research in the area of delivering audio-visual material is manifold and very well reflected in conference/workshops like ACM MMSys and Packet Video and associated journals and magazines. For these two particular standards, it would be interesting to see the efficiency of the carriage of HEVC with respect to the overhead.

MPEG Column: Press release for the 104th MPEG meeting

Multimedia ecosystem event focuses on a broader scope of MPEG standards

The 104th MPEG meeting was held in Incheon, Korea, from 22 January to 26 April 2013.

MPEG hosts Multimedia Ecosystem 2013 Event

During its 104th meeting, MPEG has hosted the MPEG Multimedia Ecosystem event to raise awareness of MPEG’s activities in areas not directly related to compression. In addition to world class standards for compression technologies, MPEG has developed media-related standards that enrich the use of multimedia such as MPEG-M for Multimedia Service Platform Technologies, MPEG-U for Rich Media User Interfaces, and MPEG-V for interfaces between real and virtual worlds. Also, new activities such as MPEG Augmented Reality Application Format, Compact Descriptors for Visual Search, Green MPEG for energy efficient media coding, and MPEG User Description are currently in progress. The event was organized with two sessions including a workshop and demonstrations. The workshop session introduced the seven standards described above while the demonstration session showed 17 products based on these standards.

MPEG issues CfP for Energy-Efficient Media Consumption (Green MPEG)

At the 104th MPEG meeting, MPEG has issued a Call for Proposals (CfP) on energy-efficient media consumption (Green MPEG) which is available in the public documents section at Green MPEG is envisaged to provide interoperable solutions for energy- efficient media decoding and presentation as well as energy-efficient media encoding based on encoder resources or receiver feedback. The CfP solicits responses that use compact signaling to facilitate reduced consumption from the encoding, decoding and presentation of media content without any degradation in the Quality of Experience (QoE). When power levels are critically low, consumers may prefer to sacrifice their QoE for reduced energy consumption. Green MPEG will provide this capability by allowing energy consumption to be traded off with the QoE. Responses to the call are due at the 105th MPEG meeting in July 2013.

APIs enable access to other MPEG technologies via MXM

The MPEG eXtensible Middleware (MXM) API technology specifications (ISO/IEC 23006-2) have reached the status of International Standard at the 104th MPEG meeting. MXM specifies the means to access individual MPEG tools through standardized APIs and is expected to help the creation of a global market of MXM applications that can run on devices supporting MXM APIs in addition to the other MPEG technologies. The MXM standard should also help the deployment of innovative business models because it will enable the easy design and implementation of media-handling value chains. The standard also provides reference software as open source with a business friendly license. The introductory part of the MXM family of specifications, 23006-1 MXM architecture and technologies, will soon be also freely available on the ISO web site.

MPEG introduces MPEG 101 with multimedia

MPEG has taken a further step toward communicating information about its standards in an easy and user- friendly manner; i.e. MPEG 101 with multimedia. MPEG 101 with multimedia will provide video clips containing overviews of individual standards along with explanations of the benefits that can be achieved by each standard, and will be available from the MPEG web site ( During this 104th MPEG meeting, the first video clip on the Unified Speech and Audio Coding (USAC) standard has been prepared. USAC is the newest MPEG Audio standard, which was issued in 2012. It provides performance as good as or better than state-of-the-art codecs that are designed specifically for a single class of content, such as just speech or just music, and it does so for any content type, such as speech, music or a mix of speech and music. Over its target operating bit rate, 12 kb/s for mono signals through 32 kb/s for stereo signals,USAC provides significantly better performance than the benchmarkcodecs, and continues to provide better performance as the bitrate is increased to higher rates. MPEG will employ the MPEG 101 with multimedia communication tool to other MPEG standards in near future.

MPEG Column: 103rd MPEG Meeting

Christian Timmerer


The 103rd MPEG Meeting

The 103rd MPEG meeting was held in Geneva, Switzerland, January 21-15, 2013. The official press release can be found here (doc only) and I’d like to introduce the new MPEG-H standard (ISO/IEC 23008) referred to as high efficiency coding and media delivery in heterogeneous environments:

  • Part 1: MPEG Media Transport (MMT) – status: 2nd committee draft (CD)
  • Part 2: High Efficiency Video Coding (HEVC) – status: final draft international standard (FDIS)
  • Part 3: 3D Audio – status: call for proposals (CfP)

MPEG Media Transport (MMT)

The MMT project was started in order to address the needs of modern media transport applications going beyond the capabilities offered by existing means of transportation such as formats defined by MPEG-2 transport stream (M2TS) or ISO base media file format (ISOBMFF) group of standards. The committee draft was approved during the 101st MPEG meeting. As a response to the CD ballot, MPEG received more than 200 comments from national bodies and, thus, decided to issue the 2nd committee draft which will be publicly available by February 7, 2013.

High Efficiency Video Coding (HEVC) – ITU-T H.265 | MPEG HEVC

HEVC is the next generation video coding standard jointly developed by ISO/IEC JTC1/SC29/WG11 (MPEG) and the Video Coding Experts Group (VCEG) of ITU-T WP 3/16. Please note that both ITU-T and ISO/IEC MPEG use the term “high efficiency video coding” in the the title of the standard but one can expect – as with its predecessor – that the former will use ITU-T H.265 and the latter will use MPEG-H HEVC for promoting its standards. If you don’t want to participate in this debate, simply use high efficiency video coding.

The MPEG press release says that the “HEVC standard reduces by half the bit rate needed to deliver high-quality video for a broad variety of applications” (note: compared to its predecessor AVC). The editing period for the FDIS goes until March 3, 2013 and then with the final preparations and a 2 month balloting period (yes|no vote only) once can expect the International Standard (IS) to be available early summer 2013. Please note that there are no technical differences between FDIS and IS.

The ITU-T press release describes HEVC as a standard that “will provide a flexible, reliable and robust solution, future-proofed to support the next decade of video. The new standard is designed to take account of advancing screen resolutions and is expected to be phased in as high-end products and services outgrow the limits of current network and display technology.”

HEVC currently defines three profiles:

  • Main Profile for the “Mass-market consumer video products that historically require only 8 bits of precision”.
  • Main 10 Profile “will support up to 10 bits of processing precision for applications with higher quality demands”.
  • Main Still Picture Profile to support still image applications, hence, “HEVC also advances the state-of-the-art for still picture coding”

3D Audio

The 3D audio standard shall complement MMT and HEVC assuming that in a “home theater” system a large number of loudspeakers will be deployed. Therefore, MPEG has issued a Call for Proposals (CfP) with the selection of the reference model v0 due in July 2013. The CfP says that MPEG-H 3D Audio “might be surrounding the user and be situated at high, mid and low vertical positions relative to the user’s ears. The desired sense of audio envelopment includes both immersive 3D audio, in the sense of being able to virtualize sound sources at any position in space, and accurate audio localization, in terms of both direction and distance.”

“In addition to a “home theater” audio-visual system, there may be a “personal” system having a tablet-sized visual display with speakers built into the device, e.g. around the perimeter of the display. Alternatively, the personal device may be a hand-held smart phone. Headphones with appropriate spatialization would also be a means to deliver an immersive audio experience for all systems.”

Complementary to the CfP, MPEG also provided the encoder input format for MPEG-H 3D audio and a draft MPEG audio core experiment methodology for 3D audio work.

MPEG Column: 102nd MPEG Meeting

Christian Timmerer

The 102nd MPEG meeting was held in Shanghai, China, October 15-19, 2012. The official press release can be found here (not yet available) and I would like to highlight the following topics:

  • Augmented Reality Application Format (ARAF) goes DIS
  • MPEG-4 has now 30 parts: Let’s welcome timed text and other visual overlays
  • Draft call for proposals for 3D audio
  • Green MPEG is progressing
  • MPEG starts a new publicity campaign by making more working documents publicly available for free

Augmented Reality Application Format (ARAF) goes DIS

MPEG’s application format dealing with augmented reality reached DIS status and is only one step away from becoming in international standard. In a nutshell, the MPEG ARAF enables to augment 2D/3D regions of scene by combining multiple/existing standards within a specific application format addressing certain industry needs. In particular, ARAF comprises three components referred to as scene, sensor/actuator, and media. The scene component is represented using a subset of MPEG-4 Part 11 (BIFS), the sensor/actuator component is defined within MPEG-V, and the media component may comprise various type of compressed (multi)media assets using different sorts of modalities and codecs.

A tutorial from Marius Preda, MPEG 3DG chair, at the Web3D conference in August 2012 is provided below.

MPEG-4 has now 30 parts

Let’s welcome timed text and other visual overlays in the family of MPEG-4 standards. Part 30 of MPEG-4 – in combination with an amendment to the ISO base media file format (ISOBMFF) –  addresses the carriage of W3C TTML including its derivative SMPTE Timed Text, as well as WebVTT. The types of overlays include subtitles, captions, and other timed text and graphics. The text-based overlays include basic text and XML-based text. Additionally, the standards provides support for bitmaps, fonts, and other graphics formats such as scalable vector graphics.

Draft call for proposals for 3D audio

MPEG 3D audio is concerned about various test items ranging from 9.1 over 12.1 up to 22.1 channel configurations. A public draft call for proposals has been issued at this meeting with the goal to finalize the call and the evaluation guidelines at the next meeting. The evaluation will be conducted in two phases. Phase one for higher bitrates (1.5 Mbps to 265 kbps) is foreseen to conclude in July 2013 with the evaluation of the answers to the call and the selection of the “Reference Model 0 (RM0)” technology which will serve as a basis for the development of an 3D audio standard. The second phase targets lower bitrates (96 kbps to 48 kbps) and builds on RM0 technology after this has been documented using text and code.

Green MPEG is progressing

The idea between green MPEG is to define signaling means that enable energy efficient encoding, delivery, decoding, and/or presentation of MPEG formats (and possibly others) without the loss of Quality of Experience. Green MPEG will address this issue from an end-to-end point of view with the focus – as usual – on the decoder. However, a codec-centric design is not desirable as the energy efficiency should not be affected at the expenses of the other components of the media ecosystem. At the moment, first requirements have been defined and everyone is free to join the discussions on the email reflector within the Ad-hoc Group.

MPEG starts a new publicity campaign by making more working documents publicly available for free

