About Christian Timmerer

Christian Timmerer is a researcher, entrepreneur, and teacher on immersive multimedia communication, streaming, adaptation, and Quality of Experience. He is an Assistant Professor at Alpen-Adria-Universität Klagenfurt, Austria. Follow him on Twitter at http://twitter.com/timse7 and subscribe to his blog at http://blog.timmerer.com.

MPEG Column: 108th MPEG Meeting

— original posts here and here by Multimedia Communication blog and bitmovin techblogChristian TimmererAAU/bitmovin

The 108th MPEG meeting was held at the Palacio de Congresos de Valencia in Spain featuring the following highlights (no worries about the acronyms, this is on purpose and they will be further explained below):

  • Requirements: PSAF, SCC, CDVA
  • Systems: M2TS, MPAF, Green Metadata
  • Video: CDVS, WVC, VCB
  • JCT-VC: SHVC, SCC
  • JCT-3D: MV/3D-HEVC, 3D-AVC
  • Audio: 3D audio

Opening Plenary of the 108th MPEG meeting in Valencia, Spain.

The official MPEG press release can be downloaded from the MPEG Web site. Some of the above highlighted topics will be detailed in the following and, of course, there’s an update on DASH-related matters at the end.

As indicated above, MPEG is full of (new) acronyms and in order to become familiar with those, I’ve put them deliberately in the overview but I will explain them further below.

PSAF – Publish/Subscribe Application Format

Publish/subscribe corresponds to a new network paradigm related to content-centric networking (or information-centric networking) where the content is addressed by its name rather than location. An application format within MPEG typically defines a combination of existing MPEG tools jointly addressing the needs for a given application domain, in this case, the publish/subscribe paradigm. The current requirements and a preliminary working draft are publicly available.

SCC – Screen Content Coding

I’ve introduced this topic in my previous report and this meeting the responses to the CfP have been evaluated. In total, seven responses have been received which meet all requirements and, thus, the actual standardization work is transferred to JCT-VC. Interestingly, the results of the CfP are publicly available. Within JCT-VC, a first test model has been defined and core experiments have been established. I will report more on this as an output of the next meetings…

CDVA – Compact Descriptors for Video Analysis

This project has been renamed from compact descriptors for video search to compact descriptors for video analysis and comprises a publicly available vision statement. That is, interested parties are welcome to join this new activity within MPEG.

M2TS – MPEG-2 Transport Stream

At this meeting, various extensions to M2TS have been defined such as transport of multi-view video coding depth information and extensions to HEVC, delivery of timeline for external data as well as carriage of layered HEVC, green metadata, and 3D audio. Hence, M2TS is still very active and multiple amendments are developed in parallel.

MPAF – Multimedia Preservation Application Format

The committee draft for MPAF has been approved and, in this context, MPEG-7 is extended with additional description schemes.

Green Metadata

Well, this standard does not have its own acronym; it’s simply referred to as MPEG-GREEN. The draft international standard has been approved and national bodies will vote on it at the JTC 1 level. It basically defines metadata to allow clients operating in an energy-efficient way. It comes along with amendments to M2TS and ISOBMFF that enable the carriage and storage of this metadata.

CDVS – Compact Descriptors for Visual Search

CDVS is at DIS stage and provide improvements on global descriptors as well as non-normative improvements of key-point detection and matching in terms of speedup and memory consumption. As all standards at DIS stage, national bodies will vote on it at the JTC 1 level.

What’s new in the video/audio-coding domain?

  • WVC – Web Video Coding: This project reached final draft international standard with the goal to provide a video-coding standard for Web applications. It basically defines a profile of the MPEG-AVC standard including those tools not encumbered by patents.
  • VCB – Video Coding for Browsers: The committee draft for part 31 of MPEG-4 defines video coding for browsers and basically defines VP8 as an international standard. This is explains also the difference to WVC.
  • SHVC – Scalable HEVC extensions: As for SVC, SHVC will be defined as an amendment to HEVC providing the same functionality as SVC, scalable video coding functionality.
  • MV/3D-HEVC, 3D-AVC: These are multi-view and 3D extensions for the HEVC and AVC standards respectively.
  • 3D Audio: Also, no acronym for this standard although I would prefer 3DA. However, CD has been approved at this meeting and the plan is to have DIS at the next meeting. At the same time, the carriage and storage of 3DA is being defined in M2TS and ISOBMFF respectively.

Finally, what’s new in the media transport area, specifically DASH and MMT?

As interested readers know from my previous reports, DASH 2nd edition has been approved has been approved some time ago. In the meantime, a first amendment to the 2nd edition is at draft amendment state including additional profiles (mainly adding xlink support) and time synchronization. A second amendment goes to the first ballot stage referred to as proposed draft amendment and defines spatial relationship description, generalized URL parameters, and other extensions. Eventually, these two amendments will be integrated in the 2nd edition which will become the MPEG-DASH 3rd edition. Also a corrigenda on the 2nd edition is currently under ballot and new contributions are still coming in, i.e., there is still a lot of interest in DASH. For your information – there will be two DASH-related sessions at Streaming Forum 2014.

On the other hand, MMT’s amendment 1 is currently under ballot and amendment 2 defines header compression and cross-layer interface. The latter has been progressed to a study document which will be further discussed at the next meeting. Interestingly, there will be a MMT developer’s day at the 109th MPEG meeting as in Japan, 4K/8K UHDTV services will be launched based on MMT specifications and in Korea and China, implementation of MMT is now under way. The developer’s day will be on July 5th (Saturday), 2014, 10:00 – 17:00 at the Sapporo Convention Center. Therefore, if you don’t know anything about MMT, the developer’s day is certainly a place to be.

Contact:

Dr. Christian Timmerer
CIO bitmovin GmbH | christian.timmerer@bitmovin.net
Alpen-Adria-Universität Klagenfurt | christian.timmerer@aau.at

What else? That is, some publicly available MPEG output documents… (Dates indicate availability and end of editing period, if applicable, using the following format YY/MM/DD):

  • Text of ISO/IEC 13818-1:2013 PDAM 7 Carriage of Layered HEVC (14/05/02)
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of Green Metadata (14/04/04)
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of 3D Audio (14/04/04)
  • WD of ISO/IEC 13818-1:2013 AMD Carriage of additional audio profiles & levels (14/04/04)
  • Text of ISO/IEC 14496-12:2012 PDAM 4 Enhanced audio support (14/04/04)
  • TuC on sample variants, signatures and other improvements for the ISOBMFF (14/04/04)
  • Text of ISO/IEC CD 14496-22 3rd edition (14/04/04)
  • Text of ISO/IEC CD 14496-31 Video Coding for Browsers (14/04/11)
  • Text of ISO/IEC 15938-5:2005 PDAM 5 Multiple text encodings, extended classification metadata (14/04/04)
  • WD 2 of ISO/IEC 15938-6:201X (2nd edition) (14/05/09)
  • Text of ISO/IEC DIS 15938-13 Compact Descriptors for Visual Search (14/04/18)
  • Test Model 10: Compact Descriptors for Visual Search (14/05/02)
  • WD of ARAF 2nd Edition (14/04/18)
  • Use cases for ARAF 2nd Edition (14/04/18)
  • WD 5.0 MAR Reference Model (14/04/18)
  • Logistic information for the 5th JAhG MAR meeting (14/04/04)
  • Text of ISO/IEC CD 23000-15 Multimedia Preservation Application Format (14/04/18)
  • WD of Implementation Guideline of MP-AF (14/04/04)
  • Requirements for Publish/Subscribe Application Format (PSAF) (14/04/04)
  • Preliminary WD of Publish/Subscribe Application Format (14/04/04)
  • WD2 of ISO/IEC 23001-4:201X/Amd.1 Parser Instantiation from BSD (14/04/11)
  • Text of ISO/IEC 23001-8:2013/DCOR1 (14/04/18)
  • Text of ISO/IEC DIS 23001-11 Green Metadata (14/04/25)
  • Study Text of ISO/IEC 23002-4:201x/DAM2 FU and FN descriptions for HEVC (14/04/04)
  • Text of ISO/IEC 23003-4 CD, Dynamic Range Control (14/04/11)
  • MMT Developers’ Day in 109th MPEG meeting (14/04/04)
  • Results of CfP on Screen Content Coding Tools for HEVC (14/04/30)
  • Study Text of ISO/IEC 23008-2:2013/DAM3 HEVC Scalable Extensions (14/06/06)
  • HEVC RExt Test Model 7 (14/06/06)
  • Scalable HEVC (SHVC) Test Model 6 (SHM 6) (14/06/06)
  • Report on HEVC compression performance verification testing (14/04/25)
  • HEVC Screen Content Coding Test Model 1 (SCM 1) (14/04/25)
  • Study Text of ISO/IEC 23008-2:2013/PDAM4 3D Video Extensions (14/05/15)
  • Test Model 8 of 3D-HEVC and MV-HEVC (14/05/15)
  • Text of ISO/IEC 23008-3/CD, 3D audio (14/04/11)
  • Listening Test Logistics for 3D Audio Phase 2 (14/04/04)
  • Active Downmix Control (14/04/04)
  • Text of ISO/IEC PDTR 23008-13 Implementation Guidelines for MPEG Media Transport (14/05/02)
  • Text of ISO/IEC 23009-1 2nd edition DAM 1 Extended Profiles and availability time synchronization (14/04/18)
  • Text of ISO/IEC 23009-1 2nd edition PDAM 2 Spatial Relationship Description, Generalized URL parameters and other extensions (14/04/18)
  • Text of ISO/IEC PDTR 23009-3 2nd edition DASH Implementation Guidelines (14/04/18)
  • MPEG vision for Compact Descriptors for Video Analysis (CDVA) (14/04/04)
  • Plan of FTV Seminar at 109th MPEG Meeting (14/04/04)
  • Draft Requirements and Explorations for HDR /WCG Content Distribution and Storage (14/04/04)
  • Working Draft 2 of Internet Video Coding (IVC) (14/04/18)
  • Internet Video Coding Test Model (ITM) v 9.0 (14/04/18)
  • Uniform Timeline Alignment (14/04/18)
  • Plan of Seminar on Hybrid Delivery at the 110th MPEG Meeting (14/04/04)
  • WD 2 of MPEG User Description (14/04/04)

MPEG Column: 107th MPEG Meeting

— original posts here and here by Multimedia Communication blog and bitmovin techblogChristian TimmererAAU/bitmovin

The MPEG-2 Transport Stream (M2TS; formally known as Rec. ITU-T H.222.0 | ISO/IEC 13818-1) has been awarded with the Technology & Engineering Emmy® Award by the National Academy of Television Arts & Sciences. It is the fourth time MPEG received an Emmy award. The M2TS is widely deployed across a broad range of application domain such as broadcast, cable TV, Internet TV (IPTV and OTT), and Blu-ray Disks. The Emmy was received during this year’s CES2014 in Las Vegas.

Plenary during the 107th MPEG Meeting.

Other topics of the 107th MPEG meeting in San Jose include the following highlights:

  • Requirements: Call for Proposals on Screen Content jointly with ITU-T’s Video Coding Experts Group (VCEG)
  • Systems: Committee Draft for Green Metadata
  • Video: Study Text Committee Draft for Compact Descriptors for Visual Search (CDVS)
  • JCT-VC: Draft Amendment for HEVC Scalable Extensions (SHVC)
  • JCT-3D: Proposed Draft Amendment for HEVC 3D Extensions (3D-HEVC)
  • Audio: 3D audio plans to progress to CD at 108th meeting
  • 3D Graphics: Working Draft 4.0 of Augmented Reality Application Format (ARAF) 2nd Edition

The official MPEG press release can be downloaded from the MPEG Web site. Some of the above highlighted topics will be detailed in the following and, of course, there’s an update on DASH-related matters at the end.

Call for Proposals on Screen Content

Screen content refers to content coming not from cameras but from screen/desktop sharing and collaboration, cloud computing and gaming, wirelessly connected displays, control rooms with high resolution display walls, virtual desktop infrastructures, tablets as secondary displays, PC over IP, ultra-thin client technology, etc. Also mixed-content is within the scope of this work item and may contain a mixture of camera-captured video and images with rendered computer-generated graphics, text, animation, etc.

Although this type of content was considered during the course of the HEVC standardization, recent studies in MPEG have led to the conclusion that significant further improvements in coding efficiency can be obtained by exploiting the characteristics of screen content and, thus, a Call for Proposals (CfP) is being issued for developing possible future extensions of the HEVC standard.

Companies and organizations are invited to submit proposals in response to this call –issued jointly by MPEG with ITU-T VCEG. Responses are expected to be submitted by early March, and will be evaluated during the 108th MPEG meeting. The timeline is as follows:

  • 2014/01/17: Final Call for Proposals
  • 2014/01/22: Availability of anchors and end of editing period for Final CfP
  • 2014/02/10: Mandatory registration deadline
    One of the contact persons (see Section 10) must be notified, and an invoice for the testing fee will be sent after registration. Additional logistic information will also be sent to proponents by this date.
  • 2014/03/05: Coded test material shall be available at the test site. By this date, the payment of the testing fee is expected to be finalized.
  • 2014/03/17: Submission of all documents and requested data associated with the proposal.
  • 2014/03/27-04/04: Evaluation of proposals at standardization meeting.
  • 2015: Final draft standard expected.

It will be interesting to see the coding efficiency of the submitted proposals compared to a pure HEVC or even AVC approach.

DEC PDP-8 at Computer History Museum during MPEG Social Event.

Committee Draft for Green Metadata

Green Metadata, formerly known as Green MPEG, shall enable energy-efficient media consumption and reached Committee Draft (CD) status at the 107th MPEG meeting. The representation formats defined within Green Metadata help reducing decoder power consumption and display power consumption. Clients may utilize such information for the adaptive selection of operating voltage or clock frequencies within their chipsets. Additional, it may be used to set the brightness of the backlights for the display to save power consumption.

Green Metadata also provides metadata for the signaling and selection of DASH representations to enable the reduction of power consumption for their encoding.

The main challenge in terms of adoption of this kind of technology is how to exploit these representation formats to actually achieve energy-efficient media consumption and how much!

What’s new on the DASH frontier?

The text of ISO/IEC 23009-1 2nd edition PDAM1 has been approved which may be referred to as MPEG-DASH v3 (once finalized and integrated into the second edition, possibly with further amendments and corrigenda, if applicable). This first amendment to MPEG-DASH v2 comprises accurate time synchronization between server and client for live services as well as a new profile, i.e., ISOBMFF High Profile which basically combines the ISOBMFF Live and ISOBMFF On-demand profiles and adds the Xlink feature.

Additionally, a second amendment to MPEG-DASH v2 has been started featuring Spatial Relationship Description (SRD) and DASH Client Authentication and Content Access Authorization (DAA).

Other DASH-related aspects include the following:

  • The common encryption for ISOBMFF has been extended with a simple pattern-based encryption mode, i.e., a new method which should simply content encryption.
  • The CD has been approved for the carriage of timed metadata metrics of media in ISOBMFF. This allows for the signaling of quality metrics within the segments enabling QoE-aware DASH clients.

What else? That is, some publicly available MPEG output documents… (Dates indicate availability and end of editing period, if applicable, using the following format YY/MM/DD):

  • Report of 3D-AVC Subjective Quality Assessment (14/02/28)
  • Working Draft 3 of Video Coding for Browsers (14/01/31)
  • Common Test Conditions for Proposals on VCB Enhancements (14/01/17)
  • Study Text of ISO/IEC CD 15938-13 Compact Descriptors for Visual Search (14/02/14)
  • WD 4.0 of ARAF 2nd Edition (14/02/07)
  • Text of ISO/IEC 23001-7 PDAM 1 Simple pattern-based encryption mode (14/01/31)
  • Text of ISO/IEC CD 23001-10 Carriage of Timed Metadata Metrics of Media in the ISO Base Media File Format (14/01/31)
  • Text of ISO/IEC CD 23001-11 Green Metadata (14/01/24)
  • Preliminary Draft of ISO/IEC 23008-2:2013/FDAM1 HEVC Range Extensions (14/02/28)
  • Text of ISO/IEC 23008-2:2013/DAM3 HEVC Scalable Extensions (14/01/31)
  • Preliminary Draft of ISO/IEC 23008-2:2013/FDAM2 HEVC Multiview Extensions (14/02/28)
  • Text of ISO/IEC 23008-2:2013/PDAM4 3D Extensions (14/03/14)
  • Text of ISO/IEC CD 23008-12 Image File Format (14/01/17)
  • Text of ISO/IEC 23009-1:201x DCOR 1 (14/01/24)
  • Text of ISO/IEC 23009-1:201x PDAM 1 High Profile and Availability Time Synchronization (14/01/24)
  • WD of ISO/IEC 23009-1 AMD 2 (14/01/31)
  • Requirements for an extension of HEVC for coding of screen content (14/01/17)
  • Joint Call for Proposals for coding of screen content (14/01/22)
  • Draft requirements for Higher Dynamic Range (HDR) and Wide Color Gamut (WCG) video coding for Broadcasting, OTT, and Storage Media (14/01/17)
  • Working Draft 1 of Internet Video Coding (IVC) (14/01/31)

MPEG Column: 106th MPEG Meeting

— original posts here and here by Multimedia Communication blog and bitmovin techblogChristian TimmererAAU/bitmovin

National Day Present by Austrian Airlines on my way to Geneva.

November, 2013, Geneva, Switzerland. Here comes a news report from the 106th MPEG in Geneva, Switzerland which was actually during the Austrian national day but Austrian Airlines had a nice present (see picture) for their guests.

The official press release can be found here.

In this meeting, ISO/IEC 23008-1 (i.e., MPEG-H Part 1) MPEG Media Transport (MMT) reached Final Draft International Standard (FDIS). Looking back when this project was started with the aim to supersede the widely adopted MPEG-2 Transport Stream (M2TS) — which receives the Technology & Engineering Emmy®Award in Jan’14 — and what we have now, the following features are supported within MMT:

  • Self-contained multiplexing structure
  • Strict timing model
  • Reference buffer model
  • Flexible splicing of content
  • Name based access of data
  • AL-FEC (application layer forward error correction)
  • Multiple Qualities of Service within one packet flow

ITU-T Tower Building, Geneva.

Interestingly, MMT supports the carriage of MPEG-DASH segments and MPD for uni-directional environments such as broadcasting.

MPEG-H now comprises three major technologies, part 1 is about transport (MMT; at FDIS stage), part 2 deals with video coding (HEVC; at FDIS stage), and part 3 will be about audio coding, specifically 3D audio coding (but it’s still in its infancy for which technical responses have been evaluated only recently). Other parts of MPEG-H are currently related to these three parts.

In terms of research, it is important to determine the efficiency, overhead, and — in general — the use cases enabled by MMT. From a business point of view, it will be interesting to see whether MMT will actually supersede M2TS and how it will evolve compared or in relation to DASH.

On another topic, MPEG-7 visual reached an important milestone at this meeting. The Committee Draft (CD) for Part 13 (ISO/IEC 15938-13) has been approved and is entitled Compact Descriptors for Visual Search (CDVS). This image description enables comparing and finding pictures that include similar content, e.g., when showing the same object from different viewpoints. CDVS mainly deals with images but MPEG also started work for compact descriptors for video search.

The CDVS standard truly helps to reduce the semantic gap. However, research in this domain is already well developed and it is unclear whether the research community will adopt CDVS, specifically because the interest in MPEG-7 descriptors has decreased lately. On the other hand, such a standard will enable interoperability among vendors and services (e.g., Google Goggles) reducing the number of proprietary formats and, hopefully, APIs. However, the most important question is whether CDVS will be adopted by the industry (and research).

Finally, what about MPEG-DASH?

The 2nd edition of part 1 (MPD and segment formats) and the 1st edition of part 2 (conformance and reference software) have been finalized at the 105th MPEG meeting (FDIS). Additionally, we had a public/open workshop at that meeting which was about session management and control for DASH. This and other new topics are further developed within so-called core experiments for which I’d like to give a brief overview:

  • Server and Network assisted DASH Operation (SAND) which is the immediate result of the workshop at the 105th MPEG meeting and introduces a DASH-Aware Media Element (DANE) as depicted in the Figure below. Parameters from this element — as well as others — may support the DASH client within its operations, i.e., downloading the “best” segments for its context. SAND parameters are typically coming from the network itself whereas Parameters for enhancing delivery by DANE (PED) are coming from the content author.

Baseline Architecture for Server and Network assisted DASH.

  • Spatial Relationship Description is about delivering (tiled) ultra-high-resolution content towards heterogeneous clients while at the same time providing interactivity (e.g., zooming). Thus, not only the temporal but also spatial relationship of representations needs to be described.

Other CEs are related to signaling intended source and display characteristicscontrolling the DASH client behavior, and DASH client authentication and content access authorization.

The outcome of these CEs is potentially interesting for future amendments. One CE closed at this meeting which was about including quality information within DASH, e.g., as part of an additional track within ISOBMFF and an additional representation within the MPD. Clients may access this quality information in advance to assist the adaptation logic in order to make informed decisions about which segment to download next.

Interested people may join the MPEG-DASH Ad-hoc Group (AhG; http://lists.uni-klu.ac.at/mailman/listinfo/dash) where these topics (and others) are discussed.

Finally, additional information/outcome from the last meeting is accessible via http://mpeg.chiariglione.org/meetings/106 including documents publicly available (some may have an editing period).

MPEG Column: 105th MPEG Meeting

— original post by Multimedia Communication blogChristian TimmererAAU

 

Opening plenary, 105th MPEG meeting, Vienna, Klagenfurt

At the 105th MPEG meeting in Vienna, Austria, a lot of interesting things happened. First, this was not only the 105th MPEG meeting but also the 48th VCEG meeting, 14th JCT-VC meeting, 5th JCT-3V meeting, and 26th SC29 meeting bringing together more than 400 experts from more than 20 countries to discuss technical issues in the domain of coding of audio, [picture (SC29 only),] multimedia and hypermedia information. Second, it was the 3rd meeting hosted in Austria after the 62nd in July 2002 and 77th in July 2006. In 2002, “the new video coding standard being developed jointly with the ITU-T VCEG organization was promoted to Final Committee Draft (FCD)” and in 2006 “MPEG Surround completed its technical work and has been submitted for final FDIS balloting” as well as “MPEG has issued a Final Call for Proposals on MPEG-7 Query Format (MP7QF)”.

The official press release of the 105th meeting can be found here but I’d like to highlight a couple of interesting topics including research aspects covered or enabled by them. Although research efforts may lead to the standardization activities but also enables research as you may see below.

MPEG selects technology for the upcoming MPEG-H 3D audio standard

Based on the responses submitted to the Call for Proposals (CfP) on MPEG-H 3D audio, MPEG selected technology supporting content based on multiple formats, i.e., channels and objects (CO) and higher order ambisonics (HOA). All submissions have been evaluated by comprehensive and standardized subjective listening tests followed by statistical analysis of the results. Interestingly, when taking the highest bitrate of 1.2 Mb/s with a 22.2 channel configuration, both of the selected technologies have achieved excellent quality and are very close to true transparency. That is, listeners cannot differentiate between the encoded and uncompressed bitstream. A first version of the MPEG-H 3D audio standard with higher bitrates of around 1.2 Mb/s to 256 kb/s should be available by March 2014 (Committee Draft – CD), July 2014 (Draft International Standard – DIS), and January 2015 (Final Draft International Standards – FDIS), respectively.

Research topics: Although the technologies have been selected, it’s still a long way until the standard gets ratified by MPEG and published by ISO/IEC. Thus, there’s a lot of space for researching efficient encoding tools including the subjective quality evaluations thereof. Additionally, it may impact the way 3D Audio bitstreams are transferred from one entity to the another including file-based, streaming, on demand, and live services. Finally, within the application domain it may enable new use cases which are interesting to explore from a research point of view.

Augmented Reality Application Format reaches FDIS status

The MPEG Augmented Reality Application Format (ARAF, ISO/IEC 23000-13) enables the augmentation of the real world with synthetic media objects by combining multiple, existing standards within a single specific application format addressing certain industry needs. In particular, it combines standards providing representation formats for scene description (i.e., subset of BIFS), sensor/actuator descriptors (MPEG-V), and media formats such as audio/video coding formats. There are multiple target applications which may benefit from the MPEG ARAF standard, e.g., geolocation-based services, image-based object detection and tracking, mixed and augmented reality games and real-virtual interactive scenarios.

Research topics: Please note that MPEG ARAF only specifies the format to enable interoperability in order to support use cases enabled by this format. Hence, there are many research topics which could be associated to the application domains identified above.

What’s new in Dynamic Adaptive Streaming over HTTP?

The DASH outcome of the 105th MPEG meeting comes with a couple of highlights. First, a public workshop was held on session management and control (#DASHsmc) which will be used to derive additional requirements for DASH. All position papers and presentations are publicly available here. Second, the first amendment (Amd.1) to part 1 of MPEG-DASH (ISO/IEC 23009-1:2012) has reached the final stage of standardization and together with the first corrigendum (Cor.1) and the existing part 1, the FDIS of the second edition of ISO/IEC 23009-1:201x has been approved. This includes support for event messages (e.g., to be used for live streaming and dynamic ad insertion) and a media presentation anchor which enables session mobility among others. Third and finally, the FDIS of conformance and reference software (ISO/IEC 23009-2) has been approved providing means for media presentation conformance, test vectors, a DASH access engine reference software, and various sample software tools.

Research topics: The MPEG-DASH conformance and reference software provides the ideal playground for researchers as it can be used both to generate and to consume bitstreams compliant to the standard. This playground could be used together with other open source tools from the DASH-IFGPAC, and DASH@ITEC. Additionally, see also Open Source Column: Dynamic Adaptive Streaming over HTTP Toolset.

HEVC support in MPEG-2 Transport Stream and ISO Base Media File Format

After the completion of High Efficiency Video Coding (HEVC) – ITU-T H.265 | MPEG HEVC at the 103rd MPEG meeting in Geneva, HEVC bitstreams can be now delivered using the MPEG-2 Transport Stream (M2TS) and files based on the ISO Base Media File Format (ISOBMFF). For the latter, the scope of the Advanced Video Coding (AVC) file format has been extended to support also HEVC and this part of MPEG-4 has been renamed to Network Abstract Layer (NAL) file format. This file format now covers AVC and its family (Scalable Video Coding – SVC and Multiview Video Coding – MVC) but also HEVC.

Research topics: Research in the area of delivering audio-visual material is manifold and very well reflected in conference/workshops like ACM MMSys and Packet Video and associated journals and magazines. For these two particular standards, it would be interesting to see the efficiency of the carriage of HEVC with respect to the overhead.

Publicly available MPEG output documents

The following documents shall be come available at http://mpeg.chiariglione.org/ (availability in brackets – YY/MM/DD). If you have difficulties to access one of these documents, please feel free to contact me.

  • Requirements for HEVC image sequences (13/08/02)
  • Requirements for still image coding using HEVC (13/08/02)
  • Text of ISO/IEC 14496-16/PDAM4 Pattern based 3D mesh compression (13/08/02)
  • WD of ISO/IEC 14496-22 3rd edition (13/08/02)
  • Study text of DTR of ISO/IEC 23000-14, Augmented reality reference model (13/08/02)
  • Draft Test conditions for HEVC still picture coding performance evaluation (13/08/02)
  • List of stereo and 3D sequences considered (13/08/02)
  • Timeline and Requirements for MPEG-H Audio (13/08/02)
  • Working Draft 1 of Video Coding for browsers (13/08/31)
  • Test Model 1 of Video Coding for browsers (13/08/31)
  • Draft Requirements for Full Gamut Content Distribution (13/08/02)
  • Internet Video Coding Test Model (ITM) v 6.0 (13/08/23)
  • WD 2.0 MAR Reference Model (13/08/13)
  • Call for Proposals on MPEG User Description (MPEG-UD) (13/08/02)
  • Use Cases for MPEG User Description (13/08/02)
  • Requirements on MPEG User Description (13/08/02)
  • Text of white paper on MPEG Query Format (13/07/02)
  • Text of white paper on MPEG-7 AudioVisual Description Profile (AVDP) (13/07/02)

Open Source Column: Dynamic Adaptive Streaming over HTTP Toolset

Introduction

Multimedia content is nowadays omnipresent thanks to technological advancements in the last decades. A major driver of today’s networks are content providers like Netflix and YouTube, which do not deploy their own streaming architecture but provide their service over-the-top (OTT). Interestingly, this streaming approach performs well and adopts the Hypertext Transfer Protocol (HTTP), which has been initially designed for best-effort file transfer and not for real-time multimedia streaming. The assumption of former video streaming research that streaming on top of HTTP/TCP will not work smoothly due to its retransmission delay and throughput variations, has apparently be overcome as supported by [1]. Streaming on top of HTTP, which is currently mainly deployed in the form of progressive download, has several other advantages. The infrastructure deployed for traditional HTTP-based services (e.g., Web sites) can be exploited also for real-time multimedia streaming. Typical problems of real-time multimedia streaming like NAT or firewall traversal do not apply for HTTP streaming. Nevertheless, there are certain disadvantages, such as fluctuating bandwidth conditions, that can not be handled with the progressive download approach, which is a major drawback especially for mobile networks where the bandwidth variations are tremendous. One of the first solutions to overcome the problem of varying bandwidth conditions has been specified within 3GPP as Adaptive HTTP Streaming (AHS) [2]. The basic idea is to encode the media file/stream into different versions (e.g., bitrate, resolution) and chop each version into segments of the same length (e.g., two seconds). The segments are provided on an ordinary Web server and can be downloaded through HTTP GET requests. The adaptation to the bitrate or resolution is done on the client-side for each segment, e.g., the client can switch to a higher bitrate – if bandwidth permits – on a per segment basis. This has several advantages because the client knows best its capabilities, received throughput, and the context of the user. In order to describe the temporal and structural relationships between segments, AHS introduced the so-called Media Presentation Description (MPD). The MPD is a XML document that associates an uniform resource locators (URL) to the different qualities of the media content and the individual segments of each quality. This structure provides the binding of the segments to the bitrate (resolution, etc.) among others (e.g., start time, duration of segments). As a consequence each client will first request the MPD that contains the temporal and structural information for the media content and based on that information it will request the individual segments that fit best for its requirements. Additionally, the industry has deployed several proprietary solutions, e.g., Microsoft Smooth Streaming [3], Apple HTTP Live Streaming [4] and Adobe Dynamic HTTP Streaming [5], which more or less adopt the same approach.

Figure 1: Concept of Dynamic Adaptive Streaming over HTTP.

Recently, ISO/IEC MPEG has ratified Dynamic Adaptive Streaming over HTTP (DASH) [6] an international standard that should enable interoperability among proprietary solutions. The concept of DASH is depicted in Figure 1. The Institute of Information Technology (ITEC) and, in particular, the Multimedia Communication Research Group of the Alpen-Adria-Universität Klagenfurt has participated and contributed from the beginning to this standard. During the standardization process a lot of research tools have been developed for evaluation purposes and scientific contributions including several publications. These tools are provided as open source for the community and are available at [7].

Open Source Tools Suite

Our open source tool suite consists of several components. On the client-side we provide libdash [8] and the DASH plugin for the VLC media player (also available on Android). Additionally, our suite also includes a JavaScript-based client that utilizes the HTML5 media source extensions of the Google Chrome browser to enable DASH playback. Furthermore, we provide several server-side tools such as our DASH dataset, consisting of different movie sequences available in different segment lengths as well as bitrates and resolutions. Additionally, we provide a distributed dataset mirrored at different locations across Europe. Our datasets have been encoded using our DASHEncoder, which is a wrapper tool for x264 and MP4Box. Finally, a DASH online MPD validation service and a DASH implementation over CCN completes our open source tool suite.

libdash

Figure 2: Client-Server DASH Architecture with libdash.

The general architecture of DASH is depicted in Figure 2, where orange represents standardized parts. libdash comprises the MPD parsing and HTTP part. The library provides interfaces for the DASH Streaming Control and the Media Player to access MPDs and downloadable media segments. The download order of such media segments will not be handled by the library. This is left to the DASH Streaming Control, which is an own component in this architecture but it could also be included in the Media Player. In a typical deployment, a DASH server provides segments in several bitrates and resolutions. The client initially receives the MPD through libdash which provides a convenient object-oriented interface to that MPD. Based on that information the client can download individual media segments through libdash at any point in time. Varying bandwidth conditions can be handled by switching to the corresponding quality level at segment boundaries in order to provide a smooth streaming experience. This adaptation is not part of libdash and the DASH standard and will be left to the application which is using libdash.

DASH-JS

Figure 3: Screenshot of DASH-JS.

DASH-JS seamlessly integrates DASH into the Web using the HTML5 video element. A screenshot is shown in Figure 3. It is based on JavaScript and uses the Media Source API of Google’s Chrome browser to present a flexible and potentially browser independent DASH player. DASH-JS is currently using WebM-based media segments and segments based on the ISO Base Media File Format.

DASHEncoder

DASHEncoder is a content generation tool – on top of the open source encoding tool x264 and GPAC’s MP4Box – for DASH video-on-demand content. Using DASHEncoder, the user does not need to encode and multiplex separately each quality level of the final DASH content. Figure 4 depicts the workflow of the DASHEncoder. It generates the desired representations (quality/bitrate levels), fragmented MP4 files, and MPD file based on a given configuration file or by command line parameters.

Figure 4: High-level structure of DASHEncoder.

The set of configuration parameters comprises a wide range of possibilities. For example, DASHEncoder supports different segment sizes, bitrates, resolutions, encoding settings, URLs, etc. The modular implementation of DASHEncoder enables the batch processing of multiple encodings which are finally reassembled within a predefined directory structure represented by single MPD. DASHEncoder is available open source on our Web site as well as on Github, with the aim that other developers will join this project. The content generated with DASHEncoder is compatible with our playback tools.

Datasets

Figure 5: DASH Dataset.

Our DASH dataset comprises multiple full movie length sequences from different genres – animation, sport and movie (c.f. Figure 5) – and is located at our Web site. The DASH dataset is encoded and multiplexed using different segment sizes inspired by commercial products ranging from 2 seconds (i.e., Microsoft Smooth Streaming) to 10 seconds per fragment (i.e., Apple HTTP Streaming) and beyond. In particular, each sequence of the dataset is provided with segments sizes of 1, 2, 4, 6, 10, and 15 seconds. Additionally, we also offer a non-segmented version of the videos and the corresponding MPD for the movies of the animation genre, which allows for byte-range requests. The provided MPDs of the dataset are compatible with the current implementation of the DASH VLC Plugin, libdash, and DASH-JS. Furthermore, we provide a distributed DASH (D-DASH) dataset which is, at the time of writing, replicated on five sites within Europe, i.e., Klagenfurt, Paris, Prague, Torino, and Crete. This allows for a real-world evaluation of DASH clients that perform bitstream switching between multiple sites, e.g., this could be useful as a simulation of the switching between multiple Content Distribution Networks (CDNs).

DASH Online MPD Validation Service

The DASH online MPD validation service implements the conformance software of MPEG-DASH and enables a Web-based validation of MPDs based on a file, URI, and text. As the MPD is based on XML schema, it is also possible to use an external XML schema file for the validation.

DASH over CCN

Finally, the Dynamic Adaptive Streaming over Content Centric Networks (DASC áka DASH over CCN) implements DASH utilizing a CCN naming scheme to identify content segments in a CCN network. Therefore, the CCN concept from Jacobson et al. and the CCNx implementation (www.ccnx.org) of PARC is used. In particular, video segments formatted according to MPEG-DASH are available in different quality levels but instead of HTTP, CCN is used for referencing and delivery.

Conclusion

Our open source tool suite is available to the community with the aim to provide a common ground for research efforts in the area of adaptive media streaming in order to make results comparable with each other. Everyone is invited to join this activity – get involved in and excited about DASH.

Acknowledgments

This work was supported in part by the EC in the context of the ALICANTE (FP7-ICT-248652) and SocialSensor (FP7-ICT-287975) projects and partly performed in the Lakeside Labs research cluster at AAU.

References

[1] Sandvine, “Global Internet Phenomena Report 2H 2012”, Sandvine Intelligent Broadband Networks, 2012. [2] 3GPP TS 26.234, “Transparent end-to-end packet switched streaming service (PSS)”, Protocols and codecs, 2010. [3] A. Zambelli, “IIS Smooth Streaming Technical Overview,” Technical Report, Microsoft Corporation, March 2009. [4] R. Pantos, W. May, “HTTP Live Streaming”, IETF draft, http://tools.ietf.org/html/draft-pantos-http-live-streaming-07 (last access: Feb 2013). [5] Adobe HTTP Dynamic Streaming, http://www.adobe.com/products/httpdynamicstreaming/ (last access: Feb 2013). [6] ISO/IEC 23009-1:2012, Information technology – Dynamic adaptive streaming over HTTP (DASH) – Part 1: Media presentation description and segment formats. Available here [7] ITEC DASH, http://dash.itec.aau.at [8] libdash open git repository, https://github.com/bitmovin/libdash  

MPEG Column: 103rd MPEG Meeting

— original post by Multimedia Communication blogChristian TimmererAAU

 

The 103rd MPEG Meeting

The 103rd MPEG meeting was held in Geneva, Switzerland, January 21-15, 2013. The official press release can be found here (doc only) and I’d like to introduce the new MPEG-H standard (ISO/IEC 23008) referred to as high efficiency coding and media delivery in heterogeneous environments:

  • Part 1: MPEG Media Transport (MMT) – status: 2nd committee draft (CD)
  • Part 2: High Efficiency Video Coding (HEVC) – status: final draft international standard (FDIS)
  • Part 3: 3D Audio – status: call for proposals (CfP)

MPEG Media Transport (MMT)

The MMT project was started in order to address the needs of modern media transport applications going beyond the capabilities offered by existing means of transportation such as formats defined by MPEG-2 transport stream (M2TS) or ISO base media file format (ISOBMFF) group of standards. The committee draft was approved during the 101st MPEG meeting. As a response to the CD ballot, MPEG received more than 200 comments from national bodies and, thus, decided to issue the 2nd committee draft which will be publicly available by February 7, 2013.

High Efficiency Video Coding (HEVC) – ITU-T H.265 | MPEG HEVC

HEVC is the next generation video coding standard jointly developed by ISO/IEC JTC1/SC29/WG11 (MPEG) and the Video Coding Experts Group (VCEG) of ITU-T WP 3/16. Please note that both ITU-T and ISO/IEC MPEG use the term “high efficiency video coding” in the the title of the standard but one can expect – as with its predecessor – that the former will use ITU-T H.265 and the latter will use MPEG-H HEVC for promoting its standards. If you don’t want to participate in this debate, simply use high efficiency video coding.

The MPEG press release says that the “HEVC standard reduces by half the bit rate needed to deliver high-quality video for a broad variety of applications” (note: compared to its predecessor AVC). The editing period for the FDIS goes until March 3, 2013 and then with the final preparations and a 2 month balloting period (yes|no vote only) once can expect the International Standard (IS) to be available early summer 2013. Please note that there are no technical differences between FDIS and IS.

The ITU-T press release describes HEVC as a standard that “will provide a flexible, reliable and robust solution, future-proofed to support the next decade of video. The new standard is designed to take account of advancing screen resolutions and is expected to be phased in as high-end products and services outgrow the limits of current network and display technology.”

HEVC currently defines three profiles:

  • Main Profile for the “Mass-market consumer video products that historically require only 8 bits of precision”.
  • Main 10 Profile “will support up to 10 bits of processing precision for applications with higher quality demands”.
  • Main Still Picture Profile to support still image applications, hence, “HEVC also advances the state-of-the-art for still picture coding”

3D Audio

The 3D audio standard shall complement MMT and HEVC assuming that in a “home theater” system a large number of loudspeakers will be deployed. Therefore, MPEG has issued a Call for Proposals (CfP) with the selection of the reference model v0 due in July 2013. The CfP says that MPEG-H 3D Audio “might be surrounding the user and be situated at high, mid and low vertical positions relative to the user’s ears. The desired sense of audio envelopment includes both immersive 3D audio, in the sense of being able to virtualize sound sources at any position in space, and accurate audio localization, in terms of both direction and distance.”

“In addition to a “home theater” audio-visual system, there may be a “personal” system having a tablet-sized visual display with speakers built into the device, e.g. around the perimeter of the display. Alternatively, the personal device may be a hand-held smart phone. Headphones with appropriate spatialization would also be a means to deliver an immersive audio experience for all systems.”

Complementary to the CfP, MPEG also provided the encoder input format for MPEG-H 3D audio and a draft MPEG audio core experiment methodology for 3D audio work.

Publicly available MPEG output documents

The following documents shall be come available at http://mpeg.chiariglione.org/ (note: some may have an editing period – YY/MM/DD). If you have difficulties to access one of these documents, please feel free to contact me.

  • Study text of DIS of ISO/IEC 23000-13, Augmented Reality Application Format (13/01/25)
  • Study text of DTR of ISO/IEC 23000-14, Augmented reality reference model (13/02/25)
  • Text of ISO/IEC FDIS 23005-1 2nd edition Architecture (13/01/25)
  • Text of ISO/IEC 2nd CD 23008-1 MPEG Media Transport (13/02/07)
  • Text of ISO/IEC 23008-2:201x/PDAM1 Range Extensions (13/03/22)
  • Text of ISO/IEC 23008-2:201x/PDAM2 Multiview Extensions (13/03/22)
  • Call for Proposals on 3D Audio (13/01/25)
  • Encoder Input Format for MPEG-H 3D Audio (13/02/08)
  • Draft MPEG Audio CE methodology for 3D Audio work (13/01/25)
  • Draft Requirements on MPEG User Descriptions (13/02/08)
  • Draft Call for Proposals on MPEG User Descriptions (13/01/25)
  • Draft Call for Proposals on Green MPEG (13/01/25)
  • Context, Objectives, Use Cases and Requirements for Green MPEG (13/01/25)
  • White Paper on State of the Art in compression and transmission of 3D Video (13/01/28)
  • MPEG Awareness Event Flyer at 104th MPEG meeting in Incheon (13/02/28)

MPEG Column: 102nd MPEG Meeting

original post by Multimedia Communication blog, Christian Timmerer, AAU

The 102nd MPEG meeting was held in Shanghai, China, October 15-19, 2012. The official press release can be found here (not yet available) and I would like to highlight the following topics:

  • Augmented Reality Application Format (ARAF) goes DIS
  • MPEG-4 has now 30 parts: Let’s welcome timed text and other visual overlays
  • Draft call for proposals for 3D audio
  • Green MPEG is progressing
  • MPEG starts a new publicity campaign by making more working documents publicly available for free

Augmented Reality Application Format (ARAF) goes DIS

MPEG’s application format dealing with augmented reality reached DIS status and is only one step away from becoming in international standard. In a nutshell, the MPEG ARAF enables to augment 2D/3D regions of scene by combining multiple/existing standards within a specific application format addressing certain industry needs. In particular, ARAF comprises three components referred to as scene, sensor/actuator, and media. The scene component is represented using a subset of MPEG-4 Part 11 (BIFS), the sensor/actuator component is defined within MPEG-V, and the media component may comprise various type of compressed (multi)media assets using different sorts of modalities and codecs.

A tutorial from Marius Preda, MPEG 3DG chair, at the Web3D conference in August 2012 is provided below.

MPEG-4 has now 30 parts

Let’s welcome timed text and other visual overlays in the family of MPEG-4 standards. Part 30 of MPEG-4 – in combination with an amendment to the ISO base media file format (ISOBMFF) –  addresses the carriage of W3C TTML including its derivative SMPTE Timed Text, as well as WebVTT. The types of overlays include subtitles, captions, and other timed text and graphics. The text-based overlays include basic text and XML-based text. Additionally, the standards provides support for bitmaps, fonts, and other graphics formats such as scalable vector graphics.

Draft call for proposals for 3D audio

MPEG 3D audio is concerned about various test items ranging from 9.1 over 12.1 up to 22.1 channel configurations. A public draft call for proposals has been issued at this meeting with the goal to finalize the call and the evaluation guidelines at the next meeting. The evaluation will be conducted in two phases. Phase one for higher bitrates (1.5 Mbps to 265 kbps) is foreseen to conclude in July 2013 with the evaluation of the answers to the call and the selection of the “Reference Model 0 (RM0)” technology which will serve as a basis for the development of an 3D audio standard. The second phase targets lower bitrates (96 kbps to 48 kbps) and builds on RM0 technology after this has been documented using text and code.

Green MPEG is progressing

The idea between green MPEG is to define signaling means that enable energy efficient encoding, delivery, decoding, and/or presentation of MPEG formats (and possibly others) without the loss of Quality of Experience. Green MPEG will address this issue from an end-to-end point of view with the focus – as usual – on the decoder. However, a codec-centric design is not desirable as the energy efficiency should not be affected at the expenses of the other components of the media ecosystem. At the moment, first requirements have been defined and everyone is free to join the discussions on the email reflector within the Ad-hoc Group.

MPEG starts a new publicity campaign by making more working documents publicly available for free

As a response to national bodies comments, MPEG is starting from now on to make more documents publicly available for free. Here’s a selection of these documents which are publicly available here. Note that some may have an editing period and, thus, are not available at the of writing this blog post.

  • Text of ISO/IEC 14496-15:2010/DAM 2 Carriage of HEVC (2012/11/02)
  • Text of ISO/IEC CD 14496-30 Timed Text and Other Visual Overlays in ISO Base Media File Format (2012/11/02)
  • DIS of ISO/IEC 23000-13, Augmented Reality Application Format (2012/11/07)
  • DTR of ISO/IEC 23000-14, Augmented reality reference model (2012/11/21)
  • Study of ISO/IEC CD 23008-1 MPEG Media Transport (2012/11/12)
  • High Efficiency Video Coding (HEVC) Test Model 9 (HM 9) Encoder Description (2012/11/30)
  • Study Text of ISO/IEC DIS 23008-2 High Efficiency Video Coding (2012/11/30)
  • Working Draft of HEVC Full Range Extensions (2012/11/02)
  • Working Draft of HEVC Conformance (2012/11/02)
  • Report of Results of the Joint Call for Proposals on Scalable High Efficiency Video Coding (SHVC) (2012/11/09)
  • Draft Call for Proposals on 3D Audio (2012/10/19)
  • Text of ISO/IEC 23009-1:2012 DAM 1 Support for Event Messages and Extended Audio Channel Configuration (2012/10/31)
  • Internet Video Coding Test Model (ITM) v 3.0 (2012/11/02)
  • Draft Requirements on MPEG User Descriptions (2012/10/19)
  • Draft Use Cases for MPEG User Description (Ver. 4.0) (2012/10/19)
  • Requirements on Green MPEG (2012/10/19)
  • White Paper on State of the Art in compression and transmission of 3D Video (Draft) (2012/10/19)
  • White Paper on Compact Descriptors for Visual Search (2012/11/09)