VQEG Column: Recent contributions to ITU recommendations

Welcome to the second column on the ACM SIGMM Records from the Video Quality Experts Group (VQEG).
VQEG plays a major role in research and the development of standards on video quality and this column presents examples of recent contributions to International Telecommunication Union (ITU) recommendations, as well as ongoing contributions to recommendations to come in the near future. In addition, the formation of a new group within VQEG addressing Quality Assessment for Health Applications (QAH) has been announced.  

VQEG website: www.vqeg.org
Authors: 
Jesús Gutiérrez (jesus.gutierrez@upm.es), Universidad Politécnica de Madrid (Spain)
Kjell Brunnström (kjell.brunnstrom@ri.se), RISE (Sweden) 
Thanks to Lucjan Janowski (AGH University of Science and Technology), Alexander Raake (TU Ilmenau) and Shahid Satti (Opticom) for their help and contributions.

Introduction

VQEG is an international and independent organisation that provides a forum for technical experts in perceptual video quality assessment from industry, academia, and standardization organisations. Although VQEG does not develop or publish standards, several activities (e.g., validation tests, multi-lab test campaigns, objective quality models developments, etc.) carried out by VQEG groups have been instrumental in the development of international recommendations and standards. VQEG contributions have been mainly submitted to relevant ITU Study Groups (e.g., ITU-T SG9, ITU-T SG12, ITU-R WP6C), but also to other standardization bodies, such as MPEG, ITU-R SG6, ATIS, IEEE P.3333 and P.1858, DVB, and ETSI. 

In our first column on the ACM SIGMM Records we provided a table summarizing the several VQEG studies that have resulted in ITU Recommendations. In this new column, we describe with more detail the last contributions to recent ITU standards, and we provide an insight on the ongoing contributions that may result in ITU recommendations in the near future.

ITU Recommendations with recent inputs from VQEG

ITU-T Rec. P.1204 standard series

A campaign within the ITU-T Study Group (SG) 12 (Question 14) in collaboration with the VQEG AVHD group resulted in the development of three new video quality model standards for the assessment of sequences of up to UHD/4K resolution. This campaign was carried out during more than two years under the project “AVHD-AS / P.NATS Phase 2”. While “P.NATS Phase 1” (finalized in 2016 and resulting in the standards series ITU-T Rec. P.1203, P.1203.1, P.1203.2 and P.1203.3) addressed the development of improved bitstream-based models for the prediction of the overall quality of long (1-5 minutes) video streaming sessions, the second phase addressed the development of short-term video quality models covering a wider scope with bitstream-based, pixel-based and hybrid models. The P.NATS Phase 2 project was executed as a competition between nine participating institutions in different tracks resulting in the aforementioned three types of video quality models. 

For the competition, a total of 26 databases were created, 13 used for training and 13 for validation and selection of the winning models. In order to establish the ground truth, subjective video quality tests were performed on four different display devices (PC-monitors, 55-75” TVs, mobile, and tablet) with at least 24 subjects each and using the 5-point Absolute Category Rating (ACR) scale. In total, about 5000 test sequences with a duration of around 8 seconds were evaluated, containing a variety of resolutions, encoding configurations, bitrates, and framerates using the codecs H.264/AVC, H.265/HEVC and VP9.   

More details about the whole workflow and results of the competition can be found in [1]. As a result of this competition, the new standard series ITU-T Rec. P.1204 [2] has been recently published, including a bitstream-based model  (ITU-T Rec. P.1204.3 [3]), a pixel-based model (ITU-T Rec. P.1204.4 [4]) and a hybrid model (ITU-T Rec. P.1204.5 [5]).

ITU-T Rec. P.1401

ITU-T Rec. P.1401 [6] is about statistical analysis, evaluation and reporting guidelines of quality measurements and was recently revised in January 2020.  Based on the article by Brunnström and Barkowsky [7], it was recognized and pointed out by VQEG that this Recommendation, which is very useful, lacked a section on the topic of multiple comparisons and its potential impact on the performance evaluations of objective quality methods. In the latest revision, Section 7.6.5 covers this topic.

Ongoing VQEG Inputs to ITU Recommendations

ITU-T Rec. P.919

ITU has been working on a recommendation for subjective test methodologies for 360º video on Head-Mounted Displays (HMDs), under the SG12 Question 13 (Q13). The Immersive Media Group (IMG) of the VQEG has collaborated in this effort through the fulfilment of the Phase 1 of the Test Plan for Quality Assessment of 360-degree Video. In particular, the Phase 1 of this test plan addresses the assessment of short sequences (less than 30 seconds), in the spirit of ITU-R BT.500 [8] and ITU-T P.910 [9]. In this sense, the evaluation of audiovisual quality and simulator sickness was considered. On the other hand, the Phase 2 of the test plan (envisioned for the near future) covers the assessment of other factors that can be more influential with longer sequences (several minutes), such as immersiveness and presence.  

Therefore, within Phase 1 the IMG designed and executed a cross-lab test with the participation of ten international laboratories, from AGH University of Science and Technology (Poland), Centrum Wiskunde & Informatica (The Netherlands), Ghent University (Belgium), Nokia Bell-Labs (Spain), Roma TRE University (Italy), RISE Acreo (Sweden), TU Ilmenau (Germany), Universidad Politécnica de Madrid (Spain), University of Surrey (England), Wuhan University (China). 

This test was aimed at assessing and validating subjective evaluation methodologies for 360º video. Thus, the single-stimulus methodology Absolute Category Rating (ACR) and the double-stimulus Degradation Category Rating (DCR) were considered to evaluate audiovisual quality of 360º videos distorted with uniform and non-uniform degradations.  In particular, different configurations of uniform and tile-based coding were applied to eight video sources with different spatial, temporal and exploration properties. Other influence factors were also studied, such as the influence of the sequence duration (from 10 to 30s) and the test setup (considering different HMDs and methods to collect the observers’ ratings, using audio or not, etc.).  Finally, in addition to the evaluation of audiovisual quality, the assessment of simulator sickness symptoms was addressed studying the use of different questionnaires. As a result of this work, the IMG of VQEG presented two contributions to the recommendation ITU-T Rec. P.919 (ex P.360-VR), which has been consented in the last SG12 meeting (7-11 September 2020) and is envisioned to be published soon. In addition, the results and the annotated dataset coming from the cross-lab test will be published soon.

ITU-T Rec. P.913

Another upcoming contribution is prepared by the Statistical Analysis Group (SAM). The main goal of the proposal is to increase the precision of the subjective experiment analysis by describing a subjective answer as a random variable. The random variable is described by three key influencing factors, the sequence quality, a subject bias, and a subject precision. It is further development of the ITU-T P.913 [10] recommendation where subject bias was introduced. Adding subject precision allows for two achievements: Better handling unreliable subjects and easier estimation procedure. 

Current standards describe a way to remove an unreliable subject. The problem is that the methods proposed in BT.500 [8] and P.913 [10] are different and point to different subjects. Also, both methods have some arbitrary parameters (e.g., thresholds) deciding when a subject should be removed. It means that two subjects can be similarly imprecise but one is over the threshold, and we accept all his answers as correct and the other is under the threshold, and we remove her all answers. The proposed method weights the impact of each subject answer depending on the subject precision. As the consequence, each subject is to some extent removed and kept. The balance between how much information we keep and how much we remove depends on the subject precision. 

The estimation procedure of the proposed model, described in the literature, is MLE (Maximum Likelihood Estimation). Such estimation is computationally costly and needs a careful setup to obtain a reliable solution. Therefore, we proposed Alternating Projection (AP) solver which is less general than MLE but works as well as MLE for the subject model estimation. This solver is called “alternating projection” because, in a loop, we alternate between projecting (or averaging) the opinion scores along the subject dimension and the stimulus dimension. It increases the precision of the obtained model parameters’ step by step weighting more information coming from the more precise subjects. More details can be found in the white paper in [11].

Other updates 

A new VQEG group has been recently established related to Quality Assessment for Health Applications (QAH), with the motivation to study visual quality requirements for medical imaging and telemedicine. The main goals of this new group are:

  • Assemble all the existing publicly accessible databases on medical quality.
  • Develop databases with new diagnostic tasks and new objective quality assessment models.
  • Provide methodologies, recommendations and guidelines for subjective test of medical image quality assessment.
  • Study the quality requirements and Quality of Experience in the context of telemedicine and other telehealth services.

For any further questions or expressions of interest to join this group, please contact QAH Chair Lu Zhang (lu.ge@insa-rennes.fr), Vice Chair Meriem Outtas (Meriem.Outtas@insa-rennes.fr), and Vice Chair Hantao Liu (hantao.liu@cs.cardiff.ac.uk).

References

[1] A. Raake, S. Borer, S. Satti, J. Gustafsson, R.R.R. Rao, S. Medagli, P. List, S. Göring, D. Lindero, W. Robitza, G. Heikkilä, S. Broom, C. Schmidmer, B. Feiten, U. Wüstenhagen, T. Wittmann, M. Obermann, R. Bitto, “Multi-model standard for bitstream-, pixel-based and hybrid video quality assessment of UHD/4K: ITU-T P.1204” , IEEE Access, 2020 (Available online soon).   
[2] ITU-T Rec. P.1204. Video quality assessment of streaming services over reliable transport for resolutions up to 4K. Geneva, Switzerland: ITU, 2020.
[3] ITU-T Rec. P.1204.3. Video quality assessment of streaming services over reliable transport for resolutions up to 4K with access to full bitstream information. Geneva, Switzerland: ITU, 2020.
[4] ITU-T Rec. P.1204.4. Video quality assessment of streaming services over reliable transport for resolutions up to 4K with access to full and reduced reference pixel information. Geneva, Switzerland: ITU, 2020.
[5] ITU-T Rec. P.1204.5. Video quality assessment of streaming services over reliable transport for resolutions up to 4K with access to transport and received pixel information. Geneva, Switzerland: ITU, 2020.
[6] ITU-T Rec. P.1401. Methods, metrics and procedures for statistical evaluation, qualification and comparison of objective quality prediction models. Geneva, Switzerland: ITU, 2020.
[7] K. Brunnström and M. Barkowsky, “Statistical quality of experience analysis: on planning the sample size and statistical significance testing”, Journal of Electronic Imaging, vol. 27, no. 5,  p. 11, Sep. 2018 (DOI: 10.1117/1.JEI.27.5.053013).
[8] ITU-R Rec. BT.500-14. Methodology for the subjective assessment of the quality of television pictures. Geneva, Switzerland: ITU, 2019.
[9]  ITU-T Rec. P.910. Subjective video quality assessment methods for multimedia applications. Geneva, Switzerland: ITU, 2008.
[10] ITU-T Rec. P.913. Methods for the subjective assessment of video quality, audio quality and audiovisual quality of Internet video and distribution quality television in any environment. Geneva, Switzerland: ITU, 2016.
[11] Z. Li, C. G. Bampis, L. Janowski, I. Katsavounidis, “A simple model for subject behavior in subjective experiments”, arXiv:2004.02067, Apr. 2020.

JPEG Column: 88th JPEG Meeting

The 88th JPEG meeting initially planned to be held in Geneva, Switzerland, was held online because of the Covid-19 outbreak.

JPEG experts organised a large number of sessions spread over day and night to allow the remote participation of multiple time zones. A very intense activity has resulted in multiple outputs and initiatives. In particular two new explorations activities were initiated. The first explores possible standardisation needs to address the growing emergence of fake media by introducing appropriate security features to prevent the misuse of media content. The latest, considers the use of DNA for media content archival.

Furthermore, JPEG has started the work on the new part 8 of the JPEG Systems standard, called JPEG snack, for interoperable rich image experiences, and it is holding two Call for Evidence, JPEG AI and JPEG Pleno Point cloud coding.

Despite travel restrictions, JPEG Committee has managed to keep up with the majority of its plans, defined prior to the COVID-19 outbreak. An overview of the different activities is represented in Fig. 1.

The 88th JPEG meeting had the following highlights:

  • JPEG explores standardization needs to address fake media
  • JPEG Pleno Point Cloud call for evidence
  • JPEG DNA – based archival of media content using DNA
  • JPEG AI call for evidence
  • JPEG XL standard evolves to a final specification
  • JPEG Systems part 8, named JPEG Snack progress
  • JPEG XS ballot raw-Bayer image sensor data compression.
JPEG ongoing activities timeline.

JPEG explores standardization needs to address fake media

Recent advances in media manipulation, particularly deep learning-based approaches, can produce near realistic media content that is almost indistinguishable from authentic content to the human eye. These developments open opportunities for production of new types of media contents that are useful for the entertainment industry and other business usage, e.g., creation of special effects or artificial natural scene production with actors in the studio. However, this also leads to issues relating to fake media generation undermining the integrity of the media (e.g., deepfakes), copyright infringements and defamation to mention a few examples. Misuse of manipulated media can cause social unrest, spread rumours for political gain or encourage hate crimes. In this context, the term ‘fake’ is used here to refer to any manipulated media, independently of its ‘good’ or ‘bad’ intention.

In many application domains, fake media producers may want or may be required to declare the type of manipulations performed, in opposition to other situations where the intention is to ‘hide’ the mere existence of such manipulations. This is already leading various Governmental organizations to plan new legislation or companies (especially social media platforms or news outlets) to develop mechanisms that would clearly detect and annotate manipulated media contents when they are shared. While growing efforts are noticeable in developing technologies, there is a need to have a standard for the media/metadata format, e.g., a JPEG standard that facilitates a secure and reliable annotation of fake media, both in good faith and malicious usage scenarios. To better understand the fake media ecosystem and needs in terms of standardization, the JPEG Committee has initiated an in-depth analysis of fake media use cases, naturally independently of the “intentions”.     

More information on the initiative is available on the JPEG website. Interested parties are invited to join the above AHG through the following URL: http://listregistration.jpeg.org.

JPEG Pleno Point Cloud

JPEG Pleno is working towards the integration of various modalities of plenoptic content under a single and seamless framework. Efficient and powerful point cloud representation is a key feature within this vision. Point cloud data supports a wide range of applications including computer-aided manufacturing, entertainment, cultural heritage preservation, scientific research and advanced sensing and analysis. During the 88th JPEG meeting, the JPEG Committee released a Final Call for Evidence on JPEG Pleno Point Cloud Coding that focuses specifically on point cloud coding solutions supporting scalability and random access of decoded point clouds. Between the 88th and 89th meetings, the JPEG Committee will be actively promoting this activity and collecting registrations to participate in the Call for Evidence.

JPEG DNA

In digital media information, notably images, the relevant representation symbols, e.g. quantized DCT coefficients, are expressed in bits (i.e., binary units) but they could be expressed in any other units, for example the DNA units which follow a 4-ary representation basis. This would mean that DNA molecules may be created with a specific DNA units’ configuration which stores some media representation symbols, e.g. the symbols of a JPEG image, thus leading to DNA-based media storage as a form of molecular data storage. JPEG standards have been used in storage and archival of digital pictures as well as moving images. While the legacy JPEG format is widely used for photo storage in SD cards, as well as archival of pictures by consumers,  JPEG 2000 as described in ISO/IEC 15444 is used in many archival applications, notably for preservation of cultural heritage in form of visual data as pictures and video in digital format. This puts the JPEG Committee in a unique position to address the challenges in DNA-based storage by creating a standard image representation and coding for such applications. To explore the latter, an AHG has been established. Interested parties are invited to join the above AHG through the following URL: http://listregistration.jpeg.org.

JPEG AI

At the 88th meeting, the submissions to the Call for Evidence were reported and analysed. Six submissions were received in response to the Call for Evidence made in coordination with the IEEE MMSP 2020 Challenge. The submissions along with the anchors were already evaluated using objective quality metrics. Following this initial process, subjective experiments have been designed to compare the performance of all submissions. Thus, during this meeting, the main focus of JPEG AI was on the presentation and discussion of the objective performance evaluation of all submissions as well as the definition of the methodology for the subjective evaluation that will be made next.

JPEG XL

The standardization of the JPEG XL image coding system is nearing completion. Final technical comments by national bodies have been received for the codestream (Part 1); the DIS has been approved and an FDIS text is under preparation. The container file format (Part 2) is progressing to the DIS stage. A white paper summarizing key features of JPEG XL is available at http://ds.jpeg.org/whitepapers/jpeg-xl-whitepaper.pdf.

JPEG Systems

ISO/IEC has approved the JPEG Snack initiative to deliver interoperable rich image experiences.  As a result, the JPEG Systems Part 8 (ISO/IEC 19566-8) has been created to define the file format construction and the metadata signalling and descriptions which enable animation with transition effects.  A Call for Participation and updated use cases and requirements have been issued. The CfP and the use cases and requirements documents are available at http://ds.jpeg.org/documents/wg1n87035-REQ-JPEG_Snack_Use_Cases_and_Requirements_v2_2.pdf and http://ds.jpeg.org/documents/wg1n88032-SI-CfP_JPEG_Snack.pdf respectively.

An updated working draft for the JLINK initiative was completed.  Interest parties are encouraged to review the JLINK Working Draft 3.0 available at http://ds.jpeg.org/documents/wg1n88031-SI-JLINK_WD_3_0.pdf

JPEG XS

The JPEG committee is pleased to announce a significant step in the standardization of an efficient Bayer image compression scheme, with the first ballot of the 2nd Edition of JPEG XS Part-1.

The new edition of this visually lossless low-latency and lightweight compression scheme now includes image sensor coding tools allowing efficient compression of Color-Filtered Array (CFA) data. This compression enables better quality and lower complexity than the corresponding compression in the RGB domain.  It can be used as a mezzanine codec in various markets such as real-time video storage in and outside of cameras, and data compression onboard autonomous cars.

Final Quote

“Fake Media has become a challenge with the wide-spread manipulated contents in the news. JPEG is determined to mitigate this problem by providing standards that can securely identify manipulated contents.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

Future JPEG meetings are planned as follows:

  • No 89, will be held online from October 5 to 9, 2020.

MPEG Column: 131st MPEG Meeting (virtual/online)

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

The 131st MPEG meeting concluded on July 3, 2020, online, again but with a press release comprising an impressive list of news items which is led by “MPEG Announces VVC – the Versatile Video Coding Standard”. Just in the middle of the SC 29 (i.e., MPEG’s parent body within ISO) restructuring process, MPEG successfully ratified — jointly with ITU-T’s VCEG within JVET — its next-generation video codec among other interesting results from the 131st MPEG meeting:

Standards progressing to final approval ballot (FDIS)

  • MPEG Announces VVC – the Versatile Video Coding Standard
  • Point Cloud Compression – MPEG promotes a Video-based Point Cloud Compression Technology to the FDIS stage
  • MPEG-H 3D Audio – MPEG promotes Baseline Profile for 3D Audio to the final stage

Call for Proposals

  • Call for Proposals on Technologies for MPEG-21 Contracts to Smart Contracts Conversion
  • MPEG issues a Call for Proposals on extension and improvements to ISO/IEC 23092 standard series

Standards progressing to the first milestone of the ISO standard development process

  • Widening support for storage and delivery of MPEG-5 EVC
  • Multi-Image Application Format adds support of HDR
  • Carriage of Geometry-based Point Cloud Data progresses to Committee Draft
  • MPEG Immersive Video (MIV) progresses to Committee Draft
  • Neural Network Compression for Multimedia Applications – MPEG progresses to Committee Draft
  • MPEG issues Committee Draft of Conformance and Reference Software for Essential Video Coding (EVC)

The corresponding press release of the 131st MPEG meeting can be found here: https://mpeg-standards.com/meetings/mpeg-131/. This report focused on video coding featuring VVC as well as PCC and systems aspects (i.e., file format, DASH).

MPEG Announces VVC – the Versatile Video Coding Standard

MPEG is pleased to announce the completion of the new Versatile Video Coding (VVC) standard at its 131st meeting. The document has been progressed to its final approval ballot as ISO/IEC 23090-3 and will also be known as H.266 in the ITU-T.

VVC Architecture (from IEEE ICME 2020 tutorial of Mathias Wien and Benjamin Bross)

VVC is the latest in a series of very successful standards for video coding that have been jointly developed with ITU-T, and it is the direct successor to the well-known and widely used High Efficiency Video Coding (HEVC) and Advanced Video Coding (AVC) standards (see architecture in the figure above). VVC provides a major benefit in compression over HEVC. Plans are underway to conduct a verification test with formal subjective testing to confirm that VVC achieves an estimated 50% bit rate reduction versus HEVC for equal subjective video quality. Test results have already demonstrated that VVC typically provides about a 40%-bit rate reduction for 4K/UHD video sequences in tests using objective metrics (i.e., PSNR, VMAF, MS-SSIM). Application areas especially targeted for the use of VVC include:

  • ultra-high definition 4K and 8K video,
  • video with a high dynamic range and wide colour gamut, and
  • video for immersive media applications such as 360° omnidirectional video.

Furthermore, VVC is designed for a wide variety of types of video such as camera capturedcomputer-generated, and mixed content for screen sharing, adaptive streaming, game streaming, video with scrolling text, etc. Conventional standard-definition and high-definition video content are also supported with similar gains in compression. In addition to improving coding efficiency, VVC also provides highly flexible syntax supporting such use cases as (i) subpicture bitstream extraction, (ii) bitstream merging, (iii) temporal sub-layering, and (iv) layered coding scalability.

The current performance of VVC compared to HEVC-HM is shown in the figure below which confirms the statement above but also highlights the increased complexity. Please note that VTM9 is not optimized for speed but functionality (i.e., compression efficiency).

Performance of VVC, VTM9 vs. HM (taken from https://bit.ly/mpeg131).

MPEG also announces completion of ISO/IEC 23002-7 “Versatile supplemental enhancement information for coded video bitstreams” (VSEI), developed jointly with ITU-T as Rec. ITU-T H.274. The new VSEI standard specifies the syntax and semantics of video usability information (VUI) parameters and supplemental enhancement information (SEI) messages for use with coded video bitstreams. VSEI is especially intended for use with VVC, although it is drafted to be generic and flexible so that it may also be used with other types of coded video bitstreams. Once specified in VSEI, different video coding standards and systems-environment specifications can re-use the same SEI messages without the need for defining special-purpose data customized to the specific usage context.

At the same time, the Media Coding Industry Forum (MC-IF) announces a VVC patent pool fostering with an initial meeting on September 1, 2020. The aim of this meeting is to identify tasks and to propose a schedule for VVC pool fostering with the goal to select a pool facilitator/administrator by the end of 2020. MC-IF is not facilitating or administering a patent pool.

At the time of writing this blog post, it is probably too early to make an assessment of whether VVC will share the fate of HEVC or AVC (w.r.t. patent pooling). AVC is still the most widely used video codec but with AVC, HEVC, EVC, VVC, LCEVC, AV1, (AV2), and probably also AVS3 — did I miss anything? — the competition and pressure are certainly increasing.

Research aspects: from a research perspective, reduction of time-complexity (for a variety of use cases) while maintaining quality and bitrate at acceptable levels is probably the most relevant aspect. Improvements in individual building blocks of VVC by using artificial neural networks (ANNs) are another area of interest but also end-to-end aspects of video coding using ANNs will probably pave the roads towards the/a next generation of video codec(s). Utilizing VVC and its features for HTTP adaptive streaming (HAS) is probably most interesting for me but maybe also for others…

MPEG promotes a Video-based Point Cloud Compression Technology to the FDIS stage

At its 131st meeting, MPEG promoted its Video-based Point Cloud Compression (V-PCC) standard to the Final Draft International Standard (FDIS) stage. V-PCC addresses lossless and lossy coding of 3D point clouds with associated attributes such as colors and reflectance. Point clouds are typically represented by extremely large amounts of data, which is a significant barrier for mass-market applications. However, the relative ease to capture and render spatial information as point clouds compared to other volumetric video representations makes point clouds increasingly popular to present immersive volumetric data. With the current V-PCC encoder implementation providing compression in the range of 100:1 to 300:1, a dynamic point cloud of one million points could be encoded at 8 Mbit/s with good perceptual quality. Real-time decoding and rendering of V-PCC bitstreams have also been demonstrated on current mobile hardware. The V-PCC standard leverages video compression technologies and the video ecosystem in general (hardware acceleration, transmission services, and infrastructure) while enabling new kinds of applications. The V-PCC standard contains several profiles that leverage existing AVC and HEVC implementations, which may make them suitable to run on existing and emerging platforms. The standard is also extensible to upcoming video specifications such as Versatile Video Coding (VVC) and Essential Video Coding (EVC).

The V-PCC standard is based on Visual Volumetric Video-based Coding (V3C), which is expected to be re-used by other MPEG-I volumetric codecs under development. MPEG is also developing a standard for the carriage of V-PCC and V3C data (ISO/IEC 23090-10) which has been promoted to DIS status at the 130th MPEG meeting.

By providing high-level immersiveness at currently available bandwidths, the V-PCC standard is expected to enable several types of applications and services such as six Degrees of Freedom (6 DoF) immersive media, virtual reality (VR) / augmented reality (AR), immersive real-time communication and cultural heritage.

Research aspects: as V-PCC is video-based, we can probably state similar research aspects as for video codecs such as improving efficiency both for encoding and rendering as well as reduction of time complexity. During the development of V-PCC mainly HEVC (and AVC) has/have been used but it is definitely interesting to use also VVC for PCC. Finally, the dynamic adaptive streaming of V-PCC data is still in its infancy despite some articles published here and there.

MPEG Systems related News

Finally, I’d like to share news related to MPEG systems and the carriage of video data as depicted in the figure below. In particular, the carriage of VVC (and also EVC) has been now enabled in MPEG-2 Systems (specifically within the transport stream) and in the various file formats (specifically within the NAL file format). The latter is used also in CMAF and DASH which makes VVC (and also EVC) ready for HTTP adaptive streaming (HAS).

Carriage of Video in MPEG Systems Standards (taken from https://bit.ly/mpeg131).

What about DASH and CMAF?

CMAF maintains a so-called “technologies under consideration” document which contains — among other things — a proposed VVC CMAF profile. Additionally, there are two exploration activities related to CMAF, i.e., (i) multi-stream support and (ii) storage, archiving, and content management for CMAF files.

DASH works on potential improvement for the first amendment to ISO/IEC 23009-1 4th edition related to CMAF support, events processing model, and other extensions. Additionally, there’s a working draft for a second amendment to ISO/IEC 23009-1 4th edition enabling bandwidth change signalling track and other enhancements. Furthermore, ISO/IEC 23009-8 (Session-based DASH operations) has been advanced to Draft International Standard (see also my last report).

An overview of the current status of MPEG-DASH can be found in the figure below.

The next meeting will be again an online meeting in October 2020.

Finally, MPEG organized a Webinar presenting results from the 131st MPEG meeting. The slides and video recordings are available here: https://bit.ly/mpeg131.

Click here for more information about MPEG meetings and their developments.

Standards Column: VQEG

Welcome to the first column on the ACM SIGMM Records from the Video Quality Experts Group (VQEG).
VQEG is an international and independent organisation of technical experts in perceptual video quality assessment from industry, academia, and government organisations.
This column briefly introduces the mission and main activities of VQEG, establishing a starting point of a series of columns that will provide regular updates of the advances within the current ongoing projects, as well as reports of the VQEG meetings. 
The editors of these columns are Jesús Gutiérrez (upper photo, jesus.gutierrez@upm.es), co-chair of the Immersive Media Group of VQEG and Kjell Brunnström (lower photo, kjell.brunnstrom@ri.se), general co-chair of VQEG.  Feel free to contact them for any further questions, comments or information, and also to check the VQEG website: www.vqeg.org.

Introduction

The Video Quality Experts Group (VQEG) was born from a need to bring together experts in subjective video quality assessment and objective quality measurement. The first VQEG meeting, held in Turin in 1997, was attended by a small group of experts drawn from ITU-T and ITU-R Study Groups. VQEG was first grounded in basic subjective methodology and objective tool development/verification for video quality assessment such that the industry could be moved forward with standardization and implementation. At the beginning it was focused around measuring the perceived video quality since the distribution path for video and audio were limited and known.

Over the last 20 years from the formation of VQEG the ecosystem has changed dramatically and thus so must the work. Multimedia is now pervasive on all devices and methods of distribution from broadcast to cellular data networks. This shift has the expertise within VQEG to move from the visual (no-audio) quality of video to Quality of Experience (QoE).

The march forward of technologies means that VQEG needs to react and be on the leading edge of developing, defining and deploying methods and tools that help address these new technologies and move the industry forward. This also means that we need to embrace both qualitative and quantitative ways of defining these new spaces and terms. Taking a holistic approach to QoE will enable VQEG to drive forward and faster with unprecedented collaboration and execution

VQEG is open to all interested from industry, academia, government organizations and Standard-Developing Organizations (SDOs). There are no fees involved, no membership applications and no invitations are needed to participate in VQEG activities. Subscription to the main VQEG email list (ituvidq@its.bldrdoc.gov) constitutes membership in VQEG.

VQEG conducts work via discussions over email reflectors, regularly scheduled conference calls and, in general, two face-to-face meetings per year. There are currently more than 500 people registered across 11 email reflectors, including a main reflector for general announcements relevant to the entire group, and different project reflectors dedicated to technical discussions of specific projects. A LinkedIn group exists as well.

Objectives

The main objectives of VQEG are: 

  • To provide a forum, via email lists and face-to-face meetings for video quality assessment experts to exchange information and work together on common goals. 
  • To formulate test plans that clearly and specifically define the procedures for performing subjective assessment tests and objective models validations.
  • To produce open source databases of multimedia material and test results, as well as software tools. 
  • To conduct subjective studies of multimedia and immersive technologies and provide a place for collaborative model development to take place.

Projects

Currently, several working groups are active within VQEG, classified under four main topics:

  1. Subjective Methods: Based on collaborative efforts to improve subjective video quality test methods.
    • Audiovisual HD (AVHD), project “Advanced Subjective Methods” (AVHD-SUB): This group investigates improved audiovisual subjective quality testing methods. This effort may lead to a revision of ITU-T Rec. P.911. As examples of its activities, the group has investigated alternative experiment designs for subjective tests, to validate subjective testing of long video sequences that are only viewed once by each subject. In addition, it conducted a joint investigation into the impact of the environment on mean opinion scores (MOS).
    • Psycho-Physiological Quality Assessment (PsyPhyQA): The aim of this project is to establish novel psychophysiology based techniques and methodologies for video quality assessment and real-time interaction of humans with advanced video communication environments. Specifically, some of the aspects that the project is looking at include: video quality assessment based on human psychophysiology (including, eye gaze, EEG, EKG, EMG, GSR, etc.), computational video quality models based on psychophysiological measurements, signal processing and machine learning techniques for psychophysiology based video quality assessment, experimental design and methodologies for psychophysiological assessment, correlates of psychophysics and psychophysiology. PsyPhyQA has published a dataset and testplan for a common framework for the evaluation of psychophysiological visual quality assessment.
    • Statistical Analysis Methods (SAM): This group addresses problems related to how to better analyze and improve data quality coming from subjective experiments and how to consider uncertainty in objective media quality predictors/models development. Its main goals are: to improve methods used to draw conclusions from subjective experiments, to understand the process of expressing opinion in a subjective experiment, to improve subjective experiment design to facilitate analysis and applications, to improve the analysis of objective model performances, and to revisit standardised methods for the assessment of the performance of objective model performances. 
  2. Objective Metrics: Working towards developing and validating objective video quality metrics.
    • Audiovisual HD (AVHD), project “AVHD-AS / P.NATS phase 2”: It is a joint project of VQEG and ITU Study Group 12 Question 14. The main goal is to develop a multitude of objective models, varying in terms of complexity/type of input/use-cases for the assessment of video quality in HTTP/TCIP based adaptive bitrate streaming services (e.g., YouTube, Vimeo, Amazon Video, Netflix, etc). For these services quality experienced by the end user is affected by video coding degradations, and delivery degradations due to initial buffering, re-buffering and media adaptations caused by the changes in bitrate, resolution, and frame rate
    • Computer Generated Imagery (CGI): focuses on the computer generated content for both images and videos material. The main goals are as follows: creating a large database of computer generated content, analyzing the content (feature extraction before and after rendering), analyzing the performance of objective quality metrics, evaluating/developing existing/new quality metrics/models for CGI material, studying rendering adaptation techniques (depending on the network constraints). This activity is in-line with the ITU-T work item P.BBQCG (Parametric Bitstream-based Quality Assessment of Cloud Gaming Services). 
    • No Reference Metrics (NORM): This group is an open collaborative for developing No-Reference metrics and methods for monitoring use case specific visual service quality. The NORM group is a complementary, industry-driven alternative of QoE to measure automatically the visual quality by using perceived indicators. Its main activities are to maintain a list of real-world use cases for visual quality monitoring, a list of potential algorithms and methods for no reference MOS and/or key indicators (visual artifact detection) for each use case, a list of methods (including datasets) to train and validate the algorithms for each use case, and a list of methods to provide root cause indication for each use case. In addition, the group encourages open discussions and knowledge sharing on all aspects related to no-reference metric research and development. 
    • Joint Effort Group (JEG) – Hybrid: This group is an open collaboration working together to develop a robust Hybrid Perceptual/Bit-Stream model. It has developed and made available routines to create and capture bit-stream data and parse bit-streams into HMIX files. Efforts are underway into developing subjectively rated video quality datasets with bit-stream data that can be used by all JEG researchers. The goal is to produce one model that combines metrics developed separately by a variety of researchers. 
    • Quality Assessment for Computer Vision Applications (QACoViA): the goal of this group is to study the visual quality requirements for computer vision methods, especially focusing on: testing methodologies and frameworks to identify the limit of computer vision methods with respect to the visual quality of the ingest; the minimum quality requirements and objective visual quality measure to estimate if a visual content is the operating region of computer vision; and delivering implementable algorithms being a proof/demonstrate of the new proposal concept of an objective video quality assessment methods for recognition tasks.
  3. Industry and Applications: Focused on seeking improved understanding of new video technologies and applications.
    • 5G Key Performance Indicators (5GKPI): Studies the relationship between the Key Performance Indicators (KPI) of new communication networks (namely 5G, but extensible to others) and the QoE of the video services on top of them. With this aim, this group addresses: the definition of relevant use cases (e.g., video for industrial applications, or mobility scenarios), the study of global QoE aspects for video in mobility and industrial scenarios, the identification of the relevant network KPIs(e.g., bitrate, latency, etc.) and application-level video KPIs (e.g., picture quality, A/V sync, etc.) and the generation of open datasets for algorithm testing and training.
    • IMG (Immersive Media Group): This group researches on quality assessment of immersive media, with the main goals of generating datasets of immersive media content, validating subjective test methods, and baseline quality assessment of immersive systems providing guidelines for QoE evaluation. The technologies covered by this group include: 360-degree content, virtual/augmented mixed reality, stereoscopic 3D content, Free Viewpoint Video, multiview technologies, light field content, etc.
  4. Support and Outreach: Responsible for the support for VQEG’s activities.
    • eLetter: The goal of VQEG eLetter is to provide up-to-date technical advances on video quality related topics. Each issue of VQEG eletter features a collection of papers authored by well-known researchers. These papers are contributed by invited authors or authors responding to a call-for-paper, and they can be: technical papers, summary/review of other publications, best practice anthologies, reprints of difficult to obtain articles, and responses to other articles. VQEG wants the eLetter to be interactive in nature.
    • Human Factors for Visual Experiences (HFVE): The objectives of this group is  to uphold the liaison relation between VQEG and the IEEE standardization group P3333.1. Some examples of the activities going on within this group are the standard for the (deep learning-based) assessment based on human factors of visual experiences with virtual/augmented/mixed reality and the standards on human factors for the  quality assessment of light field imaging (IEEE P3333.1.4) and on quality assessment of high dynamic range technologies. 
    • Independent Lab Group (ILG): The ILG act as independent arbitrators, whose generous contributions make possible the VQEG validation tests. Their goal is to ensure that all VQEG validation testing is unbiased and done to high quality standards. 
    • Joint Effort Group (JEG): is an activity within VQEG that promotes collaborative efforts addressed to: validate metrics through both subjective dataset completion and metric design, extend subjective datasets in order to better identify the limitations of quality metrics, improve subjective methodologies to address new scenarios and use cases that involve QoE issues, and increase the knowledge about both subjective and objective video quality assessment.
    • Joint Qualinet-VQEG team on Immersive Media: The objectives of this joint team from Qualinet and VQEG are: to uphold the liaison relation between both bodies, to inform both QUALINET and VQEG on the activities in respective organizations (especially on the topic of immersive media), to promote collaborations on other topics (i.e., form new joint teams), and to uphold the liaison relation with ITU-T SG12, in particular on topics around interactive, augmented and virtual reality QoE.
    • Tools and Subjective Labs Setup: The objective of this project is to provide the video quality research community with a wide variety of software tools and guidance in order to facilitate research. Tools are available in the following categories: quality analysis (software to run quality analyses), encoding (video encoding tools), streaming (streaming and extracting information from video streams), subjective test software (tools for running and analyzing subjective tests), and helper tools (miscellaneous helper tools).

In addition, the Intersector Rapporteur Group on Audiovisual Quality Assessment (IRG-AVQA) studies topics related to video and audiovisual quality assessment (both subjective and objective) among ITU-R Study Group 6 and ITU-T Study Group 12. VQEG colocates meetings with the IRG-AVQA to encourage a wider range of experts to contribute to Recommendations. 

For more details and previous closed projects please check: https://www.its.bldrdoc.gov/vqeg/projects-home.aspx

Major achievements

VQEG activities are documented in reports and submitted to relevant ITU Study Groups (e.g., ITU-T SG9, ITU-T SG12, ITU-R WP6C), and other SDOs as appropriate. Several VQEG studies have resulted in ITU Recommendations.

VQEG ProjectDescriptionITU Recommendations
Full Reference Television (FRTV) Phase I Examined the performance of FR and NR models on standard definition video. The test materials used in this test plan and the subjective tests data are freely available to researchers. ITU-T J.143 (2000), ITU-T J.144 (2001), ITU-T J.149 (2004)
Full Reference Television (FRTV) Phase II Examined the performance of FR and NR models on standard definition video, using the DSCQS methodology. ITU-T J.144 (2004)
ITU-R BT.1683 (2004)
Multimedia (MM) Phase I Examined the performance of FR, RR and NR models for VGA, CIF and QCIF video (no audio).ITU-T J.148 (2003)
ITU-T P.910 (2008)
ITU-T J.246 (2008)
ITU-T J.247 (2008)
ITU-T J.340 (2010)
ITU-R BT.1683 (2004)
Reduced Reference / No Reference Television (RRNR-TV) Examined the performance of RR and NR models on standard definition video ITU-T J.244 (2008)
ITU-T J.249 (2010)
ITU-R BT.1885 (2011)
High Definition Television (HDTV) Examined the performance of FR, RR and NR models for HDTV. Some of the video sequences used in this test are publicly available in the Consumer Digital Video Library.ITU-T J.341 (2011)
ITU-T J.342 (2011)
QARTStudied the subjective quality evaluation of video used for recognition tasks and task-based multimedia applications. ITU-T P.912 (2008)
Hybrid Perceptual BitstreamExamined the performance of Hybri models for VGA/WVGA and HDTV ITU-T J.343 (2014)
ITU-T J.343.1-6 (2014)
3DTVInvestigated how to assess 3DTV subjective video quality, covering methodologies, display requirements and evaluation of visual discomfort and fatigue. ITU-T P.914 (2016)
ITU-T P.915 (2016)
ITU-T P.916 (2016)
Audiovisual HD (AVHD)On one side, addressed the subjective evaluation of audio-video quality metrics.
On the other side, developed model standards for video quality assessment of streaming services over reliable transport for resolutions up to 4K/UHD, in collaboration with ITU-T SG12.
ITU-T P.913 (2014)
ITU-T P.1204 (2020)
ITU-T P.1204.3 (2020)
ITU-T P.1204.4 (2020)
ITU-T P.1204.5 (2020)

The contribution to current ITU standardization efforts is still ongoing. For example, updated texts have been contributed by VQEG on statistical analysis in ITU-T Rec. P.1401, and on subjective quality assessment of 360-degree video in ITU-T P.360-VR. 

Apart from this, VQEG is supporting the research on QoE by providing for the research community tools and datasets. For instance, it is worth noting the wide variety of software tools and guidance in order to facilitate research provided by VQEG Tools and Subjective Labs Setup via GitHub. Another example, is the VQEG Image Quality Evaluation Tool (VIQET), which is an objective no-reference photo quality evaluation tool. Finally, several datasets have been published which can be found in the websites of the corresponding projects, in the Consumer Digital Video Library or in other repositories.

General articles for the interested reader about the work of VQEG, especially covering the previous works are [1, 2].

References

[1] Q. Huynh-Thu, A. Webster, K. Brunnström, and M. Pinson, “VQEG: Shaping Standards on Video Quality”, in 1st International Conference on Advanced Imaging, Tokyo, Japan, 2015.
[2] K. Brunnström, D. Hands, F. Speranza, and A. Webster, “VQEG Validation and ITU Standardisation of Objective Perceptual Video Quality Metrics”, IEEE Signal Processing Magazine, vol. 26, no. 3, pp. 96-101, May 2009.

JPEG Column: 87th JPEG Meeting

The 87th JPEG meeting initially planned to be held in Erlangen, Germany, was held online from 25-30, April 2020 because of the Covid-19 outbreak. JPEG experts participated in a number of online meetings attempting to make them as effective as possible while considering participation from different time zones, ranging from Australia to California, U.S.A.

JPEG decided to proceed with a Second Call for Evidence on JPEG Pleno Point Cloud Coding and continued work to prepare for contributions to the previous Call for Evidence on Learning-based Image Coding Technologies (JPEG AI).

The 87th JPEG meeting had the following highlights:

  • JPEG Pleno Point Cloud Coding issues a Call for Evidence on coding solutions supporting scalability and random access of decoded point clouds.
  • JPEG AI defines evaluation methodologies of the Call for Evidence on machine learning based image coding solutions.
  • JPEG XL defines the file format compatible with existing formats. 
  • JPEG exploration on Media Blockchain releases use cases and requirements.
  • JPEG Systems releases a first version of JPEG Snack use cases and requirements.
  • JPEG XS announces significant improvement of the quality of raw-Bayer image sensor data compression.

JPEG Pleno Point Cloud

JPEG Pleno is working towards the integration of various modalities of plenoptic content under a single and seamless framework. Efficient and powerful point cloud representation is a key feature within this vision. Point cloud data supports a wide range of applications including computer-aided manufacturing, entertainment, cultural heritage preservation, scientific research and advanced sensing and analysis. During the 87th JPEG meeting, the JPEG Committee released a Second Call for Evidence on JPEG Pleno Point Cloud Coding that focuses specifically on point cloud coding solutions supporting scalability and random access of decoded point clouds. The Second Call for Evidence on JPEG Pleno Point Cloud Coding has a revised timeline reflecting changes in the activity due to the 2020 COVID-19 Pandemic. A Final Call for Evidence on JPEG Pleno Point Cloud Coding is planned to be released in July 2020.

JPEG AI

The main focus of JPEG AI was on the promotion and definition of the submission and evaluation methodologies of the Call for Evidence (in coordination with the IEEE MMSP 2020 Challenge) that was issued as outcome of the 86th JPEG meeting, Sydney, Australia.

JPEG XL

The File Format has been defined for JPEG XL (ISO/IEC 18181-1) codestream, metadata and extensions. The file format enables compatibility with ISOBMFF, JUMBF, XMP, Exif and other existing standards. Standardization has now reached the Committee Draft stage and the DIS ballot is ongoing. A white paper about JPEG XL’s features and tools was approved at this meeting and is available on the jpeg.org website.

JPEG exploration on Media Blockchain – Call for feedback on use cases and requirements

JPEG has determined that blockchain and distributed ledger technologies (DLT) have great potential as a technology component to address many privacy and security related challenges in digital media applications. This includes digital rights management, privacy and security, integrity verification, and authenticity, that impacts society in several ways including the loss of income in the creative sector due to piracy, the spread of fake news, or evidence tampering for fraud purposes.

JPEG is exploring standardization needs related to media blockchain to ensure seamless interoperability and integration of blockchain technology with widely accepted media standards. In this context, the JPEG Committee announces a call for feedback from interested stakeholders on the first public release of the use cases and requirements document.

JPEG Systems initiates standardisation of JPEG Snack

Media “snacking”, the consumption of multimedia in short bursts (less than 15 minutes) has become globally popular. JPEG recognizes the need for standardizing how snack images are constructed to ensure interoperability. A first version of JPEG Snack use cases and requirements is now complete and publicly available on JPEG website inviting feedback from stakeholders.

JPEG made progress on a fundamental capability of the JPEG file structure with enhancements to JPEG Universal Metadata Box Format (JUMBF) to support embedding common file types; the DIS text for JUMBF Amendment 1 is ready for ballot. Likewise JPEG 360 Amendment 1 DIS text is ready for ballot; this amendment supports stereoscopic 360 degree images, accelerated rendering for regions-of-interest, and removes the XMP signature block from the metadata description.

JPEG XS – The JPEG committee is pleased to announce significant improvement of the quality of its upcoming Bayer compression.

Over the past year, an improvement of around 2dB has been observed for the new coding tools currently being developed for image sensor compression within JPEG XS. This visually lossless low-latency and lightweight compression scheme can be used as a mezzanine codec in various markets like real-time video storage inside and outside of cameras, and data compression onboard autonomous cars. Mathematically lossless capability is also investigated and encapsulation within MXF or SMPTE ST2110-22 is currently being finalized.

Final Quote

“JPEG is committed to the development of new standards that provide state of the art imaging solutions to the largest spectrum of stakeholders. During the 87th meeting, held online because of the Covid-19 pandemic, JPEG progressed well with its current and even launched new activities. Although some timelines had to be revisited, overall, no disruptions of the workplan have occurred.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JPEG, JPEG 2000, JPEG XR, JPSearch, JPEG XT and more recently, the JPEG XS, JPEG Systems, JPEG Pleno and JPEG XL families of imaging standards.

More information about JPEG and its work is available at jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

Future JPEG meetings are planned as follows:

  • No 88, initially planned in Geneva, Switzerland, July 4 to 10, 2020, will be held online from July 7 to 10, 2020.

MPEG Column: 130th MPEG Meeting (virtual/online)

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

The 130th MPEG meeting concluded on April 24, 2020, in Alpbach, Austria … well, not exactly, unfortunately. The 130th MPEG meeting concluded on April 24, 2020, but not in Alpbach, Austria.

I attended the 130th MPEG meeting remotely.

Because of the Covid-19 pandemic, the 130th MPEG meeting has been converted from a physical meeting to a fully online meeting, the first in MPEG’s 30+ years of history. Approximately 600 experts attending from 19 time zones worked in tens of Zoom meeting sessions supported by an online calendar and by collaborative tools that involved MPEG experts in both online and offline sessions. For example, input contributions had to be registered and uploaded ahead of the meeting to allow for efficient scheduling of two-hour meeting slots, which have been distributed from early morning to late night in order to accommodate experts working in different time zones as mentioned earlier. These input contributions have been then mapped to GitLab issues for offline discussions and the actual meeting slots have been primarily used for organizing the meeting, resolving conflicts, and making decisions including approving output documents. Although the productivity of the online meeting could not reach the level of regular face-to-face meetings, the results posted in the press release show that MPEG experts managed the challenge quite well, specifically

  • MPEG ratifies MPEG-5 Essential Video Coding (EVC) standard;
  • MPEG issues the Final Draft International Standards for parts 1, 2, 4, and 5 of MPEG-G 2nd edition;
  • MPEG expands the coverage of ISO Base Media File Format (ISOBMFF) family of standards;
  • A new standard for large scale client-specific streaming with MPEG-DASH;

Other Important Activities at the 130th MPEG meeting(i) the carriage of visual volumetric video-based coding data, (ii) Network-Based Media Processing (NBMP) function templates, (iii) the conversion from MPEG-21 contracts to smart contracts, (iv) deep neural network-based video coding, (v) Low Complexity Enhancement Video Coding (LCEVC) reaching DIS stage, and (vi) a new level of the MPEG-4 Audio ALS Simple Profile for high-resolution audio among others

The corresponding press release of the 130th MPEG meeting can be found here: https://mpeg.chiariglione.org/meetings/130. This report focused on video coding (EVC) and systems aspects (file format, DASH).

MPEG ratifies MPEG-5 Essential Video Coding Standard

At its 130th meeting, MPEG announced the completion of the new ISO/IEC 23094-1 standard which is referred to as MPEG-5 Essential Video Coding (EVC) and has been promoted to Final Draft International Standard (FDIS) status. There is a constant demand for more efficient video coding technologies (e.g., due to the increased usage of video on the internet), but coding efficiency is not the only factor determining the industry’s choice of video coding technology for products and services. The EVC standard offers improved compression efficiency compared to existing video coding standards and is based on the statements of all contributors to the standard who have committed announcing their license terms for the MPEG-5 EVC standard no later than two years after the FDIS publication date.

The MPEG-5 EVC defines two important profiles, including “Baseline profile” and “Main profile”. The “Baseline Profile” contains only technologies that are older than 20 years or otherwise freely available for use in the standard. In addition, the “Main Profile” adds a small number of additional tools, each of which can be either cleanly disabled or switched to the corresponding baseline tool on an individual basis.

It will be interesting to see how EVC profiles (baseline and main) will find its path into products and services given the existing number of codecs already in use (e.g., AVC, HEVC, VP9, AV1) and those still under development but being close to ratification (e.g., VVC, LCEVC). That is, in total, we may end up with about seven video coding formats that probably need to be considered for future video products and services. In other words, the multi-codec scenario I have envisioned some time ago is becoming reality raising some interesting challenges to be addressed in the future.

Research aspects: as for all video coding standards, the most important research aspect is certainly coding efficiency. For EVC it might be also interesting to research its usability of the built-in tool switching mechanism within a practical setup. Furthermore, the multi-codec issue, the ratification of EVC adds another facet to the already existing video coding standards in use or/and under development.

MPEG expands the Coverage of ISO Base Media File Format (ISOBMFF) Family of Standards

At the 130th WG11 (MPEG) meeting, the ISOBMFF family of standards has been significantly amended with new tools and functionalities. The standards in question are as follows:

  • ISO/IEC 14496-12: ISO Base Media File Format;
  • ISO/IEC 14496-15: Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format;
  • ISO/IEC 23008-12: Image File Format; and
  • ISO /IEC 23001-16: Derived visual tracks in the ISO base media file format.

In particular, three new amendments to the ISOBMFF family have reached their final milestone, i.e., Final Draft Amendment (FDAM):

  1. Amendment 4 to ISO/IEC 14496-12 (ISO Base Media File Format) allows the use of a more compact version of metadata for movie fragments;
  2. Amendment 1 to ISO/IEC 14496-15 (Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format) adds support of HEVC slice segment data track and additional extractor types for HEVC such as track reference and track groups; and
  3. Amendment 2 to ISO/IEC 23008-12 (Image File Format) adds support for more advanced features related to the storage of short image sequences such as burst and bracketing shots.

At the same time, new amendments have reached their first milestone, i.e., Committee Draft Amendment (CDAM):

  1. Amendment 2 to ISO/IEC 14496-15 (Carriage of network abstraction layer (NAL) unit structured video in the ISO base media file format) extends its scope to newly developed video coding standards such as Essential Video Coding (EVC) and Versatile Video Coding (VVC); and
  2. the first edition of ISO/IEC 23001-16 (Derived visual tracks in the ISO base media file format) allows a new type of visual track whose content can be dynamically generated at the time of presentation by applying some operations to the content in other tracks, such as crossfading over two tracks.

Both are expected to reach their final milestone in mid-2021.

Finally, the final text for the ISO/IEC 14496-12 6th edition Final Draft International Standard (FDIS) is now ready for the ballot after converting MP4RA to the Maintenance Agency. WG11 (MPEG) notes that Apple Inc. has been appointed as the Maintenance Agency and MPEG appreciates its valuable efforts for the many years while already acting as the official registration authority for the ISOBMFF family of standards, i.e., MP4RA (https://mp4ra.org/). The 6th edition of ISO/IEC 14496-12 is expected to be published by ISO by the end of this year.

Research aspects: the ISOBMFF family of standards basically offers certain tools and functionalities to satisfy the given use case requirements. The task of the multimedia systems research community could be to scientifically validate these tools and functionalities with respect to the use cases and maybe even beyond, e.g., try to adopt these tools and functionalities for novel applications and services.

A New Standard for Large Scale Client-specific Streaming with DASH

Historically, in ISO/IEC 23009 (Dynamic Adaptive Streaming over HTTP; DASH), every client has used the same Media Presentation Description (MPD) as it best serves the scalability of the service (e.g., for efficient cache efficiency in content delivery networks). However, there have been increasing requests from the industry to enable customized manifests for more personalized services. Consequently, MPEG has studied a solution to this problem without sacrificing scalability, and it has reached the first milestone of its standardization at the 130th MPEG meeting.

ISO/IEC 23009-8 adds a mechanism to the Media Presentation Description (MPD) to refer to another document, called Session-based Description (SBD), which allows per-session information. The DASH client can use this information (i.e., variables and their values) provided in the SBD to derive the URLs for HTTP GET requests. This standard is expected to reach its final milestone in mid-2021.

Research aspects: SBD’s goal is to enable personalization while maintaining scalability which calls for a tradeoff, i.e., which kind of information to put into the MPD and what should be conveyed within the SBD. This tradeoff per se could be considered already a research question that will be hopefully addressed in the near future.

An overview of the current status of MPEG-DASH can be found in the figure below.

The next MPEG meeting will be from June 29th to July 3rd and will be again an online meeting. I am looking forward to a productive AhG period and an online meeting later this year. I am sure that MPEG will further improve its online meeting capabilities and can certainly become a role model for other groups within ISO/IEC and probably also beyond.

MPEG Column: 129th MPEG Meeting in Brussels, Belgium

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

The 129th MPEG meeting concluded on January 17, 2020 in Brussels, Belgium with the following topics:

  • Coded representation of immersive media – WG11 promotes Network-Based Media Processing (NBMP) to the final stage
  • Coded representation of immersive media – Publication of the Technical Report on Architectures for Immersive Media
  • Genomic information representation – WG11 receives answers to the joint call for proposals on genomic annotations in conjunction with ISO TC 276/WG 5
  • Open font format – WG11 promotes Amendment of Open Font Format to the final stage
  • High efficiency coding and media delivery in heterogeneous environments – WG11 progresses Baseline Profile for MPEG-H 3D Audio
  • Multimedia content description interface – Conformance and Reference Software for Compact Descriptors for Video Analysis promoted to the final stage

Additional Important Activities at the 129th WG 11 (MPEG) meeting

The 129th WG 11 (MPEG) meeting was attended by more than 500 experts from 25 countries working on important activities including (i) a scene description for MPEG media, (ii) the integration of Video-based Point Cloud Compression (V-PCC) and Immersive Video (MIV), (iii) Video Coding for Machines (VCM), and (iv) a draft call for proposals for MPEG-I Audio among others.

The corresponding press release of the 129th MPEG meeting can be found here: https://mpeg.chiariglione.org/meetings/129. This report focused on network-based media processing (NBMP), architectures of immersive media, compact descriptors for video analysis (CDVA), and an update about adaptive streaming formats (i.e., DASH and CMAF).

MPEG picture at Friday plenary; © Rob Koenen (Tiledmedia).

Coded representation of immersive media – WG11 promotes Network-Based Media Processing (NBMP) to the final stage

At its 129th meeting, MPEG promoted ISO/IEC 23090-8, Network-Based Media Processing (NBMP), to Final Draft International Standard (FDIS). The FDIS stage is the final vote before a document is officially adopted as an International Standard (IS). During the FDIS vote, publications and national bodies are only allowed to place a Yes/No vote and are no longer able to make any technical changes. However, project editors are able to fix typos and make other necessary editorial improvements.

What is NBMP? The NBMP standard defines a framework that allows content and service providers to describe, deploy, and control media processing for their content in the cloud by using libraries of pre-built 3rd party functions. The framework includes an abstraction layer to be deployed on top of existing commercial cloud platforms and is designed to be able to be integrated with 5G core and edge computing. The NBMP workflow manager is another essential part of the framework enabling the composition of multiple media processing tasks to process incoming media and metadata from a media source and to produce processed media streams and metadata that are ready for distribution to media sinks.

Why NBMP? With the increasing complexity and sophistication of media services and the incurred media processing, offloading complex media processing operations to the cloud/network is becoming critically important in order to keep receiver hardware simple and power consumption low.

Research aspects: NBMP reminds me a bit about what has been done in the past in MPEG-21, specifically Digital Item Adaptation (DIA) and Digital Item Processing (DIP). The main difference is that MPEG now targets APIs rather than pure metadata formats, which is a step forward in the right direction as APIs can be implemented and used right away. NBMP will be particularly interesting in the context of new networking approaches including, but not limited to, software-defined networking (SDN), information-centric networking (ICN), mobile edge computing (MEC), fog computing, and related aspects in the context of 5G.

Coded representation of immersive media – Publication of the Technical Report on Architectures for Immersive Media

At its 129th meeting, WG11 (MPEG) published an updated version of its technical report on architectures for immersive media. This technical report, which is the first part of the ISO/IEC 23090 (MPEG-I) suite of standards, introduces the different phases of MPEG-I standardization and gives an overview of the parts of the MPEG-I suite. It also documents use cases and defines architectural views on the compression and coded representation of elements of immersive experiences. Furthermore, it describes the coded representation of immersive media and the delivery of a full, individualized immersive media experience. MPEG-I enables scalable and efficient individual delivery as well as mass distribution while adjusting to the rendering capabilities of consumption devices. Finally, this technical report breaks down the elements that contribute to a fully immersive media experience and assigns quality requirements as well as quality and design objectives for those elements.

Research aspects: This technical report provides a kind of reference architecture for immersive media, which may help identify research areas and research questions to be addressed in this context.

Multimedia content description interface – Conformance and Reference Software for Compact Descriptors for Video Analysis promoted to the final stage

Managing and organizing the quickly increasing volume of video content is a challenge for many industry sectors, such as media and entertainment or surveillance. One example task is scalable instance search, i.e., finding content containing a specific object instance or location in a very large video database. This requires video descriptors that can be efficiently extracted, stored, and matched. Standardization enables extracting interoperable descriptors on different devices and using software from different providers so that only the compact descriptors instead of the much larger source videos can be exchanged for matching or querying. ISO/IEC 15938-15:2019 – the MPEG Compact Descriptors for Video Analysis (CDVA) standard – defines such descriptors. CDVA includes highly efficient descriptor components using features resulting from a Deep Neural Network (DNN) and uses predictive coding over video segments. The standard is being adopted by the industry. At its 129th meeting, WG11 (MPEG) has finalized the conformance guidelines and reference software. The software provides the functionality to extract, match, and index CDVA descriptors. For easy deployment, the reference software is also provided as Docker containers.

Research aspects: The availability of reference software helps to conduct reproducible research (i.e., reference software is typically publicly available for free) and the Docker container even further contributes to this aspect.

DASH and CMAF

The 4th edition of DASH has already been published and is available as ISO/IEC 23009-1:2019. Similar to previous iterations, MPEG’s goal was to make the newest edition of DASH publicly available for free, with the goal of industry-wide adoption and adaptation. During the most recent MPEG meeting, we worked towards implementing the first amendment which will include additional (i) CMAF support and (ii) event processing models with minor updates; these amendments are currently in draft and will be finalized at the 130th MPEG meeting in Alpbach, Austria. An overview of all DASH standards and updates are depicted in the figure below:

ISO/IEC 23009-8 or “session-based DASH operations” is the newest variation of MPEG-DASH. The goal of this part of DASH is to allow customization during certain times of a DASH session while maintaining the underlying media presentation description (MPD) for all other sessions. Thus, MPDs should be cacheable within content distribution networks (CDNs) while additional information should be customizable on a per session basis within a newly added session-based description (SBD). It is understood that the SBD should have an efficient representation to avoid file size issues and it should not duplicate information typically found in the MPD.

The 2nd edition of the CMAF standard (ISO/IEC 23000-19) will be available soon (currently under FDIS ballot) and MPEG is currently reviewing additional tools in the so-called ‘technologies under considerations’ document. Therefore, amendments were drafted for additional HEVC media profiles and exploration activities on the storage and archiving of CMAF contents.

The next meeting will bring MPEG back to Austria (for the 4th time) and will be hosted in Alpbach, Tyrol. For more information about the upcoming 130th MPEG meeting click here.

Click here for more information about MPEG meetings and their developments

JPEG Column: 86th JPEG Meeting in Sydney, Australia

The 86th JPEG meeting was held in Sydney, Australia.

Among the different activities that took place, the JPEG Committee issued a Call for Evidence on learning-based image coding solutions. This call results from the success of the  explorations studies recently carried out by the JPEG Committee, and honours the pioneering work of JPEG issuing the first image coding standard more than 25 years ago.

In addition, a First Call for Evidence on Point Cloud Coding was issued in the framework of JPEG Pleno. Furthermore, an updated version of the JPEG Pleno reference software and a JPEG XL open source implementation have been released, while JPEG XS continues the development of raw-Bayer image sensor compression.

JPEG Plenary at the 86th meeting.

The 86th JPEG meeting had the following highlights:

  • JPEG AI issues a call for evidence on machine learning based image coding solutions
  • JPEG Pleno issues call for evidence on Point Cloud coding
  • JPEG XL verification test reveal competitive performance with commonly used image coding solutions 
  • JPEG Systems submitted final texts for Privacy & Security
  • JPEG XS announces new coding tools optimised for compression of raw-Bayer image sensor data

JPEG AI

The JPEG Committee launched a learning-based image coding activity more than a year ago, also referred as JPEG AI. This activity aims to find evidence for image coding technologies that offer substantially better compression efficiency when compared to conventional approaches but relying on models exploiting a large image database.

A Call for Evidence (CfE) has been issued as outcome of the 86th JPEG meeting, Sydney, Australia as a first formal step to consider standardisation of such approaches in image compression. The CfE is organised in coordination with the IEEE MMSP 2020 Grand Challenge on Learning-based Image Coding Challenge and will use the same content, evaluation methodologies and deadlines.

JPEG Pleno

JPEG Pleno is working toward the integration of various modalities of plenoptic content under a single framework and in a seamless manner. Efficient and powerful point cloud representation is a key feature within this vision.  Point cloud data supports a wide range of applications including computer-aided manufacturing, entertainment, cultural heritage preservation, scientific research and advanced sensing and analysis. During the 86th JPEG Meeting, the JPEG Committee released a First Call for Evidence on JPEG Pleno Point Cloud Coding to be integrated in the JPEG Pleno framework.  This Call for Evidence focuses specifically on point cloud coding solutions that support scalability and random access of decoded point clouds.

Furthermore, a Reference Software implementation of the JPEG Pleno file format (Part 1) and light field coding technology (Part 2) is made publicly available as open source on the JPEG Gitlab repository (https://gitlab.com/wg1). The JPEG Pleno Reference Software is planned to become an International Standard as Part 4 of JPEG Pleno by the end of 2020.

JPEG XL

The JPEG XL Image Coding System (ISO/IEC 18181) has produced an open source reference implementation available on the JPEG Gitlab repository (https://gitlab.com/wg1/jpeg-xl). The software is available under Apache 2, which includes a royalty-free patent grant. Speed tests indicate the multithreaded encoder and decoder outperforms libjpeg-turbo. 

Independent subjective and objective evaluation experiments have indicated competitive performance with commonly used image coding solutions while offering new functionalities such as lossless transcoding from legacy JPEG format to JPEG XL. The standardisation process has reached the Draft International Standard stage.

JPEG exploration into Media Blockchain

Fake news, copyright violations, media forensics, privacy and security are emerging challenges in digital media. JPEG has determined that blockchain and distributed ledger technologies (DLT) have great potential as a technology component to address these challenges in transparent and trustable media transactions. However, blockchain and DLT need to be integrated efficiently with a widely adopted standard to ensure broad interoperability of protected images. Therefore, the JPEG committee has organised several workshops to engage with the industry and help to identify use cases and requirements that will drive the standardisation process.

During its Sydney meeting, the committee organised an Open Discussion Session on Media Blockchain and invited local stakeholders to take part in an interactive discussion. The discussion focused on media blockchain and related application areas including, media and document provenance, smart contracts, governance, legal understanding and privacy. The presentations of this session are available on the JPEG website. To keep informed and to get involved in this activity, interested parties are invited to register to the ad hoc group’s mailing list.

JPEG Systems

JPEG Systems & Integration submitted final texts for ISO/IEC 19566-4 (Privacy & Security), ISO/IEC 24800-2 (JPSearch), and ISO/IEC 15444-16 2nd edition (JPEG 2000-in-HEIF) for publication.  Amendments to add new capabilities for JUMBF and JPEG 360 reached Committee Draft stage and will be reviewed and balloted by national bodies.

The JPEG Privacy & Security release is timely as consumers are increasingly aware and concerned about the need to protect privacy in imaging applications.  The JPEG 2000-in-HEIF enables embedding JPEG 2000 images in the HEIF file format.  The updated JUMBF provides a more generic means to embed images and other media within JPEG files to enable richer image experiences.  The updated JPEG 360 adds stereoscopic 360 images, and a method to accelerate the rendering of a region-of-interest within an image in order to reduce the latency experienced by users.  JPEG Systems & Integrations JLINK, which elaborates the relationships of the embedded media within the file, created updated use cases to refine the requirements, and continued technical discussions on implementation.

JPEG XS

The JPEG committee is pleased to announce the specification of new coding tools optimised for compression of raw-Bayer image sensor data. The JPEG XS project aims at the standardisation of a visually lossless, low-latency and lightweight compression scheme that can be used as a mezzanine codec in various markets. Video transport over professional video links, real-time video storage in and outside of cameras, and data compression onboard of autonomous cars are among the targeted use cases for raw-Bayer image sensor compression. Amendment of the Core Coding System, together with new profiles targeting raw-Bayer image applications are ongoing and expected to be published by the end of 2020.

Final Quote

“The efforts to find new and improved solutions in image compression have led JPEG to explore new opportunities relying on machine learning for coding. After rigorous analysis in form of explorations during the last 12 months, JPEG believes that it is time to formally initiate a standardisation process, and consequently, has issued a call for evidence for image compression based on machine learning.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

86th JPEG meeting social event in Sydney, Australia.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JPEG, JPEG 2000, JPEG XR, JPSearch, JPEG XT and more recently, the JPEG XS, JPEG Systems, JPEG Pleno and JPEG XL families of imaging standards.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup. If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

Future JPEG meetings are planned as follows:

  • No 87, Erlangen, Germany, April 25 to 30, 2020 (Cancelled because of Covid-19 outbreak; Replaced by online meetings.)
  • No 88, Geneva, Switzerland, July 4 to 10, 2020

MPEG Column: 128th MPEG Meeting in Geneva, Switzerland

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

The 128th MPEG meeting concluded on October 11, 2019 in Geneva, Switzerland with the following topics:

  • Low Complexity Enhancement Video Coding (LCEVC) Promoted to Committee Draft
  • 2nd Edition of Omnidirectional Media Format (OMAF) has reached the first milestone
  • Genomic Information Representation – Part 4 Reference Software and Part 5 Conformance Promoted to Draft International Standard

The corresponding press release of the 128th MPEG meeting can be found here: https://mpeg.chiariglione.org/meetings/128. In this report we will focus on video coding aspects (i.e., LCEVC) and immersive media applications (i.e., OMAF). At the end, we will provide an update related to adaptive streaming (i.e., DASH and CMAF).

Low Complexity Enhancement Video Coding

Low Complexity Enhancement Video Coding (LCEVC) has been promoted to committee draft (CD) which is the first milestone in the ISO/IEC standardization process. LCEVC is part two of MPEG-5 or ISO/IEC 23094-2 if you prefer the always easy-to-remember ISO codes. We introduced MPEG-5 already in previous posts and LCEVC is about a standardized video coding solution that leverages other video codecs in a manner that improves video compression efficiency while maintaining or lowering the overall encoding and decoding complexity.

The LCEVC standard uses a lightweight video codec to add up to two layers of encoded residuals. The aim of these layers is correcting artefacts produced by the base video codec and adding detail and sharpness for the final output video.

The target of this standard comprises software or hardware codecs with extra processing capabilities, e.g., mobile devices, set top boxes (STBs), and personal computer based decoders. Additional benefits are the reduction in implementation complexity or a corresponding expansion in spatial resolution.

LCEVC is based on existing codecs which allows for backwards-compatibility with existing deployments. Supporting LCEVC enables “softwareized” video coding allowing for release and deployment options known from software-based solutions which are well understood by software companies and, thus, opens new opportunities in improving and optimizing video-based services and applications.

Research aspects: in video coding, research efforts are mainly related to coding efficiency and complexity (as usual). However, as MPEG-5 basically adds a software layer on top of what is typically implemented in hardware, all kind of aspects related to software engineering could become an active area of research.

Omnidirectional Media Format

The scope of the Omnidirectional Media Format (OMAF) is about 360° video, images, audio and associated timed text and specifies (i) a coordinate system, (ii) projection and rectangular region-wise packing methods, (iii) storage of omnidirectional media and the associated metadata using ISOBMFF, (iv) encapsulation, signaling and streaming of omnidirectional media in DASH and MMT, and (v) media profiles and presentation profiles.

At this meeting, the second edition of OMAF (ISO/IEC 23090-2) has been promoted to committee draft (CD) which includes

  • support of improved overlay of graphics or textual data on top of video,
  • efficient signaling of videos structured in multiple sub parts,
  • enabling more than one viewpoint, and
  • new profiles supporting dynamic bitstream generation according to the viewport.

As for the first edition, OMAF includes encapsulation and signaling in ISOBMFF as well as streaming of omnidirectional media (DASH and MMT). It will reach its final milestone by the end of 2020.

360° video is certainly a vital use case towards a fully immersive media experience. Devices to capture and consume such content are becoming increasingly available and will probably contribute to the dissemination of this type of content. However, it is also understood that the complexity increases significantly, specifically with respect to large-scale, scalable deployments due to increased content volume/complexity, timing constraints (latency), and quality of experience issues.

Research aspects: understanding the increased complexity of 360° video or immersive media in general is certainly an important aspect to be addressed towards enabling applications and services in this domain. We may even start thinking that 360° video actually works (e.g., it’s possible to capture, upload to YouTube and consume it on many devices) but the devil is in the detail in order to handle this complexity in an efficient way to enable seamless and high quality of experience.

DASH and CMAF

The 4th edition of DASH (ISO/IEC 23009-1) will be published soon and MPEG is currently working towards a first amendment which will be about (i) CMAF support and (ii) event processing model. An overview of all DASH standards is depicted in the figure below, notably part one of MPEG-DASH referred to as media presentation description and segment formats.

MPEG-DASH-standard-status

The 2nd edition of the CMAF standard (ISO/IEC 23000-19) will become available very soon and MPEG is currently reviewing additional tools in the so-called technologies under considerations document as well as conducting various explorations. A working draft for additional media profiles is also under preparation.

Research aspects: with CMAF, low-latency supported is added to DASH-like applications and services. However, the implementation specifics are actually not defined in the standard and subject to competition (e.g., here). Interestingly, the Bitmovin video developer reports from both 2018 and 2019 highlight the need for low-latency solutions in this domain.

At the ACM Multimedia Conference 2019 in Nice, France I gave a tutorial entitled “A Journey towards Fully Immersive Media Access” which includes updates related to DASH and CMAF. The slides are available here.

Outlook 2020

Finally, let me try giving an outlook for 2020, not so much content-wise but events planned for 2020 that are highly relevant for this column:

  • MPEG129, Jan 13-17, 2020, Brussels, Belgium
  • DCC 2020, Mar 24-27, 2020, Snowbird, UT, USA
  • MPEG130, Apr 20-24, 2020, Alpbach, Austria
  • NAB 2020, Apr 08-22, Las Vegas, NV, USA
  • ICASSP 2020, May 4-8, 2020, Barcelona, Spain
  • QoMEX 2020, May 26-28, 2020, Athlone, Ireland
  • MMSys 2020, Jun 8-11, 2020, Istanbul, Turkey
  • IMX 2020, June 17-19, 2020, Barcelona, Spain
  • MPEG131, Jun 29 – Jul 3, 2020, Geneva, Switzerland
  • NetSoft,QoE Mgmt Workshop, Jun 29 – Jul 3, 2020, Ghent, Belgium
  • ICME 2020, Jul 6-10, London, UK
  • ATHENA summer school, Jul 13-17, Klagenfurt, Austria
  • … and many more!

JPEG Column: 85th JPEG Meeting in San Jose, California, U.S.A.

The 85th JPEG meeting was held in San Jose, CA, USA.

The meeting was distinguished by the Prime Time Engineering Emmy Award from the Academy of Television Arts & Sciences (ATAS) for the longevity of the first JPEG standard. Furthermore, a very successful workshop on JPEG emerging technologies was held at Microsoft premises in Silicon Valley with a broad participation from several companies working in imaging technologies. This workshop ended with the celebration of two JPEG committee experts, Thomas Richter and Ogawa Shigetaka, recognized by ISO outstanding contribution awards for the key roles they played in the development of JPEG XT standard.

The 85th JPEG meeting continued laying the groundwork for the continuous development of JPEG standards and exploration studies. In particular, the developments on new image coding standard JPEG XL,  the low latency and complexity standard JPEG XS, and the release of the JPEG Systems interoperable 360 image standard, together with the exploration studies on image compression using machine learning and on the use of blockchain and distributed ledger technologies for media applications.

The 85th JPEG meeting had the following highlights:

  • Prime Time Engineering Emmy award,
  • JPEG Emerging Technologies Workshop,
  • JPEG XL progresses towards a final specification,
  • JPEG AI evaluates machine learning based coding solutions,
  • JPEG exploration on Media Blockchain,
  • JPEG Systems interoperable 360 image standards released,
  • JPEG XS announces significant improvements of Bayer image sensor data compression.
JPEG Emerging Technologies Workshop.

Prime Time Engineering Emmy

The JPEG committee is honored to be the recipient of a prestigious Prime Time Engineering Award in 2019 by the US Academy of Television Arts & Sciences at the 71st Engineering Emmy Awards ceremony on the 23rd of October 2019 in Los Angeles, CA, USA. The first JPEG standard is known as a popular format in digital photography, used by hundreds of millions of users everywhere, in a wide range of applications including the world wide web, social media, photographic apparatus and smart cameras. The first part of the standard was published in 1992 and has grown to seven parts, with the latest, defining the reference software, published in 2019. This is a unique example of longevity in the fast moving information technologies and the Emmy award acknowledges this longevity and continuing influence over nearly three decades.

This is a well-deserved recognition not only for the Joint Photographic Experts Group committee members who started this standard under the auspices of ITU, ISO, IEC but also to all experts in the JPEG committee who continued to extend and maintain it, hence guaranteeing such a longevity.

JPEG convenor Touradj Ebrahimi during the Emmy acceptance speech.

According to Prof. Touradj Ebrahimi, Convenor of JPEG standardization committee, the longevity of JPEG is based on three very important factors: “The credibility by being developed under the auspices of three important standardization bodies, namely ITU, ISO and IEC, development by explicitly taking into account end users, and the choice of being royalty free”. Furthermore,  “JPEG defined not only a great technology but also it was a committee that first defined how standardization should take place in order to become successful”.

JPEG Emerging Technologies Workshop

At the 85th JPEG meeting in San Jose, CA, USA, JPEG organized the “JPEG Emerging Technologies Workshop” on the 5th of November 2019 to inform industry and academia active in the wider field of multimedia and in particular in imaging, about current JPEG Committee standardization activities and exploration studies. Leading JPEG experts shared highlights about some of the emerging JPEG technologies that could shape the future of imaging and multimedia, with the following program:

  • Welcome and Introduction (Touradj Ebrahimi);
  • JPEG XS – Lightweight compression; Transparent quality. (Antonin Descampe);
  • JPEG Pleno (Peter Schelkens);
  • JPEG XL – Next-generation Image Compression (Jan Wassenberg and Jon Sneyers);
  • High-Throughput JPEG 2000 – Big improvement to JPEG 2000 (Pierre-Anthony Lemieux);
  • JPEG Systems – The framework for future and legacy standards (Andy Kuzma);
  • JPEG Privacy and Security and Exploration on Media Blockchain Standardization Needs (Frederik Temmermans);
  • JPEG AI: Learning to Compress (João Ascenso)

This very successful workshop ended with a panel moderated by Fernando Pereira where different relevant media technology issues were discussed with a vibrant participation of the attendees.

Proceedings of the JPEG Emerging Technologies Workshop are available for download via the following link: https://jpeg.org/items/20191108_jpeg_emerging_technologies_workshop_proceedings.html

JPEG XL

The JPEG XL Image Coding System (ISO/IEC 18181) continues its progression towards a final specification. The Committee Draft of JPEG XL is being refined based on feedback received from experts from ISO/IEC national bodies. Experiments indicate the main two JPEG XL modes compare favorably with specialized responsive and lossless modes, enabling a simpler specification.

The JPEG committee has approved open-sourcing the JPEG XL software. JPEG XL will advance to the Draft International Standard stage in 2020-01.

JPEG AI

JPEG AI carried out rigorous subjective and objective evaluations of a number of promising learning-based image coding solutions from state of the art, which show the potential of these codecs for different rate-quality tradeoffs, in comparison to widely used anchors. Moreover, a wide set of objective metrics were evaluated for several types of image coding solutions.

JPEG exploration on Media Blockchain

Fake news, copyright violations, media forensics, privacy and security are emerging challenges in digital media. JPEG has determined that blockchain and distributed ledger technologies (DLT) have great potential as a technology component to address these challenges in transparent and trustable media transactions. However, blockchain and DLT need to be integrated closely with a widely adopted standard to ensure broad interoperability of protected images. Therefore, the JPEG committee has organized several workshops to engage with the industry and help to identify use cases and requirements that will drive the standardization process. During the San Jose meeting, the committee drafted a first version of the use cases and requirements document. On the 21st of January 2020, during its 86th JPEG Meeting to be held in Sydney, Australia, JPEG plans to organize an interactive discussion session with stakeholders. Practical and registration information is available on the JPEG website. To keep informed and to get involved in this activity, interested parties are invited to register to the ad hoc group’s mailing list. (http://jpeg-blockchain-list.jpeg.org).

JPEG Systems interoperable 360 image standards released.

The ISO/IEC 19566-5 JUMBF and ISO/IEC 19566-6 JPEG 360 were published in July 2019.  These two standards work together to define basics for interoperability and lay the groundwork for future capabilities for richer interactions with still images as we add functionality to JUMBF (Part 5), Privacy & Security (Part 4), JPEG 360 (Part 6), and JLINK (Part 7). 

JPEG XS announces significant improvements of Bayer image sensor data compression.

JPEG XS aims at standardization of a visually lossless low-latency and lightweight compression that can be used as a mezzanine codec in various markets. Work has been done in the last meeting to enable JPEG XS for use in Bayer image sensor compression. Among the targeted use cases for Bayer image sensor compression, one can cite video transport over professional video links, real-time video storage in and outside of cameras, and data compression onboard of autonomous cars. The JPEG Committee also announces the final publication of JPEG XS Part-3 “Transport and Container Formats” as International Standard. This part enables storage of JPEG XS images in various formats. In addition, an effort is currently on its final way to specify RTP payload for JPEG XS, which will enable transport of JPEG XS in the SMPTE ST2110 framework.

“The 2019 Prime Time Engineering Award by the Academy is a well-deserved recognition for the Joint Photographic Experts Group members who initiated standardization of the first JPEG standard and to all experts of the JPEG committee who since then have extended and maintained it, guaranteeing its longevity. JPEG defined not only a great technology but also it was the first committee that defined how standardization should take place in order to become successful” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JPEG, JPEG 2000, JPEG XR, JPSearch, JPEG XT and more recently, the JPEG XS, JPEG Systems, JPEG Pleno and JPEG XL families of imaging standards.

The JPEG Committee nominally meets four times a year, in different world locations. The 84th JPEG Meeting was held on 13-19 July 2019, in Brussels, Belgium. The next 86th JPEG Meeting will be held on 18-24 January 2020, in Sydney, Australia.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

Future JPEG meetings are planned as follows:

  • No 86, Sydney, Australia, January 18 to 24, 2020
  • No 87, Erlangen, Germany, April 25 to 30, 2020