JPEG Column: 103rd JPEG Meeting

JPEG AI reaches Draft International Standard stage

The 103rd JPEG meeting was held online from April 8 to 12, 2024. During the 103rd JPEG meeting, the first learning-based standard, JPEG AI, reached the Draft International Standard (DIS) and was sent for balloting after a very successful development stage that led to performance improvements above 25% against its best-performing anchor, VVC. This high performance, combined with implementation in current mobile phones or the possibilities given by the latent representation to be used in image processing applications, leads to new opportunities and will certainly launch a new era of compression technology.

The following are the main highlights of the 103rd JPEG meeting:

  • JPEG AI reaches Draft International Standard;
  • JPEG Trust integrates JPEG NFT;
  • JPEG Pleno Learning based Point Cloud coding releases a Draft International Standard;
  • JPEG Pleno Light Field works in a new compression model;
  • JPEG AIC analyses different subjective evaluation models for near visually lossless quality evaluation;
  • JPEG XE prepares a call for proposal on event-based coding;
  • JPEG DNA proceeds with the development of a standard for image compression using nucleotide sequences for supporting DNA storage;
  • JPEG XS 3rd edition;
  • JPEG XL analyses HDR coding.

The following sections summarise the main highlights of the 103rd JPEG meeting.

JPEG AI reaches Draft International Standard

At its 103rd meeting the JPEG Committee produced the Draft International Standard (DIS) of the JPEG AI Part 1 Core Coding Engine which is expected to be published as an International Standard in October 2024. JPEG AI offers a coding solution for standard reconstruction with significant improvements in compression efficiency over previous image coding standards at equivalent subjective quality. The JPEG AI coding design allows for hardware/software implementation encoding and decoding, in terms of memory and computational complexity, efficient coding of images with text and graphics, support for 8- and 10-bit depth, region of interest coding, and progressive coding. To cover multiple encoder and decoder complexity-efficiency tradeoffs, JPEG AI supports a multi-branch coding architecture with two encoders and three decoders (6 possible compatible combinations) that have been jointly trained. Compression efficiency (BD-rate) gains of 12.5% to 27.9% over the VVC Intra coding anchor, for relevant encoder and decoder configurations, can be achieved with a wide range of complexity tradeoffs (7 to 216 kMAC/px at the decoder side).

The work regarding JPEG AI profiles and levels (part 2), reference software (part 3) and conformance (part 4) has started and a request for sub-division has been issued in this meeting to establish a new part on the file format (part 5). At this meeting, most of the work focused on the JPEG AI high-level syntax and improvement of several normative and non-normative tools, such as hyper-decoder activations, training dataset, progressive decoding, training methodology and enhancement filters. There are now two smartphone implementations of JPEG AI available. In this meeting, a JPEG AI demo was shown running on a Huawei Mate50 Pro with a Qualcomm Snapdragon 8+ Gen1 with high resolution (4K) image decoding, tiling, full base operating point support and arbitrary image resolution decoding.

JPEG Trust

At the 103rd meeting, the JPEG Committee produced an updated version of the Use Cases and Requirements for JPEG Trust (v2.0). This document integrates the use cases and requirements of the JPEG NFT exploration with the use cases and requirements of JPEG Trust. In addition, a new document with Terms and Definitions for JPEG Trust (v1.0) was published which incorporates all terms and concepts as they are used in the context of the JPEG Trust activities. Finally, an updated version of the JPEG Trust White Paper v1.1 has been released. These documents are publicly available on the JPEG Trust/Documentation page.

JPEG Pleno Learning-based Point Cloud coding

The JPEG Committee continued its activity on Learning-based Point Cloud Coding under the JPEG Pleno family of standards. During the 103rd JPEG meeting, comments on the Committee Draft of IS0/IEC 21794 Part 6: “Learning-based point cloud coding” were received and the activity is on track for the release of a Draft International Standard for balloting at the 104th JPEG meeting in Sapporo, Japan in July 2024. A new version of the Verification Model (Version 4.1) was released during the 103rd JPEG meeting containing an updated entropy coding module. In addition, version 2.1 of the Common Training and Test Conditions was released as a public document.

JPEG Pleno Light Field

The JPEG Pleno Light Field activity progressed at this meeting with a number of technical submissions for improvements to the JPEG PLeno Model (JPLM). The JPLM provides reference implementations for the standardized technologies within the JPEG Pleno framework. The JPEG Pleno Light Field activity has an ongoing standardization activity concerning a novel light field coding architecture that delivers a single coding mode to efficiently code all types of light fields. This novel coding mode does not need any depth information resulting in significant improvement in compression efficiency.

The JPEG Pleno Light Field is also preparing standardization activities in the domains of objective and subjective quality assessment for light fields, aiming to address other plenoptic modalities in the future. During the meeting, important decisions were made regarding the execution of multiple collaborative subjective experiments aiming at exploring various aspects of subjective light field quality assessments. Additionally, a specialized tool for subjective quality evaluation has been developed to support these experiments. The outcomes of these experiments will guide the decisions during the subjective quality assessment standardization process. They will also be utilized in evaluating proposals for the upcoming objective quality assessment standardization activities.

JPEG AIC

During the 103rd JPEG meeting, the work on visual image quality assessment continued with a focus on JPEG AIC-3, targeting a standard for a subjective quality assessment methodology for images in the range from high to nearly visually lossless quality. The activity is currently investigating three kinds of subjective image quality assessment methodologies, notably the Boosted Triplet Comparison (BTC), the In-place Double Stimulus Quality Scale (IDSQS), and the In-place Plain Triplet Comparison (IPTC), as well as a unified framework capable of merging the results of two among them.

The JPEG Committee has also worked on the preparation of the Part 4 of the standard (JPEG AIC-4) by initiating work on the Draft Call for Proposals on Objective Image Quality Assessment. The Final Call for Proposals on Objective Image Quality Assessment is planned to be released in January 2025, while the submission of the proposals is planned for April 2025.

JPEG XE

The JPEG Committee continued its activity on JPEG XE and event-based vision. This activity revolves around a new and emerging image modality created by event-based visual sensors. JPEG XE is about the creation and development of a standard to represent events in an efficient way allowing interoperability between sensing, storage, and processing, targeting machine vision and other relevant applications. The JPEG Committee finished the Common Test Conditions v1.0 document that provides the means to perform an evaluation of candidate technologies for efficient coding of event sequences. The Common Test Conditions define a canonical raw event format, a reference dataset, a set of key performance metrics and an evaluation methodology. In addition, the JPEG Committee also finalized the Draft Call for Proposals on lossless coding for event-based data. This call will be finalized at the next JPEG meeting in July 2024. Both the Common Test Conditions v1.0 and the Draft Call for Proposals are publicly available on jpeg.org. Standardization will start with lossless coding of event sequences as this has the most imminent application urgency in industry. However, the JPEG Committee acknowledges that lossy coding of event sequences is also a valuable feature, which will be addressed at a later stage. The Ad-hoc Group on Event-based Vision was reestablished to continue the work towards the 104th JPEG meeting. To stay informed about the activities please join the event-based imaging Ad-hoc Group mailing list.

JPEG DNA

JPEG DNA is an exploration aiming at developing a standard that provides technical solutions that are capable of representing bi-level, continuous-tone grey-scale, continuous-tone colour, or multichannel digital samples in a format representing nucleotide sequences for supporting DNA storage. A Call for Proposals was published at the 99th JPEG meeting and based on performance assessment and a descriptive analysis of the solutions that had been submitted, the JPEG DNA Verification Model was created during the 102nd JPEG meeting. A number of core experiments were conducted to validate the Verification Model, and notably, the first Working Draft of JPEG DNA was produced during the 103rd JPEG meeting. Work towards the creation of the specification will start with newly defined core experiments to improve the rate-distortion performance of Verification Model and the robustness to insertion, deletion, and substitution errors. In parallel, efforts are underway to improve the noise simulator produced at the 102nd JPEG meeting to allow the assessment of the resilience to noise in the Verification Model in more realistic conditions and to explore learning-based coding solutions.

JPEG XS

The JPEG Committee is happy to announce that the core parts of JPEG XS 3rd edition are ready for publication as International Standards. The Final Draft International Standard for Part 1 of the standard – Core coding tools – is ready, and Part 2 – Profiles and buffer models – and Part 3 – Transport and container formats – are both being prepared by ISO for immediate publication. At this meeting, the JPEG Committee continued the work on Part 4 – Conformance testing, to provide the necessary test streams and test protocols to implementers of the 3rd edition. Consultation of the Committee Draft for Part 4 took place and a DIS version was issued. The development of the reference software, contained in Part 5, continued and the reference decoder is now feature-complete and fully compliant with the 3rd edition. A Committee Draft for Part 5 was issued at this meeting. Development of a fully compliant reference encoder is scheduled to be completed by July.

Finally, new experimental results were presented on how to use JPEG XS over 5G mobile networks for the wireless transmission of low-latency and high quality 4K/8K 360 degree views with mobile devices and VR headsets. More experiments will be conducted, but first results show that JPEG XS is capable of providing immersive and excellent quality of experience in VR use cases, mainly thanks to its native low-latency and low-complexity properties.

JPEG XL

The performance of JPEG XL on HDR images was investigated and the experiments will continue. Work on a hardware implementation continues, and further improvements are made to the libjxl reference software. The second editions of Parts 1 and 2 are in the final stages of the ISO process and will be published soon.

Final Quote

“The JPEG AI Draft International Standard is a yet another important milestone in an age where AI is rapidly replacing previous technologies. With this achievement, the JPEG Committee has demonstrated its ability to reinvent itself and adapt to new technological paradigms, offering standardized solutions based on latest state-of-the-art technologies.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

MPEG Column: 147th MPEG Meeting in Sapporo, Japan


The 147th MPEG meeting was held in Sapporo, Japan from 15-19 July 2024, and the official press release can be found here. It comprises the following highlights:

  • ISO Base Media File Format*: The 8th edition was promoted to Final Draft International Standard, supporting seamless media presentation for DASH and CMAF.
  • Syntactic Description Language: Finalized as an independent standard for MPEG-4 syntax.
  • Low-Overhead Image File Format*: First milestone achieved for small image handling improvements.
  • Neural Network Compression*: Second edition for conformance and reference software promoted.
  • Internet of Media Things (IoMT): Progress made on reference software for distributed media tasks.

* … covered in this column and expanded with possible research aspects.

8th edition of ISO Base Media File Format

The ever-growing expansion of the ISO/IEC 14496-12 ISO base media file format (ISOBMFF) application area has continuously brought new technologies to the standards. During the last couple of years, MPEG Systems (WG 3) has received new technologies on ISOBMFF for more seamless support of ISO/IEC 23009 Dynamic Adaptive Streaming over HTTP (DASH) and ISO/IEC 23000-19 Common Media Application Format (CMAF) leading to the development of the 8th edition of ISO/IEC14496-12.

The new edition of the standard includes new technologies to explicitly indicate the set of tracks representing various versions of the media presentation of a single media for seamless switching and continuous presentation. Such technologies will enable more efficient processing of the ISOBMFF formatted files for DASH manifest or CMAF Fragments.

Research aspects: The central research aspect of the 8th edition of ISOBMFF, which “will enable more efficient processing,” will undoubtedly be its evaluation compared to the state-of-the-art. Standards typically define a format, but how to use it is left open to implementers. Therefore, the implementation is a crucial aspect and will allow for a comparison of performance. One such implementation of ISOBMFF is GPAC, which most likely will be among the first to implement these new features.

Low-Overhead Image File Format

ISO/IEC 23008-12 image format specification defines generic structures for storing image items and sequences based on ISO/IEC 14496-12 ISO base media file format (ISOBMFF). As it allows the use of various high-performance video compression standards for a single image or a series of images, it has been adopted by the market quickly. However, it was challenging to use it for very small-sized images such as icons or emojis. While the initial design of the standard was versatile and useful for a wide range of applications, the size of headers becomes an overhead for applications with tiny images. Thus, Amendment 3 of ISO/IEC 23008-12 low-overhead image file format aims to address this use case by adding a new compact box for storing metadata instead of the ‘Meta’ box to lower the size of the overhead.

Research aspects: The issue regarding header sizes of ISOBMFF for small files or low bitrate (in the case of video streaming) was known for some time. Therefore, amendments in these directions are appreciated while further performance evaluations are needed to confirm design choices made at this initial step of standardization.

Neural Network Compression

An increasing number of artificial intelligence applications based on artificial neural networks, such as edge-based multimedia content processing, content-adaptive video post-processing filters, or federated training, need to exchange updates of neural networks (e.g., after training on additional data or fine-tuning to specific content). For this purpose, MPEG developed a second edition of the standard for coding of neural networks for multimedia content description and analysis (NNC, ISO/IEC 15938-17, published in 2024), adding syntax for differential coding of neural network parameters as well as new coding tools. Trained models can be compressed to at least 10-20% for several architectures, even below 3%, of their original size without performance loss. Higher compression rates are possible at moderate performance degradation. In a distributed training scenario, a model update after a training iteration can be represented at 1% or less of the base model size on average without sacrificing the classification performance of the neural network.

In order to facilitate the implementation of the standard, the accompanying standard ISO/IEC 15938-18 has been updated to cover the second edition of ISO/IEC 15938-17. This standard provides a reference software for encoding and decoding NNC bitstreams, as well as a set of conformance guidelines and reference bitstreams for testing of decoder implementations. The software covers the functionalities of both editions of the standard, and can be configured to test different combinations of coding tools specified by the standard.

Research aspects: The reference software for NNC, together with the reference software for audio/video codecs, are vital tools for building complex multimedia systems and its (baseline) evaluation with respect to compression efficiency only (not speed). This is because reference software is usually designed for functionality (i.e., compression in this case) and not performance.

The 148th MPEG meeting will be held in Kemer, Türkiye, from November 04-08, 2024. Click here for more information about MPEG meetings and their developments.

The 2nd Edition of the Spring School on Social XR organized by CWI

ACM SIGMM co-sponsored the second edition of the Spring School on Social XR, organized by the Distributed and Interactive Systems group (DIS) at CWI in Amsterdam. The event took place on March 4th – 8th 2024 and attracted 30 students from different disciplines (technology, social sciences, and humanities). The program included 22 lectures, 6 of them open, by 23 instructors. The event was organized by Irene Viola, Silvia Rossi, Thomas Röggla, and Pablo Cesar from CWI; and Omar Niamut from TNO. The event was co-sponsored by the ACM Special Interest  Group on Multimedia ACM SIGMM, making available student grants and supporting international speaker from under-represented countries, and The Netherlands Institute for Sound and Vision (https://www.beeldengeluid.nl/en).

Students and organisers of the Spring School on Social XR (March 4th – 8th 2024, Amsterdam)

“The future of media communication is immersive, and will empower sectors such as cultural heritage, education, manufacturing, and provide a climate-neutral alternative to travelling in the European Green Deal”. With such a vision in mind, the organization committee continued for a second edition with a holistic program around the research topic of Social XR. The program included keynotes and workshops, where prominent scientists in the field shared their knowledge with students and triggered meaningful conversations and exchanges.

A poster session at the CWI DIS Spring School 2024.

The program included topics such as the capturing and modelling of realistic avatars and their behavior, coding and transmission techniques of volumetric video content, ethics for the design and development of responsible social XR experiences, novel rending and interaction paradigms, and human factors and evaluation of experiences. Together, they provided a holistic perspective, helping participants to better understand the area and to initiate a network of collaboration to overcome current limitations of current real-time conferencing systems.

The spring school is part of the semester program organized by the DIS group of CWI. It was initiated in May 2022 with the Symposium on human-centered multimedia systems: a workshop and seminar to celebrate the inaugural lecture, “Human-Centered Multimedia: Making Remote Togetherness Possible” of Prof. Pablo Cesar. Then, it was continued in 2023 with the 1st Spring School on Social XR.

The list of talks were:

  • “Volumetric Content Creation for Immersive XR Experiences” by Aljosa Smolic
  • “Social Signal Processing as a Method for Modelling Behaviour in SocialXR” by Julie Williamson
  • “Towards a Virtual Reality” by Elmar Eisemann
  • “Meeting Yourself and Others in Virtual Reality” by Mel Slater
  • “Social Presence in VR – A Media Psychology Perspective” by Tilo Hartmann
  • “Ubiquitous Mixed Reality: Designing Mixed Reality Technology to Fit into the Fabric of our Daily Lives” by Jan Gugenheimer
  • “Building Military Family Cohesion through Social XR: A 8-Week Field Study” by Sun Joo (Grace) Ahn
  • “Navigating the Ethical Landscape of XR: Building a Necessary Framework” by Eleni Mangina
  • “360° Multi-Sensory Experience Authoring” by Debora Christina Muchaluat Saade
  • “QoE Assessment of XR” by Patrick le Callet
  • “Bringing Soul to Digital” by Natasja Paulssen
  • “Evaluating QoE for Social XR – Audio, Visual, Audiovisual and Communication Aspects” by Alexander Raake
  • “Immersive Technologies Through the Lens of Public Values” by Mariëtte van Huijstee
  • “Designing Innovative Future XR Meeting Spaces” by Katherine Isbister
  • “Evaluation Methods for Social XR Experiences” by Mark Billinghurst
  • “Recent Advances in 3D Videocommunication” by Oliver Schreer
  • “Virtual Humans in Social XR” by Zerrin Yumak
  • “The Power of Graphs Learning in Immersive Communications” by Laura Toni
  • “Boundless Creativity: Bridging Sectors for Social Impact” by Benjamin de Wit
  • “Social XR in 5G and Beyond: Use Cases, Requirements, and Standardization Activities” by Lea Skorin-Kapov
  • “An Overview on Standardization for Social XR”  by Pablo Perez and Jesús Gutiérrez
  • “Funding: The Path to Research Independence” by Sergio Cabrero

SIGMM Strike Teams Activity Report (April, 2024)

On April 10th, 2024, during the SIGMM Advisory Board meeting, the Strike Team Leaders, Touradj Ebrahimi, Arnold Smeulders, Miriam Redi and Xavier Alameda Pineda (represented by Marco Bertini) reported the results of their activity. They are summarized in the following in the form of recommendations that should be intended as guidelines and behavioral advice for our ongoing and future activity. SIGMM members in charge of SIGMM activities, SIGMM Conference leaders and particularly the organizers of the next ACMMM editions, are invited to adhere to these recommendations for their concerns, implement the items marked as mandatory and report to the SIGMM Advisory Board after the event.

All the SIGMM Strike Teams will remain in charge for two years starting January 1st, 2024 for reviews and updates.

The world is changing rapidly, and technology is driving these changes at an unprecedented pace. In this scenario, multimedia has become ubiquitous, providing new services to users, advanced modalities for information transmission, processing, and management, as well as innovative solutions for digital content understanding and production. The progress of Artificial Intelligence has fueled new opportunities and vitality in the field. New media formats, such as 3D, event data, and other sensory inputs, have become popular. Cutting-edge applications are constantly being developed and introduced.

SIGMM Strike Team on Industry Engagement

Team members: Touradj Ebrahimi (EPFL),Ali Begen (Ozyegin Univ), Balu Adsumilli (Google), Yong Rui (Lenovo) and ChangSheng Xu (Chinese Academy of Sciences)
Coordinator: Touradj Ebrahimi

The team provided recommendations for both ACMMM organizers and SIGMM Advisory Board. The recommendations addressed improving the presence of industry at ACMMM and other SIGMM Conferences/Workshops launching new in-cooperation initiatives and establishing stable bi- directional links.

  1. Organization of industry-focused events
    • Suggested / Mandatory for ACMMM Organizers and SIGMM AB: Create industry-focused promotional materials like pamphlets/brochures for industry participation (sponsorship, exhibit, etc.) in the style of ICASSP 2024 and ICIP 2024
    • Suggested for ACMMM Organizers: invite Keynote Speakers from industry, eventually with financial support of SIGMMM. Keynote talks should be similar to plenary talks but around specific application challenges.
    • Suggested for ACMMM Organizers: organize Special Sessions and Workshops around specific applications of interest to companies and startups. Sessions should be coordinated by industry with eventual support from an experienced and confirmed scholar.
    • Suggested for ACMMM Organizers: organize Hands-on Sessions led by industry to receive feedback on future products and services.
    • Suggested for ACMMM Organizers: organize Panel Sessions led by industry and standardization committees on timely topics relevant to industry e.g. How companies cope with AI.
    • Suggested for ACMMM Organizers: organize Tutorial sessions given by qualified people from industry and standardization committees at SIGMM-sponsored conferences/workshops
    • Suggested for ACMMM Organizers: promote contributions mainly from the industry in theform of Industry Sessions to present companies and their products and services.
    • Suggested for ACMMM Organizers and SIGMM AB: promote Joint SIGMM / Standardization workshop on latest standards e.g. JPEG meets SIGMM, MPEG meets SIGMM, AOM meets SIGMM.
    • Suggested for ACMMM Organizers: organize Job Fairs like job interview speed dating during ACMMM
  2. Initiatives for linkage
    • Mandatory for SIGMM Organizers and SIGMM AB: Create and maintain a mailing list of industrial targets, taking care of GDPR (Include a question in the registration form of SIGMM-sponsored conferences)
    • Suggested for SIGMM AB: organize monthly talks by industry leaders either from large established or SMEs or startups sharing technical/scientific challenges they face and solutions
  3. Initiatives around reproducible results and benchmarking
    • Suggested for ACMMM Organizers and SIGMM AB: support release of databases, studies on performance assessment procedures and metrics eventually focused on specific applications.
    • Suggested for ACMMM Organizers: organize Grand Challenges initiated and sponsored by industry.

Strike Team on ACMMM Format

Team Members: Arnold Smeulders (Univ. of Amsterdam), Alan Smeaton (Dublin City University), Tat Seng Chua (National University of Singapore), Ralf Steinmetz (Univ. Darmstadt), Changwen Chen (Hong Kong Polytechnic Univ.), Nicu Sebe (Univ. of Trento), Marcel Worring (Univ. of Amsterdam), Jianfei Cai (Monash Univ.), Cathal Gurrin (Dublin City Univ.).
Coordinator: Arnold Smeulders

The team provided recommendations for both ACMMM organizers and SIGMM Advisory Board. The recommendations addressed distinct items related to Conference identity, Conference budget and Conference memory.

1. Intended audience. It is generally felt that ACMMM is under pressure from neighboring conferences growing very big. There is consensus that growing big should not be the purpose of ACMMM: a 750 – 1500 size was thought to be ideal including being attractive to industry. Growth should come naturally.

  • Suggested for ACMMM Organizers and SIGMM AB: Promote distant travel by lowering fees for those who travels far
  • Suggested for ACMMM Organizers: Include (a personalized) visa invitation in the call for papers.

2. Community feel, differentiation and interdisciplinarity. Identity is not an actionable concern, but one of the shared common goods is T-shaped individuals interested in neighboring disciplines making an interdisciplinary or multidisciplinary connection. It is desirable to differentiate submitted papers from major close conferences like CVPR. This point is already implemented in the call for papers of ACMMM 2024.

    null
  • Mandatory for ACMMM OrganizersAsk in the submission how the paper fits in the multimedia community and its scientific tradition as illustrated by citations. Consider this information in the explicit review criteria.
  • Recommended for ACMMM Organizers: Support the physical presence of participants by rebalancing fees.
  • Suggested for ACMMM Organizers and SIGMM AB: Organize a session around the SIGMM test of time award, make selection early, funded by SIGMM.
  • Suggested for ACMMM Organizers: Organize moderated discussion sessions for papers on the same theme.

3. Brave New Ideas. Brave New is very well fitting with the intended audience. It is essential that we are able to draw out brave and new ideas from our community for long term growth and vibrancy. The emphasis in reviewing Brave New Ideas should be on the novelty even if it is not perfect. Rotate over a pool of people to prevent lock-in.

    null
  • Suggested / Mandatory for ACMMM OrganizersInclude in the submission a 3-minute pitch video to archive in the ACM digital library.
  • Suggested / Mandatory for ACMMM Organizers: Select reviewers from a pool of senior people to review novelty.
  • Suggested for ACMMM Organizers: Start with one session of 4 papers, if successful, add another session later.

4. Application. There should be no support for one specific application area exclusively in the main conference. Yet, applications areas should be focused in special sessions or workshops.

  • Suggested for ACMMM Organizers: Focus on application-related workshops or special sessions with own reviewing.

5. Presentation. When the core business of ACM MM is inter- and multi-disciplinarity it is natural to make the presentation for a broader audience part of the selection. ACM should make the short videos accessible as a service to the science or general public. TED-like videos for a paper fit naturally with ACMMM and fit with the trend in YouTube to communicate your paper. If too much to do, SIGMM AB should support reviewing the videos financially.

  • Mandatory to ACMMM Organizers: Include a TED-like 3-minute pitch video as part of the submission and this is archived by ACM Digital Library as part of the conference proceedings, to be submitted a week after the paper deadline for review, so there is time to prepare it after the regular paper submission.

6. Promote open-accessFor a data-driven and fair comparison promote open access of data to be used in the next conference to compare to.

  • Suggested for SIGMM AB: Open access for data encouraged.

7. Keynotes. For the intended audience and interdisciplinary, it is felt essential to have keynote on the key-topics of the moment. Keynotes should not focus on one topic but maintaining the diversity of topics in the conference and over the years, so to be sure new ideas are inserted in the community.

  • Suggested to SIGMM AB: to directly fund a big name, expensive, marquee keynote speaker sponsored by SIGMM to one of the societally urgent key-notes as evident from news.

8. Diversity over subdisciplines, etc Do extra effort for Arts, GenAI use models, security, HCI and demos. We need to ensure that if the submitted papers are of sufficiently high quality, there should be at least a session on that sub- topic in the conference. We need to ensure that the conference is not overwhelmed by a popular topic with easy review criteria and generally of much higher review scores.

  •  Suggested for ACMMM Organizers: Promote diversity of all relevant topics in the call for papers and by action in subcommunities by an ambassador. SIGMM will supervise the diversity.

9. Living report. To enhance the institutional memory, maintain a living document passed on from organizer to organizer, with suggestions. The owner of the document is the commissioner for conferences of SIG MM.

  • Mandatory for ACMMM Organizers and SIGMM AB: A short report to the SIGMM commissioner for conferences from the ACMMM chair, including a few recommendations for the next time; handed over to the next conference after the end of the current conference.

SIGMM Strike Team on Harmonization and Spread

Team members: Miriam Redi (Wikimedia Foundation), Sivia Rossi (CWI), Irene Viola (CWI), Mylene Farias (Texas State Univ. and Univ. Brasilia), Ichiro Ide (Nagoya Univ), Pablo Cesar (CWI and TU Delft).
Coordinator: Miriam Redi

The team provided recommendations for both ACMMM organizers and SIGMM Advisory Board. The recommendations addressed distinct items related to give SIGMM Records and Social Media a more central role in SIGMM, integrate SIGMM Records and Social Media in the whole process of the ACMMM organization since its initial planning.

1. SIGMM Website The SIGMM Website is not updated and needs a serious overhaul.

  • Mandatory for SIGMM AB: restart the website from scratch being inspired by other SIGs f.e. reaching out to people at CHI to understand what can be done. Budget should be provided by SIGMM.

2. SIGMM Social Media Channels SIGMM Social media accounts (twitter and linkedin) are managed by the Social Media Team at the SIGMM Records

  • Suggested for SIGMM AB: continuing this organization expanding responsibilities of the team to include conferences and other events

3. Conference Social Media: Social media presence of conferences is managed by the individual conferences. It is not uniform and disconnected from SIGMM social media and the Records. The social media presence of ACMMM flagship conference is weak and needs help. Creating continuity in terms of strategy and processes across conference editions is key.

  • Mandatory for ACMMM Organizers and SIGMM AB: create a Handbook of conference communications: a set of guidelines about how to create continuity across conference editions in terms of communications, and how to connect the SIGMM Records to the rest of the community.
  • Suggested for ACMMM Organizers and SIGMM AB: one member of the Social Media team at the SIGMM Records is systematically invited to join the OC of major conferences as publicity co-chair. The steering committee chair of each conference should commit to keeping the organizers of each conference edition informed about this policy, and monitor its implementation throughout the years.

SIGMM Strike Team on Open Review

Team members: Xavier Alameda Pineda (Univ. Grenoble-Alpes), Marco Bertini (Univ. Firenze). Coordinator: Xavier Alameda Pineda

The team continued the support to ACMMM Conference organizers for the use of Open Review in the ACMMM reviewing process, helping to implement new functions or improve the existing ones and supporting smooth transfer of the best practices. The recommendations addressed distinct items to complete the migration and stabilize use of Open Review in the future ACMMM editions.

1. Technical development and support

  • Mandatory for the Team: update and publish the scripts; complete the Open Review configuration.
  • Mandatory for SIGMM AB and ACMMM organizers: create a Committee led by the TPC chairs of the current ACMM edition a rotating basis.

2. Communication

  • Mandatory for the Team: write a small manual for use and include it in the future ACMMM Handbook.

Alberto Del Bimbo                                                                                        
SIGMM Chair

VQEG Column: VQEG Meeting December 2023

Introduction

The last plenary meeting of the Video Quality Experts Group (VQEG) was held online by the University of Konstantz (Germany) in December 18th to 21st, 2023. It offered the possibility to more than 100 registered participants from 19 different countries worldwide to attend the numerous presentations and discussions about topics related to the ongoing projects within VQEG. All the related information, minutes, and files from the meeting are available online in the VQEG meeting website, and video recordings of the meeting are soon available at Youtube.

All the topics mentioned below can be of interest for the SIGMM community working on quality assessment, but special attention can be devoted to the current activities on improvements of the statistical analysis of subjective experiments and objective metrics and on the development of a test plan to evaluate the QoE of immersive interactive communication systems in collaboration with ITU.

Readers of these columns interested in the ongoing projects of VQEG are encouraged to suscribe to the VQEG’s  email reflectors to follow the activities going on and to get involved with them.

As already announced in the VQEG website, the next VQEG plenary meeting be hosted by Universität Klagenfurt in Austria from July 1st to 5th, 2024.

Group picture of the online meeting

Overview of VQEG Projects

Audiovisual HD (AVHD)

The AVHD group works on developing and validating subjective and objective methods to analyze commonly available video systems. During the meeting, there were various sessions in which presentations related to these topics were discussed.

Firstly, Ali Ak (Nantes Université, France), provided an analysis of the relation between acceptance/annoyance and visual quality in a recently collected dataset of several User Generated Content (UGC) videos. Then, Syed Uddin (AGH University of Krakow, Poland) presented a video quality assessment method based on the quantization parameter of MPEG encoders (MPEG-4, MPEG-AVC, and MPEG-HEVC) leveraging VMAF. In addition, Sang Heon Le (LG Electronics, Korea) presented a technique for pre-enhancement for video compression and applicable subjective quality metrics. Another talk was given by Alexander Raake (TU Ilmenau, Germany), who presented AVQBits, a versatile no-reference bitstream-based video quality model (based on the standardized ITU-T P.1204.3 model) that can be applied in several contexts such as video service monitoring, evaluation of video encoding quality, of gaming video QoE, and even of omnidirectional video quality. Also, Jingwen Zhu (Nantes Université, France) and Hadi Amirpour (University of Klagenfurt, Austria) described a study on the evaluation of the effectiveness of different video quality metrics in predicting the Satisfied User Ratio (SUR) in order to enhance the VMAF proxy to better capture content-specific characteristics. Andreas Pastor (Nantes Université, France) presented a method to predict the distortion perceived locally by human eyes in AV1-encoded videos using deep features, which can be easily integrated into video codecs as a pre-processing step before starting encoding.

In relation with standardization efforts, Mathias Wien (RWTH Aachen University, Germany) gave an overview on recent expert viewing tests that have been conducted within MPEG AG5 at the 143rd and 144th MPEG meetings. Also, Kamil Koniuch (AGH University of Krakow, Poland) presented a proposal to update the Survival Game task defined in the ITU-T Recommendation P.1301 on subjective quality evaluation of audio and audiovisual multiparty telemeetings, in order to improve its implementation and application to recent efforts such as the evaluation of immersive communication systems within the ITU-T P.IXC (see the paragraph related to the Immersive Media Group).

Quality Assessment for Health applications (QAH)

The QAH group is focused on the quality assessment of health applications. It addresses subjective evaluation, generation of datasets, development of objective metrics, and task-based approaches. Recently, the group has been working towards an ITU-T recommendation for the assessment of medical contents. On this topic, Meriem Outtas (INSA Rennes, France) led a discussion dealing with the edition of a draft of this recommendation. In addition, Lumi Xia (INSA Rennes, France) presented a study of task-based medical image quality assessment focusing on a use case of adrenal lesions.

Statistical Analysis Methods (SAM)

The group SAM investigates on analysis methods both for the results of subjective experiments and for objective quality models and metrics. This was one of the most active groups in this meeting, with several presentations on related topics.

On this topic, Krzystof Rusek (AGH University of Krakow, Poland) presented a Python package to estimate Generalized Score Distribution (GSD) parameters and showed how to use it to test the results obtained in subjective experiments. Andreas Pastor (Nantes Université, France) presented a comparison between two subjective studies using Absolute Category Rating with Hidden Reference (ACR-HR) and Degradation Category Rating (DCR), conducted in a controlled laboratory environment on SDR HD, UHD, and HDR UHD contents using naive observers. The goal of these tests is to estimate rate-distortion savings between two modern video codecs and compare the precision and accuracy of both subjective methods. He also presented another study on the comparison of conditions for omnidirectional video with spatial audio in terms of subjective quality and impacts on objective metrics resolving power.

In addition, Lukas Krasula (Netflix, USA) introduced e2nest, a web-based platform to conduct media-centric (video, audio, and images) subjective tests. Also, Dietmar Saupe (University of Konstanz, Germany) and Simon Del Pin (NTNU, Norway) showed the results of a study analyzing the national difference in image quality assessment, showing significant differences in various areas. Alexander Raake (TU Ilmenau, Germany) presented a study on the remote testing of high resolution images and videos, using AVrate Voyager , which is a publicly accessible framework for online tests. Finally, Dominik Keller (TU Ilmenau, Germany) presented a recent study exploring the impact of 8K (UHD-2) resolution on HDR video quality, considering different viewing distances. The results showed that the enhanced video quality of 8K HDR over 4K HDR diminishes with increasing viewing distance.

No Reference Metrics (NORM)

The group NORM addresses a collaborative effort to develop no-reference metrics for monitoring visual service quality. In At this meeting, Ioannis Katsavounidis (Meta, USA) led a discussion on the current efforts to improve complexity image and video metrics. In addition, Krishna Srikar Durbha (Univeristy of Texas at Austin, USA) presented a technique to tackle the problem of bitrate ladder construction based on multiple Visual Information Fidelity (VIF) feature sets extracted from different scales and subbands of a video

Emerging Technologies Group (ETG)

The ETG group focuses on various aspects of multimedia that, although they are not necessarily directly related to “video quality”, can indirectly impact the work carried out within VQEG and are not addressed by any of the existing VQEG groups. In particular, this group aims to provide a common platform for people to gather together and discuss new emerging topics, possible collaborations in the form of joint survey papers, funding proposals, etc.

In this meeting, Nabajeet Barman and Saman Zadtootaghaj (Sony Interactive Entertainment, Germany), suggested a topic to start to be discussed within VQEG: Quality Assessment of AI Generated/Modified Content. The goal is to have subsequent discussions on this topic within the group and write a position or whitepaper.

Joint Effort Group (JEG) – Hybrid

The group JEG addresses several areas of Video Quality Assessment (VQA), such as the creation of a large dataset for training such models using full-reference metrics instead of subjective metrics. In addition, the group includes the VQEG project Implementer’s Guide for Video Quality Metrics (IGVQM). At the meeting, Enrico Masala (Politecnico di Torino, Italy) provided  updates on the activities of the group and on IGVQM.

Apart from this, there were three presentations addressing related topics in this meeting, delivered by Lohic Fotio Tiotsop (Politecnico di Torino, Italy). The first presentation focused on quality estimation in subjective experiments and the identification of peculiar subject behaviors, introducing a robust approach for estimating subjective quality from noisy ratings, and a novel subject scoring model that enables highlighting several peculiar behaviors. Also, he introduced a non-parametric perspective to address the media quality recovery problem, without making any a priori assumption on the subjects’ scoring behavior. Finally, he presented an approach called “human-in-the-loop training process” that uses  multiple cycles of a human voting, DNN training, and inference procedure.

Immersive Media Group (IMG)

The IMG group is performing research on the quality assessment of immersive media technologies. Currently, the main joint activity of the group is the development of a test plan to evaluate the QoE of immersive interactive communication systems, which is carried out in collaboration with ITU-T through the work item P.IXC. In this meeting, Pablo Pérez (Nokia XR Lab, Spain), Jesús Gutiérrez (Universidad Politécnica de Madrid, Spain), Kamil Koniuch (AGH University of Krakow, Poland), Ashutosh Singla (CWI, The Netherlands) and other researchers involved in the test plan provided an update on the status of the test plan, focusing on the description of four interactive tasks to be performed in the test, the considered measures, and the 13 different experiments that will be carried out in the labs involved in the test plan. Also, in relation with this test plan, Felix Immohr (TU Ilmenau, Germany), presented a study on the impact of spatial audio on social presence and user behavior in multi-modal VR communications.

Diagram of the methodology of the joint IMG test plan

Quality Assessment for Computer Vision Applications (QACoViA)

The group QACoViA addresses the study the visual quality requirements for computer vision methods, where the final user is an algorithm. In this meeting, Mikołaj Leszczuk (AGH University of Krakow, Poland) and  Jingwen Zhu (Nantes Université, France) presented a specialized data set developed for enhancing Automatic License Plate Recognition (ALPR) systems. In addition, Hanene Brachemi (IETR-INSA Rennes, France), presented an study on evaluating the vulnerability of deep learning-based image quality assessment methods to adversarial attacks. Finally, Alban Marie (IETR-INSA Rennes, France) delivered a talk on the exploration of lossy image coding trade-off between rate, machine perception and quality.

5G Key Performance Indicators (5GKPI)

The 5GKPI group studies relationship between key performance indicators of new 5G networks and QoE of video services on top of them. At the meeting, Pablo Pérez (Nokia XR Lab, Spain) led an open discussion on the future activities of the group towards 6G, including a brief presentation of QoS/QoE management in 3GPP and presenting potential opportunities to influence QoE in 6G.

MPEG Column: 146th MPEG Meeting in Rennes, France

The 146th MPEG meeting was held in Rennes, France from 22-26 April 2024, and the official press release can be found here. It comprises the following highlights:

  • AI-based Point Cloud Coding*: Call for proposals focusing on AI-driven point cloud encoding for applications such as immersive experiences and autonomous driving.
  • Object Wave Compression*: Call for interest in object wave compression for enhancing computer holography transmission.
  • Open Font Format: Committee Draft of the fifth edition, overcoming previous limitations like the 64K glyph encoding constraint.
  • Scene Description: Ratified second edition, integrating immersive media objects and extending support for various data types.
  • MPEG Immersive Video (MIV): New features in the second edition, enhancing the compression of immersive video content.
  • Video Coding Standards: New editions of AVC, HEVC, and Video CICP, incorporating additional SEI messages and extended multiview profiles.
  • Machine-Optimized Video Compression*: Advancement in optimizing video encoders for machine analysis.
  • MPEG-I Immersive Audio*: Reached Committee Draft stage, supporting high-quality, real-time interactive audio rendering for VR/AR/MR.
  • Video-based Dynamic Mesh Coding (V-DMC)*: Committee Draft status for efficiently storing and transmitting dynamic 3D content.
  • LiDAR Coding*: Enhanced efficiency and responsiveness in LiDAR data processing with the new standard reaching Committee Draft status.

* … covered in this column.

AI-based Point Cloud Coding

MPEG issued a Call for Proposals (CfP) on AI-based point cloud coding technologies as a result from ongoing explorations regarding use cases, requirements, and the capabilities of AI-driven point cloud encoding, particularly for dynamic point clouds.

With recent significant progress in AI-based point cloud compression technologies, MPEG is keen on studying and adopting AI methodologies. MPEG is specifically looking for learning-based codecs capable of handling a broad spectrum of dynamic point clouds, which are crucial for applications ranging from immersive experiences to autonomous driving and navigation. As the field evolves rapidly, MPEG expects to receive multiple innovative proposals. These may include a unified codec, capable of addressing multiple types of point clouds, or specialized codecs tailored to meet specific requirements, contingent upon demonstrating clear advantages. MPEG has therefore publicly called for submissions of AI-based point cloud codecs, aimed at deepening the understanding of the various options available and their respective impacts. Submissions that meet the requirements outlined in the call will be invited to provide source code for further analysis, potentially laying the groundwork for a new standard in AI-based point cloud coding. MPEG welcomes all relevant contributions and looks forward to evaluating the responses.

Research aspects: In-depth analysis of algorithms, techniques, and methodologies, including a comparative study of various AI-driven point cloud compression techniques to identify the most effective approaches. Other aspects include creating or improving learning-based codecs that can handle dynamic point clouds as well as metrics for evaluating the performance of these codecs in terms of compression efficiency, reconstruction quality, computational complexity, and scalability. Finally, the assessment of how improved point cloud compression can enhance user experiences would be worthwhile to consider here also.

Object Wave Compression

A Call for Interest (CfI) in object wave compression has been issued by MPEG. Computer holography, a 3D display technology, utilizes a digital fringe pattern called a computer-generated hologram (CGH) to reconstruct 3D images from input 3D models. Holographic near-eye displays (HNEDs) reduce the need for extensive pixel counts due to their wearable design, positioning the display near the eye. This positions HNEDs as frontrunners for the early commercialization of computer holography, with significant research underway for product development. Innovative approaches facilitate the transmission of object wave data, crucial for CGH calculations, over networks. Object wave transmission offers several advantages, including independent treatment from playback device optics, lower computational complexity, and compatibility with video coding technology. These advancements open doors for diverse applications, ranging from entertainment experiences to real- time two-way spatial transmissions, revolutionizing fields such as remote surgery and virtual collaboration. As MPEG explores object wave compression for computer holography transmission, a Call for Interest seeks contributions to address market needs in this field.

Research aspects: Apart from compression efficiency, lower computation complexity, and compatibility with video coding technology, there is a range of research aspects, including the design, implementation, and evaluation of coding algorithms within the scope of this CfI. The QoE of computer-generated holograms (CGHs) together with holographic near-eye displays (HNEDs) is yet another dimension to be explored.

Machine-Optimized Video Compression

MPEG started working on a technical report regarding to the “Optimization of Encoders and Receiving Systems for Machine Analysis of Coded Video Content”. In recent years, the efficacy of machine learning-based algorithms in video content analysis has steadily improved. However, an encoder designed for human consumption does not always produce compressed video conducive to effective machine analysis. This challenge lies not in the compression standard but in optimizing the encoder or receiving system. The forthcoming technical report addresses this gap by showcasing technologies and methods that optimize encoders or receiving systems to enhance machine analysis performance.

Research aspects: Video (and audio) coding for machines has been recently addressed by MPEG Video and Audio working groups, respectively. MPEG Joint Video Experts Team with ITU-T SG16, also known as JVET, joined this space with a technical report, but research aspects remain unchanged, i.e., coding efficiency, metrics, and quality aspects for machine analysis of compressed/coded video content.

MPEG-I Immersive Audio

MPEG Audio Coding is entering the “immersive space” with MPEG-I immersive audio and its corresponding reference software. The MPEG-I immersive audio standard sets a new benchmark for compact and lifelike audio representation in virtual and physical spaces, catering to Virtual, Augmented, and Mixed Reality (VR/AR/MR) applications. By enabling high-quality, real-time interactive rendering of audio content with six degrees of freedom (6DoF), users can experience immersion, freely exploring 3D environments while enjoying dynamic audio. Designed in accordance with MPEG’s rigorous standards, MPEG-I immersive audio ensures efficient distribution across bandwidth-constrained networks without compromising on quality. Unlike proprietary frameworks, this standard prioritizes interoperability, stability, and versatility, supporting both streaming and downloadable content while seamlessly integrating with MPEG-H 3D audio compression. MPEG-I’s comprehensive modeling of real-world acoustic effects, including sound source properties and environmental characteristics, guarantees an authentic auditory experience. Moreover, its efficient rendering algorithms balance computational complexity with accuracy, empowering users to finely tune scene characteristics for desired outcomes.

Research aspects: Evaluating QoE of MPEG-I immersive audio-enabled environments as well as the efficient audio distribution across bandwidth-constrained networks without compromising on audio quality are two important research aspects to be addressed by the research community.

Video-based Dynamic Mesh Coding (V-DMC)

Video-based Dynamic Mesh Compression (V-DMC) represents a significant advancement in 3D content compression, catering to the ever-increasing complexity of dynamic meshes used across various applications, including real-time communications, storage, free-viewpoint video, augmented reality (AR), and virtual reality (VR). The standard addresses the challenges associated with dynamic meshes that exhibit time-varying connectivity and attribute maps, which were not sufficiently supported by previous standards. Video-based Dynamic Mesh Compression promises to revolutionize how dynamic 3D content is stored and transmitted, allowing more efficient and realistic interactions with 3D content globally.

Research aspects: V-DMC aims to allow “more efficient and realistic interactions with 3D content”, which are subject to research, i.e., compression efficiency vs. QoE in constrained networked environments.

Low Latency, Low Complexity LiDAR Coding

Low Latency, Low Complexity LiDAR Coding underscores MPEG’s commitment to advancing coding technologies required by modern LiDAR applications across diverse sectors. The new standard addresses critical needs in the processing and compression of LiDAR-acquired point clouds, which are integral to applications ranging from automated driving to smart city management. It provides an optimized solution for scenarios requiring high efficiency in both compression and real-time delivery, responding to the increasingly complex demands of LiDAR data handling. LiDAR technology has become essential for various applications that require detailed environmental scanning, from autonomous vehicles navigating roads to robots mapping indoor spaces. The Low Latency, Low Complexity LiDAR Coding standard will facilitate a new level of efficiency and responsiveness in LiDAR data processing, which is critical for the real-time decision-making capabilities needed in these applications. This standard builds on comprehensive analysis and industry feedback to address specific challenges such as noise reduction, temporal data redundancy, and the need for region-based quality of compression. The standard also emphasizes the importance of low latency coding to support real-time applications, essential for operational safety and efficiency in dynamic environments.

Research aspects: This standard effectively tackles the challenge of balancing high compression efficiency with real-time capabilities, addressing these often conflicting goals. Researchers may carefully consider these aspects and make meaningful contributions.

The 147th MPEG meeting will be held in Sapporo, Japan, from July 15-19, 2024. Click here for more information about MPEG meetings and their developments.

JPEG Column: 102nd JPEG Meeting in San Francisco, U.S.A.

JPEG Trust reaches Draft International Standard stage

The 102nd JPEG meeting was held in San Francisco, California, USA, from 22 to 26 January 2024. At this meeting, JPEG Trust became a Draft International Standard. Moreover, the responses to the Call for Proposals of JPEG NFT were received and analysed. As a consequence, relevant steps were taken towards the definition of standardized tools for certification of the provenance and authenticity of media content in a time where tools for effective media manipulation should be made available to the general public. The 102nd JPEG meeting was finalised with the JPEG Emerging Technologies Workshop, at Tencent, Palo Alto on 27 January.

JPEG Emerging Technologies Workshop, organised on 27 January at Tencent, Palo Alto

The following sections summarize the main highlights of the 102nd JPEG meeting:

  • JPEG Trust reaches Draft International Standard stage;
  • JPEG AI improves the Verification Model;
  • JPEG Pleno Learning-based Point Cloud coding releases the Committee Draft;
  • JPEG Pleno Light Field continues development of Quality assessment tools;
  • AIC starts working on Objective Quality Assessment models for Near Visually Lossless coding;
  • JPEG XE prepares Common Test Conditions;
  • JPEG DNA evaluates its Verification Model;
  • JPEG XS 3rd edition parts are ready for publication as International standards;
  • JPEG XL investigate HDR compression performance.

JPEG Trust

At its 102nd meeting the JPEG Committee produced the DIS (Draft International Standard) of JPEG Trust Part 1 “Core Foundation” (21617-1). It is expected that the standard will be published as an International Standard during the Summer of 2024. This rapid standardization schedule has been necessary because of the speed at which fake media and misinformation are proliferating especially with respect to Generative AI.

The JPEG Trust Core Foundation specifies a comprehensive framework for individuals, organizations, and governing institutions interested in establishing an environment of trust for the media that they use, and for supporting trust in the media they share online. This framework addresses aspects of provenance, authenticity, integrity, copyright, and identification of assets and stakeholders. To complement Part 1, a proposed new Part 2 “Trust Profiles Catalogue” has been established. This new Part will specify a catalogue of Trust Profiles, targeting common usage scenarios.

During the meeting, the committee also evaluated responses received to the JPEG NFT Final Call for Proposals (CfP). Certain portions of the submissions will be incorporated in the JPEG Trust suite of standards to improve interoperability with respect to media tokenization. As a first step, the committee will focus on standardization of declarations of authorship and ownership.

Finally, the Use Cases and Requirements document for JPEG Trust was updated to incorporate additional requirements in respect of composited media. This document is publicly available on the JPEG website.

white paper describing the JPEG Trust framework is also available publicly on the JPEG website.

JPEG AI

At the 102nd JPEG meeting, the JPEG AI Verification Model was improved by integrating nearly all the contributions adopted at the 101st JPEG meeting. The major change is a multi-branch JPEG AI decoding architecture with two encoders and three decoders (6 possible compatible combinations) that have been jointly trained, which allows the coverage of encoder and decoder complexity-efficiency tradeoffs. The entropy decoding and latent prediction portion is common for all possible combinations and thus differences reside at the analysis/synthesis networks. Moreover, the number of models has been reduced to 4, both 4:4:4 and 4:2:0 coding is supported, and JPEG AI can now achieve better rate-distortion performance in some relevant use cases. A new training dataset has also been adopted with difficult/high-contrast/versatile images to reduce the number of artifacts and to achieve better generalization and color reproducibility for a wide range of situations. Other enhancements have also been adopted, namely feature clipping for decoding artifacts reduction, improved variable bit-rate training strategy and post-synthesis transform filtering speedups.

The resulting performance and complexity characterization show compression efficiency (BD-rate) gains of 12.5% to 27.9% over the VVC Intra anchor, for relevant encoder and decoder configurations with a wide range of complexity-efficiency tradeoffs (7 to 216 kMAC/px at the decoder side). For the CPU platform, the decoder complexity is 1.6x/3.1x times higher compared to VVC Intra (reference implementation) for the simplest/base operating point. At the 102nd meeting, 12 core experiments were established to further continue work related to different topics, namely about the JPEG AI high-level syntax, progressive decoding, training dataset, hierarchical dependent tiling, spatial random access, to mention the most relevant. Finally, two demonstrations were shown where JPEG AI decoder implementations were run on two smartphone devices, Huawei Mate50 Pro and iPhone14 Pro.

JPEG Pleno Learning-based Point Cloud coding

The 102nd JPEG meeting marked an important milestone for JPEG Pleno Point Cloud with the release of its Committee Draft (CD) for ISO/IEC 21794-Part 6 “Learning-based point cloud coding” (21794-6). Part 6 of the JPEG Pleno framework brings an innovative Learning-based Point Cloud Coding technology adding value to existing Parts focused on Light field and Holography coding. It is expected that a Draft International Standard (DIS) of Part 6 will be approved at the 104th JPEG meeting in July 2024 and the International Standard to be published during 2025. The 102nd meeting also marked the release of version 4 of the JPEG Pleno Point Cloud Verification Model updated to be robust to different hardware and software operating environments.

JPEG Pleno Light Field

The JPEG Committee has recently published a light field coding standard, and JPEG Pleno is constantly exploring novel light field coding architectures. The JPEG Committee is also preparing standardization activities – among others – in the domains of objective and subjective quality assessment for light fields, improved light field coding modes, and learning-based light field coding.

As the JPEG Committee seeks continuous improvement of its use case and requirements specifications, it organized a Light Field Industry Workshop. The presentations and video recording of the workshop that took place on November 22nd, 2023 are available on the JPEG website.

JPEG AIC

During the 102nd JPEG meeting, work on Image Quality Assessment continued with a focus on JPEG AIC-3, targeting standardizing a subjective visual quality assessment methodology for images in the range from high to nearly visually lossless qualities. The activity is currently investigating three different subjective image quality assessment methodologies.

The JPEG Committee also launched the activities on Part 4 of the standard (AIC-4), by initiating work on the Draft Call for Proposals on Objective Image Quality Assessment. The Final Call for Proposals on Objective Image Quality Assessment is planned to be released in July 2024, while the submission of the proposals is planned for October 2024.

JPEG XE

The JPEG Committee continued its activity on JPEG XE and event-based vision. This activity revolves around a new and emerging image modality created by event-based visual sensors. JPEG XE is about the creation and development of a standard to represent events in an efficient way allowing interoperability between sensing, storage, and processing, targeting machine vision and other relevant applications. The JPEG Committee is preparing a Common Test Conditions document that provides the means to perform an evaluation of candidate technology for the efficient coding of event sequences. The Common Test Conditions provide a definition of a reference format, a dataset, a set of key performance metrics and an evaluation methodology. In addition, the committee is preparing a Draft Call for Proposals on lossless coding, with the intent to make it public in April of 2024. Standardization will first start with lossless coding of event sequences as this seems to have the higher application urgency in industry. However, the committee acknowledges that lossy coding of event sequences is also a valuable feature, which will be addressed at a later stage. The public Ad-hoc Group on Event-based Vision was reestablished to continue the work towards the next 103rd JPEG meeting in April of 2024. To stay informed about the activities please join the event based imaging Ad-hoc Group mailing list.

JPEG DNA

During the 102nd JPEG meeting, the JPEG DNA Verification Model description and software were approved along with continued efforts to evaluate its rate-distortion characteristics. Notably, during the 102nd meeting, a subjective quality assessment was carried out by expert viewing using a new approach under development in the framework of AIC-3. The robustness of the Verification Model to errors generated in a biochemical process was also analysed using a simple noise simulator. After meticulous analysis of the results, it was decided to create a number of core experiments to improve the Verification Model rate-distortion performance and the robustness to the errors by adding an error correction technique to the latter. In parallel, efforts are underway to improve the rate-distortion performance of the JPEG DNA Verification Model by exploring learning-based coding solutions. In addition, further efforts are defined to improve the noise simulator so as to allow assessment of the resilience to noise in the Verification Model in more realistic conditions, laying the groundwork for a JPEG DNA robust to insertion, deletion and substitution errors.

JPEG XS

The JPEG Committee is happy to announce that the core parts of JPEG XS 3rd edition are ready for publication as International standards. The Final Draft International Standard for Part 1 of the standard – Core coding tools – was created at the last meeting in November 2023, and is scheduled for publication. DIS ballot results for Part 2 – Profiles and buffer models – and Part 3 – Transport and container formats – of the standard came back, allowing the JPEG Committee to produce and deliver the proposed IS texts to ISO. This means that Part 2 and Part 3 3rd edition are also scheduled for publication.

At this meeting, the JPEG Committee continued the work on Part 4 – Conformance testing, to provide the necessary test streams of the 3rd edition for potential implementors. A Committee Draft for Part 4 was issued. With Parts 1, 2, and 3 now ready, and Part 4 ongoing, the JPEG Committee initiated the 3rd edition of Part 5 – Reference software. A first Working Draft was prepared and work on the reference software will start.

Finally, experimental results were presented on how to use JPEG XS over 5G mobile networks for the transmission of low-latency and high quality 4K/8K 360 degree views with mobile devices. This use case was added at the previous JPEG meeting. It is expected that the new use case can already be covered by the 3rd edition, meaning that no further updates to the standard would be necessary. However, investigations and experimentation on this subject continue.

JPEG XL

The second edition of JPEG XL Part 3 (Conformance testing) has proceeded to the DIS stage. Work on a hardware implementation continues. Experiments are planned to investigate HDR compression performance of JPEG XL.

“In its efforts to provide standardized solutions to ascertain authenticity and provenance of the visual information, the JPEG Committee has released the Draft international Standard of the JPEG Trust. JPEG Trust will bring trustworthiness back to imaging with specifications under the governance of the entire International community and stakeholders as opposed to a small number of companies or countries.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

First edition of the Social Robotics, Artificial Intelligence and Multimedia (SoRAIM) School


In February 20204 was held the first edition of the Social Robotics, Artificial Intelligence and Multimedia (SoRAIM) Winter School, which, with the support of SIGMM attracted more than 50 students and young researchers to learn, discuss and first-hand experiment in topics related to social robotics. The event’s success calls for further editions in upcoming years.

Rationale for SoRAIM

SPRING, a collaborative research project funded by the European Commission under Horizon 2020, is coming to an end in May 2024. Its scientific and technological objectives were to test a versatile social robotic platform within a hospital and have it perform social activities in a multi-person, dynamic setup are in most part achieved. In order to empower the next generation of young researchers with concepts and tools to answer tomorrow’s challenges in the field of social robotics, one must tackle the issue of knowledge and know-how transmission. We therefore chose to provide a winter school, free of charge to the participants (thanks to the additional support of SIGMM), so that as many students and young researchers from various horizons (not only technical fields) could attend. 

Contents of the Winter School

The Social Robotics, Artificial Intelligence and Multimedia (SoRAIM) Winter School took place from 19 to 23 February 2024 in Grenoble, France. An introduction to the contents of the school and the context provided by the SPRING project was provided, and a demonstration combining social navigation and dialogue interaction was given on the first day. This triggered the curiosity of the participants, and a spontaneous Q&A session with the contributions, questions and comments from the participants to the school was held. 

The school spanned over the entire week, with 17 talks, 8 speakers from the H2020 SPRING project, and 9 invited speakers external to the project. The school also included a panel discussion on the topic “Are social robots already out there? Immediate challenges in real-world deployment”, a poster session with 15 contributions, and two hands-on sessions where the participants could choose among the following topics: Robot navigation with Reinforcement Learning, ROS4HRI: How to represent and reason about humans with ROS, Building a conversational system with LLMs using prompt engineering, Robot self-localisation based on camera images, and Speaker extraction from microphone recordings. A social activity (visit of Grenoble’s downtown and Bastille) was organised on Thursday afternoon, allowing participants to mingle with speakers and to discover the host town’s history.

One of the highlights of SoRAIM was its Panel Session, which topic was “Are social robots already out there? Immediate challenges in real-world deployment”.  Although no definitive answers were found, the session stressed the fact that challenges remain numerous for the deployment of actual social robots in our everyday lives (at work, at home). On the technical side, because robotic platforms are subject to certain hardware and software constraints. On the hardware side, because sensors and actuators are restricted in size, power and performance, since the physical space and the battery capacity are also limited. On the software side, because large models can be used if lots of computing resources are permanently available, which is not always the case, since they need to be shared between the various computing modules. Finally on the regulatory and legal side, because the rise of AI use is fast and needs to be balanced with ethical views that address our society’s needs; but the construction of proper laws, norms and their acknowledgement and understanding by stakeholders is slow. In this session the panellists surveyed all aspects of the problems at hand and provided an overview of the challenges that future scientists will need to solve in order to take social robots out of the labs and into the world.

Attendance & future perspectives

SoRAIM attracted 57 participants through the whole week. The attendees were diverse, as was aimed initially, with a breakdown of 50% of PhD students, 20% of young researchers (public sector), 10% of engineers and young researchers (private sector), and 20% of MSc students. Of particular focus, the ratio of women attendees was close to 40%, which is double of the usual in this field. Finally, in terms of geographic spread, attendees came in majority from other European countries (17 countries total), with just below 50% attendees coming from France. Following the school, a satisfaction survey was sent to the attendees in order to better grasp which elements were the most appreciated in view of a longer-term objective to hold this winter school as a serial event. Given the diverse background of attendees, opinions on contents such as the hands-on session varied, but overall satisfaction was very high, which shows the interest of the next generation of researchers for more opportunities to learn in this field. We are currently reviewing options to held similar events each year or every two years, depending on available funding.

More information about the SoRAIM winter school is available on the webpage: https://spring-h2020.eu

Sponsors

SoRAIM was sponsored by the H2020 SPRING project, Inria, the University Grenoble Alpes, the Multidisciplinary Institute of Artificial Intelligence and by ACM’s Special Interest Group on Multimedia (SIGMM). Through ACM SIGMM, we received significant funding which allowed us to invite 14 students and young researchers, members of SIGMM, from abroad.

Full list of contributions

All the talks are available in replay on our YouTube channel: https://www.youtube.com/watch?v=ckJv0eKOgzY&list=PLwdkYSztYsLfWXWai6mppYBwLVjK0VA6y
The complete list of talks and posters presented at SoRAIM Winter School 2024 can be found here: https://spring-h2020.eu/soraim/
In the following, the list of talks in chronological order:

JPEG Column: 101st JPEG Meeting

JPEG Trust reaches Committee Draft stage at the 101st JPEG meeting

The 101st JPEG meeting was held online, from the 30th of October to the 3rd of November 2023. At this meeting, JPEG Trust became a Committee Draft. In addition, JPEG analyzed the responses to its Calls for Proposals for JPEG DNA.

The 101st JPEG meeting had the following highlights:

  • JPEG Trust reaches Committee Draft;
  • JPEG AI request its re-establishment;
  • JPEG Pleno Learning-based Point Cloud coding establishes a new Verification Model;
  • JPEG Pleno organizes a Light Field Industry Workshop;
  • JPEG AIC-3 continues the evaluation of contributions;
  • JPEG XE produces a first draft of the Common Test Conditions;
  • JPEG DNA analyses the responses to the Call for Proposals;
  • JPEG XS proceeds with the development of the 3rd edition;
  • JPEG XL proceeds with the development of the 2nd edition.

The following sections summarize the main highlights of the 101st JPEG meeting.

JPEG Trust

The 101st meeting marked an important milestone for JPEG Trust project with its Committee Draft (CD) for Part 1 “Core Foundation” (21617-1) of the standard approved for consultation. It is expected that a Draft International Standard (DIS) of the Core Foundation will be approved at the 102nd JPEG meeting in January 2024, which will be another important milestone. This rapid schedule is necessitated by the speed at which fake media and misinformation are proliferating especially in respect of generative AI.

Aligned with JPEG Trust, the NFT Call for Proposals (CfP) has yielded two expressions of interest to date, and submission of proposals is still open till the 15th of January 2024.

Additionally, the Use Cases and Requirements document for JPEG Fake Media (the JPEG Fake Media exploration preceded the initiation of the JPEG Trust international standard) was updated to reflect the change to JPEG Trust as well as incorporate additional use cases that have arisen since the previous JPEG meeting, namely in respect of composited images. This document is publicly available on the JPEG website.

JPEG AI

At the 101st meeting, the JPEG Committee issued a request for re-establishing the JPEG AI (6048-1) project, along with a Committee Draft (CD) of its version 1. A new JPEG AI timeline has also been approved and is now publicly available, where a Draft International Standard (DIS) of the Core Coding Engine of JPEG AI version 1 is foreseen at the 103rd JPEG meeting (April 2024), a rather important milestone for JPEG AI. The JPEG Committee also established that JPEG AI version 2 will address requirements not yet fulfilled (especially regarding machine consumption tasks) but also significant improvements on requirements already addressed in version 1, e.g. compression efficiency. JPEG AI version 2 will issue the final Call for Proposals in January 2025 and the presentation and evaluation of JPEG AI version 2 proposals will occur in July 2025. During 2023, the JPEG AI Verification Model (VM) has evolved from a complex system (800kMAC/pxl) to two acceptable complexity-efficiency operation points, providing 11% compression efficiency gains at 20 kMAC/pxl and 25% compression efficiency gains at 200 kMAC/pxl. The decoder for the lower-end operating point has now been implemented on mobile devices and demonstrated during the 100th and 101st JPEG meetings. A presentation with the JPEG AI architecture, networks, and tools is now publicly available. To avoid project delays in the future, the promising input contributions from the 101st meeting will be combined in JPEG AI Core Experiment 6.1 (CE6.1) to study interaction and resolve potential issues during the next meeting cycle. After this integration, a model will be trained and cross-checked to be approved for release (JPEG AI VM5 release candidate) along with the study DIS text. Among promising technologies included in CE6.1 are high quality and variable rate improvements, with a smaller number of models (from 5 to 4), a multi-branch decoder that allows up to three reconstructions with different levels of quality from the same latent representation, but with synthesis transform networks with different complexity along with several post-filter and arithmetic coder simplifications.

JPEG Pleno Learning-based Point Cloud coding

The JPEG Pleno Learning-based Point Cloud coding activity progressed at the 101st meeting with a major investigation into point cloud quality metrics. The JPEG Committee decided to continue this investigation into point cloud quality metrics as well as explore possible advancements to the VM in the areas of parameter tuning and support for residual lossless coding. The JPEG Committee is targeting a release of the Committee Draft of Part 6 of the JPEG Pleno standard relating to Learning-based point cloud coding at the 102nd JPEG meeting in San Francisco, USA in January 2024.

JPEG Pleno Light Field

The JPEG Committee has been creating several standards to provision the dynamic demands of the market, with its royalty-free patent licensing commitments. A light field coding standard has recently been developed, and JPEG Pleno is constantly exploring novel light field coding architectures.

The JPEG Committee is also preparing standardization activities – among others – in the domains of objective and subjective quality assessment for light fields, improved light field coding modes, and learning-based light field coding.

A Light Field Industry Workshop takes place on November 22nd, 2023, aiming at providing a forum for industrial actors to exchange information on their needs and expectations with respect to standardization activities in this domain.

JPEG AIC

During the 101st JPEG meeting, the AIC activity continued its efforts on the evaluation of the contributions received in April 2023 in response to the Call for Contributions on Subjective Image Quality Assessment. Notably, the activity is currently investigating three different subjective image quality assessment methodologies. The results of the newly established Core Experiments will be considered during the design of the AIC-3 standard, which has been carried out in a collaborative way since its beginning.

The AIC activity also initiated the discussion on Part 4 of the standard on Objective Image Quality Metrics (AIC-4) by refining the Use Cases and Requirements document. During the 102nd JPEG meeting in January 2024, the activity is planning to work on the Draft Call for Proposals on Objective Image  

JPEG XE

The JPEG Committee continued its activity on Event-based Vision. This activity revolves around a new and emerging image modality created by event-based visual sensors. JPEG XE aims at the creation and development of a standard to represent events in an efficient way allowing interoperability between sensing, storage, and processing, targeting machine vision and other relevant applications. For better dissemination and raising external interest, a workshop around Event-based Vision was organized and took place on Oct 24th, 2023. The workshop triggered the attention of various stakeholders in the field of Event-based Vision, who will start contributing to JPEG XE. The workshop proceedings will be made available on jpeg.org. In addition, the JPEG Committee created a minor revision for the Use cases and Requirements as v1.0, adding an extra use case on scientific and engineering measurements. Finally, a first draft of the Common Test Conditions for JPEG XE was produced, along with the first Exploration Experiments to start practical experiments in the coming 3-month period until the next JPEG meeting. The public Ad-hoc Group on Event-based Vision was re-established to continue the work towards the next 102nd JPEG meeting in January of 2024. To stay informed about the activities please join the Event-based Vision Ad-hoc Group mailing list.

JPEG DNA

As a result of the Call for Proposals issued by the JPEG Committee for contributions to JPEG DNA standard, 5 proposals were submitted under three distinct codecs by three organizations. Two codecs were submitted to both coding and transcoding categories, and one was submitted to the coding category only. All proposals showed improved compression efficiency when compared to three selected anchors by the JPEG Committee. After a rigorous analysis of the proposals and their cross checking by independent parties, it was decided to create a first Verification Model (VM) based on V-DNA, the best performing proposal. In addition, a number of core experiments were designed to improve the JPEG DNA VM with elements from other proposals submitted by quantifying their added value when integrated in the VM.

JPEG XS

The JPEG Committee continued its work on JPEG XS 3rd edition. The primary goal of the 3rd edition is to deliver the same image quality as the 2nd edition, but with half of the required bandwidth. The Final Draft International Standard for Part 1 of the standard — Core coding tools — was produced at this meeting. With this FDIS version, all technical features are now fixed and completed. Part 2 — Profiles and buffer models — and Part 3 — Transport and container formats — of the standard are still in DIS ballot, and ballot results will only be known by the end of January 2024. The JPEG Committee is now working on Part 4 — Conformance testing, to provide the necessary test streams of the 3rd edition for potential implementors. A first Working Draft for Part 4 was issued. Completion of the JPEG XS 3rd edition is scheduled for April 2024 (Parts 1, 2, and 3) and Parts 4 and 5 will follow shortly after that. Finally, the new Use cases and Requirements for JPEG XS document was created containing a new use case to use JPEG XS for transport of 4K/8K video over 5G mobile networks. It is expected that the new use case can already be covered by the 3rd edition, meaning that no further updates to the standard would be needed. However, more investigations and experimentations will be conducted on this subject.

JPEG XL

The second editions of JPEG XL Part 1 (Core coding system) and Part 2 (File format) have proceeded to the FDIS stage, and the second edition of JPEG XL Part 3 (Conformance testing) has proceeded to the CD stage. These second editions provide clarifications, corrections and editorial improvements that will facilitate independent implementations. At the same time, the development of hardware implementation solutions continues.

Final Quote

“The release of the first Committee Draft of JPEG Trust is a strong signal that the JPEG Committee is reacting with a timely response to demands for solutions that inform users when digital media assets are created or modified, in particular through Generative AI, hence contributing to bringing back trust into media-centric ecosystems.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

MPEG Column: 145th MPEG Meeting (Virtual/Online)

The 145th MPEG meeting was held online from 22-26 January 2024, and the official press release can be found here. It comprises the following highlights:

  • Latest Edition of the High Efficiency Image Format Standard Unveils Cutting-Edge Features for Enhanced Image Decoding and Annotation
  • MPEG Systems finalizes Standards supporting Interoperability Testing
  • MPEG finalizes the Third Edition of MPEG-D Dynamic Range Control
  • MPEG finalizes the Second Edition of MPEG-4 Audio Conformance
  • MPEG Genomic Coding extended to support Transport and File Format for Genomic Annotations
  • MPEG White Paper: Neural Network Coding (NNC) – Efficient Storage and Inference of Neural Networks for Multimedia Applications

This column will focus on the High Efficiency Image Format (HEIF) and interoperability testing. As usual, a brief update on MPEG-DASH et al. will be provided.

High Efficiency Image Format (HEIF)

The High Efficiency Image Format (HEIF) is a widely adopted standard in the imaging industry that continues to grow in popularity. At the 145th MPEG meeting, MPEG Systems (WG 3) ratified its third edition, which introduces exciting new features, such as progressive decoding capabilities that enhance image quality through a sequential, single-decoder instance process. With this enhancement, users can decode bitstreams in successive steps, with each phase delivering perceptible improvements in image quality compared to the preceding step. Additionally, the new edition introduces a sophisticated data structure that describes the spatial configuration of the camera and outlines the unique characteristics responsible for generating the image content. The update also includes innovative tools for annotating specific areas in diverse shapes, adding a layer of creativity and customization to image content manipulation. These annotation features cater to the diverse needs of users across various industries.

Research aspects: Progressive coding has been a part of modern image coding formats for some time now. However, the inclusion of supplementary metadata provides an opportunity to explore new use cases that can benefit both user experience (UX) and quality of experience (QoE) in academic settings.

Interoperability Testing

MPEG standards typically comprise format definitions (or specifications) to enable interoperability among products and services from different vendors. Interestingly, MPEG goes beyond these format specifications and provides reference software and conformance bitstreams, allowing conformance testing.

At the 145th MPEG meeting, MPEG Systems (WG 3) finalized two standards comprising conformance and reference software by promoting it to the Final Draft International Standard (FDIS), the final stage of standards development. The finalized standards, ISO/IEC 23090-24 and ISO/IEC 23090-25, showcase the pinnacle of conformance and reference software for scene description and visual volumetric video-based coding data, respectively.

ISO/IEC 23090-24 focuses on conformance and reference software for scene description, providing a comprehensive reference implementation and bitstream tailored for conformance testing related to ISO/IEC 23090-14, scene description. This standard opens new avenues for advancements in scene depiction technologies, setting a new standard for conformance and software reference in this domain.

Similarly, ISO/IEC 23090-25 targets conformance and reference software for the carriage of visual volumetric video-based coding data. With a dedicated reference implementation and bitstream, this standard is poised to elevate the conformance testing standards for ISO/IEC 23090-10, the carriage of visual volumetric video-based coding data. The introduction of this standard is expected to have a transformative impact on the visualization of volumetric video data.

At the same 145th MPEG meeting, MPEG Audio Coding (WG6) celebrated the completion of the second edition of ISO/IEC 14496-26, audio conformance, elevating it to the Final Draft International Standard (FDIS) stage. This significant update incorporates seven corrigenda and five amendments into the initial edition, originally published in 2010.

ISO/IEC 14496-26 serves as a pivotal standard, providing a framework for designing tests to ensure the compliance of compressed data and decoders with the requirements outlined in ISO/IEC 14496-3 (MPEG-4 Audio). The second edition reflects an evolution of the original, addressing key updates and enhancements through diligent amendments and corrigenda. This latest edition, now at the FDIS stage, marks a notable stride in MPEG Audio Coding’s commitment to refining audio conformance standards and ensuring the seamless integration of compressed data within the MPEG-4 Audio framework.

These standards will be made freely accessible for download on the official ISO website, ensuring widespread availability for industry professionals, researchers, and enthusiasts alike.

Research aspects: Reference software and conformance bitstreams often serve as the basis for further research (and development) activities and, thus, are highly appreciated. For example, reference software of video coding formats (e.g., HM for HEVC, VM for VVC) can be used as a baseline when improving coding efficiency or other aspects of the coding format.

MPEG-DASH Updates

The current status of MPEG-DASH is shown in the figure below.

MPEG-DASH Status, January 2024.

The following most notable aspects have been discussed at the 145th MPEG meeting and adopted into ISO/IEC 23009-1, which will eventually become the 6th edition of the MPEG-DASH standard:

  • It is now possible to pass CMCD parameters sid and cid via the MPD URL.
  • Segment duration patterns can be signaled using SegmentTimeline.
  • Definition of a background mode of operation, which allows a DASH player to receive MPD updates and listen to events without possibly decrypting or rendering any media.

Additionally, the technologies under consideration (TuC) document has been updated with means to signal maximum segment rate, extend copyright license signaling, and improve haptics signaling in DASH. Finally, REAP is progressing towards FDIS but not yet there and most details will be discussed in the upcoming AhG period.

The 146th MPEG meeting will be held in Rennes, France, from April 22-26, 2024. Click here for more information about MPEG meetings and their developments.