Standards – ACM SIGMM Records

VQEG Column: VQEG Meeting December 2025

By Jesús Gutiérrez | June 9, 2026 - 10:06 |June 9, 2026 0226, Event Report, Feature, Standards

Introduction

The Video Quality Experts Group (VQEG) fall plenary meeting took place online from December 15^th to 19^th 2025. More than 130 participants registered to the meeting, coming from industry and academic institutions worldwide.

The meeting was dedicated to present updates and discuss about topics related to the ongoing projects within VQEG. All the related information, minutes, and files from the meeting are available online in the VQEG meeting website, and video recordings of the meeting are available in Youtube.

Several of the addressed topics can be of interest for the SIGMM community working on quality assessment, such as the shift toward Artificial Intelligence (AI) and neural-based media processing, requiring updated evaluation methodologies; the increasing emphasis on immersive media and XR systems, including the recently published recommendation ITU-T P.1321; the growing interest in user-centric Quality of Experience (QoE) modeling, including individualized quality prediction; and the continued development of datasets, tools, and statistical frameworks for reproducible research.

Readers of these columns interested in the ongoing projects of VQEG are encouraged to subscribe to their corresponding reflectors to follow the activities going on and to get involved in them

Overview of VQEG Projects

Immersive Media Group (IMG)

The IMG group continues its focus on quality assessment for immersive technologies. In this meeting, Marta Orduna (Nokia XR Lab, Spain) and Jesús Gutiérrez (Universidad Politécnica de Madrid, Spain) reported updates on the multi-laboratory test plan and next steps for immersive communication assessment. In this sense, a major achievement of the group was the publication of the recommendation ITU-T P.1321 “Interactive test methods for subjective assessment of extended reality communications”, approved in October 2025. In addition, the following presentations related to IMG topics were delivered:

Felix Immohr (RWTH Aachen University, Germany) introduced the ICS-MR dataset, providing standardized conversational scenarios for evaluating mixed reality communication systems.
Jakob Hartbrich and William Menz (RWTH Aachen University and TU Ilmenau, Germany) analyzed how encoding parameters, such as texture resolution and polygon count influence the perceived quality of volumetric avatars, showing that texture resolution and quality level are dominant factors.
Daniel Zielasko (Technical University of Denmark, Denmark) presented a cross-laboratory initiative to establish standardized evaluation protocols for cybersickness, aiming to improve reproducibility and enable large-scale datasets.
Another contribution focused on datasets and evaluation tools. Abhinav Bhattacharya (RWTH Aachen University, Germany) introduced AMIS, a multimodal dataset for eXtended Reality (XR) research including multiple representation formats, such as talking-head videos, full-body videos, volumetric avatars, and personalized animated avatars.
An additional work explored teleoperation and user experience, with Shirin Rafiei (RISE, Sweden) presenting a study showing that latency and field of view significantly impact user performance, while video quality mainly affects subjective confidence.

Statistical Analysis Methods (SAM)

The group SAM presented several contributions focused on their main topics, which are data analysis, subjective evaluation, and statistical modeling:

Ryan Lei (Meta, USA) provided an update on AV2 common test conditions and coding gains, including both subjective and objective evaluation methodologies.
Lucjan Janowski (AGH University of Krakow, Poland) introduced a simulation framework for evaluating subject screening methods, demonstrating that correlation-based approaches outperform traditional kurtosis-based techniques.
Dietmar Saupe (University of Konstanz, Germany) proposed the use of G-test statistical methods for validating models in subjective quality datasets, improving reliability in quality scale reconstruction.
A discussion led by Robert Grosso (NTIA/ITS, USA) addressed open challenges in crowdsourcing-based subjective quality evaluation, including whether such approaches can replace standardized lab-based methodologies.

Joint Effort Group (JEG) – Hybrid

The group JEG addresses several areas of Video Quality Assessment (VQA), and in this meeting it provided contributions focused on advanced modeling of perceptual quality and subjective evaluation methods. In particular, Lohic Fotio Tiotsop (Politecnico di Torino, Italy) introduced a novel attention-based model for predicting individual perceived image coding quality, moving beyond Mean Opinion Score (MOS) toward personalized QoE modeling. In a second contribution, he also presented ongoing work on the evaluation of high-resolution image quality, highlighting variability in no-reference metrics and the need for subjective validation.

Emerging Technologies Group (ETG)

The ETG group continued to highlight forward-looking topics in multimedia. In this meeting:

Abhijay Ghildyal, Saman Zadtootaghaj, and Nabajeet Barman (Sony Interactive Entertainment, Germany and USA) proposed a non-aligned reference image quality assessment framework for novel view synthesis, addressing the challenge of quality evaluation when pixel-perfect references are unavailable.
Mathias Wien (RWTH Aachen University, Germany) presented updates on JVET activities, including results from a Call for Evidence (CfE) on next-generation video compression beyond Versatile Video Coding (VVC) and the outlook for future standardization.

Subjective and objective assessment of GenAI content (SOGAI)

The SOGAI group researches on subjective testing methodologies and objective metrics for assessing the quality of GenAI-generated content. In this meeting, the following topcis were presented and discussed:

Ryan Lei (Meta, USA) presented an evaluation of AI-based image codecs, comparing neural compression approaches with traditional codecs and discussing challenges in measuring compression efficiency.
Benjamin Herb (TU Ilmenau, Germany) presented a large-scale subjective testing of neural codecs, highlighting that metric reliability is comparable between neural and traditional compression methods.

5G Key Performance Indicators (5GKPI)

The 5GKPI group focuses on networked multimedia systems and QoE modeling. In this meeting, the following contributions were presented:

Henrique Rossi and Karan Mitra (Lulea University of Technology, Sweden) proposed a Bayesian network framework for analyzing interactivity in cloud gaming, showing that latency is the dominant factor influencing QoE.
Pablo Pérez (Nokia XR Lab, Spain) presented proposed updates to the QoE definition in ITU-T SG12 recommendation P.10/G.100, raising awareness within VQEG.
Martín Varela (Metosin, Finland / University of Malaga, Spain) discussed broader concepts of QoE at the system level, including fairness and group-level QoE metrics.
Finally, François Blouin (Meta, USA) and Pablo Pérez (Nokia XR Lab, Spain) reported on the status and future directions of the VQEG White Paper on QoE management.

Multimedia Experience and Human Factors (MEHF)

The MEHF group contributions address human perception and subjective evaluation challenges. In this meeting the following presentations were delivered:

Kamran Javidi and Maria Martini (Kingston University London) presented a subjective study on a commercial light field display of the KULF-TT53 Dataset, showing that AV1 encoding outperforms HEVC in perceived quality for light field content.
Jingwen Zhu investigated resolution cross-over in adaptive streaming of live sports, demonstrating that pairwise comparison (PC) methodologies outperform traditional Absolute Category Rating (ACR) methods for identifying optimal encoding decisions.

Quality Assessment for Computer Vision Applications (QACoViA)

In this meeting, the QACoViA group explored deep learning approaches for image enhancement:

Mehrunnisa (AGH University of Krakow, Poland) presented a MOS-guided deep learning framework for underwater image enhancement, combining perceptual loss functions with subjective score optimization.
Doğukan Öztürk (AGH University of Krakow, Poland) further examined deep learning-based enhancement models, comparing multiple approaches for visual quality improvement.

Other updates

Additional presentations covered diverse topics related to quality assessment of audiovisual technologies:

Kjell Brunnström (RISE, Sweden) presented a study on digital rear-view mirrors highlighting the importance of augmented visual cues for depth perception in automotive systems.
Werner Robitza (AVEQ, Germany) introduced Videoparser-ng, a fast open-source bitstream parser for model development.
Margaret Pinson (NTIA/ITS, USA) presented a new camera dataset (JNR-ITScam) containing information from videographers on this video dataset being filmed on modern cameras.
Syed Uddin (AGH University of Krakow, Poland) presented a segment-Level QoE assessment dataset for adaptive bitrate video streaming.
Henrique Rossi and Karan Mitra (Lulea University of Technology, Sweden) made a demonstration of ALTRUIST, a multi-platform tool to conduct subjective tests efficiently, for conducting QoE subjective tests in immersive systems.
In terms of standardization efforts, MPEG representatives provided an overview of AG5 activities, including updates on datasets and evaluation methodologies, and there was an ITU-T Q19 interim meeting to discuss progresses on the recommendations of no-reference metrics (J.Noref).

Finally, as announced in the VQEG website, the next face-to-face VQEG plenary meeting was planned for May 2026.

MPEG Column: 154th MPEG Meeting

By Christian Timmerer | May 14, 2026 - 17:05 |May 15, 2026 0226, Event Report, Feature, Standards

Leave a comment

The 154th MPEG meeting took place in Santa Eulària, Spain, from April 27 to May 1, 2026. The official MPEG press release can be found here. This report highlights key outcomes from the meeting, with a focus on research directions relevant to the ACM SIGMM community:

Exploration on MPEG Gaussian Splat Coding (GSC)
Draft Joint Call for Proposals: Video Compression Beyond VVC
Energy-aware Streaming in MPEG-DASH
MPEG-AI: Vision and Scenarios for Artificial Intelligence in Multimedia
MPEG Roadmap

Exploration on MPEG Gaussian Splat Coding (GSC)

The MPEG WG 2 Technical Requirements group — jointly with WG 4 (Video Coding), WG 5 (JVET: Joint Video Coding Team(s) with ITU-T SG 16), and WG 7 (Coding of 3D Graphics and Haptics) — made progress toward standardizing Gaussian Splat Coding (GSC) regarding draft requirements and use cases subject to change. Gaussian splatting, first introduced in a landmark 2023 ACM SIGGRAPH paper by Kerbl et al. [Kerbl2023], represents 3D scenes as collections of anisotropic Gaussian primitives carrying geometry (x, y, z positions) and appearance attributes (opacity, scale, rotation, and spherical harmonics coefficients for view-dependent color), enabling photorealistic novel-view synthesis with real-time rendering. Because raw Gaussian splat data can be extremely large and the ecosystem of proprietary formats (.ply, .splat, .spz, etc.) is fragmented, MPEG has identified a clear need for interoperable, efficient compression standards. Two exploration tracks are currently being pursued: I-3DGS, which operates on Gaussian splats in the well-established “INRIA” format as a symmetric encode/decode pipeline, and A-3DGS, which allows alternative learned representations and training-integrated approaches.

The draft requirements, still evolving, currently cover representation, coding, and system aspects across both tracks, with an additional lightweight profile targeting resource-constrained devices such as mobile phones (Snapdragon 8 Gen 3/Elite) and HMDs (Snapdragon XR Gen2, e.g., Meta Quest 3). Among the coding requirements under consideration are lossy and lossless compression with variable bitrate, spatial and temporal random access, progressive and scalable decoding (quality, Level of Detail (LoD), attribute subsets), and error resilience. Notably, a lightweight profile currently proposes hard complexity constraints (i.e., real-time encode/decode on 2024/2025 mobile hardware, a 2GB runtime memory cap, and at most four concurrent video decoder sessions) reflecting MPEG’s intent to enable a fast-deployment path for interoperable interchange and storage of static Gaussian splat assets. Alongside the requirements, a draft set of 27 use cases has been identified, spanning consumer XR (telepresence, gaming, social media, retail), professional media (movie production, sports broadcasting, immersive journalism), industrial applications (digital twins, Building Information Modeling (BIM), structure inspection, disaster assessment), and emerging hybrid representations such as Gaussian splats attached to deformable meshes for avatar animation and rigging. Several of these use cases are motivating draft requirements around primitive ordering preservation and stable identifier signaling for external metadata associations, though the details of these provisions may still change.

Research aspects: Even at this early draft stage, the direction of MPEG’s GSC work opens a rich set of research opportunities. On the compression side, the dual-track structure raises open questions around rate-distortion-complexity optimization for both geometry-based and video-codec-based pipelines, including temporally coherent coding of dynamic (tracked and non-tracked) Gaussian sequences and attribute-group-aware progressive coding. The QoE angle is equally pressing: no widely accepted perceptual quality metric yet exists for 6DoF Gaussian splat rendering, and the community can contribute splat-artifact-aware metrics, view-consistency measures, and subjective evaluation methodologies. The envisioned lightweight profile points to a need for co-design of decoders and real-time renderers targeting mobile GPU architectures, offering opportunities in GPU-friendly bitstream layouts and LOD-driven streaming. From a systems and networking perspective, the spatial and temporal random-access provisions, combined with the breadth of use cases demanding adaptive streaming to diverse devices (HMDs, phones, TVs, browsers), map naturally onto adaptive bitrate research, ROI- and view-dependent segment delivery, and loss-resilient transmission of splat parameters. Finally, the emerging use cases around hybrid mesh-Gaussian avatars, scene editing, and semantic metadata associations introduce new multimedia content management and interactive media challenges that go well beyond traditional video streaming and are squarely within the scope of ACM SIGMM’s research community.

Draft Joint Call for Proposals: Video Compression Beyond VVC

MPEG’s Joint Video Experts Team (JVET) — operating jointly under ITU-T SG21 and ISO/IEC JTC 1/SC 29 — advanced a draft Joint Call for Proposals (CfP) for a new generation of video compression technology with capabilities that would substantially exceed those of the current Versatile Video Coding (VVC) standard (Rec. ITU-T H.266 | ISO/IEC 23090-3). The final CfP is planned for July 2026, with proposal submissions evaluated at a JVET meeting in January 2027 and a tentative target of a completed standard by October 2029. The overarching goal is to solicit compression technology that significantly improves upon VVC’s Main 10 Profile in terms of rate-distortion performance, encoder/decoder implementability, applicability to diverse content types, and additional features such as low latency, error robustness, and scalability, while explicitly recognizing that practical fast encoding is increasingly important across a growing range of applications.

The draft CfP defines four test cases. The primary test case targets improved compression without runtime constraints, spanning several content categories: SDR random-access at UHD/4K and HD resolutions, SDR low-delay HD (targeting conversational and gaming applications), HDR content under both PQ and HLG transfer functions at UHD, gaming low-delay HD, and user-generated content. Three additional test cases impose encoder runtime constraints relative to the VVC Test Model (VTM) reference encoder, enabling JVET to characterize the compression-versus-speed trade-off across submissions. Formal subjective evaluation will follow the degradation category rating (DCR) methodology per ITU-R BT.500. Importantly, the CfP explicitly addresses neural and learned components: proponents must disclose what training data was used and are prohibited from using any test sequence as training material, and source code (incl. training scripts or parameter derivation procedures) must be made available for accepted technologies entering the core experiments process. The draft notes that specific test sequences and target bitrates may still change before the final CfP is issued.

Research aspects: The runtime-constrained test cases create a natural framework for studying the compression-complexity Pareto frontier for both classical and learned codecs. The inclusion of user-generated content and gaming video as distinct categories invites research into content-adaptive coding tools and perceptual quality metrics tailored to these sources, as does the HDR coverage with its use of weighted PSNR alongside MS-SSIM. The explicit allowance for neural and learned components, with mandatory training data disclosure and source code requirements, signals that JVET anticipates hybrid and end-to-end learned codecs as serious contenders, making codec-agnostic adaptive streaming, QoE modeling for learned video codecs, and large-scale perceptual quality benchmarking timely topics for the ACM SIGMM community.

Energy-aware Streaming in MPEG-DASH

MPEG’s WG 3 (Systems/DASH) is developing a framework for integrating energy-related information into adaptive streaming workflows, currently documented as a Technology under Consideration (TuC) in the DASH specification. The proposed framework treats energy as a first-class design metric alongside QoE, latency, and throughput, and defines an end-to-end approach for assigning, aggregating, and propagating energy consumption data across the entire media delivery chain — from production and encoding through CDN distribution to the client. A key design principle is extensibility: rather than hardcoding specific metrics, the framework proposes a common registry of energy-related metrics (such as energy indices or carbon indices) identified via URNs or 4CC codes, inspired by existing registries like MP4RA and DASH-IF. Energy information may be carried through a variety of existing DASH mechanisms, including MPD descriptors at multiple granularity levels (Adaptation Set, Representation, Segment, Service Location), CMCD/CMSD extensions, metadata tracks, SAND messages, and event streams. A dedicated Energy descriptor in the MPD is proposed, analogous to existing Accessibility descriptors, to expose energy information to clients and applications for representation selection, user exposure, and reporting to back-end servers.

The April 2026 update reported significant progress on two related fronts. A 5G-MAG workshop co-organized with 3GPP SA4 and Greening of Streaming (March 2026) highlighted growing industry consensus around practical energy measurement, surfacing findings such as the dominant role of device eco-mode settings and content brightness over codec or resolution choices in determining end-device energy consumption, and the challenge of reproducible cloud-based energy measurement. In parallel, 3GPP’s Rel-20 study on media energy consumption exposure (FS_Energy_Ph2_MED) reached 80% completion and is expected to conclude in June 2026, with normative work to follow. Notably, 3GPP’s current draft conclusions focus on generic architectural enablers, specifically a new Energy Information Application Function, while explicitly deferring media-layer and client-driven energy optimization to external bodies such as MPEG, SVTA, and DVB. This positions MPEG-DASH’s manifest-based energy signaling work as the natural venue for maturing the streaming-level mechanisms that 3GPP may later reference.

Research aspects: This work opens several timely directions. Energy-aware ABR algorithm design, i.e., jointly optimizing QoE and energy across representation selection, CDN choice, and client device settings, is a natural extension of the existing adaptive streaming research agenda. The proposed metrics registry and MPD-level signaling create opportunities for dataset construction and benchmarking, building on emerging open datasets such as COCONUT [Tashtarian2024] and VEED [Linder2024]. The finding that device-side factors (eco-mode, display brightness) dominate energy consumption over codec and bitrate choices challenges some common assumptions and calls for more holistic QoE-energy modeling. Finally, the cross-SDO coordination between MPEG, 3GPP, IETF (GREEN working group), and Greening of Streaming presents opportunities for the ACM SIGMM community to contribute to the design of interoperable, standardized energy reporting APIs for streaming services.

MPEG-AI: Vision and Scenarios for Artificial Intelligence in Multimedia

The first edition of ISO/IEC TR 23888-1 serves as the foundational vision document for the MPEG-AI series (ISO/IEC 23888). The document maps out how AI and neural network technologies interact with multimedia standardization along two complementary axes: (i) AI as a multimedia coding tool (e.g., AI-based video compression, 3D point cloud coding) and (ii) multimedia as input for AI consumption (e.g., video coding optimized for machine vision tasks). Under this umbrella, the document surveys six technical areas. In AI-based video coding, neural network components are explored as hybrid additions to VVC-style codecs, covering in-loop filters, intra prediction, super-resolution via reference picture resampling, and content-adaptive postfilters transmitted via SEI messages using the Neural Network Coding standard (NNC, ISO/IEC 15938-17). In AI-based 3D graphics coding, the focus is on dynamic point clouds for immersive (XR, gaming) and machine-oriented (autonomous navigation, BIM) applications, where sparsity and geometric irregularity pose unique challenges beyond those faced by image/video AI codecs. AI model compression (NNC) addresses the bandwidth-efficient deployment and incremental updating of neural network weights to devices, with use cases ranging from adaptive streaming ABR models to federated learning and postfilter delivery. Video coding for machines (VCM) targets compression optimized for downstream AI tasks such as object detection, tracking, and content moderation, with applications in surveillance, intelligent transportation, smart cities, and industrial inspection. Feature coding for machines (FCM) extends this to split-inference architectures where intermediate feature maps — rather than reconstructed video — are compressed and transmitted between edge devices and servers. Finally, distributed AI media description addresses the interoperable representation and API-level exchange of AI inference results (e.g., bounding boxes, segmentation masks) between networked media analyzers, as specified in the MPEG-IoMT suite.

ISO/IEC TR 23888-1: AI as a multimedia coding tool and multimedia as input for AI consumption.

Research aspects: The hybrid codec paradigm raises open questions around joint optimization of traditional and learned tools and complexity-aware training for mobile targets. The VCM and FCM tracks call for new task-oriented quality metrics capturing machine-task performance as a function of bitrate, an area where the multimedia and computer vision communities can collaborate. The split-inference and feature coding scenarios introduce latency-constrained compression problems for edge-to-cloud pipelines, which naturally connect to adaptive streaming and IoT research. Finally, the reproducibility and bit-exactness challenges highlighted in the document — hardware-dependent inference, non-deterministic training, and the absence of standardized evaluation environments — present an opportunity for the community to develop shared benchmarking infrastructure for learned multimedia codecs.

MPEG Roadmap

MPEG released an updated roadmap at its 154th meeting, reflecting the current status and near-term trajectory of its standardization activities across three broad pillars. Under Media Coding, work nearing completion includes MPEG Immersive Video v.2, Feature Coding for Machines, Solid Point Cloud Coding, and Dynamic Mesh Compression, while longer-horizon efforts cover AI Graphics Compression, Video Coding for Machines, Lenslet video coding, and — directly relevant to this report — both Video-based and Geometry-based Gaussian Splat Coding tracks. Under Systems and Tools, near-term deliverables include DASH v.7, Green metadata v.4, and Carriage of Haptics Data, with CMAF v.4 and File Format (ISOBMFF) v.10 on a slightly longer timeline. The Beyond Media pillar continues to advance genomic data search and biomedical waveform coding (BWC), alongside media authenticity and provenance indication — underscoring MPEG’s expanding scope well beyond traditional audiovisual applications.

Research aspects: The roadmap highlights several intersecting research opportunities. The convergence of volumetric and neural representations (i.e., point clouds, dynamic meshes, Gaussian splats, and lenslet video; all progressing in parallel) raises open questions around unified rate-distortion frameworks and cross-format QoE evaluation for 6DoF experiences. The simultaneous progression of Video Coding for Machines and Feature Coding for Machines alongside traditional human-centric codecs calls for research into adaptive pipelines that can serve both human and machine consumers from a shared bitstream. The Green metadata track connects directly to the energy-aware streaming work discussed above, underscoring the need for end-to-end energy modeling that spans codec choice, packaging, delivery, and consumption. Finally, the Beyond Media thread (e.g., particularly genomic data and biomedical waveforms) signals an expanding definition of “multimedia” that the ACM SIGMM community may wish to engage with as compression, retrieval, and QoE methods developed for audiovisual content find applicability in life sciences.

Concluding Remarks

The 154th MPEG meeting in Santa Eularia reflects a standards body in active transition, broadening its scope from traditional audiovisual compression toward a richer landscape that encompasses neural scene representations, AI-native codecs, energy-aware delivery, and even biomedical data. The Gaussian Splat Coding exploration, the next-generation video compression Call for Proposals, the MPEG-AI vision document, and the energy-aware streaming framework each address distinct but interconnected challenges: how to represent, compress, deliver, and consume increasingly complex and diverse media efficiently and sustainably. For the ACM SIGMM community, this meeting offers both a map of where industry standardization is heading and a set of open research problems (i.e., spanning perceptual quality assessment, learned compression, edge inference, green streaming, and immersive media delivery) where academic contributions can meaningfully shape the next generation of multimedia standards.

The 155th MPEG meeting will be held in Geneva, Switzerland, from July 13 to 17, 2026. Click here for more information about MPEG meetings and ongoing developments.

References

[Kerbl, 2023] Bernhard Kerbl, Georgios Kopanas, Thomas Leimkuehler, and George Drettakis. 2023. 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Trans. Graph. 42, 4, Article 139 (August 2023), 14 pages. https://doi.org/10.1145/3592433
[Tashtarian, 2024] Farzad Tashtarian, Daniele Lorenzi, Hadi Amirpour, Samira Afzal, and Christian Timmerer. 2024. COCONUT: Content Consumption Energy Measurement Dataset for Adaptive Video Streaming. In Proceedings of the 15th ACM Multimedia Systems Conference (MMSys ’24). Association for Computing Machinery, New York, NY, USA, 346–352. https://doi.org/10.1145/3625468.3652179
[Linder, 2024] Sandro Linder, Samira Afzal, Christian Bauer, Hadi Amirpour, Radu Prodan, and Christian Timmerer. 2024. VEED: Video Encoding Energy and CO2 Emissions Dataset for AWS EC2 instances. In Proceedings of the 15th ACM Multimedia Systems Conference (MMSys ’24). Association for Computing Machinery, New York, NY, USA, 332–338. https://doi.org/10.1145/3625468.3652178

JPEG Column: 110th JPEG Meeting in Sydney, Australia

By Antonio Pinheiro | May 5, 2026 - 09:28 |May 14, 2026 0226, Event Report, Feature, Standards

Leave a comment

JPEG Trust Media Asset Watermarking reaches Committee Draft stage at the 110th JPEG meeting

The 110th JPEG meeting was held in Sydney, Australia, from 11 to 16 January 2026.

This meeting was marked by several major achievements: JPEG Trust Part 3 Media Asset Watermarking that will extend JPEG Trust Core Foundation providing signalling capabilities for content authenticity, provenance, integrity, intellectual property rights, and labelling using watermarking. Furthermore, the first event-based codec, JPEG XE, reached the Draft International Standard stage.

In addition, the JPEG Committee celebrated the 25th birthday of the successful JPEG 2000 standard with a social event where members who had served the Committee shared their experience during the development of this important family of standards.

The following sections summarise the main highlights of the 110th JPEG meeting:

JPEG Trust Part 3: Media Asset Watermarking to provide watermarking support for media asset authenticity.
JPEG XE Part 1: core coding system is under DIS ballot.
JPEG AIC prepares large-scale subjective experiment.
JPEG 2000 defines a set of hardware-focused profiles for professional video streaming.
JPEG XS Part 2 new amendement defines additional levels and sublevels, ands a new frame buffer level.
JPEG RF activity approves new Use Cases and Requirements.
JPEG AI focus on implementation aspects and on extending its applicability across devices and use cases.
JPEG DNA completes wet-lab experiments, including DNA synthesis/sequencing.
JPEG Pleno Light Field Quality Assessment examines the performance of the proposed metrics.
JPEG 2000 25th Anniversary Celebrations.

The former convenor of the JPEG Committee, Daniel Lee, addressing JPEG 2000 development during the JPEG 2000 25th Anniversary Celebration.

JPEG Trust

Current technologies, especially the rise of generative AI, make synthetic creation and modification of media assets easy for general users. Media artefacts such as synthetic images and video increase the risks of online piracy, cyber security fraud, copyright breach, advertising misrepresentation and the spread of mis- and disinformation.

The JPEG Trust International Standard (ISO/IEC 21617-1) provides a framework for establishing trust in media assets, and has now been extended to include Part 3: Media Asset Watermarking (ISO/IEC 21617-3), to provide watermarking support for media asset authenticity.

This new part of the JPEG Trust framework provides a mechanism to empower businesses, governments and institutions to support critical use cases from labelling AI-generated media assets to Digital Rights Management and source tracing. This is in addition to its many applications in helping secure media asset authenticity.

In a major milestone achieved during the 110th JPEG meeting in Sydney, Part 3: Media Asset Watermarking reached the Committee Draft stage. It is expected that this standard will have a significant positive impact globally, as it directly responds to the urgent calls for watermarking functionality by governments around the world in response to the proliferation of AI-generated content online.

JPEG XE

JPEG XE is a joint effort between ITU-T SG21 and ISO/IEC JTC1/SC29/WG1 and will become the first internationally endorsed specification by major standardization bodies ITU-T, ISO, and IEC, for coding of events. It aims to establish a robust and interoperable format for efficient representation and coding of events in the context of machine vision and related applications. To expand the reach of JPEG XE, the JPEG Committee has closely coordinated its activities with the MIPI Alliance with the intention of developing a cross-compatible coding mode, allowing MIPI ESP signals to be decoded effectively by JPEG XE decoders.

Currently, JPEG XE Part 1, which defines the core coding system, is under DIS ballot and the JPEG Committee is awaiting the results. In the meantime, work started on Parts 2 and 3, which will define the Profiles and levels, and the Reference software, respectively. For both parts, a Committee Draft (CD) was created and their consultation was requested. The Profiles and levels in Part 2 will provide strict definitions to allow safe and correct interoperability between vendor specific implementations of the standard. The software for Part 3 will serve as a proof of concept implementation of an encoder and decoder of JPEG XE. The plan is to make the software free and open source to allow the community easy access to the JPEG XE technology.

Finally, work on Part 4 was also initiated to provide official and well-defined conformance tests. This will help vendors to verify interoperability and conformance to the standard.

The JPEG Committee remains committed to the development of a comprehensive and industry-aligned standard that meets the growing demand for event-based vision technologies. The collaborative approach between multiple standardisation organisations underscores a shared vision for a unified, international standard to accelerate innovation and interoperability in this emerging field. The JPEG XE public and joint AHG (ITU-T SG21 and ISO/IEC JTC1 SC29 WG1) was reestablished to continue the work. If you are interested, please consider joining the joint AHG.

JPEG AIC

The JPEG AIC-3 standard, which specifies a methodology for fine-grained subjective image quality assessment in the range from good quality up to mathematically lossless, is ready to be published as International Standard ISO/IEC 29170-3 in February this year. An implementation of the corresponding data analysis has been provided in MATLAB and will be ported to Python. For the current JPEG AIC-4 effort and evaluation of the responses to the call for Objective Image Quality Assessment, an image dataset for the large-scale subjective experiment was finalized, consisting of 18,000 compressed images for 70 source images and 17 codecs, including several learning-based methods. The crowdsourcing experiment is expected to take several weeks.

JPEG 2000

The JPEG Committee has initiated the development of a new standard to collect the growing number of profiles for its flexible JPEG 2000 image codec. As part of the activity, which is expected to be completed within the next 18 months, an initial set of hardware-focused profiles for professional video streaming coder are being codified. These profiles use the unique capabilities of the High-Throughput JPEG 2000 block coder, specified in Rec. ITU-T T.814 | ISO/IEC 15444-15, to shrink the hardware resources needed to tackle modern high-frame rate and high-resolution images.

JPEG XS

JPEG XS, the image and video compression format for transmitting visually lossless, high-quality pictures with minimal latency and low resource consumption, is a fundamental game-changer for real-time video transmission in live, professional, and broadcast applications. In this context, the JPEG Committee created an AMD1 for JPEG XS Part 2 to define some additional levels and sublevels, as well as a new frame buffer level. These additions each address specific requirements that came from the respective industry sectors that rely on JPEG XS. This new AMD1 for Part 2 was issued for DIS balloting. In the meantime, the ballot results for AMD1 for JPEG XS Part 1 were processed, and an FDIS ballot was initiated. Both AMDs are expected to be published before the end of this year.

JPEG RF

At the 110th JPEG meeting, JPEG RF made significant progress against its mandates, formally approving the Use Cases and Requirements for JPEG Radiance Fields v1.0 and requesting its public release on the JPEG website. Substantial technical discussions advanced the evaluation and assessment pipeline for radiance fields, covering both coding-only and joint instantiation and coding approaches. The Working Group also approved Exploration Study 7, including the study on pair-wise comparison assessment methodologies for radiance fields. In addition, next steps were agreed for outreach activities to engage additional stakeholders.

JPEG AI

During the 110th JPEG meeting, JPEG AI was focused on implementation aspects and on extending its applicability across devices and use cases. First, the Use Cases and Requirements document was updated, introducing a new video streaming and storage use case that positions JPEG AI as a deterministic still-image coding engine that can be integrated into video coding pipelines.

A new core experiment addresses the bit-exact reference frame reconstruction requirement. Moreover, other core experiments were defined to analyze power consumption on heterogeneous CPU–GPU/FPGA platforms and to retrain JPEG AI in the RGB domain for fair comparison with other codecs. Looking ahead, JPEG AI plans to develop mobile-ready encoder and decoder implementations, investigate error-resilience properties, and continue benchmarking JPEG AI against state-of-the-art learnt image codecs using solid and robust test conditions.

JPEG DNA

The wet-lab experiments, including DNA synthesis/sequencing, designed at the 109th JPEG meeting were completed, and the synthesized results have been delivered to the JPEG Committee as DNA molecules. As a next step, independent parties are carrying out sequencing separately, and the sequenced results are expected to be available by the next JPEG meeting, when the JPEG DNA, a.k.a. ISO/IEC 25508-1, will reach the DIS stage.

JPEG Pleno

During the 110th JPEG meeting, the JPEG Committee reviewed the outcomes of the subjective quality assessment conducted on the evaluation dataset with the aim to examine the performance of the proposals submitted in response to the Call for Proposals on objective metrics for JPEG Pleno Light Field Quality Assessment. The performance of submitted metrics was analysed across scenes with diverse spatial and angular resolutions and for both coding-only and joint coding and view-synthesis artefacts, highlighting differences in behaviour across distortion categories. Learning-based proposals were recognized as a promising direction, particularly when cross-validated on the evaluation dataset, while also raising considerations related to training, data dependency, and reproducibility. The evaluation phase was formally closed, with agreement to retain a set of well-established full-reference metrics as reference anchors and to pursue a combined technical direction integrating end-to-end and hybrid learning-based approaches. Finally, responsibilities across task forces were consolidated, and next steps were defined to continue the objective quality assessment work towards a first version of a working draft.

Highlights of JPEG 2000 25th Anniversary Celebrations, Sydney, 14 January 2026

The 110th JPEG meeting in Sydney offered a fitting occasion to mark the 25th anniversary of JPEG 2000 standardization. Opening the celebration, Prof. Touradj Ebrahimi, JPEG convenor, noted that it was in Sydney during the 12th JPEG meeting in 1997 that JPEG 2000 proposals were evaluated, culminating in the publication of the standard in December 2000.

The program featured a video message from Prof. Michael Marcellin, a key contributor to several core technologies adopted by JPEG 2000 and chair of the subsequent software verification model effort. He highlighted the successful deployment of JPEG 2000 for digital distribution of motion pictures and the essential standards work involved in defining the digital cinema profiles that enabled this adoption.

Prof. David Taubman, whose long-standing leadership and technical contributions continue to shape JPEG 2000 development, delivered a presentation highlighting the coding tools that underpin the format’s highly scalable and accessible codestreams. He also outlined recent progress in High Throughput JPEG 2000 (HTJ2K), including implementations achieving high performance, full float lossless compression for OpenEXR and FPGA based realizations delivering high speed, low latency coding.

Messages from Prof. Majid Rabbani and Dr. Daniel Lee—both instrumental in guiding the JPEG 2000 standardisation process—paid tribute to the dedication, expertise, and collaborative spirit of the many JPEG members who contributed to the standard’s success. Daniel, who served as JPEG convenor during the JPEG 2000 standardisation period, further underscored JPEG’s essential role as a collaborative international forum for developing standards with global reach.

The celebration concluded with an address by Dr. Pierre Anthony Lemieux, co-chair of the JPEG 2000 activity, who highlighted the format’s enduring flexibility as a key factor in its longevity. He noted that this flexibility allows end users to expand the capabilities of their workflows without the burden of switching to a different codec. Dr. Lemieux also emphasised the importance of ongoing maintenance activities, which allow JPEG 2000 to evolve to meet the shifting needs of its users, including current work on defining HTJ2K profiles and levels. He finished by stressing the importance of open source tools and libraries in driving adoption.

A sustained commitment to meeting industry needs and continued maintenance of the standard remains central to the ongoing and future success of JPEG 2000.

Final Quote

“Reaching Committee Draft for JPEG Trust Part 3: Media Asset Watermarking is a pivotal step toward restoring confidence in digital media at a moment when generative AI makes convincing manipulation accessible to anyone. This milestone equips industries and public institutions with interoperable, standards-based watermarking to support authenticity, provenance, integrity, rights signalling, and clear labelling, helping to curb mis- and disinformation, strengthen digital rights management, and enable reliable source tracing at a global scale.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

MPEG Column: 153rd MPEG Meeting

By Christian Timmerer | February 2, 2026 - 11:36 |February 2, 2026 0126, Event Report, Feature, Standards

Leave a comment

The 153rd MPEG meeting took place online from January 19-23, 2026. The official MPEG press release can be found here. This report highlights key outcomes from the meeting, with a focus on research directions relevant to the ACM SIGMM community:

MPEG Roadmap
Exploration on MPEG Gaussian Splat Coding (GSC)
MPEG Immersive Video 2nd edition (new white paper)

MPEG Roadmap

MPEG released an updated roadmap showing continued convergence of immersive and “beyond video” media with deployment-ready systems work. Near-term priorities include 6DoF experiences (MPEG Immersive Video v2 and 6DoF audio), volumetric representations (dynamic meshes, solid point clouds, LiDAR, and emerging Gaussian splat coding), and “coding for machines,” which treats visual and audio signals as inputs to downstream analytics rather than only for human consumption.

Research aspects: The most promising research opportunities sit at the intersections: renderer and device-aware rate-distortion-complexity optimization for volumetric content; adaptive streaming and packaging evolution (e.g., MPEG-DASH / CMAF) for interactive 6DoF services under tight latency constraints; and cross-cutting themes such as media authenticity and provenance, green and energy metadata, and exploration threads on neural-network-based compression and compression of neural networks that foreshadow AI-native multimedia pipelines.

MPEG Gaussian Splat Coding (GSC)

Gaussian Splat Coding (GSC) is MPEG’s effort to standardize how 3D Gaussian Splatting content, scenes represented as sparse “Gaussian splats” with geometry plus rich attributes (scale and rotation, opacity, and spherical-harmonics appearance for view-dependent rendering), is encoded, decoded, and evaluated so it can be exchanged and rendered consistently across platforms. The main motivation is interoperability for immersive media pipelines: enabling reproducible results, shared benchmarks, and comparable rate-distortion-complexity trade-offs for use cases spanning telepresence and immersive replay to mobile XR and digital twins, while retaining the visual strengths that made 3DGS attractive compared to heavier neural scene representations.

The work remains in an exploration phase, coordinated across ISO/IEC JTC 1/SC 29 groups WG 4 (MPEG Video Coding) and WG 7 (MPEG Coding for 3D Graphics and Haptics) through Joint Exploration Experiments covering datasets and anchors, new coding tools, software (renderer and metrics), and Common Test Conditions (CTC). A notable systems thread is “lightweight GSC” for resource-constrained devices (single-frame, low-latency tracks using geometry-based and video-based pipelines with explicit time and memory targets), alongside an “early deployment” path via amendments to existing MPEG point-cloud codecs to more natively carry Gaussian-splat parameters. In parallel, MPEG is testing whether splat-specific tools can outperform straightforward mappings in quality, bitrate, and compute for real-time and streaming-centric scenarios.

Research aspects: Relevant SIGMM directions include splat-aware compression tools and rate-distortion-complexity optimization (including tracked vs. non-tracked temporal prediction); QoE evaluation for 6DoF navigation (metrics for view and temporal consistency and splat-specific artifacts); decoder and renderer co-design for real-time and mobile lightweight profiles (progressive and LOD-friendly layouts, GPU-friendly decode); and networked delivery problems such as adaptive streaming, ROI and view-dependent transmission, and loss resilience for splat parameters. Additional opportunities include interoperability work on reproducible benchmarking, conformance testing, and practical packaging and signaling for deployment.

MPEG Immersive Video 2nd edition (white paper)

The second edition of MPEG Immersive Video defines an interoperable bitstream and decoding process for efficient 6DoF immersive scene playback, supporting translational and rotational movement with motion parallax to reduce discomfort often associated with pure 3DoF viewing. The second edition primarily extends functionality (without changing the high-level bitstream structure), adding capabilities such as capture-device information, additional projection types, and support for Simple Multi-Plane Image (MPI), alongside tools that better support geometry and attribute handling and depth-related processing.

Architecturally, MIV ingests multiple (unordered) camera views with geometry (depth and occupancy) and attributes (e.g., texture), then reduces inter-view redundancy by extracting patches and packing them into 2D “atlases” that are compressed using conventional video codecs. MIV-specific metadata signals how to reconstruct views from the atlases. The standard is built as an extension of the common Visual Volumetric Video-based Coding (V3C) bitstream framework shared with V-PCC, with profiles that preserve backward compatibility while introducing a new profile for added second-edition functionality and a tailored profile for full-plane MPI delivery.

Research aspects: Key SIGMM topics include systems-efficient 6DoF delivery (better view and patch selection and atlas packing under latency and bandwidth constraints); rate-distortion-complexity-QoE optimization that accounts for decode and render cost (especially on HMD and mobile) and motion-parallax comfort; adaptive delivery strategies (representation ladders, viewport and pose-driven bit allocation, robust packetization and error resilience for atlas video plus metadata); renderer-aware metrics and subjective protocols for multi-view temporal consistency; and deployment-oriented work such as profile and level tuning, codec-group choices (HEVC / VVC), conformance testing, and exploiting second-edition features (capture device info, depth tools, Simple MPI) for more reliable reconstruction and improved user experience.

Concluding Remarks

The meeting outcomes highlight a clear shift toward immersive and AI-enabled media systems where compression, rendering, delivery, and evaluation must be co-designed. These developments offer timely opportunities for the ACM SIGMM community to contribute reproducible benchmarks, perceptual metrics, and end-to-end streaming and systems research that can directly influence emerging standards and deployments.

The 154th MPEG meeting will be held in Santa Eulària, Spain, from April 27 to May 1, 2026. Click here for more information about MPEG meetings and ongoing developments.

JPEG Column: 109th JPEG Meeting in Nuremberg, Germany

By Antonio Pinheiro | February 1, 2026 - 23:54 |February 2, 2026 0126, Event Report, Feature, Standards

Leave a comment

JPEG XS developers awarded the Engineering, Science and Technology Emmy®.

The 109th JPEG meeting was held in Nuremberg, Germany, from 12 to 17 October 2025.

This JPEG meeting began with the excellent news that JPEG XS developers Fraunhofer IIS and intoPIX were awarded the Engineering, Science and Technology Emmy® for their contributions to the development of the JPEG XS standard.

Furthermore the 109th JPEG meeting was also marked by several major achievements: JPEG Trust Part 2 on Trust Profiles and Reports, complementing Part 1 with several profiles for various usage scenarios, reached Committee Draft; JPEG AIC part 3 was produced for final publication by ISO; JPEG XE reached Committee Draft stage; and the calls for proposals on objective evaluation JPEG AIC-4 and JPEG Pleno Quality Assessment of Light Field received several responses.

The following sections summarise the main highlights of the 109th JPEG meeting:

Fraunhofer IIS and intoPIX representatives with the awarded Engineering, Science and Technology Emmy®.

JPEG Trust Part 2 on Trust Profiles and Reports reaches Committee Draft stage.
JPEG AIC-4 receives responses to the Call for Proposals on Objective Image Quality Assessment.
JPEG XE Part 1, the core coding system, reaches DIS stage.
JPEG XS Part 1 AMD 1 reaches DIS stage.
JPEG AI Part 2 (Profiling), Part 3 (Reference Software), and Part 5 (File Format) approved as International Standards.
JPEG DNA designed the wet-lab experiments, including DNA synthesis/sequencing.
JPEG Peno receives responses to the Call for Proposals on Objective Metrics for Light Field Quality Assessment.
JPEG RF establishes frameworks for coding and quality assessment of radiance fields.
JPEG XL innitiates embedding of JPEG XL in ISOBMFF/HEIF.