VQEG Column: Emerging Technologies Group (ETG)

VQEG website: www.vqeg.org
Authors:
Nabajeet Barman (nbarman@brightcove.com), Brightcove (London, UK)
Saman Zadtootaghaj (saman.zadtootaghaj@sony.com), Sony (Berlin, Germany)
Editors:
Jesús Gutiérrez (jesus.gutierrez@upm.es), Universidad Politécnica de Madrid (Spain)
Kjell Brunnström (kjell.brunnstrom@ri.se), RISE (Sweden)

Introduction

This column provides an overview of the new Video Quality Experts Group (VQEG) group called the Emerging Technologies Group (ETG), which was created during the last VQEG plenary meeting in December 2022. For an introduction to VQEG, please check the VQEG homepage or this presentation.

The works addressed by this new group can be of interest for the SIGMM community since they are related to AI-based technologies for image and video processing, greening of streaming, blockchain in media and entertainment, and ongoing related standardization activities.

About ETG

The main objective of this group is to address various aspects of multimedia that do not fall under the scope of any of the existing VQEG groups. The group, through its activities, aims to provide a common platform for people to gather together and discuss new emerging topics and ideas, discuss possible collaborations in the form of joint survey papers/whitepapers, funding proposals, etc. The topics addressed are not necessarily directly related to “video quality” but rather focus on any ongoing work in the field of multimedia which can indirectly impact the work addressed as part of VQEG.

Scope

During the creation of the group, the following topics were tentatively identified to be of possible interest to the members of this group and VQEG in general:

AI-based technologies:
- Super Resolution
- Learning-based video compression
- Video coding for machines, etc.,
- Enhancement, Denoising and other pre- and post-filter techniques
Greening of streaming and related trends
- For example, trade-off between HDR and SDR to save energy and its impact on visual quality
Ongoing Standards Activities (which might impact the QoE of end users and hence will be relevant for VQEG)
- 3GPP, SVTA, CTA WAVE, UHDF, etc.
- MPEG/JVET
Blockchain in Media and Entertainment

Since the creation of the group, four talks on various topics have been organized, an overview of which is summarized next.

Overview of the Presentations

We briefly provide a summary of various talks that have been organized by the group since its inception.

On the work by MPEG Systems Smart Contracts for Media Subgroup

The first presentation was on the topic of the recent work by MPEG Systems on Smart Contract for Media [1], which was delivered by Dr Panos Kudumakis who is the Head of UK Delegation, ISO/IEC JTC1/SC29 & Chair of British Standards Institute (BSI) IST/37. Dr Panos in this talk highlighted the efforts in the last few years by MPEG towards developing several standardized ontologies catering to the needs of the media industry with respect to the codification of Intellectual Property Rights (IPR) information toward the fair trade of media. However, since inference and reasoning capabilities normally associated with ontology use cannot naturally be done on DLT environments, there is a huge potential to unlock the Semantic Web and, in turn, the creative economy by bridging this interoperability gap [2]. In that direction, ISO/IEC 21000-23 Smart Contracts for Media standard specifies the means (e.g., APIs) for converting MPEG IPR ontologies to smart contracts that can be executed on existing DLT environments [3]. The talk discussed the recent works that have been done as part of this effort and also on the ongoing efforts towards the design of a full-fledged ISO/IEC 23000-23 Decentralized Media Rights Application Format standard based on MPEG technologies (e.g., audio-visual codecs, file formats, streaming protocols, and smart contracts) and non-MPEG technologies (e.g., DLTs, content, and creator IDs).
The recording of the presentation is available here, and the slides can be accessed here.

Introduction to NTIRE Workshop on Quality Assessment for Video Enhancement

The second presentation was given by Xiaohong Liu and Yuxuan Gao from Shanghai Jiao Tong University, China about one of the CVPR challenge workshops called the NTIRE 2023 Quality Assessment of Video Enhancement Challenge. The presentation described the motivation for starting this challenge and how this is of great relevance to the video community in general. Then the presenters described the dataset such as the dataset creation process, subjective tests to obtain ratings, and the reasoning behind the choice of the split of the dataset into training, validation, and test sets. The results of this challenge are scheduled to be presented at the upcoming spring meeting end of June 2023. The presentation recording is available here.

Perception: The Next Milestone in Learned Image Compression

Johannes Balle from Google was the third presenter on the topic of “Perception: The Next Milestone in Learned Image Compression.” In the first part, Johannes discussed the learned compression and described the nonlinear transforms [4] and how they could achieve a higher image compression rate than linear transforms. Next, they emphasized the importance of perceptual metrics in comparison to distortion metrics by introducing the difference between perceptual quality vs. reconstruction quality [5]. Next, an example of generative-based image compression is presented where the two criteria of distortion metric and perceptual metric (named as realism criteria) are combined, HiFiC [6]. Finally, the talk concluded with an introduction to perceptual spaces and an example of a perceptual metric, PIM [7]. The presentation slides can be found here.

Compression with Neural Fields

Emilien Dupont (DeepMind) was the fourth presenter. He started the talk with a short introduction on the emergence of neural compression that fits a signal, e.g., an image or video, to a neural network. He then discussed the two recent works on neural compression that he was involved in, named COIN [8] and COIN++ [9]. He then made a short overview of other Implicit Neural Representation in the domain of video such as NerV [10] and NIRVANA [11]. The slides for the presentation can be found here.

Upcoming Presentations

As part of the ongoing efforts of the group, the following talks/presentations are scheduled in the next two months. For an updated schedule and list of presentations, please check the ETG homepage here.

Sustainable/Green Video Streaming

Given the increasing carbon footprint of streaming services and climate crisis, many new collaborative efforts have started recently, such as the Greening of the Streaming alliance, Ultra HD Sustainability forum, etc. In addition, research works recently have started focussing on how to make video streaming more greener/sustainable. A talk providing an overview of the recent works and progress in direction is tentatively scheduled around mid-May, 2023.

Panel discussion at VQEG Spring Meeting (June 26-30, 2023), Sony Interactive Entertainment HQ, San Mateo, US

During the next face-to-face VQEG meeting in San Mateo there will be an interesting panel discussion on the topic of “Deep Learning in Video Quality and Compression.” The goal is to invite the machine learning experts to VQEG and bring the two groups closer. ETG will organize the panel discussion, and the following four panellists are currently invited to join this event: Zhi Li (Netflix), Ioannis Katsavounidis (Meta), Richard Zhang (Adobe), and Mathias Wien (RWTH Aachen). Before this panel discussion, two talks are tentatively scheduled, the first one on video super-resolution and the second one focussing on learned image compression.
The meeting will talk place in hybrid mode allowing for participation both in-person and online. For further information about the meeting, please check the details here and if interested, register for the meeting.

Joining and Other Logistics

While participation in the talks is open to everyone, to get notified about upcoming talks and participate in the discussion, please consider subscribing to etg@vqeg.org email reflector and join the slack channel using this link. The meeting minutes are available here. We are always looking for new ideas to improve. If you have suggestions on topics we should focus on or have recommendation of presenters, please reach out to the chairs (Nabajeet and Saman).

References

[1] White paper on MPEG Smart Contracts for Media.
[2] DLT-based Standards for IPR Management in the Media Industry.
[3] DLT-agnostic Media Smart Contracts (ISO/IEC 21000-23).
[4] [2007.03034] Nonlinear Transform Coding.
[5] [1711.06077] The Perception-Distortion Tradeoff.
[6] [2006.09965] High-Fidelity Generative Image Compression.
[7] [2006.06752] An Unsupervised Information-Theoretic Perceptual Quality Metric.
[8] Coin: Compression with implicit neural representations.
[9] COIN++: Neural compression across modalities.
[10] Nerv: Neural representations for videos.
[11] NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-wise Modeling.