VQEG Column: VQEG Meeting December 2023

Introduction

The last plenary meeting of the Video Quality Experts Group (VQEG) was held online by the University of Konstantz (Germany) in December 18th to 21st, 2023. It offered the possibility to more than 100 registered participants from 19 different countries worldwide to attend the numerous presentations and discussions about topics related to the ongoing projects within VQEG. All the related information, minutes, and files from the meeting are available online in the VQEG meeting website, and video recordings of the meeting are soon available at Youtube.

All the topics mentioned below can be of interest for the SIGMM community working on quality assessment, but special attention can be devoted to the current activities on improvements of the statistical analysis of subjective experiments and objective metrics and on the development of a test plan to evaluate the QoE of immersive interactive communication systems in collaboration with ITU.

Readers of these columns interested in the ongoing projects of VQEG are encouraged to suscribe to the VQEG’s  email reflectors to follow the activities going on and to get involved with them.

As already announced in the VQEG website, the next VQEG plenary meeting be hosted by Universität Klagenfurt in Austria from July 1st to 5th, 2024.

Group picture of the online meeting

Overview of VQEG Projects

Audiovisual HD (AVHD)

The AVHD group works on developing and validating subjective and objective methods to analyze commonly available video systems. During the meeting, there were various sessions in which presentations related to these topics were discussed.

Firstly, Ali Ak (Nantes Université, France), provided an analysis of the relation between acceptance/annoyance and visual quality in a recently collected dataset of several User Generated Content (UGC) videos. Then, Syed Uddin (AGH University of Krakow, Poland) presented a video quality assessment method based on the quantization parameter of MPEG encoders (MPEG-4, MPEG-AVC, and MPEG-HEVC) leveraging VMAF. In addition, Sang Heon Le (LG Electronics, Korea) presented a technique for pre-enhancement for video compression and applicable subjective quality metrics. Another talk was given by Alexander Raake (TU Ilmenau, Germany), who presented AVQBits, a versatile no-reference bitstream-based video quality model (based on the standardized ITU-T P.1204.3 model) that can be applied in several contexts such as video service monitoring, evaluation of video encoding quality, of gaming video QoE, and even of omnidirectional video quality. Also, Jingwen Zhu (Nantes Université, France) and Hadi Amirpour (University of Klagenfurt, Austria) described a study on the evaluation of the effectiveness of different video quality metrics in predicting the Satisfied User Ratio (SUR) in order to enhance the VMAF proxy to better capture content-specific characteristics. Andreas Pastor (Nantes Université, France) presented a method to predict the distortion perceived locally by human eyes in AV1-encoded videos using deep features, which can be easily integrated into video codecs as a pre-processing step before starting encoding.

In relation with standardization efforts, Mathias Wien (RWTH Aachen University, Germany) gave an overview on recent expert viewing tests that have been conducted within MPEG AG5 at the 143rd and 144th MPEG meetings. Also, Kamil Koniuch (AGH University of Krakow, Poland) presented a proposal to update the Survival Game task defined in the ITU-T Recommendation P.1301 on subjective quality evaluation of audio and audiovisual multiparty telemeetings, in order to improve its implementation and application to recent efforts such as the evaluation of immersive communication systems within the ITU-T P.IXC (see the paragraph related to the Immersive Media Group).

Quality Assessment for Health applications (QAH)

The QAH group is focused on the quality assessment of health applications. It addresses subjective evaluation, generation of datasets, development of objective metrics, and task-based approaches. Recently, the group has been working towards an ITU-T recommendation for the assessment of medical contents. On this topic, Meriem Outtas (INSA Rennes, France) led a discussion dealing with the edition of a draft of this recommendation. In addition, Lumi Xia (INSA Rennes, France) presented a study of task-based medical image quality assessment focusing on a use case of adrenal lesions.

Statistical Analysis Methods (SAM)

The group SAM investigates on analysis methods both for the results of subjective experiments and for objective quality models and metrics. This was one of the most active groups in this meeting, with several presentations on related topics.

On this topic, Krzystof Rusek (AGH University of Krakow, Poland) presented a Python package to estimate Generalized Score Distribution (GSD) parameters and showed how to use it to test the results obtained in subjective experiments. Andreas Pastor (Nantes Université, France) presented a comparison between two subjective studies using Absolute Category Rating with Hidden Reference (ACR-HR) and Degradation Category Rating (DCR), conducted in a controlled laboratory environment on SDR HD, UHD, and HDR UHD contents using naive observers. The goal of these tests is to estimate rate-distortion savings between two modern video codecs and compare the precision and accuracy of both subjective methods. He also presented another study on the comparison of conditions for omnidirectional video with spatial audio in terms of subjective quality and impacts on objective metrics resolving power.

In addition, Lukas Krasula (Netflix, USA) introduced e2nest, a web-based platform to conduct media-centric (video, audio, and images) subjective tests. Also, Dietmar Saupe (University of Konstanz, Germany) and Simon Del Pin (NTNU, Norway) showed the results of a study analyzing the national difference in image quality assessment, showing significant differences in various areas. Alexander Raake (TU Ilmenau, Germany) presented a study on the remote testing of high resolution images and videos, using AVrate Voyager , which is a publicly accessible framework for online tests. Finally, Dominik Keller (TU Ilmenau, Germany) presented a recent study exploring the impact of 8K (UHD-2) resolution on HDR video quality, considering different viewing distances. The results showed that the enhanced video quality of 8K HDR over 4K HDR diminishes with increasing viewing distance.

No Reference Metrics (NORM)

The group NORM addresses a collaborative effort to develop no-reference metrics for monitoring visual service quality. In At this meeting, Ioannis Katsavounidis (Meta, USA) led a discussion on the current efforts to improve complexity image and video metrics. In addition, Krishna Srikar Durbha (Univeristy of Texas at Austin, USA) presented a technique to tackle the problem of bitrate ladder construction based on multiple Visual Information Fidelity (VIF) feature sets extracted from different scales and subbands of a video

Emerging Technologies Group (ETG)

The ETG group focuses on various aspects of multimedia that, although they are not necessarily directly related to “video quality”, can indirectly impact the work carried out within VQEG and are not addressed by any of the existing VQEG groups. In particular, this group aims to provide a common platform for people to gather together and discuss new emerging topics, possible collaborations in the form of joint survey papers, funding proposals, etc.

In this meeting, Nabajeet Barman and Saman Zadtootaghaj (Sony Interactive Entertainment, Germany), suggested a topic to start to be discussed within VQEG: Quality Assessment of AI Generated/Modified Content. The goal is to have subsequent discussions on this topic within the group and write a position or whitepaper.

Joint Effort Group (JEG) – Hybrid

The group JEG addresses several areas of Video Quality Assessment (VQA), such as the creation of a large dataset for training such models using full-reference metrics instead of subjective metrics. In addition, the group includes the VQEG project Implementer’s Guide for Video Quality Metrics (IGVQM). At the meeting, Enrico Masala (Politecnico di Torino, Italy) provided  updates on the activities of the group and on IGVQM.

Apart from this, there were three presentations addressing related topics in this meeting, delivered by Lohic Fotio Tiotsop (Politecnico di Torino, Italy). The first presentation focused on quality estimation in subjective experiments and the identification of peculiar subject behaviors, introducing a robust approach for estimating subjective quality from noisy ratings, and a novel subject scoring model that enables highlighting several peculiar behaviors. Also, he introduced a non-parametric perspective to address the media quality recovery problem, without making any a priori assumption on the subjects’ scoring behavior. Finally, he presented an approach called “human-in-the-loop training process” that uses  multiple cycles of a human voting, DNN training, and inference procedure.

Immersive Media Group (IMG)

The IMG group is performing research on the quality assessment of immersive media technologies. Currently, the main joint activity of the group is the development of a test plan to evaluate the QoE of immersive interactive communication systems, which is carried out in collaboration with ITU-T through the work item P.IXC. In this meeting, Pablo Pérez (Nokia XR Lab, Spain), Jesús Gutiérrez (Universidad Politécnica de Madrid, Spain), Kamil Koniuch (AGH University of Krakow, Poland), Ashutosh Singla (CWI, The Netherlands) and other researchers involved in the test plan provided an update on the status of the test plan, focusing on the description of four interactive tasks to be performed in the test, the considered measures, and the 13 different experiments that will be carried out in the labs involved in the test plan. Also, in relation with this test plan, Felix Immohr (TU Ilmenau, Germany), presented a study on the impact of spatial audio on social presence and user behavior in multi-modal VR communications.

Diagram of the methodology of the joint IMG test plan

Quality Assessment for Computer Vision Applications (QACoViA)

The group QACoViA addresses the study the visual quality requirements for computer vision methods, where the final user is an algorithm. In this meeting, Mikołaj Leszczuk (AGH University of Krakow, Poland) and  Jingwen Zhu (Nantes Université, France) presented a specialized data set developed for enhancing Automatic License Plate Recognition (ALPR) systems. In addition, Hanene Brachemi (IETR-INSA Rennes, France), presented an study on evaluating the vulnerability of deep learning-based image quality assessment methods to adversarial attacks. Finally, Alban Marie (IETR-INSA Rennes, France) delivered a talk on the exploration of lossy image coding trade-off between rate, machine perception and quality.

5G Key Performance Indicators (5GKPI)

The 5GKPI group studies relationship between key performance indicators of new 5G networks and QoE of video services on top of them. At the meeting, Pablo Pérez (Nokia XR Lab, Spain) led an open discussion on the future activities of the group towards 6G, including a brief presentation of QoS/QoE management in 3GPP and presenting potential opportunities to influence QoE in 6G.

VQEG Column: VQEG Meeting June 2023

Introduction

This column provides a report on the last Video Quality Experts Group (VQEG) plenary meeting, which took place from 26 to 30 June 2023 in San Mateo (USA), hosted by Sony Interactive Entertainment. More than 90 participants worldwide registered for the hybrid meeting, counting with the physical attendance of more than 40 people. This meeting was co-located with the ITU-T SG12 meeting, which took place in the first two days of the week. In addition, more than 50 presentations related to the ongoing projects within VQEG were provided, leading to interesting discussions among the researchers attending the meeting. All the related information, minutes, and files from the meeting are available online on the VQEG meeting website, and video recordings of the meeting are available on Youtube.

In this meeting, there were several aspects that can be relevant for the SIGMM community working on quality assessment. For instance, there are interesting new work items and efforts on updating existing recommendations discussed in the ITU-T SG12 co-located meeting (see the section about the Intersector Rapporteur Group on Audiovisual Quality Assessment). In addition, there was an interesting panel related to deep learning for video coding and video quality with experts from different companies (e.g., Netflix, Adobe, Meta, and Google) (see the Emerging Technologies Group section). Also, a special session on Quality of Experience (QoE) for gaming was organized, involving researchers from several international institutions. Apart from this, readers may be interested in the presentation about MPEG activities on quality assessment and the different developments from industry and academia on tools, algorithms and methods for video quality assessment.

We encourage readers interested in any of the activities going on in the working groups to check their websites and subscribe to the corresponding reflectors, to follow them and get involved.

Group picture of the VQEG Meeting 26-30 June 2023 hosted by Sony Interactive Entertainment (San Mateo, USA).

Overview of VQEG Projects

Audiovisual HD (AVHD)

The AVHD group investigates improved subjective and objective methods for analyzing commonly available video systems. In this meeting, there were several presentations related to topics covered by this group, which were distributed in different sessions during the meeting.

Nabajeet Barman (Kingston University, UK) presented a datasheet for subjective and objective quality assessment datasets. Ali Ak (Nantes Université, France) delivered a presentation on the acceptability and annoyance of video quality in context. Mikołaj Leszczuk (AGH University, Poland) presented a crowdsourcing pixel quality study using non-neutral photos. Kamil Koniuch (AGH University, Poland) discussed about the role of theoretical models in ecologically valid studies, covering the example of a video quality of experience model. Jingwen Zhu (Nantes Université, France) presented her work on evaluating the streaming experience of the viewers with Just Noticeable Difference (JND)-based Encoding. Also, Lucjan Janowski (AGH University, Poland) talked about proposing a more ecologically-valid experiment protocol using YouTube platform.

In addition, there were four presentations by researchers from the industry sector. Hojat Yeganeh (SSIMWAVE/IMAX, USA) talked about how more accurate video quality assessment metrics would lead to more savings. Lukas Krasula (Netflix, USA) delivered a presentation on subjective video quality for 4K HDR-WCG content using a browser-based approach for at-home testing. Also, Christos Bampis (Netflix, USA) presented the work done by Netflix on improving video quality with neural networks. Finally, Pranav Sodhani (Apple, USA) talked about how to evaluate videos with the Advanced Video Quality Tool (AVQT).

Quality Assessment for Health applications (QAH)

The QAH group works on the quality assessment of health applications, considering both subjective evaluation and the development of datasets, objective metrics, and task-based approaches. The group is currently working towards an ITU-T recommendation for the assessment of medical contents. In this sense, Meriem Outtas (INSA Rennes, France) led an editing session of a draft of this recommendation.

Statistical Analysis Methods (SAM)

The SAM group works on improving analysis methods both for the results of subjective experiments and for objective quality models and metrics. The group is currently working on updating and merging the ITU-T recommendations P.913, P.911, and P.910.

Apart from this, several researchers presented their works on related topics. For instance, Pablo Pérez (Nokia XR Lab, Spain) presented (not so) new findings about transmission rating scale and subjective scores. Also, Jingwen Zhu (Nantes Université, France) presented ZREC, an approach for mean and percentile opinion scores recovery. In addition, Andreas Pastor (Nantes Université, France) presented three works: 1) on the accuracy of open video quality metrics for local decision in AV1 video codec, 2) on recovering quality scores in noisy pairwise subjective experiments using negative log-likelihood, and 3) on guidelines for subjective haptic quality assessment, considering a case study on quality assessment of compressed haptic signals. Lucjan Janowski (AGH University, Poland) discussed about experiment precision, proposing experiment precision measures and methods for experiments comparison. Finally, there were three presentations from members of the University of Konstanz (Germany). Dietmar Saupe presented the JPEG AIC-3 activity on fine-grained assessment of subjective quality of compressed images, Mohsen Jenadeleh talked about how relaxed forced choice improves performance of visual quality assessment methods, and Mirko Dulfer presented his work on quantization for Mean Opinion Score (MOS) recovery in Absolute Category Rating (ACR) experiments.

Computer Generated Imagery (CGI)

CGI group is devoted to analyzing and evaluating of computer-generated content, with a focus on gaming in particular. In this meeting, Saman Zadtootaghaj (Sony Interactive Entertainment, Germany) an Nabajeet Barman (Kingston University, UK) organized a special gaming session, in which researchers from several international institutions presented their work in this topic. Among them, Yu-Chih Chen (UT Austin LIVE Lab, USA) presented GAMIVAL, a Video Quality Prediction on Mobile Cloud Gaming Content. Also, Urvashi Pal (Akamai, USA) delivered a presentation on web streaming quality assessment via computer vision applications over cloud. Mathias Wien (RWTH Aachen University, Germany) provided updates on ITU-T P.BBQCG work item, dataset and model development. Avinab Saha (UT Austin LIVE Lab, USA) presented a study of subjective and objective quality assessment of mobile cloud gaming videos. Finally, Irina Cotanis (Infovista, Sweden) and Karan Mitra (Luleå University of Technology, Sweden) presented their work towards QoE models for mobile cloud and virtual reality games.

No Reference Metrics (NORM)

The NORM group is an open collaborative project for developing no-reference metrics for monitoring visual service quality. In this meeting, Margaret Pinson (NTIA, USA) and Ioannis Katsavounidis (Meta, USA), two of the chairs of the group, provided a summary of NORM successes and discussion of current efforts for improved complexity metric. In addition, there were six presentations dealing with related topics. C.-C. Jay Kuo (University of Southern California, USA) talked about blind visual quality assessment for mobile/edge computing. Vignesh V. Menon (University of Klagenfurt, Austria) presented the updates of the Video Quality Analyzer (VQA). Yilin Wang (Google/YouTube, USA) gave a talk on the recent updates on the Universal Video Quality (UVQ). Farhad Pakdaman (Tampere University, Finland) and Li Yu (Nanjing University, China), presented a low complexity no-reference image quality assessment based on multi-scale attention mechanism with natural scene statistics. Finally, Mikołaj Leszczuk (AGH University, Poland) presented his work on visual quality indicators adapted to resolution changes and on considering in-the-wild video content as a special case of user generated content and a system for its recognition.

Emerging Technologies Group (ETG)

The main objective of the ETG group is to address various aspects of multimedia that do not fall under the scope of any of the existing VQEG groups. The topics addressed are not necessarily directly related to “video quality” but can indirectly impact the work addressed as part of VQEG. This group aims to provide a common platform for people to gather together and discuss new emerging topics, discuss possible collaborations in the form of joint survey papers/whitepapers, funding proposals, etc.

One of the topics addressed by this group is related to the use of artificial-intelligence technologies to different domains, such as compression, super-resolution, and video quality assessment. In this sense, Saman Zadtootaghaj (Sony Interactive Entertainment, Germany) organized a panel session with experts from different companies (e.g., Netflix, Adobe, Meta, and Google) on deep learning in the video coding and video quality domains. In this sense, Marcos Conde (Sony Interactive Entertainment, Germany) and David Minnen (Google, USA) gave a talk on generative compression and the challenges for quality assessment.

Another topic covered by this group is greening of streaming and related trends. In this sense, Vignesh V. Menon and Samira Afzal (University of Klagenfurt, Austria) presented their work on green variable framerate encoding for adaptive live streaming. Also, Prajit T. Rajendran (Université Paris Saclay, France) and Vignesh V. Menon (University of Klagenfurt, Austria) delivered a presentation on energy efficient live per-title encoding for adaptive streaming. Finally, Berivan Isik (Stanford University, USA) talked about sandwiched video compression to efficiently extending the reach of standard codecs with neural wrappers.

Joint Effort Group (JEG) – Hybrid

The JEG group was focused on a joint work to develop hybrid perceptual/bitstream metrics and gradually evolved over time to include several areas of Video Quality Assessment (VQA), such as the creation of a large dataset for training such models using full-reference metrics instead of subjective metrics. In addition, the group will include under its activities the VQEG project Implementer’s Guide for Video Quality Metrics (IGVQM).

Apart from this, there were three presentations addressing related topics in this meeting. Nabajeet Barman (Kingston University, UK) presented a subjective dataset for multi-screen video streaming applications. Also, Lohic Fotio (Politecnico di Torino, Italy) presented his works entitled “Human-in-the-loop” training procedure of the artificial-intelligence-based observer (AIO) of a real subject and advances on the “template” on how to report DNN-based video quality metrics.

The website of the group includes a list of activities of interest, freely available publications, and other resources.

Immersive Media Group (IMG)

The IMG group is focused on the research on quality assessment of immersive media. The main joint activity going on within the group is the development of a test plan to evaluate the QoE of immersive interactive communication systems, which is carried out in collaboration with ITU-T through the work item P.IXC. In this meeting, Pablo Pérez (Nokia XR Lab, Spain) and Jesús Gutiérrez (Universidad Politécnica de Madrid, Spain) provided a report on the status of the test plan, including the test proposals from 13 different groups that have joined the activity, which will be launched in September.

In addition to this, Shirin Rafiei (RISE, Sweden) delivered a presentation on her work on human interaction in industrial tele-operated driving through a laboratory investigation.

Quality Assessment for Computer Vision Applications (QACoViA)

The goal of the QACoViA group is to study the visual quality requirements for computer vision methods, where the “final observer” is an algorithm. In this meeting, Avrajyoti Dutta (AGH University, Poland) delivered a presentation dealing with the subjective quality assessment of video summarization algorithms through a crowdsourcing approach.

Intersector Rapporteur Group on Audiovisual Quality Assessment (IRG-AVQA)

This VQEG meeting was co-located with the rapporteur group meeting of ITU-T Study Group 12 – Question 19, coordinated by Chulhee Lee (Yonsei University, Korea). During the first two days of the week, the experts from ITU-T and VQEG worked together on various topics. For instance, there was an editing session to work together on the VQEG proposal to merge the ITU-T Recommendations P.910, P.911, and P.913, including updates with new methods. Another topic addressed during this meeting was the working item “P.obj-recog”, related to the development of an object-recognition-rate-estimation model in surveillance video of autonomous driving. In this sense, a liaison statement was also discussed with the VQEG AVHD group. Also in relation to this group, another liaison statement was discussed on the new work item “P.SMAR” on subjective tests for evaluating the user experience for mobile Augmented Reality (AR) applications.

Other updates

One interesting presentation was given by Mathias Wien (RWTH Aachen University, Germany) on the quality evaluation activities carried out within the MPEG Visual Quality Assessment group, including the expert viewing tests. This presentation and the follow-up discussions will help to strengthen the collaboration between VQEG and MPEG on video quality evaluation activities.

The next VQEG plenary meeting will take place in autumn 2023 and will be announced soon on the VQEG website.

VQEG Column: Emerging Technologies Group (ETG)

Introduction

This column provides an overview of the new Video Quality Experts Group (VQEG) group called the Emerging Technologies Group (ETG), which was created during the last VQEG plenary meeting in December 2022. For an introduction to VQEG, please check the VQEG homepage or this presentation.

The works addressed by this new group can be of interest for the SIGMM community since they are related to AI-based technologies for image and video processing, greening of streaming, blockchain in media and entertainment, and ongoing related standardization activities.

About ETG

The main objective of this group is to address various aspects of multimedia that do not fall under the scope of any of the existing VQEG groups. The group, through its activities, aims to provide a common platform for people to gather together and discuss new emerging topics and ideas, discuss possible collaborations in the form of joint survey papers/whitepapers, funding proposals, etc. The topics addressed are not necessarily directly related to “video quality” but rather focus on any ongoing work in the field of multimedia which can indirectly impact the work addressed as part of VQEG. 

Scope

During the creation of the group, the following topics were tentatively identified to be of possible interest to the members of this group and VQEG in general: 

  • AI-based technologies:
    • Super Resolution
    • Learning-based video compression
    • Video coding for machines, etc., 
    • Enhancement, Denoising and other pre- and post-filter techniques
  • Greening of streaming and related trends
    • For example, trade-off between HDR and SDR to save energy and its impact on visual quality
  • Ongoing Standards Activities (which might impact the QoE of end users and hence will be relevant for VQEG)
    • 3GPP, SVTA, CTA WAVE, UHDF, etc.
    • MPEG/JVET
  • Blockchain in Media and Entertainment

Since the creation of the group, four talks on various topics have been organized, an overview of which is summarized next.

Overview of the Presentations

We briefly provide a summary of various talks that have been organized by the group since its inception.

On the work by MPEG Systems Smart Contracts for Media Subgroup

The first presentation was on the topic of the recent work by MPEG Systems on Smart Contract for Media [1], which was delivered by Dr Panos Kudumakis who is the Head of UK Delegation, ISO/IEC JTC1/SC29 & Chair of British Standards Institute (BSI) IST/37. Dr Panos in this talk highlighted the efforts in the last few years by MPEG towards developing several standardized ontologies catering to the needs of the media industry with respect to the codification of Intellectual Property Rights (IPR) information toward the fair trade of media. However, since inference and reasoning capabilities normally associated with ontology use cannot naturally be done on DLT environments, there is a huge potential to unlock the Semantic Web and, in turn, the creative economy by bridging this interoperability gap [2]. In that direction, ISO/IEC 21000-23 Smart Contracts for Media standard specifies the means (e.g., APIs) for converting MPEG IPR ontologies to smart contracts that can be executed on existing DLT environments [3]. The talk discussed the recent works that have been done as part of this effort and also on the ongoing efforts towards the design of a full-fledged ISO/IEC 23000-23 Decentralized Media Rights Application Format standard based on MPEG technologies (e.g., audio-visual codecs, file formats, streaming protocols, and smart contracts) and non-MPEG technologies (e.g., DLTs, content, and creator IDs). 
The recording of the presentation is available here, and the slides can be accessed here.

Introduction to NTIRE Workshop on Quality Assessment for Video Enhancement

The second presentation was given by Xiaohong Liu and Yuxuan Gao from Shanghai Jiao Tong University, China about one of the CVPR challenge workshops called the NTIRE 2023 Quality Assessment of Video Enhancement Challenge. The presentation described the motivation for starting this challenge and how this is of great relevance to the video community in general. Then the presenters described the dataset such as the dataset creation process, subjective tests to obtain ratings, and the reasoning behind the choice of the split of the dataset into training, validation, and test sets. The results of this challenge are scheduled to be presented at the upcoming spring meeting end of June 2023. The presentation recording is available here.  

Perception: The Next Milestone in Learned Image Compression

Johannes Balle from Google was the third presenter on the topic of “Perception: The Next Milestone in Learned Image Compression.” In the first part, Johannes discussed the learned compression and described the nonlinear transforms [4] and how they could achieve a higher image compression rate than linear transforms. Next, they emphasized the importance of perceptual metrics in comparison to distortion metrics by introducing the difference between perceptual quality vs. reconstruction quality [5]. Next, an example of generative-based image compression is presented where the two criteria of distortion metric and perceptual metric (named as realism criteria) are combined, HiFiC [6]. Finally, the talk concluded with an introduction to perceptual spaces and an example of a perceptual metric, PIM [7]. The presentation slides can be found here.

Compression with Neural Fields

Emilien Dupont (DeepMind) was the fourth presenter. He started the talk with a short introduction on the emergence of neural compression that fits a signal, e.g., an image or video, to a neural network. He then discussed the two recent works on neural compression that he was involved in, named COIN [8] and COIN++ [9].  He then made a short overview of other Implicit Neural Representation in the domain of video such as NerV [10] and NIRVANA [11]. The slides for the presentation can be found here.

Upcoming Presentations

As part of the ongoing efforts of the group, the following talks/presentations are scheduled in the next two months. For an updated schedule and list of presentations, please check the ETG homepage here.

Sustainable/Green Video Streaming

Given the increasing carbon footprint of streaming services and climate crisis, many new collaborative efforts have started recently, such as the Greening of the Streaming alliance, Ultra HD Sustainability forum, etc. In addition, research works recently have started focussing on how to make video streaming more greener/sustainable. A talk providing an overview of the recent works and progress in direction is tentatively scheduled around mid-May, 2023.    

Panel discussion at VQEG Spring Meeting (June 26-30, 2023), Sony Interactive Entertainment HQ, San Mateo, US

During the next face-to-face VQEG meeting in San Mateo there will be an interesting panel discussion on the topic of “Deep Learning in Video Quality and Compression.” The goal is to invite the machine learning experts to VQEG and bring the two groups closer. ETG will organize the panel discussion, and the following four panellists are currently invited to join this event: Zhi Li (Netflix), Ioannis Katsavounidis (Meta), Richard Zhang (Adobe), and Mathias Wien (RWTH Aachen). Before this panel discussion, two talks are tentatively scheduled, the first one on video super-resolution and the second one focussing on learned image compression. 
The meeting will talk place in hybrid mode allowing for participation both in-person and online. For further information about the meeting, please check the details here and if interested, register for the meeting.

Joining and Other Logistics

While participation in the talks is open to everyone, to get notified about upcoming talks and participate in the discussion, please consider subscribing to etg@vqeg.org email reflector and join the slack channel using this link. The meeting minutes are available here. We are always looking for new ideas to improve. If you have suggestions on topics we should focus on or have recommendation of presenters, please reach out to the chairs (Nabajeet and Saman).

References

[1] White paper on MPEG Smart Contracts for Media.
[2] DLT-based Standards for IPR Management in the Media Industry.
[3] DLT-agnostic Media Smart Contracts (ISO/IEC 21000-23).
[4] [2007.03034] Nonlinear Transform Coding.
[5] [1711.06077] The Perception-Distortion Tradeoff.
[6] [2006.09965] High-Fidelity Generative Image Compression.
[7] [2006.06752] An Unsupervised Information-Theoretic Perceptual Quality Metric.
[8] Coin: Compression with implicit neural representations.
[9] COIN++: Neural compression across modalities.
[10] Nerv: Neural representations for videos.
[11] NIRVANA: Neural Implicit Representations of Videos with Adaptive Networks and Autoregressive Patch-wise Modeling.

VQEG Column: VQEG Meeting December 2022

Introduction

This column provides an overview of the last Video Quality Experts Group (VQEG) plenary meeting, which took place from 12 to 16 December 2022. Around 100 participants from 21 different countries around the world registered for the meeting that was organized online by Brightcove (United Kingdom). During the five days, there were more than 40 presentations and discussions among researchers working on topics related to the projects ongoing within VQEG. All the related information, minutes, and files from the meeting are available online on the VQEG meeting website, and video recordings of the meeting are available on Youtube.

Many of the works presented in this meeting can be relevant for the SIGMM community working on quality assessment. Particularly interesting can be the proposals to update and merge ITU-T recommendations P.913, P.911, and P.910, the kick-off of the test plan to evaluate the QoE of immersive interactive communication systems, and the creation of a new group on emerging technologies that will start working on AI-based technologies and greening of streaming and related trends.

We encourage readers interested in any of the activities going on in the working groups to check their websites and subscribe to the corresponding reflectors, to follow them and get involved.

Group picture of the VQEG Meeting 12-16 December 2022 (online).

Overview of VQEG Projects

Audiovisual HD (AVHD)

The AVHD group investigates improved subjective and objective methods for analysing commonly available video systems. Currently, there are two projects ongoing under this group: Quality of Experience (QoE) Metrics for Live Video Streaming Applications (Live QoE) and Advanced Subjective Methods (AVHD-SUB).

In this meeting, there were three presentations related to topics covered by this group. In the first one, Maria Martini (Kingston University, UK), presented her work on converting video quality assessment metrics. In particular, the work addressed the relationship between SSIM and PSNR for DCT-based compressed images and video, exploiting the content-related factor [1]. The second presentation was given by Urvashi Pal (Akamai, Australia) and dealt with video codec profiling with video quality assessment complexities and resolutions. Finally, Jingwen Zhu (Nantes Université, France) presented her work on the benefit of parameter-driven approaches for the modelling and the prediction of a Satisfied User Ratio for compressed videos [2].

Quality Assessment for Health applications (QAH)

The QAH group works on the quality assessment of health applications, considering both subjective evaluation and the development of datasets, objective metrics, and task-based approaches. Currently there is an open discussion on new topics to address within the group, such as the application of visual attention models and studies to health applications. Also, an opportunity to conduct medical perception research was announced, which was proposed by Elizabeth Krupinski and will take place in the European Congress of Radiology (Vienna, Austria, Mar. 2023).

In addition, four research works were presented at the meeting. Firstly, Julie Fournier (INSA Rennes, France) presented new insights on affinity therapy for people with ASD, based on an eye-tracking study on images. The second presentation was delivered by Lumi Xia (INSA Rennes, France) and dealt with the evaluation of the usability of deep learning-based denoising models for low-dose CT simulation. Also, Mohamed Amine Kerkouri (University of Orleans, France), presented his work on deep-based quality assessment of medical images through domain adaptation. Finally, Jorge Caviedes (ASU, USA) delivered a talk on cognition inspired diagnostic image quality models, emphasising the need of distinguishing among interpretability (e.g., medical professional is confident in making a diagnosis), adequacy (e.g., capture technique shows the right area for assessment), and visual quality (e.g., MOS) in quality assessment of medical contents.

Statistical Analysis Methods (SAM)

The SAM group works on improving analysis methods both for the results of subjective experiments and for objective quality models and metrics. The group is currently working on updating and merging the ITU-T recommendations P.913, P.911, and P.910. The suggestion is to make P.910 and P.911 obsolete and make P.913 the only recommendation from ITU-T on subjective video quality assessments. The group worked on the liaison and document to be sent to ITU-T SG12 and will be available in the meeting files.

In addition, Mohsen Jenadeleh (Univerity of Konstanz, Germany) presented his work on collective just noticeable difference assessment for compressed video with Flicker Test and QUEST+.

Computer Generated Imagery (CGI)

CGI group is devoted to analysing and evaluating computer-generated content, with a focus on gaming in particular. The group is currently working in collaboration with ITU-T SG12 on the work item P.BBQCG on Parametric bitstream-based Quality Assessment of Cloud Gaming Services. In this sense, Saman Zadtootaghaj (Sony Interactive Entertainment, Germany) provided an update on the ongoing activities. In addition, they are working on two new work items: G.OMMOG on Opinion Model for Mobile Online Gaming applications and P.CROWDG on Subjective Evaluation of Gaming Quality with a Crowdsourcing Approach. Also, the group is working on identifying other topics and interests in CGI rather than gaming content.

No Reference Metrics (NORM)

The NORM group is an open collaborative project for developing no-reference metrics for monitoring visual service quality. Currently, the group is working on three topics: the development of no-reference metrics, the clarification of the computation of the Spatial and Temporal Indexes (SI and TI, defined in the ITU-T Recommendation P.910), and the development of a standard for video quality metadata. 

In relation to the first topic, Margaret Pinson (NTIA/ITS, US), talked about why no-reference metrics for image and video quality lack accuracy and reproducibility [3] and presented new datasets containing camera noise and compression artifacts for the development of no-reference metrics by the group. In addition, Oliver Wiedeman (University of Konstanz, Germany) presented his work on cross-resolution image quality assessment.

Regarding the computation of complexity indices, Maria Martini (Kingston University, UK) presented a study comparing 12 metrics (and possible combinations) for assessing video content complexity. Vignesh V. Menon (University of Klagenfurt, Austria) presented a summary of live per-title encoding approaches using video complexity features. Ioannis Katsavounidis and Cosmin Stejerean (Meta, US) presented their work on using motion search to order videos by coding complexity, also making available the software in open source. In addition, they led a discussion on supplementing classic SI and TI with improved complexity metrics (VCA, motion search, etc.).

Finally, related to the third topic, Ioannis Katsavounidis (Meta, US) provided an update on the status of the project. Given that the idea is already mature enough, a contribution will be made to MPEG to consider the insertion of metadata of video metrics into the encoded video streams. In addition, a liaison with AOMedia will be established that may go beyond this particular topic. And include best practices on subjective testing, IMG topics, etc.

Joint Effort Group (JEG) – Hybrid

The JEG group was focused on a joint work to develop hybrid perceptual/bitstream metrics and gradually evolved over time to include several areas of Video Quality Assessment (VQA), such as the creation of a large dataset for training such models using full-reference metrics instead of subjective metrics. Currently, the group is working on research problems rather than algorithms and models with immediate applicability. In addition, the group has launched a new website, which includes a list of activities of interest, freely available publications, and other resources. 

Two examples of research problems addressed by the group were shown by the two presentations given by Lohic Fotio Tiotsop (Politecnico di Torino, Italy). The topic of the first presentation was related to the training of artificial intelligence observers for a wide range of applications, while the second presentation provided guidelines to train, validate, and publish DNN-based objective measures.

5G Key Performance Indicators (5GKPI)

The 5GKPI group studies the relationship between key performance indicators of new 5G networks and QoE of video services on top of them. In this meeting, Pablo Pérez (Nokia XR Lab, Spain) presented an overview of activities related to QoE and XR within 3GPP.

Immersive Media Group (IMG)

The IMG group is focused on the research on quality assessment of immersive media. The main joint activity going on within the group is the development of a test plan to evaluate the QoE of immersive interactive communication systems. After the discussions that took place in previous meetings and audio calls, a tentative schedule has been proposed to start the execution of the test plan in the following months. In this sense, a new work item will be proposed in the next ITU-T SG12 meeting to establish a collaboration between VQEG-IMG and ITU on this topic.

In addition to this, a variety of different topics related to immersive media technologies were covered in the works presented during the meeting. For example, Yaosi Hu (Wuhan University, China) presented her work on video quality assessment based on quality aggregation networks. In relation to light field imaging, Maria Martini (Kingston University, UK) exposed the main problems related to what light field quality assessment datasets are currently meeting and presented a new dataset. Also, there were three talks by researchers from CWI (Netherlands) dealing with point cloud QoE assessment: Silvia Rossi presented a behavioral analysis in a 6-DoF VR system, taking into account the influence of content, quality and user disposition [4]; Shishir Subramanyam presented his work related to the subjective QoE evaluation of user-centered adaptive streaming of dynamic point clouds [5]; and Irene Viola presented a point cloud objective quality assessment using PCA-based descriptors (PointPCA). Another presentation related to point cloud quality assessment was delivered by Marouane Tliba (Université d’Orleans, France), who presented an efficient deep-based graph objective metric

In addition, Shirin Rafiei (RISE, Sweden) gave a talk on UX and QoE aspects of remote control operations using a laboratory platform, Marta Orduna (Universidad Politécnica de Madrid, Spain) presented her work on comparing ACR, SSDQE, and SSCQE in long duration 360-degree videos, whose results will be used to submit a proposal to extend ITU-T Rec. P.919 for long sequences, and Ali Ak (Nantes Université, France) his work on just noticeable differences to HDR/SDR image/video quality.    

Quality Assessment for Computer Vision Applications (QACoViA)

The goal of the QACoViA group is to study the visual quality requirements for computer vision methods, where the “final observer” is an algorithm. Four presentations were delivered in this meeting addressing diverse related topics. In the first one, Mikołaj Leszczuk (AGH University, Poland) presented a method for assessing objective video quality for automatic license plate recognition tasks [6]. Also, Femi Adeyemi-Ejeye (University of Surrey, UK) presented his work related to the assessment of rail 8K-UHD CCTV facing video for the investigation of collisions. The third presentation dealt with the application of facial expression recognition and was delivered by Lucie Lévêque (Nantes Université, France), who compared the robustness of humans and deep neural networks on this task [7]. Finally, Alban Marie (INSA Rennes, France) presented a study video coding for machines through a large-scale evaluation of DNNs robustness to compression artefacts for semantic segmentation [8].

Other updates

In relation to the Human Factors for Visual Experiences (HFVE) group, Maria Martini (Kingston University, UK) provided a summary of the status of IEEE recommended practice for the quality assessment of light field imaging. Also, Kjell Brunnström (RISE, Sweden) presented a study related to the perceptual quality of video on simulated low temperatures in LCD vehicle displays.

In addition, a new group was created in this meeting called Emerging Technologies Group (ETG), whose main objective is to address various aspects of multimedia that do not fall under the scope of any of the existing VQEG groups. The topics addressed are not necessarily directly related to “video quality” but can indirectly impact the work addressed as part of VQEG. In particular, two major topics of interest were currently identified: AI-based technologies and greening of streaming and related trends. Nevertheless, the group aims to provide a common platform for people to gather together and discuss new emerging topics, discuss possible collaborations in the form of joint survey papers/whitepapers, funding proposals, etc.

Moreover, it was agreed during the meeting to make the Psycho-Physiological Quality Assessment (PsyPhyQA) group dormant until interest resumes in this effort. Also, it was proposed to move the Implementer’s Guide for Video Quality Metrics (IGVQM) project into the JEG-Hybrid, since their activities are currently closely related. This will be discussed in future group meetings and the final decisions will be announced. Finally, as a reminder, the VQEG GitHub with tools and subjective labs setup is still online and kept updated.

The next VQEG plenary meeting will take place in May 2023 and the location will be announced soon on the VQEG website.

References

[1] Maria G. Martini, “On the relationship between SSIM and PSNR for DCT-based compressed images and video: SSIM as content-aware PSNR”, TechRxiv. Preprint. https://doi.org/10.36227/techrxiv.21725390.v1, 2022.
[2] J. Zhu, P. Le Callet; A. Perrin, S. Sethuraman, K. Rahul, “On The Benefit of Parameter-Driven Approaches for the Modeling and the Prediction of Satisfied User Ratio for Compressed Video”, IEEE International Conference on Image Processing (ICIP), Oct. 2022.
[3] Margaret H. Pinson, “Why No Reference Metrics for Image and Video Quality Lack Accuracy and Reproducibility”, Frontiers in Signal Processing, Jul. 2022.
[4] S. Rossi, I. viola, P. Cesar, “Behavioural Analysis in a 6-DoF VR System: Influence of Content, Quality and User Disposition”, Proceedings of the 1st Workshop on Interactive eXtended Reality, Oct. 2022.
[5] S. Subramanyam, I. Viola, J. Jansen, E. Alexiou, A. Hanjalic, P. Cesar, “Subjective QoE Evaluation of User-Centered Adaptive Streaming of Dynamic Point Clouds”, International Conference on Quality of Multimedia Experience (QoMEX), Sep. 2022.
[6] M. Leszczuk, L. Janowski, J. Nawała, and A. Boev, “Method for Assessing Objective Video Quality for Automatic License Plate Recognition Tasks”, Communications in Computer and Information Science, Oct. 2022.
[7] L. Lévêque, F. Villoteau, E. V. B. Sampaio, M. Perreira Da Silva, and P. Le Callet, “Comparing the Robustness of Humans and Deep Neural Networks on Facial Expression Recognition”, Electronics, 11(23), Dec. 2022.
[8] A. Marie, K. Desnos, L. Morin, and Lu Zhang, “Video Coding for Machines: Large-Scale Evaluation of Deep Neural Networks Robustness to Compression Artifacts for Semantic Segmentation”, IEEE International Workshop on Multimedia Signal Processing (MMSP), Sep. 2022.

VQEG Column: VQEG Meeting May 2022

Introduction

Welcome to this new column on the ACM SIGMM Records from the Video Quality Experts Group (VQEG), which will provide an overview of the last VQEG plenary meeting that took place from 9 to 13 May 2022. It was organized by INSA Rennes (France), and it was the first face-to-face meeting after the series of online meetings due to the Covid-19 pandemic. Remote attendance was also offered, which made possible that around 100 participants, from 17 different countries, attended the meeting (more than 30 of them attended in person). During the meeting, more than 40 presentations were provided, and interesting discussion took place. All the related information, minutes, and files from the meeting are available online in the VQEG meeting website, and video recordings of the meeting are available in Youtube.

Many of the works presented at this meeting can be relevant for the SIGMM community working on quality assessment. Particularly interesting can be the proposals to update the ITU-T Recommendations P.910 and P.913, as well as the presented publicly available datasets. We encourage those readers interested in any of the activities going on in the working groups to check their websites and subscribe to the corresponding reflectors, to follow them and get involved.

Group picture of the VQEG Meeting 9-13 May 2022 in Rennes (France).

Overview of VQEG Projects

Audiovisual HD (AVHD)

The AVHD group investigates improved subjective and objective methods for analyzing commonly available video systems. In this sense, the group continues working on extensions of the ITU-T Recommendation P.1204 to cover other encoders (e.g., AV1) apart from H.264, HEVC, and VP9. In addition, the project’s Quality of Experience (QoE) Metrics for Live Video Streaming Applications (Live QoE) and Advanced Subjective Methods (AVHD-SUB) are still ongoing. 

In this meeting, several AVHD-related topics were discussed, supported by six different presentations. In the first one, Mikolaj Leszczuk (AGH University, Poland) presented an analysis of the influence on the subjective assessment of the quality of video transmission of experiment conditions, such as video sequence order, variation and repeatability that can entail a “learning” process of the test participants during the test. In the second presentation, Lucjan Janowski (AGH University, Poland) presented two proposals towards more ecologically valid experiment designs: the first one using the Absolute Category Rating [1] without scale but in a “think aloud” manner, and the second one called “Your Youtube, our lab” in which the user selects the content that he or she prefers and a question quality appears during the viewing experience through a specifically designed interface. Also dealing with the study of testing methodologies, Babak Naderi (TU-Berlin, Germany) presented work on subjective evaluation of video quality with a crowdsourcing approach, while Pierre David (Capacités, France) presented a three-lab experiment, involving Capacités (France), RISE (Sweden) and AGH University (Poland) on quality evaluation of social media videos. Kjell Brunnström (RISE, Sweden) continued by giving an overview of video quality assessment of Video Assistant Refereeing (VAR) systems, and lastly, Olof Lindman (SVT, Sweden) presented another effort to reduce the lack of open datasets with the Swedish Television (SVT) Open Content.

Quality Assessment for Health applications (QAH)

The QAH group works on the quality assessment of health applications, considering both subjective evaluation and the development of datasets, objective metrics, and task-based approaches. In this meeting, Lucie Lévêque (Nantes Université, France) provided an overview of the recent activities of the group, including a submitted review paper on objective quality assessment for medical images, a special session accepted for IEEE International Conference on Image Processing (ICIP) that will take place in October in Bordeaux (France), and a paper submitted to IEEE ICIP on quality assessment through detection task of covid-19 pneumonia. The work described in this paper was also presented by Meriem Outtas (INSA Rennes, France).

In addition, there were two more presentations related to the quality assessment of medical images. Firstly, Yuhao Sun (University of Edinburgh, UK) presented their research on a no-reference image quality metric for visual distortions on Computed Tomography (CT) scans [2]. Finally, Marouane Tliba (Université d’Orleans, France) presented his studies on quality assessment of medical images through deep-learning techniques using domain adaptation.

Statistical Analysis Methods (SAM)

The SAM group works on improving analysis methods both for the results of subjective experiments and for objective quality models and metrics. The group is currently working on a proposal to update the ITU-T Recommendation P.913, including new testing methods for subjective quality assessment and statistical analysis of the results. Margaret Pinson presented this work during the meeting.   

In addition, five presentations were delivered addressing topics related to the group activities. Jakub Nawała (AGH University, Poland) presented the Generalised Score Distribution to accurately describe responses from subjective quality experiments. Three presentations were provided by members of Nantes Université (France): Ali Ak presented his work on spammer detection on pairwise comparison experiments, Andreas Pastor talked about how to improve the maximum likelihood difference scaling method in order to measure the inter-content scale, and Chama El Majeny presented the functionalities of a subjective test analysis tool, whose code will be publicly available. Finally, Dietmar Saupe (Univerity of Konstanz, Germany) delivered a presentation on subjective image quality assessment with boosted triplet comparisons.

Computer Generated Imagery (CGI)

CGI group is devoted to analyzing and evaluating computer-generated content, with a focus on gaming in particular. Currently, the group is working on the ITU-T Work Item P.BBQCG on Parametric bitstream-based Quality Assessment of Cloud Gaming Services. Apart from this, Jerry (Xiangxu) Yu (University of Texas at Austin, US) presented a work on subjective and objective quality assessment of user-generated gaming videos and Nasim Jamshidi (TUB, Germany) presented a deep-learning bitstream-based video quality model for CG content.

No Reference Metrics (NORM)

The NORM group is an open collaborative project for developing no-reference metrics for monitoring visual service quality. Currently, the group is working on three topics: the development of no-reference metrics, the clarification of the computation of the Spatial and Temporal Indexes (SI and TI, defined in the ITU-T Recommendation P.910), and on the development of a standard for video quality metadata.  

At this meeting, this was one of the most active groups and the corresponding sessions included several presentations and discussions. Firstly, Yiannis Andreopoulos (iSIZE, UK) presented their work on domain-specific fusion of multiple objective quality metrics. Then, Werner Robitza (AVEQ GmbH/TU Ilmenau, Germany) presented the updates on SI/TI clarification activities, which is leading an update of the ITU-T Recommendation P.910. In addition, Lukas Krasula (Netflix, US) presented their investigations on the relation between banding annoyance and the overall quality perceived by the viewers. Hadi Amirpour (University of Klagenfurt, Austria) delivered two presentations related to their Video Complexity Analyzer and their Video Complexity Dataset, which are both publicly available. Finally, Mikołaj Leszczuk (AGH University , Poland) gave two talks on their research related to User-Generated Content (UGC) (a.k.a. in-the-wild video content) recognition and on advanced video quality indicators to characterise video content.   

Joint Effort Group (JEG) – Hybrid

The JEG group was focused on joint work to develop hybrid perceptual/bitstream metrics and gradually evolved over time to include several areas of Video Quality Assessment (VQA), such as the creation of a large dataset for training such models using full-reference metrics instead of subjective metrics. A report on the ongoing activities of the group was presented by Enrico Masala (Politecnico di Torino, Italy), which included the release of a new website to reflect the evolution that happened in the last few years within the group. Although currently the group is not directly seeking the development of new metrics or tools readily available for VQA, it is still working on related topics such as the studies by Lohic Fotio Tiotsop (Politecnico di Torino, Italy) on the sensitivity of artificial intelligence-based observers to input signal modification.

5G Key Performance Indicators (5GKPI)

The 5GKPI group studies the relationship between key performance indicators of new 5G networks and QoE of video services on top of them. In this meeting, Pablo Pérez (Nokia, Spain) presented an extended report on the group activities, from which it is worth noting the joint work on a contribution to the ITU-T Work Item G.QoE-5G

Immersive Media Group (IMG)

The IMG group is focused on the research on the quality assessment of immersive media. Currently, the main joint activity of the group is the development of a test plan for evaluating the QoE of immersive interactive communication systems. In this sense, Pablo Pérez (Nokia, Spain) and Jesús Gutiérrez (Universidad Politécnica de Madrid, Spain) presented a follow up on this test plan including an overview of the state-of-the-art on related works and a taxonomy classifying the existing systems [3]. This test plan is closely related to the work carried out by the ITU-T on QoE Assessment of eXtended Reality Meetings, so Gunilla Berndtsson (Ericsson, Sweden) presented the latest advances on the development of the P.QXM.  

Apart from this, there were four presentations related to the quality assessment of immersive media. Shirin Rafiei (RISE, Sweden) presented a study on QoE assessment of an augmented remote operating system for scaling in smart mining applications. Zhengyu Zhang (INSA Rennes, France) gave a talk on a no-reference quality metric for light field images based on deep-learning and exploiting angular and spatial information. Ali Ak (Nantes Université, France) presented a study on the effect of temporal sub-sampling on the accuracy of the quality assessment of volumetric video. Finally, Waqas Ellahi (Nantes Université, France) showed their research on a machine-learning framework to predict Tone-Mapping Operator (TMO) preference based on image and visual attention features [4].

Quality Assessment for Computer Vision Applications (QACoViA)

The goal of the QACoViA group is to study the visual quality requirements for computer vision methods. In this meeting, there were three presentations related to this topic. Mikołaj Leszczuk (AGH University, Poland) presented an objective video quality assessment method for face recognition tasks. Also, Alban Marie  (INSA Rennes, France) showed an analysis of the correlation of quality metrics with artificial intelligence accuracy. Finally, Lucie Lévêque (Nantes Université, France) gave an overview of a study on the reliability of existing algorithms for facial expression recognition [5]. 

Intersector Rapporteur Group on Audiovisual Quality Assessment (IRG-AVQA)

The IRG-AVQA group studies topics related to video and audiovisual quality assessment (both subjective and objective) among ITU-R Study Group 6 and ITU-T Study Group 12. In this sense, Chulhee Lee (Yonsei University, South Korea) and Alexander Raake (TU Ilmenau, Germany) provided an overview on ongoing activities related to quality assessment within ITU-R and ITU-T.

Other updates

In addition, the Human Factors for Visual Experiences (HFVE), whose objective is to uphold the liaison relation between VQEG and the IEEE standardization group P3333.1, presented their advances in relation to two standards: IEEE P3333.1.3 – Deep-Learning-based assessment of VE based on HF, which has been approved and published, and the IEEE P3333.1.4 on Light field imaging, which has been submitted and is in the process to be approved. Also, although there were not many activities in this meeting within the Implementer’s Guide for Video Quality Metrics (IGVQM) and the Psycho-Physiological Quality Assessment (PsyPhyQA) they are still active. Finally, as a reminder, the VQEG GitHub with tools and subjective labs setup is still online and kept updated.

The next VQEG plenary meeting will take place online in December 2022. Please, see VQEG Meeting information page for more information.

References

[1] ITU, “Subjective video quality assessment methods for multimedia applications”, ITU-T Recommendation P.910, Jul. 2022.
[2] Y. Sun, G. Mogos, “Impact of Visual Distortion on Medical Images”, IAENG International Journal of Computer Science, 1:49, Mar. 2022.
[3] P. Pérez, E. González-sosa, J. Gutiérrez, N. García, “Emerging Immersive Communication Systems: Overview, Taxonomy, and Good Practices for QoE Assessment”, Frontiers in Signal Processing, Jul. 2022.
[4] W. Ellahi, T. Vigier, P. Le Callet, “A machine-learning framework to predict TMO preference based on image and visual attention features”, International Workshop on Multimedia Signal Processing, Oct. 2021.
[5] E. M. Barbosa Sampaio, L. Lévêque, P. Le Callet, M. Perreira Da Silva, “Are facial expression recognition algorithms reliable in the context of interactive media? A new metric to analyse their performance”, ACM International Conference on Interactive Media Experiences, Jun. 2022.

VQEG Column: VQEG Meeting Dec. 2021 (virtual/online)

Introduction

Welcome to a new column on the ACM SIGMM Records from the Video Quality Experts Group (VQEG).
The last VQEG plenary meeting took place from 13 to 17 December 2021, and it was organized online by University of Surrey, UK. During five days, more than 100 participants (from more than 20 different countries of America, Asia, Africa, and Europe) could remotely attend the multiple sessions related to the active VQEG projects, which included more than 35 presentations and interesting discussions. This column provides an overview of this VQEG plenary meeting, while all the information, minutes and files (including the presented slides) from the meeting are available online in the VQEG meeting website.

Group picture of the VQEG Meeting 13-17 December 2021

Many of the works presented in this meeting can be relevant for the SIGMM community working on quality assessment. Particularly interesting can be the new analyses and methodologies discussed within the Statistical Analyses Methods group, the new metrics and datasets presented within the No-Reference Metrics group, and the progress on the plans of the 5G Key Performance Indicators group and the Immersive Media group. We encourage those readers interested in any of the activities going on in the working groups to check their websites and subscribe to the corresponding reflectors, to follow them and get involved.

Overview of VQEG Projects

Audiovisual HD (AVHD)

The AVHD group investigates improved subjective and objective methods for analyzing commonly available video systems. In this sense, it has recently completed a joint project between VQEG and ITU SG12 in which 35 candidate objective quality models were submitted and evaluated through extensive validation tests. The result was the ITU-T Recommendation P.1204, which includes three standardized models: a bit-stream model, a reduced reference model, and a hybrid no-reference model. The group is currently considering extensions of this standard, which originally covered H.264, HEVC, and VP9, to include other encoders, such as AV1. Apart from this, two other projects are active under the scope of AVHD: QoE Metrics for Live Video Streaming Applications (Live QoE) and Advanced Subjective Methods (AVHD-SUB).

During the meeting, three presentations related to AVHD activities were provided. In the first one, Mikolaj Leszczuk (AGH University) presented their work on secure and reliable delivery of professional live transmissions with low latency, which brought to the floor the constant need for video datasets, such as the VideoSet. In addition, Andy Quested (ITU-R Working Party 6C) led a discussion on how to assess video quality for very high resolution (e.g., 8K, 16K, 32K, etc.) monitors with interactive applications, which raised the discussion on the key possibility of zooming in to absorb the details of the images without pixelation. Finally, Abhinau Kumar (UT Austin) and Cosmin Stejerean (Meta) presented their work on exploring the reduction of the complexity of VMAF by using features in the wavelet domain [1]. 

Quality Assessment for Health applications (QAH)

The QAH group works on the quality assessment of health applications, considering both subjective evaluation and the development of datasets, objective metrics, and task-based approaches. This group was recently launched and, for the moment, they have been working on a topical review paper on objective quality assessment of medical images and videos, which was submitted in December to Medical Image Analysis [2]. Rafael Rodrigues (Universidade da Beira Interior) and Lucie Lévêque (Nantes Université) presented the main details of this work in a presentation scheduled during the QAH session. The presentation also included information about the review paper published by some members of the group on methodologies for subjective quality assessment of medical images [3] and the efforts in gathering datasets to be listed on the VQEG datasets website. In addition, Lu Zhang (IETR – INSA Rennes) presented her work on model observers for the objective quality assessment of medical images from task-based approaches, considering three tasks: detection, localization, and characterization [4]. In addition, it is worth noting that members of this group are organizing a special session on “Quality Assessment for Medical Imaging” at the IEEE International Conference on Image Processing (ICIP) that will take place in Bordeaux (France) from the 16 to the 19 October 2022.

Statistical Analysis Methods (SAM)

The SAM group works on improving analysis methods both for the results of subjective experiments and for objective quality models and metrics. Currently, they are working on statistical analysis methods for subjective tests, which are discussed in their monthly meetings.

In this meeting, there were four presentations related to SAM activities. In the first one, Zhi Li and Lukáš Krasula (Netflix), exposed the lessons they learned from the subjective assessment test carried out during the development of their metric Contrast Aware Multiscale Banding Index (CAMBI) [5]. In particular, they found that some subjective can have perceptually unbalanced stimuli, which can cause systematic and random errors in the results. In this sense, they explained their statistical data analyses to mitigate these errors, such as the techniques in ITU-T Recommendation P.913 (section 12.6) which can reduce the effects of the random error. The second presentation described the work by Pablo Pérez (Nokia Bell Labs), Lucjan Janowsk (AGH University), Narciso Garcia (Universidad Politécnica de Madrid), and Margaret H. Pinson (NTIA/ITS) on a novel subjective assessment methodology with few observers with repetitions (FOWR) [6]. Apart from the description of the methodology, the dataset generated from the experiments is available on the Consumer Digital Video Library (CDVL). Also, they launched a call for other labs to repeat their experiments, which will help on discovering the viability, scope and limitations of the FOWR method and, if appropriate, include this method in the ITU-T Recommendation P.913 for quasi-experimental assessments when it is not possible to have 16 to 24 subjects (e.g., pre-tests, expert assessment, and resource limitations), for example, performing the experiment with 4 subjects 4 times each on different days, which would be similar to a test with 15 subjects. In the third presentation, Irene Viola (CWI) and Lucjan Janowski (AGH University) presented their analyses on the standardized methods for subject removal in subjective tests. In particular, the methods proposed in the recommendations ITU-R BT.500 and ITU-T P.913 were considered, resulting in that the first one (described in Annex 1 of Part 1) is not recommended for Absolute Category Rating (ACR) tests, while the one described in the second recommendations provides good performance, although further investigation in the correlation threshold used to discard subjects s required. Finally, the last presentation led the discussion on the future activities of SAM group, where different possibilities were proposed, such as the analysis of confidence intervals for subjective tests, new methods for comparing subjective tests from more than two labs, how to extend these results to better understand the precision of objective metrics, and research on crowdsourcing experiment in order to make them more reliable and improve cost-effectiveness. These new activities are discussed in the monthly meetings of the group.

Computer Generated Imagery (CGI)

CGI group focuses on quality analysis of computer-generated imagery, with a focus on gaming in particular. Currently, the group is working on topics related to ITU work items, such as ITU-T Recommendation P.809 with the development of a questionnaire for interactive cloud gaming quality assessment, ITU-T Recommendation P.CROWDG related to quality assessment of gaming through crowdsourcing, ITU-T Recommendation P.BBQCG with a bit-stream based quality assessment of cloud gaming services, and a codec comparison for computer-generated content. In addition, a presentation was delivered during the meeting by Nabajeet Barman (Kingston University/Brightcove), who presented the subjective results related to the work presented at the last VQEG meeting on the use of LCEVC for Gaming Video Streaming Applications [7]. For more information on the related activities, do not hesitate to contact the chairs of the group. 

No Reference Metrics (NORM)

The NORM group is an open collaborative project for developing no-reference metrics for monitoring visual service quality. Currently, two main topics are being addressed by the group, which are discussed in regular online meetings. The first one is related to the improvement of SI/TI metrics to solve ambiguities that have appeared over time, with the objective of providing reference software and updating the ITU-T Recommendation P.910. The second item is related to the addition of standard metadata of video quality assessment-related information in the encoded video streams. 

In this meeting, this group was one of the most active in terms of presentations on related topics, with 11 presentations. Firstly, Lukáš Krasula (Netflix) presented their Contrast Aware Multiscale Banding Index (CAMBI) [5], an objective quality metric that addresses banding degradations that are not detected by other metrics, such as VMAF and PSNR (code is available on GitHub). Mikolaj Leszczuk (AGH University) presented their work on the detection of User-Generated Content (UGC) automatic detection in the wild. Also, Vignesh Menon & Hadi Amirpour (AAU Klagenfurt) presented their open-source project related to the analysis and online prediction of video complexity for streaming applications. Jing Li (Alibaba) presented their work related to the perceptual quality assessment of internet videos [8], proposing a new objective metric (STDAM, for the moment, used internally) validated in the Youku-V1K dataset. The next presentation was delivered by Margaret Pinson (NTIA/ITS) dealing with a comprehensive analysis on why no-reference metrics fail, which emphasized the need of training these metrics on several datasets and test them on larger ones. The discussion also pointed out the recommendation for researchers to publish their metrics in open source in order to make it easier to validate and improve them. Moreover, Balu Adsumilli and Yilin Wang (Youtube) presented a new no-reference metric for UGC, called YouVQ, based on a transfer-learning approach with a pre-train on non-UGC data and a re-train on UGC. This metric will be released in open-source shortly, and a dataset with videos and subjective scores has been also published. Also, Margaret Pinson (NTIA/ITS), Mikołaj Leszczuk (AGH University), Lukáš Krasula (Netflix), Nabajeet Barman (Kingston University/Brightcove), Maria Martini (Kingston University), and Jing Li (Alibaba) presented a collection of datasets for no-reference metric research, while Shahid Satti (Opticom GmbH) exposed their work on encoding complexity for short video sequences. On his side, Franz Götz-Hahn (Universität Konstanz/Universität Kassel) presented their work on the creation of the KonVid-150k video quality assessment dataset [9], which can be very valuable for training no-reference metrics, and the development of objective video quality metrics. Finally, regarding the aforementioned two active topics within NORM group, Ioannis Katsavounidis (Meta) provided a presentation on the advances in relation to the activity related to the inclusion of standard video quality metadata, while Lukáš Krasula (Netflix), Cosmin Stejerean (Meta), and Werner Robitza (AVEQ/TU Ilmenau) presented the updates on the improvement of SI/TI metrics for modern video systems.

Joint Effort Group (JEG) – Hybrid

The JEG group was focused on joint work to develop hybrid perceptual/bitstream metrics and on the creation of a large dataset for training such models using full-reference metrics instead of subjective metrics. In this sense, a project in collaboration with Sky was finished and presented in the last VQEG meeting.

Related activities were presented in this meeting. In particular, Enrico Masala and Lohic Fotio Tiotsop (Politecnico di Torino) presented the updates on the recent activities carried out by the group, and their work on artificial-intelligence observers for video quality evaluation [10].

Implementer’s Guide for Video Quality Metrics (IGVQM)

The IGVQM group, whose activity started in the VQEG meeting in December 2020, works on creating an implementer’s guide for video quality metrics. In this sense, the current goal is to create a report on the accuracy of video quality metrics following a test plan based on collecting datasets, collecting metrics and methods for assessment, and carrying out statistical analyses. An update on the advances was provided by Ioannis Katsavounidis (Meta) and a call for the community is open to contribute to this activity with datasets and metrics.

5G Key Performance Indicators (5GKPI)

The 5GKPI group studies relationship between key performance indicators of new communications networks (especially 5G) and QoE of video services on top of them. Currently, the group is working on the definition of relevant use cases, which are discussed on monthly audiocalls. 

In relation to these activities, there were four presentations during this meeting. Werner Robitza (AVQ/TU Ilmenau) presented a proposal for KPI message format for gaming QoE over 5G networks. Also, Pablo Pérez (Nokia Bell Labs) presented their work on a parametric quality model for teleoperated driving [11] and an update of the ITU-T GSTR-5GQoE topic, related to the QoE requirements for real-time multimedia services over 5G networks. Finally, Margaret Pinson (NTIA/ITS) presented an overall description of 5G technology, including differences in spectrum allocation per country impact on the propagation and responsiveness and throughput of 5G devices.

Immersive Media Group (IMG)

The IMG group researches on quality assessment of immersive media. The group recently finished the test plan for quality assessment of short 360-degree video sequences, which resulted in the support for the development of the ITU-T Recommendation P.919. Currently, the group is working on further analyses of the data gathered from the subjective tests carried out for that test plan and on the analysis of data for the quality assessment of long 360-degree videos. In addition, members of the group are contributing to the IUT-T SG12 on the topic G.CMVTQS on computational models for QoE/QoS monitoring to assess video telephony services. Finally, the group is also working on the preparation of a test plan for evaluating the QoE with immersive and interactive communication systems, which was presented by Pablo Pérez (Nokia Bell Labs) and Jesús Gutiérrez (Universidad Politécnica de Madrid). If the reader is interested in this topic, do not hesitate to contact them to join the effort. 

During the meeting, there were also four presentations covering topics related to the IMG topics. Firstly, Alexander Raake (TU Ilmenau) provided an overview of the projects within the AVT group dealing with the QoE assessment of immersive media. Also, Ashutosh Singla (TU Ilmenau) presented a 360-degree video database with higher-order ambisonics spatial audio. Maria Martini (Kingston University) presented an update on the IEEE standardization activities on Human Factors or Visual Experiences (HFVE), such as the recently submitted draft standard on deep-learning-based quality assessment and the draft standard to be submitted shortly on quality assessment of light field content. Finally, Kjell Brunnstöm (RISE) presented their work on legibility in virtual reality, also addressing the perception of speech-to-text by Deaf and hard of hearing.  

Intersector Rapporteur Group on Audiovisual Quality Assessment (IRG-AVQA) and Q19 Interim Meeting

Although in this case there was no official meeting IRG-AVQA meeting, there were various presentations related to ITU activities addressing QoE evaluation topics. In this sense, Chulhee Lee (Yonsei University) presented an overview of ITU-R activities, with a special focus on quality assessment of HDR content, and together with Alexander Raake (TU Ilmenau) presented an update on ongoing ITU-T activities.

Other updates

All the sessions of this meeting and, thus, the presentations, were recorded and have been uploaded to Youtube. Also, it is worth informing that the anonymous FTP will be closed soon, so files and presentations can be accessed from old browsers or via an FTP app. All the files, including those corresponding to the VQEG meetings, will be embedded into the VQEG website over the next months. In addition, the GitHub with tools and subjective labs setup is still online and kept updated. Moreover, during this meeting, it was decided to close the Joint Effort Group (JEG) and the Independent Lab Group (ILG), which can be re-established when needed. Finally, although there were not many activities in this meeting within the Quality Assessment for Computer Vision Applications (QACoViA) and the Psycho-Physiological Quality Assessment (PsyPhyQA) they are still active.

The next VQEG plenary meeting will take place in Rennes (France) from 9 to 13 May 2022, which will be again face-to-face after four online meetings.

References

[1] A. K. Venkataramanan, C. Stejerean, A. C. Bovik, “FUNQUE: Fusion of Unified Quality Evaluators”, arXiv:2202.11241, submitted to the IEEE International Conference on Image Processing (ICIP), 2022. (opens in a new tab).
[2] R. Rodrigues, L. Lévêque, J. Gutiérrez, H. Jebbari, M. Outtas, L. Zhang, A. Chetouani, S. Al-Juboori, M. G. Martini, A. M. G. Pinheiro, “Objective Quality Assessment of Medical Images and Videos: Review and Challenges”, submitted to the Medical Image Analysis, 2022.
[3] L. Lévêque, M. Outtas, L. Zhang, H. Liu, “Comparative study of the methodologies used for subjective medical image quality assessment”, Physics in Medicine & Biology, vol. 66, no. 15, Jul. 2021. (opens in a new tab).
[4] L.Zhang, C.Cavaro-Ménard, P.Le Callet, “An overview of model observers”, Innovation and Research in Biomedical Engineering, vol. 35, no. 4, pp. 214-224, Sep. 2014. (opens in a new tab).
[5] P. Tandon, M. Afonso, J. Sole, L. Krasula, “Comparative study of the methodologies used for subjective medical image quality assessment”, Picture Coding Symposium (PCS), Jul. 2021. (opens in a new tab).
[6] P. Pérez, L. Janowski, N. García, M. Pinson, “Subjective Assessment Experiments That Recruit Few Observers With Repetitions (FOWR)”, IEEE Transactions on Multimedia (Early Access), Jul. 2021. (opens in a new tab).
[7] N. Barman, S. Schmidt, S. Zadtootaghaj, M.G. Martini, “Evaluation of MPEG-5 part 2 (LCEVC) for live gaming video streaming applications”, Proceedings of the Mile-High Video Conference, Mar. 2022. (opens in a new tab).
[8] J. Xu, J. Li, X. Zhou, W. Zhou, B. Wang, Z. Chen, “Perceptual Quality Assessment of Internet Videos”, Proceedings of the ACM International Conference on Multimedia, Oct. 2021. (opens in a new tab).
[9] F. Götz-Hahn, V. Hosu, H. Lin, D. Saupe, “KonVid-150k: A Dataset for No-Reference Video Quality Assessment of Videos in-the-Wild”, IEEE Access, vol. 9, pp. 72139 – 72160, May. 2021. (opens in a new tab).
[10] L. F. Tiotsop, T. Mizdos, M. Barkowsky, P. Pocta, A. Servetti, E. Masala, “Mimicking Individual Media Quality Perception with Neural Network based Artificial Observers”, ACM Transactions on Multimedia Computing, Communications, and Applications, vol. 18, no. 1, Jan. 2022. (opens in a new tab).
[11] P. Pérez, J. Ruiz, I. Benito, R. López, “A parametric quality model to evaluate the performance of tele-operated driving services over 5G networks”, Multimedia Tools and Applications, Jul. 2021. (opens in a new tab).

VQEG Column: VQEG Meeting Jun. 2021 (virtual/online)

Introduction

Welcome to the fifth column on the ACM SIGMM Records from the Video Quality Experts Group (VQEG).
The last VQEG plenary meeting took place online from 7 to 11 June 2021. As the previous meeting celebrated in December 2020, it was organized online (this time by Kingston University) with multiple sessions spread over five days, allowing remote participation of people from 22 different countries of America, Asia, and Europe. More than 100 participants registered to the meeting and they could attend the 40 presentations and several discussions that took place in all working groups. 
This column provides an overview of the recently completed VQEG plenary meeting, while all the information, minutes and files (including the presented slides) from the meeting are available online in the VQEG meeting website

Group picture of the VQEG Meeting 7-11 June 2021.

Several interesting presentations of state-of-the-art works can be of interest to the SIGMM community, in addition to the contributions to several working items of ITU from various VQEG groups. The progress on the new activities launched in the last VQEG plenary meeting (in relation to Live QoE assessment, SI/TI clarification, implementers guide for video quality metrics for coding applications, and the inclusion of video quality metrics as metadata in compressed streams), as well as the proposal for a new joint work on evaluation of immersive communication systems from a task-based or interactive perspective within the Immersive Media Group.

We encourage those readers interested in any of the activities going on in the working groups to check their websites and subscribe to the corresponding reflectors, to follow them and get involved.

Overview of VQEG Projects

Audiovisual HD (AVHD)

AVHD group works on improved subjective and objective methods for video-only and audiovisual quality of commonly available systems. Currently, after the project AVHD/P.NATS2 (a joint collaboration between VQEG and ITU SG12) finished in 2020 [1], two projects are ongoing within AVHD group: QoE Metrics for Live Video Streaming Applications (Live QoE), which was launched in the last plenary meeting, and Advanced Subjective Methods (AVHD-SUB).
The main discussion during the AVHD sessions was related to the Live QoE project, which was led by Shahid Satti (Opticom) and Rohit Puri (Twitch). In addition to the presentation of the project proposal, the main decisions reached until now were exposed (e.g., use of videos of 20-30 seconds with resolution 1080p and framerates up to 60fps, use ACR as subjective test methodology, generation of test conditions, etc.), as well as open questions were brought up for discussion, especially in relation to how to acquire premium content and network traces. 
In addition to this discussion, Steve Göring (TU Ilmenau) presented and open-source platform (AVrate Voyager) for crowdsourcing/online subjective tests [2], and Shahid Satti (Opticom) presented the performance results of the Opticom models on the project AVHD/P.NATS Phase 2. Finally, Ioannis Katsavounidis (Facebook) presented the subjective testing validation of the AV1 performance from the Alliance for Open Media (AOM) to gather feedback on the test plan and possible interested testing labs from VQEG. It is also worth noting that this session was recorded to be used as raw multimedia data for the Live QoE project. 

Quality Assessment for Health applications (QAH)

The session related to the QAH group group allocated three presentations apart from the project summary provided by Lucie Lévêque (Polytech Nantes). In particular, Meriem Outtas (INSA Rennes) provided a review on objective quality assessment of medical images and videos. This is is one of the topics jointly addressed by the group, which is working on an overview paper in line with the recent review on subjective medical image quality assessment [3]. Moreover, Zohaib Amjad Khan (Université Sorbonne Paris Nord) presented a work on video quality assessment of laparoscopic videos, while Aditja Raj and Maria Martini (Kingston University) presented their work on multivariate regression-based convolutional neural network model for fundus image quality assessment.

Statistical Analysis Methods (SAM)

The SAM session consisted of three presentations followed by discussions on the topics. One of this was related to the description of subjective experiment consistency by p-value p-p plot [4], which was presented by Jakub Nawała (AGH University of Science and Technology). In addition, Zhi Li (Netflix) and Rafał Figlus (AGH University of Science and Technology) presented the progress on the contribution from SAM to the ITU-T to modify the recommendation P.913 to include the MLE model for subject behavior in subjective experiments [5] and the recently available implementation of this model in Excel. Finally, Pablo Pérez (Nokia Bell Labs) and Lucjan Janowski (AGH University of Science and Technology) presented their work on the possibility of performing subjective experiments with four subjects [6].

Computer Generated Imagery (CGI)

Nabajeet Barman (Kingston University) presented a report on the current activities of the CGI group. The main current working topics are related to gaming quality assessment methodologies and quality prediction, and codec comparison for CG content. This group is closely collaborating with the ITU-T SG12, as reflected by its support on the completion of the 3 work items: ITU-T Rec. G.1032 on influence factors on gaming quality of experience, ITU-T Rec. P.809 on subjective evaluation methods for gaming quality, and ITU-T Rec. G.1072 on opinion model for gaming applications. Furthermore, CGI is contributing to 3 new work items: ITU-T work item P.BBQCG on parametric bitstream-based quality assessment of cloud gaming services, ITU-T work item G.OMMOG on opinion models for mobile online gaming applications, and ITU-T work item P.CROWDG on subjective evaluation of gaming quality with a crowdsourcing approach. 
In addition, four presentations were scheduled during the CGI slots. The first one was delivered by Joel Jung (Tencent Media Lab) and David Lindero (Ericsson), who presented the details of the ITU-T work item P.BBQCG. Another one was related to the evaluation of MPEG-5 Part 2 (LCEVC) for gaming video streaming applications, which was presented by Nabajeet Barman (Kingston University) and Saman Zadtootaghaj (Dolby Laboratories). Also Nabajeet together with Maria Martini (Kingston University) presented a dataset, codec comparison and challenges related to user generated HDR gaming video streaming [7]. Finally, JP Tauscher (Technische Universität Braunschweig) presented his work on EEG-based detection of deep fake images. 

No Reference Metrics (NORM)

The session for NORM group included a presentation on the impact of Spatial and Temporal Information (SI and TI) on video quality and compressibility [8], delivered by Werner Robitza (AVEQ GmbH), which was followed by a fruitful discussion on the compression complexity and on the activity related to SI/TI clarification launched in the last VQEG plenary meeting. In addition, there was another presentation from Mikołaj Leszczuk (AGH University of Science and Technology) on content type indicators for technologies supporting video sequence summarization. Finally, Ioannis Katsavounidis (Facebook) led a discussion on the inclusion of video quality metrics as metadata in compressed streams, with a report on the progress on this activity that was started in the last meeting. 

Joint Effort Group (JEG) – Hybrid

The JEG-Hybrid group is currently working on the development of a generally applicable no-reference hybrid perceptual/bitstream model. In this sense, Enrico Masala and Lohic Fotio Tiotsop (Politecnico di Tornio) presented the progress on designing a neural-network approach to model single observers using existing subjectively-annotated image and video datasets [9] (the design of subjective tests tailored for the training of this approach is envisioned for future work). In addition to this activity, the group is working in collaboration with the Sky Group on the “Hodor Project”, which is based on developing a measure that could allow to automatically identify video sequences for which quality metrics are likely to deliver inaccurate Mean Opinion Score (MOS) estimation.
Apart from these joint activities Dr. Yendo Hu (Carnation Communications Inc. and Jimei University) delivered a presentation proposing to work on a benchmarking standard to bring quality, bandwidth, and latency into a common measurement domain.

Quality Assessment for Computer Vision Applications (QACoViA)

In addition to a progress report, the QACoViA group scheduled two interesting presentations on enhancing artificial intelligence resilience to image coding artifacts through expert training (by Alban Marie from INSA Rennes) and on providing datasets to rain no-reference metrics for computer vision applications (by Carolina Whitaker from NTIA/ITS). 

5G Key Performance Indicators (5GKPI)

The 5GKPI session consisted of a presentation by Pablo Pérez (Nokia Bell-Labs) of the progress achieved by the group since the last plenary meeting in the following efforts: 1) the contribution to ITU-T Study Group 12 Question 13 related through the Technical Report about QoE in 5G video services (GSTR-5GQoE), which addresses QoE requirements and factors for some use cases like Tele-operated Driving (ToD), wireless content production, mixed reality offloading and first responder networks; 2) the contribution to the 5G Automotive Association (5GAA) through a high-level contribution on general QoE requirements for remote driving, considering for the near future the execution of subjective tests for ToD video quality; and 3) the long-term plan on working on a methodology to create simple opinion models to estimate average QoE for a network and use case.

Immersive Media Group (IMG)

Several presentations were delivered during the IMG session that were divided into two blocks: one covering technologies and studies related to the evaluation of immersive communication systems from a task-based or interactive perspective, and another one covering other topics related to the assessment of QoE of immersive media. 
The first set of presentations is related to a new proposal for a joint work within IMG related to the ITU-T work item P.QXM on QoE assessment of eXtended Reality meetings. Thus, Irene Viola (CWI) presented an overview of this work item. In addition, Carlos Cortés (Universidad Politécncia de Madrid) presented his work on evaluating the impact of delay on QoE in immersive interactive environments, Irene Viola (CWI) presented a dataset of point cloud dynamic humans for immersive telecommunications, Pablo César (CWI) presented their pipeline for social virtual reality [10], and Narciso García (Universidad Politécncia de Madrid) presented their real-time free-viewpoint video system (FVVLive) [11]. After these presentations, Jesús Gutiérrez (Universidad Politécncia de Madrid) led the discussion on joint next steps with IMG, which, in addition, to identify interested parties in joining the effort to study the evaluation of immersive communication systems, also covered the further analyses to be done from the subjective tests carried out with short 360-degree videos [12] and the studies carried out to assess quality and other factors (e.g., presence) with long omnidirectional sequences. In this sense, Marta Orduna (Universidad Politécnica de Madrid) presented her subjective study to validate a methodology to assess quality, presence, empathy, attitude, and attention in Social VR [13]. Future progress on these joint activities will be discussed in the group audio-calls. 
Within the other block of presentations related to immersive media topics, Maria Martini (Kingston University), Chulhee Lee (Yonsei University), and Patrick Le Callet (Université de Nantes) presented the status of IEEE standardization on QoE for immersive experiences (IEEE P3333.1.4 – Light Field, and IEEE P3333.1.3, deep learning-based quality assessment), Kjell Brunnström (RISE) presented their work on legibility and readability in augmented reality [14], Abdallah El Ali (CWI) presented his work on investigating the relationship between momentary emotion self-reports and head and eye movements in HMD-based 360° videos [15], Elijs Dima (Mid Sweden University) exposed his study on quality of experience in augmented telepresence considering the effects of viewing positions and depth-aiding augmentation [16], Silvia Rossi (UCL) presented her work towards behavioural analysis of 6-DoF user when consuming immersive media [17], and Yana Nehme (INSA Lyon) presented a study on exploring crowdsourcing for subjective quality assessment of 3D Graphics.

Intersector Rapporteur Group on Audiovisual Quality Assessment (IRG-AVQA) and Q19 Interim Meeting

During the IRG-AVQA session, an overview on the progress and recent works within ITU-R SG6 and ITU-T SG12 was provided. In particular, Chulhee Lee (Yonsei University) in collaboration with other ITU rapporteurs presented the progress of ITU-R WP6C on recommendations for HDR content, the work items within: ITU-T SG12 Question 9 on audio-related work items, SG12 Question 13 on gaming and immersive technologies (e.g., augmented/extended reality) among others, SG12 Question 14 recommendations and work items related to the development of video quality models, and SG12 Question 19 on work items related to television and multimedia. In addition, the progress of the group “Implementers Guide for Video Quality Metrics (IGVQM)”, launched in the last plenary meeting by Ioannis Katsavounidis (Facebook) was discussed addressing specific points to push the collection of video quality models and datasets to be used to develop an implementer’s guide for objective video quality metrics for coding applications. 

Other updates

The next VQEG plenary meeting will take place online in December 2021.

In addition, VQEG is investigating the possibility to disseminate the videos from all the talks from these plenary meetings via platforms such as Youtube and Facebook.

Finally, given that some modifications are being made to the public FTP of VQEG, if the links to the presentations included in this column are not opened by the browser, the reader can download all the presentations in one compressed file.

References

[1] A. Raake, S. Borer, S. Satti, J. Gustafsson, R.R.R. Rao, S. Medagli, P. List, S. Göring, D. Lindero, W. Robitza, G. Heikkilä, S. Broom, C. Schmidmer, B. Feiten, U. Wüstenhagen, T. Wittmann, M. Obermann, and R. Bitto, “Multi-model standard for bitstream-, pixel-based and hybrid video quality assessment of UHD/4K: ITU-T P.1204”, IEEE Access, vol. 8, pp. 193020-193049, Oct. 2020.
[2] R.R.R. Rao, S. Göring, and A. Raake, “Towards High Resolution Video Quality Assessment in the Crowd”, IEEE Int. Conference on Quality of Multimedia Experience (QoMEX), Jun. 2021.
[3] L. Lévêque, M. Outtas, H. Liu, and L. Zhang, “Comparative study of the methodologies used for subjective medical image quality assessment”, Physics in Medicine & Biology, Jul. 2021 (Accepted).
[4] J. Nawala, L. Janowski, B. Cmiel, and K. Rusek, “Describing Subjective Experiment Consistency by p-Value P–P Plot”, ACM International Conference on Multimedia (ACM MM), Oct. 2020.
[5] Z. Li, C. G. Bampis, L. Krasula, L. Janowski, and I. Katsavounidis, “A Simple Model for Subject Behavior in Subjective Experiments”, arXiv:2004.02067v3, May 2021.
[6] P. Perez, L. Janowski, N. Garcia, M. Pinson, “Subjective Assessment Experiments That Recruit Few Observers With Repetitions (FOWR)”, arXiv:2104.02618, Apr. 2021.
[7] N. Barman, and M. G. Martini, “User Generated HDR Gaming Video Streaming: Dataset, Codec Comparison and Challenges”, IEEE Transactions on Circuits and Systems for Video Technology, May 2021.
[8] W. Robitza, R.R.R. Rao, S. Göring, and A. Raake, “Impact of Spatial and Temporal Information on Video Quality and Compressibility”, IEEE Int. Conference on Quality of Multimedia Experience (QoMEX), Jun. 2021.
[9] L. Fotio Tiotsop, T. Mizdos, M. Uhrina, M. Barkowsky, P. Pocta, and E. Masala, “Modeling and estimating the subjects’ diversity of opinions in video quality assessment: a neural network based approach”, Multimedia Tools and Applications, vol. 80, pp. 3469–3487, Sep. 2020.
[10] J. Jansen, S. Subramanyam, R. Bouqueau, G. Cernigliaro, M. Martos Cabré, F. Pérez, and P. Cesar, “A Pipeline for Multiparty Volumetric Video Conferencing: Transmission of Point Clouds over Low Latency DASH”, ACM Multimedia Systems Conference (MMSys), May 2020.
[11] P. Carballeira, C. Carmona, C. Díaz, D. Berjón, D. Corregidor, J. Cabrera, F. Morán, C. Doblado, S. Arnaldo, M.M. Martín, and N. García, “FVV Live: A real-time free-viewpoint video system with consumer electronics hardware”, IEEE Transactions on Multimedia, May 2021.
[12] J. Gutiérrez, P. Pérez, M. Orduna, A. Singla, C. Cortés, P. Mazumdar, I. Viola, K. Brunnström, F. Battisti, N. Cieplińska, D. Juszka, L. Janowski, M. Leszczuk, A. Adeyemi-Ejeye, Y. Hu, Z. Chen, G. Van Wallendael, P. Lambert, C. Díaz, J. Hedlund, O. Hamsis, S. Fremerey, F. Hofmeyer, A. Raake, P. César, M. Carli, N. García, “Subjective evaluation of visual quality and simulator sickness of short 360° videos: ITU-T Rec. P.919”, IEEE Transactions on Multimedia, Jul. 2021 (Early Access).
[13] M. Orduna, P. Pérez, J. Gutiérrez, and N. García, “Methodology to Assess Quality, Presence, Empathy, Attitude, and Attention in Social VR: International Experiences Use Case”, arXiv:2103.02550, 2021.
[14] J. Falk, S. Eksvärd, B. Schenkman, B. Andrén, and K. Brunnström “Legibility and readability in Augmented Reality”, IEEE Int. Conference on Quality of Multimedia Experience (QoMEX), Jun. 2021.
[15] T. Xue,  A. El Ali,  G. Ding,  and P. Cesar, “Investigating the Relationship between Momentary Emotion Self-reports and Head and Eye Movements in HMD-based 360° VR Video Watching”, Extended Abstracts of the CHI Conference on Human Factors in Computing Systems, May 2021.
[16] E. Dima, K. Brunnström, M. Sjöström, M. Andersson, J. Edlund, M. Johanson, and T. Qureshi, “Joint effects of depth-aiding augmentations and viewing positions on the quality of experience in augmented telepresence”, Quality and User Experience, vol. 5, Feb. 2020.
[17] S. Rossi, I. Viola, J. Jansen, S. Subramanyam, L. Toni, and P. Cesar, “Influence of Narrative Elements on User Behaviour in Photorealistic Social VR”, International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE), Sep. 28, 2021.

VQEG Column: New topics

Introduction

Welcome to the fourth column on the ACM SIGMM Records from the Video Quality Experts Group (VQEG).
During the last VQEG plenary meeting (14-18 Dec. 2020) various interesting discussions arose regarding new topics not addressed up to then by VQEG groups, which led to launching three new sub-projects and a new project related to: 1) clarifying the computation of spatial and temporal information (SI and TI), 2) including video quality metrics as metadata in compressed bitstreams, 3) Quality of Experience (QoE) metrics for live video streaming applications, and 4) providing guidelines on implementing objective video quality metrics to the video compression community.
The following sections provide more details about these new activities and try to encourage interested readers to follow and get involved in any of them by subscribing to the corresponding reflectors.

SI and TI Clarification

The VQEG No-Reference Metrics (NORM) group has recently focused on the topic of spatio-temporal complexity, revisiting the Spatial Information and Temporal Information (SI/TI) indicators, which are described in ITU-T Rec. P.910 [1]. They were originally developed for the T1A1 dataset in 1994 [2]. The metrics have found good use over the last 25 years – mostly employed for checking the complexity of video sources in datasets. However, SI/TI definitions contain ambiguities, so the goal of this sub-project is to provide revised definitions eliminating implementation inconsistencies.

Three main topics are discussed by VQEG in a series of online meetings:

  • Comparison of existing publicly available implementations for SI/TI: a comparison was made between several public open-source implementations for SI/TI, based on initial feedback from members of Facebook. Bugs and inconsistencies were identified with the handling of video frame borders, treatment of limited vs. full range content, as well as the reporting of TI values for the first frame. Also, the lack of standardized test vectors was brought up as an issue. As a consequence, a new reference library was developed in Python by members of TU Ilmenau, incorporating all bug fixes that were previously identified, and introducing a new test suite, to which the public is invited to contribute material. VQEG is now actively looking for specific test sequences that will be useful for both validating existing SI/TI implementations, but also extending the scope of the metrics, which is related to the next issue described below.
  • Study on how to apply SI/TI on different content formats: the description of SI/TI was found to be not suitable for extended applications such as video with a higher bit depth (> 8 Bit), HDR content, or spherical/3D video. Also, the question was raised on how to deal with the presence of scene changes in content. The community concluded that for content with higher bit depth, SI/TI functions should be calculated as specified, but that the output values could be mapped back to the original 8-Bit range to simplify comparisons. As for HDR, no conclusion was reached, given the inherent complexity of the subject. It was also preliminarily concluded that the treatment of scene changes should not be part of an SI/TI recommendation, instead focusing on calculating SI/TI for short sequences without scene changes, since the way scene changes would be dealt with may depend on the final application of the metrics.
  • Discussion on other relevant uses of SI/TI: it has been widely used for checking video datasets in terms of diversity and classifying content. Also, SI/TI have been used in some no-reference metrics as content features. The question was raised whether SI/TI could be used for predicting how well content could be encoded. The group noted that different encoders would deal with sources differently, e.g. related to noise in the video. It was stated that it would be nice to be able to find a metric that was purely related to content and not affected by encoding or representation.

As a first step, this revision of the topic of SI/TI has resulted in a harmonized implementation and in the identification of future application areas. Discussions on these topics will continue in the next months through audio-calls that are open to interested readers.

Video Quality Metadata Standard

Also within NORM group, another topic was launched related to the inclusion of video quality metadata in compressed streams [3].

Almost all modern transcoding pipelines use full-reference video quality metrics to decide on the most appropriate encoding settings. The computation of these quality metrics is demanding in terms of time and computational resources. In addition, estimation errors propagate and accumulate when quality metrics are recomputed several times along the transcoding pipeline. Thus, retaining the results of these metrics with the video can alleviate these constraints, requiring very little space and providing a “greener” way of estimating video quality. With this goal, the new sub-project has started working towards the definition of a standard format to include video quality metrics metadata both at video bitstream level and system layer [4].

In this sense, the experts involved in the new sub-project are working on the following items:

  • Identification of existing proposals and working groups within other standardisation bodies and organisations that address similar topics and propose amendments including new requirements. For example, MPEG has already worked on the adding of video quality metrics (e.g., PSNR, SSIM, MS-SSIM, VQM, PEVQ, MOS, FISG) metadata at system level (e.g, in MPEG2 streams [5], HTTP [6], etc.[7]).
  • Identification of quality metrics to be considered in the standard. In principle, validated and standardized metrics are of interest, although other metrics can be also considered after a validation process on a standard set of subjective data (e.g., using existing datasets). New metrics to those used in previous approaches are of special interest. (e.g., VMAF [8], FB-MOS [9]).
  • Consideration of the computation of multiple generations of full-reference metrics at different steps of the transcoding chain, of the use of metrics at different resolutions, different spatio-temporal aggregation methods, etc.
  • Definition of a standard video quality metadata payload, including relevant fields such as metric name (e.g., “SSIM”), version (e.g., “v0.6.1”), raw score (e.g., “0.9256”), mapped-to-MOS score (e.g., “3.89”), scaling method (e.g., “Lanczos-5”), temporal reference (e.g., “0-3” frames), aggregation method (e.g., “arithmetic mean”), etc [4].

More details and information on how to join this activity can be found in the NORM webpage.

QoE metrics for live video streaming applications

The VQEG Audiovisual HD Quality (AVHD) group launched a new sub-project on QoE metrics for live media streaming applications (Live QoE) in the last VQEG meeting [10].

The success of a live multimedia streaming session is defined by the experience of a participating audience. Both the content communicated by the media and the quality at which it is delivered matter – for the same content, the quality delivered to the viewer is a differentiating factor. Live media streaming systems undertake a lot of investment and operate under very tight service availability and latency constraints to support multimedia sessions for their audience. Both to measure the return on investment and to make sound investment decisions, it is paramount that we be able to measure the media quality offered by these systems. In this sense, given the large scale and complexity of media streaming systems, objective metrics are needed to measure QoE.

Therefore, the following topics have been identified and are studied [11]:

  • Creation of a high quality dataset, including media clips and subjective scores, which will be used to tune, train and develop objective QoE metrics. This dataset should represent the conditions that take place in typical live media streaming situations, therefore conditions and impairments comprising audio and video tracks (independently and jointly) will be considered. In addition, this datasets should cover a diverse set of content categories, including premium contentes (e.g., sports, movies, concerts, etc.) and user generated content (e.g., music, gaming, real life content, etc.).
  • Development of QoE objective metrics, especially focusing on no-reference or near-no-reference metrics, given the lack of access to the original video at various points in the live media streaming chain. Different types of models will be considered including signal-based (operate on the decoded signal), metadata-based (operate on available metadata, e.g. codecs, resolution, framerate, bitrate, etc.), bitstream-based (operate on the parsed bitstream), and hybrid models (combining signal and metadata) [12]. Also, machine-learning based models will be explored.

Certain challenges are envisioned to be faced when dealing with these two topics, such as separating “content” from “quality” (taking int account that content plays a big role on engagement and acceptability), spectrum expectations, role of network impairments and the collection of enough data to develop robust models [11]. Readers interested in joining this effort are encouraged to visit AVHD webpage for more details.

Implementer’s Guide to Video Quality Metrics

In the last meeting, a new dedicated group on Implementer’s Guide to Video Quality Metrics (IGVQM) was set up to work on introducing and provide guidelines on implementing objective video quality metrics to the video compression community.

During the development of new video coding standards, peak-signal-to-noise-ratio (PSNR) has been traditionally used as the main objective metric to determine which new coding tools to be adopted. It has been furthermore used to establish the bitrate savings that a new coding standard offers over its predecessor through the employment of the so-called “BD-rate” metric [13] that still relies on PSNR for measuring quality.

Although this choice was fully justified for the first image/video coding standards – JPEG (1992), MPEG1 (1994), MPEG2 (1996), JPEG2000 and even H.264/AVC (2004) – since there was simply no other alternative at that time, its continuing use for the development of H.265/HEVC (2013), VP9 (2013), AV1 (2018) and most recently EVC and VVC (2020) is questionable, given the rapid and continuous evolution of more perceptual image/video objective quality metrics, such as SSIM (2004) [14], MS-SSIM (2004) [15], and VMAF (2015) [8].

This project attempts to offer some guidance to the video coding community, including standards setting organisations, on how to better utilise existing objective video quality metrics to better capture the improvements offered by video coding tools. For this, the following goals have been envisioned:

  • Address video compression and scaling impairments only.
  • Explore and use “state-of-the-art” full-reference (pixel) objective metrics, examine applicability of no-reference objective metrics, and obtain reference implementations of them.
  • Offer temporal aggregation methods of image quality metrics into video quality metrics.
  • Present statistical analysis of existing subjective datasets, constraining them to compression and scaling artifacts.
  • Highlight differences among objective metrics and use-cases. For example, in case of very small differences, which metric is more sensitive? Which quality range is better served by what metric?
  • Offer standard logistic mappings of objective metrics to a normalised linear scale.

More details can be found in the working document that has been set up to launch the project [16] and on the VQEG website.

References

[1] ITU-T Rec. P.910. Subjective video quality assessment methods for multimedia applications, 2008.
[2] M. H. Pinson and A. Webster, “T1A1 Validation Test Database,” VQEG eLetter, vol. 1, no. 2, 2015.
[3] I. Katsavounidis, “Video quality metadata in compressed bitstreams”, Presentation in VQEG Meeting, Dec. 2020.
[4] I. Katsavounidis et al. “A case for embedding video quality metrics as metadata in compressed bitstreams, working document, 2019.
[5] ISO/IEC 13818-1:2015/AMD 6:2016 Carriage of Quality Metadata in MPEG2 Streams.
[6] ISO/IEC 23009 Dynamic Adaptive Streaming over HTTP (DASH).
[7] ISO/IEC 23001-10, MPEG Systems Technologies – Part 10: Carriage of timed metadata metrics of media in ISO base media file format.
[8] Toward a practical perceptual video quality metric, Tech blog with VMAF’s open sourcing on Github, Jun. 6, 2016.
[9] S.L. Regunathan, H. Wang, Y. Zhang, Y. R. Liu, D. Wolstencroft, S. Reddy, C. Stejerean, S. Gandhi, M. Chen, P. Sethi, A, Puntambekar, M. Coward, I. Katsavounidis, “Efficient measurement of quality at scale in Facebook video ecosystem”, in Applications of Digital Image Processing XLIII, vol. 11510, p. 115100J, Aug. 2020.
[10] R. Puri, “On a QoE metric for live media streaming applications”, Presentation in VQEG Meeting, Dec. 2020.
[11] R. Puri and S. Satti, “On a QoE metric for live media streaming applications”, working document, Jan. 2021.
[12] A. Raake, S. Borer, S. Satti, J. Gustafsson, R.R.R. Rao, S. Medagli, P. List, S. Göring, D. Lindero, W. Robitza, G. Heikkilä, S. Broom, C. Schmidmer, B. Feiten, U. Wüstenhagen, T. Wittmann, M. Obermann, R. Bitto, “Multi-model standard for bitstream-, pixel-based and hybrid video quality assessment of UHD/4K: ITU-T P.1204” , IEEE Access, vol. 8, Oct. 2020.
[13] G. Bjøntegaard, “Calculation of Average PSNR Differences Between RD-Curves”, Document VCEG-M33, ITU-T SG 16/Q6, 13th VCEG Meet- ing, Austin, TX, USA, Apr. 2001.
[14] Z. Wang, A. C. Bovik, H. R. Sheikh and E. P. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” in IEEE Transactions on Image Processing, vol. 13, no. 4, pp. 600-612, April 2004.
[15] Z. Wang, E. P. Simoncelli and A. C. Bovik, “Multiscale structural similarity for image quality assessment,” The Thrity-Seventh Asilomar Conference on Signals, Systems & Computers, 2003, Pacific Grove, CA, USA, 2003.
[16] I. Katsavounidis, “VQEG’s Implementer’s Guide to Video Quality Metrics (IGVQM) project , working document, 2021.

VQEG Column: VQEG Meeting Dec. 2020 (virtual/online)

Introduction

Welcome to the third column on the ACM SIGMM Records from the Video Quality Experts Group (VQEG).
The last VQEG plenary meeting took place online from 14 to 18 December. Given the current circumstances, it was organized all online for the second time, with multiple sessions distributed over five to six hours each day allowing remote participation of people from different time zones. About 130 participants from 24 different countries registered to the meeting and could attend the several presentations and discussions that took place in all working groups.
This column provides an overview of this meeting, while all the information, minutes, files (including the presented slides), and video recordings from the meeting are available online in the VQEG meeting website. As highlights of interest for the SIGMM community, apart from several interesting presentations of state-of-the-art works, relevant contributions to ITU recommendations related to multimedia quality assessment were reported from various groups (e.g., on adaptive bitrate streaming services, on subjective quality assessment of 360-degree videos, on statistical analysis of quality assessments, on gaming applications, etc.), the new group on quality assessment for health applications was launched, and an interesting session on 5G use cases took place, as well as a workshop dedicated to user testing during Covid-19. In addition, new efforts have been launched related to the research on quality metrics for live media streaming applications, and to provide guidelines on implementing objective video quality metrics (ahead of PSNR) to the video compression community.
We encourage those readers interested in any of the activities going on in the working groups to check their websites and subscribe to the corresponding reflectors, to follow them and get involved.

Overview of VQEG Projects

Audiovisual HD (AVHD)

AVHD/P.NATS2 project was a joint collaboration between VQEG and ITU SG12, whose goal was to develop a multitude of objective models, varying in terms of complexity/type of input/use-cases for the assessment of video quality in adaptive bitrate streaming services over reliable transport up to 4K. The report of this project, which finished in January 2020, was approved in this meeting. In summary, it resulted in 10 model categories with models trained and validated on 26 subjective datasets. This activity resulted in 4 ITU standards (ITU-T Rec. P.1204 in [1], P.1204.3 in [2], P.1204.4 in [3], P.1204.5 in [4], a dataset created during this effort and a journal publication reporting details on the validation tests [5]. In this sense, one presentation by Alexander Raake (TU Ilmenau) provided details on the P.NATS Phase 2 project and the resulting ITU recommendations, while details of the processing chain used in the project were presented by Werner Robitza (AVEQ GmbH) and David Lindero (Ericsson).
In addition to this activity, there were various presentations covering topics related to this group. For instance, Cindy Chen, Deepa Palamadai Sundar, and Visala Vaduganathan (Facebook) presented their work on hardware acceleration of video quality metrics. Also from Facebook, Haixiong Wang presented their work on efficient measurement of quality at scale in their video ecosystem [6]. Lucjan Janowski (AGH University) proposed a discussion on more ecologically valid subjective experiments, Alan Bovik (University of Texas at Austin) presented a hitchhiker’s guide to SSIM, and Ali Ak (Université de Nantes) presented a comprehensive analysis of crowdsourcing for subjective evaluation of tone mapping operators. Finally, Rohit Puri (Twitch) opened a discussion on the research on QoE metrics for live media streaming applications, which led to the agreement to start a new sub-project within AVHD group on this topic.

Psycho-Physiological Quality Assessment (PsyPhyQA)

The chairs of the PsyPhyQA group provided an update on the activities carried out. In this sense, a test plan for psychophysiological video quality assessment was established and currently the group is aiming to develop ideas to do quality assessment tests with psychophysiological measures in times of a pandemic and to collect and discuss ideas about possible joint works. In addition, the project is trying to learn about physiological correlates of simulator sickness, and in this sense, a presentation was delivered J.P. Tauscher (Technische Universität Braunschweig) on exploring neural and peripheral physiological correlates of simulator sickness. Finally, Waqas Ellahi (Université de Nantes) gave a presentation on visual fidelity of tone mapping operators from gaze data using HMM [7].

Quality Assessment for Health applications (QAH)

This was the first meeting for this new QAH group. The chairs informed about the first audio call that took place on November to launch the project, know how many people are interested in this project, what each member has already done on medical images, what each member wants to do in this joint project, etc.
The plenary meeting served to collect ideas about possible joint works and to share experiences on related studies. In this sense, Lucie Lévêque (Université Gustave Eiffel) presented a review on subjective assessment of the perceived quality of medical images and videos, Maria Martini (Kingston University London) talked about the suitability of VMAF for quality assessment of medical videos (ultrasound & wireless capsule endoscopy), and Jorge Caviedes (ASU) delivered a presentation on cognition inspired diagnostic image quality models.

Statistical Analysis Methods (SAM)

The update report from SAM group presented the ongoing progress on new methods for data analysis, including the discussion with ITU-T (P.913 [8]) and ITU-R (BT.500 [9]) about including a new one in the recommendations.
Several interesting presentations related to the ongoing work within SAM were delivered. For instance, Jakub Nawala (AGH University) presented the “su-JSON”, a uniform JSON-based subjective data format, as well as his work on describing subjective experiment consistency by p-value p–p plots. An interesting discussion was raised by Lucjan Janowski (AGH University) on how to define the quality of a single sequence, analyzing different perspectives (e.g., crowd, experts, psychology, etc.). Also, Babak Naderi (TU Berlin) presented an analysis on the relation on Mean Opinion Score (MOS) and ranked-based statistics. Recent advances on Netflix quality metric VMAF were presented by Zhi Li (Netflix), especially on the properties of VMAF in the presence of image enhancement. Finally, two more presentations addressed the progress on statistical analyses of quality assessment data, one by Margaret Pinson (NTIA/ITS) on the computation of confidence intervals, and one by Suiyi Ling (Université de Nantes) on a probabilistic model to recover the ground truth and annotator’s behavior.

Computer Generated Imagery (CGI)

The report from the chairs of the CGI group covered the progress on the research on assessment methodologies for quality assessment of gaming services (e.g., ITU-T P.809 [10]), on crowdsourcing quality assessment for gaming application (P.808 [11]), on quality prediction and opinion models for cloud gaming (e.g., ITU-T G.1072 [12]), and on models (signal-, bitstream-, and parametric-based models) for video quality assessment of CGI content (e.g., nofu, NDNetGaming, GamingPara, DEMI, NR-GVQM, etc.).
In terms of planned activities, the group is targeting the generation of new gaming datasets and tools for metrics to assess gaming QoE, but also the group is aiming at identifying other topics of interest in CGI rather than gaming content.
In addition, there was a presentation on updates on gaming standardization activities and deep learning models for gaming quality prediction by Saman Zadtootaghaj (TU Berlin), another one on subjective assessment of multi-dimensional aesthetic assessment for mobile game images by Suiyi Ling (Université de Nantes), and one addressing quality assessment of gaming videos compressed via AV1 by Maria Martini (Kingston University London), leading to interesting discussions on those topics.

No Reference Metrics (NORM)

The session for NORM group included a presentation on the differences among existing implementations of spatial and temporal perceptual information indices (SI and TI as defined in ITU-T P.910 [13]) by Cosmin Stejerean (Facebook), which led to an open discussion and to the agreement on launching an effort to clarify the ambiguous details that have led to different implementations (and different results), to generate test vectors for reference and validation of the implementations and to address the computation of these indicators for HDR content. In addition, Margaret Pinson (NTIA/ITS) presented the paradigm of no-reference metric research analyzing design problems and presenting a framework for collaborative development of no-reference metrics for image and video quality. Finally, Ioannis Katsavounidis (Facebook) delivered a talk on addressing the addition of video quality metadata in compressed bitstreams. Further discussions on these topics are planned in the next month within the group.

Joint Effort Group (JEG) – Hybrid

The JEG-Hybrid group is currently working in collaboration with Sky Group in determining when video quality metrics are likely to inaccurately predict the MOS and on modelling single observers’ quality perception based in artificial intelligence techniques. In this sense, Lohic Fotio (Politecnico di Tornio) presented his work on artificial intelligence-based observers for media quality assessment. Also, together with Florence Agboma (Sky UK) they presented their work on comparing commercial and open source video quality metrics for HD constant bitrate videos. Finally, Dariusz Grabowski (AGH University) presented his work on comparing full-reference video quality metrics using cluster analysis.

Quality Assessment for Computer Vision Applications (QACoViA)

The QACoViA group announced Lu Zhang (INSA Rennes) as new third co-chair, who will also work in the near future in a project related to image compression for optimized recognition by distributed neural networks. In addition, Mikołaj Leszczuk (AGH University) presented a report on a recently finished project related to objective video quality assessment method for recognition tasks, in collaboration with Huawei through its Innovation Research Programme.

5G Key Performance Indicators (5GKPI)

The 5GKPI session was oriented to identify possible interested partners and joint works (e.g., contribution to ITU-T SG12 recommendation G.QoE-5G [14], generation of open/reference datasets, etc.). In this sense, it included four presentations of use cases of interest: tele-operated driving by Yungpeng Zang (5G Automotive Association), content production related to the European project 5G-Records by Paola Sunna (EBU), Augmented/Virtual Reality by Bill Krogfoss (Bell Labs Consulting), and QoE for remote controlled use cases by Kjell Brunnström (RISE).

Immersive Media Group (IMG)

A report on the updates within the IMG group was initially presented, especially covering the current joint work investigating the subjective quality assessment of 360-degree video. In particular, a cross-lab test, involving 10 different labs, were carried out at the beginning of 2020 resulting in relevant outcomes including various contributions to ITU SG12/Q13 and MPEG AhG on Quality of Immersive Media. It is worth noting that the new ITU-T recommendation P.919 [15], related to subjective quality assessment of 360-degree videos (in line with ITU-R BT.500 [8] or ITU-T P.910 [13]), was approved in mid-October, and was supported by the results of these cross-lab tests. 
Furthermore, since these tests have already finished, there was a presentation by Pablo Pérez (Nokia Bell-Labs) on possible future joint activities within IMG, which led to an open discussion after it that will continue in future audio calls.
In addition, a total of four talks covered topics related to immersive media technologies, including an update from the Audiovisual Technology Group of the TU Ilmenau on immersive media topics, and a presentation of a no-reference quality metric for light field content based on a structural representation of the epipolar plane image by Ali Ak and Patrick Le Callet (Université de Nantes) [16]. Also, there were two presentations related to 3D graphical contents, one addressing the perceptual characterization of 3D graphical contents based on visual attention patterns by Mona Abid (Université de Nantes), and another one comparing subjective methods for quality assessment of 3D graphics in virtual reality by Yana Nehmé (INSA Lyon). 

Intersector Rapporteur Group on Audiovisual Quality Assessment (IRG-AVQA) and Q19 Interim Meeting

Chulhee Lee (Yonsei University) chaired the IRG-AVQA session, providing an overview on the progress and recent works within ITU-R WP6C in HDR related topics and ITU-T SG12 Questions 9, 13, 14, 19 (e.g., P.NATS Phase 2 and follow-ups, subjective assessment of 360-degree video, QoE factors for AR applications, etc.). In addition, a new work item was announced within ITU-T SG9: End-to-end network characteristics requirements for video services (J.pcnp-char [17]).
From the discussions raised during this session, a new dedicated group was set up to work on introducing and provide guidelines on implementing objective video quality metrics, ahead of PSNR, to the video compression community. The group was named “Implementers Guide for Video Quality Metrics (IGVQM)” and will be chaired by Ioannis Katsavounidis (Facebook), accounting with the involvement of several people from VQEG.
After the IRG-AVQA session, the Q19 interim meeting took place with a report by Chulhee Lee and a presentation by Zhi Li (Netflix) on an update on improvements on subjective experiment data analysis process.

Other updates

Apart from the aforementioned groups, the Human Factors for Visual Experience (HVEI) is still active coordinating VQEG activities in liaison with the IEEE Standards Association Working Groups on HFVE, especially on perceptual quality assessment of 3D, UHD and HD contents, quality of experience assessment for VR and MR, quality assessment of light-field imaging contents, and deep-learning-based assessment of visual experience based on human factors. In this sense, there are ongoing contributions from VQEG members to IEEE Standards.
In addition, there was a workshop dedicated to user testing during Covid-19, which included a presentation on precaution for lab experiments by Kjell Brunnström (RISE), another presentation by Babak Naderi (TU Berlin) on subjective tests during the pandemic, and a break-out session for discussions on the topic.

Finally, the next VQEG plenary meeting will take place in spring 2021 (exact dates still to be agreed), probably online again.

References

[1] ITU-T Rec. P.1204. Video quality assessment of streaming services over reliable transport for resolutions up to 4K, 2020.
[2] ITU-T Rec. P.1204.3. Video quality assessment of streaming services over reliable transport for resolutions up to 4K with access to full bitstream information, 2020.
[3] ITU-T Rec. P.1204.4. Video quality assessment of streaming services over reliable transport for resolutions up to 4K with access to full and reduced reference pixel information, 2020.
[4] ITU-T Rec. P.1204.5. Video quality assessment of streaming services over reliable transport for resolutions up to 4K with access to transport and received pixel information, 2020.
[5] A. Raake, S. Borer, S. Satti, J. Gustafsson, R.R.R. Rao, S. Medagli, P. List, S. Göring, D. Lindero, W. Robitza, G. Heikkilä, S. Broom, C. Schmidmer, B. Feiten, U. Wüstenhagen, T. Wittmann, M. Obermann, R. Bitto, “Multi-model standard for bitstream-, pixel-based and hybrid video quality assessment of UHD/4K: ITU-T P.1204”, IEEE Access, vol. 8, pp. 193020-193049, Oct. 2020.
[6] S.L. Regunathan, H. Wang, Y. Zhang, Y. R. Liu, D. Wolstencroft, S. Reddy, C. Stejerean, S. Gandhi, M. Chen, P. Sethi, A, Puntambekar, M. Coward, I. Katsavounidis, “Efficient measurement of quality at scale in Facebook video ecosystem”, in Applications of Digital Image Processing XLIII, vol. 11510, p. 115100J, Aug. 2020.
[7] W. Ellahi, T. Vigier and P. Le Callet, “HMM-Based Framework to Measure the Visual Fidelity of Tone Mapping Operators”, IEEE International Conference on Multimedia & Expo Workshops (ICMEW), London, United Kingdom, Jul. 2020.
[8] ITU-R Rec. BT.500-14. Methodology for the subjective assessment of the quality of television pictures, 2019.
[9] ITU-T Rec. P.913. Methods for the subjective assessment of video quality, audio quality and audiovisual quality of Internet video and distribution, 2016.
[10] ITU-T Rec. P.809. Subjective evaluation methods for gaming quality, 2018.
[11] ITU-T Rec. P.808. Subjective evaluation of speech quality with a crowdsourcing approach, 2018.
[12] ITU-T Rec. G.1072. Opinion model predicting gaming quality of experience for cloud gaming services, 2020.
[13] ITU-T Rec. P.910. Subjective video quality assessment methods for multimedia applications, 2008.
[14] ITU-T Rec. G.QoE-5G. QoE factors for new services in 5G networks, 2020 (under study).
[15] ITU-T Rec. P.919. Subjective test methodologies for 360º video on head-mounted displays, 2020.
[16] A. Ak, S. Ling and P. Le Callet, “No-Reference Quality Evaluation of Light Field Content Based on Structural Representation of The Epipolar Plane Image”, IEEE International Conference on Multimedia & Expo Workshops (ICMEW), London, United Kingdom, Jul. 2020.
[17] ITU-T Rec. J.pcnp-char. E2E Network Characteristics Requirement for Video Services, 2020 (under study).

VQEG Column: Recent contributions to ITU recommendations

Welcome to the second column on the ACM SIGMM Records from the Video Quality Experts Group (VQEG).
VQEG plays a major role in research and the development of standards on video quality and this column presents examples of recent contributions to International Telecommunication Union (ITU) recommendations, as well as ongoing contributions to recommendations to come in the near future. In addition, the formation of a new group within VQEG addressing Quality Assessment for Health Applications (QAH) has been announced.  

VQEG website: www.vqeg.org
Authors: 
Jesús Gutiérrez (jesus.gutierrez@upm.es), Universidad Politécnica de Madrid (Spain)
Kjell Brunnström (kjell.brunnstrom@ri.se), RISE (Sweden) 
Thanks to Lucjan Janowski (AGH University of Science and Technology), Alexander Raake (TU Ilmenau) and Shahid Satti (Opticom) for their help and contributions.

Introduction

VQEG is an international and independent organisation that provides a forum for technical experts in perceptual video quality assessment from industry, academia, and standardization organisations. Although VQEG does not develop or publish standards, several activities (e.g., validation tests, multi-lab test campaigns, objective quality models developments, etc.) carried out by VQEG groups have been instrumental in the development of international recommendations and standards. VQEG contributions have been mainly submitted to relevant ITU Study Groups (e.g., ITU-T SG9, ITU-T SG12, ITU-R WP6C), but also to other standardization bodies, such as MPEG, ITU-R SG6, ATIS, IEEE P.3333 and P.1858, DVB, and ETSI. 

In our first column on the ACM SIGMM Records we provided a table summarizing the several VQEG studies that have resulted in ITU Recommendations. In this new column, we describe with more detail the last contributions to recent ITU standards, and we provide an insight on the ongoing contributions that may result in ITU recommendations in the near future.

ITU Recommendations with recent inputs from VQEG

ITU-T Rec. P.1204 standard series

A campaign within the ITU-T Study Group (SG) 12 (Question 14) in collaboration with the VQEG AVHD group resulted in the development of three new video quality model standards for the assessment of sequences of up to UHD/4K resolution. This campaign was carried out during more than two years under the project “AVHD-AS / P.NATS Phase 2”. While “P.NATS Phase 1” (finalized in 2016 and resulting in the standards series ITU-T Rec. P.1203, P.1203.1, P.1203.2 and P.1203.3) addressed the development of improved bitstream-based models for the prediction of the overall quality of long (1-5 minutes) video streaming sessions, the second phase addressed the development of short-term video quality models covering a wider scope with bitstream-based, pixel-based and hybrid models. The P.NATS Phase 2 project was executed as a competition between nine participating institutions in different tracks resulting in the aforementioned three types of video quality models. 

For the competition, a total of 26 databases were created, 13 used for training and 13 for validation and selection of the winning models. In order to establish the ground truth, subjective video quality tests were performed on four different display devices (PC-monitors, 55-75” TVs, mobile, and tablet) with at least 24 subjects each and using the 5-point Absolute Category Rating (ACR) scale. In total, about 5000 test sequences with a duration of around 8 seconds were evaluated, containing a variety of resolutions, encoding configurations, bitrates, and framerates using the codecs H.264/AVC, H.265/HEVC and VP9.   

More details about the whole workflow and results of the competition can be found in [1]. As a result of this competition, the new standard series ITU-T Rec. P.1204 [2] has been recently published, including a bitstream-based model  (ITU-T Rec. P.1204.3 [3]), a pixel-based model (ITU-T Rec. P.1204.4 [4]) and a hybrid model (ITU-T Rec. P.1204.5 [5]).

ITU-T Rec. P.1401

ITU-T Rec. P.1401 [6] is about statistical analysis, evaluation and reporting guidelines of quality measurements and was recently revised in January 2020.  Based on the article by Brunnström and Barkowsky [7], it was recognized and pointed out by VQEG that this Recommendation, which is very useful, lacked a section on the topic of multiple comparisons and its potential impact on the performance evaluations of objective quality methods. In the latest revision, Section 7.6.5 covers this topic.

Ongoing VQEG Inputs to ITU Recommendations

ITU-T Rec. P.919

ITU has been working on a recommendation for subjective test methodologies for 360º video on Head-Mounted Displays (HMDs), under the SG12 Question 13 (Q13). The Immersive Media Group (IMG) of the VQEG has collaborated in this effort through the fulfilment of the Phase 1 of the Test Plan for Quality Assessment of 360-degree Video. In particular, the Phase 1 of this test plan addresses the assessment of short sequences (less than 30 seconds), in the spirit of ITU-R BT.500 [8] and ITU-T P.910 [9]. In this sense, the evaluation of audiovisual quality and simulator sickness was considered. On the other hand, the Phase 2 of the test plan (envisioned for the near future) covers the assessment of other factors that can be more influential with longer sequences (several minutes), such as immersiveness and presence.  

Therefore, within Phase 1 the IMG designed and executed a cross-lab test with the participation of ten international laboratories, from AGH University of Science and Technology (Poland), Centrum Wiskunde & Informatica (The Netherlands), Ghent University (Belgium), Nokia Bell-Labs (Spain), Roma TRE University (Italy), RISE Acreo (Sweden), TU Ilmenau (Germany), Universidad Politécnica de Madrid (Spain), University of Surrey (England), Wuhan University (China). 

This test was aimed at assessing and validating subjective evaluation methodologies for 360º video. Thus, the single-stimulus methodology Absolute Category Rating (ACR) and the double-stimulus Degradation Category Rating (DCR) were considered to evaluate audiovisual quality of 360º videos distorted with uniform and non-uniform degradations.  In particular, different configurations of uniform and tile-based coding were applied to eight video sources with different spatial, temporal and exploration properties. Other influence factors were also studied, such as the influence of the sequence duration (from 10 to 30s) and the test setup (considering different HMDs and methods to collect the observers’ ratings, using audio or not, etc.).  Finally, in addition to the evaluation of audiovisual quality, the assessment of simulator sickness symptoms was addressed studying the use of different questionnaires. As a result of this work, the IMG of VQEG presented two contributions to the recommendation ITU-T Rec. P.919 (ex P.360-VR), which has been consented in the last SG12 meeting (7-11 September 2020) and is envisioned to be published soon. In addition, the results and the annotated dataset coming from the cross-lab test will be published soon.

ITU-T Rec. P.913

Another upcoming contribution is prepared by the Statistical Analysis Group (SAM). The main goal of the proposal is to increase the precision of the subjective experiment analysis by describing a subjective answer as a random variable. The random variable is described by three key influencing factors, the sequence quality, a subject bias, and a subject precision. It is further development of the ITU-T P.913 [10] recommendation where subject bias was introduced. Adding subject precision allows for two achievements: Better handling unreliable subjects and easier estimation procedure. 

Current standards describe a way to remove an unreliable subject. The problem is that the methods proposed in BT.500 [8] and P.913 [10] are different and point to different subjects. Also, both methods have some arbitrary parameters (e.g., thresholds) deciding when a subject should be removed. It means that two subjects can be similarly imprecise but one is over the threshold, and we accept all his answers as correct and the other is under the threshold, and we remove her all answers. The proposed method weights the impact of each subject answer depending on the subject precision. As the consequence, each subject is to some extent removed and kept. The balance between how much information we keep and how much we remove depends on the subject precision. 

The estimation procedure of the proposed model, described in the literature, is MLE (Maximum Likelihood Estimation). Such estimation is computationally costly and needs a careful setup to obtain a reliable solution. Therefore, we proposed Alternating Projection (AP) solver which is less general than MLE but works as well as MLE for the subject model estimation. This solver is called “alternating projection” because, in a loop, we alternate between projecting (or averaging) the opinion scores along the subject dimension and the stimulus dimension. It increases the precision of the obtained model parameters’ step by step weighting more information coming from the more precise subjects. More details can be found in the white paper in [11].

Other updates 

A new VQEG group has been recently established related to Quality Assessment for Health Applications (QAH), with the motivation to study visual quality requirements for medical imaging and telemedicine. The main goals of this new group are:

  • Assemble all the existing publicly accessible databases on medical quality.
  • Develop databases with new diagnostic tasks and new objective quality assessment models.
  • Provide methodologies, recommendations and guidelines for subjective test of medical image quality assessment.
  • Study the quality requirements and Quality of Experience in the context of telemedicine and other telehealth services.

For any further questions or expressions of interest to join this group, please contact QAH Chair Lu Zhang (lu.ge@insa-rennes.fr), Vice Chair Meriem Outtas (Meriem.Outtas@insa-rennes.fr), and Vice Chair Hantao Liu (hantao.liu@cs.cardiff.ac.uk).

References

[1] A. Raake, S. Borer, S. Satti, J. Gustafsson, R.R.R. Rao, S. Medagli, P. List, S. Göring, D. Lindero, W. Robitza, G. Heikkilä, S. Broom, C. Schmidmer, B. Feiten, U. Wüstenhagen, T. Wittmann, M. Obermann, R. Bitto, “Multi-model standard for bitstream-, pixel-based and hybrid video quality assessment of UHD/4K: ITU-T P.1204” , IEEE Access, 2020 (Available online soon).   
[2] ITU-T Rec. P.1204. Video quality assessment of streaming services over reliable transport for resolutions up to 4K. Geneva, Switzerland: ITU, 2020.
[3] ITU-T Rec. P.1204.3. Video quality assessment of streaming services over reliable transport for resolutions up to 4K with access to full bitstream information. Geneva, Switzerland: ITU, 2020.
[4] ITU-T Rec. P.1204.4. Video quality assessment of streaming services over reliable transport for resolutions up to 4K with access to full and reduced reference pixel information. Geneva, Switzerland: ITU, 2020.
[5] ITU-T Rec. P.1204.5. Video quality assessment of streaming services over reliable transport for resolutions up to 4K with access to transport and received pixel information. Geneva, Switzerland: ITU, 2020.
[6] ITU-T Rec. P.1401. Methods, metrics and procedures for statistical evaluation, qualification and comparison of objective quality prediction models. Geneva, Switzerland: ITU, 2020.
[7] K. Brunnström and M. Barkowsky, “Statistical quality of experience analysis: on planning the sample size and statistical significance testing”, Journal of Electronic Imaging, vol. 27, no. 5,  p. 11, Sep. 2018 (DOI: 10.1117/1.JEI.27.5.053013).
[8] ITU-R Rec. BT.500-14. Methodology for the subjective assessment of the quality of television pictures. Geneva, Switzerland: ITU, 2019.
[9]  ITU-T Rec. P.910. Subjective video quality assessment methods for multimedia applications. Geneva, Switzerland: ITU, 2008.
[10] ITU-T Rec. P.913. Methods for the subjective assessment of video quality, audio quality and audiovisual quality of Internet video and distribution quality television in any environment. Geneva, Switzerland: ITU, 2016.
[11] Z. Li, C. G. Bampis, L. Janowski, I. Katsavounidis, “A simple model for subject behavior in subjective experiments”, arXiv:2004.02067, Apr. 2020.