Editors: Maria Torres Vega (KU Leuven, Belgium), Karel Fliegel (Czech Technical University in Prague, Czech Republic), Mihai Gabriel Constantin (University Politehnica of Bucharest, Romania)
In this and the following Dataset Columns, we present a review of some of the notable events related to open datasets and benchmarking competitions in the field of multimedia in the years 2023 and 2024. This selection highlights the wide range of topics and datasets currently of interest to the community. Some of the events covered in this review include special sessions on open datasets and competitions featuring multimedia data. This year’s review follows similar efforts from the previous year (https://records.sigmm.org/records-issues/acm-sigmm-records-issue-1-2023/), highlighting the ongoing importance of open datasets and benchmarking competitions in advancing research and development in multimedia. This first column focuses on the last two editions of QoMEX, i.e., 2023 and 2024:
- 15th International Conference on Quality of Multimedia Experience (QoMEX 2023 – https://qomex2023.itec.aau.at/).
- 16th International Conference on Quality of Multimedia Experience (QoMEX 2024 – https://qomex2024.itec.aau.at/ ).
QoMEX 2023
4 dataset full papers were presented at the 15th International Conference on Quality of Multimedia Experience (QoMEX 2023), organized in Ghent, Belgium, June 19 – 21, 2023 (https://qomex2023.itec.aau.at/). The complete QoMEX ’23 Proceedings is available in the IEEE Xplore Digital Library (https://ieeexplore.ieee.org/xpl/conhome/10178424/proceeding).
These datasets were presented within the Datasets session, chaired by Professor Lea Skorin-Kapov. Given the scope of the conference (i.e., Quality of Multimedia Experience), these four papers present contributions focused on the impact on user perception of adaptive 2D video streaming, holographic video codecs, omnidirectional video/audio environments and multi-screen video.
PNATS-UHD-1-Long: An Open Video Quality Dataset for Long Sequences for HTTP-based Adaptive Streaming QoE Assessment
Ramachandra Rao, R. R., Borer, S., Lindero, D., Göring, S. and Raake, A.
Paper available at: https://ieeexplore.ieee.org/document/10178493
Dataset available at: https://github.com/Telecommunication-Telemedia-Assessment/PNATS-UHD-1-Long
A collaboration work of Technische Universität Ilmenau (Germany), Ericsson Research (Sweden) and Rohde&Schwarz (Switzerland)
The presented dataset consists of 3 subjective databases targeting overall quality assessment of a typical HTTP-based Adaptive Streaming session consisting of degradations such as quality switching, initial loading delay, and stalling events using audiovisual content ranging between 2 and 5 minutes. In addition to this, subject bias and consistency in quality assessment of such longer-duration audiovisual contents with multiple degradations are investigated using a subject behaviour model. As part of this paper, the overall test design, subjective test results, sources, encoded audiovisual contents, and a set of analysis plots are made publicly available for further research.
Open access dataset of holographic videos for codec analysis and machine learning applications
Gilles, A., Gioia, P., Madali, N., El Rhammad, A., Morin, L.
Paper available at: https://ieeexplore.ieee.org/document/10178637
Dataset available at: https://hologram-repository.labs.b-com.com/#/holographic-videos
A collaboration work between IRT and INSA, Rennes, France
This is reported as the first large-scale dataset containing 18 holographic videos computed with three different resolutions and pixel pitches. By providing the color and depth images corresponding to each hologram frame, our dataset can be used in additional applications such as the validation of 3D scene geometry retrieval or deep learning-based hologram synthesis methods. Altogether, our dataset comprises 5400 pairs of RGB-D images and holograms, totaling more than 550 GB of data.
Saliency of Omnidirectional Videos with Different Audio Presentations: Analyses and Dataset
Singla, A., Robotham, T., Bhattacharya, A., Menz, W., Habets, E. and Raake, A.
Paper available at: https://ieeexplore.ieee.org/abstract/document/10178588
Dataset available at: https://qoevave.github.io/database/docs/Saliency
A collaboration between the Technische Universität Ilmenau and the International Audio Laboratories of Erlangen, both in Germany.
This dataset uses a between-subjects test design to collect users’ exploration data of 360-degree videos in a free-form viewing scenario using the Varjo XR-3 Head Mounted Display, in the presence of no, mono, and 4th-order Ambisonics audio. Saliency information was captured as head-saliency in terms of the center of a viewport at 50 Hz. For each item, subjects were asked to describe the scene with a short free-verbalization task. Moreover, cybersickness was assessed using the simulator sickness questionnaire at the beginning and at the end of the test. The data is sought to enable training of visual and audiovisual saliency prediction models for interactive experiences.
A Subjective Dataset for Multi-Screen Video Streaming Applications
Barman, N., Reznik Y. and Martini, M. G.
Paper available at: https://ieeexplore.ieee.org/document/10178645
Dataset available at: https://github.com/NabajeetBarman/Multiscreen-Dataset
A collaboration between Brightcove (London, UK and Seattle, USA) and Kingston University Londong, UK.
This paper presents a new, open-source dataset consisting of subjective ratings for various encoded video sequences of different resolutions and bitrates (quality) when viewed on three devices of varying screen sizes: TV, Tablet, and Mobile. Along with the subjective scores, an evaluation of some of the most famous and commonly used open-source objective quality metrics is also presented. It is observed that the performance of the metrics varies a lot across different device types, with the recently standardized ITU-T P.1204.3 Model, on average, outperforming their full-reference counterparts.
QoMEX’24
5 dataset full papers were presented at the 16th International Conference on Quality of Multimedia Experience (QoMEX 2024), organized in Karshamn, Sweden, June 18 – 20, 2024 (https://qomex2024.itec.aau.at/). The complete QoMEX ’24 Proceedings is available in the IEEE Xplore Digital Library (https://ieeexplore.ieee.org/xpl/conhome/10597667/proceeding ).
These datasets were presented within the Datasets session, chaired by Dr. Mohsen Jenadeleh. Given the scope of the conference (i.e., Quality of Multimedia Experience), these five papers present contributions focused on the impact on user perception of HDR videos (UHD-1, 8K, and AV1), immersive 360° video and light fields. This last contribution was awarded the best paper award of the conference.
AVT-VQDB-UHD-1-HDR: An Open Video Quality Dataset for Quality Assessment of UHD-1 HDR Videos
Ramachandra Rao, R. R., Herb, B., Helmi-Aurora, T., Ahmed, M. T, Raake, A.
Paper available at: https://ieeexplore.ieee.org/document/10598284
Dataset available at: https://github.com/Telecommunication-Telemedia-Assessment/AVT-VQDB-UHD-1-HDR
A work from Technische Universität Ilmenau, Germany.
This dataset deals with the assessment of the perceived quality of HDR videos. Firstly, a subjective test with 4K/UHD1 HDR videos using the ACR-HR (Absolute Category Rating – Hidden Reference) method was conducted. The tests consisted of a total of 195 encoded videos from 5 source videos which all had a framerate of 60 fps. In this test, the 4K/UHD-1 HDR stimuli were encoded at four different resolutions, namely, 720p, 1080p, 1440p, and 2160p using bitrates ranging between 0.5 Mbps and 40 Mbps. The results of the subjective test have been analyzed to assess the impact of factors such as resolution, bitrate, video codec, and content on the perceived video quality.
AVT-VQDB-UHD-2-HDR: An open 8K HDR source dataset for video quality research
Keller, D., Goebel, T., Sievenkees, V., Prenzel, J., Raake, A.
Paper available at: https://ieeexplore.ieee.org/document/10598268
Dataset available at: https://github.com/Telecommunication-Telemedia-Assessment/AVT-VQDB-UHD-2-HDR
A work from Techniche Universität Ilmenau, Germany.
The AVT-VQDB-UHD-2-HDR dataset consists of 31 8K HDR video sources of 15s created with the goal of accurately representing real-life footage, while taking into account video coding and video quality testing challenges.
The effect of viewing distance and display peak luminance – HDR AV1 video streaming quality dataset
Hammou, D., Krasula, L., Bampis, C., Li, Z., Mantiuk, R.,
Paper available at: https://ieeexplore.ieee.org/document/10598289
Dataset available at: https://doi.org/10.17863/CAM.107964
A collaboration between University of Cambridge (UK) and Netflix Inc. (USA).
The HDR-VDC dataset captures the quality degradation of HDR content due to AV1 coding artifacts and the resolution reduction. The quality drop was measured at two viewing distances, corresponding to 60 and 120 pixels per visual degree, and two display mean luminance levels, 51 and 5.6 nits. It employs a highly sensitive pairwise comparison protocol with active sampling and comparisons across viewing distances to ensure possibly accurate quality measurements. It also provides the first publicly available dataset that measures the effect of display peak luminance and includes HDR videos encoded with AV1.
A Spherical Light Field Database for Immersive Telecommunication and Telepresence Applications (Best Paper Award)
Zerman, E., Gond, M., Takhtardeshir, S., Olsson, R., Sjöström, M.
Paper available at: https://ieeexplore.ieee.org/document/10598264
Dataset available at: https://zenodo.org/records/13342006
A work presented from Mid Sweden University, Sundsvall, Sweden.
The Spherical Light Field Database (SLFDB) consists of a light field of 60 views captured with an omnidirectional camera in 20 scenes. To show the usefulness of the proposed database, we provide two use cases: compression and viewpoint estimation. The initial results validate that the publicly available SLFDB will benefit the scientific community.
AVT-ECoClass-VR: An open-source audiovisual 360° video and immersive CGI multi-talker dataset to evaluate cognitive performance
Fremerey, S., Breuer, C., Leist, L., Klatte, M., Fels, J., Raake, A.
Paper available at: https://ieeexplore.ieee.org/document/10598262
Dataset available at: https://github.com/Telecommunication-Telemedia-Assessment/AVT-ECoClass-VR
A collaboration work between Technische Universität Ilmenau, RWTH Aache University and RPTU Kaiserslautern (Germany).
This dataset includes two audiovisual scenarios (360◦ video and computer-generated imagery) and two implementations for dataset playback. The 360◦ video part of the dataset features 200 video and single-channel audio recordings of 20 speakers reading ten stories, and 20 videos of speakers in silence, resulting in a total of 220 video and 200 audio recordings. The dataset also includes one 360◦ background image of a real primary school classroom scene, targeting young school children for subsequent subjective tests. The second part of the dataset comprises 20 different 3D models of the speakers and a computer-generated classroom scene, along with an immersive audiovisual virtual environment implementation that can be interacted with using an HTC Vive controller.