Quality of Multimedia Experience Meets Machine Intelligence

By Tobias Hossfeld | February 2, 2026 - 11:16 |February 2, 2026 0126, Feature, QoE Column

1. Why QoE meets Machine Intelligence Now

[Multimedia systems are evolving towards AI-driven, adaptive services, leading to a natural convergence of QoE and machine intelligence. In this context, machine intelligence can empower QoE through learning-based, context-aware, and semantic-driven modelling and optimization. At the same time, QoE can guide machine intelligence by providing a human-centred objective for AI system design and evaluation; see also [11]. Looking beyond human perception, toward agent-centric and hybrid QoE, future multimedia systems increasingly require unified experience objectives that support human-AI co-experience. QoMEX’26 in Cardiff stands as a major milestone highlighting the convergence of Quality of Multimedia Experience with Machine Intelligence. This column reflects on this evolution and outlines the key challenges ahead.

Multimedia systems have shifted from “best-effort delivery” toward intelligent, adaptive services that operate under highly diverse network conditions, device capabilities, and user contexts. In this landscape, Quality of Experience (QoE) has become a central concept, focusing on user satisfaction rather than purely signal-level fidelity [1, 2, 3].

QoE has traditionally been human-centric, reflecting perceived quality, enjoyment, comfort, and acceptance of multimedia services [2]. Meanwhile, machine intelligence, from deep learning and reinforcement learning to multimodal foundation models, has rapidly become the dominant paradigm for perception, generation, and decision-making. The intersection of these trends is timely and inevitable: QoE provides the human-centred goal, while machine intelligence provides scalable tools to model and optimize experience in complex real-world environments. Figure 1 summarizes this bidirectional relationship between QoE and machine intelligence, from multimodal inputs to human-centric, agent-centric, and hybrid QoE objectives.

Figure 1. A conceptual framework where machine intelligence enables QoE prediction and QoE-aware optimization, while QoE evolves from a human-centric notion toward agent-centric and hybrid objectives in intelligent multimedia systems.

2. How machine intelligence can empower QoE

(1) Learning QoE models beyond handcrafted rules

Classic QoE models often rely on handcrafted features and simplified assumptions linking system parameters (bitrate, delay, resolution) to perceived quality. Machine learning offers a flexible alternative: it can learn complex nonlinear mappings from content, network conditions, and user interaction signals to QoE outcomes. Deep models further enable learning from high-dimensional inputs such as raw video frames, audio signals, and multimodal logs, supporting richer QoE prediction in streaming, immersive media, short-form video, gaming, and interactive communication. In this context, advances in perceptual quality assessment (e.g., full-reference and no-reference IQA/VQA) also provide useful foundations for QoE-related modelling [5, 8, 9].

(2) QoE-aware control and optimization

Machine intelligence is not only about prediction, it can also enable QoE-driven decision-making. Instead of optimizing network metrics alone, systems can adapt encoding, bitrate selection, buffering strategies, or rendering policies to maximize predicted QoE. This direction has been extensively studied in adaptive streaming, where QoE-driven strategies are used to balance bitrate quality and playback stability [4]. Reinforcement learning is particularly promising, where QoE can serve as a reward signal and agents can learn robust policies under uncertainty (e.g., bandwidth fluctuations, user engagement changes) [6, 7].

(3) Personalization and context-awareness

QoE is inherently subjective and context-dependent. Machine intelligence can support personalization by incorporating user preferences and context signals such as device type, mobility, ambient environment, and usage patterns. For example, some users are more sensitive to rebuffering events, while others prioritize sharpness and resolution. Context-aware learning enables systems to move beyond “one-size-fits-all” adaptation.

(4) Semantic Intelligence

Machine intelligence can empower QoE by shifting quality assessment from perceptual fidelity toward semantic quality. This means how well the meaning and task-relevant information of multimodal content is preserved for both machines and humans. As multimedia data is increasingly consumed by AI systems in applications like autonomous systems and AI-generated content pipelines, traditional perceptual metrics fail to reflect performance and experience because they ignore semantic consistency. Semantic-aware evaluation may enable task-oriented and task-agnostic assessment. By integrating semantic quality assessment, AI can guide compression, transmission, and system design in ways that better align technical performance with downstream task success and user experience.

3. How QoE can guide machine intelligence

The relationship between QoE and machine intelligence is bidirectional: QoE can also shape how multimedia AI systems are designed, trained, and evaluated.

(1) QoE as a human-centric objective function

Many multimedia AI pipelines optimize proxy metrics such as accuracy, PSNR/SSIM, or task performance. However, these do not always align with perceived quality or user satisfaction. QoE provides a principled framework to define what “better” means from the user’s perspective and encourages evaluation beyond technical fidelity [2, 10].

(2) Aligning generative intelligence with user satisfaction

With the rise of generative AI for multimedia enhancement and creation, QoE becomes even more critical. High-quality generation is not only about realism but also about temporal consistency, comfort, trust, and acceptance in real usage conditions. Integrating QoE considerations can help steer generative models toward outcomes that users actually prefer.

Emerging Challenge “QoE of interactive AI systems”

AI evaluation is shifting from pure model accuracy toward experience-based assessment of how humans interact with AI, aligned with frameworks like the EU AI Act. Quality of Experience (QoE) and UX research provide established methods to measure subjective aspects such as trust, transparency, human oversight of the AIS systems, robustness, and satisfaction. Applying QoE methodologies can translate high-level AI principles into measurable experiential dimensions reflecting real-world user understanding and use. This requires new metrics that reflect how users actually understand, trust and operate AI systems in practice. For more details, see [11].

4. Beyond human-centric QoE: toward agent-centric and hybrid QoE

While QoE has historically focused on human perception, emerging multimedia systems increasingly serve autonomous agents such as robots, drones, and intelligent vehicles. In these scenarios, multimedia is not only consumed by humans but also by machines. This motivates an extended view of QoE, agent-centric QoE, where “experience” can be interpreted as the utility of multimedia inputs for decision-making and task execution.

Agent-centric QoE can be characterized through indicators such as perception reliability, uncertainty reduction, latency sensitivity, safety margins, energy efficiency, and task success rate. Importantly, many future applications involve human–AI co-experience, for example, in teleoperation, remote driving, robot-assisted inspection, and collaborative XR. In such systems, overall quality depends on both human satisfaction and machine performance, motivating unified QoE objectives that jointly optimize human-centric and agent-centric requirements. As shown in Figure 1, future multimedia systems may require unified QoE objectives that jointly optimize human satisfaction and agent utility in human–AI co-experience scenarios.

5. Key challenges

Despite its promise, QoE-meets-AI research faces several open challenges:

Subjective data cost and scarcity: QoE ground truth often requires user studies and careful experimental design [2, 3].
Generalization: QoE models may struggle across unseen content types, devices, or cultural contexts.
Bias and fairness: QoE datasets may underrepresent certain user groups or contexts, leading to skewed optimization.
Explainability and trust: Black-box QoE predictions can be difficult to interpret and validate in engineering pipelines.
Privacy: Personalization requires user data, raising responsible data usage concerns.
Ethical aspects: Beyond established research ethics procedures, QoE research must increasingly address the broader ethical implications of AI-driven experience optimization, such as fairness, transparency, wellbeing, privacy, and environmental impact, which are essential for truly human-centred technology.

6. Outlook and takeaways

The convergence of Quality of Experience and machine intelligence represents a major opportunity for the multimedia community. Machine intelligence offers scalable tools to predict and optimize QoE in complex environments, while QoE provides a human-centred lens to guide AI system design toward real user value. Looking forward, QoE may evolve from a purely human-centric notion to a hybrid experience shared by humans and intelligent agents, enabling multimedia systems that are not only technically advanced, but also aligned with what humans and autonomous agents truly need.

Looking ahead to the continued evolution of the QoMEX conference series, QoMEX’26 in Cardiff represents a key milestone where Quality of Multimedia Experience directly converges with Machine Intelligence. As AI increasingly shapes how multimedia is created, transmitted, and consumed, the conference invites the community to rethink both the goals and methods of QoE research – using AI to enhance user experience, while drawing on QoE insights to build more human-aware, trustworthy, and adaptive intelligent systems. This vision is reflected in special sessions:
“SS1: Semantic Quality Assessment for Multi-Modal Intelligent Systems” on semantic quality assessment for multimodal intelligent systems, which extend quality evaluation beyond perceptual fidelity toward meaning and task relevance. The session aims to lay the foundations of multimodal semantic quality assessment, enable semantic-driven compression and transmission, and connect semantic quality evaluation with AI understanding.
“SS2: Beyond Quality: Integrating Ethical Dimensions in QoE Research” on integrating ethical dimensions into QoE research, emphasizing fairness, transparency, wellbeing, privacy, and environmental impact, which are essential for truly human-centred technology. This session calls for ethically reflexive, value-sensitive QoE frameworks that incorporate social impact, collective QoE, and inclusive research practices alongside traditional UX measures.

Together, these themes signal a continued broadening of the QoE scope, reaffirming QoMEX as a forum that evolves with emerging technologies while advancing inclusive, responsible, and future-oriented quality research. The 18th International Conference on Quality of Multimedia Experience (QoMEX’26) will take place in Cardiff, United Kingdom, from June 29 to July 3, 2026. Please find more information on the website of QoMEX’26: https://qomex2026.itec.aau.at/

18th International Conference on Quality of Multimedia Experience (QoMEX’26) will take place in Cardiff, United Kingdom, from June 29 to July 3, 2026

Reference

[1] ITU-T Rec. P.10/G.100 (2006), Vocabulary for performance and quality of service.

[2] Möller, S., & Raake, A. (2014), Quality of Experience: Advanced Concepts, Applications and Methods. Springer.

[3] De Moor, K., et al. (2010), Proposed framework for evaluating quality of experience in a mobile, testbed-oriented living lab setting. Mobile Networks and Applications.

[4] Seufert, M., Egger, S., Slanina, M., Zinner, T., Hoßfeld, T., & Tran-Gia, P. (2015), A survey on quality of experience of HTTP adaptive streaming. IEEE Communications Surveys & Tutorials.

[5] Bampis, C. G., Li, Z., Moorthy, A. K., Katsavounidis, I., Aaron, A., & Bovik, A. C. (2018), Study of temporal effects on subjective video quality of experience. IEEE Transactions on Image Processing.

[6] Yin, X., Jindal, A., Sekar, V., & Sinopoli, B. (2015), A control-theoretic approach for dynamic adaptive video streaming over HTTP. ACM SIGCOMM.

[7] Mao, H., Netravali, R., & Alizadeh, M. (2017), Neural adaptive video streaming with Pensieve. ACM SIGCOMM.

[8] Wang, Z. & Bovik, AC. (2006), Modern image quality assessment. Springer.

[9] Mittal, A., Moorthy, A. K., & Bovik, A. C. (2013), No-reference image quality assessment in the spatial domain. IEEE Transactions on Image Processing.

[10] Hoßfeld, T., Schatz, R., & Egger, S. (2011), SOS: The MOS is not enough! QoMEX.

[11] Hupont, I., De Moor, K, Skorin-Kapov, L., Varela, M. & Hoßfeld, T. “Rethinking QoE in the Age of AI: From Algorithms to Experience-Based Evaluation.” ACM SIGMultimedia Records (2025).

Rethinking QoE in the Age of AI: From Algorithms to Experience-Based Evaluation

By Tobias Hossfeld | December 17, 2025 - 08:02 |December 17, 2025 0425, 0425, Feature, QoE Column, September 2013

Leave a comment

AI evaluation is undergoing a paradigm shift from focusing solely on algorithmic accuracy of AI models to emphasizing experience-based assessment of human interactions with AI systems. Under frameworks like the EU AI Act, evaluation now considers intended purpose, risk, transparency, human oversight, and real-world robustness alongside accuracy. Quality of Experience (QoE) methodologies may offer a structured approach to evaluate how users perceive and experience AI systems in terms of transparency, trust, control and overall satisfaction. This column gives inspiration and shared insights for both communities to advance experience-based AI system evaluation together.

1. From algorithms to systems: AI as user experience

Artificial Intelligence (AI) algorithms—mathematical models implemented as lines of code and trained on data to predict, recommend or generate outputs—were, until recently, tools reserved for programmers and researchers. Only those with technical expertise could access, run or adapt them. For decades, progress in AI was equated with improvements in algorithmic performance: higher accuracy, better precision or new benchmark records—often achieved under narrow, controlled conditions that did not reflect the full spectrum of real-world operational environments. These advances, though scientifically impressive, remained largely invisible to society at large.

The turning point came when AI stopped being just code and became an experience accessible to everyone, regardless of their technical background. Once algorithms were embedded into interactive systems—chatbots, voice assistants, recommendation platforms, image generators—AI became ubiquitous, integrated into people’s daily lives. Interfaces transformed technical capability into human experience, making AI not only a purely algorithmic or research-oriented field but also a social, experiential and increasingly public phenomenon [Mlynář et al ., 2025].

This shift fundamentally changed what it means to evaluate AI [Bach et al., 2024 ]. Accuracy-based metrics—such as precision, recall, specificity or F1-score—no longer suffice for systems that mediate human experiences, influence decision-making and shape trust. Evaluation must now extend beyond the model’s internal performance to assess the interaction, context and experience that emerge when humans engage with AI systems in realistic conditions. We must therefore move from evaluating algorithms in isolation to genuinely human-centered approaches to AI and the experiences it enables [see e.g., https://hai.stanford.edu/], evaluating AI systems as a whole, holistically—considering not only their technical performance but also their experiential, contextual, and social impact [Shneiderman, 2022 ]. The European Union’s Artificial Intelligence Act [AI Act, 2024 ] provides a clear illustration of this shift. As the first comprehensive regulatory framework for AI, it recognizes that while algorithmic quality remains essential, what is ultimately regulated is the AI system—its design, use, and intended purpose. Obligations under the Act are tied to that intended purpose, which determines both the risk level and the compliance requirements (see figure below). For instance, the same object detection model can be considered low risk when used to organize personal photo libraries, but high risk when deployed in an autonomous vehicle’s collision-avoidance system.

Figure 1. The European Union’s Artificial Intelligence Act [AI Act, 2024]: risk and obligations depend on an AI system’s intended purpose—permitting low-risk uses while restricting or prohibiting high-risk applications. Examples in the figure are illustrative, not exhaustive. Some uses require prior authorisation under the EU AI Act.

This illustrates a fundamental change: evaluating AI systems today requires understanding how, where and by whom a system is used—not merely how accurate its underlying AI model is. Moreover, evaluation must consider how systems behave and degrade under operational conditions (e.g., adverse weather in traffic monitoring or biased performance across demographic groups in facial analysis), how humans interact with, interpret and rely on them, and what mechanisms of human oversight or intervention exist in practice to ensure accountability and control [Panigutti et al., 2023 ].

2. Towards a paradigm shift in AI evaluation

The European AI Act marks the first comprehensive attempt to regulate the design, deployment and use of AI systems. Yet its underlying philosophy resonates broadly with the principles endorsed by other high-level international institutions and initiatives—such as the OECD [OECD, 2024 ], the World Economic Forum [WEF, 2025 ] and, more recently, the Paris AI Action Summit [CSIS, 2025 ], where over sixty countries signed a joint commitment to promote responsible, trustworthy and human-centric AI.

Among the many obligations set out in the AI Act for high-risk AI systems, three provisions stand out as emblematic of this paradigm shift: they focus not on algorithmic precision, but on how AI systems are experienced, supervised and operated in the real world.

Article 13 – Transparency. AI systems must be designed and developed in a way that is sufficiently transparent to enable users to interpret their output and use it appropriately. Transparency therefore extends beyond disclosure or documentation: it encompasses interaction design and interpretability, ensuring that users—especially non-experts—can meaningfully understand and act upon what the system produces, based on which input and how.
Article 14 – Human oversight. High-risk AI systems must allow for effective human supervision so that they can be used as intended and to prevent or minimise risks to health, safety or fundamental rights (e.g., respect for human dignity, privacy, equality and non-discrimination). Oversight involves not only control features or override mechanisms, but also interface designs that help operators recognise when human intervention is necessary—addressing known challenges such as automation bias and over-trust on AI systems [Gaudeul et al., 2024 ].
Article 15 – Accuracy, robustness and cybersecurity. This provision broadens the traditional notion of accuracy, demanding that systems perform reliably under real-world operational conditions and remain secure and resilient to errors, adversarial manipulation or context change. It also calls for mechanisms that support graceful degradation and error recovery, ensuring sustained trust and dependable performance over time.

These provisions, aligned to both the AI Act and the broader international discourse on responsible AI, express a clear transformation in how AI systems should be evaluated. They call for a move beyond in-lab algorithmic performance metrics to include criteria grounded in human experience, operational reliability and social trust. To make these requirements actionable, the European Commission issued a Standardisation Request on Artificial Intelligence (initially published as M/593, 2024 [European Commission, 2024 ] and subsequently updated following the adoption of the AI Act), mandating the development of harmonised standards to support conformity with the regulation. Yet analyses of existing AI standardisation frameworks suggest that they remain primarily focused on technical robustness and risk management, while offering limited methodological guidance for assessing transparency, human oversight and perceived reliability [Soler et al., 2023 ].

This gap underscores the need for contributions from the Quality of Experience (QoE) community, whose expertise in assessing perceived quality, pragmatic, hedonic and increasingly also eudaimonic aspects of users’ experiences, usability and trust could inform both standardisation efforts and AI system design in practice. For example, [Hammer et al., 2018 ] introduced the “HEP cube”, that is a 3D model that maps hedonic (H), eudaimonic (E), and pragmatic (P) aspects of QoE and user experience. For example, utility (P), joy-of-use (H), and meaningfulness (E) are integrated into a multidimensional HEP construct [Egger-Lampl et al., 2019 ]. In professional contexts, long-term experiential quality depends increasingly on eudaimonic factors such as meaning and personal growth of the user’s capabilities. On the example of augmented reality for the informational phase of procedure assistance, [Hynes et al., 2023 ] take into account pragmatic aspects like clear, accurately aligned AR instructions that reduce cognitive load and support efficient task execution; hedonic and eudaimonic aspects involve engaging, intuitive interactions that not only make the experience pleasant but also foster confidence, competence, and meaningful professional growth. The study confirmed that AR better fulfills users’ pragmatic needs compared to paper-based instructions. However, the hypothesis that AR surpasses paper-based instructions in meeting hedonic needs was rejected. [Oppermann et al., 2024 ] evaluated a VR-based forestry safety training and found improved experiential quality and real-world skill transfer compared to traditional instruction. In addition to hedonic and pragmatic UX, eudaimonic experience was assessed by asking participants whether the training would help them “make me a better forestry worker” and “develop my personal potential”.

3. From benchmark performance to operational reality: the case of facial recognition

The example of remote facial recognition (RFR) for public security clearly illustrates how traditional accuracy-based evaluation fails to capture the real challenges of proportionality, operational viability and public trust that define the true quality of experience of AI in use. Under the EU AI Act, the use of real-time remote biometric identification systems in publicly accessible spaces for law enforcement is prohibited, except in narrowly defined circumstances—such as the prevention of terrorist threats, the search for missing persons or the prosecution of crimes—and always subject to prior authorisation by a competent authority. In these cases, the authority must assess whether the deployment of such a system is necessary and proportionate to the intended purpose.

Both the AI Act and the World Economic Forum emphasise this principle of “proportionality” for face recognition systems [AI Act, 2024 ], [Louradour & Madzou, 2021 ], yet without providing a clear guidance to determine what “proportionate use” actually means. Deciding whether to deploy RFR therefore requires balancing multiple dimensions—technical performance, societal impact and human oversight—beyond mere accuracy scores [Negri et al., 2024 ]. Consider, for instance, a competent authority evaluating whether to deploy an RFR system in airports screening 200 million passengers annually, where the estimated prevalence of genuine threats is roughly one in fifty million. Even with a true positive rate (TPR) and true negative rate (TNR) of 99% (equivalent to 99% sensitivity and specificity), the outcome is paradoxical: nearly all real threats would be detected (≈ 4 per year), but around two million innocent passengers would face unnecessary police interventions. Algorithmically, a 99% performance looks excellent. Operationally, it is unmanageable and counterproductive. Handling millions of false alarms would overwhelm security forces, delay operations, and—most importantly—erode public trust, as citizens repeatedly experience unjustified scrutiny and loss of confidence in authorities.

Beyond accuracy, competent authorities must evaluate trade-offs between different operational, social and economic dimensions that holistically define the proportionality and viability of an AI system:

Operational feasibility: number of human interventions needed, false alarms to handle and system downtime.
Social impact: perceived fairness, legitimacy and transparency of interventions.
Economic cost: cost of system deployment, resources spent managing false positives versus genuine detections.
Human trust and cognitive load: how repeated interactions with the system affect operator confidence, vigilance and the balance between over-trust and alert fatigue.
Consequences of error: the cost of a missed detection versus that of an unjustified intervention.

Hence, accuracy alone cannot guarantee reliability or trustworthiness. Evaluating AI systems requires contextual and human-aware metrics that capture operational trade-offs and social implications. The goal is not only to predict well, but to perform well in the real world. This example reveals a broader truth: trustworthy AI demands evaluation methods that connect technical performance with lived experience—and this is precisely where the QoE community can make a distinctive contribution.

4. Where AI and QoE should meet: new metrics for a new era

The limitations of accuracy-based evaluation, as illustrated by the facial recognition case, point to a broader need for metrics that capture how AI systems perform in real-world, human-centred contexts [Virvou, 2023 ],[Park et al., 2023 ].

Over the past decades, the scientific communities focusing on QoE and user experience (UX) research have developed a rigorous toolbox for quantifying subjective experience—how users perceive quality, usability, pragmatic, hedonic and increasingly also eudaimonic aspects of users’ experiences, reliability, control and satisfaction when interacting with complex technological systems. Originally rooted in multimedia, communication networks and human–computer interaction, these methodologies offer a mature foundation for assessing experienced quality in AI systems. QoE-based approaches can help transform general principles such as transparency, human oversight and robustness into measurable experiential dimensions that reflect how users actually understand, trust and operate AI systems in practice.

The following table presents a set of illustrative examples of QoE-inspired metrics—adapted from long-standing practices in the field—that could be further adapted, developed and validated for the evaluation of trustworthy AI.

General AI principles	QoE-inspired metrics
Transparency and comprehensibility	Perceived transparency score: % of users reporting understanding of system capabilities/limitations, potentially with a way to dimension the gap between reported understanding and actual understanding Explanation clarity MOS: Mean Opinion Score on clarity and interpretability of explanations. While traditional QoE assessment results are often reported as a Mean Opinion Score (MOS), additional statistical measures related to the distribution of scores in the target population are of interest, such as user diversity, uncertainty of user rating distributions, ratio of dissatisfied users, etc. [Hoßfeld et al., 2016] Time to comprehension: average time for a non-expert to understand the meaning of a given output produced by the system. Experienced interpretability: extent to which users feel that explanations meaningfully enhance their understanding of the system’s reasoning and limitations [Wehner et al., 2025 ].
Human oversight	Perceived controllability: MOS on ease of intervening or correcting system behavior. Intervention success rate: % of interventions improving outcomes. Trust calibration index: alignment between user confidence and actual system reliability.
Robustness and resilience to errors	Perceived reliability over time: longitudinal QoE measure of stability (for example, inspired by work on the longitudinal development of QoE, such as [Guse, D., 2016 ], [Cieplinska, 2023 ]). Graceful degradation MOS: subjective quality under stress (e.g., noise, adversarial input). Error recovery satisfaction: % of users satisfied with post-failure recovery.
Experience quality (holistic)	Overall satisfaction MOS: overall perceived quality of interaction with the AI system and factors influencing that experience quality (human, system, context, as discussed in [Reiter et al.. 2014 ]. Smoothness of use: perceived fluidity, continuity, absence of frustration. Perceived usefulness and usability: e.g., adapted from widely-used SUS/UMUX-Lite scales [Lewis et al., 2013 ]. Perceived response alignment: capture to what extent the system response aligns semantically and contextually with the prompt intent (particularly relevant for generative AI systems). Cognitive load: mental effort perceived during operation (e.g., adapted NASA-TLX [Hart & Staveland, 1988 ]). Perceived productivity impact: how users perceive the effect of AI system assistance on task efficiency and cognitive effort, reflecting findings from recent large-scale developer studies [Early-2025 AI, AI hampers Productivity].

These examples illustrate how the QoE perspective can complement traditional performance indicators such as accuracy or robustness. They extend evaluation beyond technical correctness to include how people experience, trust and manage AI systems in operational environments. Of interest will be to further explore and model the complex relationships between identified QoE dimensions and underlying system, context and human influence factors.

To better illustrate such complex relationships, it is useful to consider how technical and experiential dimensions interact dynamically in use. One particularly relevant example concerns how AI systems communicate confidence or uncertainty, and how this shapes users’ perceived trustworthiness, engagement and overall Quality of Experience.

Figure 2. Positive and negative feedback loop between confidence and QoE of AI systems.

While this is only one example among many possible human–AI interaction dynamics, it illustrates the kind of interrelation that still requires deeper understanding. As depicted in the figure above, complex interrelations exist that are not yet fully understood. AI confidence calibration (based on the AI model) and the way how this confidence or uncertainty is transported to users influences the users’ perceived trustworthiness of the AI system. This impacts the user’s confidence to which degree a user trusts their own ability to understand, interpret, and effectively interact with the AI system. Poor calibration can trigger a negative feedback loop of mistrust and disengagement, while well-calibrated, transparent AI fosters a positive feedback loop that enhances trust, confidence, and effective human-AI collaboration. In a negative feedback loop, overconfidence leads to low perceived trustworthiness and a strong QoE decline, while underconfidence results in moderate perceived trustworthiness and medium QoE, ultimately lowering user engagement. In contrast, a positive feedback loop emerges when confidence is well-calibrated and aligns with accuracy or when uncertainty is expressed transparently, leading to high trust, higher QoE, and stronger user engagement. User engagement and QoE are closely interrelated [Reichl et al., 2015 ], as higher engagement often reflects and reinforces a more positive overall experience.

Following this and similar examples, the bridge that now needs to be built is between the AI community’s focus on algorithmic performance and the QoE community’s expertise in human experience, bringing together two perspectives that have evolved largely in isolation, but are inherently complementary.

5. Conclusions: QoE as part of the missing link between AI systems and real-world experiences

Bridging the gap between how AI systems perform and how they are experienced is now one of the most pressing challenges in the field. The AI community has achieved extraordinary advances in model accuracy, scalability and efficiency, yet these metrics alone do not fully capture how systems behave in context—how they interact with people, support oversight or sustain trust under real operating conditions. The field of QoE, with its long tradition of measuring perceived quality, different experiential dimensions and usability, offers the conceptual and methodological tools needed to evaluate AI systems as experienced technologies, not merely as computational artefacts.

In this context, QoE of AI systems can be adapted from the original definition of QoE as proposed in [Qualinet, 2013 ] to read as: “The degree of delight or annoyance of a user resulting from interacting with an AI system. It results from how well the AI system fulfills the user’s expectations regarding usefulness, transparency, trustworthiness, comprehensibility, controllability, and reliability, considering the user’s goals, context, and cognitive state.”

Collaborative research between these domains can foster new interdisciplinary methodologies, shared benchmarks and evidence-based guidelines for assessing AI systems as they are used in the real world—not just as they perform in the lab or within classical accuracy-centred benchmarks. Building this shared evaluation culture is essential to advance trustworthy, human-centric AI, ensuring that future systems are not only intelligent but also understandable, reliable and aligned with human values.

This need is becoming increasingly urgent as, in many regions such as the EU, the principles of trustworthy AI are evolving from ethical aspirations into formal regulatory requirements, reinforcing the importance of robust, experience-based evaluation frameworks.

References

[AI Act, 2024] European Parliament & Council of the European Union. (2024). Regulation (EU) 2024/1689 of the European Parliament and of the Council of 13 June 2024 laying down harmonised rules on artificial intelligence and amending Regulations (EC) No 300/2008, (EU) No 167/2013, (EU) No 168/2013, (EU) 2018/858, (EU) 2018/1139 and (EU) 2019/1020 and Directives (EU) 2015/1535 and 2017/745 (Artificial Intelligence Act). Official Journal of the European Union, L 2024/1689. https://eur-lex.europa.eu/eli/reg/2024/1689/oj
[AI hampers Productivity] Experienced software developers assumed AI would save them a chunk of time. But in one experiment, their tasks took 20% longer. Available at: https://fortune.com/2025/07/20/ai-hampers-productivity-software-developers-productivity-study/
[Bach et al., 2024] Bach, T. A., Khan, A., Hallock, H., Beltrão, G., & Sousa, S. (2024). A systematic literature review of user trust in AI-enabled systems: An HCI perspective. International Journal of Human–Computer Interaction, 40(5), 1251-1266.
[Cieplinska, 2023] Cieplínska, Natalia; Janowski, Lucjan; Moor, Katrien De; Wierzchoń, Michał. (2023) Long-Term Video QoE Assessment Studies: A Systematic Review. IEEE Access.
[CSIS, 2025] Center for Strategic and International Studies. (2025). France’s AI Action Summit. Available at: https://www.csis.org/analysis/frances-ai-action-summit
[Early-2025 AI] Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity. Available at: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/
[Egger-Lampl et al., 2019] Egger-Lampl, S., Hammer, F., & Möller, S. (2019). Towards an integrated view on QoE and UX: adding the Eudaimonic Dimension. ACM SIGMultimedia Records, 10(4), 5-5.
[European Commission, 2024] European Commission. (2023). C(2023)3215 – Standardisation request M/593 to the European Committee for Standardisation and the European Committee for Electrotechnical Standardisation in support of Union policy on artificial intelligence. Available at: https://ec.europa.eu/growth/tools-databases/enorm/mandate/593_en
[Gaudeul et al., 2024] Gaudeul, A., Arrigoni, O., Charisi, V., Escobar-Planas, M., & Hupont, I. (2024, October). Understanding the Impact of Human Oversight on Discriminatory Outcomes in AI-Supported Decision-Making. In 27th European Conference on Artificial Intelligence (pp. 19-24).
[Guse, D., 2016] Guse, D. (2017). Multi-episodic perceived quality of telecommunication services. PhD thesis, TU Berlin.
[Hammer et al., 2018] Hammer, F., Egger-Lampl, S., & Möller, S. (2018). Quality-of-user-experience: a position paper. Quality and User Experience, 3(1), 9.
[Hart & Staveland, 1988] Hart, S. G., & Staveland, L. E. (1988). Development of NASA-TLX (Task Load Index): Results of empirical and theoretical research. In Advances in psychology (Vol. 52, pp. 139-183). North-Holland.
[Hoßfeld et al., 2016] Hoßfeld, T., Heegaard, P. E., Varela, M., & Möller, S. (2016). QoE beyond the MOS: an in-depth look at QoE via better metrics and their relation to MOS. Quality and User Experience, 1(1), 2.
[Hynes et al., 2023] Hynes, E., Flynn, R., Lee, B., & Murray, N. (2023). A QoE evaluation of augmented reality for the informational phase of procedure assistance. Quality and User Experience, 8(1), 1.
[Lewis et al., 2013] Lewis, J. R., Utesch, B. S., & Maher, D. E. (2013, April). UMUX-LITE: when there’s no time for the SUS. In Proceedings of the SIGCHI conference on human factors in computing systems (pp. 2099-2102).
[Louradour & Madzou, 2021] Louradour, S. & Madzou, L. (2021). A policy framework for responsible limits on facial recognition, use case: Law enforcement investigations. In World Economic Forum, 2021.
[Mlynář et al., 2025] Mlynář, J., De Rijk, L., Liesenfeld, A., Stommel, W., & Albert, S. (2025). AI in situated action: a scoping review of ethnomethodological and conversation analytic studies. AI & society, 40(3), 1497-1527.
[Negri et al., 2024] Negri, P., Hupont, I., & Gomez, E. (2024, May). A framework for assessing proportionate intervention with face recognition systems in real-life scenarios. In 2024 IEEE 18th International Conference on Automatic Face and Gesture Recognition.
[OECD, 2024] OECD Legal Instruments. (2024). Recommendation of the Council on Artificial Intelligence.
[Oppermann et al., 2024] Oppermann, M., Schatz, R., Sackl, A., & Egger-Lampl, S. (2024, June). Virtual Forests, Real Skills: Assessing the QoE of VR-based Occupational Training and its Impact on Experience and Learning Outcomes. In 2024 16th International Conference on Quality of Multimedia Experience (QoMEX) (pp. 250-253). IEEE.
[Panigutti et al., 2023] Panigutti, C., Hamon, R., Hupont, I., Fernandez, D., Fano, D., Junklewitz, H., … & Gomez, E. (2023, June). The role of explainable AI in the context of the AI Act. In Proceedings of the 2023 ACM conference on fairness, accountability, and transparency (pp. 1139-1150).
[Park et al., 2023] Park, S., Kim, H. K., Park, J., & Lee, Y. (2023). Designing and evaluating user experience of an AI-based defense system. IEEE Access, 11, 122045-122056.
[Shneiderman, 2022] Schneiderman, B. (2022). Human-centered AI. Oxford University Press. Online ISBN: 9780191937583
[Qualinet, 2013] Qualinet White Paper on Definitions of Quality of Experience (2012). European Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003), Patrick Le Callet, Sebastian Möller and Andrew Perkis, eds., Lausanne, Switzerland, Version 1.2, March 2013.
[Reichl et al., 2015] Reichl, P. et al. (2015). Towards a comprehensive framework for QoE and user behavior modelling. In 2015 seventh international workshop on quality of multimedia experience (QoMEX) (pp. 1-6). IEEE.
[Reiter et al.. 2014] Reiter, U. et al. (2014). Factors Influencing Quality of Experience. In: Möller, S., Raake, A. (eds) Quality of Experience. T-Labs Series in Telecommunication Services. Springer, Cham.
[Soler et al., 2023] Soler, J., Tolan, S., Hupont, I., Fernandez, D., Charisi, V., Gomez, E., Junklewitz, H., Hamon, R., Fano, D. and Panigutti, C., AI Watch: Artificial Intelligence Standardisation Landscape Update, EUR 31343 EN, Publications Office of the European Union, Luxembourg, 2023, ISBN 978-92-76-60450-1, doi:10.2760/131984, JRC131155.
[Virvou, 2023] Virvou, M. Artificial Intelligence and User Experience in reciprocity: Contributions and state of the art. Intelligent Decision Technologies, 17(1), 73-125.
[WEF, 2025] World Economic Forum. (2025). AI Governance Alliance.
[Wehner et al., 2025] Wehner, N., Seufert, A., Hoßfeld, T. and Seufert, M. (2025). A Tutorial on Data-Driven Quality of Experience Modeling With Explainable Artificial Intelligence. IEEE Communications Surveys & Tutorials, doi: 10.1109/COMST.2025.3583227.

It is All About the Experience… My Highlights of QoMEX 2025 in Madrid

By Tobias Hossfeld | December 2, 2025 - 09:10 |December 2, 2025 0425, 0425, Event Report, Feature, QoE Column

Leave a comment

Since my first QoMEX (international conference on Quality of Multimedia Experience) in 2015 (Costa Navarino, Greece), I have considered it my conference and the attendees, my research family. It has thus become my special yearly event to connect with familiar faces and meet the next generation of researchers in the field. This edition of QoMEX has brought together an outstanding program with very interesting keynotes, technical papers and demos (see https://qomex2025.itec.aau.at/ to check the full program). Moreover, it has been especially important for me both on a professional and a personal level. I would like to summarize my subjective Experience in 4 highlights:

Figure 1. Me explaining the working principles of eating 12 grapes at midnight for New Year, while walking through Puerta del Sol.

“Introducing my home city to my research family”

Madrid is my home “town”. A couple of times during the conference, one attendee or another asked me where I was from in Spain, and I proudly answered, “I am from here”. In Madrid, I spent the first 23^rd years of my life before moving abroad for my professional career. Thus, while I am not literally a local, I can be considered as such. During the conference, I had the opportunity to share my view and love of Madrid to my work family. This meant for me to introduce my research family to my early life in Madrid.

“Paying tribute to Narciso García”

Figure 2. Narciso García posing with his (former) PhD students Marta Orduna, Pablo Pérez, Jesús Gutierrez and Carlos Cortés.

QoMEX 2025 also provided the opportunity to pay well-deserved tribute to one of the two general chairs, Narciso García, on his retirement. Narciso has had an incredible impact not only on the Quality of Experience community. Moreover, plenty of researchers (including myself) in the community and beyond it consider him as a mentor and even their “spiritual guide”. Talking with Pablo Pérez (the other general co-chair) during the conference, he described Narciso as having a solution for every issue, independent from its size, complexity, or topic. Thank you Narciso for the insightful research discussions, the resourcefulness, the (history) chats, and just for being there always available for all of us.

“Mentoring the next generation of researchers”

On the final session of the conference, something very unexpected (and in my opinion very unusual) occurred. Attending the awards session is always exciting. On the one side, you are 99.9% sure that you will not get any award. However, on the other side, you always wonder “what if?”. This was definitely a “What if?” year for me. First, the Best Student Paper Award went to our work with my starting PhD student Gijs Fiten. A very interesting work on locomotion in Virtual Reality. This was also his first conference, which made it even more special (both for him and for me). When we were yet to recover from this first commotion, the Best Paper Award was announced. It went to Sam Van Damme, my former (first) PhD student on a collaborative work with CWI (Centrum Wiskunde & Informatica) in Amsterdam, about shared mental models. Details of both papers can be found in the appendix.

Seeing students that I mentored (and supervised) grow and achieve important goals in their research careers was more gratifying that winning any award myself.

“QoE researchers can easily walk in others’ shoes”

Figure 3. Reflecting our thoughts and feelings on the decoration of the bag.

To put the cherry to the cake that QoMEX 2025 was, I got the wonderful present of together with Marta Orduna (Nokia, Spain) and María Nava (Fundación Juan XXIII, Spain) to organize a diversity and inclusion workshop in the Fundación Juan XXIII (https://qomex2025.itec.aau.at/workshop/ws-walking-in-their-shoes/). It took place on Friday the 3^rd of October. The Fundación (https://www.fundacionjuanxxiii.org/ ) is an organization working for more than 55 years to promote the social and labor inclusion of people in situations of psychosocial vulnerability. With the help of their workers and users, we set up a workshop where our researchers had to switch the roles. Therefore, they became the participant of a “hands-on” experience guided by people with different abilities. The activity consisted of manufacturing paper bags with the help and guidance of the experts of the Paper Lovers project (https://www.fundacionjuanxxiii.org/nuestros-proyectos).

Figure 4. Santi (on the left) is teaching Matteo (on the right) to manufacture a paper bag.

There was some initial insecurity and fear of the language barrier with our Spanish teachers. However, this passed quickly and our QoE researchers adapted to the role of students and started manufacturing bags as they had been doing it for the last 5 years. After the experience, our experts rated the quality of the bags with the typical paper review grading (accept, major revision, minor revision and reject). Finally, after lunch, with the expert guidance of Elena Marquez Segura (Universidad Carlos III), we reflected on the morning session and decorated our bags to express what we had learned about researching from an inclusive perspective. All in all it was an experience session out of the usual constraints that our research imposes and a very fitting ending to a wonderful week.

Special Thanks to Gijs Fiten (KU Leuven, Belgium), Sam Van Damme (Ghent University, Belgium), Marta Orduna (Nokia XR Lab, Spain), Martín Varela (Metosin, Finland), Karan Mitra (Luleå University of Technology, Sweden), Markus Fiedler (BTH, Sweden) and of course the organizing committee of QoMEX’25 led by Pablo Pérez (Nokia XR Lab, Spain) and Narciso García (ETSIT-UPM, Spain)

Appendix. Details of the Best papers Awards at QoMEX 2025

Best Student Paper Award

Redirected Walking for Multi-User eXtended Reality Experiences with Confined Physical Spaces
G. Fiten, J. Chatterjee, K. Vanhaeren, M. Martens and M. Torres Vega
17th International Conference on Quality of Multimedia Experience (QoMEX), Madrid, Spain, 2025.

EXtended Reality (XR) applications allow the user to explore nearly infinite virtual worlds in a truly immersive way. However, wandering around through these Virtual Environments (VE)s while physically walking in reality is heavily constrained by the size of the Physical Environment (PE). Therefore, in the last years different techniques have been devised to improve locomotion in XR. One of these is Redirected Walking (RDW), which aims to find a balance between immersion and PE requirements by steering users away from the boundaries of the PE while allowing for arbitrary motion in the VE. However, current RDW methods still require large PEs, as to avoid obstacles and other users. Moreover, they introduce unnatural alterations in the natural path of the user, which can trigger perception anomalies, such as cybersickness or break of presence. These circumstances limit their usage in real life scenarios. This paper introduces a novel RDW algorithm, with the focus on allowing multiple users to explore an infinite VE in a confined space (6×6 m2). To evaluate it, we designed a multi-user Virtual Reality (VR) maze game, and benchmarked it against the state-of-the-art. A subjective study (20 participants) was conducted, where objective metrics, e.g., the path and the speed of the user, were combined with subjective perception analysis in terms of their cybersickness levels. Our results show that our method reduces the appearance of cybersickness appearance in 80% of participants compared to the state-of-the-art. These findings show the applicability of RDW to multi-user VR with constrained environments.

Best Paper Award

From Individual QoE to Shared Mental Models: A Novel Evaluation Paradigm for Collaborative XR
S. Van Damme, J. Jansen, S. Rossi and P. Cesar
17th International Conference on Quality of Multimedia Experience (QoMEX), Madrid, Spain, 2025.

Extended Reality (XR) systems are rapidly shifting from isolated, single-user applications towards collaborative and social multi-user experiences. To evaluate the quality and effectiveness of such interactions, it is therefore required to move beyond traditional individual metrics such as Quality-of-Experience (QoE) or Sense of Presence (SoP). Instead, group-level dynamics such as effective communication, coordination etc. need to be encompassed to assess the shared understanding of goals and procedures. In psychology, this is referred to as a Shared Mental Model (SMM). The strength and congruence of such an SMM are known to be key for effective team collaboration and performance. In an immersive XR setting, though, novel Influence Factors (IFs) emerge that are not considered in a setting of physical co-location. Evaluations on the impact of these novel factors on SMM formation in XR, however, are close to non-existent. Therefore, this work proposes SMMs as a novel evaluation tool for collaborative and social XR experiences. To better understand how to explore this construct, we ran a prototypical experiment based on ITU recommendations in which the influence of asymmetric end-to-end latency is evaluated through a collaborative, two-user block building task. The results show how also in an XR context strong SMM formation can take place even when collaborators have fundamentally different responsibilities and behavior. Moreover, the study confirms previous findings by showing in an XR context that a teams’ SMM strength is positively associated with its performance.

O QoE, Where Art Thou?

By Tobias Hossfeld | May 2, 2025 - 12:35 |June 4, 2025 0225, Feature, QoE Column

Leave a comment

Once upon a time, when engineers measured networks in latency and packet loss, the idea of Quality of Experience (QoE) emerged — a myth whispered among researchers who dared to ask not what the system delivers, but what the user perceives. Decades later, QoE has evolved into a sprawling epic, spanning disciplines and domains, from humble MOS scores to immersive virtual realities. But as media experiences become ever more complex — adaptive, interactive, personalized — the question lingers: O QoE, where art thou?

1. Introduction

In this column, we revisit the notion of QoE and its evolution over time. We begin by reviewing early work from the 1990s to 2000s on the definitions of QoE (Section 2), where researchers first recognized the importance of user perception and the relevant QoE influence factors, as well as QoE modeling efforts. As a summary of this literature survey, QoE evolved from abstract notions of perception and satisfaction to a measurable, standardized concept encompassing the emotional, cognitive, and contextual responses of users to a service or application. The trends across time are:

1990s: Early focus on perception and interaction design.
Early 2000s: Growing focus on subjectivity, emotion, and context in user experience. QoE separated from QoS, emphasizing emotion, context, and expectation. Seen as key to commercial and user success.
Mid-2000s: Integration of technical and perceptual layers; need for metrics and quantification. Push for measurable models combining technical and user perspectives. Recognition of multiple definitions across domains.
Late 2000s–2010s: Standardization, recognition of multi-dimensionality, and development of cross-disciplinary definitions. QoE defined around subjective perception and system-wide impact.
2010s: Unified, multidisciplinary understanding established through initiatives like QUALINET; QoE as “delight or annoyance”.

This initial insight laid the foundation for larger initiatives like QUALINET, which helped to shape the field by providing widely accepted QoE definitions. We then examine how these developments have been formalized through standardization activities (Section 3), particularly within the ITU and the QUALINET whitepapers on the definition of QoE and immersive QoE.

The diverse and often conflicting definitions of QoE emerging in the 2000s highlighted the need for coordinated efforts and shared understanding across disciplines. This led to joint initiatives like QUALINET, which aimed to formalize and unify QoE research within a dedicated network. One of the results is the updated QoE definition, which is now taken in standardization.

2016: ITU-T Recommendation P.10/G.100 (2006) Amendment 5 (07/ 16), New Definitions for Inclusion in Recommendation ITU-T P.10/G.100, International Telecommunication Union, July 2016. ‘‘Quality of experience (QoE) is the degree of delight or annoyance of the user of an application or service’’.

Figure 1. Timeline on the notion and definitions of QoE in literature and standardization.

A timeline of the literature survey and the early definitions of QoE as well as the standardization activities is visualized in Figure 1. Finally, we discuss selected open issues in QoE research (Section 4) that continue to challenge both academia and industry.

2. Early Definitions of QoE: 1990s to 2000s

The term Quality of Experience (QoE) emerged in the late 1990s to early 2000s as a response to the limitations of traditional network-centric approaches. Although Quality of Service (QoS) had already been formally defined in ITU-T Recommendation E.800 (1994) [ITU-T E.800] for telephony and established a basis for assessing service quality from both technical and user viewpoints, QoS primarily addresses performance at the network level. QoS is commonly applied within communication networks to describe a system’s ability to meet predefined performance targets, ensuring consistent data transmission through metrics such as bandwidth, latency, jitter, and packet loss [Varela2014].

In contrast, researchers and industry practitioners began to recognize the importance of how users actually perceive the quality of a service in the late 1990s to early 2000s. In this context, a variety of alternative terms were used prior to the standardization and definition of QoE, including User-Perceived Quality, Perceived Quality, End-User Quality, User-Experience Quality, Multimedia Experience Quality, Subjective Quality of Service, and user-level QoS. These early terms reflected a growing awareness of the need to evaluate digital services from the user’s point of view, ultimately leading to the coining and adoption of QoE as a distinct and essential concept in the field of communication systems and multimedia applications.

The term QoE brought attention to the user’s subjective perception, marking a shift toward evaluating service quality from the end-user’s perspective in the mid of 2000s. In the following, a brief overview on first documents about “Quality of Experience” or “QoE” are provided to sketch the definition of terms. In particular, research articles from the ACM Digital Library and IEEE Xplore searching for “Quality of Experience” or “QoE” are collected.

Focus on user perception and interaction design

1990: Harman, G. “The intrinsic quality of experience.“ laims we’re not directly aware of our experiences’ intrinsic properties, but of those of the external objects they represent—like color, shape, texture, motion, and spatial relations.
1996: Austin Henderson. “What’s next?” explains the idea behind the ACM Award about QoE in interaction. “We really want to know what users experience! In short we are interested in the quality of a person’s experience in the interaction. […] factors contribute to the effective experience of interacting with the device.“ However, no QoE definition is proposed.
1996: Lauralee Alben. “Quality of experience: defining the criteria for effective interaction design“ is also related to the ACM interactions design award. “By ‘experience’ we mean all the aspects of how people use an interactive product: the way it feels in their hands, how well they understand how it works, how they feel about it while they’re using it, how well it serves their purposes, and how well it fits into the entire context in which they are using it. If these experiences are successful and engaging, then they are valuable to users and noteworthy to the interaction design awards jury. We call this ‘quality of experience’.” This early definition of QoE encompasses all aspects of a user’s interaction with a product, including its physical feel, usability, emotional impact, and the overall satisfaction derived from its use.
2000: Alan Turner and Lucy T. Nowell. “Beyond the desktop: diversity and artistry” relate QoE to the need for engaging, media-rich interactions across diverse devices, emphasizing the role of artistry in delivering compelling user experiences. A remarkable statement: “We also believe that the quality of experience will become the key metric of success for software, both commercially and socially.“

Focus on subjectivity, emotion, and context

2000: Marion Buchenau and Jane Fulton Suri. “Experience prototyping.” introduce a prototyping approach that immerses users in simulated interactions to explore and refine QoE, including sensory, emotional, and contextual dimensions beyond usability or function. QoE goes beyond usability or functionality, encompassing emotional and contextual factors.
2000: Anna Bouch, Allan Kuchinsky, and Nina Bhatti. “Quality is in the eye of the beholder: meeting users’ requirements for Internet quality of service.” They show that in Internet commerce, QoE depends on both technical QoS as well as user expectations and context. “Only through such integration of users’ requirements into systems design [of users’ requirements into systems design] will it be possible to achieve the customer satisfaction that leads to the success of any commercial system.”
2001: Public slide set by Touradj Ebrahimi (2012) “Quality of Experience Past, Present and Future Trends”, presented 23 Nov 2012, refers to a definition of QoE as follows. “The degree of fulfillment of an intended experience on a given user – as defined by Touradj Ebrahimi, 2001”.
2002: Heddaya, A. S. “An economically scalable Internet” uses the term “QoE rather than quality of service because QoS is not necessary for QoE, and QoE is sufficient for successful service.”

Focus on measurable models combining technical and user perspectives

1994: Nahrstedt, K., & Smith, J., Ralf Steinmetz. “Mapping User Level QoS from a Single Parameter” aims at quantifying QoE. “The ‘satisfaction’ concept has been introduced to quantify the QoS provided by the system. The transformations required to both map the cost into satisfaction and then configure the system are then developed.”
2003: Siller, M., & Woods, J. C. “QoS arbitration for improving the QoE in multimedia transmission.” propose a QoE-aware framework that adapts QoS to real-time user perception for multimedia networks. They define QoE as “the user’s perceived experience of what is being presented by the Application Layer, where the application layer acts as a user interface front-end that presents the overall result of the individual Quality of Services”.
They also review current related work at that time, which are taken from white papers, which are not accessible anymore:
- “A metric used for measuring the performance of this perceptual layer is Quality of Experience (QoE).”
- “QoE is referred to as; what a customer experiences and values to complete his tasks quickly and with confidence.”
- “QoE is considered as all the perception elements of the network and performance relative to expectations of the users/subscribers.“
- The QoE is defined as “the totality of the Quality of Service mechanisms, provided to ensure smooth transmission of audio and video over IP networks”.
2004: R. Jain. “Quality of Experience” asks the following questions. “But how do we quantitatively define the quality of experience? Can we extend QoS to QoE? What factors should we consider in developing measures for QoE?” He concludes with a remarkable statement. “In a sense, the challenges of QoE are nothing new. People in social sciences and marketing have always developed techniques to quantify people’s preferences and choices. That situation is similar to what goes into QoE.”
2004: Euro-NGI D.JRA.6.1.1 “State-of-the-art with regards to user-perceived Quality of Service and quality feedback” with Fiedler as lead for this deliverable reviews QoS from the user’s perspective. The notion of QoE is “The degree of satisfaction, i.e. the subjective quality, is influenced by the technical, objective quality stemming from the application and the interconnecting network(s). For this reason, subjective quality as perceived by the network has to be linked to objective, measurable quality, which is expressed in application and network performance parameters. “
2007: Hoßfeld, Tobias, Phuoc Tran-Gia, and Markus Fiedler. “Quantification of quality of experience for edge-based applications” provide a quantitative link between technical metrics and QoE. “Quality of Experience (QoE), a subjective measure from the user perspective of the overall value of the provided service or application”.

Diversity of definitions and interdisciplinarity

2007: Soldani, D., Li, M., & Cuny, R. “QoS and QoE management in UMTS cellular systems” define: “QoE is the term used to describe the perception of end-users on how usable the services are. […] The term ‘QoE’ refers to the perception of the user about the quality of a particular service or networks.” Notably, they already mentioned that “Browsing through the literature, one may find many different definitions for quality of end-user experience (QoE) and quality of service (QoS).”
2009: International Conference on Quality of Multimedia Experience (QoMEX) includes in the call for papers: “perceived user experience is psychological in nature and changes in different environmental conditions and with different multimedia devices.”

3. Definitions of QoE in Standardization

In standardization, the following definitions were introduced.

2007: ITU-T Rec. G.100/P.10 Amendment 1 (2007) New Appendix I – Definition of Quality of Experience (QoE). “The overall acceptability of an application or service, as perceived subjectively by the end user. NOTE 1: Quality of experience includes the complete end-to-end system effects (client, terminal, network, services infrastructure, etc.). NOTE 2: Overall acceptability may be influenced by user expectations and context.”
This definition has been superseded by the Qualinet Definition of QoE in 2016. It should be mentioned that acceptance and QoE are different concepts. acceptability refers more narrowly to whether a service or system is deemed “good enough” or usable under certain conditions. Approaches to link QoE and acceptance have been discussed in literature [Schatz2011,Hossfeld2016].
2008: ITU-T Recommendation E.800. “Definitions of terms related to quality of service” defines in as follows: “quality of service experienced/perceived by customer/user (QoSE): a statement expressing the level of quality that customers/users believe they have experienced. NOTE 1: The level of QoS experienced and/or perceived by the customer/user may be expressed by an opinion rating.”
2009: ETSI TR 102 643 V1.0.1 (2009-12) “Human Factors (HF); Quality of Experience (QoE) requirements for real-time communication services” defines QoE as “measure of user performance based on both objective and subjective psychological measures of using an ICT service or product”. It includes two notes on QoE: (1) Considers technical QoS, context, and measures both communication process and outcomes (e.g. effectiveness, satisfaction). (2) Uses objective (e.g. task time, errors) and subjective (e.g. perceived quality, satisfaction) psychological measures, depending on context.

2016: ITU-T Recommendation P.10/G.100 (2006) Amendment 5 (07/ 16), New Definitions for Inclusion in Recommendation ITU-T P.10/G.100, International Telecommunication Union, July 2016: ‘‘Quality of experience (QoE) is the degree of delight or annoyance of the user of an application or service’’.

QUALINET White Paper on Definitions of Quality of Experience

QUALINET is the European Network on Quality of Experience in Multimedia Systems and Service (COST Action IC 1003 from 2010 to 2014, later a network that meets regularly at QoMEX) with the aim to “to establish a strong network on Quality of Experience (QoE) with participation from both academia and industry” (https://www.cost.eu/actions/IC1003/). QUALINET was the driving force to further advance research in the context of QoE, producing three major, well-cited assets (among others), namely (1) QUALINET White Paper on Definitions of Quality of Experience, (2) QUALINET databases [QUALINET2019], and (3) QUALINET White Paper on Definitions of Immersive Media Experience (IMEx)

The white paper on definitions of QoE was the result from a consultation and collaborative writing process within the COST Action IC 1003 of 38 authors, contributors, and editors from 18 countries. A first draft was discussed and improved at the 2012 QoE Dagstuhl Seminar [Fiedler2012]. The final definition of QoE:

“Quality of Experience (QoE) is the degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and / or enjoyment of the application or service in the light of the user’s personality and current state.”
[QUALINET2013]

The white paper also defines influence factors (human, system, context) and features of QoE (level of direct perception, level of interaction, level of the usage situation, level of service) as well as the relationship between QoS and QoE, plus application areas, which allow “to provide specializations of a generally agreed definition of QoE pertaining to the respective application domain taking into account its requirements formulated by means of influence factors and features of QoE”.

QUALINET White Paper on Definitions of Immersive Media Experience (IMEx)

A follow-up white paper defines the QoE for immersive media as

“the degree of delight or annoyance of the user of an application or service which involves an immersive media experience. It results from the fulfillment of his or her expectations with respect to the utility and/or enjoyment of the application or service in the light of the user’s personality and current state.”
[QUALINET2020]

IMEx is defined as

“a high-fidelity simulation provided and communicated to the user through multiple sensory and semiotic modalities. Users are emplaced in a technology-driven environment with the possibility to actively partake and participate in the information and experiences dispensed by the generated world.”
[QUALINET2020]

Consequently, this white paper provides a “toolbox for definitions of IMEx including its Quality of Experience, application areas, influencing factors, and assessment methods.” [QUALINET2020].

4. Open Issues in QoE Research

We would like to conclude with some open issues regarding Quality of Experience. The upcoming 6G standard presents significant opportunities, such as QoE-aware orchestration of edge computing, cloud rendering, and network slicing [Tondwalkar2024] and native AI in 6G [Ziegler2020], while also considering tradeoff between QoE and CO2 emissions [Hossfeld2023]. As AI-generated content continues to rise, the evaluation of its quality remains in its early stages. The same applies to learning-based codecs, where existing quality assessment methods—both objective and subjective—are reaching their limits, particularly concerning media authenticity, which is becoming a critical issue. In this context, ethics and privacy are paramount, as user data plays a central role in QoE modeling. Future research must focus on privacy-preserving methods for QoE measurement and personalization. Finally, new modalities such as point clouds, light fields, and holograms necessitate the adaptation of existing techniques or the development of new methods. Moreover, multimodal or multisensory QoE, particularly concerning audio-visual-haptic or olfactory integration (previously referred to as Mulsemedia), is emerging as an important area that requires tailored QoE assessment methods and metrics. This is also reflected by the upcoming 17^th Int. Conf. on Quality of Multimedia Experiences (QoMEX’25) under the theme “Thinking of a QoE ®evolultion”. In particular, the call for papers requests: “On the edge of QoMEX ‘coming of age’, it is time to rethink the purpose and methods of QoE research: cross-fertilizing with adjacent fields, reaching more diverse populations, or exploring novel techniques and paradigms.” This addresses innovative approaches and novel paradigms in QoE research, technological innovations in the era of big data data and AI, but also on user-centricity in 6G. Interdisciplinary links in QoE include diversity, ethics, accessibility, but also novel interaction techniques and multimedia experiences. Specific applications such as gaming, healthcare, education, and immersive technologies, and multisensory perception are in the scope.

And so, like any true odyssey, the search for Quality of Experience continues — not as a destination, but as a path we shape with every interaction, every pixel tuned, every user understood. QoE is no longer a myth, but neither is it fully found. It lives at the intersection of perception and precision, where engineers meet psychologists, and systems learn to listen. In a world of immersive media and intelligent networks, perhaps the better question is no longer “O QoE, where art thou?” but rather — “Are we ready to meet it where it truly resides?”

References

[Alben1996] Lauralee Alben. 1996. Quality of experience: defining the criteria for effective interaction design. interactions 3, 3 (May/June 1996), 11–15. https://doi.org/10.1145/235008.235010
[Bouch2000]: Anna Bouch, Allan Kuchinsky, and Nina Bhatti. 2000. Quality is in the eye of the beholder: meeting users’ requirements for Internet quality of service. In Proceedings of the SIGCHI conference on Human Factors in Computing Systems (CHI ’00). Association for Computing Machinery, New York, NY, USA, 297–304. https://doi.org/10.1145/332040.332447
[Buchenau2000] Marion Buchenau and Jane Fulton Suri. 2000. Experience prototyping. In Proceedings of the 3rd conference on Designing interactive systems: processes, practices, methods, and techniques (DIS ’00). Association for Computing Machinery, New York, NY, USA, 424–433. https://doi.org/10.1145/347642.347802
[Ebrahimi2001] Public slide set by Touradj Ebrahimi (2012) “Quality of Experience Past, Present and Future Trends”, presented at Alpen-Adria-Universität Klagenfurt, 23 Nov 2012
ETSI TR 102 643 V1.0.1 (2009-12) “Human Factors (HF); Quality of Experience (QoE) requirements for real-time communication services”
[EuroNGI2004] Euro-NGI D.JRA.6.1.1 : State-of-the-art with regards to user-perceived Quality of Service and quality feedback, Deliverable version No: 1.0 Sending date: 31/05-2004, Lead: Markus Fiedler, BTH Karlskrona. <a href=”https://www.diva-portal.org/smash/get/diva2:837296/FULLTEXT01.pdf”>Last accessed: 2025/04/22</a>
[Fiedler2012] Markus Fiedler, Sebastian Möller, and Peter Reichl. Quality of Experience: From User Perception to Instrumental Metrics (Dagstuhl Seminar 12181). In Dagstuhl Reports, Volume 2, Issue 5, pp. 1-25, Schloss Dagstuhl – Leibniz-Zentrum für Informatik (2012) https://doi.org/10.4230/DagRep.2.5.1
[Harman1990] Harman, G. (1990). The intrinsic quality of experience. Philosophical perspectives, 4, 31-52. https://doi.org/10.2307/2214186
[Heddaya2002] Heddaya, A. S. (2002). An economically scalable Internet. Computer, 35(9), 93-95. https://doi.org/10.1109/MC.2002.1033035
[Henderson1996] Austin Henderson. 1996. What’s next?—growing the notion of quality. Interactions 3, 3 (May/June 1996), 56–59. https://doi.org/10.1145/235008.235019
[Hestnes2009] Hestnes, B., Brooks, P., Heiestad, S. (2009). “QoE (Quality of Experience) – measuring QoE for improving the usage of telecommunication services”, Telenor R&I R 21/2009.
[Hossfeld2007] Hoßfeld, Tobias, Phuoc Tran-Gia, and Markus Fiedler. “Quantification of quality of experience for edge-based applications.” International Teletraffic Congress. Berlin, Heidelberg: Springer Berlin Heidelberg, 2007. https://doi.org/10.1007/978-3-540-72990-7_34
[Hossfeld2016] Hoßfeld, T., Heegaard, P. E., Varela, M., & Möller, S. (2016). QoE beyond the MOS: an in-depth look at QoE via better metrics and their relation to MOS. Quality and User Experience, 1, 1-23. https://doi.org/10.1007/s41233-016-0002-1
[Hossfeld2023] Hoßfeld, T., Varela, M., Skorin-Kapov, L., & Heegaard, P. E. (2023). A Greener Experience: Trade-Offs between QoE and CO 2 Emissions in Today’s and 6G Networks. IEEE communications magazine, 61(9), 178-184. https://doi.org/10.1109/MCOM.006.2200490
[ITU-T E.800] E.800: Terms and definitions related to quality of service and network performance including dependability”. ITU-T Recommendation. August 1994. Updated September 2008 as Definitions of terms related to quality of service. Last access: 2025/04/22
[ITU-T G.100/P.10 2007] ITU-T Rec. G.100/P.10 Amendment 1 (2007) New Appendix I—Definition of Quality of Experience (QoE). International Telecommunication Union, Geneva.
[Nahrstedt1994] Nahrstedt, K., & Smith, J., Ralf Steinmetz (Ed), 1994, “Service Kernel for Multimedia Endpoints”, Multimedia: Advanced Teleservices and High-speed Communication Architectures, Lecture Notes in Computer Science LNCS868, chanter I, pp. 8-22, Springer Verlag. https://doi.org/10.1007/3-540-58494-3_2
[QUALINET2013] Patrick Le Callet, Sebastian Möller and Andrew Perkis, eds., Qualinet White Paper on Definitions of Quality of Experience (2012). European Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003) Lausanne, Switzerland, Version 1.2, March 2013. Last access: 2025/04/22
[QUALINET2019] Karel Fliegel, Lukáš Krasula, and Werner Robitza. 2022. Qualinet databases: central resource for QoE research – history, current status, and plans. SIGMultimedia Rec. 11, 3, Article 5 (September 2019), 1 pages. https://doi.org/10.1145/3524460.3524465
[QUALINET2020] Perkis, A., Timmerer, C., et al., “QUALINET White Paper on Definitions of Immersive Media Experience (IMEx)”, European Network on Quality of Experience in Multimedia Systems and Services, 14th QUALINET meeting (online), May 25, 2020. https://arxiv.org/abs/2007.07032
[Richards1998] Richards, A., Rogers, G., Witana, V., & Antoniades, M., 1998, “Mapping User Level QoS from a Single Parameter”, In Proceedings of the International Conference on MultimediaNetworks and Services (MMNS ‘98).
[Schatz2011] Schatz, R., Egger, S., & Platzer, A. (2011, June). Poor, good enough or even better? bridging the gap between acceptability and qoe of mobile broadband data services. In 2011 IEEE International Conference on Communications (ICC) (pp. 1-6). IEEE. https://doi.org/10.1109/icc.2011.5963220
[Siller2003] Siller, M., & Woods, J. C. (2003, July). QoS arbitration for improving the QoE in multimedia transmission. In International Conference on Visual Information Engineering (VIE 2003). Ideas, Applications, Experience (pp. 238-241). London UK: IEE. https://doi.org/10.1049/cp:20030531
[Soldani2006] Soldani, D., Li, M., & Cuny, R. (Eds.). (2007). QoS and QoE management in UMTS cellular systems. John Wiley & Sons. https://doi.org/10.1002/9780470034057
[Tondwalkar2024] Tondwalkar, A., Andres-Maldonado, P., Chandramouli, D., Liebhart, R., Moya, F. S., Kolding, T., & Perez, P. (2024). Provisioning Quality of Experience in 6G Networks. IEEE Access. https://doi.org/10.1109/ACCESS.2024.3455938
[Turner2000] Alan Turner and Lucy T. Nowell. 2000. Beyond the desktop: diversity and artistry. In CHI ’00 Extended Abstracts on Human Factors in Computing Systems (CHI EA ’00). Association for Computing Machinery, New York, NY, USA, 35–36. https://doi.org/10.1145/633292.633317
[Varela2014] Varela, M., Skorin-Kapov, L., & Ebrahimi, T. (2014). Quality of service versus quality of experience. In Quality of Experience: Advanced Concepts, Applications and Methods (pp. 85-96). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-02681-7_6
[Ziegler2020] Ziegler, V., Viswanathan, H., Flinck, H., Hoffmann, M., Räisänen, V., & Hätönen, K. (2020). 6G architecture to connect the worlds. IEEE Access, 8, 173508-173520. https://doi.org/10.1109/ACCESS.2020.3025032

Challenges in Experiencing Realistic Immersive Telepresence

By Tobias Hossfeld | March 17, 2025 - 14:45 |April 23, 2025 0125, 0125, Feature, QoE Column

Leave a comment

Immersive imaging technologies offer a transformative way to change how we experience interacting with remote environments, i.e., telepresence. By leveraging advancements in light field imaging, omnidirectional cameras, and head-mounted displays, these systems enable realistic, real-time visual experiences that can revolutionize how we interact with the remote scene in fields such as healthcare, education, remote collaboration, and entertainment. However, the field faces significant technical and experiential challenges, including efficient data capture and compression, real-time rendering, and quality of experience (QoE) assessment. Expanding on the findings of the authors’ recent publication and situating them within a broader theoretical framework, this article provides an integrated overview of immersive telepresence technologies, focusing on their technological foundations, applications, and the challenges that must be addressed to advance this field.

1. Redefining Telepresence Through Immersive Imaging

Telepresence is defined as the “sense of being physically present at a remote location through interaction with the system’s human interface” [Minsky1980]. Such virtual presence is made possible by digital imaging systems and real-time communication of visuals and interaction signals. Immersive imaging systems such as light fields and omnidirectional imaging enhance the visual sense of presence, i.e., “being there” [IJsselsteijn2000], with photorealistic recreation of the remote scene. This emerging field has seen rapid growth, both in research and development [Valenzise2022], due to advancements in imaging and display technologies, combined with increasing demand for interactive and immersive experiences. A visualization is provided in Figure 1 that shows a telepresence system that utilizes traditional cameras and controls and an immersive telepresence system.

Figure 1 – A side-by-side visualization of a traditional telepresence system (left) and an immersive telepresence system (right).

The experience of “presence” consists of three components according to Schubert et al. [Schubert2001], which are renamed in this article to take into account other definitions:

Realness – “Realness” [Schubert2001] or “realism” [Takatalo2008] of the environment (i.e., in this case, the remote scene) relates to the “believability, the fidelity and validity of sensory features within the generated environments, e.g., photorealism.” [Perkis 2020].
Immersion – User’s level of “involvement” [Schubert2001] and “concentration to the virtual environment instead of real world, loss of time” [Takatalo2008]. “The combination of sensory cues with symbolic cues essential for user emplacement and engagement” [Perkis2020].
Spatiality – An attribute of the environment helps “transporting” the user to induce spatial awareness [Schubert2001] which allows “spatial presence” [Takatalo2008] and “the possibility for users to move freely and discover the world offered” [Perkis2020].

Immersion can happen without having realness or spatiality, for example, while we are reading a novel. Telepresence using traditional imaging systems might not be immersive in case of a relatively small display and other distractors present in the visual field. Realistic immersive telepresence necessitates higher degrees of freedom (e.g., 3 DoF+ or 6DoF) compared to a telepresence application with a traditional display. In this context, new view synthesis methods and spherical light field representations (cf. Section 3) will be crucial in giving correct depth cues and depth perception – which will increase realness and spatiality tremendously.

The rapid progress of immersive imaging technologies and their adoption can largely be attributed to advancements in processing and display systems, including light field displays and extended reality (XR) headsets. These XR headsets are becoming increasingly affordable while delivering excellent user experiences [Jackson2023], paving the way for the widespread adoption of immersive communication and telepresence applications in the near future. To further accelerate this transition, extensive efforts are being undertaken in both academia as well as industry.

The visual realism (i.e., realness) in realistic immersive telepresence relies on acquired photos rather than computer-generated imagery (CGI). In healthcare, it enables realistic remote consultations and surgical collaborations [Wisotzky2025]. In education and training, it facilitates immersive, location-independent learning environments [Kachach2021]. Similarly, visual realism can enhance remote collaboration by creating lifelike meeting spaces, while in media and entertainment, it can provide unprecedented realism for live events and performances, offering users a closer connection and having a feeling of being present on remote sites.

This article provides a brief overview of the technological foundations, applications, and challenges in immersive telepresence. The novel contribution of this article is setting up the theoretical framework for realistic immersive telepresence informed by prior literature and positioning the findings of the author’s recent publication [Zerman2024] within this broader theoretical framework. It explores how foundational technologies like light field imaging and real-time rendering drive the field forward, while also identifying critical obstacles, such as dataset availability, compression efficiency, and QoE evaluation.

2. Technological Foundations for Immersive Telepresence

A realistic immersive telepresence can be made possible by enabling its main defining factors of realness (e.g., photorealism), immersion, and spatiality. Although these factors can be satisfied with other modalities (e.g., spatial audio), this article focuses on the visual modality and visual recreation of the remote scene.

2.1 Immersive Imaging Modalities

Immersive imaging technologies encompass a wide range of methods aimed at capturing and recreating realistic visual and spatial experiences. These include light fields, omnidirectional images, volumetric videos using either point clouds or 3D meshes, holography, multi-view stereo imaging, neural radiance fields, Gaussian splats, and other extended reality (XR) applications — all of which contribute to recreating highly realistic and interactive representations of scenes and environments.

Light fields (LF) are vector fields of all the light rays passing through a given region in space, describing the intensity and direction of light at every point. This is fully described through the plenoptic function [Adelson1991] as follows: P(x,y,z,θ,ϕ,λ,t), where x, y, and z describe the 3D position of sampling, θ and ϕ are the angular direction, λ is the wavelength of the light ray, and t is time. Traditionally, LFs are represented using the two-plane parametrization [Levoy1996] with 2 spatial dimensions and 2 angular dimensions; however, this parametrization limits the use case of LFs to processing planar visual stimuli. The plenoptic function can be leveraged beyond the two-plane parameterization for a highly detailed view reconstruction or view synthesis. Newer capture scenarios and representations enable increased immersion with LFs [Overbeck2018],[Broxton2020], which can be further advanced in the future.

Omnidirectional image (or video) representation can provide an all-encompassing 360-degree view of a scene from a point in space for immersive visualization [Yagi1999], [Maugey2023]. This is made possible by stitching multiple views together. The created spherical image can be stored using traditional image formats (i.e., 2D planar formats) by projecting the sphere to planar format (e.g., equirectangular projection, cubemap projection, and others); however, processing these special representations without proper consideration for their spherical nature results in errors or biases.

2.2 Processing Requirements for Realistic Immersive Telepresence

Immersive telepresence relies on capturing, transmitting, and rendering realistic representations of remote environments. “Capturing” can be considered an inherent part of the imaging modalities discussed in the previous section. For transmitting and rendering, there are different requirements to take into account.

Compression is an important step for telepresence that relies heavily on real-time transmission of the visual data from the remote scene. The importance of compression increases even more for immersive telepresence applications as immersive imaging modalities capture (and represent) more information and need even more compression compared to the telepresence using traditional 2D imaging systems. Compression of LFs [Stepanov2023], omnidirectional images and video [Croci2020], and other forms of immersive video such as MPEG Immersive Video [Boyce2021], volumetric 3D representations represented with point clouds [Graziosi2020], and textured 3D meshes [Marvie2022] have been a very hot research topic within the last decade, which led to the standardization of compression methods for some immersive imaging modalities.

Rendering [Eisert2023], [Maugey2023] is yet another important aspect, especially for LFs [Overbeck2018]. The LF data needs to be rendered correctly for the position of the viewer (i.e., to render interpolated or extrapolated views) to provide a realistic and immersive experience to the user. Without the view rendering (i.e., for interpolation or extrapolation), the final displayed visuals will appear jittery, which will make the experience harder to sustain the necessary “suspension of disbelief” for an immersive experience. Furthermore, this rendering has to be real-time, as it is a requirement for telepresence. Although technologies such as GPU acceleration and advanced compression algorithms ensure seamless interaction while minimizing latency, the quality and the realness of the remote scene are still to be solved.

Immersive telepresence systems rely on specialized hardware, including omnidirectional cameras, head-mounted displays, and motion tracking systems. These components must work in harmony to deliver high-quality, immersive experiences. Reducing prices and increasing availability of such specialized devices make them easier to deploy in industrial settings [Jackson2023] regardless of business size and enables the democratization of immersive imaging applications in a broader sense.

3. Efforts in Creating a Realistic Immersive Telepresence Experience

Creating an immersive telepresence system has been a topic of many scholarly studies. These include frameworks for group-to-group telepresence [Beck2013], creating capture and delivery frameworks for volumetric 3D models [Fechteler2013], and various other social XR applications [Cortés2024]. Google’s project Starline can also be mentioned here to include realness and immersion in its delivery of the visuals, creating an immersive experience [Lawrence2024], [Starline2025], although its main functionality is interpersonal video communication. In supporting realness, LFs [Broxton2020] and other types of neural representations [Suhail2022] can create views that can support reflections and similar non-Lambertian light material interactions in recreating light occurring in the remote scene, whereas the usual assumption for texturing reconstructed 3D objects is to assume Lambertian materials [Zhi2020].

Light field reconstruction [Gond2023] and new view synthesis from single-view [Lin2023] or sparse views [Chibane2021] can be a valid way to approach creating realistic immersive telepresence experiences. Various representations can be used to recreate various views that would support movement of the user and the spatial awareness factor of presence in the remote scene. These representations can be Multi-Planar Image (MPI) [Srinivasan2019], Multi-Cylinder Image (MCI) [Waidhofer2022], layered mesh representation [Broxton2020], and neural representations [Chibane2021], [Lin2023], [Gond 2023] – which rely on structured or unstructured 2D image captures of the remote scene.

Another way of creating a realistic immersive experience can be by combining the different imaging modalities – i.e., omnidirectional content and light fields – in the form of spherical light fields (SLFs). SLFs then enable rendering and view synthesis that can generate more realistic and immersive content. There have been various attempts to create SLFs by collecting linear captures vertically [Krolla2014], capturing omnidirectional content from the scene with multiple cameras [Maugey2019], and moving a single camera in a circular trajectory and utilizing deep neural networks to generate an image grid [Lo2023]. Nevertheless, these works either did not yield publicly available datasets or did not have precise localizations of the cameras. To address this, the Spherical Light Field Database (SLFDB) was introduced in previous work [Zerman2024], which provides a foundational dataset for testing and developing applications for realistic immersive telepresence applications.

4. Challenges and Limitations

Studies in creating realistic immersive telepresence environments showed that there are still certain challenges and limitations that need to be addressed to improve QoE and IMEx for these systems. These challenges include dataset availability, compression of the structured and unstructured LFs, new view synthesis and rendering, and QoE estimation. Most of these challenges are also discussed in our recent study [Zerman2024].

Figure 2 – A set of captures highlighting the effects of dynamically changing scene: lighting change and its effect on white balance (top) and dynamic capture environment, where people appear and disappear (bottom).

Datasets relevant to realistic immersive telepresence tasks, such as the SLFDB [Zerman2024], are crucial for developing and validating immersive telepresence technologies. However, the creation and use of such datasets with precise spatial and angular resolution and very precise positioning of the camera face significant hurdles. Traditional camera grid setups are ineffective for capturing spherical light fields due to occlusions. This challenge necessitates having static scenes and meticulous camera positioning for a consistent capture of the scene. A dynamic scene brings a risk of non-consistent views within the same light field, as shown in Figure 2, which is non-ideal. These challenges highlight the critical need for innovative approaches to spherical light field dataset generation and sharing, ensuring future advancements in the field. Additionally, variations in lighting present significant challenges when capturing spherical light fields, as they impact the scene’s dynamic range, white balance, and color grading, which creates yet another challenge in database creation. Brightness and color variations, such as sunlight’s yellow tint compared to cloudy daylight, are not easy to correct and often require advanced algorithms for adjustment. Capturing static outdoor scenes remains a challenge for future work, as they still encounter lighting-related issues despite lacking movement.

LF compression is also another challenge that requires attention after combining imaging modalities. JPEG Pleno compression algorithm [ISO2021] is adapted for 2-dimensional grid-like structured LFs (e.g., LFs captured by microlens array or structured camera grids) and does not work for linear or unstructured captures. The situation is the same for many other compression methods, as most of them require some form of structured representation. Considering how well scene regression and other new view synthesis algorithms can adapt for unstructured inputs, one can also see the importance of advancing the compression field for unstructured LFs (e.g., the volume of light captured by cameras in various positions or in-the-wild user captures). Furthermore, the said LF compression method needs to be real-time to support immersive telepresence applications while having a very good visual QoE that would not impede realism.

Figure 3 – Strong artifacts created at the extremes of view synthesis with a large baseline (i.e. 30cm), where either the scene is warped (left – 360ViewSynth), or strong ghosting artifacts occur (right – PanoSynthVR).

Current new view synthesis methods are primarily designed to handle small baselines, typically just a few centimeters, and face significant challenges when applied to larger baselines required in telepresence applications. Challenges such as ghosting artifacts and unrealistic distortions (e.g., nonlinear distortions, stretching) occur when interpolating views, particularly for larger baselines, as shown in Figure 3. A recent comparative evaluation of PanoSynthVR and 360ViewSynth [Zerman2024] reveals that while 360ViewSynth marginally outperforms PanoSynthVR on average quality metrics, the scores for both methods remain suboptimal. PanoSynthVR struggles with large baselines, exhibiting prominent layer-like ghosting artifacts due to limitations in its MCI structure. Although 360ViewSynth produces visually better results, closer inspection shows that it distorts object perspectives by stretching them rather than accurately rendering the scene, leading to an unnatural user experience. These findings underscore the limitations of current state-of-the-art view synthesis methods for SLFs and highlight the complexity of addressing larger baselines effectively in view synthesis.

Assessing user satisfaction and immersion in telepresence systems is a multidimensional challenge, requiring assessments in three different strands as described in IMEx whitepaper: subjective assessment, behavioral assessment, and assessment via psycho-physiological methods [Perkis2020]. Quantitative metrics can be used for interaction latency and task performance metrics in a user study, and individual preferences and experiences can be collected qualitatively. Certain aspects of user experience, such as visual quality and user engagement, can also be collected as quantitative data during user studies – with user self-reporting. Additionally, behavioral assessment (e.g., user movement, interaction patterns) can be used to identify different use patterns. Here, the limiting factor is mainly the time and experience cost in running the said user studies. Therefore, the challenge here is to prepare a framework that can model the user experience for realistic immersive telepresence scenarios, which can speed up the assessment strategies.

Other limitations and aspects to consider include accessibility, privacy issues, and ethics. Regarding accessibility, it is important to ensure that immersive telepresence technologies are affordable and usable by diverse populations. The situation is improving as the cameras and headsets are getting cheaper and easier to use (e.g., faster and stronger on-device processing, removal of headset connection cables, increased ease of use with hand gestures, etc.). Nevertheless, hardware costs, connectivity requirements, and usability barriers must be further addressed to make these systems widely accessible. Regarding privacy and ethics, the realistic nature of immersive telepresence may raise ethical and privacy concerns. Capturing and transmitting live environments may involve sensitive data, necessitating robust privacy safeguards and ethical guidelines to prevent misuse. Also, privacy concerns regarding the headsets that rely on visual cameras for localization and mapping must be addressed.

5. Conclusions and Future Directions

Realistic immersive telepresence systems represent a transformative shift in how people interact with remote environments. By combining advanced imaging, rendering, and interaction technologies, these systems promise to revolutionize industries ranging from healthcare to entertainment. However, significant challenges remain, including data availability, compression, rendering, and QoE assessment. Addressing these obstacles will require collaboration across disciplines and industries.

To address these challenges, future research should focus on attempting to create relevant datasets for spherical LFs that address with accurate positioning of the camera and challenges such as dynamic lighting conditions and occlusions. Developing real-time, robust compression methods for unstructured LFs, which maintain visual quality and support immersive applications, is another critical area. Developing advanced view synthesis algorithms capable of handling large baselines without introducing artifacts or distortions and creating frameworks for user experience and QoE assessment methodologies are still open research questions.

Further into the future, the remaining challenges can be solved using learning-based algorithms for the challenges related to realness and spatiality factors as well as QoE estimation, increasing the level of interactivity and feeling of immersion through integrating different senses to the existing systems (e.g., spatial audio, haptics, natural interfaces), and increasing the standardization to create common frameworks that can manage interoperability across different systems. Long-term goals include the integration of realistic immersive displays – such as LF displays or improved holographic displays – and the convergence of telepresence systems with emerging technologies like 5G or 6G networks and edge computing, on which the efforts are already underway [Mahmoud2023].

References

[Adelson1991] Adelson, E. H., & Bergen, J. R. (1991). The plenoptic function and the elements of early vision (Vol. 2). Cambridge, MA, USA: Vision and Modeling Group, Media Laboratory, Massachusetts Institute of Technology.
[Beck2013] Beck, S., Kunert, A., Kulik, A., & Froehlich, B. (2013). Immersive group-to-group telepresence. IEEE transactions on visualization and computer graphics, 19(4), 616-625.
[Boyce2021] Boyce, J. M., Doré, R., Dziembowski, A., Fleureau, J., Jung, J., Kroon, B., … & Yu, L. (2021). MPEG immersive video coding standard. Proceedings of the IEEE, 109(9), 1521-1536.
[Broxton2020] Broxton, M., Flynn, J., Overbeck, R., Erickson, D., Hedman, P., Duvall, M., … & Debevec, P. (2020). Immersive light field video with a layered mesh representation. ACM Transactions on Graphics (TOG), 39(4), 86-1.
[Chibane2021] Chibane, J., Bansal, A., Lazova, V., & Pons-Moll, G. (2021). Stereo radiance fields (SRF): Learning view synthesis for sparse views of novel scenes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 7911-7920).
[Cortés2024] Cortés, C., Pérez, P., & García, N. (2023). Understanding latency and qoe in social xr. IEEE Consumer Electronics Magazine.
[Croci2020] Croci, S., Ozcinar, C., Zerman, E., Knorr, S., Cabrera, J., & Smolic, A. (2020). Visual attention-aware quality estimation framework for omnidirectional video using spherical Voronoi diagram. Quality and User Experience, 5, 1-17.
[Eisert2023] Eisert, P., Schreer, O., Feldmann, I., Hellge, C., & Hilsmann, A. (2023). Volumetric video– acquisition, interaction, streaming and rendering. In Immersive Video Technologies (pp. 289-326). Academic Press.
[Fechteler2013] Fechteler, P., Hilsmann, A., Eisert, P., Broeck, S. V., Stevens, C., Wall, J., … & Zahariadis, T. (2013, June). A framework for realistic 3D tele-immersion. In Proceedings of the 6th International Conference on Computer Vision/Computer Graphics Collaboration Techniques and Applications.
[Gond2023] Gond, M., Zerman, E., Knorr, S., & Sjöström, M. (2023, November). LFSphereNet: Real Time Spherical Light Field Reconstruction from a Single Omnidirectional Image. In Proceedings of the 20th ACM SIGGRAPH European Conference on Visual Media Production (pp. 1-10).
[Graziosi2020] Graziosi, D., Nakagami, O., Kuma, S., Zaghetto, A., Suzuki, T., & Tabatabai, A. (2020). An overview of ongoing point cloud compression standardization activities: Video-based (V-PCC) and geometry-based (G-PCC). APSIPA Transactions on Signal and Information Processing, 9, e13.
[IJsselsteijn2000] IJsselsteijn, W. A., De Ridder, H., Freeman, J., & Avons, S. E. (2000, June). Presence: concept, determinants, and measurement. In Human Vision and Electronic Imaging V (Vol. 3959, pp. 520-529). SPIE.
[ISO2021] ISO/IEC 21794-2:2021 (2021) Information technology – Plenoptic image coding system (JPEG Pleno) — Part 2: Light field coding.
[Jackson2023] Jackson, A. (2023, September) Meta Quest 3: Can businesses use VR day-to-day?, Technology Magazine. https://technologymagazine.com/digital-transformation/meta-quest-3-can-businesses-use-vr-day- to-day, Accessed: 2024-02-05.
[Kachach2021] Kachach, R., Orduna, M., Rodríguez, J., Pérez, P., Villegas, Á., Cabrera, J., & García, N. (2021, July). Immersive telepresence in remote education. In Proceedings of the International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE’21) (pp. 21-24).
[Krolla2014] Krolla, B., Diebold, M., Goldlücke, B., & Stricker, D. (2014, September). Spherical Light Fields. In BMVC (No. 67.1–67.12).
[Lawrence2024] Lawrence, J., Overbeck, R., Prives, T., Fortes, T., Roth, N., & Newman, B. (2024). Project starline: A high-fidelity telepresence system. In ACM SIGGRAPH 2024 Emerging Technologies (pp. 1-2).
[Levoy1996] Levoy, M. & Hanrahan, P. (1996) Light field rendering, in Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (pp. 31-42), New York, NY, USA, Association for Computing Machinery.
[Lin2023] Lin, K. E., Lin, Y. C., Lai, W. S., Lin, T. Y., Shih, Y. C., & Ramamoorthi, R. (2023). Vision transformer for nerf-based view synthesis from a single input image. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (pp. 806-815).
[Lo2023] Lo, I. C., & Chen, H. H. (2023). Acquiring 360° Light Field by a Moving Dual-Fisheye Camera. IEEE Transactions on Image Processing.
[Mahmoud2023] Mahmood, A., Abedin, S. F., O’Nils, M., Bergman, M., & Gidlund, M. (2023). Remote-timber: an outlook for teleoperated forestry with first 5g measurements. IEEE Industrial Electronics Magazine, 17(3), 42-53.
[Marvie2022] Marvie, J. E., Krivokuća, M., Guede, C., Ricard, J., Mocquard, O., & Tariolle, F. L. (2022, September). Compression of time-varying textured meshes using patch tiling and image-based tracking. In 2022 10th European Workshop on Visual Information Processing (EUVIP) (pp. 1-6). IEEE.
[Maugey2019] Maugey, T., Guillo, L., & Cam, C. L. (2019, June). FTV360: A multiview 360° video dataset with calibration parameters. In Proceedings of the 10th ACM Multimedia Systems Conference (pp. 291-295).
[Maugey2023] Maugey, T. (2023). Acquisition, representation, and rendering of omnidirectional videos. In Immersive Video Technologies (pp. 27-48). Academic Press. [Minsky1980] Minsky, M. (1980). Telepresence. Omni, pp. 45-51.
[Overbeck2018] Overbeck, R. S., Erickson, D., Evangelakos, D., Pharr, M., & Debevec, P. (2018). A system for acquiring, processing, and rendering panoramic light field stills for virtual reality. ACM Transactions on Graphics (TOG), 37(6), 1-15.
[Perkis2020] Perkis, A., Timmerer, C., et al. (2020, May) “QUALINET White Paper on Definitions of Immersive Media Experience (IMEx)”, European Network on Quality of Experience in Multimedia Systems and Services, 14th QUALINET meeting (online), Online: https://arxiv.org/abs/2007.07032
[Schubert2001] Schubert, T., Friedmann, F., & Regenbrecht, H. (2001). The experience of presence: Factor analytic insights. Presence: Teleoperators & Virtual Environments, 10(3), 266-281.
[Srinivasan2019] Srinivasan, P. P., Tucker, R., Barron, J. T., Ramamoorthi, R., Ng, R., & Snavely, N. (2019). Pushing the boundaries of view extrapolation with multiplane images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 175-184).
[Starline2025] Project Starline: Be there from anywhere with our breakthrough communication technology. (n.d.). Online: https://starline.google/. Accessed: 2025-01-14
[Stepanov2023] Stepanov, M., Valenzise, G., & Dufaux, F. (2023). Compression of light fields. In Immersive Video Technologies (pp. 201-226). Academic Press.
[Suhail2022] Suhail, M., Esteves, C., Sigal, L., & Makadia, A. (2022). Light field neural rendering. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (pp. 8269-8279).
[Takatalo2008] Takatalo, J., Nyman, G., & Laaksonen, L. (2008). Components of human experience in virtual environments. Computers in Human Behavior, 24(1), 1-15.
[Valenzise2022] Valenzise, G., Alain, M., Zerman, E., & Ozcinar, C. (Eds.). (2022). Immersive Video Technologies. Academic Press.
[Waidhofer2022] Waidhofer, J., Gadgil, R., Dickson, A., Zollmann, S., & Ventura, J. (2022, October). PanoSynthVR: Toward light-weight 360-degree view synthesis from a single panoramic input. In 2022 IEEE International Symposium on Mixed and Augmented Reality (ISMAR) (pp. 584-592). IEEE.
[Wisotzky2025] Wisotzky, E. L., Rosenthal, J. C., Meij, S., van den Dobblesteen, J., Arens, P., Hilsmann, A., … & Schneider, A. (2025). Telepresence for surgical assistance and training using eXtended reality during and after pandemic periods. Journal of telemedicine and telecare, 31(1), 14-28.
[Yagi1999] Yagi, Y. (1999). Omnidirectional sensing and its applications. IEICE transactions on information and systems, 82(3), 568-579.
[Zerman2024] Zerman, E., Gond, M., Takhtardeshir, S., Olsson, R., & Sjöström, M. (2024, June). A Spherical Light Field Database for Immersive Telecommunication and Telepresence Applications. In 2024 16th International Conference on Quality of Multimedia Experience (QoMEX) (pp. 200-206). IEEE.
[Zhi2020] Zhi, T., Lassner, C., Tung, T., Stoll, C., Narasimhan, S. G., & Vo, M. (2020). TexMesh: Reconstructing detailed human texture and geometry from RGB-D video. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part X 16 (pp. 492-509). Springer International Publishing.

From Theory to Practice: System QoE Assessment by Providers

By Tobias Hossfeld | September 7, 2024 - 09:38 |October 15, 2024 0324, Feature, QoE Column

Leave a comment

Service and network providers actively evaluate and derive Quality of Experience (QoE) metrics within their systems, which necessitates suitable monitoring strategies. Objective QoE monitoring involves mapping Quality of Service (QoS) parameters into QoE scores, such as calculating Mean Opinion Scores (MOS) or Good-or-Better (GoB) ratios, by using appropriate mapping functions. Alternatively, individual QoE monitoring directly assesses user experience based on self-reported feedback. We discuss the strengths, weaknesses, opportunities, and threats of both approaches. Based on the collected data from individual or objective QoE monitoring, providers can calculate the QoE metrics across all users in the system, who are subjected to a range of varying QoS conditions. The aggregated QoE across all users in the system for a dedicated time frame is referred to as system QoE. Based on a comprehensive simulation study, the expected system QoE, the system GoB ratio, as well as QoE fairness across all users are computed. Our numerical results explore whether objective and individual QoE monitoring lead to similar conclusions. In our previous work [Hoss2024], we provided a theoretical framework and the mathematical derivation of the corresponding relationships between QoS and system QoE for both monitoring approaches. Here, the focus is on illustrating the key differences of individual and objective QoE monitoring and the consequences in practice.

System QoE: Assessment of QoE of Users in a System

The term “System QoE” refers to the assessment of user experience from a provider’s perspective, focusing on the perceived quality of the users of a particular service. Thereby, providers may be different stakeholders along the service delivery chain, for example, network service provider and, in particular, Internet service provider, or application service provider. QoE monitoring delivers the necessary information to evaluate the system QoE, which is the basis for appropriate actions to ensure high-quality services and high QoE, e.g., through resource and network management.

Typically, QoE monitoring and management involves evaluating how well the network and services perform by analyzing objective metrics like Quality of Service (QoS) parameters (e.g., latency, jitter, packet loss) and mapping them to QoE metrics, such as Mean Opinion Scores (MOS). However, QoE monitoring involves a series of steps that providers need to follow: 1) identify relevant QoE metrics of interest, like MOS or GoB ratio; 2) deploy a monitoring framework to collect and analyze data. We will discuss this in the following.

The scope of system QoE metrics is to quantify the QoE across all users consuming the service for a dedicated time frame, e.g., one day, one week, or one month. Thereby, the expected QoE of an arbitrary user in the system, the ratio of all users experiencing Good-or-Getter (GoB) quality or Poor-or-Worse (PoW) quality, as well as the QoE fairness across all users are of interest. The users in the system may achieve different QoS on network level, e.g., different latency, jitter, throughput, since resources are shared among the users. The same is also true on application level with varying application-specific QoS parameters, for instance, video resolution, buffering time, or startup delays for video streaming. The varying QoS conditions manifest then in the system QoE. Fundamental relationships between the system QoE and QoS metrics were derived in [Hoss2020].

Expected system QoE: The expected system QoE is the average QoE rating of an arbitrary user in the system. The fundamental relationship in [Hoss2020] shows that the expected system QoE may be derived by mapping the QoS as experienced by a user to the corresponding MOS value and computing the average MOS over the varying QoS conditions. Thus, a MOS mapping function is required to map the QoS parameters to MOS values.

System GoB and System PoW: The Mean Opinion Score provides an average score but fails to account for the variability in users and the user rating diversity. Thus, users obtaining the same QoS conditions, may rate this subjectively differently. Metrics like the percentage of users rating the experience as Good or Better or as Poor or Worse provide more granular insights. Such metrics help service providers understand not just the average quality, but how quality is distributed across the user base. The fundamental relationship in [Hoss2020] shows that the system GoB and PoW may be derived by mapping the QoS as experienced by a user to the corresponding GoB or PoW value and computing the average over the varying QoS conditions, respectively. Thus, a GoB or PoW mapping function is required.

QoE Fairness: Operators must not only ensure that users are sufficiently satisfied, but also that this is done in a fair manner. However, what is considered fair in the QoS domain may not necessarily translate to fairness in the QoE domain, making the need to apply a QoE fairness index. [Hoss2018] defines the QoE fairness index as a linear transformation of the standard deviation of MOS values to the range [0;1]. The observed standard deviation is normalized with the maximal standard deviation, being theoretically possible for MOS values in a finite range, typically between 1 (poor quality) and 5 (excellent quality). The difference between 1 (indicating perfect fairness) and the normalized standard deviation of MOS values (indicating the degree of unfairness) yields the fairness index.

The fundamental relationships allow different implementations of QoE monitoring in practice, which are visualized in Figure 1 and discussed in the following. We differentiate between individual QoE monitoring and objective QoE monitoring and provide a qualitative strengths-weaknesses-opportunities-threats (SWOT) analysis.

Figure 1. QoE monitoring approaches to assess system QoE: individual and objective QoE monitoring.

Individual QoE Monitoring

Individual QoE monitoring refers to the assessment of system QoE by collecting individual ratings, e.g., on a 5-point rating scale, from users through their personal feedback. This approach captures the unique and individual nature of user experiences, accounting for factors like personal preferences and context. It allows optimizing services in a personalized manner, which is regarded as a challenging future research objective, see [Schmitt2017, Zhu2018, Gao2020, Yamazaki2021, Skorin-Kapov2018].

The term “individual QoE” was nicely described by in [Zhu2018]: “QoE, by definition, is supposed to be subjective and individual. However, we use the term ‘individual QoE’, since the majority of the literature on QoE has not treated it as such. […] The challenge is that the set of individual factors upon which an individual’s QoE depends is not fixed; rather this (sub)set varies from one context to another, and it is this what justifies even more emphatically the individuality and uniqueness of a user’s experience – hence the term ‘individual QoE’.”

Strengths: Individual QoE monitoring provides valuable insights into how users personally experience a service, capturing the variability and uniqueness of individual perceptions that objective metrics often miss. A key strength is that it gathers direct feedback from a provider’s own users, ensuring a representative sample rather than relying on external or unrepresentative populations. Additionally, it does not require a predefined QoE model, allowing for flexibility in assessing user satisfaction. This approach enables service providers to directly derive various system QoE metrics.

Weaknesses: Individual QoE monitoring is mainly feasible for application service providers and requires additional monitoring efforts beyond the typical QoS tools already in place. Privacy concerns are significant, as collecting sensitive user data can raise issues with data protection and regulatory compliance, such as with GDPR. Additionally, users may use the system primarily as a complaint tool, focusing on reporting negative experiences, which could skew results. Feedback fatigue is another challenge, where users may become less willing to provide ongoing input over time, limiting the validity and reliability of the data collected.

Opportunities: Data from individual QoE monitoring can be utilized to enhance individual user QoE through better resource and service management. From a business perspective, offering a personalized QoE can set providers apart in competitive markets and the data collected has monetization potential, supporting personalized marketing. Data from individual QoE monitoring enables deriving objective metrics like MOS or GoB, to update existing QoE models or to develop new QoE models for novel services by correlating it with QoS parameters. Those insights can drive innovation, leading to new features or services that meet evolving customer needs.

Threats: Individual QoE monitoring accounts for factors outside the provider’s control, such as environmental context (e.g., noisy surroundings [Reichl2015, Jiménez2020]), which may affect user feedback but not reflect actual service performance. Additionally, as mentioned, it may be used as a complaint tool, with users disproportionately reporting negative experiences. There is also the risk of over-engineering solutions by focusing too much on minor individual issues, potentially diverting resources from addressing more significant, system-wide challenges that could have a broader impact on overall service quality

Objective QoE Monitoring

Objective QoE monitoring involves assessing user experience by translating measurable QoS parameters on network level, such as latency, jitter, and packet loss, and on application level, such as video resolution or stalling duration for video streaming, into QoE metrics using predefined models and mapping functions. Unlike individual QoE monitoring, it does not require direct user feedback and instead relies on technically measurable parameters to estimate user satisfaction and various QoE metrics [Hoss2016]. Thereby, the fundamental relationships between system QoE and QoS [Hoss2020] are utilized. For computing the expected system QoE, a MOS mapping function is required, which maps a dedicated QoS value to a MOS value. For computing the system GoB, a GoB mapping function between QoS and GoB is required. Note that the QoS may be a vector of various QoS parameters, which are the input values for the mapping function.

Recent works [Hoss2022] indicated that industrial user experience index values, as obtained by the Threshold-Based Quality (TBQ) model for QoE monitoring, may be accurate enough to derive system QoE metrics. The TBQ model is a framework that defines application-specific thresholds for QoS parameters to assess and classify the user experience, which may be derived with simple and interpretable machine learning models like decision trees.

Strengths: Objective QoE monitoring relies solely on QoS monitoring, making it applicable for network providers, even for encrypted data streams, as long as appropriate QoE models are available, see for example [Juluri2015, Orsolic2020, Casas2022]. It can be easily integrated into existing QoS monitoring tools already deployed, reducing the need for additional resources or infrastructure. Moreover, it offers an objective assessment of user experience, ensuring that the same QoS conditions for different users are consistently mapped to the same QoE scores, as required for QoE fairness.

Weaknesses: Objective QoE monitoring requires specific QoE models and mapping functions for each desired QoE metric, which can be complex and resource-intensive to develop. Additionally, it has limited visibility into the full user experience, as it primarily relies on network-level metrics like bandwidth, latency, and jitter, which may not capture all factors influencing user satisfaction. Its effectiveness is also dependent on the accuracy of the monitored QoS metrics; inaccurate or incomplete data, such as from encrypted packets, can lead to misguided decisions and misrepresentation of the actual user experience.

Opportunities: Objective QoE monitoring enables user-centric resource and network management for application and network service providers by tracking QoS metrics, allowing for dynamic adjustments to optimize resource utilization and improve service delivery. The integration of AI and automation with QoS monitoring can increase the efficiency and accuracy of network management from a user-centric perspective. The objective QoE monitoring data can also enhance Service Level Agreements (SLAs) towards Experience Level Agreements (ELAs) as discussed in [Varela2015].

Threats: One risk of Objective QoE monitoring is the potential for incorrect traffic flow characterization, where data flows may be misattributed to the wrong applications, leading to inaccurate QoE assessments. Additionally, rapid technological changes can quickly make existing QoS monitoring tools and QoE models outdated, necessitating constant upgrades and investment to keep pace with new technologies. These challenges can undermine the accuracy and effectiveness of objective QoE monitoring, potentially leading to misinformed decisions and increased operational costs.

Numerical Results: Visualizing the Differences

In this section, we explore and visualize the obtained system QoE metrics, which are based on collected data either through i) individual QoE monitoring or ii) objective QoE monitoring. The question arises if the two monitoring approaches lead to the same results and conclusions for the provider. The obvious approach for computing the system QoE metrics is to use i) the individual ratings collected directly from the users and ii) the MOS scores obtained through mapping the objectively collected QoS parameters. While the discrepancies are derived mathematically in [Hoss2024], this article presents a visual representation of the differences between individual and objective QoE monitoring through a comprehensive simulation study. This simulation approach allows us to quantify the expected system QoE, the system GoB ratio, and the QoE fairness for a multitude of potential system configurations, which we manipulate in the simulation with varying QoS distributions. Furthermore, we demonstrate methods for utilizing data obtained through either individual QoE monitoring or objective QoE monitoring to accurately calculate the system QoE metrics as intended for a provider.

For the numerical results, the web QoE use case in [Hoss2024] is employed. We conduct a comprehensive simulation study, in which the QoS settings are varied. To be more precise, the page load times (PLTs) are varied, such that the users in the system experience a range of different loading times. For each simulation run, the average PLT and the standard deviation of the PLT across all users in the system are fixed. Then each user gets a randomly assigned PLT according to a beta distribution in the range between 0s and 8s with the specified average and standard deviation. The PLTs per user are sampled from that parameterized beta distribution.

For a concrete PLT, the corresponding user rating distribution is available and follows in our case a shifted binomial distribution, where the mean of the binomial distribution reflects the MOS value for that condition. To mention this clearly, this binomial distribution is a conditional random variable with discrete values on a 5-point scale: the user ratings are conditioned on the actual QoS value. For the individual QoE monitoring, the user ratings are sampled from that conditional random variable, while the QoS values are sampled from the beta distribution. For objective QoE monitoring, only the QoS values are used, but in addition, the MOS mapping function provided in [Hoss2024] is used. Thus, each QoS value is mapped to a continuous MOS value within the range of 1 to 5.

Figure 2 shows the expected system QoE using individual QoE monitoring as well as objective QoE monitoring depending on the average QoS as well as the standard deviation of the QoS, which is indicated by the color. Each point in the figure represents a single simulation run with a fixed average QoS and fixed standard deviation. It can be seen that both QoE monitoring approaches lead to the same results, which was also formally proven in [Hoss2024]. Note that higher QoS variances also result in higher expected system since for the same average QoS, there may be some users with larger QoS values, but also some users with lower QoS values. Due to the non-linear mapping between QoS and QoE this results in higher QoE scores.

Figure 3 shows the system GoB ratio, which can be simply computed with individual QoE monitoring. However, in the case of objective QoE monitoring, we assume that only a MOS mapping function is available. It is tempting to derive the GoB ratio by deriving the ratio of MOS values which are good or better. However, this leads to wrong results, see [Hoss2020]. Nevertheless, the GoB mapping function can be approximated from an existing MOS mapping function, see [Hoss2022, Hoss2017, Perez2023]. Then, the same conclusions are then derived through objective QoE monitoring as for individual QoE monitoring.

Figure 4 considers now QoE fairness for both monitoring approaches. It is tempting to use the user rating values from individual QoE monitoring and apply the QoE fairness index. However, in that case, the fairness index considers the variances of the system QoS and additionally the variances due to user rating diversity, as shown in [Hoss2024]. However, this is not the intended application of the QoE fairness index, which aims to evaluate the fairness objectively from a user-centric perspective, such that resource management can be adjusted and to provide users with high and fairly distributed quality. Therefore, the QoE fairness index uses MOS values, such that users with the same QoS are assigned the same MOS value. In a system with deterministic QoS conditions, i.e., the standard deviation diminishes, the QoE fairness index is 100%, see the results for the objective QoE monitoring. Nevertheless, the individual QoE monitoring also allows computing the MOS values for similar QoS values and then to apply the QoE fairness index. Then, comparable results are obtained as for objective QoE monitoring.

Figure 2. Expected system QoE when using individual and objective QoE monitoring. Both approaches lead to the same expected system QoE.

Figure 3. System GoB ratio: Deriving the ratio of MOS values which are good or better does not work for objective QoE monitoring. But an adjusted GoB computation, by approximating GoB through MOS, leads to the same conclusions as individual QoE monitoring, which simply measures the system GoB.

Figure 4. QoE Fairness: Using the user rating values obtained through individual QoE monitoring additionally includes the user rating diversity, which is not desired in network or resource management. However, individual QoE monitoring also allows computing the MOS values for similar QoS values and then to apply the QoE fairness index, which leads to comparable insights as objective QoE monitoring.

Conclusions

Individual QoE monitoring and objective QoE monitoring are fundamentally distinct approaches for assessing system QoE from a provider’s perspective. Individual QoE monitoring relies on direct user feedback to capture personalized experiences, while objective QoE monitoring uses QoS metrics and QoE models to estimate QoE metrics. Both methods have strengths and weaknesses, offering opportunities for service optimization and innovation while facing challenges such as over-engineering and the risk of models becoming outdated due to technological advancements, as summarized in our SWOT analysis. However, as the numerical results have shown, both approaches can be used with appropriate modifications and adjustments to derive various system QoE metrics like expected system QoE, system GoB and PoW ratio, as well as QoE fairness. A promising direction for future research is the development of hybrid approaches that combine both methods, allowing providers to benefit from objective monitoring while integrating the personalization of individual feedback. This could also be interesting to integrate in existing approaches like the QoS/QoE Monitoring Engine proposal [Siokis2023] or for upcoming 6G networks, which may allow the radio access network (RAN) to autonomously adjust QoS metrics in collaboration with the application to enhance the overall QoE [Bertenyi2024].

References

[Bertenyi2024] Berteny, B., Kunzmann, G., Nielsen, S., and Pedersen, K. Andres, P. (2024). Transforming the 6G vision to action. Nokia Whitepaper, 28 June 2024. Url: https://www.bell-labs.com/institute/white-papers/transforming-the-6g-vision-to-action/.

[Casas2022] Casas, P., Seufert, M., Wassermann, S., Gardlo, B., Wehner, N., & Schatz, R. (2022). DeepCrypt: Deep learning for QoE monitoring and fingerprinting of user actions in adaptive video streaming. In 2022 IEEE 8th International Conference on Network Softwarization (NetSoft) (pp. TBD). IEEE.

[Gao2020] Gao, Y., Wei, X., & Zhou, L. (2020). Personalized QoE improvement for networking video service. IEEE Journal on Selected Areas in Communications, 38(10), 2311-2323.

[Hoss2016] Hoßfeld, T., Schatz, R., Egger, S., & Fiedler, M. (2016). QoE beyond the MOS: An in-depth look at QoE via better metrics and their relation to MOS. Quality and User Experience, 1, 1-23.

[Hoss2017] Hoßfeld, T., Fiedler, M., & Gustafsson, J. (2017, May). Betas: Deriving quantiles from MOS-QoS relations of IQX models for QoE management. In 2017 IFIP/IEEE Symposium on Integrated Network and Service Management (IM) (pp. 1011-1016). IEEE.

[Hoss2018] Hoßfeld, T., Skorin-Kapov, L., Heegaard, P. E., & Varela, M. (2018). A new QoE fairness index for QoE management. Quality and User Experience, 3, 1-23.

[Hoss2020] Hoßfeld, T., Heegaard, P. E., Skorin-Kapov, L., & Varela, M. (2020). Deriving QoE in systems: from fundamental relationships to a QoE-based Service-level Quality Index. Quality and User Experience, 5(1), 7.

[Hoss2022] Hoßfeld, T., Schatz, R., Egger, S., & Fiedler, M. (2022). Industrial user experience index vs. quality of experience models. IEEE Communications Magazine, 61(1), 98-104.

[Hoss2024] Hoßfeld, T., & Pérez, P. (2024). A theoretical framework for provider’s QoE assessment using individual and objective QoE monitoring. In 2024 16th International Conference on Quality of Multimedia Experience (QoMEX) (pp. TBD). IEEE.

[Jiménez2020] Jiménez, R. Z., Naderi, B., & Möller, S. (2020, May). Effect of environmental noise in speech quality assessment studies using crowdsourcing. In 2020 Twelfth International Conference on Quality of Multimedia Experience (QoMEX) (pp. 1-6). IEEE.

[Juluri2015] Juluri, P., Tamarapalli, V., & Medhi, D. (2015). Measurement of quality of experience of video-on-demand services: A survey. IEEE Communications Surveys & Tutorials, 18(1), 401-418.

[Orsolic2020] Orsolic, I., & Skorin-Kapov, L. (2020). A framework for in-network QoE monitoring of encrypted video streaming. IEEE Access, 8, 74691-74706.

[Perez2023] Pérez, P. (2023). The Transmission Rating Scale and its Relation to Subjective Scores. In 2023 15th International Conference on Quality of Multimedia Experience (QoMEX) (pp. 31-36). IEEE.

[Reichl2015] Reichl, P., et al. (2015, May). Towards a comprehensive framework for QoE and user behavior modelling. In 2015 Seventh International Workshop on Quality of Multimedia Experience (QoMEX) (pp. 1-6). IEEE.

[Schmitt2017] Schmitt, M., Redi, J., Bulterman, D., & César, P. (2017). Towards individual QoE for multiparty videoconferencing. IEEE Transactions on Multimedia, 20(7), 1781-1795.

[Siokis2023] Siokis, A., Ramantas, K., Margetis, G., Stamou, S., McCloskey, R., Tolan, M., & Verikoukis, C. V. (2023). 5GMediaHUB QoS/QoE monitoring engine. In 2023 IEEE 28th International Workshop on Computer Aided Modeling and Design of Communication Links and Networks (CAMAD) (pp. TBD). IEEE.

[Skorin-Kapov2018] Skorin-Kapov, L., Varela, M., Hoßfeld, T., & Chen, K. T. (2018). A survey of emerging concepts and challenges for QoE management of multimedia services. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 14(2s), 1-29.

[Varela2015] Varela, M., Zwickl, P., Reichl, P., Xie, M., & Schulzrinne, H. (2015, June). From service level agreements (SLA) to experience level agreements (ELA): The challenges of selling QoE to the user. In 2015 IEEE International Conference on Communication Workshop (ICCW) (pp. 1741-1746). IEEE.

[Yamazaki2021] Yamazaki, T. (2021). Quality of experience (QoE) studies: Present state and future prospect. IEICE Transactions on Communications, 104(7), 716-724.

[Zhu2018] Zhu, Y., Guntuku, S. C., Lin, W., Ghinea, G., & Redi, J. A. (2018). Measuring individual video QoE: A survey, and proposal for future directions using social media. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM), 14(2s), 1-24.

Towards Immersive Digiphysical Experiences

By Tobias Hossfeld | February 3, 2024 - 12:40 |March 12, 2024 0124, Event Report, Feature, QoE Column

Leave a comment

Immersive experiences have the potential of redefining traditional forms of media engagement by intricately combining reality with imagination. Motivated by necessities, current developments and emerging technologies, this column sets out to bridge immersive experiences in both digital and physical realities. Fitting under the umbrella term of eXtended Reality (XR), the first section describes various realizations of blending digital and physical elements to design what we refer to as immersive digiphysical experiences. We further highlight industry and research initiatives related to driving the design and development of such experiences, considered to be key building-blocks of the futuristic ‘metaverse’. The second section outlines challenges related to assessing, modeling, and managing the Quality of Experience (QoE) of immersive digiphysical experiences and reflects upon ongoing work in the area. While potential use cases span a wide range of application domains, the third section elaborates on the specific case of conference organization, which has over the past few years spanned from fully physical, to fully virtual, and finally to attempts at hybrid organization. We believe this use case provides valuable insights into needs and promising approaches, to be demonstrated and experienced at the upcoming 16th edition of the International Conference on Quality of Multimedia Experience (QoMEX 2024) in Karlshamn, Sweden in June 2024.

Multiple users engaged in a co-located mixed reality experience

Bridging The Digital And Physical Worlds

According to [IMeX WP, 2020], immersive media have been described as involving “multi-modal human-computer interaction where either a user is immersed inside a digital/virtual space or digital/virtual artifacts become a part of the physical world”. Spanning the so-called virtuality continuum [Milgram, 1995], immersive media experiences may involve various realizations of bridging the digital and physical worlds, such as the seamless integration of digital content with the real world (via Augmented or Mixed Reality, AR/MR), and vice versa by incorporating real objects into a virtual environment (Augmented Virtuality, AV). More recently, the term eXtended Reality (XR) (also sometimes referred to as xReality) has been used as an umbrella term for a wide range of levels of “realities”, with [Rauschnabel, 2022] proposing a distinction between AR/MR and Virtual Reality (VR) based on whether the physical environment is, at least visually, part of the user’s experience.

By seamlessly merging digital and physical elements and supporting real-time user engagement with both digital and physical components, immersive digiphysical (i.e., both digitally and physically accessible [Westerlund, 2020]) experiences have the potential of providing compelling experiences blurring the distinction between the real and virtual worlds. A key aspect is that of digital elements responding to user input or the physical environment, and the physical environment responding to interactions with digital objects. Going beyond only visual or auditory stimuli, the incorporation of additional senses, for example via haptic feedback or olfactory elements, can contribute to multisensory engagement [Gibbs, 2022].

The rapid development of XR technologies has been recognized as a key contributor to realizing a wide range of applications built on the fusion of the digital and physical worlds [NEM WP, 2022]. In its contribution to the European XR Coalition (launched by the European Commission), the New European Media Initiative (NEM), Europe’s Technology Platform of Horizon 2020 dedicated to driving the future of digital experiences, calls for needed actions from both industry and research perspectives addressing challenges related to social and human centered XR as well as XR communication aspects [NEM XR, 2022]. One such initiative is the Horizon 2020 TRANSMIXR project [TRANSMIXR], aimed at developing a distributed XR creation environment that supports remote collaboration practices, as well as an XR media experience environment for the delivery and consumption of social immersive media experiences. The NEM initiative further identifies the need for scalable solutions to obtain plausible and convincing virtual copies of physical objects and environments, as well as solutions supporting seamless and convincing interaction between the physical and the virtual world. Among key technologies and infrastructures needed to overcome outlined challenges, the following are identified [NEM XR, 2022]: high bandwidth and low-latency energy-efficient networks; remote computing for processing and rendering deployed on cloud and edge infrastructures; tools for the creation and updating of digital twins (DT) to strengthen the link between the real and virtual worlds, integrating Internet of Things (IoT) platforms; hardware in the form of advanced displays; and various content creation tools relying on interoperable formats.

Looking towards the future, immersive digiphysical experiences set the stage for visions of the metaverse [Wang, 2023], described as representing the evolution of the Internet towards a platform enabling immersive, persistent, and interconnected virtual environments blending digital and physical [Lee, 2021].[Wang, 2022] see the metaverse as `created by the convergence of physically persistent virtual space and virtually enhance physical reality’. The metaverse is further seen as a platform offering the potential to host real-time multisensory social interactions (e.g., involving sight, hearing, touch) between people communicating with each other in real-time via avatars [Hennig-Thurau, 2023]. As of 2022, the Metaverse Standards Forum is proving a venue for industry coordination fostering the development of interoperability standards for an open and inclusive metaverse [Metaverse, 2023]. Relevant existing standards include: ISO/IEC 23005 (MPEG-V) (standardization of interfaces between the real world and the virtual world, and among virtual worlds) [ISO/IEC 23055], IEEE 2888 (definition of standardized interfaces for synchronization of cyber and physical worlds) [IEEE 2888], and MPEG-I (standards to digitally represent immersive media) [ISO/IEC 23090].

Research Challenges For The Qoe Community

Achieving wide-spread adoption of XR-based services providing digiphysical experiences across a broad range of application domains (e.g., education, industry & manufacturing, healthcare, engineering, etc.) inherently requires ensuring intuitive, comfortable, and positive user experiences. While research efforts in meeting such requirements are well under way, a number of open challenges remain.

Quality of Experience (QoE) for immersive media has been defined as [IMeX WP, 2020] “the degree of delight or annoyance of the user of an application or service which involves an immersive media experience. It results from the fulfillment of his or her expectations with respect to the utility and/or enjoyment of the application or service in the light of the user’s personality and current state.” Furthermore, a bridge between QoE and UX has been established through the concept of Quality of User Experience (QUX), combining hedonic, eudaimonic and pragmatic aspects of QoE and UX [Egger-Lampl, 2019]. In the context of immersive communication and collaboration services, significant efforts are being invested towards understanding and optimizing the end user experience [Perez, 2022].

The White Paper [IMeX WP, 2020] ties immersion to the digital media world (“The more the system blocks out stimuli from the physical world, the more the system is considered to be immersive.”). Nevertheless, immersion as such exists in physical contexts as well, e.g., when reading a captivating book. MR, XR and AV scenarios are digiphysical in their nature. These considerations pose several challenges:

Achieving intuitive and natural interactive experiences [Hennig-Thurau, 2023] when mixing realities.
Developing a common understanding of MR-, XR- and AV-related challenges in digiphysical multi-modal multi-party settings.
Advancing VR, AR, MR, XR and AV technologies to allow for truly digiphysical experiences.
Measuring and modeling QoE, UX and QUX for immersive digiphysical services, covering overall methodology, measurement instruments, modeling approaches, test environments and application domains.
Management of the networked infrastructure to support immersive digiphysical experiences with appropriate QoE, UX and QUX.
Sustainability considerations in terms of environmental footprint, accessibility, equality of opportunities in various parts of the world, and cost/benefit ratio.

Challenges 1 and 2 demand for an experience-based bottom-up approach to focus on the most important aspects. Examples include designing and evaluating different user representations [Aseeri, 2021][Viola, 2023], natural interaction techniques [Spittle, 2023] and use of different environments by participants (AR/MR/VR) [Moslavac, 2023]. The latter has shown beneficial for challenges 3 (cf. the emergence of MR-/XR-/AV-supporting head-mounted devices such as the Microsoft Hololens and recent pass-through versions of the Meta Quest) and 4. Finally, challenges 5 and 6 need to be carefully addressed to allow for long-term adoption and feasibility.

Challenges 1 to 4 have been addressed in standardization. For instance, ITU-T Recommendation P.1320 specifies QoE assessment procedures and metrics for the evaluation of XR telemeetings, outlining various categories of QoE influence factors and use cases [ITU-T Rec. P.1320, 2022] (adopted from the 3GPP technical report TR 26.928 on XR technology in 5G). The corresponding ITU-T Study Group 12 (Question 10) developed a taxonomy of telemeetings [ITU-T Rec. G.1092, 2023], providing a systematic classification of telemeeting systems. Ongoing joint efforts between the VQEG Immersive Media Group and ITU-T Study Group 12 are targeted towards specifying interactive test methods for subjective assessment of XR communications [ITU-T P.IXC, 2022].

The complexity of the aforementioned challenges demand for a combination of fundamental work, use cases, implementations, demonstrations, and testing. One specific use case that has shown its urge during recent years in combining digital and physical realities is that of hybrid conference organization, touching in particular on the challenge of achieving intuitive and natural interactions between remote and physically present participants. We consider this use case in detail in the following section, referring to the organization of the International Conference on Quality of Multimedia Experience (QoMEX) as an example.

Immersive Communication And Collaboration: The Case Of Conference Organization

What seemed to be impossible and was undesirable in the past, became a necessity overnight during the CoVid-19 pandemic: running conferences as fully virtual events. Many research communities succeeded in adapting ongoing conference organizations such that communities could meet, present, demonstrate and socialize online. The conference QoMEX 2020 is one such example, whose organizers introduced a set of innovative instruments to mutually interact and enjoy, such as virtual Mozilla Hubs spaces for poster presentations and a music session with prerecorded contributions mixed to form a joint performance to be enjoyed virtually together. A yet unknown inventiveness was observed to make the best out of the heavily travel-restricted situation. Furthermore, the technical approaches varied from off-the-shelf systems (such as Zoom or Teams) to custom-built applications. However, the majority of meetings during CoVid times, no matter scale and nature, were run in unnatural 2D on-screen settings. The frequently reported phenomenon of videoconference (VC) fatigue can be attributed to a set of personal, organizational, technical and environmental factors [Döring, 2022]. Indeed, talking to one’s computer with many faces staring back, limited possibilities to move freely, technostress [Brod, 1984] and organizational mishaps made many people tired of VC technology that was designed for a better purpose, but could not get close enough to a natural real-life experience.

As CoVid was on its retreat, conferences again became physical events and communities enjoyed meeting again, e.g., at QoMEX 2022. However, voices were raised that asked for remote participation for various reasons, such as time or budget restrictions, environmental sustainability considerations, or simply the comfort of being able to work from home. With remote participation came the challenge of bridging between in-person and remote participants, i.e., turning conferences into hybrid events [Bajpai, 2022]. However, there are many mixed experiences from hybrid conferences, both with onsite and online participants: (1) The onsite participants suffer from interruptions of the session flow needed to fix problems with the online participation tool. Their readiness to devote effort, time, and money to participate in a future hybrid event in person might suffer from such issues, which in turn would weaken the corresponding communities; (2) The online participants suffer from similar issues, where sound irregularities (echo, excessive sound volumes, etc.) are felt to be particularly disturbing, along with feelings of being not properly included e.g., in Q&A-sessions and personal interactions. At both ends, clear signs of technostress and “us-and-them” feelings can be observed. Consequently, and despite good intentions and advice [Bajpai, 2022], any hybrid conference might miss its main purpose to bring researchers together to present, discuss and socialize. To avoid the above-listed issues, the post-CoVid QoMEX conferences (since 2022) avoided hybrid operations, with few exceptions.

A conference is a typical case that reveals difficulties in bringing the physical and digital worlds together [Westerlund, 2020], at least when relying upon state-of-the-art telemeeting approaches that have not explicitly been designed for hybrid and digiphysical operations. At the recent 26th ACM Conference on Computer-Supported Cooperative Work And Social Computing in Minneapolis, USA (CSCW 2023), one of the panel sessions focused on “Realizing Values in Hybrid Environments”. Panelists and audience shared experiences about successes and failures with hybrid events. The main take-aways were as follows: (1) there is a general lack of know-how, no matter how much funds are allocated, and (2) there is a significant demand for research activities in the area.

Yet, there is hope, as increasingly many VR, MR, XR and AV-supporting devices and applications keep emerging, enabling new kinds and representations of immersive experiences. In a conference context, the latter implies the feeling of “being there”, i.e., being integrated in the conference community, no matter where the participant is located. This calls for new ways of interacting amongst others through various realities (VR/MR/XR), which need to be invented, tried and evaluated in order to offer new and meaningful experiences in telemeeting scenarios [Viola, 2023]. Indeed, CSCW 2023 hosted a specific workshop titled “Emerging Telepresence Technologies for Hybrid Meetings: an Interactive Workshop”, during which visions, experiences, and solutions were shared and could be experienced locally and remotely. About half of the participants were online, successfully interacting with participants onsite via various techniques.

With these challenges and opportunities in mind, the motto of QoMEX 2024 has been set as “Towards immersive digiphysical experiences.” While the conference is organized as an in-person event, a set of carefully selected hybrid activities will be offered to interested remote participants, such as (1) 360° stereoscopic streaming of the keynote speeches and demo sessions, and (2) the option to take part in so-called hybrid experience demos. The 360° stereoscopic streaming has so far been tested successfully in local, national and transatlantic sessions (during the above-mentioned CSCW workshop) with various settings, and further fine-tuning will be done and tested before the conference. With respect to the demo session – and in addition to traditional onsite demos – this year, the conference will in particular solicit hybrid experience demos that enable both onsite and remote participants to test the demo in an immersive environment. Facilities will also be provided for onsite participants to test demos from both the perspective of a local and remote user, enabling them to experience different roles. The organizers of QoMEX 2024 hope that the hybrid activities of QoMEX 2024 will trigger more research interest in these areas along and beyond the classical lines of QoE research (to perform quantitative subjective studies of QoE features and correlating them with QoE factors).

QoMEX 2024: Towards Immersive Digiphysical Experiences

Concluding Remarks

As immersive experiences extend into both digital and physical worlds and realities, there is a great space to conquer for QoE, UX, and QUX-related research. While the recent CoVid pandemic has forced many users to replace physical with digital meetings and sustainability considerations have reduced many peoples’ and organizations’ readiness to (support) travel, shortcomings of hybrid digiphysical meetings have failed to persuade their participants of their superiority over pure online or on-site meetings. Indeed, one promising path towards a successful integration of physical and digital worlds consists of trying out, experiencing, reflecting, and deriving important research questions for and beyond the QoE research community The upcoming conference QoMEX 2024 will be a stop along this road with carefully selected hybrid experiences aimed at boosting research and best practice in the QoE domain towards immersive digiphysical experiences.

References

[Aseeri, 2021] Aseeri, S., & Interrante, V. (2021). The Influence of Avatar Representation on Interpersonal Communication in Virtual Social Environments. IEEE Transactions on Visualization and Computer Graphics, 27(5), 2608-2617.
[Bajpai, 2022] Bajpai, V., et al.. (2022). Recommendations for designing hybrid conferences. ACM SIGCOMM Computer Communication Review, 52(2), 63-69.
[Brod, 1984] Brod, C. (1984). Technostress: The Human Cost of the Computer Revolution. Basic Books; New York, NY, USA: 1984.
[Döring, 2022] Döring, N., Moor, K. D., Fiedler, M., Schoenenberg, K., & Raake, A. (2022). Videoconference Fatigue: A Conceptual Analysis. International Journal of Environmental Research and Public Health, 19(4), 2061.
[Egger-Lampl, 2019] Egger-Lampl, S., Hammer, F., & Möller, S. (2019). Towards an integrated view on QoE and UX: adding the Eudaimonic Dimension, ACM SIGMultimedia Records, 10(4):5.
[Gibbs, 2022] Gibbs, J. K., Gillies, M., & Pan, X. (2022). A comparison of the effects of haptic and visual feedback on presence in virtual reality. International Journal of Human-Computer Studies, 157, 102717.
[Hennig-Thurau, 2023] Hennig-Thurau, T., Aliman, D. N., Herting, A. M., Cziehso, G. P., Linder, M., & Kübler, R. V. (2023). Social Interactions in the Metaverse: Framework, Initial Evidence, and Research Roadmap. Journal of the Academy of Marketing Science, 51(4), 889-913.
[IMeX WP, 2020] Perkis, A., Timmerer, C., et al., “QUALINET White Paper on Definitions of Immersive Media Experience (IMEx)”, European Network on Quality of Experience in Multimedia Systems and Services, 14th QUALINET meeting (online), May 25, 2020. Online: https://arxiv.org/abs/2007.07032
[ISO/IEC 23055] ISO/IEC 23005 (MPEG-V) standards, Media Context and Control, https://mpeg.chiariglione.org/standards/mpeg-v, accessed January 21, 2024.
[ISO/IEC 23090] ISO/IEC 23090 (MPEG-I) standards, Coded representation of Immersive Media, https://mpeg.chiariglione.org/standards/mpeg-i, accessed January 21, 2024.
[IEEE 2888] IEEE 2888 standards, https://sagroups.ieee.org/2888/, accessed January 21, 2024.
[ITU-T Rec.. G.1092, 2023] ITU-T Recommendation G.1092 – Taxonomy of telemeetings from a quality of experience perspective, Oct. 2023.
[ITU-T Rec. P.1320, 2022] ITU-T Recommendation P.1320 – QoE assessment of extended reality (XR) meetings, 2022.
[ITU-T P.IXC, 2022] ITU-T Work Item: Interactive test methods for subjective assessment of extended reality communications, under study,” 2022.
[Lee, 2021] Lee, L. H. et al. (2021). All One Needs to Know about Metaverse: A Complete Survey on Technological Singularity, Virtual Ecosystem, and Research Agenda. arXiv preprint arXiv:2110.05352.
[Metaverse, 2023] Metaverse Standards Forum, https://metaverse-standards.org/
[Milgram, 1995] Milgram, P., Takemura, H., Utsumi, A., & Kishino, F. (1995, December). Augmented reality: A class of displays on the reality-virtuality continuum. In Telemanipulator and telepresence technologies (Vol. 2351, pp. 282-292). International Society for Optics and Photonics.
[Moslavac, 2023] Moslavac, M., Brzica, L., Drozd, L., Kušurin, N., Vlahović, S., & Skorin-Kapov, L. (2023, July). Assessment of Varied User Representations and XR Environments in Consumer-Grade XR Telemeetings. In 2023 17th International Conference on Telecommunications (ConTEL) (pp. 1-8). IEEE.
[Rauschnabel, 2022] Rauschnabel, P. A., Felix, R., Hinsch, C., Shahab, H., & Alt, F. (2022). What is XR? Towards a Framework for Augmented and Virtual Reality. Computers in human behavior, 133, 107289.
[NEM WP, 2022] New European Media (NEM), NEM: List of topics for the Work Program 2023-2024.
[NEM XR, 2022] New European Media (NEM), NEM contribution to the XR coalition, June 2022.
[Perez, 2022] Pérez, P., Gonzalez-Sosa, E., Gutiérrez, J., & García, N. (2022). Emerging Immersive Communication Systems: Overview, Taxonomy, and Good Practices for QoE Assessment. Frontiers in Signal Processing, 2, 917684.
[Spittle, 2023] Spittle, B., Frutos-Pascual, M., Creed, C., & Williams, I. (2023). A Review of Interaction Techniques for Immersive Environments. IEEE Transactions on Visualization and Computer Graphics, 29(9), Sept. 2023.
[TRANSMIXR] EU HORIZON 2020 TRANSMIXR project, Ignite the Immersive Media Sector by Enabling New Narrative Visions, https://transmixr.eu/
[Viola, 2023] Viola, I., Jansen, J., Subramanyam, S., Reimat, I., & Cesar, P. (2023). VR2Gather: A Collaborative Social VR System for Adaptive Multi-Party Real-Time Communication. IEEE MultiMedia, 30(2).
[Wang 2023] Wang, H. et al. (2023). A Survey on the Metaverse: The State-of-the-Art, Technologies, Applications, and Challenges. IEEE Internet of Things Journal, 10(16).
[Wang, 2022] Wang, Y. et al. (2022). A Survey on Metaverse: Fundamentals, Security, and Privacy. IEEE Communications Surveys & Tutorials, 25(1).
[Westerlund, 2020] Westerlund, T. & Marklund, B. (2020). Community pharmacy and primary health care in Sweden – at a crossroads. Pharm Pract (Granada), 18(2): 1927.

Explainable Artificial Intelligence for Quality of Experience Modelling

By Tobias Hossfeld | July 24, 2023 - 12:07 |September 26, 2023 0323, Feature, QoE Column

Leave a comment

Data-driven Quality of Experience (QoE) modelling using Machine Learning (ML) arose as a promising alternative to the cumbersome and potentially biased manual QoE modelling. However, the reasoning of a majority of ML models is not explainable due to their black-box characteristics, which prevents us from gaining insights about how the model actually related QoE influence factors and QoE. These fundamental relationships are highly relevant for QoE researchers and service and network providers though.

With the emerging field of eXplainable Artificial Intelligence (XAI) and its recent technological advances, these issues can now be resolved. As a consequence, XAI enables data-driven QoE modelling to obtain generalizable QoE models and provides us simultaneously with the model’s reasoning on which QoE factors are relevant and how they affect the QoE score. In this work, we showcase the feasibility of explainable data-driven QoE modelling for video streaming and web browsing, before we discuss the opportunities and challenges of deploying XAI for QoE modelling.

Introduction

In order to enhance services and networks and prevent users from switching to competitors, researchers and service providers need a deep understanding of the factors that influence the Quality of Experience (QoE) [1]. However, developing an effective QoE model is a complex and costly endeavour. Typically, it requires dedicated and extensive studies, which can only cover a limited portion of the parameter space and may be influenced by the study design. These studies often generate a relatively small sample of QoE ratings from a comparatively small population, making them vulnerable to poor performance when applied to unseen data. Moreover, the process of collecting and processing data for QoE modelling is not only arduous and time-consuming, but it can also introduce biases and self-fulfilling prophecies, such as perceiving an exponential relationship when one is expected.

To overcome these challenges, data-driven QoE modelling utilizing machine learning (ML) has emerged as a promising alternative, especially in scenarios where there is a wealth of data available or where data streams can be continuously obtained. A notable example is the ITU-T standard P.1203 [2], which estimates video streaming QoE by combining manual modelling – accounting for 75% of the Mean Opinion Score (MOS) estimation – and ML-based Random Forest modelling – accounting for the remaining 25%. The inclusion of the ML component in P.1203 indicates its ability to enhance performance. However, the inner workings of P.1203’s Random Forest model, specifically how it calculates the output score, are not obvious. Also, the survey in [3] shows that ML-based QoE modelling in multimedia systems is already widely used, including Virtual Reality, 360-degree video, and gaming. However, the QoE models are based on shallow learning methods, e.g., Support Vector Machines (SVM), or on deep learning methods, which lack explainability. Thus, it is difficult to understand what QoE factors are relevant and how they affect the QoE score [13], resulting in a lack of trust in data-driven QoE models and impeding their widespread adoption by researchers and providers [14].

Fortunately, recent advancements in the field of eXplainable Artificial Intelligence (XAI) [6] have paved the way for interpretable ML-based QoE models, thereby fostering trust between stakeholders and the QoE model. These advancements encompass a diverse range of XAI techniques that can be applied to existing black-box models, as well as novel and sophisticated ML models designed with interpretability in mind. Considering the use case of modelling video streaming QoE from real subjective ratings, the work in [4] evaluates the feasibility of explainable, data-driven QoE modelling and discusses the deployment of XAI for QoE research.

The utilization of XAI for QoE modelling brings several benefits. Not only does it speed up the modelling process, but it also enables the identification of the most influential QoE factors and their fundamental relationships with the Mean Opinion Score (MOS). Furthermore, it helps eliminate biases and preferences from different research teams and datasets that could inadvertently influence the model. All that is required is a selective dataset with descriptive features and corresponding QoE ratings (labels), which covers the most important QoE influence factors and, in particular, also rare events, e.g., many stalling events in a session. Generating such complete datasets, however, is an open research question, but calls for data-centric AI [15]. By merging datasets from various studies, more robust and generalizable QoE models can theoretically be created. These studies need to have a common ground though. Another benefit is the fact that the models can also be automatically refined over time as new QoE studies are conducted and additional data becomes available.

XAI: eXplainable Artificial Intelligence

For a comprehensive understanding of eXplainable Artificial Intelligence (XAI), a general overview can be found in [5], while a thorough survey on XAI methods and a taxonomy of XAI methods, in general, is available in [6].

XAI methods can be categorized into two main types: local and global explainability techniques. Local explainability aims to provide explanations for individual stimuli in terms of QoE factors and QoE ratings. On the other hand, global explainability focuses on offering general reasoning for how a model derives the QoE rating from the underlying QoE factors. Furthermore, XAI methods can be classified into post-hoc explainers and interpretable models.

Post-hoc explainers [6] are commonly used to explain various black-box models, such as neural networks or ensemble techniques after they have been trained. One widely utilized post-hoc explainer is SHAP values [7], which originates from game theory. SHAP values quantify the contribution of each feature to the model’s prediction by considering all possible feature subsets and learning a model for each subset. Other post-hoc explainers include LIME and Anchors, although they are limited to classification tasks.

Interpretable models, by design, provide explanations for how the model arrives at its output. Well-known interpretable models include linear models and decision trees. Additionally, generalized additive models (GAM) are gaining recognition as interpretable models.

A GAM is a generalized linear model in which the model output is computed by summing up each of the arbitrarily transformed input features along with a bias [8]. The form of a GAM enables a direct interpretation of the model by analyzing the learned functions and the transformed inputs, which allows to estimate the influence of a feature. Two state-of-the-art ML-based GAM models are Explainable Boosting Machine (EBM) [9] and Neural Additive Model (NAM) [8]. While EBM uses decision trees to learn the functions and gradient boosting to improve training, NAM utilizes arbitrary neural networks to learn the functions, resulting in a neural network architecture with one sub-network per feature. EBM extends GAM by also considering additional pairwise feature interaction terms while maintaining explainability.

Exemplary XAI-based QoE Modelling using GAMs

We demonstrate the learned predictor functions for both EBM (red) and NAM (blue) on a video QoE dataset in Figure 1. All technical details about the dataset and the methodology can be found in [4]. We observe that both models provide smooth shape functions, which are easy to interpret. EBM and NAM differ only marginally and mostly in areas where the data density is low. Here, EBM outperforms NAM by overfitting on single data points using the feature interaction terms. We can see this, for example, for a high total stalling duration and a high number of quality switches, where at some point EBM stops the negative trend and strongly contrasts its previous trend to improve predictions for extreme outliers.

Figure 1: EBM and NAM for video QoE modelling

Using the smooth predictor functions, it is easy to apply curve fitting. In the bottom right plot of Figure 1, we fit the average bitrate predictor function of NAM, which was shifted by the average MOS of the dataset to obtain the original MOS scale on the y-axis, on an inverted x-axis using exponential (IQX), logarithmic (WQL), and linear functions (LIN). Note that this constitutes a univariate mapping of average bitrate to MOS, neglecting the other influencing factors. We observe that our predictor function follows the WQL hypothesis [10] (red) with a high R²=0.967. This is in line with the mechanics of P.1203, where the authors of [11] showed the same logarithmic behavior for the bitrate in mode 0.

Figure 2: EBM and NAM for web QoE modelling

As the presented XAI methods are universally applicable to any QoE dataset, Figure 2 shows a similar GAM-based QoE modelling for a web QoE dataset obtained from [12]. We can see that the loading behavior in terms of ByteIndex-Page Load Time (BI-PLT) and time to last byte (TTLB) has the strongest impact on web QoE. Moreover, we see that different URLs/webpages have a different effect on the MOS, which shows that web QoE is content dependent. Summarizing, using GAMs, we obtain valuable easy to interpret functions, which explain fundamental relationships between QoE factors and MOS. Nevertheless, further XAI methods can be utilized, as detailed in [4,5,6].

Discussion

In addition to expediting the modelling process and mitigating modelling biases, data-driven QoE modelling offers significant advantages in terms of improved accuracy and generalizability compared to manual QoE models. ML-based models are not constrained to specific classes of continuous functions typically used in manual modelling, allowing them to capture more complex relationships present in the data. However, a challenge with ML-based models is the risk of overfitting, where the model becomes overly sensitive to noise and fails to capture the underlying relationships. Overfitting can be avoided through techniques like model regularization or by collecting sufficiently large or complete datasets.

Successful implementation of data-driven QoE modelling relies on purposeful data collection. It is crucial to ensure that all (or at least the most important) QoE factors are included in the dataset, covering their full parameter range with an adequate number of samples. Controlled lab or crowdsourcing studies can define feature values easily, but budget constraints (time and cost) often limit data collection to a small set of selected feature values. Conversely, field studies can encompass a broader range of feature values observed in real-world scenarios, but they may only gather limited data samples for rare events, such as video sessions with numerous stalling events. To prevent data bias, it is essential to balance feature values, which may require purposefully generating rare events in the field. Additionally, thorough data cleaning is necessary. While it is possible to impute missing features resulting from measurement errors, doing so increases the risk of introducing bias. Hence, it is preferable to filter out missing or unusual feature values.

Moreover, adding new data and retraining an ML model is a natural and straightforward process in data-driven modelling, offering long-term advantages. Eventually, data-driven QoE models would be capable of handling concept drift, which refers to changes in the importance of influencing factors over time, such as altered user expectations. However, QoE studies are rarely conducted as temporal and population-based snapshots, limiting frequent model updates. Ideally, a pipeline could be established to provide a continuous stream of features and QoE ratings, enabling online learning and ensuring the QoE models remain up to date. Although challenging for research endeavors, service providers could incorporate such QoE feedback streams into their applications

Comparing black-box and interpretable ML models, there is a slight trade-off between performance and explainability. However, as shown in [4], it should be negligible in the context of QoE modelling. Instead, XAI allows to fully understand the model decisions, identifying relevant QoE factors and their relationships to the QoE score. Nevertheless, it has to be considered that explaining models becomes inherently more difficult when the number of input features increases. Highly correlated features and interactions may further lead to misinterpretations when using XAI since the influence of a feature may also depend on other features. To obtain reliable and trustworthy explainable models, it is, therefore, crucial to exclude highly correlated features.

Finally, although we demonstrated XAI-based QoE modelling only for video streaming and web browsing, from a research perspective, it is important to understand that the whole process is easily applicable in other domains like speech or gaming. Apart from that, it can also be highly beneficial for providers of services and networks to use XAI when implementing a continuous QoE monitoring. They could integrate visualizations of trends like Figure 1 or Figure 2 into dashboards, thus, allowing to easily obtain a deeper understanding of the QoE in their system.

Conclusion

In conclusion, the progress in technology has made data-driven explainable QoE modeling suitable for implementation. As a result, it is crucial for researchers and service providers to consider adopting XAI-based QoE modeling to gain a comprehensive and broader understanding of the factors influencing QoE and their connection to users’ subjective experiences. By doing so, they can enhance services and networks in terms of QoE, effectively preventing user churn and minimizing revenue losses.

References

[1] K. Brunnström, S. A. Beker, K. De Moor, A. Dooms, S. Egger, M.-N. Garcia, T. Hossfeld, S. Jumisko-Pyykkö, C. Keimel, M.-C. Larabi et al., “Qualinet White Paper on Definitions of Quality of Experience,” 2013.

[2] W. Robitza, S. Göring, A. Raake, D. Lindegren, G. Heikkilä, J. Gustafsson, P. List, B. Feiten, U. Wüstenhagen, M.-N. Garcia et al., “HTTP Adaptive Streaming QoE Estimation with ITU-T Rec. P. 1203: Open Databases and Software,” in ACM MMSys, 2018

[3] G. Kougioumtzidis, V. Poulkov, Z. D. Zaharis, and P. I. Lazaridis, “A Survey on Multimedia Services QoE Assessment and Machine Learning-Based Prediction,” IEEE Access, 2022.

[4] N. Wehner, A. Seufert, T. Hoßfeld, M. and Seufert, “Explainable Data-Driven QoE Modelling with XAI,” QoMEX, 2023.

[5] C. Molnar, Interpretable Machine Learning, 2nd ed., 2022. Available: https://christophm.github.io/interpretable-ml-book

[6] A. B. Arrieta, N. Diıaz-Rodriguez et al., “Explainable Artificial Intelligence (XAI): Concepts, Taxonomies, Opportunities and Challenges Toward Responsible AI,” Information fusion, 2020.

[7] S. M. Lundberg and S.-I. Lee, “A Unified Approach to Interpreting Model Predictions,” NIPS, 2017.

[8] R. Agarwal, L. Melnick, N. Frosst, X. Zhang, B. Lengerich, R. Caruana, and G. E. Hinton, “Neural Additive Models: Interpretable MachineLearning with Neural Nets,” NIPS, 2021.

[9] H. Nori, S. Jenkins, P. Koch, and R. Caruana, “InterpretML: A Unified Framework for Machine Learning Interpretability,” arXiv preprint arXiv:1909.09223, 2019.

[10] T. Hoßfeld, R. Schatz, E. Biersack, and L. Plissonneau, “Internet Video Delivery in YouTube: From Traffic Measurements to Quality of Experience,” in Data Traffic Monitoring and Analysis, 2013.

[11] M. Seufert, N. Wehner, and P. Casas, “Studying the Impact of HAS QoE Factors on the Standardized Qoe Model P. 1203,” in ICDCS, 2018

[12] D. N. da Hora, A. S. Asrese, V. Christophides, R. Teixeira, D. Rossi, “Narrowing the gap between QoS metrics and Web QoE using Above-the-fold metrics,” PAM, 2018

[13] A. Seufert, F. Wamser, D. Yarish, H. Macdonald, and T. Hoßfeld, “QoE Models in the Wild: Comparing Video QoE Models Using a Crowdsourced Data Set”, in QoMEX, 2021

[14] D. Shin, “The effects of explainability and causability on perception, trust, and acceptance: Implications for explainable AI”, in International Journal of Human-Computer Studies, 2021.

[15] D. Zha, Z. P. Bhat, K. H. Lai, F. Yang, & X. Hu, “Data-centric ai: Perspectives and challenges”, in SIAM International Conference on Data Mining, 2023

Sustainability vs. Quality of Experience: Striking the Right Balance for Video Streaming

By Tobias Hossfeld | April 11, 2023 - 14:14 |June 26, 2023 0223, Feature, QoE Column

Leave a comment

The exponential growth in internet data traffic, driven by the widespread use of video streaming applications, has resulted in increased energy consumption and carbon emissions. This outcome is primarily due to higher resolution or higher framerates content and the ability to watch videos on various end-devices. However, efforts to reduce energy consumption in video streaming services may have unintended consequences on users’ Quality of Experience (QoE). This column delves into the intricate relationship between QoE and energy consumption, considering the impact of different bit rates on end-devices. We also consider other factors to provide a more comprehensive understanding of whether these end-devices have a significant environmental impact. It is essential to carefully weigh the trade-offs between QoE and energy consumption to make informed decisions and develop sustainable practices in video streaming services.

Energy Consumption for Video Streaming

In the past few years, we have seen a remarkable expansion in how online content is delivered. According to Sandvine’s 2023 Global Internet Phenomena Report [1], video usage on the Internet has increased by 24% in 2022 and now accounts for 65% of all Internet traffic. This surge in video usage is mainly due to the growing popularity of streaming video services. Videos have become an increasingly popular form of online content, capturing a significant portion of internet users’ attention and shaping how we consume information and entertainment online. Therefore, the rising quality expectations of end-users have necessitated research and implementation of video streaming management approaches that consider the Quality of Experience (QoE) [2]. The idea is to develop applications that can work within the energy and resource limits of end-devices, while still delivering the Quality of Service (QoS) needed for smooth video viewing and an improved user experience (QoE). Even though video streaming services are advancing so quickly, energy consumption is still a significant issue causing many concerns about its impact and the urgent need to boost energy efficiency [14].

The literature provides four main elements: the data centres, the data transmission networks, the end-devices and the consumer behaviour analysing of the energy consumption of video streaming [3]. In this regard, in [4], the authors present a comprehensive review of existing literature on the energy consumption of online video streaming services. Then, they outline the potential actions that can be taken by both service providers and consumers to promote sustainable video streaming, drawing from the literature studies discussed. Their summary of the current possible actions for sustainable video streaming, from both the provider’s and consumer’s perspective, is expressed in the following segments with some of the possible solutions:

Data center: CDN (Content Delivery Network) can be utilized to offload contents/applications to the edge from the provider’s side and choose providers that prioritize sustainability from the consumer’s side.
Data transmission network: Data compression/encoding algorithms from the provider’s side and no autoplay from the consumer’s side.
End-Device: Produce energy-efficient devices from the provider’s size and prefer small-size (mobile) devices from the consumer’s side.
Consumer behaviour: Reduce the number of subscribers from the provider’s size and prefer watching videos with other people than alone from the consumer’s side.

Finally, they noted that the end device and consumer behaviour are the primary contributors to energy costs in the video streaming process. This result includes actions such as reducing video resolution and using smaller devices. However, taking such actions may have a potential downside as they can negatively impact the QoE due to their effect on video quality. Therefore, in [5], they found that by sacrificing the maximum QoE and aiming for good quality instead (e.g., MOS score of 4=Good instead of MOS score 5=Excellent), significant energy savings can be achieved in video-conferencing services. This is possible by using lower video bitrates compared to higher bitrates which result in higher energy consumption, as per their logarithmic QoE model. Regarding this research, in [4], the authors propose identifying an acceptable level of QoE, rather than striving for maximum QoE, as a potential solution to reduce energy consumption while still meeting consumer satisfaction. They conducted a crowdsourcing survey to gather real consumer opinions on their willingness to save energy consumption while streaming online videos. Then, they analysed the survey results to understand how willing people are to lower video streaming quality in order to achieve energy savings.

Green Video Streaming: The Trade-Off Between QoE and Energy Consumption

To provide a trade-off between QoE and Energy Consumption, we looked at the connection between video bitrate of standard resolution, electricity usage, and perceived QoE for a video streaming service on four different devices (smartphone, tablet, laptop/PC, and smart TV) as taken from [4].

They calculated the energy consumption of streaming on devices which is provided in [6]: Q_i = t_i.(P_i+R_i.ƿ), in the given equation, Q_i represents the electricity consumption (in kWh) of the i-th device, t_i denotes the streaming duration (in hours per week) for the i-th device, P_i represents the power load (in kW) of the i-th device, R_i signifies the data traffic (in GB/h) for a specific bitrate, and ρ = 0.1 kWh/GB represents the electricity intensity of data traffic.

Then, to estimate the perceived QoE based on the video bitrate, the authors employed a QoE model from [7], as noted in their analysis which is: QoE = a.br^b + c, where “br” represents the bitrate, and “a”, “b”, and “c” are the regression coefficients calculated for specific resolutions.

After taking this into account, we can establish a link between the QoE model, energy consumption, and the perceived QoE associated with video bitrate across various end-devices. Therefore, we implemented the green QoE model in [8] to provide a trade-off between the perceived QoE and the calculated energy consumption from above in the following way: f_γ(x)= 4/(log(x’_5)-log(x_1))*log(x)+ (log(x’_5)-5*log(x_1))/(log(x’_5)-log(x_1)). The given equation represents the mapping function between video bitrate and Mean Opinion Scores (MOS), considering both the minimum bitrate x_1 corresponding to MOS 1 and the maximum bitrate x_5 corresponding to MOS 5. Moreover, the factor γ, representing the greenness of a user, is considered in the context of maximum bitrate x’_5 = x_5/γ, which results in a MOS score of 5.

The model focuses on the concept of a “green user,” who considers the energy consumption aspect in their overall QoE evaluations. Thus, a green user might rate their QoE slightly lower in order to reduce their carbon footprint compared to a high-quality (HQ) user (or “non-green” user) who prioritizes QoE without considering energy consumption.

The numerical results for the energy consumption (in kWh) and the MOS scores depending on the video bitrate can be simplified with linear and logarithmic regressions, respectively. In Figure 1, the graph depicts a linear regression analysis conducted to examine the relationship between energy consumption (kWh) and bitrate (kbps). The y-axis represents energy consumption while the x-axis represents bitrate (kbps). The graph displays a straight-line trend that starts at 1.6 kWh and extends up to 3.5 kWh as the bitrate increases. The linear fitting function used for the analysis is formulated as: kWh = f(bitrate) = a * bitrate + c, where ‘a’ represents the slope and ‘c’ represents the y-intercept of the line.

Figure 1 visually illustrates how energy consumption tends to increase with higher bitrates, as indicated by the positive slope of the linear regression line in Figure 1. One notable observation is that as video bitrates increase, the electricity consumption of end-devices also tends to increase. This can be attributed to the larger amount of data traffic generated by higher-resolution video content, which requires higher bitrates for transmission. Consequently, smart TVs are likely to consume more energy compared to other devices. This finding is consistent with the results obtained from the linear regression model, as described in [4], further validating the relationship between bitrate and energy consumption.

Figure 1: The devices’ energy consumption (kWh) vs bitrate (kbps).

As illustrated in Figure 2, the relationship between MOS and video bitrate (kbps) follows a logarithmic pattern. Therefore, we can use a straightforward QoE model to estimate the MOS if there is information about the video bitrate. This can be achieved by utilizing a logistic regression model MOS(x), where MOS = f(x) = a * log(x) + c, with x representing the video bitrate in Mbps, and a and c being coefficients, as provided in [9]. After, MOS and video bitrate (kbps) values in [4] are applied to the above-mentioned QoE green model equation regarding the logistic regression model, which is an extension of the logarithmic regression model [8]. This relationship allows to determine the green user QoE model and we exemplary show the green user QoE model for smart TV (using γ=2 in f_γ(x)).

Figure 2:The MOS vs bitrate (kbps) for all the devices. Additionally, the green user QoE model for smart TV is depicted with a dashed line.

According to Figure 2, it is categorized users into two groups: those who prioritize high-quality (HQ) video regardless of energy consumption, and green users who prioritize energy efficiency while still being satisfied with slightly lower video quality. It can be observed that the MOS value changes in video quality on their smart TVs faster compared to other end-devices. This is evident from the steeper curve in the smart TV section. On the other hand, when looking at the curve for tablets, it shows that changes in bitrate have a milder impact on MOS values. The outcome suggests that video streaming on smaller screens, such as tablets or laptops, may contribute less to the perception of quality changes. Considering that those small-screen devices consume less energy than larger screen devices, it may be preferable to use lower resolution videos instead of high-resolution ones. Analysing the relationship between laptops and tablets, it can be seen that low-resolution video streaming on laptops resulted in lower MOS scores compared to the tablet. From this result, it can be inferred that the choice of end-device and user behaviour plays a significant role in energy savings. Figure 2 indicates that the MOS values for the green user of a smart TV is comparable to the MOS values of an HQ user using a laptop.

Concerning this outcome, in [10], the authors presented the results of a subjective assessment aimed at investigating how different factors, such as video resolution, luminance, and end devices (TV, Laptop, and Smartphone), impact the QoE and energy consumption of video streaming services. The study found that, in certain conditions such as dark or bright environments, low device backlight luminance, or small-screen devices), users may need to strike a balance between acceptable QoE and sustainable (green) choices, as consuming more energy (e.g., by streaming higher-quality videos) may not significantly enhance the QoE.

Figure 3: The trade-off between energy consumption (kWh) and MOS. Green users are indicated with dashed lines, while HQ users are represented with solid lines.

Therefore, Figure 3 plots the trade-off relationship between energy consumption (kWh) and MOS for the end devices (such as smart TV, laptop and tablet). Thereby, we differentiate the HQ user and the green user, which presents some interesting results. First, a MOS score of 4 leads to comparable energy consumption results for green and HQ users. The relative differences are rather small. However, aiming for best quality (MOS 5) leads to significant differences. Furthermore, it is seen that the device type has a significant impact on energy consumption. Even for green users, which rate lower bitrates with higher MOS scores than HQ users, the energy consumption of the smart TV is much higher than for any quality (i.e. bitrate) for laptop and tablet users. Thus, device type and user behaviour are essential to strike the right balance between QoE and energy consumption.

Discussions and Future Research

Meeting the QoE expectations of end-users is essential to fulfilling the requirements of video streaming services. As users are the primary viewers of streaming videos in most real-world scenarios, subjective QoE assessment [11] provides a direct and dependable means to evaluate the perceptual quality of video streaming. Furthermore, there is a growing need to create objective QoE assessment models provided in [12]– [13]. However, many studies have focused on investigating the QoE obtained through subjective and objective models and have overlooked the consideration of energy consumption in video streaming.

Therefore, in the previous section, we have discussed how the different elements within the video streaming ecosystem play a role in consuming energy and emitting CO2. The findings pave the way for an objective response to determining an appropriate optimal video bitrate for viewing, considering both QoE and sustainability considerations, which can be further explored in future research.

It is evident that addressing energy consumption and emissions is crucial for the future of video streaming systems, while ensuring that end-users’ QoE is not compromised poses a significant and ongoing challenge. Thus, potential solutions to prevent energy consumption increase in QoE while still satisfying the user include streaming videos on smaller screen devices and watching lower resolution videos that offer sufficient quality instead of the highest resolution ones. Here, it can be highlighted the importance of user behavior to prevent energy consumption. Additionally, trade-off models can be developed using the green QoE model (especially for smarTV) by identifying ideal bitrate values for energy savings and user satisfaction in the QoE.

Delving deeper into the dynamics of the video streaming ecosystem, it becomes increasingly clear that energy consumption and emissions are critical concerns that must be addressed for the sustainable future of video streaming systems. The environmental impact of video streaming, particularly in terms of carbon emissions, cannot be understated. With the growing awareness of the urgent need to combat climate change, mitigating the environmental footprint of video streaming has become a pressing priority.

As video streaming technologies evolve, optimizing energy-efficient approaches without compromising users’ QoE is a complex task. End-users, who expect seamless and high-quality video streaming experiences, should not be deprived of their QoE while addressing the energy and emissions concerns. The outcome opens a novel door for an objective answer to the question of what constitutes an appropriate optimal video bitrate for viewing that takes into account both QoE and sustainability concerns.

Future research in this area is crucial to explore innovative techniques and strategies that can effectively reduce the energy consumption and carbon emissions of video streaming systems without sacrificing the QoE. Additionally, collaborative efforts among stakeholders, including researchers, industry practitioners, policymakers, and end-users, are essential in devising sustainable video streaming solutions that consider both environmental and user experience factors [14].

In conclusion, the discussions on the relationship between energy consumption, emissions, and QoE in video streaming systems emphasize the need for continued research and innovation to achieve a sustainable balance between environmental sustainability and user satisfaction.

References

[1] Sandvine. The Global Internet Phenomena Report. January 2023. Retrieved April 24, 2023
[2] M. Seufert, S. Egger, M. Slanina, T. Zinner, T. Hoßfeld and P. Tran-Gia, “A Survey on Quality of Experience of HTTP Adaptive Streaming,” in IEEE Communications Surveys & Tutorials, vol. 17, no. 1, pp. 469-492, Firstquarter 2015, doi: 10.1109/COMST.2014.2360940., 2015.
[3] Reinhard Madlener, Siamak Sheykhha, Wolfgang Briglauer,”The electricity- and CO2-saving potentials offered by regulation of European video-streaming services,” Energy Policy,vol. 161, p. 112716, 2022.
[4] G. Bingöl, S. Porcu, A. Floris and L. Atzori, “An Analysis of the Trade-off between Sustainability,” in IEEE ICC Workshop-GreenNet, Rome, 2023.
[5] T. Hoßfeld, M. Varela, L. Skorin-Kapov, P. E. Heegaard, “What is the trade-off between CO2 emission and video-conferencing QoE?,” ACM SIGMM Records, 2022.
[6] P. Suski, J. Pohl, and V. Frick, “All you can stream: Investigating the role of user behavior for greenhouse gas intensity of video streaming,” in Proc. of the 7th Int. Conf. on ICT for Sustainability, 2020, pp. 128–138.
[7] M. Mu, M. Broadbent, A. Farshad, N. Hart, D. Hutchison, Q. Ni, and N. Race, “A Scalable User Fairness Model for Adaptive Video Streaming Over SDN-Assisted Future Networks,” IEEE Journal on Selected Areas in Communications, vol. 34, no. 8, p. 2168–2184, 2016.
[8] T. Hossfeld, M. Varela, L. Skorin-Kapov and P. E. Heegaard, “A Greener Experience: Trade-offs between QoE and CO2 Emissions in Today’s and 6G Networks,” IEEE Communications Magazine, pp. 1-7, 2023.
[9] J. P. López, D. Martín, D. Jiménez and J. M. Menéndez, “Prediction and Modeling for No-Reference Video Quality Assessment Based on Machine Learning,” in 14th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), IEEE, pp. 56-63, Las Palmas de Gran Canaria, Spain, 2018.
[10] G. Bingöl, A. Floris, S. Porcu, C. Timmerer and L. Atzori, “Are Quality and Sustainability Reconcilable? A Subjective Study on Video QoE, Luminance and Resolution,” in 15th International Conference on Quality of Multimedia Experience (QoMEX), Gent, Belgium, 2023.
[11] G. Bingol, L. Serreli, S. Porcu, A. Floris, L. Atzori, “The Impact of Network Impairments on the QoE of WebRTC applications: A Subjective study,” in 14th International Conference on Quality of Multimedia Experience (QoMEX), Lippstadt, Germany, 2022.
[12] D. Z. Rodríguez, R. L. Rosa, E. C. Alfaia, J. I. Abrahão and G. Bressan, “Video quality metric for streaming service using DASH standard,” IEEE Trans. Broadcasting, vol. vol. 62, no. 3, pp. 628-639, Sep. 2016.
[13] T. Hoßfeld, M. Seufert, C. Sieber and T. Zinner, “Assessing effect sizes of influence factors towards a QoE model for HTTP adaptive streaming,” in 6th Int. Workshop Qual. Multimedia Exper. (QoMEX), Sep. 2014.
[14] S. Afzal, R. Prodan, C. Timmerer, “Green Video Streaming: Challenges and Opportunities.” ACM SIGMultimedia Records, Jan. 2023.

Towards the design and evaluation of more sustainable multimedia experiences: which role can QoE research play?

By Tobias Hossfeld | September 30, 2022 - 08:39 |October 23, 2022 0322, Feature, QoE Column

Leave a comment

In this column, we reflect on the environmental impact and broader sustainability implications of resource-demanding digital applications and services such as video streaming, VR/AR/XR and videoconferencing. We put emphasis not only on the experiences and use cases they enable but also on the “cost” of always striving for high Quality of Experience (QoE) and better user experiences. Starting by sketching the broader context, our aim is to raise awareness about the role that QoE research can play in the context of various of the United Nations’ Sustainable Development Goals (SDGs), either directly (e.g., SDG 13 “climate action”) or more indirectly (e.g., SDG 3 “good health and well-being” and SDG 12 “responsible consumption and production”).

UNs Sustainable Development goals (Figure taken from https://www.un.org/en/sustainable-development-goals)

The ambivalent role of digital technology

One of the latest reports from the Intergovernmental Panel on Climate Change (IPCC) confirmed the urgency of drastically reducing emissions of carbon dioxide and other human-induced greenhouse gas (GHG) emissions in the years to come (IPCC, 2021). This report, directly relevant in the context of SDG 13 “climate action”, confirmed the undeniable and negative human influence on global warming and the need for collective action. While the potential of digital technology (and ICT more broadly) for sustainable development has been on the agenda for some time, the context of the COVID-19 pandemic has made it possible to better understand a set of related opportunities and challenges.

First of all, it has been observed that long-lasting lockdowns and restrictions due to the COVID-19 pandemic and its aftermath have triggered a drastic increase in internet traffic (see e.g., Feldmann, 2020). This holds particularly for the use of videoconferencing and video streaming services for various purposes (e.g., work meetings, conferences, remote education, and social gatherings, just to name a few). At the same time, the associated drastic reduction of global air traffic and other types of traffic (e.g., road traffic) with their known environmental footprint, has had undeniable positive effects on the environment (e.g., reduced air pollution, better water quality see e.g., Khan et al., 2020). Despite this potential, the environmental gains enabled by digital technology and recent advances in energy efficiency are threatened by digital rebound effects due to increased energy consumption and energy demands related to ICT (Coroamua et al., 2019; Lange et al., 2020). In the context of ever-increasing consumption, there has for instance been a growing focus in the literature on the negative environmental impact of unsustainable use and viewing practices such as binge-watching, multi-watching and media-multitasking, which have become more common over the last years (see e.g., Widdicks, 2019). While it is important to recognize that the overall emission factor will vary depending on the mix of energy generation technologies used and region in the world (Preist et al., 2014), the above observation also fits with other recent reports and articles, which expect the energy demands linked to digital infrastructure, digital services and their use to further expand and which expect the greenhouse gas emissions of ICT relative to the overall worldwide footprint to significantly increase (see e.g., Belkhir et al., 2018, Morley et al., 2018, Obringer et al., 2021). Hence, these and other recent forecasts show a growing and even unsustainable high carbon footprint of ICT in the middle-term future, due to, among others, the increasing energy demand of data centres (including e.g., also the energy needed for cooling) and the associated traffic (Preist et al., 2016).

Another set of challenges that became more apparent can be linked to the human mental resources and health involved as well as environmental effects. Here, there is a link to the abovementioned Sustainable development goals 3 (good health and well-being) and 12 (sustainable consumption and production). For instance, the transition to “more sustainable” digital meetings, online conferences, and online education has also pointed to a range of challenges from a user point of view. “Zoom fatigue” being a prominent example illustrates the need to strike the right balance between the more sustainable character of experiences provided by and enabled through technology and how these are actually experienced and perceived from a user point of view (Döring et al., 2022; Raake et al., 2022). Another example is binge-watching behavior, which can in certain cases have a positive effect on an individual’s well-being, but has also been shown to have a negative effect through e.g., feelings of guilt and goal conflicts (Granow et al., 2018) or through problematic involvement resulting in e.g., chronic sleep issues (Flayelle, 2020).

From the “production” perspective, recent work has looked at the growing environmental impact of commonly used cloud-based services such as video streaming (see e.g., Chen et al., 2020, Suski et al., 2020, The Shift Project, 2021) and the underlying infrastructure consisting of data centers, transport network and end devices (Preist et al., 2016, Suski, 2020, Preist et al., 2014). As a result, the combination of technological advancements and user-centered approaches aiming to always improve the experience may have undesired environmental consequences. This includes stimulating increased user expectations (e.g., higher video quality, increased connectivity and availability, almost zero-latency, …) and by triggering increased use, and unsustainable use practices, resulting in potential rebound effects due to increased data traffic and electricity demand.

These observations have started to culminate into a plea for a shift towards a more sustainable and humanity-centered paradigm, which considers to a much larger extent how digital consumption and increased data demand impact individuals, society and our planet (Widdicks et al., 2019, Priest et al., 2016, Hazas & Nathan, 2018). Here, it is obvious that experience, consumption behavior and energy consumption are tightly intertwined.

How does QoE research fit into this picture?

This leads to the question of where research on Quality of Experience and its underlying goals fit into this broader picture, to which extent related topics have gained attention so far and how future research can potentially have an even larger impact.

As the COVID-19 related examples above already indicated, QoE research, through its focus on improving the experience for users in e.g., various videoconferencing-based scenarios or immersive technology-related use cases, already plays and will continue to play a key role in enabling more sustainable practices in various domains (e.g., remote education, online conferences, digital meetings, and thus reducing unnecessary travels, …) and linking up to various SDGs. A key challenge here is to enable experiences that become so natural and attractive that they may even become preferred in the future. While this is a huge and important topic, we refrain from discussing it further in this contribution, as it already is a key focus within the QoE community. Instead, in the following, we, first of all, reflect on the extent to which environmental implications of multimedia services have explicitly been on the agenda of the QoE community in the past, what the focus is in more recent work, and what is currently not yet sufficiently addressed. Secondly, we consider a broader set of areas and concrete topics in which QoE research can be related to environmental and broader sustainability-related concerns.

Traditionally, QoE research has predominantly focused on gathering insights that can guide the optimization of technical parameters and allocation of resources at different layers, while still ensuring a high QoE from a user point of view. A main underlying driver in this respect has traditionally been the related business perspective: optimizing QoE as a way to increase profitability and users/customers’ willingness to pay for better quality (Wechsung, 2014). While better video compression techniques or adaptive video streaming may allow the saving of resources, which overall may lead to environmental gains, the latter has traditionally not been a main or explicit motivation.

There are however some exceptions in earlier work, where the focus was more explicitly on the link between energy consumption-related aspects, energy efficiency and QoE. The study of Ickin, 2012 for instance, aimed to investigate QoE influence factors of mobile applications and revealed the key role of the battery in successful QoE provisioning. In this work, it was observed that energy modelling and saving efforts are typically geared towards the immediate benefits of end users, while less attention was paid to the digital infrastructure (Popescu, 2018). Efforts were further also made in the past to describe, analyze and model the trade-off between QoE and energy consumption (QoE perceived per user per Joule, QoEJ) (Popescu, 2018) or power consumption (QoE perceived per user per Watt, QoEW) (Zhang et al., 2013), as well as to optimize resource consumption so as to avoid sources of annoyance (see e.g., (Fiedler et al., 2016). While these early efforts did not yet result in a generic end-to-end QoE-energy-model that can be used as a basis for optimizations, they provide a useful basis to build upon.

A more recent example (Hossfeld et al., 2022) in the context of video streaming services looked into possible trade-offs between varying levels of QoE and the resulting energy consumption, which is further mapped to CO₂ emissions (taking the EU emission parameter as input, as this – as mentioned – depends on the overall energy mix of green and non-renewable energy sources). Their visualization model further considers parameters such as the type of device and type of network and while it is a simplification of the multitude of possible scenarios and factors, it illustrates that it is possible to identify areas where energy consumption can be reduced while ensuring an acceptable QoE.

Another recent work (Herglotz et al., 2022) jointly analyzed end-user power efficiency and QoE related to video streaming, based on actual real-world data (i.e., YouTube streaming events). More specifically, power consumption was modelled and user-perceived QoE was estimated in order to model where optimization is possible. They found that optimization is possible and pointed to the importance of the choice of video codec, video resolution, frame rate and bitrate in this respect.

These examples point to the potential to optimize at the “production” side, however, the focus has more recently also been extended to the actual use, user expectations and “consumption” side (Jiang et al., 2021, Lange et al., 2020, Suski et al., 2020, Elgaaied-Gambier et al., 2020) Various topics are explored in this respect, e.g., digital carbon footprint calculation at the individual level (Schien et al., 2013, Preist et al., 2014), consumer awareness and pro-environmental digital habits (Elgaaied-Gambier et al., 2020; Gnanasekaran et al., 2021), or impact of user behavior (Suski et al., 2020). While we cannot discuss all of these in detail here, they all are based on the observation that there is a growing need to involve consumers and users in the collective challenge of reducing the impact of digital applications and services on the environment (Elgaaied-Gambier et al., 2020; Priest et al., 2016).

QoE research can play an important role here, extending the understanding of carbon footprint vs. QoE trade-offs to making users more aware of the actual “cost” of high QoE. A recent interview study with digital natives conducted by some of the co-authors of this column (Gnanasekaran et al., 2021) illustrated that many users are not aware of the environmental impact of their user behavior and expectations and that even with such insights, substantial drastic changes in behavior cannot be expected. The lack of technological understanding, public information and social awareness about the topic were identified as important factors. It is therefore of utmost importance to trigger more awareness and help users see and understand their carbon footprint related to e.g., the use of video streaming services (Gnanasekaran et al., 2021). This perspective is currently missing in the field of QoE and we argue here that QoE research could – in collaboration with other disciplines and by integrating insights from other fields – play an important role here.

In terms of the motivation for adopting pro-environmental digital habits, Gnanasekaran et al., (2021) found that several factors indirectly contribute to this goal, including the striving for personal well-being. Finally, the results indicate some willingness to change and make compromises (e.g., accepting a lower video quality), albeit not an unconditional one: the alignment with other goals (e.g., personal well-being) and the nature of the perceived sacrifice and its impact play a key role. A key challenge for future work is therefore to identify and understand concrete mechanisms that could trigger more awareness amongst users about the environmental and well-being impact of their use of digital applications and services, and those that can further motivate positive behavioral change (e.g., opting for use practices that limit one’s digital carbon footprint, mindful digital consumption). By investigating the impact of various more environmentally-friendly viewing practices on QoE (e.g., actively promoting standard definition video quality instead of HD, nudging users to switch to audio-only when a service like YouTube is used as background noise or stimulating users to switch to the least data demanding viewing configuration depending on the context and purpose), QoE research could help to bridge the gap towards actual behavioral change.

Final reflections and challenges for future research

We have argued that research on users’ Quality of Experience and overall User Experience can be highly relevant to gain insights that may further drive the adoption of new, more sustainable usage patterns and that can trigger more awareness of implications of user expectations, preferences and actual use of digital services. However, the focus on continuously improving users’ Quality Experience may also trigger unwanted rebound effects, leading to an overall higher environmental footprint due to the increased use of digital applications and services. Further, it may have a negative impact on users’ long-term well-being as well.

We, therefore, need to join efforts with other communities to challenge the current design paradigm from a more critical stance, partly as “it’s difficult to see the ecological impact of IT when its benefits are so blindingly bright” (Borning et al., 2020). Richer and better experiences may lead to increased, unnecessary or even excessive consumption, further increasing individuals’ environmental impact and potentially impeding long-term well-being. Open questions are, therefore: Which fields and disciplines should join forces to mitigate the above risks? And how can QoE research — directly or indirectly — contribute to the triggering of sustainable consumption patterns and the fostering of well-being?

Further, a key question is how energy efficiency can be improved for digital services such as video streaming, videoconferencing, online gaming, etc., while still ensuring an acceptable QoE. This also points to the question of which compromises can be made in trading QoE against its environmental impact (from “willingness to pay” to “willingness to sacrifice”), under which circumstances and how these compromises can be meaningfully and realistically assessed. In this respect, future work should extend the current modelling efforts to link QoE and carbon footprint, go beyond exploring what users are willing to (more passively) endure, and also investigate how users can be more actively motivated to adjust and lower their expectations and even change their behavior.

These and related topics will be on the agenda of the Dagstuhl seminar 23042 “Quality of Sustainable Experience” and the conference QoMEX 2023 “Towards sustainable and inclusive multimedia experiences”.