Towards Interactive QoE Assessment of Robotic Telepresence

Telepresence robots (TPRs) are remote-controlled, wheeled devices with an internet connection. A TPR can “teleport” you to a remote location, let you drive around and interact with people.  A TPR user can feel present in the remote location by being able to control the robot position, movements, actions, voice and video. A TPR facilitates human-to-human interaction, wherever you want and whenever you want. The human user sends commands to the TPR by pressing buttons or keys from a keyboard, mouse, or joystick.

A Robotic Telepresence Environment

In recent years, people from different environments and backgrounds have started to adopt TPRs for private and business purposes such as attending a class, roaming around the office and visiting patients. Due to the COVID-19 pandemic, adoption in healthcare has increased in order to facilitate social distancing and staff safety [Ackerman 2020, Tavakoli et al. 2020].

Robotic Telepresence Sample Use Cases

Despite such increase in adoption, a research gap remains from a QoE perspective, as TPRs offer interaction beyond the well understood QoE issues in traditional static audio-visual conferencing. TPRs, as remote-controlled vehicles, enable users with some form of physical presence at the remote location. Furthermore, for those people interacting with the TPR at the remote location, the robot is a physical representation or proxy agent of its remote operator. The operator can physically interact with the remote location by driving over an object or pushing an object forward. These aspects of teleoperation and navigation represent an additional dimension in terms of functionality, complexity and experience.

Navigating a TPR may pose challenges to end-users and influence their perceived quality of the system. For instance, when a TPR operator is driving the robot, he/she expects an instantaneous reaction from the robot. An increased delay in sending commands to the robot may thus negatively impact robot mobility and the user’s satisfaction, even if the audio-visual communication functionality itself is not affected.

In a recent paper published at QoMEX 2020 [Jahromi et al. 2020], we addressed this gap in research by means of a subjective QoE experiment that focused on the QoE aspects of live TPR teleoperation over the internet. We were interested in understanding how network QoS-related factors influence the operator’s QoE when using a TPR in an office context.

TPR QoE User Study and Experimental Findings

In our study, we investigated the QoE of TPR navigation along three research questions: 1) impact of network factors including bandwidth, delay and packet loss on the TPR navigation QoE, 2) discrimination between navigation QoE and video QoE, 3) impact of task on TPR QoE sensitivity.

The QoE study participants were situated in a laboratory setting in Dublin, Ireland, where they navigated a Beam Plus TPR via keyboard input on a desktop computer. The TPR was placed in a real office setting of California Telecom in California, USA. Bandwidth, delay and packet loss rate were manipulated on the operator’s PC.

A User Participating in the Robotic Telepresence QoE Study

A total of 23 subjects participated in our QoE lab study: 8 subjects were female and 15 male and the average test duration was 30 minutes per participant. We followed  ITU-T Recommendation BT.500 and detected three participants as outliers which were excluded from subsequent analysis. A post-test survey shows that none of the participants reported task boredom as a factor. In fact, many reported that they enjoyed the experience! 

The influence of network factors on Navigation QoE

All three network influence factors exhibited a significant impact on navigation QoE but in different ways. Above a threshold of 0.9 Mbps, bandwidth showed no influence on navigation QoE, while 1% packet loss already showed a noticeable impact on the navigation QoE.  A mixed-model ANOVA confirms that the impact of the different network factors on navigation quality ratings is statistically significant (see [Jahromi et al. 2020] for details).  From the figure below, one can see that the levels of navigation QoE MOS, as well as their sensitivity to network impairment level, depend on the actual impairment type.

The bar plots illustrate the influence of network QoS factors on the navigation quality (left) and the video quality (right).

Discrimination between navigation QoE and video QoE

Our study results show that the subjects were capable of discriminating between video quality and navigation quality, as they treated them as separate concepts when it comes to experience assessment. Based on ANOVA analysis [Jahromi et al. 2020], we see that the impact of bandwidth and packet loss on TPR video quality ratings were statistically significant. However, for the delay, this was not the case (in contrast to navigation quality).  A comparison of navigation quality and video quality subplots shows that changes in MOS across different impairment levels diverge between the two in terms of amplitude.  To quantify this divergence, we performed a Spearman Rank Ordered Correlation Coefficient (SROCC) analysis, revealing only a weak correlation between video and navigation quality (SROCC =0.47).

Impact of task on TPR QoE sensitivity

Our study showed that the type of TPR task had more impact on navigation QoE than streaming video QoE. Statistical analysis reveals that the actual task at hand significantly affects QoE impairment sensitivity, depending on the network impairment type. For example, the interaction between bandwidth and task is statistically significant for navigation QoE, which means that changes in bandwidth were rated differently depending on the task type. On the other hand, this was not the case for delay and packet loss. Regarding video quality, we do not see a significant impact of task on QoE sensitivity to network impairments, except for the borderline case for packet loss rate.

Conclusion: Towards a TPR QoE Research Agenda

There were three key findings from this study. First, we understand that users can differentiate between visual and navigation aspects of TPR operation. Secondly, all three network factors have a significant impact on TPR navigation QoE. Thirdly,  visual and navigation QoE sensitivity to specific impairments strongly depends on the actual task at hand. We also found the initial training phase to be essential in order to ensure familiarity of participants with the system and to avoid bias caused by novelty effects. We observed that participants were highly engaged when navigating the TPR, as was also reflected in the positive feedback received during the debriefing interviews. We believe that our study methodology and design, including task types, worked very well and can serve as a solid basis for future TPR QoE studies. 

We also see the necessity of developing a more generic, empirically validated, TPR experience framework that allows for systematic assessment and modelling of QoE and UX in the context of TPR usage. Beyond integrating concepts and constructs that have been already developed in other related domains such as (multi-party) telepresence, XR, gaming, embodiment and human-robot interaction, the development of such a framework must take into account the unique properties that distinguish the TPR experience from other technologies:

  • Asymmetric conditions
    The factors influencing  QoE for TPR users are not only bidirectional, they are also different on both sides of TPR, i.e., the experience is asymmetric. Considering the differences between the local and the remote location, a TPR setup features a noticeable number of asymmetric conditions as regards the number of users, content, context, and even stimuli: while the robot is typically controlled by a single operator, the remote location may host a number of users (asymmetry in the number of users). An asymmetry also exists in the number of stimuli. For instance, the remote users perceive the physical movement and presence of the operator by the actual movement of the TPR. The experience of encountering a TPR rolling into an office is a hybrid kind of intrusion, somewhere between a robot and a physical person. However, from the operator’s perspective, the experience is a rather virtual one, as he/she only becomes conscious of physical impact at the remote location only by means of technically mediated feedback.
  • Social Dimensions
    According to [Haans et al. 2012], the experience of telepresence is defined as “a consequence of the way in which we are embodied, and that the capability to feel as if one is actually there in a technologically mediated or simulated environment is a natural consequence of the same ability that allows us to adjust to, for example, a slippery surface or the weight of a hammer”.
    The experience of being present in a TPR-mediated context goes beyond AR and VR. It is a blended physical reality. The sense of ownership of a wheeled TPR by means of mobility and remote navigation of using a “physical” object, allows the users to feel as if they are physically present in the remote environment (e.g. a physical avatar). This allows the TPR users to get involved in social activities, such as accompanying people and participating in discussions while navigating, sharing the same visual scenes, visiting a place and getting involved in social discussions, parties and celebrations. In healthcare, a doctor can use TPR for visiting patients as well as dispensing and administering medication remotely.
  • TPR Mobility and Physical Environment
    Mobility is a key dimension of telepresence frameworks [Rae et al. 2015]. TPR mobility and navigation features introduce new interactions between the operators and the physical environment.  The environmental aspect becomes an integral part of the interaction experience [Hammer et al. 2018].
    During a TPR usage, the navigation path and the number of obstacles that a remote user may face can influence the user’s experience. The ease or complexity of navigation can change the operator’s focus and attention from one influence factor to another (e.g., video quality to navigation quality). In Paloski et al’s, 2008 study, it was found that cognitive impairment as a result of fatigue can influence user performance concerning robot operation [Paloski et al. 2008]. This raises the question of how driving and interaction through TPR impacts the user’s cognitive load and results in fatigue compared to physical presence.
    The mobility aspects of TPRs can also influence the perception of spatial configurations of the physical environment. This allows the TPR user to manipulate and interact with the environment from a spatial configuration aspect [Narbutt et al. 2017]. For example,  the ambient noise of the environment can be perceived at different levels. The TPR operator can move the robot closer to the source of the noise or keep a distance from it. This can enhance his/her feelings of being present [Rae et al. 2015].

Above distinctive characteristics of a TPR-mediated context illustrate the complexity and the broad range of aspects that potentially have a significant influence on the TPR quality of user experience. Consideration of these features and factors provides a useful basis for the development of a comprehensive TPR experience framework.

References

  • [Tavakoli et al. 2020] Tavakoli, Mahdi, Carriere, Jay and Torabi, Ali. (2020). Robotics For COVID-19: How Can Robots Help Health Care in the Fight Against Coronavirus.
  • [Ackerman 2020] E. Ackerman (2020). Telepresence Robots Are Helping Take Pressure Off Hospital Staff, IEEE Spectrum, Apr 2020
  • [Jahromi et al. 2020] H. Z. Jahromi, I. Bartolec, E. Gamboa, A. Hines, and R. Schatz, “You Drive Me Crazy! Interactive QoE Assessment for Telepresence Robot Control,” in 12th International Conference on Quality of Multimedia Experience (QoMEX 2020), Athlone, Ireland, 2020.
  • [Hammer et al. 2018] F. Hammer, S. Egger-Lampl, and S. Möller, “Quality-of-user-experience: a position paper,” Quality and User Experience, vol. 3, no. 1, Dec. 2018, doi: 10.1007/s41233-018-0022-0.
  • [Haans et al. 2012] A. Haans & W. A. Ijsselsteijn (2012). Embodiment and telepresence: Toward a comprehensive theoretical framework✩. Interacting with Computers, 24(4), 211-218.
  • [Rae et al. 2015] I. Rae, G. Venolia, JC. Tang, D. Molnar  (2015, February). A framework for understanding and designing telepresence. In Proceedings of the 18th ACM conference on computer supported cooperative work & social computing (pp. 1552-1566).
  • [Narbutt et al. 2017] M. Narbutt, S. O’Leary, A. Allen, J. Skoglund, & A. Hines,  (2017, October). Streaming VR for immersion: Quality aspects of compressed spatial audio. In 2017 23rd International Conference on Virtual System & Multimedia (VSMM) (pp. 1-6). IEEE.
  • [Paloski et al. 2008] W. H. Paloski, C. M. Oman, J. J. Bloomberg, M. F. Reschke, S. J. Wood, D. L. Harm, … & L. S. Stone (2008). Risk of sensory-motor performance failures affecting vehicle control during space missions: a review of the evidence. Journal of Gravitational Physiology, 15(2), 1-29.

Definitions of Crowdsourced Network and QoE Measurements

1 Introduction and Definitions

Crowdsourcing is a well-established concept in the scientific community, used for instance by Jeff Howe and Mark Robinson in 2005 to describe how businesses were using the Internet to outsource work to the crowd [2], but can be dated back up to 1849 (weather prediction in the US). Crowdsourcing has enabled a huge number of new engineering rules and commercial applications. To better define crowdsourcing in the context of network measurements, a seminar was held in Würzburg, Germany 25-26 September 2019 on the topic “Crowdsourced Network and QoE Measurements”. It notably showed the need for releasing a white paper, with the goal of providing a scientific discussion of the terms “crowdsourced network measurements” and “crowdsourced QoE measurements”. It describes relevant use cases for such crowdsourced data and its underlying challenges.

The outcome of the seminar is the white paper [1], which is – to our knowledge – the first document covering the topic of crowdsourced network and QoE measurements. This document serves as a basis for differentiation and a consistent view from different perspectives on crowdsourced network measurements, with the goal of providing a commonly accepted definition in the community. The scope is focused on the context of mobile and fixed network operators, but also on measurements of different layers (network, application, user layer). In addition, the white paper shows the value of crowdsourcing for selected use cases, e.g., to improve QoE, or address regulatory issues. Finally, the major challenges and issues for researchers and practitioners are highlighted.

This article now summarizes the current state of the art in crowdsourcing research and lays down the foundation for the definition of crowdsourcing in the context of network and QoE measurements as provided in [1]. One important effort is first to properly define the various elements of crowdsourcing.

1.1 Crowdsourcing

The word crowdsourcing itself is a mix of the crowd and the traditional outsourcing work-commissioning model. Since the publication of [2], the research community has been struggling to find a definition of the term crowdsourcing [3,4,5] that fits the wide variety of its applications and new developments. For example, in ITU-T P.912, crowdsourcing has been defined as:

Crowdsourcing consists of obtaining the needed service by a large group of people, most probably an on-line community.

The above definition has been written with the main purpose of collecting subjective feedback from users. For the purpose of this white paper focused on network measurements, it is required to clarify this definition. In the following, the term crowdsourcing will be defined as follows:

Crowdsourcing is an action by an initiator who outsources tasks to a crowd of participants to achieve a certain goal.

The following terms are further defined to clarify the above definition:

A crowdsourcing action is part of a campaign that includes processes such as campaign design and methodology definition, data capturing and storage, and data analysis.

The initiator of a crowdsourcing action can be a company, an agency (e.g., a regulator), a research institute or an individual.

Crowdsourcing participants (also “workers” or “users”) work on the tasks set up by the initiator. They are third parties with respect to the initiator, and they must be human.

The goal of a crowdsourcing action is its main purpose from the initiator’s perspective.

The goals of a crowdsourcing action can be manifold and may include, for example:

  • Gathering subjective feedback from users about an application (e.g., ranks expressing the experience of users when using an application)
  • Leveraging existing capacities (e.g., storage, computing, etc.)  offered by companies or individual users to perform some tasks
  • Leveraging cognitive efforts of humans for problem-solving in a scientific context.

In general, an initiator adopts a crowdsourcing approach to remedy a lack of resources (e.g., running a large-scale computation by using the resources of a large number of users to overcome its own limitations) or to broaden a test basis much further than classical opinion polls. Crowdsourcing thus covers a wide range of actions with various degrees of involvement by the participants.

In crowdsourcing, there are various methods of identifying, selecting, receiving, and retributing users contributing to a crowdsourcing initiative and related services. Individuals or organizations obtain goods and/or services in many different ways from a large, relatively open and often rapidly-evolving group of crowdsourcing participants (also called users). The use of goods or information obtained by crowdsourcing to achieve a cumulative result can also depend on the type of task, the collected goods or information and final goal of the crowdsourcing task.

1.2 Roles and Actors

Given the above definitions, the actors involved in a crowdsourcing action are the initiator and the participants. The role of the initiator is to design and initiate the crowdsourcing action, distribute the required resources to the participants (e.g., a piece of software or the task instructions, assign tasks to the participants or start an open call to a larger group), and finally to collect, process and evaluate the results of the crowdsourcing action.

The role of participants depends on their degree of contribution or involvement. In general, their role is described as follows. At least, they offer their resources to the initiator, e.g., time, ideas, or computation resources. In higher levels of contributions, participants might run or perform the tasks assigned by the initiator, and (optionally) report the results to the initiator.

Finally, the relationships between the initiator and the participants are governed by policies specifying the contextual aspects of the crowdsourcing action such as security and confidentiality, and any interest or business aspects specifying how the participants are remunerated, rewarded or incentivized for their participation in the crowdsourcing action.

2 Crowdsourcing in the Context of Network Measurements

The above model considers crowdsourcing at large. In this section, we analyse crowdsourcing for network measurements, which creates crowd data. This exemplifies the broader definitions introduced above, even if the scope is more restricted but with strong contextual aspects like security and confidentiality rules.

2.1 Definition: Crowdsourced Network Measurements

Crowdsourcing enables a distributed and scalable approach to perform network measurements. It can reach a large number of end-users all over the world. This clearly surpasses the traditional measurement campaigns launched by network operators or regulatory agencies able to reach only a limited sample of users. Primarily, crowd data may be used for the purpose of evaluating QoS, that is, network performance measurements. Crowdsourcing may however also be relevant for evaluating QoE, as it may involve asking users for their experience – depending on the type of campaign.

With regard to the previous section and the special aspects of network measurements, crowdsourced network measurements/crowd data are defined as follows, based on the previous, general definition of crowdsourcing introduced above:

Crowdsourced network measurements are actions by an initiator who outsources tasks to a crowd of participants to achieve the goal of gathering network measurement-related data.

Crowd data is the data that is generated in the context of crowdsourced network measurement actions.

The format of the crowd data is specified by the initiator and depends on the type of crowdsourcing action. For instance, crowd data can be the results of large scale computation experiments, analytics, measurement data, etc. In addition, the semantic interpretation of crowd data is under the responsibility of the initiator. The participants cannot interpret the crowd data, which must be thoroughly processed by the initiator to reach the objective of the crowdsourcing action.

We consider in this paper the contribution of human participants only. Distributed measurement actions solely made by robots, IoT devices or automated probes are excluded. Additionally, we require that participants consent to contribute to the crowdsourcing action. This consent might, however, vary from actively fulfilling dedicated task instructions provided by the initiator to merely accepting terms of services that include the option of analysing usage artefacts generated while interacting with a service.

It follows that in the present document, it is assumed that measurements via crowdsourcing (namely, crowd data) are performed by human participants aware of the fact that they are participating in a crowdsourcing campaign. Once clearly stated, more details need to be provided about the slightly adapted roles of the actors and their relationships in a crowdsourcing initiative in the context of network measurements.

2.2 Active and Passive Measurements

For a better classification of crowdsourced network measurements, it is important to differentiate between active and passive measurements. Similar to the current working definition within the ITU-T Study Group 12 work item “E.CrowdESFB” (Crowdsourcing Approach for the assessment of end-to-end QoS in Fixed Broadband and Mobile Networks), the following definitions are made:

Active measurements create artificial traffic to generate crowd data.

Passive measurements do not create artificial traffic, but measure crowd data that is generated by the participant.

For example, a typical case of an active measurement is a speed test that generates artificial traffic against a test server in order to estimate bandwidth or QoS. A passive measurement instead may be realized by fetching cellular information from a mobile device, which has been collected without additional data generation.

2.3 Roles of the Actors

Participants have to commit to participation in the crowdsourcing measurements. The level of contribution can vary depending on the corresponding effort or level of engagement. The simplest action is to subscribe to or install a specific application, which collects data through measurements as part of its functioning – often in the background and not as part of the core functionality provided to the user. A more complex task-driven engagement requires a more important cognitive effort, such as providing subjective feedback on the performance or quality of certain Internet services. Hence, one must differentiate between participant-initiated measurements and automated measurements:

Participant-initiated measurements require the participant to initiate the measurement. The measurement data are typically provided to the participant.

Automated measurements can be performed without the need for the participant to initiate them. They are typically performed in the background.

A participant can thus be a user or a worker. The distinction depends on the main focus of the person doing the contribution and his/her engagement:

A crowdsourcing user is providing crowd data as the side effect of another activity, in the context of passive, automated measurements.

A crowdsourcing worker is providing crowd data as a consequence of his/her engagement when performing specific tasks, in the context of active, participant-initiated measurements.

The term “users” should, therefore, be used when the crowdsourced activity is not the main focus of engagement, but comes as a side effect of another activity – for example, when using a web browsing application which collects measurements in the background, which is a passive, automated measurement.

“Workers” are involved when the crowdsourced activity is the main driver of engagement, for example, when the worker is paid to perform specific tasks and is performing an active, participant-initiated measurement. Note that in some cases, workers can also be incentivized to provide passive measurement data (e.g. with applications collecting data in the background if not actively used).

In general, workers are paid on the basis of clear guidelines for their specific crowdsourcing activity, whereas users provide their contribution on the basis of a more ambiguous, indirect engagement, such as via the utilization of a particular service provided by the beneficiary of the crowdsourcing results, or a third-party crowd provider. Regardless of the participants’ level of engagement, the data resulting from the crowdsourcing measurement action is reported back to the initiator.

The initiator of the crowdsourcing measurement action often has to design a crowdsourcing measurement campaign, recruit the participants (selectively or openly), provide them with the necessary means (e.g. infrastructure and/or software) to run their action, provide the required (backend) infrastructure and software tools to the participants to run the action, collect, process and analyse the information, and possibly publish the results.

2.4 Dimensions of Crowdsourced Network Measurements

In light of the previous section, there are multiple dimensions to consider for crowdsourcing in the context of network measurements. A preliminary list of dimensions includes:

  • Level of subjectivity (subjective vs. objective measurements) in the crowd data
  • Level of engagement of the participant (participant-initiated or background) or their cognitive effort, and awareness (consciousness) of the measurement level of traffic generation (active vs. passive)
  • Type and level of incentives (attractiveness/appeal, paid or unpaid)

Besides these key dimensions, there are other features which are relevant in characterizing a crowdsourced network measurement activity. These include scale, cost, and value; the type of data collected; the goal or the intention, i.e. the intention of the user (based on incentives) versus the intention of the crowdsourcing initiator of the resulting output.

Figure 1: Dimensions for network measurements crowdsourcing definition, and relevant characterization features (examples with two types of measurement actions)

In Figure 1, we have illustrated some dimensions of network measurements based on crowdsourcing. Only the subjectivity, engagement and incentives dimension are displayed, on an arbitrary scale. The objective of this figure is to show that an initiator has a wide range of combinations for crowdsourcing action. The success of a measurement action with regard to an objective (number of participants, relevance of the results, etc.) is multifactorial. As an example, action 1 may indicate QoE measurements from a limited number of participants and action 2 visualizes the dimensions for network measurements by involving a large number of participants.

3 Summary

The attendees of the Würzburg seminar on “Crowdsourced Network and QoE Measurements” have produced a white paper, which defines terms in the context of crowdsourcing for network and QoE measurements, lists of relevant use cases from the perspective of different stakeholders, and discusses the challenges associated with designing crowdsourcing campaigns, analyzing, and interpreting the data. The goal of the white paper is to provide definitions to be commonly accepted by the community and to summarize the most important use-cases and challenges from industrial and academic perspectives.

References

[1] White Paper on Crowdsourced Network and QoE Measurements – Definitions, Use Cases and Challenges (2020). Tobias Hoßfeld and Stefan Wunderer, eds., Würzburg, Germany, March 2020. doi: 10.25972/OPUS-20232.

[2] Howe, J. (2006). The rise of crowdsourcing. Wired magazine, 14(6), 1-4.

[3] Estellés-Arolas, E., & González-Ladrón-De-Guevara, F. (2012). Towards an integrated crowdsourcing definition. Journal of Information science, 38(2), 189-200.

[4] Kietzmann, J. H. (2017). Crowdsourcing: A revised definition and introduction to new research. Business Horizons, 60(2), 151-153.

[5] ITU-T P.912, “Subjective video quality assessment methods for recognition tasks “, 08/2016

[6] ITU-T P.808 (ex P.CROWD), “Subjective evaluation of speech quality with a crowdsourcing approach”, 06/2018

Collaborative QoE Management using SDN

The Software-Defined Networking (SDN) paradigm offers the flexibility and programmability in the deployment and management of network services by separating the Control plane from the Data plane. Being based on network abstractions and virtualization techniques, SDN allows for simplifying the implementation of traffic engineering techniques as well as the communication among different services providers, included Internet Service Providers (ISPs) and Over The Top (OTT) providers. For these reasons, the SDN architectures have been widely used in the last years for the QoE-aware management of multimedia services.

The paper [1] presents Timber, an open source SDN-based emulation platform to provide the research community with a tool for experimenting new QoE management approaches and algorithms, which may also rely on information exchange between ISP and OTT [2].  We believe that the exchange of information between the OTT and the ISP is extremely important because:

  1. QoE models depend on different influence factors, i.e., network, application, system and context factors [3];
  2. OTT and ISP have different information in their hands, i.e., network state and application Key Quality Indicators (KQIs), respectively;
  3. End-to-end encryption of the OTT services makes it difficult for ISP to have access to application KQIs to perform QoE-aware network management.

In the following we briefly describe Timber and the impact of collaborative QoE management.

Timber architecture

Figure 1 represents the reference architecture, which is composed of four planes. The Service Management Plane is a cloud space owned by the OTT provider, which includes: a QoE Monitoring module to estimate the user’s QoE on the basis of service parameters acquired at the client side; a DB where QoE measurements are stored and can be shared with third parties; a Content Distribution service to deliver multimedia contents. Through the RESTful APIs, the OTTs give access to part of the information stored in the DB to the ISP, on the basis of appropriate agreements.

The Network Data Plane, Network Control Plane, and the Network Management Plane are the those in the hands of the ISP. The Network Data Plane includes all the SDN enabled data forwarding network devices; the Network Control Plane consists of the SDN controller which manages the network devices through Southbound APIs; and the Network Management Plane is the application layer of the SDN architecture controlled by the ISP to perform network-wide control operations which communicates with the OTT via RESTful APIs. The SDN application includes a QoS Monitoring module to monitor the performance of the network, a Management Policy module to take into account Service Level Agreements (SLA), and a Control Actions module that decides on the network control actions to be implemented by the SDN controller to optimize the network resources and improve the service’s quality.

Timber implements this architecture on top of the Mininet SDN emulator and the Ryu SDN controller, which provides the major functionalities of the traffic engineering abstractions. According to the depicted scenario, the OTT has the potential to monitor the level of QoE for the provided services as it has access to the needed application and network level KQIs (Key Quality Indicators). On the other hand, the ISP has the potential to control the network level quality by changing the allocated resources. This scenario is implemented in Timber and allows for setting the needed emulation network and application configuration to text QoE-aware service management algorithms.

Specifically, the OTT performs QoE monitoring of the delivered service by acquiring service information from the client side based on passive measurements of service-related KQIs obtained through probes installed in the user’s devices. Based on these measurements, specific QoE models can be used to predict the user experience. The QoE measurements of active clients’ sessions are also stored in the OTT DB, which can also be accessed by the ISP through mentioned RESTful APIs. The ISP’s SDN application periodically controls the OTT-reported QoE and, in case of observed QoE degradations, implements network-wide policies by communicating with the SDN controller through the Northbound APIs. Accordingly, the SDN controller performs network management operations such as link-aggregation, addition of new flows, network slicing, by controlling the network devices through Southbound APIs.

QoE management based on information exchange: video service use-case

The previously described scenario, which is implemented by Timber, portraits a collaborative scenario between the ISP and the OTT, where the first provides QoE-related data and the later takes care of controlling the resources allocated to the deployed services. Ahmad et al. [4] makes use of Timber to conduct experiments aimed at investigating the impact of the frequency of information exchange between an OTT providing a video streaming service and the ISP on the end-user QoE.

Figure 2 shows the experiments topology. Mininet in Timber is used to create the network topology, which in this case regards the streaming of video sequences from the media server to the User1 (U1) when web traffic is also transmitted on the same network towards User2 (U2). U1 and U2 are two virtual hosts sharing the same access network and act as the clients. U1 runs the client-side video player and the Apache server provides both web and HAS (HTTP Adaptive Streaming) video services.

In the considered collaboration scenario, QoE-related KQIs are extracted from the client-side and sent to the to the MongoDB database (managed by the OTT), as depicted by the red dashed arrows. This information is then retrieved by the SDN controller of the ISP at frequency f (see green dashed arrow). The aim is to provide different network level resources to video streaming and normal web traffic when QoE degradation is observed for the video service. These control actions on the network are needed because TCP-based web traffic sessions of 4 Mbps start randomly towards U2 during the HD video streaming sessions, causing network time varying bottlenecks in the S1−S2 link. In these cases, the SDN controller implements virtual network slicing at S1 and S2 OVS switches, which provides the minimum guaranteed throughput of 2.5 Mbps and 1 Mbps to video streaming and web traffic, respectively. The SDN controller application utilizes flow matching criteria to assign flows to the virtual slice. The objective of this emulations is to show the impact of f on the resulting QoE.

The Big Buck Bunny 60-second long video sequence in 1280 × 720 was streamed between the server and the U1 by considering 5 different sampling intervals T for information exchange between OTT and ISP, i.e., 2s, 4s, 8s, 16s, and 32s. The information exchanged in this case were the average length stalling duration and the number of stalling events measured by the probe at the client video player. Accordingly, the QoE for the video streaming service was measured in terms of predicted MOS using the QoE model defined in [5] for HTTP video streaming, as follows:
MOSp = α exp( -β(L)N ) + γ
where L and N are the average length stalling duration and the number of stalling events, respectively, whereas α=3.5, γ=1.5, and β(L)=0.15L+0.19.

Figure 3.a shows the average predicted MOS when information is exchanged at different sampling intervals (the inverse of f). The greatest MOSp is 4.34 obtained for T=2s, and T=4s. Exponential decay in MOSp is observed as the frequency of information exchange decreases. The lowest MOSp is 3.07 obtained for T=32s. This result shows that greater frequency of information exchange leads to low latency in the controller response to QoE degradation. The reason is that the buffer at the client player side keeps on starving for longer durations in case of higher T resulting into longer stalling durations until the SDN controller gets triggered to provide the guaranteed network resources to support the video streaming service.

Figure 3.b Initial loading time, average stalling duration and latency in controller response to quality degradation for different sampling intervals.

Figure 3.b shows the video initial loading time, average stalling duration and latency in controller response to quality degradation w.r.t different sampling intervals. The latency in controller response to QoE degradation increases linearly as the frequency of information exchange decreases while the stalling duration grows exponentially as the frequency decrease. The initial loading time seems to be not relevantly affected by different sampling intervals.

Conclusions

Experiments are conducted on an SDN emulation environment to investigate the impact of the frequency of information exchange between OTT and ISP when a collaborative network management approach is considered. The QoE for a video streaming service is measured by considering 5 different sampling intervals for information exchange between OTT and ISP, i.e., 2s, 4s, 8s, 16s, and 32s. The information exchanged are the video average length stalling duration and the number of stalling events.

The experiment results showed that higher frequency of information exchange results in greater delivered QoE, but a sampling interval lower than 4s (frequency > ¼ Hz) may not further improve the delivered QoE. Clearly, this threshold depends on the variability of the network conditions. Further studies are needed to understand how frequently the ISP and OTT should collaboratively share data to have observable benefits in terms of QoE varying the network status and the deployed services.

References

[1] A. Ahmad, A. Floris and L. Atzori, “Timber: An SDN based emulation platform for QoE Management Experimental Research,” 2018 Tenth International Conference on Quality of Multimedia Experience (QoMEX), Cagliari, 2018, pp. 1-6.

[2] https://github.com/arslan-ahmad/Timber-DASH

[3] P. Le Callet, S. Möller, A. Perkis et al., “Qualinet White Paper on Definitions of Quality of Experience (2012),” in European Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003), Lausanne, Switzerland, Version 1.2, March 2013.

[4] A. Ahmad, A. Floris and L. Atzori, “Towards Information-centric Collaborative QoE Management using SDN,” 2019 IEEE Wireless Communications and Networking Conference (WCNC), Marrakesh, Morocco, 2019, pp. 1-6.

[5] T. Hoßfeld, C. Moldovan, and C. Schwartz, “To each according to his needs: Dimensioning video buffer for specific user profiles and behavior,” in IFIP/IEEE Int. Symposium on Integrated Network Management (IM), 2015. IEEE, 2015, pp. 1249–1254.

Report on QoMEX 2019: QoE and User Experience in Times of Machine Learning, 5G and Immersive Technologies

qomex2019_logo

The QoMEX 2019 was held from 5 to 7 June 2019 in Berlin, with Sebastian Möller (TU Berlin and DFKI) and Sebastian Egger-Lampl (AIT Vienna) as general chairs. The annual conference celebrated its 10th birthday in Berlin since the first edition in 2009 in San Diego. The latter focused on classic multimedia voice, video and video services. Among the fundamental questions back then were how to measure and how to quantify quality from the user’s point of view in order to improve such services? Answers to these questions were also presented and discussed at QoMEX 2019, where technical developments and innovations in terms of video and voice quality were considered. The scope has however broadened significantly over the last decade: interactive applications, games and immersive technologies, which require new methods for the subjective assessment of perceived quality of service and QoE, were addressed. With a focus on 5G and its implications for QoE, the influence of communication networks and network conditions for the transmission of data and the provisioning of services were also examined. In this sense, QoMEX 2019 looked at both classic multimedia applications such as voice, audio and video as well as interactive and immersive services: gaming QoE, virtual realities such as VR exergames, and augmented realities such as smart shopping, 360° video, Point Clouds, Web QoE, text QoE, perception of medical ultrasound videos for radiologists, QoE of visually impaired users with appropriately adapted videos, QoE in smart home environments, etc.

In addition to this application-oriented perspective, methodological approaches and fundamental models of QoE were also discussed during QoMEX 2019. While suitable methods for carrying out user studies and assessing quality remain core topics of QoMEX, advanced statistical methods and machine learning (ML) techniques emerged as another focus topic at this year’s QoMEX. The applicability, performance and accuracy of e.g. neural networks or deep learning approaches have been studied for a wide variety of QoE models and in several domains: video quality in games, content of image quality and compression methods, quality metrics for high-dynamic-range (HDR) images, instantaneous QoE for adaptive video streaming over the Internet and in wireless networks, speech quality metrics, and ML-based voice quality improvement. Research questions addressed at QoMEX 2019 include the impact of crowdsourcing study design on the outcomes, or the reliability of crowdsourcing, for example, in assessing voice quality. In addition to such data-driven approaches, fundamental theoretical work on QoE and its quantification in systems as well as fundamental relationships and model approaches were presented.

The TPC Chairs were Lynne Baillie (HWU Edinburgh), Tobias Hoßfeld (Univ. Würzburg), Katrien De Moor (NTNU Trondheim), Raimund Schatz (AIT Vienna). In total, the program included 11 sessions on the above topics. From those 11 sessions, 6 sessions on dedicated topics were organized by various Special Session organizers in an open call. A total of 82 full paper contributions were submitted, out of which 35 contributions were accepted (acceptance rate: 43%). Out of the 77 short papers submitted, 33 were accepted and presented in two dedicated poster sessions. The QoMEX 2019 Best Paper Award went to Dominik Keller, Tamara Seybold, Janto Skowronek and Alexander Raake for “Assessing Texture Dimensions and Video Quality in Motion Pictures using Sensory Evaluation Techniques”. The Best Student Paper Award went to Alexandre De Masi and Katarzyna Wac for “Predicting Quality of Experience of Popular Mobile Applications in a Living Lab Study”.

The keynote speakers addressed several timely topics. Irina Cotanis gave an inspiring talk on QoE in 5G. She addressed both the emerging challenges and services in 5G and the question of how to measure quality and QoE in these networks. Katrien De Moor highlighted the similarities and differences between QoE and User Experience (UX), considering the evolution of the two terms QoE and UX in the past and current status. An integrated view of QoE and UX was discussed and how the two concepts develop in the future. In particular, she posed the question how the two communities could empower each other and what would be needed to bring both communities together in the future. The final day of QoMEX 2019 began with the keynote of artist Martina Menegon, who presented some of her art projects based on VR technology.

Additional activities and events within QoMEX 2019 comprised the following. (1) In the Speed ​​PhD mentoring organized by Sebastian Möller and Saman Zadtootaghaj, the participating doctoral students could apply for a short mentoring session (10 minutes per mentor) with various researchers from industry and academia in order to ask technical or general questions. (2) In a session organized by Sebastian Egger-Lampl, the best works of the last 5 years of the simultaneous TVX Conference and QoMEX were presented to show the similarities and differences between the QoE and the UX communities. This was followed by a panel discussion. (3) There was a 3-minute madness session organized by Raimund Schatz and Tobias Hoßfeld, which featured short presentations of “crazy” new ideas in a stimulating atmosphere. The intention of this second session is to playfully encourage the QoMEX community to generate new unconventional ideas and approaches and to provide a forum for mutual creative inspiration.

The next edition, QoMEX 2020, will be held May 26th to 28th 2020 in Athlone, Ireland. More information:  http://qomex2020.ie/

Qualinet Databases: Central Resource for QoE Research – History, Current Status, and Plans

Introduction

Datasets are an enabling tool for successful technological development and innovation in numerous fields. Large-scale databases of multimedia content play a crucial role in the development and performance evaluation of multimedia technologies. Among those are most importantly audiovisual signal processing, for example coding, transmission, subjective/objective quality assessment, and QoE (Quality of Experience) [1]. Publicly available and widely accepted datasets are necessary for a fair comparison and validation of systems under test; they are crucial for reproducible research. In the public domain, large amounts of relevant multimedia contents are available, for example, ACM SIGMM Records Dataset Column (http://sigmm.hosting.acm.org/category/datasets-column/), MediaEval Benchmark (http://www.multimediaeval.org/), MMSys Datasets (http://www.sigmm.org/archive/MMsys/mmsys14/index.php/mmsys-datasets.html), etc. However, the description of these datasets is usually scattered – for example in technical reports, research papers, online resources – and it is a cumbersome task for one to find the most appropriate dataset for the particular needs.

The Qualinet Multimedia Databases Online platform is one of many efforts to provide an overview and comparison of multimedia content datasets – especially for QoE-related research, all in one place. The platform was introduced in the frame of ICT COST Action IC1003 European Network on Quality of Experience in Multimedia Systems and Services – Qualinet (http://www.qualinet.eu). The platform, abbreviated “Qualinet Databases” (http://dbq.multimediatech.cz/), is used to share information on databases with the community [3], [4]. Qualinet was supported as a COST Action between November 8, 2010, and November 7, 2014. It has continued as an independent entity with a new structure, activities, and management since 2015. Qualinet Databases platform fulfills the initial goal to provide a rich and internationally recognized database and has been running since 2010. It is widely considered as one of Qualinet’s most notable achievements.

In the following paragraphs, there is a summary on Qualinet Databases, including its history, current status, and plans.

Background

A commonly recognized database for multimedia content is a crucial resource required not only for QoE-related research. Among the first published efforts in this field are the image and video quality resources website by Stefan Winkler (https://stefan.winklerbros.net/resources.html) and related publications providing in-depth analysis of multimedia content databases [2]. Since 2010, one of the main interests of Qualinet and its Working Group 4 (WG4) entitled Databases and Validation (Leader: Christian Timmerer, Deputy Leaders: Karel Fliegel, Shelley Buchinger, Marcus Barkowsky) was to create an even broader database with extended functionality and take the necessary steps to make it accessible to all researchers.

Qualinet firstly decided to list and summarize available multimedia databases based on a literature search and feedback from the project members. As the number of databases in the list was rapidly increasing, the handling of the necessary updates became inefficient. Based on these findings, WG4 started the implementation of the Qualinet Databases online platform in 2011. Since then, the website has been used as Qualinet’s central resource for sharing the datasets among Qualinet members and the scientific community. To the best of our knowledge, there is no other publicly available resource for QoE research that offers similar functionality. The Qualinet Databases platform is intended to provide more features than other known similar solutions such as Consumer Video Digital Library (http://www.cdvl.org). The main difference lies in the fact that the Qualinet Databases acts as a hub to various scattered resources of multimedia content, especially with the available data, such as MOS (Mean Opinion Score), raw data from subjective experiments, eye-tracking data, and detailed descriptions of the datasets including scientific references.

In the development of Qualinet DBs within the frame of COST Action IC1003, there are several milestones, which are listed in the timeline below:

  • March 2011 (1st Qualinet General Assembly (GA), Lisbon, Portugal), an initial list of multimedia databases collected and published internally for Qualinet members, creation of Web-based portal proposed,
  • September 2011 (2nd Qualinet GA, Brussels, Belgium), Qualinet DBs prototype portal introduced, development of publicly available resource initiated,
  • February 2012 (3rd Qualinet GA, Prague, Czech Republic), hosting of the Qualinet DBs platform under development at the Czech Technical University in Prague (http://dbq.multimediatech.cz/), Qualinet DBs Wiki page (http://dbq-wiki.multimediatech.cz/) introduced,
  • October 2012 (4th Qualinet GA, Zagreb, Croatia), White paper on Qualinet DBs published [3], Qualinet DBs v1.0 online platform released to the public,
  • March 2013 (5th Qualinet GA, Novi Sad, Serbia), Qualinet DBs v1.5 online platform published with extended functionality,
  • September 2013 (6th Qualinet GA, Novi Sad, Serbia), Qualinet DBs Information leaflet published, Task Force (TF) on Standardization and Dissemination established, QoMEX 2013 Dataset Track organized,
  • March 2014 (7th Qualinet GA, Berlin, Germany), ACM MMSys 2014 Dataset Track organized, liaison with Ecma International (https://www.ecma-international.org/) on possible standardization of Qualinet DBs subset established,
  • October 2014 (8th Final Qualinet GA and Workshop, Delft, The Netherlands), final development stage v3.00 of Qualinet DBs platform reached, code freeze.

Qualinet Databases became Qualinet’s primary resource for sharing datasets publicly to Qualinet members and after registration also to the broad scientific community. At the final Qualinet General Assembly under the COST Action IC1003 umbrella (October 2014, Delft, The Netherlands) it was concluded – also based on numerous testimonials – that Qualinet DBs is one of the major assets created throughout the project. Thus it was decided that the sustainability of this resource must be ensured for the years to come. Since 2015 the Qualinet DBs platform is being kept running with the effort of a newly established Task Force, TF4 Qualinet Databases (Leader: Karel Fliegel, Deputy Leaders: Lukáš Krasula, Werner Robitza). The status and achievements are being discussed regularly at Qualinet’s Annual Meetings collocated with QoMEX (International Conference on Quality of Multimedia Experience), i.e., 7th QoMEX 2015 (Costa Navarino, Greece), 8th QoMEX 2016 (Lisbon, Portugal), 9th QoMEX 2017 (Erfurt, Germany), 10th QoMEX 2018 (Sardinia, Italy), and 11th QoMEX 2019 (Berlin, Germany).

Current Status

The basic functionality of the Qualinet Databases online platform, see Figure 1, is based on the idea that registered users (Qualinet members and other interested users from the scientific community) have access through an easy-to-use Web portal providing a list of multimedia databases. Based on their user rights, they are allowed to browse information about the particular database and eventually download the actual multimedia content from the link provided by the database owner.

qualinetDatabaseInterface

Figure 1. Qualinet Databases online platform and its current interface.

Selected users – Database Owners in particular – have rights to upload or edit their records in the list of databases. Most of the multimedia databases have a flag of “Publicly Available” and are accessible to the registered users outside Qualinet. Only Administrators (Task Force leader and deputy leaders) have the right to delete records in the database. Qualinet DBs does not contain the actual multimedia content but only the access information with provided links to the dataset files saved at the server of the Database Owner.

The Qualinet DBs is accessible to all registered users after entering valid login data. Depending on the level of the rights assigned to the particular account, the user can browse the list of the databases with description (all registered users) and has access to the actual multimedia content via a link entered by the Database Owner. It provides the user with a powerful tool to find the multimedia database that best suits his/her needs.

In the list of databases user can select visible fields for the list in the User Settings, namely:

  • Database name, Institution, Qualinet Partner (Yes/No),
  • Link, Description (abstract), Access limitations, Publicly available (Yes/No), Copyright Agreement signed (Yes/No),
  • Citation, References, Copyright notice, Database usage tracking,
  • Content type, MOS (Yes/No), Other (Eye tracking, Sensory, …),
  • Total number of contents, SRC, HRC,
  • Subjective evaluation method (DSCQS, …), Number of ratings.

Fulltext search within the selected visible fields is available. In the current version of the Qualinet DBs, users can sort databases alphabetically based on the visible fields or use the search field as described above.

The list of databases allows:

  • Opening a card with details on particular database record (accessible to all users),
  • Editing database record (accessible to the database owners and administrators),
  • Deleting database record (accessible only to administrators),
  • Requesting deletion of a database record (accessible to the database owners),
  • Requesting assignment as the database owner (accessible to all users).

As for the records available in Qualinet DBs, the listed multimedia databases are a crucial resource for various tasks in multimedia signal processing. The Qualinet DBs is focused primarily on QoE research [1] related content, where, while designing objective quality assessment algorithms, it is necessary to perform (1) Verification of model during development, (2) Validation of model after development, and (2) Benchmarking of various models.

Annotated multimedia databases contain essential ground truth, that is, test material from the subjective experiment annotated with subjective ratings. Qualinet DBs also lists other material without subjective ratings for other kinds of experiments. Qualinet DBs covers mostly image and video datasets, including special contents (e.g., 3D, HDR) and data from subjective experiments, such as subjective quality ratings or visual attention data.

A timeline with statistics on the number of records and users registered in Qualinet DBs throughout the years can be seen in Figure 2. Throughout Qualinet COST Action IC1003 the number of registered datasets grew from 64 in March 2011 to 201 in October 2014. The number of datasets created by the Qualinet partner institutions grew from 30 in September 2011 to 83 in October 2014. The number of registered users increased from 37 in March 2013 to 222 in October 2014. After the end of COST Action IC1003 in November 2014 the number of datasets increased to 246 and the number of registered users to 491. The average yearly increase of registered users is approximately 56 users, which illustrates continuous interest and value of Qualinet DBs for the community.

Figure 2. Qualinet Databases statistics on the number of records and users.

Figure 2. Qualinet Databases statistics on the number of records and users.

Besides the Qualinet DBs online platform (http://dbq.multimediatech.cz/), there are also additional resources available for download via the Wiki page (http://dbq-wiki.multimediatech.cz) and Qualinet website (http://www.qualinet.eu/). Two documents are available: (1) “QUALINET Multimedia Databases v6.5” (May 28, 2017) with a detailed description of registered datasets, and “List of QUALINET Multimedia Databases v6.5” in a searchable spreadsheet with records as of May 28, 2017.

Plans

There are indicators – especially the number of registered users – showing that Qualinet DBs is a valuable resource for the community. However, the current platform as described above has not been updated since 2014, and there are several issues to be solved, such as the burden on one institution to host and maintain the system, possible instability and an obsolete interface, issues with the Wiki page and lack of a file repository. Moreover, in the current system, user registration is required. It is a very useful feature for usage tracking, ensuring database privacy, but at the same time, it can put some people off from using and adding new datasets, and it requires handling of personal data. There are also numerous obsolete links in Qualinet DBs, which is useful for the record, but the respective databases should be archived.

A proposal for a new platform for Qualinet DBs has been presented at the 13th Qualinet General Meeting in June 2019 (Berlin, Germany) and was subsequently supported by the assembly. The new platform is planned to be based on a Git repository so that the system will be open-source and text-based, and no database will be needed. The user-friendly interface is to be provided by a static website generator; the website itself will be hosted on GitHub. A similar approach has been successfully implemented for the VQEG Software & Tools (https://vqeg.github.io/software-tools/) web portal. Among the main advantages of the new platform are (1) easier access (i.e., fast performance with simple interface, no hosting fees and thus long term sustainability, no registration necessary and thus no entry barrier), (2) lower maintenance burden (i.e., minimal technical maintenance effort needed, easy code editing), and (3) future-proofness (i.e., databases are just text files with easy format conversion, and hosting can be done on any server).

On the other hand, the new platform will not support user registration and login, which is beneficial in order to prevent data privacy issues. Tracking of registered users will no longer be available, but database usage tracking is planned to be provided via, for example, Google Analytics. There are three levels of dataset availability in the current platform: (1) Publicly available dataset, (2) Information about dataset but data not available/available upon request, and (3) Not publicly available (e.g., Qualinet members only, not supported in the new platform). The migration of Qualinet DBs to the new platform is to be completed by mid-2020. Current data are to be checked and sanitized, and obsolete records moved to the archive.

Conclusions

Broad audiovisual contents with diverse characteristics, annotated with data from subjective experiments, is an enabling resource for research in multimedia signal processing, especially when QoE is considered. The availability of training and testing data becomes even more important nowadays, with ever-increasing utilization of machine learning approaches. Qualinet Databases helps to facilitate reproducible research in the field and has become a valuable resource for the community. 

References

  • [1] Le Callet, P., Möller, S., Perkis, A. Qualinet White Paper on Definitions of Quality of Experience, European Network on Quality of Experience in Multimedia Systems and Services (COST Action IC 1003), Lausanne, Switzerland, Version 1.2, March 2013. (http://www.qualinet.eu/images/stories/QoE_whitepaper_v1.2.pdf
  • [2] Winkler, S. Analysis of public image and video databases for quality assessment, IEEE Journal of Selected Topics in Signal Processing, 6(6):616-625, 2012. (https://doi.org/10.1109/JSTSP.2012.2215007)
  • [3] Fliegel, K., Timmerer, C. (eds.) WG4 Databases White Paper v1.5: QUALINET Multimedia Database enabling QoE Evaluations and Benchmarking, Prague/Klagenfurt, Czech Republic/Austria, Version 1.5, March 2013.
  • [4] Fliegel, K., Battisti, F., Carli, M., Gelautz, M., Krasula, L., Le Callet, P., Zlokolica, V. 3D Visual Content Datasets. In: Assunção P., Gotchev A. (eds) 3D Visual Content Creation, Coding and Delivery. Signals and Communication Technology, Springer, Cham, 2019. (https://doi.org/10.1007/978-3-319-77842-6_11)

NoteThe readers interested in active contribution to extending the success of Qualinet Databases are referred to Qualinet (http://www.qualinet.eu/) and invited to join its Task Force on Qualinet Databases via email reflector. To subscribe, please send an email to (dbq.wg4.qualinet-subscribe@listes.epfl.ch). This work was partially supported by the project No. GA17-05840S “Multicriteria optimization of shift-variant imaging system models” of the Czech Science Foundation.

Report from QoE-Management 2019

The 3rd International Workshop on Quality of Experience Management (QoE-Management 2019) was a successful full day event held on February 18, 2019 in Paris, France, where it was co-located with the 22nd Conference on Innovation in Clouds, Internet and Networks (ICIN). After the success of the previous QoE-Management workshops, the third edition of the workshop was also endorsed by the QoE and Networking Initiative (http://qoe.community). It was organized by workshop co-chairs Michael Seufert (AIT, Austrian Institute of Technology, Austria, who is now at University of Würzburg, Germany), Lea Skorin-Kapov (University of Zagreb, Croatia) and Luigi Atzori (University of Cagliari, Italy). The workshop attracted 24 full paper and 3 short paper submissions. The Technical Program Committee consisted of 33 experts in the field of QoE Management, which provided at least three reviews per submitted paper. Eventually, 12 full papers and 1 short paper were accepted for publication, which gave an acceptance rate of 48%.

On the day of the workshop, the co-chairs welcomed 30 participants. The workshop started with a keynote given by Martín Varela (callstats.io, Finland) who elaborated on “Some things we might have missed along the way”. He presented open technical and business-related research challenges for the QoE Management community, which he supported with examples from his current research on the QoE monitoring of WebRTC video conferencing. Afterwards, the first two technical sessions focused on video streaming. Susanna Schwarzmann (TU Berlin, Germany) presented a discrete time analysis approach to compute QoE-relevant metrics for adaptive video streaming. Michael Seufert (AIT Austrian Institute of Technology, Austria) reported the results of an empirical comparison, which did not find any differences in the QoE between QUIC- and TCP-based video streaming for naïve end users. Anika Schwind (University of Würzburg, Germany) discussed the impact of virtualization on video streaming behavior in measurement studies. Maria Torres Vega (Ghent University, Belgium) presented a probabilistic approach for QoE assessment based on user’s gaze in 360° video streams with head mounted displays. Finally, Tatsuya Otoshi (Osaka University, Japan) outlined how quantum decision making-based recommendation methods for adaptive video streaming could be implemented.

The next session was centered around machine learning-based quality prediction. Pedro Casas (AIT Austrian Institute of Technology) presented a stream-based machine learning approach for detecting stalling in real-time from encrypted video traffic. Simone Porcu (University of Cagliari, Italy) reported on the results of a study investigating the potential of predicting QoE from facial expressions and gaze direction for video streaming services. Belmoukadam Othmane (Cote D’Azur University & INRIA Sophia Antipolis, France) introduced ACQUA, which is a lightweight platform for network monitoring and QoE forecasting from mobile devices. After the lunch break, Dario Rossi (Huawei, France) gave the second keynote, entitled “Human in the QoE loop (aka the Wolf in Sheep’s clothing)”. He used the main leitmotiv of Web browsing and showed relevant practical examples to discuss the challenges towards QoE-driven network management and data-driven QoE models based on machine learning.

The following technical session was focused on resource allocation. Tobias Hoßfeld (University of Würzburg, Germany) elaborated on the interplay between QoE, user behavior and system blocking in QoE management. Lea Skorin-Kapov (University of Zagreb, Croatia) presented studies on QoE-aware resource allocation for multiple cloud gaming users sharing a bottleneck link. Quality monitoring was the topic of the last technical session. Tomas Boros (Slovak University of Technology, Slovakia) reported how video streaming QoE could be improved by 5G network orchestration. Alessandro Floris (University of Cagliari, Italy) talked about the value of influence factors data for QoE-aware management. Finally, Antoine Saverimoutou (Orange, France) presented WebView, a measurement platform for web browsing QoE. The workshop co-chairs closed the day with a short recap and thanked all speakers and participants, who joined in the fruitful discussions. To summarize, the third edition of the QoE Management workshop proved to be very successful, as it brought together researchers from both academia and industry to discuss emerging concepts and challenges related to managing QoE for network services. As the workshop has proven to foster active collaborations in the research community over the past years, a fourth edition is planned in 2020.

We would like to thank all the authors, reviewers, and attendants for their precious contributions towards the successful organization of the workshop!

Michael Seufert, Lea Skorin-Kapov, Luigi Atzori
QoE-Management 2019 Workshop Co-Chairs

On System QoE: Merging the system and the QoE perspectives

With Quality of Experience (QoE) research having made significant advances over the years, increased attention is being put on exploiting this knowledge from a service/network provider perspective in the context of the user-centric evaluation of systems. Current research investigates the impact of system/service mechanisms, their implementation or configurations on the service performance and how it affects the corresponding QoE of its users. Prominent examples address adaptive video streaming services, as well as enabling technologies for QoE-aware service management and monitoring, such as SDN/NFV and machine learning. This is also reflected in the latest edition of conferences such as the ACM Multimedia Systems Conference (MMSys ‘19), see some selected exemplary papers.

  • “ERUDITE: a Deep Neural Network for Optimal Tuning of Adaptive Video Streaming Controllers” by De Cicco, L., Cilli, G., & Mascolo, S.
  • “An SDN-Based Device-Aware Live Video Service For Inter-Domain Adaptive Bitrate Streaming” by Khalid, A., Zahran, H. & Sreenan C.J.
  • “Quality-aware Strategies for Optimizing ABR Video Streaming QoE and Reducing Data Usage” by Qin, Y., Hao, S., Pattipati, K., Qian, F., Sen, S., Wang, B., & Yue, C.
  • “Evaluation of Shared Resource Allocation using SAND for Adaptive Bitrate Streaming” by Pham, S., Heeren, P., Silhavy, D., Arbanowski, S.
  • “Requet: Real-Time QoE Detection for Encrypted YouTube Traffic” by Gutterman, C., Guo, K., Arora, S., Wang, X., Wu, L., Katz-Bassett, E., & Zussman, G.

For the evaluation of systems, proper QoE models are of utmost importance, as they  provide a mapping of various parameters to QoE. One of the main research challenges faced by the QoE community is deriving QoE models for various applications and services, whereby ratings collected from subjective user studies are used to model the relationship between tested influence factors and QoE. Below is a selection of papers dealing with this topic from QoMEX 2019; the main scientific venue for the  QoE community.

  • “Subjective Assessment of Adaptive Media Playout for Video Streaming” by Pérez, P., García, N., & Villegas, A.
  • “Assessing Texture Dimensions and Video Quality in Motion Pictures using Sensory Evaluation Techniques” by Keller, D., Seybold, T., Skowronek, J., & Raake, A.
  • “Tile-based Streaming of 8K Omnidirectional Video: Subjective and Objective QoE Evaluation” by Schatz, R., Zabrovskiy, A., & Timmerer, C.
  • “SUR-Net: Predicting the Satisfied User Ratio Curve for Image Compression with Deep Learning” by Fan, C., Lin, H., Hosu, V., Zhang, Y., Jiang, Q., Hamzaoui, R., & Saupe, D.
  • “Analysis and Prediction of Video QoE in Wireless Cellular Networks using Machine Learning” by Minovski, D., Åhlund, C., Mitra, K., & Johansson, P.

System-centric QoE

When considering the whole service, the question arises of how to properly evaluate QoE in a systems context, i.e., how to quantify system-centric QoE. The paper [1] provides fundamental relationships for deriving system-centric QoE,which are the basis for this article.

In the QoE community, subjective user studies are conducted to derive relationships between influence factors and QoE. Typically, the results of these studies are presented in terms of Mean Opinion Scores (MOS). However, these MOS results mask user diversity, which leads to specific distributions of user scores for particular test conditions. In a systems context, QoE can be better represented as a random variable Q|t for a fixed test condition. Such models are commonly exploited by service/network providers to derive various QoE metrics [2] in their system, such as expected QoE, or the percentage of users rating above a certain threshold (Good-or-Better ratio GoB).

Across the whole service, users will experience different performance, measured by e.g.,  response times, throughput, etc. which depend on the system’s (and services’) configuration and implementation. In turn, this leads to users experiencing different quality levels. As an example, we consider the response time of a system, which offers a certain web service, such as access to a static web site. In such a case, the system’s performance can be represented by a random variable R for the response time. In the system community, research aims at deriving such distributions of the performance, R.

The user centric evaluation of the system combines the system’s perspective and the QoE perspective, as illustrated in the figure below. We consider service/network providers interested in deriving various QoE metrics in their system, given (a) the system’s performance, and (b) QoE models available from user studies. The main questions we need to answer are how to combine a) user rating distributions obtained from subjective studies, and b) system performance condition distributions, so as to obtain the actual observed QoE distribution in the system? Moreover, how can various QoE metrics of interest in the system be derived?

System centric QoE - Merging the system and the QoE perspectives

System centric QoE – Merging the system and the QoE perspectives

Model of System-centric QoE

A service provider is interested in the QoE distribution Q in the system, which includes the following stochastic components: 1) system performance condition, t (i.e., response time in our example), and 2) user diversity, Q|t. This system-centric QoE distribution allows us to derive various QoE metrics, such as expected QoE or expected GoB in the system.

Some basic mathematical transformations allow us to derive the expected system-centric QoE E[Q], as shown below. As a result, we show that the expected system QoE is equal to the expected Mean Opinion Score (MOS) in the system! Hence, for deriving system QoE, it is necessary to measure the response time distribution R and to have a proper QoS-to-MOS mapping function f(t) obtained from subjective studies. From the subjective studies, we obtain the MOS mapping function for a response time t, f(t)=E[Q|t]. The system QoE then follows as E[Q] = E[f(R)]=E[M]. Note: The MOS M distribution in the system allows only to derive the expected MOS, i.e., expected system-centric QoE.

Expected system QoE E[Q] in the system is equal to the expected MOS

Expected system QoE E[Q] in the system is equal to the expected MOS

Let us consider another system-centric QoE metric, such as the GoB ratio. On a typical 5-point Absolute Category Rating (ACR) scale (1:bad quality, 5: excellent quality), the system-centric GoB is defined as GoB[Q]=P(Q>=4). We find that it is not possible to use a MOS mapping function f and the MOS distribution M=f(R) to derive GoB[Q] in the system! Instead, it is necessary to use the corresponding QoS-to-GoB mapping function g. This mapping function g can also be derived from the same subjective studies as the MOS mapping function, and maps the response time (tested in the subjective experiment) to the ratio of users rating “good or better” QoE, i.e., g(t)=P(Q|t > 4). We may thus derive in a similar way: GoB[Q]=E[g(R)]. In the system, the GoB ratio is the expected value of the response times R mapped to g(R). Similar observations lead to analogous results for other QoE metrics, such as quantiles or variances (see [1]).

Conclusions

The reported fundamental relationships provide an important link between the QoE community and the systems community. If researchers conducting subjective user studies provide different QoS-to-QoE mapping functions for QoE metrics of interest (e.g.,  MOS or GoB), this is enough to derive corresponding QoE metrics from a system’s perspective. This holds for any QoS (e.g., response time) distribution in the system, as long as the corresponding QoS values are captured in the reported QoE models. As a result, we encourage QoE researchers to report not only MOS mappings, but the entire rating distributions from conducted subjective studies. As an alternative, researchers may report QoE metrics and corresponding mapping functions beyond just those relying on MOS!

We draw the attention of the systems community to the fact that the actual QoE distribution in a system is not (necessarily) equal to the MOS distribution in the system (see [1] for numerical examples). Just applying MOS mapping functions and then using observed MOS distribution to derive other QoE metrics like GoB is not adequate. The current systems literature however, indicates that there is clearly a lack of a common understanding as to what are the implications of using MOS distributions rather than actual QoE distributions.

References

[1] Hoßfeld, T., Heegaard, P.E., Skorin-Kapov, L., & Varela, M. (2019). Fundamental Relationships for Deriving QoE in Systems. 2019 Eleventh International Conference on Quality of Multimedia Experience (QoMEX). IEEE 

[2] Hoßfeld, T., Heegaard, P. E., Varela, M., & Möller, S. (2016). QoE beyond the MOS: an in-depth look at QoE via better metrics and their relation to MOS. Quality and User Experience, 1(1), 2.

Authors

  • Tobias Hoßfeld (University of Würzburg, Germany) is heading the chair of communication networks.
  • Poul E. Heegaard (NTNU – Norwegian University of Science and Technology) is heading the Networking Research Group.
  • Lea Skorin-Kapov (University of Zagreb, Faculty of Electrical Engineering and Computing, Croatia) is heading the Multimedia Quality of Experience Research Lab
  • Martin Varela is working in the analytics team at callstats.io focusing on understanding and monitoring QoE for WebRTC services.

Towards an Integrated View on QoE and UX: Adding the Eudaimonic Dimension

In the past, research on Quality of Experience (QoE) has frequently been limited to networked multimedia applications, such as the transmission of speech, audio and video signals. In parallel, usability and User Experience (UX) research addressed human-machine interaction systems which either focus on a functional (pragmatic) or aesthetic (hedonic) aspect of the experience of the user. In both, the QoE and UX domains, the context (mental, social, physical, societal etc.) of use has mostly been considered as a control factor, in order to guarantee the functionality of the service or the ecological validity of the evaluation. This situation changes when systems are considered which explicitly integrate the usage environment and context they are used in, such as Cyber-Physical Systems (CPS), used e.g. in smart home or smart workplace scenarios. Such systems dispose of sensors and actuators which are able to sample and manipulate the environment they are integrated into, and thus the interaction with them is somehow moderated through the environment; e.g. the environment can react to a user entering a room. In addition, such systems are used for applications which differ from standard multimedia communication in the sense that they are frequently used over a long or repeating period(s) of time, and/or in a professional use scenario. In such application scenarios the motivation of system usage can be divided between the actual system user and a third party (e.g. the employer) resulting in differing factors affecting related experiences (in comparison to services which are used on the user’s own account). However, the impact of this duality of usage motivation on the resulting QoE or UX has rarely been addressed in existing research of both scientific communities. 

In the context of QoE research, the European Network on Quality of Experience in Multimedia Systems and Services, Qualinet (COST Action IC 1003) as well as a number of Dagstuhl seminars [see note from the editors], started a scientific discussion about the definition of the term QoE and related concepts around 2011. This discussion resulted in a White Paper which defines QoE as “the degree of delight or annoyance of the user of an application or service. It results from the fulfillment of his or her expectations with respect to the utility and/ or enjoyment of the application or service in the light of the users personality and current state.” [White Paper 2012]. Besides this definition, the white paper describes a number of factors that influence a user’s QoE perception, e.g. human-, system- and contextual factors. Although this discussion lists a large set of influencing factors quite thoroughly, it still focuses on rather short-term (or episodic) and media related hedonic experiences. A first step towards integrating an additional (quality) dimension (to the hedonic one) has been described in [Hammer et al., 2018], where the authors introduced the eudaimonic perspective as being the user’s overall well-being as a result of system usage. The term “eudaimonic” stems from Aristoteles and is commonly used to designate a deeper degree of well-being, as a result of a self-fulfillment by developing one’s own strengths.

On a different side, UX research has historically evolved from usability research (which was for a long time focusing on enhancing the efficiency and effectiveness of the system), and was initially concerned with the prevention of negative emotions related to technology use. As an important contributor for such preventions, pragmatic aspects of analyzed ICT systems have been identified in usability research. However, the twist towards a modern understanding of UX focuses on the understanding of human-machine interaction as a specific emotional experience (e.g., pleasure) and considers pragmatic aspects only as enablers of positive experiences but not as contributors to positive experiences. In line with this understanding, the concept of Positive or Hedonic Psychology, as introduced by [Kahnemann 1999], has been embedded and adopted in HCI and UX research. As a result, the related research community has mainly focused on the hedonic aspects of experiences as described in [Diefenbach 2014] and as critically outlined by [Mekler 2016] in which the authors argue that this concentration on hedonic aspects has overcasted the importance of eudaimonic aspects of well-being as described in positive psychology. With respect to the measurement of user experiences, the devotion towards hedonic psychology comes also with the need for measuring emotional responses (or experiential qualities). In contrast to the majority of QoE research, where the measurement of the (single) experienced (media) quality of a multimedia system is in the focus, the measurement of experiential qualities in UX calls for the measurement of a range of qualities (e.g. [Bargas-Avila 2011] lists affect, emotion, fun, aesthetics, hedonic and flow as qualities that are assessed in the context of UX). Hence, this measurement approach considers a considerable broader range of quantified qualities. However, the development of the UX domain towards a design-based UX research that steers away from quantitatively measurable qualities and focuses more towards a qualitative research approach (that does not generate measurable numbers) has marginalized this measurement or model-based UX research camp in recent UX developments as denoted by [Law 2014].

While existing work in QoE mainly focuses on hedonic aspects (and in UX, also on pragmatic ones), eudaimonic aspects such as the development of one’s own strengths have not been considered extensively so far in the context of both research areas. Especially in the usage context of professional applications, the meaningfulness of system usage (which is strongly related to eudaimonic aspects) and the growth of the user’s capabilities will certainly influence the resulting experiential quality(ies). In particular, professional applications must be designed such that the user continues to use the system in the long run without frustration, i.e. provide long-term acceptance for applications which the user is required to use by the employer. In order to consider these aspects, the so-called “HEP cube” has been introduced in [Hammer et al. 2018]. It opens a 3-dimensional space of hedonic (H), eudaimonic (E) and pragmatic (P) aspects of QoE and UX, which are integrated towards a Quality of User Experience (QUX) concept.

Whereas a simple definition of QUX has not yet been set up in this context, a number of QUX-related aspects, e.g. utility (P), joy-of-use (H), meaningfulness (E), have been integrated into a multidimensional HEP construct. This construct is displayed in Figure 1. In addition to the well-known hedonic and pragmatic aspects of UX, it incorporates the eudaimonic dimension. Thereby, it shows the assumed relationships between aforementioned aspects of User Experience and QoE, and in addition usefulness and motivation (which is strongly related to the eudaimonic dimension). These aspects are triggered by user needs (first layer) and moderated by the respective dimension aspects joy-of-use (for hedonic), ease-of-use (pragmatic), and purpose-of-use (eudaimonic). The authors expect that a consideration of the additional needs and QUX aspects, and an incorporation of these aspects into application design, will not only lead to higher acceptance rates, but also to deep-grounded well-being of users. Furthermore, incorporation of these aspects into QoE and / or QUX modelling will improve their respective prediction performance and ecological validity.

towardsAnIntegratedViewQoEandUX_AddingEudaimonicDimension

Figure 1: QUX as a multidimensional construct involving HEP attributes, existing QoE/UX, need fulfillment and motivation. Picture taken from Hammer, F., Egger-Lampl, S., Möller, S.: Quality-of-User-Experience: A Position Paper, Quality and User Experience, Springer (2018).

References

  • [White Paper 2012] Qualinet White Paper on Definitions of Quality of Experience (2012).  European Network on Quality of Experience in Multimedia Systems and  Services (COST Action IC 1003), Patrick Le Callet, Sebastian Möller and Andrew Perkis, eds., Lausanne, Switzerland, Version 1.2, March 2013.
  • [Kahnemann 1999] Kahneman, D.: Well-being: Foundations of Hedonic Psychology, chap. Objective Happiness, pp. 3{25. Russell Sage Foundation Press, New York (1999)
  • [Diefenbach 2014] Diefenbach, S., Kolb, N., Hassenzahl, M.: The `hedonic’ in human-computer interaction: History, contributions, and future research directions. In: Proceedings of the 2014 conference on Designing interactive systems, pp. 305{314. ACM (2014)
  • [Mekler 2016] Mekler, E.D., Hornbaek, K.: Momentary pleasure or lasting meaning?: Distinguishing eudaimonic and hedonic user experiences. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 4509{4520. ACM (2016)
  • [Bargas-Avila 2011] Bargas-Avila, J.A., Hornbaek, K.: Old wine in new bottles or novel challenges: A critical analysis of empirical studies of user experience. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, pp. 2689{2698. ACM (2011)
  • [Law 2014] Law, E.L.C., van Schaik, P., Roto, V.: Attitudes towards user experience (UX) measurement. International Journal of Human-Computer Studies 72(6), 526{541 (2014)
  • [Hammer et al. 2018] Hammer, F., Egger-Lampl, S., Möller, S.: Quality-of-User-Experience: A Position Paper, Quality and User Experience, Springer (2018).

Note from the editors:

More details on the integrated view of QoE and UX can be found in Hammer, F., Egger-Lampl, S. & Möller, “Quality-of-user-experience: a position paper”. Springer Quality and User Experience (2018) 3: 9. https://doi.org/10.1007/s41233-018-0022-0

The Dagstuhl seminars mentioned by the authors started a scientific discussion about the definition of the term QoE in 2009. Three Dagstuhl Seminars were related to QoE: 09192 “From Quality of Service to Quality of Experience” (2009), 12181 “Quality of Experience: From User Perception to Instrumental Metrics” (2012), and 15022 “Quality of Experience: From Assessment to Application” (2015). A Dagstuhl Perspectives Workshop 16472 “QoE Vadis?” followed in 2016 which set out to jointly and critically reflect on future perspectives and directions of QoE research. During the Dagstuhl Perspectives Workshop, the QoE-UX wedding proposal came up to marry the area of QoE and UX. The reports from the Dagstuhl seminars  as well as the Manifesto from the Perspectives Workshop are available online and listed below.

One step towards an integrated view of QoE and UX is reflected by QoMEX 2019. The 11th International Conference on Quality of Multimedia Experience will be held in June 5th to 7th, 2019 in Berlin, Germany. It will bring together leading experts from academia and industry to present and discuss current and future research on multimedia quality, quality of experience (QoE) and user experience (UX). This way, it will contribute towards an integrated view on QoE and UX, and foster the exchange between the so-far distinct communities. More details: https://www.qomex2019.de/