Title: A Conceptual Architecture and a Framework for Dealing with Variability in Mulsemedia Systems
The increasing interest in digital immersive experiences has drawn the attention of researchers into understanding human perception whilst adding sensory effects to multimedia systems such as VR (Virtual Reality) and AR (Augmented Reality) applications, multimedia players, and games. These so-called mulsemedia—multiple sensorial media—systems are capable of delivering wind, smell, vibration, among others, along with audiovisual ...
Title: A Three Layer System for Audio-visual Quality Assessment
The development of models for quality prediction of both audio and video signals is a fairly mature field. But, although several multimodal models have been proposed, the area of audiovisual quality prediction is still an emerging area. In fact, despite the reasonable performance obtained by combination and parametric metrics, currently there is no reliable pixel-based ...
Title: Quality of experience characterization and provisioning in mobile cellular networks
Traditionally, previous generations of mobile cellular networks have been designed with Quality of Service (QoS) criteria in mind, so that they manage to meet specific service requirements. Quality of Experience (QoE) has, however, recently emerged as a concept, disrupting the design of future network generations by giving clear emphasis on the actually achieved user experience. ...
Title: Quality of Experience and Access Network Traffic Management of HTTP Adaptive Video Streaming
The thesis focuses on Quality of Experience (QoE) of HTTP adaptive video streaming (HAS) and traffic management in access networks to improve the QoE of HAS. First, the QoE impact of adaptation parameters and time on layer was investigated with subjective crowdsourcing studies. The results were used to compute a QoE-optimal adaptation strategy for given ...
Title: Objects for spatio-temporal activity recognition in videos
This thesis investigates the role of objects for the spatio-temporal recognition of activities in videos. We investigate what, when, and where specific activities occur in visual content by examining object representations, centered around the main question: what do objects tell about the extent of activities in visual space and time? The thesis presents six works ...
Title: An action recognition framework for uncontrolled video capture based on a spatio-temporal video graph
The task of automatic categorization and localization of human action in video sequences is valuable for a variety of applications such as detecting relevant activities in surveillance video, summarizing and indexing video sequences or organizing a digital video library according to the relevant actions. However it remains a challenging problem for computers to robustly recognize ...
Title: Deep image representations for instance search
We address the problem of visual instance search, which consists to retrieve all the images within an dataset that contain a particular visual example provided to the system. The traditional approach of processing the image content for this task relied on extracting local low-level information within images that was “manually engineered” to be invariant to ...
Title: Improving instance search performance in video collections
This thesis presents methods to improve instance search and enhance user performance while browsing unstructured video collections. Through the use of computer vision and information retrieval techniques, we propose novel solutions to analyse visual content and build a search algorithm to address the challenges of visual instance search, while considering the constraints for practical applications. ...
Title: An Investigation Into Machine Learning Solutions Involving Time Series Across Different Problem Domains
In this thesis we will examine architectures and models for machine learning in three problem domains each of which are based around the use of time series data in time series applications. We set out to examine whether the architecture and model solutions in different problem domains will converge when optimised towards a similar solution ...
Title: Behavioural biometric identification based on human computer interaction
As we become increasingly dependent on information systems, personal identification and profiling systems have received an increasing interest, either for reasons of personalisation or security. Biometric profiling is one means of identification which can be achieved by analysing something the user is or does (e.g., a fingerprint, signature, face, voice). This Ph.D. research focuses on ...
Title: Investigating multi-modal features for continuous affect recognition using visual sensing
Emotion plays an essential role in human cognition, perception and rational decisionmaking. In the information age, people spend more time then ever before interacting with computers, however current technologies such as Artificial Intelligence (AI) and Human-Computer Interaction (HCI) have largely ignored the implicit information of a user’s emotional state leading to an often frustrating and ...
Sang-hyo Park
In video coding, motion estimation (ME) that predicts a block among temporally correlated frames has had a crucial impact on not only the compression efficiency, but also the computational complexity. Particularly, fast ME algorithms has been a pivot in much research that attempts to reduce the complexity of video encoder while preserving the compression efficiency ...
Shrinivas D Desai
Medical imaging has advanced tremendously over the decade right from the inception. Among many, X-ray Computed Tomography+ (CT) is recognized as an imperative medical imaging modality to reveal the interior details of human body for effective diagnosis, treatment, operation and complication management of various clinical cases. CT operates on the principle of reconstruction of an image from projection based ...
Sucheta Ghosh
Parsing discourse is a challenging natural language processing task. In this research work first we take a data driven approach to identify arguments of explicit discourse connectives. In contrast to previous work we do not make any assumptions on the span of arguments and consider parsing as a token-level sequence labeling task. We design the ...
Rufael Mekuria
The Internet is used for distributed shared experiences such as video conferencing, voice calls (possibly in a group), chatting, photo sharing, online gaming and virtual reality. These technologies are changing our daily lives and the way we interact with each other. The current rapid advances in 3D depth sensing and 3D cameras are enabling acquisition ...
Svetlana Kordumova
This thesis contributes to learning machines what is in an image by avoiding direct manual annotation as training data. We either rely on tagged data from social media platforms to recognize concepts, or on objects semantics and layout to recognize scenes. We focus our effort on image search.We firstly demonstrate that concepts detectors can be ...
Amirhossein Habibian
This thesis studies the fundamental question: what vocabulary of concepts are suited for machines to describe video content? The answer to this question involves two annotation steps: First, to specify a list of concepts by which videos are described. Second, to label a set of videos per concept as its examples or counter examples. Subsequently, ...
Masoud Mazloom
In this thesis we aim to represent an event in a video using semantic features. We start from a bank of concept detectors for representing events in video. At first we considered the relevance of concepts to the event inside the video representation. We address the problem of video event classification using a bank of ...
Chien-nan Chen
3D Tele-immersion (3DTI) technology allows full-body, multimodal interaction among geographically dispersed users, which opens a variety of possibilities in cyber collaborative applications such as art performance, exergaming, and physical rehabilitation. However, with its great potential, the resource and quality demands of 3DTI rise inevitably, especially when some advanced applications target resource-limited computing environments with stringent ...
Pengpeng Ni
Modern video coding techniques provide multidimensional adaptation options for adaptive video streaming over networks. For instance, a video server can adjust the frame-rate, frame-size or signal-to-noise ratio (SNR) of the video being requested to cope with the available bandwidth. However, these adaptation operations give rise to distinct visual artefacts, so it follows that they are ...