The transmission and processing of sensor-rich videos in mobile environment
Supervisor(s) and Committee member(s): Roger Zimmermann (supervisor), Ye Wang (advisor), Wei Tsang Ooi (advisor), Mun Choon Chan (rapporteur)
The astounding volume of camera sensors produced for and embedded in cellular phones has led to a rapid advancement in their quality, wide availability and popularity for capturing, uploading and sharing of videos (also referred to user-generated content or UGC). Furthermore, GPS-enabled smartphones have become an essential contributor to location-based services. A large number of geo-tagged photos and videos have been accumulating continuously on the web, posing a challenging problem for mining this type of media data. Existing solutions attempt to examine the signal content of the videos and recognize objects and events. This is typically time-consuming and computationally expensive and the results can be uneven in their quality. Therefore these methods face challenges when applied to large video repositories. Furthermore, the acquisition and transmission of large amounts of video data on mobile devices face fundamental challenges such as power and wireless bandwidth constraints. To support diverse mobile video applications, it is critical to overcome these challenges.
Recent technological trends have opened another avenue that fuses much more accurate, relevant data with videos: the concurrent collection of sensor-generated geospatial contextual data. The aggregation of multi-sourced geospatial data into a standalone meta-data tag allow video content to be identified by a number of precise, objective geospatial characteristics. These so-called sensor-rich videos can conveniently be captured with smartphones. In this thesis we investigate the transmission and processing of sensor-rich videos in mobile environment. Our work focuses on the following key issues for sensor-rich videos:
1) Energy-efficient video acquisition and upload. We design a system to support energy-efficient sensor-rich video delivery. The core of our approach is the separate transmission of the small amount of text-based geospatial meta-data from the large binary-based video content.
2) Point of Interest (POI) detection and visual distance estimation. We propose a technique which is able to detect interesting regions and objects and their distances from the camera positions in a fully automated way.
3) Presentation of user generated videos. We present a system that provides an integrated solution to present videos based on keyframe extraction and interactive, map-based browsing.
4) Geo-predictive video streaming. We present a method to predict the bandwidth change for HTTP streaming. The method make use of the geo-location information to build bandwidth maps to facilitate bandwidth prediction, and efficient quality adaptation. We also propose two quality adaptation algorithms for adaptive HTTP streaming.
Our study shows that using location and viewing direction information, coupled with timestamps, efficient video delivery systems can be developed, more interesting information can be mined from video repository, and user-generated video presentation can be more natural.
Media Management Research Lab
Our group's research focuses on Streaming Media and Geo-Referenced Video Management (GeoVid).
The GeoVid project explores the concept of sensor-rich video tagging. Specifically, recorded videos are tagged with a continuous stream of extended geographic properties that relate to the camera scenes. This meta-data is then utilized for storing, indexing and searching large collections of community-generated videos. By considering video related meta-information, more relevant search results can be returned and advanced searches, such as directional and surround queries, can be executed.