SIGMM Education Column

SIGMM Education Column of this issue highlights a new book, titled “Visual Information Retrieval using Java and LIRE,” which gives an introduction to the fields of information retrieval and visual information retrieval and points out selected methods, as well as their use and implementation within Java and more specifically LIRE, a Java CBIR library. The book is authored by Dr. Mathias Lux, from Klagenfurt University, Austria, and Prof. Oge Marques, of Florida Atlantic University, and it is published in the Synthesis Lectures on Information Concepts, Retrieval, and Services by Morgan & Claypool.

 

The basic motivation for writing this book was the need for a fundamental course book that contained just the necessary knowledge to get students started with content-based image retrieval. The book is based on lectures given by the authors over the last years and has been designed to fulfill that need. It will also provide developers for content-based image solutions with a head start by explaining the most relevant concepts and practical requirements.

 

The book begins with a short introduction, followed by explanations of information retrieval and retrieval evaluation. Visual features are then explained, and practical problems and common solutions are outlined. Indexing strategies of visual features, including linear search, nearest neighbor search, hashing and bag of visual words, are discussed next, and the use of these strategies with LIRE is shown. Finally, LIRE is described in detail, to allow for employment of the library in various contexts and for extension of the functions provided.

 

There is also a companion website for the book (http://www.lire-project.net), which gives pointers to additional resources and will be updated with slides, figures, teaching materials and code samples.

 

Interview with ACM Fellow and SIGMM Chair Prof Klara Nahrstedt

Prof. Dr. Klara Nahrstedt, SIGMM Chair

SIGMM Editor: “Why do societies such as ACM offer Fellows status to some of its members?”

Prof Klara Nahrstedt: The ACM society celebrates through its ACM Fellows Status Program the exceptional contributions of the leading members in the computing field. These individuals have helped to enlighten researchers, developers, practitioners and end-users of computing and information technology throughout the world. The new ACM Fellows join a distinguished list of colleagues to whom ACM and its members look for guidance and leadership in computing and information technology.

SIGMM Editor: “What is the significance for you as an individual research in becoming an ACM Fellow?”

Prof Klara Nahrstedt: Receiving the ACM Fellow Status represents a great honor for me due to the high distinction of this award in the computing community.  The ACM Fellow award recognizes  my own research in the area of “Quality of Service (QoS) management  for distributed multimedia systems”, as well as the joint work in this area with my students and colleagues at my home institution, the University of Illinois, Urbana-Champaign, and other institutions, research labs, and companies with whom I have collaborated over the years.  Furthermore, becoming an ACM Fellow allows me to continue and push new ideas of QoS in distributed multimedia systems in three societal domains, the trustworthy cyber-physical infrastructure for smart grid environments, the collaborative immersive spaces in tele-health-care, and robust mobile multimedia systems in airline-airplane maintenance ecosystem.
SIGMM Editor: “How is this recognition perceived by your research students, department, and University? “

Prof Klara Nahrstedt: My research students, department and university are delighted that I have received the ACM Fellow status since this type of award very much reflects the high quality of students that get admitted to our department and I work with, colleagues I interact with, and resources I get provided by the department and university.

SIGMM Editor: “You have been one of the important torch bearers of the SIGMM community. What does this recognition imply for the SIGMM Community?”

Prof Klara Nahrstedt: SIGMM community is a relatively young community, having only recently celebrated 20 years of its existence. However, as the multimedia community is maturing, it is important for our community to promote its outstanding researchers and assist them towards the ACM Fellow status.  Furthermore, multimedia technology is becoming ubiquitous in all facets of our lives; hence it is of great importance that SIGMM leaders, especially its ACM Fellows, are at the table with other computing researchers to guide and drive future directions in computing and information technologies.

SIGMM Editor: “How will this recognition influence the SIGMM community?”

Prof Klara Nahrstedt: I hope that my ACM Fellow status recognition will influence the SIGMM community at least in three directions: (1) it will motivate young researchers in academia and industry to work towards high impact research accomplishments in multimedia area that will lead to the ACM Fellow status at the later stage of their careers, (2) it will impact female researchers to strive towards recognition of their work through the ACM Fellow Status, and (3) it will increase the distinguished group of ACM Fellows within the SIGMM, which again will be able to promote the next generation of multimedia researchers to join the ACM Fellows ranks.

 

MPEG Column: 103rd MPEG Meeting

— original post by Multimedia Communication blogChristian TimmererAAU

 

The 103rd MPEG Meeting

The 103rd MPEG meeting was held in Geneva, Switzerland, January 21-15, 2013. The official press release can be found here (doc only) and I’d like to introduce the new MPEG-H standard (ISO/IEC 23008) referred to as high efficiency coding and media delivery in heterogeneous environments:

  • Part 1: MPEG Media Transport (MMT) – status: 2nd committee draft (CD)
  • Part 2: High Efficiency Video Coding (HEVC) – status: final draft international standard (FDIS)
  • Part 3: 3D Audio – status: call for proposals (CfP)

MPEG Media Transport (MMT)

The MMT project was started in order to address the needs of modern media transport applications going beyond the capabilities offered by existing means of transportation such as formats defined by MPEG-2 transport stream (M2TS) or ISO base media file format (ISOBMFF) group of standards. The committee draft was approved during the 101st MPEG meeting. As a response to the CD ballot, MPEG received more than 200 comments from national bodies and, thus, decided to issue the 2nd committee draft which will be publicly available by February 7, 2013.

High Efficiency Video Coding (HEVC) – ITU-T H.265 | MPEG HEVC

HEVC is the next generation video coding standard jointly developed by ISO/IEC JTC1/SC29/WG11 (MPEG) and the Video Coding Experts Group (VCEG) of ITU-T WP 3/16. Please note that both ITU-T and ISO/IEC MPEG use the term “high efficiency video coding” in the the title of the standard but one can expect – as with its predecessor – that the former will use ITU-T H.265 and the latter will use MPEG-H HEVC for promoting its standards. If you don’t want to participate in this debate, simply use high efficiency video coding.

The MPEG press release says that the “HEVC standard reduces by half the bit rate needed to deliver high-quality video for a broad variety of applications” (note: compared to its predecessor AVC). The editing period for the FDIS goes until March 3, 2013 and then with the final preparations and a 2 month balloting period (yes|no vote only) once can expect the International Standard (IS) to be available early summer 2013. Please note that there are no technical differences between FDIS and IS.

The ITU-T press release describes HEVC as a standard that “will provide a flexible, reliable and robust solution, future-proofed to support the next decade of video. The new standard is designed to take account of advancing screen resolutions and is expected to be phased in as high-end products and services outgrow the limits of current network and display technology.”

HEVC currently defines three profiles:

  • Main Profile for the “Mass-market consumer video products that historically require only 8 bits of precision”.
  • Main 10 Profile “will support up to 10 bits of processing precision for applications with higher quality demands”.
  • Main Still Picture Profile to support still image applications, hence, “HEVC also advances the state-of-the-art for still picture coding”

3D Audio

The 3D audio standard shall complement MMT and HEVC assuming that in a “home theater” system a large number of loudspeakers will be deployed. Therefore, MPEG has issued a Call for Proposals (CfP) with the selection of the reference model v0 due in July 2013. The CfP says that MPEG-H 3D Audio “might be surrounding the user and be situated at high, mid and low vertical positions relative to the user’s ears. The desired sense of audio envelopment includes both immersive 3D audio, in the sense of being able to virtualize sound sources at any position in space, and accurate audio localization, in terms of both direction and distance.”

“In addition to a “home theater” audio-visual system, there may be a “personal” system having a tablet-sized visual display with speakers built into the device, e.g. around the perimeter of the display. Alternatively, the personal device may be a hand-held smart phone. Headphones with appropriate spatialization would also be a means to deliver an immersive audio experience for all systems.”

Complementary to the CfP, MPEG also provided the encoder input format for MPEG-H 3D audio and a draft MPEG audio core experiment methodology for 3D audio work.

Publicly available MPEG output documents

The following documents shall be come available at http://mpeg.chiariglione.org/ (note: some may have an editing period – YY/MM/DD). If you have difficulties to access one of these documents, please feel free to contact me.

  • Study text of DIS of ISO/IEC 23000-13, Augmented Reality Application Format (13/01/25)
  • Study text of DTR of ISO/IEC 23000-14, Augmented reality reference model (13/02/25)
  • Text of ISO/IEC FDIS 23005-1 2nd edition Architecture (13/01/25)
  • Text of ISO/IEC 2nd CD 23008-1 MPEG Media Transport (13/02/07)
  • Text of ISO/IEC 23008-2:201x/PDAM1 Range Extensions (13/03/22)
  • Text of ISO/IEC 23008-2:201x/PDAM2 Multiview Extensions (13/03/22)
  • Call for Proposals on 3D Audio (13/01/25)
  • Encoder Input Format for MPEG-H 3D Audio (13/02/08)
  • Draft MPEG Audio CE methodology for 3D Audio work (13/01/25)
  • Draft Requirements on MPEG User Descriptions (13/02/08)
  • Draft Call for Proposals on MPEG User Descriptions (13/01/25)
  • Draft Call for Proposals on Green MPEG (13/01/25)
  • Context, Objectives, Use Cases and Requirements for Green MPEG (13/01/25)
  • White Paper on State of the Art in compression and transmission of 3D Video (13/01/28)
  • MPEG Awareness Event Flyer at 104th MPEG meeting in Incheon (13/02/28)

Open Source Column: GPAC

GPAC, Toolbox for Interactive Multimedia Packaging, Delivery and Playback

Introduction

GPAC was born 10 years ago from the need of a lighter and more robust implementation of the MPEG-4 Systems standard [1], compared to the official reference software. It has since then evolved into a much wider project, covering many tools required when exploring new research topics in multimedia, while keeping a strong focus on international standard coming from organization such as W3C, ISO, ETSI or IETF. The goal of the project is to provide the tools needed to setup test beds and experiments for interactive multimedia applications, in any of the various environments used to deliver content in modern systems: broadcast, multicast, unicast unreliable streaming, HTTP-based streaming and file-based delivery. Read more

MPEG Column: 102nd MPEG Meeting

original post by Multimedia Communication blog, Christian Timmerer, AAU

The 102nd MPEG meeting was held in Shanghai, China, October 15-19, 2012. The official press release can be found here (not yet available) and I would like to highlight the following topics:

  • Augmented Reality Application Format (ARAF) goes DIS
  • MPEG-4 has now 30 parts: Let’s welcome timed text and other visual overlays
  • Draft call for proposals for 3D audio
  • Green MPEG is progressing
  • MPEG starts a new publicity campaign by making more working documents publicly available for free

Augmented Reality Application Format (ARAF) goes DIS

MPEG’s application format dealing with augmented reality reached DIS status and is only one step away from becoming in international standard. In a nutshell, the MPEG ARAF enables to augment 2D/3D regions of scene by combining multiple/existing standards within a specific application format addressing certain industry needs. In particular, ARAF comprises three components referred to as scene, sensor/actuator, and media. The scene component is represented using a subset of MPEG-4 Part 11 (BIFS), the sensor/actuator component is defined within MPEG-V, and the media component may comprise various type of compressed (multi)media assets using different sorts of modalities and codecs.

A tutorial from Marius Preda, MPEG 3DG chair, at the Web3D conference in August 2012 is provided below.

MPEG-4 has now 30 parts

Let’s welcome timed text and other visual overlays in the family of MPEG-4 standards. Part 30 of MPEG-4 – in combination with an amendment to the ISO base media file format (ISOBMFF) –  addresses the carriage of W3C TTML including its derivative SMPTE Timed Text, as well as WebVTT. The types of overlays include subtitles, captions, and other timed text and graphics. The text-based overlays include basic text and XML-based text. Additionally, the standards provides support for bitmaps, fonts, and other graphics formats such as scalable vector graphics.

Draft call for proposals for 3D audio

MPEG 3D audio is concerned about various test items ranging from 9.1 over 12.1 up to 22.1 channel configurations. A public draft call for proposals has been issued at this meeting with the goal to finalize the call and the evaluation guidelines at the next meeting. The evaluation will be conducted in two phases. Phase one for higher bitrates (1.5 Mbps to 265 kbps) is foreseen to conclude in July 2013 with the evaluation of the answers to the call and the selection of the “Reference Model 0 (RM0)” technology which will serve as a basis for the development of an 3D audio standard. The second phase targets lower bitrates (96 kbps to 48 kbps) and builds on RM0 technology after this has been documented using text and code.

Green MPEG is progressing

The idea between green MPEG is to define signaling means that enable energy efficient encoding, delivery, decoding, and/or presentation of MPEG formats (and possibly others) without the loss of Quality of Experience. Green MPEG will address this issue from an end-to-end point of view with the focus – as usual – on the decoder. However, a codec-centric design is not desirable as the energy efficiency should not be affected at the expenses of the other components of the media ecosystem. At the moment, first requirements have been defined and everyone is free to join the discussions on the email reflector within the Ad-hoc Group.

MPEG starts a new publicity campaign by making more working documents publicly available for free

As a response to national bodies comments, MPEG is starting from now on to make more documents publicly available for free. Here’s a selection of these documents which are publicly available here. Note that some may have an editing period and, thus, are not available at the of writing this blog post.

  • Text of ISO/IEC 14496-15:2010/DAM 2 Carriage of HEVC (2012/11/02)
  • Text of ISO/IEC CD 14496-30 Timed Text and Other Visual Overlays in ISO Base Media File Format (2012/11/02)
  • DIS of ISO/IEC 23000-13, Augmented Reality Application Format (2012/11/07)
  • DTR of ISO/IEC 23000-14, Augmented reality reference model (2012/11/21)
  • Study of ISO/IEC CD 23008-1 MPEG Media Transport (2012/11/12)
  • High Efficiency Video Coding (HEVC) Test Model 9 (HM 9) Encoder Description (2012/11/30)
  • Study Text of ISO/IEC DIS 23008-2 High Efficiency Video Coding (2012/11/30)
  • Working Draft of HEVC Full Range Extensions (2012/11/02)
  • Working Draft of HEVC Conformance (2012/11/02)
  • Report of Results of the Joint Call for Proposals on Scalable High Efficiency Video Coding (SHVC) (2012/11/09)
  • Draft Call for Proposals on 3D Audio (2012/10/19)
  • Text of ISO/IEC 23009-1:2012 DAM 1 Support for Event Messages and Extended Audio Channel Configuration (2012/10/31)
  • Internet Video Coding Test Model (ITM) v 3.0 (2012/11/02)
  • Draft Requirements on MPEG User Descriptions (2012/10/19)
  • Draft Use Cases for MPEG User Description (Ver. 4.0) (2012/10/19)
  • Requirements on Green MPEG (2012/10/19)
  • White Paper on State of the Art in compression and transmission of 3D Video (Draft) (2012/10/19)
  • White Paper on Compact Descriptors for Visual Search (2012/11/09)

MPEG Column: 101st MPEG Meeting

MPEG news: a report from the 101st meeting, Stockholm, Sweden

The 101st MPEG meeting in Sweden

The 101st MPEG meeting was held in Stockholm, Sweden, July 16-20, 2012. The official press release can be found here and I would like to highlight the following topics:

  • MPEG Media Transport (MMT) reaches Committee Draft (CD)
  • High-Efficiency Video Coding (HEVC) reaches Draft International Standard (DIS)
  • MPEG and ITU-T establish JCT-3V
  • Call for Proposals: HEVC scalability extensions
  • 3D audio workshop
  • Green MPEG

MMT goes CD

The Committee Draft (CD) of MPEG-H part 1 referred to as MPEG Media Transport (MMT) has been approved and will be publicly available after an editing period which will end Sep 17th. MMT comprises the following features:

  • Delivery of coded media by concurrently using more than one delivery medium (e.g., as it is the case of heterogeneous networks).
  • Logical packaging structure and composition information to support multimedia mash-ups (e.g., multiscreen presentation).
  • Seamless and easy conversion between storage and delivery formats.
  • Cross layer interface to facilitate communication between the application layers and underlying delivery layers.
  • Signaling of messages to manage the presentation and optimized delivery of media.

This list of ‘features’ may sound very high-level but as the CD usually comprises stable technology and is publicly available, the research community is more than welcome to evaluate MPEG’s new way of media transport. Having said this, I would like to refer to the Call for Papers of  JSAC’s special issue on adaptive media streaming which is mainly focusing on DASH but investigating its relationship to MMT is definitely within the scope.

HEVCs’ next step towards completion: DIS

The approval of the Draft International Standard (DIS) brought the HEVC standard one step closer to completion. As reported previously, HEVC shows inferior performance gains compared to its predecessor and real-time software decoding on the iPad 3 (720p, 30Hz, 1.5 Mbps) has been demonstrated during the Friday plenary [12]. It is expected that the Final Draft International Standard (FDIS) is going to be approved at the 103rd MPEG meeting in January 21-25, 2013. If the market need for HEVC is only similar as it was when AVC was finally approved, I am wondering if one can expect first products by mid/end 2013. From a research point of view we know – and history is our witness – that improvements are still possible even if the standard has been approved some time ago. For example, the AVC standard is now available in its 7th edition as a consolidation of various amendments and corrigenda.

JCT-3V

After the Joint Video Team (JVT) which successfully developed standards such as AVC, SVC, MVC and the Joint Collaborative Team on Video Coding (JCT-VC), MPEG and ITU-T establish the Joint Collaborative Team on 3D Video coding extension development (JCT-3V). That is, from now on MPEG and ITU-T also joins forces in developing 3D video coding extensions for existing codecs as well as the ones under development (i.e., AVC, HEVC). The current standardization plan includes the development of AVC multi-view extensions with depth to be completed this year and I assume HEVC will be extended with 3D capabilities once the 2D version is available.

In this context it is interesting that a call for proposals for MPEG Frame Compatible (MFC) has been issued to address current deployment issues of stereoscopic videos. The requirements are available here.

Call for Proposals: SVC for HEVC

In order to address the need for higher resolutions – Ultra HDTV – and subsets thereof, JCT-VC issued a call for proposals for HEVC scalability extensions. Similar to AVC/SVC, the requirements include that the base layer should be compatible with HEVC and enhancement layers may include temporal, spatial, and fidelity scalability. The actual call, the use cases, and the requirements shall become available on the MPEG Web site.

MPEG hosts 3D Audio Workshop

Part 3 of MPEG-H will be dedicated to audio, specifically 3D audio. The call for proposals will be issues at the 102nd MPEG meeting in October 2012 and submissions will be due at the 104th meeting in April 2013. At this meeting, MPEG has hosted a 2nd workshop on 3D audio with the following speakers.

  • Frank Melchior, BBC R&D: “3D Audio? – Be inspired by the Audience!”
  • Kaoru Watanabe, NHK and ITU: “Advanced multichannel audio activity and requirements”
  • Bert Van Daele, Auro Technologies: “3D audio content production, post production and distribution and release”
  • Michael Kelly, DTS: “3D audio, objects and interactivity in games”

The report of this workshop including the presentations will be publicly available by end of August at the MPEG Web site.

What’s new: Green MPEG

Impressions from the 101st meeting

Finally, MPEG is starting to explore a new area which is currently referred to as Green MPEG addressing technologies to enable energy-efficient use of MPEG standards. Therefore, an Ad-hoc Group (AhG) was established with the following mandates:

  1. Study the requirements and use-cases for energy efficient use of MPEG technology.
  2. Solicit further evidence for the energy savings.
  3. Develop reference software for Green MPEG experimentation and upload any such software to the SVN.
  4. Survey possible solutions for energy-efficient video processing and presentation.
  5. Explore the relationship between metadata types and coding technologies.
  6. Identify new metadata that will enable additional power savings.
  7. Study system-wide interactions and implications of energy-efficient processing on mobile devices.

AhGs are usually open to the public and all discussions take place via email. To subscribe please feel free to join the email reflector.

ACM TOMCCAP Nicolas D. Georganas Best Paper Award

In its initial year the ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP) Nicolas D. Georganas Best Paper Award goes to the paper Video Quality for Face Detection, Recognition and Tracking (TOMCCAP vol 7. no.3) by Pavel Korshunov and Wei Tsang Ooi.

The winning paper is pioneering because it is the very first study which tries to determine an objective quality threshold value for videos used in automated video processing (AVP). The

paper proves that if a video’s quality is below a certain threshold (it gives the actual values for this threshold based on video context), it cannot be used in AVP systems. Further, it is shown that

AVP systems still work with reasonable accuracy even when the video quality is low from a human’s perspective. This is an important finding because it means we can reduce quality and bit rate of the video without sacrificing accuracy, leading to reduced costs, greater scalability, and faster processing. What is unique about the paper is that it distinguishes between quality as perceived by humans, versus quality as perceived by AVP systems. In essence, the paper proposes that for AVP systems we should design machine-consumable video coding standards, not human-consumable codes.

The purpose of the award is to recognize the most significant work in ACM TOMCCAP in a given calendar year. The whole readership of ACM TOMCCAP was invited to nominate articles which were published in Volume 7 (2011). Based on the nominations the winner has been chosen by the TOMCCAP Editorial Board. The main assessment criteria have been quality, novelty, timeliness, clarity of presentation, in addition to relevance to multimedia computing, communications, and applications.

The award honors the founding Editor-in-Chief of TOMCCAP, Nicolas D. Georganas, for his contributions to the field of multimedia computing and his significant contributions to ACM.  He influenced the research and the multimedia community exceedingly.

The Editor-in-Chief Prof. Dr.-Ing. Ralf Steinmetz and the Editorial Board of ACM TOMCCAP cordially congratulate the winner. The award will be presented to the authors on November 1st 2012 at the ACM Multimedia 2012 in Nara, Japan and includes  travel expenses for the winning authors.

Ambulant – a multimedia playback platform

Distributed multimedia is a field that depends on many technologies, including networking, coding and decoding, scheduling, rendering and user interaction. Often, this leads to multimedia researchers in one of those fields expending a lot of implementation effort to build a complete media environment when they actually only want to demonstrate an advance within their own field. In 2004 the authors, having gone through this process more than once themselves, decided to design an open source extensible and embeddable multimedia platform that could serve as a central research resource. The NLNet Foundation, www.nlnet.nl, graciously provided initial funding for the resulting Ambulant project. Ambulant was designed from the outset to be usable for experimentation in a wide range of fields, not only in a laboratory setting but also as a deployed player for end users. However, it was not intended to compete with general end-user playback systems such as the (then popular) RealPlayer, Quicktime or the Windows Media Player. Our goal was to build a glue environment where various research groups could plug in next approaches to media scheduling, rendering and distribution. While some effort was spent on things like ease of installation, multi-platform compatibility and user interface issues, Ambulant has never hoped to usurp commercial media players. The user interface on three different platforms can bee seen in the figure below.

The first deployment of the platform was during the W3C standardization of SMIL 2.1 and 3.0 [2, 3], when Ambulant was used to test the specification and create an open reference implementation. The fact that Ambulant supports SMIL out of the box means that it is not only useful to “low-level” multimedia researchers who want to experiment with replacing systems components, but also to people interested in semantics or server-side document generation: by using SMIL as their output format they can use Ambulant to render their documents on any platform, including inside a web browser. Design and Implementation Ambulant is designed so that all key components are replaceable and extensible. This follows from the requirement that it is usable as an experimentation vehicle: if someone wants to replace the scheduler by one of their own design this should be possible, and have little or no impact on the rest of the system. To ensure wide deployability it was decided to create a portable platform. However, runtime efficiency is also an issue in multimedia playback, especially for audio and video decoding and rendering, so we decided to implement the core engine in C++. This allowed us to use platform-native decoding and rendering toolkits such as QuickTime and DirectShow, and gave us the added benefit of being able to use the native GUI toolkit on each platform, which makes life easier for end users and integrators. Using the native GUI has been a bit of extra effort up front, finding the right spot to separate platform-independent and platform-dependent code, but by now porting to a new GUI toolkit takes about three man-months. About 8 GUI toolkits have been supported over time (or 11 if you count browser plugin APIs as a GUI toolkit). The current version of Ambulant runs natively on MacOSX, Linux, Windows and iOS, and a browser plugin is available for all major browsers on all desktop platforms (including Internet Explorer on Windows). Various old platforms (WM5, Maemo) were supported in the past and, while no longer maintained, the code is still available. The design of Ambulant is shown in the figure above. On the document level there is a parser which reads external documents and converts them into a representation that the scheduler and layout engine will handle during document playout time. On the intermediate level there are datasources that read documents and media streams and handles them to the playout components. On the lower level there are the machine-dependent implementations of those stream readers and renderers. For each of these components there are multiple implementations, and those can easily be replaced or extended. The design largely uses factory functions and abstract interfaces, therefore the implementation uses a plugin architecture to allow easy replacement of components at runtime without having to rebuild the complete application. To make life even more simple, the API to the core Ambulant engine is available not only in C++ but also in Python. The Python bridge is complete and bidirectional: all classes that are accessible from C++ are just as accessible from Python and vice versa, and sending an object back-and-forth through the Python-C++ bridge results in the original object, not a new double-wrapped object. Moreover, not only can C++ classes be subclassed in Python but also the reverse. This means both extending Ambulant through a plugin and embedding Ambulant can be done in pure Python, without having to write any C/C++ code and without having to rebuild Ambulant. Applications Over the years, Ambulant has extensively been used for experimentation, both within our group and externally. In this section we will highlight some of these applications. The overview is not complete, but it highlights the breadth of applications of Ambulant. One of the interests of the authors is maintaining the temporal scheduling integrity of dynamically modified multimedia presentations. In the Ambulant Annotator [4], we experimented with using secondary screens during playback, allowing user interaction on those secondary screens to modify existing shared presentations on the main screen. The modification and sharing interface was implemented as a plugin in Ambulant, which is also used to drive the main screen. In Ta2 MyVideos [5] we looked at a different form of live modification: a personalized video mashup that was created while the user is viewing it. Integration of live video conferencing and multimedia documents is another area in which we work. For the Ta2 Family Game project [6] we augmented Ambulant with renderers to do low delay live video rendering and digitizing, and a Flash engine. The resulting platform was used to play a cooperative action game in multiple locations. We are also using Ambulant to investigate protocols for synchronizing media playback at remote locations. In a wholly different application area, the Daisy Consortium has used Ambulant as the basis of AMIS, www.daisy.org/projects/amis. AMIS is software that reads Daisy Books, which are the international standard for digital talking books for the visually impaired. For this project Ambulant was only a small part of the solution. The main program allows the end user, who may be blind or dyslectic, to select books and navigate them. Timed playback is then handled by Ambulant, with added functionality to highlight paragraphs on-screen as the content is read out, etc. At a higher level, an instrumented version of Ambulant has also been deployed to indirectly evaluate social media systems. In 2004, it was submitted to the first ACM Multimedia Open Source Software Competition [1]. Obtaining and Using Ambulant Ambulant is available via www.ambulantplayer.org, in three different forms: as a stable distribution (source and installers), as a nightly build (source and installers) and through Mercurial. Unfortunately, the stable distribution is currently lagging quite a bit behind, due to restricted manpower. We also maintain full API documentation, sample documents and community mailing lists. Ambulant is distributed under the LGPL2 license. This allows the platform to be used with commercial plugins developed by industry partners who provide proprietary software intended for limited distribution. We are considering a switch to dual licensing (GPL/BSD), but a concrete need has yet to arise. The Bottom Line Ambulant is a full open source media rendering pipeline. It provides an open, plug-in environment in which researches from a wide variety of (sub)disciplines can test new algorithms and media sharing approaches without having to write mountains of less-relevant framework code. It can serve as an open environment for experimentation, validation and distribution. You are welcome to give it a try and to contribute to its growth. References [1]Bulterman, D. et al. 2004. Ambulant: a fast, multi-platform open source SMIL player. In Proceedings of the 12th annual ACM international conference on Multimedia (MULTIMEDIA ’04). ACM, New York, NY, USA, 492-495. DOI=10.1145/1027527.1027646 [2]Bulterman, D. et al. 2008. Synchronized Multimedia Integration Language (SMIL 3.0). W3C. URL=http://www.w3.org/TR/SMIL/ [3]Bulterman, D. and Rutledge, L. 2008. Interactive Multimedia for the Web, Mobile Devices and Daisy Talking Books. Springer-Verlag, Heidelberg, Germany, ISBN: 3-540-20234-X. [4]Cesar, P. et al. Fragment, tag, enrich, and send: Enhancing social sharing of video. Transactions on Multimedia Computing, Communications, and Applications (TOMCCAP (2009) vol. 5 (3). DOI=10.1145/1556134.1556136 [5]Jansen, J. et al. 2012. Just-in-time personalized video presentations. In Proceedings of the 2012 ACM symposium on Document engineering (DocEng ’12). ACM, New York, NY, USA, 59-68. DOI=10.1145/2361354.2361368 [6]Jansen, J. et al. Enabling Composition-Based Video-Conferencing for the Home. IEEE Transactions on Multimedia (2011) vol. 13 (5) pp. 869-881. DOI=10.1109/TMM.2011.2159369

Outstanding PhD Thesis in Multimedia Computing, Communications and Applications

Dr. Wanmin Wu

The SIGMM Ph.D. Award Committee is pleased to recommend this year’s award for the outstanding Ph.D. thesis in multimedia computing, communications and applications to Dr. Wanmin Wu.

Wu’s dissertation documents fundamental work in the area of unifying systems  and user-centric approaches to managing information flows for supporting 3D tele-immersive environments. She has developed a theoretical framework for modeling and measuring QoE,  and for correlating QoE with Quality-of-Service (QoS) in distributed multi-modal interactive environments. This work has been significant in that it introduced the importance of the user-centric approach to modelling and managing complex three-dimensional data exchanges in time-constrained systems.

The committee considered the main innovations of this work to be:

  1. Identifying and incorporating human psycho-physical factors along with traditional QOS to improve experience;
  2. Proposing new methods and theory for QOS in interactive multi-camera environments that have served as a catalyst for enabling work in distributed education, medicine and conferencing;
  3. The development of new methods for video coding incorporating understanding of users psycho-physical understanding of color and depth.

These new methods have significantly reduced the impact of sharing tele-immersive information and are likely to have a longer-term benefit that is similar to that of selective audio encoding.

The committee has considered this contribution as worthy of the award as it tackles a new problem, proposes new theory and practice as a solution to this problem area, and opens the way for further research into effective distributed three-dimensional immersive systems.