SIVA Suite: An Open-Source Framework for Hypervideos

Overview

The SIVA Suite is an open source framework for the creation, playback, and administration of hypervideos. Allowing the definition of complex navigational structures, our hypervideos are well suited for different scenarios. Compared to traditional linear videos, they especially excel in e-learning and training situations (see [1] and [2]), where fitting the teaching material to the needs of the viewer can be crucial. Other fields of application include virtual tours through buildings or cities, sports events, and interactive video stories. The SIVA Suite consists of an authoring tool (SIVA Producer), an HTML5 hypervideo player (SIVA Player), and a Web server (SIVA Server) for user and video management. It has been evaluated in various scenarios with several usability tests and has been improved step-by-step since 2008.

Introduction

The viewer of a traditional video takes a mostly passive role. Traditional videos are linear and cannot provide additional information about objects or scenes. In contrast to traditional linear videos, hypervideos are not only made of a sequence of video scenes. Their essence are alternative storylines, user choices, and additional materials which can be viewed in parallel with the main content as well as a navigational structure facilitating these features. Therefore, special players with extended controls and areas to present the additional information beyond the original content are necessary. The user choices in the video can be made at the selection of the follow-up scene on a button panel, a table of contents, as well as a keyword search.

One of the most advanced tools in this area is Hyper-Hitchcock [3] which can be used for the creation of detail-on-demand hypervideos with one main storyline and entry points for more detailed video explanations. However, an open source version of the software is not available. With new technologies like HTML5, CSS 3, and JavaScript, web-based tools like Klynt [4] emerged. Klynt allows the creation of hypervideos with focus on different media types and provides many useful features but can not be extended or customized due to the proprietary licensing. Finally, with the SIVA Suite we now offer the first customizable open source framework for the creation of hypervideos.

To simplify the creation process, our work focuses on videos as the main content. In the SIVA Producer, video scenes and navigational elements are arranged in a graph, called scene graph, to define the navigational structure of a hypervideo. Annotations offering additional information can be added to single scenes as well as to the whole video. For this purpose, images, texts, pdfs, audios, and even videos may be used. As a supplement to the video structure defined by the scene graph, further navigational elements like a table of contents and a keyword search enable the viewer to easily jump to points of interest.

Hypervideos are created in the SIVA Producer and then uploaded to the SIVA Server. Registered users can then download the video from the server or watch it online. If logging is enabled, user interactions during playback are logged by the player and sent to the database on the SIVA Server. Video administrators can access the logging data and watch different diagrams or export the data for further analysis in a statistics tool. An overview of the system is shown in Figure 1.

overview

Figure 1. SIVA Suite – Overview.


SIVA Producer

Reqirements (recommended): Windows 7 or higher
Installation: executable setup file
License: Eclipse Public License (EPL)

Installation files of the SIVA Producer can be found at https://github.com/SIVAteam/SIVA-Suite/tree/master/producer, an installer of the latest release can be found at https://github.com/SIVAteam/SIVA-Suite/releases.

The SIVA Producer is used for the creation of hypervideos where main video scenes are linked with each other in a scene graph. Each of the scenes may have one or more multimedia annotations. Further navigational structures are a table of contents as well as the definition of keywords which can then be searched for in the player. The GUI was implemented and improved step-by-step since 2008 [5]

First Steps

  1. Create a new project: A new project is created with a wizard. The author can set the appearance of the player as well as functions the player will provide. It is, for example, possible to select a primary and a secondary color, to determine the width of the annotation panel, etc.
  2. Add media files to the project: Media files are imported into the media repository. These may be videos, audios, images, or html files. The Producer uses each media file in its original format during the creation process and only transforms it during the export.
  3. Create scenes: From videos in the media repository, scenes can be extracted. Those will be added to the scene repository from where they can be dragged to the scene graph to create the hypervideo structure.
  4. Create a scene graph: A scene graph (see Figure 2) consists of a defined start and an end, as well as several scenes and connection/branching elements allowing advanced navigation options during playback of the video. Scenes and navigation elements are added to the scene graph via drag and drop. These elements are linked with the connection tool from the scene graph tool bar. In order to produce a valid and exportable scene graph, two conditions have to be met. First, only one start scene is allowed. Second, every scene has to be connected by some path to the start and to the end of the video. The validity of the scene graph can be checked with a validation function.
    Figure 2. Scene graph of the SIVA Producer.

    Figure 2. Scene graph of the SIVA Producer.

  5. Add annotations to scenes: Each scene in the scene graph may have one or more multimedia annotations. To add an annotation, a media file can either be dragged from the media repository and dropped on a scene, or an annotation editor (see Figure 3) can be used to customize its timing and appearance. Additionally, a hotspot can be added to the scene which invokes the display of the annotation only after a viewer clicks the marked area.
    Figure 3. Annotation editor of the SIVA Producer.

    Figure 3. Annotation editor of the SIVA Producer.

  6. Export video project: In a last step, finished hypervideo projects with valid scene graphs are exported for the player. The structure of the hypervideo with all possible actions is converted into a JSON file. The media files are transformed and transcoded for the desired target platform.

Further Features

  • Global annotations: Besides annotations which are displayed with scenes, global annotations which are displayed during the whole hypervideo (and do not have timing information as a consequence) can be added with a separate editor. The editor is opened from the main menu or the quick access toolbar.
  • Keywords: Keywords can be added to scenes and annotations in the respective editors. They are added in whitespace-separated lists at the lower left part of the editors. Currently, only keywords added by the author are exported to the player and searchable with the search function, no automated analysis of the media files is performed.
  • Table of contents: The table of contents editor (see Figure 4) is used to create a tree structure of entries with meaningful headlines. A scene from the scene graph can be linked with one of the entries in the table of contents. A scene is added to an entry in the table of contents via drag and drop. The editor is opened from the main menu or the quick access toolbar.
    Figure 4. Table of contents editor of the SIVA Producer.

    Figure 4. Table of contents editor of the SIVA Producer.

  • Advanced navigation: Besides a standard selection element where the user may select one of the attached paths to continue playback in the player, more advanced elements are available as well:
    • Forward button: A single button with only one label. It can be used to interrupt a linear sequence of scenes.
    • Random selection: One of the attached paths will be selected at random without user interaction.
    • Conditional selection: For attached paths, conditions can be defined which have to be fulfilled before the path is unlocked for playback.
  • Project handover: The SIVA Producer provides a function for handing over a project to another computer. Using this function, all media files as well as the project file are copied into a given file structure where they can easily be copied from.
  • Help: A help for the SIVA Producer can be found in the menu under “Help -> Help Contents“.

 


SIVA Player

Reqirements (recommended): Firefox 42.0, Chrome 46.0, Opera 33, Internet Explorer 11, Safari 10.10
Installation: use HTML export profile in SIVA Producer, then integrate it into a website via copying the body part of the exported HTML file and adapting the paths – or use as local stand-alone player
License: GPLv3

Installation files are contained in the SIVA Producer at https://github.com/SIVAteam/SIVA-Suite/tree/master/producer/org.iviPro.ui/libs-native/HTML5player.

The SIVA Player is used to play the hypervideo created in the SIVA Producer. The structure and media elements of the hypervideo are described in a JSON file which conforms to the XML structure described in [6]. A previous versions of the player can be found in [7].

Figure 5. SIVA Player with video view and annotation area.

Figure 5. SIVA Player with video view and annotation area.

The playback of the described videos requires special players which are capable of providing navigational elements like selection panels for follow-up scenes, a table of contents, or a search function. Furthermore, areas for displaying additional information are necessary. Figure 5 shows a user interface of the player (with contents of a medical training scenario) with the following elements:

  • (1) standard controls like pause/play
  • (2) a progress bar (for the current video)
  • (3) a settings button
  • (4) a volume control
  • (5) entry point to the table of contents
  • (6) a button to jump to the previous scene
  • (7) title of the currently displayed scene
  • (8) a button to jump to the next scene (or to a selection panel)
  • (9) a search button (performs a live search and refines the search results with every keystroke)
  • (10) a button for the full-screen mode
  • (11) a foldout panel on the right shows additional information (here, an additional video (12) and two image galleries (13); the additional video provides standard controls and can be displayed in full-screen mode (14))

A click on one of the annotations opens its contents in full screen mode for additional interactions (like browsing an image gallery or watching a video), while the player pauses the main video in the background. If a fork is reached in the scene graph, a button panel is provided at the left side of the main video area where the viewer has to select the next scene. The player also provides multiple language support if the author provides translations for all text and media elements (note: this functionality is not yet implemented in the SIVA Producer, the translations have to be made manually in the JSON file). Besides clicking or tapping the buttons, the basic functions of the player can also be controlled using the keyboard, namely with space bar, ESC button, and left and right arrow button.

All actions of the user can be recorded if logging is enabled by the author. The player transmits the actions to the server every 60 seconds as well as when the player starts or the video ends, if it is used in online mode. If the player is used in offline mode, logging data is collected and transmitted to the server when a connection can be established.

Configurations in HTML are possible. Using a responsive design, the player cannot only be used on desktop PCs having varying screen sizes but also on mobile devices in landscape and portrait mode. The player can be used online over the internet or in offline mode when all files are stored at the end user device.

 


SIVA Server

Main server application:

Reqirements (recommended): Apache Tomcat 7, PostgreSQL 9.1 or newer, credentials to an SMTP account
Installation: deploy WAR file into the Tomcat’s webapp folder, open URL in browser, finish installation by filling all fields
License: GPLv3

Player Stats:

Reqirements (recommended): Apache 2 webserver, PHP in version 5.4, enabled Apache module mod_rewrite
Installation: put back-end files into virtual host’s folder, open in browser, complete the installation
License: GPLv3

Installation files for the server application and the player stats can be found at https://github.com/SIVAteam/SIVA-Suite/tree/master/server. Additionally, a WAR file for the main server application can be found at https://github.com/SIVAteam/SIVA-Suite/releases.

The SIVA Server provides a platform for hypervideos and evaluations based on logging data. Furthermore, it provides user and rights management for copyright protected videos.

Videos exported by the producer are uploaded in the Web interface, extracted by the server, and can then be viewed on the server. It is furthermore possible to provide a link to a video, for example when the video is also available as a Chrome App, or a download for a zip file. The latter can be extracted locally on the end user device and watched without internet connection.

Users may have different roles (like user, administrator, etc.) and rights according to their roles. Furthermore, each user may be member of one or more groups. The accessibility of videos can be assigned at group level. This ensures that the visibility of videos is satisfied according to the demands of the author or copyright limitations. A help for the SIVA Server can be found on its start page.

Figure 6. SIVA Server - player stats with usage view.

Figure 6. SIVA Server – player stats with usage view.

The server furthermore provides the SIVA Player Stats, the back end for the logging functionality of the player. This part of the application facilitates analyzing and evaluating the logged usage data. Watching, searching, exporting, or visualizing these data can be done video based. One of the currently available diagram views is the Sunburst diagram (see Figure 6), which shows how often certain paths were taken in a video by the viewers. Another diagram is a Treemap which shows the different scenes of the video and the events in these scenes. Thereby, the sizes of the boxes are representing the frequency of occurrence of one single event. This part of the application is only accessible for administrators registered in the front-end.

 


Implementation

For information about implementation details please refer to the documentation on GitHub https://github.com/SIVAteam/SIVA-Suite or [8].


Conclusion and Future Work

In this column we present the SIVA Suite, an open source framework for the creation, playback, and administration of hypervideos. The authoring tool, the SIVA Producer, provides several editors like a scene graph or annotation editors, as well as an export function. The hypervideo player, the SIVA Player, has extended controls and display areas as well as an intuitive design. The Web server, the SIVA Server, provides functions for user, group, and video management. This framework, especially the authoring tool and the player, were successfully used for the creation and playback of several hypervideos, most noteworthy a medical hypervideo training (see [1] and [2]). Both were evaluated in several usability tests and improved step-by-step since 2008.

While the framework already provides all necessary functions for the creation, playback, and management of hypervideos, several additional functions might be desirable. For now, video conversion is done in the producer during the export of a hypervideo. Especially when several video versions (regarding resolution, quality, or video format) are needed, this task can block the production site for a long period. To improve productivity, video conversion could be moved to the server component. Furthermore, a player preview in the producer is preferable, which avoids the necessity to export the hypervideo to watch it. While currently a created hypervideo can only be translated by manually copying its structure to a new project, input forms for multilingualism in the producer would make this task easier. Pushing the interaction part to a new level, viewers could benefit from a collaborative editing function in the player, allowing them to add comments or additional materials to a video. Additionally, splitting the contents of the player into a second screen could allow for easier interaction and perception of hypervideos, especially in sports or medical training scenarios. The implementation of a download and cache management as described in [9] and [10] in the player may help to reduce waiting time at scene changes.

 

References

[1] Katrin Tonndorf, Christian Handschigl, Julian Windscheid, Harald Kosch & Michael Granitzer. The effect of non-linear structures on the usage of hypervideo for physical training. In: 2015 IEEE International Conference on Multimedia and Expo (ICME), pp.1-6, 2015.

[2] Britta Meixner, Katrin Tonndorf, Stefan John, Christian Handschigl, Kai Hofmann, Michael Granitzer, Michael Langbauer & Harald Kosch. A Multimedia Help System for a Medical Scenario in a Rehabilitation Clinic. In: Proceedings of I-Know, 14th International Conference on Knowledge Management and Knowledge Technologies (i-KNOW ’14). ACM, New York, NY, USA, 25:1-25:8, 2014.

[3] Frank Shipman, Andreas Girgensohn & Lynn Wilcox. Authoring, Viewing, and Generating Hypervideo: An Overview of Hyper-Hitchcock. In: ACM Trans. Multimedia Comput. Commun. Appl., ACM, 5, 15:1-15:19, 2008.

[4] Honkytonk Films Klynt, http://www.klynt.net/, Website (accessed May 18, 2015), 2015.

[5] Britta Meixner, Katarzyna Matusik, Christoph Grill & Harald Kosch. Towards an easy to use authoring tool for interactive non-linear video. In: Multimedia Tools and Applications, Volume 70, Number 2, Springer Netherlands, pp. 1251-1276, ISSN 1380-7501, 2014.

[6] Britta Meixner & Harald Kosch. Interactive non-linear video: definition and XML structure. In: Proceedings of the 2012 ACM symposium on Document engineering (DocEng ’12). ACM, New York, NY, USA, 49-58, 2012.

[7] Britta Meixner, Beate Siegel, Peter Schultes, Franz Lehner & Harald Kosch. An HTML5 Player for Interactive Non-linear Video Time-based Collaborative Annotations. In: Proceedings of the 10th International Conference on Advances in Mobile Computing & Multimedia, MoMM ’13, ACM, New York, NY, USA, pp. 490-499, 2013.

[8] Britta Meixner, Stefan John & Christian Handschigl. SIVA Suite: Framework for Hypervideo Creation, Playback and Management. In: Proceedings of the 23rd Annual ACM Conference on Multimedia Conference (MM ’15). ACM, New York, NY, USA, 713-716, 2015.

[9] Britta Meixner & Jürgen Hoffmann. Intelligent Download and Cache Management for Interactive Non-Linear Video. In: Multimedia Tools and Applications, Volume 70, Number 2, SpringerNetherlands, pp. 905-948, ISSN 1380-7501, 2014.

[10] Britta Meixner Annotated Interactive Non-linear Video – Software Suite, Download and Cache Management Doctoral Thesis, University of Passau, 2014.

An interview with Klara Nahrstedt

Michael Riegler (MR): Describe your journey into computing from your youth up to the present. What foundational lessons did you learn from this journey? Why were you initially attracted to multimedia?

Prof. Klara Nahrstedt

Prof. Klara Nahrstedt

Klara Nahrstedt (KN): From my youth I have been attracted and interested in mathematics, physics and other sciences. However, since most of my family were electrical and computer engineers, I was surrounded by engineering gadgets and devices, and one of them was a very early computer, able to answer various quiz questions about the world. I liked this new device with its many potentials. Therefore, my interests and my family’s influence guided me towards an educational journey between science and engineering. I did my undergraduate studies in Mathematics and my Diploma work in Numerical Analysis, at the Humboldt University zu Berlin in East Germany. And after the Berlin Wall came down in 1989, my educational journey led me to the Computer and Information Science Department at the University of Pennsylvania in Philadelphia where I did my PhD degree and studied multimedia systems and networking.

My interest in multimedia came during my time at the Institute for Informatik, where I worked as a research programmer. This was the time after my Diploma Degree and after my System Administrator job at the Computer Center of the Ministry of Agriculture in East Berlin. This was the time when Europe, in contrast to USA, invested heavily in the new ISO-defined X.25-based digital networking technology, and with it in the new X.400 email system and its applications. One of the very interesting discussions at the time was to transport via email not only text messages, but also digital audio and images as messages. I wanted to be part of the discussion, since I believed that a picture (image) is worth 1000 words and auditory interfaces would be easier for users to enter messages than text messages. I wanted to help develop solutions that would enable transport of these multi-modal media, and my long journey into multimedia systems and networks started. After I joined University of Pennsylvania, as part of my PhD work, I was exposed to the research in the GRASP laboratory where researchers studied computer vision algorithms and cameras, mounted on robots. As a researcher interested in networking and multimedia, it was very natural for me to explore the integrated multimedia networking problems for tele-robotic applications and enable video and control information to be transported from remote robots to operators and to visualize what the remote robot was doing. Since my PhD the journey into deep understanding of multimedia systems and networks continues as new knowledge, technologies, applications, and users emerge.

The foundational lessons that I learned from this journey are: (1) acquire very strong fundamental knowledge in science and humanities very early independent what future opportunities, jobs, interests, and circumstances guide you towards; (2) work hard and believe in yourself; and (3) keep continuously learning.

MR: Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

KN: During my professional life, I had three different roles: researcher, educator and provider of professional services in different functions.

  • As a researcher, my vision and objective are to provide theoretical and practical cyber-solutions that enable people to communicate seamlessly and trustworthy with each other and with their physical environments.
  • As an educator, my vision and objective were and are to educate as best I can the next generation of undergraduate and graduate students who are very well prepared to tackle the numerous new challenges in the fast changing human-cyber-physical environments.
  • In the space of professional services, I served in various roles as the member of numerous program committees, and organizing member and/or chair, co-chair, editor of IEEE and ACM professional venues, as the chair of ACM Special Interest Group on Multimedia (SIGMM), and as the member of various departmental and college committees, and now as the Director of the Interdisciplinary Research Unit, the Coordinated Science Laboratory (CSL) in the College of Engineering at the University of Illinois at Urbana-Champaign. In each of the administrative and service roles, my vision and objective are to provide high quality service to the community if it is a high quality technical program at a conference or journal, fair and balanced allocation of resources that would advance the mission of SIGMM, or a broad support of interdisciplinary work in CSL.

I hope to achieve the vision and objectives of my research, educational and professional service activities via hard work, continuous learning, willingness to listen to others, and a very strong collaboration with others, especially my students, colleagues and staff members that I interact with.

MR: Can you profile your current research, its challenges, opportunities, and implications?

KN: My current research moves in three different directions which have some commonalities, but also differences. The major commonality of my research is in aiming to solve the underlying joint performance and trust issues in resource management of multi-modal systems and networking that we find in the current human-cyber-physical systems. The three different directions of my research are: (a) 3D teleimmersive systems for tele-health, (b) trustworthy cyber-physical systems such as power-grid, oil and gas, and (c) trustworthy and timely cloud-based cyber-infrastructures for scientific instruments such as distributed microscopes.

In all of these challenges and directions, the challenges are in providing real-time acquisition, distribution, analysis and retrieval of multi-modal data in conjunction with providing security, reliability and safety.

The opportunities in the areas of human-cyber-physical systems in health, and critical infrastructures are enormous as people are aging, physical infrastructures are being fully stressed, and multimedia devices are challenging every societal cyber-infrastructure by generating Big Data in terms of their volume, velocity and variety.

We are living in truly exciting times as the digital systems are getting more and more complex. The implications are that we have a lot of work to do and solve many challenges as a multimedia system and networking community in collaboration with many other communities. It is very clear that a single computing community is not able to solve the many problems that are coming upon us in the space of multi-modal human-cyber-physical systems. Inter and cross-disciplinary research is the call of the day.

MR: How would you describe the role of women especially in the field of multimedia?

KN: “Difficult” comes to my mind. The number of women in multimedia computing is small and in multimedia systems and networks even smaller. I wish that the role and visibility of women in multimedia technology field would be greater when it comes to IEEE and ACM awards, conference leadership roles, editorial boards memberships, participations in SIGMM technical challenges, and other visible events and roles. Multimedia technology became such a ubiquitous base for numerous application fields including education, training, entertainment, health care, social work which have very strong representations of women in general. Hence, I believe that women in multimedia should play even more of a crucial role in the future than today, especially in innovation, leadership, and interconnection of multimedia computing technologies with the above mentioned application fields.

MR: How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

KN: My top innovative achievements range from bringing a much better understanding into the field of Quality of Service (QoS) Management and Quality of Service Routing for multimedia systems and networks, to developing novel real-time and trusted resource management architectures and protocols for complex multi-modal applications, systems and networks such as the 3D teleimmersion, energy-efficient mobile multimedia, and trustworthy smart grid, to name few. My QoS research impact can be seen in current wide area wired and wireless networks and systems. The impact of the research management algorithms, architectures and systems that I and my research group have developed can be seen throughout the Microsoft, Google, HP, and IBM solutions where my graduate and undergraduate students took on an employment and brought with them research results and knowledge that then made their ways into multimedia applications, systems and network products.

MR: Over your distinguished career, what are your top lessons you want to share with the audience?

KN: The top lessons that I would like to share are: be patient, honest, open-minded, and fair; don’t give up; be humble but don’t be shy to “toot your own horn” when appropriate; listen what others have to say; and be respectful to others since everybody has something to contribute to the community and society in his/her own way.

 

MPEG Column: Press release for the 113th MPEG meeting

MPEG explores new frontiers for coding technologies with Genome Compression

Geneva, CH − The 113th MPEG meeting was held in Geneva, CH, from 19 – 23 October 2015

MPEG issues Call for Evidence (CfE) for Genome Compression and Storage

At its 113th meeting, MPEG has taken its first formal step toward leveraging its compression expertise to code an entirely new kind of essential information, i.e. the single recipe that describes each one of us as an individual — the human genome. A genome is comprised of the DNA sequences that may contain up to 300 billion DNA pairs, that make up the genetic information within each human cell. It is fundamentally the complete set of our hereditary information.

To aid in the representation and storage of this unique information, MPEG has issued a Call for Evidence (CfE) on Genome Compression and Storage with the goal to assess the performance of new technologies for the efficient compression of genomic information when compared to currently used file formats. This is vitally important because the amount of genomic and related information from a sequences can be as high as several Tbytes (trillion bytes).

Additional purposes of the call are to:

  • become aware of which additional functionalities (e.g. non sequential access, lossy compression efficiency, etc. ) are provided by these new technologies
  • collect information that may be used in drafting a future Call for Proposals

Responses to the CfE will be evaluated during the 114th MPEG meeting in February 2016.

Detailed information, including how to respond to the CfE, will soon be available as documents N15740 and N15739 at the 113th meeting website.

Future Video Coding workshop explores requirements and technologies for the next video codec

A workshop on Future Video Coding Applications and Technologies was held on October 21st, 2015 during the 113th MPEG meeting in Geneva. The workshop was organized to acquire relevant information about the context in which video coding will be operating in the future, and to review the status of existing technologies with merits beyond the capabilities of HEVC, with the goal of guiding future codec standardization activity.

The event featured speakers from the MPEG community and invited outside experts from industry and academia, and covered several topics related to video coding. Prof. Patrick Le Callet from the University of Nantes presented recent results in the field of objective and subjective video quality evaluation. Various applications of video compression were introduced by Prof. Doug Young Suh from Kyung Hee University, Stephan Wenger from Vidyo, Jonatan Samuelsson from Ericsson, and Don Wu from HiSilicon. Dr. Stefano Andriani presented the Digital Cinema Workflow from ARRI. Finally, Debargha Mukherjee from Google and Tim Terriberry from Mozilla gave an overview of recent algorithmic improvements in the development of the VP10 and Daala codecs, and discussed the motivation for the royalty-free video compression technologies developed by their companies.

The workshop took place at a very timely moment when MPEG and VCEG (ITU-T SG16 Q9) experts decided to join efforts in developing extensions of the HEVC standard for HDR.

MPEG‑V 3rd Edition reaches FDIS status for communication between actors in virtual and physical worlds

Parts 1‑6 of the 3rd Edition of MPEG‑V, to be published as ISO/IEC 23005‑[1‑6]:2016, have reached FDIS status — the final stage in the development of a standard prior to formal publication by ISO/IEC — at the 113th MPEG meeting. MPEG‑V specifies the architecture and associated representations to enable interaction between digital content and virtual worlds with the physical one, as well as information exchange between virtual worlds. Features of MPEG‑V enable the specification of multi-sensorial content associated with audio/video data, and control of multimedia applications via advanced interaction devices. In this 3rd Edition, MPEG‑V also includes technology for environmental and camera-related sensors, and 4D-theater effects.

Configurable decoder framework extended with new bitstream parser

The Reconfigurable Media Coding framework — MPEG’s toolkit that enables the expression of a functions of a decoder in terms of functional units and data models — has been extended by a novel new building block, called Parser Instantiation from BSD. Parser Instantiation from BSD can interpret information about a bitstream that is described via the Bitstream Syntax Description Language (defined in ISO/IEC 23001-5), and automatically instantiate a functional unit that is able to correctly parse the bitstream. This will, for example, enable on-the-fly decoding of bitstreams that have been reconfigured for dedicated purposes. The specification of the new technology will be included in a new edition of ISO/IEC 23001-4, which has also been issued by MPEG at its 113th meeting.

Multimedia Preservation Application Format (MP-AF) is finalized

At the 113th MPEG meeting, the Multimedia Preservation Application Format (MP-AF, ISO/IEC 23000-15) has reached the Final Draft International Standard (FDIS) stage. This new standard provides standardized description information to enable users to plan, execute, and evaluate preservation operations (e.g., checking preserved content integrity, migrating preserved content from one system to another system, replicating subpart or entire preserved content, etc.) in order to achieve the objectives of digital preservation. The standard also provides the industry with a coherent and consistent approach in managing multimedia preservation so that it can be implemented in a variety of scenarios. This includes applications, systems, and methods and different hardware and software in varying administrative domains, independent of technological changes.

Implementation guidelines and reference software for MP-AF are under development and have reached Draft Amendment stage at the 113th MPEG meeting. This latest amendment contains examples of applying MP-AF to use cases from the media industry. The final International Standard of MP-AF is expected to be issued at the 114th meeting (February 2016, San Diego).

Seminar for Genome Compression Standardization planned for the 114th MPEG meeting

After its successful seminar on “Prospects on Genome Compression Standardization” held in Geneva during the 113th meeting, MPEG plans to hold another open seminar at its next meeting in San Diego, California on 23rd February 2016 to collect further input and perspectives on genome data standardization from parties interested in the acquisition and processing of genome data.

The main topics covered by the planned seminar presentations are:

  • New approaches, tools and algorithms to compress genome sequence data
  • Genome compression and genomic medicine applications
  • Objectives and issues of quality scores compression and impact on downstream analysis applications

All interested parties are invited to join the seminar to learn more about genome data processing challenges and planned MPEG standardization activities in this area, share opinions, and work together towards the definition of standard technologies supporting improved storage, transport and new functionality for the processing of genomic information.

The seminar is open to the public and registration is free of charge.

Other logistic information on the 114th MPEG meeting are available online together with the detailed program of the Seminar.

How to contact MPEG, learn more, and find other MPEG facts

To learn about MPEG basics, discover how to participate in the committee, or find out more about the array of technologies developed or currently under development by MPEG, visit MPEG’s home page at http://mpeg.chiariglione.org. There you will find information publicly available from MPEG experts past and present including tutorials, white papers, vision documents, and requirements under consideration for new standards efforts.

Examples of tutorials that can be found on the MPEG homepage include tutorials for: High Efficiency Video Coding, Advanced Audio Coding, Universal Speech and Audio Coding, and DASH to name a few.  A rich repository of white papers can also be found and continues to grow. You can find these papers and tutorials for many of MPEG’s standards freely available. Press releases from previous MPEG meetings are also available. Journalists that wish to receive MPEG Press Releases by email should contact Dr. Arianne T. Hinds at a.hinds@cablelabs.com 

Further Information

Future MPEG meetings are planned as follows:

No. 114, San Diego, CA, USA, 22 – 26 February 2016
No. 115, Geneva, CH, 30 – 03 May – June 2016
No. 116, Chengdu, CN, 17 – 21 October 2016
No. 117, Geneva, CH, 16 – 20 January, 2017

 

 

An interview with Wallapak Tavanapong

MR: Describe your journey into computing from your youth up to the present.

Wallapak Tavanapong

Wallapak Tavanapong

Pak: I started learning about computing quite late. I did not know what a computer was until I joined a B.S. degree program in Computer Science at Thammasat University, Thailand, and learned the foundation there. After finishing the degree, I joined the M.S. program in Computer Science at the University of Central Florida (UCF), Orlando, Florida, USA. UCF was a great learning place for me. I had a wonderful advisor, Prof. Kien A. Hua, good classes, and great friends. My research at the time was video-on-demand, which was a hot topic then. After my Ph.D., I joined the Department of Computer Science at Iowa State University in 1999 as an Assistant Professor and was promoted to a Full Professor recently.

Iowa State University is a great place for my career. In the beginning, I continued on with the research in video-on-demand and multimedia caching. In 2003, my colleagues, Profs. JungHwan Oh, Piet C. de Groen, Johnny Wong, and I began investigating automated content analysis of endoscopic video for improving quality of the procedure. At the time, few works exist and mostly were on automated detection of polyp appearance in images. Our approach is to automatically analyze an entire procedure, calculate detailed objective metrics that reflect quality of inspection for the entire procedure, and provide real-time feedback to assist the endoscopist to improve the quality. We co-founded EndoMetric Corporation to transfer the technology into practice. I am glad that this research area receives much more attention now both in academia and industry. I am glad that our work has some influence on later work. In 2013, I began new interdisciplinary research and education initiatives in political informatics and computation communication and advertising.

MR: What foundational lessons did you learn from this journey?

Pak:

First, never give up when facing difficulty. Second, there are several paths toward good research. I am more attracted to research problems in a different discipline. I like to create a new computing research problem out of vague problem descriptions in other disciplines. I love interdisciplinary research.

MR: Why were you initially attracted to multimedia?

Pak:

My initial interest was in database research. As data began to come in different media types, extension to multimedia was natural.

MR: Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

Pak:

First, I’d like to see my research helps to prevent or reduce suffering from cancer for many. To achieve this goal, I need to do more to push my technology into practice. Second, I’d like to see computational thinking integrated into science and math curriculum in elementary schools in the US and other countries soon. Over the past five years, I have been engaging in our departmental K-12 outreach activities, coaching K-12 kids and interested K-12 teachers in computational thinking. I’d like to see more women in computer science and computing fields. In our K-12 outreach program, we found that young girls started losing their interest in science as early as the fifth grade. So, I hope to get them interested in computing early in the third grade. Last, I’d like to see that my interdisciplinary work with political scientists and communication scholars leads to a national social multimedia repository that is useful for social scientists and the public to learn about decision making in public policies that affect many lives.

MR: Can you profile your current research, its challenges, opportunities, and implications?

Pak: My top two projects are

  • Reconstruction of a virtual colon from 2D colonoscopic images:

    The human colon is a complex tubular structure with multiple twists and turns. A good colon exam increases early detection of colorectal cancer. I’d like to provide a 3D colon inspection map during the procedure for the endoscopist to know which areas inside the colon that they might have missed. There are many challenges. The most critical one is that commonly used endoscopes are not equipped with 3D camera positioning technology. I am working to add low-cost hardware equipment that provides some position information. I will utilize the position together with content analysis of endoscopic images to reconstruct the virtual colon. The work has a potential to increase the polyp detection rate during colonoscopy, preventing deaths and reducing pain and suffering.

  • Multimedia information system for political science and communication:

    This system would help answering research questions in political science and communication that could not have been answered before because of the sheer volume, variety, and velocity of data. Specifically, my team is working on understanding how states learn about policies from one another, how news reporters carry information from state legislatures to the public, how a public policy is influenced, etc. This is an application domain that lends itself to multimedia research, ranging from the underlying data management technology, automated content analysis of multiple media types and sources: web and video online ads, TV ads, state bills and laws, and tweets by political figures, to visualization of the resulting knowledge from the analysis.

MR: How would you describe the role of women especially in the field of multimedia?

Pak: I think the role of women in multimedia is same as men. But our number is much lower. We need to increase the number of women in the field. I believe that we need to get young girls interested in computing as early as elementary school.

MR: How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

Pak: I would say that my top achievement so far is in the idea and the realization of real-time computer-aided analysis and feedback to improve quality of colonoscopy. We were the first to investigate this problem. There are several challenges, for instance, defining what to analyze that reflect quality as seen by the domain experts, coming up with effective algorithms to compute the quality measurements, showing that the automated measurement indeed improves quality, making the automated analysis real-time, effective, and low cost to be used in practice, deploying the technology for daily use in hospitals and clinics.

My technology has already saved a couple of lives and I would like it to do more in the future. I have seen more researchers in academia and industry get into this research area, which is great. We need more researchers and developers in multimedia and healthcare to help medical professions improve quality of care via automation.

MR: Over your distinguished career, what are your top lessons you want to share with the audience?

Pak: Never give up. Find good mentors who care about you, believe in you, and give you different perspectives. A peer mentor is great. I learn a lot from my colleagues. Find a research problem you are passionate about. Last, when realizing that there is a problem, do not complain, look for a good solution, and fix it.

 

Call for Nominations for Editor-In-Chief of ACM TiiS

Call for Nominations
Editor-In-Chief
ACM Transactions on Interactive Intelligent Systems (TiiS)

The term of the current Editors-in-Chief (EiC) of the ACM Transactions on Interactive Intelligent Systems (TiiS) is coming to an end, and the ACM Publications Board has set up a nominating committee to assist the Board in selecting the next Editor(s)-in-Chief.

Nominations, including self nominations, are invited for a three-year term as TiiS EiC, beginning on January 1, 2016. The EiC appointment may be renewed at most one time. This is an entirely voluntary position, but ACM will provide appropriate administrative support.

The EiC is responsible for maintaining the highest editorial quality, for setting technical direction of the papers published in TiiS, and for maintaining a reasonable pipeline of articles for publication. He/she has final say on acceptance of papers, size of the Editorial Board, and appointment of Associate Editors. The EiC is expected to adhere to the commitments expressed in the policy on Rights and Responsibilities in ACM Publishing. For more information about the role of the EiC, see ACM’s Evaluation Criteria for Editors-in-Chief.

Nominations should include a vita along with a brief statement of why the nominee should be considered. Self-nominations are encouraged, and should include a statement of the candidate’s vision for the future development of TiiS. The deadline for submitting nominations is November 30, 2015, although nominations will continue to be accepted until the position is filled.

Please send all nominations to the nominating committee chair, Matthew Turk (mturk@cs.ucsb.edu).

The search committee members are:

  • Matthew Turk (UC Santa Barbara), Chair
  • Elisabeth André (University of Augsburg)
  • Henry Lieberman (MIT)
  • Desney Tan (Microsoft Research)

Report from ICACNI 2015

Report from the 3rd International Conference on Advanced Computing, Networking, and Informatics

1

Inauguration of 3rd ICACNI 2015

The 3rd International Conference on Advanced Computing, Networking and Informatics (ICACNI-2015), organized by School of Computer Engineering, KIIT University, Odisha, India, was held during 23-25 June, 2015.

2

Prof. Nikhil R. Pal during his keynote

The conference commenced with a keynote by Prof. Nikhil R. Pal (Fellow IEEE, Indian Statistical Institute, Kolkata, India) on ‘A Fuzzy Rule-Based Approach to Single Frame Super Resolution’.

Authors listening to technical presentations

Authors listening to technical presentations

Apart from three regular tracks on advanced computing, networking, and informatics, the conference hosted three invited special sessions. While a total of more than 550 articles across different tracks of the conference were received, 132 articles are finally selected for presentation and publication by Smart Innovation, Systems and Technologies series of Springer as Volume 43 and 44.

Prof. Nabendu Chaki during his technical talk

Prof. Nabendu Chaki during his technical talk

Extended versions of few extraordinary articles from these will be published by special issues of Egyptian Informatics Journal and Innovations in Systems and Software Engineering (A NASA Journal). The conference showcased a technical talk by Prof. Nabendu Chaki (Senior Member IEEE, Calcutta University, India) on ‘Evolution from Web-based Applications to Cloud Services: A Case Study with Remote Healthcare’.

A click from award giving ceremony

A click from award giving ceremony

The conference identified some wonderful works and have given away eight awards in different categories. The conference was successful to bring together academic scientists, professors, research scholars and students to share and disseminate information on knowledge and scientific research works related to the conference. 4th ICACNI 2016 is scheduled to be held at National Institute of Technology Rourkela, Odisha, India.

A quick interview with Open Source Software Competition’s organizers

The program of ACM Multimedia is very diverse: apart from oral and poster presentations, panels and keynotes there are challenges and competitions. One that is particularly interesting is the Open Source Software Competition, which is pretty much specific for this conference and was started in ACM Multimedia 2004. The full list of participants and winners (along with links to all the projects) can be found on the SIGMM web site: http://sigmm.org/Resources/software/ossc. This list shows that over the years this session has drawn a larger (and well deserved) attention from the community. We have asked the chairs of the ACM MM 2013 (Andrea Vedaldi & Ioannis Patras, answering as Org2013) and 2015 (Xian-Sheng Hua, Marco Bertini & Tao Mei, answering as Org2015) competition about their experience and opinions about the competition.

Q1: How hard was to get submissions to OSSC? Did you have to ask authors of software you knew or are they aware of this part of the ACM MM programme? Overall how many submissions did you receive?

Org2013: We did not have to ask any author directly. We only circulated an advertisement to three mailing lists, including a CV and ML one. The competition seems to be sufficiently well known that it is capable to attract submission with little effort.

Org2015: It was not that hard, we contacted some authors asking for submissions, but in the end the majority of submissions came from people who already knew the competition or from the call for paper we disseminated. We received 15 submissions, of which 9 were accepted. Decisions were taken considering the quality of the presentation and of the software itself, as well as the importance and utility for the multimedia community.

Q2: What’s your evaluation of the quality of submissions? Have you ever used software from past submissions?

Org2013: Half of the submission were of very high quality, both in scope and maturity of the projects. A few very very poor, at the level of master project at most. (Note: Andrea won the ACM MM’10 competition with his VLFeat library).

Org2015: Quality is quite high, we accepted works that were interesting and useful for the community and that were also mature enough to be used by members of the multimedia community. Marco Bertini: I already use some software of the submissions of this year, and I am using also software from past editions.

Q3: What’s your evaluation of OSSC per-se? Do you think other conferences should have something similar?

Org2013: It is a very good competition as it gives a chance to the authors of the software to obtain a publication and significant publicity (especially in the case of victory). It is also a great way to let the public know about solid OS projects. Having multiple competitions is tempting as contributions tend to be quite orthogonal (e.g. audio vs database vs networking vs imaging). At the same time, the number of contributions does not seem to warrant splitting the effort up.

Org2015: It is an interesting and useful track for ACM Multimedia. It has both scientific and technical value: It eases the development of new algorithms and methods, and allows to re-implement more easily the methods proposed by other researchers. The effort of the authors of such software deserve to be recognized by the scientific community. Probably other major conferences in different fields of CS should introduce this type of track.