MPEG Column: 107th MPEG Meeting

— original posts here and here by Multimedia Communication blog and bitmovin techblogChristian TimmererAAU/bitmovin

The MPEG-2 Transport Stream (M2TS; formally known as Rec. ITU-T H.222.0 | ISO/IEC 13818-1) has been awarded with the Technology & Engineering Emmy® Award by the National Academy of Television Arts & Sciences. It is the fourth time MPEG received an Emmy award. The M2TS is widely deployed across a broad range of application domain such as broadcast, cable TV, Internet TV (IPTV and OTT), and Blu-ray Disks. The Emmy was received during this year’s CES2014 in Las Vegas.

Plenary during the 107th MPEG Meeting.

Other topics of the 107th MPEG meeting in San Jose include the following highlights:

  • Requirements: Call for Proposals on Screen Content jointly with ITU-T’s Video Coding Experts Group (VCEG)
  • Systems: Committee Draft for Green Metadata
  • Video: Study Text Committee Draft for Compact Descriptors for Visual Search (CDVS)
  • JCT-VC: Draft Amendment for HEVC Scalable Extensions (SHVC)
  • JCT-3D: Proposed Draft Amendment for HEVC 3D Extensions (3D-HEVC)
  • Audio: 3D audio plans to progress to CD at 108th meeting
  • 3D Graphics: Working Draft 4.0 of Augmented Reality Application Format (ARAF) 2nd Edition

The official MPEG press release can be downloaded from the MPEG Web site. Some of the above highlighted topics will be detailed in the following and, of course, there’s an update on DASH-related matters at the end.

Call for Proposals on Screen Content

Screen content refers to content coming not from cameras but from screen/desktop sharing and collaboration, cloud computing and gaming, wirelessly connected displays, control rooms with high resolution display walls, virtual desktop infrastructures, tablets as secondary displays, PC over IP, ultra-thin client technology, etc. Also mixed-content is within the scope of this work item and may contain a mixture of camera-captured video and images with rendered computer-generated graphics, text, animation, etc.

Although this type of content was considered during the course of the HEVC standardization, recent studies in MPEG have led to the conclusion that significant further improvements in coding efficiency can be obtained by exploiting the characteristics of screen content and, thus, a Call for Proposals (CfP) is being issued for developing possible future extensions of the HEVC standard.

Companies and organizations are invited to submit proposals in response to this call –issued jointly by MPEG with ITU-T VCEG. Responses are expected to be submitted by early March, and will be evaluated during the 108th MPEG meeting. The timeline is as follows:

  • 2014/01/17: Final Call for Proposals
  • 2014/01/22: Availability of anchors and end of editing period for Final CfP
  • 2014/02/10: Mandatory registration deadline
    One of the contact persons (see Section 10) must be notified, and an invoice for the testing fee will be sent after registration. Additional logistic information will also be sent to proponents by this date.
  • 2014/03/05: Coded test material shall be available at the test site. By this date, the payment of the testing fee is expected to be finalized.
  • 2014/03/17: Submission of all documents and requested data associated with the proposal.
  • 2014/03/27-04/04: Evaluation of proposals at standardization meeting.
  • 2015: Final draft standard expected.

It will be interesting to see the coding efficiency of the submitted proposals compared to a pure HEVC or even AVC approach.

DEC PDP-8 at Computer History Museum during MPEG Social Event.

Committee Draft for Green Metadata

Green Metadata, formerly known as Green MPEG, shall enable energy-efficient media consumption and reached Committee Draft (CD) status at the 107th MPEG meeting. The representation formats defined within Green Metadata help reducing decoder power consumption and display power consumption. Clients may utilize such information for the adaptive selection of operating voltage or clock frequencies within their chipsets. Additional, it may be used to set the brightness of the backlights for the display to save power consumption.

Green Metadata also provides metadata for the signaling and selection of DASH representations to enable the reduction of power consumption for their encoding.

The main challenge in terms of adoption of this kind of technology is how to exploit these representation formats to actually achieve energy-efficient media consumption and how much!

What’s new on the DASH frontier?

The text of ISO/IEC 23009-1 2nd edition PDAM1 has been approved which may be referred to as MPEG-DASH v3 (once finalized and integrated into the second edition, possibly with further amendments and corrigenda, if applicable). This first amendment to MPEG-DASH v2 comprises accurate time synchronization between server and client for live services as well as a new profile, i.e., ISOBMFF High Profile which basically combines the ISOBMFF Live and ISOBMFF On-demand profiles and adds the Xlink feature.

Additionally, a second amendment to MPEG-DASH v2 has been started featuring Spatial Relationship Description (SRD) and DASH Client Authentication and Content Access Authorization (DAA).

Other DASH-related aspects include the following:

  • The common encryption for ISOBMFF has been extended with a simple pattern-based encryption mode, i.e., a new method which should simply content encryption.
  • The CD has been approved for the carriage of timed metadata metrics of media in ISOBMFF. This allows for the signaling of quality metrics within the segments enabling QoE-aware DASH clients.

What else? That is, some publicly available MPEG output documents… (Dates indicate availability and end of editing period, if applicable, using the following format YY/MM/DD):

  • Report of 3D-AVC Subjective Quality Assessment (14/02/28)
  • Working Draft 3 of Video Coding for Browsers (14/01/31)
  • Common Test Conditions for Proposals on VCB Enhancements (14/01/17)
  • Study Text of ISO/IEC CD 15938-13 Compact Descriptors for Visual Search (14/02/14)
  • WD 4.0 of ARAF 2nd Edition (14/02/07)
  • Text of ISO/IEC 23001-7 PDAM 1 Simple pattern-based encryption mode (14/01/31)
  • Text of ISO/IEC CD 23001-10 Carriage of Timed Metadata Metrics of Media in the ISO Base Media File Format (14/01/31)
  • Text of ISO/IEC CD 23001-11 Green Metadata (14/01/24)
  • Preliminary Draft of ISO/IEC 23008-2:2013/FDAM1 HEVC Range Extensions (14/02/28)
  • Text of ISO/IEC 23008-2:2013/DAM3 HEVC Scalable Extensions (14/01/31)
  • Preliminary Draft of ISO/IEC 23008-2:2013/FDAM2 HEVC Multiview Extensions (14/02/28)
  • Text of ISO/IEC 23008-2:2013/PDAM4 3D Extensions (14/03/14)
  • Text of ISO/IEC CD 23008-12 Image File Format (14/01/17)
  • Text of ISO/IEC 23009-1:201x DCOR 1 (14/01/24)
  • Text of ISO/IEC 23009-1:201x PDAM 1 High Profile and Availability Time Synchronization (14/01/24)
  • WD of ISO/IEC 23009-1 AMD 2 (14/01/31)
  • Requirements for an extension of HEVC for coding of screen content (14/01/17)
  • Joint Call for Proposals for coding of screen content (14/01/22)
  • Draft requirements for Higher Dynamic Range (HDR) and Wide Color Gamut (WCG) video coding for Broadcasting, OTT, and Storage Media (14/01/17)
  • Working Draft 1 of Internet Video Coding (IVC) (14/01/31)

ACM TOMM (TOMCCAP) Call for Special Issue Proposals

ACM – TOMM is one of the world’s leading journals on multimedia. As in previous years, we are planning to publish a special issue in 2015. Proposals are accepted until May, 1st 2014. Each special issue is in the responsibility of the guest editors. If you wish to guest edit a special issue, you should prepare a proposal as outlined below, then send this via e-mail to the Senior Associate Editor (SAE) for Special Issue Management of TOMM, Shervin Shirmohammadi (shervin@discover.uottawa.ca)

Call for Proposals – Special Issue
Deadline for Proposal Submission: May, 1st 2014
Notification: June, 1st 2014
http://tomccap.acm.org/
Proposals should:

  • Cover a current or emerging topic in the area of multimedia computing, communications and applications;
  • Set out the importance of the special issue’s topic in that area;
  • Give a strategy for the recruitment of high quality papers;
  • Indicate a draft timeline in which the special issue could be produced (paper writing, reviewing, and submission of final copies to TOMM), assuming the proposal is accepted.
  • Include the list of the proposed guest editors, their short bios, and their experience as related to the Special Issue’s topic

As in the previous years, the special issue will be published as online-only issue in the ACM Digital Library. This gives the guest editors higher flexibility in the review process and the number of papers to be accepted, while yet ensuring a timely publication.

The proposals will be reviewed by the SAE together with the EiC. The final decision will be made by the EiC. A notification of acceptance for the proposals will be given until June, 1st 2014. Once a proposal is accepted we will contact you to discuss the further process.

For questions please contact:

  • Shervin Shirmohammadi – Senior Associate Editor for Special Issue Management ( shervin@discover.uottawa.ca )
  • Ralf Steinmetz – Editor in Chief (EiC) ( steinmetz.eic@kom.tu-darmstadt.de )
  • Sebastian Schmidt – Information Director ( TOMCCAP@kom.tu-darmstadt.de )

VIREO-VH: Libraries and Tools for Threading and Visualizing a Large Video Collection

Introduction

“Video Hyperlinking” refers to the creation of links connecting videos that share near-duplicate segments. Like hyperlinks in HTML documents, the video links help user navigating videos of similar content, and facilitate the mining of iconic clips (or visual memes) spread among videos. Figure 1 shows some example of iconic clips, which can be leveraged for linking videos and the results are potentially useful for multimedia tasks such as video search, mining and analytics.

VIREO-VH [1] is an open source software developed by the VIREO research team. The software provides end-to-end support for the creation of hyperlinks, including libraries and tools for threading and visualizing videos in a large collection. The major software components are: near-duplicate keyframe retrieval, partial near-duplicate localization with time alignment, and galaxy visualization. These functionalities are mostly implemented based on state-of-the-art technologies, and each of them is developed as an independent tool taking into consideration flexibility, such that users can substitute any of the components with their own implementation. The earlier versions of the software are LIP-VIREO and SOTU, which have been downloaded more than 3,500 times. VIREO-VH has been internally used by VIREO since 2007, and evolved over the years based on the experiences of developing various multimedia applications, such as news events evolution analysis, novelty reranking, multimedia-based question-answering [2], cross media hyperlinking [3], and social video monitoring.

Figure 1: Examples of iconic clips.

Functionality

The software components include video pre-processing, bag-of-words based inverted file indexing for scalable near-duplicate keyframe search, localization of partial near-duplicate segments [4], and galaxy visualization of a video collection, as shown in Figure 2. The open source includes over 400 methods with 22,000 lines of code.

The workflow of the open source is as followings. Given a collection of videos, the visual content will be indexed based on a bag-of-words (BoW) representation. Near-duplicate keyframes will be retrieved and then temporally aligned in a pairwise manner among videos. Segments of a video which are near-duplicate to other videos in the collection will then be hyperlinked with the start and end times of segments being explicitly logged. The end product is a galaxy browser, where the videos are visualized as a galaxy of clusters on a Web browser, with each cluster being a group of videos that are hyperlinked directly or indirectly through transitivity propagation. User friendly interaction is provided such that end user can zoom in and out, so they can glance or take a close inspection of the video relationship.

Figure 2: Overview of VIREO-VH software architecture.

Interface

VIREO-VH could be either used as an end-to-end system that outputs visual hyperlinks, with a video collection as input, or as independent functions for development of different applications.

For content owners interested in the content-wise analysis of a video collection, VIREO-VH can be used as an end-to-end system by simply inputting the location of a video collection and the output paths (Figure 3). The resulting output can then be viewed with the provided interactive interface for showing the glimpse of video relationship in the collection.

Figure 3: Interface for end-to-end processing of video collection.

VIREO-VH also provides libraries to grant researchers programmatic access. The libraries consist of various classes (e.g., Vocab, HE, Index, SearchEngine and CNetwork), providing different functions for vocabulary and Hamming signature training [5], keyframe indexing, near-duplicate keyframe searching and video alignment. Users can refer to the manual for details. Furthermore, the components of VIREO-VH are independently developed for providing flexibility, so users can substitute any of the components with their own implementation. This capability is particular useful for benchmarking the users’ own choice of algorithms. As an example, users can choose their own visual vocabulary and Hamming median, but use the open source for building index and retrieving near-duplicate keyframes. For example, the following few lines of code implements a typical image retrieval system:

#include “Vocab_Gen.h” #include “Index.h” #include “HE.h” #include “SearchEngine.h” … // train visual vocabulary using descriptors in folder “dir_desc” // here we choose to train a hierarchical vocabulary with 1M leaf nodes (3 layers, 100 nodes / layer) Vocab_Gen::genVoc(“dir_desc”, 100, 3); // load pre-trained vocabulary from disk Vocab* voc = new Vocab(100, 3, 128); voc->loadFromDisk(“vk_words/”); // Hamming Embedding training for the vocabulary HE* he = new HE(32, 128, p_mat, 1000000, 12); he->train(voc, “matrix”, 8); // index the descriptors with inverted file Index::indexFiles(voc, he, “dir_desc/”, “.feat”, “out_dir/”, 8); // load index and conduct online search for images in “query_desc” SearchEngine* engine = new SearchEngine(voc, he); engine->loadIndexes(“out_dir/”); engine->search_dir(“query_desc”, “result_file”, 100); …

Example

We use a video collection consisting of 220 videos (around 31 hours) as an example. The collection was crawled from YouTube using the keyword “economic collapse”. Using our open source and default parameter settings, a total of 35 partial near-duplicate (ND) segments are located, resulting in 10 visual clusters (or snippets). Figure 4 shows two examples of the snippets. Based on our experiments, the precision of ND localization is as high as 0.95 and the recall is 0.66. Table 1 lists the running time for each step. The experiment was conducted on a PC with dual core 3.16 GHz CPU and 3 GB of RAM. In total, creating a galaxy view for 31.2 hours of videos (more than 4,000 keyframes) could be completed within 2.5 hours using our open source. More details can be found in [6].

Pre-processing 75 minutes
ND Retrieval 59 minutes
Partial ND localization 8 minutes
Galaxy Visualization 55 seconds

Table 1: The running time for processing 31.2 hours of videos.

Figure 4: Examples of visual snippets mined from a collection of 220 videos. For ease of visualization, each cluster is tagged with a timeline description from Wikipedia using the techniques developed in [3].

Acknowledgements

The open source software described in this article was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (CityU 119610).

References

[1] http://vireo.cs.cityu.edu.hk/VIREO-VH/

[2] W. Zhang, L. Pang and C. W. Ngo. Snap-and-Ask: Answering Multimodal Question by Naming Visual Instance. ACM Multimedia, Nara, Japan, October 2012. Demo

[3] S. Tan, C. W. Ngo, H. K. Tan and L. Pang. Cross Media Hyperlinking for Search Topic Browsing. ACM Multimedia, Arizona, USA, November 2011. Demo

[4] H. K. Tan, C. W. Ngo, R. Hong and T. S. Chua. Scalable Detection of Partial Near-Duplicate Videos by Visual-Temporal Consistency. In ACM Multimedia, pages 145-154, 2009.

[5] H. Jegou, M. Douze, and C. Schmid. Improving bag-of-features for large scale image search. IJCV,87(3):192-212, May 2010.

[6] L. Pang, W. Zhang and C. W. Ngo. Video Hyperlinking: Libraries and Tools for Threading and Visualizing a Large Video Collection. ACM Multimedia, Nara, Japan, Oct 2012.

MPEG Column: 106th MPEG Meeting

— original posts here and here by Multimedia Communication blog and bitmovin techblogChristian TimmererAAU/bitmovin

National Day Present by Austrian Airlines on my way to Geneva.

November, 2013, Geneva, Switzerland. Here comes a news report from the 106th MPEG in Geneva, Switzerland which was actually during the Austrian national day but Austrian Airlines had a nice present (see picture) for their guests.

The official press release can be found here.

In this meeting, ISO/IEC 23008-1 (i.e., MPEG-H Part 1) MPEG Media Transport (MMT) reached Final Draft International Standard (FDIS). Looking back when this project was started with the aim to supersede the widely adopted MPEG-2 Transport Stream (M2TS) — which receives the Technology & Engineering Emmy®Award in Jan’14 — and what we have now, the following features are supported within MMT:

  • Self-contained multiplexing structure
  • Strict timing model
  • Reference buffer model
  • Flexible splicing of content
  • Name based access of data
  • AL-FEC (application layer forward error correction)
  • Multiple Qualities of Service within one packet flow

ITU-T Tower Building, Geneva.

Interestingly, MMT supports the carriage of MPEG-DASH segments and MPD for uni-directional environments such as broadcasting.

MPEG-H now comprises three major technologies, part 1 is about transport (MMT; at FDIS stage), part 2 deals with video coding (HEVC; at FDIS stage), and part 3 will be about audio coding, specifically 3D audio coding (but it’s still in its infancy for which technical responses have been evaluated only recently). Other parts of MPEG-H are currently related to these three parts.

In terms of research, it is important to determine the efficiency, overhead, and — in general — the use cases enabled by MMT. From a business point of view, it will be interesting to see whether MMT will actually supersede M2TS and how it will evolve compared or in relation to DASH.

On another topic, MPEG-7 visual reached an important milestone at this meeting. The Committee Draft (CD) for Part 13 (ISO/IEC 15938-13) has been approved and is entitled Compact Descriptors for Visual Search (CDVS). This image description enables comparing and finding pictures that include similar content, e.g., when showing the same object from different viewpoints. CDVS mainly deals with images but MPEG also started work for compact descriptors for video search.

The CDVS standard truly helps to reduce the semantic gap. However, research in this domain is already well developed and it is unclear whether the research community will adopt CDVS, specifically because the interest in MPEG-7 descriptors has decreased lately. On the other hand, such a standard will enable interoperability among vendors and services (e.g., Google Goggles) reducing the number of proprietary formats and, hopefully, APIs. However, the most important question is whether CDVS will be adopted by the industry (and research).

Finally, what about MPEG-DASH?

The 2nd edition of part 1 (MPD and segment formats) and the 1st edition of part 2 (conformance and reference software) have been finalized at the 105th MPEG meeting (FDIS). Additionally, we had a public/open workshop at that meeting which was about session management and control for DASH. This and other new topics are further developed within so-called core experiments for which I’d like to give a brief overview:

  • Server and Network assisted DASH Operation (SAND) which is the immediate result of the workshop at the 105th MPEG meeting and introduces a DASH-Aware Media Element (DANE) as depicted in the Figure below. Parameters from this element — as well as others — may support the DASH client within its operations, i.e., downloading the “best” segments for its context. SAND parameters are typically coming from the network itself whereas Parameters for enhancing delivery by DANE (PED) are coming from the content author.

Baseline Architecture for Server and Network assisted DASH.

  • Spatial Relationship Description is about delivering (tiled) ultra-high-resolution content towards heterogeneous clients while at the same time providing interactivity (e.g., zooming). Thus, not only the temporal but also spatial relationship of representations needs to be described.

Other CEs are related to signaling intended source and display characteristicscontrolling the DASH client behavior, and DASH client authentication and content access authorization.

The outcome of these CEs is potentially interesting for future amendments. One CE closed at this meeting which was about including quality information within DASH, e.g., as part of an additional track within ISOBMFF and an additional representation within the MPD. Clients may access this quality information in advance to assist the adaptation logic in order to make informed decisions about which segment to download next.

Interested people may join the MPEG-DASH Ad-hoc Group (AhG; http://lists.uni-klu.ac.at/mailman/listinfo/dash) where these topics (and others) are discussed.

Finally, additional information/outcome from the last meeting is accessible via http://mpeg.chiariglione.org/meetings/106 including documents publicly available (some may have an editing period).

An Interview with Cynthia Liem: The PHENICX Project

The PHENICX project is supported by the European Commission, FP7 (Seventh Framework Programme, STREP project, ICT-2011.8.2 ICT for access to cultural resources, grant agreement No 601166). The project is running for a year now and Cynthia Liem is involved since the initial planning and proposal writing. Currently, she is a work package leader in the project, and part of the overall project coordination team in the role of dissemination coordinator.

Partners in the project are Universitat Pompeu Fabra, Barcelona, ES; Delft University of Technology, NL; Johannes Kepler University Linz, AT; Austrian Research Institute for Artificial Intelligence, Vienna, AT; Video Dock BV, Amsterdam, NL; Royal Concertgebouw Orchestra, Amsterdam, NL; and Escola Superior de Música de Catalunya, Barcelona, ES. More information on the project can be found at http://phenicx.upf.edu

Q: What is the goal and scope of the PHENICX project?

PHENICX is about music and concert experiences. We want to use multimedia technologies to enhance the experience of a concert and make it more interesting and accessible for broad audiences. In this, we mainly focus on classical music.

Basically, the project has two sides. First of all, there is a content analysis side, in which we analyze concert performance data in a broad sense. We do not only look at an audio stream, but also e.g. at videos, gesture information, and social commenting information from people who attended concerts. Besides multiple modalities, we also try to take into account multiple perspectives: think of multiple cameras and microphones registering an orchestra, but also of multiple types of people (a conductor, orchestra musicians, or just your personal friends) speaking about a concert. Finally, a concert really is a multilayered phenomenon, with lots of things going on at the same time in which one could be potentially interested. The particular notes being played from a score are part of a larger structural whole; and while 130 individuals may be playing at the same time in a symphony orchestra, they form sub-groups which all have a particular role in the musical narrative and instrumental mix.

On the other side, it’s about the experience, about getting and keeping users from different consumer groups engaged. This is not just targeted at live attendance scenarios in the concert hall, but also for scenarios in which people attend concerts off-site through a live stream, or want to relive a concert on-demand after its performance. While for the content analysis part, we mostly focus on signal-oriented research topics, for this experience part we strongly look into topics such as recommendation, visualization and interaction. For example, how can you make the whole multilayered aspect of music more tangible? This can for example be done with automated score-following, through more simplified visualizations, but also by contrasting a particular performance against other existing performances of the same piece.

Our mission to broaden audiences for the classical music genre can be seen as a way of cultural heritage preservation using ICT. In the end, we really hope to see digital technology affecting culture consumption in a positive way. [As a concrete example, our partners Video Dock and the Royal Concertgebouw Orchestra already are working on a commercial tablet app called RCO Editions. The technologies we work on in PHENICX can really help in making the production of the app more scalable, expanding its feature set, and optimizing its user experience.

Q: Are there special organizational challenges?

In the project there are seven partners, four of them being academic partners. The three non-academic partners are major players in different parts of the music stakeholder spectrum, but have less experience with academic projects – especially the Royal Concertgebouw Orchestra, which really is involved for the first time in a large academic technology project. So in communicating and working with each other, there is always some translation needed between partners with different background and project experience levels. This is a very interesting organizational challenge in which we always try to find an optimal balance between different stakeholders.

Another potential challenge is language. Especially in the first year, we have been running a lot of focus groups to validate use cases. But while we have grown completely accustomed to using English in our daily academic work, as soon as you wish to interact with realistic local potential users of your technology in all project partner countries, you can’t take for granted these users have full expressive command of English (the younger generation typically does, but you don’t want to only reach them). And music is a very attractive topic for general public dissemination, since it’s a concrete part in many people’s lives; but once again, to make full use of this opportunity, you may have to look beyond English. So we’re having some dedicated organizational activities on that, working to also hold some studies and get some publicity material available in local languages.

Q: What is your personal relation to the project?

Well, I wrote a significant part of the proposal, so in that sense have a considerable relation to the project … but, at least as importantly, my musician background creates a strong personal link to this project. Having degrees in computer science and classical piano performance, I’m really interested in the interface between these two: working with music and digital data, using data technologies to improve on what you can learn and do with music – and PHENICX definitely is about this. So I’m very actively trying to use this double background for the project. It is especially useful for communication and dissemination: I can talk to people at the more musical side, many of which do not have extensive technical backgrounds, but also to those at the more technical side, who do not always have an extensive music background.

Funnily enough, the project also affected views I had from my own musicianship. The Royal Concertgebouw Orchestra is one of the most famous orchestras in the world. If you’re a music student in Holland, you can be backstage and engage with people from many national orchestras, but only the lucky few will manage to get even in the neighborhood of this particular orchestra. Now I’m having this connecting role in the project between academics and music stakeholders, and the orchestra became a project partner, I suddenly find myself being in their office quite often. I would never have expected that!

Besides that, with our work on user requirements and focus groups, I really managed to be in contact with actual audience. In our focus groups, we asked people why they liked going to concert performances, and we frequently heard people responding they valued feeling isolated from external influences in the concert hall, to have themselves being swept away by the music. Probably because a concert hall is a bit of a working space for me, I had totally forgotten this escapism aspect of concert attendance. So here, the project really made me aware of my own professional biases and ‘put me back on the ground’.

Q: Would you ever write an EU project proposal again?

Well, yes, I would, definitely with a consortium and project as inspiring as PHENICX. But I hope that next time I’ll have a bit more time than the three weeks in which we raced to completing the PHENICX proposal. 😉

Curriculum Vitae:

Cynthia Liem obtained her BSc and MSc degrees in Media and Knowledge Engineering (Computer Science) with honors at Delft University of Technology, The Netherlands, and currently is a PhD student at the Multimedia Information Retrieval Lab of the same university, working under the supervision of Prof. Alan Hanjalic. Besides, she holds Bachelor and Master of Music degrees in classical piano performance from the Royal Conservatoire in The Hague. Her research interests are strongly motivated by her background in both engineering and music and concentrate around multimedia content analysis for the music information retrieval domain.

From this background, she has been very active in getting music on the multimedia research agenda, particularly at the ACM Multimedia Conference, where she first initiated and served as the main organizer of the ACM MIRUM workshop (2011, 2012). This led to her becoming a co-chair of a dedicated ‘Music & Audio’ area at ACM MM 2013, and currently the more broadened ‘Music, Speech, and Audio Processing in Multimedia’ area for ACM MM 2014. She also was a main initiator of the EU FP7 PHENICX project (2013 – 2016), in which she now serves as work package leader and dissemination coordinator.

She is the recipient of several international scholarships and awards, including the Lucent Global Science Scholarship in 2005, the Google Anita Borg Scholarship in 2008, the Google European Doctoral Fellowship in Multimedia in 2010 (which partially supports her PhD research work), and the UfD Best PhD Candidate Award at Delft University of Technology in 2012. Besides her ongoing academic and musical activities, Cynthia has interned at Bell Labs Europe Netherlands, Philips Research, Google UK and Google Research, Mountain View, USA.

The interviewer, Mathias Lux, is a Associate Professor at the Institute for Information Technology (ITEC) at Klagenfurt University, where he has been since 2006. He received his M.S. in Mathematics in 2004 and his Ph.D. in Telematics in 2006 from Graz University of Technology. Before joining Klagenfurt University, he worked in industry on web-based applications, as a junior researcher at a research center for knowledge-based applications, and as research and teaching assistant at the Knowledge Management Institute (KMI) of Graz University of Technology. In research, he is working on user intentions in multimedia retrieval and production, visual information retrieval, and serious games. In his scientific career he has (co-) authored more than 60 scientific publications, has served in multiple program committees and as reviewer of international conferences, journals, and magazines, and has organized several scientific events. He is also well known for managing the development of the award-winning and popular open source tools Caliph & Emir and LIRE for visual information retrieval.

A report from the First International Competition on Game-Based Learning Applications

The European Conference on Game Based Learning is an academic conference that has been held annually in various European Universities since 2006. For the first time this year the Programme Committee, together with Segan (Serious Games Network, https://www.facebook.com/groups/segan) decided to launch a competition at the conference for the best educational game. The aims of the competition were:

  • To provide an opportunity for educational game designers and creators to participate in the conference and demonstrate their game design and development skills in an international competition;
  • To provide an opportunity for GBL creators to peer-assess and peer-evaluate their games;
  • To provide ECGBL attendees with engaging and best-practice games that showcase exemplary applications of GBL .

In the first instance prospective participants were asked to submit a 1000 word extended abstract giving an overview of the game itself, how it is positioned in terms of related work and what the unique education contribution is. We received 56 applications and these were reduced to 22 finalists who were invited to come to the conference to present their games. Four judges, in two teams assessed the games based on a comprehensive set of criteria including sections on learning outcomes, usability and soci-cultural aspects. A shortlist of 6 games were then revisited by all the judges during an open demonstration session at which conference participants were also welcome to participate. First, Second and Third place awards were given and two Highly Commended certificates were presented. The top three games were quite different in terms of the target audience and the format.

Third place

In third place was an app-based early learning game called Lipa Eggs developed by Ian Hook and Roman Hodek from Lipa Learning in the Czech Republic. This game was designed to help pre-school children with colour mixing and recognition and was delivered via a tablet. The gameplay takes the form of a graduated learning system which first allows children to develop the skills to play the game and then develops the learning process to encourage players to find new solutions. More information about the game can be found at http://www.lipalearning.com/game/lipa-eggs

Second place

In second place was a non-digital game called ChemNerd developed by Jakob Thomas Holm from Sterskov Efterskole (a secondary school in Denmark specializing in game-based learning). This game was designed to help teach the periodic table to secondary school students and was presented as a multi-level card game. The game utilizes competition and face to face interaction between students to teach them complicated chemical theory over six phases beginning with a memory challenge and ending with a practical experiment. A video illustrating the game can been seen at http://youtu.be/XD6BPrJyxlc

Winners

The winner was a computer game called Mystery of Taiga River developed by Sasha Barab and Anna Arici from Arizona State University in the USA. The aim of the game was to teach ecological studies to secondary school students and was presented as a game-based immersive world where students become investigative reporters who had to investigate, learn and apply scientific concepts to solve applied problems in a virtual park and restore the health of the dying fish. A video of the game can be seen at http://gamesandimpact.org/taiga_river

Both competitors and conference participants said that they had enjoyed the opportunity of seeing applied educational game development from around the world and the intention is to make this an annual competition associated with the European Conference on Game-Based Learning (ECGBL). The conference in 2014 will be held in Berlin on 30-31 October and the call for games is now open. Details can be found here: http://academic-conferences.org/ecgbl/ecgbl2014/ecgbl14-call-papers.htm

ACM TOMCCAP Special on 20th Anniversary of ACM Multimedia

ACM Transactions on Multimedia Computing, Communications and Applications

Special Issue: 20th Anniversary of ACM International Conference on Multimedia

A journey ‘Back to the Future’

The ACM Special Interest Group on Multimedia (SIGMM) celebrated the 20th anniversary of the establishment of its premier conference, the ACM International Conference on Multimedia (ACM Multimedia) in 2012. To commemorate this milestone, leading researchers organized and extensively contributed to the 20th anniversary celebration.

from left to right: Malcolm Slaney, Ramesh Jain, Dick Bulterman, Klara Nahrstedt, Larry Rowe and Ralf Steinmetz

The celebratory events started at ACM Multimedia 2012 in Nara Japan, with the  “Coulda, Woulda, Shoulda: 20 Years of Multimedia Opportunities” panel, organized by Klara Nahrstedt (center) and Malcolm Slaney (far left). At this panel, pioneers of the field, Ramesh Jain, Dick Bulterman, Larry Rowe and Ralf Steinmetz, from left to right shown in the image, reflected on innovations, and successful and missed opportunities in the multimedia research area.

This special issue of the ACM Transaction on Multimedia Computing, Communication and Applications (TOMCCAP) is the final event to celebrate achievements and opportunities in a variety of multimedia areas. Through peer-reviewed long articles and invited short contributions, readers will get a sense of the past, present and future of multimedia research. The evolution ranges over traditional topics such as video streaming, multimedia synchronization, multimedia authoring, content analysis, and multimedia retrieval to newer topics including music retrieval, geo-tagging context in worldwide community of photos, multi-modal humancomputer interactions and experiential media systems.

Recent years have seen an explosion of research and technologies in multimedia, beyond individual algorithms, protocols and small scale systems. The scale of multimedia innovations and deployment has exploded with unimaginable speed. Hence, as the multimedia area is growing fast, penetrating every facet of our society, this special issue fills an important need to look back at the multimedia research achievements over the past 20 years, celebrates the exciting potential, and explores new goals of the multimedia research community.

Visit dl.acm.org/tomccap to view in the DL.

TOMCCAP Nicolas D. Georganas Best Paper Award 2013

ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP) Nicolas D. Georganas Best Paper Award

The 2013 ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP) Nicolas D. Georganas Best Paper Award is provided to the paper Exploring interest correlation for peer-to-peer socialized video sharing (TOMCCAP vol. 8, Issue 1) by Xu Cheng and Jiangchuan Liu.

The purpose of the named award is to recognize the most significant work in ACM TOMCCAP in a given calendar year. The whole readership of ACM TOMCCAP was invited to nominate articles which were published in Volume 8 (2012). Based on the nominations the winner has been chosen by the TOMCCAP Editorial Board. The main assessment criteria have been quality, novelty, timeliness, clarity of presentation, in addition to relevance to multimedia computing, communications, and applications.

In this paper the authors examine architectures for large-scale video streaming systems exploiting social relations. To achieve this objective, a large study of YouTube traffic was conducted and a cluster analysis performed on the resulting data. Based on the observations made, a new approach for video pre-fetching based on social relations has been developed. This important work bridges the gap between social media and multimedia streaming and hence combines two extremely relevant research topics.

The award honors the founding Editor-in-Chief of TOMCCAP, Nicolas D. Georganas, for his outstanding contributions to the field of multimedia computing and his significant contributions to ACM.  He exceedingly influenced the research and the whole multimedia community.

The Editor-in-Chief Prof. Dr.-Ing. Ralf Steinmetz and the Editorial Board of ACM TOMCCAP cordially congratulate the winner. The award will be presented to the authors on October 24th 2013 at the ACM Multimedia 2013 in Barcelona, Spain and includes travel expenses for the winning authors.

Bio of Awardees:

Xu Cheng is currently a research engineer at BroadbandTV, Vancouver, Canada. He receive the Bachelor of Science from Peking University, China, in 2006, Master of Science from Simon Fraser University, Canada, in 2008, and PhD from Simon Fraser University, Canada, in 2012. His research interests included multimedia networks, social networks and overlay networks.

 

Jiangchuan Liu is an Associate Professor in School of Computing Science, Simon Fraser University, British Columbia, Canada. He received BEng(cum laude) from Tsinghua University in 1999 and PhD from HKUST in 2003, both in computer science. He is a co-recipient of ACM Multimedia’2012 Best Paper Award, IEEE Globecom’2011 Best Paper Award, IEEE Communications Society Best Paper Award on Multimedia Communications 2009, as well as IEEE IWQoS’08 and IEEE/ACM IWQoS’2012 Best Student Paper Awards. His research interests are in networking and multimedia. He served on the editorial boards of IEEE Transactions on Multimedia, IEEE Communications Tutorial and Surveys, and IEEE Internet of Things Journal. He will be TPC co-chair for IEEE/ACM IWQoS’2014 at Hong Kong.

 

TOMCCAP Best Associate Editor Award 2013

ACM Transactions on Multimedia Computing, Communications and Applications Best Associate Editor Award

Annually, the Editor-in-Chief of the ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP) honors one member of the Editorial Board with the TOMCCAP Associate Editor of the Year Award. The purpose of the award is the distinction of excellent work for ACM TOMCCAP and hence also for the whole multimedia community in the previous year. Criteria for the award are (1.) the amount of submissions processed in time, (2.) the performance during the reviewing process and (3.) the accurate interaction with the reviewers in order to broader the awareness for the journal.

Based on the criteria mentioned above, the ACM Transactions on Multimedia Computing, Communications and Applications Associate Editor of the Year Award 2013 goes to Mohan S. Kankanhalli from the National University of Singapore.  The Editor-in-Chief Prof. Dr.-Ing. Ralf Steinmetz cordially congratulates Mohan.

Bio of Awardee:

Mohan Kankanhalli is a Professor at the Department of Computer Science of the National University of Singapore. He is also the Associate Provost for Graduate Education at NUS. Before that, he was the Vice-Dean for Academic Affairs and Graduate Studies at the NUS School of Computing during 2008-2010 and Vice-Dean for Research during 2001-2007. Mohan obtained his BTech from IIT Kharagpur and MS & PhD from the Rensselaer Polytechnic Institute.

His current research interests are in Multimedia Systems (content processing, retrieval) and Multimedia Security (surveillance and privacy). He has been awarded a S$10M grant by Singapore’s National Research Foundation to set up the Centre for “Sensor-enhanced Social Media” (sesame.comp.nus.edu.sg).

Mohan has been actively involved in the organization of many major conferences in the area of Multimedia. He was the Director of Conferences for ACM SIG Multimedia from 2009 to 2013. He is on the editorial boards of several journals including the ACM Transactions on Multimedia Computing, Communications, and Applications, Springer Multimedia Systems Journal, Pattern Recognition Journal and Multimedia Tools & Applications Journal.

SIGMM Outstanding Technical Contributions Award 2013

SIGMM Award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications

The 2013 winner of the prestigious ACM Special Interest Group on Multimedia (SIGMM) award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications is Prof. Dr. Dick Bulterman. He is currently a Research Group Head of the Distributed and Interactive Systems at Centrum Wiskunde & Informatica (CWI) in Amsterdam, The Netherlands. He is also a Full Professor of Computer Science at Vrije Universiteit, Amsterdam. His research interests are multimedia authoring and document processing. His recent research concerns socially-aware multimedia, interactive television, and media analysis.

The ACM SIGMM Technical Achievement award is given in recognition of outstanding contributions over a researcher’s career. Prof. Dick Bulterman has been selected for his outstanding technical contributions in multimedia authoring, media annotation, and social sharing from research through standardization to entrepreneurship, and in particular for promoting international Web standards for multimedia authoring and presentation (SMIL) in the W3C Synchronized Multimedia Working Group as well as his dedicated involvement in the SIGMM research community for many years. The SIGMM award will be presented at the ACM International Conference on Multimedia 2013 that will be held Oct 21–25 2013 in Barcelona, Spain.

Dick Bulterman has been a long time intellectual leader in the area of temporal modeling and support for complex multimedia system. His research has led to the development of several widely used multimedia authoring systems and players. He developed the Amsterdam Hypermedia Model, the CMIF document structure, the CMIFed authoring environment, the GRiNS editor and player, and a host of multimedia demonstrator applications. In 1999, he started the CWI spinoff company called Oratrix Development BV, and he worked as CEO to widely deliver this software.

Dick has a strong international reputation for the development of the domain-specific temporal language for multimedia (SMIL). Much of this software has been incorporated into the widely used Ambulant Open Source SMIL Player, which has served to encourage development and use of time-based multimedia content. His conference publications and book on SMIL have helped to promote SMIL and its acceptance as a W3C standard.

Dick’s recent work on social sharing of video will likely prove influential in upcoming Interactive TV products. This work has already been recognized in the academic community, earning the ACM SIGMM best paper award at ACM MM 2008 and also at the EUROITV conference.

In summary, Prof. Bulterman’s accomplishments include pioneering and extraordinary contributions in multimedia authoring, media annotation, and social sharing and outstanding service to the computing community.