SIGMM PhD Thesis Award 2013

SIGMM Award for Outstanding PhD Thesis in Multimedia Computing, Communications and Applications 2013

The SIGMM Ph.D. Thesis Award Committee is pleased to recommend this year’s award for the outstanding Ph.D. thesis in multimedia computing, communications and applications to Dr. Xirong Li.

The committee considered Dr. Li’s dissertation titled “Content-based visual search learned from social media” as worthy of the award as it substantially extends the boundaries for developing content-based multimedia indexing and retrieval solutions. In particular, it provides fresh new insights into the possibilities for realizing image retrieval solutions in the presence of vast information that can be drawn from the social media.

The committee considered the main innovation of Dr. Li’s work to be in the development of the theory and algorithms providing answers to the following challenging research questions:

  1. what determines the relevance of a social tag with respect to an image,
  2. how to fuse tag relevance estimators,
  3. which social images are the informative negative examples for concept learning,
  4. how to exploit socially tagged images for visual search and
  5. how to personalize automatic image tagging with respect to a user’s preferences.

The significance of the developed theory and algorithms lies in their power to enable effective and efficient deployment of the information collected from the social media to enhance the datasets that can be used to learn automatic image indexing mechanisms (visual concept detection) and to make this learning more personalized for the user.

Bio of Awardee:

Dr. Xirong Li received the B.Sc. and M.Sc. degrees from the Tsinghua University, China, in 2005 and 2007, respectively, and the Ph.D. degree from the University of Amsterdam, The Netherlands, in 2012, all in computer science. The title of his thesis is “Content-based visual search learned from social media”. He is currently an Assistant Professor in the Key Lab of Data Engineering and Knowledge Engineering, Renmin University of China. His research interest is image search and multimedia content analysis. Dr. Li received the IEEE Transactions on Multimedia Prize Paper Award 2012, Best Paper Nominee of the ACM International Conference on Multimedia Retrieval 2012, Chinese Government Award for Outstanding Self-Financed Students Abroad 2011, and the Best Paper Award of the ACM International Conference on Image and Video Retrieval 2010. He served as publicity co-chair for ICMR 2013.

ACM SIGMM/TOMCCAP 2013 Award Announcements

The ACM Special Interest Group in Multimedia (SIGMM) and ACM Transactions on Multimedia Computing, Communications and Applications (TOMCCAP) are pleased to announce the following awards for 2013 recognizing outstanding achievements and services made in the multimedia community.

SIGMM Technical Achievement Award:
Dr. Dick Bulterman

SIGMM Best Ph.D. Thesis Award:
Dr. Xirong Li

TOMCCAP Nicolas D. Georganas Best Paper Award:
“Exploring interest correlation for peer-to-peer socialized video sharing” by Xu Cheng and Jiangchuan Liu, published in TOMCCAP vol. 8, Issue 1, 2012.

TOMCCAP Best Associate Editor Award:
Dr. Mohan S. Kankanhalli

Additional information of each award and recepient will be released in separate announcemrtns. Awards will be presented in the annual SIGMM event, ACM Multimedia Conference, held in Barcelona, Catalunya, Spain during October 23-25 2013.

ACM is the professional society of computer scientists, and SIGMM is the special interest group on multimedia. TOMCCAP is the flagship journal publication of SIGMM.

Opencast Matterhorn – Open source lecture capture and video management

Since its formation in 2007, Opencast has become a global community around academic video and its related domains. It is a valid source of inspiring ideas and a huge living community for educational multimedia content creation and usage. Matterhorn is a community-driven collaborative project to develop an end-to-end, open source solution that supports the scheduling, capture, managing, encoding, and delivery of educational audio and video content, and the engagement of users with that content. Recently (July 2013) Matterhorn 1.4 has been released after almost a year of feature planing, development and bug fixing by the community. The latest version, along with documentation, can be downloaded from the project website at Opencast Matterhorn.

Opencast Matterhorn Welcomes Climbers

The first screenshot shows a successful system installation and start in a web browser.

Opencast: A community for higher education

The Opencast community is a collaborative effort, in which individuals, higher education institutions and organizations work together to explore, develop, define and document best practices and technologies for management of audiovisual content in academia. As such, it wants to stimulate discussion and collaboration between institutions to develop and enhance the use of academic video. The community shares experiences with existing technologies as well as identifies future approaches and requirements. The community seeks broad participation in this important and dynamic domain, to allow community members to share expertise and experience and collaborate in related projects. It was initiated by the founding members of the community [2] to solve the need identified with academic institutions to run an affordable, flexible and enterprise-ready video management system. A list of current system adopters along with a detailed description can be found on the adopters page at List of Matterhorn adopters.

Matterhorn: Underlying technology

Matterhorn offers an open source reference implementation of an end-to-end enterprise lecture capture suite and a comprehensive set of flexible rich media services. It supports the typical lecture capture and video management phases: preparation, scheduling, capture, media processing, content distribution and usage. The picture below depicts the supported phases. These phases are also major differentiators in the system architecture. Additional information is available in [2].

Opencast Matterhorn phases of lecture recording

The core components are build upon a service-based architecture leveraging Java OSGI technology, which provides a standardized, component oriented environment for service coupling and cooperation [1], [2]. System administrators, lecturers or students do not need to handle Java objects, interfaces or service endpoints directly but can create and interact with system components by using fully customizable workflows (XML descriptions) for media recordings, encoding, handling and/or content distribution. Matterhorn comes with tools for administrators that allow to plan and schedule upcoming lectures as well as monitor different processes across distributed Matterhorn nodes. Feeds (ATOM/RSS) as well as a browsable media gallery enable users to customize and adapt content created with the system to local needs. Finally content player components (engage applications) are provided which allow to synchronize different streams (e.g. talking head and screen capture video or audience cameras), access content directly based on media search queries and use the media analysis capabilities for navigation purposes (e.g. slide changes).

Opencast Matterhorn welcome page

The project website provides a guided feature and demo tour, cookbook and installation sections about how to use and operate the system on a daily basis, as well as links to the community issue/feature tracker system Opencast issues.

Matterhorn2GO: Mobile connection to online learning repositories

Matterhorn 2GO is a cross-platform open source mobile front-end for recorded video material produced with the Opencast Matterhorn system [3]. It can be used on Android or iOS smartphones and tablets. The core technology used is Apache Flex. It has been released in the Google Play Store as well as in Apple’s iTunes store. Further information is available on the project website: Download / Install Matterhorn2GO. It brings lecture recordings and additional material created by Opencast Matterhorn to mobile learners worldwide. Matterhorn 2GO comes with powerful in content search capabilities based on Matterhorn’s core multimedia analysis features and is able to synchronize different content streams in one single view to fully follow the activity in the classroom. This allows users, for example, to access a certain aspect directly within numerous recorded series and/or single episodes. A user can choose between three different video view state options: a parallel view (professor and corresponding synchronized slides or screen recording), just the professor or only the lecture slides. Since most Opencast Matterhorn service endpoints offer a streaming option, a user can directly navigate to any time position in the video without waiting until it has been fully downloaded. The browse media page lists recordings from available Matterhorn installations. Students can simply follow their own eLectures but also get information about what else is being taught or presented at the local university or abroad at other learning institutes.

Stay informed and join the discussion

As an international open source community, Opencast has established several mechanisms for individuals to communicate, support each other and collaborate.

More information about communication within the community can be found at www.opencast.org/communications.

Conclusion

Matterhorn and the Opencast Community can offer research initiatives a prolific environment with a multitude of partners and a technology developed to be adapted, amended or supplemented by new features, be that voice recognition, face detection, support for mobile devices, semantic connections in learning objects or (big) data mining. The final objective is to ensure that research initiatives will consider Matterhorn a focal point for their activities. The governance model of the Opencast Community and the Opencast Matterhorn project can be found online at www.opencast.org/opencast-governance.

Acknowledgments and License

The authors would like to thank the Opencast Community and the Opencast Matterhorn developers for their support and creativity as well as the continuous efforts to create tools that can be used across campuses and learning institutes worldwide. Matterhorn is published under the Educational Community License (ECL) 2.0.

References

[1] Christopher A. Brooks, Markus Ketterl, Adam Hochman, Josh Holtzman, Judy Stern, Tobias Wunden, Kristofor Amundson, Greg Logan, Kenneth Lui, Adam McKenzie, Denis Meyer, Markus Moormann, Matjaz Rihtar, Ruediger Rolf, Nejc Skofic, Micah Sutton, Ruben Perez Vazquez, und Benjamin Wulff. OpenCast Matterhorn 1.1: reaching new heights. ACM Multimedia, pages 703-706. ACM, (2011)

[2] Ketterl, M, Schulte, O. A., Hochman, A. Opencast Matterhorn: A Community-Driven Open Source Software Project for Producing, Managing, and Distributing Academic Video. International Journal of Interactive Technology and Smart Education, Emerald Group Publishing Limited, Vol. 7 Iss: 3, pp.168 – 180, 2010.

[3] Markus Ketterl, Leonid Oldenburger, Oliver Vornberger. Opencast 2 Go: Mobile Connections to Multimedia Learning Repositories. In proceeding of: IADIS International Conference Mobile Learning, pages 181-188, Berlin, Germany, 2012

Report from SLAM 2013

Intl. Workshop on Speech, Language and Audio in Multimedia

The International Workshop on Speech, Language and Audio in Multimedia (SLAM) is a yearly series of workshop to bring together researchers working in the broad field of speech, language and audio processing applied to the analysis, indexing and use of any type of multimedia data (e.g., broadcast, social media, audiovisual archives, online courses, music), with the goal of sharing recent research results, discussing ongoing and future projects as well as benchmarking initiatives and applications.

The very first edition of SLAM was held in Marseille, Aug. 22—23, 2013, as a satellite event of Interspeech 2013. Jointly patronized by the ISCA SIG on Speech and Language in Multimedia and the IEEE SIG on Audio and Speech Processing for Multimedia , the workshop was locally organized by the Laboratoire d’Informatique Fondamental (LIF) of Aix-Marseille University in a gorgeous location, the Parc du Pharo. SLAM received financial support from local institutions, from national and international associations and from national project in the field of multimedia. Financial support, combined with low-cost organization within a university setting, made it possible to maintain very low registration fees, in particular targeting students.

SLAM 2013 gathered 56 participants from around the world over a day and a half (see IMAGE(slam-photo-2.jpg)) CAPTION(Group picture of the SLAM 2013 attendees). The workshop was held in a very friendly atmosphere, with plenty of time for discussions on the side and a warm-hearted social event, yet featuring high-profile scientific communications in a number of topics and a keynote speech by Sam Davies on the BBC World Archives. Contributions were organized in five sessions, namely

  • audio & video event detection and segmentation
  • ASR in multimedia documents
  • multimedia person recognition
  • speaker & speaker roles recognition
  • multimedia applications and corpus

covering most of the topics targeted by the workshop. Major results from a vast number of projects were presented, generating fundamental discussions on the future of speech, language and audio in the multimedia sphere. We however hope to have in the future more contributions regarding non-speech audio processing. Proceedings are available online in open-access mode at http://ceur-ws.org/Vol-1012.

The SLAM workshop intends to establish itself as a yearly event, at the frontier between the audio processing, speech communication and multimedia communities. The second edition will be held in Penang, Malaysia, Sep. 11—12, 2014, as a satellite event of Interspeech 2014. See http://language.cs.usm.my/SLAM2014/index.html. We hope to have SLAM organized as a satellite multimedia conferences in the near future and welcome bids for 2015.

An interview with Aljosa Smolic

Dr. Aljosa Smolic joined Disney Research Zürich, Switzerland in 2009, as Senior Research Scientist and Head of the “Advanced Video Technology” group. Before he was Scientific Project Manager at the Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institut (HHI), Berlin, also heading a research group. He has been involved in several national and international research projects, where he conducted research in various fields of video processing, video coding, computer vision and computer graphics and published more than 100 refereed papers in these fields. In current projects he is responsible for research in 2D video, 3D video and free viewpoint video processing and coding.

Q: What is your main area of research?

A: I’m working on video processing in a general sense and visual computing. I’m interested in everything related to pixel processing like camera systems, processing visual information, perception and computational systems that are creating high quality output for the user.

Q: What got you interested in this area in the first place?

A: In my studies in electrical engineering I was focusing on audio processing, in a sense that if I wouldn’t become a rock star, I still could be an audio engineer. Then I got the opportunity to work at Fraunhofer HHI on Image Processing, where I turned my signal processing interests from audio to image processing, and that’s how I ended up here.

Q: Does your research & work influence your private life a lot, like owning a stereoscopic TV, taking a lot of videos and photos, etc.?

A: Yes, in a sense that I’m very critical on any type of visual information. I’m also very picky watching television and I notice all the small imperfections. I have an expert view on cinema, any type of multimedia presentation and audio. On the other hand I don’t create too much content myself. I don’t have a special camera and I don’t do too much of filming. And I don’t have too much of fancy 3D equipment for myself at home.

Q: Speaking of 3D equipment at home … Obviously 3D TV home equipment didn’t start off too well. Do you think 3D TV will rise again in say 10-15 years, or will we skip towards the “holodeck”?

A: The holodeck … I formulated that as my long term research question, so I’m still working on it and it’s still a long way. We are not yet there and stereo or 3D TV at home didn’t reach the broad adoption that many people thought of two or three years ago. I believe TV is a more difficult thing than for instance home cinema on Blue-Ray. I think business & technology based on 3D Blue-Ray disk work well. You can buy content, which is very well produced to be consumed in a situation very similar to watching a movie in a cinema. But I think it’s more difficult to adopt stereoscopic technology for the classic TV watching experience, which should be more social. The quality of the content should be better, and the need to wear glasses is not that accepted for watching TV.

Q: What are possible technological advances between now and the holodeck? Does something like Illumiroom (a project from Microsoft Research, that projects peripheral content around a screen) or higher resolutions like 4K will have an impact?

A: Things like Illumiroom and Philips Ambilight are all a step towards the holodeck as much as stereoscopic TV was. I believe there are a lot of more steps in different directions necessary in order to get a 3D immersive experience. Regarding higher resolutions, I’m not so enthusiastic about 4K. As from what I saw so far the difference between HD and 4k is very subtle. Only under very specific conditions and very specific distances you are able to perceive any difference. So I don’t think it matters that much and I don’t see that 4K will have that much of an impact over HD.

I rather look forward to HDR. I’ve seen a few demos which offered an impressive level of experience.

Those displays are starting to become available in professional and consumer markets.

Q: If you would re-start your PhD right now, would you end up in the same field or do you think there is another research direction that is more interesting to you right now?

A: I don’t know … I could always do theoretical physics and go to CERN to try to create black holes, which is always an option. The other option would be to work more on the rock star career. Well, but I’m pretty happy where I ended up right now.

Curriculum Vitae:

Dr. Aljoša Smolić joined Disney Research Zurich, Switzerland in 2009, as Senior Research Scientist and Head of the “Advanced Video Technology” group. Before he was Scientific Project Manager at the Fraunhofer Institute for Telecommunications, Heinrich-Hertz-Institut (HHI), Berlin, also heading a research group. He has been involved in several national and international research projects, where he conducted research in various fields of video processing, video coding, computer vision and computer graphics and published more than 100 referred papers in these fields. In current projects he is responsible for research in 2D video, 3D video and free viewpoint video processing and coding. He received the Dipl.-Ing. Degree in Electrical Engineering from the Technical University of Berlin, Germany in 1996, and the Dr.-Ing. Degree in Electrical Engineering and Information Technology from Aachen University of Technology (RWTH), Germany, in 2001. Dr. Smolic received the “Rudolf-Urtlel-Award” of the German Society for Technology in TV and Cinema (FKTG) for his dissertation in 2002. He is Area Editor for Signal Processing: Image Communication and served as Guest Editor for the Proceedings of the IEEE, IEEE Transactions on CSVT, IEEE Signal Processing Magazine, and other scientific journals. He chaired the MPEG ad hoc group on 3DAV pioneering standards for 3D video. In this context he also served as one of the Editors of the Multi-view Video Coding (MVC) standard. Since many years he is teaching full lecture courses on Multimedia Communications and other topics, now at ETH Zurich.

Dr. Mathias Lux is a Senior Assistant Professor at the Institute for Information Technology (ITEC) at Klagenfurt University, where he has been since 2006. He received his M.S. in Mathematics in 2004 and his Ph.D. in Telematics in 2006 from Graz University of Technology. Before joining Klagenfurt University, he worked in industry on web-based applications, as a junior researcher at a research center for knowledge-based applications, and as research and teaching assistant at the Knowledge Management Institute (KMI) of Graz University of Technology. In research, he is working on user intentions in multimedia retrieval and production, visual information retrieval, and serious games. In his scientific career he has (co-) authored more than 60 scientific publications, has served in multiple program committees and as reviewer of international conferences, journals, and magazines, and has organized several scientific events. He is also well known for managing the development of the award-winning and popular open source tools Caliph & Emir and LIRE for visual information retrieval.

SIGMM Conferences and Journals Ranked High by CCF

SIGMM Conferences and Journals Ranked High by Chinese Computing Federation (CCF)

The Chinese Computing Federation (CCF) Ranking List provides a ranking of peer-reviewed conferences and journals in the broad area of computer science. This list is typically consulted by most academic institutions in China as a quality metric for PhD promotions and tenure track jobs.

The CCF ranking released in 2013 for “Multimedia and Graphics” is at the following link (one may use Google Translate to view the web page in English or other desired language):

http://www.ccf.org.cn/sites/ccf/biaodan.jsp?contentId=2567814757424

This CCF 2013 ranking released for “Multimedia and Graphics” conferences and journals is first split into sections A, B, and C, and the conferences are ranked numerically within each section. We are very pleased and excited to share the news that the conferences and journals sponsored by SIGMM have been ranked high in CCF list!

For Multimedia and Graphics conferences:

ACM Multimedia was ranked the highest in the A section, and ACM Conference on Multimedia Retrieval (ICMR) was ranked the highest in the B section.

For Multimedia and Graphics Journals

ACM Transactions on Multimedia Computing, Communications and Application (TOMCCAP) was ranked top in the B section.

This recognition for SIGMM sponsored publishing avenues has been entirely due to the tireless and significant efforts from the organizers, steering committees, SIGMM officers, and most importantly, the multimedia research community.

MPEG Column: 105th MPEG Meeting

— original post by Multimedia Communication blogChristian TimmererAAU

 

Opening plenary, 105th MPEG meeting, Vienna, Klagenfurt

At the 105th MPEG meeting in Vienna, Austria, a lot of interesting things happened. First, this was not only the 105th MPEG meeting but also the 48th VCEG meeting, 14th JCT-VC meeting, 5th JCT-3V meeting, and 26th SC29 meeting bringing together more than 400 experts from more than 20 countries to discuss technical issues in the domain of coding of audio, [picture (SC29 only),] multimedia and hypermedia information. Second, it was the 3rd meeting hosted in Austria after the 62nd in July 2002 and 77th in July 2006. In 2002, “the new video coding standard being developed jointly with the ITU-T VCEG organization was promoted to Final Committee Draft (FCD)” and in 2006 “MPEG Surround completed its technical work and has been submitted for final FDIS balloting” as well as “MPEG has issued a Final Call for Proposals on MPEG-7 Query Format (MP7QF)”.

The official press release of the 105th meeting can be found here but I’d like to highlight a couple of interesting topics including research aspects covered or enabled by them. Although research efforts may lead to the standardization activities but also enables research as you may see below.

MPEG selects technology for the upcoming MPEG-H 3D audio standard

Based on the responses submitted to the Call for Proposals (CfP) on MPEG-H 3D audio, MPEG selected technology supporting content based on multiple formats, i.e., channels and objects (CO) and higher order ambisonics (HOA). All submissions have been evaluated by comprehensive and standardized subjective listening tests followed by statistical analysis of the results. Interestingly, when taking the highest bitrate of 1.2 Mb/s with a 22.2 channel configuration, both of the selected technologies have achieved excellent quality and are very close to true transparency. That is, listeners cannot differentiate between the encoded and uncompressed bitstream. A first version of the MPEG-H 3D audio standard with higher bitrates of around 1.2 Mb/s to 256 kb/s should be available by March 2014 (Committee Draft – CD), July 2014 (Draft International Standard – DIS), and January 2015 (Final Draft International Standards – FDIS), respectively.

Research topics: Although the technologies have been selected, it’s still a long way until the standard gets ratified by MPEG and published by ISO/IEC. Thus, there’s a lot of space for researching efficient encoding tools including the subjective quality evaluations thereof. Additionally, it may impact the way 3D Audio bitstreams are transferred from one entity to the another including file-based, streaming, on demand, and live services. Finally, within the application domain it may enable new use cases which are interesting to explore from a research point of view.

Augmented Reality Application Format reaches FDIS status

The MPEG Augmented Reality Application Format (ARAF, ISO/IEC 23000-13) enables the augmentation of the real world with synthetic media objects by combining multiple, existing standards within a single specific application format addressing certain industry needs. In particular, it combines standards providing representation formats for scene description (i.e., subset of BIFS), sensor/actuator descriptors (MPEG-V), and media formats such as audio/video coding formats. There are multiple target applications which may benefit from the MPEG ARAF standard, e.g., geolocation-based services, image-based object detection and tracking, mixed and augmented reality games and real-virtual interactive scenarios.

Research topics: Please note that MPEG ARAF only specifies the format to enable interoperability in order to support use cases enabled by this format. Hence, there are many research topics which could be associated to the application domains identified above.

What’s new in Dynamic Adaptive Streaming over HTTP?

The DASH outcome of the 105th MPEG meeting comes with a couple of highlights. First, a public workshop was held on session management and control (#DASHsmc) which will be used to derive additional requirements for DASH. All position papers and presentations are publicly available here. Second, the first amendment (Amd.1) to part 1 of MPEG-DASH (ISO/IEC 23009-1:2012) has reached the final stage of standardization and together with the first corrigendum (Cor.1) and the existing part 1, the FDIS of the second edition of ISO/IEC 23009-1:201x has been approved. This includes support for event messages (e.g., to be used for live streaming and dynamic ad insertion) and a media presentation anchor which enables session mobility among others. Third and finally, the FDIS of conformance and reference software (ISO/IEC 23009-2) has been approved providing means for media presentation conformance, test vectors, a DASH access engine reference software, and various sample software tools.

Research topics: The MPEG-DASH conformance and reference software provides the ideal playground for researchers as it can be used both to generate and to consume bitstreams compliant to the standard. This playground could be used together with other open source tools from the DASH-IFGPAC, and DASH@ITEC. Additionally, see also Open Source Column: Dynamic Adaptive Streaming over HTTP Toolset.

HEVC support in MPEG-2 Transport Stream and ISO Base Media File Format

After the completion of High Efficiency Video Coding (HEVC) – ITU-T H.265 | MPEG HEVC at the 103rd MPEG meeting in Geneva, HEVC bitstreams can be now delivered using the MPEG-2 Transport Stream (M2TS) and files based on the ISO Base Media File Format (ISOBMFF). For the latter, the scope of the Advanced Video Coding (AVC) file format has been extended to support also HEVC and this part of MPEG-4 has been renamed to Network Abstract Layer (NAL) file format. This file format now covers AVC and its family (Scalable Video Coding – SVC and Multiview Video Coding – MVC) but also HEVC.

Research topics: Research in the area of delivering audio-visual material is manifold and very well reflected in conference/workshops like ACM MMSys and Packet Video and associated journals and magazines. For these two particular standards, it would be interesting to see the efficiency of the carriage of HEVC with respect to the overhead.

Publicly available MPEG output documents

The following documents shall be come available at http://mpeg.chiariglione.org/ (availability in brackets – YY/MM/DD). If you have difficulties to access one of these documents, please feel free to contact me.

  • Requirements for HEVC image sequences (13/08/02)
  • Requirements for still image coding using HEVC (13/08/02)
  • Text of ISO/IEC 14496-16/PDAM4 Pattern based 3D mesh compression (13/08/02)
  • WD of ISO/IEC 14496-22 3rd edition (13/08/02)
  • Study text of DTR of ISO/IEC 23000-14, Augmented reality reference model (13/08/02)
  • Draft Test conditions for HEVC still picture coding performance evaluation (13/08/02)
  • List of stereo and 3D sequences considered (13/08/02)
  • Timeline and Requirements for MPEG-H Audio (13/08/02)
  • Working Draft 1 of Video Coding for browsers (13/08/31)
  • Test Model 1 of Video Coding for browsers (13/08/31)
  • Draft Requirements for Full Gamut Content Distribution (13/08/02)
  • Internet Video Coding Test Model (ITM) v 6.0 (13/08/23)
  • WD 2.0 MAR Reference Model (13/08/13)
  • Call for Proposals on MPEG User Description (MPEG-UD) (13/08/02)
  • Use Cases for MPEG User Description (13/08/02)
  • Requirements on MPEG User Description (13/08/02)
  • Text of white paper on MPEG Query Format (13/07/02)
  • Text of white paper on MPEG-7 AudioVisual Description Profile (AVDP) (13/07/02)

Editorial

Dear Member of the SIGMM Community, welcome to the second issue of the SIGMM Records in 2013.

SIGMM has elected a new board, to guide the SIG through the next couple of years and develop it further. The new board, under the chairmanship of Professor Shih-Fu Chang introduces itself in this issue of the Records.

Among the first acts of the new board was the call for bids for ACM Multimedia 2016, announced also in this issue.

Of course, we have also several other contributions: the OpenSource column introduces  OpenIMAJ, while the MPEG column brings the press release for the 104th MPEG meeting. We can also reveal a change of leadership in FXPal, a research company with many SIGMM members in its ranks and a former SIGMM chair as departing president.

We put also a spotlight on the ongoing season for MediaEval, the multimedia benchmarking initiative, and we include four PhD thesis summaries in this issue.

Of course, we include also a variety of calls for contribution. Please give attention to two particular ones: TOMCCAP has chosen its special issue topic for 2014 and includes a call for papers in this issue of the Records. And also MTAP has issued a special issue call for paper.

Last but most certainly not least, you find pointers to the latest issues of TOMCCAP and MMSJ, and several job announcements.

We hope that you enjoy this issue of the Records.

The Editors
Stephan Kopf, Viktor Wendel, Lei Zhang, Pradeep Atrey, Christian Timmerer, Pablo Cesar, Mathias Lux, Carsten Griwodz

MPEG Column: Press release for the 104th MPEG meeting

Multimedia ecosystem event focuses on a broader scope of MPEG standards

The 104th MPEG meeting was held in Incheon, Korea, from 22 January to 26 April 2013.

MPEG hosts Multimedia Ecosystem 2013 Event

During its 104th meeting, MPEG has hosted the MPEG Multimedia Ecosystem event to raise awareness of MPEG’s activities in areas not directly related to compression. In addition to world class standards for compression technologies, MPEG has developed media-related standards that enrich the use of multimedia such as MPEG-M for Multimedia Service Platform Technologies, MPEG-U for Rich Media User Interfaces, and MPEG-V for interfaces between real and virtual worlds. Also, new activities such as MPEG Augmented Reality Application Format, Compact Descriptors for Visual Search, Green MPEG for energy efficient media coding, and MPEG User Description are currently in progress. The event was organized with two sessions including a workshop and demonstrations. The workshop session introduced the seven standards described above while the demonstration session showed 17 products based on these standards.

MPEG issues CfP for Energy-Efficient Media Consumption (Green MPEG)

At the 104th MPEG meeting, MPEG has issued a Call for Proposals (CfP) on energy-efficient media consumption (Green MPEG) which is available in the public documents section at http://mpeg.chiariglione.org/. Green MPEG is envisaged to provide interoperable solutions for energy- efficient media decoding and presentation as well as energy-efficient media encoding based on encoder resources or receiver feedback. The CfP solicits responses that use compact signaling to facilitate reduced consumption from the encoding, decoding and presentation of media content without any degradation in the Quality of Experience (QoE). When power levels are critically low, consumers may prefer to sacrifice their QoE for reduced energy consumption. Green MPEG will provide this capability by allowing energy consumption to be traded off with the QoE. Responses to the call are due at the 105th MPEG meeting in July 2013.

APIs enable access to other MPEG technologies via MXM

The MPEG eXtensible Middleware (MXM) API technology specifications (ISO/IEC 23006-2) have reached the status of International Standard at the 104th MPEG meeting. MXM specifies the means to access individual MPEG tools through standardized APIs and is expected to help the creation of a global market of MXM applications that can run on devices supporting MXM APIs in addition to the other MPEG technologies. The MXM standard should also help the deployment of innovative business models because it will enable the easy design and implementation of media-handling value chains. The standard also provides reference software as open source with a business friendly license. The introductory part of the MXM family of specifications, 23006-1 MXM architecture and technologies, will soon be also freely available on the ISO web site.

MPEG introduces MPEG 101 with multimedia

MPEG has taken a further step toward communicating information about its standards in an easy and user- friendly manner; i.e. MPEG 101 with multimedia. MPEG 101 with multimedia will provide video clips containing overviews of individual standards along with explanations of the benefits that can be achieved by each standard, and will be available from the MPEG web site (http://mpeg.chiariglione.org/). During this 104th MPEG meeting, the first video clip on the Unified Speech and Audio Coding (USAC) standard has been prepared. USAC is the newest MPEG Audio standard, which was issued in 2012. It provides performance as good as or better than state-of-the-art codecs that are designed specifically for a single class of content, such as just speech or just music, and it does so for any content type, such as speech, music or a mix of speech and music. Over its target operating bit rate, 12 kb/s for mono signals through 32 kb/s for stereo signals,USAC provides significantly better performance than the benchmarkcodecs, and continues to provide better performance as the bitrate is increased to higher rates. MPEG will employ the MPEG 101 with multimedia communication tool to other MPEG standards in near future.

Digging Deeper – How to Contact MPEG

Communicating the large and sometimes complex array of technology that the MPEG Committee has developed is not a simple task. Experts, past and present, have contributed a series of tutorials and vision documents that explain each of these standards individually. The repository is growing with each meeting, so if something you are interested is not yet there, it may appear shortly – but you should also not hesitate to request it. You can start your MPEG adventure at http://mpeg.chiariglione.org/

Further Information

Future MPEG meetings are planned as follows:

  • No. 105, Vienna, AT, 29 July – 2 August 2013
  • No. 106, Geneva, CH, 28 October – 1 November 2013
  • No. 107, San Jose, CA, USA, 13 – 17 January 2014
  • No. 108, Valencia, ES, 31 March – 04 April 2014

For further information about MPEG, please contact:
Dr. Leonardo Chiariglione (Convenor of MPEG, Italy)
Via Borgionera, 103
10040 Villar Dora (TO), Italy
Tel: +39 011 935 04 61
leonardo@chiariglione.org

or

Dr. Arianne T. Hinds
Cable Television Laboratories 858
Coal Creek Circle Lousiville, Colorado 80027, USA
Tel: +1 303 661 3419
a.hinds@cablelabs.com.

The MPEG homepage also has links to other MPEG pages that are maintained by the MPEG subgroups. It also contains links to public documents that are freely available for download by those who are not MPEG members. Journalists that wish to receive MPEG Press Releases by email should contact Dr. Arianne T. Hinds at a.hinds@cablelabs.com.

MediaEval Multimedia Benchmark: Highlights from the Ongoing 2013 Season

MediaEval is an international multimedia benchmarking initiative that offers tasks to the multimedia community that are related to the human and social aspects of multimedia. The focus is on addressing new challenges in the area of multimedia search and indexing that allow researchers to make full use of multimedia techniques that simultaneously exploit multiple modalities. A series of interesting tasks is currently underway in MediaEval 2013. As every year, the selection of tasks is made using a community-wide survey that gauges what multimedia researchers would find most interesting and useful. A new task to watch closely this year is Search and Hyperlinking of Television Content, which follows on the heels of a very successful pilot last year. The other main tasks running this year are:

The tagline of the MediaEval Multimedia Benchmark is: “The ‘multi’ in multimedia: speech, audio, visual content, tags, users, context”. This tagline explains the inspiration behind the choice of the Brave New Tasks, which are running for the first time this year. Here, we would like to highlight Question Answering for the Spoken Web, which builds on the Spoken Web Search tasks mentioned above. This task is a joint effort between MediaEval and the Forum for Information Retrieval Evaluation, a India-based information retrieval benchmark. MediaEval believes strongly in collaboration and complementarity between benchmarks and we hope that this task will help us to better understand how joint-tasks should be best designed and coordinated. The other Brave New Tasks at MediaEval this year are:

The MediaEval 2013 season culminates with the MediaEval 2013 worksop, which will take place in Barcelona, Catalunya, Spain, on Friday-Saturday 18-19 October 2013. Note that this is just before ACM Multimedia 201, which will be held Monday-Friday 21-25 October 2013, also in Barcelona. We are currently working on finalizing the registration site for the workshop and it will open very soon and will be announced on the MediaEval website. In order to further foster our understanding and appreciation of user-generated multimedia, each year we designate a MediaEval filmmaker to make a YouTube video about the workshop. The MediaEval 2012 workshop video was made by John N.A. Brown and has recently appeared online. John’s decided to focus on the sense of community and the laughter than he observed at the workshop. Interestingly, his focus recalls the work done at MediaEval 2010 on the role of laughter in social video, see:

http://www.youtube.com/watch?v=z1bjXwxkgBs&feature=youtu.be&t=1m29s

We hope that this video inspires your to join us in Barcelona.