SIGMM Award for Outstanding PhD Thesis in Multimedia Computing, Communications and Applications

Award Description

This award will be presented at most once per year to a researcher whose PhD thesis has the potential of very high impact in multimedia computing, communication and applications, or gives direct evidence of such impact. A selection committee will evaluate contributions towards advances in multimedia including multimedia processing, multimedia systems, multimedia network services, multimedia applications and interfaces. The award will recognize members of the SIGMM community and their research contributions in their PhD theses as well as the potential of impact of their PhD theses in multimedia area. The selection committee will focus on candidates’ contributions as judged by innovative ideas and potential impact resulting from their PhD work.

The award includes a US$500 honorarium, an award certificate of recognition, and an invitation for the recipient to receive the award at a current year’s SIGMM-sponsored conference, the ACM International Conference on Multimedia (ACM Multimedia). A public citation for the award will be placed on the SIGMM website, in the SIGMM Records e-newsletter as well as in the ACM e-newsletter.

Funding

The award honorarium, the award plaque of recognition and travel expenses to the ACM International Conference on Multimedia will be fully sponsored by the SIGMM budget.

Nomination Applications

Nominations will be solicited by the 31st May 2014 with an award decision to be made by August 30. This timing will allow a recipient to prepare for an award presentation at ACM Multimedia in that Fall (October/November).

The initial nomination for a PhD thesis must relate to a dissertation deposited at the nominee’s Academic Institution between January and December of the year previous to the nomination. As discussed below, some dissertations may be held for up to three years by the selection committee for reconsideration. If the original thesis is not in English, a full English translation must be provided with the submission. Nominations for the award must include:

  1. PhD thesis (upload at: https://cmt.research.microsoft.com/SIGMMA2014/ )
  2. A statement summarizing the candidate’s PhD thesis contributions and potential impact, and justification of the nomination (two pages maximum);
  3. Curriculum Vitae of the nominee
  4. Three endorsement letters supporting the nomination including the significant PhD thesis contributions of the candidate. Each endorsement should be no longer than 500 words with clear specification of nominee PhD thesis contributions and potential impact on the multimedia field.
  5. A concise statement (one sentence) of the PhD thesis contribution for which the award is being given. This statement will appear on the award certificate and on the website.

The nomination rules are:

  1. The nominee can be any member of the scientific community.
  2. The nominator must be a SIGMM member.
  3. No self-nomination is allowed.

If a particular thesis is considered to be of exceptional merit but not selected for the award in a given year, the selection committee (at its sole discretion) may elect to retain the submission for consideration in at most two following years. The candidate will be invited to resubmit his/her work in these years.

A thesis is considered to be outstanding if:

  1. Theoretical contributions are significant and application to multimedia is demonstrated.
  2. Applications to multimedia is outstanding, techniques are backed by solid theory with clear demonstration that algorithms can be applied in new domains – e.g., algorithms must be demonstrably scalable in application in terms of robustness, convergence and complexity.

The submission process of nominations will be preceded by the call for nominations. The call of nominations will be widely publicized by the SIGMM awards committee and by the SIGMM Executive Board at the different SIGMM venues, such as during the SIGMM premier ACM Multimedia conference (at the SIGMM Business Meeting) on the SIGMM web site, via SIGMM mailing list, and via SIGMM e-newsletter between September and December of the previous year.

Submission Process

  • Register an account at https://cmt.research.microsoft.com/SIGMMA2014/ and upload one copy of the nominated PhD thesis. The nominee will receive a Paper ID after the submission.
  • The nominator must then collate other materials detailed in the previous section and upload them as supplementary materials, except the endorsement letters, which must be emailed separately as detailed below.
  • Contact your referees and ask them to send all endorsement letters to sigmmaward@gmail.com with the title: “PhD Thesis Award Endorsement Letter for [YourName]”. The web administrator will acknowledge the receipt and the submission CMT website will reflect the status of uploaded documents and endorsement letters.

It is the responsibility of the nominator to follow the process and make sure documentation is complete. Thesis with incomplete documentation will be considered invalid.

Selection Committee

The 2014 award selection committee consists of:

  • Prof. Kiyoharu Aizawa (aizawa@hal.t.u-tokyo.ac.jp) from University of Tokyo, Japan
  • Prof. Baochun Li (bli@eecg.toronto.edu) from University of Toronto, Canada
  • Prof. K. Selcuk Candan (candan@asu.edu) from Arizona State University, USA
  • Prof. Shin’ichi Satoh (satoh@nii.ac.jp) from National Institute of Informatics, Japan
  • Dr. Daniel Gatica-Perez (gatica@idiap.ch) from Idiap-EPFL, Switzerland

ESSENTIA: an open source library for audio analysis

Over the last decade, audio analysis has become a field of active research in academic and engineering worlds. It refers to the extraction of information and meaning from audio signals for analysis, classification, storage, retrieval, and synthesis, among other tasks. Related research topics challange understanding and modeling of sound and music, and develop methods and technologies that can be used to process audio in order to extract acoustically and musically relevant data and make use of this information. Audio analysis techniques are instrumental in the development of new audio-related products and services, because these techniques allow novel ways of interaction with sound and music. Essentia is an open-source C++ library for audio analysis and audio-based music information retrieval released under the Affero GPLv3 license (also available under proprietary license upon request). It contains an extensive collection of reusable algorithms which implement audio input/output functionality, standard digital signal processing blocks, statistical characterization of data, and a large set of spectral, temporal, tonal and high-level music descriptors that can be computed from audio. In addition, Essentia can be complemented with Gaia, a C++ library with python bindings which allows searching in a descriptor space using different similarity measures and classifying the results of audio analysis (same license terms apply). Gaia can be used to generate classification models that Essentia can use to compute high-level description of music. Essentia is not a framework, but rather a collection of algorithms wrapped in a library. It doesn’t enforce common high-level logic for descriptor computation (so you aren’t locked into a certain way of doing things). It rather focuses on the robustness, performance and optimality of the provided algorithms, as well as ease of use. The flow of the analysis is decided and implemented by the user, while Essentia is taking care of the implementation details of the algorithms being used. A number of examples are provided with the library, however they should not be considered as the only correct way of doing things. The library includes Python bindings as well as a number of predefined executable extractors for the available music descriptors, which facilitates its use for fast prototyping and allows setting up research experiments very rapidly. The extractors cover a number of common use-cases for researchers, for example, computing all available music descriptors for an audio track, extracting only spectral, rhythmic, or tonal descriptors, computing predominant melody and beat positions, and returning the results in yaml/json data formats. Furthermore, it includes a Vamp plugin to be used for visualization of music descriptors using hosts such as Sonic Visualiser. The library is cross-platform and supports Linux, Mac OS X and Windows systems. Essentia is designed with a focus on the robustness of the provided music descriptors and is optimized in terms of the computational cost of the algorithms. The provided functionality, specifically the music descriptors included out-of-the-box and signal processing algorithms, is easily expandable and allows for both research experiments and development of large-scale industrial applications. Essentia has been in development for more than 7 years incorporating the work of more than 20 researchers and developers through its history. The 2.0 version marked the first release to be publicly available as free software released under AGPLv3.

Algorithms

Essentia currently features the following algorithms (among others):

  • Audio file input/output: ability to read and write nearly all audio file formats (wav, mp3, ogg, flac, etc.)
  • Standard signal processing blocks: FFT, DCT, frame cutter, windowing, envelope, smoothing
  • Filters (FIR & IIR): low/high/band pass, band reject, DC removal, equal loudness
  • Statistical descriptors: median, mean, variance, power means, raw and central moments, spread, kurtosis, skewness, flatness
  • Time-domain descriptors: duration, loudness, LARM, Leq, Vickers’ loudness, zero-crossing-rate, log attack time and other signal envelope descriptors
  • Spectral descriptors: Bark/Mel/ERB bands, MFCC, GFCC, LPC, spectral peaks, complexity, rolloff, contrast, HFC, inharmonicity and dissonance
  • Tonal descriptors: Pitch salience function, predominant melody and pitch, HPCP (chroma) related features, chords, key and scale, tuning frequency
  • Rhythm descriptors: beat detection, BPM, onset detection, rhythm transform, beat loudness
  • Other high-level descriptors: danceability, dynamic complexity, audio segmentation, semantic annotations based on SVM classifiers

The complete list of algorithms is available online in the official documentation.

Architecture

The main purpose of Essentia is to serve as a library of signal-processing blocks. As such, it is intended to provide as many algorithms as possible, while trying to be as little intrusive as possible. Each processing block is called an Algorithm, and it has three different types of attributes: inputs, outputs and parameters. Algorithms can be combined into more complex ones, which are also instances of the base Algorithm class and behave in the same way. An example of such a composite algorithm is presented in the figure below. It shows a composite tonal key/scale extractor, which combines the algorithms for frame cutting, windowing, spectrum computation, spectral peaks detection, chroma features (HPCP) computation and finally the algorithm for key/scale estimation from the HPCP (itself a composite algorithm).

The algorithms can be used in two different modes: standard and streaming. The standard mode is imperative while the streaming mode is declarative. The standard mode requires to specifying the inputs and outputs for each algorithm and calling their processing function explicitly. If the user wants to run a network of connected algorithms, he/she will need to manually run each algorithm. The advantage of this mode is that it allows very rapid prototyping (especially when the python bindings are coupled with a scientific environment in python, such as ipython, numpy, and matplotlib).

The streaming mode, on the other hand, allows to define a network of connected algorithms, and then an internal scheduler takes care of passing data between the algorithms inputs and outputs and calling the algorithms in the appropriate order. The scheduler available in Essentia is optimized for analysis tasks, and does not take into account the latency of the network. For real-time applications, one could easily replace this scheduler with another one that favors latency over throughput. The advantage of this mode is that it results in simpler and safer code (as the user only needs to create algorithms and connect them, there is no room for him to make mistakes in the execution order of the algorithms), and in lower memory consumption in general, as the data is streamed through the network instead of being loaded entirely in memory (which is the usual case when working with the standard mode). Even though most of the algorithms are available for both the standard and streaming mode, the code that implements them is not duplicated as either the streaming version of an algorithm is deduced/wrapped from its standard implementation, or vice versa.

Applications

Essentia has served in a large number of research activities conducted at Music Technology Group since 2006. It has been used for music classification, semantic autotagging, music similarity and recommendation, visualization and interaction with music, sound indexing, musical instruments detection, cover detection, beat detection, and acoustic analysis of stimuli for neuroimaging studies. Essentia and Gaia have been used extensively in a number of research projects and industrial applications. As an example, both libraries are employed for large-scale indexing and content-based search of sound recordings within Freesound, a popular repository of Creative Commons licensed audio samples. In particular, Freesound uses audio based similarity to recommend sounds similar to user queries. Dunya is a web-based software application using Essentia that lets users interact with an audio music collection through the use of musical concepts that are derived from a specific musical culture, in this case Carnatic music.

Examples

Essentia can be easily used via its python bindings. Below is a quick illustration of Essentia’s possibilities for example on detecting beat positions of music track and its predominant melody in a few lines of python code using the standard mode:


from essentia.standard import *; audio = MonoLoader(filename = 'audio.mp3')(); beats, bconfidence = BeatTrackerMultiFeature()(audio); print beats; audio = EqualLoudness()(audio); melody, mconfidence = PredominantMelody(guessUnvoiced=True, frameSize=2048, hopSize=128)(audio); print melody Another python example for computation of MFCC features using the streaming mode: from essentia.streaming import * loader = MonoLoader(filename = 'audio.mp3') frameCutter = FrameCutter(frameSize = 1024, hopSize = 512) w = Windowing(type = 'hann') spectrum = Spectrum() mfcc = MFCC() pool = essentia.Pool() # connect all algorithms into a network loader.audio >> frameCutter.signal frameCutter.frame >> w.frame >> spectrum.frame spectrum.spectrum >> mfcc.spectrum mfcc.mfcc >> (pool, 'mfcc') mfcc.bands >> (pool, 'mfcc_bands') # compute network essentia.run(loader) print pool['mfcc'] print pool['mfcc_bands'] Vamp plugin provided with Essentia allows to use many of its algorithms via the graphical interface of Sonic Visualiser. In this example, positions of onsets are computed for a music piece (marked in red): An interested reader is referred to the documention online for more example applications built on top of Essentia.

Getting Essentia

The detailed information about Essentia is available online on the official web page: http://essentia.upf.edu. It contains the complete documentation for the project, compilation instructions for Debian/Ubuntu, Mac OS X and Windows, as well as precompiled packages. The source code is available at the official Github repository: http://github.com/MTG/essentia. In our current work we are focused on expanding the library and the community of users, and all active Essentia users are encouraged to contribute to the library.

References

[1] Serra, X., Magas, M., Benetos, E., Chudy, M., Dixon, S., Flexer, A., Gómez, E., Gouyon, F., Herrera, P., Jordà, S., Paytuvi, O, Peeters, G., Schlüter, J., Vinet, H., and Widmer, G., Roadmap for Music Information ReSearch, G. Peeters, Ed., 2013. [Online].

[2] Bogdanov, D., Wack N., Gómez E., Gulati S., Herrera P., Mayor O., Roma, G., Salamon, J., Zapata, J., Serra, X. (2013). ESSENTIA: an Audio Analysis Library for Music Information Retrieval. International Society for Music Information Retrieval Conference(ISMIR’13). 493-498.

[3] Bogdanov, D., Wack N., Gómez E., Gulati S., Herrera P., Mayor O., Roma, G., Salamon, J., Zapata, J., Serra, X. (2013). ESSENTIA: an Open-Source Library for Sound and Music Analysis. ACM International Conference on Multimedia (MM’13).

Most cited papers before the era of ICMR

In the early years of 2000, the field of multimedia retrieval was composed of special sessions at conferences and small workshops. There were no multimedia retrieval conferences. One of the leading workshops (B. Kerherve, V. Oria and S. Satoh) was the ACM SIGMM Workshop on Multimedia Information Retrieval (MIR) which was held with the ACM MM conference.

To have a central meeting for the scientific community, the International Conference on Image and Video Retrieval (CIVR) was founded in 2002 (J. Eakins, P. Enser, M. Graham, M.S. Lew, P. Lewis and A. Smeaton). Both meetings evolved over the next decade.  CIVR and MIR became ACM SIGMM sponsored conferences and established reputations for high quality work.

In 2010, the steering committees of both CIVR and MIR voted to combine the two conferences toward unifying the communities and establishing the ACM flagship meeting for multimedia retrieval, the ACM International Conference on Multimedia Retrieval (ICMR).  In 2013, ICMR was ranked by the Chinese Computing Federation as the #1 meeting in multimedia retrieval and the #4 meeting in the wide domain of Multimedia and Graphics.

For archival reasons, this is a summary of which papers had the most citations from ACM CIVR and ACM MIR (2008-2010), based on Google Scholar data in the period from February 17-18, 2014.

Google Scholar citations were used because they have wide coverage (ACM, IEEE, Springer, Elsevier, etc.), are publicly accessible and because they are being increasingly accepted by researchers for both paper citations estimates and computing the h-index.

The information below is given in the format of
Rank | Citations | Article-Information

CIVR 2008

  1. 173 – World-scale mining of objects and events from community photo collections
    Till Quack, Bastian Leibe, Luc Van Gool
    http://dl.acm.org/citation.cfm?id=1386363
  2. 81 – Analyzing Flickr groups
    Radu Andrei Negoescu, Daniel Gatica-Perez
    http://dl.acm.org/citation.cfm?id=1386406
  3. 70 – A comparison of color features for visual concept classification
    Koen E.A. van de Sande, Theo Gevers, Cees G.M. Snoek
    http://dl.acm.org/citation.cfm?id=1386376
  4. 68 – Language modeling for bag-of-visual words image categorization
    Pierre Tirilly, Vincent Claveau, Patrick Gros
    http://dl.acm.org/citation.cfm?id=1386388
  5. 46 – Multiple feature fusion by subspace learning
    Yun Fu, Liangliang Cao, Guodong Guo, Thomas S. Huang
    http://dl.acm.org/citation.cfm?id=1386373

CIVR 2009

  1. 379 – NUS-WIDE: a real-world web image database from National University of Singapore
    Tat-Seng Chua, Jinhui Tang, Richang Hong, Haojie Li, Zhiping Luo, Yantao Zheng
    http://dl.acm.org/citation.cfm?id=1646452
  2. 124 – Evaluation of GIST descriptors for web-scale image search
    Matthijs Douze, Hervé Jégou, Harsimrat Sandhawalia, Laurent Amsaleg, Cordelia Schmid
    http://dl.acm.org/citation.cfm?id=1646421
  3. 81 – Real-time bag of words, approximately
    J. R. R. Uijlings, A. W. M. Smeulders, R. J. H. Scha
    http://dl.acm.org/citation.cfm?id=1646405
  4. 57 – Dense sampling and fast encoding for 3D model retrieval using bag-of-visual features
    Takahiko Furuya, Ryutarou Ohbuchi
    http://dl.acm.org/citation.cfm?id=1646430
  5. 46 – Multilayer pLSA for multimodal image retrieval
    Rainer Lienhart, Stefan Romberg, Eva Hörster
    http://dl.acm.org/citation.cfm?id=1646408

CIVR 2010

  1. 43 – Signature Quadratic Form Distance
    Christian Beecks, Merih Seran Uysal, Thomas Seidl
    http://dl.acm.org/citation.cfm?id=1816105
  2. 41 – Feature detector and descriptor evaluation in human action recognition
    Ling Shao, Riccardo Mattivi
    http://dl.acm.org/citation.cfm?id=1816111
  3. 38 – Unsupervised multi-feature tag relevance learning for social image retrieval
    Xirong Li, Cees G. M. Snoek, Marcel Worring
    http://dl.acm.org/citation.cfm?id=1816044
  4. 29 – Co-reranking by mutual reinforcement for image search
    Ting Yao, Tao Mei, Chong-Wah Ngo
    http://dl.acm.org/citation.cfm?id=1816048
  5. Two papers were tied for 5th place in citations:

MIR 2008

  1. 285 – The MIR flickr retrieval evaluation
    Mark J. Huiskes, Michael S. Lew
    http://dl.acm.org/citation.cfm?id=1460104
  2. 203 – Outdoors augmented reality on mobile phone using loxel-based visual feature organization
    Gabriel Takacs, Vijay Chandrasekhar, Natasha Gelfand, Yingen Xiong, Wei-Chao Chen, Thanos Bismpigiannis, Radek Grzeszczuk, Kari Pulli, Bernd Girod
    http://dl.acm.org/citation.cfm?id=1460165
  3. 119 – Learning tag relevance by neighbor voting for social image retrieval
    Xirong Li, Cees G.M. Snoek, Marcel Worring
    http://dl.acm.org/citation.cfm?id=1460126
  4. 58 – Spirittagger: a geo-aware tag suggestion tool mined from flickr
    Emily Moxley, Jim Kleban, B. S. Manjunath
    http://dl.acm.org/citation.cfm?id=1460102
  5. 42 – Content-based mood classification for photos and music: a generic multi-modal classification framework and evaluation approach
    Peter Dunker, Stefanie Nowak, André Begau, Cornelia Lanz
    http://dl.acm.org/citation.cfm?id=1460114

MIR 2010

  1. 82 – New trends and ideas in visual concept detection: the MIR flickr retrieval evaluation initiative
    Mark J. Huiskes, Bart Thomee, Michael S. Lew
    http://dl.acm.org/citation.cfm?id=1743475
  2. 78 – How reliable are annotations via crowdsourcing: a study about inter-annotator agreement for multi-label image annotation
    Stefanie Nowak, Stefan Rüger
    http://dl.acm.org/citation.cfm?id=1743478
  3. 45 – Exploring automatic music annotation with “acoustically-objective” tags
    Derek Tingle, Youngmoo E. Kim, Douglas Turnbull
    http://dl.acm.org/citation.cfm?id=1743400
  4. 39 – Feature selection for content-based, time-varying musical emotion regression
    Erik M. Schmidt, Douglas Turnbull, Youngmoo E. Kim
    http://dl.acm.org/citation.cfm?id=1743431
  5. 34 – ACQUINE: aesthetic quality inference engine – real-time automatic rating of photo aesthetics
    Ritendra Datta, James Z. Wang
    http://dl.acm.org/citation.cfm?id=1743457

Report from ACM Multimedia 2013

Conference/Workshop Program Highlights

ACM Multimedia 2013 was held at the CCIB (Centre de Conventions Internacional de Barcelona) from October 21st to October 25th, 2012 in Barcelona. The Art Exhibition has been held for the entire duration of the conference at the FAD (Forment de les Arts i del Disseny) in the center of the city while the workshops were held in the Universitat Pompeu Fabra – Balmes building during the first two days of the conference (Oct. 21-Oct 22). It was the first time the conference was held in Spain and it offered a high-quality program and a few notable innovations. Dr. Nozha Boujemaa from INRIA, France, Dr. Alejandro Jaimes from Yahoo! Labs, Spain and Prof. Nicu Sebe from the University of Trento, Italy were the general co-chairs of the conference. Dr. Daniel Gatica-Perez from IDIAP & EPFL, Switzerland, Dr. David A. Shamma from Yahoo! Labs, USA, Prof. Marcel Worring from the University of Amsterdam, The Netherlands, and Prof. Roger Zimmermann from the National University of Singapore, Singapore were the program co-chairs. The entire organization committee is listed in Appendix A. The number of participants was 544. The main conference was attended by 476 participants out of which 425 paid and 51 participants were special cases (sponsors, student volunteers, etc.), and 68 participants attended workshops only. The tutorials which were free of charge were registered by 312 in advance. Multimedia art exhibition was open to public from Oct. 21 to Oct.28, and visited by more than 2,000 visitors. The total revenue of the conference was $318,151, and the surplus was $25,430.

The venue (CCIB)

Below is the list of the program components of Multimedia 2013.

  • ž Technical Papers: Full and Short papers
  • ž Keynote Talks
  • ž SIGMM Achievement Award Talk, Ph.D Thesis Award Talk
  • ž Panel
  • ž Brave New Ideas
  • ž Multimedia Grand Challenge Solutions
  • ž Technical Demos
  • ž Open Source Software Competition
  • ž Doctoral Symposium
  • ž Art Exhibition and Reception
  • ž Tutorials
  • ž Workshops
  • ž Awards and Banquet

Innovations made for Multimedia 2013:

In attempt to continuously improve ACM Multimedia and ensure its vibrant role for the multimedia community, we have made a number of enhancements for this year’s conference:

  • The Technical Program Committee defined twelve Technical Areas for major focus for this year’s conference, including introducing new Technical Areas for Music & Audio and Crowdsourcing to reflect their growing interest and promise. We have also changed the names of some traditional Technical Areas and provided extensive description of each area to help the authors choosing the most appropriate Technical Area for their manuscripts.
  • We have introduced a new role in the organization of the conference: the author’s advocate. His explicit role was to listen to the authors, and to help them if reviews are clearly below average quality. The authors could request the mediation of the author’s advocate after the reviews have been sent to them and they had to clearly justify the reasons why such mediation is needed (the reviews or the meta-review were below average quality). The task of the advocate was to investigate carefully the matter and to request additional review or reexamination of the decision of the particular manuscript. This year, the author’s advocate was Pablo Cesar from CWI, The Netherlands.
  • We have decided to keep a couple of plenary sessions which will bring singular focus to conference activities: keynotes, Multimedia Grand Challenge competition, Best Paper session, Technical Achievement Award and Best PhD Award sessions. The other technical sessions are held in parallel to allow pursuit of more specialized interests at the conference. We have limited the number of parallel session to no more than 3 to minimize the risk of having overlapping interests.
  • The use of video spotlights for advertising the works to be presented. These were meant to offer all attendees an opportunity to become aware of the content of each paper, and thus to be attracted to attend the corresponding poster or talk.
  • Workshops and Tutorials are held on separate days from the main conference in order to reduce conflict with the regular Technical Program.
  • The Multimedia Art Exhibition featured both invited and selected artists. It was open for the duration of the conference in the satellite venue located in the center of the city.
  • Following the last two years’ precedent, Tutorials are made free for all participants.
  • Recognizing that students are the lifeblood of our next generation of multimedia thinkers, this year’s Student Travel Grant was greatly expanded. We had a total amount of $26,000 received from SIGMM ($16,000) and NSF ($10,000) that supported 35 students.
  • Finally, we have decided to provide open access for the community to the proceedings available in the ACM Digital Libraries. As such, no USB proceedings were handed over to the participants encouraging everyone to get online access.

Technical Program

Following the guidelines of the ACM Multimedia Review Committee, the conference was structured into 12 Areas, with a two-tier TPC, a double-blind review process, and a target acceptance rate of 20% for long papers and 27.7% for short papers. Based on the experience from ACM Multimedia 2012 and the responses to our “Call for Areas” that we issued to the community, we selected the following Areas.

  1. Art, Entertainment, and Culture
  2. Authoring and Collaboration
  3. Crowdsourcing
  4. Media Transport and Delivery
  5. Mobile & Multi-device
  6. Multimedia Analysis
  7. Multimedia HCI
  8. Music & Audio
  9. Search, Browsing, and Discovery
  10. Security and Forensics
  11. Social Media & Presence
  12. Systems and Middleware

The Technical Program Committee was first created by appointing Area Chairs (ACs). A total of 29 colleagues agreed to serve in this role. Each Area was represented by two ACs, with exception of two Areas (Multimedia Analysis and Search, Browsing, and Discovery) whose scope has traditionally attracted the largest proportion of papers and so required further coordination. The added topic diversity brought an increase in gender diversity to the ACs, which increased from approximately 12% in previous years to 22% for 2013. We also made a conscious effort to bring new talent and excellence into the community and to better represent emerging trends in the field. For this we appointed many young and well recognized ACs who served in this role for the first time. For each junior AC, we co-appointed a senior researcher as their co-AC to aid in their shepherding. In a second step, the Area Chairs were responsible for appointing the TPC members (reviewers) for their coordinated areas. This was a large effort to grow the TPC base for the conference as well as ensure proper expertise was represented in each area. We coupled this with a hard goal of limiting the number of submissions assigned to each TPC member for review. For example, two years ago, the average number of papers assigned to a reviewer was 9 with over 38% of the approximately 225 TPC members receiving 10 or more papers to review. With our design, we had a total of 398 reviewers receiving an average of 4.13 papers per reviewer. While we were unable to keep a hard ceiling limitation, only 2.51% of the TPC received 10 or more papers to review—all TPC members who had agreed to serve in more than one area. The Area Chairs were in charge of assigning all papers for review, and each submission was reviewed double-blind by three TPC members. Reviews and reviewer assignments of papers co-authored by Area Chairs, Program Chairs, and General Chairs were handled by Program Chairs who had no conflicts of interest for each specific case. Another novelty introduced in the reviewing process was to set the paper submission deadline to a significantly earlier date than previous years, in order to allocate more time for reviews, rebuttals, discussions, and final decisions. Despite the reduced time given to authors, the response to the Call for Papers was enthusiastic with a total of 235 long papers and 278 short papers going through review. The authors of long papers were asked to write a rebuttal after receiving the reviews. A new element in the reviewing process was the introduction of the Author’s Advocate figure, created to provide authors with an independent channel to express concerns about the quality of the reviews for their papers, and to raise a flag about these reviews. All cases were brought to the attention to the corresponding Area Chair. After evaluating each case reported to him (16 reviews out of 761 long paper reviews), the Author’s Advocate recommended in 5 cases that new reviews were generated and added to the discussion. The reviewers had a period for on-line discussion of reviews and rebuttals, after which the Area Chairs drafted a meta-review for each paper. Decisions on long and short papers were made at the TPC meeting held at the University of Amsterdam on June 11, 2013. The meeting was physically attended by one of the General Chairs, three of the Program Chairs, the Author’s Advocate, and 86% of the ACs. Many of the ACs who were unable to attend were tele-present online for discussions. On the first half day of the TPC meeting, the Area Chairs worked in breakout sessions to discuss the papers that were weak accepts and weak rejects, with the exception of conflict of interest papers which were handled out of band as previously mentioned. In the second half of the first day, the ACs met in a plenary session where they reviewed the clear accepts and defended the decisions on the borderline papers based on the papers themselves, reviews, meta-reviews, on-line discussions, and authors’ rebuttal comments. In many cases, an emergency reviewer was added if there was clear intersection with a related submission area. If a paper had any conflict of interest during the plenary session with an Area, Program, or General Chair, they were excused from the room. On June 12, 2013, the Program Chairs finalized the process and conference program in a separate meeting—arranging the sessions by thematic narratives and not by submission area to promote cross-area conversations during the conference itself. The review process resulted in an overall acceptance rate of 20.0% for long papers and 27.7% for short papers (the distribution of submissions and the acceptance rate for each one of the 12 areas is shown in the graph below). All accepted long papers were shepherded by the Area Chairs themselves or by qualified TPC members who were in charge of verifying that the revised papers adequately addressed concerns raised by the reviewers and changes promised by authors in their rebuttals. This step ensured that all of the accepted papers are of the highest quality possible. In addition, four papers with high review scores were nominated at the TPC meeting as candidates for the Best Paper Award. Each nominated paper had to be successfully championed and defended by the ACs from that area. The winner was announced at the Conference Banquet.

ACM Multimedia 2013 Program at a Glance

The entire program of ACM Multimedia 2013 is shown below.



Workshop session

Conference venue

Opening ceremony

Keynote presentation

Poster/Demo session

SIGMM Achievement Award Talk


Keynote Talks

Multimedia Framed Dr. Elizabeth F. Churchill (Ebay Research Labs) Wednesday, Oct. 23, 2013 Abstract: Multimedia is the combination of several media forms. Information designers, educationalists and artists are concerned with questions such as: Is text, or audio or video, or a combination of all three, the best format for the message? Should another modality (e.g., haptics/touch, olfaction) be invoked instead to make the message more effective and/or the experience more engaging? How does the setting affect perception/reception? How does framing affect people’s experience of multimedia? How is the artifact changed through interaction with audience members? In this presentation, I will talk about people’s experience of multimedia artifacts like videos. I will discuss the ways in which framing affects how we experience multimedia. Framing can be intentional–scripted creations produced with clear intent by technologists, designers, media producers, media artists, film-makers, archivists, documentarians and architects. Framing can also be unintentional. Everyday acts of interest and consumption turn us, the viewers, into co-producers of the experiences of the multimedia artifacts we have viewed. We download, annotate, comment and share multimedia artifacts online. Our actions are reflected in view counts, displayed comments and content ranking. Our actions therefore change how multimedia artifacts are interpreted and understood by others. Drawing on examples from the history of film and of performance art, from current social media research and from research conducted with collaborators over the past 16 years, I will illustrate how content understanding is modulated by context, by the “framing” of the content. I will consider three areas of research that are addressing the issue of framing, and that have implications for our understanding of ‘multimedia’ consumption, now and in the future: (1) The psychology and psychophysiology of multimedia as multimodal experience; (2) Emerging practices with contemporary social media capture and sharing from personal devices; and (3) Innovations in social media and audience analytics focused on more deeply understanding media consumption. I will conclude with some technical excitements, design/development challenges and experiential possibilities that lie ahead. Dr. Elizabeth Churchill is Director of Human Computer Interaction at eBay Research Labs (ERL) in San Jose, California. Formerly a Principal Research Scientist at Yahoo! Research, she founded, staffed and managed the Internet Experiences Group. Until September of 2006, she worked at the Palo Alto Research Center (PARC), California, in the Computing Science Lab (CSL). Prior to that she formed and led the Social Computing Group at FX Palo Laboratory, Fuji Xerox’s research lab in Palo Alto. Originally a psychologist by training, throughout her career Elizabeth has focused on understanding people’s social and collaborative interactions in their everyday digital and physical contexts. With over 100 peer-reviewed publications and 5 edited books, topics she has written about include implicit learning, human-agent systems, mixed initiative dialogue systems, social aspects of information seeking, digital archive and memory, and the development of emplaced media spaces. She has been a regular columnist for ACM interactions since 2008. Elizabeth has a BSc in Experimental Psychology, an MSc in Knowledge Based Systems, both from the University of Sussex, and a PhD in Cognitive Science from the University of Cambridge. In 2010, she was recognised as a Distinguished Scientist by the Association for Computing Machinery (ACM). Elizabeth is the current Executive Vice President of ACM SigCHI (Human Computer Interaction Special Interest Group). She is a Distinguished Visiting Scholar at Stanford University’s Media X, the industry affiliate program to Stanford’s H-STAR Institute. The Space between the Images Leonidas J. Guibas (Stanford University) Thursday, Oct. 24, 2013 Abstract: Multimedia content has become a ubiquitous presence on all our computing devices, spanning the gamut from live content captured by device sensors such as smartphone cameras to immense databases of images, audio and video stored in the cloud. As we try to maximize the utility and value of all these petabytes of content, we often do so by analyzing each piece of data individually and foregoing a deeper analysis of the relationships between the media. Yet with more and more data, there will be more and more connections and correlations, because the data captured comes from the same or similar objects, or because of particular repetitions, symmetries or other relations and self-relations that the data sources satisfy. This is particularly true for media of a geometric character, such as GPS traces, images, videos, 3D scans, 3D models, etc. In this talk we focus on the “space between the images”, that is on expressing the relationships between different multimedia data items. We aim to make such relationships explicit, tangible, first class objects that themselves can be analyzed, stored, and queried — irrespective of the media they originate from. We discuss mathematical and algorithmic issues on how to represent and compute relationships or mappings between media data sets at multiple levels of detail. We also show how to analyze and leverage networks of maps and relationships, small and large, between inter-related data. The network can act as a regularizer, allowing us to to benefit from the “wisdom of the collection” in performing operations on individual data sets or in map inference between them. We will illustrate these ideas using examples from the realm of 2D images and 3D scans/shapes — but these notions are more generally applicable to the analysis of videos, graphs, acoustic data, biological data such as microarrays, homeworks in MOOCs, etc. This is an overview of joint work with multiple collaborators, as will be discussed in the talk. Prof. Leonidas Guibas obtained his Ph.D. from Stanford under the supervision of Donald Knuth. His main subsequent employers were Xerox PARC, DEC/SRC, MIT, and Stanford. He is currently the Paul Pigott Professor of Computer Science (and by courtesy, Electrical Engineering) at Stanford University. He heads the Geometric Computation group and is part of the Graphics Laboratory, the AI Laboratory, the Bio-X Program, and the Institute for Computational and Mathematical Engineering. Professor Guibas’ interests span geometric data analysis, computational geometry, geometric modeling, computer graphics, computer vision, robotics, ad hoc communication and sensor networks, and discrete algorithms. Some well-known past accomplishments include the analysis of double hashing, red-black trees, the quad-edge data structure, Voronoi-Delaunay algorithms, the Earth Mover’s distance, Kinetic Data Structures (KDS), Metropolis light transport, and the Heat-Kernel Signature. Professor Guibas is an ACM Fellow, an IEEE Fellow and winner of the ACM Allen Newell award.

SIGMM Talks

SIGMM Achievement Award Talk Dick Bulterman, CWI, The Netherlands Friday, Oct. 25, 2013 The 2013 winner of SIGMM award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications is Prof. Dr. Dick Bulterman. The ACM SIGMM Technical Achievement award is given in recognition of outstanding contributions over a researcher’s career. Prof. Dick Bulterman has been selected for his outstanding technical contributions in multimedia authoring, media annotation, and social sharing from research through standardization to entrepreneurship, and in particular for promoting international Web standards for multimedia authoring and presentation (SMIL) in the W3C Synchronized Multimedia Working Group as well as his dedicated involvement in the SIGMM research community for many years. Dr. Dick Bulterman has been a long time intellectual leader in the area of temporal modeling and support for complex multimedia system. His research has led to the development of several widely used multimedia authoring systems and players. He developed the Amsterdam Hypermedia Model, the CMIF document structure, the CMIFed authoring environment, the GRiNS editor and player, and a host of multimedia demonstrator applications. In 1999, he started the CWI spinoff company called Oratrix Development BV, and he worked as CEO to widely deliver this software. He is currently a Research Group Head of the Distributed and Interactive Systems at Centrum Wiskunde & Informatica (CWI) in Amsterdam, The Netherlands. He is also a Full Professor of Computer Science at Vrije Universiteit, Amsterdam. His research interests are multimedia authoring and document processing. Dick has a strong international reputation for the development of the domain-specific temporal language for multimedia (SMIL). Much of this software has been incorporated into the widely used Ambulant Open Source SMIL Player, which has served to encourage development and use of time-based multimedia content. His conference publications and book on SMIL have helped to promote SMIL and its acceptance as a W3C standard. Dick’s recent work on social sharing of video will likely prove influential in upcoming Interactive TV products. This work has already been recognized in the academic community, earning the ACM SIGMM best paper award at ACM MM 2008 and also at the EUROITV conference. SIGMM Ph.D Thesis Award Talk Xirong Li, Remin University, China Friday, Oct. 25, 2013 The SIGMM Ph.D. Thesis Award Committee recommended this year’s award for the outstanding Ph.D. thesis in multimedia computing, communications and applications to Dr. Xirong Li. The committee considered Dr. Li’s dissertation titled “Content-based visual search learned from social media” as worthy of the award as it substantially extends the boundaries for developing content-based multimedia indexing and retrieval solutions. In particular, it provides fresh new insights into the possibilities for realizing image retrieval solutions in the presence of vast information that can be drawn from the social media. The committee considered the main innovation of Dr. Li’s work to be in the development of the theory and algorithms providing answers to the following challenging research questions: (a) what determines the relevance of a social tag with respect to an image, (b) how to fuse tag relevance estimators, (c) which social images are the informative negative examples for concept learning, (d) how to exploit socially tagged images for visual search and (e) how to personalize automatic image tagging with respect to a user’s preferences. The significance of the developed theory and algorithms lies in their power to enable effective and efficient deployment of the information collected from the social media to enhance the datasets that can be used to learn automatic image indexing mechanisms (visual concept detection) and to make this learning more personalized for the user. Dr. Xirong Li received the B.Sc. and M.Sc. degrees from the Tsinghua University, China, in 2005 and 2007, respectively, and the Ph.D. degree from the University of Amsterdam, The Netherlands, in 2012, all in computer science. The title of his thesis is “Content-based visual search learned from social media”. He is currently an Assistant Professor in the Key Lab of Data Engineering and Knowledge Engineering, Renmin University of China. His research interest is image search and multimedia content analysis. Dr. Li received the IEEE Transactions on Multimedia Prize Paper Award 2012, Best Paper Nominee of the ACM International Conference on Multimedia Retrieval 2012, Chinese Government Award for Outstanding Self-Financed Students Abroad 2011, and the Best Paper Award of the ACM International Conference on Image and Video Retrieval 2010. He served as publicity co-chair for ICMR 2013. Panel Cross-Media Analysis and Mining Wednesday, Oct 23, 2013 Panelists:Mark Zhang, Alberto del Bimbo, Selcuk Candan, Alexander Hauptmann, Ramesh Jain, Alexis Joly, Yueting Zhuang Motivation Today there are lots of heterogeneous and homogeneous media data from multiple sources, such as news media websites, microblog, mobile phone, social networking websites, and photo/video sharing websites. Integrated together these media data represent different aspects of the real-world and help document the evolution of the world. Consequently, it is impossible to correctly conceive and to appropriately understand the world without exploiting the data available on these different sources of rich multimedia content simultaneously and synergistically. Cross-media analysis and mining is a research area in the general field of multimedia content analysis which focuses on the exploitation of the data with different modalities from multiple sources simultaneously and synergistically to discover knowledge and understand the world. Specifically, we emphasize two essential elements in the study of cross-media analysis that help differentiate cross-media analysis from the rest of the research in multimedia content analysis or machine learning. The first is the simultaneous co-existence of data from two or more different data sources. This element indicates the concept of “cross”, e.g., cross-modality, cross-source, and cross cyberspace to reality. Cross-modality means that heterogeneous features are obtained from the data in different modalities; cross-source means that the data may be obtained across multiple sources (domains or collections); cross-space means that the virtual world (i.e., cyberspace) and the real world (i.e., reality) complement each other. The second is the leverage of different types of data across multiple sources for strengthening the knowledge discovery, for example, discovering the (latent) correlation or synergy between the data with different modalities across multiple sources, transferring the knowledge learned from one domain (e.g., a modality or a space) to generate knowledge in another related domain, and generating a summary with the data from multiple sources. There two essential elements help promote cross-media analysis and mining as a new, emerging, and important research area in today’s multimedia research. With the emphasis on knowledge discovery, cross-media analysis is different from the traditional research areas such as cross-lingual translation. On the other hand, with the general scenarios of the leverage of different types of data across multiple sources for strengthening the knowledge discovery, cross-media analysis and mining addresses a broader series of problems than the traditional research areas such as transfer learning. Overall, cross-media analysis and mining is beneficial for many applications in data mining, causal inference, machine learning, multimedia, and public security. Like other emerging hot topics in multimedia research, cross-media analysis and mining also has a number of fundamental and controversial issues that must be addressed in order to have a full and complete understanding of the research in this topic. These issues include but are not limited to whether or not there exists a unified representation or modeling for the same semantic concept from different media, and if there is what such unified representation or modeling is; whether or not there exists any “law” that governs the topic evolution and development over the time in different media and if there is what such “law” is and how it is formulated; whether or not there exists a mapping for a conceptual or semantic activity between the cyberspace and the real-world, and if there is what such a mapping is and how it is developed and formulated. Brave New Idea Program Brave New Ideas addressed long term research challenges, pointed to new research directions, or provided new insights or brave perspectives that pave the way to innovation. The selection process was different from the regular papers. First, submission of a 2 page abstract was requested. Then, the first selection was performed and a full paper was required for the selected abstracts and reviewed and chosen. We received 38 submissions for the first stage and 14 were invited to submit the full paper for the second reviewing stage. Finally, there were accepted 6 papers, which formed two sessions of oral presentations. Multimedia Grand Challenge Solutions We had received six challenges as shown below for the Multimedia Grand Challenge Solutions Program.

  1. NHK – Where is beauty? Grand Challenge
  2. Technicolor – Rich Multimedia Retrieval from Input Videos Grand Challenge
  3. Yahoo! – Large-scale Flickr-tag Image Classification Grand Challenge
  4. Huawei/3DLife – 3D human reconstruction and action recognition Grand Challenge
  5. MediaMixer/VideoLectures.NET – Temporal Segmentation and Annotation Grand Challenge
  6. Microsoft: MSR – Bing Image Retrieval Grand Challenge

We received 34 proposals for this program, and 14 of them were accepted for the presentation. In order to promote submissions, all presentations in this program were awarded as Multimedia Grand Challenge Finalists. The best prize and two second best prizes were chosen and awarded. Requested by Technicolor, the Grand Challenge Multimodal Prize was also chosen and awarded. Technical Demonstrations We have received 80 excellent technical demonstrations proposals. The number of submissions was in line to the demonstrations received in the previous year. Three reviewers were assigned to each demo proposal, and finally 40 proposals were chosen. The best demo prize was awarded. Open Source Software Competition This year was the 6th edition of the Open software competition being part of the ACM Multimedia program. The goal of this competition is to praise the invaluable contribution of researchers and software developers who advance the field by providing the community with implementations of codecs, middleware, frameworks, toolkits, libraries, applications, and other multimedia software. This year we have received 16 submissions and after assigning three reviewers to each of them we have selected 11 for the competition. The best open source software was awarded. Doctoral Symposium Doctoral Symposium was meant as a forum for mentoring graduate students. It was held in the afternoon of Oct. 25 both in the oral and poster formats. We have received 19 proposals for doctoral symposium. We accepted 13 presentations (6 oral + poster and 7 additional posters). Additionally, there was organized a Doctoral Symposium lunch in which the students had the opportunity to talk to their assigned mentors. Finally, the best doctoral symposium paper was awarded. Multimedia Art Exhibition and Reception ACM Multimedia provided a rich Multimedia Art Exhibition to stimulate artists and researchers alike to meet and discover the frontiers of multimedia artistic communication. The Art Exhibition has attracted significant work from a variety of digital artists collaborating with research institutions. We have endeavored to select exhibits that achieved an interesting balance between technology and artistic intent. The techniques underpinning these artworks are relevant to several technical tracks of the conference, in particular those dealing with human-centered and interactive media. We had a satellite venue, FAD (Forment de les Arts i del Disseny), for the art exhibition located in the center of the city. The venue had a very good public access. The exhibition was open from Oct. 21 to Oct. 28 and visited by more than 2,000 visitors. The reception event was held with the artists on Oct. 23. We had selected 10 art works for the exhibition:

  1. Emotion Forecast, Maurice Benayoun (City University of Hong Kong)
  2. Critical, Anabela Costa (France)
  3. Smile-Wall, Shen-Chi Chen, He-Lin Luo, Kuan-Wen Chen, Yu-Shan Lin, Hsiao-Lun Wang, Che-Yao Chan, Kai-Chih Huang, Yi-Ping Hung (National Taiwan University)
  4. SOMA, Guillaume Faure (France)
  5. A Feast of Shadow Puppetry, Zhenzhen Hu, Min Lin, Si Liu, Jiangguo Jiang, Meng Wang, Richang Hong, Shuicheng Yan, Hefei University of Technology and NUS
  6. Tele Echo Tube, Hill Hiroki Kobayashi, Kaoru Saito, Akio Fujiwara (University of Tokyo)
  7. 3D-Stroboscopy, Sujin Lee (Sogang University, South Korea)
  8. The Qi of Calligraphy, He-Lin Luo, Yi-Ping Hung (Taiwan National University), I-Chun Chen (Tainan National University of the Arts)
  9. Gestural Pen Animation, Sheng-Ying Pao and Kent Larson (MIT Media Lab, USA)
  10. MixPerceptions, Jose San Pedro (Telefonica Research, Spain), Aurelio San Pedro (Escola Massana, Barcelona), Juan Pablo Carrascal (UPF, Barcelona), Matylda Szmukier (Telefonica Research, Spain)

Attending the Art Exhibition

San Pedro’s Mix Perceptions


Tutorials

We received in 14 tutorial proposals and we have selected 8 tutorials for the main program. All tutorials were half day and were held on Oct. 21 and 22 in parallel with the workshops in the in the Universitat Pompeu Fabra – Balmes building. Tutorials were made free for all participants and we received 312 pre-registrations. Gerald Friedland(ICSI)

Tutorial 1 Foundations and Applications of Semantic Technologies for Multimedia Content
Ansgar Scherp (Uni Mannheim, Germany)
Tutorial 2 Towards Next-Generation Multimedia Recommendation Systems
Jialie Shen, (SMU Singapore)
Shuicheng Yan (NUS)
Xian-Sheng Hua (Microsoft)
Tutorial 3 Crowdsourcing for Multimedia Research
Mohammad Soleymani (Imperial College London)
Martha Larson (TU Delft)
Tutorial 4 Massive-Scale Multimedia Semantic Modeling
John R. Smith (IBM Research )
Liangliang Cao (IBM Research)
Tutorial 5 Social Interactions over Geographic-Aware Multimedia Systems
Roger Zimmerman (NUS)
Yi Yu (NUS)
Tutorial 6 Multimedia Information Retrieval: Music and Audio
Markus Schedl (JKU Linz)
Emilia Gomez (UPF)
Masataka Goto (AIST)
Tutorial 7 Blending the Physical and the Virtual in Musical Technology: From interface design to multimodal signal processing
George Tzanetakis (U Victoria, Canada)
Sidney Fels (UBC)
Michael Lyons (Ritsumeikan U, JP)
Tutorial 8 Privacy Concerns of Sharing Multimedia in Social Networks
Gerald Friedland (ICSI)

Workshops

Workshops have always been an important part of the conference. Below is the list of workshops held in conjunction with ACM Multimedia 2013. We had 9 full day workshops and 4 half day workshops, which were held on Oct. 21-22 in parallel with the tutorials. We followed the rule from last year and two complementary workshop only registrations were provided for invited talks of each workshop to encourage participation of notable speakers.

Full Day Workshops (8)

  1. 2nd International Workshop on Socially-Aware Multimedia (SAM 2013) Organizers: Pablo Cesar (CWI, NL) Matthew Cooper (FXPAL) David A. Shamma (Yahoo!) Doug Williams (BT)
  1. 4th ACM/IEEE ARTEMIS 2013 International Workshop on Analysis and Retrieval of Tracked Events and Motion in Imagery Streams Organizers: Marco Bertini (University of Florence, Italy) Anastasios Doulamis (TU Crete, Greece) Nikolaos Doulamis (Cyprus University of Technology, Cyprus) Jordi Gonzàlez (Universitat Autònoma de Barcelona, Spain) Thomas Moeslund (University of Aalborg, Denmark)
  1. 5th International Workshop on Multimedia for Cooking and Eating Activities (CEA2013) Organizer: Kiyoharu Aizawa(Univ. of Tokyo, JP)
  1. 4th International Workshop on Human Behavior Understanding (HBU 2013) Organizers: Albert Ali Salah, Boğaziçi Univ., Turkey Hayley Hung, Delft Univ. of Technology, The Netherlands Oya Aran, Idiap Research Intitute, Switzerland Hatice Gunes, Queen Mary Univ. of London (QMUL), UK
  1. International ACM Workshop on Crowdsourcing for Multimedia 2013 (CrowdMM 2013) Organizers: Wei-Ta Chu (National Chung Cheng University, TW) Martha Larson (Delft University of Technology, NL) Kuan-Ta Chen (Academia Sinica, TW)
  1. First ACM MM Workshop on Multimedia Indexing and Information Retrieval for Healthcare (ACM MM MIIRH) Organizers: Jenny Benois-Pineau, University of Bordeaux 1, France Alexia Briasouli, CERTH -ITI Alex Hauptman, Carnegie-Mellon University, USA
  1. Workshop on Personal Data Meets Distributed Multimedia Organizers: Vivek Singh, MIT, USA Tat-Seng Chua, NUS Ramesh Jain, University of California, Irvine, USA Alex (Sandy) Pentland, MIT, USA
  1. Workshop on Immersive Media Experiences Organizers: Teresa Chambel, University of Lisbon, Portugal V. Michael Bove, MIT Media Lab, USA Sharon Strover, University of Texas at Austin, USAA Paula Viana, Polytechnic of Porto and INESC TEC, Portugal Graham Thomas, BBC, UK
  1. Workshop on Event-based Media Integration and Processing Organizers: Fausto Giunchiglia, University of Trento, Italy Sang “Peter” Chin, Johns Hopkins University, US Giulia Boato, University of Trento, Italy Bogdan Ionescu, University Politehnica of Bucharest, Romania Yiannis Kompatsiaris, Centre for Research and Technology Hellas, Greece

Half Day Workshops (4)

  1. ACM Multimedia Workshop on Geotagging and Its Applications Organizers: Liangliang Cao, IBM T. J. Watson Research Center, USA Gerald Friedland, International Computer Science Institute, USA, Pascal Kelm, Technische Universitaet of Berlin, Germany
  1. Data-driven challenge-based workshop ACM MM 2013(AVEC 2013) Organizers: Björn Schuller, TUM, Germany Michel Valstar, University of Nottingham, UK Roddy Cowie, Queen’s University Belfast, UK Maja Pantic, Imperial College London, UK Jarek Krajewski, University of Wuppertal, Germany
  1. 2nd ACM International Workshop on Multimedia Analysis for Ecological Data (MAED 2013) Organizers: Concetto Spampinato, University of Catania, Italy Vasileios Mezaris, CERTH, Greece Jacco van Ossenbruggen, CWI, The Netherlands
  1. 3rd International Workshop on Interactive Multimedia on Mobile and Portable Devices(IMMPD’13) Organizers: Jiebo Luo, University of Rochester, USA Caifeng Shan, Philips Research, The Netherlands Ling Shao, The University of Sheffield, UK Minoru Etoh, NTT DOCOMO, Japan

Awards

Awards were given in almost all the programs except for short papers during the banquet that was organized at the conference venue. The following awards have been given: Best Paper Award Luoqi Liu, Hui Xu, Junliang Xing, Si Liu, Xi Zhou and Shuicheng Yan, National University of Singapore (NUS), “Wow! You Are So Beautiful Today!” Best Student Paper Award Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao and Tat-Seng Chua, National University of Singapore (NUS), “Attributes-augmented Semantic Hierarchy for Image Retrieval” Grand Challenge 1st Place Award [Sponsored by Technicolor] Brendan Jou, Hongzhi Li, Joseph G. Ellis, Daniel Morozoff-Abegauz and Shih-Fu Chang, Digital Video & Multimedia (DVMM) Lab, Columbia University, “Structured Exploration of Who, What, When, and Where in Heterogenous Multimedia News Sources” Grand Challenge 2nd Place Award [Sponsored by Technicolor] Subhabrata Bhattacharya, Behnaz Nojavanasghari, Tao Chen, Dong Liu, Shih-Fu Chang, Mubarak Shah, University of Central Florida and Columbia University, “Towards a Comprehensive Computational Model for Aesthetic Assessment of Videos” Grand Challenge 3rd Place Award [Sponsored by Technicolor] Shannon Chen, Penye Xia, and Klara Nahrstedt, UIUC, “Activity-Aware Adaptive Compression: A Morphing-Based Frame Synthesis Application in 3DTI”


Program chairs during the banquet

Award ceremony

Banquet venue

Social program


Grand Challenge Multimodal Award [Sponsored by Technicolor] Chun-Che Wu, Kuan-Yu Chu, Yin-Hsi Kuo, Yan-Ying Chen, Wen-Yu Lee, Winston H. Hsu, National Taiwan University, Taiwan, “Search-Based Relevance Association with Auxiliary Contextual Cues” Best Demo Award Duong-Trung-Dung Nguyen, Mukesh Saini; Vu-Thanh Nguyen, Wei Tsang Ooi, National University of Singapore (NUS), “Jiku director: An online mobile video mashup system” Best Doctoral Symposium Paper Jules Francoise, Institut de Recherche et Coordination Acoustique/Musique (IRCAM), “Gesture-Sound Mapping by demonstration in Interactive Music Systems” Best Open Source Software Award Dmitry Bogdanov, Nicolas Wack, Emilia Gómez, Sankalp Gulati, Perfecto Herrera, Oscar Mayor, Gerard Roma, Justin Salamon, Jose Zapata Xavier Serra (UPF), “ESSENTIA: An Audio Analysis Library for Music Information Retrieval”

Prize amounts:

Best Paper Award 500 euro
Best Student Paper Award 250 euro
Grand Challenge 1st Prize 750 euro
Grand Challenge 2nd Prize 500 euro
Grand Challenge 3nd Prize 200 euro
Grand Challenge Multimodal Prize 500 euro
Best Technical Demo Award 250 euro
Best Doctoral Symposium Paper 250 euro
Best Open Source Software Award 250 euro
Student Travel Grant (35 students) $26,000 ($10,000 NSF, $16,000 SIGMM)

Sponsors:We had an incredible support from industries and funding organizations (38.5k euro). All the sponsors and the institutional supporters are listed in Appendix B. The sponsoring amount for each individual sponsor is as follows:

Sponsor Amount
FXPAL 5000 euro
Google 5000 euro
Huawei 5000 euro
Yahoo!Labs 5000 euro
Technicolor 4000 euro
Media Mixer 3500 euro
INRIA 3000 euro
Facebook 2000 euro
IBM 2000 euro
Telefonica 2000 euro
Microsoft 2000 euro
Total 38500 euro

The benefits for the sponsors were honorary registrations and publicity, that is, the company logo was published on the website of the conference, in the Proceedings, and the Booklet. On top of these amounts we have received 16k$ from SIGMM and 10K from NSF for student travel grants.

Geographical distribution of the participants

We had 544 participants at the main conference and workshops. The main conference was attended by 476 participants out of which 425 paid and 51 participants were special cases (sponsors, student volunteers, etc.), and 68 participants attended only the workshops. The tutorials which were free of charge were registered by 312 in advance. Country-wise distribution is shown below. As shown in the list, the geographical distribution was wide meaning that we managed to attract participants from a large number of countries.

Total  # of participants 544      
USA 75 Switzerland 20
Singapore 48 Germany 20
China 45 Portugal 20
Japan 40 Taiwan 18
UK 35 Korea 15
Italy 29 Australia 15
France 28 Greece 14
Netherlands 26 Turkey 14
Spain 26 25 other countries 56

Survey

In order to gather opinions from the participants at ACM Multimedia 2013 we have performed a post-conference survey and the results are summarized in Appendix C. Here we summarize the 10 most important issues that were compiled after analyzing the answers received. The effort to gather all this information is the first of its kind at ACM Multimedia and we hope this tradition will be continued in the future. The results of the survey represent in our opinion a very good source of information for the future organizers.

  1. Poster space too small
  2. Many people still want USB proceedings!!
  3. Oral topics in the same time slot overlapped too much. Need to diversify.
  4. Need to attract more multimedia niche topics. Should not become a second rate CV conference
  5. First day location hard to find. Workshop/tutorial better to be co-located with main conference
  6. Senior members of MM community should participate in paper sessions more
  7. Need to update web site program content and make it available earlier
  8. Consider offering short spotlight talks for poster papers
  9. Keep 15 mins for oral, but have them presented again in poster session for more discussion
  10. SIGMM business meeting too long. Not enough time for QA.

Conclusion

ACM Multimedia 2013 was a great success with a great number of submissions, an excellent technical program, attractive program components, and stimulating events. As a result, we welcomed a large number of participants, in line with our initial expectation. There were a few problems see above but this is only natural. We greatly acknowledge those who have contributed to the success of ACM Multimedia 2013. We thank the organizers of ACM Multimedia 2012 for their useful suggestions and comments which helped us to improve the organization the 2013 edition. We also thank them for giving us the template for the conference booklet. We thank the many paper authors and proposal contributors for the various technical and program components. We thank the large number of volunteers, including the Organizing Committee members and Technical Program Committee members who worked very hard to create this year’s outstanding conference. Every aspect of the conference was also aided by local committee members and by the hard work of Grupo Pacifico, to whom we are very grateful. We thank also ACM staff and Sheridan Printing Company for their constant support. This success was clearly due to the integration of their efforts.

Appendix A: ACM MULTIMEDIA 2013 CONFERENCE ORGANIZATION

General Co-Chairs  Alejandro (Alex) Jaimes (Yahoo Labs, Spain) Nicu Sebe (University of Trento, Italy) Nozha Boujemaa (INRIA, France) Technical Program Co-Chairs Daniel Gatica-Perez (IDIAP & EPFL, Switzerland) David A. Shamma (Yahoo Labs, USA) Marcel Worring (University of Amsterdam, The Netherlands) Roger Zimmermann (National University of Singapore, Singapore) Author’s Advocate Pablo Cesar (CWI, The Netherlands) Multimedia Grand Challenge Co-Chairs Yiannis Kompatsiaris (CERTH, Greece) Neil O’Hare (Yahoo Labs, Spain) Interactive Arts Co-Chairs Antonio Camurri (University of Genova, Italy) Marc Cavazza (Teesside University, UK) Local Arrangement Chair Mari-Carmen Marcos (Pompeu Fabra University, Spain) Sponsorship Chairs Ricardo Baeza-Yates (Yahoo Labs, Spain) Bernard Merialdo (Eurecom, France) Panel Co-Chairs  Yong Rui (Microsoft, China) Winston Hsu (National Tawain University, Taiwan) Michael Lew (University of Leiden, The Netherlands) Video Program Chairs Alexis Joly (INRIA, France) Giovanni Maria Farinella (University of Catania, Italy) Julien Champ (INRIA/LIRMM, France) Brave New Ideas Co-Chairs Jiebo Luo (University of Rochester, USA) Shuicheng Yan (National University of Singapore, Singapore) Doctorial Symposium Chairs Hayley Hung (Technical University of Delft, The Netherlands) Marco Cristani (University of Verona, Italy) Open Source Competition Chairs Ioannis (Yiannis) Patras (Queen Mary University, UK) Andrea Vedaldi (Oxford University, UK) Tutorial Co-Chairs Kiyoharu Aizawa (University of Tokyo, Japan) Lexing Xie (Australian National University, Australia) Workshop Co-Chairs Maja Pantic (Imperial College, UK ) Vladimir Pavlovic (Rutgers University, USA) Student Travel Grants Co-Chairs Ramanathan Subramanian (ADSC, Singapore) Jasper Uijlings (University of Trento, Italy) Publicity Co-Chairs Marco Bertini (University of Florence, Italy) Ichiro Ide (Nagoya University, Japan) Technical Demo Co-Chairs  Yi Yang (Carnegie Mellon University, USA) Xavier Anguera (Telefonica Research, Spain) Proceedings Co-Chairs  Bogdan Ionescu (University Politehnica of Bucharest, Romania) Qi Tian (University of Texas San Antonio, USA) Web Chair Michele Trevisol (Web Research Group UPF & Yahoo Labs, Spain)

Appendix B. ACM MM 2012 Sponsors & Supporters

Editorial

Dear Member of the SIGMM Community, welcome to the last issue of the SIGMM Records in 2013.

The editors of the Records have taken to a classical reporting approach, and you can read here the first of series of interviews. In this issue, Cynthia Liem is interview by Mathias Lux, and explains about the Phenicx project.

We have received a report from the first international competition on game-based learning applications, and also our regular column reporting from the 106th MPEG meeting that was held in Geneva. Our open source column presents libraries and tools for threading and visualizing a large video collection in this issue, a set of tools that will be useful for many in the community. Beyond that, you also read about two PhD thesis.

Among the announcements are several open positions, and a long list of calls for paper. The long list of calls is achieved by a policy change in SIGMM. After several years that have seen our two public mailing lists, sigmm@pi4.informatik.tu-mannheim.de and mm-interest@acm.org, flooded by calls for papers, the board and online services editors have decided to change the posting policy. Both lists are now closed for public submissions of calls for paper and participation. Instead, calls must be submitted through the SIGMM Records web page, and will be distributed on the mailing list in a weekly digest. We hope that the members of the SIG appreciate this service, and that those of us who have filtered emails for years feel that this is a more appropriate policy.

With those news, we invite you to read on in this issue of the Records.

The Editors
Stephan Kopf, Viktor Wendel, Lei Zhang, Pradeep Atrey, Christian Timmerer, Pablo Cesar, Mathias Lux, Carsten Griwodz

ACM TOMM (TOMCCAP) Call for Special Issue Proposals

ACM – TOMM is one of the world’s leading journals on multimedia. As in previous years, we are planning to publish a special issue in 2015. Proposals are accepted until May, 1st 2014. Each special issue is in the responsibility of the guest editors. If you wish to guest edit a special issue, you should prepare a proposal as outlined below, then send this via e-mail to the Senior Associate Editor (SAE) for Special Issue Management of TOMM, Shervin Shirmohammadi (shervin@discover.uottawa.ca)

Call for Proposals – Special Issue
Deadline for Proposal Submission: May, 1st 2014
Notification: June, 1st 2014
http://tomccap.acm.org/
Proposals should:

  • Cover a current or emerging topic in the area of multimedia computing, communications and applications;
  • Set out the importance of the special issue’s topic in that area;
  • Give a strategy for the recruitment of high quality papers;
  • Indicate a draft timeline in which the special issue could be produced (paper writing, reviewing, and submission of final copies to TOMM), assuming the proposal is accepted.
  • Include the list of the proposed guest editors, their short bios, and their experience as related to the Special Issue’s topic

As in the previous years, the special issue will be published as online-only issue in the ACM Digital Library. This gives the guest editors higher flexibility in the review process and the number of papers to be accepted, while yet ensuring a timely publication.

The proposals will be reviewed by the SAE together with the EiC. The final decision will be made by the EiC. A notification of acceptance for the proposals will be given until June, 1st 2014. Once a proposal is accepted we will contact you to discuss the further process.

For questions please contact:

  • Shervin Shirmohammadi – Senior Associate Editor for Special Issue Management ( shervin@discover.uottawa.ca )
  • Ralf Steinmetz – Editor in Chief (EiC) ( steinmetz.eic@kom.tu-darmstadt.de )
  • Sebastian Schmidt – Information Director ( TOMCCAP@kom.tu-darmstadt.de )

VIREO-VH: Libraries and Tools for Threading and Visualizing a Large Video Collection

Introduction

“Video Hyperlinking” refers to the creation of links connecting videos that share near-duplicate segments. Like hyperlinks in HTML documents, the video links help user navigating videos of similar content, and facilitate the mining of iconic clips (or visual memes) spread among videos. Figure 1 shows some example of iconic clips, which can be leveraged for linking videos and the results are potentially useful for multimedia tasks such as video search, mining and analytics.

VIREO-VH [1] is an open source software developed by the VIREO research team. The software provides end-to-end support for the creation of hyperlinks, including libraries and tools for threading and visualizing videos in a large collection. The major software components are: near-duplicate keyframe retrieval, partial near-duplicate localization with time alignment, and galaxy visualization. These functionalities are mostly implemented based on state-of-the-art technologies, and each of them is developed as an independent tool taking into consideration flexibility, such that users can substitute any of the components with their own implementation. The earlier versions of the software are LIP-VIREO and SOTU, which have been downloaded more than 3,500 times. VIREO-VH has been internally used by VIREO since 2007, and evolved over the years based on the experiences of developing various multimedia applications, such as news events evolution analysis, novelty reranking, multimedia-based question-answering [2], cross media hyperlinking [3], and social video monitoring.

Figure 1: Examples of iconic clips.

Functionality

The software components include video pre-processing, bag-of-words based inverted file indexing for scalable near-duplicate keyframe search, localization of partial near-duplicate segments [4], and galaxy visualization of a video collection, as shown in Figure 2. The open source includes over 400 methods with 22,000 lines of code.

The workflow of the open source is as followings. Given a collection of videos, the visual content will be indexed based on a bag-of-words (BoW) representation. Near-duplicate keyframes will be retrieved and then temporally aligned in a pairwise manner among videos. Segments of a video which are near-duplicate to other videos in the collection will then be hyperlinked with the start and end times of segments being explicitly logged. The end product is a galaxy browser, where the videos are visualized as a galaxy of clusters on a Web browser, with each cluster being a group of videos that are hyperlinked directly or indirectly through transitivity propagation. User friendly interaction is provided such that end user can zoom in and out, so they can glance or take a close inspection of the video relationship.

Figure 2: Overview of VIREO-VH software architecture.

Interface

VIREO-VH could be either used as an end-to-end system that outputs visual hyperlinks, with a video collection as input, or as independent functions for development of different applications.

For content owners interested in the content-wise analysis of a video collection, VIREO-VH can be used as an end-to-end system by simply inputting the location of a video collection and the output paths (Figure 3). The resulting output can then be viewed with the provided interactive interface for showing the glimpse of video relationship in the collection.

Figure 3: Interface for end-to-end processing of video collection.

VIREO-VH also provides libraries to grant researchers programmatic access. The libraries consist of various classes (e.g., Vocab, HE, Index, SearchEngine and CNetwork), providing different functions for vocabulary and Hamming signature training [5], keyframe indexing, near-duplicate keyframe searching and video alignment. Users can refer to the manual for details. Furthermore, the components of VIREO-VH are independently developed for providing flexibility, so users can substitute any of the components with their own implementation. This capability is particular useful for benchmarking the users’ own choice of algorithms. As an example, users can choose their own visual vocabulary and Hamming median, but use the open source for building index and retrieving near-duplicate keyframes. For example, the following few lines of code implements a typical image retrieval system:

#include “Vocab_Gen.h” #include “Index.h” #include “HE.h” #include “SearchEngine.h” … // train visual vocabulary using descriptors in folder “dir_desc” // here we choose to train a hierarchical vocabulary with 1M leaf nodes (3 layers, 100 nodes / layer) Vocab_Gen::genVoc(“dir_desc”, 100, 3); // load pre-trained vocabulary from disk Vocab* voc = new Vocab(100, 3, 128); voc->loadFromDisk(“vk_words/”); // Hamming Embedding training for the vocabulary HE* he = new HE(32, 128, p_mat, 1000000, 12); he->train(voc, “matrix”, 8); // index the descriptors with inverted file Index::indexFiles(voc, he, “dir_desc/”, “.feat”, “out_dir/”, 8); // load index and conduct online search for images in “query_desc” SearchEngine* engine = new SearchEngine(voc, he); engine->loadIndexes(“out_dir/”); engine->search_dir(“query_desc”, “result_file”, 100); …

Example

We use a video collection consisting of 220 videos (around 31 hours) as an example. The collection was crawled from YouTube using the keyword “economic collapse”. Using our open source and default parameter settings, a total of 35 partial near-duplicate (ND) segments are located, resulting in 10 visual clusters (or snippets). Figure 4 shows two examples of the snippets. Based on our experiments, the precision of ND localization is as high as 0.95 and the recall is 0.66. Table 1 lists the running time for each step. The experiment was conducted on a PC with dual core 3.16 GHz CPU and 3 GB of RAM. In total, creating a galaxy view for 31.2 hours of videos (more than 4,000 keyframes) could be completed within 2.5 hours using our open source. More details can be found in [6].

Pre-processing 75 minutes
ND Retrieval 59 minutes
Partial ND localization 8 minutes
Galaxy Visualization 55 seconds

Table 1: The running time for processing 31.2 hours of videos.

Figure 4: Examples of visual snippets mined from a collection of 220 videos. For ease of visualization, each cluster is tagged with a timeline description from Wikipedia using the techniques developed in [3].

Acknowledgements

The open source software described in this article was fully supported by a grant from the Research Grants Council of the Hong Kong Special Administrative Region, China (CityU 119610).

References

[1] http://vireo.cs.cityu.edu.hk/VIREO-VH/

[2] W. Zhang, L. Pang and C. W. Ngo. Snap-and-Ask: Answering Multimodal Question by Naming Visual Instance. ACM Multimedia, Nara, Japan, October 2012. Demo

[3] S. Tan, C. W. Ngo, H. K. Tan and L. Pang. Cross Media Hyperlinking for Search Topic Browsing. ACM Multimedia, Arizona, USA, November 2011. Demo

[4] H. K. Tan, C. W. Ngo, R. Hong and T. S. Chua. Scalable Detection of Partial Near-Duplicate Videos by Visual-Temporal Consistency. In ACM Multimedia, pages 145-154, 2009.

[5] H. Jegou, M. Douze, and C. Schmid. Improving bag-of-features for large scale image search. IJCV,87(3):192-212, May 2010.

[6] L. Pang, W. Zhang and C. W. Ngo. Video Hyperlinking: Libraries and Tools for Threading and Visualizing a Large Video Collection. ACM Multimedia, Nara, Japan, Oct 2012.

A report from the First International Competition on Game-Based Learning Applications

The European Conference on Game Based Learning is an academic conference that has been held annually in various European Universities since 2006. For the first time this year the Programme Committee, together with Segan (Serious Games Network, https://www.facebook.com/groups/segan) decided to launch a competition at the conference for the best educational game. The aims of the competition were:

  • To provide an opportunity for educational game designers and creators to participate in the conference and demonstrate their game design and development skills in an international competition;
  • To provide an opportunity for GBL creators to peer-assess and peer-evaluate their games;
  • To provide ECGBL attendees with engaging and best-practice games that showcase exemplary applications of GBL .

In the first instance prospective participants were asked to submit a 1000 word extended abstract giving an overview of the game itself, how it is positioned in terms of related work and what the unique education contribution is. We received 56 applications and these were reduced to 22 finalists who were invited to come to the conference to present their games. Four judges, in two teams assessed the games based on a comprehensive set of criteria including sections on learning outcomes, usability and soci-cultural aspects. A shortlist of 6 games were then revisited by all the judges during an open demonstration session at which conference participants were also welcome to participate. First, Second and Third place awards were given and two Highly Commended certificates were presented. The top three games were quite different in terms of the target audience and the format.

Third place

In third place was an app-based early learning game called Lipa Eggs developed by Ian Hook and Roman Hodek from Lipa Learning in the Czech Republic. This game was designed to help pre-school children with colour mixing and recognition and was delivered via a tablet. The gameplay takes the form of a graduated learning system which first allows children to develop the skills to play the game and then develops the learning process to encourage players to find new solutions. More information about the game can be found at http://www.lipalearning.com/game/lipa-eggs

Second place

In second place was a non-digital game called ChemNerd developed by Jakob Thomas Holm from Sterskov Efterskole (a secondary school in Denmark specializing in game-based learning). This game was designed to help teach the periodic table to secondary school students and was presented as a multi-level card game. The game utilizes competition and face to face interaction between students to teach them complicated chemical theory over six phases beginning with a memory challenge and ending with a practical experiment. A video illustrating the game can been seen at http://youtu.be/XD6BPrJyxlc

Winners

The winner was a computer game called Mystery of Taiga River developed by Sasha Barab and Anna Arici from Arizona State University in the USA. The aim of the game was to teach ecological studies to secondary school students and was presented as a game-based immersive world where students become investigative reporters who had to investigate, learn and apply scientific concepts to solve applied problems in a virtual park and restore the health of the dying fish. A video of the game can be seen at http://gamesandimpact.org/taiga_river

Both competitors and conference participants said that they had enjoyed the opportunity of seeing applied educational game development from around the world and the intention is to make this an annual competition associated with the European Conference on Game-Based Learning (ECGBL). The conference in 2014 will be held in Berlin on 30-31 October and the call for games is now open. Details can be found here: http://academic-conferences.org/ecgbl/ecgbl2014/ecgbl14-call-papers.htm

ACM MM 2013 awards

Best Paper Award
—————-
Luoqi Liu, Hui Xu, Junliang Xing, Si Liu, Xi Zhou and Shuicheng Yan
Wow! You Are So Beautiful Today!

http://dl.acm.org/ft_gateway.cfm?id=2502126&ftid=1406045&dwn=1&CFID=257526528&CFTOKEN=87000185

Best Student Paper Award
————————
Hanwang Zhang, Zheng-Jun Zha, Yang Yang, Shuicheng Yan, Yue Gao and
Tat-Seng Chua
Attribute-augmented Semantic Hierarchy: Towards Bridging Semantic Gap and
Intention Gap in Image Retrieval

http://dl.acm.org/ft_gateway.cfm?id=2502093&ftid=1406048&dwn=1&CFID=257526528&CFTOKEN=87000185

Grand Challenge 1st Place Award
——————————-
Brendan Jou, Hongzhi Li, Joseph G. Ellis, Daniel Morozoff-Abegauz and
Shih-Fu Chang
Structured Exploration of Who, What, When, and Where in Heterogenous
Multimedia News Sources

http://dl.acm.org/ft_gateway.cfm?id=2508118&ftid=1406083&dwn=1&CFID=257526528&CFTOKEN=87000185

Grand Challenge 2nd Place Award
——————————-
Subhabrata Bhattacharya, Behnaz Nojavanasghari, Tao Chen, Dong Liu,
Shih-Fu Chang, Mubarak Shah
Towards a Comprehensive Computational Model for Aesthetic Assessment of
Videos

http://dl.acm.org/ft_gateway.cfm?id=2508119&ftid=1406084&dwn=1&CFID=257526528&CFTOKEN=87000185

Grand Challenge 3rd Place Award
——————————-
Shannon Chen, Penye Xia, and Klara Nahrstedt
Activity-Aware Adaptive Compression: A Morphing-Based Frame Synthesis
Application in 3DTI

http://dl.acm.org/ft_gateway.cfm?id=2508116&ftid=1406081&dwn=1&CFID=257526528&CFTOKEN=87000185

Grand Challenge Multimodal Award
——————————–
Chun-Che Wu, Kuan-Yu Chu, Yin-Hsi Kuo, Yan-Ying Chen, Wen-Yu Lee, Winston
H. Hsu
Search-Based Relevance Association with Auxiliary Contextual Cues

http://dl.acm.org/ft_gateway.cfm?id=2508127&ftid=1406092&dwn=1&CFID=257526528&CFTOKEN=87000185

Best Demo Award
—————
Duong-Trung-Dung Nguyen, Mukesh Saini; Vu-Thanh Nguyen, Wei Tsang Ooi
Jiku director: An online mobile video mashup system

http://dl.acm.org/ft_gateway.cfm?id=2502277&ftid=1406132&dwn=1&CFID=257526528&CFTOKEN=87000185

Best Doctoral Symposium Paper
—————————–
Jules Francoise
Gesture-Sound Mapping by demonstration in Interactive Music Systems

http://dl.acm.org/ft_gateway.cfm?id=2502214&ftid=1406247&dwn=1&CFID=257526528&CFTOKEN=87000185

Best Open Source Software Award
——————————-
Dmitry Bogdanov,  Nicolas Wack, Emilia Gómez, Sankalp Gulati, Perfecto
Herrera,
Oscar Mayor, Gerard Roma, Justin Salamon, Jose Zapata Xavier Serra
ESSENTIA: An Audio Analysis Library for Music Information Retrieval

http://dl.acm.org/ft_gateway.cfm?id=2502229&ftid=1406222&dwn=1&CFID=257526528&CFTOKEN=87000185

Editorial

Dear Member of the SIGMM Community, welcome to the third issue of the SIGMM Records in 2013.

On the verge of ACM Multimedia 2013, we can already present the receivers of SIGMM’s yearly awards, the SIGMM Technical Achievement Award, the SIGMM Best Ph.D. Thesis Award, the TOMCCAP Nicolas D. Georganas Best Paper Award, and the TOMCCAP Best Associate Editor Award.

The TOMCCAP Special Issue on the 20th anniversary of ACM Multimedia is out in October, and you can read both the announcement, and find each of the contributions directly through the TOMCCAP Issue 9(1S) table of contents.

That SIGMM has established a strong foothold in the scientific community can also be seen by the Chinese Computing Federation’s rankings of SIGMM’s venues. Read the article to get even more motivation for submitting your papers to SIGMM’s conferences and journal.

We are also reporting from SLAM, the international workshop on Speech, Language and Audio in Multimedia. Not a SIGMM event, but certainly of interest to many SIGMMers who care about audio technology.

You find also two PhD thesis summaries, and last but most certainly not least, you find pointers to the latest issues of TOMCCAP and MMSJ, and several job announcements.

We hope that you enjoy this issue of the Records.

The Editors
Stephan Kopf, Viktor Wendel, Lei Zhang, Pradeep Atrey, Christian Timmerer, Pablo Cesar, Mathias Lux, Carsten Griwodz