Report from ACM ICMR 2017

ACM ICMR 2017 in “Little Paris”

ACM ICMR is the premier International Conference on Multimedia Retrieval, and from 2011 it “illuminates the state of the arts in multimedia retrieval”. This year, ICMR was in an wonderful location: Bucharest, Romania also known as “Little Paris”. Every year at ICMR I learn something new. And here is what I learnt this year.

img1

Final Conference Shot at UP Bucharest

UNDERSTANDING THE TANGIBLE: object, scenes, semantic categories – everything we can see.

1) Objects (and YODA) can be easily tracked in videos.

Arnold Smeulders delivered a brilliant keynote on “things” retrieval: given an object in an image, can we find (and retrieve) it in other images, videos, and beyond? Very interesting technique for tracking objects (e.g. Yoda) in videos based on similarity learnt through siamese networks.

Tracking Yoda with Siamese Networks

Tracking Yoda with Siamese Networks

2) Wearables + computer vision help explore cultural heritage sites.

As showed in his keynote, at MICC University of Florence, Alberto del Bimbo and his amazing team have designed smart audio guides for indoor and outdoor spaces. The system detects, recognises, and describes landmarks and artworks from wearable camera inputs (and GPS coordinates, in case of outdoor spaces)

3) We can finally quantify how much images provide complementary semantics compared to text [BEST MULTIMODAL PAPER AWARD].

For ages, the community have asked how relevant different modalities are for multimedia analysis: this paper (http://dl.acm.org/citation.cfm?id=3078991) finally proposes solution to quantify information gaps between different modalities.

4) Exploring news corpuses is now very easy: news graphs are easy to navigate and aware of the type of relations between articles.

Remi Bois and his colleagues presented this framework (http://dl.acm.org/citation.cfm?id=3079023), made for professional journalists and the general public, for seamlessly browsing through large-scale news corpus. They built a graph where nodes are articles in a news corpus. The most relevant items to each article are chosen (and linked) based on an adaptive nearest neighbor technique. Each link is then characterised according to the type of relation of the 2 linked nodes.

5) Panorama outdoor images are much easier to localise.

In his beautiful work (https://t.co/3PHCZIrA4N), Ahmet Iscen from Inria developed an algorithm for location prediction from StreetView images, outperforming the state of the art thanks to an intelligent stitching pre-processing step: predicting locations from panoramas (stitched individual views) instead of individual street images improve performances dramatically!

UNDERSTANDING THE INTANGIBLE: artistic aspects, beauty, intent: everything we can perceive

1) Image search intent can be predicted by the way we look.

In his best paper candidate research work (http://dl.acm.org/citation.cfm?id=3078995), Mohammad Soleymani showed that image search intent (seeking information, finding content, or re-finding content) can be predicted from physisological responses (eye gaze) and implicit user interaction (mouse movements).

2) Real-time detection of fake tweets is now possible using user and textual cues.

Another best paper (http://dl.acm.org/citation.cfm?id=3078979) candidate, this time from CERTH. The team collected a large dataset of fake/real sample tweets spanning 17 events and built an effective model from misleading content detection from tweet content and user characteristics. A live demo here: http://reveal-mklab.iti.gr/reveal/fake/.

3) Music tracks have different functions in our daily lives.

Researchers from TU Delft have developed an algorithm (http://dl.acm.org/citation.cfm?id=3078997) which classifies music tracks according to their purpose in our daily activities: relax, study and workout.

4) By transferring image style we can make images more memorable!

The team at University of Trento built an automatic framework (https://arxiv.org/abs/1704.01745) to improve image memorability. A selector finds the style seeds (i.e. abstract paintings) which are likely to increase memorability of a given image, and after style transfer, the image will be more memorable!

5) Neural networks can help retrieve and discover child book illustrations.

In this (https://arxiv.org/pdf/1704.03057.pdf) amazing work, motivated by real children experiences, Pinar and her team from Hacettepe University collected a large dataset of children book illustrations and found that neural networks can predict and transfer style, allowing to make “Winnie the witch”-like many other illustrations.

Winnie the Witch

Winnie the Witch

6) Locals perceive their neighborhood as less interesting, more dangerous and dirtier compared to non-locals. 

In this (http://www.idiap.ch/~gatica/publications/SantaniRuizGatica-icmr17.pdf) wonderful work presented by Darshan Santain from IDIAP, researchers asked locals and crowd-workers to look at pictures from various neighborhoods in Guanajuato.

THE FUTURE: What’s Next?

1) We will be able to anonymize images of outdoor spaces thanks to Instagram filters, as proposed by this (http://dl.acm.org/citation.cfm?id=3080543) work in the Brave New Idea session.

When an image of an outdoor space is manipulated with appropriate Instagram filters, the location of the image can be masked from vision-based geolocation classifiers.

2) Soon we will be able to embed watermarks in our Deep Neural Network models in order to protect our intellectual property [BEST PAPER AWARD].

This is a disruptive, novel idea, and that is why this work from KDDI Research and Japan National Institute of Informatics won the best paper award. Congratulations!

3) Given an image view of an object, we will predict the other side of things (from Smeulders’ keynote). In the pic: predicting the other side of chairs. Beautiful.

Predicting the other side of things

Predicting the other side of things

 

THANKS: To the organisers, to the volunteers, and to all the authors for their beautiful work 🙂

Report from MMM 2017

MMM 2017 — 23rd International Conference on MultiMedia Modeling

MMM is a leading international conference for researchers and industry practitioners for sharing new ideas, original research results and practical development experiences from all MMM related areas. The 23rd edition of MMM took place on January 4-6 of 2017, on the modern campus of Reykjavik University. In this short report, we outline the major aspects of the conference, including: technical program; best paper session; video browser showdown; demonstrations; keynotes; special sessions; and social events. We end by acknowledging the contributions of the many excellent colleagues who helped us organize the conference. For more details, please refer to the MMM 2017 web site.

Technical Program

The MMM conference calls for research papers reporting original investigation results and demonstrations in all areas related to multimedia modeling technologies and applications. Special sessions were also held that focused on addressing new challenges for the multimedia community.

This year, 149 regular full paper submissions were received, of which 36 were accepted for oral presentation and 33 for poster presentation, for a 46% acceptance ratio. Overall, MMM received 198 submissions for all tracks, and accepted 107 for oral and poster presentation, for a total of 54% acceptance rate. For more details, please refer to the table below.

MMM2017 Submissions and Acceptance Rates

MMM2017 Submissions and Acceptance Rates

Best Paper Session

Four best paper candidates were selected for the best paper session, which was a plenary session at the start of the conference.

The best paper, by unanimous decision, was “On the Exploration of Convolutional Fusion Networks for Visual Recognition” by Yu Liu, Yanming Guo, and Michael S. Lew. In this paper, the authors propose an efficient multi-scale fusion architecture, called convolutional fusion networks (CFN), which can generate the side branches from multi-scale intermediate layers while consuming few parameters.

Phoebe Chen, Laurent Amsaleg and Shin’ichi Satoh (left) present the Best Paper Award to Yu Liu and Yanming Guo (right).

Phoebe Chen, Laurent Amsaleg and Shin’ichi Satoh (left) present the Best Paper Award to Yu Liu and Yanming Guo (right).

The best student paper, partially chosen due to the excellent presentation of the work, was “Cross-modal Recipe Retrieval: How to Cook This Dish?” by Jingjing Chen, Lei Pang, and Chong-Wah Ngo. In this work, the problem of sharing food pictures from the viewpoint of cross-modality analysis was explored. Given a large number of image and recipe pairs acquired from the Internet, a joint space is learnt to locally capture the ingredient correspondence from images and recipes.

Phoebe Chen, Laurent Amsaleg and Shin’ichi Satoh (left) present the Best Student Paper Award to Jingjing Chen and Chong-Wah Ngo (right).

Phoebe Chen, Shin’ichi Satoh and Laurent Amsaleg (left) present the Best Student Paper Award to Jingjing Chen and Chong-Wah Ngo (right).

The two runners-up were “Spatio-temporal VLAD Encoding for Human Action Recognition in Videos” by Ionut Cosmin Duta, Bogdan Ionescu, Kiyoharu Aizawa, and Nicu Sebe, and “A Framework of Privacy-Preserving Image Recognition for Image-Based Information Services” by Kojiro Fujii, Kazuaki Nakamura, Naoko Nitta, and Noboru Babaguchi.

Video Browser Showdown

The Video Browser Showdown (VBS) is an annual live video search competition, which has been organized as a special session at MMM conferences since 2012. In VBS, researchers evaluate and demonstrate the efficiency of their exploratory video retrieval tools on a shared data set in front of the audience. The participating teams start with a short presentation of their system and then perform several video retrieval tasks with a moderately large video collection (about 600 hours of video content). This year, seven teams registered for VBS, although one team could not compete for personal and technical reasons. For the first time in 2017, live judging was included, in which a panel of expert judges made decisions in real-time about the accuracy of the submissions for ⅓ of the tasks.

Teams and spectators in the Video Browser Showdown.

Teams and spectators in the Video Browser Showdown.

On the social side, two changes were also made from previous conferences. First, VBS was held in a plenary session, to avoid conflicts with other schedule items. Second, the conference reception was held at VBS, which meant that attendees had extra incentives to attend VBS, namely food and drink. And third, Alan Smeaton served as “color commentator” during the competition, interviewing the organizers and participants, and helping explain to the audience what was going on. All of these changes worked well, and contributed to a very well attended VBS session.

The winners of VBS 2017, after a very even and exciting competition, were Luca Rossetto, Ivan Giangreco, Claudiu Tanase, Heiko Schuldt, Stephane Dupont and Omar Seddati, with their IMOTION system.

The winners of VBS 2017, after a very even and exciting competition, were Luca Rossetto, Ivan Giangreco, Claudiu Tanase, Heiko Schuldt, Stephane Dupont and Omar Seddati, with their IMOTION system.

Demonstrations

Five demonstrations were presented at MMM. As in previous years, the best demonstration was selected using both a popular vote and a selection committee. And, as in previous years, both methods produced the same winner, which was: “DeepStyleCam: A Real-time Style Transfer App on iOS” by Ryosuke Tanno, Shin Matsuo, Wataru Shimoda, and Keiji Yanai.

The winners of the Best Demonstration competition hard at work presenting their system.

The winners of the Best Demonstration competition hard at work presenting their system.

Keynotes

The first keynote, held in the first session of the conference, was “Multimedia Analytics: From Data to Insight” by Marcel Worring, University of Amsterdam, Netherlands. He reported on a novel multimedia analytics model based on an extensive survey of over eight hundred papers. In the analytics model, the need for semantic navigation of the collection is emphasized and multimedia analytics tasks are placed on an exploration-search axis. Categorization is then proposed as a suitable umbrella task for realizing the exploration-search axis in the model. In the end, he considered the scalability of the model to collections of 100 million images, moving towards methods which truly support interactive insight gain in huge collections.

Björn Þór Jónsson introduces the first keynote speaker, Marcel Worring (right).

Björn Þór Jónsson introduces the first keynote speaker, Marcel Worring (right).

The second keynote, held in the last session of the conference, was “Creating Future Values in Information Access Research through NTCIR” by Noriko Kando, National Institute of Informatics, Japan. She reported on NTCIR (NII Testbeds and Community for Information access Research), which is a series of evaluation workshops designed to enhance the research in information access technologies, such as information retrieval, question answering, and summarization using East-Asian languages, by providing infrastructures for research and evaluation. Prof Kando provided motivations for the participation in such benchmarking activities and she highlighted the range of scientific tasks and challenges that have been explored at NTCIR over the past twenty years. She ended with ideas for the future direction of NTCIR.

key2

Noriko Kando presents the second MMM keynote.

Special Sessions

During the conference, four special sessions were held. Special sessions are mini-venues, each focusing on one state-of-the-art research direction within the multimedia field. The sessions are proposed and chaired by international researchers, who also manage the review process, in coordination with the Program Committee Chairs. This year’s sessions were:
– “Social Media Retrieval and Recommendation” organized by Liqiang Nie, Yan Yan, and Benoit Huet;
– “Modeling Multimedia Behaviors” organized by Peng Wang, Frank Hopfgartner, and Liang Bai;
– “Multimedia Computing for Intelligent Life” organized by Zhineng Chen, Wei Zhang, Ting Yao, Kai-Lung Hua, and Wen-Huang Cheng; and
– “Multimedia and Multimodal Interaction for Health and Basic Care Applications” organized by Stefanos Vrochidis, Leo Wanner, Elisabeth André, Klaus Schoeffmann.

Social Events

This year, there were two main social events at MMM 2017: a welcome reception at the Video Browser Showdown, as discussed above, and the conference banquet. Optional tours then allowed participants to further enjoy their stay on the unique and beautiful island.

The conference banquet was held in two parts. First, we visited the exotic Blue Lagoon, which is widely recognised as one of the modern wonders of the world and one of the most popular tourist destinations in Iceland. MMM participants had the option of bathing for two hours in this extraordinary spa, and applying the healing silica mud to their skin, before heading back for the banquet in Reykjavík.

The banquet itself was then held at the Harpa Reykjavik Concert Hall and Conference Centre in downtown Reykjavík. Harpa is one of Reykjavik‘s most recent, yet greatest and most distinguished landmarks. It is a cultural and social centre in the heart of the city and features stunning views of the surrounding mountains and the North Atlantic Ocean.

Harpa, the venue of the conference banquet.

Harpa, the venue of the conference banquet.

During the banquet, Steering Committee Chair Phoebe Chen gave a historical overview of the MMM conferences and announced the venues for MMM 2018 (Bangkok, Thailand) and MMM 2019 (Thessaloniki, Greece), before awards for the best contributions were presented. Finally, participants were entertained by a small choir, and were even asked to participate in singing a traditional Icelandic folk song.

MMM 2018 will be held at Chulalongkorn University in Bangkok, Thailand.  See http://mmm2018.chula.ac.th/.

MMM 2018 will be held at Chulalongkorn University in Bangkok, Thailand. See http://mmm2018.chula.ac.th/.

Acknowledgements

There are many people who deserve appreciation for their invaluable contributions to MMM 2017. First and foremost, we would like to thank our Program Committee Chairs, Laurent Amsaleg and Shin’ichi Satoh, who did excellent work in organizing the review process and helping us with the organization of the conference; indeed they are still hard at work with an MTAP special issue for selected papers from the conference. The Proceedings Chair, Gylfi Þór Guðmundsson, and Local Organization Chair, Marta Kristín Lárusdóttir, were also tirelessly involved in the conference organization and deserve much gratitude.

Other conference officers contributed to the organization and deserve thanks: Frank Hopfgartner and Esra Acar (demonstration chairs); Klaus Schöffmann, Werner Bailer and Jakub Lokoč (VBS Chairs); Yantao Zhang and Tao Mei (Sponsorship Chairs); all the Special Session Chairs listed above; the 150 strong Program Committee, who did an excellent job with the reviews; and the MMM Steering Committee, for entrusting us with the organization of MMM 2017.

Finally, we would like to thank our student volunteers (Atli Freyr Einarsson, Bjarni Kristján Leifsson, Björgvin Birkir Björgvinsson, Caroline Butschek, Freysteinn Alfreðsson, Hanna Ragnarsdóttir, Harpa Guðjónsdóttir), our hosts at Reykjavík University (in particular Arnar Egilsson, Aðalsteinn Hjálmarsson, Jón Ingi Hjálmarsson and Þórunn Hilda Jónasdóttir), the CP Reykjavik conference service, and all others who helped make the conference a success.

Call for Grand Challenge Problem Proposals

Original page: http://www.acmmm.org/2017/contribute/call-for-multimedia-grand-challenge-proposals/

 

The Multimedia Grand Challenge was first presented as part of ACM Multimedia 2009 and has established itself as a prestigious competition in the multimedia community.  The purpose of the Multimedia Grand Challenge is to engage with the multimedia research community by establishing well-defined and objectively judged challenge problems intended to exercise state-of-the-art techniques and methods and inspire future research directions.

Industry leaders and academic institutions are invited to submit proposals for specific Multimedia Grand Challenges to be included in this year’s program.

A Grand Challenge proposal should include:

  • A brief description motivating why the challenge problem is important and relevant for the multimedia research community, industry, and/or society today and going forward for the next 3-5 years.
  • A description of a specific set of tasks or goals to be accomplished by challenge problem submissions.
  • Links to relevant datasets to be used for experimentation, training, and evaluation as necessary. Full appropriate documentation on any datasets should be provided or made accessible.
  • A description of rigorously defined objective criteria and/or procedures for how submissions will be judged.
  • Contact information of at least two organizers who will be responsible for accepting and judging submissions as described in the proposal.

Grand Challenge proposals will be considered until March 1st and will be evaluated on an on-going basis as they are received. Grand Challenge proposals that are accepted to be part of the ACM Multimedia 2017 program will be posted on the conference website and included in subsequent calls for participation. All material, datasets, and procedures for a Grand Challenge problem should be ready for dissemination no later than March 14th.

While each Grand Challenge is allowed to define an independent timeline for solution evaluation and may allow iterative resubmission and possible feedback (e.g., a publicly posted leaderboard), challenge submissions must be complete and a paper describing the solution and results should be submitted to the conference program committee by July 14, 2017.

Grand Challenge proposals should be sent via email to the Grand Challenge chair, Ketan Mayer-Patel.

Those interested in submitting a Grand Challenge proposal are encouraged to review the problem descriptions from ACM Multimedia 2016 as examples. These are available here: http://www.acmmm.org/2016/?page_id=353

ACM TVX — Call for Volunteer Associate Chairs

CALL FOR VOLUNTEER ASSOCIATE CHAIRS – Applications for Technical Program Committee

ACM TVX 2017 International Conference on Interactive Experiences for Television and Online Video June 14-16, 2017, Hilversum, The Netherlands www.tvx2017.com


We are welcoming applications to become part of the TVX 2017 Technical Program Committee (TPC), as Associate Chair (AC). This involves playing a key role in the submission and review process, including attendance at the TPC meeting (please note that this is not a call for reviewers, but a call for Associate Chairs). We are opening applications to all members of the community, from both industry and academia, who feel they can contribute to this team.

  • This call is open to new Associate Chairs and to those who have been Associate Chairs in previous years and want to be an Associate Chair again for TVX 2017
  • Application form: https://goo.gl/forms/c9gNPHYZbh2m6VhJ3
  • The application deadline is December 12, 2016

Following the success of previous years’ invitations for open applications to join our Technical Program Committee, we again invite applications for Associate Chairs. Successful applicants would be responsible for arranging and coordinating reviews for around 3 or 4 submissions in the main Full and Short Papers track of ACM TVX2017, and attend the Technical Program Committee meeting in Delft, The Netherlands, in mid-March 2017 (participation in person is strongly recommended). Our aim is to broaden participation, ensuring a diverse Technical Program Committee, and to help widen the ACM TVX community to include a full range of perspectives.

We welcome applications from academics, industrial practitioners and (where appropriate) senior PhD students, who have expertise in Human Computer Interaction or related fields, and who have an interest in topics related to interactive experiences for television or online video. We would expect all applicants to have ‘top-tier’ publications related to this area. Applicants should have an expertise or interest in at least one or more topics in our call for papers: https://tvx.acm.org/2017/participation/full-and-short-paper-submissions/

After the application deadline, the volunteers will be considered and selected for ACs, and the TPC Chairs will be free to also invite previous ACs or other researchers of the community to integrate the team. The ultimate goal is to reach a balanced, diverse and inclusive TPC team in terms of fields of expertise, experience and perspectives, both from academia and industry.

To submit, just fill in the application form above!

CONTACT INFORMATION

For up to date information and further details please visit: www.tvx2017.com or get in touch with the Inclusion Chairs:

Teresa Chambel, University of Lisbon, PT; Rob Koenen, TNO, NL
at: inclusion@tvx2017.com

In collaboration with the Program Chairs: Wendy van den Broeck, Vrije Universiteit Brussel, BE; Mike Darnell, Samsung, USA; Roger Zimmermann, NUS, Singapore

MPEG Column: 116th MPEG Meeting

MPEG Workshop on 5-Year Roadmap Successfully Held in Chengdu

Chengdu, China – The 116th MPEG meeting was held in Chengdu, China, from 17 – 21 October 2016

MPEG Workshop on 5-Year Roadmap Successfully Held in Chengdu

At its 116th meeting, MPEG successfully organised a workshop on its 5-year standardisation roadmap. Various industry representatives presented their views and reflected on the need for standards for new services and applications, specifically in the area of immersive media. The results of the workshop (roadmap, presentations) and the planned phases for the standardisation of “immersive media” are available at http://mpeg.chiariglione.org/. A follow-up workshop will be held on 18 January 2017 in Geneva, co-located with the 117th MPEG meeting. The workshop is open to all interested parties and free of charge. Details on the program and registration will be available at http://mpeg.chiariglione.org/.

Summary of the “Survey on Virtual Reality”

At its 115th meeting, MPEG established an ad-hoc group on virtual reality which conducted a survey on virtual reality with relevant stakeholders in this domain. The feedback from this survey has been provided as input for the 116th MPEG meeting where the results have been evaluated. Based on these results, MPEG aligned its standardisation timeline with the expected deployment timelines for 360-degree video and virtual reality services. An initial specification for 360-degree video and virtual reality services will be ready by the end of 2017 and is referred to as the Omnidirectional Media Application Format (OMAF; MPEG-A Part 20, ISO/IEC 23000-20). A standard addressing audio and video coding for 6 degrees of freedom where users can freely move around is on MPEG’s 5-year roadmap. The summary of the survey on virtual reality is available at http://mpeg.chiariglione.org/.

MPEG and ISO/TC 276/WG 5 have collected and evaluated the answers to the Genomic Information Compression and Storage joint Call for Proposals

At its 115th meeting, MPEG issued a Call for Proposals (CfP) for Genomic Information Compression and Storage in conjunction with the working group for standardisation of data processing and integration of the ISO Technical Committee for biotechnology standards (ISO/TC 276/WG5). The call sought submissions of technologies that can provide efficient compression of genomic data and metadata for storage and processing applications. During the 116th MPEG meeting, responses to this CfP have been collected and evaluated by a joint ad-hoc group of both working groups, comprising twelve distinct technologies submitted. An initial assessment of the performance of the best eleven solutions for the different categories reported compression factors ranging from 8 to 58 for the different classes of data.

The submitted twelve technologies show consistent improvements versus the results assessed as an answer to the Call for Evidence in February 2016. Further improvements of the technologies under consideration are expected with the first phase of core experiments that has been defined at the 116th MPEG meeting. The open core experiments process planned in the next 12 months will address multiple, independent, directly comparable rigorous experiments performed by independent entities to determine the specific merit of each technology and their mutual integration into a single solution for standardisation. The core experiment process will consider submitted technologies as well as new solutions in the scope of each specific core experiment. The final inclusion of submitted technologies into the standard will be based on the experimental comparison of performance, as well as on the validation of requirements and inclusion of essential metadata describing the context of the sequence data, and will be reached by consensus within and across both committees.

Call for Proposals: Internet of Media Things and Wearables (IoMT&W)

At its 116th meeting, MPEG issued a Call for Proposals (CfP) for Internet of Media Things and Wearables (see http://mpeg.chiariglione.org/), motivated by the understanding that more than half of major new business processes and systems will incorporate some element of the Internet of Things (IoT) by 2020. Therefore, the CfP seeks submissions of protocols and data representation enabling dynamic discovery of media things and media wearables. A standard in this space will facilitate the large-scale deployment of complex media systems that can exchange data in an interoperable way between media things and media wearables.

MPEG-DASH Amendment with Media Presentation Description Chaining and Pre-Selection of Adaptation Sets

At its 116th MPEG meeting, a new amendment for MPEG-DASH reached the final stage of Final Draft Amendment (ISO/IEC 23009-1:2014 FDAM 4). This amendment includes several technologies useful for industry practices of adaptive media presentation delivery. For example, the media presentation description (MPD) can be daisy chained to simplify implementation of pre-roll ads in cases of targeted dynamic advertising for live linear services. Additionally, support for pre-selection in order to signal suitable combinations of audio elements that are offered in different adaptation sets is enabled by this amendment. As there have been several amendments and corrigenda produced, this amendment will be published as a part of the 3rd edition of ISO/IEC 23009-1 together with the amendments and corrigenda approved after the 2nd edition.

How to contact MPEG, learn more, and find other MPEG facts

To learn about MPEG basics, discover how to participate in the committee, or find out more about the array of technologies developed or currently under development by MPEG, visit MPEG’s home page at http://mpeg.chiariglione.org. There you will find information publicly available from MPEG experts past and present including tutorials, white papers, vision documents, and requirements under consideration for new standards efforts. You can also find useful information in many public documents by using the search window.

Examples of tutorials that can be found on the MPEG homepage include tutorials for: High Efficiency Video Coding, Advanced Audio Coding, Universal Speech and Audio Coding, and DASH to name a few. A rich repository of white papers can also be found and continues to grow. You can find these papers and tutorials for many of MPEG’s standards freely available. Press releases from previous MPEG meetings are also available. Journalists that wish to receive MPEG Press Releases by email should contact Dr. Christian Timmerer at christian.timmerer@itec.uni-klu.ac.at or christian.timmerer@bitmovin.com.

Further Information

Future MPEG meetings are planned as follows:
No. 117, Geneva, CH, 16 – 20 January, 2017
No. 118, Hobart, AU, 03 – 07 April, 2017
No. 119, Torino, IT, 17 – 21 July, 2017
No. 120, Macau, CN, 23 – 27 October 2017

For further information about MPEG, please contact:
Dr. Leonardo Chiariglione (Convenor of MPEG, Italy)
Via Borgionera, 103
10040 Villar Dora (TO), Italy
Tel: +39 011 935 04 61
leonardo@chiariglione.org

or

Priv.-Doz. Dr. Christian Timmerer
Alpen-Adria-Universität Klagenfurt | Bitmovin Inc.
9020 Klagenfurt am Wörthersee, Austria, Europe
Tel: +43 463 2700 3621
Email: christian.timmerer@itec.aau.at | christian.timmerer@bitmovin.com

Call for Task Proposals: Multimedia Evaluation 2017

MediaEval 2017 Multimedia Evaluation Benchmark

Call for Task Proposals

Proposal Deadline: 3 December 2016

MediaEval is a benchmarking initiative dedicated to developing and evaluating new algorithms and technologies for multimedia retrieval, access and exploration. It offers tasks to the research community that are related to human and social aspects of multimedia. MediaEval emphasizes the ‘multi’ in multimedia and seeks tasks involving multiple modalities, e.g., audio, visual, textual, and/or contextual.

MediaEval is now calling for proposals for tasks to run in the 2017 benchmarking season. The proposal consists of a description of the motivation for the task and challenges that task participants must address. It provides information on the data and evaluation methodology to be used. The proposal also includes a statement of how the task is related to MediaEval (i.e., its human or social component), and how it extends the state of the art in an area related to multimedia indexing, search or other technologies that support users in accessing multimedia collections.

For more detailed information about the content of the task proposal, please see:
http://www.multimediaeval.org/files/mediaeval2017_taskproposals.html

Task proposal deadline: 3 December 2016

Task proposals are chosen on the basis of their feasibility, their match with the topical focus of MediaEval, and also according to the outcome of a survey circulated to the wider multimedia research community.

The MediaEval 2017 Workshop will be held 13-15 September 2017 in Dublin, Ireland, co-located with CLEF 2017 (http://clef2017.clef-initiative.eu)

For more information about MediaEval see http://multimediaeval.org or contact Martha Larson m.a.larson@tudelft.nl

 

SIGMM Award for Outstanding Ph.D. Thesis in Multimedia Computing, Communications and Applications 2016

image001ACM Special Interest Group on Multimedia (SIGMM) is pleased to present the 2016 SIGMM Outstanding Ph.D. Thesis Award to Dr. Christoph Kofler. The award committee considers Dr. Kofler’s dissertation entitled “User Intent in Online Video Search” worthy of the recognition as the thesis is the first to innovatively consider a user’s intent in multimedia search yielding significantly improved results in satisfying the information need of the user. The work has high originality and is expected to have significant impact, especially in boosting the search performance for multimedia data.

Dr. Kofler’s thesis systematically explores a user’s video search intent that is behind a user’s information need in three steps: (1) analyzing a real-world transaction log produced by a large video search engine to understand why searches fail, (2) understanding the possible intents of users behind video search and uploads, and (3) designing an intent-aware video search result optimization approach that re-ranks initial video search results so as to yield the highest potential to satisfy the users’ search intent.

The effectiveness of the framework developed in the thesis has been successfully justified by a thorough range of experiments. The thesis topic itself is highly topical and the framework makes groundbreaking contributions to our understanding and knowledge in the area of users’ information seeking, user intent, user satisfaction, and multimedia search engine usability.  The publications related to the thesis clearly demonstrate the impact of this work across several research disciplines including multimedia, web, and information retrieval.  Overall, the committee recognizes that the thesis has significant impact and makes considerable contributions to the multimedia community. 

Bio of Awardee:

Dr. Christoph Kofler is a software engineer and data scientist at Bloomberg L.P., NY, USA. He holds a Ph.D. degree from Delft University of Technology, The Netherlands, and an M.Sc. and B.Sc. degree from Klagenfurt University, Austria – all in Computer Science. His research interests include the broad fields of multimedia and text-based information retrieval with focus on search intent inference and its applications for search results optimization throughout the entire search engine pipeline (indexing, ranking, query formulation). In addition to “what” a user is looking for using search, Dr. Kofler is particularly interested in the “why” component behind the search and in the related opportunities for improving the efficiency and effectiveness of information retrieval systems. Dr. Kofler has co-authored more than 20 scientific publications with predominant focus on venues such as ACM Multimedia, IEEE Transactions on Multimedia, and ACM Computing Surveys. He has been a task co-organizer of the MediaEval Benchmark initiative. He received the Grand Challenge Best Presentation Award at ACM Multimedia and the Best Paper nomination at the European Conference on Information Retrieval. Dr. Kofler is a recipient of the Google Doctoral Fellowship in Information Retrieval (Video Search). He has held positions at Microsoft Research, Beijing, China; Columbia University, NY, USA; and Google, NY, USA.

 

image003-1The award committee is pleased to present an honorable mention to Dr. Varun Singh for the thesis entitled: “Protocols and Algorithms for Adaptive Multimedia Systems.” The thesis develops and presents congestion control algorithms and signaling protocols that are used in interactive multimedia communications.  The committee is impressed by the thorough theoretical and experimental depth of the thesis. Additionally, remarkable are Dr. Singh’s efforts to shepherd his work to real world adoption which has led him to author four RFCs and several standards-track documents in the IETF. This has resulted in the incorporation of his work in the production versions of the Chrome and Firefox web browsers. Therefore, it can be seen that his work has already achieved impact in the multimedia community.

Bio of Awardee:

Dr. Varun Singh received his Master’s degree in Electrical Engineering from Helsinki University of Technology, Finland, in 2009 his Ph.D. degree from Aalto University, Finland, in 2015.  His research has led him to making important contributions to different standardization organization: 3GPP (2008 – 2010), IETF (since 2010), and W3C (since 2014). He is the co-author of the WebRTC Statistics API. Beyond this, his research work led him to found and become CEO of callstats.io, a startup which analyses and optimizes the Quality of multimedia in real-time communication (currently, WebRTC).

 

ACM TOMM Special Issues and Special Sections

ACM TOMM journal has launched a new two-year program of SPECIAL ISSUES and SPECIAL SECTIONS on strategic and emerging topics in Multimedia research. Each Special Issue will also include an extended survey paper on the subject of the issue, prepared by the Guest Editors. It will help to highlight trends and research paths and will position the contributed papers appropriately.

On May, we received 11 proposals and selected 4 proposals for Special Issues and 2 proposals for Special Sections, based on the timeliness and relevance of the topic and the qualification of the proponents:

SPECIAL ISSUES (8 papers each)

  • “Deep Learning for Mobile Multimedia”
    for publication on April’17. Submission deadline Oct 15, 2016
  • “Delay-Sensitive Video Computing in the Cloud”
    for publication on July’17. Submission deadline Nov. 30, 2016
  • “Representation, Analysis and Recognition of 3D Human”
    for publication on Nov’17. Submission deadline Jan. 15, 2017
  • “QoE Management for Multimedia Services”
    for publication on April’18. Submission deadline May 15, 2017

SPECIAL SECTIONS (4 papers each)

  • “Multimedia Computing and Applications of Socio-Affective Behaviors in the Wild”
    for publication on May ’17. Submission deadline Oct 31, 2016
  • “Multimedia Understanding via Multimodal Analytics”
    for publication on May ’17. Submission deadline Oct 31, 2016

You can visit the ACM TOMM home page at  http://tomm.acm.org, news section, for more in-detail information. We will be definitely happy of your valuable contributions to this initiative.

 

ACM SIGMM Award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications

image002The 2016 winner of the prestigious ACM Special Interest Group on Multimedia (SIGMM) award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications is Prof. Dr. Alberto del Bimbo. The award is given in recognition of his outstanding, pioneering and continued research contributions in the areas of multimedia processing, multimedia content analysis, and multimedia applications, his leadership in multimedia education, and his outstanding and continued service to the community.

Prof. del Bimbo was among the very few who pioneered the research in image and video content-based retrieval in the late 80’s. Since that time, for over 25 years, he has been among the most visionary and influential researchers in Europe and world-wide in this field. His research has influenced several generations of researchers that are now active in some of the most important research centers world-wide. Over the years, he has made significant innovative research contributions.

In the early times of the discipline he explored all the modalities for retrieval by visual similarity of images and video. In his early paper Visual Image Retrieval by Elastic Matching of User Sketches published in IEEE Trans. on Pattern Analysis and Machine Intelligence in 1997, he presented one of the first and top performing methods for image retrieval by shape similarity from user’s sketches. He also published in IEEE Trans. on Pattern Analysis and Machine Intelligence and IEEE Trans. on Multimedia his original research on representations for spatial relationships between image regions based on spatial logic. This ground-breaking research was accompanied by the definition of efficient index structures to permit retrieval from large datasets. He was one of the first to address this large datasets aspect that has now become very important for the research community.

Since the early 2000s, with the advancement of 3D imaging technologies and the availability of a new generation of acquisition devices capable of capturing the geometry of 3D objects in the three-dimensional physical space, Prof. del Bimbo and his team initiated research in 3D content based retrieval that has now become increasingly popular in mainstream research. Again, he was among the very first researchers to initiate this research. Particularly, he focused on 3D face recognition extending the weighted walkthrough representation of spatial relationships between image regions to model the 3D relationships between facial stripes. His solution of 3D Face Recognition Using Iso-geodesic Stripes scored the best performance at SHREC Shape Retrieval Contest in 2008, and was published in IEEE Trans. on Pattern Analysis and Machine Intelligence, in 2010. At CVPR’15 he presented a novel idea for representing 3D textured mesh manifolds using Local Binary Patterns, that is highly effective for 3D face retrieval. This was the first attempt to combine 3D geometry and photometric texture into a single unified representation. In 2016 he has co-authored a forward looking survey on content-based image retrieval in the context of social image platforms, that has appeared on ACM Computing Surveys. It includes an extensive treatise of image tag assignment, refinement and tag-based retrieval and explores the differences between traditional image retrieval and retrieval with socially generated images.
One very important aspect of his contribution to the community is Professor del Bimbo’s educational impact during his career. He was the author of the monograph, Visual Information Retrieval, published by Morgan Kaufmann in 1999 which became one of the most cited and influential books from the early years of image and video content-based retrieval. Many young researchers have used this book as the main reference in their studies, and their career has been shaped by the ideas discussed in this book. Being the first and sole book on that subject in the early times of the discipline, it played a key role to develop content-based retrieval from a research niche to a largely populated field of research and to make it central to Multimedia research.

Professor del Bimbo has an extraordinary and long-lasting track record of services to the scientific community through the last 20 years. As the General Chair he organized two of the most successful conferences in Multimedia, namely IEEE ICMCS’99, the Int’l Conf. on Multimedia Computing and Systems (now renamed IEEE ICME) and ACM MULTIMEDIA’10. The quality and success of these conferences were highly influential to attract new young researchers in the field and form the present research community. Since 2016, he is the Editor-in-Chief for ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM).

Announcement of ACM SIGMM Rising Star Award 2016

image003The ACM Special Interest Group on Multimedia (SIGMM) is pleased to present this year’s Rising Star Award in multimedia computing, communications and applications to Dr. Bart Thomee for his significant contributions in the areas of geo-multimedia computing, media evaluation, and open research datasets. The ACM SIGMM Rising Star Award recognizes a young researcher who has made outstanding research contributions to the field of multimedia computing, communication and applications during the early part of his or her career.

Dr. Bart Thomee received his Ph.D. from Leiden University in 2010. In his thesis, he focused on multimedia search and exploration, specifically targeting artificial imagination and duplicate detection. On the topic of artificial imagination, he aimed to more rapidly understand the user’s search intent by generating imagery that resemble the ideal image the user is looking for. Using the synthesized images as queries instead of existing images from the database boosted the relevance of the image results by up to 23%. On the topic of duplicate detection, he designed descriptors to compactly represent web-scale image collections and to accurately detect transformed versions of the same image. This work led to an Outstanding Paper Citation at ACM Conference on Multimedia Information Retrieval 2008.

In 2011, he jointed Yahoo Labs, where Dr. Thomee ‘s interests grew into geographic computing in Multimedia. He began characterizing spatiotemporal regions from labeled (e.g. tagged) georeferenced media, for which he devised a technique based on scale-space theory that could process billions of georeferenced labels in a matter of hours. This work was published at WWW 2013 and became a reference example at Yahoo for how to disambiguate multi-language and multi-meaning labels from media with noisy annotations.

He also started to use an overlooked piece of information that is found in most camera phone images: compass information. He developed a technique to accurately pinpoint the locations and surface area of landmarks, solely based on the positions and orientations of photos taken of them which may have been taken hundreds of yards to miles away.

Dr. Thomee’s recent work on the YFCC100M dataset has had important impacts on the multimedia and SIGMM research community. This new dataset was real in size and structure to fuel and change the landscape of research in Multimedia. What started as an initiative to release a geo-Flickr dataset, Dr. Thomee quickly saw the broader impact and worked rapidly to scale the size. He had to push the limits of openness without violating licensing terms, copyright, or privacy. He worked closely with many lawyers to overturn the default, restrictive terms of use by making it also available to non-academics all over the world. He coordinated and led the efforts to share the data and effort horizontally with ICSI, LLNL, and Amazon Open Data. It was highlighted in the 2016 February issue of the Communications of ACM (CACM). The dataset has been requested over 1200 times in just a few months and cited many times since launch. Dr. Thomee has continued by releasing expansion packs to the YFCC100M. This dataset is expected to impact Multimedia research significantly over the future years.

Dr. Thomee has also been an exemplary community member of the Multimedia community. For example, he organized the ImageCLEF photo annotation task (2012-2013) and MediaEval placing task (2013-2016) as well as designed the ACM Grand Challenge on Event Summarization (2015) and on Tag & Caption Prediction (2016).

In summary, Dr. Bart Thomee receives the 2016 ACM SIGMM Rising Star Award Thomee for significant contributions in the areas of geo-multimedia computing, media evaluation, and open datasets for research.