SIGMM Annual Report (2018)

 

Dear Readers,

Each year SIGMM, like all ACM SIGs, produces an annual report summarising our activities which includes our sponsored and i-cooperation conferences and also the initiatives we are undertaking to support our community and broaden participation. The report also includes our significant papers, our awards given and the major issues that face us going forward. Below is the main text of the SIGMM report 2017-2018 which is augmented by further details on our conferences which is provided by the ACM Office. We hope you enjoy reading this ad learning about what SIGMM does.


SIGMM Annual Report (2018)
Prepared by SIGMM Chair (Alan Smeaton),
Vice Chair (Nicu Sebe), and Conference Director (Gerald Friedland)
August 6th, 2018

Mission: SIGMM provides an international interdisciplinary forum for researchers, engineers, and practitioners in all aspects of multimedia computing, communication, storage and application.

1. Awards:
SIGMM gives out three awards each year and these were as follows:

  • SIGMM Technical Achievement Award for lasting contributions to multimedia computing, communications and applications was presented to Arnold W.M. Smeulders, University of Amsterdam, the Netherlands. The award was given in recognition of his outstanding and pioneering contributions to defining and bridging the semantic gap in content-based image retrieval.
  • SIGMM 2016 Rising Star Award was given to Dr Liangliang Cao of HelloVera. AI for his significant contributions in large-scale multimedia recognition and social media mining.
  • SIGMM Outstanding PhD Thesis in Multimedia Computing Award was given to Chien-Nan (Shannon) Chen for a thesis entitled Semantics-Aware Content Delivery Framework For 3D Tele-Immersion at the University of Illinois at Urbana-Champaign, US.

2. Significant Papers:

The SIGMM flagship conference, ACM Multimedia 2017, was held in Mountain View, Calif. And presented the following awards plus other awards for Best Grand Challenge Video Captioning Paper, Best Grand Challenge Social Media Prediction Paper, Best Brave New Idea Paper:

  • Best paper award to “Adversarial Cross-Modal Retrieval”, by Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, Heng Tao Shen
  • Best student paper award to “H-TIME: Haptic-enabled Tele-Immersive Musculoskeletal Examination”, by Yuan Tian, Suraj Raghuraman, Thiru Annaswamy, Aleksander Borresen, Klara Nahrstedt, Balakrishnan Prabhakaran
  • Best demo award to “NexGenTV: Providing Real-Time Insight during Political Debates in a Second Screen Application” by Olfa Ben Ahmed, Gabriel Sargent, Florian Garnier, Benoit Huet, Vincent Claveau, Laurence Couturier, Raphaël Troncy, Guillaume Gravier, Philémon Bouzy  and Fabrice Leménorel.
  • Best Open source software award to “TensorLayer: A Versatile Library for Efficient Deep Learning Development” by Hao Dong, Akara Supratak, Luo Mai, Fangde Liu, Axel Oehmichen, Simiao Yu, Yike Guo.

The 9th ACM International Conference on Multimedia Systems (MMSys 2018), was held in Amsterdam, the Netherlands, and presented a range awards including:

  • Best paper award to “Dynamic Adaptive Streaming for Multi-Viewpoint Omnidirectional Videos” by Xavier Corbillon, Francesca De Simone, Gwendal Simon and Pascal Frossard.
  • Best student-paper award to “Want to Play DASH? A Game Theoretic Approach for Adaptive Streaming over HTTP” by Abdelhak Bentaleb, Ali C. Begen, Saad Harous and Roger Zimmermann.

The International Conference in Multimedia Retrieval (ICMR) 2018 was held in Yokohama, Japan, and presented a range of awards including:

  • Best paper award to “Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval” by Niluthpol Mithun, Juncheng Li, Florian Metze and Amit Roy-Chowdhury.

The best paper and best student paper from each of these three conferences were then reviewed by a specially set up committee to select one paper which has been nominated for Communications of the ACM Research Highlights and that is presently under consideration.

In addition to the above, SIGMM presented the 2017 ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) Nicolas D. Georganas Best Paper Award to the paper “Automatic Generation of Visual-Textual Presentation Layout” (TOMM vol. 12, Issue 2) by Xuyong Yang, Tao Mei, Ying-Qing Xu, Yong Rui, and Shipeng Li.

3. Significant Programs that Provide a Springboard for Further Technical Efforts

  • SIGMM provided support for student travel through grants, at all of our SIGMM-sponsored conferences.
  • Apart from the specific sessions dedicated to open source and datasets, the ACM Multimedia Systems Conference (MMSys) has started to provide official ACM badging for articles that make artifacts available. This year, our second year for doing this, has marked a record with 45% of the articles published at the conference acquiring such a reproducibility badge.

4. Innovative Programs Providing Service to Some Part of Our Technical Community

  • A large part of our research area in SIGMM is driven by the availability of large datasets, usually used for training purposes.  Recent years have shown a large growth in the emergence of openly available datasets coupled with grand challenge events at our conferences and workshops. Mostly these are driven by our corporate researchers but this allows all of our researchers the opportunity to carry out their research at scale.  This provides great opportunities for our community.
  • Following the lead of SIGARCH we have commissioned a study of gender distribution among the SIGMM conferences, conference organization and awards. This report will be completed and presented at our flagship conference in October.  We have also commissioned a study of the conferences and journals which mostly influence, and are influenced by, our own SIGMM conferences as an opportunity for some self-reflection on our origins, and our future.  Both these follow an open call for new initiatives to be supported by SIGMM. 
  • SIGMM Conference Director Gerald Friedland worked with several volunteers from SIGMM to improve the content and organization of ACM Multimedia and connected conferences. Volunteer Dayid Ayman Shamma used data science methods to analyze several ACM MM conferences in the past five years with the goal of identifying biases and patterns of irregularities. Some results were presented at the ACM MM TPC meeting. Volunteers Hayley Hung and Martha Larson gave an account of their expectations and experiences with ACM Multimedia and Dr. Friedland himself volunteered as a reviewer for conferences of similar size and importance, including NIPS and CSCW and approached the chairs to get external feedback into what can be improved in the review process. Furthermore, in September, Dr. Friedland will travel to Berlin to visit Lutz Prechelt, who invented a review quality management system. The results of this work will be included into a conference handbook that will put down standard recommendations of best practices for future organizers of SIGMM conferences. We expect the book to be finished by the end of 2018.
  • Last year SIGMM made a decision to try to co-locate conferences and other events as much as possible and the ACM Multimedia conference was co-located with the European Conference on Computer Vision (ECCV) in 2016 with joint workshops and tutorials. This year the ACM MultiMedia Systems (MMSys) conference was co-located with the 10th International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE2018), the16th Annual Workshop on Network and Systems Support for Games (NetGames2018), the 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV2018) and the 23rd Packet Video Workshop (PV2018).  In addition, the Technical Program Committee meeting for the Multimedia Conference was co-located with the ICMR conference.

5. Events or Programs that Broaden Participation

  • SIGMM has approved the launch of a new conference series called Multimedia Asia which will commence in 2019. This will be run by the SIGMM China Chapter and consolidates two existing multimedia-focused conferences in Asia under the sponsorship and governance of SIGMM. This follows a very detailed review and the successful location for the inaugural conference in 2019 will be announced at our flagship conference in October 2018.
  • The Women / Diversity in Multimedia Lunch at ACM MULTIMEDIA 2017 (previously the Women’s Lunch) continued this year with an enlarged program of featured speakers and discussion which led to the call for the gender study in Multimedia mentioned earlier.
  • SIGMM continues to pursue an active approach to nurturing the careers of our early stage researchers. The “Emerging Leaders” event (formerly known as Rising Stars) skipped a year in 2017 but will be happening again in 2018 at the Multimedia Conference.  Giving these early career researchers the opportunity to showcase their vision helps to raise their visibility and helps SIGMM to enlarge the pool of future volunteers.
  • The expansion we put in place in our social media communication team has proven to be a shrewd move with a large growth in our website traffic and raised profile on social media. We also invite conference attendees to post on twitter and/or Facebook about papers, demos, talks that they think are most thought provoking and forward looking and the most active of these are rewarded with a free registration at a future SIGMM-sponsored conference.

6. Issues for SIGMM in the next 2-3 years

  • Like other SIGs, we realize that improving the diversity of the community we serve is essential to continuing our growth and maintaining our importance and relevance. This includes diversity in gender, in geographical location, and in many other facets.  We have started to address these through some of the initiatives mentioned earlier, and at our flagship conference in 2017 we ran a Workshop emphasizing contributions focusing on research from South Africa and the African continent in general.
  • Leadership and supporting young researchers in the early stages of their careers is also important and we highlight this through 2 of our regular awards (Rising Stars and Best Thesis). The “Emerging Leaders” event (formerly known as Rising Stars) skipped a year in 2017 but will be happening again in 2018 at the Multimedia Conference.
  • We wish to reach to other SIGs with whom we could have productive engagement because we see multimedia as a technology enabler as well as an application unto itself. To this end we will continue to try to hold joint panels or workshops at our conferecnes.
  • Our research area is marked by the growth and availability of open datasets and grand challenge competitions held at our conferences and workshops. These datasets are often provided from the corporate sector and this is both an opportunity for us to do research on datasets otherwise unavailable to us, as well as being a threat to the balance between corporate influence and independence.
  • In a previous annual report we highlighted the difficulties caused by a significant portion of our conference proceedings not being indexed by Thomson Web of Science. In a similar vein we find our conference proceedings are not used as input into CSRankings, a metrics-based ranking of Computer Science institutions worldwide. Publishing at venues which are considered in CSRankings’ operation is important to much of our community and while we are in the process of trying to re-dress this, support of ACM on making this case would be welcome.

Report from ACM Multimedia 2017 – by Benoit Huet

 

Best #SIGMM Social Media Reporter Award! Me? Really?? fig_huet_1

This was my reaction after being informed by the SIGMM Social Media Editors that I was one of the two recipients following ACM Multimedia 2017! #ACMMM What a wonderful idea this is to encourage our community to communicate, both internally and to other related communities, about our events, our key research results and all the wonderful things the multimedia community stands for!  I have always been surprised by how limited social media engagement is within the multimedia community. Your initiative has all my support! Let’s disseminate our research interest and activities on social media! @SIGMM #Motivated

fig_huet_2

The SIGMM flagship conference took place on October 23-27 at the Computer History Museum in Mountain View California, USA. For its 25th edition, the organizing committee had prepared an attractive program cleverly mixing expected classics (i.e. Best Paper session, Grand Challenges, Open Source software competition, etc…) and brand new sessions (such as Fast Forward and Thematic Workshops, Business Idea Venture, and the Novel Topics Track). In this last edition, the conference adopted a single paper length, removing the boundary between long and short papers. The TPC Co-Chairs and Area Chairs had the responsibility of directing accepted papers to either an oral session or a thematic workshop.

Thematic workshops took the form of poster presentations. Presenters were asked to provide a short video briefly motivating their work with the intention of making them available online for reference after the conference (possibly with a link to the full paper and the poster!). However, this did not come through as publication permissions were not cleared out in time, but the idea is interesting and should be taken into account for future editions. Fast forward (or Thematic workshop pitches) are short targeted presentations aimed at attracting the audience to the Thematic Workshop where the papers are presented (in the form of posters in this case). While such short presentations allow conference attendees to efficiently identify which poster are relevant to them, it is crucial for presenters to be well prepared and concentrate on highlighting one key research idea, as time is very limited. It also gives more exposure to poster. I would be in favor of keeping such sessions for future ACM Multimedia editions.

The 25th edition of ACM MM wasn’t short of keynotes. No less than 6 industry keynotes had punctuated each of the conference half day. The first keynote by Achin Bhowmik from Starkey focused on Audio as a mean to “Enhancing and Augmenting Human Perception with Artificial Intelligence”. Bill Dally from NVidia presented “Efficient Methods and Hardware for Deep Learning”, in short why we all need GPUs! “Building Multi-Modal Interfaces for Smartphones” was the topic presented by Injong Rhee (Samsung Electronics), Scott Silver (YouTube) discussed the difficulties in “Bringing a Billion Hours to Life” (referring to the vast quantities of videos uploaded and viewed on the sharing platform, and the long tail). Ed. Chang from HTC presented “DeepQ: Advancing Healthcare Through AI and VR” and demonstrated how healthcare is and will benefit from AR, VR and AI. Danny Lange from Unity Technologies highlighted how important machine learning and deep learning are in the game industry in ”Bringing Gaming, VR, and AR to Life with Deep Learning”.  Personally, I would have preferred a mix of industry/academic keynotes as I found some of the keynotes not targeting an audience of computer scientists.

Arnold W. M. Smeulders received the SIGMM Technical Achievement Award for his outstanding and pioneeringfig_huet_3 contribution defining and bridging the semantic gap in content based image retrieval (his lecture is here: https://youtu.be/n8kLxKNjQ0A). His talk was sharp, enlightening and very well received by the audience.

The @sigmm rising star award went to Dr Liangliang Cao for his contribution to large-scale multimedia recognition and social media mining.

The conference was noticeably flavored with trendy topics such as AI, Human augmenting technologies, Virtual and Augmented Reality, and Machine (Deep) Learning, as can be noticed from the various works rewarded.

The Best Paper award was given to Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, Heng Tao Shen for their work on “Adversarial Cross-Modal Retrieval“.

Yuan Tian, Suraj Raghuraman, Thiru Annaswamy, Aleksander Borresen, Klara Nahrstedt, Balakrishnan Prabhakaran received the Best Student Paper award for the paper “H-TIME: Haptic-enabled Tele-Immersive Musculoskeletal Examination“.

The Best demo award went to “NexGenTV: Providing Real-Time Insight during Political Debates in a Second Screen Application” by Olfa Ben Ahmed, Gabriel Sargent, Florian Garnier, Benoit Huet, Vincent Claveau, Laurence Couturier, Raphaël Troncy, Guillaume Gravier, Philémon Bouzy and Fabrice Leménorel.

The Best Open source software award was received by Hao Dong, Akara Supratak, Luo Mai, Fangde Liu, Axel Oehmichen, Simiao Yu, Yike Guo for “TensorLayer: A Versatile Library for Efficient Deep Learning Development“.

The Best Grand Challenge Video Captioning Paper award went to “Knowing Yourself: Improving Video Caption via In-depth Recap“, by Qin Jin, Shizhe Chen, Jia Chen, Alexander Hauptmann.

The Best Grand Challenge Social Media Prediction Paper award went to Chih-Chung Hsu, Ying-Chin Lee, Ping-En Lu, Shian-Shin Lu, Hsiao-Ting Lai, Chihg-Chu Huang,Chun Wang, Yang-Jiun Lin, Weng-Tai Su for “Social Media Prediction Based on Residual Learning and Random Forest“.

Finally, the Best Brave New Idea Paper award was conferred to John R Smith, Dhiraj Joshi, Benoit Huet, Winston Hsu and Zef Cota for the paper “Harnessing A.I. for Augmenting Creativity: Application to Movie Trailer Creation“.

A few years back, the multimedia community was concerned with the lack of truly multimedia publications. In my opinion, those days are behind us. The technical program has evolved into a richer and broader one, let’s keep the momentum!

The location was a wonderful opportunity for many of the attendees to take a stroll down memory lane and see fig_huet_4computers and devices (VT100, PC, etc…) from the past thanks to the complementary entrance to the museum exhibitions. The “isolated” location of the conference venue meant going out for lunch breaks was out of the question given the duration of the lunch break. As a solution, the organizers catered buffet lunches. This resulted in the majority of the attendees interacting and mixing over the lunch break while eating. This could be an effective way to better integrate new participants and strengthen the community.  Both the welcome reception and the banquet were held successfully within Computer Museum. Both events offer yet another opportunity for new connections to be made and for further interaction between attendees. Indeed, the atmosphere of both occasions was relaxed, lively and joyful. 

All in all, ACM MM 2017 was another successful edition of our flagship conference, many thanks to the entire organizing team and see you all in Seoul for ACM MM 2018 http://www.acmmm.org/2018/ and follow @sigmm on Twitter!

Report from ACM Multimedia 2017 – by Conor Keighrey

conor1

My name is Conor Keighrey, I’m a PhD. candidate at the Athlone Institute Technology in Athlone, Co. Westmeath, Ireland.  The focus of my research is to understand the key influencing factors that affect Quality of Experience (QoE) in emerging immersive multimedia experiences, with a specific focus on applications in the speech and language therapy domain. I am funded for this research by the Irish Research Council Government of Ireland Postgraduate Scholarship Programme. I’m delighted to have been asked to present this report to the SIGMM community as a result of my social media activity at ACM Multimedia Conference.

Launched in 1993, the ACM Multimedia (ACMMM) Conference held its 25th anniversary event in the Mountain View, California. The conference was located in the heart of Silicon Valley, at the inspirational Computer History Museum.

Under five focal themes, the conference called for multimedia papers which focused on topics relating to multimedia: Experience, Systems and Applications, Understanding, Novel Topics, and Engagement.

Keynote addresses were delivered by high-profile industry leading experts from the field of multimedia. These talks provided insight into the active development from the following experts:

  • Achin Bhowmik (CTO & EVP, Starkey, USA)
  • Bill Dally (Senior Vice President and Chief Scientist, NVidia, USA)
  • Injong Rhee (CTO & EVP, Samsung Electronics, Korea)
  • Edward Y. Chang (President, HTC, Taiwan)
  • Scott Silver (Vice President, Google, USA)
  • Danny Lange (Vice President, Unity Technologies, USA)

Some keynote highlights include Bill Dally’s talk on “Efficient Methods and Hardware for Deep Learning”. Bill provided insight into the work NVidia are doing with neural networks, the hardware which drives them, and the techniques the company are using to make them more efficient. He also highlighted how AI should not be thought of as a mechanism which replaces, but empower humans, thus allowing us to explore more intellectual activities.

Danny Lange of Unity Technologies discussed the application of the Unity game engine to create scenarios in which machine learning models can be trained. His presentation entitled “Bringing Gaming, VR, and AR to Life with Deep Learning” described the capture of data for self-driving cars to prepare for unexpected occurrences in the real world (e.g. pedestrians activity or other cars behaving in unpredicted ways).

A number of the Keynotes were captured by FXPAL (an ACMMM Platinum Sponsor) and are available here.

With an acceptance rate of 27.63% (684 reviewed, 189 accepted), the main track at ACMMM showcased a diverse collection of research from academic institutes around the globe. An abundance of work was presented in the ever-expanding area of deep/machine learning, virtual/augmented/mixed realities, and the traditional multimedia field.

conor2

The importance of gender equality and diversity with respect to advancing careers of women in STEM has never been greater. Sponsored by SIGMM, the Women/Diversity in MM lunch took place on the first day of ACMMM. Speakers such as Prof. Noel O’Conner discussed the significance of initiatives such as Athena SWAN (Scientific Women’s Academic Network) within Dublin City University (DCU). Katherine Breeden (Pictured left), an Assistant Professor in the Department of Computer Science at Harvey Mudd College (HMC), presented a fantastic talk on gender balance at HMC. Katherine’s discussion highlighted the key changes which have occurred resulting in more women than men graduating with a degree in computer science at the college.

Other highlights from day 1 include a paper presented at the Experience 2 (Perceptual, Affect, and Interaction) session, chaired by Susanne Boll (University of Oldenburg). Researchers from the National University of Singapore presented the results of a multisensory virtual cocktail (Vocktail) experience which was well received. 

 

conor3Through the stimulation of 3 sensory modalities, Vocktails aim to create virtual flavor, and augment taste experiences through a customizable interactive drinking utensil. Controlled by a mobile device, participants of the study experienced augmented taste (electrical stimulation of the tongue), smell (micro air-pumps), and visual (RGB light projected onto the liquid) stimulus as they used the system. For more information, check out their paper entitled “Vocktail: A Virtual Cocktail for Pairing Digital Taste, Smell, and Color Sensations” on the ACM Digital Library.

Day 3 of the conference included a session entitled Brave New Ideas. The session presented a fantastic variety of work which focused on the use of multimedia technologies to enhance or create intelligent systems. Demonstrating AI as an assistive tool and winning the Best Brave New Idea Paper award, a paper entitled “Harnessing A.I. for Augmenting Creativity: Application to Movie Trailer Creation” (ACM Digital Library) describes the first-ever human machine collaboration for creating a real movie trailer. Through multi-modal semantic extraction, inclusive of audio-visual, scene analysis, and a statistical approach, key moments which characterize horror films were defined. As a result of this, the AI selected 10 scenes from a feature length film which were further developed alongside a professional film maker to finalize an exciting movie trailer. Officially released by 20th Century Fox, the complete AI trailer for the horror movie “Morgan” can be viewed here.

A new addition to the last ACMMM edition year has been the inclusion of thematic workshops. Four individual workshops (as outlined below) provided opportunity for papers which could not be accommodated within the main track to be presented to the multimedia research community. A total of 495 papers were reviewed from which 64 were accepted (12.93%). Authors of accepted papers presented their work via on-stage thematic workshop pitches, which were followed by poster presentations on Monday the 23rd and Friday the 27th. The workshop themes were as follows:

  • Experience (Organised by Wanmin Wu)
  • Systems and Applications (Organised by Roger Zimmermann & He Ma)
  • Engagement (Organised by Jianchao Yang)
  • Understanding (Organised by Qi Tian)

Presented as part of the thematic workshop pitches, one of the most fascinating demos at the conference was a body of work carried out by Audrey Ziwei Hu (University of Toronto). Her paper entitled “Liquid Jets as Logic-Computing Fluid-User-Interfaces” describes a fluid (water) user interface which is presented as a logic-computing device. Water jets form a medium for tactile interaction and control to create a musical instrument known as a hydraulophone.

conor4Steve Mann (Pictured left) from Stanford University, who is regarded as “The Father of Wearable Computing”, provided a fantastic live demonstration of the device. The full paper can be found on the ACM Digital Library, and a live demo can be seen here.

In large scale events such ACMMM, the importance of social media reporting/interaction has never been greater. More than 250 social media interactions (tweets, retweets, and likes) were monitored using the #SIGMM and #ACMMM hashtags, as outlined by the SIGMM Records prior to the event. Descriptive (and multimedia enhanced) social media reports provide a chance for those who encounter an unavoidable schedule overlap, and an opportunity to gather some insight into alternative works presented at the conference.

From my own perspective (as a PhD. student), the most important aspect of social media interaction is that reports often serve as a conversational piece. Developing a social presence throughout the many coffee breaks and social events during the conference is key to the success of building a network of contacts within any community. As a newcomer this can often be a daunting task, recognition of other social media reporters offers the perfect ice-breaker, providing opportunity to discuss and inform each other of the on-going work within the multimedia community. As a result of my own online reporting, I was recognized numerous times throughout the conference. Staying active on social media often leads to the development of a research audience, and social media presence among peers. Engaging in such an audience is key to the success of those who wish to follow a path in academia/research.

Building on my own personal experience, continued attendance to SIGMM conferences (irrespective of paper submission) has so many advantages. While the predominant role of a conference is to disseminate work, the informative aspect of attending such events is often overlooked. The area of multimedia research is moving at a fast pace, and thus having the opportunity to engage directly with researchers in your field of expertise is of upmost importance. Attendance to ACMMM and other SIGMM conferences, such ACM Multimedia Systems, has inspired me to explore alternative methodologies within my own respective research. Without a doubt, continued attendance will inspire my research as I move forward.

ACM Multimedia ‘18 (October 22nd – 26th) – The diverse landscape of modern skyscrapers mixed with traditional Buddhist temples, and palaces that is Seoul, South Korea, will be host to the 26th Annual ACMMM. The 2018 event will without a doubt present a variety of work from the multimedia research community. Regular paper abstracts are due on the 30th of March (Full manuscripts are due on the 8th of April). For more information on next year’s ACM Multimedia conference check out the following link: http://www.acmmm.org/2018

The Deep Learning Indaba Report

Abstract

Given the focus on deep learning and machine learning, there is a need to address this problem of low participation of Africans in data science and artificial intelligence. The Deep Learning Indaba was thus born to stimulate the participation of Africans within the research and innovation landscape surrounding deep learning and machine learning. This column reports on the Deep Learning Indaba event, which consisted of a 5-day series of introductory lectures on Deep Learning, held from 10-15 September 2017, coupled with tutorial sessions where participants gained practical experience with deep learning software packages. The column also includes interviews with some of the organisers to learn more about the origin and future plans of the Deep Learning Indaba.

Introduction

Africans have a low participation in the area of science called deep learning and machine learning, as shown by the fact that at the 2016 Neural Information Processing Systems (NIPS’16) conference, none of the accepted papers had at least one author from a research institution in Africa (http://www.deeplearningindaba.com/blog/missing-continents-a-study-using-accepted-nips-papers).

Given the increasing focus on deep learning, and the more general area of machine learning, there is a need to address this problem of low participation of Africans in the technology that underlies the recent advances in data science and artificial intelligence that is set to transform the way the world works. The Deep Learning Indaba was thus born, aiming to be a series of master classes on deep learning and machine learning for African researchers and technologists. The purpose of the Deep Learning Indaba was to stimulate the participation of Africans, within the research and innovation landscape surrounding deep learning and machine learning.

What is an ‘indaba’?

According to the organisers ‘indaba’ is a Zulu word that simply means gathering or meeting. There are several words for such meetings (that are held throughout southern Africa) including an imbizo (in Xhosa), an intlanganiso, and a lekgotla (in Sesotho), a baraza (in Kiswahili) in Kenya and Tanzania, and padare (in Shona) in Zimbabwe. Indabas have several functions: to listen and share news of members of the community, to discuss common interests and issues facing the community, and to give advice and coach others. Using the word ‘indaba’ for the Deep Learning event connects it to other community gatherings that are similarly held by cultures throughout the world. The Deep Learning Indaba is about the spirit of coming together, of sharing and learning and is one of the core values of the event.

The Deep Learning Indaba

After a couple of months of furious activity by the organisers, roughly 300 students, researchers and machine learning practitioners from all over Africa gathered for the first Deep Learning Indaba from 10-15 September 2017 at the University of Witswatersrand, Johannesburg, South Africa. More than 30 African countries were represented for an intense week of immersion into Deep Learning.

The Deep Learning Indaba consisted of a 5-day series of introductory lectures on Deep Learning, coupled with tutorial sessions where participants gained practical experience with deep learning software packages such as TensorFlow. The format of the Deep Learning Indaba was based on the intense summer school experience of NIPS. Presenters at the Indaba included prominent figures in the machine learning community such as Nando de Freitas, Ulrich Paquet and Yann Dauphin. The lecture sessions were all recorded and all the practical tutorials are also available online: Lectures and Tutorials.

After organising the first successful Deep Learning Indaba in Africa (a report on the outcomes of the Deep Learning Indaba can be found at online), the organisers have already started planning the next two Deep Learning Indabas, that will take place in 2018 and 2019. More information can be found at the Deep Learning Indaba website http://www.deeplearningindaba.com.

Having been privileged to attend this first Deep Learning Indaba, a number of the organisers were interviewed to learn more about the origin and future plans of the Deep Learning Indaba. The interviewed organisers include Ulrich Paquet and Stephan Gouws.

Question 1: What was the origin of the Deep Learning Indaba?

Ulrich Paquet: We’d have to dig into history a bit here, as the dream of taking ICML (International Conference on Machine Learning) to South Africa has been around for a while. The topic was again raised at the end of 2016, when Shakir and I sat at NIPS (Conference on Neural Information Processing Systems), and said “let’s find a way to make something happen in 2017.” We were waiting for the right opportunity. Stephan has been thinking along these lines, and so has George Konidaris. I met Benjamin Rosman in January or February over e-mail, and within a day we were already strategizing what to do.

We didn’t want to take a big conference to South Africa, as people parachute in and out, without properly investing in education. How can we make the best possible investment in South African machine learning? We thought a summer school would be the best vehicle, but more than that, we wanted a summer school that would replicate the intense NIPS experience in South Africa: networking, parties, high-octane teaching, poster sessions, debates and workshops…

Shakir asked Demis Hassibis for funding in February this year, and Demis was incredibly supportive. And that got the ball rolling…

Stephan Gouws: It began with the question that was whispered amongst many South Africans in the machine learning industry: “how can we bring ICML to South Africa?” Early in 2017, Ulrich Paquet and Shakir Mohamed (both from Google DeepMind) began a discussion regarding how a summer school-like event can be held in South Africa. A summer school-like event was chosen as it typically has a bigger impact after the event than a typical conference. Benjamin Rosman (from the South African Council of Scientific and Industrial Research), Nando de Freitas (also from Google DeepMind) joined the discussion in February. A fantastic group of researchers from South Africa was gathered that shared the vision of making the event a reality. I suggested the name “Deep Learning Indaba”, we registered a domain, and from there we got the ball rolling!

Question 2: What did the organisers want to achieve with the Indaba?

Ulrich Paquet: Strengthening African Machine Learning

“a shared space to learn, to share, and to debate the state-of-the-art in machine learning and artificial intelligence”

  • Teaching and mentoring
  • Building a strong research community
  • Overcoming isolation

We also wanted to work towards inclusion; build a community; confidence building; affect government policy.

Stephan Gouws: Our vision is to strengthen machine learning in Africa. Machine learning experts, workshop and conferences are mostly concentrated in North America and Western-Europe. African do not easily get the opportunity to be exposed to such events as they are far away, expensive to attend, etc. Furthermore, with a conference a bunch of experts fly in, discuss the state-of-the-art of the field, and then fly away. A conference does not easily allow for a transfer of expertise, and therefore the local community does not gain much from a conference. With the Indaba, we hoped to facility a knowledge transfer (for which a summer school-like event is better suited), and also to create networking opportunities for students, industry, academics and the international presenters.

Question 3: Why was the Indaba held in South Africa?

Ulrich Paquet: All of the (original) organizers are South African, and really care about development of their own country. We want to reach beyond South Africa, though, and tried to include as many institutions as possible (more than 20 African countries were represented).

But, one has to remember that the first Indaba was essentially an experiment. We had to start somewhere! We benefit by having like-minded local organizers 🙂

Stephan Gouws: All the organisers are originally from South Africa and want to support and strengthen the machine learning field in South Africa (and eventually in the rest of Africa).

Question 4: What was the expectations beforehand for the Indaba? (For example, how many people did the organisers expect will attend?)

Ulrich Paquet: Well, we originally wanted to run a series of master classes for 40 students. We had ABSOLUTELY NO idea how many students would apply, or if any would even apply. We were very surprised when we hit more than 700 applications by our deadline, and by then, the whole game changed. We couldn’t take 40 out of 700, and decided to go for the largest lecture hall we could possibly find (for 300 people).

There are then other logistics of scale that come into play: feeding everyone, transporting everyone, running practical sessions, etc. And it has to be within budget!! The cap at 300 seemed to work well.

Question 5: Are there any plans for the future of the Indaba? Are you planning on making it an annual event?

Ulrich Paquet: Yes, definitely.

Stephan Gouws: Nothing official yet, but the plan from the beginning was to make it an annual event.

[Editor]:  The Deep Learning Indaba 2018 has since been announced and more information can be found at the following link: http://www.deeplearningindaba.com/indaba-2018.html.  The organisers have also announced locally organised, one-day Indabas to be held from 26 March to 6 April 2108 with the aim of strengthening the African Machine learning community. Details for obtaining support for the organising of an IndabaX event can be found at the main site: http://www.deeplearningindaba.com/indabax

Question 6: How can students, researchers and people from industry still get and stay involved after the Indaba?

Ulrich Paquet: There are many things that could be changed with enough critical mass. One, that we’re hoping, is to ensure that the climate for research in sub-Saharan Africa is as fertile as possible. This will only happen through lots of collaboration and cross-pollination. There are some things that stand in the way of this kind of collaboration. One is government KPIs (key performance indicators) that rewards research: for AI, it does not rightly reward collaboration, and does not rightly reward publications in top-tier platforms, which are all conferences (NIPS, ICML). Therefore, it does not reward playing in and contributing to the most competitive playing field. These are all things that the AI community in SA should seek to creatively address and change.

We have seen organic South African papers published at UAI and ICML for the first time this year, and the next platforms should be JMLR and NIPS, and then Nature. There’s never been any organic Africa AI or machine learning papers in any of the latter venues. Students should be encouraged to collaborate and submit to them! The nature of the game is that the barrier to entry for these venues is so high, that one has to collaborate… This of course brings me to my point about why research grants (in SA) should be revisited to reflect these outcomes.

Stephan Gouws: In short, yes. All the practical, lectures and videos are made publicly available. There is also Facebook and WhatsApp groups, and we hope that the discussion and networking will not stop after the 15th of September. As a side note: I am working on ideas (more aimed at postgraduate students) to eventually put a mentor system in place, as well as other types of support for postgraduate students after the Indaba. But it is still early days and only time will tell.

Biographies of Interviewed Organisers

Ulrich Paquet (Research Scientist, DeepMind, London):

Ulrich Paquet

Dr. Ulrich Paquet is a Research Scientist at DeepMind, London. He really wanted to be an artist before stumbling onto machine learning while attending a third-year course taught at University of Pretoria (South Africa) where he eventually obtained a Master’s degree in Computer Science. In April 2007 Ulrich obtained his PhD from the University of Cambridge with dissertation topic “Bayesian Inference for Latent Variable Models.” After obtaining his PhD he worked with a start-up called Imense, focusing on face recognition and image similarity search. He then joined Microsoft’s FUSE Labs, based at Microsoft Research Cambridge, where he eventually worked on the XBox-One launch as part of the Xbox Recommendations team. From 2015 he joined another start-up in Cambridge, VocalIQ, which has been acquired by Apple before joining DeepMind in April 2016.

Stephan Gouws (Research Scientist, Google Brain Team):

Stephan Gouws

Dr. Stephan Gouws is a Research Scientist at Google and part of the Google Brain Team that developed TensorFlow and Google’s Neural Machine Translation System. His undergraduate studies was a double major in Electronic Engineering and Computer Science at Stellenbosch University (South Africa). His postgraduate studies in Electronic Engineering were also completed at the MIH Media Lab at Stellenbosch University. He obtained his Master’s degree cum laude in 2010 and his PhD degree in 2015 on the dissertation topic of “Training Neural Word Embeddings for Transfer Learning and Translation.” During his PhD he spent one year at Information Sciences Institute (ISI) at the University of Southern California in Los Angeles, and 1 year at Montreal Institute for Learning Algorithms where he worked closely with Yoshua Bengio. He also worked as Research Intern at both Microsoft Research and Google Brain during this period.

 
The Deep Learning Indaba Organisers:

Shakir Mohamed (Research Scientist, DeepMind, London)
​Nyalleng Moorosi (Researcher, Council for Scientific and Industrial Research, South Africa)
Ulrich Paquet (Research Scientist, DeepMind, London)
​Stephan Gouws (Research Scientist, Google, Brain Team, London)
Vukosi Marivate (Researcher, Council for Scientific and Industrial Research, South Africa)
Willie Brink (Senior Lecturer, Stellenbosch University, South Africa)
Benjamin Rosman (Researcher, Council for Scientific and Industrial Research, South Africa)
Richard Klein (Associate Lecturer, University of the Witwatersrand, South Africa)

Advisory Committee:

Nando De Freitas (Research Scientist, DeepMind, London)
Ben Herbst (Professor, Stellenbosch University)
Bonolo Mathibela (Research Scientist, IBM Research South Africa)
​George Konidaris (Assistant Professor, Brown University)​
​Bubacarr Bah (Research Chair, African Institute for Mathematical Sciences, South Africa)

Report from ACM MMSys 2017

–A report from Christian Timmerer, AAU/Bitmovin Austria

The ACM Multimedia Systems Conference (MMSys) provides a forum for researchers to present and share their latest research findings in multimedia systems. It is a unique event targeting “multimedia systems” from various angles and views across all domains instead of focusing on a specific aspect or data type. ACM MMSys’17 was held in Taipei, Taiwan in June 20-23, 2017.

MMSys is a single-track conference which hosts also a series of workshops, namely NOSSDAV, MMVE, and NetGames. Since 2016, it kicks off with overview talks and 2017 we’ve seen the following talks: “Geometric representations of 3D scenes” by Geraldine Morin; “Towards Understanding Truly Immersive Multimedia Experiences” by Niall Murray; “Rate Control In The Age Of Vision” by Ketan Mayer-Patel; “Humans, computers, delays and the joys of interaction” by Ragnhild Eg; “Context-aware, perception-guided workload characterization and resource scheduling on mobile phones for interactive applications” by Chung-Ta King and Chun-Han Lin.

Additionally, industry talks have been introduced: “Virtual Reality – The New Era of Future World” by WeiGing Ngang; “The innovation and challenge of Interactive streaming technology” by Wesley Kuo; “What challenges are we facing after Netflix revolutionized TV watching?” by Shuen-Huei Guan; “The overview of app streaming technology” by Sam Ding; “Semantic Awareness in 360 Streaming” by Shannon Chen; “On the frontiers of Video SaaS” by Sega Cheng.

An interesting set of keynotes presented different aspects related multimedia systems and its co-located workshops:

  • Henry Fuchs, The AR/VR Renaissance: opportunities, pitfalls, and remaining problems
  • Julien Lai, Towards Large-scale Deployment of Intelligent Video Analytics Systems
  • Dah Ming Chiu, Smart Streaming of Panoramic Video
  • Bo Li, When Computation Meets Communication: The Case for Scheduling Resources in the Cloud
  • Polly Huang, Measuring Subjective QoE for Interactive System Design in the Mobile Era – Lessons Learned Studying Skype Calls

IMG_4405The program included a diverse set of topics such as immersive experiences in AR and VR, network optimization and delivery, multisensory experiences, processing, rendering, interaction, cloud-based multimedia, IoT connectivity, infrastructure, media streaming, and security. A vital aspect of MMSys is dedicated sessions for showcasing latest developments in the area of multimedia systems and presenting datasets, which is important towards enabling reproducibility and sustainability in multimedia systems research.

The social events were a perfect venue for networking and in-depth discussion how to advance the state of the art. A welcome reception was held at “LE BLE D’OR (Miramar)”, the conference banquet at the Taipei World Trade Center Club, and finally a tour to the Shilin Night Market was organized.

ACM MMSys 2917 issued the following awards:

  • The Best Paper Award  goes to “A Scalable and Privacy-Aware IoT Service for Live Video Analytics” by Junjue Wang (Carnegie Mellon University), Brandon Amos (Carnegie Mellon University), Anupam Das (Carnegie Mellon University), Padmanabhan Pillai (Intel Labs), Norman Sadeh (Carnegie Mellon University), and Mahadev Satyanarayanan (Carnegie Mellon University).
  • The Best Student Paper Award goes to “A Measurement Study of Oculus 360 Degree Video Streaming” by Chao Zhou (SUNY Binghamton), Zhenhua Li (Tsinghua University), and Yao Liu (SUNY Binghamton).
  • The NOSSDAV’17 Best Paper Award goes to “A Comparative Case Study of HTTP Adaptive Streaming Algorithms in Mobile Networks” by Theodoros Karagkioules (Huawei Technologies France/Telecom ParisTech), Cyril Concolato (Telecom ParisTech), Dimitrios Tsilimantos (Huawei Technologies France), Stefan Valentin (Huawei Technologies France).

Excellence in DASH award sponsored by the DASH-IF 

  • 1st place: “SAP: Stall-Aware Pacing for Improved DASH Video Experience in Cellular Networks” by Ahmed Zahran (University College Cork), Jason J. Quinlan (University College Cork), K. K. Ramakrishnan (University of California, Riverside), and Cormac J. Sreenan (University College Cork)
  • 2nd place: “Improving Video Quality in Crowded Networks Using a DANE” by Jan Willem Kleinrouweler, Britta Meixner and Pablo Cesar (Centrum Wiskunde & Informatica)
  • 3rd place: “Towards Bandwidth Efficient Adaptive Streaming of Omnidirectional Video over HTTP” by Mario Graf (Bitmovin Inc.), Christian Timmerer (Alpen-Adria-Universität Klagenfurt / Bitmovin Inc.), and Christopher Mueller (Bitmovin Inc.)

Finally, student travel grants awards have been sponsored by SIGMM. All details including nice pictures can be found here.


ACM MMSys 2018 will be held in Amsterdam, The Netherlands, June 12 – 15, 2018 and includes the following tracks:

  • Research track: Submission deadline on November 30, 2017
  • Demo track: Submission deadline on February 25, 2018
  • Open Dataset & Software Track: Submission deadline on February 25, 2018

MMSys’18 co-locates the following workshops (with submission deadline on March 1, 2018):

  • MMVE2018: 10th International Workshop on Immersive Mixed and Virtual Environment Systems,
  • NetGames2018: 16th Annual Worksop on Network and Systems Support for Games,
  • NOSSDAV2018: 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video,
  • PV2018: 23rd Packet Video Workshop

MMSys’18 includes the following special sessions (submission deadline on December 15, 2017):

Report from ICMR 2017

ACM International Conference on Multimedia Retrieval (ICMR) 2017

ACM ICMR 2017 in “Little Paris”

ACM ICMR is the premier International Conference on Multimedia Retrieval, and from 2011 it “illuminates the state of the arts in multimedia retrieval”. This year, ICMR was in an wonderful location: Bucharest, Romania also known as “Little Paris”. Every year at ICMR I learn something new. And here is what I learnt this year.

ICMR2017

Final Conference Shot at UP Bucharest

UNDERSTANDING THE TANGIBLE: object, scenes, semantic categories – everything we can see.

1) Objects (and YODA) can be easily tracked in videos.

Arnold Smeulders delivered a brilliant keynote on “things” retrieval: given an object in an image, can we find (and retrieve) it in other images, videos, and beyond? Very interesting technique for tracking objects (e.g. Yoda) in videos based on similarity learnt through siamese networks.

Tracking Yoda with Siamese Networks

2) Wearables + computer vision help explore cultural heritage sites.

As showed in his keynote, at MICC University of Florence, Alberto del Bimbo and his amazing team have designed smart audio guides for indoor and outdoor spaces. The system detects, recognises, and describes landmarks and artworks from wearable camera inputs (and GPS coordinates, in case of outdoor spaces).

3) We can finally quantify how much images provide complementary semantics compared to text [BEST MULTIMODAL PAPER AWARD].

For ages, the community has asked how relevant different modalities are for multimedia analysis: this paper (http://dl.acm.org/citation.cfm?id=3078991) finally proposes a solution to quantify information gaps between different modalities.

4) Exploring news corpuses is now very easy: news graphs are easy to navigate and aware of the type of relations between articles.

Remi Bois and his colleagues presented this framework (http://dl.acm.org/citation.cfm?id=3079023), made for professional journalists and the general public, for seamlessly browsing through large-scale news corpus. They built a graph where nodes are articles in a news corpus. The most relevant items to each article are chosen (and linked) based on an adaptive nearest neighbor technique. Each link is then characterised according to the type of relation of the 2 linked nodes.

5) Panorama outdoor images are much easier to localise.

In his beautiful work (https://t.co/3PHCZIrA4N), Ahmet Iscen from Inria developed an algorithm for location prediction from StreetView images, outperforming the state of the art thanks to an intelligent stitching pre-processing step: predicting locations from panoramas (stitched individual views) instead of individual street images improves performances dramatically!

UNDERSTANDING THE INTANGIBLE: artistic aspects, beauty, intent: everything we can perceive

1) Image search intent can be predicted by the way we look.

In his best paper candidate research work (http://dl.acm.org/citation.cfm?id=3078995), Mohammad Soleymani showed that image search intent (seeking information, finding content, or re-finding content) can be predicted from physisological responses (eye gaze) and implicit user interaction (mouse movements).

2) Real-time detection of fake tweets is now possible using user and textual cues.

Another best paper candidate (http://dl.acm.org/citation.cfm?id=3078979), this time from CERTH. The team collected a large dataset of fake/real sample tweets spanning 17 events and built an effective model from misleading content detection from tweet content and user characteristics. A live demo here: http://reveal-mklab.iti.gr/reveal/fake/

3) Music tracks have different functions in our daily lives.

Researchers from TU Delft have developed an algorithm (http://dl.acm.org/citation.cfm?id=3078997) which classifies music tracks according to their purpose in our daily activities: relax, study and workout.

4) By transferring image style we can make images more memorable!

The team at University of Trento built an automatic framework (https://arxiv.org/abs/1704.01745) to improve image memorability. A selector finds the style seeds (i.e. abstract paintings) which are likely to increase memorability of a given image, and after style transfer, the image will be more memorable!

5) Neural networks can help retrieve and discover child book illustrations.

In this amazing work (https://arxiv.org/pdf/1704.03057.pdf), motivated by real children experiences, Pinar and her team from Hacettepe University collected a large dataset of children book illustrations and found that neural networks can predict and transfer style, allowing to make “Winnie the witch”-like many other illustrations.

Winnie the Witch

6) Locals perceive their neighborhood as less interesting, more dangerous and dirtier compared to non-locals.

In this wonderful work (http://www.idiap.ch/~gatica/publications/SantaniRuizGatica-icmr17.pdf), presented by Darshan Santain from IDIAP, researchers asked locals and crowd-workers to look at pictures from various neighborhoods in Guanajuato and rate them according to interestingness, cleanliness, and safety.

THE FUTURE: What’s Next?

1) We will be able to anonymize images of outdoor spaces thanks to Instagram filters, as proposed by this work (http://dl.acm.org/citation.cfm?id=3080543) in the Brave New Idea session.  When an image of an outdoor space is manipulated with appropriate Instagram filters, the location of the image can be masked from vision-based geolocation classifiers.

2) Soon we will be able to embed watermarks in our Deep Neural Network models in order to protect our intellectual property [BEST PAPER AWARD]. This is a disruptive, novel idea, and that is why this work from KDDI Research and Japan National Institute of Informatics won the best paper award. Congratulations!

3) Given an image view of an object, we will predict the other side of things (from Smeulders’ keynote). In the pic: predicting the other side of chairs. Beautiful.

Predicting the other side of things

THANKS: To the organisers, to the volunteers, and to all the authors for their beautiful work 🙂

EDITORIAL NOTE: A more extensive report from ICMR 2017 by Miriam is available on Medium

Report from MMM 2017

MMM 2017 — 23rd International Conference on MultiMedia Modeling

MMM is a leading international conference for researchers and industry practitioners for sharing new ideas, original research results and practical development experiences from all MMM related areas. The 23rd edition of MMM took place on January 4-6 of 2017, on the modern campus of Reykjavik University. In this short report, we outline the major aspects of the conference, including: technical program; best paper session; video browser showdown; demonstrations; keynotes; special sessions; and social events. We end by acknowledging the contributions of the many excellent colleagues who helped us organize the conference. For more details, please refer to the MMM 2017 web site.

Technical Program

The MMM conference calls for research papers reporting original investigation results and demonstrations in all areas related to multimedia modeling technologies and applications. Special sessions were also held that focused on addressing new challenges for the multimedia community.

This year, 149 regular full paper submissions were received, of which 36 were accepted for oral presentation and 33 for poster presentation, for a 46% acceptance ratio. Overall, MMM received 198 submissions for all tracks, and accepted 107 for oral and poster presentation, for a total of 54% acceptance rate. For more details, please refer to the table below.

MMM2017 Submissions and Acceptance Rates

MMM2017 Submissions and Acceptance Rates

Best Paper Session

Four best paper candidates were selected for the best paper session, which was a plenary session at the start of the conference.

The best paper, by unanimous decision, was “On the Exploration of Convolutional Fusion Networks for Visual Recognition” by Yu Liu, Yanming Guo, and Michael S. Lew. In this paper, the authors propose an efficient multi-scale fusion architecture, called convolutional fusion networks (CFN), which can generate the side branches from multi-scale intermediate layers while consuming few parameters.

Phoebe Chen, Laurent Amsaleg and Shin’ichi Satoh (left) present the Best Paper Award to Yu Liu and Yanming Guo (right).

Phoebe Chen, Laurent Amsaleg and Shin’ichi Satoh (left) present the Best Paper Award to Yu Liu and Yanming Guo (right).

The best student paper, partially chosen due to the excellent presentation of the work, was “Cross-modal Recipe Retrieval: How to Cook This Dish?” by Jingjing Chen, Lei Pang, and Chong-Wah Ngo. In this work, the problem of sharing food pictures from the viewpoint of cross-modality analysis was explored. Given a large number of image and recipe pairs acquired from the Internet, a joint space is learnt to locally capture the ingredient correspondence from images and recipes.

Phoebe Chen, Laurent Amsaleg and Shin’ichi Satoh (left) present the Best Student Paper Award to Jingjing Chen and Chong-Wah Ngo (right).

Phoebe Chen, Shin’ichi Satoh and Laurent Amsaleg (left) present the Best Student Paper Award to Jingjing Chen and Chong-Wah Ngo (right).

The two runners-up were “Spatio-temporal VLAD Encoding for Human Action Recognition in Videos” by Ionut Cosmin Duta, Bogdan Ionescu, Kiyoharu Aizawa, and Nicu Sebe, and “A Framework of Privacy-Preserving Image Recognition for Image-Based Information Services” by Kojiro Fujii, Kazuaki Nakamura, Naoko Nitta, and Noboru Babaguchi.

Video Browser Showdown

The Video Browser Showdown (VBS) is an annual live video search competition, which has been organized as a special session at MMM conferences since 2012. In VBS, researchers evaluate and demonstrate the efficiency of their exploratory video retrieval tools on a shared data set in front of the audience. The participating teams start with a short presentation of their system and then perform several video retrieval tasks with a moderately large video collection (about 600 hours of video content). This year, seven teams registered for VBS, although one team could not compete for personal and technical reasons. For the first time in 2017, live judging was included, in which a panel of expert judges made decisions in real-time about the accuracy of the submissions for ⅓ of the tasks.

Teams and spectators in the Video Browser Showdown.

Teams and spectators in the Video Browser Showdown.

On the social side, two changes were also made from previous conferences. First, VBS was held in a plenary session, to avoid conflicts with other schedule items. Second, the conference reception was held at VBS, which meant that attendees had extra incentives to attend VBS, namely food and drink. And third, Alan Smeaton served as “color commentator” during the competition, interviewing the organizers and participants, and helping explain to the audience what was going on. All of these changes worked well, and contributed to a very well attended VBS session.

The winners of VBS 2017, after a very even and exciting competition, were Luca Rossetto, Ivan Giangreco, Claudiu Tanase, Heiko Schuldt, Stephane Dupont and Omar Seddati, with their IMOTION system.

The winners of VBS 2017, after a very even and exciting competition, were Luca Rossetto, Ivan Giangreco, Claudiu Tanase, Heiko Schuldt, Stephane Dupont and Omar Seddati, with their IMOTION system.

Demonstrations

Five demonstrations were presented at MMM. As in previous years, the best demonstration was selected using both a popular vote and a selection committee. And, as in previous years, both methods produced the same winner, which was: “DeepStyleCam: A Real-time Style Transfer App on iOS” by Ryosuke Tanno, Shin Matsuo, Wataru Shimoda, and Keiji Yanai.

The winners of the Best Demonstration competition hard at work presenting their system.

The winners of the Best Demonstration competition hard at work presenting their system.

Keynotes

The first keynote, held in the first session of the conference, was “Multimedia Analytics: From Data to Insight” by Marcel Worring, University of Amsterdam, Netherlands. He reported on a novel multimedia analytics model based on an extensive survey of over eight hundred papers. In the analytics model, the need for semantic navigation of the collection is emphasized and multimedia analytics tasks are placed on an exploration-search axis. Categorization is then proposed as a suitable umbrella task for realizing the exploration-search axis in the model. In the end, he considered the scalability of the model to collections of 100 million images, moving towards methods which truly support interactive insight gain in huge collections.

Björn Þór Jónsson introduces the first keynote speaker, Marcel Worring (right).

Björn Þór Jónsson introduces the first keynote speaker, Marcel Worring (right).

The second keynote, held in the last session of the conference, was “Creating Future Values in Information Access Research through NTCIR” by Noriko Kando, National Institute of Informatics, Japan. She reported on NTCIR (NII Testbeds and Community for Information access Research), which is a series of evaluation workshops designed to enhance the research in information access technologies, such as information retrieval, question answering, and summarization using East-Asian languages, by providing infrastructures for research and evaluation. Prof Kando provided motivations for the participation in such benchmarking activities and she highlighted the range of scientific tasks and challenges that have been explored at NTCIR over the past twenty years. She ended with ideas for the future direction of NTCIR.

key2

Noriko Kando presents the second MMM keynote.

Special Sessions

During the conference, four special sessions were held. Special sessions are mini-venues, each focusing on one state-of-the-art research direction within the multimedia field. The sessions are proposed and chaired by international researchers, who also manage the review process, in coordination with the Program Committee Chairs. This year’s sessions were:
– “Social Media Retrieval and Recommendation” organized by Liqiang Nie, Yan Yan, and Benoit Huet;
– “Modeling Multimedia Behaviors” organized by Peng Wang, Frank Hopfgartner, and Liang Bai;
– “Multimedia Computing for Intelligent Life” organized by Zhineng Chen, Wei Zhang, Ting Yao, Kai-Lung Hua, and Wen-Huang Cheng; and
– “Multimedia and Multimodal Interaction for Health and Basic Care Applications” organized by Stefanos Vrochidis, Leo Wanner, Elisabeth André, Klaus Schoeffmann.

Social Events

This year, there were two main social events at MMM 2017: a welcome reception at the Video Browser Showdown, as discussed above, and the conference banquet. Optional tours then allowed participants to further enjoy their stay on the unique and beautiful island.

The conference banquet was held in two parts. First, we visited the exotic Blue Lagoon, which is widely recognised as one of the modern wonders of the world and one of the most popular tourist destinations in Iceland. MMM participants had the option of bathing for two hours in this extraordinary spa, and applying the healing silica mud to their skin, before heading back for the banquet in Reykjavík.

The banquet itself was then held at the Harpa Reykjavik Concert Hall and Conference Centre in downtown Reykjavík. Harpa is one of Reykjavik‘s most recent, yet greatest and most distinguished landmarks. It is a cultural and social centre in the heart of the city and features stunning views of the surrounding mountains and the North Atlantic Ocean.

Harpa, the venue of the conference banquet.

Harpa, the venue of the conference banquet.

During the banquet, Steering Committee Chair Phoebe Chen gave a historical overview of the MMM conferences and announced the venues for MMM 2018 (Bangkok, Thailand) and MMM 2019 (Thessaloniki, Greece), before awards for the best contributions were presented. Finally, participants were entertained by a small choir, and were even asked to participate in singing a traditional Icelandic folk song.

MMM 2018 will be held at Chulalongkorn University in Bangkok, Thailand.  See http://mmm2018.chula.ac.th/.

MMM 2018 will be held at Chulalongkorn University in Bangkok, Thailand. See http://mmm2018.chula.ac.th/.

Acknowledgements

There are many people who deserve appreciation for their invaluable contributions to MMM 2017. First and foremost, we would like to thank our Program Committee Chairs, Laurent Amsaleg and Shin’ichi Satoh, who did excellent work in organizing the review process and helping us with the organization of the conference; indeed they are still hard at work with an MTAP special issue for selected papers from the conference. The Proceedings Chair, Gylfi Þór Guðmundsson, and Local Organization Chair, Marta Kristín Lárusdóttir, were also tirelessly involved in the conference organization and deserve much gratitude.

Other conference officers contributed to the organization and deserve thanks: Frank Hopfgartner and Esra Acar (demonstration chairs); Klaus Schöffmann, Werner Bailer and Jakub Lokoč (VBS Chairs); Yantao Zhang and Tao Mei (Sponsorship Chairs); all the Special Session Chairs listed above; the 150 strong Program Committee, who did an excellent job with the reviews; and the MMM Steering Committee, for entrusting us with the organization of MMM 2017.

Finally, we would like to thank our student volunteers (Atli Freyr Einarsson, Bjarni Kristján Leifsson, Björgvin Birkir Björgvinsson, Caroline Butschek, Freysteinn Alfreðsson, Hanna Ragnarsdóttir, Harpa Guðjónsdóttir), our hosts at Reykjavík University (in particular Arnar Egilsson, Aðalsteinn Hjálmarsson, Jón Ingi Hjálmarsson and Þórunn Hilda Jónasdóttir), the CP Reykjavik conference service, and all others who helped make the conference a success.

Report from ICACNI 2015

Report from the 3rd International Conference on Advanced Computing, Networking, and Informatics

1

Inauguration of 3rd ICACNI 2015

The 3rd International Conference on Advanced Computing, Networking and Informatics (ICACNI-2015), organized by School of Computer Engineering, KIIT University, Odisha, India, was held during 23-25 June, 2015.

2

Prof. Nikhil R. Pal during his keynote

The conference commenced with a keynote by Prof. Nikhil R. Pal (Fellow IEEE, Indian Statistical Institute, Kolkata, India) on ‘A Fuzzy Rule-Based Approach to Single Frame Super Resolution’.

Authors listening to technical presentations

Authors listening to technical presentations

Apart from three regular tracks on advanced computing, networking, and informatics, the conference hosted three invited special sessions. While a total of more than 550 articles across different tracks of the conference were received, 132 articles are finally selected for presentation and publication by Smart Innovation, Systems and Technologies series of Springer as Volume 43 and 44.

Prof. Nabendu Chaki during his technical talk

Prof. Nabendu Chaki during his technical talk

Extended versions of few extraordinary articles from these will be published by special issues of Egyptian Informatics Journal and Innovations in Systems and Software Engineering (A NASA Journal). The conference showcased a technical talk by Prof. Nabendu Chaki (Senior Member IEEE, Calcutta University, India) on ‘Evolution from Web-based Applications to Cloud Services: A Case Study with Remote Healthcare’.

A click from award giving ceremony

A click from award giving ceremony

The conference identified some wonderful works and have given away eight awards in different categories. The conference was successful to bring together academic scientists, professors, research scholars and students to share and disseminate information on knowledge and scientific research works related to the conference. 4th ICACNI 2016 is scheduled to be held at National Institute of Technology Rourkela, Odisha, India.

Summary of the 5th BAMMF

Bay Area Multimedia Forum (BAMMF)

BAMMF is a Bay Area Multimedia Forum series. Experts from both academia and industry are invited to exchange ideas and information through talks, tutorials, posters, panel discussions and networking sessions. Topics of the forum will include emerging areas in vision, audio, touch, speech, text, various sensors, human computer interaction, natural language processing, machine learning, media-related signal processing, communication, and cross-media analysis etc. Talks in the event may cover advancement in algorithms and development, demonstration of new inventions, product innovation, business opportunities, etc. If you are interested in giving a presentation at the forum, please contact us.

The 5th BAMMF

The 5th BAMMF was held in the George E. Pake Auditorium in Palo Alto, CA, USA on November 20, 2014. The slides and videos of the speakers at the forum have been made available on the BAMMF web page, and we provide here an overview of their talks. For speakers’ bios, the slides and videos, please visit the web page.

Industrial Impact of Deep Learning – From Speech Recognition to Language and Multimodal Processing

Li Deng (Deep Learning Technology Center, Microsoft Research, Redmond, USA)

Since 2010, deep neural networks have started making real impact in speech recognition industry, building upon earlier work on (shallow) neural nets and (deep) graphical models developed by both speech and machine learning communities. This keynote will first reflect on the historical path to this transformative success. The role of well-timed academic-industrial collaboration will be highlighted, so will be the advances of big data, big compute, and seamless integration between application-domain knowledge of speech and general principles of deep learning. Then, an overview will be given on the sweeping achievements of deep learning in speech recognition since its initial success in 2010 (as well as in image recognition since 2012). Such achievements have resulted in across-the-board, industry-wide deployment of deep learning. The final part of the talk will focus on applications of deep learning to large-scale language/text and multimodal processing, a more challenging area where potentially much greater industrial impact than in speech and image recognition is emerging.

Brewing a Deeper Understanding of Images

Yangqing Jia (Google)

In this talk I will introduce the recent developments in the image recognition fields from two perspectives: as a researcher and as an engineer. For the first part I will describe our recent entry “GoogLeNet” that won the ImageNet 2014 challenge, including the motivation of the model and knowledge learned from the inception of the model. For the second part, I will dive into the practical details of Caffe, an open-source deep learning library I created at UC Berkeley, and show how one could utilize the toolkit for a quick start in deep learning as well as integration and deployment in real-world applications.

Applied Deep Learning

Ronan Collobert (Facebook)

I am interested in machine learning algorithms which can be applied in real-life applications and which can be trained on “raw data”. Specifically, I prefer to trade simple “shallow” algorithms with task-specific handcrafted features for more complex (“deeper”) algorithms trained on raw features. In that respect, I will present several general deep learning architectures, which excels in performance on various Natural Language, Speech and Image Processing tasks. I will look into specific issues related to each application domain, and will attempt to propose general solutions for each use case.

Compositional Language and Visual Understanding

Richard Socher (Stanford)

In this talk, I will describe deep learning algorithms that learn representations for language that are useful for solving a variety of complex language tasks. I will focus on 3 projects:

  • Contextual sentiment analysis (e.g. having an algorithm that actually learns what’s positive in this sentence: “The Android phone is better than the IPhone”)
  • Question answering to win trivia competitions (like IBM Watson’s Jeopardy system but with one neural network)
  • Multimodal sentence-image embeddings to find images that visualize sentences and vice versa (with a fun demo!) All three tasks are solved with a similar type of recursive neural network algorithm.

 

Report from SLAM 2014

ISCA/IEEE Workshop on Speech, Language and Audio in Multimedia

Following SLAM 2013 in Marseille, France, SLAM 2014 was the second edition of the workshop, held in Malaysia as a satellite of Interspeech 2014. The workshop was organized over two days, one for science and one for socializing and community building. With about 15 papers and 30 attendees, the highly-risky second edition of the workshop showed the will to build a strong scientific community at the frontier of speech and audio processing, natural language processing and multimedia content processing.

The first day featured talks covering various topics related to speech, language and audio processing applied to multimedia data. Two keynotes from Shri Narayanan (University of Southern California) and Min-Yen Kan (National University of Singapore) nicely completed the program.
The second day took us on a tour of Penang followed by a visit of the campus of Universiti Sains Malaysia from which local organizers are. The tour offered plenty of opportunities to strengthen the links between participants and build a stronger community, as expected. Most participants later went ot Singapore to attend Interspeech, the main conference in the domain of speech communication, where further discussions went on.

We hope to collocate the next SLAM edition with a multimedia conference such as ACM Multimedia in 2015. Keep posted!