SISAP 2018: 11th International Conference on Similarity Search and Applications

The International Conference on Similarity Search and Applications (SISAP) is an annual forum for researchers and application developers in the area of similarity data management. It aims at the technological problems shared by numerous application domains, such as data mining, information retrieval, multimedia, computer vision, pattern recognition, computational biology, geography, biometrics, machine learning, and many others that make use of similarity search as a necessary supporting service.

From its roots as a regional workshop in metric indexing, SISAP has expanded to become the only international conference entirely devoted to the issues surrounding the theory, design, analysis, practice, and application of content-based and feature-based similarity search. The SISAP initiative has also created a repository serving the similarity search community, for the exchange of examples of real-world applications, the source code for similarity indexes, and experimental testbeds and benchmark data sets ( The proceedings of SISAP are published by Springer as a volume in the Lecture Notes in Computer Science (LNCS) series.

The 2018 edition of SISAP was held at the Universidad de Ingeniería y Tecnología (UTEC) in one of the oldest neighborhoods of Lima, in a modern building just recently inaugurated. The conference was held back-to-back, with a shared session, with the International Symposium on String Processing and Information Retrieval (SPIRE), an independent symposium with some intersection with SISAP. The organization was smooth and with a strong technical program assembled by two co-chairs and sixty program committee members. Each paper was reviewed by at least three referees. The program was completed with three invited speakers of high caliber.

During this 11th edition of SISAP, the first invited speaker was Hanan Samet ( from the University of Maryland, a pioneer in the similarity search field, with several books published on the subject. Professor Samet presented a state of the art system for news search based on the geographical location of the user to get more accurate results. The second invited speaker was Alistair Moffat ( from the University of Melbourne, who delivered a talk about a novel technique for building compressed indexes using Asymmetric Numeral Systems (ANS). The ANS is a curious case of a scientific breakthrough not published in a peer-reviewed journal. Although it is available only as an arXiv technical, it is widely used in the industry – from Google and Facebook to Amazon, the adoption has been widespread. The third keynote talk was delivered in the shared session with SPIRE by Moshe Vardi ( of Rice University, a most celebrated editor of Communications of the ACM. Professor Vardi’s talk was an eye-opening discussion of jobs conquered by machines and the perspectives in accepting technological changes in everyday life. In the same shared session, a keynote presentation of SPIRE was given by Nataša Przulj ( of University College London, concerning molecular networks and the challenges researchers face in developing a better understanding of them. It is worth noting that roughly 10% of the SPIRE participants were inspired to attend the SISAP technical program.

As it is usually the case, SISAP 2018 included a program with papers exploring various similarity-aware data analysis and processing problems from multiple perspectives. The papers presented at the conference in 2018 studied the role of similarity processing in the context of metric search, visual search, nearest neighbor queries, clustering, outlier detection, and graph analysis. Some of the papers had a theoretical emphasis, while others had a systems perspective, presenting experimental evaluations comparing against state-of-the-art methods. An interesting event at the 2018 conference, as well as the two previous editions, was a poster session that included all accepted papers. This component of the conference generated many lively interactions between presenters and attendees, to not only learn more about the presented techniques but also to identify potential topics for future collaboration.

A shortlist for the Best Paper Award was created from those conference papers nominated by at least one of their 3 reviewers. An award committee of 3 researchers ranked the shortlisted papers, from which a final ranking was decided using Borda count. The Best Paper Award was presented during the Conference Dinner. In a tradition that began with the 2009 conference in Prague, extended versions of the top-ranked papers were invited for a Special Issue of the Information Systems journal.

The venue and the location of SISAP 2018 deserve a special mention. In addition to the excellent conference facilities at UTEC, we had many student volunteers who were ready to help ensure that the logistical aspects of the conference ran smoothly. Lima was a superb location for the conference. Our conference dinner was held at the Huaca Pucllana Restaurant, located on the site of amazing archaeological remains within the city itself. We also had many opportunities to enjoy excellently-prepared traditional Peruvian food and drink. Before and after the conference, many participants chose to visit Machu Picchu, voted as one of the New Seven Wonders of the World.

SISAP 2018 demonstrated that the SISAP community has a strong stable kernel of researchers, active in the field of similarity search and to fostering the growth of the community. Organizing SISAP is a smooth experience thanks to the support of the Steering Committee and dedicated participants.

SISAP 2019 will be organized in Newark (NJ, USA) by Professor Vincent Oria (NJIT). This attractive location in the New York City metropolitan area will allow for easy and convenient travel to and from the conference. One of the major challenges of the SISAP conference series is to continue to raise its profile in the landscape of scientific events related to information indexing, database and search systems.

Figure 1. The conference dinner at Pachacamac ruins

Figure 1. The conference dinner at Pachacamac ruins

Figure 2. After the very interesting technical sessions, we ended the conference with an excursion to Lima downtown

Figure 2. After the very interesting technical sessions, we ended the conference with an excursion to Lima downtown

Figure 3. Keynote by Vardi

Figure 3. Keynote by Vardi

Gender Diversity in SIGMM: We’ll Just Leave This Here As Well


1. Introduction and Background

SIGMM is the Association for Computing Machinery’s (ACM) Special Interest Group (SIG) in Multimedia, one of 36 SIGs in the ACM family.  ACM itself was founded in 1947 and is the world’s largest educational and scientific society for computing, uniting computing educators, researchers and professionals. With almost 100,000 members worldwide, ACM is a strong force in the computing world and is dedicated to advancing the art, science, engineering, and application of information technology.

SIGMM has been operating for nearly 30 years and sponsors 5, soon to be 6, major international conferences each year as well as dozens of workshops and an ACM Transactions Journal.  SIGMM sponsors several Excellence and Achievement Awards each year, including awards for Technical Achievement, Rising Star, Outstanding PhD Thesis, TOMM best paper, and Best TOMM Associate Editor award. SIGMM funds student travel scholarships to almost all our conferences with nearly 50 such student travel grants at the flagship MULTIMEDIA conference in Seoul, Korea, in 2018.  SIGMM has two active chapters, one in the Bay Area of San Francisco and one in China. It has a very active online activity with social media reporters at our conferences, a regular SIGMM Records newsletter, and a weekly news digest.  At our flagship conference, SIGMM sponsors Women and diversity lunches, Doctoral Symposiums, and a newcomers’ welcome breakfast.  SIGMM also funds special initiatives based on suggestions/proposals from the community as well as a newly-launched conference ambassador program to reach out to other ACM SIGs for collaborations across our conferences.

It is generally accepted that SIGMM has a diversity and inclusion problem which exists at all levels, but we have now realized this and have started to take action.  In September 2017 ACM SIGARCH produced the first of a series of articles on gender diversity in the field of Computer Architecture. SIGARCH members looked at their numbers of representation of women in SIGARCH conferences over the previous 2 years and produced the first of a set of reports entitled “Gender Diversity in Computer Architecture: We’re Just Going to Leave This Here”.


This report generated much online debate and commentary, including at the ACM SIG Governing Board (SGB) meetings in 2017 and in 2018.

At a SIGMM Executive Committee meeting in Mountain View, California in October 2017, SIGMM agreed to replicate the SIGARCH study to examine and measure, the (lack of) gender diversity at SIGMM-sponsored Conferences.  We issued a call offering funding support to do this, but there were no takers, so I did this myself, from within my own research lab.

2. Baselines for Performance Comparison

Before jumping into the numbers it is worth establishing a baseline to measure against. As an industry-wide figure, 17-24% of Computer Science undergrads at US R1 institutions are female as are 17% of those with technical roles at large high-tech companies that report diversity. I also looked at the female representation within some of the other ACM SIGs. While we must accept that inclusiveness and diversity is not just about gender but also about race, ethnicity, nationality, even about institution, we don’t have data on these other aspects so I focus just on gender diversity.

So how does SIGMM compare to other SIGs? Let’s look at SIG memberships using data provided by ACM.

The best (most balanced or least imbalanced) SIGs are CSE (Computer Science Education) with 25% female, Computer Human Interaction (CHI) also with 25% female from among those declaring a gender, though CHI is probably better because it has a greater percentage of undeclared gender, thus a lower proportion of males. The worst SIGs (most imbalanced or least balanced) are PLAN (Programming Languages) with 4% female, and OPS (operating systems) with 5% female.


The figures for SIGMM show 9% female membership with 17% unknown or not declaring which means that among the declared members it is just below 11%. Among the other SIGs this makes us closest to AI (Artificial Intelligence) and to IR (Information Retrieval), though SIGIR has a larger number of members with gender undeclared.


Measuring this against overall ACM memberships we find that ACM members are 68% male, 12% female and 20% undeclared. This makes SIGMM quite mid-table compared to other SIGs, but we’re all doing badly and we all have an imbalance. Interestingly, the MULTMEDIA Conference in 2018 in Seoul, Korea had 81% male, 18% female and 1% other/undeclared attendees, slightly better than our memberships ratio but still not good.

3. Gender Balance at SIGMM Conferences

We [1] carried out a desk study for the 3 major SIGMM conferences, namely MULTIMEDIA with an average attendance of almost 800, the International Conference on Multimedia Retrieval (ICMR) with 230 attendees at the last conference and Multimedia Systems (MMSys) with about 130 attendees. For each of the last 5 years we trawled through the conference websites, extracting the names/affiliations of the organizing committees, the technical program committees and the invited keynote speakers.  We did likewise for the SIGMM award winners. This required us determining gender for over 2,700 people and although there were duplicates as the same people can recur on the program committees for multiple years and over multiple conferences. Some of these were easy like “John” and “Susanne”, but these were few so for the others we searched for them on the web. If we were still searching after 5 minutes, we gave up. [2]

[1] This work was carried out by Agata Wolski, a Summer intern student, and I, during Summer 2018.

[2] The data gathered from this activity is available on request from

The figures for each of these annual conferences for a 5-year period for MULTIMEDIA, for a 4-year period for ICMR and for a 3-year period for MMSys, are shown in the following sequence of charts, first showing the percentages and then the raw numbers, for each conference.







So what do the figures mean in comparison to each other and to our baseline?

The results tell us the following:

  • Almost all the percentages for female participation in the organisation of all SIGMM conferences are above the SIGMM membership figure of 9% which is really closer to 11% when discounting those SIGMM members with gender unassigned yet we know the number of female SIGMM members is much already smaller compared to the 17% female in technology companies and the almost 18% female ACM members when discounting unassigned genders.
  • Even if we were to use 17% to 18% figures as our baseline, our female participation in SIGMM conference organisation is less than that baseline, meaning our female SIGMM members are not appearing in organisational and committee roles as per our membership pro rates would indicate they should.
  • While each of our conferences fall below these pro rata figures, none of the three conferences are particularly worse than the others.

4. Initiatives Elsewhere to Redress Gender Imbalance

I then examined some of the actions that are carried out elsewhere and that SIGMM could implement, and started by looking at other ACM SIGs.  There I found that some of the other SIGs do some of the following:

  • women and diversity events at conferences (breakfasts or lunches, like SIGMM does)
  • Women-only networking pre-conference meals at conferences
  • Women-only technical programme events like N2Women
  • Formation of mentoring group (using Slack) for informal mentoring
  • Highlighting the roles and achievements of women on social media and in newsletters
  • Childcare and companion travel grants for conference attendance

I then looked more broadly at other initiatives and found the following:

  • gender quotas
  • accelerator programs like Athena Swan
  • female-only events like workshops
  • reports like this which act as spotlights

When we put these all together there are three recurring themes which appear across various initiatives:

  1. Networking .. encouraging us to be part of a smaller group within a larger group. This is a natural human trait of us being tribal, we like to belong to groups starting with our family but also the people we have lunch with, go to yoga classes with, go on holidays with, we each have multiple sometimes non-overlapping groups or tribes that we like to be part of. One such group is the network of minority/women that gets formed as a result of some of the activities.
  2. Peer-to-peer buddying .. again there is a natural human trait whereby older siblings (sisters) tend to help younger ones throughout life, from when we are very young and right throughout life.  The buddying activity reflects this and gives a form of satisfaction to the older or senior buddy, as well as practical benefit to the younger or more junior buddy.
  3. Role models .. there are several initiatives which try to promote role models as those kinds of people that we ourselves can try to aspire to be.  More often that not, it is the very successful people and the high flyers who are put into these positions of role models whereas in practice not everyone actually wants to aspire to be a high flyer.  For many people success in their lives means something different, something less lofty and aspirational and when we see high flying successful people promoted as role models our reaction can be the opposite. We can reject them because we don’t want to be in their league and as a result we can feel depressed and regard ourselves as under-achievers, thus defeating the purpose of having role models in the first place.

5. SIGMM Women’s / Diversity Lunch at MULTIMEDIA 2018

At the ACM MULTIMEDIA Conference in Seoul, Korea in October 2018 SIGMM once again organised a women’s / diversity lunch and about 60 people attended, mostly women.


At the event I gave a high level overview of the statistics presented earlier in this report, and then in order to gather feedback from the audience we held a moderated discussion with PadLet used to gather feedback. PadLet is an online bulletin board used to display information (text, images or links) which can be contributed anonymously from an audience. Attendees at the lunch scanned a QR code on their smartphones which opened a browser and allowed them to post comments on the big screen in response to a topic being discussed during the meeting.

The first topic discussed was “What brings you to the MULTMEDIA Conference?

  • The answers (anonymous comments) posted included that many are here because they are presenting papers or posters, many want to do networking and to share ideas, to help build the community of like-minded researchers, some are attending in order to meet old friends .. and these are the usual reasons for attending a conference.

For the second topic we asked “What excites you about multimedia as a topic, how did you get into the area?

  • The answers included the interaction between computer vision and language, the novel applications around multimodality, the multidisciplinary nature and the practical nature of the subject, and the diversity of topics and the people attending.

The third topic was “What is more/less important for you … networking, role models or peer buddies?

  • From the answers to this, networking was almost universally identified as the most important, and as a follow-on from that, interacting with peers

Finally we asked “Do you know of an initiative that works, or that you would like to see at SIGMM event(s)?

  • A variety of suggestions were put forward including holding hackathons, funding undergraduate students from local schools to attend the conference, an ACM award for women only, ring-fenced funding for supporting women only, training for reviewing, and a lot of people wanted mentoring and mentor matching.

6. SIGMM Initiatives

So what will we do in SIGMM?

  • We will continue to encourage networking at SIGMM sponsored conferences. We will fund lunches like the ones at the MULTIMEDIA Conference. We also started a newcomers breakfast at the MULTIMEDIA Conference in 2018 and we will continue with this.
  • We will ensure that all our conference delegates can attend all conference events at all SIGMM conferences without extra fees. This was a SIGMM policy identified in a review of SIGMM conference some years ago but it has slipped.
  • We will not force but we will facilitate peer-to-peer buddying through the networking events at our conferences and through this we will indirectly help you identify your own role models.
  • We will appoint a diversity coordinator to oversee the women / diversity activities across our SIGMM events and this appointee will be a full member of the SIGMM Executive Committee.
  • We will offer an opportunity for all members of our SIGMM community attending our sponsored conferences, as part of their conference registration, to indicate their availability and interest in taking on an organisational role in SIGMM activities, including conference organisation and/or reviewing. This will provide for us a reserve of people from whom we can draw on their expertise and their services and we can do so in a way which promotes diversity.

These may appear to be small-scale and relatively minor because we are not getting to the roots of what causes the bias and we are not inducing change to counter the causes of the bias. However these are positive steps, steps in the right direction, and we will now have the gender and other bias issues permanently on our radars.

Report from the SIGMM Emerging Leaders Symposium 2018

The idea of a symposium to bring together the bright new talent within the SIGMM community and to hear their views on some topics within the area and on the future of Multimedia, was first mooted in 2014 by Shih-Fu Chang, then SIGMM Chair. That lead to the “Rising Stars Symposium” at the MULTIMEDIA Conference in 2015 where 12 invited speakers made presentations on their work as a satellite event to the main conference. After each presentation a respondent, typically an experienced member of the SIGMM community, gave a response or personal interpretation of the presentation. The format worked well and was very thought-provoking, though some people felt that a shorter event which could be more integrated into the conference, might work better.

For the next year, 2016, the event was run a second time with 6 invited speakers and was indeed more integrated into the main conference. The event skipped a year in 2017, but was brought back for the MULTIMEDIA Conference in 2018 and this time, rather than invite speakers we decided to have an open call with nominations, to make selection for the symposium a competitive process. We also decided to rename the event from Rising Stars Symposium, and call it the “SIGMM Emerging Leaders Symposium”, to avoid confusion with the “SIGMM Rising Star Award”, which is completely different and is awarded annually.

In July 2018 we issued a call for applications to the “Third SIGMM Emerging Leaders Symposium, 2018” which was to be held at the annual MULTIMEDIA Conference in Seoul, Korea, in October 2018. Applications were received and were evaluated by a panel consisting of the following people, and we thank them for volunteering and for their support in doing this.

Werner Bailer, Joanneum Research
Guillaume Gravier, IRISA
Frank Hopfgartner, Sheffield University
Hayley Hung, Delft University, (a previous awardee)
Marta Mrak, BBC

Based on the assessment panel recommendations, 4 speakers were included in the Symposium, namely:

Hanwang Zhang, Nanyang Technological University, Singapore
Michael Riegler, Simula, Norway
Jia Jia, Tsinghua University, China
Liqiang Nie, Shandong University, China

The Symposium took place on the last day of the main conference and was chaired by Gerald Friedland, SIGMM Conference Director.


Towards X Visual Reasoning

By Hanwang Zhang (Nanyang Technological University, Singapore)

For decades, we are interested in detecting objects and classifying them into a fixed vocabulary of lexicon. With the maturity of these “low-level” vision solutions, we are hunger for a “higher-level” representation of the visual data, so as to extract visual knowledge rather than merely bags of visual entities, allowing machines to reason about human-level decision-making. In particular, we wish an “X” reasoning, where X means eXplainable and eXplicit. In this talk, I first reviewed a brief history of symbolism and connectionism, which alternatively promote the development of AI in the past decades. In particular, though the deep neural networks — the prevailing incarnation of connectionism — have shown impressive super-human performance in various tasks, they still lag behind us in high-level reasoning. Therefore, I propose the marriage between symbolism and connectionism to take the complementary advantages of them, that is, the proposed X visual reasoning. Second, I introduced the two building blocks of X visual reasoning: visual knowledge acquisition by scene graph detection and X neural modules applied on the knowledge for reasoning. For scene graph detection, I introduced our recent progress on reinforcement learning of the scene dynamics, which can help to generate coherent scene graphs that respect visual context. For X neural modules, I discussed our most recent work on module design, algorithms, and applications in various visual reasoning tasks such as visual Q&A, natural language grounding, and image captioning. At last, I visioned some future directions towards X visual reasoning, such as using meta-learning and deep reinforcement learning for more dynamic and efficient X neural module compositions.

Professor Ramesh Jain mentioned that a truly X reasoning should consider the potential human-computer interaction that may change or digress a current reasoning path. This is crucial because human intelligence can reasonably respond to interruptions and incoming evidences.

We can position X visual reasoning in the recent trend of neural-symbolic unification, which gradually becomes our consensus towards a general AI. The “neural”’ is good at representation learning and model training, and the “symbolic” is good at knowledge reasoning and model explanation. One should bear in mind that the future multimedia system should take the complementary advantages of the “neural-symbolic”.

BioMedia – The Important Role of Multimedia Research for Healthcare

by Michael Riegler (SimulaMet & University of Oslo, Norway)

With the recent rise of machine learning, analysis of medical data has become a hot topic. Nevertheless, the analysis is still often restricted to a special type of images coming from radiology or CT scans. However, there are continuously vast amounts of multimedia data collected both within the healthcare systems and by the users using devices such as cameras, sensors and mobile phones.

In this talk I focused on the potential of multimedia data and applications to improve healthcare systems. First, a focus on the various data was given. A person’s health is contained in many data sources such as images, videos, text and sensors. Medical data can also be divided into data with hard and soft ground truth. Hard ground truth means that there are procedures that verify certain labels of the given data (for example a biopsy report for a cancerous tissue sample). Soft ground truth is data that was labeled by medical experts without a verification of the outcome. Different data types also come with different levels of security. For example activity data from sensors have a low chance to help to identify the patient whereas speech, social media, GPS come with a higher chance of identification. Finally, it is important to take context into account and results should be explainable and reproducible. This was followed by a discussion about the importance of multimodal data fusion and context aware analysis supported by three example use cases: Mental health, artificial reproduction and colonoscopy.

I also discussed the importance of involving medical experts and patients as users. Medical experts and patients are two different user groups, with different needs and requirements. One common requirement for both groups is the need for explanation about how the decisions were taken. In addition, medical experts are mainly interested in support during their daily tasks, but are not very interested in, for example, huge amounts of sensor data from patients because the increase amount of work. They have a preference on interacting with the patients than with the data. Patients on the other hand usually prefer to collect a lot of data and get informed about their current status, but are more concerned about their privacy. They also usually want that medical experts take as much data into account as possible when making their assessments.

Professor Susanne Boll mentioned that it is important to find out what is needed to make automatic analysis accepted by hospitals and who is taking the responsibility for decisions made by automatic systems. Understandability and reproducibility of methods were mentioned as an important first step.

The most relevant messages of the talk are that the multimedia community has the diverse skills needed to address several challenges related to medicine. Furthermore, it is important to focus on explainable and reproducible methods.

Mental Health Computing via Harvesting Social Media Data

By Jia Jia, Tsinghua University, China

Nowadays, with the rapid pace of life, mental health is receiving widespread attention. Common symptoms like stress, or clinical disorders like depression, are quite harmful, and thus it is of vital significance to detect mental health problems before they lead to severe consequences. Professional mental criteria like the International Classification of Diseases (ICD-10 [1]) and the Diagnostic and Statistical Manual of Mental Disorders (DSM [2]) have defined distinguishing behaviors in daily lives that help diagnosing disorders. However, traditional interventions based on face-to-face interviews or self-report questionnaires are expensive and hysteretic. The potential antipathy towards consulting psychiatrists exacerbates these problems.

Social media platforms, like Twitter and Weibo, have become increasingly prevalent for users to express themselves and interact with friends. The user-generated content (UGC) shared in such platforms may help to better understand the real-life state and emotion of users in a timely manner, making the analysis of the users’ mental wellness feasible. Underlying these discoveries, research efforts have also been devoted for early detection of mental problems.

In this talk, I focused on the timely detection of mental wellness, focusing on typical mental problems: stress and depression. Starting with binary user-level detection, I expanded the research by considering the trigger and the severity of the mental problems, involving different social media platforms that are popular in different cultures. I presented my recent progress from three prespectives:

  1. Through self-reported sentence pattern matching, I constructed a series of large-scale well-labeled datasets in the field of online mental health analysis;
  2. Based on previous psychological research, I extracted multiple groups of discriminating features for detection and presented several multi-modal models targeting at different contexts. I conducted extensive experiments with my models, demonstrating significantly better performance as compared to the state-of-the-art methods; and
  3. I investigated in detail the contribution per feature, of online behaviors and even cultural differences in different contexts. I managed to reveal behaviors not covered in traditional psychological criteria, and provided new perspectives and insights for current and future research.

My developed mental health care applications were also demonstrated in the end.

Dr. B. Prabhakaran indicated that mental health understanding is a difficult problem, even for trained doctors, and we will need to work with psychiatrist sooner than later. Thanks to his valuable comments, regarding possible future directions, I envisage the use of augmented / mixed reality to create different immersive “controlled” scenarios where human behavior can be studied. I consider for example to create stressful situations (such as exams, missing a flight, etc.), for better understanding depression. Especially for depression, I plan to incorporate EEG sensor data in my studies.



Towards Micro-Video Understanding

By Liqiang Nie, Shandong University, China

We are living in the era of ever-dwindling attention span. To feed our hunger for quick content, bite-sized videos embracing the philosophy of “shorter-is-better”, are becoming popular with the rise of micro-video sharing services. Typical services include Vine, Snapchat, Viddy, and Kwai. Micro-videos like a wildfire are very popular and taking over the content and social media marketing space, in virtue of their value in brevity, authenticity, communicability, and low-cost. Micro-videos can benefit lots of commercial applications, such as brand building. Despite their value, the analysis and modeling of micro-videos is non-trivial due to the following reasons:

  1. micro-videos are short in length and of low quality;
  2. they can be described by multiple heterogeneous channels, spanning from social, visual, and acoustic to textual modalities;
  3. they are organized into a hierarchical ontology in terms of semantic venues; and
  4. there are no available benchmark dataset on micro-videos.

In my talk, I introduced some shallow and deep learning models for micro-video understanding that are worth studying and have proven effective:

  1. Popularity Prediction. Among the large volume of micro-videos, only a small portion of them will be widely viewed by users, while most will only gain little attention. Obviously, if we can identify in advance the hot and popular micro-videos, it will benefit many applications, like the online marketing and network reservation;
  2. Venue Category Estimation. In a random sample over 2 million Vine videos, I found that only 1.22% of the videos are associated with venue information. Including location information about the videos can benefit multifaceted aspects, such as footprints recording, personalized applications, and other location-based services, it is thus highly desired to infer the missing geographic cues;
  3. Low quality sound. As the quality of the acoustic signal is usually relatively low, simply integrating acoustic features with visual and textual features often leads to suboptimal results, or even adversely degrades the overall quality.

In the future, I may try some other meaningful tasks such as micro-video captioning or tagging and detection of unsuitable content. As many micro-videos are annotated with erroneous words, namely the topic tags or descriptions are not well correlated to the content, this negatively influences other applications, such as textual query search. It is common that users upload many violence and erotic videos. At present, the detection and alert tasks mainly rely on labor-intensive inspection. I plan to create systems that automatically detect erotic and violence content.

During the presentation, the audience asked about the datasets used in my work. In my previous work, all the videos come from Vine, but this service has been closed. The audience wondered how I will build the dataset in the future. As there are many other micro-video sites, such as Kwai and Instagram, I hence can obtain sufficient data from them to support my further research.

Opinion Column: Survey on ACM Multimedia

For this edition of the Opinion Column, happening in correspondence with ACM Multimedia 2018, we launched a short community survey regarding their perception of the conference. We prepared the survey together with senior members of the community, as well as the organizers of ACM Multimedia 2019. You can find the full survey here.


Overall, we collected 52 responses. The participant sample was slightly skewed towards more senior members of the community: around 70% described themselves are full, associate or assistant professors. Almost 20% were research scientists from industry. Half of the participants were long-term contributors of the conference, having attended more than 6 editions of ACM MM, however only around a quarter of the participants had attended the last edition of MM in Seoul, Korea.

First, we asked participants to describe what ACM Multimedia means for them, using 3 words. We aggregated the responses in the word cloud below. Bigger words correspond to words with higher frequency. Most participants associated MM with prestigious and high quality content, and with high diversity of topics and modalities. While recognizing its prestige, some respondents showed their interest in a modernization of the MM focus.


Next, we asked respondents “What brings you to ACM Multimedia?”, and provided a set of pre-defined options including “presenting my research”, “networking”, “community building”,  “ACM MM is at the core of my scientific interests” and “other” (free text). 1 on 5 participants selected all options as relevant to their motivation behind attending Multimedia. The large majority of participants (65%) declare to attend ACM Multimedia to present research and do networking. By inspecting the free-text answers in the “other” option, we found that some people were interested in specific tracks, and that others see MM as a good opportunity to showcase research to their graduate students.

The next question was about paper submission. We wanted to characterize what pushes researchers to submit to ACM multimedia. We prepared 3 different statements capturing different dimensions of analysis, and asked participants to rate them on a 5-point scale, from “Strongly disagree” (1), to “Strongly agree” (5).

The distribution of agreement for each question is shown in the plot below. Participants tend to neither disagree nor agree about Multimedia as the only possible venue for their papers (average agreement score 2.9); they generally disagreed with the statement “I consider ACM Multimedia mostly to resubmit papers rejected from other venues” (average score 2.0), and strongly agreed on the idea of MM as a premier conference (average score 4.2).


One of the goals of this survey was to help the future Program Chairs of MM 2019 understand the extent to which participants agree with the reviewers’ guidelines that will be introduced in the next edition of the conference. To this end, we invited respondents to express their agreement with a fundamental point of these guidelines: “Remember that the problem [..] is expected to involve more than a single modality, or [..] how people interpret and use multimedia. Papers that address a single modality only and also fail to contribute new knowledge on human use of multimedia must be rejected as out of scope for the conference”.  Around 60% agreed or strongly agreed with this statement, while slightly more than 25% disagreed or strongly disagreed. The remaining 15% had no opinion about the statement.

We also asked participants to share with us any further comment regarding this last question or ACM MM in general. People generally approved the introduction of these reviewing guidelines, and the idea of multiple modalities and human perception and applications of multimedia. Some suggested that, given the re-focusing implied by this new reviewing guidelines, the instructions should be made more specific i.e. chairs should clarify the definition of “involve”: how multimodal should the paper be?

Others encouraged to clarify even further the broader scope of ACM Multimedia, defining its position with respect to other multimedia conferences (MMsys, MMM), but also with computer vision conferences such as CVPR/ECCV (and avoid conference dates overlapping).

Some comments proposed to rate papers based on the impact on the community, and on the level of innovation even in  a single modality, as forcing multiple modalities could “alienate” community members.

Beyond reviewing guidelines, a major theme emerging from the free-text comments was about diversity in ACM Multimedia. Several participants called for more geographic diversity in participants and paper authors. Some also noted that more turn-over in the organizing committees should be encouraged. Finally, most participants brought up the need for more balance in MM topics: it was brought up that, while most accepted papers are under the general umbrella of “Multimedia Content Understanding”, MM should encourage in the future more paper about systems, arts, and other emerging topics.

With this bottom-up survey analysis, we aimed to give voice to the major themes that the multimedia community cares about, and hope to continue doing so in the future editions of this column. We would like to thank all researchers and community members who gave their contribution by shaping and filling this survey, and allowed us to get a broader picture of the community perception of ACM MM!

SIGMM Records: News, Statistics, and Call for Contributions & Suggestions


A new editorial team has committed to lead the ACM SIGMM Records since the issue of January 2017. The goal is to consolidate the Records as a primary source of information and a communication vehicle for the multimedia community. With these objectives in mind, the Records were re-organized around three main categories (Open Science, Information, and Opinion), for which specific sections and columns were created (more details in

statistics october 2018

Since then, all sections and columns have provided relevant and high-quality contributions, with a higher impact than anticipated. Since the new epoch of the Records, apart from new columns, two additional initiatives have been incorporated:

  • Best social media reporter: It was decided to award the SIGMM members with the most intense and valuable posts on Social Media during the SIGMM conferences. The selected Best Social Media Reporters are asked to provide a post-conference report to be published in the Records, and get a free registration to one of the upcoming SIGMM conferences. Up to now, the awardees have been: Miriam Redi (ICMR 2017), Christian Timmerer (MMSYS 2017), Benoit Huet and Conor Keighrey (MM 2017), Cathal Gurrin (ICMR 2018) and Gwendal Simon (MMSYS 2018). The criteria for the awards are specified here:
  • Section on QoE: Starting in the third issue of 2018 (September 2018), the Records include a new section on QoE, edited by Tobias Hoßfeld and Christian Timmerer. You can find here the introduction column: 

Apart from the recurrent sections, the community has as well contributed with relevant feature articles. Some examples include the article about the flow of ideas around SIGMM conferences by Lexing Xie, the article about ACM Fellows in SIGMM by Alan Smeaton, the SIGMM Annual Report (2018) by the Chairs, and an article about data driven statistics and trends in SIGMM conferences by David Ayman Shamma.

Finally, the editorial team is also working on infrastructural aspects together with ACM. First, an effective communication protocol with the ACM Digital Library has been established, enabling the publication of the issues and individual contributions in HTML format. SIGMM has indeed been pioneering in adopting the HTML format in the publication of articles. Second, the process for migrating the Records website to an ACM server and domain has started, and should be completed before the end of the year.

Pablo Cesar, the editor-in-chief, presented the new team, structure and impact at ACM MM 2017 and will update the community during ACM MM2018.

pablo_acm mm


Reach of the SIGMM Records

Since August 2018, we have been collecting statistics about visitors and visits to the Records website, and making use of Social Media for disseminating the contributions and news. In these 13 months, the daily number of visitors have ranged approximately between 100 and 400, being this variation strongly influenced by the publication of Social Media posts promoting published contents. In these last 13 months, more than 80000 visitors and nearly 500000 visits (i.e. clicks) have been registered.

The top 3 countries with highest number of visitors are US (>19000), China (>10000) and Germany (nearly 7000), and the top 10 all surpass 2000 visitors. Likewise, the top 3 posts with highest impact, in terms of number of visits are listed in Table 1.

Table 1. Top 3 posts on the Records website with highest impact

Post Publication Date Number of Visits
Impact of the New @sigmm Records September 2017 3051 visits
Standards Column: JPEG and MPEG May 2017 1374 visits
Practical Guide to Using the YFCC100M and MMCOMMONS on a Budget October 2017 786 visits

Finally, the top 3 referring sites (i.e., external websites from which visitors have clicked an URL to access the Records website) are Facebook (around 2500 references), Google (around 2500 references) and Twitter (>700 references). According to this, it seems clear that the social media strategy implemented by the editorial team is positively impacting the Records.

Regarding Social Media, two @sigmm channels are being used: a Facebook page and a Twitter account (@sigmm). The number of followers is still not high in Facebook (47), but it has significantly increased in Twitter (247) compared to the previous report. However, the impact of the posts on these platforms, in terms of reach, likes and shares is noteworthy. In Facebook, there are posts that have reached more than 1000 users, and in Twitter there are many tweets with tens of re-tweets and likes.


Our mission is to keep improving and consolidate the Records, and we are very open to getting extra help and feedback. So, if you would like to become member of our team, or simply have suggestions or ideas, please drop us a line!

We hope you are enjoying every new edition of the Records.

The Editorial team

Opinion Column: Review Process of ACM Multimedia


This quarter, our Community column is dedicated to the review process of ACM Multimedia (MM). We report the summary of discussions arisen at various points in time, after the first round of reviews were returned to authors.

The core part of the discussion focused on how to improve review quality for ACM MM. Some participants pointed out that there have been complaints about the level and usefulness of some reviews in recent editions of ACM Multimedia. The members of our discussion forums (Facebook and Linkedin) proposed some solutions.

A semi-automated paper assignment. Participants debated about the best way of assigning papers to reviewers. Some suggested that automated assignment, i.e. using TPMS, helps reducing biases at scale: this year MM followed the review model of CVPR, which handled 1,000+ submissions and peer reviews. Other participants observed that automated assignment systems often fail in matching papers with the right reviewers. This is mainly due to the diversity of the Multimedia field: even within a single area, there is a lot of diversity in expertise and methodologies. Some participants advocated that the best solution is to have two steps (1) a bidding period where reviewers choose their favorite papers based on the areas of expertise, or, alternatively, an automated assignment step; (2) an “expert assignment” period, where, based on the previous choices, Area Chairs select the right people for a paper: a reviewer pool with relevant complementary expertise.

The authors’ advocate. Most participants agreed that the figure of the author’s advocate is crucial for a fair reviewing process, especially for a diverse community such as the Multimedia community. Most participants agreed that the author’s advocate should be provided in all tracks.

Non-anonymity among reviewers. It was observed that revealing the identity of reviewers to the other members of the program committee (e.g. Area Chairs and other reviewers) could encourage responsiveness and commitments during the review and discussion periods.

Quality over quantity. It was pointed out that increasing the number of reviews per paper is not always the right solution. This adds workload on the reviewers, thus potentially decreasing the quality of their reviews.

Less frequent changes in review process. A few participants discussed about the frequency of changes in the review process in ACM MM. In recent years, the conference organizers have tried different review formats, often inspired by other communities. It was observed that this lack of continuity in the review process might not give the time to evaluate the success of a format, or to measure the quality of the conference overall. Moreover, changes should be communicated and announced well before implemented (and repeatedly because people tend to oversight them) to the authors and the reviewers.

This debate lead to a higher-level discussion about the identity of the MM community. Some participants interpreted these frequent changes in the review process as some kind of identity crisis. It was proposed to use empirical evidence (e. g. a community survey) to analyse exactly what the MM community actually is and how it should evaluate itself. The risk of becoming a second tier conference to CVPR was brought up: not only authors submit to MM rejected papers from CVPR, but also, at times, reviewers are assuming that the MM papers have to be reviewed as CVPR papers, thus potentially losing a lot of interesting papers for the conference.

We would like to thank all participants for their time and precious thoughts. As next step for this column, we might consider making short surveys about specific topics, including the ones discussed in this issue of the SIGMM Records opinion column.

We hope this column will foster fruitful discussions during the conference, which will be held in Seoul, Korea, on 22-26 October 2018.

Report from ACM MMSYS 2018 – by Gwendal Simon

While I was attending the MMSys conference (last June in Amsterdam), I tweeted about my personal highlights of the conference, in the hope to share with those who did not have the opportunity to attend the conference. Fortunately, I have been chosen as “Best Social Media Reporter” of the conference, a new award given by ACM SIGMM chapter to promote the sharing among researchers on social networks. To celebrate this award, here is a more complete report on the conference!

When I first heard that this year’s edition of MMsys would be attended by around 200 people, I was a bit concerned whether the event would maintain its signature atmosphere. It was not long before I realized that fortunately it would. The core group of researchers who were instrumental in the take-off of the conference in the early 2010’s is still present, and these scientists keep on being sincerely happy to meet new researchers, to chat about the latest trends in the fast-evolving world of online multimedia, and to make sure everybody feels comfortable talking with each other.


I attended my first MMSys in 2012 in North Carolina. Although I did not even submit any paper to MMSys’12, I decided to attend because the short welcoming text on the website was astonishingly aligned with my own feeling of the academic research world. I rarely read the usually boring and unpassionate conference welcoming texts, but this particular day I took time to read this particular MMSys text changed my research career. Before 2012, I felt like one lost researcher among thousands of other researchers, whose only motivation is to publish more papers whatever at stake. I used to publish sometimes in networking venues, sometimes in system venues, sometimes in multimedia venues… My production was then quite inconsistent, and my experiences attending conferences were not especially exciting.

The MMsys community matches my expectations for several reasons:

  • The size of a typical MMSys conference is human: when you meet someone the first day, you’ll surely meet this fellow again the next day.
  • Informal chat groups are diverse. I’ve the feeling that anybody can feel comfortable enough to chat with any other attendee regardless of gender, nationality, and seniority.
  • A responsible vision of what should be an academic event. The community is not into show-off in luxury resorts, but rather promotes decently cheap conferences in standard places while maximizing fun and interactions. It comes sometimes with the cost of organizing the conference in the facilities of the university (which necessarily means much more work for organizers and volunteers), but social events have never been neglected.
  • People share a set of “values” into their research activities.

This last point is of course the most significant aspect of MMSys. The main idea behind this conference is that multimedia services are not only multimedia but also networks, systems, and experiences. This commitment to a holistic vision of multimedia systems has at least two consequences. First, the typical contributions that are discussed in this conference have both some theoretical and experimental parts, and, to be accepted, papers have to find the right balance between both sides of the problem. It is definitely challenging, but it brings passionate researchers to the conference. Second, the line between industry and academia is very porous. As a matter of facts, many core researchers of MMSys are either (past or current) employees of research centers in a company or involved into standard groups and industrial forums. The presence of people being involved in the design of products nurtures the academic debates.

While MMSys significantly grows, year after year, I was curious to see if these “values” remain. Fortunately, it does. The growing reputation has not changed the spirit.


The 2018 edition of the MMSys conference was held in the campus of CWI, near Downtown Amsterdam. Thanks to the impressive efforts of all volunteers and local organizers, the event went smoothly in the modern facilities near the Amsterdam University. As can be expected from a conference in the Netherlands, especially in June, biking to the conference was the obviously best solution to commute every morning from anywhere in Amsterdam.

mmsys_3The program contains a fairly high number of inspiring talks, which altogether reflected the “style” of MMsys. We got a mix of entertaining technological industry-oriented talks discussing state-of-the-art and beyond. The two main conference keynotes were given by stellar researchers (who unsurprisingly have a bright career in both academia and industry) on the two hottest topics of the conference. First Philip Chou (8i Labs) introduced holograms. Phil kind of lives in the future, somewhere five years later than now. And from there, Phil was kind enough to give us a glimpse of the anticipatory technologies that will be developed between our and his nows. Undoubtedly everybody will remember his flash-forwarding talk. Then Nuria Oliver (Vodafone) discussed the opportunities to combine IoT and multimedia in a talk that was powerful and energizing. The conference also featured so-called overview talks. The main idea is that expert researchers present the state-of-the-art in areas that have been especially under the spotlights in the past months. The topics this year were 360-degree videos, 5G networks, and per-title video encoding. The experts were from Tiledmedia, Netflix, Huawei and University of Illinois. With such a program, MMSys attendees had the opportunity to catch-up on everything they may have missed during the past couple of years.


mmsys_5The MMSys conference has also a long history of commitment for open-source and demonstration. This year’s conference was a peak with an astonishing ratio of 45% papers awarded by a reproducibility badge, which means that the authors of these papers have accepted to share their dataset, their code, and to make sure that their work can be reproduced by other researchers. I am not aware of any other conference reaching such a ratio of reproducible papers. MMSys is all about sharing, and this reproducibility ratio demonstrates that the MMSys researchers see their peers as cooperating researchers rather than competitors.


mmsys_6My personal highlights would go for two papers: the first one is a work from researchers from UT Dallas and Mobiweb. It shows a novel efficient approach to generate human models (skeletal poses) with regular Kinect. This paper is a sign that Augmented Reality and Virtual Reality will soon be populated by user-generated content, not only synthetized 3D models but also digital captures of real humans. The road toward easy integration of avatars in multimedia scenes is paved and this work is a good example of it. The second work I would like to highlight in this column is a work from researchers from Université Cote d’Azur. The paper deals with head movement in 360-degree videos but instead of trying to predict movements, the authors propose to edit the content to guide user attention so that head movements are reduced. The approach, which is validated by a real prototype and code source sharing, comes from a multi-disciplinary collaboration with designers, engineers, and human interaction experts. Such multi-disciplinary work is also largely encouraged in MMSys conferences.


Finally, MMSys is also a full event with several associated workshops. This year, Packet Video (PV) was held with MMSys for the very first time and it was successful with regards to the number of people who attended it. Fortunately, PV has not interfered with Nossdav, which is still the main venue for high-quality innovative and provocative studies. In comparison, both MMVE and Netgames were less crowded, but the discussion in these events was intense and lively, as can be expected when so many experts sit in the same room. It is the purpose of workshops, isn’t it?


A very last word on the social events. The social events in the 2018 edition were at the reputation of MMSys: original and friendly. But I won’t say more about them: what happens in MMSys social events stays at MMSys.

mmsys_9The 2019 edition of MMSys will be held on the East Coast of US, hosted by University of Massachusetts-Amherst. The multimedia community is in a very exciting time of its history. The attention of researchers is shifting from video delivery to immersion, experience, and attention. More than ever, multimedia systems should be studied from multiple interplaying perspectives (network, computation, interfaces). MMSys is thus a perfect place to discuss research challenges and to present breakthrough proposals.

[1] This means that I also had my bunch of rejected papers at MMSys and affiliated workshops. Reviewer #3, whoever you are, you ruined my life (for a couple of hours)

Opinion Column: Privacy and Multimedia


The discussion: multimedia data is affected by new forms of privacy threats, let’s learn, protect, and engage our users.

For this edition of the SIGMM Opinion Column, we carefully selected the discussion’s main topic, looking for an appealing and urgent problem arising for our community. Given the recent Cambridge Analytica’s scandal, and the upcoming enforcement of the General Data Protection Act in EU countries, we thought we should have a collective reflection on  ‘privacy and multimedia’.

The discussion: multimedia data is affected by new forms of privacy threats, let’s learn, protect, and engage our users.

Users share their data often unintentionally. One could indeed observe a diffuse sense of surprise and anger following the data leaks from Cambridge Analytica. As mentioned in a recent blog post from one of the participants, so far, large-scale data leaks have mainly affected private textual and social data of social media users. However, images and videos also contain private user information. There was a general consensus that it is time for our community to start thinking about how to protect private visual and multimedia data.

It was noted that computer vision technologies are now able to infer sensitive information from images (see, for example, a recent work on sexual orientation detection from social media profile pictures). However few technologies exist that defend users against automatic inference of private information from their visual data. We will need to design protection techniques to ensure users’ privacy protection for images as well, beyond simple face de-identification. We might also want users to engage and have fun with image privacy preserving tools, and this is the aim of the Pixel Privacy project.

But in multimedia, we go beyond image analysis. By nature, as multimedia researchers, we combine different sources of information to design better media retrieval or content serving technologies, or to ‘get more than the sum of parts’. While this is what makes our research so special, in the discussion participants noted that multimodal approaches might also generate new forms of privacy threats. Each individual source of data comes with its own privacy dimension, and we should be careful about the multiple privacy breaches we generate by analyzing each modality. At the same time, by combining different medias and their privacy dimensions, and performing massive inference on the global multimodal knowledge, we might also be generating new forms of threats to user privacy that individual stream don’t have.

Finally, we should also inform users about these new potential threats:  as experts who are doing ‘awesome cutting-edge work’, we also have a responsibility to make sure people know what the potential consequences are.

A note on the new format, the response rate, and a call for suggestions!

This quarter, we experimented with a new, slimmer format, hoping to reach out to more members of the community, beyond Facebook subscribers.

We extended the outreach beyond Facebook: we used the SIGMM Linkedin group for our discussion, and we directly contacted senior community members. To engage community members with limited time for long debates, we also lightened the format, asking anyone who is interested in giving us their opinion on the topic to send us or share with the group a one-liner reflecting their view on privacy on multimedia.

Despite the new format, we received a limited number of replies. We will keep trying new formats. Our aim is to generate fruitful  discussions, and gather opinions on crucial problems in a bottom-up fashion. We hope, edition after edition, to get better at giving voice to more and more members of the Multimedia Community.

We are happy to hear your thoughts on how to improve, so please reach out to us!

Sharing and Reproducibility in ACM SIGMM


This column discusses the efforts of ACM SIGMM towards sharing and reproducibility. Apart from the specific sessions dedicated to open source and datasets, ACM Multimedia Systems started to provide official ACM badges for articles that make artifacts available since last year. This year, it has marked a record with 45% of the articles acquiring such a badge.

Without data it is impossible to put theories to the test. Moreover, without running code it is tedious at best to (re)produce and evaluate any results. Yet collecting data and writing code can be a road full of pitfalls, ranging from datasets containing copyrighted materials to algorithms containing bugs. The ideal datasets and software packages are those that are open and transparent for the world to look at, inspect, and use without or with limited restrictions. Such “artifacts” make it possible to establish public consensus on their correctness or otherwise to start a dialogue on how to fix any identified problems.

In our interconnected world, storing and sharing information has never been easier. Despite the temptation for researchers to keep datasets and software to themselves, a growing number are willing to share their resources with others. To further promote this sharing behavior, conferences, workshops, publishers, non-profit and even for-profit companies are increasingly recognizing and supporting these efforts. For example, the ACM Multimedia conference has hosted an open source software competition since 2004, and the ACM Multimedia Systems conference has included an open datasets and software track since 2011 . The ACM Digital Library now also hands out badges to public artifacts that have been made available and optionally reviewed and verified by members of the community. At the same time, organizations such as Zenodo and Amazon host open datasets for free. Sharing ultimately pays off: the citation statistics for ACM Multimedia Systems conferences over the past five years, for example, show that half of the 20 most cited papers shared data and code although they have represented a small fraction of the published papers so far.

graphic datasets

Good practices are increasingly adopted. In this year’s edition of the ACM Multimedia Systems conference, 69 works (papers, demos, datasets, software) were accepted, out of which 31 (45%) were awarded an ACM badge. This is a large increase compared to last year, when out of 42 works only a total of 13 (31%) received one. This greatly expands one of the core objectives of both the conference and SIGMM towards open science. At this moment, the ACM Digital Library does not separately index which papers received a badge, making it challenging to find all papers who have one. It further appears not many other ACM conferences are aware of the badges yet; for example, while ACM Multimedia accepted 16 open source papers in 2016 and 6 papers in 2017, none applied for a badge. This year at ACM Multimedia Systems only “artifacts available” badges have been awarded. For next year our intention is to ensure all dataset and software submissions receive the “artifacts evaluated” badge. This would require several committed community members to spend time working with the authors to get the artifacts running on all major platforms with corresponding detailed documentation.

The accepted artifacts this year are diverse in nature: several submissions focus on releasing artifacts related to quality of experience of (mobile/wireless) streaming video, while others center on making datasets and tools related to images, videos, speech, sensors, and events available; in addition, there are a number of contributions in the medical domain. It is great to see such a range of interests in our community!

SIGMM Annual Report (2018)


Dear Readers,

Each year SIGMM, like all ACM SIGs, produces an annual report summarising our activities which includes our sponsored and i-cooperation conferences and also the initiatives we are undertaking to support our community and broaden participation. The report also includes our significant papers, our awards given and the major issues that face us going forward. Below is the main text of the SIGMM report 2017-2018 which is augmented by further details on our conferences which is provided by the ACM Office. We hope you enjoy reading this ad learning about what SIGMM does.

SIGMM Annual Report (2018)
Prepared by SIGMM Chair (Alan Smeaton),
Vice Chair (Nicu Sebe), and Conference Director (Gerald Friedland)
August 6th, 2018

Mission: SIGMM provides an international interdisciplinary forum for researchers, engineers, and practitioners in all aspects of multimedia computing, communication, storage and application.

1. Awards:
SIGMM gives out three awards each year and these were as follows:

  • SIGMM Technical Achievement Award for lasting contributions to multimedia computing, communications and applications was presented to Arnold W.M. Smeulders, University of Amsterdam, the Netherlands. The award was given in recognition of his outstanding and pioneering contributions to defining and bridging the semantic gap in content-based image retrieval.
  • SIGMM 2016 Rising Star Award was given to Dr Liangliang Cao of HelloVera. AI for his significant contributions in large-scale multimedia recognition and social media mining.
  • SIGMM Outstanding PhD Thesis in Multimedia Computing Award was given to Chien-Nan (Shannon) Chen for a thesis entitled Semantics-Aware Content Delivery Framework For 3D Tele-Immersion at the University of Illinois at Urbana-Champaign, US.

2. Significant Papers:

The SIGMM flagship conference, ACM Multimedia 2017, was held in Mountain View, Calif. And presented the following awards plus other awards for Best Grand Challenge Video Captioning Paper, Best Grand Challenge Social Media Prediction Paper, Best Brave New Idea Paper:

  • Best paper award to “Adversarial Cross-Modal Retrieval”, by Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, Heng Tao Shen
  • Best student paper award to “H-TIME: Haptic-enabled Tele-Immersive Musculoskeletal Examination”, by Yuan Tian, Suraj Raghuraman, Thiru Annaswamy, Aleksander Borresen, Klara Nahrstedt, Balakrishnan Prabhakaran
  • Best demo award to “NexGenTV: Providing Real-Time Insight during Political Debates in a Second Screen Application” by Olfa Ben Ahmed, Gabriel Sargent, Florian Garnier, Benoit Huet, Vincent Claveau, Laurence Couturier, Raphaël Troncy, Guillaume Gravier, Philémon Bouzy  and Fabrice Leménorel.
  • Best Open source software award to “TensorLayer: A Versatile Library for Efficient Deep Learning Development” by Hao Dong, Akara Supratak, Luo Mai, Fangde Liu, Axel Oehmichen, Simiao Yu, Yike Guo.

The 9th ACM International Conference on Multimedia Systems (MMSys 2018), was held in Amsterdam, the Netherlands, and presented a range awards including:

  • Best paper award to “Dynamic Adaptive Streaming for Multi-Viewpoint Omnidirectional Videos” by Xavier Corbillon, Francesca De Simone, Gwendal Simon and Pascal Frossard.
  • Best student-paper award to “Want to Play DASH? A Game Theoretic Approach for Adaptive Streaming over HTTP” by Abdelhak Bentaleb, Ali C. Begen, Saad Harous and Roger Zimmermann.

The International Conference in Multimedia Retrieval (ICMR) 2018 was held in Yokohama, Japan, and presented a range of awards including:

  • Best paper award to “Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval” by Niluthpol Mithun, Juncheng Li, Florian Metze and Amit Roy-Chowdhury.

The best paper and best student paper from each of these three conferences were then reviewed by a specially set up committee to select one paper which has been nominated for Communications of the ACM Research Highlights and that is presently under consideration.

In addition to the above, SIGMM presented the 2017 ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) Nicolas D. Georganas Best Paper Award to the paper “Automatic Generation of Visual-Textual Presentation Layout” (TOMM vol. 12, Issue 2) by Xuyong Yang, Tao Mei, Ying-Qing Xu, Yong Rui, and Shipeng Li.

3. Significant Programs that Provide a Springboard for Further Technical Efforts

  • SIGMM provided support for student travel through grants, at all of our SIGMM-sponsored conferences.
  • Apart from the specific sessions dedicated to open source and datasets, the ACM Multimedia Systems Conference (MMSys) has started to provide official ACM badging for articles that make artifacts available. This year, our second year for doing this, has marked a record with 45% of the articles published at the conference acquiring such a reproducibility badge.

4. Innovative Programs Providing Service to Some Part of Our Technical Community

  • A large part of our research area in SIGMM is driven by the availability of large datasets, usually used for training purposes.  Recent years have shown a large growth in the emergence of openly available datasets coupled with grand challenge events at our conferences and workshops. Mostly these are driven by our corporate researchers but this allows all of our researchers the opportunity to carry out their research at scale.  This provides great opportunities for our community.
  • Following the lead of SIGARCH we have commissioned a study of gender distribution among the SIGMM conferences, conference organization and awards. This report will be completed and presented at our flagship conference in October.  We have also commissioned a study of the conferences and journals which mostly influence, and are influenced by, our own SIGMM conferences as an opportunity for some self-reflection on our origins, and our future.  Both these follow an open call for new initiatives to be supported by SIGMM. 
  • SIGMM Conference Director Gerald Friedland worked with several volunteers from SIGMM to improve the content and organization of ACM Multimedia and connected conferences. Volunteer Dayid Ayman Shamma used data science methods to analyze several ACM MM conferences in the past five years with the goal of identifying biases and patterns of irregularities. Some results were presented at the ACM MM TPC meeting. Volunteers Hayley Hung and Martha Larson gave an account of their expectations and experiences with ACM Multimedia and Dr. Friedland himself volunteered as a reviewer for conferences of similar size and importance, including NIPS and CSCW and approached the chairs to get external feedback into what can be improved in the review process. Furthermore, in September, Dr. Friedland will travel to Berlin to visit Lutz Prechelt, who invented a review quality management system. The results of this work will be included into a conference handbook that will put down standard recommendations of best practices for future organizers of SIGMM conferences. We expect the book to be finished by the end of 2018.
  • Last year SIGMM made a decision to try to co-locate conferences and other events as much as possible and the ACM Multimedia conference was co-located with the European Conference on Computer Vision (ECCV) in 2016 with joint workshops and tutorials. This year the ACM MultiMedia Systems (MMSys) conference was co-located with the 10th International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE2018), the16th Annual Workshop on Network and Systems Support for Games (NetGames2018), the 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV2018) and the 23rd Packet Video Workshop (PV2018).  In addition, the Technical Program Committee meeting for the Multimedia Conference was co-located with the ICMR conference.

5. Events or Programs that Broaden Participation

  • SIGMM has approved the launch of a new conference series called Multimedia Asia which will commence in 2019. This will be run by the SIGMM China Chapter and consolidates two existing multimedia-focused conferences in Asia under the sponsorship and governance of SIGMM. This follows a very detailed review and the successful location for the inaugural conference in 2019 will be announced at our flagship conference in October 2018.
  • The Women / Diversity in Multimedia Lunch at ACM MULTIMEDIA 2017 (previously the Women’s Lunch) continued this year with an enlarged program of featured speakers and discussion which led to the call for the gender study in Multimedia mentioned earlier.
  • SIGMM continues to pursue an active approach to nurturing the careers of our early stage researchers. The “Emerging Leaders” event (formerly known as Rising Stars) skipped a year in 2017 but will be happening again in 2018 at the Multimedia Conference.  Giving these early career researchers the opportunity to showcase their vision helps to raise their visibility and helps SIGMM to enlarge the pool of future volunteers.
  • The expansion we put in place in our social media communication team has proven to be a shrewd move with a large growth in our website traffic and raised profile on social media. We also invite conference attendees to post on twitter and/or Facebook about papers, demos, talks that they think are most thought provoking and forward looking and the most active of these are rewarded with a free registration at a future SIGMM-sponsored conference.

6. Issues for SIGMM in the next 2-3 years

  • Like other SIGs, we realize that improving the diversity of the community we serve is essential to continuing our growth and maintaining our importance and relevance. This includes diversity in gender, in geographical location, and in many other facets.  We have started to address these through some of the initiatives mentioned earlier, and at our flagship conference in 2017 we ran a Workshop emphasizing contributions focusing on research from South Africa and the African continent in general.
  • Leadership and supporting young researchers in the early stages of their careers is also important and we highlight this through 2 of our regular awards (Rising Stars and Best Thesis). The “Emerging Leaders” event (formerly known as Rising Stars) skipped a year in 2017 but will be happening again in 2018 at the Multimedia Conference.
  • We wish to reach to other SIGs with whom we could have productive engagement because we see multimedia as a technology enabler as well as an application unto itself. To this end we will continue to try to hold joint panels or workshops at our conferecnes.
  • Our research area is marked by the growth and availability of open datasets and grand challenge competitions held at our conferences and workshops. These datasets are often provided from the corporate sector and this is both an opportunity for us to do research on datasets otherwise unavailable to us, as well as being a threat to the balance between corporate influence and independence.
  • In a previous annual report we highlighted the difficulties caused by a significant portion of our conference proceedings not being indexed by Thomson Web of Science. In a similar vein we find our conference proceedings are not used as input into CSRankings, a metrics-based ranking of Computer Science institutions worldwide. Publishing at venues which are considered in CSRankings’ operation is important to much of our community and while we are in the process of trying to re-dress this, support of ACM on making this case would be welcome.