Multidisciplinary Column: An Interview with Andrew Quitmeyer

 

Picture of Dr. Andrew Quitmeyer

Could you tell us a bit about your background, and what the road to your current position was?

In general, a lot of my background has been motivated by personal independence and trying to find ways to sustain myself. I was a first-generation grad student (which may explain a lot of my skepticism and confusion about academia in general). I moved out of the house at 15 to go to this cool, free, experimental public boarding high school in Illinois. I went to the University of Illinois because it was the nicest school I could go to for free (despite horrible college counselors telling all the students they should take on hundreds of thousands of dollars of debt to go to the “school of their dreams”).  They didn’t have a film-making program, so I created my own degree for it. Thinking I could actually have a film career seemed risky, and I wanted something that would protect my ability to get a job, so I got an engineering degree too.

I was a bit disappointed in the engineering side though, because I felt we never actually got to build anything on our own. I think a lot of people know me as some kind of “hacker” or “maker”, but I didn’t start doing many physical projects until much later in grad school when I met my friend Hannah Perner-Wilson. She helped pioneer a lot of DIY e-textiles (kobakant.at), and what struck me was how beautifully you could document and play with physical projects. Physical computing seemed an attractive combination of my abilities in documentary filmmaking and engineering.

The other big revelation for me roped in the naturalist side of what I do. I have always loved adventuring in nature, and studying wild creatures, but growing up in the midwest USA, this was never presented as a viable career opportunity. In the same way that it was basically taken for granted in midwestern US culture that studying art was a sort of frivolous hobby for richer kids, a career in biology that didn’t feed into engineering work in some specific industry (agriculture, biotech, etc…) was treated as equally flippant. I tried taking as many science electives as I could in undergrad, but it was because they were fun. Again, it was not until grad school when I had a cool job doing computer vision programming with an ant-lab robot-lab collaboration that I realized the potential error of my ways. Some of the ant biologists invited me to go ant hunting out in the desert after our meeting, and it was so fun and interesting I had a sort of existential meltdown. “Oh no! I screwed up, I could have been a field biologist all this time? Like that’s a job?”

So I worked to sculpt my PhD around this revelations. I wanted to join field biologists in exploring the natural world while using and developing novel technology to help us probe and document these creatures in new ways. 

I plowed through my PhD as fast as I could because going to school in the US is expensive. You either have to join a lab to help work on someone else’s (already funded) project or take time away from your research to TA classes. After I got out of there, I did some other projects, and eventually got a job as an assistant professor at the National University of Singapore in the communications and new media department. Unfortunately it seems like I came at a pretty chaotic time (80% of my fellow professors are leaving my division, not to be replaced), and so I will actually be leaving at the end of this semester to figure out a new place to continue doing research and teaching others.

How does multidisciplinary work play a role in your research on “Digital Naturalism”?

The work is basically anti-disciplinary. Instead of a relying on specific field of practice, the work simply sets out towards some basic goals and happily uses any means necessary to get there. Currently this includes a blend of naturalistic experimentation, performance art, film making, interaction design, software and hardware engineering, industrial design, ergonomics, illustration, and storytelling.

The more this work spreads into other disciplines the more robust and interesting I think it will become. For instance, I would love to see more video games developed about interacting with wild animals.

Could you name a grand research challenge in your current field of work?

Let’s talk to animals.

I am a big follower of Janet Murray’s work in Digital Media. She sees the grand challenge of computers as forming this amazing new medium that we all have to collaboratively experiment with to figure out how to truly make use of the new affordances it provides us. For me, the coolest new ability of computers is their behavioral nature. Never before have we had a medium that shares the same unique qualities of living creatures in being able to sense stimuli from the world, process this and be able to create new stimuli in response. Putting together these senses and actions let’s computers give us the first truly “behavioral medium.” Intertwining the behaviors of computers with living creatures opens up a new world of dynamic experimentation. 

When most people think of talking to animals, they imagine some kind of sci-fi, dr. doolittle auto-translator. A bird chips a song, and we get a text message that says “I am looking for more bird food.” This is quite specist of us, and upholds that ingrained assumption that all living creatures strive to somehow become more like us.

Instead, I think this digital, behavioral medium holds more value and potential into bringing us into their world and modes of communication. You can learn what the ants are saying by building your own robot ant that taps antennae with the workers around her. You might learn more in a birds’ communication by capturing its bodily movements and physical interactions giving context to its thoughts than trying to brute-force decrypt the sounds it makes. 

I find anything we can learn about animals and their environments useful, and the behaviors that computers can enact as a key to bringing us into their world. There is a really long road ahead though. To facilitate rich, behavioral interactions with other creatures requires advances, experimentation, and refinement of our ability to sense non-human stimuli and provide realistic stimuli back. Meanwhile though, I can barely create a sensor that can detect just the presence or absence of an ant on a tree in the wild. Thus we need a lot more development and experimentation but, I imagine future digital naturalists using technology to turn themselves into Goat-men like Thomas Thwaites rather than as Star Trek commanders using some kind of universal translator.

You have been starring in a ‘Hacking the Wild’ television series on the Discovery Channel. How did the idea for this series come about? Do you aim to reach out to particular audiences with the show?

Yeah that was an interesting experience! Some producers had seen some of my work I had been documenting and producing from expeditions I led during my PhD, and contacted me about turning it into a show. A problem in the entertainment industry is that nobody seems to understand why you would ever not want to be in entertainment. They treat it as a given that that’s what everyone is striving for in their lives. This seems to give them license to not treat people great and say whatever it takes to get people to do what they want (even if some of these things turn out to be false). So, for instance, I was first told that my show would be about me working with scientists building technology in the jungle, but then it devolved into a survival genre TV show with just me. The plot became non-sensical (which could have been fun!), but pressure from the industry forced us to keep up the grizzled stereotypes of the genre (“if I don’t find food in the next couple hours…I might not make it out”).

It gave me an interesting chance to insert myself and some of my own ideals into this space though.  One thing that irks me about the survival genre in general is its rhetoric of “conquering nature.” They kept trying to feed me lines about how I would use this device, or this hack to “defeat nature” which is the exact opposite of what I want to do in my work. So I tried to stand my ground and assert that nature is beautiful and fun, and maybe we can use things we build to understand it even better. Many traditional survival audiences didn’t seem to care for it, but I have gotten lots of fan mail from around the world from people who seem to get the real idea of it a bit more – make things outside and use them to play in nature. I remember one nice email from a young kid who would prototype contraptions in their back yard with what they called “electric sticks,” and that was really nice.

You recently organized the first Digital Naturalism Conference (Dinacon), that was quite unlike the types of conferences we would normally encounter. Could you tell a bit more about Dinacon’s setup,and the reasons why you initiated a conference like this?

Dinacon was half a long-term dream and half a reaction to problems in current academic publishing and conferences.

The basic idea of of the Digital Naturalism Conference was to gather all the neat people in my network spanning many different fields and practices, and get them to hang out together in a beautiful, interesting place. For me, this was a direct continuation of my Digital Naturalism work to re-imagine the sites of scientific exploration. In previous events I had tried to explore combining hackathons with biological field expeditions. These “hiking hacks” looked to design the future of how scientific trips might function in tandem with the design of scientific tools. The conference looked to take this to the next stage and re-imagine what the biological field station of the future might look like.

The more specific design of this conference was built as a reaction to a lot of the problems I see in current academic traditions.  The academic conferences I have taken part in generally had these problems:

  • Exploitative – Powered by unpaid laborers (organizing, reviewing, formatting, advertising) who then have to pay to attend themselves
  • Expensive – only rich folks get to attend (generally with money from their institution)
  • Exclusive – generally you have to already be “vetted” with your papers to attend (not knocking Peer review! Just vetted networking)
  • Steer Money in not great directions – e.g. lining the pockets of fancy hotels and publishing companies
  • Restricted Time – Most conferences leave just enough time to get bored waiting for others unenthusiastic presentations to finish, and maybe grab a drink before heading back to all the duties one has. I think for good work to be done, and proper connections to be made in research, people need time to live and work together in a relaxing, exciting environment.

[I go into more details about all this in the post about our conference’s philosophy: https://www.dinacon.org/2017/11/01/philosophy/ ]

Based on these problems, I wanted to experiment with alternative methods for gathering people, sharing information, and reviewing the information they create. I wanted to show that these problems were illnesses within the current system and traditions we perpetuate, and that many alternatives not only exist, but are feasible even on a severely reduced budget. (We started on an initial budget self-funding the rental of the place with $7000 USD, we then crowdfunded $11,000 additionally after the conference was announced  to provide additional amenities and stipends).

Thus, when creating this conference, we sought to attack each of these challenges. First we made it free to attend and provided free or subsidized housing. We also made it open to absolutely anyone from any discipline or background. Then we tried to direct what money we did have to spend towards community improvements. For instance, we rented out the Diva Andaman for the duration of the conference. This was a tourism ship that was interested in also helping the biology community by serving as a mobile marine science lab. In return for letting us use the facilities and rooms on the ship, we helped develop ideas and tools for its new laboratories. Finally, and perhaps most importantly, we worked to provide time for the participants. They were allowed to stay for 3 days to 3 weeks and encouraged to take time to explore, adjust to the place, interact with each other. 

We tried to also streamline the responsibilities of the participants too by having just 3 official “rules”: 

  1. You must complete something. Aim big, aim small, just figure out a task for yourself that you can commit yourself to that you can accomplish during your time at the conference. It can be any format you want: sculpture, a movie, a poem, a fingerpainting, a journal article – you just have to finish it!
  1. Document and Share it. Everything will be made open-source and publicly accessible!
  1. Provide feedback on two (2) other people’s projects. 

The goal of these rules would be that, just like at a traditional conference, everyone would leave with a completed work that’s been reviewed by their peers. Also like the reality of any conference, not all of these rules were 100% met. Everyone created something, most documented it, and gave plenty of feedback to each other, but there wasn’t yet as much of an infrastructure in place for them to give this feedback and documentation a bit more formally. These rules functioned with great success, however, as goal posts leading people towards working on interesting things while also collaborating and sharing their work.

Do you feel Dinacon was successful in promoting inclusivity? What further actions can the community undertake towards this as a whole?

I do, and was quite happy with the results, but am excited to build on this aspect even more. We worked hard at reaching out to many communities around the world, especially within groups or demographics that may be overlooked otherwise. This was a big factor in where we decided to locate the conference as well. Thailand was great because many folks from around southeast asia could easily come, while people from generally richer nations in the west could also make it. I think this is a super important feature for any international conference: make it easier for the less privileged and more difficult for the more privileged.

I genuinely do not understand why giant expensive conferences just keep being held where the rich people already live. Anytime I am at some expensive conference hotel in Singapore, Japan, or the USA, I think about how all that money could go so much further and have a bigger impact on a community elsewhere. For instance, there has NEVER been a CHI conference held anywhere in Africa, South America, or Southeast Asia. These places do also have large hotels that you can hook up computers and show a PowerPoint as well, so it’s not like they are missing the key infrastructure of these types of conferences.

One of the biggest hurdles is money and logistics. We had folks accepted from every continent except Antarctica, but our friends from Ghana couldn’t make it due to the arduous visa process. We had a couple small micro-travel grants (300-600 bucks of my own money) to help get people over who might not have been able to otherwise, but I wished we could have made our conference entirely free and could cover transportation (instead of just free registration, free food, and free camping housing).

That’s a limitation of a self-funded project, you just try to help as much as you can until you are tapped out. The benefits of it, though, are proving that really many people with a middle class job can actually do this too. Before I got my job, I pledged to put 10% of my earnings towards creating fun, interesting experiences for others. It’s funny that when people spend this amount of money on more established things like cars or religious tithing, people accept it, but when I tell people I am spending $7000USD of my own money putting on a free conference about my research they balk and act like I am nuts. I couldn’t think of anything better to spend 10% of your money on than something that brings you and others joy.

Next year’s conference will likely have a sliding scale registration though to help promote greater inclusivity overall than what we could provide out of our own pockets. Having people who can afford to pay a couple hundred for registration help subsidize those who would have been prevented from coming seems like an equitable solution.

How and in what form do you feel we as academics can be most impactful?

Fighting competitiveness. I think the greatest threat put onto academics is the idea that we are competing with each other. Unfortunately, many institutional policies actually codify this competition into truth. As an academic your loyalty should be first and foremost into unlocking new ideas about our world that you can share with others. This quality is rarely directly rewarded by any large organization, though. This means that standing up for academic integrity will almost undoubtedly come at a cost. It may cost you your bonus, your grant application, or even your job. In terms of your life and your career, however, I think these will only be short term expenses, and in fact be investments into deeper, more impactful research and experiences.

Academics like to complain about the destructiveness of policies based on pointless metrics and academic cliques, but nothing will change unless you simply stand up against it. Not everyone can afford to stand up against the authorities. Maybe you cannot quit your job because you need the health care, but there are ways for all of us to call out exploitation that we see in institutional or community structures. You need to assess the privileges you do have, and do what you can to help share knowledge and lift up those around you. 

For instance, in my reflection after going to a more traditional conference (during my own conference), I pledged to 

  • no longer help recruit “reviewers” for papers if they are not compensated in some way.
  • avoid reviewing papers for exploitative systems

and

  • transfer my reviewing time to help conferences and journals with open policies.

(more info here: https://www.dinacon.org/2018/06/19/natural-reflection-andy-quitmeyer/). For now, this pledge excludes me from some of the major conferences in my field, which in turn makes me publish my work in other venues, which many institutions look down on, and this inhibits my hire-ability. I think it’s worth it though to help stop perpetuating these problems onto future generations.

In your opinion, what would make an academic community a healthy community?

I think a healthy academic community would be one where the people are happy, help each other, and help make space for people outside their community to join and share. The only metric I think I would want to judge quality of an institution on would be about how happy they feel their community is. I don’t care what their output is, especially in baseless numbers of publications or grant money, developing healthy communities is the only way to lead to any kind of long-term sustainable research. You need humans of different abilities and generations watching out for each other, helping each other learn new things, and protecting each other. 

Some people try to push the idea that competition is necessary to make people work hard and be productive, or else they will be lazy and greedy. In fact, it’s this competition that creates these side affects. When cared for, they are curious, constructive, and helpful.

So keep your eyes open for ways in which your peers or students are being exploited and stand up against it. Reach out to find out challenges people around you face, and work on developing opportunities outside the scope of the traditions in your field. I think doing this will help build healthy and productive communities.

 


Bios

Dr. Andrew Quitmeyer is a hacker / adventurer studying intersections between wild animals and computational devices. His academic research in “Digital Naturalism” at the National University of Singapore blends biological fieldwork and DIY digital crafting. This work has taken him through international wildernesses where he’s run workshops with diverse groups of scientists, artists, designers, and engineers.  He runs “Hiking Hacks” around the world where participants build technology entirely in the wild for interacting with nature. His research also inspired a ridiculous spin-off television series he hosted for Discovery Networks called “Hacking the Wild.” He is currently working to establish his own art-science field station fab lab.

Editor Biographies

Cynthia_Liem_2017Dr. Cynthia C. S. Liem is an Assistant Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. She initiated and co-coordinated the European research project PHENICX (2013-2016), focusing on technological enrichment of symphonic concert recordings with partners such as the Royal Concertgebouw Orchestra. Her research interests consider music and multimedia search and recommendation, and increasingly shift towards making people discover new interests and content which would not trivially be retrieved. Beyond her academic activities, Cynthia gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, and a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach.

 

jochen_huberDr. Jochen Huber is a Senior User Experience Researcher at Synaptics. Previously, he was an SUTD-MIT postdoctoral fellow in the Fluid Interfaces Group at MIT Media Lab and the Augmented Human Lab at Singapore University of Technology and Design. He holds a Ph.D. in Computer Science and degrees in both Mathematics (Dipl.-Math.) and Computer Science (Dipl.-Inform.), all from Technische Universität Darmstadt, Germany. Jochen’s work is situated at the intersection of Human-Computer Interaction and Human Augmentation. He designs, implements and studies novel input technology in the areas of mobile, tangible & non-visual interaction, automotive UX and assistive augmentation. He has co-authored over 60 academic publications and regularly serves as program committee member in premier HCI and multimedia conferences. He was program co-chair of ACM TVX 2016 and Augmented Human 2015 and chaired tracks of ACM Multimedia, ACM Creativity and Cognition and ACM International Conference on Interface Surfaces and Spaces, as well as numerous workshops at ACM CHI and IUI. Further information can be found on his personal homepage: http://jochenhuber.com

Interview with Dr. Magda Ek Zarki and Dr. De-Yu Chen: winners of the Best MMsys’18 Workshop paper award

Abstract

The ACM Multimedia Systems conference (MMSys’18) was recently held in Amsterdam from 9-15 June 2018. The conferencs brings together researchers in multimedia systems. Four workshops were co-located with MMSys, namely PV’18, NOSSDAV’18, MMVE’18, and NetGames’18. In this column we interview Magda El Zarki and De-Yu Chen, the authors of the best workshop paper entitled “Improving the Quality of 3D Immersive Interactive Cloud-Based Services Over Unreliable Network” that was presented at MMVE’18.

Introduction

The ACM Multimedia Systems Conference (MMSys) (mmsys2018.org) was held from the 12-15 June in Amsterdam, The Netherlands. The MMsys conference provides a forum for researchers to present and share their latest research findings in multimedia systems. MMSys is a venue for researchers who explore complete multimedia systems that provide a new kind of multimedia or overall performance improves the state-of-the-art. This touches aspects of many hot topics including but not limited to: adaptive streaming, games, virtual reality, augmented reality, mixed reality, 3D video, Ultra-HD, HDR, immersive systems, plenoptics, 360° video, multimedia IoT, multi- and many-core, GPGPUs, mobile multimedia and 5G, wearable multimedia, P2P, cloud-based multimedia, cyber-physical systems, multi-sensory experiences, smart cities, QoE.

Four workshops were co-located with MMSys in Amsterdam in June 2018. The paper titled “Improving the Quality of 3D Immersive Interactive Cloud-Based Services Over Unreliable Network” by De-Yu Chen and Magda El-Zarki from University of California, Irvine was awarded the Comcast Best Workshop Paper Award for MMSys 2018, chosen from among papers from the following workshops: 

  • MMVE’18 (10th International Workshop on Immersive Mixed and Virtual Environment Systems)
  • NetGames’18 (16th Annual Workshop on Network and Systems Support for Games)
  • NOSSDAV’18 (28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video)
  • PV’18 (23rd Packet Video Workshop)

We approached the authors of the best workshop paper to learn about the research leading up to their paper. 

Could you please give a short summary of the paper that won the MMSys 2018 best workshop paper award?

In this paper we discussed our approach of an adaptive 3D cloud gaming framework. We utilized a collaborative rendering technique to generate partial content on the client, thus the network bandwidth required for streaming the content can be reduced. We also made use of progressive mesh so the system can dynamically adapt to changing performance requirements and resource availability, including network bandwidth and computing capacity. We conducted experiments that are focused on the system performance under unreliable network connections, e.g., when packets can be lost. Our experimental results show that the proposed framework is more resilient under such conditions, which indicates that the approach has potential advantage especially for mobile applications.

Does the work presented in the paper form part of some bigger research question / research project? If so, could you perhaps give some detail about the broader research that is being conducted?

A more complete discussion about the proposed framework can be found in our technical report, Improving the Quality and Efficiency of 3D Immersive Interactive Cloud Based Services by Providing an Adaptive Application Framework for Better Service Provisioning, where we discussed performance trade-off between video quality, network bandwidth, and local computation on the client. In this report, we also tried to tackle network latency issues by utilizing the 3D image warping technique. In another paper, Impact of information buffering on a flexible cloud gaming system, we further explored the potential performance improvement of our latency reduction approach, when more information can be cached and processed.

We received many valuable suggestions and identified a few important future directions. Unfortunately, De-Yu, graduated and decided to pursue a career in the industry. He will not likely to be able to continue working on this project in the near future.

Where do you see the impact of your research? What do you hope to accomplish?

Cloud gaming is an up-and-coming area. Major players like Microsoft and NVIDIA have already launched their own projects. However, it seems to me that there is not a good enough solution that is accepted by the users yet. By providing an alternative approach, we wanted to demonstrate that there are still many unsolved issues and research opportunities, and hopefully inspire further work in this area.

Describe your journey into the multimedia research. Why were you initially attracted to multimedia?

De-Yu: My research interest in cloud gaming system dated back to 2013 when I worked as a research assistant in Academia Sinica, Taiwan. When U first joined Dr. Kuan-Ta Chen’s lab, my background was in parallel and distributed computing. I joined the lab for a project that is aimed to provide a tool that help developers do load balancing on massively multiplayer online video games. Later on, I had the opportunity to participate in the lab’s other project, GamingAnywhere, which aimed to build the world’s first open-source cloud gaming system. Being an enthusiastic gamer myself, having the opportunity to work on such a project was really an enjoyable and valuable experience. That experience came to be the main reason for continuing to work in this area. 

Magda El Zarki: I have worked in multimedia research since the 1980’s when I worked for my PhD project on a project that involved the transmission of data, voice and video over a LAN. It was named MAGNET and was one of the first integrated LANs developed for multimedia transmission. My work continued in that direction with the transmission of Video over IP. In conjunction with several PhD students over the past 20—30 years I have developed several tools for the study of video transmission over IP (MPEGTool) and has several patents related to video over wireless networks. All the work focused on improving the quality of the video via pre and post processing of the signal.

Can you profile your current research, its challenges, opportunities, and implications?

There are quite some challenges in our research. First of all, our approach is an intrusive method. That means we need to modify the source code of the interactive applications, e.g. games, to apply our method. We found it very hard to find a suitable open source game whose source code is neat and clean and easy to modify. Developing our own fully functioning game is not a reasonable approach, alas, due to the complexity involved. We ended up building a 3D virtual environment walkthrough application to demonstrate our idea. Most reviewers have expressed concerns about synchronization issues in a real interactive game, where there may be AI controlled objects, non-deterministic processes, or even objects controlled by other players. We agree with the reviewers that this is a very important issue. But currently it is very hard for us to address it with our limited resources. Most of the other research work in this area faces similar problems to ours – lack of a viable open source game for researchers to modify. As a result, researchers are forced to build their own prototype application for performance evaluation purposes. This brings about another challenge: it is very hard for us to fairly compare the performance of different approaches given that we all use a different application for testing. However, these difficulties can also be deemed as opportunities. There are still many unsolved problems. Some of them may require a lot of time, effort, and resources, but even a little progress can mean a lot since cloud gaming is an area that is gaining more and more attention from industry to increase distribution of games over many platforms.

“3D immersive and interactive services” seems to encompass both massive multi-user online games as well augmented and virtual reality. What do you see as important problems for these fields? How can multimedia researchers help to address these problems?

When it comes to gaming or similar interactive applications, all comes down to the user experience. In the case of cloud gaming, there are many performance metrics that can affect user experience. Identifying what matters the most to the users would be one of the important problems. In my opinion, interactive latency would be the most difficult problem to solve among all performance metrics. There is no trivial way to reduce network latency unless you are willing to pay the cost for large bandwidth pipes. Edge computing may effectively reduce network latency, but it comes with high deployment cost.

As large companies start developing their own systems, it is getting harder and harder for independent researchers with limited funding and resources to make major contributions in this area. Still, we believe that there are a couple ways how independent researchers can make a difference. First, we can limit the scope of the research by simplifying the system, focusing on just one or a few features or components. Unlike corporations, independent researchers usually do not have the resources to build a fully functional system, but we also do not have the obligation to deliver one. That actually enables us to try out some interesting but not so realistic ideas. Second, be open to collaboration. Unlike corporations who need to keep their projects confidential, we have more freedom to share what we are doing, and potentially get more feedback from others. To sum up, I believe in an area that has already attracted a lot of interest from industry, researchers should try to find something that companies cannot or are not willing to do, instead of trying to compete with them.

If you were conducting this interview, what questions would you ask, and then what would be your answers?

 The real question is: Is Cloud Gaming viable? It seems to make economic sense to try to offer it as companies try to reach a broader  and more remote audience. However, computing costs are cheaper than bandwidth costs, so maybe throwing computing power at the problem makes more sense – make more powerful end devices that can handle the computing load of a complex game and only use the network for player interactivity.

Biographies of MMSys’18 Best Workshop Paper Authors

Prof Magda El Zarki (Professor, University of California, Irvine):

Magda El Zarki

Prof. El Zarki’s lab focuses on multimedia transmission over the Internet. The work consists of both theoretical studies and practical implementations to test the algorithms and new mechanisms to improve quality of service on the user device. Both wireline and wireless networks and all types of video and audio media are considered. Recent work has shifted to networked games and massively multi user virtual environments (MMUVE). Focus is mostly on studying the quality of experience of players in applications where precision and time constraints are a major concern for game playability. A new effort also focuses on the development of games and virtual experiences in the arena of education and digital heritage.

De-Yu Chen (PhD candidate, University of California, Irvine):

De-Yu Chen

De-Yu Chen is a PhD candidate at UC Irvine. He received his M.S. in Computer Science from National Taiwan University in 2009, and his B.B.A. in Business Administration from National Taiwan University in 2006. His research interests include multimedia systems, computer graphics, big data analytics and visualization, parallel and distributed computing, cloud computing. His most current research project is focused on improving quality and flexibility of cloud gaming systems.

Report from ACM MMSYS 2018 – by Gwendal Simon

While I was attending the MMSys conference (last June in Amsterdam), I tweeted about my personal highlights of the conference, in the hope to share with those who did not have the opportunity to attend the conference. Fortunately, I have been chosen as “Best Social Media Reporter” of the conference, a new award given by ACM SIGMM chapter to promote the sharing among researchers on social networks. To celebrate this award, here is a more complete report on the conference!

When I first heard that this year’s edition of MMsys would be attended by around 200 people, I was a bit concerned whether the event would maintain its signature atmosphere. It was not long before I realized that fortunately it would. The core group of researchers who were instrumental in the take-off of the conference in the early 2010’s is still present, and these scientists keep on being sincerely happy to meet new researchers, to chat about the latest trends in the fast-evolving world of online multimedia, and to make sure everybody feels comfortable talking with each other.

mmsys_1

I attended my first MMSys in 2012 in North Carolina. Although I did not even submit any paper to MMSys’12, I decided to attend because the short welcoming text on the website was astonishingly aligned with my own feeling of the academic research world. I rarely read the usually boring and unpassionate conference welcoming texts, but this particular day I took time to read this particular MMSys text changed my research career. Before 2012, I felt like one lost researcher among thousands of other researchers, whose only motivation is to publish more papers whatever at stake. I used to publish sometimes in networking venues, sometimes in system venues, sometimes in multimedia venues… My production was then quite inconsistent, and my experiences attending conferences were not especially exciting.

The MMsys community matches my expectations for several reasons:

  • The size of a typical MMSys conference is human: when you meet someone the first day, you’ll surely meet this fellow again the next day.
  • Informal chat groups are diverse. I’ve the feeling that anybody can feel comfortable enough to chat with any other attendee regardless of gender, nationality, and seniority.
  • A responsible vision of what should be an academic event. The community is not into show-off in luxury resorts, but rather promotes decently cheap conferences in standard places while maximizing fun and interactions. It comes sometimes with the cost of organizing the conference in the facilities of the university (which necessarily means much more work for organizers and volunteers), but social events have never been neglected.
  • People share a set of “values” into their research activities.

This last point is of course the most significant aspect of MMSys. The main idea behind this conference is that multimedia services are not only multimedia but also networks, systems, and experiences. This commitment to a holistic vision of multimedia systems has at least two consequences. First, the typical contributions that are discussed in this conference have both some theoretical and experimental parts, and, to be accepted, papers have to find the right balance between both sides of the problem. It is definitely challenging, but it brings passionate researchers to the conference. Second, the line between industry and academia is very porous. As a matter of facts, many core researchers of MMSys are either (past or current) employees of research centers in a company or involved into standard groups and industrial forums. The presence of people being involved in the design of products nurtures the academic debates.

While MMSys significantly grows, year after year, I was curious to see if these “values” remain. Fortunately, it does. The growing reputation has not changed the spirit.

mmsys_2

The 2018 edition of the MMSys conference was held in the campus of CWI, near Downtown Amsterdam. Thanks to the impressive efforts of all volunteers and local organizers, the event went smoothly in the modern facilities near the Amsterdam University. As can be expected from a conference in the Netherlands, especially in June, biking to the conference was the obviously best solution to commute every morning from anywhere in Amsterdam.

mmsys_3The program contains a fairly high number of inspiring talks, which altogether reflected the “style” of MMsys. We got a mix of entertaining technological industry-oriented talks discussing state-of-the-art and beyond. The two main conference keynotes were given by stellar researchers (who unsurprisingly have a bright career in both academia and industry) on the two hottest topics of the conference. First Philip Chou (8i Labs) introduced holograms. Phil kind of lives in the future, somewhere five years later than now. And from there, Phil was kind enough to give us a glimpse of the anticipatory technologies that will be developed between our and his nows. Undoubtedly everybody will remember his flash-forwarding talk. Then Nuria Oliver (Vodafone) discussed the opportunities to combine IoT and multimedia in a talk that was powerful and energizing. The conference also featured so-called overview talks. The main idea is that expert researchers present the state-of-the-art in areas that have been especially under the spotlights in the past months. The topics this year were 360-degree videos, 5G networks, and per-title video encoding. The experts were from Tiledmedia, Netflix, Huawei and University of Illinois. With such a program, MMSys attendees had the opportunity to catch-up on everything they may have missed during the past couple of years.

mmsys_4

mmsys_5The MMSys conference has also a long history of commitment for open-source and demonstration. This year’s conference was a peak with an astonishing ratio of 45% papers awarded by a reproducibility badge, which means that the authors of these papers have accepted to share their dataset, their code, and to make sure that their work can be reproduced by other researchers. I am not aware of any other conference reaching such a ratio of reproducible papers. MMSys is all about sharing, and this reproducibility ratio demonstrates that the MMSys researchers see their peers as cooperating researchers rather than competitors.

 

mmsys_6My personal highlights would go for two papers: the first one is a work from researchers from UT Dallas and Mobiweb. It shows a novel efficient approach to generate human models (skeletal poses) with regular Kinect. This paper is a sign that Augmented Reality and Virtual Reality will soon be populated by user-generated content, not only synthetized 3D models but also digital captures of real humans. The road toward easy integration of avatars in multimedia scenes is paved and this work is a good example of it. The second work I would like to highlight in this column is a work from researchers from Université Cote d’Azur. The paper deals with head movement in 360-degree videos but instead of trying to predict movements, the authors propose to edit the content to guide user attention so that head movements are reduced. The approach, which is validated by a real prototype and code source sharing, comes from a multi-disciplinary collaboration with designers, engineers, and human interaction experts. Such multi-disciplinary work is also largely encouraged in MMSys conferences.

mmsys_7b

Finally, MMSys is also a full event with several associated workshops. This year, Packet Video (PV) was held with MMSys for the very first time and it was successful with regards to the number of people who attended it. Fortunately, PV has not interfered with Nossdav, which is still the main venue for high-quality innovative and provocative studies. In comparison, both MMVE and Netgames were less crowded, but the discussion in these events was intense and lively, as can be expected when so many experts sit in the same room. It is the purpose of workshops, isn’t it?

mmsys_8

A very last word on the social events. The social events in the 2018 edition were at the reputation of MMSys: original and friendly. But I won’t say more about them: what happens in MMSys social events stays at MMSys.

mmsys_9The 2019 edition of MMSys will be held on the East Coast of US, hosted by University of Massachusetts-Amherst. The multimedia community is in a very exciting time of its history. The attention of researchers is shifting from video delivery to immersion, experience, and attention. More than ever, multimedia systems should be studied from multiple interplaying perspectives (network, computation, interfaces). MMSys is thus a perfect place to discuss research challenges and to present breakthrough proposals.

[1] This means that I also had my bunch of rejected papers at MMSys and affiliated workshops. Reviewer #3, whoever you are, you ruined my life (for a couple of hours)

JPEG Column: 79th JPEG Meeting in La Jolla, California, U.S.A.

The JPEG Committee had its 79th meeting in La Jolla, California, U.S.A., from 9 to 15 April 2018.

During this meeting, JPEG had a final celebration of the 25th anniversary of its first JPEG standard, usually known as JPEG-1. This celebration coincides with two interesting facts. The first was the approval of a reference software for JPEG-1, “only” after 25 years. At the time of approval of the first JPEG standard a reference software was not considered, as it is common in recent image standards. However, the JPEG committee decided that was still important to provide a reference software, as current applications and standards can largely benefit on this specification. The second coincidence was the launch of a call for proposals for a next generation image coding standard, JPEG XL. This standard will define a new representation format for Photographic information, that includes the current technological developments, and can become an alternative to the 25 years old JPEG standard.

An informative two-hour JPEG Technologies Workshop marked the 25th anniversary celebration on Friday April 13, 2018. The workshop had presentations of several committee members on the current and future JPEG committee activity, with the following program:

IMG_4560

Touradj Ebrahimi, convenor of JPEG, presenting an overview of JPEG technologies.

  • Overview of JPEG activities, by Touradj Ebrahimi
  • JPEG XS by Antonin Descampe and Thomas Richter
  • HTJ2K by Pierre-Anthony Lemieux
  • JPEG Pleno – Light Field, Point Cloud, Holography by Ioan Tabus, Antonio Pinheiro, Peter Schelkens
  • JPEG Systems – Privacy and Security, 360 by Siegfried Foessel, Frederik Temmermans, Andy Kuzma
  • JPEG XL by Fernando Pereira, Jan De Cock

After the workshop, a social event was organized where a past JPEG committee Convenor, Eric Hamilton was recognized for key contributions to the JPEG standardization.

La Jolla JPEG meetings comprise mainly the following highlights:

  • Call for proposals of a next generation image coding standard, JPEG XL
  • JPEG XS profiles and levels definition
  • JPEG Systems defines a 360 degree format
  • HTJ2K
  • JPEG Pleno
  • JPEG XT
  • Approval of the JPEG Reference Software

The following summarizes various activities during JPEG’s La Jolla meeting.

JPEG XL

Billions of images are captured, stored and shared on a daily basis demonstrating the self-evident need for efficient image compression. Applications, websites and user interfaces are increasingly relying on images to share experiences, stories, visual information and appealing designs.

User interfaces can target devices with stringent constraints on network connection and/or power consumption in bandwidth constrained environments. Even though network capacities are improving globally, bandwidth is constrained to levels that inhibit application responsiveness in many situations. User interfaces that utilize images containing larger resolutions, higher dynamic ranges, wider color gamuts and higher bit depths, further contribute to larger volumes of data in higher bandwidth environments.

The JPEG Committee has launched a Next Generation Image Coding activity, referred to as JPEG XL. This activity aims to develop a standard for image coding that offers substantially better compression efficiency than existing image formats (e.g. more than 60% improvement when compared to the widely used legacy JPEG format), along with features desirable for web distribution and efficient compression of high-quality images.

To this end, the JPEG Committee has issued a Call for Proposals following its 79th meeting in April 2018, with the objective of seeking technologies that fulfill the objectives and scope of a Next Generation Image Coding. The Call for Proposals (CfP), with all related info, can be found at jpeg.org. The deadline for expression of interest and registration is August 15, 2018, and submissions to the Call are due September 1, 2018. To stay posted on the action plan for JPEG XL, please regularly consult our website at jpeg.org and/or subscribe to our e-mail reflector.

 

JPEG XS

This project aims at the standardization of a visually lossless low-latency lightweight compression scheme that can be used as a mezzanine codec for the broadcast industry, Pro-AV and other markets such as VR/AR/MR applications and autonomous cars. Among important use cases identified one can mention in particular video transport over professional video links (SDI, IP, Ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and sensor compression in the automotive industry. During the La Jolla meeting, profiles and levels have been defined to help implementers accurately size their design for their specific use cases. Transport of JPEG XS over IP networks or SDI infrastructures, are also being specified and will be finalized during the next JPEG meeting in Berlin (July 9-13, 2018). The JPEG committee therefore invites interested parties, in particular coding experts, codec providers, system integrators and potential users of the foreseen solutions, to contribute to the specification process. Publication of the core coding system as an International Standard is expected in Q4 2018.

 

JPEG Systems – JPEG 360

The JPEG Committee continues to make progress towards its goals to define a common framework and definitions for metadata which will improve the ability to share 360 images and provide the basis to enable new user interaction with images.  At the 79th JPEG meeting in La Jolla, the JPEG committee received responses to a call for proposals it issued for JPEG 360 metadata. As a result, JPEG Systems is readying a committee draft of “JPEG Universal Metadata Box Format (JUMBF)” as ISO/IEC 19566-5, and “JPEG 360” as ISO/IEC 19566-6.  The box structure defined by JUMBF allows JPEG 360 to define a flexible metadata schema and the ability to link JPEG code streams embedded in the file. It also allows keeping unstitched image elements for omnidirectional captures together with the main image and descriptive metadata in a single file.  Furthermore, JUMBF lays the groundwork for a uniform approach to integrate tools satisfying the emerging requirements for privacy and security metadata.

To stay posted on JPEG 360, please regularly consult our website at jpeg.org and/or subscribe to the JPEG 360 e-mail reflector. 

 

HTJ2K

High Throughput JPEG 2000 (HTJ2K) aims to develop an alternate block-coding algorithm that can be used in place of the existing block coding algorithm specified in ISO/IEC 15444-1 (JPEG 2000 Part 1). The objective is to significantly increase the throughput of JPEG 2000, at the expense of a small reduction in coding efficiency, while allowing mathematically lossless transcoding to and from codestreams using the existing block coding algorithm.

As a result of a Call for Proposals issued at its 76th meeting, the JPEG Committee has selected a block-coding algorithm as the basis for Part 15 of the JPEG 2000 suite of standards, known as High Throughput JPEG 2000 (HTJ2K). The algorithm has demonstrated an average tenfold increase in encoding and decoding throughput, compared to the algorithms based on JPEG 2000 Part 1. This increase in throughput results in less than 15% average loss in coding efficiency, and allows mathematically lossless transcoding to and from JPEG 2000 Part 1 codestreams.

A Working Draft of Part 15 to the JPEG 2000 suite of standards is now under development.

 

JPEG Pleno

The JPEG Committee is currently pursuing three activities in the framework of the JPEG Pleno Standardization: Light Field, Point Cloud and Holographic content coding.

JPEG Pleno Light Field finished a third round of core experiments for assessing the impact of individual coding modules and started work on creating software for a verification model. Moreover, additional test data has been studied and approved for use in future core experiments. Working Draft documents for JPEG Pleno specifications Part 1 and Part 2 were updated. A JPEG Pleno Light Field AhG was established with mandates to create a common test conditions document; perform exploration studies on new datasets, quality metrics, and random-access performance indicators; and to update the working draft documents for Part 1 and Part 2.

Furthermore, use cases were studied and are under consideration for JPEG Pleno Point Cloud. A current draft list is under discussion for the next period and will be updated and mapped to the JPEG Pleno requirements. A final document on use cases and requirements for JPEG Pleno Point Cloud is expected at the next meeting.

JPEG Pleno Holography has reviewed the draft of a holography overview document. Moreover, the current databases were classified according to use cases, and plans to analyze numerical reconstruction tools were established.

 

JPEG XT

The JPEG Committee released two corrigenda to JPEG XT Part 1 (core coding system) and JPEG XT Part 8 (lossless extension JPEG-1). These corrigenda clarify the upsampling procedure for chroma-subsampled images by adopting the centered upsampling in use by JFIF.

 

JPEG Reference Software

The JPEG Committee is pleased to announce that the CD ballot for Reference Software has been issued for the original JPEG-1 standard. This initiative closes a long-standing gap in the legacy JPEG standard by providing two reference implementations for this widely used and popular image coding format.

Final Quote

The JPEG Committee is hopeful to see its recently launched Next Generation Image Coding, JPEG XL, can result in a format that will become as important for imaging products and services as its predecessor was; the widely used and popular legacy JPEG format which has been in service for a quarter of century. said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JBIG, JPEG, JPEG 2000, JPEG XR, JPSearch and more recently, the JPEG XT, JPEG XS, JPEG Systems and JPEG Pleno families of imaging standards.

The JPEG Committee nominally meets four times a year, in different world locations. The 79th JPEG Meeting was held on 9-15 April 2018, in La Jolla, California, USA. The next 80th JPEG Meeting will be held on 7-13, July 2018, in Berlin, Germany.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

 

Future JPEG meetings are planned as follows:JPEG-signature

  • No 80, Berlin, Germany, July 7 to13, 2018
  • No 81, Vancouver, Canada, October 13 to 19, 2018
  • No 82, Lisbon, Portugal, January 19 to 25, 2019

MPEG Column: 122nd MPEG Meeting in San Diego, CA, USA

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

MPEG122 Plenary, San Diego, CA, USA.

MPEG122 Plenary, San Diego, CA, USA.

The MPEG press release comprises the following topics:

  • Versatile Video Coding (VVC) project starts strongly in the Joint Video Experts Team
  • MPEG issues Call for Proposals on Network-based Media Processing
  • MPEG finalizes 7th edition of MPEG-2 Systems Standard
  • MPEG enhances ISO Base Media File Format (ISOBMFF) with two new features
  • MPEG-G standards reach Draft International Standard for transport and compression technologies

Versatile Video Coding (VVC) – MPEG’ & VCEG’s new video coding project starts strong

The Joint Video Experts Team (JVET), a collaborative team formed by MPEG and ITU-T Study Group 16’s VCEG, commenced work on a new video coding standard referred to as Versatile Video Coding (VVC). The goal of VVC is to provide significant improvements in compression performance over the existing HEVC standard (i.e., typically twice as much as before) and to be completed in 2020. The main target applications and services include — but not limited to — 360-degree and high-dynamic-range (HDR) videos. In total, JVET evaluated responses from 32 organizations using formal subjective tests conducted by independent test labs. Interestingly, some proposals demonstrated compression efficiency gains of typically 40% or more when compared to using HEVC. Particular effectiveness was shown on ultra-high definition (UHD) video test material. Thus, we may expect compression efficiency gains well-beyond the targeted 50% for the final standard.

Research aspects: Compression tools and everything around it including its objective and subjective assessment. The main application area is clearly 360-degree and HDR. Watch out conferences like PCS and ICIP (later this year), which will be full of papers making references to VVC. Interestingly, VVC comes with a first draft, a test model for simulation experiments, and a technology benchmark set which is useful and important for any developments for both inside and outside MPEG as it allows for reproducibility.

MPEG issues Call for Proposals on Network-based Media Processing

This Call for Proposals (CfP) addresses advanced media processing technologies such as network stitching for VR service, super resolution for enhanced visual quality, transcoding, and viewport extraction for 360-degree video within the network environment that allows service providers and end users to describe media processing operations that are to be performed by the network. Therefore, the aim of network-based media processing (NBMP) is to allow end user devices to offload certain kinds of processing to the network. Therefore, NBMP describes the composition of network-based media processing services based on a set of media processing functions and makes them accessible through Application Programming Interfaces (APIs). Responses to the NBMP CfP will be evaluated on the weekend prior to the 123rd MPEG meeting in July 2018.

Research aspects: This project reminds me a lot about what has been done in the past in MPEG-21, specifically Digital Item Adaptation (DIA) and Digital Item Processing (DIP). The main difference is that MPEG targets APIs rather than pure metadata formats, which is a step forward into the right direction as APIs can be implemented and used right away. NBMP will be particularly interesting in the context of new networking approaches including, but not limited to, software-defined networking (SDN), information-centric networking (ICN), mobile edge computing (MEC), fog computing, and related aspects in the context of 5G.

7th edition of MPEG-2 Systems Standard and ISO Base Media File Format (ISOBMFF) with two new features

More than 20 years since its inception development of MPEG-2 systems technology (i.e., transport/program stream) continues. New features include support for: (i) JPEG 2000 video with 4K resolution and ultra-low latency, (ii) media orchestration related metadata, (iii) sample variance, and (iv) HEVC tiles.

The partial file format enables the description of an ISOBMFF file partially received over lossy communication channels. This format provides tools to describe reception data, the received data and document transmission information such as received or lost byte ranges and whether the corrupted/lost bytes are present in the file and repair information such as location of the source file, possible byte offsets in that source, byte stream position at which a parser can try processing a corrupted file. Depending on the communication channel, this information may be setup by the receiver or through out-of-band means.

ISOBMFF’s sample variants (2nd edition), which are typically used to provide forensic information in the rendered sample data that can, for example, identify the specific Digital Rights Management (DRM) client which has decrypted the content. This variant framework is intended to be fully compatible with MPEG’s Common Encryption (CENC) and agnostic to the particular forensic marking system used.

Research aspects: MPEG systems standards are mainly relevant for multimedia systems research with all its characteristics. The partial file format is specifically interesting as it targets scenarios with lossy communication channels.

MPEG-G standards reach Draft International Standard for transport and compression technologies

MPEG-G provides a set of standards enabling interoperability for applications and services dealing with high-throughput deoxyribonucleic acid (DNA) sequencing. At its 122nd meeting, MPEG promoted its core set of MPEG-G specifications, i.e., transport and compression technologies, to Draft International Standard (DIS) stage. Such parts of the standard provide new transport technologies (ISO/IEC 23092-1) and compression technologies (ISO/IEC 23092-2) supporting rich functionality for the access and transport including streaming of genomic data by interoperable applications. Reference software (ISO/IEC 23092-4) and conformance (ISO/IEC 23092-5) will reach this stage in the next 12 months.

Research aspects: the main focus of this work item is compression and transport is still in its infancy. Therefore, research on the actual delivery for compressed DNA information as well as its processing is solicited.

What else happened at MPEG122?

  • Requirements is exploring new video coding tools dealing with low-complexity and process enhancements.
  • The activity around coded representation of neural networks has defined a set of vital use cases and is now calling for test data to be solicited until the next meeting.
  • The MP4 registration authority (MP4RA) has a new awesome web site http://mp4ra.org/.
  • MPEG-DASH is finally approving and working the 3rd edition comprising consolidated version of recent amendments and corrigenda.
  • CMAF started an exploration on multi-stream support, which could be relevant for tiled streaming and multi-channel audio.
  • OMAF kicked-off its activity towards a 2nd edition enabling support for 3DoF+ and social VR with the plan going to committee draft (CD) in Oct’18. Additionally, there’s a test framework proposed, which allows to assess performance of various CMAF tools. Its main focus is on video but MPEG’s audio subgroup has a similar framework to enable subjective testing. It could be interesting seeing these two frameworks combined in one way or the other.
  • MPEG-I architectures (yes plural) are becoming mature and I think this technical report will become available very soon. In terms of video, MPEG-I looks more closer at 3DoF+ defining common test conditions and a call for proposals (CfP) planned for MPEG123 in Ljubljana, Slovenia. Additionally, explorations for 6DoF and compression of dense representation of light fields are ongoing and have been started, respectively.
  • Finally, point cloud compression (PCC) is in its hot phase of core experiments for various coding tools resulting into updated versions of the test model and working draft.

Research aspects: In this section I would like to focus on DASH, CMAF, and OMAF. Multi-stream support, as mentioned above, is relevant for tiled streaming and multi-channel audio which has been recently studied in the literature and is also highly relevant for industry. The efficient storage and streaming of such kind of content within the file format is an important aspect and often underrepresented in both research and standardization. The goal here is to keep the overhead low while maximizing the utility of the format to enable certain functionalities. OMAF now targets the social VR use case, which has been discussed in the research literature for a while and, finally, makes its way into standardization. An important aspect here is both user and quality of experience, which requires intensive subjective testing.

Finally, on May 10 MPEG will celebrate 30 years as its first meeting dates back to 1988 in Ottawa, Canada with around 30 attendees. The 122nd meeting had more than 500 attendees and MPEG has around 20 active work items. A total of more than 170 standards have been produces (that’s approx. six standards per year) where some standards have up to nine editions like the HEVC standards. Overall, MPEG is responsible for more that 23% of all JTC 1 standards and some of them showing extraordinary longevity regarding extensions, e.g., MPEG-2 systems (24 years), MPEG-4 file format (19 years), and AVC (15 years). MPEG standards serve billions of users (e.g., MPEG-1 video, MP2, MP3, AAC, MPEG-2, AVC, ISOBMFF, DASH). Some — more precisely five — standards have receive Emmy awards in the past (MPEG-1, MPEG-2, AVC (2x), and HEVC).

Thus, happy birthday MPEG! In today’s society starts the high performance era with 30 years, basically the time of “compression”, i.e., we apply all what we learnt and live out everything, truly optimistic perspective for our generation X (millennials) standards body!

Opinion Column: Privacy and Multimedia

 

The discussion: multimedia data is affected by new forms of privacy threats, let’s learn, protect, and engage our users.

For this edition of the SIGMM Opinion Column, we carefully selected the discussion’s main topic, looking for an appealing and urgent problem arising for our community. Given the recent Cambridge Analytica’s scandal, and the upcoming enforcement of the General Data Protection Act in EU countries, we thought we should have a collective reflection on  ‘privacy and multimedia’.

The discussion: multimedia data is affected by new forms of privacy threats, let’s learn, protect, and engage our users.

Users share their data often unintentionally. One could indeed observe a diffuse sense of surprise and anger following the data leaks from Cambridge Analytica. As mentioned in a recent blog post from one of the participants, so far, large-scale data leaks have mainly affected private textual and social data of social media users. However, images and videos also contain private user information. There was a general consensus that it is time for our community to start thinking about how to protect private visual and multimedia data.

It was noted that computer vision technologies are now able to infer sensitive information from images (see, for example, a recent work on sexual orientation detection from social media profile pictures). However few technologies exist that defend users against automatic inference of private information from their visual data. We will need to design protection techniques to ensure users’ privacy protection for images as well, beyond simple face de-identification. We might also want users to engage and have fun with image privacy preserving tools, and this is the aim of the Pixel Privacy project.

But in multimedia, we go beyond image analysis. By nature, as multimedia researchers, we combine different sources of information to design better media retrieval or content serving technologies, or to ‘get more than the sum of parts’. While this is what makes our research so special, in the discussion participants noted that multimodal approaches might also generate new forms of privacy threats. Each individual source of data comes with its own privacy dimension, and we should be careful about the multiple privacy breaches we generate by analyzing each modality. At the same time, by combining different medias and their privacy dimensions, and performing massive inference on the global multimodal knowledge, we might also be generating new forms of threats to user privacy that individual stream don’t have.

Finally, we should also inform users about these new potential threats:  as experts who are doing ‘awesome cutting-edge work’, we also have a responsibility to make sure people know what the potential consequences are.

A note on the new format, the response rate, and a call for suggestions!

This quarter, we experimented with a new, slimmer format, hoping to reach out to more members of the community, beyond Facebook subscribers.

We extended the outreach beyond Facebook: we used the SIGMM Linkedin group for our discussion, and we directly contacted senior community members. To engage community members with limited time for long debates, we also lightened the format, asking anyone who is interested in giving us their opinion on the topic to send us or share with the group a one-liner reflecting their view on privacy on multimedia.

Despite the new format, we received a limited number of replies. We will keep trying new formats. Our aim is to generate fruitful  discussions, and gather opinions on crucial problems in a bottom-up fashion. We hope, edition after edition, to get better at giving voice to more and more members of the Multimedia Community.

We are happy to hear your thoughts on how to improve, so please reach out to us!

Multidisciplinary Community Spotlight: Assistive Augmentation

 

Emphasizing the importance of neighboring communities for our work in the field of multimedia was one of the primary objectives we set out with when we started this column about a year ago. In past issues, we gave related communities a voice through interviews and personal accounts. For instance, in the third issue of 2017, Cynthia shared personal insights from the International Society of Music Information Retrieval [4]. This issue continues the spotlight series.

Since its inception, I was involved with the Assistive Augmentation community—a multidisciplinary field that sits at the intersection of accessibility, assistive technologies, and human augmentation. In this issue, I briefly reflect on my personal experiences and research work within the community.

First, let me provide a high-level view on Assistive Augmentation and its general idea which is that of cross-domain assistive technology. Instead of putting sensorial capability in individual silos, the approach puts it on a continuum of usability for a specific technology. As an example, a reading aid for people with visual impairments enables access to printed text. At the same time, the reading aid can also be used by those with an unimpaired visual sense for other applications like language learning. In essence, the field is concerned with the design, development, and study of technology that substitutes, recovers, empowers or augments physical, sensorial or cognitive capabilities, depending on specific user needs (see Figure 1).

Assistive Augmentation Continuum

Figure 1.  Assistive Augmentation Continuum

Now let us take a step back. I joined the MIT Media Lab as a postdoctoral fellow in 2013 pursuing research on multi-sensory cueing for mobile interaction. With my background in user research and human-computer interaction, I was immediately attracted by an ongoing project at the lab lead by Roy Shilkrot, Suranga Nanayakkara and Pattie Maes, that involved studying how the MIT visually impaired and blind user group (VIBUG) uses assistive technology. People in that group are particularly tech-savvy. I came to know products like the ORCAM MyEye. It is priced at about 2500-4500 USD and aims at recognizing text, objects and so forth. Back in 2013 it had a large footprint and made its users really stand out. Our general observations were, to briefly summarize, that many tools we got to know during regular VIBUG meetings were highly specialized for this very target group. The latter is, of course, a good thing since it focuses directly on the actual end user. However, we also concluded that it locks the products in silos of usability defined by its’ end users’ sensorial capabilities. 

These anecdotal observations bring me back to the general idea of Assistive Augmentation. To explore this idea further, we proposed to hold a workshop at a conference, jointly with colleagues in neighboring communities. With ACM CHI attracting folks from different fields of research, we felt like it would be a good fit to test the waters and see whether we could get enough interest from different communities. Our proposal was successful: the workshop was held in 2014 and set the stage for thinking about, discussing and sketching out facets of Assistive Augmentation. As intended, our workshop attracted a very diverse crowd from different fields. Being able to discuss opportunities and the potential of Assistive Augmentation with such a group was immensely helpful and contributed significantly to our ongoing efforts to define the field. A practice I would encourage everyone at a similar stage to follow.

As a tangible outcome of this very workshop, our community decided to pursue a jointly edited volume which Springer published earlier this year [3]. The book illustrates two main areas of Assistive Augmentation by example: (i) sensory enhancement and substitution and (ii) design for Assistive Augmentation. Peers contributed comprehensive reports on case studies which serve as lighthouse projects to exemplify Assistive Augmentation research practice. Besides, the book features field-defining articles that introduce each of the two main areas.

Many relevant areas have yet to be touched upon, for instance, ethical issues, quality of augmentations and their appropriations. Augmenting human perception, another important research thrust, has recently been discussed in both SIGCHI and SIGMM communities. Last year, a workshop on “Amplification and Augmentation of Human Perception” was held by Albrecht Schmidt, Stefan Schneegass, Kai Kunze, Jun Rekimoto and Woontack Woo at ACM CHI [5]. Also, one of last year’s keynotes at ACM Multimedia focused on “Enhancing and Augmenting Human Perception with Artificial Intelligence” by Achin Bhowmik [1]. These ongoing discussions in academic communities underline the importance of investigating, shaping and defining the intersection of assistive technologies and human augmentations. Academic research is one avenue that must be pursued, with work being disseminated at dedicated conference series such as Augmented Human [6]. Other avenues that highlight and demonstrate the potential of Assistive Augmentation technology include for instance sports, as discussed within the Superhuman Sports Society [7]. Most recently, the Cybathlon was held for the very first time in 2016. Athletes with “disabilities or physical weakness use advanced assistive devices […] to compete against each other” [8].

Looking back at how the community came about, I conclude that organizing a workshop at a large academic venue like CHI was an excellent first step for establishing the community. In fact, the workshop created a fantastic momentum within the community. However, focusing entirely on a jointly edited volume as the main tangible outcome of the workshop had several drawbacks. In retrospect, the publication timeline was far too long, rendering it impossible to capture the dynamics of an emerging field. But indeed, this cannot be the objective of a book publication—this should have been the objective of follow-up workshops in neighboring communities (e.g., at ACM Multimedia) or special issues in a journal with a much shorter turn-around. With our book project now being concluded, we aim to pick up on past momenta with a forthcoming special issue on Assistive Augmentation in MDPI’s Multimodal Technologies and Interaction journal. I am eagerly looking forward to what is next and to our communities’ joint work across disciplines towards pushing our physical, sensorial and cognitive abilities.

References

[1]       Achin Bhowmik. 2017. Enhancing and Augmenting Human Perception with Artificial Intelligence Technologies. In Proceedings of the 2017 ACM on Multimedia Conference(MM ’17), 136–136.

[2]       Ellen Yi-Luen Do. 2018. Design for Assistive Augmentation—Mind, Might and Magic. In Assistive Augmentation. Springer, 99–116.

[3]       Jochen Huber, Roy Shilkrot, Pattie Maes, and Suranga Nanayakkara (Eds.). 2018. Assistive Augmentation. Springer Singapore.

[4]       Cynthia Liem. 2018. Multidisciplinary column: inclusion at conferences, my ISMIR experiences. ACM SIGMultimedia Records9, 3 (2018), 6.

[5]       Albrecht Schmidt, Stefan Schneegass, Kai Kunze, Jun Rekimoto, and Woontack Woo. 2017. Workshop on Amplification and Augmentation of Human Perception. In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems, 668–673.

[6]       Augmented Human Conference Series. Retrieved June 1, 2018 from http://www.augmented-human.com/

[7]       Superhuman Sports Society. Retrieved June 1, 2018 from http://superhuman-sports.org/

[8]       Cybathlon. Cybathlon – moving people and technology. Retrieved June 1, 2018 from http://www.cybathlon.ethz.ch/

 


About the Column

The Multidisciplinary Column is edited by Cynthia C. S. Liem and Jochen Huber. Every other edition, we will feature an interview with a researcher performing multidisciplinary work, or a column of our own hand. For this edition, we feature a column by Jochen Huber.

Editor Biographies

Cynthia_Liem_2017Dr. Cynthia C. S. Liem is an Assistant Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. She initiated and co-coordinated the European research project PHENICX (2013-2016), focusing on technological enrichment of symphonic concert recordings with partners such as the Royal Concertgebouw Orchestra. Her research interests consider music and multimedia search and recommendation, and increasingly shift towards making people discover new interests and content which would not trivially be retrieved. Beyond her academic activities, Cynthia gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, and a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach.

 

jochen_huberDr. Jochen Huber is a Senior User Experience Researcher at Synaptics. Previously, he was an SUTD-MIT postdoctoral fellow in the Fluid Interfaces Group at MIT Media Lab and the Augmented Human Lab at Singapore University of Technology and Design. He holds a Ph.D. in Computer Science and degrees in both Mathematics (Dipl.-Math.) and Computer Science (Dipl.-Inform.), all from Technische Universität Darmstadt, Germany. Jochen’s work is situated at the intersection of Human-Computer Interaction and Human Augmentation. He designs, implements and studies novel input technology in the areas of mobile, tangible & non-visual interaction, automotive UX and assistive augmentation. He has co-authored over 60 academic publications and regularly serves as program committee member in premier HCI and multimedia conferences. He was program co-chair of ACM TVX 2016 and Augmented Human 2015 and chaired tracks of ACM Multimedia, ACM Creativity and Cognition and ACM International Conference on Interface Surfaces and Spaces, as well as numerous workshops at ACM CHI and IUI. Further information can be found on his personal homepage: http://jochenhuber.com

 

Sharing and Reproducibility in ACM SIGMM

 

This column discusses the efforts of ACM SIGMM towards sharing and reproducibility. Apart from the specific sessions dedicated to open source and datasets, ACM Multimedia Systems started to provide official ACM badges for articles that make artifacts available since last year. This year, it has marked a record with 45% of the articles acquiring such a badge.


Without data it is impossible to put theories to the test. Moreover, without running code it is tedious at best to (re)produce and evaluate any results. Yet collecting data and writing code can be a road full of pitfalls, ranging from datasets containing copyrighted materials to algorithms containing bugs. The ideal datasets and software packages are those that are open and transparent for the world to look at, inspect, and use without or with limited restrictions. Such “artifacts” make it possible to establish public consensus on their correctness or otherwise to start a dialogue on how to fix any identified problems.

In our interconnected world, storing and sharing information has never been easier. Despite the temptation for researchers to keep datasets and software to themselves, a growing number are willing to share their resources with others. To further promote this sharing behavior, conferences, workshops, publishers, non-profit and even for-profit companies are increasingly recognizing and supporting these efforts. For example, the ACM Multimedia conference has hosted an open source software competition since 2004, and the ACM Multimedia Systems conference has included an open datasets and software track since 2011 . The ACM Digital Library now also hands out badges to public artifacts that have been made available and optionally reviewed and verified by members of the community. At the same time, organizations such as Zenodo and Amazon host open datasets for free. Sharing ultimately pays off: the citation statistics for ACM Multimedia Systems conferences over the past five years, for example, show that half of the 20 most cited papers shared data and code although they have represented a small fraction of the published papers so far.

graphic datasets

Good practices are increasingly adopted. In this year’s edition of the ACM Multimedia Systems conference, 69 works (papers, demos, datasets, software) were accepted, out of which 31 (45%) were awarded an ACM badge. This is a large increase compared to last year, when out of 42 works only a total of 13 (31%) received one. This greatly expands one of the core objectives of both the conference and SIGMM towards open science. At this moment, the ACM Digital Library does not separately index which papers received a badge, making it challenging to find all papers who have one. It further appears not many other ACM conferences are aware of the badges yet; for example, while ACM Multimedia accepted 16 open source papers in 2016 and 6 papers in 2017, none applied for a badge. This year at ACM Multimedia Systems only “artifacts available” badges have been awarded. For next year our intention is to ensure all dataset and software submissions receive the “artifacts evaluated” badge. This would require several committed community members to spend time working with the authors to get the artifacts running on all major platforms with corresponding detailed documentation.

The accepted artifacts this year are diverse in nature: several submissions focus on releasing artifacts related to quality of experience of (mobile/wireless) streaming video, while others center on making datasets and tools related to images, videos, speech, sensors, and events available; in addition, there are a number of contributions in the medical domain. It is great to see such a range of interests in our community!

SIGMM Annual Report (2018)

 

Dear Readers,

Each year SIGMM, like all ACM SIGs, produces an annual report summarising our activities which includes our sponsored and i-cooperation conferences and also the initiatives we are undertaking to support our community and broaden participation. The report also includes our significant papers, our awards given and the major issues that face us going forward. Below is the main text of the SIGMM report 2017-2018 which is augmented by further details on our conferences which is provided by the ACM Office. We hope you enjoy reading this ad learning about what SIGMM does.


SIGMM Annual Report (2018)
Prepared by SIGMM Chair (Alan Smeaton),
Vice Chair (Nicu Sebe), and Conference Director (Gerald Friedland)
August 6th, 2018

Mission: SIGMM provides an international interdisciplinary forum for researchers, engineers, and practitioners in all aspects of multimedia computing, communication, storage and application.

1. Awards:
SIGMM gives out three awards each year and these were as follows:

  • SIGMM Technical Achievement Award for lasting contributions to multimedia computing, communications and applications was presented to Arnold W.M. Smeulders, University of Amsterdam, the Netherlands. The award was given in recognition of his outstanding and pioneering contributions to defining and bridging the semantic gap in content-based image retrieval.
  • SIGMM 2016 Rising Star Award was given to Dr Liangliang Cao of HelloVera. AI for his significant contributions in large-scale multimedia recognition and social media mining.
  • SIGMM Outstanding PhD Thesis in Multimedia Computing Award was given to Chien-Nan (Shannon) Chen for a thesis entitled Semantics-Aware Content Delivery Framework For 3D Tele-Immersion at the University of Illinois at Urbana-Champaign, US.

2. Significant Papers:

The SIGMM flagship conference, ACM Multimedia 2017, was held in Mountain View, Calif. And presented the following awards plus other awards for Best Grand Challenge Video Captioning Paper, Best Grand Challenge Social Media Prediction Paper, Best Brave New Idea Paper:

  • Best paper award to “Adversarial Cross-Modal Retrieval”, by Bokun Wang, Yang Yang, Xing Xu, Alan Hanjalic, Heng Tao Shen
  • Best student paper award to “H-TIME: Haptic-enabled Tele-Immersive Musculoskeletal Examination”, by Yuan Tian, Suraj Raghuraman, Thiru Annaswamy, Aleksander Borresen, Klara Nahrstedt, Balakrishnan Prabhakaran
  • Best demo award to “NexGenTV: Providing Real-Time Insight during Political Debates in a Second Screen Application” by Olfa Ben Ahmed, Gabriel Sargent, Florian Garnier, Benoit Huet, Vincent Claveau, Laurence Couturier, Raphaël Troncy, Guillaume Gravier, Philémon Bouzy  and Fabrice Leménorel.
  • Best Open source software award to “TensorLayer: A Versatile Library for Efficient Deep Learning Development” by Hao Dong, Akara Supratak, Luo Mai, Fangde Liu, Axel Oehmichen, Simiao Yu, Yike Guo.

The 9th ACM International Conference on Multimedia Systems (MMSys 2018), was held in Amsterdam, the Netherlands, and presented a range awards including:

  • Best paper award to “Dynamic Adaptive Streaming for Multi-Viewpoint Omnidirectional Videos” by Xavier Corbillon, Francesca De Simone, Gwendal Simon and Pascal Frossard.
  • Best student-paper award to “Want to Play DASH? A Game Theoretic Approach for Adaptive Streaming over HTTP” by Abdelhak Bentaleb, Ali C. Begen, Saad Harous and Roger Zimmermann.

The International Conference in Multimedia Retrieval (ICMR) 2018 was held in Yokohama, Japan, and presented a range of awards including:

  • Best paper award to “Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval” by Niluthpol Mithun, Juncheng Li, Florian Metze and Amit Roy-Chowdhury.

The best paper and best student paper from each of these three conferences were then reviewed by a specially set up committee to select one paper which has been nominated for Communications of the ACM Research Highlights and that is presently under consideration.

In addition to the above, SIGMM presented the 2017 ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) Nicolas D. Georganas Best Paper Award to the paper “Automatic Generation of Visual-Textual Presentation Layout” (TOMM vol. 12, Issue 2) by Xuyong Yang, Tao Mei, Ying-Qing Xu, Yong Rui, and Shipeng Li.

3. Significant Programs that Provide a Springboard for Further Technical Efforts

  • SIGMM provided support for student travel through grants, at all of our SIGMM-sponsored conferences.
  • Apart from the specific sessions dedicated to open source and datasets, the ACM Multimedia Systems Conference (MMSys) has started to provide official ACM badging for articles that make artifacts available. This year, our second year for doing this, has marked a record with 45% of the articles published at the conference acquiring such a reproducibility badge.

4. Innovative Programs Providing Service to Some Part of Our Technical Community

  • A large part of our research area in SIGMM is driven by the availability of large datasets, usually used for training purposes.  Recent years have shown a large growth in the emergence of openly available datasets coupled with grand challenge events at our conferences and workshops. Mostly these are driven by our corporate researchers but this allows all of our researchers the opportunity to carry out their research at scale.  This provides great opportunities for our community.
  • Following the lead of SIGARCH we have commissioned a study of gender distribution among the SIGMM conferences, conference organization and awards. This report will be completed and presented at our flagship conference in October.  We have also commissioned a study of the conferences and journals which mostly influence, and are influenced by, our own SIGMM conferences as an opportunity for some self-reflection on our origins, and our future.  Both these follow an open call for new initiatives to be supported by SIGMM. 
  • SIGMM Conference Director Gerald Friedland worked with several volunteers from SIGMM to improve the content and organization of ACM Multimedia and connected conferences. Volunteer Dayid Ayman Shamma used data science methods to analyze several ACM MM conferences in the past five years with the goal of identifying biases and patterns of irregularities. Some results were presented at the ACM MM TPC meeting. Volunteers Hayley Hung and Martha Larson gave an account of their expectations and experiences with ACM Multimedia and Dr. Friedland himself volunteered as a reviewer for conferences of similar size and importance, including NIPS and CSCW and approached the chairs to get external feedback into what can be improved in the review process. Furthermore, in September, Dr. Friedland will travel to Berlin to visit Lutz Prechelt, who invented a review quality management system. The results of this work will be included into a conference handbook that will put down standard recommendations of best practices for future organizers of SIGMM conferences. We expect the book to be finished by the end of 2018.
  • Last year SIGMM made a decision to try to co-locate conferences and other events as much as possible and the ACM Multimedia conference was co-located with the European Conference on Computer Vision (ECCV) in 2016 with joint workshops and tutorials. This year the ACM MultiMedia Systems (MMSys) conference was co-located with the 10th International Workshop on Immersive Mixed and Virtual Environment Systems (MMVE2018), the16th Annual Workshop on Network and Systems Support for Games (NetGames2018), the 28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video (NOSSDAV2018) and the 23rd Packet Video Workshop (PV2018).  In addition, the Technical Program Committee meeting for the Multimedia Conference was co-located with the ICMR conference.

5. Events or Programs that Broaden Participation

  • SIGMM has approved the launch of a new conference series called Multimedia Asia which will commence in 2019. This will be run by the SIGMM China Chapter and consolidates two existing multimedia-focused conferences in Asia under the sponsorship and governance of SIGMM. This follows a very detailed review and the successful location for the inaugural conference in 2019 will be announced at our flagship conference in October 2018.
  • The Women / Diversity in Multimedia Lunch at ACM MULTIMEDIA 2017 (previously the Women’s Lunch) continued this year with an enlarged program of featured speakers and discussion which led to the call for the gender study in Multimedia mentioned earlier.
  • SIGMM continues to pursue an active approach to nurturing the careers of our early stage researchers. The “Emerging Leaders” event (formerly known as Rising Stars) skipped a year in 2017 but will be happening again in 2018 at the Multimedia Conference.  Giving these early career researchers the opportunity to showcase their vision helps to raise their visibility and helps SIGMM to enlarge the pool of future volunteers.
  • The expansion we put in place in our social media communication team has proven to be a shrewd move with a large growth in our website traffic and raised profile on social media. We also invite conference attendees to post on twitter and/or Facebook about papers, demos, talks that they think are most thought provoking and forward looking and the most active of these are rewarded with a free registration at a future SIGMM-sponsored conference.

6. Issues for SIGMM in the next 2-3 years

  • Like other SIGs, we realize that improving the diversity of the community we serve is essential to continuing our growth and maintaining our importance and relevance. This includes diversity in gender, in geographical location, and in many other facets.  We have started to address these through some of the initiatives mentioned earlier, and at our flagship conference in 2017 we ran a Workshop emphasizing contributions focusing on research from South Africa and the African continent in general.
  • Leadership and supporting young researchers in the early stages of their careers is also important and we highlight this through 2 of our regular awards (Rising Stars and Best Thesis). The “Emerging Leaders” event (formerly known as Rising Stars) skipped a year in 2017 but will be happening again in 2018 at the Multimedia Conference.
  • We wish to reach to other SIGs with whom we could have productive engagement because we see multimedia as a technology enabler as well as an application unto itself. To this end we will continue to try to hold joint panels or workshops at our conferecnes.
  • Our research area is marked by the growth and availability of open datasets and grand challenge competitions held at our conferences and workshops. These datasets are often provided from the corporate sector and this is both an opportunity for us to do research on datasets otherwise unavailable to us, as well as being a threat to the balance between corporate influence and independence.
  • In a previous annual report we highlighted the difficulties caused by a significant portion of our conference proceedings not being indexed by Thomson Web of Science. In a similar vein we find our conference proceedings are not used as input into CSRankings, a metrics-based ranking of Computer Science institutions worldwide. Publishing at venues which are considered in CSRankings’ operation is important to much of our community and while we are in the process of trying to re-dress this, support of ACM on making this case would be welcome.

Towards Data Driven Conferences Within SIGMM

There is no doubt that research in our field has become more data driven. And while the term data science might suggest there is some science without data, it is data the feeds our systems, trains our networks, and tests our results. Recently we have seen examples of a few conferences experimenting with their review process to gain new insights. For those of us in the Multimedia (MM) community, it is not an entirely new thing. In 2013, I led an effort along with my TPC Co-Chairs to look at the past several years of conferences and examine the reviewing system, the process, the scores, and the reviewer load. Additionally, we ran surveys to the authors (accepted and rejected) and the reviewers and ACs to gauge how we did. This was presented at the business meeting in Barcelona along with the suggestion that this practice continues. While this was met with great praise, it was never repeated.

Fast forward to 2017, I found myself asking the same questions about the MM review process which went through several changes (such as the late addition of the “Thematic Workshops” as well as an explicit COI—for papers from the Chairs —which we stated in 2013 could have adverse effects). And, just like before, I requested data from the Director of Conferences and SIGMM chair so I could run an analysis. There are a few things to note about the 2017 data.

  • Some reviews were contained in attachments which were unavailable.
  • Rebuttals were not present (some chairs allowed them, some did not).
  • The conference was divided into a “Regular” set of tracks and a “COI” track for anyone who was on the PC and submitted a paper.
  • The Call for Papers was a mixture of “Papers and Posters”. 

The conference reports:

Finally, the Program Committee accepted 189 out of 684 submissions, yielding an acceptance rate of 27.63 percent. Among the 189 accepted full papers, 49 are selected to give oral presentations on the conference, while the rest are arranged to present to conference attendees in a poster format. 

Track Reviewed Accepted Rate
Main Paper 684 189 27.63%
Thematic Workshop 495 64 12.93%

In 2017, in a departure from previous years, the chairs decided to invite roughly 9% of the accepted papers for an oral presentation with the remaining accepts delegated to a larger poster session. During the review process, to be inclusive, a decision was made to invite some of the rejected papers to a non-archival Thematic workshop where their work could be presented as posters in a non-archival format such that the article can be published elsewhere at a future date. The published rates for these Thematic workshop was 64/495 or roughly 13% of the rejected papers. To dive in further, first, we compute the accepted orals and posters against the total submitted. Second, amongst the rejected papers, we compute the percent of rejects that were invited to a Thematic Workshop. However in the dataset there were 113 papers invited for Thematic Workshops; 49 of these did not make it into the program as the authors refused the automatic enrollment invitation.

  Normal COI Normal Rate COI Rate
Oral 41 8 7.03% 7.92%
Poster 123 17 21.1% 16.83%
Workshop 79 34 18.85% 44.74%
Reject 339 42 58.15% 41.58%

Comparing the Regular and COI tracks, we find the scores to be correlated (p<0.003) if the workshops are treated as rejects. Including the workshops into the calculation shows no correlation (p<0.093). To further examine this, we plotted the percent decision by area and track.

Decision Percent Rates by Track and Area

While one must remember the numbers by volume are smaller in the COI track, some inflation will be seen here. Again, by percentage, you can see Novel Topics – Privacy and Experience – Novel Interactions have a higher oral accept rate while Understanding Vision & Deep Learning and Experience Perceptual pulled in higher Thematic Workshop rates.

No real change was seen in the score distribution across the tracks and areas (as seen here in the following jitter plots).

ayman_2b

ayman_3b

For the review lengths, the average size by character was 1452 with an IQR of 1231. Some reviews skewed longer in the Regular track but they still are outliers for the most part. The load averaged around 4 papers per reviewer with some normal exception. The people depicted with more than 10 papers were TCP members or ACs.

ayman_4b

Overall, there is some difference but still a correlation between the COI and Regular tracks and the average number of papers per reviewer was kept to a manageable number. The score distributions roughly seems similar with the exception of the IQR but this is likely more product of the COI track being smaller. For the Thematic Workshops thereʼs an inflation in the accept rate for the COI track: accepting at 18.85% for the regular submissions but 44.74% for the COI. This was dampened by authors rejecting the Thematic Workshop invitation. Of the 79 Regular Workshop invitations and 34 COI invitations, only 50 regular and 14 COI were in the final program. So the final accept rates for what was actually at the conference became 11.93% for Regular Thematic workshop submissions and 18.42% for COI.

So where do we go from here?

Removal of a COI track. A COI track comes and goes in ACM MM and it seems its removal is at the top of the list. Modern conference management software (EasyChair, PCS, CMT, etc.) handles conflicts extremely well already. 

TCP and ACs must monitor reviews. Next, while quantity is not related to quality, a short review length might be an early indicator of poor quality. TCP Chairs and ACs should monitor these reviews because a review of 959 characters is not particularly informative despite it being positive or negative (in fact this paragraph is almost as long as the average review). While some might believe trapping that error is the job of the authors and the Author Advocate (and hence the authors who need to invoke the Advocate), it is the job of the ACs and the TPC to ensure review quality and make sure the Advocate never gets invoked (as we presented the role back when we invented it in 2014).

CMS Systems Need To Support Us. There is no shortage of Conference Management Systems (CMS); none of them are data-driven. Why do I have to export a set of CSVs from a conference system then write R scripts to see there are short reviews? Yelp and TripAdvisor give me guidance on how long my review should be, how is it that a review for ACM MM can be two short sentences?

Provide upfront submission information. The Thematic Workshops were introduced late into the submission process and came as a surprise to many authors. While some innovation in the Technical program is a good idea, the decline rate showed it was undesirable. Some a priori communication with the community might give insights into what experiments we should try and what to avoid. Which falls into the final point.

We need a New Role. And while the SIGMM EC has committed to looking back at past conferences, we should continue this practice routinely. Conferences or ideally the SIGMM EC should create a “Data Health and Metrics” role (or assign this to the Director of Conferences) to continue to oversee the TPC as well as issue post-review and post-conference surveys to learn how we did at each step and ensure we can move forward and grow our community. However, if done right, it will be considerable work and should likely be its own role.

To get started, the SIGMM Executive Committee is working on obtaining past MM conference datasets to further track the history of the conference in a data-forward method. Hopefully youʼll hear more at the ACM MM Business Meeting and Town Hall in Seoul; SIGMM is looking to hear more from the community.