Report from ACM MMSys 2020 by Conor Keighrey

Conor Keighrey (@ConorKeighrey) recently completed his PhD in the Athlone Institute of Technology which aimed to capture and understand the quality of experience (QoE) within a novel immersive multimedia speech and language assessment. He is currently interested in exploring the application of immersive multimedia technologies within health, education and training.


With a warm welcome from Istanbul, Ali C. Begen (Ozyegin University and Networked Media, Turkey) opened MMSys 2020 this year. In light of the global pandemic, the conference has taken a new format being delivered online for the first time. This, however, was not the only first for MMSys, Laura Toni (University College London, UK) is introduced as the first-ever female co-chair for the conference. This year, the organising committee presented gender and culturally diverse line-up of researchers from all around the globe. In parallel, two new grand challenges were introduced on the topics of “Improving Open-Source HEVC Encoding” and “Low-latency live streaming” for the first time ever at MMSys. 

The conference attracted paper submissions from a range of multimedia topics including but not limited to streaming technologies, networking, machine learning, volumetric media, and fake media detection tools. Core areas were complemented with in-depth keynotes delivered by academic and industry experts. 

Examples of such include Ryan Overbeck’s (Google, USA) keynote on “Light Fields – Building the Core Immersive Photo and Video Format for VR and AR” presented on the first day. Light fields provide the opportunity to capture full 6DOF and photo-realism in virtual reality. In his talk, Ryan provided key insight into the camera rigs and results from Google’s recent approach to perfect the capture of virtual representations of real-world spaces.

Later during the conference, Roderick Hodgson from Amber Video presented an interesting keynote on “Preserving Video Truth: an Anti-Deepfakes Narrative”. Roderick delivered a fantastic overview of the emerging area of deep fakes, and the application platforms which are being developed to detect, what will without a doubt be used as highly influential media streams in the future. Discussion closed with Stefano Petrangeli asking how the concept of deep fakes could be applied within the context of AR filters. Although AR is within its infancy from a visual quality perspective, the future may rapidly change how we perceive faces through immersive multimedia experiences utilizing AR filters. The concept is interesting, and it leads to the question of what future challenges will be seen with these emerging technologies.

Although not the main focus of the MMSys conference, the co-located workshops have always stood out for me. I have attended MMSys for the last three years and the warm welcome expressed by all members of the research community has been fantastic. However, the workshops have always shined through as they provide the opportunity to meet those who are working in focused areas of multimedia research. This year’s MMSys was no different as it hosted three workshops:

  • NOSSDAV – The International workshop on Network and Operating System Support for Digital Audio and Video
  • PV – The International Packet Video Workshop
  • MMVE – The International Workshop on Immersive Mixed and Virtual Environment Systems

With a focus on novel immersive media experiences, the MMVE workshop was highly successful with five key presentations exploring the topics of game mechanics, cloud computing, head-mounted display field of view prediction, navigation, and delay. Highlights include the work presented by Na Wang et. Al (George Mason University) which explored field of view prediction within augmented reality experiences on mobile platforms. With the emergence of new and proposed areas of research in augmented reality cloud, field of view predication will alleviate some of the challenges associated with the optimization of network communication for novel immersive multimedia experiences in the future. 

Unlike previous years, conference organisers faced the challenge of creating social events which were completely online. A trivia night hosted on Zoom brought over 40 members of the MMSys community together virtually to test their knowledge against a wide array of general knowledge. Utilizing online the platform “Kahoot”, attendees were challenged with a series of 47 questions. With great interaction from the audience, the event provided a great opportunity to socialise in a relaxing manner much like the real world counterpart! 

Leader boards towards the end were close, with Wei Tsang Ooi gaining the first place with a last-minute bonus question! Jean Botev and Roderick Hodgson took second and third place respectively. Events like this have always been a highlight of the MMSys community, we hope to see it take place this coming year in person over some quite beers and snacks!

Mea Wang opened the N2Women Meeting on the 10th of June. The event openly discussed core influential topics such as the separation of work and life needs within the research community. With a primary objective of assisting new researchers to maintain a healthy work and life balance. Overall, the event was a success, the topic of work and life balance is important for those at all stages of their research careers. Reflecting on my own personal experiences during my PhD, it can be a struggle to determine when to “clock out” and when to spend a few extra hours engaged with research. Key members of the community shared their own personal experiences, discussing other topics such the importance of mentoring, as academic supervisors can often become a mentor for life. Ozgu Alay discussed the importance of developing connections at research-orientated events. Those new to the community should not be afraid to spark a conversation with experts in the field, often the ideal approach is to take interest in their work and begin discussion from there. 

Lastly, Mea Wang mentioned that the initiative had initially acquired funding for the purpose of travel supports and childcare for those attending the conference. Due to the online nature this year, the supports have now been placed aside for next year’s event. Such funding provides a fantastic opportunity to support the cost of attending an international conference and engage with the multimedia community!

Closing the conference, Ali C. Begen opened with the announcement of the awards. The Best Paper Award was presented by Özgü Alay and Christian Timmerer who announced Nan Jiang et al as the winner for their paper on “QuRate: Power-Efficient Mobile Immersive Video Streaming”. The paper is available for download on the ACM Digital Library at the following link. The conference closed with the announcement of key celebrations for next year as the NOSSDAV workshop celebrates it’s 30thanniversary, and the Packet Video workshop celebrates the 25th anniversary! 

Overall, the expertise in multimedia shined through in this year’s ACM MMSys, with fantastic keynotes, presentations, and demonstrations from researchers around the globe. Although there are many benefits to attending a virtual conference, after numerous experiences this year I can’t help but feel there is something missing. Over the past 3 years, I’ve attended ACM MMSys in person as a PhD candidate, one of the major benefits of in person events are social encounters. Although this year’s iteration of ACM MMSys did a phenomenal job at the presentation of these events in the new and unexpected virtual format. I believe that it is these social events which shine through as they provide the opportunity to meet, discuss, and develop professional and social links throughout the multimedia research community in a more relaxed setting. 

As a result, I look forward to what Özgü Alay, Cheng-Hsin Hsu, and Ali C. Begen have in store for us at ACM Multimedia Systems 2021, located in the beautiful city of Istanbul, Turkey.

ACM IMX 2020: What does “going virtual” mean?

I work in the department of Research & Development, based in London, at the BBC. My interests include Interactive and Immersive Media, Interaction Design, Evaluative Methods, Virtual Reality, Augmented Reality, Synchronised Experiences & Connected Homes.
In the interest of full disclosure, I serve on the steering board of ACM Interactive Media Experiences (IMX) as Vice President for Conferences. It was an honour to be invited to the organising committee as one of IMX’s first Diversity Co-Chairs and as a Doctoral Consortium Co-Chair. I will also be the General Co-Chair for ACM IMX 2021
I hope you join us at IMX 2021 but if you need convincing, please read on about my experiences with IMX 2020!
I am quite active on Twitter (@What2DoNext), so I don’t think it came as a massive surprise to the IMX community that I won the award of the Best Social Media Reporter for ACM IMX 2020. Here are some of the award-winning tweets describing a workshop, a creative challenge, the opening keynote, my co-author presenting our paper (which incidentally won an honourable mention), the closing keynote and announcing the venue for ACM IMX 2021. This report is a summary of my experiences with IMX 2020.

Before the conference

Summary of activities at IMX 2020.

For the first time in the history of IMX, it was going entirely virtual. As if that wasn’t enough, IMX 2020 was the conference that got rebranded. In 2019, it was called TVX – Interactive Experiences for Television and Online Video! However, the steering committee unanimously voted to rename and rebrand it to reflect the fact that the conference had outgrown its original remit. The new name – Interactive Media Experiences (IMX) – was succinct and all-compassing of the conference’s current scope. With the rebrand, came a revival of principles and ethos. For the first time in the history of IMX, the organising committee worked with the steering committee to include Diversity co-chairs. 

The tech industry has suffered from a lack of diverse representation, and 2020 was the year, we decided to try to improve the situation in the IMX community. So, in addition to holding the position of the Doctoral Consortium co-chair, a relatively well-defined role, I was invited to be one of two Diversity chairs. The conference was going to take place in Barcelona, Spain – a city I have been lucky to visit multiple times. I love the people, the culture, the food (and wine) and the city, especially in the summer. The organisation was on track when, due to the unprecedented and global pandemic, we called in an emergency meeting to immediately transfer conference activities to various online platforms. Unfortunately, we lost one keynote, a panel, & 3 workshops, but we managed to transfer the rest into a live virtual event over a combination of platforms: Zoom, Mozilla Hubs, Miro, Slack & Sli.do.

The organising committee came together to reach out to the IMX community to ask for their help in converting their paper, poster and demo presentations to a format suitable for a virtual conference. We were quite amazed at how the community came together to make the virtual conference possible. Quite a few of us spent a lot of late nights getting everything ready!

We set about creating an accessible program and proceedings with links to the various online spaces scheduled to host track sessions and links to papers for better access using the SIGCHI progressive web app and the ACM Publishing System. It didn’t hurt that one of our Technical Program chairs, David A. Shamma, is the current SIGCHI VP of Operations. It was also helpful to have access to the ACM’s guide for virtual conferences and the experience gained by folks like Blair McIntyre (general co-chair of IEEE VR 2020 & Professor at Georgia Institute of Technology). We also got lots of support from Liv Erickson (Emerging Tech Product Manager at Mozilla).

About a week before the conference, Mario Montagud (General Co-Chair) sent an email to all registered attendees to inform them about how to join. Honestly, there were moments when I thought it might be touch and go. I had issues with my network, last-minute committee jobs kept popping up, and social distancing was becoming problematic.

During the conference…

Traditionally, IMX brings together international researchers and practitioners from a wide range of disciplines to attend workshops and challenges on the first day followed by two days of keynotes, panels, paper presentations, posters and demos. The activities are interspersed with lunches, networking with colleagues, copious coffee and a social event. 

The advantage of a virtual event is that I had no jet lag and I woke up in my bed at home on the day of the conference. However, I had to provide my coffee and lunches in the 2020 instantiation while (very briefly) considering the option of attending an international conference in my pyjamas. The other early difference is that I didn’t get a name badge in a conference branded registration packet, however, due to my committee roles at IMX 2020, the communications team made us zoom background ‘badges’ – which I loved!

Virtual Backgrounds for use in Zoom.

My first day was exciting and diverse! I had a three-hour workshop in the morning (starting 10 AM BST) titled “Toys & the TV: Serious Play” I had organised with my colleagues Suzanne Clark and Barbara Zambrini from BBC R&D, Christoph Ziegler from IRT and Rainer Kirchknopf from ZDF. We had a healthy interest in the workshop and enthusiastic contributions. A few of the attendees contributed idea/position papers while the other attendees were asked to support their favourite amongst the presented ideas. The groups of people were then sent to a breakout group to work on the concept and produce a newspaper-type summary page of an exemplar manifestation of the idea. We all worked over Zoom and a collaborative whiteboard on Miro. It was the virtual version of an interactive “post-it on a wall” type workshop. 

Then it was time for lunch and a cup of tea while managing home learning activities for my kids. Usually, I would have been hunting for a quiet place in the conference venue (depending on the time difference) to facetime with my kids. None of that in 2020! I could chat with my fellow organising committee to make sure things were running smoothly and offer aid if needed. Most of the day’s activities were being efficiently coordinated by Mario, based during the conference, at the i2Cat offices in Barcelona.

Around 4 PM (BST), I had a near four-hour creative challenge meet up. However, before that, I dropped into the IMX in Latin America workshop which was organised by colleagues in (you guessed it) Latin America as a way to introduce the work they do to IMX. Things were going well in that workshop, so after a quick hello to the organisers, I rushed over to take part in the creative challenge!

The creative challenge, titled “Snap Creative Challenge: Reimagine the Future of Storytelling with Augmented Reality (AR) ”, was an invited event. It was sponsored by Snap (Andrés Monroy-Hernández) and co-organised by Microsoft Research (Mar González-Franco) and BBC Research & Development (myself). Earlier in the year, over six months, eleven academic teams from eight countries created AR projects to demonstrate their vision of what storytelling would look like in a world where AR is more prevalent. We mentored the teams with the help of Anthony Steed (University College London), Nonny de La Peña (Emblematic Group), Rajan Vaish (Snap), Vanessa Pope (Queen Mary, University of London), and some colleagues who generously donated their time and expertise. We started with a welcome to the event (hosted on Zoom) given by Andrés Monroy-Hernández and then it was straight into presentations of the project. Snap created a summary video of the ideas presented on the day. 

Each project was distinct, unique and had the potential for so much more development and expansion. The creative challenge was closed by one of the co-founders of Snap (Bobby Murphy). After closing, some teams had office hours where we could go and have an extended chat about the various projects. Everyone was super enthusiastic and keen to share ideas.

It was 8.20 PM, so I had to end the day with my glass of wine with my other half, but I had a brilliant day and couldn’t get over how many interesting people I got to chat to – and it was just the first day of the conference! On the second day of the conference, Christian Timmerer (Alpen-Adria-Universität Klagenfurt & Bitmovin) and I had an hour-long doctoral consortium to host bright and early at 9 AM (BST). Three doctoral students presented a variety of topics. Each student was assigned two mentors who were experts in the field the students were working in. This year, the organising committee were keen to ensure diverse participation through all streams of the conference so, Christian and I kept this in mind in choosing mentors for the doctoral students. We were also able to invite mentors regardless of whether they would travel to a venue or not since everyone was attending online. In a way, it gave us more freedom to be diverse in our choices and thinking. Turns out one hour was whetting the appetite for everyone but the conference had other activities scheduled in the day, so I quite liked having a short break before my next session at noon! Time for another cup of coffee and a piece of chocolate! 

The general chairs (Pablo Cesar – CWI, Mario Montagud & Sergi Fernandez – i2Cat) welcomed everyone to the conference at noon (BST). Pablo gave a summary of the number of participants we had at IMX. This is one of the most unfortunate things in a virtual conference. It’s difficult to get a sense of ‘being together’ with the other attendees at the conference but we got some idea from Pablo. Asreen Rostami (RISE) and I gave a summary of diversity & inclusion activities we put in place through the organisation of the conference to begin the process of improving the representation of under-represented groups within the IMX community. Unfortunately, a lot of the plans were not implemented once IMX 2020 went virtual but some of the guidance to inject diverse thinking into all parts of the conference were still carried out – ensuring that the make-up of the ACs was diverse, encouraging workshop organisers to include a diverse set of participants and use inclusive language, casting a wider net in our search for keynotes and mentors, and selecting a time period to run the conference that was best suited to a majority of our attendees. The Technical Program Co-Chair (Lucia D’Acunto, TNO) gave a summary of how the tracks were populated w.r.t papers. To round off the opening welcome for IMX 2020, Mario gave an overview of communication channels, the tools used and the conference program. The wonderful thing about being in a virtual conference is that you can easily screenshot presentations, so you have a good record of what happened. Under pre-pandemic situations, I would have photographed the slides on a screen on stage from my seat in the auditorium hall. So unfashionable in 2020 – you will agree. Getting a visual reminder of talks is useful if you want to remember key points! It also exceedingly good for illustrations as part of a report you might write about the conference three months later.

Sergi Fernandez introduced the opening keynote: Mel Slater (University of Barcelona) who talked about using Virtual Reality to Change Attitudes and Behaviour. Mel was my doctoral supervisor back in between 2001 and 2006 when I did a PhD at UCL. He was the reason I decided to focus my postgraduate studies to build expressive virtual characters. It was fantastic to “go to a conference with him” again even if he got the seat with the better weather. His opening keynote was engaging, entertaining and gave a lot of food for thought. He also had a new video of his virtual self being a rock star. To this day, I believe this is the main reason he got into VR in the first place! And why ever not?

Immediately after Mels’ talk and Q&A session, it was time to inform attendees about the demos and posters available for viewing as part of the conference. The demos and posters were displayed in a series of Mozilla Hubs rooms (domes) created by Jesús Gutierrez (Universidad Politecnica de Madrid, Demo co-chair) and I, based off some models given to us by Liv (Mozilla). We were able to personalise the virtual spaces and give it a Spanish twist using a couple of panorama images David A. Shamma (FXPAL & Technical Program co-chair for IMX 2020) found on Flickr. Ayman and Julie Williamson (Univ. of Glasgow) also enabled the infrastructure behind the IMX Hub spaces. Jesús and I gave a short ‘how-to’ presentation to let attendees know what to expect in the IMX Hub Spaces. After our presentation, Mario played a video of pitches giving us quick lightning summaries of the demos, work-in-progress poster presentations and doctoral consortium poster displays.

Thirty minutes later, it was time for the first paper session of the day (and the conference)! Ayman chaired the first four papers in the conference in a session titled ‘Augmented TV’. The first paper presented was one I co-authored with Radu-Daniel Vatavu (Univ. Stefan cel Mare of Suceava), Pejman Saeghe (Univ. of Manchester), Teresa Chambel (Univ. of Lisbon), and Marian F Ursu (Univ. of York). The paper (‘Conceptualising Augmented Reality Television for the Living Room’) examined the characteristics of Augmented Reality Television (ARTV) by analysing commonly accepted views on augmented and mixed reality systems, by looking at previous work, by looking at tangential fields (ambient media, interactive TV, 3D TV etc.) and by proposing a conceptual framework for ARTV – the “Augmented Reality Television Continuum”. The presentation is on the ACM SIGCHI’s YouTube channel if you feel like watching Pejman talk about the paper instead of reading it or maybe in addition to reading it!

Ayman and Pejman talking about our paper ‘Conceptualising Augmented Reality Television for the Living Room

I did not present the paper, but I was still relieved that it was done! I have noticed that once a paper I was involved with is done, I tend to have enough headspace to engage and ask questions of other authors. So that’s what I was able to do for the rest of the conference. In that same first paper session, Simon von der Au (IRT) et al. presented ‘The SpaceStation App: Design and Evaluation of an AR Application for Educational Television’ in which they got to work with models and videos of the International Space Station! Now, I love natural history documentaries so when I need to work with content, I don’t think I can go wrong if I choose David Attenborough narrated content – think Blue Planet. However, the ISS is a close second! They also cited two of my co-authored papers – Ziegler et al. 2018 and Saeghe et al. 2019 – which is always lovely to see.

After the first session, we had a 30-minute break before making our way to the Hubs Domes to look at demos and posters. Our outstanding student volunteers were deployed to guide IMX attendees to various domes. It was very satisfying seeing all our Hubs space populated with demos/posters with snippets of conversation flowing past as I passed through the domes to see how folks fared in the space. The whole experience resulted in a lot of selfies and images!

There were moments of delight throughout the event. I thought I’d rebel against my mom and get pink hair! Pablo got purple hair and IRL he does not have hair that colour (or that uniformly distributed). Ayman and I tried getting some virtual drinks – I got myself a pina colada while Ayman stayed sober. I also visited all the posters and demos which seldom happens when I attend conferences IRL. In Hubs, it was an excellent way to ‘bump into’ folks. I have been in the IMX community for a while, so I was able to recognise many people by reading their floating name labels. Most of their avatars looked nothing like the people I knew! Christian and Omar Niamut (TNO) had more photorealistic avatars but even those were only recognisable if I squinted! I was also very jealous of Omar’s (and Julie’s) virtual hands which they got because they visited the domes using their VR headsets. It was loads of fun seeing how people represented themselves through their virtual clothes, hair and body choice. 

All of the demos and posters were well presented but the ‘Watching Together but Apart’ caught my eye because I knew my colleagues Rajiv RamdhanyLibby Miller, and Kristian Hentschel built ‘BBC Together’ – an experimental BBC R&D prototype to enable people to watch and listen to BBC programmes together while they are physically apart. It was a response to the situation brought to a lot of our doorsteps by the pandemic! It was amazing to see that another research group responded in the same way to build a similar application. It was great fun talking to Jannik Munk Bryld about their project and compare notes.

Once the paper session was over, there was a 45 minutes break to stretch our legs and rest our eyes. Longer in-between session breaks are a necessity in virtual conferences. At 2:30 PM (BST), it was time to listen to two industry talks chaired by Steve Schirra (YouTube) and Mikel Zorrilla (Vicomtech). Mike Darnell (Samsung Electronics America) talked of conclusions he drew from a survey study of hundreds of participants which focused on user behaviour when it came to choosing what to watch on the TV. The main take-home message was that people generally knew in advance exactly what they want to watch on TV.

Natàlia Herèdia (Media UX Design) talked of her pop-up media lab focusing on designing an OTT for a local public channel. She spoke of the process she used and gave a summary of her work on reaching new audiences. 

After the industry talk, it was time for a half an hour break. The organising committee and student volunteers went out to the demo domes in Hubs to get a group selfie! We realised that Ayman has serious ambitions when it comes to cinematography. After we got our shots, we attended another paper session chaired by Aisling Kelliher (Virginia Tech) titled ‘Live Production and Audience’. Other people might have mosquitos or mice as a pest problem. In this paper session, I learnt that there are people like Aisling whose pest problems are a little more significant – like bear sized bigger! So many revelations in such a short time! 

The first paper of the last session, titled ‘DAX: Data-Driven Audience Experiences in Esports’, was presented by Athanasios Vasileios Kokkinakis (Univ. of York). He gave a fascinating insight into how companion screen applications might allow audiences to consume interesting data-driven insights during and around the broadcasts of Esports. It was great to see this wort of work since I have some history of working on companion screen applications with sports being one of the genres that could benefit from multi-device applications. The paper won the best paper award! Yvette Wohn (New Jersey Institute of Technology) presented a paper, titled ‘Audience Management practices of Live Streamers on Twitch’, in which she interviewed Twitch streamers to understand how streamers discover audience composition and use appropriate mechanisms to interact with them. The last paper of the conference was presented by Marian –  ‘Authoring Interactive Fictional Stories in Object-Based Media (OBM)’. The paper referred to quite a few BBC R&D OBM projects. Again, it was quite lovely to see some reaffirmation of ideas with similar thought processes flowing through the screen.

At 6 PM (BST), I had the honour of chairing the closing keynote by Nonny. Nonny had a lot of unique immersive journalism pieces to show us! She also gave us a live demo of her XR creation, remixing and sharing platform – REACH.love. She imported a virtual character inspired by the Futurama animated character – Bender. Incidentally, my very first virtual character was also created in Bender’s image. I had to remove the antenna off his head because Anthony Steed, who was my project lead at the time, wasn’t as appreciative of my character design – tragic times. 

Alas, we had come near the end of the conference which meant it was time for Mario to give a summary of numbers to indicate how many attendees participated in IMX 2020 – spoiler: it was the highest attendance yet. He also handed out various awards. It turns out that our co-authored paper on ‘Conceptualising Augmented Reality Television for the Living Room’ got an honourable mention! More importantly, I was awarded the best social media reporter which is of course why you are reading this report! I guess this is an encouragement to keep on tweeting about IMX!

Frank Bentley (Verizon Media, IMX Steering Committee president) gave a short presentation in which he acknowledged that it was June the 19th – Juneteenth (Freedom Day) in the US. He gave a couple of poignant suggestions on how we might consider marking the day. He also talked about the rebranding exercise that resulted in the conference going from TVX to IMX.

Frank also announced that we are looking for host bids for IMX 2022! As VP of Conferences, I would be very excited to hear from you! Please do email me if you are looking for information about hosting an IMX conference in 2022 or beyond. You can also drop me a tweet @What2DoNext!

He then handed over the floor to Yvette and me to announce the proposed venue of IMX 2021 – New York! A few of the organising committee positions are still up for grabs. Do consider joining our exciting and diverse organising committee if you feel like you could contribute to making IMX 2021 a success! In the meantime, I managed to persuade my lovely colleague at BBC R&D (Vicky Barlow) to make a teaser video to introduce IMX 2021.

That brought us to the end of IMX 2020, sadly. The stragglers of the IMX community lingered a little to have a little bit of chat over zoom which was lovely.

After the conference…

You would think that once the conference was over, that was it but no, not so. In years past, all that was left to do was to stalk people you met at the conference on LinkedIn to make sure the ‘virtual business cards’ were saved. Of course, I did a bit of that this year as well. However, this year had been a much more involved experience. I have had a chance to define the role of Diversity chairs with Asreen. I have had the chance to work with Ayman, Julie, Jesús, Liv and Blair to bring demos and posters to Hubs as part of the IMX 2020 virtual experience. It was a blast! You might have thought that I would be taking a rest! You would be wrong! 

I am joining forces with Yvette and the rest of a whole new committee to start organising IMX 2021 – New York into a format that continues the success of IMX 2020 and strive to improve on it. Finally, let’s not forget Frank’s reminder that we are looking for colleagues out there (maybe you?) to host IMX 2022 and beyond! 

The story continues… Do get in touch!

JPEG Column: 88th JPEG Meeting

The 88th JPEG meeting initially planned to be held in Geneva, Switzerland, was held online because of the Covid-19 outbreak.

JPEG experts organised a large number of sessions spread over day and night to allow the remote participation of multiple time zones. A very intense activity has resulted in multiple outputs and initiatives. In particular two new explorations activities were initiated. The first explores possible standardisation needs to address the growing emergence of fake media by introducing appropriate security features to prevent the misuse of media content. The latest, considers the use of DNA for media content archival.

Furthermore, JPEG has started the work on the new part 8 of the JPEG Systems standard, called JPEG snack, for interoperable rich image experiences, and it is holding two Call for Evidence, JPEG AI and JPEG Pleno Point cloud coding.

Despite travel restrictions, JPEG Committee has managed to keep up with the majority of its plans, defined prior to the COVID-19 outbreak. An overview of the different activities is represented in Fig. 1.

The 88th JPEG meeting had the following highlights:

  • JPEG explores standardization needs to address fake media
  • JPEG Pleno Point Cloud call for evidence
  • JPEG DNA – based archival of media content using DNA
  • JPEG AI call for evidence
  • JPEG XL standard evolves to a final specification
  • JPEG Systems part 8, named JPEG Snack progress
  • JPEG XS ballot raw-Bayer image sensor data compression.
JPEG ongoing activities timeline.

JPEG explores standardization needs to address fake media

Recent advances in media manipulation, particularly deep learning-based approaches, can produce near realistic media content that is almost indistinguishable from authentic content to the human eye. These developments open opportunities for production of new types of media contents that are useful for the entertainment industry and other business usage, e.g., creation of special effects or artificial natural scene production with actors in the studio. However, this also leads to issues relating to fake media generation undermining the integrity of the media (e.g., deepfakes), copyright infringements and defamation to mention a few examples. Misuse of manipulated media can cause social unrest, spread rumours for political gain or encourage hate crimes. In this context, the term ‘fake’ is used here to refer to any manipulated media, independently of its ‘good’ or ‘bad’ intention.

In many application domains, fake media producers may want or may be required to declare the type of manipulations performed, in opposition to other situations where the intention is to ‘hide’ the mere existence of such manipulations. This is already leading various Governmental organizations to plan new legislation or companies (especially social media platforms or news outlets) to develop mechanisms that would clearly detect and annotate manipulated media contents when they are shared. While growing efforts are noticeable in developing technologies, there is a need to have a standard for the media/metadata format, e.g., a JPEG standard that facilitates a secure and reliable annotation of fake media, both in good faith and malicious usage scenarios. To better understand the fake media ecosystem and needs in terms of standardization, the JPEG Committee has initiated an in-depth analysis of fake media use cases, naturally independently of the “intentions”.     

More information on the initiative is available on the JPEG website. Interested parties are invited to join the above AHG through the following URL: http://listregistration.jpeg.org.

JPEG Pleno Point Cloud

JPEG Pleno is working towards the integration of various modalities of plenoptic content under a single and seamless framework. Efficient and powerful point cloud representation is a key feature within this vision. Point cloud data supports a wide range of applications including computer-aided manufacturing, entertainment, cultural heritage preservation, scientific research and advanced sensing and analysis. During the 88th JPEG meeting, the JPEG Committee released a Final Call for Evidence on JPEG Pleno Point Cloud Coding that focuses specifically on point cloud coding solutions supporting scalability and random access of decoded point clouds. Between the 88th and 89th meetings, the JPEG Committee will be actively promoting this activity and collecting registrations to participate in the Call for Evidence.

JPEG DNA

In digital media information, notably images, the relevant representation symbols, e.g. quantized DCT coefficients, are expressed in bits (i.e., binary units) but they could be expressed in any other units, for example the DNA units which follow a 4-ary representation basis. This would mean that DNA molecules may be created with a specific DNA units’ configuration which stores some media representation symbols, e.g. the symbols of a JPEG image, thus leading to DNA-based media storage as a form of molecular data storage. JPEG standards have been used in storage and archival of digital pictures as well as moving images. While the legacy JPEG format is widely used for photo storage in SD cards, as well as archival of pictures by consumers,  JPEG 2000 as described in ISO/IEC 15444 is used in many archival applications, notably for preservation of cultural heritage in form of visual data as pictures and video in digital format. This puts the JPEG Committee in a unique position to address the challenges in DNA-based storage by creating a standard image representation and coding for such applications. To explore the latter, an AHG has been established. Interested parties are invited to join the above AHG through the following URL: http://listregistration.jpeg.org.

JPEG AI

At the 88th meeting, the submissions to the Call for Evidence were reported and analysed. Six submissions were received in response to the Call for Evidence made in coordination with the IEEE MMSP 2020 Challenge. The submissions along with the anchors were already evaluated using objective quality metrics. Following this initial process, subjective experiments have been designed to compare the performance of all submissions. Thus, during this meeting, the main focus of JPEG AI was on the presentation and discussion of the objective performance evaluation of all submissions as well as the definition of the methodology for the subjective evaluation that will be made next.

JPEG XL

The standardization of the JPEG XL image coding system is nearing completion. Final technical comments by national bodies have been received for the codestream (Part 1); the DIS has been approved and an FDIS text is under preparation. The container file format (Part 2) is progressing to the DIS stage. A white paper summarizing key features of JPEG XL is available at http://ds.jpeg.org/whitepapers/jpeg-xl-whitepaper.pdf.

JPEG Systems

ISO/IEC has approved the JPEG Snack initiative to deliver interoperable rich image experiences.  As a result, the JPEG Systems Part 8 (ISO/IEC 19566-8) has been created to define the file format construction and the metadata signalling and descriptions which enable animation with transition effects.  A Call for Participation and updated use cases and requirements have been issued. The CfP and the use cases and requirements documents are available at http://ds.jpeg.org/documents/wg1n87035-REQ-JPEG_Snack_Use_Cases_and_Requirements_v2_2.pdf and http://ds.jpeg.org/documents/wg1n88032-SI-CfP_JPEG_Snack.pdf respectively.

An updated working draft for the JLINK initiative was completed.  Interest parties are encouraged to review the JLINK Working Draft 3.0 available at http://ds.jpeg.org/documents/wg1n88031-SI-JLINK_WD_3_0.pdf

JPEG XS

The JPEG committee is pleased to announce a significant step in the standardization of an efficient Bayer image compression scheme, with the first ballot of the 2nd Edition of JPEG XS Part-1.

The new edition of this visually lossless low-latency and lightweight compression scheme now includes image sensor coding tools allowing efficient compression of Color-Filtered Array (CFA) data. This compression enables better quality and lower complexity than the corresponding compression in the RGB domain.  It can be used as a mezzanine codec in various markets such as real-time video storage in and outside of cameras, and data compression onboard autonomous cars.

Final Quote

“Fake Media has become a challenge with the wide-spread manipulated contents in the news. JPEG is determined to mitigate this problem by providing standards that can securely identify manipulated contents.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

Future JPEG meetings are planned as follows:

  • No 89, will be held online from October 5 to 9, 2020.

Report from the MMM 2020 Special Session on Multimedia Datasets for Repeatable Experimentation (MDRE 2020)

Introduction

Information retrieval and multimedia content access have a long history of comparative evaluation, and many of the advances in the area over the past decade can be attributed to the availability of open datasets that support comparative and repeatable experimentation. Hence, sharing data and code to allow other researchers to replicate research results is needed in the multimedia modeling field, as it helps to improve the performance of systems and the reproducibility of published papers.

This report summarizes the special session on Multimedia Datasets for Repeatable Experimentation (MDRE 2020), which was organized at the 26th International Conference on MultiMedia Modeling (MMM 2020), held in January 2020 in Daejeon, South Korea.

The intent of these special sessions is to be a venue for releasing datasets to the multimedia community and discussing dataset related issues. The presentation mode in 2020 was to have short presentations (approximately 8 minutes), followed by a panel discussion moderated by Aaron Duane. In the following we summarize the special session, including its talks, questions, and discussions.

Presentations

GLENDA: Gynecologic Laparoscopy Endometriosis Dataset

The session began with a presentation on ‘GLENDA: Gynecologic Laparoscopy Endometriosis Dataset’ [1], given by Andreas Leibetseder from the University of Klagenfurt. The researchers worked with experts on gynecologic laparoscopy, a type of minimally invasive surgery (MIS), that is performed via a live feed of a patient’s abdomen to survey the insertion and handling of various instruments for conducting medical treatments. Adopting this kind of surgical intervention not only facilitates a great variety of treatments but also the possibility of recording such video streams is essential for numerous post-surgical activities, such as treatment planning, case documentation and education. The process of manually analyzing these surgical recordings, as it is carried out in current practice, usually proves tediously time-consuming. In order to improve upon this situation, more sophisticated computer vision as well as machine learning approaches are actively being developed. Since most of these approaches rely heavily on sample data that, especially in the medical field, is only sparsely available, the researchers published the Gynecologic Laparoscopy ENdometriosis DAtaset (GLENDA) – an image dataset containing region-based annotations of a common medical condition called endometriosis. 

Endometriosis is a disorder involving the dislocation of uterine-like tissue. Andreas explained that this dataset is the first of its kind and was created in collaboration with leading medical experts in the field. GLENDA contains over 25K images, about half of which are pathological, i.e., showing endometriosis, and the other half non-pathological, i.e., containing no visible endometriosis. The accompanying paper thoroughly described the data collection process, the dataset’s properties and structure, while also discussing its limitations. The authors plan on continuously extending GLENDA, including the addition of other relevant categories and ultimately lesion severities. Furthermore, they are in the process of collecting specific ”endometriosis suspicion” class annotations in all categories for capturing a common situation where at times it proves difficult, even for endometriosis specialists, to classify the anomaly without further inspection. The difficulty in classification may be due to several reasons, such as visible video artifacts. Including such challenging examples in the dataset may greatly improve the quality of endometriosis classifiers.

Kvasir-SEG: A Segmented Polyp Dataset

The second presentation was given by Debesh Jha from the Simula Research Laboratory, who introduced the work entitled ‘Kvasir-SEG: A Segmented Polyp Dataset’ [2]. Debesh explained that pixel-wise image segmentation is a highly demanding task in medical image analysis. Similar to the aforementioned GLENDA dataset, it is difficult to find annotated medical images with corresponding segmentation masks in practice. The Kvasir-SEG dataset is an open-access corpus of gastrointestinal polyp images and corresponding segmentation masks, which has been further manually annotated and verified by an experienced gastroenterologist. The researchers demonstrated the use of their dataset with both a traditional segmentation approach and a modern deep learning-based CNN approach. In addition to presenting the Kvasir-SEG dataset, Debesh also discussed the FCM clustering algorithm and the ResUNet-based approach for automatic polyp segmentation they presented in their paper. The results show that the ResUNet model was superior to FCM clustering.

The researchers released the Kvasir-SEG dataset as an open-source dataset to the multimedia and medical research communities, in the hope that it can help evaluate and compare existing and future computer vision methods. By adding segmentation masks to the Kvasir dataset, which until today only consisted of framewise annotations, the authors have enabled multimedia and computer vision researchers to contribute in the field of polyp segmentation and automatic analysis of colonoscopy videos. This could boost the performance of other computer vision methods and may be an important step towards building clinically acceptable CAI methods for improved patient care.

Rethinking the Test Collection Methodology for Personal Self-Tracking Data

The third presentation was given by Cathal Gurrin from Dublin City University and was titled ‘Rethinking the Test Collection Methodology for Personal Self-Tracking Data’ [3]. Cathal argued that, although vast volumes of personal data are being gathered daily by individuals, the MMM community has not really been tackling the challenge of developing novel retrieval algorithms for this data, due to the challenges of getting access to the data in the first place. While initial efforts have taken place on a small scale, it is their conjecture that a new evaluation paradigm is required in order to make progress in analysing, modeling and retrieving from personal data archives. In their position paper, the researchers proposed a new model of Evaluation-as-a-Service that re-imagines the test collection methodology for personal multimedia data in order to address the many challenges of releasing test collections of personal multimedia data. 

After providing a detailed overview of prior research on the creation and use of self-tracking data for research, the authors identified issues that emerge when creating test collections of self-tracking data as commonly used by shared evaluation campaigns. This includes in particular the challenge of finding self-trackers willing to share their data, legal constraints that require expensive data preparation and cleaning before a potential release to the public, as well as ethical considerations. The Evaluation-as-a-Service model is a novel evaluation paradigm meant to address these challenges by enabling collaborative research on personal self-tracking data. The model relies on the idea of a central data infrastructure that guarantees full protection of the data, while at the same time allowing algorithms to operate on this protected data. Cathal highlighted the importance of data banks in this scenario. Finally, he briefly outlined technical aspects that would allow setting up a shared evaluation campaign on self-tracking data.

Experiences and Insights from the Collection of a Novel Multimedia EEG Dataset

The final presentation of the session was also provided by Cathal Gurrin from Dublin City University in which he introduced the topic ‘Experiences and Insights from the Collection of a Novel Multimedia EEG Dataset’ [4]. This work described how there is a growing interest in utilising novel signal sources such as EEG (Electroencephalography) in multimedia research. When using such signals, subtle limitations are often not readily apparent without significant domain expertise. Multimedia research outputs incorporating EEG signals can fail to be replicated when only minor modifications have been made to an experiment or seemingly unimportant (or unstated) details are changed. Cathal claimed that this can lead to over-optimistic or over-pessimistic viewpoints on the potential real-world utility of these signals in multimedia research activities.

In their paper, the researchers described the EEG/MM dataset and presented a summary of distilled experiences and knowledge gained during the preparation (and utilisation) of the dataset that supported a collaborative neural-image labelling benchmarking task. They stated that the goal of this task was to collaboratively identify machine learning approaches that would support the use of EEG signals in areas such as image labelling and multimedia modeling or retrieval. The researchers stressed that this research is relevant for the multimedia community as it suggests a template experimental paradigm (along with datasets and a baseline system) upon which researchers can explore multimedia image labelling using a brain-computer interface. In addition, the paper provided insights and experience of commonly encountered issues (and useful signals) when conducting research that utilises EEG in multimedia contexts. Finally, this work provided insight on how an EEG dataset can be used to support a collaborative neural-image labelling benchmarking task.

Discussion

After the presentations, Aaron Duane moderated a panel discussion in which all presenters participated, as well as Björn Þór Jónsson who joined the panel as one of the special session chairs.

The panel began with a question about how the research community should address data anonymity in large multimedia datasets and how, even if the dataset is isolated and anonymised, data analysis techniques can be utilised to reverse this process either partially or completely. The panel agreed this was an important question and acknowledged that there is no simple answer. Cathal Gurrin stated that there is less of a restrictive onus on the datasets used for such research because the owners of the dataset often provide it with full knowledge of how it will be used.

As a follow up, the questioner asked the panel about GDPR compliancy in this context and the fact that uploaders could potentially change their minds about allowing their datasets to be used in research several years after it was released. The panel acknowledged this remains an open concern and even expanded on such concerns by presenting an additional concern, namely the malicious uploading of data without the consent of the owner. One solution to this which was provided by the panel was the introduction of an additional layer of security in the form of a human curator who could review the security and privacy concerns of a dataset during its generation, as is the case with some datasets of personal data currently under release to the community. 

The discussion continued with much interest continuing to be directed toward effective privacy in datasets, especially when dealing with personal data, such as those generated by lifeloggers. One audience member recalled a story where a personal dataset was publicly released and individuals were able to garner personal information about individuals who were not the original uploader of the dataset and who did not consent to their face or personal information being publicly released. Cathal and Björn acknowledged that this remains an issue but drew attention to advanced censoring techniques such as automatic face blurring which is rapidly maturing in the domain. Furthermore, they claimed that the proposed model of Evaluation-as-a-Service discussed in Cathal’s earlier presentation could help to further alleviate some of these concerns.

Steering the conversation away from exclusively dealing with data privacy concerns, Aaron directed a question at Debesh and Andreas regarding the challenges and limitations associated with working directly with medical professionals to generate their datasets related to medical disorders. Debesh stated that there were numerous challenges such as the medical professionals being unfamiliar with the tools used in the generation of this work and that in many cases circumstances required multiple medical professionals and their opinion as they would often disagree. This generated significant technical and administrative overhead for the researchers and their work which resulted in a tedious speed of progress. Andreas stated that such issues were identical for him and his colleagues and highlighted the importance of effective communication between the medical experts and the technical researchers.

Towards the end of the discussion, the panel discussed the concept of encouraging the release of more large-scale multimedia datasets for experimentation and what challenges are currently associated with that. The panel responded that the process remains difficult but having special sessions such as this are very helpful. The recognition of papers associated with multimedia datasets is becoming increasingly apparent with many exceptional papers earning hundreds of citations within the community. The panel also stated that we should be mindful of the nature of each dataset as releasing the same type of dataset, again and again, is not beneficial and has the potential to do more harm than good.

Conclusions

The MDRE special session, in its second incarnation at MMM 2020, was organised to facilitate the publication of high-quality datasets, and for community discussions on the methodology of dataset creation. The creation of reliable and shareable research artifacts, such as datasets with reliable ground truths, usually represents tremendous effort; effort that is rarely valued by publication venues, funding agencies or research institutions. In turn, this leads many researchers to focus on short-term research goals, with an emphasis on improving results on existing and often outdated datasets by small margins, rather than boldly venturing where no researchers have gone before. Overall, we believe that more emphasis on reliable and reproducible results would serve our community well, and the MDRE special session is a small effort towards that goal.

Acknowledgements

The session was organized by the authors of the report, in collaboration with Duc-Tien Dang-Nguyen (Dublin City University), who could not attend MMM. The panel format of the special session made the discussions much more engaging than that of a traditional special session. We would like to thank the presenters, and their co-authors for their excellent contributions, as well as the members of the audience who contributed greatly to the session.

References

  • [1] Leibetseder A., Kletz S., Schoeffmann K., Keckstein S., and Keckstein J. “GLENDA: Gynecologic Laparoscopy Endometriosis Dataset.” In: Cheng WH. et al. (eds) MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science, vol. 11962, 2020. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_36.
  • [2] Jha D., Smedsrud P.H., Riegler M.A., Halvorsen P., De Lange T., Johansen D., and Johansen H.D. “Kvasir-SEG: A Segmented Polyp Dataset.” In: Cheng WH. et al. (eds) MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science, vol. 11962, 2020. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_37.
  • [3] Hopfgartner F., Gurrin C., and Joho H. “Rethinking the Test Collection Methodology for Personal Self-tracking Data.” In: Cheng WH. et al. (eds) MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science, vol. 11962, 2020. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_38.
  • [4] Healy G., Wang Z., Ward T., Smeaton A., and Gurrin C. “Experiences and Insights from the Collection of a Novel Multimedia EEG Dataset.” In: Cheng WH. et al. (eds) MultiMedia Modeling. MMM 2020. Lecture Notes in Computer Science, vol. 11962, 2020. Springer, Cham. https://doi.org/10.1007/978-3-030-37734-2_39.

JPEG Column: 87th JPEG Meeting

The 87th JPEG meeting initially planned to be held in Erlangen, Germany, was held online from 25-30, April 2020 because of the Covid-19 outbreak. JPEG experts participated in a number of online meetings attempting to make them as effective as possible while considering participation from different time zones, ranging from Australia to California, U.S.A.

JPEG decided to proceed with a Second Call for Evidence on JPEG Pleno Point Cloud Coding and continued work to prepare for contributions to the previous Call for Evidence on Learning-based Image Coding Technologies (JPEG AI).

The 87th JPEG meeting had the following highlights:

  • JPEG Pleno Point Cloud Coding issues a Call for Evidence on coding solutions supporting scalability and random access of decoded point clouds.
  • JPEG AI defines evaluation methodologies of the Call for Evidence on machine learning based image coding solutions.
  • JPEG XL defines the file format compatible with existing formats. 
  • JPEG exploration on Media Blockchain releases use cases and requirements.
  • JPEG Systems releases a first version of JPEG Snack use cases and requirements.
  • JPEG XS announces significant improvement of the quality of raw-Bayer image sensor data compression.

JPEG Pleno Point Cloud

JPEG Pleno is working towards the integration of various modalities of plenoptic content under a single and seamless framework. Efficient and powerful point cloud representation is a key feature within this vision. Point cloud data supports a wide range of applications including computer-aided manufacturing, entertainment, cultural heritage preservation, scientific research and advanced sensing and analysis. During the 87th JPEG meeting, the JPEG Committee released a Second Call for Evidence on JPEG Pleno Point Cloud Coding that focuses specifically on point cloud coding solutions supporting scalability and random access of decoded point clouds. The Second Call for Evidence on JPEG Pleno Point Cloud Coding has a revised timeline reflecting changes in the activity due to the 2020 COVID-19 Pandemic. A Final Call for Evidence on JPEG Pleno Point Cloud Coding is planned to be released in July 2020.

JPEG AI

The main focus of JPEG AI was on the promotion and definition of the submission and evaluation methodologies of the Call for Evidence (in coordination with the IEEE MMSP 2020 Challenge) that was issued as outcome of the 86th JPEG meeting, Sydney, Australia.

JPEG XL

The File Format has been defined for JPEG XL (ISO/IEC 18181-1) codestream, metadata and extensions. The file format enables compatibility with ISOBMFF, JUMBF, XMP, Exif and other existing standards. Standardization has now reached the Committee Draft stage and the DIS ballot is ongoing. A white paper about JPEG XL’s features and tools was approved at this meeting and is available on the jpeg.org website.

JPEG exploration on Media Blockchain – Call for feedback on use cases and requirements

JPEG has determined that blockchain and distributed ledger technologies (DLT) have great potential as a technology component to address many privacy and security related challenges in digital media applications. This includes digital rights management, privacy and security, integrity verification, and authenticity, that impacts society in several ways including the loss of income in the creative sector due to piracy, the spread of fake news, or evidence tampering for fraud purposes.

JPEG is exploring standardization needs related to media blockchain to ensure seamless interoperability and integration of blockchain technology with widely accepted media standards. In this context, the JPEG Committee announces a call for feedback from interested stakeholders on the first public release of the use cases and requirements document.

JPEG Systems initiates standardisation of JPEG Snack

Media “snacking”, the consumption of multimedia in short bursts (less than 15 minutes) has become globally popular. JPEG recognizes the need for standardizing how snack images are constructed to ensure interoperability. A first version of JPEG Snack use cases and requirements is now complete and publicly available on JPEG website inviting feedback from stakeholders.

JPEG made progress on a fundamental capability of the JPEG file structure with enhancements to JPEG Universal Metadata Box Format (JUMBF) to support embedding common file types; the DIS text for JUMBF Amendment 1 is ready for ballot. Likewise JPEG 360 Amendment 1 DIS text is ready for ballot; this amendment supports stereoscopic 360 degree images, accelerated rendering for regions-of-interest, and removes the XMP signature block from the metadata description.

JPEG XS – The JPEG committee is pleased to announce significant improvement of the quality of its upcoming Bayer compression.

Over the past year, an improvement of around 2dB has been observed for the new coding tools currently being developed for image sensor compression within JPEG XS. This visually lossless low-latency and lightweight compression scheme can be used as a mezzanine codec in various markets like real-time video storage inside and outside of cameras, and data compression onboard autonomous cars. Mathematically lossless capability is also investigated and encapsulation within MXF or SMPTE ST2110-22 is currently being finalized.

Final Quote

“JPEG is committed to the development of new standards that provide state of the art imaging solutions to the largest spectrum of stakeholders. During the 87th meeting, held online because of the Covid-19 pandemic, JPEG progressed well with its current and even launched new activities. Although some timelines had to be revisited, overall, no disruptions of the workplan have occurred.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JPEG, JPEG 2000, JPEG XR, JPSearch, JPEG XT and more recently, the JPEG XS, JPEG Systems, JPEG Pleno and JPEG XL families of imaging standards.

More information about JPEG and its work is available at jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

Future JPEG meetings are planned as follows:

  • No 88, initially planned in Geneva, Switzerland, July 4 to 10, 2020, will be held online from July 7 to 10, 2020.

MPEG Column: 129th MPEG Meeting in Brussels, Belgium

The original blog post can be found at the Bitmovin Techblog and has been modified/updated here to focus on and highlight research aspects.

The 129th MPEG meeting concluded on January 17, 2020 in Brussels, Belgium with the following topics:

  • Coded representation of immersive media – WG11 promotes Network-Based Media Processing (NBMP) to the final stage
  • Coded representation of immersive media – Publication of the Technical Report on Architectures for Immersive Media
  • Genomic information representation – WG11 receives answers to the joint call for proposals on genomic annotations in conjunction with ISO TC 276/WG 5
  • Open font format – WG11 promotes Amendment of Open Font Format to the final stage
  • High efficiency coding and media delivery in heterogeneous environments – WG11 progresses Baseline Profile for MPEG-H 3D Audio
  • Multimedia content description interface – Conformance and Reference Software for Compact Descriptors for Video Analysis promoted to the final stage

Additional Important Activities at the 129th WG 11 (MPEG) meeting

The 129th WG 11 (MPEG) meeting was attended by more than 500 experts from 25 countries working on important activities including (i) a scene description for MPEG media, (ii) the integration of Video-based Point Cloud Compression (V-PCC) and Immersive Video (MIV), (iii) Video Coding for Machines (VCM), and (iv) a draft call for proposals for MPEG-I Audio among others.

The corresponding press release of the 129th MPEG meeting can be found here: https://mpeg.chiariglione.org/meetings/129. This report focused on network-based media processing (NBMP), architectures of immersive media, compact descriptors for video analysis (CDVA), and an update about adaptive streaming formats (i.e., DASH and CMAF).

MPEG picture at Friday plenary; © Rob Koenen (Tiledmedia).

Coded representation of immersive media – WG11 promotes Network-Based Media Processing (NBMP) to the final stage

At its 129th meeting, MPEG promoted ISO/IEC 23090-8, Network-Based Media Processing (NBMP), to Final Draft International Standard (FDIS). The FDIS stage is the final vote before a document is officially adopted as an International Standard (IS). During the FDIS vote, publications and national bodies are only allowed to place a Yes/No vote and are no longer able to make any technical changes. However, project editors are able to fix typos and make other necessary editorial improvements.

What is NBMP? The NBMP standard defines a framework that allows content and service providers to describe, deploy, and control media processing for their content in the cloud by using libraries of pre-built 3rd party functions. The framework includes an abstraction layer to be deployed on top of existing commercial cloud platforms and is designed to be able to be integrated with 5G core and edge computing. The NBMP workflow manager is another essential part of the framework enabling the composition of multiple media processing tasks to process incoming media and metadata from a media source and to produce processed media streams and metadata that are ready for distribution to media sinks.

Why NBMP? With the increasing complexity and sophistication of media services and the incurred media processing, offloading complex media processing operations to the cloud/network is becoming critically important in order to keep receiver hardware simple and power consumption low.

Research aspects: NBMP reminds me a bit about what has been done in the past in MPEG-21, specifically Digital Item Adaptation (DIA) and Digital Item Processing (DIP). The main difference is that MPEG now targets APIs rather than pure metadata formats, which is a step forward in the right direction as APIs can be implemented and used right away. NBMP will be particularly interesting in the context of new networking approaches including, but not limited to, software-defined networking (SDN), information-centric networking (ICN), mobile edge computing (MEC), fog computing, and related aspects in the context of 5G.

Coded representation of immersive media – Publication of the Technical Report on Architectures for Immersive Media

At its 129th meeting, WG11 (MPEG) published an updated version of its technical report on architectures for immersive media. This technical report, which is the first part of the ISO/IEC 23090 (MPEG-I) suite of standards, introduces the different phases of MPEG-I standardization and gives an overview of the parts of the MPEG-I suite. It also documents use cases and defines architectural views on the compression and coded representation of elements of immersive experiences. Furthermore, it describes the coded representation of immersive media and the delivery of a full, individualized immersive media experience. MPEG-I enables scalable and efficient individual delivery as well as mass distribution while adjusting to the rendering capabilities of consumption devices. Finally, this technical report breaks down the elements that contribute to a fully immersive media experience and assigns quality requirements as well as quality and design objectives for those elements.

Research aspects: This technical report provides a kind of reference architecture for immersive media, which may help identify research areas and research questions to be addressed in this context.

Multimedia content description interface – Conformance and Reference Software for Compact Descriptors for Video Analysis promoted to the final stage

Managing and organizing the quickly increasing volume of video content is a challenge for many industry sectors, such as media and entertainment or surveillance. One example task is scalable instance search, i.e., finding content containing a specific object instance or location in a very large video database. This requires video descriptors that can be efficiently extracted, stored, and matched. Standardization enables extracting interoperable descriptors on different devices and using software from different providers so that only the compact descriptors instead of the much larger source videos can be exchanged for matching or querying. ISO/IEC 15938-15:2019 – the MPEG Compact Descriptors for Video Analysis (CDVA) standard – defines such descriptors. CDVA includes highly efficient descriptor components using features resulting from a Deep Neural Network (DNN) and uses predictive coding over video segments. The standard is being adopted by the industry. At its 129th meeting, WG11 (MPEG) has finalized the conformance guidelines and reference software. The software provides the functionality to extract, match, and index CDVA descriptors. For easy deployment, the reference software is also provided as Docker containers.

Research aspects: The availability of reference software helps to conduct reproducible research (i.e., reference software is typically publicly available for free) and the Docker container even further contributes to this aspect.

DASH and CMAF

The 4th edition of DASH has already been published and is available as ISO/IEC 23009-1:2019. Similar to previous iterations, MPEG’s goal was to make the newest edition of DASH publicly available for free, with the goal of industry-wide adoption and adaptation. During the most recent MPEG meeting, we worked towards implementing the first amendment which will include additional (i) CMAF support and (ii) event processing models with minor updates; these amendments are currently in draft and will be finalized at the 130th MPEG meeting in Alpbach, Austria. An overview of all DASH standards and updates are depicted in the figure below:

ISO/IEC 23009-8 or “session-based DASH operations” is the newest variation of MPEG-DASH. The goal of this part of DASH is to allow customization during certain times of a DASH session while maintaining the underlying media presentation description (MPD) for all other sessions. Thus, MPDs should be cacheable within content distribution networks (CDNs) while additional information should be customizable on a per session basis within a newly added session-based description (SBD). It is understood that the SBD should have an efficient representation to avoid file size issues and it should not duplicate information typically found in the MPD.

The 2nd edition of the CMAF standard (ISO/IEC 23000-19) will be available soon (currently under FDIS ballot) and MPEG is currently reviewing additional tools in the so-called ‘technologies under considerations’ document. Therefore, amendments were drafted for additional HEVC media profiles and exploration activities on the storage and archiving of CMAF contents.

The next meeting will bring MPEG back to Austria (for the 4th time) and will be hosted in Alpbach, Tyrol. For more information about the upcoming 130th MPEG meeting click here.

Click here for more information about MPEG meetings and their developments

JPEG Column: 86th JPEG Meeting in Sydney, Australia

The 86th JPEG meeting was held in Sydney, Australia.

Among the different activities that took place, the JPEG Committee issued a Call for Evidence on learning-based image coding solutions. This call results from the success of the  explorations studies recently carried out by the JPEG Committee, and honours the pioneering work of JPEG issuing the first image coding standard more than 25 years ago.

In addition, a First Call for Evidence on Point Cloud Coding was issued in the framework of JPEG Pleno. Furthermore, an updated version of the JPEG Pleno reference software and a JPEG XL open source implementation have been released, while JPEG XS continues the development of raw-Bayer image sensor compression.

JPEG Plenary at the 86th meeting.

The 86th JPEG meeting had the following highlights:

  • JPEG AI issues a call for evidence on machine learning based image coding solutions
  • JPEG Pleno issues call for evidence on Point Cloud coding
  • JPEG XL verification test reveal competitive performance with commonly used image coding solutions 
  • JPEG Systems submitted final texts for Privacy & Security
  • JPEG XS announces new coding tools optimised for compression of raw-Bayer image sensor data

JPEG AI

The JPEG Committee launched a learning-based image coding activity more than a year ago, also referred as JPEG AI. This activity aims to find evidence for image coding technologies that offer substantially better compression efficiency when compared to conventional approaches but relying on models exploiting a large image database.

A Call for Evidence (CfE) has been issued as outcome of the 86th JPEG meeting, Sydney, Australia as a first formal step to consider standardisation of such approaches in image compression. The CfE is organised in coordination with the IEEE MMSP 2020 Grand Challenge on Learning-based Image Coding Challenge and will use the same content, evaluation methodologies and deadlines.

JPEG Pleno

JPEG Pleno is working toward the integration of various modalities of plenoptic content under a single framework and in a seamless manner. Efficient and powerful point cloud representation is a key feature within this vision.  Point cloud data supports a wide range of applications including computer-aided manufacturing, entertainment, cultural heritage preservation, scientific research and advanced sensing and analysis. During the 86th JPEG Meeting, the JPEG Committee released a First Call for Evidence on JPEG Pleno Point Cloud Coding to be integrated in the JPEG Pleno framework.  This Call for Evidence focuses specifically on point cloud coding solutions that support scalability and random access of decoded point clouds.

Furthermore, a Reference Software implementation of the JPEG Pleno file format (Part 1) and light field coding technology (Part 2) is made publicly available as open source on the JPEG Gitlab repository (https://gitlab.com/wg1). The JPEG Pleno Reference Software is planned to become an International Standard as Part 4 of JPEG Pleno by the end of 2020.

JPEG XL

The JPEG XL Image Coding System (ISO/IEC 18181) has produced an open source reference implementation available on the JPEG Gitlab repository (https://gitlab.com/wg1/jpeg-xl). The software is available under Apache 2, which includes a royalty-free patent grant. Speed tests indicate the multithreaded encoder and decoder outperforms libjpeg-turbo. 

Independent subjective and objective evaluation experiments have indicated competitive performance with commonly used image coding solutions while offering new functionalities such as lossless transcoding from legacy JPEG format to JPEG XL. The standardisation process has reached the Draft International Standard stage.

JPEG exploration into Media Blockchain

Fake news, copyright violations, media forensics, privacy and security are emerging challenges in digital media. JPEG has determined that blockchain and distributed ledger technologies (DLT) have great potential as a technology component to address these challenges in transparent and trustable media transactions. However, blockchain and DLT need to be integrated efficiently with a widely adopted standard to ensure broad interoperability of protected images. Therefore, the JPEG committee has organised several workshops to engage with the industry and help to identify use cases and requirements that will drive the standardisation process.

During its Sydney meeting, the committee organised an Open Discussion Session on Media Blockchain and invited local stakeholders to take part in an interactive discussion. The discussion focused on media blockchain and related application areas including, media and document provenance, smart contracts, governance, legal understanding and privacy. The presentations of this session are available on the JPEG website. To keep informed and to get involved in this activity, interested parties are invited to register to the ad hoc group’s mailing list.

JPEG Systems

JPEG Systems & Integration submitted final texts for ISO/IEC 19566-4 (Privacy & Security), ISO/IEC 24800-2 (JPSearch), and ISO/IEC 15444-16 2nd edition (JPEG 2000-in-HEIF) for publication.  Amendments to add new capabilities for JUMBF and JPEG 360 reached Committee Draft stage and will be reviewed and balloted by national bodies.

The JPEG Privacy & Security release is timely as consumers are increasingly aware and concerned about the need to protect privacy in imaging applications.  The JPEG 2000-in-HEIF enables embedding JPEG 2000 images in the HEIF file format.  The updated JUMBF provides a more generic means to embed images and other media within JPEG files to enable richer image experiences.  The updated JPEG 360 adds stereoscopic 360 images, and a method to accelerate the rendering of a region-of-interest within an image in order to reduce the latency experienced by users.  JPEG Systems & Integrations JLINK, which elaborates the relationships of the embedded media within the file, created updated use cases to refine the requirements, and continued technical discussions on implementation.

JPEG XS

The JPEG committee is pleased to announce the specification of new coding tools optimised for compression of raw-Bayer image sensor data. The JPEG XS project aims at the standardisation of a visually lossless, low-latency and lightweight compression scheme that can be used as a mezzanine codec in various markets. Video transport over professional video links, real-time video storage in and outside of cameras, and data compression onboard of autonomous cars are among the targeted use cases for raw-Bayer image sensor compression. Amendment of the Core Coding System, together with new profiles targeting raw-Bayer image applications are ongoing and expected to be published by the end of 2020.

Final Quote

“The efforts to find new and improved solutions in image compression have led JPEG to explore new opportunities relying on machine learning for coding. After rigorous analysis in form of explorations during the last 12 months, JPEG believes that it is time to formally initiate a standardisation process, and consequently, has issued a call for evidence for image compression based on machine learning.” said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

86th JPEG meeting social event in Sydney, Australia.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JPEG, JPEG 2000, JPEG XR, JPSearch, JPEG XT and more recently, the JPEG XS, JPEG Systems, JPEG Pleno and JPEG XL families of imaging standards.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup. If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

Future JPEG meetings are planned as follows:

  • No 87, Erlangen, Germany, April 25 to 30, 2020 (Cancelled because of Covid-19 outbreak; Replaced by online meetings.)
  • No 88, Geneva, Switzerland, July 4 to 10, 2020

Report from ACM SIG Heritage Workshop

What does history mean to computer scientists?” – that was the first question that popped up in my mind when I was to attend the ACM Heritage Workshop at Minneapolis few months back. And needless to say, the follow up question was “what does history mean for a multimedia systems researcher?” As a young graduate student, I had the joy of my life when my first research paper on multimedia authoring (a hot topic those days) was accepted for presentation in the first ACM Multimedia in 1993, and that conference was held along side SIGGRAPH. Thinking about that, it gives multimedia systems researchers about 25 to 30 years of history. But what a flow of topics this area has seen: from authoring to streaming to content-based retrieval to social media and human-centered multimedia, the research area has been hot as ever. So, is it the history of research topics or the researchers or both? Then, how about the venues hosting these conferences, the networking events, or the grueling TPC meetings that prepped the conference actions?

Figure 1. Picture from the venue

With only questions and no clear answers, I decided to attend the workshop with an open mind. Most SIGs (Special Interest Groups) in ACM had representation at this workshop. The workshop itself was organized by the ACM History Committee. I understood this committee, apart from the workshop, organizes several efforts to track, record, and preserve computing efforts across disciplines. This includes identifying distinguished persons (who are retired but made significant contributions to computing), coming up with a customized questionnaire for the persons, training the interviewer, recording the conversations, curating them, archiving, and providing them for public consumption. Efforts at most SIGs were mostly based on the website. They were talking about how they try to preserve conference materials such as paper proceedings (when only paper proceedings were published), meeting notes, pictures, and videos. For instance, some SIGs were talking about how they tracked and preserved ACM’s approval letter for the SIG! 

It was very interesting – and touching – to see some attendees (senior Professors) coming to the workshop with boxes of materials – papers, reports, books, etc. They were either downsizing their offices or clearing out, and did not feel like throwing the material in recycling bins! These materials were given to ACM and Babbage Institute (at University of Minnesota, Minneapolis) for possible curation and storage.

Figure 2. Galleries with collected material

ACM History committee members talked about how they can fund (at a small level) projects that target specific activities for preserving and archiving computing events and materials. ACM History Committee agreed that ACM should take more responsibility in providing technical support to web hosting – obviously, not sure whether anything tangible would result.

Over the two days at the workshop, I was getting answers to my questions: History can mean pictures and videos taken at earlier MM conferences, TPC meetings, SIGMM sponsored events and retreats. Perhaps, the earlier paper proceedings that have some additional information than what is found in the corresponding ACM Digital Library version. Interviews with different research leaders that built and promoted SIGMM.

It was clear that history meant different things to different SIGs, and as SIGMM community, we would have to arrive at our own interpretation, collect and preserve that. And that made me understand the most obvious and perhaps, the most important thing: today’s events become tomorrow’s history! No brainer, right? Preserving today’s SIGMM events will give us a richer, colorful, and more complete SIGMM history for the future generations!

For the curious ones:

ACM Heritage Workshop website is at: https://acmsigheritage.dash.umn.ed

Some of the workshop presentation materials are available at: https://acmsigheritage.dash.umn.edu/uncategorized/class-material-posted/

Interview with Dr. Magda Ek Zarki and Dr. De-Yu Chen: winners of the Best MMsys’18 Workshop paper award

Abstract

The ACM Multimedia Systems conference (MMSys’18) was recently held in Amsterdam from 9-15 June 2018. The conferencs brings together researchers in multimedia systems. Four workshops were co-located with MMSys, namely PV’18, NOSSDAV’18, MMVE’18, and NetGames’18. In this column we interview Magda El Zarki and De-Yu Chen, the authors of the best workshop paper entitled “Improving the Quality of 3D Immersive Interactive Cloud-Based Services Over Unreliable Network” that was presented at MMVE’18.

Introduction

The ACM Multimedia Systems Conference (MMSys) (mmsys2018.org) was held from the 12-15 June in Amsterdam, The Netherlands. The MMsys conference provides a forum for researchers to present and share their latest research findings in multimedia systems. MMSys is a venue for researchers who explore complete multimedia systems that provide a new kind of multimedia or overall performance improves the state-of-the-art. This touches aspects of many hot topics including but not limited to: adaptive streaming, games, virtual reality, augmented reality, mixed reality, 3D video, Ultra-HD, HDR, immersive systems, plenoptics, 360° video, multimedia IoT, multi- and many-core, GPGPUs, mobile multimedia and 5G, wearable multimedia, P2P, cloud-based multimedia, cyber-physical systems, multi-sensory experiences, smart cities, QoE.

Four workshops were co-located with MMSys in Amsterdam in June 2018. The paper titled “Improving the Quality of 3D Immersive Interactive Cloud-Based Services Over Unreliable Network” by De-Yu Chen and Magda El-Zarki from University of California, Irvine was awarded the Comcast Best Workshop Paper Award for MMSys 2018, chosen from among papers from the following workshops: 

  • MMVE’18 (10th International Workshop on Immersive Mixed and Virtual Environment Systems)
  • NetGames’18 (16th Annual Workshop on Network and Systems Support for Games)
  • NOSSDAV’18 (28th ACM SIGMM Workshop on Network and Operating Systems Support for Digital Audio and Video)
  • PV’18 (23rd Packet Video Workshop)

We approached the authors of the best workshop paper to learn about the research leading up to their paper. 

Could you please give a short summary of the paper that won the MMSys 2018 best workshop paper award?

In this paper we discussed our approach of an adaptive 3D cloud gaming framework. We utilized a collaborative rendering technique to generate partial content on the client, thus the network bandwidth required for streaming the content can be reduced. We also made use of progressive mesh so the system can dynamically adapt to changing performance requirements and resource availability, including network bandwidth and computing capacity. We conducted experiments that are focused on the system performance under unreliable network connections, e.g., when packets can be lost. Our experimental results show that the proposed framework is more resilient under such conditions, which indicates that the approach has potential advantage especially for mobile applications.

Does the work presented in the paper form part of some bigger research question / research project? If so, could you perhaps give some detail about the broader research that is being conducted?

A more complete discussion about the proposed framework can be found in our technical report, Improving the Quality and Efficiency of 3D Immersive Interactive Cloud Based Services by Providing an Adaptive Application Framework for Better Service Provisioning, where we discussed performance trade-off between video quality, network bandwidth, and local computation on the client. In this report, we also tried to tackle network latency issues by utilizing the 3D image warping technique. In another paper, Impact of information buffering on a flexible cloud gaming system, we further explored the potential performance improvement of our latency reduction approach, when more information can be cached and processed.

We received many valuable suggestions and identified a few important future directions. Unfortunately, De-Yu, graduated and decided to pursue a career in the industry. He will not likely to be able to continue working on this project in the near future.

Where do you see the impact of your research? What do you hope to accomplish?

Cloud gaming is an up-and-coming area. Major players like Microsoft and NVIDIA have already launched their own projects. However, it seems to me that there is not a good enough solution that is accepted by the users yet. By providing an alternative approach, we wanted to demonstrate that there are still many unsolved issues and research opportunities, and hopefully inspire further work in this area.

Describe your journey into the multimedia research. Why were you initially attracted to multimedia?

De-Yu: My research interest in cloud gaming system dated back to 2013 when I worked as a research assistant in Academia Sinica, Taiwan. When U first joined Dr. Kuan-Ta Chen’s lab, my background was in parallel and distributed computing. I joined the lab for a project that is aimed to provide a tool that help developers do load balancing on massively multiplayer online video games. Later on, I had the opportunity to participate in the lab’s other project, GamingAnywhere, which aimed to build the world’s first open-source cloud gaming system. Being an enthusiastic gamer myself, having the opportunity to work on such a project was really an enjoyable and valuable experience. That experience came to be the main reason for continuing to work in this area. 

Magda El Zarki: I have worked in multimedia research since the 1980’s when I worked for my PhD project on a project that involved the transmission of data, voice and video over a LAN. It was named MAGNET and was one of the first integrated LANs developed for multimedia transmission. My work continued in that direction with the transmission of Video over IP. In conjunction with several PhD students over the past 20—30 years I have developed several tools for the study of video transmission over IP (MPEGTool) and has several patents related to video over wireless networks. All the work focused on improving the quality of the video via pre and post processing of the signal.

Can you profile your current research, its challenges, opportunities, and implications?

There are quite some challenges in our research. First of all, our approach is an intrusive method. That means we need to modify the source code of the interactive applications, e.g. games, to apply our method. We found it very hard to find a suitable open source game whose source code is neat and clean and easy to modify. Developing our own fully functioning game is not a reasonable approach, alas, due to the complexity involved. We ended up building a 3D virtual environment walkthrough application to demonstrate our idea. Most reviewers have expressed concerns about synchronization issues in a real interactive game, where there may be AI controlled objects, non-deterministic processes, or even objects controlled by other players. We agree with the reviewers that this is a very important issue. But currently it is very hard for us to address it with our limited resources. Most of the other research work in this area faces similar problems to ours – lack of a viable open source game for researchers to modify. As a result, researchers are forced to build their own prototype application for performance evaluation purposes. This brings about another challenge: it is very hard for us to fairly compare the performance of different approaches given that we all use a different application for testing. However, these difficulties can also be deemed as opportunities. There are still many unsolved problems. Some of them may require a lot of time, effort, and resources, but even a little progress can mean a lot since cloud gaming is an area that is gaining more and more attention from industry to increase distribution of games over many platforms.

“3D immersive and interactive services” seems to encompass both massive multi-user online games as well augmented and virtual reality. What do you see as important problems for these fields? How can multimedia researchers help to address these problems?

When it comes to gaming or similar interactive applications, all comes down to the user experience. In the case of cloud gaming, there are many performance metrics that can affect user experience. Identifying what matters the most to the users would be one of the important problems. In my opinion, interactive latency would be the most difficult problem to solve among all performance metrics. There is no trivial way to reduce network latency unless you are willing to pay the cost for large bandwidth pipes. Edge computing may effectively reduce network latency, but it comes with high deployment cost.

As large companies start developing their own systems, it is getting harder and harder for independent researchers with limited funding and resources to make major contributions in this area. Still, we believe that there are a couple ways how independent researchers can make a difference. First, we can limit the scope of the research by simplifying the system, focusing on just one or a few features or components. Unlike corporations, independent researchers usually do not have the resources to build a fully functional system, but we also do not have the obligation to deliver one. That actually enables us to try out some interesting but not so realistic ideas. Second, be open to collaboration. Unlike corporations who need to keep their projects confidential, we have more freedom to share what we are doing, and potentially get more feedback from others. To sum up, I believe in an area that has already attracted a lot of interest from industry, researchers should try to find something that companies cannot or are not willing to do, instead of trying to compete with them.

If you were conducting this interview, what questions would you ask, and then what would be your answers?

 The real question is: Is Cloud Gaming viable? It seems to make economic sense to try to offer it as companies try to reach a broader  and more remote audience. However, computing costs are cheaper than bandwidth costs, so maybe throwing computing power at the problem makes more sense – make more powerful end devices that can handle the computing load of a complex game and only use the network for player interactivity.

Biographies of MMSys’18 Best Workshop Paper Authors

Prof Magda El Zarki (Professor, University of California, Irvine):

Magda El Zarki

Prof. El Zarki’s lab focuses on multimedia transmission over the Internet. The work consists of both theoretical studies and practical implementations to test the algorithms and new mechanisms to improve quality of service on the user device. Both wireline and wireless networks and all types of video and audio media are considered. Recent work has shifted to networked games and massively multi user virtual environments (MMUVE). Focus is mostly on studying the quality of experience of players in applications where precision and time constraints are a major concern for game playability. A new effort also focuses on the development of games and virtual experiences in the arena of education and digital heritage.

De-Yu Chen (PhD candidate, University of California, Irvine):

De-Yu Chen

De-Yu Chen is a PhD candidate at UC Irvine. He received his M.S. in Computer Science from National Taiwan University in 2009, and his B.B.A. in Business Administration from National Taiwan University in 2006. His research interests include multimedia systems, computer graphics, big data analytics and visualization, parallel and distributed computing, cloud computing. His most current research project is focused on improving quality and flexibility of cloud gaming systems.

JPEG Column: 79th JPEG Meeting in La Jolla, California, U.S.A.

The JPEG Committee had its 79th meeting in La Jolla, California, U.S.A., from 9 to 15 April 2018.

During this meeting, JPEG had a final celebration of the 25th anniversary of its first JPEG standard, usually known as JPEG-1. This celebration coincides with two interesting facts. The first was the approval of a reference software for JPEG-1, “only” after 25 years. At the time of approval of the first JPEG standard a reference software was not considered, as it is common in recent image standards. However, the JPEG committee decided that was still important to provide a reference software, as current applications and standards can largely benefit on this specification. The second coincidence was the launch of a call for proposals for a next generation image coding standard, JPEG XL. This standard will define a new representation format for Photographic information, that includes the current technological developments, and can become an alternative to the 25 years old JPEG standard.

An informative two-hour JPEG Technologies Workshop marked the 25th anniversary celebration on Friday April 13, 2018. The workshop had presentations of several committee members on the current and future JPEG committee activity, with the following program:

IMG_4560

Touradj Ebrahimi, convenor of JPEG, presenting an overview of JPEG technologies.

  • Overview of JPEG activities, by Touradj Ebrahimi
  • JPEG XS by Antonin Descampe and Thomas Richter
  • HTJ2K by Pierre-Anthony Lemieux
  • JPEG Pleno – Light Field, Point Cloud, Holography by Ioan Tabus, Antonio Pinheiro, Peter Schelkens
  • JPEG Systems – Privacy and Security, 360 by Siegfried Foessel, Frederik Temmermans, Andy Kuzma
  • JPEG XL by Fernando Pereira, Jan De Cock

After the workshop, a social event was organized where a past JPEG committee Convenor, Eric Hamilton was recognized for key contributions to the JPEG standardization.

La Jolla JPEG meetings comprise mainly the following highlights:

  • Call for proposals of a next generation image coding standard, JPEG XL
  • JPEG XS profiles and levels definition
  • JPEG Systems defines a 360 degree format
  • HTJ2K
  • JPEG Pleno
  • JPEG XT
  • Approval of the JPEG Reference Software

The following summarizes various activities during JPEG’s La Jolla meeting.

JPEG XL

Billions of images are captured, stored and shared on a daily basis demonstrating the self-evident need for efficient image compression. Applications, websites and user interfaces are increasingly relying on images to share experiences, stories, visual information and appealing designs.

User interfaces can target devices with stringent constraints on network connection and/or power consumption in bandwidth constrained environments. Even though network capacities are improving globally, bandwidth is constrained to levels that inhibit application responsiveness in many situations. User interfaces that utilize images containing larger resolutions, higher dynamic ranges, wider color gamuts and higher bit depths, further contribute to larger volumes of data in higher bandwidth environments.

The JPEG Committee has launched a Next Generation Image Coding activity, referred to as JPEG XL. This activity aims to develop a standard for image coding that offers substantially better compression efficiency than existing image formats (e.g. more than 60% improvement when compared to the widely used legacy JPEG format), along with features desirable for web distribution and efficient compression of high-quality images.

To this end, the JPEG Committee has issued a Call for Proposals following its 79th meeting in April 2018, with the objective of seeking technologies that fulfill the objectives and scope of a Next Generation Image Coding. The Call for Proposals (CfP), with all related info, can be found at jpeg.org. The deadline for expression of interest and registration is August 15, 2018, and submissions to the Call are due September 1, 2018. To stay posted on the action plan for JPEG XL, please regularly consult our website at jpeg.org and/or subscribe to our e-mail reflector.

 

JPEG XS

This project aims at the standardization of a visually lossless low-latency lightweight compression scheme that can be used as a mezzanine codec for the broadcast industry, Pro-AV and other markets such as VR/AR/MR applications and autonomous cars. Among important use cases identified one can mention in particular video transport over professional video links (SDI, IP, Ethernet), real-time video storage, memory buffers, omnidirectional video capture and rendering, and sensor compression in the automotive industry. During the La Jolla meeting, profiles and levels have been defined to help implementers accurately size their design for their specific use cases. Transport of JPEG XS over IP networks or SDI infrastructures, are also being specified and will be finalized during the next JPEG meeting in Berlin (July 9-13, 2018). The JPEG committee therefore invites interested parties, in particular coding experts, codec providers, system integrators and potential users of the foreseen solutions, to contribute to the specification process. Publication of the core coding system as an International Standard is expected in Q4 2018.

 

JPEG Systems – JPEG 360

The JPEG Committee continues to make progress towards its goals to define a common framework and definitions for metadata which will improve the ability to share 360 images and provide the basis to enable new user interaction with images.  At the 79th JPEG meeting in La Jolla, the JPEG committee received responses to a call for proposals it issued for JPEG 360 metadata. As a result, JPEG Systems is readying a committee draft of “JPEG Universal Metadata Box Format (JUMBF)” as ISO/IEC 19566-5, and “JPEG 360” as ISO/IEC 19566-6.  The box structure defined by JUMBF allows JPEG 360 to define a flexible metadata schema and the ability to link JPEG code streams embedded in the file. It also allows keeping unstitched image elements for omnidirectional captures together with the main image and descriptive metadata in a single file.  Furthermore, JUMBF lays the groundwork for a uniform approach to integrate tools satisfying the emerging requirements for privacy and security metadata.

To stay posted on JPEG 360, please regularly consult our website at jpeg.org and/or subscribe to the JPEG 360 e-mail reflector. 

 

HTJ2K

High Throughput JPEG 2000 (HTJ2K) aims to develop an alternate block-coding algorithm that can be used in place of the existing block coding algorithm specified in ISO/IEC 15444-1 (JPEG 2000 Part 1). The objective is to significantly increase the throughput of JPEG 2000, at the expense of a small reduction in coding efficiency, while allowing mathematically lossless transcoding to and from codestreams using the existing block coding algorithm.

As a result of a Call for Proposals issued at its 76th meeting, the JPEG Committee has selected a block-coding algorithm as the basis for Part 15 of the JPEG 2000 suite of standards, known as High Throughput JPEG 2000 (HTJ2K). The algorithm has demonstrated an average tenfold increase in encoding and decoding throughput, compared to the algorithms based on JPEG 2000 Part 1. This increase in throughput results in less than 15% average loss in coding efficiency, and allows mathematically lossless transcoding to and from JPEG 2000 Part 1 codestreams.

A Working Draft of Part 15 to the JPEG 2000 suite of standards is now under development.

 

JPEG Pleno

The JPEG Committee is currently pursuing three activities in the framework of the JPEG Pleno Standardization: Light Field, Point Cloud and Holographic content coding.

JPEG Pleno Light Field finished a third round of core experiments for assessing the impact of individual coding modules and started work on creating software for a verification model. Moreover, additional test data has been studied and approved for use in future core experiments. Working Draft documents for JPEG Pleno specifications Part 1 and Part 2 were updated. A JPEG Pleno Light Field AhG was established with mandates to create a common test conditions document; perform exploration studies on new datasets, quality metrics, and random-access performance indicators; and to update the working draft documents for Part 1 and Part 2.

Furthermore, use cases were studied and are under consideration for JPEG Pleno Point Cloud. A current draft list is under discussion for the next period and will be updated and mapped to the JPEG Pleno requirements. A final document on use cases and requirements for JPEG Pleno Point Cloud is expected at the next meeting.

JPEG Pleno Holography has reviewed the draft of a holography overview document. Moreover, the current databases were classified according to use cases, and plans to analyze numerical reconstruction tools were established.

 

JPEG XT

The JPEG Committee released two corrigenda to JPEG XT Part 1 (core coding system) and JPEG XT Part 8 (lossless extension JPEG-1). These corrigenda clarify the upsampling procedure for chroma-subsampled images by adopting the centered upsampling in use by JFIF.

 

JPEG Reference Software

The JPEG Committee is pleased to announce that the CD ballot for Reference Software has been issued for the original JPEG-1 standard. This initiative closes a long-standing gap in the legacy JPEG standard by providing two reference implementations for this widely used and popular image coding format.

Final Quote

The JPEG Committee is hopeful to see its recently launched Next Generation Image Coding, JPEG XL, can result in a format that will become as important for imaging products and services as its predecessor was; the widely used and popular legacy JPEG format which has been in service for a quarter of century. said Prof. Touradj Ebrahimi, the Convenor of the JPEG Committee.

About JPEG

The Joint Photographic Experts Group (JPEG) is a Working Group of ISO/IEC, the International Organisation for Standardization / International Electrotechnical Commission, (ISO/IEC JTC 1/SC 29/WG 1) and of the International Telecommunication Union (ITU-T SG16), responsible for the popular JBIG, JPEG, JPEG 2000, JPEG XR, JPSearch and more recently, the JPEG XT, JPEG XS, JPEG Systems and JPEG Pleno families of imaging standards.

The JPEG Committee nominally meets four times a year, in different world locations. The 79th JPEG Meeting was held on 9-15 April 2018, in La Jolla, California, USA. The next 80th JPEG Meeting will be held on 7-13, July 2018, in Berlin, Germany.

More information about JPEG and its work is available at www.jpeg.org or by contacting Antonio Pinheiro or Frederik Temmermans (pr@jpeg.org) of the JPEG Communication Subgroup.

If you would like to stay posted on JPEG activities, please subscribe to the jpeg-news mailing list on http://jpeg-news-list.jpeg.org.  

 

Future JPEG meetings are planned as follows:JPEG-signature

  • No 80, Berlin, Germany, July 7 to13, 2018
  • No 81, Vancouver, Canada, October 13 to 19, 2018
  • No 82, Lisbon, Portugal, January 19 to 25, 2019