Opinion: Multidisciplinary Column

Goodbye Multidisciplinary Column!

By Jochen Huber | July 21, 2023 - 08:54 |September 26, 2023 0323, Feature, Opinion: Multidisciplinary Column

In June 2017, we were invited to serve as editors of the newly established column on multidisciplinary aspects of Multimedia. Our major goal back then was to portray a look beyond ‘the big pond’ — multimedia research that is.

We set out to establish a multidisciplinary dialogue within the multimedia community and raise awareness for, as well as underline mutual benefits between neighbouring disciplines. Towards this end, we chose various formats: interviews of peers whose work sits at the intersection of disciplines, and opinion-based articles on multidisciplinary aspects of multimedia, also including community and conference spotlights.

We look back at a rich volume of articles. Over the past 5 years, we gave voice to 6 peers who work at the intersection of multimedia and other disciplines, e.g. accessibility, musical interfaces, digital naturalism or security and privacy. One common recurring theme amongst those interviewed is that they draw upon a variety of disciplines in their daily work–or as Andy Quitmeyer put it, his “work is anti-disciplinary. Instead of relying on [a] specific field of practice, the work simply sets out towards some basic goals and happily uses any means necessary to get there. Currently, this includes a blend of naturalistic experimentation, performance art, filmmaking, interaction design, software and hardware engineering, industrial design, ergonomics illustration, and storytelling.” We also discussed grand research challenges for our community. In retrospect, these were again mostly interdisciplinary, e.g. the likes of ‘universal design’, ‘generative everything’, the blending of real and virtual worlds through AI-powered multimedia and ‘reproducibility, openness and accessibility of research and communities‘. In this, Odette Scharenborg also emphasized the importance of being a visible role model, and ensuring that a diverse user audience should be accounted for, as well exemplified in her work on “making speech technology available for everyone, irrespective of how one speaks and what language one speaks”.

As for our opinion-based articles, we both highlighted communities that actively fostered interdisciplinarity (assistive augmentation and music information retrieval). Next to this, we shared further examples of ways to inclusively teach and design, as well as establish communities and reach more diverse audiences. Here, we often gave examples in which more established infrastructures in the academic community (such as conferences and workshops) could be combined with lesser-trodden paths of outreach.

Making connections between disciplines takes energy and commitment, which often needs to be invested next to other duties and services. In this, lately, we realized that both of us do not have the necessary capacity anymore to continue this series of columns. While this last piece marks the end of this column, for now, we are positive this column gave stage to multidisciplinary dialogues within, and inspirational to, our community. Given the grand research challenges our fields of research face, we speculate that inter- and multidisciplinary work will remain key to working towards addressing those challenges–there is ‘multi’ in multimedia.

About the Column

The Multidisciplinary Column is edited by Cynthia C. S. Liem and Jochen Huber.

Editor Biographies

Dr. Cynthia C. S. Liem is an Assistant Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. She initiated and co-coordinated the European research project PHENICX (2013-2016), focusing on technological enrichment of symphonic concert recordings with partners such as the Royal Concertgebouw Orchestra. Her research interests consider music and multimedia search and recommendation, and increasingly shift towards making people discover new interests and content which would not trivially be retrieved. Beyond her academic activities, Cynthia gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, and a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach.

Dr. Jochen Huber is Professor of Computer Science at Furtwangen University, Germany. Previously, he was a Senior User Experience Researcher with Synaptics and an SUTD-MIT postdoctoral fellow in the Fluid Interfaces Group at MIT Media Lab and the Augmented Human Lab at Singapore University of Technology and Design. He holds a Ph.D. in Computer Science and degrees in both Mathematics (Dipl.-Math.) and Computer Science (Dipl.-Inform.), all from Technische Universität Darmstadt, Germany. Jochen’s work is situated at the intersection of Human-Computer Interaction and Human Augmentation. He designs, implements and studies novel input technology in the areas of mobile, tangible & non-visual interaction, automotive UX and assistive augmentation. He has co-authored over 60 academic publications and regularly serves as program committee member in premier HCI and multimedia conferences. He was program co-chair of ACM TVX 2016 and Augmented Human 2015 and chaired tracks of ACM Multimedia, ACM Creativity and Cognition and ACM International Conference on Interface Surfaces and Spaces, as well as numerous workshops at ACM CHI and IUI. Further information can be found on his personal homepage: http://jochenhuber.com

Multidisciplinary Column: Lessons Learned from a Multidisciplinary Hands-on Course on Interfaces for Inclusive Music Making

By Jochen Huber | July 25, 2022 - 11:49 |September 1, 2023 0222, Feature, Opinion: Multidisciplinary Column

Leave a comment

This short article reports on lessons learned from a multidisciplinary hands-on course that I co-taught in the academic winter term 2021/2022. Over the course of the term, I co-advised a group of 4 students who explored designing interfaces for Musiklusion [1], a project focused on inclusive music making using digital tools. Inclusive participation in music making processes is a topic home to the Multimedia community, as well as many neighbouring disciplines (see e.g. [2,3]). In the following, I briefly detail the curriculum, describe project Musiklusion, outline challenges and report on the course outcome. I conclude by summarizing a set of personal observations from the course—albeit anecdotal—that could be helpful for fellow teachers who wish to design a hands-on course with inclusive design sessions.

When I rejoined academia in 2020, I got the unique possibility to take part in teaching activities pertaining to, i.a., human-centered multimedia within a master’s curriculum on Human Factors at Furtwangen University. Within this 2-year master’s programme, one of the major mandatory courses is a 4-month hands-on course on Human Factors Design. I co-teach this course jointly with 3 other colleagues from my department. We expose students to multi-disciplinary research questions which they must investigate empirically in groups of 4-6. They have to come up with tangible results, e.g. a prototype or qualitative and quantitative data as empirical evidence.

Last term, each of us docents advised one group of students. Each group was also assigned an external partner to help ground the work and embed it into a real-world use case. The group of students I had the pleasure to work with partnered with Musiklusion’s project team. Musiklusion is an inclusive project focused on accessible music making with digital tools for people with so-called disabilities. They work and make music alongside people without any disabilities. These disabilities pertain e.g. to cognitive disabilities and impairments of motor skills with conditions continuing to progress. Movement, gestures and, eventually tasks, that can be performed today (e.g. being able to move one’s upper body) cannot be taken for granted in the future. Thus, as an overarching research agenda for the course project, the group of students explored the design and implementation of digital interfaces that enable people with cognitive and/or motor impairments to actively participate in music making processes and possibly sustain their participation in the long run depending on their physical abilities.

Figure 1. Current line-up of instruments of Project Musiklusion (source: Musiklusion feature with Tabea Booz & Sharon)

Project Musiklusion is spearheaded by musician and designer Andreas Brand [4], partnering with Lebenshilfe Tuttlingen [5]. The German Lebenshilfe is a nation-wide charitable association for people with so-called disabilities. Musiklusion’s project team makes two salient contributions: (i) orchestrating off-the-shelf instruments such that they are “programmable” and (ii) designing, developing and implementing digital interfaces that enable people with so-called disabilities to make music using said instruments. The project’s current line-up of instruments (cf. Figure 1) comprises a Disklavier with a Midi port and an enhanced drum set with drivers and mechanical actuators [6]. Both instruments can be controlled using MAX/MSP through OSC. Hence tools like TouchOSC [7] can be leveraged to design 2D widget-based graphical user interfaces to control each instrument. While a musician with impaired motor skills in the upper body might not be able to play individual notes using a touch interface or the actual Disklavier for instance, digital interfaces and widgets can be used to vary e.g. pitch or pace of themes.

With sustainable use of the above instruments in mind, the group of students aimed to explore alternative input modalities that could be used redundantly depending on a musician’s motor skills. They conducted weekly sessions with project members of Musiklusion over the course of about 2.5 months. Most of the project members use a motorized wheelchair and have limited upper body movement. Each session ran from 1 to 3 hours, depending on availability of project members and typically 2-5 members were present. The sessions took place at Lebenshilfe Tuttlingen, where the instruments were based at and used on daily basis. Based on in-situ observations and conversations, the group of students derived requirements and user needs to inform interface designs. They also led weekly co-design sessions where they prototyped both interfaces and interactions and tried them out with project members, respectively. Reporting on the actual iterative design sessions, the employed methodology (cf. [8,9]), as well as data gathered is beyond this short article and should be presented at a proper venue focusing on human-centred multimedia. Yet, to provide a glimpse on to the results: the group of students came up with a set of 4 different interfaces that cater to individual abilities and can be used redundantly with both the Disklavier and the drum kit. They designed (a) body-based interactions that can be employed while sitting in a motorized wheelchair, (b) motion-based interactions that leverage accelerometer and gyroscope data of e.g. a mobile phone held in hand or strapped to an upper arm, (c) an interface that leverages face mimics, relying on face tracking and (d) an eye-tracking interface that leverages eye movement for interaction. At the end of the course, and amidst the corona pandemic, these interfaces were used to enable the Musiklusion project members to team up with artists and singers Tabea Booz and Sharon to produce a music video remotely. The music video is available at https://www.youtube.com/watch?v=RYaTEYiaSDo and showcases the interfaces in actual productive use.

In the following, I enumerate personal lessons learned as an advisor and course instructor. Although these observations only steam from a single term and single group of students, I still find them worthwhile to share with the community.

Grounding of course topic is key. Teaming up with an external partner who provides a real-world use case had a tremendous impact on how the project went. The course could have also taken place without involving Musiklusion’s project members and actual instruments—designs and implementations would then have suffered from a low external validity. Furthermore, this would have rendered conduction of co-design sessions impossible.
Project work must be meaningful and possibly impactful. The real-world grounding of the project work and therefore also pressure to deliver progress to Musiklusion’s project members kept students extrinsically motivated. However, I observed students being engaged on a very high level and going above and beyond to deliver constantly improved prototypes. From conversations I had, I felt that both meaningfulness of their work and the impact they had motivated them intrinsically.
Course specifications should be tailored towards interests to acquire skills of course members. It might seem obvious (cf. [10]), but this course made me again realize how important it is to cater to the interest of students in acquiring new skills and match their interest to course specifications (cite Teaching college). The outcome of this project would have been entirely different, if students were not interested in learning how to build, deliver and test-drive prototypes iteratively at a high pace. This certainly also served as an additional intrinsic motivation.

In conclusion, teaching this course was a unique experience for me, as well as for the student members involved in the course work. It was certainly not my first hands-on course that I had taught. Also, hands-on course work is home to many HCI curricula across the globe. But I hope that this anecdotal report further inspires fellow teachers to partner with (charitable) organizations to co-teach modules and have them sponsor real-world use cases that motivate students both extrinsically and intrinsically.

Acknowledgements

I want to extend special thanks to participating students Selina Layer, Laura Moosmann, Marvin Shopp and Tobias Wirth, as well as Andreas Brand, Musiklusion project members and Lebenshilfe Tuttlingen.

References

[1] Musiklusion Project Webpage. https://www.musiklusion.de. Last accessed: June 28, 2022.

[2] Hornof A, Sato L. (2004). EyeMusic: making music with the eyes. In: Proceedings of the 2004 conference on New interfaces for musical expression, pp 185–188.

[3] Petry, B., Illandara, T., & Nanayakkara, S. (2016, November). MuSS-bits: sensor-display blocks for deaf people to explore musical sounds. In Proceedings of the 28th Australian Conference on Computer-Human Interaction(pp. 72-80).

[4] Personal webpage of Andreas Brand. https://andybrand.de. Last accessed: June 28, 2022.

[5] Lebenshilfe Tuttlingen. https://lebenshilfe-tuttlingen.de. Last accessed: June 28, 2022.

[6] Musiklusion Drum Set. https://www.musiklusion.de/musiklusion-schlagzeug/. Last accessed: June 28, 2022.

[7] TouchOSC. https://hexler.net/touchosc. Last accessed: June 28, 2022.

[8] Veytizou J, Magnier C, Villeneuve F, Thomann G. (2012). Integrating the human factors characterization of disabled users in a design method. Application to an interface for playing acoustic music. Association for the Advancement of Modelling and Simulation Techniques in Enterprises 73:173.

[9] Gehlhaar R, Rodrigues PM, Girão LM, Penha R. (2014). Instruments for everyone: Designing new means of musical expression for disabled creators. In: Technologies of inclusive well-being. Springer, pp 167–196.

[10] Eng, N. (2017). Teaching college: The ultimate guide to lecturing, presenting, and engaging students.

About the Column

The Multidisciplinary Column is edited by Cynthia C. S. Liem and Jochen Huber. Every other edition, we will feature an interview with a researcher performing multidisciplinary work, or a column of our own hand. For this edition, we feature a column by Jochen Huber.

Editor Biographies

Multidisciplinary Column: An Interview with Odette Scharenborg

By Cynthia Liem | January 25, 2022 - 21:27 |February 1, 2022 0421, Feature, Opinion: Multidisciplinary Column

Leave a comment

Odette, could you tell us a bit about your background, and what the road to your current position was?

Dr Odette Scharenborg, Associate professor and Delft Technology Fellow, SpeechLab/Multimedia Computing Group, Delft University of Technology

In high school, I enjoyed both languages and science topics such as physics, chemistry and biology. When researching what I wanted to study I came across “Language, Speech, and Computer Science” at Radboud University, Nijmegen, the Netherlands, which sounded and indeed was an interesting combination of both languages and science topics. Probably inspired by one of my favourite TV series when I was younger, the Knight Rider, which included a car with which you could communicate through speech, I from early on focused on speech technology.

After obtaining my university degree in 2000, I was offered a PhD position at the same department as I pursued my studies, on another interdisciplinary topic: computational modelling of human speech processing. My PhD project (2001-2005) combined theories about human speech processing (psycholinguistics) and tools and approaches from automatic speech recognition (which itself is more or less at the cross-roads of electrical engineering and computer science) in order to learn more about how humans process speech and improve automatic speech recognition (i.e., the conversion of speech into text).

After obtaining my PhD (in 2005), I went to the Speech and Hearing group in the Department of Computer Science at the University of Sheffield, UK, for a visiting post-doc position (funded by a Dutch Science Foundation (NWO) Talent Scholarship). I then returned to Radboud University for a 3-year post-doc position (funded by an NWO Veni personal fellowship) on new computational modelling of human speech processing project. After this project, I felt that after having read so much about the theories about humans process speech, I really wanted to know how researchers actually came to these theories. So, in the next few years, my research focused on human speech processing. First at the Max Planck Institute for Psycholinguistics, where I was trained as a psycholinguist, and subsequently, funded by an NWO Vidi personal grant, again at the Radboud University, where I became Associate Professor.

Towards the end of my Vidi-project (in 2016), I started to miss the computer science component of my earlier research and decided to try to move back into automatic speech recognition. I had an idea, met two amazing speech researchers who loved my idea, and we decided to collaborate. This collaboration (still ongoing) has allowed me to move back into the field of automatic speech recognition that at that time was rapidly changing due to the rise of deep learning.

In 2018, my Vidi project and contract at Radboud University ended, and I became unemployed. I was then headhunted by a company on automatic speech recognition for health applications. However, I felt that I wanted to stay in academia. Luckily for me, shortly after joining the company, Delft University of Technology offered me a Delft Technology Fellowship, and I joined TU Delft in June 2018, where I’ve since then worked as an Associate Professor of Speech Technology.

How important is interdisciplinarity in your research on speech?

As probably is clear from my road so far, I am an interdisciplinary researcher. The field of automatic speech recognition is already interdisciplinary in that it combines electrical engineering and computer science. However, in my research, I use my knowledge about sounds and sound structures (i.e., phonetics, a subfield of linguistics) and am inspired by and use knowledge about how humans process speech (i.e., psycholinguistics). The speech signal is a signal that can be researched and viewed from different angles: from the perspective of frequencies (physics), the perspective of the individual sounds (phonetics), meaning (semantics), as a means to convey a message or intent, etc. . It also contains different types of information: the words of the message, information about the speaker’s identity, age, gender, height, health status, emotional status, native language, to name only a few.

The focus of my research is on automatic speech recognition. Automatic speech recognisers typically work well for “standard” speakers of a small number of languages. In fact, for only about 2% of all the languages in the world, there is enough annotated speech data to build automatic speech recognisers. Moreover, a large portion of society does not speak in a “standard” way: “standard” speakers are native speakers of a language, without a speech impediment, without a strong regional accent, typically highly educated, and between the ages of 18 and 60 years. As you can tell, this excludes a large portion of our society: children, elderly, people with speaking or voice disorders, deaf people, immigrants, etc. In my work, I focus on making speech technology, and particularly automatic speech recognition, available for everyone, irrespective of how one speaks and the language one speaks. In order to do so, I look at how humans process speech as they are the best speech recognisers that exist; moreover, they can quickly adapt to idiosyncrasies in a speaker’s speech or voice. Moreover, I use knowledge about how sounds and the voice sound differently depending on, for instance, the speaker’s age or health status. So, in my research towards inclusive speech technology, I combine computer science with linguistics and psycholinguistics. Interdisciplinarity is thus at the core of my research.

What disciplines do you combine yourself in your own work?

As explained above, in my research I combine multiple research fields, most notably: computer science, different subfields of psycholinguistics (first and second language learning, native and non-native speech processing; the processing of emotions) and linguistics (primarily phonetics and a bit of conversational analysis).

Could you name a grand research challenge in your current field of work?

There are several grand research challenges in my field:

I already named one: making speech technology available for everyone, irrespective of how one speaks and what language one speaks. One of the grand challenges for this is to build speech technology for speech that is not only highly variable but for which also only a little amount of data is available (i.e., low resource scenarios).
A second grand challenge: when people speak they often use words or phrases from another language, this is called code-switching. Automatic speech recognisers are typically built for one language; it is very hard for them to deal with code-switched speech.
A third grand challenge: speech is often produced with background noise or background speech present. This deteriorates recognition performance tremendously. Dealing with all the different types of background noise and speech is another grand challenge.

You have been an active champion for diversity and inclusion. Could you tell us a bit more about your activities on these topics?

When I was growing academically, I did not really have a female role model, and especially not female role models who had children. When I was in my late twenties/early thirties, I found this hard because I was afraid that having children would negatively impact my chances for the next academic job and my academic career in general. Also, being not only a first-generation PhD but also a first-generation academic, it took me a really long time to realise there were unwritten rules and, knowing what these were and how to deal with them (not sure I now know all 😉 ). Then, when I became Associate Professor at Radboud University, I found that several students, male and female, regularly came to talk to me about personal and academic issues and, that they thought my advice useful and I found it interesting and motivating to talk to them. I wanted to do more regarding gender equality but didn’t know how.

Then in 2016, a group of senior female speech researchers together organised the Young Female Researchers in Speech Science and Technology Workshop, in conjunction with the flagship conference of the International Speech Communication Association (ISCA) Interspeech, in order to attract more female students into a speech PhD program. I was invited as a mentor. This workshop was highly successful and now is a yearly workshop in conjunction with Interspeech. I joined the organisation of this workshop for 3 years. Then in 2019, having advocated gender equality in the ISCA board of which I’ve been a member since 2017, I was asked to form a new committee: the committee for gender equality. Very quickly this committee started to focus on more than gender and look at other types of diversity, sexual orientation, research areas (ISCA encompasses several speech sub-areas, including phonetics, psycholinguistics, health, automatic speech recognition, speech generation, etc.), and geographical regions. Naturally, we not only wanted to attract people from diverse backgrounds but also wanted to retain them, so we also started to look into inclusion. The first thing our committee did, was to create a website where female speech researchers who hold a PhD can list themselves. This website is used to help workshop/conference organisers to find female researchers for the organising committee, as panellists and keynote/invited speakers, etc. We then went on to organise diversity and inclusion meetings at Interspeech and for 2 years we organise a separate ISCA-queer meeting. We have held a workshop in Africa (remotely due to the pandemic) in order to reach local speech researchers there and see where we can collaborate and where we can help them with our resources and expertise. We wrote a code of conduct for session chairs at workshops/conferences in order for them to know how to balance questions from people from minority groups and non-minority groups. To name but a few of our activities.

In 2020 I came up with the idea for a mentoring programme within the IEEE Signal Processing Society (SPS), for students from minority groups, which was well received and was funded with $50K annually. This programme, loosely based on the YFRSW-format, provides students with a mentor from our society who will supervise them for a period of 9 months, and who will mentor them and help them build a network. Each student receives $4K to visit one of the IEEE (SPS) conferences/workshops. In the first round, we awarded 9 students from all over the world.

In addition to these activities, I’ve also been on the board of the Delft Women in Science (DEWIS) at my university and the chair of the Diversity and Inclusion Committee (EDIT) at my faculty at TU Delft. Additionally, I am regularly asked to appear as a female role model in STEM for young girls and in Dutch media.

In getting to your current position, you experienced some personal hardships. In serving as a public role model, you have been open about these. How can we learn from these experiences to make academia a better place?

My CV shows the (many) consecutive positions I’ve had and how almost all are financed by personal grants that I obtained. These personal grants especially tend to attract a lot of praise. What my CV doesn’t show is the story behind it. It doesn’t show the many job applications I sent out, which never led to a position. It doesn’t show that for a period of more than 2 years I did not have a contract, meaning that I did not have any social security, while I was working on a post-doc position. It does not show how I was bullied at my previous university and the damage that did to my self-esteem, something I still struggle with. It does not show that I had to leave behind my 10-month-old daughter for a month and again for 2 weeks because I was expected to be in Germany for a post-doc position, nor does it show the two bouts of (mild) depression I suffered (one directly related to the bullying). I never talked about all of this because as a temporary (and young and female) researcher, you feel extremely vulnerable because you are so dependent on (the goodwill of) other, more senior researchers. If you don’t want to or cannot do a task, if you complain, they will simply find someone else and you are without a job again. On top of that, you often simply are not believed.

When I became a mentor for students and young researchers, I decided to share some of my struggles so that they knew that they were not the only ones who struggled and that I knew what they were going through. I began to receive feedback from these students that they appreciated my honesty and openness, which gave me the courage to be more open about my own issues. However, only after I received my permanent position at TU Delft (in 2019), and after becoming active in diversity and inclusion, did I very slowly dare to speak more openly to my colleagues and senior people about what had happened.

In late 2019, I was asked to talk about what it is like to be a female researcher in speech technology at the IEEE Workshop on Automatic Speech Recognition and Understanding. I thought about the story I wanted to tell, and eventually, I decided to tell my colleagues, including many of my close friends, my story: I started by showing them my CV, which received a lot of appreciative nods. I then told the story of my life, the story that is not shown by my CV, including the hardships. This resulted in many of my male colleagues and friends crying. Of course, this was never my intention. I don’t think that my story is that much different from the average person from a minority group and probably there are quite a few men whose stories are worse than mine.What I wanted to say was: CVs might look great, or they might not. It is important to not take CVs or facts or numbers at face value, you don’t know what people go through or have done to get where they are. Everyone has a story to tell; but it is, unfortunately, the case that the bad stories far more often happen to women and other people from minority groups.

A third message of piece of advice is that if you go through a hard time, know that you are not alone. In life in general, and in academia particularly, we celebrate successes, but failures and hardships are ignored and are often considered a weakness. I strongly believe that by being open about one’s hardships, you will feel better yourself, and will help others with dealing with their hardships.

Finally, we need to see fellow academics as people and treat them as people. We need to be supportive of one another, especially of our younger colleagues and of those from minority groups. We should be mentors and role models. We should listen to what they are saying and believe what they are saying. Not question what they say, but believe them when they describe something bad that has happened to them and help, because daring to speak up takes an enormous amount of strength and courage. If one dares to speak up, believe that it is true and tell them that you know how courageous they have to be to speak up.

How and in what form do you feel we as academics can be most impactful?

As academics we have many responsibilities: we teach the younger generation, we investigate and develop new technology and theories. Some of our research has a direct impact on society, some research does not yet, some research will maybe never have a direct impact on society. I don’t believe that all research needs to have an impact. I do believe that we as academics can be impactful, and that is by explaining science to the general audience. What is science? Why don’t scientists have answers to all questions? Why is what you do important? By explaining one’s research in layman’s terms, science and scientific output will become easier to understand for non-scientists. It will help shape public debate. It will lead to scientific results not being as easily dismissed as nowadays often happens. At the same time, and at least as important: by talking to people from the general public, you as an academic will see the world through their eyes, look at the impact of your work in a different way, and I am convinced it will also often lead to the explanation of why a certain development or technology is not adopted by society at large or by a particular group in society. In short, academics can be most impactful by communicating with the general public, and communication is and thus should be a two-directional process.

Bios

Dr Odette Scharenborg is an Associate Professor and Delft Technology Fellow at the Multimedia Computing Group at the Delft University of Technology, the Netherlands, and the Vice-President of the International Speech Communication Association (ISCA). Her research focuses on human speech-processing inspired automatic speech processing with the aim to develop inclusive speech technology, i.e., speech technology that works for everyone irrespective of how they speak or the language they speak.

Since 2017, Odette is on the Board of ISCA, where she is also the chair of the Diversity committee (since 2019) and was co-chair of the Interspeech Conferences committee and of the Technical Committee (2017-2019). From 2018-2021, Odette was a member of the IEEE Speech and Language Processing Technical Committee (subarea Speech Production and Perception). From 2019-2021, she was an Associate Editor of IEEE Signal Processing Letters, where she now is a Senior Associate Editor.

Editor Biographies

Dr Cynthia C. S. Liem is an Associate Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. Her research interests focus on making people discover new interests and content which would not trivially be retrieved in music and multimedia collections, assessing questions of validation and validity in data science, and fostering trustworthy and responsible AI applications when human-interpreted data is involved. She initiated and co-coordinated the European research projects PHENICX (2013-2016) and TROMPA (2018-2021), focusing on technological enrichment of digital musical heritage, and participated as technical partner in an ERASMUS+ education innovation project on Big Data for Psychological Assessment. She gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach, Researcher-in-Residence 2018 at the National Library of The Netherlands, general chair of the ISMIR 2019 conference, and keynote speaker at the RecSys 2021 conference. Presently, she co-leads the Future Libraries Lab with the National Library of The Netherlands, is track leader of the Trustworthy AI track in the AI for Fintech lab with the ING bank, holds a TU Delft Education Fellowship on Responsible AI teaching, and is a member of the Dutch Young Academy.

Dr Jochen Huber is Professor of Computer Science at Furtwangen University, Germany. Previously, he was a Senior User Experience Researcher with Synaptics and an SUTD-MIT postdoctoral fellow in the Fluid Interfaces Group at MIT Media Lab and the Augmented Human Lab at Singapore University of Technology and Design. He holds a Ph.D. in Computer Science and degrees in both Mathematics (Dipl.-Math.) and Computer Science (Dipl.-Inform.), all from Technische Universität Darmstadt, Germany. Jochen’s work is situated at the intersection of Human-Computer Interaction and Human Augmentation. He designs, implements and studies novel input technology in the areas of mobile, tangible & non-visual interaction, automotive UX and assistive augmentation. He has co-authored over 60 academic publications and regularly serves as program committee member in premier HCI and multimedia conferences. He was program co-chair of ACM TVX 2016 and Augmented Human 2015 and chaired tracks of ACM Multimedia, ACM Creativity and Cognition and ACM International Conference on Interface Surfaces and Spaces, as well as numerous workshops at ACM CHI and IUI. Further information can be found on his personal homepage: http://jochenhuber.com

Multidisciplinary column: the importance of talking to a 12-year old

By Cynthia Liem | August 17, 2021 - 09:00 |October 18, 2021 0321, Feature, Opinion: Multidisciplinary Column

Leave a comment

In 2018, while on a research visit to Bordeaux, I felt it would be good to connect more closely to the local community. As a consequence, colleagues convinced me to join the Femmes & Sciences movement, in which women researchers in STEM proactively did local outreach.

My French was conversational, though not stellar. But I thought it hopefully should be good enough to converse with young teenagers. Furthermore, as for ‘local community’, it would be a nice idea to both get to know colleagues and the culture of the local schools. So there I went, speaking at a countryside school in one of the many wine regions, and at a secondary school in Bordeaux where students would not trivially think of STEM university careers.

It was an amazing and enlightening experience. As soon as I started to talk about search engines, recommender systems, music and video services, a spark really ignited in the students. They knew these, and they used them daily!

But it only was because of me mentioning it, that they started realizing there was computer science technology behind all these services. Before, they had no clue.

And I think this is a real problem, that we as a community severely undervalue.

In my own family, my father (electrical engineering), sister (civil engineering & geomatics) and I (computer science) studied to become engineers. For the rest of my family, this meant we were ‘the technical people’, getting called in when computers were slow, cell phones were updated and printers started malfunctioning. This especially happened to my father and me, as we ‘were good with computers, since that was our profession’.

But I had not studied to fix printers. And, as I joked during university open days to prospective students, my sister never got asked to go fix the kitchen sink, even though she had been taught about water management.

It always has been striking to me how malfunctioning hardware and software were the first associations that laypeople outside of our field seemed to have with our work. Today, this is broadening to fears of hacking, and on the less negative side, (overblown?) hopes in AI and cryptocurrency. In all these cases, the technology is something alien, something that ‘normal’ humans do not understand and grasp well.

Yet at the same time, the technologies we build affect everyone’s lives, increasingly so. Frequently, they silently work in the back, and we indeed only visibly notice them if something goes wrong. But then, rather strange associations and dialogues emerge.

Recently, I became a member of the national Young Academy, a body of earlier-career faculty across disciplines in The Netherlands, playing a public opinion-making role on academic culture, the image of academia and its findings, and associated policy-making. Through this role, and with my background in search and recommendation, I am increasingly being invited into committees, workshops and other forms of public appearances, that involve policy-makers and laypeople concerned with the impact of AI technologies (especially: possible exclusion of humans, as a consequence of the use of AI technologies).

In these activities, it has again been striking to me how little common vocabulary is present, and how questions thus get formulated awkwardly. More than once, I get asked ‘what the algorithm exactly is doing’, when my discussion partners actually refer to broader decision-making processes, where problems may occur across the pipeline, also already before any algorithm would be deployed.

When I try to explain that much of the applications of interest focus on prioritization with a cutoff within a larger collection, and I ask how my discussion partners would prioritize, I get blank stares if I keep this story at the current, general, abstract level that would come naturally to me as a computer scientist. If I’m unlucky, I may even get an answer back that my discussion partners don’t want to take a stance themselves, as it is ‘difficult and subjective’ matter, but ‘surely AI can do this better than we humans?’. Now that will form a problem if we will frame the problem in a supervised learning setup, without a sense of solid ground truth or criteria to optimize for.

However, going through simple, concrete examples ‘close to home’ does seem to help. Here, I really benefited from the experience I had learnt while in Bordeaux and beyond, especially in setups where I had to work with children.

Try to explain concepts of information retrieval and data modelling in a non-native tongue to a 12-year old, and you are forced to ask simple questions, that will give insight into these children’s own world views and contexts. It will give them building blocks they recognize and can build on.

Working in music and multimedia has greatly helped me here; as said before, everyone is a heavy daily user of music and multimedia services, and thus (without explicitly knowing) actually has some world view ready on preferences, priorities and ways to navigate larger information collections. This will greatly help as a discussion starter, with the discussion elements remaining tangible for everyone.

I would argue that working on a better public understanding of our work is among the most societally impactful roles that we, as researchers in the field, can play. Our discussion partners are stakeholders who don’t realize they are stakeholders. And of course, in the case of children, they may at the same time be the future technologists, who in the future will build forth on our work.

It takes serious time investment and a lot of practice to get this right. I have always been puzzled at how this typically meant this would be considered too much of a time sink, and not our prime responsibility as academics. But who else would otherwise take this up?

And if I think of how much time I have been encouraged to sink into endlessly rewriting grant proposals or papers at the micro-level, just to hopefully please reviewers, something does not feel right. Any acceptances following this have arguably been good for my career. But I am not quite convinced this has been more meaningful use of the public money my contract is funded from.

Or, in a more positive interpretation: in our community, we actually care about communicating well, and are clearly willing to invest in it. But so far, we really have been focusing our attention inward, while there is a lot to gain when we’d rather look outward.

So for those who would be interested in engaging more with those outsides of our field: please do. Outreach is much more than cute PR. And with the applications that we work on being so close to people’s daily lives, we in music/multimedia hold some very important keys, and really should learn the perspectives of our end users.

So let’s use those keys, and finally, get some doors opened that have remained shut for too long.

Editor Biographies

Dr. Cynthia C. S. Liem is an Associate Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. Her research interests focus on making people discover new interests and content which would not trivially be retrieved, and assessing questions of validation and validity, especially in the context of music and multimedia search and recommendation. She initiated and co-coordinated the European research projects PHENICX (2013-2016) and TROMPA (2018-2021), focusing on technological enrichment of digital musical heritage, and gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach, and is a member of the Dutch national Young Academy.

Multidisciplinary Column: An Interview with Alex Thayer

By Jochen Huber | March 22, 2021 - 13:36 |April 15, 2021 0121, Feature, Opinion: Multidisciplinary Column

Leave a comment

Alex, could you tell us a bit about your background, and what the road to your current position was?

Alex Thayer, PhD. Head of Research, Amazon (Search); Affiliate Assistant Professor, University of Washington

Sure! I began my career in the tech industry in 1998, when I interned at the IBM Silicon Valley Lab in San Jose, California. Back then it was called the Santa Teresa Lab, and I completed a year-long internship because I wanted to get a richer professional experience than a single school quarter would provide. I also wanted to find an internship at a company that future employers would recognize when they saw my resume.

At the time, I thought about my career as a narrative that would span decades: What story would I want to tell about my employment history 20 or 30 years later? In a sense, each job would become a “chapter” in that story. As I have learned over the years, this metaphor holds up and each chapter has a slightly different theme: from drama to comedy to Greek tragedy. After about 13 different tech industry jobs, I think I’ve got a lot of genres covered.

After the year at IBM, I returned to Seattle and spent another year completing my degrees in Technical Communication (College of Engineering) and Art History (College of Art). After graduation, I focused on building my career as a technical writer. I worked at a voice recognition startup, then at a consulting firm, and I wound up doing a lot of “UX work” that was not quite codified into specific roles yet. For example, in a typical week I might work on the design of a UI component, rewrite the Javascript for a website, change the physical layout of a printed user manual, and write copy for a tutorial. I went back to the University of Washington in 2002 to get a Master of Science degree in the Technical Communication program, and to try teaching courses at the college level.

Eventually I began working full-time at Microsoft in 2006. It was during my time there when I realized technical writing was not my passion. I decided to “adjust my career narrative” and shift toward UX design and research. I was able to make that happen partly because I worked on a cross-disciplinary team at Microsoft: We had interaction design, industrial design, user research, and content publishing included in the same team. I worked on software and hardware projects in a variety of capacities. For one project, I helped design the physical product packaging; on another project, I collaborated with my teammates on the vision for an adaptive keyboard.

Eventually I hit the limits of what I could do professionally without returning to school and advancing my knowledge about people and their practices. I returned to the University of Washington and spent 4 years working on my PhD in Human Centered Design & Engineering. I moved with my family to the Bay Area in California near the conclusion of my PhD work, and I looked for a role with a focus on emerging technology and interfaces. I found that role at Intel, where I stayed for a year and a half before shifting to a very different research role at VMware. When an opportunity to work at HP Labs arose, I decided to make another career move after a year and a half. It was never my intention to work for different companies so quickly, but I thought about the career narrative perspective and the story I wanted to tell. That perspective helped me make my decision to change roles and work at HP.

What is the professional role of interdisciplinarity in your experience?

Because I have an interdisciplinary skill set, I have discovered that it can be tricky to find a job! As a “T-shaped” person, it’s not always easy to know how to bring my full set of skills to a specific role or organization. In my experience, companies are looking for experts who can go deep in a particular area, but who can also span a variety of topics and skills as needed. In practice, this means collaborating with colleagues who have an assortment of technical backgrounds and methodologies. In a typical week at my current role, I engage with product managers, designers, design technologists, business leaders, engineers, economists, and scientists. All of these roles have different requirements and dialects, which means I am constantly surrounded by “interdisciplinarity,” if that makes sense!

Also, because of my academic research focus on how people collaborate, it’s hard for me to imagine a world without “interdisciplinarity.” That’s how I think about the “role” of interdisciplinarity: It’s more of a fabric or texture that underpins the teams on which I work. And as a leader, I need to consider how different members of a team or organization come together and bring their unique skills and backgrounds to bear on the tasks at hand.

As a tangible example, we had a terrific undergraduate intern at HP who was working on Computer Science and Humanities degrees at Stanford. His approach to his education resonated with me since I had taken a similar Engineering/Arts path in my own undergrad education. It was fun to watch him apply his thought processes and knowledge on a team of senior engineers, designers, and researchers. I believe he was successful in his intern role because he could reframe problems or goals in creative ways.

In 2012, you successfully defended your dissertation on “Understanding University Students’ Use of Tools and Artifacts in Support of Collaborative Project Work”. Almost a decade later: what are your thoughts on today’s use of (multimedia) tools and devices at a university level?

This is a great segue from the question about interdisciplinarity and collaboration!

As a social scientist, I am excited to see how new tools and processes “come with” students as they graduate and enter the workforce. The space of design prototyping is evolving rapidly, for example, as recent grads expect to use the same tools on the job that they learned how to use while in school. My role at HP included people management, and I had a number of conversations about how to get access to the specific software and hardware tools that employees needed to achieve their vision. Some of these discussions were easy: one of my colleagues asked if he could buy an iron and an ironing board, for example. I said yes. Other discussions required more planning, like when our team wanted to purchase a laser cutter. So perhaps I am taking this question in an unexpected direction, but I do see an opportunity to bridge a gap between the tools and devices in use at the university level and the availability of those same tools and devices in industry.

To be honest, I have a lot to learn about how students are doing their work today. It’s been several years since I finished my PhD. I spent an entire academic quarter observing a class of advanced design students. When I think about how they were doing their project work nearly a decade ago, and when I think about how I saw students working at Yale a couple of years ago, it’s easy for me to see the advances in technology. Or when we took a trip to Wellesley a few years ago, I watched my young daughter play with the VR headsets and try her hand at archaeology. And yet we still love whiteboards and paper! Once university students are able to safely return to in-person learning, I’m sure we will keep using whiteboards and paper as two of our main tools for learning and collaboration.

Looking at your impressive set of published patents: your inventions draw from and actually span many different disciplines.

Thanks! All of those patents represent the work of teams: I have been lucky to have worked with amazing people who, quite frankly, did the hard work to make those patents happen. So, returning to that topic of interdisciplinarity, I can only point to these published patents because of the amazing work of my colleagues.

One anecdote stands out for me now, as I think back about my experience at HP Labs in particular. I was meeting with one of my teammates, an amazing colleague named Ian Robinson, and we were having our weekly one-on-one meeting. We were talking about tracking digital pen devices in Virtual Reality (VR) spaces. At one point we began riffing on the idea of a “low-cost” VR controller, and then we had a realization: rather than putting a lot of expensive technology inside a single pen, what if you designed a pair of objects that relied on a different VR tracking method? We could conceivably eliminate the need for some of the guts of the single object if we had two objects moving in virtual space. We stopped out meeting and walked over to our desks, hoping to catch some of our teammates. We described the essential concept to a few of our peers and that was the genesis of the “VR Grabbers” idea. Jackie Yang was a Stanford grad student who was working as an intern in our lab at the time, and he did an incredible amount of work on the project from that point on. His effort culminated in our UIST 2018 paper on which Jackie was the first author!

How do you work across disciplines?

Continuing that “VR Grabbers” story, I was lucky enough to have a stimulating conversation with a really smart person in a place that enabled us to pursue the idea. Ian and I came from different professional backgrounds. We happened to find ourselves working together and, on that project, we made the most of our different skills. My role after that initial conversation was to evangelize the project inside the organization rather than develop the prototype, for example. So, while it was great to help a team come together around an idea, my involvement on the project was quite different than it would have been if I were earlier in my career.

I said a bit about collaboration earlier, but I’d like to go a bit deeper on this topic. In my dissertation I spent a lot of time in the literature review section exploring the different types of collaboration. I am a big believer in “contested collaboration,” which occurs when a team of people come from different backgrounds and bring their specific perspectives and experiences to bear on a project. It is certainly more challenging to lead a team that engages in contested collaboration: It would be a lot easier if everyone agreed all the time! I’m not saying anything new here, of course.

Could you name a grand research challenge in your current field of work?

I recently saw the 2021 AI Index Report from Stanford (https://aiindex.stanford.edu/report/) and I thought each topic raised in the summary of that report could represent a “grand research challenge.” On the topic of “generative everything”, I am particularly curious about the future of ideas. In 2019 I delivered one of the keynote presentations at the IEEE Games, Entertainment, and Media (IEEE GEM) conference at Yale University in New Haven, Connecticut. In part of my presentation, I raised the question about attribution of ideas and intellectual property when we “partner” with AI. I can imagine a future where it seems less clear “who” came up with an idea: the person or the AI agent? Thinking about the “VR Grabbers” story I told earlier, I wonder how that same story will play out 20 years from now. In my capacity as an affiliate assistant professor at the University of Washington, I’m excited to continue thinking about this topic!

How and in what form do you feel we as academics can be most impactful?

I think academics need to keep doing what they’re doing. Perhaps that’s a trite answer, but as a society we need to preserve and protect the ability of academics to do their work, to ask very basic questions and be surprised by what they find. I’m not just talking about the need for basic R&D so we can find the next penicillin. I’m also talking about how companies incentivize the effort to identify and use academic work.

I also think others know a lot more about this topic, though! I’d suggest reviewing the 2017 DIS paper, Translational Resources: Reducing the Gap Between Academic Research and HCI Practice, as a useful starting point. Lucas Colusso recently completed his PhD in Human Centered Design & Engineering at the University of Washington, and he was the first author on that paper. Thanks to Professor Gary Hsieh in that department, I became aware of Lucas’ work and now I reference it with my team members when we talk about how to pursue research topics that will have lasting impact. I believe academics are the experts at generating knowledge, and in industry we can apply similar approaches on our projects.

Bios

Alex Thayer, PhD is the Head of Research for Amazon (Search) in Palo Alto. He completed his PhD in Human Centered Design & Engineering at the University of Washington, where he is currently an Affiliate Assistant Professor. Prior to joining Amazon, Alex was the Chief Experience Architect for HP Labs. He has also worked at VMware, Intel, Microsoft, YouTube, and a voice recognition startup that was partly funded by James Doohan (Scotty from Star Trek). Alex’s professional work focuses on explorations of the social-technical gap and how we make sense of people’s habits, practices, and messy lives. His academic work spans topics from AR/VR to professional collaboration to digital gaming. He has published 12 patents on medical testing, haptic feedback systems, 3D and 4D printing, immersive displays, and wearable technology. He also co-leads his daughter’s Girl Scout troop.

Editor Biographies

Multidisciplinary Column: Conferences as Career and Community Catalysts

By Cynthia Liem | October 26, 2019 - 11:43 |January 26, 2020 0419, Feature, Opinion: Multidisciplinary Column

Leave a comment

A little over 10 years ago, I chose to pursue a PhD. This meant I chose a professional life in which research publications and their uptake would be seen as major evidence of achievement. For those working in computer science, the major dissemination platforms for such publications are conferences.

Given my dual background in music and computer science, it was logical that my main interests were in topics that connected these both worlds. As a consequence, I hoped to become part of the Music Information Retrieval community. The International Society for Music Information Retrieval (ISMIR) therefore seemed the professional community to target, and the annual ISMIR conference the most logical place to present my work at.

In terms of its education and research, my department at TU Delft had track records and agendas in visual and social multimedia content analysis, but not particularly in music. Considering methodology and philosophy, I did think a lot of the work at the department was compatible with what I tried to do in music. Furthermore, as I still was in training in a selective major at the conservatoire, I was not in a good position to geographically move to any other institute that would have a more established Music Information Retrieval track record. So I inquired whether I could stay in Delft for pursuing my PhD.

The answer was somewhat complicated. There was no funding for a PhD position in Music Information Retrieval, and there were no strategic plans to change that. At the same time, the people who had supervised me as a student (in particular, my thesis supervisor Alan Hanjalic) saw promise in me, and would like to keep working with me. Ultimately, I got a one-year contract in which my main task was to try acquiring funding and international community backing to pursue a Music Information Retrieval PhD in a multimedia group.

At the start of that year, I got to attend my first ISMIR conference, where I presented a paper based on my master’s thesis. In a previous column for the SIGMM records, I already discussed my experiences at that moment: how debuting alone at a conference was intimidating, but how I was lucky that senior members of the community pro-actively took care I got introduced to other attendees. Frans Wiering, the senior member who looked after me in particular at that moment, was general chair of the upcoming ISMIR, which would take place in Utrecht, so in my home country. Frans was quick to invite me to serve as a student volunteer, which was very good news for me. As my year would be filled with grant-writing, I did not yet have a sufficiently stable infrastructure around me to be able to truly do research, so submitting to the next ISMIR was out of reach. But this way, I could still attend the conference, and even would have an excuse to keep mingling with all the attendees, as we as volunteers would be the first people to answer any participant questions regarding logistics.

Getting funding turned out a true challenge. In 2009, digital music consumption was not as large yet as it is today, and many potential data-providing partners were reluctant to collaborate. Of course, it also did not help my cause that I still was a complete nobody. Finally, when working on music, one faces an interesting paradox. On the one hand, many people, regardless of their backgrounds, identify with music, up to the point that they personally deeply care about it. As such, working on music makes for a good conversation starter, in which people are always happy to share their personal experiences. On the other hand, this makes music a commonplace topic, which risks it being shoved aside as ‘less serious’. Even though technically, the problems we are working on are framed in very similar ways as they may be in neighboring domains such as vision (and the research challenges are at least as hard, if not harder, due to subjective human factors being an integral part of the problem), common criticisms we receive are that music is fun but does not save lives, and does not deal with areas of major economical impact, nor easily measurable societal impact. So while we never have any problems legitimizing our work in public outreach, in grant-writing, we always need to justify extra why our work is more than a fun hobby, and sufficiently relevant to justify serious funding.

After several collaboration rejections, and the one proposal I did manage setting up getting rejected despite good review scores, I was very lucky that at the very end of my grant-writing year, I managed securing PhD funding through a Google Doctoral Fellowship (now PhD Fellowship). For this, I needed to get a research mentor, although my Google contacts weren’t so sure who would be appropriate for this role, as they were not aware of anyone working in music in the company at that stage.

Several weeks later, I was volunteering at ISMIR in Utrecht. That was where I found out that Douglas Eck had just moved from academia to industry, to work on music research at Google. And that was how I got my research mentor, with several extremely useful interning experiences at the company as a consequence.

When Emilia Gómez, the 2018-2019 president of the ISMIR society invited me to become general co-chair to ISMIR’s 20th anniversary edition with her, and host the event in Delft, this was my chance to give back. Now I had general chair powers, and as the society was quite open to discussing any innovations, I could try realizing the conference of my dreams.

As described in my previous column, the inclusive spirit of ISMIR has always been quite elaborate, including mentoring programs spearheaded by our Women in MIR movement, an explicit focus on multidisciplinarity over exclusivity, and on being medium-sized but single-track. Since two years, all our accepted papers are presented in a 4-minute presentation and a poster, such that all the works get equal visibility. This year, we chose to not do themed sessions but to randomize the paper order, such that authors on related topics would not be presenting their posters at the same time. As a side-effect, this also would nudge attendees towards learning about everything that got accepted, beyond the topics of their specializations. This is something I have seen the ISMIR community always being enthusiastic about, while I had very different experiences at (more prestigious) larger-sized conferences. In many cases, their larger size led to many parallel tracks with fragmented audiences, while any plenary program elements were so massive that it was hard to engage with anyone you did not happen to know already, or incidentally happened to stand or sit next to.

We made sure we offered more than paper presentations. For the keynotes, we invited speakers from neighboring fields and disciplines, and encouraged them to give some critical perspectives on our field. We engaged with a local school in an outreach program. Before the conference, we held workshops, including the Women in MIR prototyping workshop, so people would already get to know one another; we had a dedicated Newcomer Initiatives chair to make sure no one felt lost, and the socials were set up such that people could really mingle. With many people in music also happening to be active music players, we offered both formal and informal options to jam together, so that week, several cafes in Delft faced more live music than we would normally see.

But while I was preparing for this conference, one of my strongest experiences was that I kept being haunted by these memories of the past: that being able to join this community (and an academic career at all) had been a really close call, that really was catalyzed by me having been able to join the conferences, and having met supportive seniors, while I was still an early-stage student without a full research embedding.

So one of the ISMIR 2019 achievements I am most proud of, was that we extended our financial support programs, enabled by the ISMIR board and sponsorship funds. Beyond the existing grants for student authors and female participants, we added a third ‘community grant’ category, meant for individuals who would like to attend ISMIR, but who had not been in the capacity to actively participate to the conference at this stage. Reading through the motivation letters for this grant made me realize that my experiences not as much of a freak case, and that colleagues have been facing similar challenges.

I am deeply grateful that these grants enabled for us to get more people over to ISMIR. Young professionals in between positions, students in other disciplines seeking to collaborate more closely on music topics; students that have found themselves as sole people in their labs working on music, as the labs faced other strategic priorities; but also, seniors who used to be members of our field, but who had gradually been drifting out, when entering a vicious circle of not getting music projects funded, then having to do more teaching in other topics, and then taking hits on their research output and profile. It was a wonderful experience seeing all of them actively mingling with the community, and hearing how being at ISMIR indeed had been personally impactful for them.

For my student volunteers, I especially targeted local and national students who were not yet at the PhD level, such that they could experience our academic atmosphere. Here as well, I saw the positive impact of the ISMIR spirit; several of these students (of whom I am not even the thesis supervisor…) made friends with international colleagues, and are even trying to collaborate on music information research with them in their free time today.

Hopefully, this story can help inspiring colleagues who are seeking to make their conference cultures more inclusive and impactful. With this, I do want to add a warning that endeavors like this will not come for free, but demand considerable extra work and advocacy. Much of our proposed innovations initially faced pushback in some form, as these were not how things normally were done, and they required financial and human resources that would not be normally accounted for. But I am very grateful that we followed through, and extremely proud of what we achieved in the end. My great thanks go to the ISMIR society, my fellow ISMIR 2019 organizers and our sponsors for their trust and support.

All ISMIR 2019 presentations have been recorded, and are available through this link. The accepted (open access) papers with supplementary material are available via this page. Photos of the socials are available here.

About the Column

Dr. Cynthia C. S. Liem is an Assistant Professor in the Multimedia Computing Group of Delft University of Technology, The Netherlands, and pianist of the Magma Duo. Her research interests consider search and recommendation for music and multimedia, with special interest in making people discover new interests, as well as questions of interpretability and validity. She initiated, co-coordinated and participated in various (inter)national collaborative research projects on the accessibility of content which would not trivially be retrieved, both in the music/cultural heritage world, as well as in social sciences applications, e.g. collaborating with organizational psychologists. Beyond her academic activities, Cynthia gained industrial experience at Bell Labs Netherlands, Philips Research and Google. She was a recipient of the Lucent Global Science and Google Anita Borg Europe Memorial scholarships, the Google European Doctoral Fellowship 2010 in Multimedia, and a finalist of the New Scientist Science Talent Award 2016 for young scientists committed to public outreach. In 2018, she was Researcher-in-Residence at the National Library of The Netherlands, and in 2019, she served as general co-chair of the ISMIR conference.

Dr. Jochen Huber is a Senior User Experience Researcher at Synaptics. Previously, he was an SUTD-MIT postdoctoral fellow in the Fluid Interfaces Group at MIT Media Lab and the Augmented Human Lab at Singapore University of Technology and Design. He holds a Ph.D. in Computer Science and degrees in both Mathematics (Dipl.-Math.) and Computer Science (Dipl.-Inform.), all from Technische Universität Darmstadt, Germany. Jochen’s work is situated at the intersection of Human-Computer Interaction and Human Augmentation. He designs, implements and studies novel input technology in the areas of mobile, tangible & non-visual interaction, automotive UX and assistive augmentation. He has co-authored over 60 academic publications and regularly serves as program committee member in premier HCI and multimedia conferences. He was program co-chair of ACM TVX 2016 and Augmented Human 2015 and chaired tracks of ACM Multimedia, ACM Creativity and Cognition and ACM International Conference on Interface Surfaces and Spaces, as well as numerous workshops at ACM CHI and IUI. Further information can be found on his personal homepage: http://jochenhuber.com

Multidisciplinary Column: An Interview with Max Mühlhäuser

By Jochen Huber | May 11, 2019 - 09:11 |July 8, 2019 0219, Feature, Opinion: Multidisciplinary Column

Leave a comment

Could you tell us a bit about your background, and what the road to your current position was?

Well, this road is marked by wonderful people who inspired me and sparked my interest in the research fields I pursued. In addition, it is marked by two of my major deficiencies: I cannot stop to investigate the role of my research in the larger context of systems and disciplines, and I have the strong desire to see “inventions” by researchers make their way into practice i.e. turn into “innovations”. The first of these deficiencies led to the unusually broad research interests of my lab and myself, and the second one made me spend a substantial part of my career conceptualizing and leading technology transfer organizations, for the most part industry-funded ones.

More precisely, I started to cooperate with Digital Equipment Corp. (DEC) during the time of my Diploma thesis already. DEC was then the second largest computer manufacturer and spearhead of the efforts to build affordable “computers for every engineering group”. My boss, the late Professor Krüger, gave me a lot of freedom, so I was able to turn the research cooperation into the first funded European research project of DEC and later into their first research center in Europe, conceived as a campus-based organization that worked very closely with academia. I am proud to say that I was allowed to conceptualize this academia-industry cooperation and that it was later on copied – often with my help and consultancy – many times across the globe, by several companies and governments. I acted as the founding director of the first such center, but at that time I was already determined to follow the academic career path. At the age of 32, I was appointed professor at the university of Kaiserslautern. Over the years, I was offered positions at prestigious universities in Canada, France, and the Netherlands, and I accepted positions in Austria and Germany (Karlsruhe, Darmstadt). My sabbaticals led me to Australia, France and Canada, and for the most part to California (San Diego and four times Palo Alto). In retrospective it was exciting to start at a new academic position every couple of years in the beginning, but it was also exciting to “finally settle” in Darmstadt and to build the strengths and connections there that were necessary to drive even larger cooperative projects than before.

The Telecooperation Lab embraces many different disciplines. Celebrating its 20th birthday next year, how did these disciplines evolve over the years?

It started with my excitement for distributed systems, based on solid knowledge about computer networks. At the time (the early 1980s), little more than point-to-point communication between file transfer or e-mail agents existed, and neither client-server nor multi-party systems were common. My early interest in this field concerned software engineering for distributed systems, ranging from design and specification support via programming and simulation to debugging and testing. Soon, multimedia became feasible due to advancements in computer hardware– and in peripherals: think of the late laser disk, a clumsy predecessor of today’s DVDs and BDs. Multimedia grabbed my immediate attention since numerous problems arose from the interest to enable it in a distributed manner. Almost at the same time, e-learning became my favorite application field since I saw the great potential of distributed multimedia for this domain, given the challenges of global education and of the knowledge society. I believe that technology has come a long way with respect to e-learning, but we are still far from mastering the challenges of technology supported education and knowledge work.

Soon came the time when computers left the desk and became ubiquitous. From my experience in multimedia and e-learning, it was obvious to me that human computer interaction would be a key to the success of ubiquitous computing. Simply extrapolating the keyboard-mouse-monitor based interaction paradigm to a future where tens, hundreds, or thousands of computers would surround an individual – what a nightmare! This threat of a dystopia made us work on implicit and tangible interaction, hybrid cyber-physical knowledge work, novel mobile and workspace interaction, augmented and virtual reality, and custom 3D printed interaction – HCI became our “new multimedia”.

Regarding applications domains, our research in supporting the knowledge society evolved towards supporting ‘smart environments and spaces’, a natural consequence of the evolution of our core research towards networked ubiquitous computers. My continued interest in turning inventions into innovations made us work on urgent problems of industry – mainly revolving around business processes – and on computers that expect the unexpected: emergencies and disasters. Both these domains were a nice fit since they could benefit from appropriate smart spaces. Looking at smart spaces of ever larger scale, we naturally hit the challenge of supporting smart cities and critical infrastructures.

Finally, a bit more than ten years ago, our ubiquitous computing research made us encounter and realize the “ubiquity” of related cybersecurity threats to at large, in particular threats to privacy and appropriate trustworthiness estimation and of detecting networked attacks. These cybersecurity research activities were, like those in HCI, natural consequences of my afore-mentioned deficiency: my desire to take a holistic look at systems – in my case, ubiquitous computing systems.

Finally, the fact that we adapt, apply and sometimes further machine learning concepts in our research is nothing but a natural consequence of the utility of those concepts for our purposes.

How would you describe the interrelationship between those disciplines? Do these benefit from cross-fertilization effects and if so, how?

In my answer to your last question, I unwillingly used the word “natural” several times. This shows already that research on ubiquitous computing and smart spaces with a holistic slant almost inevitably leads you to looking at the different aspects we investigate. These aspects just happen to concern different research disciplines in computer science. The starting point is the fact that ubiquitous computing devices are much less general-purpose computers than dedicated components. Networking and distributed systems support are therefore a prerequisite for orchestrating these dedicated skills, forming what can be called a truly smart space. Such spaces are usually meant to assist humans, so that multimedia – conveying “humane” information representations – and HCI – for interacting with many cooperating dedicated components – are indispensable. Next, how can a smart space assist a human if it is subject to cyber-vulnerabilities? Instead, it has to enforce its users’ concerns with respect to privacy, trust, and intended behavior. Finally, true smartness is by nature bound to adopting and adapting best-of-breed AI techniques.

You also asked for cross-fertilizing effects. Let me share just three of the many examples in this respect. (i) Our AI related work cross-feritlized our cyberattack defense. (ii) On the other hand, the AI work introduced new challenges in distributed and networked systems, driving our research on edge computing forward. (iii) New requirements are added to this edge computing research by HCI since we want to support collaborative AR applications at large i.e. city-wide scale.

Moreover, cross-fertilizing goes beyond the research fields of computer science that we integrate in my own lab. As you know, I was and am heading highly interdisciplinary doctoral schools, formerly on e-learning, and now on privacy and trust for mobile users. When you work with researchers from sociology, law, economics, and psychology on topics like privacy protecting Smartphones, you first consider these topics as pertaining to computer science. Soon, you realize that the other disciplines dealt with issues like privacy and trust long before computers existed. Not only can you learn a lot from the deep and concise findings brought forth by these disciplines for decades or centuries, you can quickly establish a very fruitful cooperation with researchers from these disciplines who address the new challenges of mobile and ubiquitous computing from their perspective. I am convinced that the unique role of Xerox PARC in the history of computer science, with so many of the most fundamental innovations originating there, is mainly a consequence of their highly interdisciplinary approaches, combining the “science of computers” with the “sciences concerned with humans”.

Please tell us about the main challenges you faced when uniting such diverse topics under the Telecooperation Lab’s multi-disciplinary umbrella?

The major challenge lies in a balancing act for each PhD thesis and researcher. On one hand, the work must be strictly anchored in a narrow academic field; as a young researcher, you are lucky if you can make yourself a bit of a name in a single narrow community–which is a prerequisite for any further academic career steps for many reasons. Trying to get rooted in more than one community during a PhD would be what I call academic suicide. The second side of the balancing act, for us, is the challenge to keep that narrow and focused PhD well connected to the multi-area context of my lab – and for the members of the doctoral schools, even connected to the respective multi-disciplinary context. While this second side is not a prerequisite for a PhD, it is an inexhaustible source of both new challenges for, and new approaches to, the respective narrow PhD fields. In fact, reaching out to other fields while mastering your own field costs some additional time; in my experience, however, this additional time can easily be spared in the search for original scientific contributions that will earn you a PhD. The reason is that the cross-fertilizing from a multi-area or even multi-disciplinary setting will lead you to original contributions much faster, due to a fresh look at both, challenges and approaches.

When it comes to Postdoctoral researchers, things are a bit different since they are already rooted in a field, which means that they can reach out a bit further to other areas and disciplines, thereby creating a unique little research domain in which they can make themselves a name for their further career. My aim for my postdocs is to help them attain a status where, when I mention their name in a pertinent academic circle, my colleagues would say “oh, I know, that’s the guy who is working on XYZ”, with XYZ being a concise subdomain of research which that postdoc was instrumental in shaping.

The Telecooperation Lab is part of CRISP, the National Research Center for Applied Cybersecurity in Germany, which embraces many disciplines as well. Can you give us some insights into multidisciplinarity in such an environment?

Let me start by explaining that we started the first large cybersecurity research center in Darmstadt more than ten years ago, CRISP in its current form as a national center has only started to exist. By the way, CRISP will have to be renamed again for legal reasons (sigh!). Therefore, let me address our cybersecurity research in general. This research involved a very broad spectrum of disciplines, from physicists that address quantum related aspects to psychologists that investigate usable security and mental models. The most fruitful cooperations always concern areas that establish a “mutual benefits and challenges” relationship with the computer science side of cybersecurity. Two examples that come to my mind are The Laws and Economics. Computer science solutions to security and privacy always have limits. For instance, cryptographic solutions are always linked to trust at their boundaries (cf. trusted certificate authorities, trusted implementations of theoretically “proven-secure” protocols, trust in the absence of insider threats etc.). At such boundaries, law must punish what technology cannot guarantee, otherwise the systems remain insecure. In the reverse direction, new technical possibilities and solutions must be reflected in law. A prominent example is the power of AI: privacy law, such as the European Union’s GDPR, holds data processing organizations liable if they process personally identifiable information, PII for short. If data is not considered to be PII, it can be released. Now what if, three years later, a novel AI algorithm can link that data to some background data and infer PII from it? Privacy law needs a considerable update due to these new technical possibilities. I could talk about these mutual benefits and challenges on and on, but let me just quickly mention one more example from economics: if technology comes up with new privacy preserving schemes then these schemes may open up new opportunities for privacy-respecting services. In order for such services to succeed in the market, we need to learn about possible corresponding business models. This kind of economics research may lead to new challenges for technical approaches, and so on. Such “cycles of innovation” across different disciplines are among the most exciting facets of interdisciplinary research.

Could you name a grand challenge of multidisciplinary research in the Multimedia community?

Oh, I think I have a quite dedicated opinion on this one! We clearly live in the era of the fusion of bits and atoms – and this metaphor is of course just one way to characterize what is going on. Firstly, in the cyber-physical society that we are currently creating, the digital components are becoming the “brains” of complex real-world systems such as the transport system, energy grids, industrial production etc. This development creates already significant challenges concerning our future society, but beyond this trend and directly related to multimedia, there is an even more striking development: we increasingly feed the human senses by means of digitally created or processed signals – and hence, basically by means of multimedia. TV and telephone, social media and Web based information, Skype conversations and meetings, you-name-it: our perception of objects, spaces, and of our conversation partners – in other words: of the physical world – is conveyed, augmented, altered, and filtered by means of computers and computer networks. Now, you will ask what I consider the challenge in this development that goes on since decades. Consider that this field “jumps forward” in our days due to AI and other advancements: it is the challenge for interdisciplinary multimedia research to properly conserve the distinction between “real” and “imaginary” in all cases where we would or should conserve it. To cite a field that is only marginally concerned here, let me mention games: in games, it is – mostly – desired to blur the distinction between the real and the virtual. However, if you think of fake news or of highly persuasive social media governmental election campaigns, you get an idea of what I mean. The challenge here is highly multidisciplinary: for instance, many computer science areas have to come together already in order to check where in the media processing chain we can intervene in order to keep a handle on the real-versus-virtual distinction. Way beyond that, we need many disciplines to work hand-in-hand in order to figure out what we want and how we can achieve it. We have to recognize that many long-existing trends are at the fringe of jumping forward to an unprecedented level of perfection. We must figure out what society needs and wants. It is reckless to leave this development to economic or even malicious forces or to tech nerds who invent their own ethics. The examples are endless, let me cite a few in addition to those mentioned above, highlighting fake news and manipulative election campaigns.

Machine learning experts may call me paranoid, hinting at the fact that the detection of manipulated photos or deep fake videos is still a much simpler machine learning task than creating them. While this is true, I fear that it may change in the future. Moreover, alluding to the multidisciplinary challenges mentioned, let me remind you that we currently don’t have processes in place that would sufficiently check content for authenticity in a systematic way.

As another example, humans are told they are “valued customers”, but they are since long considered as consumers at best. More recently, they are downgraded to mass objects in which purchase desires are first created then directed–by sophisticated algorithms and with ever more convincing multimedia content. Meanwhile in the background, pricing discrimination is rising to new levels of sophistication. On a different field, questionable political powers are more and more capable of destabilizing democracies from a save seat across the Internet, using curated and increasingly machine-created influential media.

As a next big wave, we are witnessing a giants’ race among global IT players for the crown in the augmented and virtual reality markets. What is still a niche area may become wide spread technology tomorrow – reckon that the first successful smartphone was introduced only little more than a decade ago and that meanwhile the majority of the world’s population use Smartphones to access the Internet. A similar success story may lie ahead for AR/VR: at the latest when a generation grows up wearing AR contact lenses, noise-cancelling earplugs and haptics-augmented cloths, reality will not be threatened by fake information any more but digitally created, imaginary content will be reality, rendering the question “what is real?” obsolete. Of course, the list of technologies and application domains mentioned here is by far non-exhaustive.

The problem is that all these trends appear to be evolutionary, not disruptive as they are. Marketing has influenced customers already centuries ago, fake news existed even longer, and the movie industry has always had a leading role in imaginary technology, from chroma keying to the most advanced animation techniques. Therefore, the new and upcoming AI-powered multimedia technology is not (yet) recognized as disruptive and hence as a considerable threat to the fundamental rules of our society. This is a key reason why I consider this field a grand interdisciplinary research challenge. We need definitely far more than technology solutions. As an outset, we need to come to grips with appropriate ethical and socio-political norms. To what extend do we want to keep and protect the governing rules of society and humankind? Which changes do we want, which ones not? What does all that mean in terms of governing rules for AI-powered multimedia, for the merging of the real and the virtual? Apart from basic research, we need a participatory approach that involves society in general and the rising generations in particular. Since we cannot expect these fundamental societal process to lead to a final decision, we have to advance the other research challenges in parallel. For instance, we need a better understanding of social implications and of psychological factors related to the merge of the real and the virtual. Technology-related research must be intertwined with these efforts; as to technology fields concerned, multimedia research must go hand-in-hand with others like AI, cybersecurity, privacy, etc. –the selection depends on the particular questions addressed. This research must be further intertwined with human-related fields such as Law: laws must again regulate what technology can’t solve, and reflect what technology can achieve for the good or the evil. In all this, I did not yet mention further related issues like for instance biometric access control: as we try to make access control more user friendly, we rely on biometric data, most of which are variants of multimedia, namely speech, face or iris photos, gait and others. The difference between real and virtual remain important here and we can expect enormous malicious efforts to blur it. You see, there is really a multidisciplinary grand challenge for multimedia.

How and in what form do you feel we as academics can be most impactful?

During the first half of my career, computer science was still in that wonderful gold diggers’ era: if you had a good idea and just decent skills to convey it to your academic peers, you could count on that idea to be heart, valued, and – if it was socially and economically viable – realized. Since then, we have moved to a state in which good research results are not even half the story. Many seemingly marginal factors drive innovation today. No wonder have we reached a point at which many industry players think that innovation should be driven by the company’s product groups in a close loop with customers, or by startups that can be acquired if successful, or – for the small part that requires long-term research – by a few top research institutions. I am confident that this opinion will be replaced by a new craze among CEOs in a few years. Meanwhile, academics should do there homework in three ways. (a) They should look for the true kernel in the current anti-academic trend and improve academic research accordingly. (b) They should orient their research towards the unique strength of academia, like the possibility to carry out true interdisciplinary research at universities. (c) They should tune their role, their words and deeds to those much-increased societal responsibilities highlighted above.

Academics from computer science trigger confusion and reshaping of our society to a bigger and bigger extend; it is time for them to live up to their responsibility.

Bios

Prof. Dr. Max Mühlhäuser is head of the Telecooperation Lab at Technische Universität Darmstadt, Informatics Dept. His Lab conducts research on smart ubiquitous computing environments for the ‘pervasive Future Internet’ in three research fields: middleware and large network infrastructures, novel multimodal interaction concepts, and human protection in ubiquitous computing (privacy, trust, & civil security). He heads or co-supervises various multilateral projects, e.g., on the Internet-of-Services, smart products, ad-hoc and sensor networks, and civil security; these projects are funded by the National Funding Agency DFG, the EU, German ministries, and industry. Max is heading the doctoral school Privacy and Trust for Mobile Users and serves as deputy speaker of the collaborative research center MAKI on the Future Internet. Max has also led several university wide programs that fostered E-Learning research and application. In his career, Max put a particular emphasis on technology transfer, e.g., as the founder and mentor of several campus-based industrial research centers.

Max has over 30 years of experience in research and teaching in areas related to Ubiquitous Computing (UC), Networks, Distributed Multimedia Systems, E-Learning, and Privacy&Trust. He held permanent or visiting professorships at the Universities of Kaiserslautern, Karlsruhe, Linz, Darmstadt, Montréal, Sophia Antipolis (Eurecom), and San Diego (UCSD). In 1993, he founded the TeCO institute (www.teco.edu) in Karlsruhe, Germany, which became one of the pace-makers for Ubiquitous Computing research in Europe. Max regularly publishes in Ubiquitous and Distributed Computing, HCI, Multimedia, E-Learning, and Privacy&Trust conferences and journals and authored or co-authored more than 400 publications. He was and is active in numerous conference program committees, as organizer of several annual conferences, and as member of editorial boards or guest editor for journals like Pervasive Computing, ACM Multimedia, Pervasive and Mobile Computing, Web Engineering, and Distance Learning Technology.

Editor Biographies

Multidisciplinary Column: An Interview with Andrew Quitmeyer

By Jochen Huber | September 13, 2018 - 10:32 |October 13, 2018 0318, Feature, Opinion: Multidisciplinary Column

1 Comment

Could you tell us a bit about your background, and what the road to your current position was?

In general, a lot of my background has been motivated by personal independence and trying to find ways to sustain myself. I was a first-generation grad student (which may explain a lot of my skepticism and confusion about academia in general). I moved out of the house at 15 to go to this cool, free, experimental public boarding high school in Illinois. I went to the University of Illinois because it was the nicest school I could go to for free (despite horrible college counselors telling all the students they should take on hundreds of thousands of dollars of debt to go to the “school of their dreams”). They didn’t have a film-making program, so I created my own degree for it. Thinking I could actually have a film career seemed risky, and I wanted something that would protect my ability to get a job, so I got an engineering degree too.

I was a bit disappointed in the engineering side though, because I felt we never actually got to build anything on our own. I think a lot of people know me as some kind of “hacker” or “maker”, but I didn’t start doing many physical projects until much later in grad school when I met my friend Hannah Perner-Wilson. She helped pioneer a lot of DIY e-textiles (kobakant.at), and what struck me was how beautifully you could document and play with physical projects. Physical computing seemed an attractive combination of my abilities in documentary filmmaking and engineering.

The other big revelation for me roped in the naturalist side of what I do. I have always loved adventuring in nature, and studying wild creatures, but growing up in the midwest USA, this was never presented as a viable career opportunity. In the same way that it was basically taken for granted in midwestern US culture that studying art was a sort of frivolous hobby for richer kids, a career in biology that didn’t feed into engineering work in some specific industry (agriculture, biotech, etc…) was treated as equally flippant. I tried taking as many science electives as I could in undergrad, but it was because they were fun. Again, it was not until grad school when I had a cool job doing computer vision programming with an ant-lab robot-lab collaboration that I realized the potential error of my ways. Some of the ant biologists invited me to go ant hunting out in the desert after our meeting, and it was so fun and interesting I had a sort of existential meltdown. “Oh no! I screwed up, I could have been a field biologist all this time? Like that’s a job?”

So I worked to sculpt my PhD around this revelations. I wanted to join field biologists in exploring the natural world while using and developing novel technology to help us probe and document these creatures in new ways.

I plowed through my PhD as fast as I could because going to school in the US is expensive. You either have to join a lab to help work on someone else’s (already funded) project or take time away from your research to TA classes. After I got out of there, I did some other projects, and eventually got a job as an assistant professor at the National University of Singapore in the communications and new media department. Unfortunately it seems like I came at a pretty chaotic time (80% of my fellow professors are leaving my division, not to be replaced), and so I will actually be leaving at the end of this semester to figure out a new place to continue doing research and teaching others.

How does multidisciplinary work play a role in your research on “Digital Naturalism”?

The work is basically anti-disciplinary. Instead of a relying on specific field of practice, the work simply sets out towards some basic goals and happily uses any means necessary to get there. Currently this includes a blend of naturalistic experimentation, performance art, film making, interaction design, software and hardware engineering, industrial design, ergonomics, illustration, and storytelling.

The more this work spreads into other disciplines the more robust and interesting I think it will become. For instance, I would love to see more video games developed about interacting with wild animals.

Could you name a grand research challenge in your current field of work?

Let’s talk to animals.

I am a big follower of Janet Murray’s work in Digital Media. She sees the grand challenge of computers as forming this amazing new medium that we all have to collaboratively experiment with to figure out how to truly make use of the new affordances it provides us. For me, the coolest new ability of computers is their behavioral nature. Never before have we had a medium that shares the same unique qualities of living creatures in being able to sense stimuli from the world, process this and be able to create new stimuli in response. Putting together these senses and actions let’s computers give us the first truly “behavioral medium.” Intertwining the behaviors of computers with living creatures opens up a new world of dynamic experimentation.

When most people think of talking to animals, they imagine some kind of sci-fi, dr. doolittle auto-translator. A bird chips a song, and we get a text message that says “I am looking for more bird food.” This is quite specist of us, and upholds that ingrained assumption that all living creatures strive to somehow become more like us.

Instead, I think this digital, behavioral medium holds more value and potential into bringing us into their world and modes of communication. You can learn what the ants are saying by building your own robot ant that taps antennae with the workers around her. You might learn more in a birds’ communication by capturing its bodily movements and physical interactions giving context to its thoughts than trying to brute-force decrypt the sounds it makes.

I find anything we can learn about animals and their environments useful, and the behaviors that computers can enact as a key to bringing us into their world. There is a really long road ahead though. To facilitate rich, behavioral interactions with other creatures requires advances, experimentation, and refinement of our ability to sense non-human stimuli and provide realistic stimuli back. Meanwhile though, I can barely create a sensor that can detect just the presence or absence of an ant on a tree in the wild. Thus we need a lot more development and experimentation but, I imagine future digital naturalists using technology to turn themselves into Goat-men like Thomas Thwaites rather than as Star Trek commanders using some kind of universal translator.

You have been starring in a ‘Hacking the Wild’ television series on the Discovery Channel. How did the idea for this series come about? Do you aim to reach out to particular audiences with the show?

Yeah that was an interesting experience! Some producers had seen some of my work I had been documenting and producing from expeditions I led during my PhD, and contacted me about turning it into a show. A problem in the entertainment industry is that nobody seems to understand why you would ever not want to be in entertainment. They treat it as a given that that’s what everyone is striving for in their lives. This seems to give them license to not treat people great and say whatever it takes to get people to do what they want (even if some of these things turn out to be false). So, for instance, I was first told that my show would be about me working with scientists building technology in the jungle, but then it devolved into a survival genre TV show with just me. The plot became non-sensical (which could have been fun!), but pressure from the industry forced us to keep up the grizzled stereotypes of the genre (“if I don’t find food in the next couple hours…I might not make it out”).

It gave me an interesting chance to insert myself and some of my own ideals into this space though. One thing that irks me about the survival genre in general is its rhetoric of “conquering nature.” They kept trying to feed me lines about how I would use this device, or this hack to “defeat nature” which is the exact opposite of what I want to do in my work. So I tried to stand my ground and assert that nature is beautiful and fun, and maybe we can use things we build to understand it even better. Many traditional survival audiences didn’t seem to care for it, but I have gotten lots of fan mail from around the world from people who seem to get the real idea of it a bit more – make things outside and use them to play in nature. I remember one nice email from a young kid who would prototype contraptions in their back yard with what they called “electric sticks,” and that was really nice.

You recently organized the first Digital Naturalism Conference (Dinacon), that was quite unlike the types of conferences we would normally encounter. Could you tell a bit more about Dinacon’s setup,and the reasons why you initiated a conference like this?

Dinacon was half a long-term dream and half a reaction to problems in current academic publishing and conferences.

The basic idea of of the Digital Naturalism Conference was to gather all the neat people in my network spanning many different fields and practices, and get them to hang out together in a beautiful, interesting place. For me, this was a direct continuation of my Digital Naturalism work to re-imagine the sites of scientific exploration. In previous events I had tried to explore combining hackathons with biological field expeditions. These “hiking hacks” looked to design the future of how scientific trips might function in tandem with the design of scientific tools. The conference looked to take this to the next stage and re-imagine what the biological field station of the future might look like.

The more specific design of this conference was built as a reaction to a lot of the problems I see in current academic traditions. The academic conferences I have taken part in generally had these problems:

Exploitative – Powered by unpaid laborers (organizing, reviewing, formatting, advertising) who then have to pay to attend themselves
Expensive – only rich folks get to attend (generally with money from their institution)
Exclusive – generally you have to already be “vetted” with your papers to attend (not knocking Peer review! Just vetted networking)
Steer Money in not great directions – e.g. lining the pockets of fancy hotels and publishing companies
Restricted Time – Most conferences leave just enough time to get bored waiting for others unenthusiastic presentations to finish, and maybe grab a drink before heading back to all the duties one has. I think for good work to be done, and proper connections to be made in research, people need time to live and work together in a relaxing, exciting environment.

[I go into more details about all this in the post about our conference’s philosophy: https://www.dinacon.org/2017/11/01/philosophy/ ]

Based on these problems, I wanted to experiment with alternative methods for gathering people, sharing information, and reviewing the information they create. I wanted to show that these problems were illnesses within the current system and traditions we perpetuate, and that many alternatives not only exist, but are feasible even on a severely reduced budget. (We started on an initial budget self-funding the rental of the place with $7000 USD, we then crowdfunded $11,000 additionally after the conference was announced to provide additional amenities and stipends).

Thus, when creating this conference, we sought to attack each of these challenges. First we made it free to attend and provided free or subsidized housing. We also made it open to absolutely anyone from any discipline or background. Then we tried to direct what money we did have to spend towards community improvements. For instance, we rented out the Diva Andaman for the duration of the conference. This was a tourism ship that was interested in also helping the biology community by serving as a mobile marine science lab. In return for letting us use the facilities and rooms on the ship, we helped develop ideas and tools for its new laboratories. Finally, and perhaps most importantly, we worked to provide time for the participants. They were allowed to stay for 3 days to 3 weeks and encouraged to take time to explore, adjust to the place, interact with each other.

We tried to also streamline the responsibilities of the participants too by having just 3 official “rules”:

You must complete something. Aim big, aim small, just figure out a task for yourself that you can commit yourself to that you can accomplish during your time at the conference. It can be any format you want: sculpture, a movie, a poem, a fingerpainting, a journal article – you just have to finish it!

Document and Share it. Everything will be made open-source and publicly accessible!

Provide feedback on two (2) other people’s projects.

The goal of these rules would be that, just like at a traditional conference, everyone would leave with a completed work that’s been reviewed by their peers. Also like the reality of any conference, not all of these rules were 100% met. Everyone created something, most documented it, and gave plenty of feedback to each other, but there wasn’t yet as much of an infrastructure in place for them to give this feedback and documentation a bit more formally. These rules functioned with great success, however, as goal posts leading people towards working on interesting things while also collaborating and sharing their work.

Do you feel Dinacon was successful in promoting inclusivity? What further actions can the community undertake towards this as a whole?

I do, and was quite happy with the results, but am excited to build on this aspect even more. We worked hard at reaching out to many communities around the world, especially within groups or demographics that may be overlooked otherwise. This was a big factor in where we decided to locate the conference as well. Thailand was great because many folks from around southeast asia could easily come, while people from generally richer nations in the west could also make it. I think this is a super important feature for any international conference: make it easier for the less privileged and more difficult for the more privileged.

I genuinely do not understand why giant expensive conferences just keep being held where the rich people already live. Anytime I am at some expensive conference hotel in Singapore, Japan, or the USA, I think about how all that money could go so much further and have a bigger impact on a community elsewhere. For instance, there has NEVER been a CHI conference held anywhere in Africa, South America, or Southeast Asia. These places do also have large hotels that you can hook up computers and show a PowerPoint as well, so it’s not like they are missing the key infrastructure of these types of conferences.

One of the biggest hurdles is money and logistics. We had folks accepted from every continent except Antarctica, but our friends from Ghana couldn’t make it due to the arduous visa process. We had a couple small micro-travel grants (300-600 bucks of my own money) to help get people over who might not have been able to otherwise, but I wished we could have made our conference entirely free and could cover transportation (instead of just free registration, free food, and free camping housing).

That’s a limitation of a self-funded project, you just try to help as much as you can until you are tapped out. The benefits of it, though, are proving that really many people with a middle class job can actually do this too. Before I got my job, I pledged to put 10% of my earnings towards creating fun, interesting experiences for others. It’s funny that when people spend this amount of money on more established things like cars or religious tithing, people accept it, but when I tell people I am spending $7000USD of my own money putting on a free conference about my research they balk and act like I am nuts. I couldn’t think of anything better to spend 10% of your money on than something that brings you and others joy.

Next year’s conference will likely have a sliding scale registration though to help promote greater inclusivity overall than what we could provide out of our own pockets. Having people who can afford to pay a couple hundred for registration help subsidize those who would have been prevented from coming seems like an equitable solution.

How and in what form do you feel we as academics can be most impactful?

Fighting competitiveness. I think the greatest threat put onto academics is the idea that we are competing with each other. Unfortunately, many institutional policies actually codify this competition into truth. As an academic your loyalty should be first and foremost into unlocking new ideas about our world that you can share with others. This quality is rarely directly rewarded by any large organization, though. This means that standing up for academic integrity will almost undoubtedly come at a cost. It may cost you your bonus, your grant application, or even your job. In terms of your life and your career, however, I think these will only be short term expenses, and in fact be investments into deeper, more impactful research and experiences.

Academics like to complain about the destructiveness of policies based on pointless metrics and academic cliques, but nothing will change unless you simply stand up against it. Not everyone can afford to stand up against the authorities. Maybe you cannot quit your job because you need the health care, but there are ways for all of us to call out exploitation that we see in institutional or community structures. You need to assess the privileges you do have, and do what you can to help share knowledge and lift up those around you.

For instance, in my reflection after going to a more traditional conference (during my own conference), I pledged to

no longer help recruit “reviewers” for papers if they are not compensated in some way.
avoid reviewing papers for exploitative systems

and

transfer my reviewing time to help conferences and journals with open policies.

(more info here: https://www.dinacon.org/2018/06/19/natural-reflection-andy-quitmeyer/). For now, this pledge excludes me from some of the major conferences in my field, which in turn makes me publish my work in other venues, which many institutions look down on, and this inhibits my hire-ability. I think it’s worth it though to help stop perpetuating these problems onto future generations.

In your opinion, what would make an academic community a healthy community?

I think a healthy academic community would be one where the people are happy, help each other, and help make space for people outside their community to join and share. The only metric I think I would want to judge quality of an institution on would be about how happy they feel their community is. I don’t care what their output is, especially in baseless numbers of publications or grant money, developing healthy communities is the only way to lead to any kind of long-term sustainable research. You need humans of different abilities and generations watching out for each other, helping each other learn new things, and protecting each other.

Some people try to push the idea that competition is necessary to make people work hard and be productive, or else they will be lazy and greedy. In fact, it’s this competition that creates these side affects. When cared for, they are curious, constructive, and helpful.

So keep your eyes open for ways in which your peers or students are being exploited and stand up against it. Reach out to find out challenges people around you face, and work on developing opportunities outside the scope of the traditions in your field. I think doing this will help build healthy and productive communities.

Bios

Dr. Andrew Quitmeyer is a hacker / adventurer studying intersections between wild animals and computational devices. His academic research in “Digital Naturalism” at the National University of Singapore blends biological fieldwork and DIY digital crafting. This work has taken him through international wildernesses where he’s run workshops with diverse groups of scientists, artists, designers, and engineers. He runs “Hiking Hacks” around the world where participants build technology entirely in the wild for interacting with nature. His research also inspired a ridiculous spin-off television series he hosted for Discovery Networks called “Hacking the Wild.” He is currently working to establish his own art-science field station fab lab.

Editor Biographies

Multidisciplinary Community Spotlight: Assistive Augmentation

By Jochen Huber | June 21, 2018 - 16:44 |June 21, 2018 0218, Feature, Opinion: Multidisciplinary Column

Leave a comment

Emphasizing the importance of neighboring communities for our work in the field of multimedia was one of the primary objectives we set out with when we started this column about a year ago. In past issues, we gave related communities a voice through interviews and personal accounts. For instance, in the third issue of 2017, Cynthia shared personal insights from the International Society of Music Information Retrieval [4]. This issue continues the spotlight series.

Since its inception, I was involved with the Assistive Augmentation community—a multidisciplinary field that sits at the intersection of accessibility, assistive technologies, and human augmentation. In this issue, I briefly reflect on my personal experiences and research work within the community.

First, let me provide a high-level view on Assistive Augmentation and its general idea which is that of cross-domain assistive technology. Instead of putting sensorial capability in individual silos, the approach puts it on a continuum of usability for a specific technology. As an example, a reading aid for people with visual impairments enables access to printed text. At the same time, the reading aid can also be used by those with an unimpaired visual sense for other applications like language learning. In essence, the field is concerned with the design, development, and study of technology that substitutes, recovers, empowers or augments physical, sensorial or cognitive capabilities, depending on specific user needs (see Figure 1).

Figure 1. Assistive Augmentation Continuum

Now let us take a step back. I joined the MIT Media Lab as a postdoctoral fellow in 2013 pursuing research on multi-sensory cueing for mobile interaction. With my background in user research and human-computer interaction, I was immediately attracted by an ongoing project at the lab lead by Roy Shilkrot, Suranga Nanayakkara and Pattie Maes, that involved studying how the MIT visually impaired and blind user group (VIBUG) uses assistive technology. People in that group are particularly tech-savvy. I came to know products like the ORCAM MyEye. It is priced at about 2500-4500 USD and aims at recognizing text, objects and so forth. Back in 2013 it had a large footprint and made its users really stand out. Our general observations were, to briefly summarize, that many tools we got to know during regular VIBUG meetings were highly specialized for this very target group. The latter is, of course, a good thing since it focuses directly on the actual end user. However, we also concluded that it locks the products in silos of usability defined by its’ end users’ sensorial capabilities.

These anecdotal observations bring me back to the general idea of Assistive Augmentation. To explore this idea further, we proposed to hold a workshop at a conference, jointly with colleagues in neighboring communities. With ACM CHI attracting folks from different fields of research, we felt like it would be a good fit to test the waters and see whether we could get enough interest from different communities. Our proposal was successful: the workshop was held in 2014 and set the stage for thinking about, discussing and sketching out facets of Assistive Augmentation. As intended, our workshop attracted a very diverse crowd from different fields. Being able to discuss opportunities and the potential of Assistive Augmentation with such a group was immensely helpful and contributed significantly to our ongoing efforts to define the field. A practice I would encourage everyone at a similar stage to follow.

As a tangible outcome of this very workshop, our community decided to pursue a jointly edited volume which Springer published earlier this year [3]. The book illustrates two main areas of Assistive Augmentation by example: (i) sensory enhancement and substitution and (ii) design for Assistive Augmentation. Peers contributed comprehensive reports on case studies which serve as lighthouse projects to exemplify Assistive Augmentation research practice. Besides, the book features field-defining articles that introduce each of the two main areas.

Many relevant areas have yet to be touched upon, for instance, ethical issues, quality of augmentations and their appropriations. Augmenting human perception, another important research thrust, has recently been discussed in both SIGCHI and SIGMM communities. Last year, a workshop on “Amplification and Augmentation of Human Perception” was held by Albrecht Schmidt, Stefan Schneegass, Kai Kunze, Jun Rekimoto and Woontack Woo at ACM CHI [5]. Also, one of last year’s keynotes at ACM Multimedia focused on “Enhancing and Augmenting Human Perception with Artificial Intelligence” by Achin Bhowmik [1]. These ongoing discussions in academic communities underline the importance of investigating, shaping and defining the intersection of assistive technologies and human augmentations. Academic research is one avenue that must be pursued, with work being disseminated at dedicated conference series such as Augmented Human [6]. Other avenues that highlight and demonstrate the potential of Assistive Augmentation technology include for instance sports, as discussed within the Superhuman Sports Society [7]. Most recently, the Cybathlon was held for the very first time in 2016. Athletes with “disabilities or physical weakness use advanced assistive devices […] to compete against each other” [8].

Looking back at how the community came about, I conclude that organizing a workshop at a large academic venue like CHI was an excellent first step for establishing the community. In fact, the workshop created a fantastic momentum within the community. However, focusing entirely on a jointly edited volume as the main tangible outcome of the workshop had several drawbacks. In retrospect, the publication timeline was far too long, rendering it impossible to capture the dynamics of an emerging field. But indeed, this cannot be the objective of a book publication—this should have been the objective of follow-up workshops in neighboring communities (e.g., at ACM Multimedia) or special issues in a journal with a much shorter turn-around. With our book project now being concluded, we aim to pick up on past momenta with a forthcoming special issue on Assistive Augmentation in MDPI’s Multimodal Technologies and Interaction journal. I am eagerly looking forward to what is next and to our communities’ joint work across disciplines towards pushing our physical, sensorial and cognitive abilities.

References

[1] Achin Bhowmik. 2017. Enhancing and Augmenting Human Perception with Artificial Intelligence Technologies. In Proceedings of the 2017 ACM on Multimedia Conference(MM ’17), 136–136.

[2] Ellen Yi-Luen Do. 2018. Design for Assistive Augmentation—Mind, Might and Magic. In Assistive Augmentation. Springer, 99–116.

[3] Jochen Huber, Roy Shilkrot, Pattie Maes, and Suranga Nanayakkara (Eds.). 2018. Assistive Augmentation. Springer Singapore.

[4] Cynthia Liem. 2018. Multidisciplinary column: inclusion at conferences, my ISMIR experiences. ACM SIGMultimedia Records9, 3 (2018), 6.

[5] Albrecht Schmidt, Stefan Schneegass, Kai Kunze, Jun Rekimoto, and Woontack Woo. 2017. Workshop on Amplification and Augmentation of Human Perception. In Proceedings of the 2017 CHI Conference Extended Abstracts on Human Factors in Computing Systems, 668–673.

[6] Augmented Human Conference Series. Retrieved June 1, 2018 from http://www.augmented-human.com/

[7] Superhuman Sports Society. Retrieved June 1, 2018 from http://superhuman-sports.org/

[8] Cybathlon. Cybathlon – moving people and technology. Retrieved June 1, 2018 from http://www.cybathlon.ethz.ch/

About the Column

Editor Biographies

Multidisciplinary Column: An Interview with Emilia Gómez

By Cynthia Liem | February 1, 2018 - 16:25 |March 24, 2018 0118, Feature, Opinion: Multidisciplinary Column

Leave a comment

Could you tell us a bit about your background, and what the road to your current position was?

I have a technical background in engineering (telecommunication engineer specialized in signal processing, PhD in Computer Science), but I also followed formal musical studies at the conservatory since I was a child. So I think I have an interdisciplinary background.

Could you tell us a bit more about how you have encountered multidisciplinarity and interdisciplinarity both in your work on music information retrieval and your current project on human behavior and machine intelligence?

Music Information Retrieval (MIR) is itself a multidisciplinarity research area intended to help humans better make sense of this data. MIR draws from a diverse set of disciplines, including, but by no means limited to, music theory, computer science, psychology, neuroscience, library science, electrical engineering, and machine learning.

In my current project HUMAINT at the Joint Research Centre of the European Commission, we try to understand the impact that algorithms will have on humans, including our decision making and cognitive capabilities. This challenging topic can only be addressed in a holistic way and by incorporating insights from different disciplines. At our kick-off workshop, we gathered researchers working on distant fields, e.g. from computer science to philosophy, including law, neuroscience and psychology and we realised the need to engage on scientific discussions from different views and perspectives to address human challenges in a holistic way.

What have, in your personal experience, been the main advantages of multidisciplinarity and interdisciplinarity? Have you also encountered any disadvantages or obstacles?

The main advantage I see is the fact that we can combine distinct methodologies to generate new insights. For researchers, the fact of stepping out a discipline’s comfort zone makes us more creative and innovative.

One disadvantage is the fact that when you work on a multidisciplinary field you seem not to fit into traditional academic standards. In my case, I am perceived as a musician by engineers and as an engineer by musicians.

Beyond the academic community, your work also closely connects to interests by diverse types of stakeholders (e.g. industry, policy-makers). In your opinion, what are the most challenging aspects for an academic to operate in such a diverse stakeholder environment?

The most challenging part of diverse teams is communication, e.g. being able to speak the same language (we might need to create interdisciplinary glossaries!) and explain about our research in an accessible way so that it is understood by people with diverse backgrounds and expertises.

Regarding your work on music, you often have been speaking about making all music accessible to everyone. What do you consider the grand research challenges regarding this mission?

Many MIR researchers desire that technology can be used to make all music accessible to everyone, i.e. that our algorithms can help people discover new music, develop a varied musical taste and make them open to new music and, at the same time, to new ideas and cultures. We often talk of our desire that MIR algorithms help people discover music in the so called ´long tail`, i.e. music that is not so popular or present in the mainstream scenario. I believe the variety of music styles reflect the variety of human beings, e.g. in terms of culture, personalities and ideas. Through music we can then enrich our culture and understanding.

As the newly elected president of the ISMIR society, are there any specific missions regarding the community you would like to emphasize?

I have had the chance to work with an amazing ISMIR board over the last years, an incredible group of people willing to contribute to our community with their talent and time. With this team is very easy to work!

This year, ISMIR is organizing its 19th edition (yes, we are getting old)! There are many challenges at ISMIR that we as a community should address, but at the moment I would like to emphasize some relevant aspects that are now somehow a priority for the board.

The first one is to maintain and expand its scientific excellence, as ISMIR should continue to provide key scientific advancements in our field. In this respect, we have recently launched our open access journal Transactions of ISMIR to foster the publication of more deep and mature research works in our area.

The second one is to promote variety in our community, e.g. in terms of discipline, gender or geographical location, also related to music culture and repertoire. In this respect, and thanks to our members, we have promoted ISMIR taking place at different locations, including editions in Asia (e.g. 2014 in Taipei, Taiwan, and 2017 in Suzhou, China).

Other aspects we put into value is reproducibility, openness and accessibility. In this sense, our priority is to maintain affordable registration rates, taking advantage of sponsorships from our industrial members, and devote our membership fees to provide travel funds for students or other members in need to attend ISMIR.

How and in what form do you feel we as academics can be most impactful?

The academic environment gives you a lot of flexibility and freedom to define research roadmaps, although there are always some dependencies on funding. In addition, academia provides time to reflect and go deep into problems that are not directly related to a product in a short-term. In the technological field, academia has the potential to advance technologies by focusing on deeper understanding of why these technologies work well or not, e.g. through theoretical analysis or comprehensive evaluation

You also have been very engaged in missions surrounding Women in STEM, for example through the Women in MIR initiatives. In discussions on fostering diversity, the importance of role models is frequently mentioned. How can we be good role models?

Yes, I have become more and more concerned about the lack of opportunities that women have in our field with respect to their male colleagues. In this sense, Women in MIR is playing a major role in promoting the role and opportunities of women in our field, including a mentoring program, funding for women to attend ISMIR, and the creation of a public repository of female researchers to make them more visible and present.

I think women are already great role models in their different profiles, but they lack visibility with respect to their male colleagues.

Bios

Dr. Emilia Gómez graduated as a Telecommunication Engineer at Universidad de Sevilla and studied piano performance at the Seville Conservatoire of Music, Spain. She then received a DEA in Acoustics, Signal Processing and Computer Science applied to Music at IRCAM, Paris and a PhD in Computer Science at Universitat Pompeu Fabra in Barcelona (2006). She has been visiting researcher at the Royal Institute of Technology, Stockholm (Marie Curie Fellow, 2003), McGill University, Montreal (AGAUR competitive fellowship. 2010), and Queen Mary University of London (José de Castillejos competitive fellowship, 2015). After her PhD, she was first a lecturer in Sonology at the Higher School of Music of Catalonia and then joined the Music Technology Group, Department of Information and Communication Technologies, Universitat Pompeu Fabra in Barcelona, Spain, first as an assistant professor and then as an associate professor (2011) and ICREA Academia fellow (2015). In 2017, she became the first female president of the International Society for Music Information Retrieval, and in January 2018, she joined the Joint Research Centre of the European Commission as Lead Scientist of the HUMAINT project, studying the impact of machine intelligence into human behavior.

Editor Biographies