An interview with Wallapak Tavanapong

MR: Describe your journey into computing from your youth up to the present.

Wallapak Tavanapong

Wallapak Tavanapong

Pak: I started learning about computing quite late. I did not know what a computer was until I joined a B.S. degree program in Computer Science at Thammasat University, Thailand, and learned the foundation there. After finishing the degree, I joined the M.S. program in Computer Science at the University of Central Florida (UCF), Orlando, Florida, USA. UCF was a great learning place for me. I had a wonderful advisor, Prof. Kien A. Hua, good classes, and great friends. My research at the time was video-on-demand, which was a hot topic then. After my Ph.D., I joined the Department of Computer Science at Iowa State University in 1999 as an Assistant Professor and was promoted to a Full Professor recently.

Iowa State University is a great place for my career. In the beginning, I continued on with the research in video-on-demand and multimedia caching. In 2003, my colleagues, Profs. JungHwan Oh, Piet C. de Groen, Johnny Wong, and I began investigating automated content analysis of endoscopic video for improving quality of the procedure. At the time, few works exist and mostly were on automated detection of polyp appearance in images. Our approach is to automatically analyze an entire procedure, calculate detailed objective metrics that reflect quality of inspection for the entire procedure, and provide real-time feedback to assist the endoscopist to improve the quality. We co-founded EndoMetric Corporation to transfer the technology into practice. I am glad that this research area receives much more attention now both in academia and industry. I am glad that our work has some influence on later work. In 2013, I began new interdisciplinary research and education initiatives in political informatics and computation communication and advertising.

MR: What foundational lessons did you learn from this journey?

Pak:

First, never give up when facing difficulty. Second, there are several paths toward good research. I am more attracted to research problems in a different discipline. I like to create a new computing research problem out of vague problem descriptions in other disciplines. I love interdisciplinary research.

MR: Why were you initially attracted to multimedia?

Pak:

My initial interest was in database research. As data began to come in different media types, extension to multimedia was natural.

MR: Tell us more about your vision and objectives behind your current roles? What do you hope to accomplish and how will you bring this about?

Pak:

First, I’d like to see my research helps to prevent or reduce suffering from cancer for many. To achieve this goal, I need to do more to push my technology into practice. Second, I’d like to see computational thinking integrated into science and math curriculum in elementary schools in the US and other countries soon. Over the past five years, I have been engaging in our departmental K-12 outreach activities, coaching K-12 kids and interested K-12 teachers in computational thinking. I’d like to see more women in computer science and computing fields. In our K-12 outreach program, we found that young girls started losing their interest in science as early as the fifth grade. So, I hope to get them interested in computing early in the third grade. Last, I’d like to see that my interdisciplinary work with political scientists and communication scholars leads to a national social multimedia repository that is useful for social scientists and the public to learn about decision making in public policies that affect many lives.

MR: Can you profile your current research, its challenges, opportunities, and implications?

Pak: My top two projects are

  • Reconstruction of a virtual colon from 2D colonoscopic images:

    The human colon is a complex tubular structure with multiple twists and turns. A good colon exam increases early detection of colorectal cancer. I’d like to provide a 3D colon inspection map during the procedure for the endoscopist to know which areas inside the colon that they might have missed. There are many challenges. The most critical one is that commonly used endoscopes are not equipped with 3D camera positioning technology. I am working to add low-cost hardware equipment that provides some position information. I will utilize the position together with content analysis of endoscopic images to reconstruct the virtual colon. The work has a potential to increase the polyp detection rate during colonoscopy, preventing deaths and reducing pain and suffering.

  • Multimedia information system for political science and communication:

    This system would help answering research questions in political science and communication that could not have been answered before because of the sheer volume, variety, and velocity of data. Specifically, my team is working on understanding how states learn about policies from one another, how news reporters carry information from state legislatures to the public, how a public policy is influenced, etc. This is an application domain that lends itself to multimedia research, ranging from the underlying data management technology, automated content analysis of multiple media types and sources: web and video online ads, TV ads, state bills and laws, and tweets by political figures, to visualization of the resulting knowledge from the analysis.

MR: How would you describe the role of women especially in the field of multimedia?

Pak: I think the role of women in multimedia is same as men. But our number is much lower. We need to increase the number of women in the field. I believe that we need to get young girls interested in computing as early as elementary school.

MR: How would you describe your top innovative achievements in terms of the problems you were trying to solve, your solutions, and the impact it has today and into the future?

Pak: I would say that my top achievement so far is in the idea and the realization of real-time computer-aided analysis and feedback to improve quality of colonoscopy. We were the first to investigate this problem. There are several challenges, for instance, defining what to analyze that reflect quality as seen by the domain experts, coming up with effective algorithms to compute the quality measurements, showing that the automated measurement indeed improves quality, making the automated analysis real-time, effective, and low cost to be used in practice, deploying the technology for daily use in hospitals and clinics.

My technology has already saved a couple of lives and I would like it to do more in the future. I have seen more researchers in academia and industry get into this research area, which is great. We need more researchers and developers in multimedia and healthcare to help medical professions improve quality of care via automation.

MR: Over your distinguished career, what are your top lessons you want to share with the audience?

Pak: Never give up. Find good mentors who care about you, believe in you, and give you different perspectives. A peer mentor is great. I learn a lot from my colleagues. Find a research problem you are passionate about. Last, when realizing that there is a problem, do not complain, look for a good solution, and fix it.

 

Call for Nominations for Editor-In-Chief of ACM TiiS

Call for Nominations
Editor-In-Chief
ACM Transactions on Interactive Intelligent Systems (TiiS)

The term of the current Editors-in-Chief (EiC) of the ACM Transactions on Interactive Intelligent Systems (TiiS) is coming to an end, and the ACM Publications Board has set up a nominating committee to assist the Board in selecting the next Editor(s)-in-Chief.

Nominations, including self nominations, are invited for a three-year term as TiiS EiC, beginning on January 1, 2016. The EiC appointment may be renewed at most one time. This is an entirely voluntary position, but ACM will provide appropriate administrative support.

The EiC is responsible for maintaining the highest editorial quality, for setting technical direction of the papers published in TiiS, and for maintaining a reasonable pipeline of articles for publication. He/she has final say on acceptance of papers, size of the Editorial Board, and appointment of Associate Editors. The EiC is expected to adhere to the commitments expressed in the policy on Rights and Responsibilities in ACM Publishing. For more information about the role of the EiC, see ACM’s Evaluation Criteria for Editors-in-Chief.

Nominations should include a vita along with a brief statement of why the nominee should be considered. Self-nominations are encouraged, and should include a statement of the candidate’s vision for the future development of TiiS. The deadline for submitting nominations is November 30, 2015, although nominations will continue to be accepted until the position is filled.

Please send all nominations to the nominating committee chair, Matthew Turk (mturk@cs.ucsb.edu).

The search committee members are:

  • Matthew Turk (UC Santa Barbara), Chair
  • Elisabeth André (University of Augsburg)
  • Henry Lieberman (MIT)
  • Desney Tan (Microsoft Research)

Report from ICACNI 2015

Report from the 3rd International Conference on Advanced Computing, Networking, and Informatics

1

Inauguration of 3rd ICACNI 2015

The 3rd International Conference on Advanced Computing, Networking and Informatics (ICACNI-2015), organized by School of Computer Engineering, KIIT University, Odisha, India, was held during 23-25 June, 2015.

2

Prof. Nikhil R. Pal during his keynote

The conference commenced with a keynote by Prof. Nikhil R. Pal (Fellow IEEE, Indian Statistical Institute, Kolkata, India) on ‘A Fuzzy Rule-Based Approach to Single Frame Super Resolution’.

Authors listening to technical presentations

Authors listening to technical presentations

Apart from three regular tracks on advanced computing, networking, and informatics, the conference hosted three invited special sessions. While a total of more than 550 articles across different tracks of the conference were received, 132 articles are finally selected for presentation and publication by Smart Innovation, Systems and Technologies series of Springer as Volume 43 and 44.

Prof. Nabendu Chaki during his technical talk

Prof. Nabendu Chaki during his technical talk

Extended versions of few extraordinary articles from these will be published by special issues of Egyptian Informatics Journal and Innovations in Systems and Software Engineering (A NASA Journal). The conference showcased a technical talk by Prof. Nabendu Chaki (Senior Member IEEE, Calcutta University, India) on ‘Evolution from Web-based Applications to Cloud Services: A Case Study with Remote Healthcare’.

A click from award giving ceremony

A click from award giving ceremony

The conference identified some wonderful works and have given away eight awards in different categories. The conference was successful to bring together academic scientists, professors, research scholars and students to share and disseminate information on knowledge and scientific research works related to the conference. 4th ICACNI 2016 is scheduled to be held at National Institute of Technology Rourkela, Odisha, India.

A quick interview with Open Source Software Competition’s organizers

The program of ACM Multimedia is very diverse: apart from oral and poster presentations, panels and keynotes there are challenges and competitions. One that is particularly interesting is the Open Source Software Competition, which is pretty much specific for this conference and was started in ACM Multimedia 2004. The full list of participants and winners (along with links to all the projects) can be found on the SIGMM web site: http://sigmm.org/Resources/software/ossc. This list shows that over the years this session has drawn a larger (and well deserved) attention from the community. We have asked the chairs of the ACM MM 2013 (Andrea Vedaldi & Ioannis Patras, answering as Org2013) and 2015 (Xian-Sheng Hua, Marco Bertini & Tao Mei, answering as Org2015) competition about their experience and opinions about the competition.

Q1: How hard was to get submissions to OSSC? Did you have to ask authors of software you knew or are they aware of this part of the ACM MM programme? Overall how many submissions did you receive?

Org2013: We did not have to ask any author directly. We only circulated an advertisement to three mailing lists, including a CV and ML one. The competition seems to be sufficiently well known that it is capable to attract submission with little effort.

Org2015: It was not that hard, we contacted some authors asking for submissions, but in the end the majority of submissions came from people who already knew the competition or from the call for paper we disseminated. We received 15 submissions, of which 9 were accepted. Decisions were taken considering the quality of the presentation and of the software itself, as well as the importance and utility for the multimedia community.

Q2: What’s your evaluation of the quality of submissions? Have you ever used software from past submissions?

Org2013: Half of the submission were of very high quality, both in scope and maturity of the projects. A few very very poor, at the level of master project at most. (Note: Andrea won the ACM MM’10 competition with his VLFeat library).

Org2015: Quality is quite high, we accepted works that were interesting and useful for the community and that were also mature enough to be used by members of the multimedia community. Marco Bertini: I already use some software of the submissions of this year, and I am using also software from past editions.

Q3: What’s your evaluation of OSSC per-se? Do you think other conferences should have something similar?

Org2013: It is a very good competition as it gives a chance to the authors of the software to obtain a publication and significant publicity (especially in the case of victory). It is also a great way to let the public know about solid OS projects. Having multiple competitions is tempting as contributions tend to be quite orthogonal (e.g. audio vs database vs networking vs imaging). At the same time, the number of contributions does not seem to warrant splitting the effort up.

Org2015: It is an interesting and useful track for ACM Multimedia. It has both scientific and technical value: It eases the development of new algorithms and methods, and allows to re-implement more easily the methods proposed by other researchers. The effort of the authors of such software deserve to be recognized by the scientific community. Probably other major conferences in different fields of CS should introduce this type of track.

Image indexing and retrieval with Yael

Introduction

Yael is a library implementing computationally intensive functions used in large scale image retrieval, such as neighbor search, clustering and inverted files. The library offers interfaces for C, Python and Matlab.

The motivation of Yael is twofold. We aim at providing: 

  • core and optimized instructions and methods commonly used for large-scale multimedia retrieval systems 
  • more sophisticated functions associated with state-of-the-art methods, such as the Fisher vector, VLAD, Hamming Embedding or more generally methods based on inverted file systems, such as selective match kernels.

Yael is intended as an API and does not implement a retrieval system in an integrated manner: only a few test programs are available for key tasks such as k-means. Yet this can be done on top of it with a few dozen lines of Matlab or Python code.

Yael started as an open-source spin-off of INRIA LEAR‘s proprietary library Bigimbaz. The objective was to isolate performance-critical primitives that could be re-used in other projects. Yael’s design choices were: implemented in C for simplicity, but using an object-oriented design (structs with constructors/destructors), interface with Python as high-level language to facilitate administrative tasks. 

Yael is designed to handle dense data in float, as it is primarily used for signal processing tasks where the quality of the representation is determined by the number of dimensions rather than the precision of the components. In the Matlab interface, single matrices, and float32 in Python. Yael was designed initially to manipulate matrices in C. It was interfaced for Python using SWIG, which gives low-level access to the full library. An additional Numpy layer (ynumpy) is provided for high-level functions. The most important functions of Yael are wrapped in Mex to be callable from Matlab.

Performance is very important. Yael has computed k-means with hundreds of thousand centroids and routinely manipulate matrices that occupy more than 1/2 the machine’s RAM. This means that it has to be lightweight and 64-bit clean. The design choices of Yael are governed by efficiency concerns more than by portability. As a result, the library may work only with severely down-graded performance if instructions are not provided by the processor. In particular, Yael relies on SSE instructions such as the SSE 4.2 popcnt instruction. The library is maintained for Linux and MacOS. Yael relies on as few external libraries as possible. The only mandatory ones are BLAS/Lapack (for performance). Other libraries (Python’s C interface, Matlab’s mex, Arpack, OpenMP) are optional.

Yael and related packages are downloaded around 600 times per month. 

This article addresses the recognition of images of the same scene or object, and how Yael can perform this kind of operation. Here is an example of two images of the same scene that we would like to match:

127300 127301

 

We will explain how to compute descriptors (aka signatures) for the images, and how to find descriptors that are similar between images.

We are going to work on the 100 first query images of the Holidays dataset, and their associated database examples. The images and associated SIFT descriptors can be downloaded from here: Images and SIFT descriptors.

Image indexing

Imagine a user that has a large image collection with photos of buildings, with as associated metadata the GPS location of the building. Given a new photo of a building, taken with a mobile phone, the user wants to find the location where the photo was taken. This is where image indexing comes into play.

Image indexing means constructing an index referencing the images from a collection. This index has a search function that can be used to retrieve the images that are most similar to a query image. 

At build time and search time, the index is stored in RAM. This is orders of magnitude faster than disk-based implementations, such as those used in SQL database engines. However, for large datasets, this requires either a lot of RAM or a very compact representation per image. Yael provides this compact representation, so that you do not need to buy the RAM.

In combination with efficient matrix manipulation environments like Matlab and Numpy, Yael makes the process of building an index and searching in it very simple. 

Extracting image descriptors

Local image descriptors are vectors computed each on an area of the image. The areas are selected to contain strong contrast changes, with a 2D signal processing filter. Then the descriptor vector is computed from the gradient or frequency content in the area.

Local descriptors are typically designed to be invariant to some classes of transformations: translations, illumination changes, rotations, etc. At the same time, they should be discriminant enough to distinguish relevant differences on the patches, eg. different patterns on the facade of a building. There is a long line of research on designing local image features with appropriate tradeoffs in terms of invariance / discriminance / computational cost, see for example this comparison of affine covariant features.

In the images above, local descriptors extracted on the skyline ought to be very similar. Therefore, these images should be easy to match.

Local descriptors can be extracted using any local description algorithm, as long as they can be compared with L2 distances, ie. descriptors that are far away in L2 space are also considered different in image content. For example, OpenCV provides an implementation of the SURF descriptor, and VLFeat contains a SIFT implementation. 

For this example, we will use the SIFT implementation provided along with the Holidays dataset. In the “Descriptor extraction” section of http://lear.inrialpes.fr/~jegou/data.php, download the executable (there is a Mac OS X version and a Linux version). 

The pre-processing applied to images before analyzing them to extract signatures can have a dramatic effect on the retrieval performance. Ideally, images should be equalized so that their luminance is similar and resized into dimensions that are not too different. This can be performed in a number of ways, eg. with Imagemagick. In our case, we’ll just use a few command-line utilities from netpbm

In total, the steps that extract the descriptors from a single image are:

infile=xxxx.jpg
tmpfile=${infile/jpg/pgm}
outfile=${infile/jpg/siftgeo}

# Rescaling and intensity normalization
djpeg $infile | ppmtopgm | pnmnorm -bpercent=0.01 -wpercent=0.01 -maxexpand=400 | pamscale -pixels $[1024*768] > $tmpfile

# Compute descriptors
compute_descriptors -i $tmpfile -o4 $outfile -hesaff -sift 

This should be applied to all the images that are to be indexed, and the ones that will be queried. 

The remainder of this article presents the main functions used in Yael to do image retrieval. They are implemented in the two languages supported by Yael: Python and Matlab. 

Image indexing in Python with Fisher vectors

A global image descriptor is a vector that characterizes the whole image. The Euclidean distance between the descriptors of two images should be higher for different images than for similar images. There are many popular types of global descriptors, like color histograms or GIST descriptors.

Here, we use a statistical tool derived from the Fisher kernel to aggregate the local SIFT descriptors of an image into a global image descriptor: the Fisher vector (FV). See Aggregating local image descriptors into compact codes for more details. You may also be interested in INRIA’s Fisher vector implementation which is a Matlab version of this example, on the complete Holidays dataset.

The most important functions of Yael are available in Python via the ynumpy module. They all manipulate c-compact float32 or int32 matrices. 

The FV computation relies on a training where a Gaussian Mixture Model (GMM) is fitted to a set of representative local descriptors. For simplicity, we are going to use the descriptors of the database we index. To load the database descriptors, use the ynumpy.siftgeo_read function:

for imname in image_names:
    desc, meta = ynumpy.siftgeo_read(imname)
    image_descs.append(desc)

The meta component contains the SIFT descriptor’s meta-information (location and size of the area, orientation, etc.). We do not use this information to compute the FV.

Next we sample the descriptors to reduce their dimensionality by PCA and computing a GMM. This involves some standard numpy code, and the ynumpy.gmm_learn function. For a GMM of size k (let’s set it to 64), we need about 1000*k training descriptors

k = 64
n_sample = k * 1000

# choose n_sample descriptors at random
sample_indices = np.random.choice(all_desc.shape[0], n_sample)
sample = all_desc[sample_indices]

# train GMM
gmm = ynumpy.gmm_learn(sample, k)

The GMM is a tuple containing the a-priori weights per mixture component, the mixture centres and the diagonal of the component covariance matrices (the model assumes a diagonal matrix, otherwise the descriptor would be way too long).

The training is finished. The next stage is to encode the SIFTs into one vector per image: 

image_fvs = []
for image_desc in image_descs:
   # compute the Fisher vector, using only the derivative w.r.t mu
   fv = ynumpy.fisher(gmm, image_desc, include = 'mu')
   image_fvs.append(fv)

All the database descriptors are stacked as lines of a single matrix image_fvs, and all queries image descriptors in another matrix query_fvs. Then the Euclidean nearest neighbors of each query (and hence the most similar images) can be retrieved with:

# get the 8 NNs for all query images in the image_fvs array
results, distances = ynumpy.knn(query_fvs, image_fvs, nnn = 8)

Now we display the search results for a few query images. There is one line per query image, which shows the image, and a row of retrieval results. The correct results have a green rectangle around them, negative ones a red rectangle. 

search_results

Note that the query image always appears as the first retrieval result, because it is included in the dataset.

Image indexing based on global descriptors like the Fisher Vector is very efficient and easy to implement using Yael. For larger datasets (more than a few tens of thousand images), it is useful to use vector quantization or hashing techniques to perform the nearest-neighbor search faster. 

Image indexing in Matlab with inverted files

In this chapter, we directly index all the local SIFT descriptors of the database images into an indexing structure in RAM called the inverted file. Each SIFT descriptor is assigned an index in [1,k] using a quantization function. The inverted file contains k lists, one per possible index. When a SIFT from an image is assigned to an index 1 ≤ i ≤ k, the id of this image is added to the list i.

In the example below, we show how to use an inverted file of Yael from Matlab. More specifically, the inverted file we consider supports binary signatures, as proposed in the Hamming Embedding approach described in this paper.

Before launching the code, please ensure that

  • You have a working and compiled version of Yael’s matlab interface
  • The corresponding directory (‘YAELDIR/matlab’) is in your matlab Path. If not, use the addpath(‘YAELDIR/matlab’) to add it.

To start with, we define the parameters of the indexing method. Here, we choose a vocabulary of size k=1024. We also set some parameters specific to Hamming embedding.

k = 1024;                            % Vocabulary size
dir_data = './holidays_100/';        % data directory

% Parameters For Hamming Embedding
nbits = 128;                         % Typical values are 32, 64 or 128 bits
ht = floor(nbits*24/64);             % Hamming Embedding threshold

Hereafter, we show how we typically load a set of images and descriptors stored in separate files. We use the standard matlab functions arrayfun and cellfun to perform operations in batch. The descriptors are assumed stored in the siftgeo format, therefore we read them with the yael ‘siftgeo_read’ function.

sifts = cell(); 

for i = 1:numel(img_list)
  [sifts_i, meta] = siftgeo_read(img_list{i}); 
  sifts{i} = sifts_i; 
end

Now, we are going to learn the visual vocabulary with k-means and subsequently construct the inverted file structure for Hamming Embedding. We learn it on Holidays itself to avoid requiring another dataset. But note that this should be avoided for a true system, and a proper evaluation should employ an external dataset for dictionary learning.

vtrain = [sifts{:}];
vtrain = vtrain (:, 1:2:end); tic

C = yael_kmeans (vtrain, k, 'niter', 10);

% We provide the codebook and the function that performs the assignment,
% here it is the exact nearest neighbor function yael_nn

ivfhe = yael_ivf_he (k, nbits, vtrain, @yael_nn, C);

We can add the descriptors of all the database images to the inverted file. Here, Each local descriptor receives an identifier. This is not a requirement: another possible choice would be to use directly the id of the image. But in this case we could not use this output for spatial verification. In our case, the descriptor id will be used to display the matches.

descid_to_imgid = zeros (totsifts, 1);  % desc to image conversion
imgid_to_descid = zeros (nimg, 1);      % for finding desc id
lastid = 0;

for i = 1:nimg
  ndes = nsifts(i);  % number of descriptors

  % Add the descriptors to the inverted file.
  % The function returns the visual words (and binary signatures),
  [vw,bits] = ivfhe.add (ivfhe, lastid+(1:ndes), sifts{i});
  imnorms(i) = norm(hist(vw,1:k));

  descid_to_imgid(lastid+(1:ndes)) = i;
  imgid_to_descid(i) = lastid;
  lastid = lastid + ndes;
end

Finally, we make some queries. We compute the number of matches n_immatches between query and database images. We invoke the standard Matlab function accumarray, which in essence compute here a histogram weighted by the match weights.

Queries = [1 13 23 42 63 83];
for q = 1:numel(Queries)
  qimg = Queries(q)

  matches = ivfhe.query (ivfhe, int32(1:nsifts(qimg)), sifts{qimg}, ht);

  % Translate to image identifiers and count number of matches per image, 
  m_imids = descid_to_imgid(matches(2,:));
  n_immatches = hist (m_imids, 1:nimg);

  % Images are ordered by descreasing score 
  [~, idx] = sort (n_immatches, 'descend');

  % Display results 
  ...
end

The output looks as follows. The query is the top-left image, and then the queries are displayed. The title gives the number of matches and the normalized score used to rank the images. The matches are displayed in yellow (and the non-matching descriptors in red).

search_results_matlab

Conclusion

Yael is a small library that contains many primitives that are useful for image indexing, nearest-neighbor search, sorting, etc. It at the base of several state-of-the-art implementations of image indexing packages. Reference [1] describes the implementation tradeoffs of some of Yael’s main functions, and provides more references to research papers whose results were obtained with Yael.

In the code above, only the main function calls were shown, see the Yael tutorial for a fully functional version of the code, and the main Yael website for the complete documentation. 

 

TOMM Associate Editor of the Year Award 2015

Annually, the Editor-in-Chief of the ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) honors one member of the Editorial Board with the TOMM Associate Editor of the Year Award. The purpose of the award is the distinction of excellent work for ACM TOMM and hence also for the whole multimedia community in the previous year. Criteria for the award are (1.) the amount of submissions processed in time, (2.) the performance during the reviewing process and (3.) the accurate interaction with the reviewers in order to broader the awareness for the journal. Based on the criteria mentioned above, the ACM Transactions on Multimedia Computing, Communications and Applications Associate Editor of the Year Award 2015 goes to Pradeep Atrey from State University of New York, Albany, USA. pradeep-atreyPradeep K. Atrey is an Assistant Professor at the State University of New York, Albany, NY, USA. He is also an (on-leave) Associate Professor at the University of Winnipeg, Canada and an Adjunct Professor at University of Ottawa, Canada. He received his Ph.D. in Computer Science from the National University of Singapore, M.S. in Software Systems and B.Tech. in Computer Science and Engineering from India. He was a Postdoctoral Researcher at the Multimedia Communications Research Laboratory, University of Ottawa, Canada. His current research interests are in the area of Security and Privacy with a focus on multimedia surveillance and privacy, multimedia security, secure-domain cloud-based large-scale multimedia analytics, and social media. He has authored/co-authored over 100 research articles at reputed ACM, IEEE, and Springer journals and conferences. His research has been funded by Canadian Govt. agencies NSERC and DFAIT, and by Govt. of Saudi Arabia. Dr. Atrey is on the editorial board of several journals including ACM Trans. on Multimedia Computing, Communications and Applications, ETRI Journal and IEEE Communications Society Review Letters. He was also guest editor for Springer Multimedia Systems and Multimedia Tools and Applications journals. He has been associated with over 40 international conferences/workshops in various roles such as Organizing Chair, Program Chair, Publicity Chair, Web Chair, Area Chair, Demo Chair and TPC Member. Dr. Atrey was a recipient of the Erica and Arnold Rogers Award for Excellence in Research and Scholarship (2014), ETRI Journal Best Editor Award (2012), ETRI Journal Best Reviewer Award (2009) and the three University of Winnipeg Merit Awards for Exceptional Performance (2010, 2012 and 2013). He was also recognized as “ICME 2011 – Quality Reviewer” and is invited as a Rising Star Speaker at the SIGMM Inaugural Multimedia Frontier Workshop (2015). The Editor-in-Chief Prof. Dr.-Ing. Ralf Steinmetz cordially congratulates Pradeep.

2015 ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) Nicolas D. Georganas Best Paper Award

The 2015 ACM Transactions on Multimedia Computing, Communications and Applications (TOMM) Nicolas D. Georganas Best Paper Award is provided to the paper “A Quality of Experience Model for Haptic Virtual Environments” (TOMM vol.10, Issue 3) by Abdelwahab Hamam, Abdulmotaleb El Saddik and Jihad Alja’am.

The purpose of the named award is to recognize the most significant work in ACM TOMM (formerly TOMCCAP) in a given calendar year. The whole readership of ACM TOMM was invited to nominate articles which were published in Volume 10 (2014). Based on the nominations the winner has been chosen by the TOMM Editorial Board. The main assessment criteria have been quality, novelty, timeliness, clarity of presentation, in addition to relevance to multimedia computing, communications, and applications.

The winning paper is grounded on the observation that so far there is only limited research on Quality of Experience (QoE) for Haptic-based Virtual Reality applications. In order to overcome this issue, the authors propose a human-centric taxonomy for the evaluation of QoE for haptic virtual environments. The QoE evaluation is applied through a fuzzy logic inference model. The taxonomy also gives guidelines for the evaluation of other multi-modal multimedia systems. This multi-modality was one of the main reasons for the selection of this article and TOMM members expect that it will have an impact on future QoE studies in various sub-fields of multimedia research.

The award honors the founding Editor-in-Chief of TOMM, Nicolas D. Georganas, for his outstanding contributions to the field of multimedia computing and his significant contributions to ACM. He exceedingly influenced the research and the whole multimedia community.

The Editor-in-Chief Prof. Dr.-Ing. Ralf Steinmetz and the Editorial Board of ACM TOMM cordially congratulate the winner. The award will be presented to the authors at the ACM Multimedia 2015 in Brisbane, Australia, and includes travel expenses for the winning authors.

abdelwahab-hamamAbdelwahab Hamam received his PhD in Electrical and Computer Engineering from the University of Ottawa, Canada, in 2013. He is currently a postdoctoral research scientist at Immersion in Montreal, Canada focusing in research and development of novel haptic technologies and interactions. He was previously a teaching and research assistant at the University of Ottawa from Jan 2005 to May 2013. He has more than 35 academic papers and pending patent applications. He is the recipient of the best paper award at the 2015 QoMEX workshop. He is the technical co-chair of the 2015 Haptic Audio-Visual Environments and Games (HAVE) Workshop and the co-organizer of the 2015 QoMEX workshop special session on quality of experience in haptics. His research interests include haptic applications, medical simulations, and quality of experience for multimedia haptics.

abed-el-saddikAbdulmotaleb El Saddik is Distinguished University Professor and University Research Chair in the School of Electrical Engineering and Computer Science at the University of Ottawa. He is an internationally-recognized scholar who has made strong contributions to the knowledge and understanding of multimedia computing, communications and applications. He has authored and co-authored four books and more than 450 publications. Chaired more than 50 conferences and workshop and has received research grants and contracts totaling more than $18 Mio. He has supervised more than 100 researchers. He received several international awards, among others ACM Distinguished Scientist, Fellow of the Engineering Institute of Canada, Fellow of the Canadian Academy of Engineers and Fellow of IEEE and IEEE Canada Computer Medal.

jihad-mohamed-aljaamJihad Mohamed Alja’am received the Ph.D. degree, MS. degree and BSc degree in computing from Southern University (The National Council for Scientific Research, CNRS), France. He was with IBM-Paris as Project Manager and with RTS-France as IT Consultant for several years. He is currently with the Department of Computer Science and Engineering at Qatar University. His current research interests include multimedia, assistive technology, learning systems, human–computer interaction, stochastic algorithms, artificial intelligence, information retrieval, and natural language processing. Dr. Alja’am is a member of the editorial boards of the Journal of Soft Computing, American Journal of Applied Sciences, Journal of Computing and Information Sciences, Journal of Computing and Information Technology, and Journal of Emerging Technologies in Web Intelligence. He acted as a scientific committee member of different international conferences (ACIT, SETIT, ICTTA, ACTEA, ICLAN, ICCCE, MESM, ICENCO, GMAG, CGIV, ICICS, and ICOST). He is a regular reviewer for the ACM computing review and the journal of supercomputing. He has collaborated with different researchers in Canada, France, Malaysia, and USA. He published so far 138 papers, 8 books chapters in computing and information technology which are published in conference proceedings, scientific books, and international journals. He is leading a research team in multimedia and assistive technology and collaborating in the Financial Watch and Intelligent Document Management System for Automatic Writer Identification projects.

ACM SIGMM Award for Outstanding PhD Thesis in Multimedia Computing, Communications and Applications 2015

Awardee

ting-yaoACM Special Interest Group on Multimedia (SIGMM) is pleased to present the 2015 SIGMM Outstanding Ph.D. Thesis Award to Dr. Ting Yao and Honorable Mention recognition to Dr. Britta Meixner.

The award committee considers Dr. Yao’s dissertation entitled “Multimedia Search by Self, External, and Crowdsourcing Knowledge” worthy of the recognition as the thesis proposes an innovative knowledge transfer framework for multimedia search which is expected to have significant impact, especially in boosting the search performance for big multimedia data.

Dr. Yao’s thesis proposes the knowledge transfer methodology in three multimedia search scenarios:

  1. Seeking consensus among multiple modalities in the context of search re-ranking,
  2. Leveraging external knowledge as a prior to be transferred to a problem that belongs to a domain different from the external knowledge, and
  3. Exploring the large user click-through data as crowdsourced human intelligence for annotation and search.

The effectiveness of the proposed framework has been successfully justified by thorough experiments. The proposed framework has substantial contributions in principled integration of multimodal data which is indispensable in multimedia search. The publications related to the thesis clearly demonstrate the major impact of this work in many research disciplines including multimedia, web, and information retrieval. The fact that parts of the proposed techniques have been and are being transferred to the commercial search service Bing further attest to the practical contributions of this thesis. Overall, the committee recognizes the significant impact and contributions presented in the thesis to the multimedia community.

Bio of Awardee

Dr. Ting Yao is an associate researcher in the Multimedia Search and Mining group at Microsoft Research, Beijing, China. His research interests are in multimedia search and computing. He completed a Ph.D. in Computer Science at City University of Hong Kong in 2014. He received the B.Sc. degree in theoretical and applied mechanics (2004), B.Eng. double degree in electronic information engineering (2004), and M.Eng. degree in signal and information processing (2008) all from the University of Science and Technology of China, Hefei, China. The system designed by him achieved the second place in the THUMOS action recognition challenge at CVPR 2015. He was also the principal designer of the image retrieval systems that achieved the third and fifth performance in the MSR-Bing image retrieval challenge at ACM MM 2014 and 2013, respectively. He received the Best Paper Award of ACM ICIMCS (2013).

Honorable Mention

britta-meixnerThe award committee is pleased to present the Honorable Mention to Dr. Britta Meixner for the thesis entitled: “Annotated Interactive Non-linear Video – Software Suite, Download and Cache Management.”

The thesis presents a fully functional software suite for authoring non-linear interactive videos with downloading and cache management mechanisms for effective video playback. The committee is significantly impressed by the thorough study presented in the thesis with extensive analysis of the properties of the software suite. The implementation which has been made available as open source software along with the thesis undoubtedly has very high potential impact to the multimedia community.

Bio of Awardee

Dr. Britta Meixner received her Master’s degree (German Diplom) in Computer Science from the University of Passau, Germany, in 2008. Furthermore, she received the First State Examination for Lectureship at Secondary Schools for the subjects Computer Science and Mathematics from the Bavarian State Ministry for Education and Culture in 2008. She received her Ph.D. degree from the University of Passau, Germany, in 2014. The title of her thesis is “Annotated Interactive Non-linear Video – Software Suite, Download and Cache Management.” She is currently a postdoctoral research fellow with the University of Passau, Germany, and will be a postdoctoral research fellow at FXPAL, Palo Alto, CA, USA, starting October 2015. Her research interest is mainly in hypermedia. She is an award winner of the 2015 Award “Women + Media Technology,” granted by Germany’s public broadcasters ARD and ZDF (ARD/ZDF Förderpreis “Frauen + Medientechnologie” 2015). She was a Reviewer for Springer Multimedia Tools and Applications (MTAP) Journal, an Organizer of the “International Workshop on Interactive Content Consumption (WSICC)” at ACM TVX in 2014 and 2015, and Associate Chair at ACM TVX2015.

Announcement of ACM SIGMM Rising Star Award 2015

yu-gang-jiangACM Special Interest Group on Multimedia (SIGMM) is pleased to present this year’s Rising Star Award in multimedia computing, communications and applications to Dr. Yu-Gang Jiang. The ACM SIGMM Rising Star Award recognizes a young researcher who has made outstanding research contributions to the field of multimedia computing, communication and applications during the early part of his or her career. Dr. Yu-Gang Jiang has made fundamental contributions in the area of video analysis and retrieval, especially with innovative approaches to large-scale video concept detection. He has been an active leader in exploring the bag-of-visual-words (BoW) representation for concept detection, providing influential insights on the critical representation design. He proposed the important idea of “soft-weighting” in his CIVR 2007paper, which significantly advanced the performance of visual concept detection. Dr. Jiang has proposed several important techniques for video and image search. In 2009, he proposed a novel domain adaptive concept selection method for concept-based video search. His method selects the most relevant concepts for a given query considering not only the semantic concept-to-query relatedness but also the data distribution in the target domain. Recently he proposed a method that generates query-adaptive hash codes for improved visual search, with which a finer-grained ranking of search results can be achieved compared to the traditional hashing based methods. His most recent work is in the emerging field of video content recognition by deep learning, where he proposed a comprehensive deep learning framework to model static, short-term motion and long-term temporal information in videos. Very promising results were obtained on the widely used UCF101 dataset. As a postdoctoral researcher at Columbia University and later as a faculty member at Fudan University, Dr. Jiang has devoted significant efforts to video event recognition, a problem that is receiving increasing attention in the multimedia community. His extensive contributions in this area include not only innovative algorithm design, but also large benchmark construction, system development, and survey tutorials. He devised a comprehensive system in 2010 using multimodal features, contextual concepts and temporal clues, which won the multimedia event detection (MED) task in NIST TRECVID 2010. He constructed the Columbia Consumer Video (CCV) benchmark in 2011, which has been widely used. Recently, he continues to lead major efforts in creating and sharing large-scale video datasets in critical areas (including 200+ event categories and 100,000 partially copy videos) as community resources. The high impact of his works is reflected by the high number of citations of his work. His recent paper on video search result organization received the Best Poster Paper Award at ACMMM 2014. His shared benchmark datasets and source codes have been used worldwide. In addition, he has made extensive contributions to the professional communities by serving as conference program chairs, invited speakers, and tutorial experts. In summary, Dr. Yu-Gang Jiang receives the 2015 ACM SIGMM Rising Star Award for his significant contributions in the areas of video content recognition and search.

ACM SIGMM Award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications

tatsengchuaThe 2015 winner of the prestigious ACM Special Interest Group on Multimedia (SIGMM) award for Outstanding Technical Contributions to Multimedia Computing, Communications and Applications is Prof. Dr. Tat-Seng Chua. The award is given in recognition of his pioneering contributions to multimedia, text and social media processing. Tat-Seng Chua is a leading researcher in multimedia, text and social media analysis and retrieval. He is one of the few researchers who has made substantial contributions in the fields of multimedia, information retrieval and social media. Dr. Chua’s contributions in multimedia dates back to the early 1990s, where he was among the first to work on image retrieval with relevance feedback (1991), video retrieval and sequencing by exploring metadata and cinematic rules (1995), and fine grained image retrieval at segment level (1995). These works helped shape the development of the field for many years. Given the limitation of visual content analysis, his research advocates the integration of text, metadata and visual contents coupled with domain knowledge for large-scale media analysis. He developed a multi-source, multi-modal and multi-resolution framework together with the involvement of human in the loop for such analysis and retrieval tasks. This has helped his group not only publish papers in top conferences and journals, but also achieve top positions in large-scale video evaluations when his group participated in TRECVID in 2000-2006, VideOlympics in 2007-09, as well as winning the highly competitive Star (Multimedia) Challenge in 2008. Leveraging the experience, he developed a large-scale multi-label image test set named NUS-WIDE, which has been widely used with over 600 citations. He recently started a company named ViSenze Pte Ltd (www.visenze.com) to commercialize his research in mobile visual fashion search. In his more recent research work in multimedia question-answering (MMQA), he developed a joint text-visual model to exploit correlation between text queries, text-based answers, and visual concepts in images and videos to return both relevant text and video answers. The early work was carried out in the domain of news video (2003), which has motivated several follow-on works in image QA. His recent works tackled the more complicated “how-to” type QA in product domains (2010-13). His recent works (2013-14) exploited SemanticNet to perform attribute-based image retrieval and use of various types of domain knowledge. His current work aims to build a live, continuous-learning system to support the dynamic annotation and retrieval of images and micro videos in social media streams. In information retrieval and social media research, Dr. Chua focused on the key problems of organizing large-scale unstructured text contents to support question-answering (QA). His works point towards the use of linguistics and domain knowledge for effective large-scale information analysis, organization and retrieval. Given his strong interest in both multimedia and text processing, it is natural for him to venture into social media research that involves the analysis of text, multimedia, and social network contents. His group developed a live social observatory system to carry out research in building descriptive, predictive and prescriptive analytics of multiple live social media streams. The system has been well recognized by peers. His recent work on “multi-screen social TV” won the 2015 Best IEEEE Multimedia Best paper Award. Dr. Chua has been involved in most key conferences in these areas by serving as general chair, technical program chair, or invited keynote speaker as well as by leading innovative research and winning many best paper or best student paper awards in recent years. He is the Steering Committee Chair of two international multimedia conference series: ACM ICMR (International Conference on Multimedia Retrieval) and MMM (MultiMedia Modeling). In summary, he is an extraordinarily accomplished and outstanding researcher in multimedia, text and social media processing, truly exemplifying the characteristics of the ACM SIGMM Award for Outstanding Technical Contributions.