Deep Learning in the Browser: TensorFlow JS

Authors: Matthieu Pizenberg, Axel Carlier, Emmanuel Faure, Vincent Charvillat, Marco Bertini, Mathias Lux;
Affiliation: University of Toulouse, CNRS - IRIT, University of Klagenfurt, University of Florence;
Editors: Mathias Lux and Marco Bertini

URL: https://js.tensorflow.org/

Having already discussed MatConvNet and Keras, let us continue with an open source framework for deep learning, which takes a new and interesting approach. TensorFlow.js is not only providing deep learning for JavaScript developers, but it’s also making applications of deep learning available in the WebGL enabled web browsers, or more specifically, Chrome, Chromium-based browsers, Safari and Firefox. Recently node.js support has been added, so TensorFlow.js can be used to directly control TensorFlow without the browser.

TensorFlow.js is easy to install. As soon as a browser is installed one is ready to go. Browser based, cross platform applications, e.g. running with Electron, can also make use of TensorFlow.js without an additional install. The performance, however, depends on the browser the client is running, and memory and GPU on the client device. More specifically, one cannot expect to analyze 4K videos on a mobile phone in real time.
While it’s easy to install, and it’s easy to develop based on TensorFlow.js, there are drawbacks: (i) developers have less control over where the machine learning actually takes place (e.g. on CPU or GPU), that it is running in the same sandbox as all web pages in the browser do, and (ii) that in the current release it still has rough edges and is not considered stable enough to use in production.

Introduction

TensorFlow.js is the successor of deeplearn.js, a Google project to support WebGL accelerated machine learning in the browser. However, while deeplearn.js focused on making machine learning in the browser possible, TensorFlow.js brings TensorFlow to the browser. While the result is the same – machine learning in the browser – TensorFlow.js provides a high-level, Keras-like high level API as well as low level, tensor-based API which are instantly familiar to developers used to TensorFlow.

Generally speaking there are three things you can do with TensorFlow.js:

Import a pre-trained model for inference. TensorFlow based models can be converted and used within TensorFlow.js.
Re-train an imported model. Existing models can be adapted to specific needs with transfer learning in the browser.
Author models directly in the browser. Using the API models can also be created from scratch and trained directly in TensorFlow.js

Getting Started With a Pre-Trained Model

A simple example shows the utility of TensorFlow.js. We are using a pre-trained model for speech commands [3] to recognize commands and change slides in a HTML & JavaScript presentation based on Reveal.js [4]. The first step is to create an HTML file with the presentation. In the head we need to import Reveal.js as well as TensorFlow.js:

<!-- Reveal.js -->
<script src="https://cdnjs.cloudflare.com/ajax/libs/reveal.js/3.7.0/js/reveal.js"></script>
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/reveal.js/3.7.0/css/reveal.min.css" />
<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/reveal.js/3.7.0/css/theme/white.min.css" />
<!-- TensorFlow.js -->
<script src="https://unpkg.com/@tensorflow/tfjs"></script>
<script src="https://unpkg.com/@tensorflow-models/speech-commands"></script>

In the body we define a simple presentation and point to the JavaScript source code

<!-- slides -->
<div class="reveal">
 <div class="slides">
 <section>Slide 1</section>
 <section>Slide 2</section>
 <section>Slide 3</section>
 <section>Slide 4</section>
 </div>
</div>
<!-- JavaScript code -->
<script src="index.js"></script>

The main program resides in the JavaScript file. There, the presentation is initialized and the model is loaded in the function app(). Then, the model is used to constantly listen and classify speech commands. Slides are changed when the commands “left” and “right” are recognized. The probabilityThreshold controls how often the model fires a recognition event:

let recognizer;
function predictWord() {
 // Array of words that the recognizer is trained to recognize.
 const words = recognizer.wordLabels();
 recognizer.listen(({scores}) => {
 // Turn scores into a list of (score,word) pairs.
 scores = Array.from(scores).map((s, i) => ({score: s, word: words[i]}));
 // Find the most probable word.
 scores.sort((s1, s2) => s2.score - s1.score);
 console.log(scores[0].word);
 if (scores[0].word == 'right') Reveal.navigateNext();
 else if (scores[0].word == 'left') Reveal.navigatePrev();
 }, {probabilityThreshold: 0.75});
}

async function app() {
 Reveal.initialize(); // init Reveal.js
 recognizer = speechCommands.create('BROWSER_FFT');
 await recognizer.ensureModelLoaded();
 predictWord();
}

app();

Note that, at that point, the same code works on mobile phones as well as desktop computers. As described in [2] the model can be re-trained to act on different sounds, like claps, whistles or finger snaps.

Conclusions

TensorFlow.js is one of the most interesting developments of last year. While TensorFlow has become a popular framework for machine learning research, TensorFlow.js seems to be a nice add-on for demonstrating models of deep learning approaches in web browsers, e.g. on mobile phones. In the field of image classifications, MobileNet [6] and PoseNet [7] are demos that can be integrated smoothly in a web application to utilize web cams for showcasing interactively how it works. Using this pre-trained models that have been already ported to TensorFlow.js [3] is as easy as it is shown previously. However, converting custom trained models from TensorFlow to TensorFlow.js is currently not well supported. Although TensorFlow provides a converter [8] for porting TensorFlow models to a valid TensorFlow.js format, it has to be considered that not all TensorFlow operations, as well as layers, are supported yet.

However, TensorFlow.js is under active development and new releases are issued on a nearly monthly basis. Since the beginning of 2019, the range of supported operations and layers has been considerably increased, and more often than not useful workarounds are offered by the community for those things not yet integrated. On top of that, there are even frameworks building on TensorFlow.js. If the main use case is to employ pre-existing models there is the option to use ml5js [5], a wrapper focusing on making TensorFlow.js even more accessible.

We think this framework with high potential of becoming important in the next few years and recommend to look at its development closely over the next few months, since there is an active community and with enough perseverance.

References

[1] TensorFlow.js, https://js.tensorflow.org/, accessed 2019-02-01
[2] Tutorial: Build an audio recognition model using TensorFlow.js, https://codelabs.developers.google.com/codelabs/tensorflowjs-audio-codelab, accessed 2019-02-01
[3] Pre-trained models for TensorFlow.js, https://github.com/tensorflow/tfjs-models , accessed 2019-02-01
[4] Reveal.js, https://github.com/hakimel/reveal.js, accessed 2019-02-01
[5] Friendly Machine Learning for the Web, https://ml5js.org/, accessed 2019-02-01
[6] PoseNet, https://github.com/tensorflow/tfjs-models/tree/master/posenet, accessed 2019-02-05
[7] MobileNet, https://github.com/tensorflow/tfjs-models/tree/master/mobilenet, accessed 2019-02-05
[8] TensorFlow.js Converter, https://github.com/tensorflow/tfjs-converter, accessed 2019-02-05