Authors / Contributors: Mathias Lux, Marco Bertini;
Affiliation: University of Klagenfurt, University of Florence;
Editors: Mathias Lux and Marco Bertini
URL: https://keras.io
Following the last column on MatConvNet, let us continue to look at open source frameworks for deep learning. In this column we are going to check Keras, a Python API that allows to use several different backends like Tensorflow and CNTK. Actually, it also supports Theano, although the development of this framework has been halted by the original developers in 2017.
These characteristics make it very useful since the same Keras code can be executed using different backends, allowing to abstract from this detail. Tensorflow has its own implementation of Keras API in tf.keras and in the forthcoming 2.0 version it is going to be even more integrated as high-level API. Amazon has developed a version that allows to use also MXNet, the backend of choice on AWS, e.g. in the Deep Learning AMI. The latest version of this port is on par with the current 2.2.4 official version of Keras. There are also other backends that support Keras, like PlaidML that supports GPUs other than NVidia ones.
Good reasons to use Keras are: an incredibly terse and well designed API that simplifies development or allows to just tinker with it, independence from backend, compatibility with Python 2.7 and 3.6, to allow using it in older projects or to develop more future-proof code.
Keras is a great tool both for research and for teaching, providing many pre-trained networks and datasets ready for download, e.g. using it in a Google Colab environment using GPUs or TPUs for the Tensorflow backend.
Introduction
Provided that a backend is installed on your machine, installing Keras is just a matter of using pip or Conda to install it. On Google Colab start a notebook, either Python 2 or 3, perhaps selecting GPU acceleration for the Tensorflow backend. Nowadays Keras is already installed, so there’s no need of a !pip install keras in Colab’s code cells.
The basic building block of Keras is a model that represents the structure of the network. The simplest type of model is that of a stack of layers, created using the Sequential model, adding one layer after the other, e.g. dense (fully connected), then compiling the model, before starting training it and then using it with the fit/predict methods typically used in Scikit-learn . Of course, it is possible to develop more modern network architectures than a VGG16 net, using a functional API that allows to build graphs of layers.
Getting started
To get a gist of the API let’s see how to download a dataset and the create and train a simple network. The following code can be pasted in Google Colab to immediately see how it works:
# load dataset (download it the first time that it is requested) from keras.datasets import mnist (train_images, train_labels), (test_images, test_labels) = mnist.load_data() # visualize an image using Matplotlib library import matplotlib.pyplot as plt example_digit = train_images[4] # 4th example of the training set plt.imshow(example_digit, cmap=plt.cm.binary) plt.show()
Keras provides other ready made datasets that are useful while learning the API, in lectures, or while testing code on small scale, like CIFAR10, CIFAR100 or Fashion-MNIST. To import the latter data use:
from keras.datasets import fashion_mnist
(x_train, y_train), (x_test, y_test) = fashion_mnist.load_data()
Then, create a simple network and prepare it for training:
from keras import models from keras import layers network = models.Sequential() network.add(layers.Dense(512, activation='relu', input_shape=(28 * 28,))) network.add(layers.Dense(10, activation='softmax')) network.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
Before training there is need to reshape the input into the format expected by the network and rescale its values; we also need to encode the labels of the dataset into categories:
train_images = train_images.reshape((60000, 28 * 28)) train_images = train_images.astype('float32') / 255 test_images = test_images.reshape((10000, 28 * 28)) test_images = test_images.astype('float32') / 255 from keras.utils import to_categorical train_labels = to_categorical(train_labels) test_labels = to_categorical(test_labels)
Now we are ready to train the network (for 5 epochs) and then evaluate its accuracy on the test set:
network.fit(train_images, train_labels, epochs=5, batch_size=128) test_loss, test_acc = network.evaluate(test_images, test_labels) print('test_acc:', test_acc)
The same network of this example could be rewritten using the more advanced functional API as:
## Functional API input_tensor = layers.Input(shape=(784,)) x = layers.Dense(32, activation='relu')(input_tensor) output_tensor = layers.Dense(10, activation='softmax')(x) model = models.Model(inputs=input_tensor, outputs=output_tensor)
Independently from the chosen API, once the model has been defined there is a need to specify loss, optimizer and metric using the compile method seen before.
As an example of use of pre-trained network let’s start with VGG16:
from keras.applications.vgg16 import VGG16 model = VGG16(weights='imagenet')
Again, the network weights will be downloaded upon the first use, and they’ll be stored in a .keras directory.
Then, get an image of one of the classes used in ImageNet, e.g. an elephant from Wikipedia. In case we are using Google Colab it’s enough o prepend a ! before the wget used to download the image: !wget https://upload.wikimedia.org/wikipedia/commons/thumb/3/37/African_Bush_Elephant.jpg/1280px-African_Bush_Elephant.jpg
When using pre-trained networks Keras provides functions that preprocess the input, e.g. in case of VGG it normalizes color.
from keras.preprocessing import image from keras.applications.vgg16 import preprocess_input, decode_predictions import numpy as np # The local path to our target image img_path = './1280px-African_Bush_Elephant.jpg' # `img` is a PIL image of size 224x224 img = image.load_img(img_path, target_size=(224, 224)) # `x` is a float32 Numpy array of shape (224, 224, 3) x = image.img_to_array(img) # We add a dimension to transform our array into a "batch" # of size (1, 224, 224, 3) x = np.expand_dims(x, axis=0) # Finally we preprocess the batch # (this does channel-wise color normalization) x = preprocess_input(x)
Then we feed the processed image to the network for inference:
preds = model.predict(x) print('Predicted:', decode_predictions(preds, top=3)[0])
Instead, if we want to use VGG16 to extract features, or if we want to fine tune it to some other domain, we can use:
from keras.applications import VGG16 conv_base = VGG16(weights='imagenet', include_top=False, input_shape=(150, 150, 3)) conv_base.summary()
The summary() method shows the design of the network. In this case the VGG16 dense layers are not included, so that we can use the first layers of the network to extract features, and add our own dense layers to train them to solve other classification tasks:
from keras import models from keras import layers model = models.Sequential() model.add(conv_base) model.add(layers.Flatten()) model.add(layers.Dense(256, activation='relu')) model.add(layers.Dense(10, activation='softmax')) model.summary()
If we don’t want to fine tune VGG we freeze the cone_base layers:
print('This is the number of trainable layers (kernel+bias) ' 'before freezing the conv base:', len(model.trainable_weights)) conv_base.trainable = False print('This is the number of trainable layers (kernel+bias) ' 'after freezing the conv base:', len(model.trainable_weights))
Calling again model.summary() shows that the number of trainable parameters has been greatly reduced by this operation.
Conclusions
Apart from these simple examples that show the API, Keras provides all the required facilities for a researcher, like ease of extension of new layers, cost functions, etc.
The pre-trained networks include ResNet, ResNetX, and MobileNet variants, and the library also has facilities for data augmentation. Additional features are available in the special Github repo, keras-contrib, that contains material that is kept separated from the main library so to keep its ease of use.
Many recent works have a Keras version available on Github, like VGGFace, RetinaNet, YOLOv3, GANs, etc. The popularity of the library and the increasing integration with Tensorflow make this a library a good candidate as tool of choice for development of deep learning systems.