Google recently announced the availability of GPUs on Google Compute Engine instances. For my deep learning experiments, I often need more beefy GPUs than the puny GTX 750Ti in my desktop workstation, so this was good news. To make the GCE offering even more attractive, their GPU instances are also available in their EU datacenters, which is in terms of latency a big plus for me here on the Southern tip of the African continent.

Last night I had some time to try this out, and in this post I would like to share with you all the steps I took to:

  1. Get a GCE instance with GPU up and running with miniconda, TensorFlow and Keras
  2. Create a reusable disk image with all software pre-installed so that I could bring up new instances ready-to-roll at the drop of a hat.
  3. Apply the pre-trained Resnet50 deep neural network on images from the web, as a demonstration that the above works. Thanks to Keras, this step is fun and fantastically straight-forward.

Pre-requisites

I started by creating a project for this work. On the Compute Engine console, check that this project is active at the top.

Before I was able to allocate GPUs to my instance, I had to fill in the “request quote increase” form available from the Compute Engine quotas page. My request for two GPUs in the EU region was approved within minutes.

I installed my client workstation’s id_rsa.pub public SSH key as a project-wide SSH key via the metadata screen.

Start an instance for the first time

I configured my GPU instance as shown in the following screenshot:

gce-create-instance.png

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
<ul class="org-ul">
  <li>
    Under <code>Machine type</code> switch to <code>Customize</code> to be able to select a GPU.
  </li>
  <li>
    I selected an Ubuntu 16.04 image, and changed the persistent disk to SSD.
  </li>
  <li>
    I selected the <code>europe-west1-b</code> zone. Choose whatever is closest for you. The interface will warn you if the selection does NOT support GPUs.
  </li>
</ul>

<p>
  After this, click on the <code>Create</code> button and wait for your instance to become ready.
</p>

<p>
  Once it&#8217;s up and running, you&#8217;ll be able to <code>ssh</code> to the displayed public IP. I used the ssh on my client workstation, but of course you could opt for the Google-supplied web-based versions.
</p>

Install NVIDIA drivers and CUDA

I used the following handy script from the relevant GCE documentation:

#!/bin/bash
echo "Checking for CUDA and installing."
# Check for CUDA and try to install.
if ! dpkg-query -W cuda; then
  # The 16.04 installer works with 16.10.
  curl -O http://developer.download.nvidia.com/compute/cuda/repos/ubuntu1604/x86_64/cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
  dpkg -i ./cuda-repo-ubuntu1604_8.0.61-1_amd64.deb
  apt-get update
  apt-get install cuda -y
fi
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
<p>
  After this, download the CUDNN debs from the <a href="https://developer.nvidia.com/cudnn">NVIDIA download site</a> using your developer account. Install the two debs using <code>dpkg -i</code>.
</p>

<p>
  To confirm that the drivers have been installed, run the <code>nvidia-smi</code> command:
</p>

<div class="figure">
  <p>
    <img src="/wp-content/uploads/2017/03/gce-nvidia-smi.png?w=660&#038;ssl=1" alt="gce-nvidia-smi.png" data-recalc-dims="1" />
  </p></p>
</div>

Install miniconda, tensorflow and keras

I usually download the 64bit Linux miniconda installer from conda.io and then install it into ~/miniconda3 by running the downloaded .sh script.

After this, I installed TensorFlow 1.0.1 and Keras 2.0.1 into a new conda environment by doing:

conda create -n ml python=3.6
conda install jupyter pandas numpy scipy scikit-image
pip install --ignore-installed --upgrade https://storage.googleapis.com/tensorflow/linux/gpu/tensorflow_gpu-1.0.1-cp36-cp36m-linux_x86_64.whl
pip install keras h5py
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
<p>
  The <code>keras</code> package also installed theano, which I then uninstalled using <code>pip uninstall theano</code> in the active <code>ml</code> environment.
</p>

<p>
  To test, run <code>ipython</code> and then type <code>import keras</code>. It should look like this:
</p>

<div class="figure">
  <p>
    <a href="/wp-content/uploads/2017/03/gce-import-keras.png?ssl=1"><img src="/wp-content/uploads/2017/03/gce-import-keras.png?resize=660%2C237&#038;ssl=1" alt="gce-import-keras.png" data-recalc-dims="1" /></a>
  </p></p>
</div>

<p>
  Note that it&#8217;s picking up the TensorFlow backend, and successfully loading all of the CUDA librarias, including CUDNN.
</p>

Save your disk as an image for later

You will get billed for each minute that the instance is running. You also get billed for persistent disks that are still around, even if they are not used by any instance.

Creating a reusable disk image will enable you to delete instances and disks, and later to restart an instance with all of your software already installed.

To do this, follow the steps in the documentation, which I paraphrase and extend here:

  1. Stop the instance.
  2. In the instance list, click on the instance name itself; this will take you to the edit screen.
  3. Click the edit button, and then uncheck Delete boot disk when instance is deleted.
  4. Click the save button.
  5. Delete the instance, but double-check that delete boot disk is unchecked in the confirmation dialog.
  6. Now go to the Images screen and select Create Image with the boot disk as source.

Next time, go to the Images screen, select your image and then select Create Instance. That instance will come with all of your goodies ready to go!

Apply the ResNet50 neural network on images from the interwebs

After connecting to the instance with an SSH port redirect:

ssh -L 8889:localhost:8888 cpbotha@EXTERNAL_IP
1
2
3
4
5
6
<p>
  &#x2026; and then starting a jupyter notebook on the GCE instance:
</p>

<div class="org-src-container">
  <pre class="src src-sh">cpbotha@instance-1:~$ source ~/miniconda3/bin/activate ml

(ml) cpbotha@instance-1:~$ jupyter notebook

1
2
3
4
5
6
<p>
  So that I can connect to the notebook on my <code>localhost:8889</code>, I enter and execute the following code (adapted from the Keras documentation) in a cell:
</p>

<div class="org-src-container">
  <pre class="src src-python"><span style="color: #A52A2A; font-weight: bold;">from</span> keras.applications.resnet50 <span style="color: #A52A2A; font-weight: bold;">import</span> ResNet50

from keras.preprocessing import image from keras.applications.resnet50 import preprocess_input, decode_predictions import numpy as np from PIL import Image

model = ResNet50(weights=‘imagenet’)

# adapted from https://github.com/fchollet/deep-learning-models # to accept also a PIL image def load_and_predict_image(img_or_path): target_size = (224,224) if type(img_or_path) is str: img = image.load_img(img_or_path, target_size=target_size)

else: img = img_or_path.resize(target_size)

x = image.img_to_array(img) x = np.expand_dims(x, axis=0) x = preprocess_input(x)

preds = model.predict(x) # decode the results into a list of tuples (class, description, probability) # (one such list for each sample in the batch) print(‘Predicted:’, decode_predictions(preds, top=3)[])

1
2
3
4
5
6
<p>
  In the next cell, I do:
</p>

<div class="org-src-container">
  <pre class="src src-python"><span style="color: #A52A2A; font-weight: bold;">from</span> PIL <span style="color: #A52A2A; font-weight: bold;">import</span> Image

import urllib.request

url1 = https://upload.wikimedia.org/wikipedia/commons/thumb/c/c8/KuduKr%C3%BCger.jpg/1920px-KuduKr%C3%BCger.jpg" url2 = https://upload.wikimedia.org/wikipedia/commons/thumb/9/9d/Struthio_camelus_-_Etosha_2014_%283%29.jpg/800px-Struthio_camelus_-_Etosha_2014_%283%29.jpg" im = Image.open(urllib.request.urlopen(url2))

load_and_predict_image(im)