September 19, 2020.
There are Docker images for FastAI now! I was really interested in this, because I have been an aspirational FastAI student up to now. My biggest problem – the Python installs. I talk about this more in the end – a motivation section.
This post will describe how to use the new FastAI Docker images to run the notebooks for the FastAI course, as well as your Python code from your local disk instead of in the container, so you do not have to commit as a new image after every save. This distinction may not matter at all to you at the moment. That’s okay!
The tutorial is presented from the Ubuntu OS viewpoint. I hope that it can translate to other OS’s though.
I’ll explain a few options along the way too. The post does not assume you know much, if anything, about Docker. All Docker commands you will need, will be given. The links below will be introduced in the text, but are also listed below for easy reference.
First, install Docker if you have not already. Here’s the Download page from Docker.
Next, you need to install the NVIDIA drivers. Here’s the current information from NVIDIA. This process is a lot faster than it used to be; it took me about 10-15 minutes, including a mistake I made.
I learned from Twitter that one can also enable NVIDIA GPUs from the base Docker package, but following the instructions looks an awful like installing NVIDIA-Docker. In any event, either option does not take much time on Ubuntu / Debian.
Then, for this post I created a very small GitHub repository, amy-tabb/fastai-docker-example. It has a Dockerfile, a folder with a python script in it for testing, a directory
build-from-pytorch-image, a README, license, and a
Go to where ever you want this project to live, and clone the repository. Since I use the terminal and am on Ubuntu, for me this is
git clone https://github.com/amy-tabb/fastai-docker-example
I’m going to change directory into the new
First, let’s take a look at that Dockerfile. For people new to Docker, there are two ways to create Docker images: the Dockerfile-and-build, and the command-line-and-commit route. We will use the Dockerfile method in this post, and only mention the second a few times.
FROM fastdotai/fastai:2020-10-02 RUN useradd fastai-user RUN apt-get update RUN apt-get -y install nano\ graphviz \ libwebp-dev RUN pip uninstall -y pillow RUN pip install pillow RUN pip install kaggle \ dtreeviz \ treeinterpreter RUN pip install waterfallcharts WORKDIR /home/ RUN echo '#!/bin/bash\njupyter notebook --ip=0.0.0.0 --no-browser' >> run_jupyter.sh WORKDIR /home/fastai-user/ USER fastai-user ENV HOME "/home/fastai-user"
Explanations (Understanding this section is not key to getting everything running):
fastdotai/fastai:2020-10-02image on Docker hub as the base image. Layperson’s terms: everything that is in the
fastdotai/fastai:2020-10-02image, will be in our image, after the
FROMstatement. See Details 6 for what is included in this image.
RUN ...runs instructions as we would from a bash shell, in this case:
useradd fastai-user: adds a new user, as in Docker we only have a root/superuser by default.
apt-get -y install nano graphviz libwebp-dev: add the command-line text editor nano while we are still the root/superuser. This is not necessary, but handy. We also add graphviz, because we need the executable when
gvis called in Chapter 1 of the lessons. The graphviz python package is installed through
fastdotai/fastaiDetails 6. In chapter 2, we need
webpsupport for pillow, which requires a reinstall of pillow to enable.
pip uninstall -y pillowand
pip install pillow: enables
webpsupport for pillow after the installation of
pip install kaggle dtreeviz treeinterpreter: installs python packages needed for Chapter 9.
pip install waterfallcharts: installs a python package needed for Chapter 9. Sure, I could have included many of these
pip installs on one line, but I like smaller layers.
WORKDIR /home/: I learned the hard way
cdis not your friend in Docker: Docker: What’s the deal with WORKDIR?. This is just switching to a new directory.
RUN echo '#!/bin/bash\njupyter notebook --ip=0.0.0.0 --no-browser' >> run_jupyter.sh: create a helper script for launching a Jupyter notebook if the person launching the container has chosen the
-poption for networking.
WORKDIR /home/fastai-user/switch to the new user’s home directory. You can also do this with a flag,
-w, but for a few reasons, I use
WORKDIRhere. (Details: the
/workspace, Details 7.)
USER fastai-user: switch user to the new user, avoiding running everything as root (see Details 3 for more information).
ENV: set environment variables, which is needed in this case to have downloaded files and models saved in the correct places (see Details 2 for more information).
Ok, so we’re going to build this Docker image locally. I will select an image name
fastai-local, but really this is up to you.
From within the
docker build -t fastai-local .
(Note: I always run docker with
sudo prefixed each time. You may not depending on your configuration.)
-t: tag for the image we are building, in other words, its image name,
.: location of the default filename, which is
Dockerfile. We could also go somewhere else and run something like
docker build fastai-docker-example -t temptag, assuming the Dockerfile is in path
fastai-docker-exampleand the tag is
Time to do some essential testing before going further. Let’s run this thing.
First off, a very basic test, assuming a tag of
docker run -it fastai-local bash
Ok, all this does is launch your Docker image, making it a container, in interactive mode (
-it). You should have a bash terminal. You can look around at the file structure, or
exit out of it.
Okay, now I’m going to throw all the needed items in there and I will explain them.
docker run -it -p 8888:8888 --gpus all --ipc=host -v $PWD:/home/fastai-user fastai-local bash
-it: interactive mode, but you knew that.
-p 8888:8888: exposing a port for the Jupyter notebook and to download data. Other options include
--network hostdocs, which allows the Docker container to use the host’s network.
--gpus all: give docker access to all GPU resources. NVIDIA-Docker has some resources on different options; I will use one of these other options later.
--ipc=host: InterProcess Communication (IPC), this flag also allows the container to use the host’s resources, particularly memory in the GPU docs. I describe other options on this flag later.
-v: a bind mount. The arguments are: host file system path, colon (:), Docker image file system path. In this example, the host file system path is our current location (
$PWD), and the Docker image file system path is
/home/fastai-user. The pedantically slow explanation here is because I had a hard time figuring this out from the documentation. I also have a section about bind mounting in my tutorial.
fastai-localis the image tag / name.
bashis the executable we want to run when we launch this thing. In other words, we get a terminal!
Once in the terminal, you can type
nvidia-smi. Is it giving you expected output?
Figure 1. Output of
nvidia-smi for two GPUs. In this setup, I only use GPU 1 in the docker container.
Since you bind-mounted, when you take a look at the directory contents, they are the contents of the local disk at
$PWD. So from the container, you should be seeing something like:
fastai-user@ea40d26a227b:~$ ls -a . .. .git .gitignore Dockerfile LICENSE README.md testing
If you go up in the directory structure (
cd ..), you are in the container’s directory structure. Go down, and you are outside of the image’s directory structure (as it was created in the build step) and in your local machine’s directory structure. Is this weird? Yes.
Because of all this, you can run scripts from the local disk like this one:
import torch from fastai.vision import * from fastai.metrics import error_rate print("Is cuda available?", torch.cuda.is_available()) print("Is cuDNN version:", torch.backends.cudnn.version()) print("cuDNN enabled? ", torch.backends.cudnn.enabled) x = torch.rand(5, 3) print(x)
This is a little test that tries to import some libraries and check if cuda is available, and outputs the cuDNN version. You can run it from the
Obviously, you can write more involved and interesting things all on your local machine, all while the container is up and running.
While here, you can also launch a Jupyter notebook (assuming only exposing a port for networking…):
jupyter notebook --ip=0.0.0.0
There will be some warnings about there being no browser, ok, you can also do
jupyter notebook --ip=0.0.0.0 --no-browser
The default port for Jupyter notebooks is 8888. If you’ve chosen a different port for the Docker container, you need to use the flag
--port=8889 or whatever port number you have when you launch the Jupyter notebook.
or use the convenience script,
if you used the
--networking host option,
The net result is the same: you’re offered several links. In my terminal, I typically right-click one of the URLs and ‘open link’ to see a directory structure with ‘Jupyter’ on the left top corner.
Ok, from the terminal, Ctrl-C, kill the notbook, and type
exit to kill the container.
I will not go into this in detail since this tutorial uses Docker in interactive mode (the
-it flag). However, you can directly run programs without multiple steps as follows:
docker run -p 8888:8888 --gpus '"device=1"' --ipc=host -v $PWD:/home/fastai-user fastai-local python3 testing/test0.py Is cuda available? True Is cuDNN version: 7603 cuDNN enabled? True tensor([[0.0248, 0.7735, 0.0954], [0.9509, 0.0247, 0.6945], [0.0116, 0.5573, 0.1870], [0.1349, 0.6840, 0.8420], [0.6088, 0.8092, 0.7516]])
fastai/docker-containers has more examples of this.
On one of my machines, I have an old GPU that I am using for display, and then a new one that PyTorch supports for the deep learning work. So I only want Docker to use the newer GPU, which is GPU number 1 when I run
nvidia-smi outside of the Docker container.
docker run -it -p 8888:8888 --gpus '"device=1"' --ipc=host -v $PWD:/home/fastai-user fastai-local
is the command I run. Again, the NVIDIA-Docker site has many other variations if you need to specify your GPUs differently.
So if you don’t want to use the
--ipc flag for whatever reason, you can also use the shared memory size flag
An example using the shared memory size flag, Docker docs (have to search for the string),
docker run -it -p 8888:8888 --gpus '"device=1"' --shm-size="10000m" -v $PWD:/home/fastai-user fastai-local bash
A very helpful troubleshooting tool is to use some of the options from
nvidia-smi dmon allows you to monitor the GPU while processes are running to determine what potential problems are.
Figure 2. Output of
nvidia-smi dmon --id 1 in a two GPU setting, but only monitoring for GPU with ID = 1. Every second, the power use, temperature, shared memory use percentage and memory use percentages are output.
The way this tutorial is set up is that you will clone the FastAI course into the
fastai-dir folder. Then, you can launch the Docker image and manipulate the materials any way you like, make notes and copies, etc., and the course materials’ last state will be on your local disk and not in the container.
Clone! (Again, assuming we are within the
git clone https://github.com/fastai/fastbook.git fastai-dir
Once you have cloned, then run your image however it works for you. For example:
docker run -it -p 8888:8888 --gpus all --ipc=host -v $PWD:/home/fastai-user fastai-local
Then, launch a Jupyter notebook, and navigate to the course notebooks, and try to run them. Happy learning at the 2020 version of the course!
Figure 3. Jupyter notebook structure within
fastai-dir. This first chapter of the book is
01_intro.ipynb. Start there! For versions of the notebooks without the book test, look at the
In the current FastAI course (2020), the suggestion is to run everything in the cloud. And this is generally a good idea and to the best of my knowledge it works well. But I wanted to work locally, and use Docker images.
As mentioned above, I have been fighting the Python install method and virtual environment business in python. By now, I’ve been programming off (but more on) in C++ for 20 years, and while I understood perfectly well the mechanics of programming in a new language (python), I didn’t understand how to get the python library installs to work, and fix it when it went wonky, and test it. Besides, all of those
conda runs took FOREVER.
In comparison, I have some experience with Docker, which started with getting code to someone who works with medical data and cannot move it, which led to this two-year-old tutorial. So I knew if I could get things set up in a Docker image, I would have a better chance of getting a reproducible environment I could move from computer to computer, without mysteries about why it is working on one and not another.
I’ll give a few other reasons why someone would want to do so:
And finally, the easiest use case for using Docker is not having to alter installations for different operating systems.
What about hardware? I used an NVIDIA GeForce RTX 2080 Ti (an up-to-the-task card) in one computer, and then tested with a NVIDIA Quadro P1000 in another (baby card). I was running Ubuntu 18.04 on both computers.
For a deep dive into selecting GPUs, Tim Dettmers’ Blog has had posts that people have been referring to for three years, and he has a fresh post that is not yet a month old (linked).
Comments or feedback? Please open an issue on GitHub or catch up with me on Twitter.
© Amy Tabb 2018-2021. All rights reserved. The contents of this site reflect my personal perspectives and not those of any other entity.