Optimize your Deep Learning Workflow with Docker

Abhishri Medewar
4 min readMar 19, 2023

--

Deep learning is a powerful field of artificial intelligence that requires a lot of computational power to train models. Running deep learning experiments on local machines can be time-consuming and expensive, as it requires specialized hardware and software configurations.

Docker, a containerization platform, can help streamline the process of running deep learning experiments by providing a consistent environment that can be easily deployed across different machines.

Content

In this article, we’ll explore how to use Docker in deep learning, including the following:

  1. Installing Docker
  2. Setting up a Docker container using Nvidia Docker Containers
  3. Running Deep Learning Experiments
  4. Additional Docker commands

Docker installation on ubuntu using the apt repository

Docker is available for Windows, macOS, and Linux, and can be downloaded from the Docker website. To install docker on ubuntu follow the steps given below:

  1. Open terminal on ubuntu.
  2. Uninstall pervious versions of docker.
#this will remove docker, docker-engine docker.io containerd runc
$ sudo apt-get remove docker docker-engine docker.io containerd runc

#remove additional files from the directory /var/lib/docker/
$ sudo rm -rf /var/lib/docker
$ sudo rm -rf /var/lib/containerd

3. Update the apt repository and install dependencies.

$ sudo apt-get update
$ sudo apt-get install ca-certificates curl gnupg lsb-release

4. Add docker’s GPG key.

$ sudo mkdir -m 0755 -p /etc/apt/keyrings
$ curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg

5. Add the docker package to the system.

$ echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null

6. Update the apt package and install Docker Engine, containerd, and Docker Compose.

$ sudo apt-get update
$ sudo apt-get install docker-ce docker-ce-cli containerd.io \
docker-buildx-plugin docker-compose-plugin

7. Verify your docker installation.

$sudo docker run hello-world

This command downloads a test image from a Docker registry and runs it in a Docker container. When the container is started, it executes the specified command and then exits. The command that is executed in the container prints a confirmation message to the console.

Setting up docker container using Nvidia Docker container

Once you’ve installed Docker, you can create a Docker container by defining a Dockerfile. A Dockerfile is a text file that contains instructions for building a Docker image. An image is a read-only template that contains a set of instructions for creating a container.

NVIDIA provides a set of PyTorch containers that are optimized for deep learning workflows on NVIDIA GPUs. These containers come pre-installed with the necessary drivers, libraries, and dependencies for running PyTorch applications on NVIDIA GPUs.

The NVIDIA PyTorch containers come in several different versions, each corresponding to a specific version of PyTorch and the CUDA toolkit. For example, the nvcr.io/nvidia/pytorch:22.04-py3 container corresponds to PyTorch version 1.12 and CUDA toolkit version 11.6.2

Here is an example of setting up a docker container using Nvidia PyTorch Container:

FROM nvcr.io/nvidia/pytorch:22.04-py3
RUN apt-get update
RUN pip install numpy
RUN pip install scipy
RUN pip install opencv-python

This Dockerfile starts with nvcr.io/nvidia/pytorch:22.04-py3 base image and installs numpy, scipy and opencv-python. You can customize this Dockerfile to include any other libraries or tools needed for your deep learning experiments.

To build the Docker image, save the Dockerfile to a directory and run the following command in the terminal:

docker build -t deep-learning .

This command builds the Docker image and tags it with the name “deep-learning”. The “.” at the end of the command specifies that the Dockerfile is located in the current directory.

Once the Docker image is built, you can create a Docker container by running the following command:

docker run --gpus all -it -v /home/username:/home/username deep-learning

This command starts a Docker container based on the “deep-learning” image and opens an interactive terminal session inside the container.

Running deep learning experiments

Use the docker run command to mount a directory from your local machine into the Docker container and use it to store data and code for your experiments.

For example, suppose you have a directory on your local machine called “experiments” that contains the code and data for your experiments. You can run the following command to start a Docker container and mount the “experiments” directory as a volume:

docker run --gpus all -it -v /path/to/experiments:/experiments deep-learning

You can create multiple docker images with the necessary packages and use these images to perform different experiments.

Additional Docker Commands

  1. Get the list of docker images on the system.
docker images

2. Get the list of all the running containers.

docker container ls

3. Provide shared memory size while running docker run.

docker run --gpus all --shm-size=2g --ulimit memlock=-1 \
--ulimit stack=67108864 -it -v /experiments:/experiments deep-learning

4. Attach to a running container.

docker attach <container_id>

5. Save docker image.

docker save --output <tar_name>.tar <docker_image_name>

6. Load docker image from tar file.

docker load --input <tar_name>.tar

--

--