This article is part of the Learning Docker series.
- 1 - A Beginner’s Guide to Docker
- 2 - Understanding Docker Layers (This Article)
Docker is an incredibly versatile tool for developers to build, package, ship, and run applications. In this tutorial, we take a stab at understanding how to build efficient Dockerfiles for faster build times and what are things to look out for.
Building Docker Image
Let’s take an example if a flask application that returns a simple string.
Sample app.py
file:
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello_world():
return 'Hello world!'
if __name__ == "__main__":
app.run(debug=True)
A sample requirements.txt
file for the same would look like:
flask==2.3.1
We can write an initial Dockerfile as follows:
FROM python:3.10-slim-buster
WORKDIR /usr/src/app
COPY . .
RUN pip3 install -r requirements.txt
CMD "python3 -m flask run --host=0.0.0.0"
Let’s look at the sequence of what’s happening in the Dockerfile:
FROM python:3.10-slim-buster
- Uses the 3.10 python slim buster image as base.
WORKDIR /usr/src/app
- Creates a work directory under the path
/usr/src/app
.
- Creates a work directory under the path
COPY . .
- Copies the contents of the folder we’re currently in to the work directory in the image.
RUN pip3 install -r requirements.txt
- Installs the dependencies present in the requirements file.
CMD "python3 -m flask run --host=0.0.0.0"
- Sets up the default command to be run for the image as running the app file using flask.
Let’s build the application:
➜ test_python docker build . -t flask-demo:first
[+] Building 15.4s (9/9) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 187B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.10-slim-buster 5.3s
=> [1/4] FROM docker.io/library/python:3.10-slim-buster@sha256:1b501f9aa621df27078adcd19ba769c09cb1c4f2e797bfaba0c66553db16923b 5.3s
=> => resolve docker.io/library/python:3.10-slim-buster@sha256:1b501f9aa621df27078adcd19ba769c09cb1c4f2e797bfaba0c66553db16923b 0.0s
=> => sha256:1b501f9aa621df27078adcd19ba769c09cb1c4f2e797bfaba0c66553db16923b 988B / 988B 0.0s
=> => sha256:7857e9a198fc4b06818b0e064c13b21485b72c7fdb1f51d3b13c9854ca2fcfa5 1.37kB / 1.37kB 0.0s
=> => sha256:6f74f1480ab79fde95503d17f5305a4115c1b1125b39e73fa08312b7d9bd6ec2 6.90kB / 6.90kB 0.0s
=> => sha256:9fbefa3370776b7ec7633cf07efc14cc24e0c0cd53893ad0e7e3f44ffdc1bedb 27.14MB / 27.14MB 2.1s
=> => sha256:a25702e0699eca20ab682bbfa60f6bb7775e4fb18ef65c038ffda342fdab9e3a 2.78MB / 2.78MB 1.0s
=> => sha256:adf6e8027509be41b79122b1745365bd5c0c43586ce2976493dd69943d2d179e 11.50MB / 11.50MB 2.1s
=> => sha256:a68430a46d9dab42676a2ca1d8e9ae1ee8c6c19082a6bcece3e8e935231138a2 243B / 243B 1.7s
=> => sha256:433875ea4139d286599691131e06d480b4b5c742a7558320f567f9789c097820 3.37MB / 3.37MB 3.4s
=> => extracting sha256:9fbefa3370776b7ec7633cf07efc14cc24e0c0cd53893ad0e7e3f44ffdc1bedb 1.8s
=> => extracting sha256:a25702e0699eca20ab682bbfa60f6bb7775e4fb18ef65c038ffda342fdab9e3a 0.2s
=> => extracting sha256:adf6e8027509be41b79122b1745365bd5c0c43586ce2976493dd69943d2d179e 0.5s
=> => extracting sha256:a68430a46d9dab42676a2ca1d8e9ae1ee8c6c19082a6bcece3e8e935231138a2 0.0s
=> => extracting sha256:433875ea4139d286599691131e06d480b4b5c742a7558320f567f9789c097820 0.2s
=> [internal] load build context 0.0s
=> => transferring context: 436B 0.0s
=> [2/4] WORKDIR /usr/src/app 0.1s
=> [3/4] COPY . . 0.0s
=> [4/4] RUN pip3 install -r requirements.txt 4.3s
=> exporting to image 0.2s
=> => exporting layers 0.2s
=> => writing image sha256:aa133f3be3e32746f498fd5f885855f69abbddd3691a4cd1a35c426fd744dc6e 0.0s
=> => naming to docker.io/library/flask-demo:first
➜ test_python
We can see that each line from the Dockerfile shows up as a step.
Let’s inspect the contents of the image:
➜ test_python docker history flask-demo:first
IMAGE CREATED CREATED BY SIZE COMMENT
aa133f3be3e3 9 minutes ago CMD ["/bin/sh" "-c" "\"python3 -m flask run … 0B buildkit.dockerfile.v0
<missing> 9 minutes ago RUN /bin/sh -c pip3 install -r requirements.… 11.7MB buildkit.dockerfile.v0
<missing> 9 minutes ago COPY . . # buildkit 315B buildkit.dockerfile.v0
<missing> 9 minutes ago WORKDIR /usr/src/app 0B buildkit.dockerfile.v0
<missing> 2 weeks ago CMD ["python3"] 0B buildkit.dockerfile.v0
<missing> 2 weeks ago RUN /bin/sh -c set -eux; savedAptMark="$(a… 12.2MB buildkit.dockerfile.v0
<missing> 2 weeks ago ENV PYTHON_GET_PIP_SHA256=394be00f13fa1b9aaa… 0B buildkit.dockerfile.v0
<missing> 2 weeks ago ENV PYTHON_GET_PIP_URL=https://github.com/py… 0B buildkit.dockerfile.v0
<missing> 2 weeks ago ENV PYTHON_SETUPTOOLS_VERSION=65.5.1 0B buildkit.dockerfile.v0
<missing> 2 weeks ago ENV PYTHON_PIP_VERSION=23.0.1 0B buildkit.dockerfile.v0
<missing> 2 weeks ago RUN /bin/sh -c set -eux; for src in idle3 p… 32B buildkit.dockerfile.v0
<missing> 2 weeks ago RUN /bin/sh -c set -eux; savedAptMark="$(a… 29.4MB buildkit.dockerfile.v0
<missing> 2 weeks ago ENV PYTHON_VERSION=3.10.11 0B buildkit.dockerfile.v0
<missing> 2 weeks ago ENV GPG_KEY=A035C8C19219BA821ECEA86B64E628F8… 0B buildkit.dockerfile.v0
<missing> 2 weeks ago RUN /bin/sh -c set -eux; apt-get update; a… 7.09MB buildkit.dockerfile.v0
<missing> 2 weeks ago ENV LANG=C.UTF-8 0B buildkit.dockerfile.v0
<missing> 2 weeks ago ENV PATH=/usr/local/bin:/usr/local/sbin:/usr… 0B buildkit.dockerfile.v0
<missing> 2 weeks ago /bin/sh -c #(nop) CMD ["bash"] 0B
<missing> 2 weeks ago /bin/sh -c #(nop) ADD file:e614539607055bdbd… 69.3MB
The first four lines in the history correspond to the last 4 lines in our Dockerfile after the FROM
.
This shows us how the final image is just the topmost layer and there are multiple layers underneath.
Docker image cache
Any reasonable person would expect that these layers should probably be cached so that we don’t have to build each of them everytime we build the application.
Let’s take a look at running the command a second time and tag it with a different name:
➜ test_python docker build . -t flask-demo:second
[+] Building 3.2s (9/9) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 38B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.10-slim-buster 3.0s
=> [1/4] FROM docker.io/library/python:3.10-slim-buster@sha256:1b501f9aa621df27078adcd19ba769c09cb1c4f2e797bfaba0c66553db16923b 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 93B 0.0s
=> CACHED [2/4] WORKDIR /usr/src/app 0.0s
=> CACHED [3/4] COPY . . 0.0s
=> CACHED [4/4] RUN pip3 install -r requirements.txt 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:aa133f3be3e32746f498fd5f885855f69abbddd3691a4cd1a35c426fd744dc6e 0.0s
=> => naming to docker.io/library/flask-demo:second 0.0s
➜ test_python
Looks like the layers are cached as expected.
In fact, if there is no change in our files, changing the image tag will still point to the underlyng image:
➜ test_python docker image ls
REPOSITORY TAG IMAGE ID CREATED SIZE
flask-demo first aa133f3be3e3 29 minutes ago 130MB
flask-demo second aa133f3be3e3 29 minutes ago 130MB
Cache busting
Let’s try to see how we can bust the cache and rebuild our application by changing our code.
Let’s change our app’s code slightly:
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello_world():
return 'Hello universe!'
if __name__ == "__main__":
app.run(debug=True)
Let’s build our image again:
➜ test_python docker build . -t flask-demo:third
[+] Building 6.3s (9/9) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 38B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.10-slim-buster 1.5s
=> [1/4] FROM docker.io/library/python:3.10-slim-buster@sha256:1b501f9aa621df27078adcd19ba769c09cb1c4f2e797bfaba0c66553db16923b 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 124B 0.0s
=> CACHED [2/4] WORKDIR /usr/src/app 0.0s
=> [3/4] COPY . . 0.0s
=> [4/4] RUN pip3 install -r requirements.txt 4.5s
=> exporting to image 0.2s
=> => exporting layers 0.2s
=> => writing image sha256:ed803fb17ec8037403cf936a28f5300d8abd94d5d1001c874d7dd6a822c51632 0.0s
=> => naming to docker.io/library/flask-demo:third 0.0s
➜ test_python
Looking above, both the COPY step and the installing requirements step were re-run. But, we haven’t changed any of our requirements! Seems like the code changes causes the layer in step 3 to change and forces all further steps to be re-run and created as separate layers. We should probably cache our build dependencies!
Caching build dependencies
As seen above, we should not re-install dependencies while building containers and hence should cache them unless changed.
Let’s make a tiny change to our Dockerfile:
FROM python:3.10-slim-buster
WORKDIR /usr/src/app
COPY requirements.txt .
RUN pip3 install -r requirements.txt
COPY . .
CMD "python3 -m flask run --host=0.0.0.0"
We are now copying the requirements first and installing them before moving the remaining files in the folder.
Let’s build and see what happens now:
➜ test_python docker build . -t flask-demo:fourth
[+] Building 7.7s (10/10) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 212B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.10-slim-buster 3.0s
=> [internal] load build context 0.0s
=> => transferring context: 302B 0.0s
=> [1/5] FROM docker.io/library/python:3.10-slim-buster@sha256:1b501f9aa621df27078adcd19ba769c09cb1c4f2e797bfaba0c66553db16923b 0.0s
=> CACHED [2/5] WORKDIR /usr/src/app 0.0s
=> [3/5] COPY requirements.txt . 0.0s
=> [4/5] RUN pip3 install -r requirements.txt 4.3s
=> [5/5] COPY . . 0.0s
=> exporting to image 0.2s
=> => exporting layers 0.2s
=> => writing image sha256:d5378b21b1b14370c4b2d2c8591afc2919b568534e5ae28a83a7d95cdd0f7ccc 0.0s
=> => naming to docker.io/library/flask-demo:fourth
Let’s make some changes in our code and see if that invalidates the build dependency cache layer.
Let’s revert our app.py to our original version:
from flask import Flask
app = Flask(__name__)
@app.route('/')
def hello_world():
return 'Hello world!'
if __name__ == "__main__":
app.run(debug=True)
Let’s build and see:
➜ test_python docker build . -t flask-demo:fifth
[+] Building 3.3s (10/10) FINISHED
=> [internal] load build definition from Dockerfile 0.0s
=> => transferring dockerfile: 38B 0.0s
=> [internal] load .dockerignore 0.0s
=> => transferring context: 2B 0.0s
=> [internal] load metadata for docker.io/library/python:3.10-slim-buster 3.1s
=> [1/5] FROM docker.io/library/python:3.10-slim-buster@sha256:1b501f9aa621df27078adcd19ba769c09cb1c4f2e797bfaba0c66553db16923b 0.0s
=> [internal] load build context 0.0s
=> => transferring context: 265B 0.0s
=> CACHED [2/5] WORKDIR /usr/src/app 0.0s
=> CACHED [3/5] COPY requirements.txt . 0.0s
=> CACHED [4/5] RUN pip3 install -r requirements.txt 0.0s
=> [5/5] COPY . . 0.0s
=> exporting to image 0.0s
=> => exporting layers 0.0s
=> => writing image sha256:5376451429f671f4c5ee477f51447941d85df2db964249215066fdfd65e7b65e 0.0s
=> => naming to docker.io/library/flask-demo:fifth 0.0s
➜ test_python
Looks like the build dependencies are no longer re-installed!
And that concludes our tutorial!
What we have accomplished in this tutorial:
- Wrote a REST application.
- Wrote the dependencies for the application.
- Wrote the Dockerfile spec for the application.
- Built the application using the Dockerfile and tagged the image with a name.
- Understood the underlying layers which constitute an image.
- Learnt how to optimize builds so that build dependencies don’t get re-installed each time.
- Moved dependency installation to a separate layer to ensure it is cached until changed.
For further reading, please take a look at the following links:
Conclusion
Understanding layering in Docker images is an important concept for developers who want to work with Docker. By using them, you can create and manage Docker images efficiently.
This article is part of the Learning Docker series.
- 1 - A Beginner’s Guide to Docker
- 2 - Understanding Docker Layers (This Article)