At litl we use Docker images to package and deploy our Room for More services, using our Galaxy deployment platform. This week I spent some time looking into how we might reduce the size of our images and speed up container deployments.
Most of our services are in Go, and thanks to the fact that compiled Go binaries are mostly-statically linked by default, it’s possible to create containers with very few files within. It’s surely possible to use these techniques to create tighter containers for other languages that need more runtime support, but for this post I’m only focusing on Go apps.
The old way
We built images in a very traditional way, using a base image built on top of Ubuntu with Go 1.4.2 installed. For my examples I’ll use something similar.
Here’s a Dockerfile
:
FROM golang:1.4.2
EXPOSE 1717
RUN go get github.com/joeshaw/qotd
# Don't run network servers as root in Docker
USER nobody
CMD qotd
The golang:1.4.2
base image is built on top of Debian Jessie. Let’s
build this bad boy and see how big it is.
$ docker build -t qotd .
...
Successfully built ae761b93e656
$ docker images qotd
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
qotd latest ae761b93e656 3 minutes ago 520.3 MB
Yikes. Half a gigabyte. Ok, what leads us to a container this size?
$ docker history qotd
IMAGE CREATED BY SIZE
ae761b93e656 /bin/sh -c #(nop) CMD ["/bin/sh" "-c" "qotd"] 0 B
b77d0ca3c501 /bin/sh -c #(nop) USER [nobody] 0 B
a4b2a01d3e42 /bin/sh -c go get github.com/joeshaw/qotd 3.021 MB
c24802660bfa /bin/sh -c #(nop) EXPOSE 1717/tcp 0 B
124e2127157f /bin/sh -c #(nop) COPY file:56695ddefe9b0bd83 2.481 kB
69c177f0c117 /bin/sh -c #(nop) WORKDIR /go 0 B
141b650c3281 /bin/sh -c #(nop) ENV PATH=/go/bin:/usr/src/g 0 B
8fb45e60e014 /bin/sh -c #(nop) ENV GOPATH=/go 0 B
63e9d2557cd7 /bin/sh -c mkdir -p /go/src /go/bin && chmod 0 B
b279b4aae826 /bin/sh -c #(nop) ENV PATH=/usr/src/go/bin:/u 0 B
d86979befb72 /bin/sh -c cd /usr/src/go/src && ./make.bash 97.4 MB
8ddc08289e1a /bin/sh -c curl -sSL https://golang.org/dl/go 39.69 MB
8d38711ccc0d /bin/sh -c #(nop) ENV GOLANG_VERSION=1.4.2 0 B
0f5121dd42a6 /bin/sh -c apt-get update && apt-get install 88.32 MB
607e965985c1 /bin/sh -c apt-get update && apt-get install 122.3 MB
1ff9f26f09fb /bin/sh -c apt-get update && apt-get install 44.36 MB
9a61b6b1315e /bin/sh -c #(nop) CMD ["/bin/bash"] 0 B
902b87aaaec9 /bin/sh -c #(nop) ADD file:e1dd18493a216ecd0c 125.2 MB
This is not a very lean container, with a lot of intermediate layers. To reduce the size of our containers, we did two additional steps:
(1) Every repo has a clean.sh
script that is run inside the
container after it is initially built. Here’s part of a script for
one of our Ubuntu-based Go images:
apt-get purge -y software-properties-common byobu curl git htop man unzip vim \
python-dev python-pip python-virtualenv python-dev python-pip python-virtualenv \
python2.7 python2.7 libpython2.7-stdlib:amd64 libpython2.7-minimal:amd64 \
libgcc-4.8-dev:amd64 cpp-4.8 libruby1.9.1 perl-modules vim-runtime \
vim-common vim-tiny libpython3.4-stdlib:amd64 python3.4-minimal xkb-data \
xml-core libx11-data fonts-dejavu-core groff-base eject python3 locales \
python-software-properties supervisor git-core make wget cmake gcc bzr mercurial \
libglib2.0-0:amd64 libxml2:amd64
apt-get clean autoclean
apt-get autoremove -y
rm -rf /usr/local/go
rm -rf /usr/local/go1.*.linux-amd64.tar.gz
rm -rf /var/lib/{apt,dpkg,cache,log}/
rm -rf /var/{cache,log}
(2) We run Jason Wilder’s excellent
docker-squash
tool. It is especially helpful when combined with the clean.sh
script above.
These steps are time intensive. Cleaning and squashing take minutes and dominate the overall build and deploy time.
In the end, we have built a mostly-statically linked Go binary sitting alongside an entire Debian or Ubuntu operating system. We can do better.
Separating containers for building and running
There have been a handful of good blog posts about how to do this in the past, including one by Atlassian this week. Here’s another one from Xebia, and another from Codeship.
However, all these posts focus on building a completely static Go
binary. This means you eschew cgo
by setting CGO_ENABLED=0
and
the benefits that go along with it. On OS X, you lose access to the
system’s SSL root CA certificates. On Linux, user.Current()
from
the os/user
package no longer works. And in both cases you must use
the Go DNS resolver rather than the one provided by the operating
system. If you are not testing your application with CGO_ENABLED=0
prior to building a Docker container with it then you are not testing
the code you ship.
We can use a few purpose-built base Docker images and the tricks from Jamie McCrindle’s Dockerception to build two separate Docker containers: one larger container to build our software and another smaller one to run it.
The builder
We create a Dockerfile.build
, which is responsible for initializing
the build environment and building the software:
FROM golang:1.4.2
RUN go get github.com/joeshaw/qotd
COPY / Dockerfile.run
# This command outputs a tarball which can be piped into
# `docker build -f Dockerfile.run -`
CMD tar -cf - -C / Dockerfile.run -C $GOPATH/bin qotd
This container, when run, will output a tarball to standard out,
containing only our qotd
binary and Dockerfile.run
, used to build
the runner.
Dynamically linked binary
Notice that we did not set CGO_ENABLED=0
here, so our binary is
still dynamically linked against GNU libc
:
$ ldd $GOPATH/bin/qotd
linux-vdso.so.1 (0x00007ffea6b8a000)
libpthread.so.0 => /lib/x86_64-linux-gnu/libpthread.so.0 (0x00007f6e76e50000)
libc.so.6 => /lib/x86_64-linux-gnu/libc.so.6 (0x00007f6e76aa7000)
/lib64/ld-linux-x86-64.so.2 (0x00007f6e7706d000)
We need to run this binary in an environment that has glibc
available to us. That means we cannot use stock BusyBox (which uses
uClibc
) or Alpine (which uses musl
). However, the BusyBox
distribution that ships with Ubuntu is linked against glibc
, and
that’ll be the foundation for our running container.
The busybox:ubuntu-14.04
image only has a root user, but you should
never run network-facing servers as root, even in a container. Use
my joeshaw/busybox-nonroot
image
— which adds a nobody
user with UID 1 — instead.
The runner
Now we create a Dockerfile.run
, which is responsible for creating
the environment in which to run our app:
FROM joeshaw/busybox-nonroot
EXPOSE 1717
COPY qotd /bin/qotd
USER nobody
CMD qotd
Putting them together
First, create the builder image:
docker build -t qotd-builder -f Dockerfile.build .
Next, run the builder container, piping its output into the creation of the runner container:
docker run --rm qotd-builder | docker build -t qotd -f Dockerfile.run -
Now we have a qotd
container which has the basic BusyBox
environment, plus our qotd
binary. The size?
$ docker images qotd
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
qotd latest 92e7def8f105 3 minutes ago 8.611 MB
Under 9 MB. Much improved. Better still, it doesn’t require squashing, which saves us a lot of time.
Conclusion
In this example, we were able to go from a 500 MB image built from
golang:1.4.2
and containing a whole Debian installation down to a 9
MB image of just BusyBox and our binary. That’s a 98% reduction in
size.
For one of our real services at litl, we reduced the image size from
300 MB (squashed) to 25 MB and the time to build and deploy the
container from 8 minutes to 2. That time is now dominated by building
the container and software, and not by cleaning and squashing the
resulting image. We didn’t have to give up on using cgo
and
glibc
, as some of its features are essential to us. If you’re using
Docker to deploy services written in Go, this approach can save you a
lot of time and disk space. Good luck!