One of the golden promises of pipeline as code is that your pipeline definitions (how to build, test, and perhaps deploy your application) travel with the source code, test data and even deployment scripts through the commit history of your version control repository. You can then take a snapshot of a particular point in time and know with a high degree of certainty that everything you need to recreate a particular build is kept together. This aids new starters on a project to understand how things are put together, and builds repeatability and confidence in the build/test/deploy cycle.
There are many, many fine CI/CD toolchains which support pipeline as code natively; but if you’re stuck with one that doesn’t support this, or if you want to try it out without committing yourself to a particular one, you can still create a repeatable build/test pipeline and deliver a deployable artifact using a consistent process if you’re using Docker already.
To get started, you need to have Docker 17.05 or later; that’s the only requirement. We’ll be using the multistage build features available in that version and later. If you’ve never used Docker before, you can find an excellent introduction at the Docker site
SETTING THE SCENE
Let’s create a build and test pipeline for a simple service, written in Go. The actual language doesn’t matter, because all the dependencies for build and test will be encapsulated inside the Docker images as the pipeline runs; I just like Go. Because this is a demonstration, we’ll build a simple service which takes a date in an unknown format, and returns an ISO8601 date in UTC.
Our service will use some external packages that we’d like to vendor so that we know which versions we’ll be using. The build will have to take into account dependency management and unit tests, plus building a final minimal deployable artifact. We’re assuming that the method of image publication and deployment won’t be part of this pipeline.
BUILDING THE BASIC PIPELINE
Since we’re using the multistage features of Docker, our pipeline definition will actually be a Dockerfile. If you’ve used Docker before, but haven’t ever used the multistage capability, it boils down to:
- Build an image using the normal
- In the same
Dockerfile, start building a second image, copying bits as needed from the first image
- Keep doing that, if you need to, copying bits from earlier images as required
- Build your final, minimal, deployable image
The advantages here are enormous:
- You don’t end up shipping all your build-time dependencies or intermediate layers (of adding and removing files) to production
- You can use different images for different stages of the build pipeline, as needs demand (eg separate build, unit-test, security-scan, fuzz-test stages all from different base images)
- Your final deployable image is as small as can be, while including everything required to run your code.
- All the layers that can be cached will be, resulting in reduced build times (after the first build)
- The final deployable image can be built from a single source code repository
OUR INITIAL DOCKERFILE
Let’s start with a
Dockerfile that assumes we’ve already built and tested our application code. This will look very familiar to people who already use this method for generating production Docker images.
In this example, our runtime is a statically-compiled Go binary, called
utcservice. We’ll assume that it’s been compiled and tested, and is sitting in the same directory as the
FROM scratch ADD utcservice / EXPOSE 8080 ENTRYPOINT [“/utcservice”]
We build the container with
localhost$ docker build -t utcservice:final .
which gives us a nice, tiny image to deploy:
localhost$ docker images utcservice REPOSITORY TAG IMAGE ID CREATED SIZE utcservice latest ecbd26fc5c1a 3 minutes ago 10.1MB
10MB for a production deployable is pretty nice, but this is the end goal – we’ve skipped over so much! Let’s fill in the blanks.
Dockerfile starts from the end – a tested, compiled binary. If we want unit tests (and every codebase should have unit tests, as per the test pyramid), we can make use of Go’s lovely inbuilt support for tests. Of course, this assumes that you have Go installed, and the version is ok, and and and … so instead, let’s run the build and the tests inside a Docker container.
While Docker builds its images, it creates an ephemeral container for each layer. We can use those ephemeral containers to run our tests. If we start off with a base image which includes all the required Go tools, we can just add our application code, run the tests, build the binary, and away we go (pun intended):
Dockerfile which does the test/build step looks like:
# start from an upstream, maintained build image for Go FROM golang:1.10 # Install dep (for managing versioned go package dependencies) RUN curl https://raw.githubusercontent.com/golang/dep/master/install.sh | sh # create the location for our code, and go there RUN mkdir -p /go/src/github.com/cevoaustralia/utcservice WORKDIR /go/src/github.com/cevoaustralia/utcservice # Copy our code into the work directory ADD *.go Gopkg* /go/src/github.com/cevoaustralia/utcservice/ # Make sure we have the correct versioned dependencies installed RUN dep ensure # Actually run the tests, and then build the binary if the tests passed RUN go test && go build
We build the image with the command:
docker build -t utcservice:test -f Dockerfile.test-build .
Great! We have a single
Dockerfile which installs all the versioned build-time dependencies, runs the tests and creates our run-time binary. In the days before multistage builds, we’d have a couple of choices at this point. We could:
- copy the binary out of the resulting image into the host filesystem, then add it to a new clean base image (meaning we now have to have wrapper scripts, and a host filesystem that we can access, and so on); or
- add more
RUNsteps to remove the build-time dependencies, resulting in the most minimal runtime image possible, at the cost of shipping all the intermediate layers of added-and-removed changes every time.
For sake of comparison, the build-time image before any cleanup clocks in at:
localhost$ docker images utcservice:test REPOSITORY TAG IMAGE ID CREATED SIZE utcservice test e47540c4077c 13 minutes ago 787MB
787MB! That’s quite a difference, and it’s only going to get bigger if we remove build-time components, as each additional layer just adds to the deltas. In addition, if we leave it with all the build-time dependencies in it, we’re shipping all sorts of tools and things to production that probably shouldn’t be there.
Instead, let’s combine the two stages we have above: our test-and-build stage, and our final deployable stage, into a single
Dockerfile. It’s easy:
# start from an upstream, maintained build image for Go # This time, we're giving it a friendly name for use later in the pipeline FROM golang:1.10 AS build # Install dep (for managing versioned go package dependencies) RUN curl https://raw.githubusercontent.com/golang/dep/master/install.sh | sh # create the location for our code, and go there RUN mkdir -p /go/src/github.com/cevoaustralia/utcservice WORKDIR /go/src/github.com/cevoaustralia/utcservice # Copy our code into the work directory ADD *.go Gopkg* /go/src/github.com/cevoaustralia/utcservice/ # Make sure we have the correct versioned dependencies installed RUN dep ensure # Actually run the tests, and then build a statically-linked binary if the tests passed RUN go test && \ CGO_ENABLED=0 GOOS=linux go build -a -ldflags '-extldflags "-static"' # -------------------------- # This is the second stage -- starting from the most minimal possible image FROM scratch # We copy in the built binary from the previous stage COPY --from=build /go/src/github.com/cevoaustralia/utcservice/utcservice / # and away we go EXPOSE 8080 ENTRYPOINT ["/utcservice"]
We build the final image with the command
localhost$ docker build -t utcservice:final Dockerfile.test-build-runtime .
This is a simple 2-step build pipeline! We’ve encapsulated all our build and test dependencies, separated them from the deliverable runtime, and ended up with the same 10MB container to deploy to production.
GOING FURTHER: MULTIPLE SOURCES
Imagine we now want to combine some kind of static content with our webservice. Someone else in our organisation has already created a Docker image with the content installed, so we just have to add it. A tiny addition to the existing
Dockerfile is all it takes:
# … existing Dockerfile up there ^^ # Add the clock logo from the pre-built static assets image COPY --from=dockerrepo.example.com/common/static-assets:latest /clock.png /
You can reference Docker images by name, and Docker will pull them down if need be. Cool, huh?
Hopefully you’ve learned a bit about Docker’s multistage feature, and one way of delivering robust, minimal production-ready artifacts in a repeatable way. The potential value is significant: increased confidence to build and release, reducing the time it takes for you to realise return on your investment.
If you’d like to know more, get in touch!
If you’d like to see the complete set of Dockerfiles, Go source code and so forth, you can get it at https://github.com/cevoaustralia/blog-docker-pipeline-as-code