My current org uses Docker containers heavily in our development environment. For the most part, back end engineers rarely configure the containerized environment. We have other groups that do that for us. There’s a development infrastructure group, which overlaps somewhat with the larger infrastructure group.
I get what containers are good for — they get us standardized, repeatable, isolable environments. They make it much easier to keep our development environment in sync with our production infrastructure. And they are a step up, in many ways, from the way I used to do this. The old way was just “Install the development environment and all its dependencies on my workstation,” which gets old fast, and scales poorly.
Anyway, this week I wanted to set up a brand new demo environment, so I decided to learn Docker from scratch.
It took about 6 hours start to finish, including learning how to write a FastCGI process in Ruby. Basically I built a demo project with one NGINX web server container and two back-end application server containers (one running Puma, one running a FastCGI process). Then I used it for some performance testing I wanted to do.
So these are just some notes on getting started with Docker and Docker Compose.
How do you learn your way around Docker?
For what it’s worth, this was pretty much my approach:
- Google “how to create a Docker container.”
- Figure out which of the existing docs were actually worth reading (reliable, comprehensive, readable, current).
- Set out to create the most basic possible Docker environment: an NGINX container that displayed the default homepage.
- Create a project folder on a Linux dev box that already had Docker tooling installed.
- Make a basic
docker-compose.yml
file with one service defined. - Browse around in our existing work repos to find a suitable base image for the container.
- Try a command like
docker-compose up
in my project folder. - Watch it build.
- Log into the container using
docker-compose run
ordocker exec -it [container] sh
. - Install
bash
inside the container, becausesh
was mediocre. - Install
vim
to be able to edit the NGINX configuration interactively. - Figure out how to generate a custom Docker image, by writing a
Dockerfile
, which added custom packages and configuration to a given base image. (Learned that docker-compose is for orchestration containers at runtime, while a Dockerfile governs image building.) - Fiddle around with NGINX configuration inside the container to ensure that it listened nicely on http/port 80.
- Learn that you can use
nginx -s reload
to live-reload the running NGINX settings without restarting the container. - Read the Docker docs to figure out how to expose a container (on a certain port) to the host. Use port mapping.
- Restart the containerized environment and check that you see the NGINX default homepage at
http://localhost:8088
(let’s say8088
was the port on the host that pointed to port 80 in the container). - Put my custom NGINX configuration in a file on the parent host. Use the
Dockerfile
to copy it into the container. - Rebuild the container a few times to make sure it works.
- Make a second
app
service indocker-compose.yml
, using a Ruby 2.7.6 image we had lying around. - Stumble over the question of how to do containerized development in a more exploratory way. (Containers need to have a process running at start time, but when you’re doing new development, you might not know how to start your process just yet. I put
tail -f /dev/null
as the initial container process, after a handy stackoverflow tip.) - Set up a Ruby project inside the app container with a
Gemfile
. - Realize that most Ruby web server libraries will need C development tools to build. Install them from inside the container (
gcc
,build-essential
, etc). - Pick through some verbose
make
output to detect other missing dependencies. Install them too. - After
bundle install
worked manually inside the container, I moved all the dependency setup and the actualbundle install
command into theDockerfile
for my app service. - Set up the
Dockerfile
to copy my Ruby project onto the container during the build process (COPY ...
). - Google how to set up file synchronization between a container and the host file system. I used bind mounts, which is discouraged, since you’re supposed to use virtual volumes now, but bind mount worked just fine for my case. It’s configured in
docker-compose.yml
, as it’s a container “runtime” feature, rather than a container “build” feature. - Spend some time poking around at how Docker does virtualized networking, to try to figure out how to communicate from one container to the next (since NGINX needs to be able to reach the upstream service).
- Try using container IP addresses to communicate (172.16.x.x), but they changed sometimes when I restarted the docker-compose environment. I couldn’t readily provision them at container build time, and it seemed hacky to pass them down to NGINX at container runtime, if that is even possible.
- Look in /etc/hosts on the container. Didn’t help me.
- Google some questions about Docker networking.
- Realize that I’m doing it the suboptimal (basic) way, with bridge networking mode instead of something fancier. No worries there, doesn’t matter in this case.
- Read something on Stackoverflow and learned that you can just use the other container’s name as a hostname. It Just Works™ because of some custom DNS setup in Docker.
- Update the NGINX config, rebuild the environment.
- OK then why does NGINX still not connect to the upstream?
- Oh right, the upstream web process needs to be listening on a public network interface instead of on localhost.
- Add the correct incantation to the Dockerfile for my
app
container, rebuild the environment. - It works! Now it’s time to add a second app container for FastCGI (the first one used Puma) and point NGINX at both of them…
In the end I had a containerized environment with an NGINX container plus two upstream containers (Puma and FastCGI).
Then I was able to finish my little demo project, doing some basic performance testing for different Ruby web server processes. (In particular, I was curious about comparative memory usage for Puma, WEBrick, Unicorn and FastCGI-based back end servers. TLDR: FastCGI uses much less runtime memory than any of the alternatives.)
How I didn’t learn my way around Docker
Note that I didn’t do any of these other possible strategies:
- Run
man docker
. - Read a technical book about Docker. (I’m sure there are good ones.)
- Watch videos about Docker. (I’m kind of a text-based person.)
- Ask a colleague for assistance. (I have lots of highly experienced colleagues in this area, but they’re all busy and it’s fun to teach myself.)
- Use an existing containerized environment as a point of departure, and then customize it. (I built from scratch instead).
- Have a completely clear plan about how the environment needed to work (e.g. networking, volume mounting). (I was OK with not knowing exactly what I was going to do, as long as it worked in general.)
- Use best practices for production-ready containers. (Some of the best practices are too heavy-duty for a basic use case.)
To be clear, any of these approaches would have been valid! I just didn’t use them.
I was happy with my very hands-on, iterative, solo, approach.
Reflections on Docker
I dislike the way that Docker can become a black box in my organization, maintained by specialists even though we all use it all day. I courteously dislike that approach, because what Docker does is really just the basics of Linux-based systems administration, organized in a particular way around a particular core abstraction. I think developers should know their way around those things, even if we don’t know every detail of a complex dev environment.
Anyway, once I dug into it, it wasn’t that hard to understand Docker because I already knew some basic Linux systems administration things, e.g. about networking, file systems, package management, and OS virtualization. So I just applied what I already knew to the Docker environment, trying to figure out “How do I do that here?” Once I thought of it that way, it was all relatively easy.
(It helps that documentation was so easy to find, since Docker is common, well-documented technology.)
I didn’t love some of the inconsistencies between Docker and Docker Compose. I guess they are technically two separate tools, but I wanted them to feel more like an integrated system, instead of having one DSL for one of them and another for the other.
But I did appreciate how Docker pushes you into an ephemeral, fully declarative environment setup.* With a long running virtual Linux system, even if you use something like Ansible for initial setup, it can be tempting to make custom tweaks to a running environment, ignoring your own configuration management. It’s very hard to do this with a containerized environment; you find yourself rebuilding the containers pretty frequently. This causes you to put all the setup in the relevant Dockerfile
, with no cheating.
(Fortunately, it’s still possible to log into a container and interactively configure it. If you look at my notes above, I frequently started out with “How do I do XYZ from a shell inside the container,” and only subsequently moved the incantation into the Dockerfile
. This speeds up the dev feedback loop.)
It was a fun afternoon of digging into this stuff, honestly. It’s not every day I learn new things.
* To be precise, a Dockerfile is an imperative build script, but docker-compose wraps it in a declarative configuration system.