
This is the first part of a series in which I will present a pattern for integration testing of Kafka consumers using Burrow and docker-compose. In this post we will cover how to build a common docker image that we will then use to run both Kafka and Zookeeper in a local docker-compose cluster. In the rest of the series we will continue by creating a simple smoke test to confirm that the consumer is successfully committing its offsets back to the Kafka cluster, giving us assurance that processing progress will not be lost.
The project associated with this series is available on Github here. In this post we will cover the contents of the first commit.
This is a cross-post from my personal blog, which you can follow here.
Motivations
Unit tests are invaluable tools for verifying that the internals of an application are working as expected, but when it comes to our application's interactions with other components over a network we need to take a different approach.
Tools like docker-compose allow us to easily set up a production-like environment that enables us to test at the system level, verifying the behaviour of the application over the network boundary where bugs often occur. Being able to do this on a local machine (as well as in a CI pipeline) shortens the development feedback loop, meaning that we catch bugs earlier and thereby yielding a rise in developer productivity.
Common Dockerfile
We define a multistage build to create our common image. In the build phase we download and verify the contents of Kafka from the Apache archive before extracting and installing it. In the following phase we build the common
image that will be the base from which we build our Zookeeper and Kafka images, by copying over the verified install of Kafka from the builder
image and installing the dependencies needed to run it in the docker-compose environment:
openjdk-11-jre-headless
- java runtime environment, needed to run Kafka and Zookeeper.wait-for-it
- used to configure docker-compose health checks.ncat
- used to open network ports on our containers to signal to docker-compose that they are healthy.
FROM ubuntu:latest as builder
RUN apt-get update && apt-get -y dist-upgrade
RUN apt-get -y --no-install-recommends install \
curl \
ca-certificates
WORKDIR /tmp
COPY SHA512SUMS .
RUN curl -fsSL -o kafka_2.13-2.5.1.tgz https://archive.apache.org/dist/kafka/2.5.1/kafka_2.13-2.5.1.tgz
RUN sha512sum --check SHA512SUMS
RUN tar -C /opt -zxf kafka_2.13-2.5.1.tgz
FROM ubuntu:latest
RUN apt-get update && apt-get -y dist-upgrade
RUN apt-get -y --no-install-recommends install \
openjdk-11-jre-headless \
wait-for-it \
ncat && \
apt-get clean all
COPY --from=builder /opt/kafka_2.13-2.5.1 /opt/kafka
WORKDIR /opt/kafka
CMD trap : TERM INT; sleep infinity & wait
Zookeeper
Zookeeper provides a centralised service to manage synchronisation and configuration for a Kafka cluster. It is responsible for keeping track of the status of broker nodes, ACLs, and topic configuration to name a few. For a more in depth discussion of Zookeeper's role in Kafka clusters I defer to this article.
Zookeeper Dockerfile
Our Zookeeper Dockerfile is now very simple, all it needs to do is build from the common
image and override the CMD
to run the script that starts Zookeeper that comes bundled with the Kafka download that we installed, along with the provided default configuration settings defined in config/zookeeper.properties
.
FROM common
CMD ["bin/zookeeper-server-start.sh", "config/zookeeper.properties"]
Kafka Dockerfile
Our Kafka Dockerfile is almost as simple as Zookeeper's in that we similarly build from the common image and start the Kafka server using the bundled script to do so, however we also copy over a config
directory containing server.properties
, since we need to make a small change to the default configuration to tell the server where Zookeeper is running.
FROM common
COPY config/ ./config/
CMD ["bin/kafka-server-start.sh", "./config/server.properties"]
The version of Kafka that we installed in the common
image contains a copy of server.properties
populated with default values, so we can make a local copy of this that we can edit by building our common
image, running a container from it, and copying the file out with docker cp
:
# from within the directory with the common Dockerfile
$ docker build -t common .
$ docker run --rm -it --name common common bash
# from another shell outside of the common container
$ mkdir -p config
$ docker cp common:/opt/kafka/config/server.properties ./config/server.properties
The only change we need to make in this file is to update the value of zookeeper.connect
to zookeeper:2181
, since we will set the hostname of the Zookeeper container to zookeeper
in docker-compose.
Configuring docker-compose
We define common
, kafka
, and zookeeper
services in our docker-compose.yml
file. Although we do not need a running instance of the common
container for our tests, it is still specified here since we want docker-compose build
to build the image, since it is the shared base image of the other two services.
Kafka depends on Zookeeper for orchestration of broker nodes in the distributed system. Even though we have only one broker server in our example the dependency still exists, so we make it explicit to docker-compose using depends_on
in conjunction with service_healthy
and healthcheck
. Note that service_healthy
is not available in version 3.x
of docker-compose, so make sure you are using 2.x
. It is also worth noting that it is possible to configure docker compose to run multiple broker servers by adjusting the scale
parameter, however we will not do so in this example since there is additional per-broker configuration required in server.properties
, which goes beyond the scope of this series.
The health checks use the wait-for-it
package that we installed in the common
Dockerfile, and will check whether a port is open on the localhost. In practice this means that when the zookeeper
service is running, docker-compose should consider it to be in a healthy state as long as port 2181
is open, after a startup grace-period of 10 seconds. Once zookeeper
passes its first health check, docker-compose will then start the kafka
service.
version: "2.4"
services:
common:
image: common
build:
context: common/
kafka:
hostname: kafka
build: broker/
healthcheck:
test: ["CMD", "wait-for-it", "--timeout=2", "--host=localhost", "--port=9092"]
timeout: 2s
retries: 12
interval: 5s
start_period: 10s
depends_on:
zookeeper:
condition: service_healthy
zookeeper:
hostname: zookeeper
build: zookeeper/
healthcheck:
test: ["CMD", "wait-for-it", "--timeout=2", "--host=localhost", "--port=2181"]
timeout: 2s
retries: 12
interval: 5s
start_period: 10s
Running docker-compose
We are now ready to run our local Kafka set up by running docker-compose up --build
! You should see from the logs that both services start up without issue. When you are done, run docker-compose down
to clean up the containers.
If we had not overridden the default configuration for the Kafka server to specify where Zookeeper is running we would have observed connectivity issues when the services started. Go ahead a change the value of zookeeper.connect
to some other value and rebuild the images. You should see that when you run docker-compose up
again after the rebuild Kafka now complains in the logs that it cannot connect to Zookeeper before failing and exiting with the error code 1
.
Still to Come
In the next part of this series we will introduce Burrow and use it to run a very simple test. We will configure another service in docker-compose whose responsibility it will be to create a topic, produce a known quantity of messages to the topic, and consume a known quantity of messages from the topic. Burrow will be used to verify that production and consumption both occurred as expected.
Later in the series we will look at how to create a simple consumer using Scala and fs2-kafka, and how to test these with the docker-compose pattern. It is worth noting however that the choice of language and framework really are not important, so long as your producers and consumers run inside docker containers you can use the pattern presented here.