OVO Tech Blog

Reduce docker images size with Alpine Linux

Introduction

Guido Maresca

Guido Maresca


docker sbt technology

Reduce docker images size with Alpine Linux

Posted by Guido Maresca on .
Featured

docker sbt technology

Reduce docker images size with Alpine Linux

Posted by Guido Maresca on .

In Boost we develop most of our backend services using sbt, Scala, deployed as docker containers on kubernetes.
The following is something we found out while creating a new service.

Openjdk docker images: 8-jre vs 8-jre-alpine

So your code finally seems to work, uh?
You have unit-integration-end2end tests passing and you can't wait to deploy your new shiny masterpiece to the cloud?
Let's just package it locally and docker-run it to be safe.

sbt docker:publishLocal

What have we got here?
image1
Ahhh, nicely packaged and ready to go...wait a second, 537MB?! What's going on with my simple service?!
I can see an openjdk:8-jre of 443MB as well, maybe that has something to do with this. It's probably time to take a look at the code.

object Docker extends AutoPlugin {

  private val default =
    Seq(
      aggregate in TypesafeDocker := false,
      mainClass in Compile := Some("uk.co.boostpower.****"),
      dockerBaseImage := "openjdk:8-jre", // <== THIS LINE
      dockerUpdateLatest := true,
      dockerBuildOptions := "--rm=false" +: dockerBuildOptions.value.tail,
      packageName in TypesafeDocker := moduleName.value,
      packageSummary in TypesafeDocker := name.value,
      packageDescription in TypesafeDocker := description.value,
      maintainer in TypesafeDocker := "ovopayg",
      dockerAlias := {
        DockerAlias(
          registryHost = sys.env.get("GCR_PREFIX") orElse Some("gcr.io"),
          username = sys.env.get("NONPROD_PROJECT_ID") orElse Some("boost-****"),
          name = sys.env.getOrElse("CONTAINER_NAME", moduleName.value),
          tag = Some(version.value)))

  object autoImport {
    implicit final class DockerSettings(val project: Project) extends AnyVal {
      def withDocker: Project =
        project
          .enablePlugins(JavaAppPackaging, DockerPlugin)
          .settings(default)
    }
  }

}

As specified in the sbt-native-packager documentation, we are using the openjdk:8-jre image, which is based on debian:jessie linux.
Probably a complete debian linux image is a bit overkill for most (micro)services, but worry not! There are few alternatives.
In this article we are going to use openjdk:8-jre-alpine, which is based on Alpine Linux. This is still a fully fledged linux, but really lightweight! Let's see what is the difference in practical terms.
First, we are going to change our base image in the docker configuration.

object Docker extends AutoPlugin {

  private val default =
    Seq(
      aggregate in TypesafeDocker := false,
      mainClass in Compile := Some("uk.co.boostpower.****"),
      dockerBaseImage := "openjdk:8-jre-alpine", // <== THIS LINE
      dockerUpdateLatest := true,
      dockerBuildOptions := "--rm=false" +: dockerBuildOptions.value.tail,
      packageName in TypesafeDocker := moduleName.value,
      packageSummary in TypesafeDocker := name.value,
      packageDescription in TypesafeDocker := description.value,
      maintainer in TypesafeDocker := "ovopayg",
      dockerAlias := {
        DockerAlias(
          registryHost = sys.env.get("GCR_PREFIX") orElse Some("gcr.io"),
          username = sys.env.get("NONPROD_PROJECT_ID") orElse Some("boost-*****"),
          name = sys.env.getOrElse("CONTAINER_NAME", moduleName.value),
          tag = Some(version.value)
        )
      },
      dockerCommands := {
        val extraDockerCommands = Seq(Cmd("RUN", "apk --update add bash openssl curl"))
        dockerCommands.value.head +: extraDockerCommands ++: dockerCommands.value.tail
      })

  object autoImport {
    implicit final class DockerSettings(val project: Project) extends AnyVal {
      def withDocker: Project =
        project
          .enablePlugins(JavaAppPackaging, DockerPlugin)
          .settings(default)
    }
  }

}

Time to publish again!

sbt docker:publishLocal

Let's see the results.
image-2
Wow! Just by adding "-alpine" to our base image, the size went from 537MB to 177MB! Isn't that awesome?!
We can also see why! The openjdk docker image based on Alpine Linux is just 81.9MB, as opposed to the 443MB of the debian:jessie one.
Finally, it's time to run our beloved service!

docker run 1590cfe324bc

image-3
Hey, what does that mean?! Who stole my bash?!
A quick google search will point us to the answer: Almquist shell.

Embedded Linux

Ash is also fairly popular in embedded Linux systems; its code was incorporated into the BusyBox catch-all executable often employed in this area, and is used in distributions like DSLinux, Alpine Linux, Tiny Core Linux and Linux-based router firmware such as OpenWrt, Tomato and DD-WRT.

So, our docker image doesn't have a bash shell, but an ash..how are we supposed to run our java-packaged service without a bash?
Fortunately, the sbt-native-packager documentation tells us how to package an application to be run into an ash shell.

object Docker extends AutoPlugin {

  private val default =
    Seq(
      aggregate in TypesafeDocker := false,
      mainClass in Compile := Some("uk.co.boostpower.****"),
      dockerBaseImage := "openjdk:8-jre-alpine",
      dockerUpdateLatest := true,
      dockerBuildOptions := "--rm=false" +: dockerBuildOptions.value.tail,
      packageName in TypesafeDocker := moduleName.value,
      packageSummary in TypesafeDocker := name.value,
      packageDescription in TypesafeDocker := description.value,
      maintainer in TypesafeDocker := "ovopayg",
      dockerAlias := {
        DockerAlias(
          registryHost = sys.env.get("GCR_PREFIX") orElse Some("gcr.io"),
          username = sys.env.get("NONPROD_PROJECT_ID") orElse Some("boost-*****"),
          name = sys.env.getOrElse("CONTAINER_NAME", moduleName.value),
          tag = Some(version.value)
        )
      },
      dockerCommands := {
        val extraDockerCommands = Seq(Cmd("RUN", "apk --update add bash openssl curl"))
        dockerCommands.value.head +: extraDockerCommands ++: dockerCommands.value.tail
      })

  object autoImport {
    implicit final class DockerSettings(val project: Project) extends AnyVal {
      def withDocker: Project =
        project
          .enablePlugins(AshScriptPlugin, DockerPlugin) // <== THIS LINE
          .settings(default)
    }
  }

}

Did that work? Let's see...
image4
Hooray! Our service starts normally and we saved a whopping 360MB in size!!
Wait a minute...we use docker also to run integration and end-to-end tests! Shall we take a look at the size of those images as well?

The images we use on our tests are influxdb, postgres and landoop/fast-data-dev (Kafka). Another quick search in the docker repositories tells us that there are alpine images for both inlfuxdb and postgres.
Let's see the difference when we change that too!
image6
Awesome! Here as well, just by adding the "-alpine" to our postgres and influxdb docker images, we managed to save 196MB and 124MB respectively!
Regarding the landoop/fast-data-dev (Kafka), unfortunately there doesn't seem to be an alpine version available (yet).

Adding packages to the docker image

What if you need to install packages, such as curl, openssl or even bash (because you love it so much and I can't live without it)?!
Alpine Linux comes with it's own package manager apk and a well furnished repository, so you just need to add the following lines to your configuration:

object Docker extends AutoPlugin {

  private val default =
    Seq(
      aggregate in TypesafeDocker := false,
      mainClass in Compile := Some("uk.co.boostpower.****"),
      dockerBaseImage := "openjdk:8-jre-alpine",
      dockerUpdateLatest := true,
      dockerBuildOptions := "--rm=false" +: dockerBuildOptions.value.tail,
      packageName in TypesafeDocker := moduleName.value,
      packageSummary in TypesafeDocker := name.value,
      packageDescription in TypesafeDocker := description.value,
      maintainer in TypesafeDocker := "ovopayg",
      dockerAlias := {
        DockerAlias(
          registryHost = sys.env.get("GCR_PREFIX") orElse Some("gcr.io"),
          username = sys.env.get("NONPROD_PROJECT_ID") orElse Some("boost-****"),
          name = sys.env.getOrElse("CONTAINER_NAME", moduleName.value),
          tag = Some(version.value)
        )
      },
      // THESE LINES
      dockerCommands := {
        val extraDockerCommands = Seq(Cmd("RUN", "apk --update add bash openssl curl"))
        dockerCommands.value.head +: extraDockerCommands ++: dockerCommands.value.tail
      // <---------->  
      })

  object autoImport {
    implicit final class DockerSettings(val project: Project) extends AnyVal {
      def withDocker: Project =
        project
          .enablePlugins(AshScriptPlugin, DockerPlugin)
          .settings(default)
    }
  }

}

This way, you can add any package you need to your docker image, before publishing it!

Conclusion

While saving on docker images size might be of little importance when you can access fast and reliable networks, it's always good to know what the other options are and the relative benefits (not limited to disk space).
For example, when using third party CI/CD services, the pipelines might need to pull some images once at least, if not in every run.
Having smaller sized images can positively impact the execution times for those pipelines, leading to quicker integration/deployment.
Also, as explained in many articles such as this one, Alpine Linux brings many advantages in terms of security, performances and so on.

Guido Maresca

Guido Maresca

View Comments...