OVO Tech Blog

Continuous Deployment of Versioned AWS Lambdas

Introduction

Tom Dane

Tom is a Software Engineer at Kaluza


Continuous Deployment of Versioned AWS Lambdas

Posted by Tom Dane on .
Featured

Continuous Deployment of Versioned AWS Lambdas

Posted by Tom Dane on .

So you’ve decided to go serverless! You want to focus on writing code and let AWS Lambda handle all that painful provisioning, maintenance, and scaling of backend servers. Being up to date with DevOps best practices, you’re looking to manage your AWS Lambdas with Infrastructure-as-Code (IaC). Lambdas are easy to get started with, but have their idiosyncrasies when it comes to building a smooth deployment pipeline. In this article I am going to cover how we use the version and alias features of AWS Lambda along with terraform as an IaC tool to manage the continuous integration and deployment (CI/CD) of serverless applications.

What exactly is Lambda? At a high level it’s the AWS event-driven compute service. Event-driven in that you can trigger Lambdas from a range of other AWS services like S3 file uploads and SQS/SNS messages, or as a backend to an API Gateway. Lambda has managed concurrency, allowing extreme scalability and loads of great built in logging and monitoring features. Fundamentally, a Lambda function is defined by the code that is executed when it is invoked, as well as the runtime instance configuration including things like environment variables, memory, and timeout.

Building and Deploying Lambda Functions

We could simply provision a Lambda, then update the configuration or code in place. Any new instances will then make use of the updated configuration. Working in this way however, you will quickly run into two common pitfalls. Firstly, the changes may take some time to be adopted; “warm” Lambdas will continue to use the configuration they were instantiated with, leading to uncontrolled and difficult to predict deployments. Secondly it can pose problems during development. Multiple engineers editing the live configuration is a recipe for disaster.

Enter Lambda versions, which allow you to publish numbered copies of your function with the configuration “frozen” in an unmodifiable Version. We can solve both of our previous problems by pointing our production service at a specific version of the Lambda. Engineers can publish new versions during development, and when we’re happy we can re-point our production service to that new version. Versions also enable controlled deployment, faster rollback, and cool stuff like canary deployments and A/B testing (by splitting traffic between versions).

Lambda versions can also be aliased, assigning them a friendly name. Instead of our application pointing at an explicitly numbered version, we can instead reference the named alias (e.g. main). Using aliases, releasing to production is simply a case of moving the alias main from the old version to the new. We can also use aliases on Lambdas built from development branches to make it easier to identify and test specific versions.

Let’s take a more detailed look at an application lifecycle as represented above. We have a production application which is an API Gateway invoking the Lambda version aliased with main, currently version 2. Engineer Sandy is implementing a new feature which will provide additional logging on the endpoint. Sandy branched off the main branch. When she commits to the remote repository a new version of the Lambda will be published. This new version will have an alias based on the branch name (feature-123). When she is satisfied the code works, she merges her code into the main branch. This time a new version is published and the alias main is shifted from version 2 to version 5, effectively deploying the backend referenced by the production application.

Implementing the Workflow

There is one unintuitive feature of AWS Lambdas: the function code must be provided to the resource at the point of creation. This is unlike traditional server-based applications, where you would typically provision the infrastructure then separately deploy the application code to the server. For this reason, we find it easier to treat the entire Lambda function—code, config and all—as a single entity. We manage Lambdas with terraform as our IaC language. Everything except the alias. Why? Because aliases are quite difficult to manage in a declarative language like terraform and we like to have more fine-grained control over when aliases are created or updated.

To implement this workflow we will need three steps that a CI/CD pipeline runs:

  • Build the function code package.
  • Publish a new version of the Lambda with terraform using the built function code.
  • Create (or update) the alias for this version using the branch name.

In the following sections I’m going to walk through how we implement this deployment workflow. Let’s assume we want to build a Lambda called api-backend, which is written in Node.js. That source code is in a repository of the same name that looks like this:

api-backend/
  src/
    index.js
    implementation.js
    main.js
  package.json
  package-lock.json

Building the Lambda Code Package

The code executed by a Lambda function can be a single file, but sooner or later you will need to start structuring the code into modules and bundling third-party dependencies. At this point you have to compress your source code directory into a zip file. Building our api-backend function zip file boils down to installing the dependencies, building the distribution and then zipping the files into lambda.zip. For convenience, we can put those steps in a bash script in our repo under bin/build_lambda_package.sh so that it can be used in our CI pipeline:

#!/bin/bash
rm -rf ./dist
npm install --no-save @babel/cli
babel src -d dist --copy-files
npm ci --production --no-save
zip -r -q lambda.zip dist node_modules

Terraforming the Lambda Function

Now that we have our function code package, we’re going to provision a Lambda function with terraform. I’m not going to go into the basics of terraform here and assume a certain level of familiarity. The official docs provide a great introduction and there’s a wealth of tutorials online. For now, we’re going to create a terraform directory with a lambda.tf file containing the code needed to provision and deploy a Lambda:

resource "aws_iam_role" "lambda_role" {
  name_prefix = "api-backend"

  assume_role_policy = jsonencode({
    "Version" : "2012-10-17",
    "Statement" : [
      {
        "Action" : "sts:AssumeRole",
        "Principal" : {
          "Service" : "lambda.amazonaws.com"
        },
        "Effect" : "Allow"
      }
    ]
  })
}
 
 
resource "aws_lambda_function" "lambda" {
  filename         = "../lambda.zip"
  function_name    = "api-backend"
  source_code_hash = filebase64sha256("../lambda.zip")
  role             = aws_iam_role.lambda_role.arn
  handler          = "dist/main.handler" 

  runtime     = "nodejs12.x"
  memory_size = 128
  timeout     = 120

  publish = true
}
 
 
output "lambda_version" {
  value = aws_lambda_function.lambda.version
}
 
 
output "lambda_name" {
  value = aws_lambda_function.lambda.function_name
}

There are a few points to note on this example:

  • The configuration will create two AWS resources: an IAM role and the Lambda function itself. The IAM role here is a very minimal example.
  • We specify the function code using the filename argument pointing to the locally built zip file described earlier. This zip file must exist on disk to deploy the terraform.
  • We are also specifying a source code hash. Whenever this terraform code is run, it will calculate a hash based on the zip file, compare it to the hash from the last time it was deployed and determine if the function code has changed.
  • We have set the publish argument to true. This tells terraform to publish a new version rather than simply update the Lambda configuration in place.
  • We are declaring two output variables: the lambda_name and lambda_version of the function, which we’re going to use later on.

Whether locally or on a CI/CD server, we can apply any changes and publish a new version using the commands:

% cd terraform
% terraform init
% terraform apply

So that we can use those output variables (name and version) later on, let’s write them to a JSON file using the following command:

% terraform output -json > output.json

Creating the Alias

At this point we have our Lambda function deployed and we’re now going to use the AWS CLI to create and update aliases. The command takes as input the function name and version number (output by terraform), and the alias name (the current git branch). To get those values we can read the terraform output file using jq. For the alias name, we have an environment variable GIT_BRANCH. The following script reads the output file and first attempts to update the alias. If the alias doesn’t exist already it will fail, so we then instead create a new alias:

#!/bin/bash
VERSION=$(cat terraform/output.json | jq -r '.lambda_version.value')
FUNCTION_NAME=$(cat terraform/output.json | jq -r '.lambda_name.value')
 
aws lambda update-alias \
  --function-name $FUNCTION_NAME \
  --name $GIT_BRANCH \
  --function-version $VERSION \
|| \
aws lambda create-alias \
  --function-name $FUNCTION_NAME \
  --name $GIT_BRANCH \
  --function-version $VERSION \
  --description "The latest build in the $GIT_BRANCH branch"

Putting it all Together

Let’s have a look at what our repo now looks like:

api-backend/
  .circleci/
    config.yml
  bin/
    build_lambda_package.sh
    create_update_alias.sh
  src/
    index.js
    main.js
    implementation.js
  terraform/
    lambda.tf
  package.json
  package-lock.json

We have a bin directory containing the convenience scripts for building the package and updating the alias, our source code in src and the configuration in terraform. The last piece of the puzzle is to define the complete end-to-end workflow in a CI/CD pipeline. Using a yaml specification for a platform like CircleCI, it would look something like this:

jobs:
  build-and-deploy:
    steps:
      - checkout:
      	  path: ~/project
      - run:
          name: Build the lambda.zip package
          command: bin/build_package.sh
      - run:
          name: Run terraform
          command: |
            cd terraform
            terraform init
            terraform apply -auto-approve
            terraform output -json > output.json
      - run:
          name: Update (or create) Lambda alias
          command: bin/create_update_alias.sh

This CI/CD job will checkout the repository and build the function zip file. The terraform will then be applied  to publish the updated Lambda function and output the variables to a file, before creating or updating the alias using the output of the terraform run.

Do not write a CI/CD script exactly like this! This will deploy your Lambda function and update the production main alias as soon as a commit lands on the main branch without any form of testing. Of course, you already have lots of unit and integration tests, as well as all manner of lovely software engineering tools like linting, vulnerability scanning, static code analysis etc. Be sure to bake that into your workflow before reaching production deployment. It may also be worth introducing a manual approval gate in your pipeline before the alias is updated in case you need to test or coordinate your production deployments.

And that’s it! Now we have a complete CI/CD pipeline deploying versioned AWS Lambdas functions with branch-based aliasing.

Limitations and Warnings

  • One potentially dangerous side-effect relates to the IAM role defining the permissions for the Lambda function. We have defined a named IAM Role here, and as written all versions of the Lambdas will reference this same named role. If you modify the role permissions for a development branch it will affect all other versions with that same role, including production.
  • The example I have run through here only covers deploying to a single AWS environment. It is best practice from a security perspective to have separate AWS accounts for your development and production environments. There are a number of ways to manage this with terraform.
  • When you publish a Lambda version, AWS stores that deployment package in an internal S3 bucket. There is a finite limit to the total amount of Lambda function code that can exist in your AWS account, which by default is 75 Gb. With 1000s of historic Lambda versions in our account we unexpectedly ran into this limit and found we could not deploy anything until we had cleaned up our Lambdas! The solution here is either pay to increase your account limit, or run a cron job to remove older versions that are no longer needed.

Tom Dane

Tom is a Software Engineer at Kaluza

View Comments...