Here at OVO we make use of managed services that allow us to run less software ourselves, and play pivotal roles in achieving CI/CD; services which are firmly placed on our Paved Road.
Two of the most prevalent managed services we use are GitHub and CircleCI. To enable us to roll out changes to production quickly and securely, we want to be able to store our config in our Git repos and deploy changes to them automatically via a pipeline, even if the configs contain sensitive data.
Ideally, configs containing sensitive data, or secrets, would be in encrypted for their whole journey from the Git repo, through to the deployment pipeline and the application host. We’d like to have a single-source-of-truth for our secrets to minimise management overhead.
Modus Operandi
Initially the team I was working with stored all their secrets in a 1Password Vault, with no consistent structure of configs within the vault. After being modified in 1Password, secrets were copied to local workstations and usually SCP’d to the host that needed the update. While this approach wasn’t inherently insecure, there was a lot of toil involved, potential for config drift and substantial risk of plain text config being left indefinitely on workstations.
To switch the team to a suitable new tool for secret encryption, it needed to be super-easy to install and get started with, preferably with no extra tooling required; both humans and machines would need to use it. Various operating systems were in use, so Linux, macOS and Windows compatibility was a must. The tool would have to be both performant and use a strong encryption technique. Finally, flexible access control, logging and auto-key rotation would be very beneficial.
Research
After some initial research into what tools are already available, we discovered there’s a lot of choice, with different tools operating in slightly different spaces. From digital vaults such as 1Password and Hashicorp Vault, to CLI Tools like GPG and Git wrappers.
Making secrets as secure as possible, whilst simultaneously as easy as possible for authorised users/machines to decrypt in different places could be challenging. This could be considered an example of Security Vs. Convenience, of which the two sides have traditionally been diametrically opposed within the security industry, the theory being “Making access hard was the best way to keep info safe”. The major flaw to that theory? Humans. Once the procedure of accessing accounts or secrets breaches a certain complexity, humans will bypass protocol to make the procedure easier for themselves (think post-it-notes). We wanted a tool to achieve both security and convenience so that humans would find it easier to comply, and machines didn’t object too much either.
Whilst a Cloud KMS service wasn’t essential, we felt using one could be beneficial to provide good access control, logging and auto-rotation of keys. We didn’t encounter evidence to suggest there’d been many KMS integrations in existing CLI tools.
Prior Art
1Password
1Password is a digital vault that has an API, so secrets could potentially be pulled programmatically, and a new vault could be created for programmatic access. Building a deployment pipeline from this could prove difficult given 1Password doesn’t provide webhooks for changes to secrets.
Hashicorp Vault
The installation process required for Vault, another digital vault, feels like it’s become much easier in the year since I first encountered it. Terraform (also from Hashicorp) helps greatly with that. I like the idea of requiring a quorum of key-holders to unseal a sealed vault, though my team was too small for that to be useful, and I wasn’t sure about the sealing of an entire vault (multiple vaults would be nice?).
It was the operational commitment the team would have to make in order to really understand how Vault works that put me off. I’d read of people running into difficulties due to tech debt, at times enabling features unknowingly. Adopting Hashicorp Vault would mean the team running a complex third party tool requiring specialised domain knowledge, the risk of which being potential vulnerabilities or inability to decrypt. This may have been more of a viable option for a much larger team, or a team providing Vault as a service to others.
GPG
GPG is a command line tool for asymmetric encryption. It’s used by other teams in the organisation, who seem happy with it. I felt that although GPG comes with a good reputation and the option for building a web of trust, there’s a key management overhead, and recovering from a key leak usually means re-encrypting secrets. Rotating the keys’ passwords could have been used for that purpose, though I didn’t want to deal with any additional password management.
Git Wrappers
There’s some pretty neat Git wrapper tools that allow you define which files are secret in a Git repo, and using (usually) GPG it encrypts those files when you push, and decrypts when you pull, so it’s all seamless. You can also encrypt an entire repo if you don’t want to spend time specifying the secrets.
The seamless nature of this approach is appealing. This type of tool is simply setting up git hooks to do the additional encrypt/decrypt on commit/checkout, so we needn’t necessarily use GPG. The problem with only using a Git wrapper tool / Git hooks is we’re still left with our managed CI/CD service, e.g. CircleCI, potentially having access to not only the unencrypted text we’re rolling out changes to, but all other secrets stored in the same Git repo.
Rolling Our Own
Without any of the above solutions fulfilling all of our requirements, we looked into the possibility of writing our own CLI tool. The way in which GPG was being used by some of our teams was attractive, but we believed we could create something better for ourselves.
I started looking at Golang, as I knew it would make life easy for us in both creating a cross-platform CLI binary that’s easy to install, and connecting to remote Cloud KMS services. It also has AES and GCM packages built into its core, both of which Google recommend alongside Envelope Encryption.
Envelope Encryption appears to have been adopted by most, if not all, of the major cloud providers for (at least) some of their storage. The general flow for encryption is:
- Generate Key A
- Encrypt your plaintext with Key A to get the ciphertext
- Encrypt Key A using a Cloud KMS (Key B) to get the encrypted key
- Store ciphertext + encrypted key together (e.g. in your Git repo)
Decryption works in reverse order.
This has some benefits, primarily you never give the Cloud provider your plaintext, and simultaneously require that provider for decryption. You can make full use of the advantages Cloud KMS will give you, such as IAM/ACL, auditing, encryption and key rotation, all in a managed service. KMS providers usually have a pretty strict cap on payload size, too, so you could potentially encrypt huge files this way without worrying about such caps (you only send Key A to the KMS Service to be encrypted).
AES (256-bit) with GCM is recommended by Google for the encryption process, both known for being strong and performant - I won’t go into specifics here, but it’s worth taking a look at this awesome AES stick-man-comic (15min read), and this GCM video (16 mins).
Enter Mantle
With the aforementioned requirements in mind, we developed a Go client that encapsulated AES (256) encryption in GCM mode, using Google KMS for the second half of the Envelope Encryption process. Google was chosen as the initial KMS provider as the team had already established the majority of their infrastructure in GCP.
Convenience, as discussed previously, is the core of what Mantle brings to the table. It cross-compiles into Linux, Darwin and Windows binaries (we get GoReleaser to do that for us in CircleCI every time a new Git tag is pushed to the repository). Installing is simply a matter of downloading the correct binary from the releases page, and ensuring it’s executable.
Once installed, the tool is incredibly easy to use. Users need to authenticate into their chosen project either with the gcloud command-line tool, or by providing a GCP service-account key.json file. Users can then obtain Google’s resourceId (using the Google Cloud SDK or the console) of the KMS key they want to use, and finally provide this to Mantle in order to encrypt or decrypt. If the user happens to already be authenticated into the project via gcloud (which seems likely), and with enough permissions, this may seem like magic to them, as it’ll just work without any additional auth.
We’ve created an Alpine Docker image with Mantle installed, and considering command-line use is so straight forward, it can be used in Kubernetes to run an init container that decrypts before the application container starts. This makes adoption much easier for teams using Kubernetes, as it doesn’t require any changes to their application code. The emptyDir storage volume can be used to mount a tmpfs (RAM-backed filesystem) volume, ensuring the decrypted config isn’t initially stored on disk when it’s shared between init and app containers, and gets deleted once the pod dies. We’ve added an example of Mantle in a Kubernetes init container to the Git repo.
Object Hierarchy
For me, part of the beauty of Mantle lies in the seamless use of Cloud KMS. Using what Google call object hierarchy, we can restrict access to specific keys or keyrings (organizational groups of keys) down to specific users or groups.
Within the Google KMS service, the key is an encryption key that the KMS provider has created and held strictly for their use only. Each KMS key has an identifier, and by specifying this key identifier when asking the KMS service to encrypt something, it retrieves the specified key, performs the encryption (provided access is granted) and returns the output to the user. A keyring is simply a reference to a group of keys. Keyrings allow us to control access on a group of keys, rather than on each key, which could get burdensome.
If required, we can give users access to all keys in a project using project level IAM permissions. Using the object hierarchy, we have the ability to define virtual vaults. For example, we could use a key for each application, and keyrings for any further organisation of applications. We’d then be able to disable or enable decryption capabilities on a user, application, sub-team or whole team level, at the flick of a switch. There’s a lot of multi-tiered flexibility with Google KMS, for a range of team sizes, or even managing a group of teams.
For each key in Google KMS, there can be multiple versions that are active at any one time, but only one of those versions will be the primary key. When decrypting, Google KMS will automatically use the key version that was originally used to encrypt, but when encrypting, Google KMS will only use the key that’s currently flagged as being the primary. Google provide seamless auto-rotation of keys (essentially creation of a new key, and making it the primary), which we can set to daily, to seamlessly restrict the blast radius of a compromised key. With daily auto-rotation, for example, if we encrypt configs on day one, and the day one key is compromised, an attacker would only be able to decrypt the day one configs (provided they also had all the other pieces required to decrypt). The reason this works seamlessly, is that as mentioned previously, when asking KMS to encrypt/decrypt, we simply supply a key id as we don’t care which key version is primary, the KMS service handles that for us.
Being able to delete older keys is important, as it renders any encrypted config existing on people’s machines completely obsolete; it’s impossible to decrypt them. This may or may not be viable for all users, but having a frequent automated re-encryption process of all configs, coupled with auto-key-rotation would make this possible. Mantle has a dedicated re-encrypt command to help with this.
Mantle has some other tricks up its sleeve, with features such as zero-filling (a form of data erasure) and deletion of plain text files after encryption, and a validation command that errors upon decryption failure which could be useful in a CI/CD pipeline.
Freedom Of Expression
Because encryption is performed using algorithms available in other languages’ core libraries, there’s no dependence on Mantle for decryption. For example, my team has developed a library for their Scala-based applications that decrypts at startup, so there’s no storage of plain text outside of the application memory, and no need for an init container. The added bonus here, is that developers never need to use the decrypted config directly in order to run locally; the application will do the decryption for them.
The previous paragraph describes another important point: users are free to develop their own tools to decrypt. Mantle doesn’t tie you into using it, or storing your encrypted configs in any specific vaults, it’s a tool that wraps a set of existing well-established algorithms, mechanisms and access control models to make strong security convenient to users.
Mantle 101
Install Mantle by downloading the correct binary to your local machine, and putting it on your path. Binaries are available on the GitHub releases page.
Google is currently Mantle’s solitary KMS provider, though we are looking to add more in the near future. This means you’ll need to have a GCP account, with a project that has billing enabled, and the Cloud KMS API enabled. Google have a decent Quickstart doc that’ll walk you through creation of a keyring and key.
Ensure you’re authorised against the project on your local machine. You’ll need the Cloud KMS CryptoKey Encrypter
and Cloud KMS CryptoKey Decrypter
roles (you can either set on the project using IAM or add as ACLs to the keyring or key).
Obtain the resourceId
Get the name of the keyring:
$ gcloud kms keyrings list --location <location>
Get the name/resourceId of the key:
$ gcloud kms keys list --location <location> --keyring <keyring_name>
The resourceId will be in the format:
projects/<project_name>/locations/<location>/keyRings/<keyring_name>/cryptoKeys/<key_name>
Encrypt a file
Create the file:
$ echo "helloworld" > plain.txt
Issue the encrypt command. The binary should output to command line the encrypted string, remove the plain.txt file, and create a cipher.txt file:
$ mantle encrypt -n <key_resource_id>
Take a look at the cipher.txt:
$ cat cipher.txt
You may want to check at this point that Mantle has removed the plain.txt file.
Decrypt the file:
Issue the decrypt command. You should be left with a new plain.txt:
$ mantle decrypt -n <key_resource_id>
Take a peek at the new decrypted file:
$ cat plain.txt
Full getting started instructions are available in the Mantle GitHub repo.
The most challenging part of getting started is most likely the GCP and KMS setup (though still straight forward), which, I feel, is indicative of how simple Mantle is to set up.
What’s next?
Mantle has just been open sourced. It’d be great to see others outside of OVO adopt it as their chosen encryption tool, and the ideas and contributions that can be made. Perhaps there’s a new feature or a use for it that we’ve not yet considered.
We’re looking to integrate other Cloud KMS services, starting with AWS. Ideally the KMS service being used would be abstracted away from the user (though obviously users would still need to provide key resource id’s). Other features such as a vim wrapper, recursive text search, signing, and bulk encrypt/decrypt are being considered and could be coming soon.
Summary
This blog post has covered our journey to developing a cross-platform encryption tool, Mantle. By taking advantage of Golang for its command-line tooling, Google Cloud KMS for its flexible multi-tiered access control and Kubernetes for its init-containers, we can perform fast, secure rollouts using managed CI/CD services.
The key takeaway here, is that achieving both security and convenience is difficult, but possible, by implementing existing strong encryption algorithms and Cloud KMS services. There are more cogs in the machine than for other tools such as GPG, but each cog serves a very specific purpose, and ultimately minimises the amount of toil humans spend on creating and managing key files or digital vaults.