By Chongyang Shi, Alex Kaskasoli, Ignacio Dominguez, and Emily Young
Following our culture of Technical Autonomy, teams building the Kaluza Energy Platform at OVO choose the cloud platforms and services best suited for them. Most components of the Kaluza platform are deployed as microservices across a number of AWS and GCP environments, which are primarily self-managed by the development teams, with support from our production engineers.
Kubernetes clusters are the main platform of choice for running these microservices, and across Kaluza, teams operate a mix of managed clusters using either AWS Elastic Kubernetes Service (EKS) or Google Kubernetes Engine (GKE). While development teams are ultimately responsible for remediating security issues within their managed Kubernetes clusters, the Kaluza Security Engineering team has the central responsibility of proactively detecting potential security issues across the platform, as well as providing guidance and support for their remediation by owning teams.
In this blog post, we talk about the challenges in providing effective security monitoring for Kubernetes microservice workloads across the Kaluza platform, and how we designed and built a monitoring system uniquely suited for our platform architecture: the Kaluza Kubernetes Monitor.
Challenges of multi-cluster security monitoring
The Center for Internet Security Kubernetes Benchmark (“CIS Benchmark”) provides a useful (though somewhat-generalised) set of evaluation criteria on how secure a Kubernetes cluster is configured. Much of the CIS Benchmark deals with the security of the Kubernetes control plane and worker nodes themselves.
Because we only operate managed Kubernetes clusters using EKS or GKE at Kaluza, security configurations for the control plane and the managed worker node groups are either locked into a secure state, or easily configurable through the EKS and GKE APIs.
Prior to this project, our security audit tooling already had good visibility into security parameters in the latter category. If these security parameters have been configured insecurely – for example, authentication logging got disabled for a cluster – the team which owns the cluster would be notified quickly and remediate these issues:
However, monitoring security parameters through the AWS and GCP APIs only solves half of the problem: Kubernetes has its own control plane API for managing workloads running within the cluster, represented as resource manifests containing object specifications usually in YAML or JSON formats. These include security-relevant configurations of workloads covered by the last section of the CIS Benchmark, which are just as important to the security of the Kaluza platform as those addressed in the earlier sections, for example:
- Pod specifications in Deployments: determining what containers can be executed by worker nodes with what commands and arguments, and with any security-sensitive privileges;
- Network policies: regulating traffic within, entering, or leaving the cluster;
- RBAC: authentication and authorization policies within the cluster;
- Services and Ingresses: able to expose cluster services to the internet through load balancer integrations;
- Credentials stored in the wrong location: for example in Kubernetes ConfigMaps instead of Secrets.
While at the time of writing, both AWS and GCP provide some visibility of Kubernetes resource manifests for each managed cluster through their respective web console UIs, we could not use these to monitor for potential security issues automatically. This was because for both AWS and GCP, we could not find any documented APIs providing the same resource manifest data as those available via the web consoles.
We therefore needed to talk to each Kaluza cluster’s control plane directly and “pull” the resource manifests we are interested in. However, under the shared network design for our platform spanning across multiple VPCs, this proved to be difficult to implement:
Both AWS EKS and GCP GKE run their managed cluster control planes in a separate virtual private cloud (VPC) network. While it is possible for workloads within the “main” VPC of each AWS or GCP environment to access the control plane through private networking, this private control plane interface is only routable within the main VPC, and hence not accessible from the VPC where our security tooling will need to run.
The other possibility is to set up connections from the security tooling VPC to the cluster control planes over the public internet; but we rejected this option, as relying on a persistent connection to the Kubernetes cluster control plane over the internet is generally considered a bad security practice, even if source IP allow-listing is used.
We also considered policy agents such as Gatekeeper and Kyverno, which validate configurations of resources locally in each cluster. Unfortunately we found ourselves in lack of a convenient way to distribute policies across running agents in all clusters, or to gather policy audit reports from them; and more importantly, we lacked the ability to customise rule exemptions specifically for each individual cluster’s workloads and operating conditions.
As all of these alternative approaches had drawbacks or limitations, we decided to design and build our own solution, which neither relies on AWS or GCP APIs nor degrades the existing security configurations of the Kaluza clusters.
Design of the Kubernetes Monitor
The difficulty of accessing the control plane through private networking across VPCs means that: instead of the security tooling trying to “pull” the resources manifests from each Kubernetes cluster, it is much easier for something in each cluster to “push” the manifest to a server in the security tooling VPC. The destination server of this “push” traffic is fixed and addressable within our shared network, and the presence of internal NAT gateways ensures that such traffic remains routable throughout.
However, a “push” system for resource manifests is more complex than a “pull” system, with considerations being:
- A client must be deployed and run continuously within each cluster, to periodically gather resource manifests and upload them to a server managed by the security team;
- The server must be backed by a database in order to detect whether the client in any cluster has stopped sending resource manifests to be scanned for potential security issues;
- The server must remain continuously available to receive resource manifests from each client, at the time of the client’s choosing;
- The server must be able to identify each client with a cryptographic identity to ensure the authenticity of resources reported by each client, which in turn requires an internal authentication system to be set up between the server and the clients.
Naming it the Kubernetes Monitor (
k8s-monitor), we set out to build such a system with the following goals:
- It meets each of the requirements above in a secure and reliable way;
- It supports different API versions for resources as a result of clusters having a range of control plane versions which remain supported by Kubernetes maintainers;
- Its client can be easily deployed into a new or existing cluster within Kaluza, and requires no manual maintenance, to minimise overhead for development teams;
- The overall system is sufficiently scalable to record and monitor resource manifests for at least a few dozen, medium-sized Kubernetes clusters, to allow us plenty of performance headroom;
- In addition to scanning manifests for potential security issues, data held by the server can be manually retrieved and used for other operational purposes.
We settled on a design as shown in the diagram below:
We run the containerised server-side components of the Kubernetes Monitor in an EKS cluster managed by the security team, alongside other existing security tools. This provides the server with a high-availability setup with minimal work. There is also a client deployment running in the same security cluster, as the system monitors its own cluster's security.
All components are written in Go, with gRPC as the API protocol to take advantage of its integrated toolchain with Protocol Buffers. Components are described in more detail in the following sections.
The monitor client runs as a single-replica Kubernetes Deployment within each Kaluza cluster, with a very small resource allocation and runs in its own namespace. Installation templates have been adapted to suit a range of deployment methods, such as through Helm charts or the Terraform Kubernetes Provider.
The client pod is bound to an RBAC ClusterRole with read-only permissions for most types of resources from all namespaces in the local cluster. It cannot read Kubernetes Secrets at a cluster level, as reading service account tokens in each namespace potentially allows privilege escalation, in the event the monitor client itself is compromised.
Using an internal timer, the monitor client will periodically perform the following actions:
- Frequently send telemetry pings to the monitor server, which confirms its liveness and includes the most recent error the client has encountered locally; this helps the security team perform troubleshooting and emit metrics for client health monitoring from a single location.
- Regularly calls an update-rules endpoint on the monitor server, which supplies the latest list of Kubernetes resource types for which resource manifests should be collected; this allows the security team to update instructions on what resources to monitor without requiring clients to be redeployed in each cluster.
- Every few hours, it calls the local cluster’s control plane api-server with its service account credentials, take a snapshot of all resource types in the cluster being collected, and sends them to the server.
During initial deployment, each client is assigned a unique ID (corresponding to its local cluster) and a private signing key, which is persisted via a Kubernetes Secret in the client namespace. The public key is registered for the client ID at server side. Using a gRPC interceptor, the client signs all requests with its private key, which allows the server to validate the authenticity and identity of the request.
The monitor server provides the main interface for different parts of the system to exchange information and instructions:
- Resource manifest and telemetry API interfaces for receiving, validating, and storing information from monitor clients in all clusters.
- Cron APIs for triggering scanning and cleaning up operations from CronJobs, which will be discussed in the next sections.
- Administrative APIs whose requests can only be signed by private keys registered to security engineers via a command line client, which allows security engineers to maintain the system. This will be discussed in a later section.
We implemented a rules framework where all scan rules are implemented in Go, which allows us very fine-grained control over how rules operate on each type of resources and API version, and how exemptions can be applied at a global or cluster-level for specific resources. Two types of rules are implemented:
- Resource rules, which scan individual resource manifests of the given resource types and API versions. For example, detecting whether a Deployment or DaemonSet (
apps/v1) instructs its pods to run in privileged mode;
- Aggregation rules, which ingests all resources of a given type and API version in a cluster, and makes a decision on whether they collectively leave the cluster in a safe state. For example, detecting whether a default NetworkPolicy is present in a namespace, which prevents ingress traffic into any pod in the namespace by default, unless allowed by a more fine-grained NetworkPolicy.
At a high level, the rules currently cover the following aspects of cluster security:
- Violations of baseline and restricted Pod Security Standards for Kubernetes;
- Interactions with Docker and other container runtime sockets on the host node;
- Over-privileged RBAC bindings;
- Privilege escalation opportunities (e.g. access to service account tokens);
- Network access escalation opportunities (e.g. ability to expose additional endpoints by creating or updating arbitrary Services and Ingresses);
- Uses of untrusted container images and unsafe container commands;
- Lack of default network policies governing ingress, egress, and access to instance metadata endpoints, or network policies that are overly open;
- The lack of recent telemetry pings or collected resource manifests from the client in a registered cluster, which could indicate an issue within that cluster.
We actively monitor the ways Kubernetes workloads are being configured across Kaluza; and thanks to the fully-customisable rules framework, we can write and deploy rules for new security issues arising at a moment’s notice.
The scan trigger CronJob calls the monitor server on its endpoint once every few hours, which will cause the server to perform a batched scan of the most recent resource manifests collected from each cluster. Report from the server is firstly indexed by cluster IDs, and then by references to scan rules.
For each scan rule, any violations found among cluster resources are listed individually by the resource’s type, name, and namespace (if applicable). This provides accurate information to the development team responsible for the cluster as to what resource violated the rule.
The scan trigger then reformats the report into our centralised security audit reporting format with additional metadata, and submits it into our centralised alerting pipeline. It will then inform the team responsible for the cluster directly, with the full details of each violation available through a dashboard.
Expired records deletion trigger
The expired records deletion trigger calls the monitor server on its endpoint once a day, which causes the monitor server to discard telemetry ping and resource manifest records that are older than a threshold number of days from the PostgreSQL database. This ensures that we only store as much history of cluster telemetries and resources as we could possibly need to look back through, without causing a significant amount of storage and cost overhead.
Command line utilities
Occasionally, security engineers need to interact with the monitor server to perform administrative tasks, for example to register or update monitor client credentials for a new cluster in the Kaluza platform, or simply to get a list of the latest telemetry details from all clusters in order to investigate a suspected wide-spreading client issue.
For this purpose, we have built administrative gRPC endpoints which can only be called through requests signed by security engineers, and a corresponding gRPC command line client. Taking advantage of the shared protocol buffers, the task to build the client was largely through templating.
We have added additional administrative functionalities into the API client to take full advantage of the telemetry and resource manifest data we hold for each cluster. For example, if a security engineer wants to retrieve a list of Docker images currently running across all Kaluza clusters, it is as simple as issuing a search request with the right JSON path to the monitor server, as shown below:
We have deployed the Kubernetes Monitor client in all production and non-production managed Kubernetes clusters across the AWS and GCP environments of Kaluza. At the time of writing, the monitor server has tracked more than 11 million telemetry pings and scanned more than 29 million versions of resource manifests, with hundreds of security findings remediated.
The presence of Kubernetes Monitor has significantly improved our visibility into the security posture of the Kubernetes clusters behind the Kaluza platform. But our work does not stop here: the Kubernetes Monitor has provided us with the ability to centrally collect, process, and store control plane data from all our clusters. As discussed in the previous section, we already use this functionality to retrieve resource attributes across clusters for security purposes; and we plan to build automations leveraging these data capabilities.
Potential future works
We have plans for a range of features which help our Security Engineers and Site-Reliability Engineers make decisions for the platform.
For example, the ability to detect Deployments and DaemonSets without suitable pod resource reservation, scheduling priorities, or autoscaling configurations will allow our Site-Reliability Engineers to offer tailored advice to Kaluza teams and improve our overall platform health. Security Engineers will also be able to use pod template data to detect outdated and vulnerable image and application versions directly from their runtime, which supplements our detection capabilities at the source code level.
Finally, while not every organisation will require a system like this in their platform architecture, the Kubernetes Monitor is something we wish to be able to open-source and share with the DevSecOps community in the near future.