In our last blog post, we discussed our use of GitOps. In this blog, we will dive into how we secure private data.
Private data comes in many forms at Tidepool. We collect personal health data from our users. We must protect that data from unauthorized access or disclosure. We call these user secrets.
Another form of private data are the secrets used to encrypt data or to verify identity. We call them system secrets. This blog post deals with protecting system secrets.
Exposure or loss of such data could compromise the integrity of the Tidepool system.
- a way to persist the secrets;
- a way to protect that data from disclosure; and,
- a way to retrieve that data by an authorized client.
Handling System Secrets
Kubernetes has a mechanism called Secrets that is meant for storing secret data, which persists in encrypted form within etcd. Kubernetes containers can be given access to Secrets as environment variables or as memory-mapped files. Kubernetes Role Based Access Control ensures that only authorized services that run in a Kubernetes cluster may access those secrets.
However, Kubernetes does not solve all of our needs. How does one populate Kubernetes Secrets? How secure is the etcd store?
What about storing secrets in Git? Our legacy infrastructure did just that. It stores the secrets in a private GitHub repo in cleartext. The secrets would be copied onto the file system before a process starts.
Storing our secrets in cleartext in a private GitHub repo relies on GitHub to protect our data. Instead, what if we stored our secrets in encrypted form? We could even publish the encrypted secrets in a public GitHub repo. Public or private, we could continue to use GitHub to version our secrets as we version our code.
We considered using Bitnami Sealed Secrets (BSS) for this purpose. BSS leverages the existing Kubernetes Secrets mechanism to provide secrets to pods: environment variables or memory-mapped files. For each Kubernetes Secret, you associate exactly one BSS custom resource that contains an encrypted form of your secret. BSS uses the Kubernetes Watch API to watch for requests for Kubernetes Secrets. When it sees one, it decrypts the corresponding BSS custom resource data and creates a Kubernetes Secret. The Kubernetes controller that deploys pods watches for the Secrets that it needs. Until all secrets are available, the controller can’t launch the containers that depend on them.
BSS, however, has a bootstrapping problem. How do you provide the initial key that the running BSS controller can use to decrypt the secret? An elegant solution to that is proved by Mozilla sops. Sops allows one to encrypt a value multiple times, each with a different key generated from a random secret and a master key. Sops can retrieve master keys from cloud-based key management services (KMS) such as Amazon KMS, Google KMS, and Azure Key Vault. Sops can also use master keys stored on a local files system using pgp.
If used within the cloud, one can use the KMS of that cloud provider combined with the identity management (IAM) system of that cloud provider to eliminate any manual bootstrapping step. Authorization to the master key is provided to authenticated members of the IAM system.
Another approach is to store the secrets in a third party vault such as Hashicorp Vault or Amazon Secrets Manager. Vault secures, stores, and controls access to secrets. One may provide secrets to a Kubernetes pod as memory mapped files, much like Kubernetes Secrets by injecting sidecars to each Kubernetes pod that needs access to secrets. The sidecar talks securely to a Vault server, which authenticates and authorizes the access and writes the shared secret to a memory-mapped file.
To access secrets stored in Amazon Secrets Manager (ASM), one may use a service such as GoDaddy External Secrets (GES), which retrieves secrets on demand. Moreover, if you are running the GES in the Amazon cloud, you can use the identity of the service to authenticate with ASM, and avoid the bootstrapping issue.
Vault is best-in-class, but we concluded that a hosted Vault service is too expensive for our needs and a self-hosted Vault is too complicated. We initially selected GES.
However, having two ways of versioning data -- in Git and in Amazon Secrets Manager -- means that Git would not reflect the state of the cluster. Additionally, GES would need to continuously poll ASM for new updates.
Sops runs as a simple application that encrypts and decrypts files. We need sops to decrypt files when they are installed into Kubernetes. Recently sops has been integrated into Flux, our GitOps controller. When Flux loads manifests from a Git config repo it can use sops to decrypt any encrypted secrets. Flux runs in the cloud provider and can use its identity to access the master secret from the cloud key management system. Secrets can also be encrypted using a key generated from a local master key, allowing sops to run without integration with a cloud provider. That also makes it a good solution for local test environments.
With all the parts in place, we are moving to sops!
In our next blog post, we will discuss how we use a service mesh to secure intra-service traffic.