Unexpected item in bagging area ⚠️
Within the Core Platform team at Curve we spend a lot of our time building the tools that allow our engineers to be more efficient.
One of the most common requests to our team was for the provisioning of infrastructure for a service; that might be a database, or setting up of credentials to something like RabbitMQ. The Engineer would have to submit a ticket, a Platform Engineer would then have to update Terraform with the requested change, wait for the Merge Request to be approved, and then apply the change, and repeat for each environment. This process could be anything from an hour to a couple of days depending on what we're working on at the time.
It was clear that this wouldn't scale as Curve continued to grow. 📈
Bring on Self Service! 🚀
Giving the ability for Engineers to manage their service components without the need of a Platform Engineer was where we wanted to be. It would mean Engineers would no longer be blocked waiting for us to create a database, and free up time for the Core Platform team to focus on other tasks, rather than creating databases.
All services at Curve are deployed to our Kubernetes clusters via a Helm Release (Using the FluxCD Helm Operator), using our own internal Helm Chart. We felt it was sensible to keep all service infrastructure components within their helm release, this would allow anyone to easily see what components a service has (Postgres, RabbitMQ, etc), and also create a clear lifecycle for the service (Delete the helm release, and it's components are removed too 🚮).
We first setup Terraform Operator, a Kubernetes CRD and Controller to handle Terraform operations. This would allow us run Terraform within our Kubernetes cluster as part of the Helm install/upgrade process.
We then modified our Helm Chart with a handful of templates for specific infrastructure components which created the Terraform resources, these templates used conditional blocks, so that when the new
infrastructure value was used, it would use the appropriate template.
Lets use RabbitMQ as an example;
We have a Terraform module that is used to create credentials within a defined RabbitMQ cluster, and then store those credentials within Vault for a given service.
Within our Helm Chart, we create a template
What the above will do is if the conditional is met, the Terraform Operator will pull the module, and then run a
terraform apply with the variable
service. In this example, RabbitMQ will create a user as the service name.
The Engineer can then create the RabbitMQ credentials at the point of deploying their service by updating their Helm Release with this new Chart:
The combination of these means, when a service is deployed for the first time the following will happen:
Helm install will run
Terraform Operator will run a job with the RabbitMQ resource (
terraform apply), the created credentials will be written to Vault as part of the module.
When the pods come online, the credentials will be pulled from Vault (via an init container), and made available to the service.
Currently we have self-service support for Postgres 💾, MongoDB 💿 , RabbitMQ 🐰, Kafka 📬, and PagerDuty📟(for service alert routing).
About the Core Platform Team 🦜
The Platform team maintain Curve’s services, both for customers and for our engineering team. Our goal is to build highly reliable and highly scalable infrastructure, running across AWS and GCP, on Kubernetes (EKS), and technologies such as Istio, Ambassador, Terraform, Atlantis, Kafka, and Vault to name a few.
We're hiring within the Core Platform Team too! - Careers | Curve