Go faster with Golang and LocalStack
NBA Finals are kicking off tomorrow, and there’s no better time for a new post than now.
In previous posts I talked extensively about unit testing, mocking, coverage and how important they are for a reliable and efficient development process.
With that said, unit tests have their own limitations: their scope is narrow, as they test specific units: functions, methods, etc.
In reality, systems are far more complex than that: Today’s systems are more distributed, event-driven and consist of dozens of microservices and resources, sometimes not available or compatible for running on a developer’s laptop.
As a result, integration/end-to-end testing and local development have become more difficult or at least more trickier to perform than ever before.
The Challenges
At Axiom Security, we rely heavily on AWS and managed services.
We run several complex event-driven procedures that involve queues, streams, notifications and more.
We also use API Gateway, Lambdas, Secrets Manager, S3, EC2, EKS, Route53 and many more.
Early on, we experienced difficulties and were slowed down by several issues, mainly these two:
1. How to efficiently develop a product that relies heavily on the cloud?
2. How to write end-to-end or integration tests for a product that relies heavily on the cloud?
While there are an abundance of cloud native technologies that have gained tremendous popularity, such as: Kubernetes, Prometheus, Grafana, and more, your product sometimes does not fit entirely with them.
At Axiom, we run our Kubernetes workload with EKS, but can develop locally with Minikube, Docker Compose or simply by building and running the application binaries.
Using an RDS or a local database doesn’t make much of a difference either.
But what happens if you want to integrate a Lambda Function that is triggered by a queue? or by an API Gateway?
What about DynamoDB? yes, it’s a key-value database, but has it’s own constraints that make it difficult to replace or emulate. I guess that you get the point.
How we used to do it
Initially we started by using the same resources as our develop environment. This naive approach caused us some issues and difficulties:
1. Automated tests, developers and the cloud deployment of the product were using the same resources, resulting in race conditions and frequent errors that took us hours to debug.
2. Some events such as Kinesis streams, Lambda triggers by SQS and more were impossible to simulate as part of the product.
3. Our DevOps engineer was busy supporting developer requests for provisioning/accessing/debugging the cloud resources, eventually handing over access to everyone.
4. Our infrastructure was polluted with too many resources, configuration drifts and limitations.
cloud costs increased dramatically as well, as we started creating resources per developer.
A different approach?
Writing in Go almost exclusively, we use abstractions and interfaces extensively. This is how we interact with our queues, buckets, secrets, etc.
We wrote several packages to replace S3 with a local file storage, and same for secrets management, but it never felt the same: sharing data with other developers was not trivial when the data is stored on your disk, and ultimately, we were not working with or testing the actual implementations.
As for testing — we had nothing to show. We ran manual/automated tests on our development account and crossed our fingers.
OK, now what?
Things were not looking great for us, as we spent too much time on bugs, regressions and configurations, rather than creating new features. We needed a different approach.
After trying LocalStack a few times in the past, I decided to give it a real chance now.
What is LocalStack?
LocalStack provides an easy-to-use test/mocking framework for developing Cloud applications. It spins up a testing environment on your local machine that provides the same functionality and APIs as the real AWS cloud environment.
This is exactly what we needed.
As simple as that, LocalStack starts a real AWS cloud environment on your laptop. It currently supports around 30 popular AWS services and counting.
Integrating LocalStack was fairly easy: you either install a pip package or run a docker container that runs all this magic.
On the application side, the only thing we had to do is initialize the AWS SDK clients, pointing to localstack as the endpoint-url.
Let’s jump into an example repository: we have a simple application that continuously polls a queue, receives a single message every time, and writes it to S3.
Whenever we want to work with LocalStack (usually locally and when testing), we set the LOCALSTACK_ENDPOINT environment variable, and when the AWS SDK client is initialized, it connects to the LocalStack endpoint rather than the actual AWS backend.
Try it out yourself!
Impact
Now, our developers run the entire cloud infrastructure locally and can experiment, test functionality and do whatever they want in full isolation.
We no longer spend time debugging race conditions or trying to figure out who deleted an item or resource, or what IAM permissions a certain service or resource needs in order to access another resource.
Our cloud costs have dropped significantly, and the load on our DevOps engineer has decreased dramatically.
Before LocalStack, we were unable to simulate major flows and procedures in our product, especially the ones that are event-driven, due to their dependency on queues, streams, buckets, etc. Now, with a native support for SQS, SNS, Kinesis and most importantly, Lambda Functions, we can run the exact same flows on our laptops.
However, the most dramatic improvement was in the testing domain.
LocalStack allows us to run the entire environment on a single EC2 instance!
On each pull request, we start an ephemeral EC2 spot instance, start a private, isolated AWS environment and provision all the resources that are used by the product.
We then clone our repository, start all services, and run a series of tests against it.
Testing Gates
Our testing gates provide a very high degree of certainty and confidence:
We first run our Linter, and fail on every error.
Then, we run our unit tests, fail here too whenever a test fails.
We continue by running integration and end to end test, where most of our major flows are covered.
And then, after deploying, we run these same integration and end-to-end tests on our cloud development environment.
This procedure provides us certainty and confidence that our code, configurations and permissions are all intact.
As mentioned above, the example repository is available here.