APP scaling with operational excellence
This is a continuation of the app scaling series showing motivations, clear ways, trade-offs, and pitfalls for a cloud strategy. The last post is an overview of why to prepare and how to start. It can be found here.
As soon as your application starts scaling and more actions are needed every day to evolve the app to the new scenarios, the automation will come in handy and also in need.
Benefits of automating infrastructure
- Fastest-possible solution for deploying a new workflow environment. It saves you time for deploying multiple environments like PROD, DEV, and QA.
- Ensures PROD, QA, and DEV are exactly the same. This will help your engineers to narrow problems and solve issues faster.
- Immutable infrastructure — the old dark days when nobody knew how a server was still working is gone. With immutable infrastructure, stop using human interference to fix things, and use it only to hot-fixes.
- Define your workflows as code. Code is more reliable than anyone’s memory.
- Easily track changes over time (you also achieve more coverage for auditing with this step).
- Your infrastructure-as-code is documentation you can review and ask for support if needed.
There are a few tools to automate infrastructure creation and each cloud provider has its own. CloudFormation on AWS, Resource Templates on Azure, and Cloud Deployment on Google. But you may be in an organization that wants as least lock-in as possible due to past experience. Then HashiCorp’s stack, specifically Terraform here, comes in handy.
Use cases + Tactics
- Saving costs with environments — have your DEV and QA environments shut down at the end of every day to save costs on cloud.
- Watching infrastructure — since your infrastructure may change during execution (upscaling, downscaling, termination, etc), you can have a job looking for specific parts of your app that should be kept in a certain configuration. Example: for scaling systems, you can use Terraform to always have one instance pre-heated and ready to be attached to an auto-scaling (AS) group when the application requires, instead of waiting for the time of warm-up of every instance’s configuration. Once your app needs to scale, that instance will be added to the AS group, and sometime after that Terraform will provision a new instance proactively for when the load suffers another spike.
- Configuration management — applying the immutable infrastructure concept here, you will have one single source of truth for your environment. Example: you had to apply one hot-fix in production to prevent an error to happen. Right after the hot-fix, update your infrastructure-as-code to include that fix so you won’t forget to replicate it to new environments.
- Orchestration — let’s say you have your infrastructure primarily on AWS but are using Google for ML. Terraform will orchestrate the creation of them all for you. It saves you the time of going to each cloud and activating CloudFormation, Cloud Deployment, and so on.
- Security and Compliance — having your infrastructure as code will make it easier for your team to ensure they are following all the security and compliance definitions. The code is also versionable, ensuring auditing capabilities.
Example with Terraform
The code found here will deploy the above infrastructure in a matter of a few minutes. It is an example of Terraform provisioning the AWS best practices for infrastructure when still using EC2 instances as your option for computing.
Do not forget to add CloudFront and Route 53 to your stack if you are going to use it in a real environment.