9 benefits of blue/green deployment strategy
TL;DR: 9 benefits and 2 tradeoffs to keep in mind about blue/green deployment strategy.
Inspired by the yellow-covered book “The DevOps Handbook” by Gene Kim, in the middle of 2019 I decided to start one new project implementing one of the most amazing concept approached: the blue green deployment.
How we used to deploy our code
ACloud.guru’s course for DevOps Engineer certification on AWS shows some of the known deployment strategies available for us to choose to our projects:
1. Single target (deploys to a single instance)
This is basic just as its simple. We used to simply login to our app servers and change the files running. In some cases, restart the app server and that’s it. The worst part was the if something goes wrong you have to figure out the solution, bundle the hot fix, and during this time your users are offline. And you have downtime.
2. When we start to automate…
Complexity start coming in. Still very simple, but already getting rid of a big of human error. Once you have your steps scripted, you can start adding more quality to the deployment. Automating unit and integration testes is a must. You can also add SonarQube or several other tools for checking your code. Here you still have the downtime and you’re still locked to one single instance of your environment. Your strategies can be:
- All-at-once (deploys to all multiple instances at once)
- Minimum in-service (keeps a minimum in-service instances while deploying/testing/validating to the rest of them)
- Rolling (deploys little by little to a pre-defined number of instances (let’s say 2 out of 5 total), until it reaches all instances).
3. Canary releases:
It’s an art. And also impossible to operate with a small team and just raw tools (open source languages). Creating all the controls to stop a release when something goes wrong is a lot of work. With canary you already have freedom to implement new layers of test, but you’re still affecting one single environment and are competing with your users. So if something goes wrong, a small part of them will notice that.
4. And then… Blue/Green
Here’s what happens during a blue/green deployment process:
Step 1: At first, your app domain (app.domain) is pointing to your regular set of instances where your app is running at:
Step 2: Then you spin up a whole new environment (green) just like you already have, then deploy the app into it and test it all.
Step 3: Once all of your tests and verifications on the new environment tell you the just deployed code is healthy, you switch the DNS to the new environment (turning green into blue) and destroy the old blue.
The key thing is: at every deployment you’ll create a brand new environment with your whole code. After spinning up the new (blue) environment you have freedom to test it however you want without your users even noticing that. If something goes wrong, you trace the issue, request a hot fix and destroy the new environment. Completely trauma-free for the users. Once all your testing layers are done, you go to the DNS or load balancer (whichever is the front door of your app), switch the environments, and destroy the old one whenever you want. With all this low-impact work you can implement all the needed layers of tests to make sure your app didn’t break. Unit, integration, end to end, stress to an endpoint, penetration, etc.
Finally! The 9 benefits:
- Zero impact to users. If a deployment fails, your user won’t even notice that something ever happened.
- More stability to your app. You’ll have freedom to implement complex pipelines with all layers of automated tests (see pictures below).
- Zero downtime for users if you’re using the load balancer approach. Virtual zero downtime if you’re using the DNS.
- Your new DR plan is having no DR plan. Fast feedback of your infrastructure automation. You’re constantly testing your Disaster Recovery plan.
- It doesn’t require a rollback strategy.
- No more ghost infrastructure. You won’t have the issues of those old machines that are up and nobody knows why and how they’re still there. Everything is scripted.
- Save money on cloud. It makes very easy to automate the start and shutdown of your non-prod environments.
- Easier to maintain than canary.
- Reduce the truck factor. Since all your infrastructure and app layers are scripted, it becomes your environment documentation. Loose low knowledge when somebody leaves the company.
Important trade-offs to keep in mind:
- You can’t simply destroy an environment. What about the living connections in there? You have to graceful shutdown the old environment.
- Your pipeline of deployment is gonna take longer. Remember that you’re building the whole app layer from scratch at every single deployment.
Credits of the heading image: https://dev.to/mostlyjason/intro-to-deployment-strategies-blue-green-canary-and-more-3a3