Red Badger on immutable infrastructure as code

This is a guest post for the Computer Weekly Developer Network by Stuart Harris in his role as CIO and founder of Red Badger.

Harris: pumped about code

Harris: pumped about code, likes badgers too

Red Badger describes itself as an experience-led software development studio focused on helping big corporations feel like start-ups (Ed we think that means kind of Agile & nimble and ‘not clunky’).

Harris dives into code-centric complexities, his words follow below…

Immutable infrastructure as code

These days, inside the container is the developer’s responsibility and everything outside the container is the job of an infrastructure operator. We got here through an evolution from physical hardware, manually set up and managed by large teams, through virtual machines (with tools for manual management), through automation with tools like PuppetChef and Ansible that help eliminate human error and provide consistency.

… and so DevOps was born.

These tools allowed us to modify the infrastructure components more predictably. This is “infrastructure as code”. But the mutation is very hard to do safely – there are too many edge cases and what-if scenarios that make the tools complex and fragile.

Now, with Linux Containers and Docker, we finally have a way to embrace immutability. Instead of using tools to mutate our application’s environments we can replace them completely with new environments when something needs to change. DevOps died.

Linux Containers are as revolutionary as their real-world counterparts in the shipping industry. They’ve taught us that instead of mutating a server configuration, it’s safer to treat servers as immutable and replace them with brand new instances.

This is — “immutable infrastructure as code”.

You can divide infrastructure into components that are either stateful or stateless. In modern scalable web application architectures, state lives at the very front edge (the browser) and the very back edge (the data stores). Everything in between needs to be stateless so that it can scale horizontally. These stateless components should be immutable. You never change them directly. Indeed the only thing we should do is to create better ones to replace the old ones.

Infrastructure as code – NAT Gateways described in Terraform

This means that if the component is a container, or a VM, you can remove ssh access! And create security rules that forbid traffic on port 22.  If there is a problem, we do root cause analysis through runtime logging and monitoring. Instrument everything and install APM (New Relic is awesome). Gather logs and store them for analysis. When you find a problem, make a change to the source and redeploy.

You can store the application source code, the container configuration and the infrastructure scripts, in a single Github repository (a mono-repo) that describes your whole application, including the environment and the infrastructure. Transactional (atomic) commits can be made across all three areas (application source, environment configuration and infrastructure); easily keeping them in sync and you get an audit trail of changes for free.

Terraform is great – you just describe what your infrastructure should look like in code. You can adjust this declaration and ask Terraform to show you a plan. If you’re happy, you apply the change.

Terraform decides what it can re-create and what it needs to mutate. Servers, for example, are recreated. Terraform can stand up full datacentres in minutes and destroy them equally as fast. So we now can afford to create a whole new infrastructure every time we need to make a “change”. Except for the data stores, the whole infrastructure can become immutable.

Because it’s so cheap to stand up completely new infrastructure, we can afford to create permanent environments and ephemeral environments (e.g. for load, performance and penetration testing) knowing they are absolutely identical to each other. And we can evolve these environments, in an immutable way, by creating new ones. No configuration drift, no outdated operating systems or environmental software, no server rot, no surprises, no human error. Instead, you can trickle changes into the infrastructure using continuous delivery.