Infrastructure as Code on AWS - An Introduction

Infrastructure as Code on AWS - An Introduction

Mastering AWS (or any cloud provider) doesn't only mean knowing all about the different services and configurations. But it also means working on AWS in a professional setting with professional tools.

You cannot build scalable and reproducible applications on the cloud without learning Infrastructure as Code tools. Cloud architectures can be quite complex. Creating these architectures within the AWS Console is simply not possible. Once you need to deploy the same infrastructure in a second account or in another region you will face many problems.

That is why we focus in this blog post on why we need Infrastructure as Code (IaC) and what it actually is. To understand it even better we look a bit at the history of IaC and which tools are out there.

Application Code vs. Infrastructure Code

One thing to note right at the beginning is that the line between infrastructure code and application code gets really blurry. A few years back IaC simply referred to the infrastructure your other application runs on.

Infrastructure and Application Separated

For example, you had a Spring Boot server developed in Java in one repository, and your infrastructure code only handled the deployment of this repository on EC2 or ECS.

Infrastructure and Application Code Line Blurry

In a cloud-native environment or serverless environment, it is often the case that your application code and your infrastructure code are one. For example, if you use SQS & Lambda to process incoming user requests asynchronously this is your application code and infrastructure code in one.

Always keep that in mind. There is no pure infrastructure or pure business logic code anymore in a serverless environment. You shouldn't treat your code, processes, or teams like they were.

What is Infrastructure as Code?

Infrastructure as Code describes the practice of provisioning your infrastructure resources with some sort of source code. The alternative to Infrastructure as Code is manually provisioning the resources, for example with the AWS Management Console.

IaC lets you simply define your infrastructure with either source code or configuration files.

Why Infrastructure as Code?

The same environment can be provisioned in another account or region

Every time you want to create a new environment like staging or a QA environment you can simply deploy it via your code automatically. The same thing is also with regions. If you want to switch from eu-central-1 to us-east-1 this is easily possible. With manual provisioning not so much.

History of Changes

With IaC you probably use version control like Git. With that, you automatically have a history of all your changes. Either for governance or simply to understand how your architecture evolved over time.

DevOps Best Practices

By having source code you can follow DevOps best practices like:

  • Having a deployment pipeline

  • Automated test suites

  • Continuous Deployment

Reproducible Setup

You don’t have to remember any steps or where you need to click. You have everything documented in the code.

There are many more benefits out there but let's just say you won't ever professionally develop on AWS without Infrastructure as Code.

History of IaC

The History of Infrastructure as Code

Let's have a look at the history of infrastructure as code.

Manual

Creating a S3 bucket manual

Manual provisioning was the start. Back when AWS only had three different services and not too many configuration options it was fairly easy to simply log in to the console and create your SQS queue and your bucket.

You didn’t have any code, any history, or any way to do governance on your architecture. Working with each other meant creating checklists with screenshots on how to launch a specific service.

It was especially painful to launch your whole application in a new environment or region. There were folders full of checklists only to know which settings to activate where.

Mainly you launched more and more services without thinking about your existing ones. Your architecture really became a mess after a few days.

After AWS launched many more services and configurations behind these services this was not a feasible approach anymore.

Pros:

  • No tool to learn simply launch your service

Cons:

  • No history

  • Not reproducible

  • Really prone to errors

  • It takes a lot of time to launch another region or environment

Scripted

The second provisioning method was creating scripts using the AWS CLI. The CLI is one of the main interfaces to the AWS API. You can create an S3 bucket for example with the CLI like that:

aws s3api create-bucket --bucket awsfundamentalstestbucket

While this seems like a good programmatic approach at first you will see many downsides to that very quickly. People used the CLI in bash scripts to provision new environments.

Pros:

  • History

  • Programmatical approach

Cons:

  • No updates possible

  • Rollbacks not possible

  • Error handling is almost impossible

Declarative

CloudFormation and Terraform as Delarative IaC

The next step was using a declarative method. A declarative approach defines what the final state looks like. We don't care how it will be done, we only define the final state.

For example, we can define that we need an S3 bucket. How the tool takes care of providing us with that bucket is not of interest.

AWS introduced the service CloudFormation. CloudFormation allows you to provision resources, handle errors, and roll back states.

A CloudFormation template is a simple configuration file in a YAML or JSON format. For example, a file could look like that:

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Description": "Sample CloudFormation Template",
  "Resources": {
    "s3Bucket": {
      "Type": "AWS::S3::Bucket",
      "Properties": {
        "BucketName": "awsfundamentalstestbucket-xyz"
      }
    }
  },

  "Outputs": {
    "BucketName": {
      "Value": {
        "Fn::GetAtt": ["s3Bucket", "Arn"]
      }
    }
  }
}

CloudFormation automatically knew which infrastructure is already available in your account and which services it needed to provision.

We are still in the time where CloudFormation is used a lot. Especially in huge corporations that started using CloudFormation right at the beginning are still using it a lot.

A very popular declarative tool for IaC is Terraform which isn't using CloudFormation in the background but is really good at handling your current state.

Pros:

  • State management

  • Rollback & Error Handling

  • Reproducible

  • History

Cons:

  • You need to work on large JSON or YAML files all-day

  • Really hard to debug

  • Learning of new syntax

  • You cannot really use proper programming constructs

Componentized

CDK and Pulumi as Componentized IaC

The final stage is where we are today. The current stage is called componentized. Componentized refers here to building reusable abstractions that developers can easily use. This is a daily job of a developer.

The main difference with tools in that stage is that you use a proper programming language like Python, TypeScript, or Java. Under the hood often CloudFormation is still used.

Popular frameworks in that space are the Cloud Development Kit (CDK) and Pulumi.

The good thing about these tools is also that you make use of the service CloudFormation underneath. That means things like drift detection, state updates, or rollbacks are still working out.

Using a proper programming language is one of the major benefits here. Developers are used to working with their IDE and all capabilities they have. Debugging, refactoring, and building reusable components is possible and it is also a daily job of a developer.

Creating a bucket in CDK looks like that: ts new s3.Bucket(this, 'MyFirstBucket', {});

Pros:

  • Use your IDE

  • No new syntax to learn

  • You have all programming constructs at hand (loops, conditions, etc.)

  • You can build abstractions

  • Rollback, history, and error handling are all there

Cons:

  • Too many abstractions can lead to a bad design

  • You can build systems that rely on external inputs (e.g. external states)

  • CloudFormation knowledge is often still needed

This should give you an initial understanding of what Infrastructure as Code is and why we need it.