Home>Blog>Infrastructure as Code: Why You Need It

Infrastructure as Code: Why You Need It

July 24, 2020 | 15 min read

In this article

Meet Infrastructure as Code
IAC Tools
Technical Differences of IaC Tools
About Terraform
"Terraform, Power On!"
No One Likes Meaningless Duplication
Divide and Conquer
"By the Power of Workspaces!"
"Infrastructure, Assemble!"

When technology first came into our lives, developers only worked in bare-metal environments. Each server was a separate physical unit. Those were simpler times. Developers connected the server to the network and power source, logged in via Telnet or SSH, installed all the necessary software, set up Cron Jobs for alerts and then were finished.

Then it came time for virtualization. From the early 2000s, the IT industry dove into this amazing world without realized just how far it would extend.

In the beginning, the same approach that worked in a bare-metal environment still worked in this environment. The only difference was that there were virtual servers instead of physical ones. Gradually, more servers were added, and old methods couldn't keep up with the volume. Eventually, provisioners were created to simplify and accelerate the process of setting up servers and installing software.

Now, we live in the era of cloud computing. Today's engineers may need dozens or hundreds of servers to accomplish their business goals. The need for a new approach has become critical.

Meet Infrastructure as Code

As Andreas and Michael Witting define it in their book Amazon Web Services in Action, Infrastructure as Code (IaC) is "the process of managing and provisioning computer data centers through machine-readable definition files, rather than physical hardware configuration or interactive configuration tools."

The ability to manage IaC provides a lot of important benefits, including:

Infrastructure versioning
Cover it with tests
Quickly scale the number of environments

By implementing IaC, your business can:

Reduce costs since your business can now utilize its computing resources efficiently.
Increase speed, as engineers can spend more time on improvements and development instead of on routine tasks.
Decrease risks by replacing manual operations with automation, reducing the chance of human error.

IAC Tools

With all of these benefits in mind, you may decide to implement IaC. If that's the case, you need a proper tool to match your project's requirements.

Cloud vendor tools short list:

Google Cloud Deployment Manager
Azure Resource Manager
AWS CloudFormation

Third-party tools short list:

Chef
Terraform

Third-party tools are a great options for users who need to manage several cloud environments.

Technical Differences of IaC Tools

From the technical perspective, IaC tool implementation has several variations:

Mutable Infrastructure vs. Immutable Infrastructure
Procedural vs. Declarative
Master vs. Masterless
Agent vs. Agentless

Each of these options has its strengths and weaknesses. Along with that, Terraform aims to be an industry-wide acknowledged mainstream as of today. As soon as your IaC uses more than several dozen resources, however, migration will become more difficult.

About Terraform

Modules

A module is a container for multiple resources that are used together. Every Terraform configuration has at least one module, known as its root module, which consists of the resources defined in the .tf files in the main working directory.

A module can call other modules, and can easily include the child module's resources in the configuration. Modules can also be called multiple times, either within the same configuration or in separate configurations, allowing resource configurations to be packaged and re-used.

Below is a code example for creating basic network infrastructure in AWS:

module "core" {

source = "github.com/lean-delivery/tf-module-aws-core.git?ref=1.0.0"

project = "amazing"

environment = "production"

availability_zones = ["us-east-1a", "us-east-1b"]

vpc_cidr = "10.0.0.0/8"

private_subnets = ["10.0.1.0/24", "10.0.2.0/24"]

public_subnets = ["10.0.3.0/24", "10.0.4.0/24"]

database_subnets = var.database_subnets

create_database_subnet_group = true

enable_nat_gateway = true

}

More useful Terraform modules can be found on the Lean Delivery project on GitHub.

Workspaces

Each Terraform configuration has an associated backend that defines how operations are executed and where persistent data, such as the Terraform state, is stored. The persistent data stored in the backend belongs to a workspace. Initially, the backend only has one workspace called "default," and thus, there is only one Terraform state associated with this configuration.

Certain backends support multiple named workspaces, allowing multiple states to be associated with a single configuration. The configuration still has only one backend, but multiple distinct instances of that configuration can be deployed without users having to configure a new backend or changing authentication credentials.

Multiple workspaces are currently supported by the following backends:

AzureRM
Hashicorp Consul
Google Compute Storage
Local File system
Manta
Postgres
Terraform Remote
AWS S3

Terraservices

Terraservices break components up into logical modules, so we can manage them separately. By using Terraservices, we only need one state file per component rather than per environment. Typically, if not done already, users can move to a distributed or a mode state setup.

"Terraform, Power On!"

After almost two years of using Terraform, we have finally identified our best practices, which we will share in the examples below.

Let's use AWS as the cloud provider in our example. First, we should prepare infrastructure for a new service, which includes:

Several Amazon Elastic Compute Cloud (EC2) instances for backend and frontend
Some of these instances should be balanced with the Application Load Balancer (ALB)
Relational Database Service (RDS)
Virtual Private Cloud (VPC) for anything that contains subnets, routing tables, etc.

As the solution for this example, we would use AWS S3 as a storage for Terraform state files.

No One Likes Meaningless Duplication

In our approach, we use data inheritance from one Terraservice to another by using the data source terraform_remote_state. Through this data source, we can receive any data outputted in Terraservices that has already been applied. As a result, in every new Terraservice, we only need to define a few specific variables.

Divide and Conquer

According to Terraservices concept, we divide our Terraform code into several groups:

Terraform state storage infrastructure
Core infra: VPC, Subnets, routing tables, etc.
Common resources:
1. Bastion instance (if needed)
2. RDS
3. Network connectivity (if needed)
Infrastructure for our new service

The last point could contain several separate Terraservices, depending on your target infrastructure:

Shared resources
Service's frontend
Service's backend

Note: If you want to separate production and non-production environments by placing them in different accounts, move Terraform backend configuration from .tf files to the separate .hcl files. It allows you to choose the required backend on the Terraform initial step:

[user@host ~] $ terraform init -backend-config=/path/to/your/tf_backend_config.hcl

The catalog tree in your repository will look this:

Some may ask why we store tfstate files for 0_terraform_infra in our Git repository. Code in the 0_terraform_infra step performs the creation of S3 for our Terraform backend. Until it no longer exists, there is nowhere else to store tfstate files. These files don't contain any sensitive data, so we don't break Git best practices.

Also, 0_terraform_infra creates a Terraform backend config file (prod.hcl, dev.hcl), which will be used for all future Terraservices. A name of the file will be generated based on the workspace name.

"By the Power of Workspaces!"

Now that we have a Terraform code for our infrastructure, we need it to manage several environments—including production and development. Terraform workspaces are designed exactly for this. First, let's agree on the naming convention.

The AssumptionWorkspace name will contain the environment name and AWS region name, e.g. prod-eu-west-1 and dev-us-east-1.

For production and development environments, we should use different input values – that's why each environment should have a separate *.tfvars file. Let's name them according to the workspace name to avoid confusion: prod-eu-west-1.tfvars and dev-us-east-1.tfvars.

The setup sequence example for 1_core would look like:

[user@host 1_core] $ terraform init -backend-config=../dev.hcl # Initialize backend for dev environment

[user@host 1_core] $ terraform workspace new dev-us-east-1 # Create new workspace for dev environment

[user@host 1_core] $ terraform apply -var-file=tfvars/dev-us-east-1.tfvars # Create dev infrastructure by applying Terraform code

[user@host 1_core] $ rm -rf .terraform # Remove backend configuration for dev env

[user@host 1_core] $ terraform init -backend-config=../prod.hcl # Initialize backend for production environment

[user@host 1_core] $ terraform workspace new prod-eu-west-1 # Create new workspace for production environment

[user@host 1_core] $ terraform apply -var-file=tfvars/prod-eu-west-1.tfvars # Create prod infrastructure by applying Terraform code

"Infrastructure, Assemble!"

With these tips, you'll have flexible control on each level of your environments. Competent separation of your infrastructure code will allow you to update any part of the infrastructure safely, with minimum risks and minimal effect on other parts of the service.

Request Solution