If you haven't read the Day 5 post, the short version is: state is how Terraform remembers what it created, and remote state in S3 with DynamoDB locking is the baseline for team use. This post assumes that's already in place.
Why This Matters in the Industry
Once teams get comfortable with Terraform, a new problem emerges: how do you manage state across multiple environments, multiple teams, and multiple regions without everything becoming one tangled mess?
This is where a lot of Terraform setups start breaking down. You end up with one giant state file containing prod, staging, and dev all mixed together — or worse, engineers sharing one backend config and accidentally running terraform apply against production when they meant staging. Neither is a fun situation to be in.
The patterns in this post are what experienced teams use to avoid that. State isolation, workspace boundaries, and knowing how to inspect and fix state without guessing are the difference between infrastructure that's easy to maintain and infrastructure that's a source of constant anxiety.
Inspecting State
Before I get into managing state across teams, it helped to actually understand what's inside state and how to interact with it through the CLI.
List all resources in state:
terraform state list
Output from my Day 4/5 setup looked like this:
data.aws_ami.amazon_linux
data.aws_availability_zones.available
data.aws_subnets.default
data.aws_vpc.default
aws_autoscaling_group.web
aws_lb.web
aws_lb_listener.http
aws_lb_target_group.web
aws_security_group.alb
aws_security_group.instance
aws_launch_template.web
Inspect a specific resource:
terraform state show aws_lb.web
This prints every attribute Terraform knows about that resource — the ARN, DNS name, subnets, security groups, all of it. Useful for debugging when something doesn't match what you expect.
Move a resource to a different name:
terraform state mv aws_instance.old_name aws_instance.new_name
If I rename a resource in my .tf files, Terraform sees it as a delete + create — which would destroy and recreate the actual infrastructure. Using state mv first tells Terraform the resource just moved, so the real resource stays untouched.
Remove a resource from state without destroying it:
terraform state rm aws_instance.example
This removes the resource from Terraform's tracking but leaves the actual AWS resource running. Useful when you want to take something out of Terraform management without deleting it.
Import an existing resource into state:
terraform import aws_instance.example i-0abc123def456789
If infrastructure exists in AWS but not in state — created manually or by another tool — import pulls it in so Terraform can manage it going forward.
The Problem with One State File for Everything
Once I had remote state working, I started thinking about what happens when the project grows. If everything lives in one state file — all environments, all services — a few bad things can happen:
A mistake in one environment affects all of them. Running a bad terraform apply that deletes the wrong resource in a single state file could take down prod and staging at the same time.
terraform plan gets slow. Every plan has to read the current state of every resource. A large state file with hundreds of resources makes every plan slower.
Access control is hard. If the whole team has access to the same backend, everyone can read everything — including resources in environments they shouldn't touch.
The solution is to split state into smaller, isolated pieces.
Isolation Strategy 1: File Layout
The simplest approach is a separate folder per environment, each with its own backend config and state file:
infrastructure/
├── prod/
│ ├── main.tf
│ ├── variables.tf
│ └── backend.tf # points to prod/terraform.tfstate in S3
├── staging/
│ ├── main.tf
│ ├── variables.tf
│ └── backend.tf # points to staging/terraform.tfstate in S3
└── dev/
├── main.tf
├── variables.tf
└── backend.tf # points to dev/terraform.tfstate in S3
Each folder is a completely independent Terraform config. Running terraform apply in staging/ has zero effect on prod/. They don't share state, so they can't interfere with each other.
The tradeoff is code duplication — prod/main.tf and staging/main.tf are usually very similar. Modules (coming in a later day) are how you solve that without copy-pasting.
Isolation Strategy 2: Workspaces
Terraform Workspaces let you have multiple state files within a single configuration. Each workspace has its own state:
# Create and switch to a staging workspace
terraform workspace new staging
# Switch back to default (prod)
terraform workspace select default
# List all workspaces
terraform workspace list
Inside my config, I can reference the current workspace:
locals {
environment = terraform.workspace # "default", "staging", "dev"
name_prefix = "web-app-${local.environment}"
}
This way the same config deploys to different environments just by switching workspaces — the resource names and state are kept separate automatically.
The state files in S3 end up organized like this:
my-terraform-state-bucket/
└── env:/
├── default/
│ └── web-app/terraform.tfstate
├── staging/
│ └── web-app/terraform.tfstate
└── dev/
└── web-app/terraform.tfstate
Workspace Limitations
Workspaces feel like the perfect solution until you hit their limits. The main one: all workspaces share the same backend configuration and the same IAM permissions. You can't give staging read-only access to prod — the same credentials work across all workspaces.
For real environment isolation where prod and staging have genuinely different access controls, file layout with separate AWS accounts is the safer pattern. Workspaces are better suited for short-lived feature environments than for long-lived prod/staging/dev separation.
Backend Authentication
Have you thought about how Terraform authenticate to S3 to read and write state?
It uses the same AWS credentials configured on the machine — the same credentials you set up with aws configure. When using remote state in a CI/CD pipeline, this usually means an IAM role attached to the build agent.
The minimum IAM permissions needed for the S3 backend:
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:PutObject",
"s3:DeleteObject"
],
"Resource": "arn:aws:s3:::my-terraform-state-bucket/*"
},
{
"Effect": "Allow",
"Action": [
"s3:ListBucket"
],
"Resource": "arn:aws:s3:::my-terraform-state-bucket"
},
{
"Effect": "Allow",
"Action": [
"dynamodb:GetItem",
"dynamodb:PutItem",
"dynamodb:DeleteItem"
],
"Resource": "arn:aws:dynamodb:*:*:table/terraform-state-locks"
}
]
}
Scoping these permissions tightly matters — especially for the prod state bucket. You don't want every developer to have write access to production state.
Reading State from Another Config
One thing that comes up when splitting state into separate configs: how does one config reference a value from another? For example, if networking lives in its own state file, how does the web app config get the VPC ID?
The answer is terraform_remote_state:
# In the web-app config, read outputs from the networking config
data "terraform_remote_state" "networking" {
backend = "s3"
config = {
bucket = "my-terraform-state-bucket"
key = "prod/networking/terraform.tfstate"
region = "us-east-1"
}
}
# Now use the VPC ID from the networking config
resource "aws_lb" "web" {
subnets = data.terraform_remote_state.networking.outputs.subnet_ids
...
}
This is cleaner than hardcoding IDs or duplicating infrastructure between configs. The networking config exposes values through output blocks, and any other config can read them through terraform_remote_state.
What I've Got Now
After Day 5 and Day 6, my state setup looks like this:
S3 Bucket: my-terraform-state-bucket
├── dev/web-app/terraform.tfstate
├── staging/web-app/terraform.tfstate
└── prod/web-app/terraform.tfstate
DynamoDB Table: terraform-state-locks
(used by all three environments for locking)
Each environment is fully isolated. A terraform apply against dev/ can't touch prod/ state. Backend authentication is scoped per IAM role, so CI/CD pipelines only have access to the environment they're deploying to.
This took two days to properly understand because state isn't the flashiest topic — there's nothing to watch appear in the AWS console. But it's the kind of thing that bites you hard if you ignore it, and it's much easier to set up correctly from the start than to fix later when state is a mess.
The terraform state commands were the most immediately useful part. Being able to inspect, move, and selectively remove resources without destroying real infrastructure makes debugging a lot less scary.
This post is part of a 30-day Terraform learning journey.
💬 Comments
No comments yet. Be the first to share your thoughts!
Leave a Comment