Setup and context — What Claude Code Brings to IaC Development
Managing cloud infrastructure as Infrastructure as Code (IaC) has become standard practice for modern engineering organizations. Yet the learning curve of Terraform, the security risks from misconfigurations, the challenge of managing ballooning cloud costs, and handling configuration drift between your code and production remain persistent pain points for most teams.
Claude Code has the potential to fundamentally change this landscape. Imagine describing your infrastructure requirements in plain language and having Terraform code generated instantly — code that is immediately reviewed for security issues and paired with cost estimates. That workflow is a reality today.
This guide walks through concrete implementation patterns for integrating Claude Code into each phase of Terraform development, complete with working code examples. We won't cover the basics. We'll go deep into advanced techniques: CLAUDE.md design, automation with hooks, and parallel processing using sub-agents.
The target audience is engineers who already understand Terraform fundamentals and want to integrate Claude Code into production DevOps workflows.
Prerequisites and Environment Setup
To get the most from this guide, you'll need the following:
Environment
- Claude Code (latest version)
- Terraform v1.7 or higher
- AWS CLI v2 (configured)
- tfsec or Checkov (security scanning)
- Node.js 18 or higher (for hooks scripts)
Required Knowledge
- Terraform basics (provider, resource, variable, output)
- Core AWS services (VPC, EC2, RDS, IAM, etc.)
- Git fundamentals
- Conceptual understanding of Claude Code hooks
Start by setting up your working directory and project structure:
mkdir -p ~/projects/infra-with-claude
cd ~/projects/infra-with-claude
# Initialize Terraform project structure
mkdir -p {modules/{vpc,ec2,rds,iam},environments/{dev,staging,prod},.claude}
touch CLAUDE.mdDesigning CLAUDE.md — Giving Claude Context
The quality of Claude Code's output depends directly on how precisely you communicate your project's context. For Terraform projects specifically, it's critical to convey your provider versions, naming conventions, and security policies upfront.
# Infra Project — CLAUDE.md
## Project Overview
Multi-environment AWS infrastructure managed with Terraform.
Target environments: dev / staging / prod
## Terraform Conventions
- Provider: aws ~> 5.0, terraform ~> 1.7
- Backend: S3 + DynamoDB (state lock)
- Naming convention: {env}-{service}-{resource} e.g., prod-api-sg
- Required tags: Environment, Project, ManagedBy=terraform, Owner
## Security Policies
- All S3 buckets must be encrypted (SSE-S3 or stronger)
- Security groups must NOT allow 0.0.0.0/0 inbound
- IAM roles must follow the principle of least privilege
- RDS must have public access disabled; Multi-AZ required in prod
## Module Design
- Reusable modules live under modules/
- Each module separates variables.tf / outputs.tf / main.tf
- Common tag logic centralized in locals.tf
## Cost Awareness
- dev uses t3.micro / db.t3.micro as baseline
- Prioritize Spot Instances for prod workloads where possible
- Regular reviews for unused resourcesWith this CLAUDE.md in place, Claude Code automatically considers policies like "no 0.0.0.0/0 allowed" and "tags are mandatory" when generating code — without you having to repeat them every time.
Practical Terraform Code Generation Patterns
Pattern 1: Generate Modules from Natural Language
Let's start with the most fundamental use case.
# Example prompt to Claude Code
claude "Following the policies in CLAUDE.md, generate a VPC module with these requirements:
- Multi-AZ (ap-northeast-1a/1c)
- 2 public and 2 private subnets each
- NAT Gateway only for prod (not needed in dev)
- VPC Flow Logs output to CloudWatch
Split into modules/vpc/main.tf, variables.tf, and outputs.tf"Claude Code analyzes these requirements and produces code like this:
# modules/vpc/main.tf (Claude Code generated example)
locals {
common_tags = merge(var.tags, {
Module = "vpc"
})
}
resource "aws_vpc" "main" {
cidr_block = var.vpc_cidr
enable_dns_hostnames = true
enable_dns_support = true
tags = merge(local.common_tags, {
Name = "${var.env}-vpc"
})
}
# Public subnets (Multi-AZ)
resource "aws_subnet" "public" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 4, count.index)
availability_zone = var.availability_zones[count.index]
# CLAUDE.md policy: explicitly control public IP auto-assignment
map_public_ip_on_launch = false
tags = merge(local.common_tags, {
Name = "${var.env}-public-${var.availability_zones[count.index]}"
Tier = "public"
})
}
# Private subnets (Multi-AZ)
resource "aws_subnet" "private" {
count = length(var.availability_zones)
vpc_id = aws_vpc.main.id
cidr_block = cidrsubnet(var.vpc_cidr, 4, count.index + length(var.availability_zones))
availability_zone = var.availability_zones[count.index]
tags = merge(local.common_tags, {
Name = "${var.env}-private-${var.availability_zones[count.index]}"
Tier = "private"
})
}
# NAT Gateway (prod only)
resource "aws_eip" "nat" {
count = var.enable_nat_gateway ? length(var.availability_zones) : 0
domain = "vpc"
tags = merge(local.common_tags, {
Name = "${var.env}-nat-eip-${count.index}"
})
}
resource "aws_nat_gateway" "main" {
count = var.enable_nat_gateway ? length(var.availability_zones) : 0
allocation_id = aws_eip.nat[count.index].id
subnet_id = aws_subnet.public[count.index].id
tags = merge(local.common_tags, {
Name = "${var.env}-nat-${count.index}"
})
depends_on = [aws_internet_gateway.main]
}
# VPC Flow Logs
resource "aws_flow_log" "main" {
vpc_id = aws_vpc.main.id
traffic_type = "ALL"
iam_role_arn = aws_iam_role.flow_log.arn
log_destination = aws_cloudwatch_log_group.flow_log.arn
tags = local.common_tags
}Pattern 2: Refactoring Existing Infrastructure
When your existing Terraform code has issues, you can ask Claude Code to diagnose and refactor it:
claude "Read modules/ec2/main.tf and refactor it with these objectives:
1. Replace hardcoded AMI IDs with data sources
2. Extract security group rules (from inline to aws_security_group_rule)
3. Fix any violations of the tag policy in CLAUDE.md
4. Add inline comments explaining the changes"Automated Security Scanning with Hooks
Using Claude Code's PostToolUse hooks, you can build a system that automatically runs security scans every time a Terraform file is saved.
// .claude/settings.json
{
"hooks": {
"PostToolUse": [
{
"matcher": "Write|Edit",
"hooks": [
{
"type": "command",
"command": "node .claude/hooks/terraform-security-scan.js"
}
]
}
]
}
}// .claude/hooks/terraform-security-scan.js
const { execSync } = require('child_process');
const fs = require('fs');
const path = require('path');
// Receive hook event from stdin
let inputData = '';
process.stdin.on('data', chunk => { inputData += chunk; });
process.stdin.on('end', () => {
const event = JSON.parse(inputData);
const filePath = event.tool_input?.file_path || event.tool_input?.path || '';
// Only process Terraform files
if (!filePath.endsWith('.tf') && !filePath.endsWith('.tfvars')) {
process.exit(0);
}
const dir = path.dirname(filePath);
console.error(`🔍 Terraform security scan: ${filePath}`);
try {
// Scan with tfsec
const result = execSync(
`tfsec ${dir} --format json --no-color 2>/dev/null`,
{ encoding: 'utf8', stdio: ['pipe', 'pipe', 'pipe'] }
);
const report = JSON.parse(result);
const criticalIssues = report.results?.filter(r =>
r.severity === 'CRITICAL' || r.severity === 'HIGH'
) || [];
if (criticalIssues.length > 0) {
// Output issues to stderr to notify Claude
const issues = criticalIssues.map(issue =>
`[${issue.severity}] ${issue.rule_id}: ${issue.description} (${issue.location?.filename}:${issue.location?.start_line})`
).join('\n');
console.error(`⚠️ Security issues detected:\n${issues}`);
// Prompt Claude to fix the issues
process.stdout.write(JSON.stringify({
decision: 'block',
reason: `Terraform security issues detected. Fixes required:\n${issues}`
}));
process.exit(0);
}
console.error('✅ Security scan: No issues found');
} catch (err) {
// Skip gracefully if tfsec is not installed
if (err.message.includes('not found') || err.message.includes('command not found')) {
console.error('⚠️ tfsec not installed. Run: npm install -g tfsec');
}
}
process.exit(0);
});With this hook in place, the moment you write Terraform code that violates a security policy, Claude Code detects it and automatically proposes a fix.
Combining with Checkov
Adding Checkov alongside tfsec enables more comprehensive IaC scanning:
# Checkov scan (called from Claude Code)
claude "Analyze the following terraform plan output and fix all HIGH or above vulnerabilities found by Checkov:
$(checkov -d . --framework terraform --compact --output json 2>/dev/null)"AI-Powered terraform plan Analysis
The output of terraform plan can be overwhelming to parse manually. Claude Code can rapidly analyze these diffs and surface risks clearly.
# Run terraform plan and have Claude Code analyze it
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
claude "Read tfplan.json and analyze it from these angles:
1. Risk assessment for destructive changes (destroy/replace)
2. Security impact of security group changes
3. Downtime risk from database changes
4. Cost delta estimate (added/removed resources)
5. A checklist of items to verify before applying to production
Format the output as a Markdown report"Claude Code generates reports like this:
## terraform plan Analysis Report
### 🔴 Destructive Changes (Attention Required)
- `aws_db_instance.main`: **REPLACE** — Change to parameter_group_name
requires RDS instance restart. Expected downtime: ~2–5 minutes
### 🟡 Security Changes
- `aws_security_group_rule.api_ingress`: Port 443 ingress source changed
from `10.0.0.0/8` → `10.1.0.0/16` (narrower scope = improved security)
### 💰 Cost Estimate (Approximate)
- Added: `aws_nat_gateway` × 2 → +~$64/month
- Removed: `aws_instance.bastion` → -~$8/month
- Net change: +~$56/month
### ✅ Pre-Apply Checklist
- [ ] RDS snapshot taken
- [ ] Apply during maintenance window
- [ ] Rollback procedure confirmedAutomating Cost Optimization
Here's a workflow for automatically generating cloud cost optimization recommendations using Claude Code.
# Export AWS Cost Explorer data
aws ce get-cost-and-usage \
--time-period Start=2026-03-01,End=2026-04-01 \
--granularity MONTHLY \
--metrics BlendedCost \
--group-by Type=DIMENSION,Key=SERVICE \
--output json > cost_report.json
# Analyze with Claude Code
claude "Cross-reference cost_report.json with the current Terraform code in terraform/ and provide:
1. Optimization recommendations for the top 5 most expensive services
2. Identification of low-utilization resources (inferred from tag data)
3. Reserved Instance / Savings Plan candidates
4. Estimated savings from nightly shutdown of dev/staging environments
Propose Terraform changes ready for a PR"Automatic Spot Instance Migration Proposals
# Spot Instance optimization example (Claude Code proposal)
# modules/ec2/main.tf (optimized)
resource "aws_launch_template" "app" {
name_prefix = "${var.env}-app-"
image_id = data.aws_ami.amazon_linux.id
instance_type = var.instance_type
# Spot instance configuration (dev/staging only)
dynamic "instance_market_options" {
for_each = var.use_spot ? [1] : []
content {
market_type = "spot"
spot_options {
# Settings for interruption-tolerant workloads
instance_interruption_behavior = "terminate"
max_price = var.spot_max_price
}
}
}
lifecycle {
create_before_destroy = true
}
}Drift Detection and Automated Remediation
In production environments, infrastructure drift from manual changes is a serious risk. Here's a Claude Code-powered workflow for detecting and resolving drift.
# Drift detection script
cat > .claude/scripts/detect-drift.sh << 'EOF'
#!/bin/bash
echo "🔍 Starting Terraform drift detection..."
# Refresh state to reflect actual infrastructure
terraform refresh -compact-warnings 2>&1
# Check for differences
terraform plan -detailed-exitcode -out=drift.tfplan 2>&1
EXIT_CODE=$?
if [ $EXIT_CODE -eq 2 ]; then
echo "⚠️ Drift detected"
terraform show -json drift.tfplan > drift.json
claude "Analyze drift.json and:
1. Identify resources likely changed manually
2. Infer the intent of each change (emergency fix vs. mistake)
3. Recommend how to reconcile (terraform import vs. code update)
4. Propose AWS Config rules to prevent recurrence"
elif [ $EXIT_CODE -eq 0 ]; then
echo "✅ No drift: infrastructure matches Terraform state"
fi
rm -f drift.tfplan drift.json
EOF
chmod +x .claude/scripts/detect-drift.shGitHub Actions Integration
# .github/workflows/terraform-drift.yml
name: Terraform Drift Detection
on:
schedule:
- cron: '0 0 * * 1-5' # Weekdays at 9:00 JST (UTC 00:00)
workflow_dispatch:
jobs:
drift-check:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Terraform
uses: hashicorp/setup-terraform@v3
with:
terraform_version: 1.7.x
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v4
with:
role-to-assume: ${{ secrets.AWS_DRIFT_CHECK_ROLE }}
aws-region: ap-northeast-1
- name: Terraform Init
run: terraform init
working-directory: environments/prod
- name: Detect Drift
id: drift
run: |
terraform plan -detailed-exitcode -out=drift.tfplan 2>&1 || true
echo "exit_code=$?" >> $GITHUB_OUTPUT
working-directory: environments/prod
- name: Notify if Drift Detected
if: steps.drift.outputs.exit_code == '2'
uses: actions/github-script@v7
with:
script: |
await github.rest.issues.create({
owner: context.repo.owner,
repo: context.repo.repo,
title: `[Alert] Terraform Drift Detected - ${new Date().toISOString().split('T')[0]}`,
body: 'Terraform drift detected in production. Investigation and remediation required.',
labels: ['infrastructure', 'urgent']
});Parallel Infrastructure Generation with Sub-Agents
For large-scale infrastructure changes, Claude Code's sub-agent feature enables parallel processing that dramatically cuts generation time.
claude "Use separate sub-agents to generate the following infrastructure components in parallel:
[Agent 1] modules/networking/:
- VPC, subnets, route tables, IGW, NAT Gateway
[Agent 2] modules/security/:
- WAF, Security Hub, GuardDuty config, Security Group templates
[Agent 3] modules/compute/:
- ECS Fargate cluster, ALB, Auto Scaling
[Agent 4] modules/data/:
- RDS Aurora PostgreSQL, ElastiCache Redis, S3 bucket suite
Once all agents complete, run a consistency check (VPC ID references, cross-module security group references)"Parallel sub-agent execution compresses infrastructure design work that would normally take 2–3 hours into 20–30 minutes.
Common Errors and How to Handle Them
Error 1: Provider Version Mismatch
Error: Invalid provider configuration
Ask Claude Code:
claude "Fix this error:
$(terraform init 2>&1)
Review the provider constraints in versions.tf and unify provider versions across all modules"Error 2: State Lock Contention
Error: Error locking state: ConditionalCheckFailedException
claude "A Terraform state lock is stuck.
Walk me through:
1. How to check for currently running terraform processes
2. How to inspect the DynamoDB lock record
3. Safe steps to force-unlock
Step by step please"Error 3: Circular References
Error: Cycle: module.a.resource, module.b.resource
claude "Resolve this Terraform circular reference:
$(terraform graph 2>&1 | head -50)
Analyze the dependency graph, explain which module should sit higher in the hierarchy, and apply the fix"Summary
This guide walked through practical patterns for integrating Claude Code across all phases of Terraform development.
Here are the key takeaways.
CLAUDE.md design is the foundation of everything. Pre-defining security policies, naming conventions, and cost guidelines dramatically improves the quality of generated code.
Automated security scanning via hooks complements human review. Integration with tfsec and Checkov substantially reduces the risk of vulnerable configurations entering your codebase. To go deeper on Claude Code's hooks capabilities, Claude Code Hooks Automation Techniques — Practical Guide to Automating Your Development Workflow is a valuable companion read.
AI-powered terraform plan analysis automates risk assessment and cost estimation, preventing missed validations before production applies.
Drift detection × GitHub Actions continuously maintains production reliability. For large-scale infrastructure changes where parallel sub-agents are needed, Claude Code Parallel Agent Development Mastery Guide — 10x Faster Development with Worktrees, Sub-Agents, and Task Management provides the implementation blueprint.