Claude Code × Terraform: Complete AWS Infrastructure Automation Guide — Design, Generation, Security Auditing, and Cost Optimization

What Claude Code Brings to IaC Development

Managing cloud infrastructure as Infrastructure as Code (IaC) has become standard practice for modern engineering organizations. Yet the learning curve of Terraform, the security risks from misconfigurations, the challenge of managing ballooning cloud costs, and handling configuration drift between your code and production remain persistent pain points for most teams.

Claude Code has the potential to fundamentally change this landscape. Imagine describing your infrastructure requirements in plain language and having Terraform code generated instantly — code that is immediately reviewed for security issues and paired with cost estimates. That workflow is a reality today.

This guide walks through concrete implementation patterns for integrating Claude Code into each phase of Terraform development, complete with working code examples. We won't cover the basics. We'll go deep into advanced techniques: CLAUDE.md design, automation with hooks, and parallel processing using sub-agents.

The target audience is engineers who already understand Terraform fundamentals and want to integrate Claude Code into production DevOps workflows.

Prerequisites and Environment Setup

To get the most from this guide, you'll need the following:

Environment

Claude Code (latest version)
Terraform v1.7 or higher
AWS CLI v2 (configured)
tfsec or Checkov (security scanning)
Node.js 18 or higher (for hooks scripts)

Required Knowledge

Terraform basics (provider, resource, variable, output)
Core AWS services (VPC, EC2, RDS, IAM, etc.)
Git fundamentals
Conceptual understanding of Claude Code hooks

Start by setting up your working directory and project structure:

mkdir -p ~/projects/infra-with-claude
cd ~/projects/infra-with-claude
 
# Initialize Terraform project structure
mkdir -p {modules/{vpc,ec2,rds,iam},environments/{dev,staging,prod},.claude}
touch CLAUDE.md

Designing CLAUDE.md — Giving Claude Context

The quality of Claude Code's output depends directly on how precisely you communicate your project's context. For Terraform projects specifically, it's critical to convey your provider versions, naming conventions, and security policies upfront.

# Infra Project — CLAUDE.md
 
## Project Overview
Multi-environment AWS infrastructure managed with Terraform.
Target environments: dev / staging / prod
 
## Terraform Conventions
- Provider: aws ~> 5.0, terraform ~> 1.7
- Backend: S3 + DynamoDB (state lock)
- Naming convention: {env}-{service}-{resource} e.g., prod-api-sg
- Required tags: Environment, Project, ManagedBy=terraform, Owner
 
## Security Policies
- All S3 buckets must be encrypted (SSE-S3 or stronger)
- Security groups must NOT allow 0.0.0.0/0 inbound
- IAM roles must follow the principle of least privilege
- RDS must have public access disabled; Multi-AZ required in prod
 
## Module Design
- Reusable modules live under modules/
- Each module separates variables.tf / outputs.tf / main.tf
- Common tag logic centralized in locals.tf
 
## Cost Awareness
- dev uses t3.micro / db.t3.micro as baseline
- Prioritize Spot Instances for prod workloads where possible
- Regular reviews for unused resources

With this CLAUDE.md in place, Claude Code automatically considers policies like "no 0.0.0.0/0 allowed" and "tags are mandatory" when generating code — without you having to repeat them every time.

Practical Terraform Code Generation Patterns

Pattern 1: Generate Modules from Natural Language

Let's start with the most fundamental use case.

# Example prompt to Claude Code
claude "Following the policies in CLAUDE.md, generate a VPC module with these requirements:
- Multi-AZ (ap-northeast-1a/1c)
- 2 public and 2 private subnets each
- NAT Gateway only for prod (not needed in dev)
- VPC Flow Logs output to CloudWatch
Split into modules/vpc/main.tf, variables.tf, and outputs.tf"

Claude Code analyzes these requirements and produces code like this:

# modules/vpc/main.tf (Claude Code generated example)
 
locals {
  common_tags = merge(var.tags, {
    Module = "vpc"
  })
}
 
resource "aws_vpc" "main" {
  cidr_block           = var.vpc_cidr
  enable_dns_hostnames = true
  enable_dns_support   = true
 
  tags = merge(local.common_tags, {
    Name = "${var.env}-vpc"
  })
}
 
# Public subnets (Multi-AZ)
resource "aws_subnet" "public" {
  count             = length(var.availability_zones)
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(var.vpc_cidr, 4, count.index)
  availability_zone = var.availability_zones[count.index]
 
  # CLAUDE.md policy: explicitly control public IP auto-assignment
  map_public_ip_on_launch = false
 
  tags = merge(local.common_tags, {
    Name = "${var.env}-public-${var.availability_zones[count.index]}"
    Tier = "public"
  })
}
 
# Private subnets (Multi-AZ)
resource "aws_subnet" "private" {
  count             = length(var.availability_zones)
  vpc_id            = aws_vpc.main.id
  cidr_block        = cidrsubnet(var.vpc_cidr, 4, count.index + length(var.availability_zones))
  availability_zone = var.availability_zones[count.index]
 
  tags = merge(local.common_tags, {
    Name = "${var.env}-private-${var.availability_zones[count.index]}"
    Tier = "private"
  })
}
 
# NAT Gateway (prod only)
resource "aws_eip" "nat" {
  count  = var.enable_nat_gateway ? length(var.availability_zones) : 0
  domain = "vpc"
 
  tags = merge(local.common_tags, {
    Name = "${var.env}-nat-eip-${count.index}"
  })
}
 
resource "aws_nat_gateway" "main" {
  count         = var.enable_nat_gateway ? length(var.availability_zones) : 0
  allocation_id = aws_eip.nat[count.index].id
  subnet_id     = aws_subnet.public[count.index].id
 
  tags = merge(local.common_tags, {
    Name = "${var.env}-nat-${count.index}"
  })
 
  depends_on = [aws_internet_gateway.main]
}
 
# VPC Flow Logs
resource "aws_flow_log" "main" {
  vpc_id          = aws_vpc.main.id
  traffic_type    = "ALL"
  iam_role_arn    = aws_iam_role.flow_log.arn
  log_destination = aws_cloudwatch_log_group.flow_log.arn
 
  tags = local.common_tags
}

Pattern 2: Refactoring Existing Infrastructure

When your existing Terraform code has issues, you can ask Claude Code to diagnose and refactor it:

claude "Read modules/ec2/main.tf and refactor it with these objectives:
1. Replace hardcoded AMI IDs with data sources
2. Extract security group rules (from inline to aws_security_group_rule)
3. Fix any violations of the tag policy in CLAUDE.md
4. Add inline comments explaining the changes"

Automated Security Scanning with Hooks

Using Claude Code's PostToolUse hooks, you can build a system that automatically runs security scans every time a Terraform file is saved.

// .claude/settings.json
{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [
          {
            "type": "command",
            "command": "node .claude/hooks/terraform-security-scan.js"
          }
        ]
      }
    ]
  }
}

// .claude/hooks/terraform-security-scan.js
const { execSync } = require('child_process');
const fs = require('fs');
const path = require('path');
 
// Receive hook event from stdin
let inputData = '';
process.stdin.on('data', chunk => { inputData += chunk; });
 
process.stdin.on('end', () => {
  const event = JSON.parse(inputData);
  const filePath = event.tool_input?.file_path || event.tool_input?.path || '';
 
  // Only process Terraform files
  if (!filePath.endsWith('.tf') && !filePath.endsWith('.tfvars')) {
    process.exit(0);
  }
 
  const dir = path.dirname(filePath);
 
  console.error(`🔍 Terraform security scan: ${filePath}`);
 
  try {
    // Scan with tfsec
    const result = execSync(
      `tfsec ${dir} --format json --no-color 2>/dev/null`,
      { encoding: 'utf8', stdio: ['pipe', 'pipe', 'pipe'] }
    );
 
    const report = JSON.parse(result);
    const criticalIssues = report.results?.filter(r =>
      r.severity === 'CRITICAL' || r.severity === 'HIGH'
    ) || [];
 
    if (criticalIssues.length > 0) {
      // Output issues to stderr to notify Claude
      const issues = criticalIssues.map(issue =>
        `[${issue.severity}] ${issue.rule_id}: ${issue.description} (${issue.location?.filename}:${issue.location?.start_line})`
      ).join('\n');
 
      console.error(`⚠️  Security issues detected:\n${issues}`);
 
      // Prompt Claude to fix the issues
      process.stdout.write(JSON.stringify({
        decision: 'block',
        reason: `Terraform security issues detected. Fixes required:\n${issues}`
      }));
      process.exit(0);
    }
 
    console.error('✅ Security scan: No issues found');
  } catch (err) {
    // Skip gracefully if tfsec is not installed
    if (err.message.includes('not found') || err.message.includes('command not found')) {
      console.error('⚠️  tfsec not installed. Run: npm install -g tfsec');
    }
  }
 
  process.exit(0);
});

With this hook in place, the moment you write Terraform code that violates a security policy, Claude Code detects it and automatically proposes a fix.

Combining with Checkov

Adding Checkov alongside tfsec enables more comprehensive IaC scanning:

# Checkov scan (called from Claude Code)
claude "Analyze the following terraform plan output and fix all HIGH or above vulnerabilities found by Checkov:
$(checkov -d . --framework terraform --compact --output json 2>/dev/null)"

AI-Powered terraform plan Analysis

The output of terraform plan can be overwhelming to parse manually. Claude Code can rapidly analyze these diffs and surface risks clearly.

# Run terraform plan and have Claude Code analyze it
terraform plan -out=tfplan.binary
terraform show -json tfplan.binary > tfplan.json
 
claude "Read tfplan.json and analyze it from these angles:
1. Risk assessment for destructive changes (destroy/replace)
2. Security impact of security group changes
3. Downtime risk from database changes
4. Cost delta estimate (added/removed resources)
5. A checklist of items to verify before applying to production
Format the output as a Markdown report"

Claude Code generates reports like this:

## terraform plan Analysis Report
 
### 🔴 Destructive Changes (Attention Required)
- `aws_db_instance.main`: **REPLACE** — Change to parameter_group_name
  requires RDS instance restart. Expected downtime: ~2–5 minutes
 
### 🟡 Security Changes
- `aws_security_group_rule.api_ingress`: Port 443 ingress source changed
  from `10.0.0.0/8` → `10.1.0.0/16` (narrower scope = improved security)
 
### 💰 Cost Estimate (Approximate)
- Added: `aws_nat_gateway` × 2 → +~$64/month
- Removed: `aws_instance.bastion` → -~$8/month
- Net change: +~$56/month
 
### ✅ Pre-Apply Checklist
- [ ] RDS snapshot taken
- [ ] Apply during maintenance window
- [ ] Rollback procedure confirmed

Automating Cost Optimization

Here's a workflow for automatically generating cloud cost optimization recommendations using Claude Code.

# Export AWS Cost Explorer data
aws ce get-cost-and-usage \
  --time-period Start=2026-03-01,End=2026-04-01 \
  --granularity MONTHLY \
  --metrics BlendedCost \
  --group-by Type=DIMENSION,Key=SERVICE \
  --output json > cost_report.json
 
# Analyze with Claude Code
claude "Cross-reference cost_report.json with the current Terraform code in terraform/ and provide:
1. Optimization recommendations for the top 5 most expensive services
2. Identification of low-utilization resources (inferred from tag data)
3. Reserved Instance / Savings Plan candidates
4. Estimated savings from nightly shutdown of dev/staging environments
Propose Terraform changes ready for a PR"

Automatic Spot Instance Migration Proposals

# Spot Instance optimization example (Claude Code proposal)
# modules/ec2/main.tf (optimized)
 
resource "aws_launch_template" "app" {
  name_prefix   = "${var.env}-app-"
  image_id      = data.aws_ami.amazon_linux.id
  instance_type = var.instance_type
 
  # Spot instance configuration (dev/staging only)
  dynamic "instance_market_options" {
    for_each = var.use_spot ? [1] : []
    content {
      market_type = "spot"
      spot_options {
        # Settings for interruption-tolerant workloads
        instance_interruption_behavior = "terminate"
        max_price                      = var.spot_max_price
      }
    }
  }
 
  lifecycle {
    create_before_destroy = true
  }
}

Drift Detection and Automated Remediation

In production environments, infrastructure drift from manual changes is a serious risk. Here's a Claude Code-powered workflow for detecting and resolving drift.

# Drift detection script
cat > .claude/scripts/detect-drift.sh << 'EOF'
#!/bin/bash
echo "🔍 Starting Terraform drift detection..."
 
# Refresh state to reflect actual infrastructure
terraform refresh -compact-warnings 2>&1
 
# Check for differences
terraform plan -detailed-exitcode -out=drift.tfplan 2>&1
EXIT_CODE=$?
 
if [ $EXIT_CODE -eq 2 ]; then
  echo "⚠️  Drift detected"
  terraform show -json drift.tfplan > drift.json
 
  claude "Analyze drift.json and:
1. Identify resources likely changed manually
2. Infer the intent of each change (emergency fix vs. mistake)
3. Recommend how to reconcile (terraform import vs. code update)
4. Propose AWS Config rules to prevent recurrence"
elif [ $EXIT_CODE -eq 0 ]; then
  echo "✅ No drift: infrastructure matches Terraform state"
fi
 
rm -f drift.tfplan drift.json
EOF
chmod +x .claude/scripts/detect-drift.sh

GitHub Actions Integration

# .github/workflows/terraform-drift.yml
name: Terraform Drift Detection
 
on:
  schedule:
    - cron: '0 0 * * 1-5'  # Weekdays at 9:00 JST (UTC 00:00)
  workflow_dispatch:
 
jobs:
  drift-check:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
 
      - name: Setup Terraform
        uses: hashicorp/setup-terraform@v3
        with:
          terraform_version: 1.7.x
 
      - name: Configure AWS Credentials
        uses: aws-actions/configure-aws-credentials@v4
        with:
          role-to-assume: ${{ secrets.AWS_DRIFT_CHECK_ROLE }}
          aws-region: ap-northeast-1
 
      - name: Terraform Init
        run: terraform init
        working-directory: environments/prod
 
      - name: Detect Drift
        id: drift
        run: |
          terraform plan -detailed-exitcode -out=drift.tfplan 2>&1 || true
          echo "exit_code=$?" >> $GITHUB_OUTPUT
        working-directory: environments/prod
 
      - name: Notify if Drift Detected
        if: steps.drift.outputs.exit_code == '2'
        uses: actions/github-script@v7
        with:
          script: |
            await github.rest.issues.create({
              owner: context.repo.owner,
              repo: context.repo.repo,
              title: `[Alert] Terraform Drift Detected - ${new Date().toISOString().split('T')[0]}`,
              body: 'Terraform drift detected in production. Investigation and remediation required.',
              labels: ['infrastructure', 'urgent']
            });

Parallel Infrastructure Generation with Sub-Agents

For large-scale infrastructure changes, Claude Code's sub-agent feature enables parallel processing that dramatically cuts generation time.

claude "Use separate sub-agents to generate the following infrastructure components in parallel:
[Agent 1] modules/networking/:
  - VPC, subnets, route tables, IGW, NAT Gateway
[Agent 2] modules/security/:
  - WAF, Security Hub, GuardDuty config, Security Group templates
[Agent 3] modules/compute/:
  - ECS Fargate cluster, ALB, Auto Scaling
[Agent 4] modules/data/:
  - RDS Aurora PostgreSQL, ElastiCache Redis, S3 bucket suite
 
Once all agents complete, run a consistency check (VPC ID references, cross-module security group references)"

Parallel sub-agent execution compresses infrastructure design work that would normally take 2–3 hours into 20–30 minutes.

Common Errors and How to Handle Them

Error 1: Provider Version Mismatch

Error: Invalid provider configuration

Ask Claude Code:

claude "Fix this error:
$(terraform init 2>&1)
 
Review the provider constraints in versions.tf and unify provider versions across all modules"

Error 2: State Lock Contention

Error: Error locking state: ConditionalCheckFailedException

claude "A Terraform state lock is stuck.
Walk me through:
1. How to check for currently running terraform processes
2. How to inspect the DynamoDB lock record
3. Safe steps to force-unlock
Step by step please"

Error 3: Circular References

Error: Cycle: module.a.resource, module.b.resource

claude "Resolve this Terraform circular reference:
$(terraform graph 2>&1 | head -50)
Analyze the dependency graph, explain which module should sit higher in the hierarchy, and apply the fix"

Beyond Terraform — When to Reach for the deploy-on-aws Plugin and CDK

This guide has centered on Terraform, but if your target is AWS only, there's another path worth knowing: the Claude Code deploy-on-aws plugin. It's an official AWS plugin built on top of AWS CDK (Cloud Development Kit) that turns natural language instructions into CDK stacks and handles the deployment for you.

The workflow mirrors what we did with Terraform. Tell Claude Code "connect a Node.js 20 Lambda to an API Gateway," and it generates the handler plus a CDK stack, runs a cdk diff-style preview, and then deploys. IAM roles are created with least privilege, and a failed deploy rolls back to the previous CloudFormation state.

# Install the plugin and check its status
claude plugins install deploy-on-aws
claude deploy-on-aws status
# → shows AWS Account / Region / CDK Bootstrap status

Because the generated stack is TypeScript, autocompletion and type checking work out of the box.

// Example CDK stack generated by deploy-on-aws (excerpt)
const helloFn = new NodejsFunction(this, "HelloFunction", {
  entry: "lambda/handler.ts",
  runtime: lambda.Runtime.NODEJS_20_X,
  handler: "handler",
});
 
const httpApi = new apigatewayv2.HttpApi(this, "HelloApi", { apiName: "hello-api" });
httpApi.addRoutes({
  path: "/hello",
  methods: [apigatewayv2.HttpMethod.GET],
  integration: new HttpLambdaIntegration("HelloIntegration", helloFn),
});

So how do you choose between Terraform and the CDK-based deploy-on-aws? Here's the decision axis.

Dimension	Terraform	deploy-on-aws (CDK)
Cloud coverage	Multi-cloud	AWS only
Language	HCL (declarative)	TypeScript and others (type-safe, autocomplete)
State management	Explicit tfstate (easy to lock and audit)	Managed internally by CloudFormation (less hands-on, less control)
Best fit	Multiple environments, large scale, strict state ops	Small-to-mid AWS-only builds, fast prototyping
Working with Claude Code	Write conventions in CLAUDE.md to raise generation quality	Natural language → CDK generation → diff → deploy in one flow

For my own work, I write multi-cloud setups and any foundation where I want a firm grip on state in Terraform, and I hand small AWS-only APIs and prototypes to deploy-on-aws. When I'm working on the blog infrastructure and app backends for Dolice Labs as an indie developer, keeping that line drawn means fewer things to hold in my head and fewer moments where I stall. You don't have to commit to just one. Pick the tool that fits the target and the team.

Summary

This guide walked through practical patterns for integrating Claude Code across all phases of Terraform development.

Here are the key takeaways.

CLAUDE.md design is the foundation of everything. Pre-defining security policies, naming conventions, and cost guidelines dramatically improves the quality of generated code.

Automated security scanning via hooks complements human review. Integration with tfsec and Checkov substantially reduces the risk of vulnerable configurations entering your codebase. To go deeper on Claude Code's hooks capabilities, Claude Code Hooks Automation Techniques — Practical Guide to Automating Your Development Workflow is a valuable companion read.

AI-powered terraform plan analysis automates risk assessment and cost estimation, preventing missed validations before production applies.

Drift detection × GitHub Actions continuously maintains production reliability. For large-scale infrastructure changes where parallel sub-agents are needed, Claude Code Parallel Agent Development Mastery Guide — 10x Faster Development with Worktrees, Sub-Agents, and Task Management provides the implementation blueprint.