blog.tryterracotta.com
Why is Terraform still hard in 2025? - Terracotta AI
## Terraform in practice We’ve worked with and spoken to platform teams at fast-moving startups, growing mid-size orgs, and multi-cloud enterprises. They all use Terraform and have built internal tooling, standards, and CI pipelines to manage it. And almost all of them still have the same problems: drift, copy-paste modules, fragile reviews, broken dependencies, unclear ownership, and too much tribal knowledge. … ## Problem 1: Terraform is deceptively simple “I just need to provision an S3 bucket and an IAM role. Shouldn’t take more than 10 minutes.” – Famous last words from a junior engineer Terraform starts easy. But what you see is only the top of the HCL iceberg. You write a few resources. Then you need a backend. Then providers. Then variables. Then modules. Then environments. Then workspaces. Then you realize the code is the easy part, and the real complexity is in understanding *what this code will actually do in the real world.* … ## Problem 2: Everyone’s Terraform is different! The beauty of Terraform is its flexibility. The curse of Terraform is also its flexibility. Every company has its own flavors: - Custom modules with their own naming patterns, output conventions, and hidden assumptions - Some teams use `terraform apply`; others have 8-step CI workflows - Some apply per environment, some per resource, some per team - Half the team uses Terragrunt (wrong), the other half doesn’t (also wrong) … ## Problem 3: Drift is inevitable The moment your infrastructure touches the real world, it starts to drift. → An engineer fixes something manually in the AWS console. → Someone disables a policy to unblock a deploy. → An external system updates a tag. → A resource gets destroyed and recreated with new defaults. You run `terraform plan` and it looks fine — because Terraform only knows what *you* told it. It doesn’t know about the out-of-band changes. It doesn’t warn you that a resource has drifted. It happily wipes out your fix and calls it “safe.” … ## Problem 4: The plan looks fine. Until it isn’t. Terraform plans are supposed to bring safety and predictability. In theory, you see what will change before it changes. In practice: You scroll through hundreds of lines, looking for anything suspicious, hoping nothing sneaky gets through. The plan tells you *what* will change. It rarely tells you *why.* It doesn’t tell you what modules depend on this. It doesn’t tell you which services this IAM policy might break. It doesn’t tell you this change will wipe out and recreate a stateful database. … Helpful. You trace through nested modules, try to remember if that variable is a string or a map, and mentally reconstruct the dependency tree. You check the state. You re-run the plan. You eventually just `console.log` your way through `terraform console`. Terraform doesn't make debugging easy. It makes you guess. ## Problem 6: Infra review doesn’t scale In theory, Terraform should enable fast, safe infra changes. In practice, infra PRs sit for days because: - No one has enough context to approve them confidently - Everyone is afraid of breaking something - The person who owns the module is on PTO - The plan is massive, and no one wants to read it The result? You either block product teams or approve blindly. Neither is good. Both are common. ## Some things we haven’t even mentioned yet We haven’t talked about: - Plan noise from computed diffs - Implicit vs. explicit dependencies - Hidden resource recreation - Conditional logic that breaks silently - The tension between DRY and readable code - Running `terraform destroy` in the wrong workspace 😬 ## A better way? If you’ve read this far, you’re probably not surprised by any of this. You’ve lived it. Terraform gives you control, but it doesn’t provide you with confidence. Not when drift is invisible. Not when context lives in someone’s head. Not when reviewing a plan feels like decoding a black box. Infra isn’t hard because we’re doing it wrong. It’s hard because the workflows weren’t designed for how teams actually build and ship today: fast-moving, multi-owner, highly parallel, deeply interconnected.
Related Pain Points4件
Unsafe plan review and hidden destructive changes in large changesets
8Terraform plans with hundreds or thousands of changes are difficult for humans to review reliably. Destructive actions (resource deletion/recreation) hide in the noise of benign changes, making it easy to miss critical issues during code review.
Configuration drift detection and management
6Infrastructure managed by CloudFormation can drift when modified through AWS Console, SDK, or CLI. Without proper tools, detecting and reconciling these changes is manual and error-prone.
Every organization develops different Terraform conventions and custom tooling
6Terraform's flexibility allows each team to develop unique patterns: custom modules with different naming conventions, varying CI workflows (from simple apply to 8-step pipelines), conflicting use of Terragrunt, and different apply strategies. This fragmentation creates tribal knowledge and complicates onboarding.
Terraform feels deceptively simple but hides deep complexity in real-world usage
5Initial Terraform tasks (provisioning a bucket) appear simple, but complexity emerges across backends, providers, variables, modules, environments, workspaces, and dependency management. Understanding what code actually does in production requires deep system knowledge.