Sources
1577 sources collected
Even if you're not a Terraform fan, there's a good chance you'll have to support it—especially on projects with a history. Terraform starts to “creak” when the infrastructure becomes really large and multilevel.: **There is no built-in support for multiregionality and multienvironment**. **Managing dependencies between parts of the infrastructure is inconvenient**. **State files without an external backend (S3, GCS, etc.) are a pain**. **It is difficult to share resources between teams without Enterprise solutions**. … - They support a **limited number of cases**, and are often incompatible with existing configs. - They require a separate CLI utility for signing manifests. Opinions in the industry are divided: some find them useful, while others consider them a “crude solution with a good idea.” ... … Positive: - You can use familiar data structures, loops, and conditions. - They are better suited for developers than DevOps engineers. Minuses: - Steep learning curve for ops teams. - Difficulties with typing (especially in CDK). - Not all clouds and resources are supported on par with Terraform. Many teams, having tried the CDK, then return to Terraform due to the inability to easily delegate maintenance to DevOps engineers.
State of Terraform at Scale 2025: Insights from Practitioners ______________________________________________________________________ This report compiles real-world insights from 20+ DevOps practitioners and Terraform users across diverse industries. As Terraform adoption scales, so do the challenges-ranging from environment inconsistencies and state management nightmares to collaboration hurdles and developer experience gaps. The document unpacks these recurring pain points, critiques current tooling, and shares practitioner workarounds-revealing what breaks at scale and what leaders wish they’d known earlier. Key Takeaways … ● Environment Management is brittle: Teams struggle to keep dev/stage/prod consistent without duplicating code or fighting tool limitations (Terragrunt, Workspaces). ● State Management gets messy fast: Monolith vs microstates? Neither is perfect. Cross-repo dependencies and locking issues only get worse at scale. ● Reusable modules are still hard: Teams crave typed, API-like modules-but HCL’s flexibility is a double-edged sword for large teams. ● Collaboration is painful: large teams using Terraform lack proper workflow visibility and often serialize deployments to avoid chaos. ● Validation remains weak: "Best effort" testing leads to accidental infra drifts. Scripts and hacks are common. ● Developer experience matters: Product teams want abstractions and self-service. Many adopt CDKTF or build internal platforms to bridge the gap. Acknowledgment & Thank You … ## Preview text ... developer experience gaps. ... ● Environment Management is brittle: Teams struggle to keep dev/stage/prod consistent without duplicating code or fighting tool limitations (Terragrunt, Workspaces). ● State Management gets messy fast: Monolith vs microstates? Neither is perfect. Cross-repo dependencies and locking issues only get worse at scale. ● Reusable modules are still hard: Teams crave typed, API-like modules-but HCL’s flexibility is a double-edged sword for large teams. ● Collaboration is painful: large teams using Terraform lack proper workflow visibility and often serialize deployments to avoid chaos. ● Validation remains weak: "Best effort" testing leads to accidental infra drifts. Scripts and hacks are common. ● Developer experience matters: Product teams want abstractions and self-service. … ### Introduction #### ______________________________________________________________________ Terraform is a widely adopted Infrastructure as Code (IaC) tool, praised for its declarative approach and extensive provider ecosystem. However, as organizations scale their infrastructure and teams, practitioners often encounter significant challenges that can complicate management and slow down development. Conversations with experienced Terraform users reveal common pain points and highlight areas where existing solutions fall short or where new approaches are desired. … that it "just shifts the goalpost a little bit. It doesn't solve the actual underlying problem". ● Another approach involved using Terraform Workspaces for different environments, but this was found to be "really weirdly" handled by Terraform, at least in one user's experience. ● Many companies ultimately resort to having totally different Terraform templates for … is perceived as a "slowly changing beast," and avoid it for faster-moving components. "I will never put, you know, any fast moving stuff in Terraform because it's difficult to manage," confessed one user. ● Users desire a solution that provides a summary view of all repositories, pipelines, and statuses, with the ability to selectively run them. … ### Validation and Testing Deficiencies #### ______________________________________________________________________ Ensuring that deployed infrastructure matches the intended configuration and functions correctly is challenging, and practitioners feel Terraform's built-in validation and testing tools are not fully mature. … "Whether it is functioning as intended is a totally different question," one practitioner stressed, despite acknowledging that Terraform's own testing features help validate the plan's intent. ● Some teams have had to resort to embedding validation scripts or commands within their Terraform code, often using local provisioners or depends_on to ensure they run … because "when people generate a plan, it would start saying that no one has to apply because there is a tri block there," making the plan harder to validate. This happens because try is evaluated at the apply phase if values are not statically assigned. ● More built-in support for robust testing and validation within Terraform is desired. Terraform is trying to address this with tests, but it's "still not really mature".
### 1. State Management: The Double-Edged Sword The Terraform state file (`terraform.tfstate`) is the heart of Terraform. It's a JSON file that maps your code to real-world resources. This is how Terraform knows what it's managing. But it's also its biggest source of pain. - **The Problem:** The state file is a single source of truth that can become a single point of failure. If it gets corrupted, lost, or out of sync, Terraform loses its "memory," leading to chaos. - **The Impact:** Manually editing the state file is terrifying and error-prone. Concurrency issues arise when multiple people run `terraform apply` at the same time, leading to state corruption. And by default, state is stored locally, which is a non-starter for teams. … ### 2. Refactoring is Painful and Risky As your infrastructure evolves, your code needs to evolve with it. You'll want to rename resources for clarity, move them into modules, or reorganize your file structure. In a normal programming language, this is a simple refactor. In Terraform, it's a destructive operation. - **The Problem:** If you rename a resource in your `.tf` file (e.g., from `aws_instance.web` to `aws_instance.web_server`), Terraform sees one resource to be destroyed and one new resource to be created. - **The Impact:** This can cause catastrophic downtime and data loss for stateful resources like databases or storage buckets. … - **The Problem:** While major cloud providers are excellent, smaller or community-led providers can be buggy, lack features, or lag behind API updates. You're entirely dependent on the provider's implementation. - **The Impact:** You might find a bug where `terraform plan` shows no changes, but `apply` fails. Or a new cloud service is released, and you have to wait months for the provider to support it. … - **The Problem:** Terraform has no native, end-to-end secret management solution. The state file itself can contain sensitive values in plain text after an `apply`. - **The Impact:** Accidentally committing a `.tfvars` file with secrets or having an exposed state file can lead to a severe security breach. … - **The Problem:** Terraform needs to refresh the state of every resource in your configuration by making API calls to your cloud provider. For large setups, this can take many minutes. - **The Impact:** Long feedback loops kill developer productivity and make quick fixes anything but quick. … - **The Problem:** There's no built-in testing framework. Unit testing HCL is difficult, and integration testing (spinning up real infrastructure) is slow, expensive, and complex to manage. - **The Impact:** It's easy for bugs to slip into production, causing outages or security vulnerabilities. Confidence in making changes decreases as the infrastructure grows. … ### 9. Cryptic Error Messages While this has improved significantly in recent versions, Terraform can still produce error messages that are baffling, especially when dealing with complex modules or provider bugs. … - **The Problem:** Terraform only detects drift when you run a `plan` or `apply`. It doesn't have a built-in, continuous monitoring system to alert you when drift occurs. - **The Impact:** Your state file no longer represents reality, and the next `apply` could have unintended, destructive consequences by trying to "fix" the manual change.
www.schibsted.pl
9 reasons why terraform is a pain, and 1 why you should still care - Schibsted Tech Polska## The pains ### 1. The evil state First thing you will complain about, when it comes to Terraform, is the fact that it’s stateful, and the implications it brings. I personally consider two issues that it brings: - the state has to be in sync with the infrastructure all the time – that also means that you have to go all-in when it comes to provisioning – i.e. no stack modifications can be made outside of the provisioning tool - you have to keep the state somewhere – and this has to be a secure location as state has to carry secrets … ### 2. Hard to start with the existing stack Back in the early days of Terraform, its issue tracker was full of complaints from people not being able to leverage Terraform with the existing stack. The reason for it was the fact, that Terraform was not able to incorporate it into the state (to my amazement, while looking for a sign of this, I’ve found my old PR that was trying to address that issue back then 😉 ). Fortunately, the import command was introduced, and this problem has been solved (at least at the system level). … ### 3. Complicated state modifications There is one additional thing that is a bit problematic when dealing with the state. While constantly refactoring your infrastructure definition, you may end up renaming resources (changing their identifiers) or moving them deeper into modules. Such changes are unfortunately hard for Terraform to follow, and leave it in a state where it doesn’t know that certain resources are simply misplaced. If you run … ### 4. Tricky conditional logic There are some people around the web who doesn’t like the fact that Terraform is not really an actual imperative programming language. To be perfectly honest I don’t share that opinion – I think the provisioning definition of the stack should be as declarative as it can – that leaves a lot less space for some deviations in the definitions. On the other hand, the conditional logic provided by Terraform is a bit tricky. For example to define a resource that is conditionally provisioned you make the resource to be a list, and use the count parameter to control it:
itnext.io
The Pains in Terraform CollaborationThe snags that may stall your Terraform adoption and what to do I divide Infrastructure as Code (IaC) into three categories. **Mark-up languages** like CloudFormation and ARM have simple format, but the body of code sprawls enormously with more objects lumped together. **Domain specific languages** such as Terraform’s HCL, feature flexible syntax and a mild dose of abstraction, creating a pleasant coding experience. Libraries that supports **general-purpose programming languages**, such as AWS CDK and Pulumi, are extremely powerful yet requiring serious programming proficiencies to tame the hyper-abstractions. … The open-source Terraform keeps states in workspaces. So we can address the first problem. However, workspace does not attempt to deal with the second and third problems. For that sake, I regard the workspace feature in open-source Terraform as half-baked. It misses too much. I have seen teams using variable files to store input per-workspace input variables. However, the input variables may contain secrets too. In addition, one more item to keep track over time, is whether each state remains consistent with the actual target resources (drift detection), which is also tricky. … There are many purpose-built extensions (GitHub, Azure DevOps) to facilitate Terraform installation and command invocation. However, as discussed, the real pain point with Terraform collaboration is the statefulness and consequent issues. Automation pipelines fall short in this regard, despite of its significant role in continuous integration in SDLC. Its scripting capability can virtually achieve any programmable task, but it is not fun to juggle with numerous code paths to deal with state logistics and stateful resources.
scalr.com
7. Advanced Patterns...As Terraform/OpenTofu structures become more complex with modules, multiple environments, split state, and potentially orchestration tools like Terragrunt, developers inevitably encounter challenging issues. ... Many common errors actually stem from neglecting the foundational best practices discussed earlier.
www.capterra.com
Terraform Reviews 2025. Verified Reviews, Pros & Cons | CapterraTerraform's error messages are usually cryptic and hard to understand. Finally, when Terraform fails, it just drops the deployment and leaves the half-deployed resources as is. There is no built-in way to revert to a last good state. Supports a wide range of providers, such as cloud platforms and all sorts of server-based software such as Hashicorp Vault, Grafana, etc. ... Cryptic error messages The quality of the documentation varies a lot, but generally speaking it doesn't go into enough details Terraform sometimes fails for obscure reasons Terraform sometimes updates resources that have no changes No built-in ability to roll back to a previously working state if the deployment fails Terraform is no longer FOSS Weird configuration language Renaming resources usually leads to painful problems Some limitations prevent the efficient use of Terraform in a multi-environment setup (which is usually the case), which lead to the birth of Terragrunt to overcome those limitations. … 1. There was generally one example on each resources in terraform documentation which makes understanding a bit challenging. 2. There are very few developers with the terraform experience. 3. After writing the terraform scripts, the developers has to check the terraform plan properly before proceed the terraform apply command . There is a possibility that Developers run terraform apply command directly which would lead to deletion or modification of the resources and once resource is modified or deleted, there is no way to get the resources back. … Basically whenever your DevOps engineers are overseeing in excess of ten machines or when you need numerous groups not zeroed in on DevOps to assist with claiming the framework facilitating their code. Prominent sentiment is that Terraform isn't exceptionally secure, fight tried, and spilling mysteries happen effectively on mishap. Thus, Terraform is not so great when you need to store bunches of touchy mysteries that your organization is lawfully needed to watch in case it is the finish of you. … The actual language is somewhat surprising and this makes it difficult for new clients to get onboarded into the codebase. While it's improving with later deliveries, essential ideas like "map a variety of choices into a bunch of designs" or "apply this rationale in the event that a variable is indicated" are conceivable however superfluously unwieldy. … One of my main problems with Terraform is when I'm trying to delete things and I end up with dependencies blocking the deletion. Sometimes I need to manipulate state and am left with orphaned resources. When things don't work, it's tricky to troubleshoot. Really positive, we have the majority of our infrastructure represented as code which makes deployments and maintaining our infrastructure easier. … Looping strategies like for_each are rather complex to understand when you are new to terraform. It does not have a rollback feature & if something fails all the changes before that failed change would still be applied. Some conditional logics are unnecessarily cumbersome. Also overally analytics for users running plan & apply is missing, it's better for management purpose & debugging the plan which might have caused an infra issue.
www.shadecoder.com
Terraform: A Comprehensive Guide for 2025Even experienced teams can run into predictable issues when adopting Terraform. Below are common pitfalls, why they happen, and practical fixes to avoid downtime and confusion. Poor state management • Why it happens: Teams keep state files locally or don’t enable locking, causing conflicts and state corruption. • Solution: Use remote state backends with locking (often supported by cloud object stores or managed services). Be careful when migrating state and avoid manual edits unless necessary.
Terraform hits hard when your infrastructure grows faster than your control over it. You run `terraform apply`, and the plan looks fine—until the change breaks something you didn’t expect. This is the core frustration: Terraform’s strength in managing large, complex clouds also exposes sharp edges when your state, modules, and workflows drift out of sync. The first pain point is state management. Remote state is supposed to solve collaboration issues, but locking, version conflicts, and backups can slow teams down or block deployments outright. Every mismatch between real resources and recorded state becomes a delay, a risk, and a source of hidden cost. The second is module complexity. Terraform encourages reusable modules, but deep dependency chains and over-abstracted components make debugging painful. Changing one variable in a shared module can trigger unrelated updates in production. Simple fixes can trigger large-scale plans, making rollbacks harder. Third, drift detection remains weak. Terraform can detect most changes, but out-of-band updates often slip through until something breaks. Manual audits and refresh commands add friction. Accurate drift detection should be automated and safe, but in practice, it is another manual chore. Lastly, execution speed matters. Large plans on multi-cloud setups run slow. Waiting minutes or hours to see if your change worked kills momentum. Quick feedback loops are rare, and workarounds often involve hacks to split plans or bypass certain checks.
Terraform is a powerful infrastructure-as-code (IaC) tool, but many teams hit the same pain points as they scale: remote state management, secrets ending up in state, configuration drift, module sprawl, slow plans and applies, and safe promotion across environments. In this article, we’ll walk through 13 of the biggest Terraform challenges, with practical tips to help you build faster, safer workflows. … 1. State management at scale 2. Sensitive data ending up in state and plan artifacts 3. Preventing and detecting configuration drift 4. Taming the dependency graph and resource ordering 5. Provider versioning and upgrade surprises 6. Dealing with cloud API rate limits and eventual consistency 7. Managing multiple environments without chaos 8. Managing Terraform modules at scale 9. Refactoring without accidental destroy/recreate 10. Importing existing (brownfield) infrastructure 11. Performance bottlenecks in large plans and applies 12. Making changes safe: review, testing, and policy guardrails 13. Licensing and governance uncertainty ## 1. State management at scale Terraform state management gets tricky the moment your team and CI/CD start running Terraform in parallel. The `terraform.tfstate` file is Terraform’s “source of truth” for what it thinks exists. If two runs can write state at the same time (or the state is stored somewhere unreliable), you can end up with conflicting updates and painful recovery work. … ## 2. Sensitive data ending up in state and plan artifactstiTerraform is good at not splashing secrets all over your terminal, but that can create a false sense of safety. Even when the CLI shows `(sensitive value)`, the underlying state and plan data can still contain the real value, because Terraform needs a complete record of resource attributes to manage drift and future changes. State and plan files may include sensitive values like initial database passwords or API tokens — and local state is stored in plaintext by default. This becomes a real problem in CI/CD: It’s common to save `terraform plan -out=tfplan` and upload it as an artifact for a later apply job. That plan file can contain enough information to leak secrets if it’s accessible to the wrong people (or just ends up in the wrong place), turning “preview” artifacts into secret blobs you now have to secure like production credentials. … ## 4. Taming the dependency graph and resource orderingesTerraform builds a dependency graph to figure out resource ordering and run as much as possible in parallel, based mostly on the references it can “see” in your configuration. Trouble starts when the dependency is real but implicit: maybe a resource relies on a side effect (“this IAM policy must exist before that service can start”), or you’re passing IDs around as plain strings, so Terraform can’t infer the relationship. … Version constraints alone aren’t enough for reproducibility. Terraform uses constraints to decide what’s allowed and then records the exact chosen versions (plus checksums) in `.terraform.lock.hcl` so future runs make the same selections by default. If that lock file isn’t committed and consistently used, you can still get “works on my machine” drift between environments. … ## 6. Dealing with cloud API rate limits and eventual consistencyndSometimes your Terraform code is fine and the cloud just isn’t ready yet. Big applies can hit API throttling (429s / “Rate exceeded”) because Terraform is doing lots of create, read, and update calls at once — and most providers enforce per-account or per-region limits. Furthermore, many services are eventually consistent: The API accepts a change, but other endpoints won’t “see” it for seconds or minutes. … ## 12. Making changes safe: review, testing, and policy guardrails, At some point, the biggest risk isn’t “Terraform is wrong.” It’s that humans can’t reliably review what Terraform is saying. A plan with hundreds (or thousands) of changes is easy to rubber-stamp — and it’s hard to spot the one destructive action hiding in the noise. Correctness also isn’t just syntax. A configuration can be valid and still violate your organization’s rules (“no public S3,” “only these regions,” “no wide-open security groups”), or break module expectations in subtle ways. … ## 13. Licensing and governance uncertaintynsFor a lot of teams, “Terraform risk” isn’t technical — it’s licensing and governance. Terraform’s license changed to Business Source License 1.1 in August 2023, which created uncertainty for anyone redistributing Terraform, embedding it in products, or offering IaC as a hosted service. Many organizations can keep using Terraform internally, but the gray area is usually “Are we building something that could be considered competitive?” That question tends to trigger legal review and slow platform roadmaps. Governance adds a second layer: when a single vendor controls the roadmap, release cadence, and contribution process, teams need to plan for the possibility of future shifts (license terms, deprecations, feature direction) that ripple through their infrastructure workflow.
controlmonkey.io
10 Common Terraform Errors & Best Practices to Avoid Themdiv Terraform Errors are more common than most teams realize. While terraform has become the IaC tool of choice for many organizations. The reality is that Terraform makes it deceptively easy to get started but considerably more challenging to get right. Many teams discover this only after they’ve accumulated significant technical debt. Simple deployments can quickly become maintenance nightmares when you overlook best practices. … ### Adopt Trunk-Based Development for Better Terraform Collaboration ... However, unlike application code, infrastructure can have only one version deployed. Keeping multiple long-lived branches in a Terraform repository is not common practice. ... … ## 2. Terraform Error: Ignoring Modules in Your Infrastructure Without modules, lengthy and duplicated code appears across multiple environments as developers copy and paste configurations rather than reusing established patterns. It can cause inconsistencies across environments, and making a simple change would require updates in multiple places. Modules help keep provider versioning such as Terraform AWS provider or Terraform Azure provider consistent across your configuration. … ## 3. Not Pinning Provider Versions: A Common Terraform Pitfall When you don’t specify exact provider versions, Terraform automatically pulls the latest version during initialization, which can lead to unexpected behaviour or broken deployments when providers release breaking changes. Here is the right way: … ## 4. Terraform Mistake: Poor Resource DependenciesorTerraform builds its dependency graph based on explicit references between resources. But some dependencies exist at runtime that aren’t visible in configuration. Failing to declare these “hidden” dependencies can lead to subtle, hard-to-debug issues where resources are technically created but don’t function properly together. The example below shows why Terraform can miss important runtime dependencies and how `depends_on` can be used to fix it: … ## 7. Terraform Errors Caused by Inconsistent File Structurese.One of the most common Terraform Errors teams make is cramming numerous resources, data sources, and variables into a single monolithic .tf file. This approach might seem convenient initially, but as your infrastructure expands, it becomes increasingly difficult to navigate, troubleshoot, and collaborate effectively. A well-structured Terraform project typically includes several specialized files, each with a distinct purpose.
terramate.io
10 Biggest Pitfalls of Terraform# 10 Biggest Pitfalls of Terraform Terraform, a popular tool in Infrastructure as Code (IaC), faces challenges with managing multiple environments and scaling, leading to complex and error-prone setups. Terramate, a new CLI tool, addresses these issues by introducing stack concepts, global variables, and code generation to simplify environment management and reduce code complexity. It improves upon Terraform's limitations in versioning, backend configuration, and resource lifecycle management, offering a more streamlined and flexible approach for complex infrastructure management. Terramate's innovative features enhance efficiency and control, making it a valuable addition to the IaC toolset. Terraform (or OpenTofu if you prefer open source) has emerged as a pivotal player in the evolving Infrastructure as Code (IaC) landscape, facilitating the management and provision of cloud resources through code. However, like any tool, it has drawbacks and tradeoffs. Challenges such as **managing multiple environments with workspaces**, **maintaining module versions** and **backend configurations**, and ** managing resource lifecycles** often make Terraform code hard to read and prone to errors. Moreover, scaling can be cumbersome due to a lack of stack concept, leading to complications in more intricate environments. … ## 1. Terraform Workspaces Terraform Workspaces help you manage different environments, like staging, development, and production. However, they can be tricky to handle. For example, the code can be difficult to understand because you have to use the `count` parameter a lot to create resources based on conditions. Also, it gets harder when you want to scale or grow with Terraform Workspaces because you need to add more connections between them when managing different environments. … ## 2. Maintaining Module Versions In Terraform, a feature called the module block lets users use pre-set modules. But there's a problem with this block. The `source` and `version` attributes in this block, which are used to specify where the module comes from and which version of the module to use, don't allow for variable interpolation. Variable interpolation is replacing a placeholder in a string with its actual value. This limitation can cause trouble when you're trying to set up modules in a flexible or dynamic way. … ## 3. Hardcoding Backend Configuration When you’re working with Terraform, you might need to make copies of Root Modules, but this can cause unexpected problems if you’re not careful with the backend configuration. The backend configuration is where Terraform stores information about your infrastructure. If you copy the backend configuration without changing the `key` or … ## 7. Missing Stack Concept Terraform is unique in the world of IaC tools because it doesn’t have a stack concept. A stack is a collection of resources that are managed together. Instead, Terraform only focuses on what’s happening within a single directory, a root module. This can cause problems when dealing with bigger, more complex environments because it’s not designed to handle multiple collections of resources at once. … ## 8. Code Duplication In Terraform, when you want to use a module (which is a pre-set piece of code) multiple times, you have to copy the call to the module and the arguments (the specific instructions you give to the module) each time. This leads to repeated code, making your codebase larger and harder to maintain. Terramate offers a solution to this problem with its Code Generation feature. This feature creates the necessary blocks of code based on your configurations. This means you don’t have to repeat the same blocks of code multiple times, which reduces code duplication and makes your code more efficient. # 9. Monostacks If you’re managing a lot of resources (like virtual machines, databases, etc.) in Terraform, it can cause some problems. For example, if something goes wrong, it could affect many of your resources (this is known as a “big blast radius”). Also, executing plans and applying changes can take a long time when dealing with many resources. Additionally, if there are discrepancies or “drifts” in a single resource, it can prevent you from applying new changes. … ## 10. Deep Merging of Maps and Objects In Terraform, merging or combining maps and objects at multiple levels, also known as “deep merging”, is not allowed. A map is a collection of key-value pairs, and an object is a complex structure containing multiple data types. This limitation makes it hard to merge default configurations with user inputs. For instance, it’s difficult to create keys or attributes that conflict, and changing the value of an attribute in a nested structure is impossible. … ## Conclusion Terraform has played a key role in popularizing the concept of Infrastructure as Code, where you manage your IT infrastructure using code. However, it’s not without its challenges. These include issues like code that is hard to read, difficulty scaling with workspaces, problems maintaining versions of modules, the need to hardcode backend configurations and the complexity of managing the lifecycle of resources.