Terraform drift recovery: stabilize IaC without stalling delivery
Drift grows quietly until applies feel dangerous. This is a recovery plan that restores safe change control without freezing delivery.
- Manual changes patched in production without Terraform
- Applies are avoided because the blast radius is unknown
- State is shared across environments or teams
- Modules are brittle and undocumented
Terraform drift is usually a process failure, not a tooling failure
Drift appears when changes land outside of IaC and nobody can reconcile them safely. It often starts with a hotfix, then becomes a habit. Over time, the state file stops matching reality and teams lose trust.
- Incidents force manual changes that are never reconciled.
- Multiple teams edit infrastructure without a shared review gate.
- Environment strategy mixes shared state and conflicting ownership.
A six-step plan that restores safe applies
Keep delivery moving while you rebuild trust in IaC.
1. Freeze unsafe change
Pause high-risk applies and document the current state reality.
2. Inventory drift
Identify manual changes, unknown resources, and unmanaged dependencies.
3. Split ownership
Separate environments and reduce cross-team coupling in state.
4. Rebuild modules
Simplify critical modules and document intent and constraints.
5. Re-introduce safe applies
Use targeted plans and smaller blast radius changes.
6. Create guardrails
Make off-path changes visible and expensive again.
Habits that keep drift from returning
- Pre-apply checklists and ownership gates.
- Change reviews that include runtime impact, not just diffs.
- Runbooks for emergency changes with reconciliation steps.
- Weekly drift checks on critical modules and environments.
Related pages to continue drift recovery
If applies feel risky, request an Infrastructure Review.
We can stabilize IaC, pipelines, and delivery without slowing your team.