Deletion protection: the basics
Why one boolean across so many services is its own capability
Deletion and termination protection is the same idea wearing different field names across the estate. A CloudFormation stack has EnableTerminationProtection, which blocks DeleteStack. An RDS DB instance and an Aurora or DocumentDB cluster have DeletionProtection, which blocks the delete API. A DynamoDB table has DeletionProtectionEnabled. An Application, Network or Gateway Load Balancer has the deletion_protection.enabled attribute. An ECS capacity provider managing an Auto Scaling group uses scale-in protection. A Cognito user pool has DeletionProtection. In every case the flag is free, defaults to off, and does exactly one thing: it makes the delete call fail until someone deliberately turns protection off first.
AWS Security Hub turns each of these into its own control, which is why a single estate can fail several deletion-protection checks at once. CloudFormation.3 covers stacks; RDS.7 covers Aurora and other DB clusters while RDS.8 covers standalone DB instances; DocumentDB.5 covers DocumentDB clusters; DynamoDB.6 covers tables; ELB.6 covers load balancers; ECS.19 covers capacity-provider scale-in protection; Cognito.6 covers user pools. They look like separate findings on the report, but they are one capability: turn the irreversible one-step delete into a deliberate two-step action on anything long-lived.
The severity ratings on these are mostly Low or Medium, but the rating reflects likelihood, not blast radius. The thing each one prevents is the permanent, unrecoverable destruction of production data or routing. The flag behaves identically to its unprotected twin on every normal day; the difference only shows up at the exact moment a delete would have gone through. That is the whole value: it adds a hard stop precisely when the human operator most needs one.
In this lesson you will learn how AWS expresses deletion and termination protection across stacks, databases, tables, load balancers, container capacity and identity, why most of these are scoped to long-lived production resources and not disposable ones, and the one-call fix for each. You will also see the trap that catches teams (fixing the live resource without fixing the source template, so the next deploy quietly re-opens the gap) and the per-resource retention controls that pair with these flags for defence in depth. The Controls this lesson covers section lists every Security Hub control in this capability, each linking to a deep page with the exact check and a copy-and-paste fix.
The flag that exists because "we'll be careful" did not scale
AWS added RDS deletion protection in 2018, years after RDS launched, and CloudFormation termination protection arrived on a similar timeline. They were not there from day one. Enough teams ran an automation that deleted the wrong instance with the final snapshot skipped, or ran delete-stack against the wrong AWS profile during a Friday cleanup, that AWS introduced a hard block: a property that makes the delete API call itself fail. The whole family of these flags exists for the same reason: across millions of API calls and thousands of customers, "everyone will remember not to make the mistake" turned out not to be a control at all.
Finding unprotected production resources across an estate
Marco runs the security cadence at a B2B SaaS company. Security Hub fires deletion-protection findings spread across CloudFormation stacks, RDS instances and load balancers in three accounts that pre-date the team's safe-defaults standard. None of it is exotic: the flags simply default to off, and the old Terraform modules never set them.
Rather than enable protection blindly everywhere (which would turn every CI teardown into a two-step ceremony), he scopes to long-lived production resources and starts with the databases, because a deleted production database is the least recoverable thing on the list.
Audit every RDS DB instance for the DeletionProtection flag so you can see which production databases fail RDS.8.
The False rows on production engines are the RDS.8 failures that matter. The same shape of audit applies to stacks, tables and load balancers.
How AWS enforces the two-step deletedeep dive
Every control in this group resolves to the same enforcement pattern: a boolean attribute read at the front of the delete call. CloudFormation rejects DeleteStack with a ValidationError when EnableTerminationProtection is true. RDS returns an InvalidParameterCombination error on DeleteDBInstance or DeleteDBCluster when DeletionProtection is true (RDS.8 evaluates the instance resource, RDS.7 the cluster resource; DocumentDB.5 is the same property on DocumentDB clusters). ELB returns OperationNotPermitted on DeleteLoadBalancer when deletion_protection.enabled is true (ELB.6). DynamoDB blocks DeleteTable when DeletionProtectionEnabled is true (DynamoDB.6). Cognito blocks user-pool deletion when DeletionProtection is ACTIVE (Cognito.6). ECS.19 uses Auto Scaling scale-in protection so a managed instance is not terminated out from under a task. There is no override flag on the delete itself; the only path through is a prior modify that turns protection off.
Enabling or disabling these is a metadata-only change on the resource: it applies effectively immediately, requires no reboot, and incurs no cost. Importantly, protection blocks deletion only, not modification. You can still resize a database, swap a load balancer's listeners, update a table's capacity, or run an UpdateStack. So it is not a freeze; it is specifically a guard against the worst irreversible mistake, the whole-resource delete.
Two subtleties catch teams. First, deletion protection does not stop infrastructure-as-code from re-opening the gap: if a Terraform module's default is false, the next apply flips the live resource back to unprotected, which is why fixing the resource without fixing the template is treating the symptom. Second, on CloudFormation the stack-level flag does not stop an UpdateStack from replacing (and destroying) a stateful resource; that is what per-resource DeletionPolicy: Retain is for. The whole-stack flag and the per-resource policy are complementary, and you want both.
What is the impact of leaving production resources unprotected?
The direct impact is exposure to an irreversible event. A single delete call (issued in error, by a runaway script, by a stale Terraform plan, or by a compromised credential) permanently destroys the resource. A delete-stack against a production stack walks the dependency graph and removes the VPC, the database, the KMS keys and the IAM roles in order, with no confirmation. A delete-load-balancer drops every in-flight connection and can leave a dangling DNS record. A delete-table or delete-db-instance with the final snapshot skipped leaves no automatic backup at all. For production, that gap is the difference between an incident and a catastrophe.
There is no cost dimension to the fix, and that is what makes the unprotected state hard to justify. These flags are free, apply instantly, and have zero operational side effects on normal work; the only change is that deletion now requires a deliberate two-step. There is no trade-off to weigh, which is why a persistent failure here is almost always a defaults problem rather than a considered decision. And because protection defaults to off everywhere, an unprotected production resource usually means the template or process that created it did not set the safe default, so the same process is likely producing other unprotected resources.
On the compliance side, these controls map to data-availability and change-management expectations across SOC 2, ISO 27001, PCI DSS and HIPAA. "An engineer ran the wrong command and we had an outage" is exactly the failure mode auditors look for, and a two-step deletion requirement on production is one of the cheapest controls to point at. The financial impact dwarfs the cost: public post-mortems of accidental deletes routinely cite six-figure incident costs, while the protection is free. The math is not subtle.
How do you enable protection safely?
Work the capability as one loop. The order matters: protect the long-lived resources, leave the disposable ones alone, pair the flag with per-resource retention where loss is unrecoverable, and make the safe default automatic so the findings do not come back.
1. Inventory and bucket resources by environment
Pull the state of each protection flag across every account and region: stacks (EnableTerminationProtection), DB instances and clusters (DeletionProtection), DynamoDB tables (DeletionProtectionEnabled), load balancers (deletion_protection.enabled), ECS capacity providers (scale-in protection) and Cognito user pools. Separate long-lived production resources, which are the urgent fixes, from intentionally disposable ones (CI pipelines, sandbox stacks, feature-branch environments). Protection on a throwaway is just friction, so the target is a residual count that reflects disposable resources, not zero across the board.
2. Enable protection on every long-lived production resource
Flip the flag with the relevant modify call: update-termination-protection for stacks, modify-db-instance / modify-db-cluster for RDS, Aurora and DocumentDB, update-table for DynamoDB, modify-load-balancer-attributes for ELB. These are metadata-only changes, so they are non-disruptive and free. Run them in a loop across the filtered inventory rather than by hand, and the findings clear on the next evaluation.
3. Pair the flag with per-resource retention for stateful resources
The whole-resource flag blocks deletion but not replacement during an update. On CloudFormation, add DeletionPolicy: Retain (and UpdateReplacePolicy: Retain) to the template entries for stateful resources (RDS instances, S3 buckets, KMS keys) so an UpdateStack that replaces them leaves the old one in place. On RDS, set the final-snapshot behaviour so a permitted delete still leaves a recoverable copy. Belt and braces: the flag handles the API-level accident, the retention policy handles the template-level one.
4. Make the safe default automatic so it cannot drift back
Enabling the flag on a live resource is undone the moment an IaC apply runs from a template whose default is false. Set protection on by default in the Terraform / CloudFormation modules, and keep the matching AWS Config rules enabled so any new unprotected long-lived resource is flagged immediately. For the strongest posture, add a Service Control Policy that denies the delete action on resources tagged production, giving two independent layers (one at the IAM gate, one at the service control plane). Detection plus prevention keeps the count where it belongs without manual policing.
# Enable deletion protection on every unprotected standalone RDS instance in a region.
for id in $(aws rds describe-db-instances \
--query 'DBInstances[?DeletionProtection==`false`].DBInstanceIdentifier' --output text); do
aws rds modify-db-instance --db-instance-identifier "$id" \
--deletion-protection --apply-immediately
echo "Protected RDS instance: $id"
done
# Termination-protect every production-tagged CloudFormation stack (eyeball the list first).
aws cloudformation describe-stacks \
--query "Stacks[?Tags[?Key=='Environment' && Value=='production']].StackName" \
--output text | tr '\t' '\n' | while read -r stack; do
aws cloudformation update-termination-protection \
--stack-name "$stack" --enable-termination-protection
echo "Protected stack: $stack"
done
# Deletion-protect a production load balancer.
aws elbv2 modify-load-balancer-attributes --load-balancer-arn "$LB_ARN" \
--attributes Key=deletion_protection.enabled,Value=true Quick quiz
Question 1 of 5Security Hub shows deletion-protection failures across CloudFormation stacks, RDS instances and load balancers. What is the most efficient way to think about them?
You scored
0 / 5
Keep learning
Go deeper on how the two-step delete works across the services in this capability.
- Protecting a CloudFormation stack from being deleted How termination protection works on stacks, the IAM permissions involved, and the edge cases.
- Deletion protection for Amazon RDS DB instances What the DeletionProtection flag blocks on instances and clusters, and how to enable or disable it.
- CloudFormation DeletionPolicy attribute Per-resource retention controls that pair with whole-resource protection for defence in depth.
You can now treat deletion and termination protection as one capability rather than a scatter of findings: inventory the protection flag across stacks, databases, tables, load balancers, container capacity and user pools, enable it on everything long-lived while leaving disposable resources alone, pair it with per-resource retention where loss is unrecoverable, and make the safe default automatic in your templates so the findings do not come back. The Controls this lesson covers section below links every control in this group to its deep page and fix.
Back to the library