Compliance

Enable deletion and termination protection

One capability across CloudFormation, RDS, Aurora, DocumentDB, DynamoDB, ELB, ECS and Cognito: turn the irreversible one-step delete into a deliberate two-step action so a single mistyped command cannot wipe out production.

14 min·10 sections·AWS

Last reviewed 16 June 2026

Remediates AWS Security Hub: CloudFormation.3 Cognito.6 DocumentDB.5 DynamoDB.6 ECS.19 ELB.6 RDS.7 RDS.8

Deletion protection: the basics

Why one boolean across so many services is its own capability

Deletion and termination protection is the same idea wearing different field names across the estate. A CloudFormation stack has EnableTerminationProtection, which blocks DeleteStack. An RDS DB instance and an Aurora or DocumentDB cluster have DeletionProtection, which blocks the delete API. A DynamoDB table has DeletionProtectionEnabled. An Application, Network or Gateway Load Balancer has the deletion_protection.enabled attribute. An ECS capacity provider managing an Auto Scaling group uses scale-in protection. A Cognito user pool has DeletionProtection. In every case the flag is free, defaults to off, and does exactly one thing: it makes the delete call fail until someone deliberately turns protection off first.

AWS Security Hub turns each of these into its own control, which is why a single estate can fail several deletion-protection checks at once. CloudFormation.3 covers stacks; RDS.7 covers Aurora and other DB clusters while RDS.8 covers standalone DB instances; DocumentDB.5 covers DocumentDB clusters; DynamoDB.6 covers tables; ELB.6 covers load balancers; ECS.19 covers capacity-provider scale-in protection; Cognito.6 covers user pools. They look like separate findings on the report, but they are one capability: turn the irreversible one-step delete into a deliberate two-step action on anything long-lived.

The severity ratings on these are mostly Low or Medium, but the rating reflects likelihood, not blast radius. The thing each one prevents is the permanent, unrecoverable destruction of production data or routing. The flag behaves identically to its unprotected twin on every normal day; the difference only shows up at the exact moment a delete would have gone through. That is the whole value: it adds a hard stop precisely when the human operator most needs one.

In this lesson you will learn how AWS expresses deletion and termination protection across stacks, databases, tables, load balancers, container capacity and identity, why most of these are scoped to long-lived production resources and not disposable ones, and the one-call fix for each. You will also see the trap that catches teams (fixing the live resource without fixing the source template, so the next deploy quietly re-opens the gap) and the per-resource retention controls that pair with these flags for defence in depth. The Controls this lesson covers section lists every Security Hub control in this capability, each linking to a deep page with the exact check and a copy-and-paste fix.

Fun fact

The flag that exists because "we'll be careful" did not scale

AWS added RDS deletion protection in 2018, years after RDS launched, and CloudFormation termination protection arrived on a similar timeline. They were not there from day one. Enough teams ran an automation that deleted the wrong instance with the final snapshot skipped, or ran delete-stack against the wrong AWS profile during a Friday cleanup, that AWS introduced a hard block: a property that makes the delete API call itself fail. The whole family of these flags exists for the same reason: across millions of API calls and thousands of customers, "everyone will remember not to make the mistake" turned out not to be a control at all.

Finding unprotected production resources across an estate

Marco runs the security cadence at a B2B SaaS company. Security Hub fires deletion-protection findings spread across CloudFormation stacks, RDS instances and load balancers in three accounts that pre-date the team's safe-defaults standard. None of it is exotic: the flags simply default to off, and the old Terraform modules never set them.

Rather than enable protection blindly everywhere (which would turn every CI teardown into a two-step ceremony), he scopes to long-lived production resources and starts with the databases, because a deleted production database is the least recoverable thing on the list.

Audit every RDS DB instance for the DeletionProtection flag so you can see which production databases fail RDS.8.

$ aws rds describe-db-instances --query 'DBInstances[].{Id:DBInstanceIdentifier,Engine:Engine,Protected:DeletionProtection}' --output table

----------------------------------------------------------

| prod-orders-db | postgres | False |

| prod-billing-db | postgres | False |

| prod-auth-db | mysql | True |

| dev-scratch-db | mariadb | False |

----------------------------------------------------------

# Two production instances unprotected; dev-scratch is fine to leave open.

The False rows on production engines are the RDS.8 failures that matter. The same shape of audit applies to stacks, tables and load balancers.

How AWS enforces the two-step deletedeep dive

Every control in this group resolves to the same enforcement pattern: a boolean attribute read at the front of the delete call. CloudFormation rejects DeleteStack with a ValidationError when EnableTerminationProtection is true. RDS returns an InvalidParameterCombination error on DeleteDBInstance or DeleteDBCluster when DeletionProtection is true (RDS.8 evaluates the instance resource, RDS.7 the cluster resource; DocumentDB.5 is the same property on DocumentDB clusters). ELB returns OperationNotPermitted on DeleteLoadBalancer when deletion_protection.enabled is true (ELB.6). DynamoDB blocks DeleteTable when DeletionProtectionEnabled is true (DynamoDB.6). Cognito blocks user-pool deletion when DeletionProtection is ACTIVE (Cognito.6). ECS.19 uses Auto Scaling scale-in protection so a managed instance is not terminated out from under a task. There is no override flag on the delete itself; the only path through is a prior modify that turns protection off.

Enabling or disabling these is a metadata-only change on the resource: it applies effectively immediately, requires no reboot, and incurs no cost. Importantly, protection blocks deletion only, not modification. You can still resize a database, swap a load balancer's listeners, update a table's capacity, or run an UpdateStack. So it is not a freeze; it is specifically a guard against the worst irreversible mistake, the whole-resource delete.

Two subtleties catch teams. First, deletion protection does not stop infrastructure-as-code from re-opening the gap: if a Terraform module's default is false, the next apply flips the live resource back to unprotected, which is why fixing the resource without fixing the template is treating the symptom. Second, on CloudFormation the stack-level flag does not stop an UpdateStack from replacing (and destroying) a stateful resource; that is what per-resource DeletionPolicy: Retain is for. The whole-stack flag and the per-resource policy are complementary, and you want both.

What is the impact of leaving production resources unprotected?

The direct impact is exposure to an irreversible event. A single delete call (issued in error, by a runaway script, by a stale Terraform plan, or by a compromised credential) permanently destroys the resource. A delete-stack against a production stack walks the dependency graph and removes the VPC, the database, the KMS keys and the IAM roles in order, with no confirmation. A delete-load-balancer drops every in-flight connection and can leave a dangling DNS record. A delete-table or delete-db-instance with the final snapshot skipped leaves no automatic backup at all. For production, that gap is the difference between an incident and a catastrophe.

There is no cost dimension to the fix, and that is what makes the unprotected state hard to justify. These flags are free, apply instantly, and have zero operational side effects on normal work; the only change is that deletion now requires a deliberate two-step. There is no trade-off to weigh, which is why a persistent failure here is almost always a defaults problem rather than a considered decision. And because protection defaults to off everywhere, an unprotected production resource usually means the template or process that created it did not set the safe default, so the same process is likely producing other unprotected resources.

On the compliance side, these controls map to data-availability and change-management expectations across SOC 2, ISO 27001, PCI DSS and HIPAA. "An engineer ran the wrong command and we had an outage" is exactly the failure mode auditors look for, and a two-step deletion requirement on production is one of the cheapest controls to point at. The financial impact dwarfs the cost: public post-mortems of accidental deletes routinely cite six-figure incident costs, while the protection is free. The math is not subtle.

How do you enable protection safely?

Work the capability as one loop. The order matters: protect the long-lived resources, leave the disposable ones alone, pair the flag with per-resource retention where loss is unrecoverable, and make the safe default automatic so the findings do not come back.

1. Inventory and bucket resources by environment

Pull the state of each protection flag across every account and region: stacks (EnableTerminationProtection), DB instances and clusters (DeletionProtection), DynamoDB tables (DeletionProtectionEnabled), load balancers (deletion_protection.enabled), ECS capacity providers (scale-in protection) and Cognito user pools. Separate long-lived production resources, which are the urgent fixes, from intentionally disposable ones (CI pipelines, sandbox stacks, feature-branch environments). Protection on a throwaway is just friction, so the target is a residual count that reflects disposable resources, not zero across the board.

2. Enable protection on every long-lived production resource

Flip the flag with the relevant modify call: update-termination-protection for stacks, modify-db-instance / modify-db-cluster for RDS, Aurora and DocumentDB, update-table for DynamoDB, modify-load-balancer-attributes for ELB. These are metadata-only changes, so they are non-disruptive and free. Run them in a loop across the filtered inventory rather than by hand, and the findings clear on the next evaluation.

3. Pair the flag with per-resource retention for stateful resources

The whole-resource flag blocks deletion but not replacement during an update. On CloudFormation, add DeletionPolicy: Retain (and UpdateReplacePolicy: Retain) to the template entries for stateful resources (RDS instances, S3 buckets, KMS keys) so an UpdateStack that replaces them leaves the old one in place. On RDS, set the final-snapshot behaviour so a permitted delete still leaves a recoverable copy. Belt and braces: the flag handles the API-level accident, the retention policy handles the template-level one.

4. Make the safe default automatic so it cannot drift back

Enabling the flag on a live resource is undone the moment an IaC apply runs from a template whose default is false. Set protection on by default in the Terraform / CloudFormation modules, and keep the matching AWS Config rules enabled so any new unprotected long-lived resource is flagged immediately. For the strongest posture, add a Service Control Policy that denies the delete action on resources tagged production, giving two independent layers (one at the IAM gate, one at the service control plane). Detection plus prevention keeps the count where it belongs without manual policing.

# Enable deletion protection on every unprotected standalone RDS instance in a region.
for id in $(aws rds describe-db-instances \
  --query 'DBInstances[?DeletionProtection==`false`].DBInstanceIdentifier' --output text); do
  aws rds modify-db-instance --db-instance-identifier "$id" \
    --deletion-protection --apply-immediately
  echo "Protected RDS instance: $id"
done

# Termination-protect every production-tagged CloudFormation stack (eyeball the list first).
aws cloudformation describe-stacks \
  --query "Stacks[?Tags[?Key=='Environment' && Value=='production']].StackName" \
  --output text | tr '\t' '\n' | while read -r stack; do
  aws cloudformation update-termination-protection \
    --stack-name "$stack" --enable-termination-protection
  echo "Protected stack: $stack"
done

# Deletion-protect a production load balancer.
aws elbv2 modify-load-balancer-attributes --load-balancer-arn "$LB_ARN" \
  --attributes Key=deletion_protection.enabled,Value=true

Quick quiz

Question 1 of 5

Security Hub shows deletion-protection failures across CloudFormation stacks, RDS instances and load balancers. What is the most efficient way to think about them?

Keep learning

Go deeper on how the two-step delete works across the services in this capability.

You can now treat deletion and termination protection as one capability rather than a scatter of findings: inventory the protection flag across stacks, databases, tables, load balancers, container capacity and user pools, enable it on everything long-lived while leaving disposable resources alone, pair it with per-resource retention where loss is unrecoverable, and make the safe default automatic in your templates so the findings do not come back. The Controls this lesson covers section below links every control in this group to its deep page and fix.

Back to the library

Deletion protection: the cost and risk view

A family of free safeguards against an expensive, unrecoverable event

Every control in this group is a setting, not a service, so there is no line item attached to fixing it. Enabling termination protection on a stack, deletion protection on a database, or the deletion-protection attribute on a load balancer changes nothing on the AWS bill. The cost is bounded engineering time to flip the flags and fix the templates that ship them off by default.

The reason this lands on a finance or risk radar is the asymmetry. The safeguards are free, but the events they prevent (a deleted production database, a wiped CloudFormation stack that owned the whole environment, a load balancer dropped mid-deployment) are among the most expensive things that can happen to a business: lost data, regulatory exposure, recovery effort, customer credits and reputational damage that dwarf any cloud line item. Public post-mortems of accidental deletes routinely land in six figures.

From a governance standpoint, a persistent failure in this group is a finding about discipline, not dollars. It usually means infrastructure is being created from templates that do not set the safe defaults, which means the same templates are spawning other unprotected resources too. The number to watch is not a cost; it is the count of long-lived production resources failing these controls, and whether that count trends to zero and stays there.

This lesson is for the finance partner who sees a cluster of deletion-protection findings on the security report and wants to know what to do and what it costs. It covers why these are free to fix, why the residual finding count should reflect intentionally disposable resources rather than zero, and the two governance levers (safe defaults in templates, and protection as a precondition for production sign-off) that keep the count where it belongs.

Fun fact

The flag that exists because "we'll be careful" did not scale

How a finance partner frames the deletion-protection findings

Priya is the finance and risk partner for the platform team. Security Hub fires a spread of deletion-protection failures across CloudFormation stacks, RDS instances and load balancers in three accounts that pre-date the safe-defaults standard. Her instinct is not to ask for a budget line, because there is none: every control in this group is a setting, not a service, and flipping the flag changes nothing on the AWS bill. Her question is which of these resources is long-lived production, because those are the ones where a deletion would be unrecoverable and expensive.

The team buckets the inventory by environment tier. Two production RDS instances and several production stacks are unprotected; a handful of CI and sandbox resources are also flagged but are intentionally disposable, where protection would just add friction to automated teardowns. Priya prices each unprotected production resource as an uninsured deletion-accident exposure, recovery hours plus SLA credits plus regulatory notification, against a zero cost to fix, which makes the prioritisation self-evident. Her output for the risk pack is a count, not a dollar figure: the number of long-lived production resources failing these controls, with a note that the residual target reflects disposable resources rather than zero, and a question about whether the source templates were fixed so the count does not creep back up.

Why this belongs on the risk register

Most compliance controls involve a trade-off. This family is unusual because the remediation is free and the risk of ignoring it is quantifiable. Enabling protection adds no compute, storage or licensing charge; it is a boolean with no price tag. The relevant number is the count of long-lived production resources failing these controls and the size of the loss if one of them were deleted.

The asymmetry is the whole story. These controls are rated Low or Medium because deletion events are rare, but the cost when one happens is among the largest a business can absorb. That is the textbook case for a cheap, always-on safeguard: you do not price it against its frequency, you price it against its worst outcome, and the fact that it is free removes even the usual cost-benefit debate.

For governance, the right framing is uninsured liability. A production stack, database or load balancer without protection is an uncovered exposure with a documented cost range. Enabling protection on every long-lived resource converts that open liability to a closed one at zero premium. The finance contribution is to classify resources by environment tier, treat protection as a precondition for production sign-off, and require a recorded exception for any long-lived resource intentionally left unprotected, rather than a silent finding.

What finance can actually do about deletion protection

1. Inventory and bucket resources by environment

2. Enable protection on every long-lived production resource

3. Pair the flag with per-resource retention for stateful resources

4. Make the safe default automatic so it cannot drift back

Quick quiz

Question 1 of 5

What does it cost on the AWS bill to enable termination or deletion protection across a fleet of long-lived production resources?

Keep learning

Go deeper on how the two-step delete works across the services in this capability.

You have finished the finance view of deletion and termination protection. You know every control in this group is free to fix because it is a setting and not a service, that the metric is the count of unprotected long-lived production resources rather than a dollar figure, and that the residual target reflects intentionally disposable resources rather than zero. Next time the cluster appears, you will price each production resource as an uninsured deletion-accident exposure and check that the source template was fixed, not just the live resource.

Back to the library

Deletion protection: the headline

Whether a single wrong command can permanently destroy production

Most production resources can be deleted with one command by anyone holding the right permission. Deletion and termination protection is a free flag that requires that command to be a deliberate two-step action: disable protection, then delete. Security Hub flags the long-lived resources where the guard is off, across stacks, databases, tables, load balancers, container capacity providers and user pools.

This is a low-frequency, high-consequence category. The controls are rated Low or Medium because the bad event is rare, but the right way to read them is as cheap insurance against a tail risk no business wants to be on the wrong side of. A clean score is a small but real signal that the organisation sets safe defaults on the infrastructure that holds its most important data and serves its most important traffic.

None of this is a cost decision. The flags are free. It is a governance decision about whether critical systems are protected by design or only by everyone remembering not to run the wrong command in the wrong terminal window.

A short read for the leader who needs to know what an unprotected production resource exposes, why closing the gap is a governance decision rather than a budget one, and what a defensible end state looks like: protection on for anything long-lived, intentionally off for disposable environments, and every exception documented.

Fun fact

The flag that exists because "we'll be careful" did not scale

What it looks like when the safe default is automatic

After a near-miss where a stale Terraform plan almost deleted a production database, the CTO asked the platform team a single question: can a wrong command still permanently destroy production. The honest answer was yes, because the flags that block a one-step delete defaulted to off and the old modules never set them. The fix was free; the gap was that nobody had formally decided critical systems require deliberate, two-step deletion.

The team made the safe default automatic rather than chasing the findings quarter to quarter. Protection went on by default in the approved infrastructure templates, alongside backups and encryption; per-resource retention policies were added for stateful resources so an UpdateStack could not replace a database out from under itself; and a Service Control Policy denied the delete action on resources tagged production. Disposable environments were left intentionally unprotected so teardowns stayed frictionless. The next review answered the CTO's question differently: production deletion now takes a deliberate two-step, enforced automatically, with disposable exceptions on the record. That standing yes, with documented exceptions, is the governance signal, not the technical depth behind it.

Why this is a leadership item

Nearly every accidental-deletion incident comes back to the same root cause: a critical resource that could be deleted in one step was deleted in one step, by an internal change against the wrong target. Deletion and termination protection makes that root cause structurally difficult, and pairing it with per-resource retention policies and Service Control Policies makes it close to impossible.

What makes it a leadership item rather than a pure engineering one is accountability and the fact that the fix is free. An unprotected production resource represents a policy gap: the team has not formally decided that critical systems require deliberate, two-step deletion. The healthy end state (protection on by default on everything long-lived, intentionally off on disposable environments, every exception documented) is a governance signal, not a technical one. It says critical systems are protected by design, not by luck, and the leadership move is to insist the safe default is automatic so the risk is structurally closed and never has to be revisited.

The leadership move on deletion protection

1. Inventory and bucket resources by environment

2. Enable protection on every long-lived production resource

3. Pair the flag with per-resource retention for stateful resources

4. Make the safe default automatic so it cannot drift back

Quick quiz

Question 1 of 5

What is the leadership handle on deletion and termination protection?

Keep learning

Go deeper on how the two-step delete works across the services in this capability.

Two takeaways: a critical resource that can be deleted in one step is an uninsured tail risk, and the fix is a free flag, so the only real decision is governance. Make the safe default automatic in the approved templates and back it with a Service Control Policy, leave disposable environments intentionally open, and document every exception. A standing yes to whether production is protected by design tells you the discipline is healthy.

Back to the library

Controls this lesson covers

One capability, many AWS Security Hub controls. This lesson is the shared playbook; each control below keeps its own deep page with the exact check, severity and a copy-and-paste fix.

Enable deletion and termination protection

Deletion protection: the basics

The flag that exists because "we'll be careful" did not scale

Finding unprotected production resources across an estate

How AWS enforces the two-step deletedeep dive

What is the impact of leaving production resources unprotected?

How do you enable protection safely?

1. Inventory and bucket resources by environment

2. Enable protection on every long-lived production resource

3. Pair the flag with per-resource retention for stateful resources

4. Make the safe default automatic so it cannot drift back

Quick quiz

Keep learning

Deletion protection: the cost and risk view

The flag that exists because "we'll be careful" did not scale

How a finance partner frames the deletion-protection findings

Why this belongs on the risk register

What finance can actually do about deletion protection

1. Inventory and bucket resources by environment

2. Enable protection on every long-lived production resource

3. Pair the flag with per-resource retention for stateful resources

4. Make the safe default automatic so it cannot drift back

Quick quiz

Keep learning

Deletion protection: the headline

The flag that exists because "we'll be careful" did not scale

What it looks like when the safe default is automatic

Why this is a leadership item

The leadership move on deletion protection

1. Inventory and bucket resources by environment

2. Enable protection on every long-lived production resource

3. Pair the flag with per-resource retention for stateful resources

4. Make the safe default automatic so it cannot drift back

Quick quiz

Keep learning

Controls this lesson covers

CloudFormation

Cognito

DocumentDB

DynamoDB

ECS

ELB

RDS

Related compliance lessons