Site Reliability

Protect RDS instances with AWS Backup

Native RDS backups die with the database — bring your RDS instances and Aurora clusters under a centralized AWS Backup plan so they're protected by policy, not per-DB settings.

13 min·10 sections·AWS

Last reviewed 27 May 2026

Unprotected RDS instances: the basics

Why native automated backups aren't the whole story

Every RDS instance can take native automated backups — daily snapshots plus transaction logs that give you point-in-time recovery (PITR) anywhere in a retention window of 1 to 35 days. That's excellent for fast operational recovery: fat-fingered DELETE, a bad migration, a corrupted table at 2pm. But native backups have one fatal property — they live with the instance. They sit in the same account, can't be retained beyond 35 days, and when the database is deleted they are deleted with it. A rogue admin, a compromised root credential, or terraform destroy against the wrong workspace takes the database and every native backup in a single action.

AWS Backup is the centralized alternative. Instead of each DB owning its own retention setting, a backup plan defines policy once — schedule, retention, lifecycle, copy rules — and selects resources by tag. Recovery points land in a separate backup vault that can live in another Region and another AWS account entirely. Crucially, they survive deletion of the source database. Delete the RDS instance and the AWS Backup recovery point is still sitting in the vault, ready to restore. For ransomware resilience, rogue-admin protection, and any compliance regime that demands isolated, long-retained, immutable copies, this is the difference between a recoverable incident and a resume-generating one.

Continuity check COV-002 ("Unprotected RDS Instances") cross-references every RDS instance and Aurora cluster against the resources covered by an AWS Backup plan. A database with only native automated backups and no AWS Backup coverage fails the check — because native backups alone do not survive the deletion of the thing they protect. Severity is HIGH: the gap is invisible until the worst possible moment, and by then there's nothing left to restore from.

In this lesson you'll learn the difference between native RDS automated backups and centralized AWS Backup, why the two are complementary rather than competing, and how to detect RDS databases that have no AWS Backup coverage. You'll see how tag-based resource selection auto-protects new databases, how backup vaults plus Vault Lock give you immutable WORM copies, how lifecycle rules tier old recovery points into cold storage, and how cross-Region and cross-Account copy give you a backup an attacker in the production account can't reach. You'll get the exact CLI calls to find unprotected databases, attach them to a plan by tag, and kick off an on-demand backup of a specific RDS ARN.

Fun fact

The deletion that took the backups with it

In a widely-discussed 2014 incident, a code-hosting startup was hit by an attacker who gained access to its AWS console and deleted the production database, the EC2 instances, and — critically — the backups, all from the same account in a matter of minutes. The native backups lived in the same blast radius as production, so a single set of stolen credentials was enough to erase everything. The company never recovered and shut down within days. The lesson the whole industry took away: a backup that an attacker in your production account can delete is not a backup. AWS Backup's cross-account copy into a separate, Vault-Locked account exists precisely so the same set of credentials can't reach both production and its recovery points.

Closing the RDS coverage gap in action

Marco is the SRE on call when COV-002 fires for the production account: 9 RDS instances and 2 Aurora clusters with no AWS Backup coverage, 4 of them flagged HIGH because they're tagged Environment=prod and DataClass=pii. The flagged set includes the primary orders database — exactly the system of record that has to survive a worst-case event, not just a Tuesday-afternoon bad migration.

He doesn't assume the absence of AWS Backup means no backups at all. Native automated backups are almost certainly on with a 7-day window, which is fine for fast PITR. What's missing is the isolated, long-retained, deletion-proof copy — and AWS Backup's list-protected-resources endpoint is the source of truth for who has one. Anything not in that list has nothing that survives deletion of the source database, regardless of what the native retention setting says.

He starts by cross-referencing the live RDS inventory against the set of resources AWS Backup currently protects, scoped to RDS and Aurora.

First, build the coverage gap query — every RDS instance minus everything AWS Backup currently considers protected.

$ comm -23 <(aws rds describe-db-instances --query 'DBInstances[].DBInstanceArn' --output text | tr '\t' '\n' | sort) <(aws backup list-protected-resources --query "Results[?ResourceType=='RDS'].ResourceArn" --output text | tr '\t' '\n' | sort)

arn:aws:rds:us-east-1:123456789012:db:orders-prod

arn:aws:rds:us-east-1:123456789012:db:billing-prod

arn:aws:rds:us-east-1:123456789012:db:identity-prod

arn:aws:rds:us-east-1:123456789012:db:analytics-stage

# ... 5 more ...

# 9 databases have native backups only — nothing survives a delete of the instance.

The set-difference between live RDS instances and AWS Backup protected resources is the coverage gap.

Create a tag-based selection on the backup plan so every database tagged BackupRequired=true is picked up automatically at the next plan run — including databases that don't exist yet.

$ aws backup create-backup-selection --backup-plan-id 8a2c5e9f-prod-daily-35d --backup-selection 'SelectionName=tag-based-rds,IamRoleArn=arn:aws:iam::123456789012:role/AWSBackupDefaultServiceRole,ListOfTags=[{ConditionType=STRINGEQUALS,ConditionKey=BackupRequired,ConditionValue=true}]'

{

"SelectionId": "b1d4...selection",

"BackupPlanId": "8a2c5e9f-prod-daily-35d",

"CreationDate": "2026-05-26T10:14:22.000Z"

}

# Now tag the unprotected databases for inclusion:

$ aws rds add-tags-to-resource --resource-name arn:aws:rds:us-east-1:123456789012:db:orders-prod \

--tags Key=BackupRequired,Value=true Key=BackupTier,Value=daily-35d

# Tag-based selection means coverage scales with the fleet, not with engineering hours.

One selection rule, tag-driven — new databases inherit protection the moment they're tagged.

For an immediate isolated copy of a flagged database, kick off an on-demand backup job against the RDS ARN. This lands a recovery point in the vault now, before the next scheduled window.

$ aws backup start-backup-job --backup-vault-name prod-isolated-vault --resource-arn arn:aws:rds:us-east-1:123456789012:db:orders-prod --iam-role-arn arn:aws:iam::123456789012:role/AWSBackupDefaultServiceRole --lifecycle MoveToColdStorageAfterDays=90,DeleteAfterDays=2555

{

"BackupJobId": "3f7c-8a91-orders-prod",

"CreationDate": "2026-05-26T10:18:44.000Z"

}

# Recovery point lands in prod-isolated-vault; cross-account copy rule replicates it off-account.

# DeleteAfterDays=2555 = 7-year retention — far beyond native's 35-day ceiling.

On-demand backup gives you an isolated, long-retained recovery point immediately — no waiting for the schedule.

Native backups vs AWS Backup under the hooddeep dive

Native RDS automated backups are storage-level snapshots plus a continuous stream of transaction logs, both retained for a window you set between 1 and 35 days. The transaction logs are what enable point-in-time recovery to any second in that window. They're free up to the size of your database, take no operational effort, and are perfect for fast operational recovery. Their constraint is architectural: they're a property of the DB instance, stored in an AWS-managed account associated with your account, capped at 35 days, and reaped automatically when the instance is deleted. There is no native way to keep a copy beyond 35 days or to put it somewhere the source account can't reach.

AWS Backup turns backup into a policy object that lives independently of the database. A backup plan defines schedule, retention (up to indefinite), lifecycle transition to cold storage, and copy rules; a backup selection binds resources to the plan by tag or ARN. Recovery points land in a backup vault — a logical container with its own access policy and KMS key. Two vault features change the risk profile entirely: Vault Lock enforces WORM (write-once-read-many) immutability so that, in compliance mode, not even the AWS account root can delete a recovery point before its retention expires; and cross-Region / cross-Account copy rules replicate recovery points into a vault in a different Region and a different AWS account. That isolated copy is the one that survives ransomware, a compromised root credential, or a delete-db-instance against the wrong account.

The two are complementary, not competing, and a mature setup runs both. Native automated backups handle the high-frequency, low-latency case: "a migration corrupted a table ten minutes ago, rewind PITR." AWS Backup handles the low-frequency, high-stakes case: "the account was compromised / we need a 7-year retained copy for the auditor / the Region is down." Lifecycle rules in the plan transition older recovery points from warm to cold (Glacier-class) storage automatically — typically after 90 days — dropping the per-GB cost by roughly 75% while keeping multi-year retention affordable. RDS snapshot storage is incremental and deduplicated, so the marginal cost of each additional recovery point is only the changed blocks.

# Confirm a database has ONLY native backups and no AWS Backup recovery points.
# 1. Native automated backup window (lives with the instance, max 35 days):
aws rds describe-db-instances \
  --db-instance-identifier orders-prod \
  --query 'DBInstances[0].{Retention:BackupRetentionPeriod,Window:PreferredBackupWindow}'

# 2. Any AWS Backup recovery points for the same ARN (survive instance deletion):
aws backup list-recovery-points-by-resource \
  --resource-arn arn:aws:rds:us-east-1:123456789012:db:orders-prod \
  --query 'RecoveryPoints[].{Arn:RecoveryPointArn,Vault:BackupVaultName,Status:Status,Created:CreationDate}' \
  --output table

# Empty result from step 2 = native-only = fails COV-002.

What is the impact of leaving an RDS instance unprotected?

The direct impact is binary, exactly like EC2: when a recovery event happens, either you have a copy that survives the event or you don't. For a database the failure modes that matter most are the ones native backups can't help with — a deleted instance (and with it every native snapshot), a compromised account where an attacker wipes everything reachable, or a need to restore data older than the 35-day native ceiling. In all three cases a database relying on native backups alone has nothing to restore from. The recovery path collapses to "reconstruct from application logs and replicas if any survived," which for a system of record means hours to days of downtime and probable permanent data loss.

The second-order impact is decision pressure during the incident, magnified because it's data rather than compute. Lose a stateless app server and you rebuild it. Lose the orders database and its only backups together, and the incident commander is choosing between "restore from a manual snapshot someone took for a migration four months ago," "reconstruct balances from event logs and accept the gaps," or "tell customers their data is gone." A current, isolated AWS Backup recovery point reduces that entire decision tree to "restore the recovery point into a new instance and verify."

On the regulatory side the bar is higher for databases than almost anything else, because they hold the regulated data itself. SOC 2 CC9.1, ISO 27001 A.12.3, PCI-DSS requirement 12.10, and HIPAA's contingency-planning rule all expect demonstrable, tested, and crucially isolated backups for systems holding regulated records. A database flagged unprotected by your own continuity check is documented awareness of a gap — and once the gap is a known, recorded finding, leaving it open is far worse in an audit than never having checked. Immutable Vault-Locked copies are increasingly the explicit control auditors look for as ransomware-recovery expectations harden.

The cost side is real but modest, and the asymmetry is the whole point. AWS Backup for RDS is snapshot storage — roughly $0.095/GB-month warm in US-East, dropping to about $0.02/GB-month once lifecycle moves recovery points to cold storage, deduplicated so each new point bills only changed blocks. For a 200 GB database with daily backups, 35-day warm retention, and a cross-account copy, expect somewhere in the range of $15-40/month all-in. Against that you're insuring the system of record — the single most expensive thing on the bill to lose, and the only one you genuinely cannot rebuild. This is the cheapest insurance policy you'll buy and the one you'll be most grateful for exactly once.

How do you bring an RDS instance under protection?

Closing the gap is a four-step loop: find what's exposed, decide the protection tier each database needs, bring it under a centralized plan with isolation, and make sure new databases can't slip through uncovered.

1. Inventory the coverage gap and confirm it's native-only

Cross-reference describe-db-instances (and describe-db-clusters for Aurora) against backup list-protected-resources scoped to RDS. The set difference is your gap. For each database in the gap, confirm what protection actually exists: native automated backups are almost always on, but they're capped at 35 days and die with the instance. A database with native backups only still fails the check, because the failure modes that matter — deletion, account compromise, long-retention compliance — are exactly the ones native backups can't cover. Prioritize by data class: production and regulated databases first.

2. Define plans by tier and select resources by tag

Create one or two backup plans (e.g. daily-35d for general production, daily-7y-isolated for regulated data) with tag-based resource selection on BackupRequired=true and a BackupTier tag that routes to the right plan. Tagging for inclusion makes coverage scale with the fleet — every new database gets the tag in its Terraform module or launch process and AWS Backup picks it up at the next plan run. Native automated backups stay on alongside this: they're complementary. Native handles fast PITR; the centralized plan handles isolated long-term recovery.

3. Isolate and immutabilize: separate vault, cross-account copy, Vault Lock

The recovery point has to land somewhere an attacker in the production account can't reach. Configure the plan's copy rule to replicate into a vault in a separate AWS account (and ideally a different Region) — that's what survives a compromised root credential or a delete-db-instance against the wrong account. Apply AWS Backup Vault Lock in compliance mode on the destination vault so recovery points are WORM-immutable and cannot be deleted before retention expires, even by the account root. Add a lifecycle rule (MoveToColdStorageAfterDays=90) to keep multi-year retention affordable.

4. Prevent recurrence with AWS Config and IaC defaults

Enable the AWS Config managed rule rds-resources-protected-by-backup-plan to alert on any RDS instance or Aurora cluster without recent backup coverage. For prevention, bake BackupRequired=true and the appropriate BackupTier into your Terraform/CloudFormation modules for every database pattern, and lint pull requests to flag new RDS resources without a backup tag. The goal is that an unprotected production database simply can't be created — protection is a property of the module, not a step someone has to remember.

# Bulk-tag every RDS instance in the account that isn't already protected by AWS Backup.
UNPROTECTED=$(comm -23 \
  <(aws rds describe-db-instances \
      --query 'DBInstances[].DBInstanceArn' --output text | tr '\t' '\n' | sort) \
  <(aws backup list-protected-resources \
      --query "Results[?ResourceType=='RDS'].ResourceArn" --output text | tr '\t' '\n' | sort))

for arn in $UNPROTECTED; do
  aws rds add-tags-to-resource --resource-name "$arn" \
    --tags Key=BackupRequired,Value=true Key=BackupTier,Value=daily-35d
done

# Verify coverage at the next plan run.
aws backup list-backup-jobs \
  --by-state COMPLETED --by-created-after $(date -u -d '24 hours ago' +%FT%TZ) \
  --query 'BackupJobs[?ResourceType==`RDS`].ResourceArn'

Quick quiz

Question 1 of 5

Your production orders database has native automated backups with a 14-day retention window. COV-002 flags it as unprotected. What's the right next move?

Keep learning

Dig deeper into centralized backup, immutability, cross-account isolation, and continuous coverage detection for databases.

You've completed Protect RDS instances with AWS Backup. You now know why native automated backups — capped at 35 days and deleted with the instance — aren't enough on their own, how a centralized AWS Backup plan adds isolated, immutable, long-retained copies via tag-based selection, Vault Lock, and cross-account/cross-Region copy, and that the two are complementary. The next time COV-002 fires on a production database, you'll have a four-step loop ready: inventory the gap, tier the protection, isolate the copy, and prevent recurrence.

Back to the library

Unprotected RDS instances: what it means for risk

A protection gap that costs nothing until it costs everything

When engineers say a database "has backups," they usually mean the automatic daily copy that AWS takes for free. That copy is genuinely useful — it lets the team rewind a database to any moment in the last few weeks. But it has a property nobody mentions until an incident: it lives inside the same account as the database, can't be kept longer than 35 days, and gets deleted the instant the database is deleted. If an attacker, a disgruntled insider, or an automation error removes the database, the built-in backup goes with it. The protection and the thing it protects share a single point of failure.

This finding flags databases that rely only on those built-in backups and aren't enrolled in a centralized backup plan. The fix is cheap insurance: a policy-driven copy that lands in a separate, locked vault — ideally in another account and another Region — and survives even if the original database is wiped. The cost of that protection is modest, typically a few dollars to low tens of dollars per database per month in storage. The exposure it covers is not modest: the cost of a multi-day customer-data outage, a failed audit, or a ransomware event that finds your only backups sitting in the same blast radius as production.

From a risk and budgeting standpoint, the right framing is exposure, not line-item. The relevant question at the operational review isn't "how much does the backup cost?" — it's "what is our exposure if this database and its backups disappear together, and is every regulated database covered by a copy that an attacker or an accident can't reach?" Good looks like 100% coverage of production and regulated databases under a centralized plan, with copies isolated in a separate account. A coverage number below that is a quantified, named risk, not a cost-optimization opportunity.

This lesson is for the finance partner who sees "backups are on" and assumes the risk is covered. It walks through why built-in backups share a fate with the database they protect, what the centralized alternative costs versus the exposure it removes, what a sensible coverage target looks like (100% of production and regulated databases), and the questions to ask at the operational review when the coverage number isn't where it should be. By the end you'll know how to frame this as quantified risk rather than a discretionary spend line, and what to push engineering on when a regulated database shows up uncovered.

Fun fact

The deletion that took the backups with it

How a finance partner frames the exposure

Sam is the finance partner embedded with the platform team. At the quarterly operational review, the engineering lead reports that "all databases have backups enabled." Sam asks the question that's now standard on the agenda: "How many of our production and regulated databases have a backup copy that survives the database being deleted, and that an attacker in the production account can't reach?" The honest answer is 4 of the 11 are uncovered by the centralized plan — they only have the built-in backups that live in the same account and vanish with the instance.

The conversation that follows isn't technical. Sam doesn't ask about retention windows, snapshot dedup, or KMS keys. She asks three things: which databases hold customer or regulated data, whether each one has an isolated copy in a separate account, and what the recovery story is if the production account itself is compromised. The framing is exposure, not cost — the centralized copies add maybe $40 a month across the whole estate, while the exposure of losing the orders database and its only backups together is a company-ending event and a near-certain audit failure.

Engineering commits to bringing 100% of production and PII databases under the centralized plan with cross-account copy within the sprint. Sam adds "% of regulated databases with isolated, deletion-proof backups" as a standing line on the operational review — not a dollar figure, a coverage percentage. She knows the target is 100% for that tier and that anything less is a named, quantified risk she can put in front of the audit committee, not a cost-optimization opportunity to be traded away.

Why this is exposure, not a cost line

The spend involved is small and easy to approve — a few dollars to low tens of dollars per database per month for the isolated, long-retained copy. The mistake is treating that number as the thing to optimize. The relevant number isn't the cost of the backup; it's the magnitude of the exposure it covers. For a system of record, the exposure side of the ledger is a multi-day outage, a regulatory penalty, and in the genuine worst case a company-ending data-loss event. No reasonable view of that asymmetry justifies trimming the protection to save tens of dollars.

The exposure is also concentrated and non-diversifiable, which is what makes it a board-level risk rather than a routine cost. A few specific databases — the orders DB, the billing DB, the identity store — carry almost all of the consequence. You cannot average this risk across the estate the way you can with idle compute; one uncovered production database is one uncovered production database, and its loss is not offset by the other ten being fine. That concentration is exactly why the right metric is coverage of the critical tier, expressed as a percentage, with a target of 100%.

There's a compliance and audit-credibility dimension that finance feels directly. When an auditor or regulator asks to see the backup-and-recovery control for regulated data, "we have automatic backups" is no longer a sufficient answer if those backups share an account and a fate with production. An isolated, immutable, tested copy is the evidence that satisfies SOC 2, PCI-DSS, and HIPAA contingency requirements. A documented coverage gap on a regulated database is a finding waiting to happen and a far harder conversation than the spend ever was.

Finally, the coverage percentage is a leading indicator of DR maturity. If production and regulated databases drift below 100% centralized, deletion-proof coverage, it almost always means tagging discipline or the backup-plan-by-default convention is slipping — the same slippage that predicts gaps in EBS protection, cross-Region failover, and recovery testing. Watch the regulated-tier coverage number as a signal of whether continuity engineering is actually keeping pace with the estate, not as a cost to be minimized.

What finance can actually do about this

Finance can't configure a backup vault, but it can make sure the right risk is being managed and that protection isn't traded away for trivial savings. Three levers, used together at the operational review.

1. Track coverage of the critical tier as a percentage, not a dollar

Add "% of production and regulated databases with isolated, deletion-proof backups" as a standing line on the operational review. The target is 100% for that tier. This is a risk metric, not a cost metric — frame it that way so it can't be quietly de-prioritized against a few dollars of storage. Anything below 100% is a named, quantified exposure you can put in front of the audit committee.

2. Tie it to the audit and compliance calendar

Demonstrable, isolated, tested database backups are explicit controls under SOC 2, PCI-DSS, and HIPAA. Make sure the coverage number is reviewed ahead of every audit cycle and that a restore test has actually been run — "we have backups" and "we have tested that we can restore" are different claims, and only the second one satisfies an auditor. Finance owning the calendar link keeps this from being discovered cold during an audit.

3. Protect the protection spend from cost-cutting

When a cost-optimization sweep looks at storage line items, isolated backup copies for the system of record must be explicitly out of scope. The exposure they cover dwarfs the spend by orders of magnitude. The lever here is making sure the backup line is categorized as risk mitigation, not discretionary storage, so it isn't trimmed by a well-meaning savings initiative that doesn't understand what it's cutting.

4. Treat any regulated-database gap as escalate-now, not monitor

Unlike diffuse waste categories, this one doesn't have an acceptable residual for the critical tier. A single uncovered production or regulated database is a concentrated risk that isn't offset by the others being fine. The right response to a gap there isn't to watch the trend — it's to escalate immediately and get it closed within the current sprint, because the cost of waiting is asymmetric and unbounded.

Quick quiz

Question 1 of 5

An auditor asks to see the backup-and-recovery control for your PII databases. Coverage of the regulated tier under the centralized, isolated plan is currently at 80%. As the finance partner, what's the right next move?

Keep learning

Dig deeper into centralized backup, immutability, cross-account isolation, and continuous coverage detection for databases.

You've finished the finance partner's view of unprotected RDS instances. You know why the built-in backup everyone assumes is sufficient shares a fate with the database it protects, why this is exposure rather than a cost line, and what the three finance levers are — track critical-tier coverage as a percentage, tie it to the audit calendar, and protect the protection spend from cost-cutting. Next time 'backups are on' comes up at the operational review, you'll have a sharper question: does every regulated database have an isolated, tested copy that survives losing the environment?

Back to the library

Unprotected RDS instances: the headline

Business continuity that depends on backups living in the same blast radius

Databases are where the business keeps the record that matters — customers, orders, transactions, regulated data. The default "automatic backup" AWS provides lives in the same account as the database and is deleted the moment the database is deleted. If a single compromised credential or operator error removes the production database, the safety net goes with it. That is a continuity risk hiding behind a feature everyone assumes is sufficient.

Closing this gap is inexpensive and largely a policy decision: keep an isolated, immutable copy of every important database in a separate account and Region, retained long enough to satisfy auditors and ransomware-recovery scenarios. The headline isn't the cost of the backups — it's whether the company could still recover its system of record after a worst-case event. A coverage gap here is one of the cleanest signals that disaster-recovery posture is aspirational rather than real.

A five-minute read on a continuity risk that hides behind a feature everyone assumes is enough, written for the exec who wants the headline and the one question to ask. You'll get the rule-of-thumb framing — backups that share a blast radius with production aren't really backups — what a coverage gap signals about wider DR readiness, and what "good" looks like at an org level. No commands, no implementation detail.

Fun fact

The deletion that took the backups with it

What it looks like when the org gets this right

At one company, the disaster-recovery section of the board pack used to say "automated database backups: enabled." Technically true, and quietly misleading — those backups lived in the same account as production and would be gone in any scenario where production itself was lost. The exec sponsor stopped accepting "backups: enabled" and started asking a sharper question: "If someone deleted our production account tomorrow, could we still recover the customer database?"

Within a quarter the answer changed from "probably not" to "yes, within hours, from an isolated copy in a separate account in another Region." The board pack line became "100% of regulated databases have an immutable, off-account backup copy, tested quarterly." The cost of getting there was trivial against the spend; the value was that the continuity claim was now true rather than aspirational.

That's the right outcome state. The goal isn't "backups are on" — it's "every system of record has a recovery copy that survives the loss of the environment that holds it." Coverage of the regulated tier becomes a confidence signal at the leadership review, not a recurring action item.

Why this is on the report at all

The cost in this category is trivial; it's tracked because of what a gap exposes. Databases hold the system of record, and the default backup that everyone assumes is sufficient shares an account and a fate with the thing it's supposed to protect. A coverage gap on a production or regulated database is a concentrated, non-diversifiable continuity risk — the kind where a single event (a compromised credential, an operator error, a ransomware payload) can take both production and its only safety net at once. That is precisely the scenario that ends companies, so it belongs on the risk register, not buried in a cost report.

There's a regulatory and reputational dimension too. Demonstrable, isolated, tested database backups are an explicit expectation under SOC 2, PCI-DSS, and HIPAA, and increasingly the specific control auditors probe as ransomware-recovery scrutiny intensifies. A known, documented gap on a regulated database is worse in an audit than the spend was ever going to be. So this sits at the intersection of business continuity, security posture, and compliance — and the leadership question is whether the company could still recover its system of record after losing the environment that holds it.

The leadership move on this category

The actionable handle for an executive isn't to manage the backup spend — it's to insist the continuity claim is true rather than aspirational, and to set the norms that keep it true.

1. Demand isolation, not just "backups are on"

Insist that every system of record has a recovery copy in a separate account and Region — one that survives losing the production environment entirely. "Backups enabled" is not the same as "recoverable after a worst-case event," and the gap between those two claims is exactly where companies have died.

2. Require a tested restore, not just an existing backup

A backup that has never been restored is a hypothesis. Ask for evidence of a periodic restore test of the critical databases — recovery time and recovery point actually measured, not assumed. This converts continuity from a slide that says 'enabled' into a number you can trust under pressure.

3. Make critical-tier coverage a confidence signal at the leadership review

Ask one question: "Are 100% of our regulated databases covered by an isolated, immutable, tested backup?" A 'yes' for several quarters running means DR posture is real; anything else is a quantified risk to act on now. It's a one-minute item that tells you whether continuity engineering is keeping pace, without any technical depth.

Quick quiz

Question 1 of 5

You're reviewing the DR section of the board pack. It states '100% of regulated databases have an immutable, off-account backup copy, tested quarterly.' What's the right read?

Keep learning

Dig deeper into centralized backup, immutability, cross-account isolation, and continuous coverage detection for databases.

That's the lesson. Two takeaways worth holding onto: a backup that lives in the same blast radius as production isn't really a backup, and the coverage of your regulated database tier is a continuity signal, not a cost line. The leadership question is simple — could we still recover the system of record after losing the environment that holds it?

Back to the library

Part of the learning path Build in resilience

Protect RDS instances with AWS Backup

Unprotected RDS instances: the basics

The deletion that took the backups with it

Closing the RDS coverage gap in action

Native backups vs AWS Backup under the hooddeep dive

What is the impact of leaving an RDS instance unprotected?

How do you bring an RDS instance under protection?

1. Inventory the coverage gap and confirm it's native-only

2. Define plans by tier and select resources by tag

3. Isolate and immutabilize: separate vault, cross-account copy, Vault Lock

4. Prevent recurrence with AWS Config and IaC defaults

Quick quiz

Keep learning

Unprotected RDS instances: what it means for risk

The deletion that took the backups with it

How a finance partner frames the exposure

Why this is exposure, not a cost line

What finance can actually do about this

1. Track coverage of the critical tier as a percentage, not a dollar

2. Tie it to the audit and compliance calendar

3. Protect the protection spend from cost-cutting

4. Treat any regulated-database gap as escalate-now, not monitor

Quick quiz

Keep learning

Unprotected RDS instances: the headline

The deletion that took the backups with it

What it looks like when the org gets this right

Why this is on the report at all

The leadership move on this category

1. Demand isolation, not just "backups are on"

2. Require a tested restore, not just an existing backup

3. Make critical-tier coverage a confidence signal at the leadership review

Quick quiz

Keep learning

Related site reliability lessons