Aurora Backtracking: the basics
What does Security Hub RDS.14 actually check?
Aurora Backtracking is an Aurora-MySQL-only feature that lets you rewind the entire cluster to a point in the recent past — up to 72 hours — without restoring a snapshot or creating a new cluster. The cluster keeps a change-log alongside its data pages; triggering a backtrack replays the cluster backwards to your chosen timestamp and resumes in place. No DNS swap, no new endpoint, no application reconfiguration. Seconds to minutes per minute rewound.
Security Hub control RDS.14 fails when an Aurora MySQL cluster has BacktrackWindow=0 — backtracking disabled. The check is binary: either you've configured a non-zero window (1-72 hours) at cluster creation, or you haven't. There's no "enable on existing cluster" path; if backtracking wasn't enabled at create time you can't bolt it on later. You can only get it by cloning the cluster or migrating to a fresh one with the window set.
The control exists because point-in-time recovery via snapshot restore takes 20-60 minutes for a sizeable cluster, requires a new endpoint, and forces every application to cut over. Backtracking turns the same recovery into a 30-second operation against the same endpoint — the difference between a five-minute incident and a two-hour one when someone runs DROP TABLE customers at 3pm on a Tuesday.
In this lesson you'll learn what Aurora Backtracking does, when it can save you and when it can't, the cost trade-off of enabling a 72-hour window, and the exact CLI flow to investigate a failed RDS.14 finding and remediate it. You'll see the describe call that surfaces the misconfiguration, the clone-and-cutover required to enable backtracking on a cluster that didn't have it, and the SCP pattern to prevent the next cluster from being created without it.
The 30-second DROP TABLE rescue
At a fintech in 2021, a senior engineer running a one-off migration accidentally ran DROP TABLE transactions against prod instead of staging — 240M rows, six years of history. Snapshot restore was quoted at 47 minutes. The on-call lead noticed backtracking was enabled with a 24-hour window, triggered a rewind to 90 seconds before the drop, and the cluster was serving traffic again in 34 seconds. The post-mortem cost about $0.40 in backtrack storage and saved the entire trading day.
Investigating an RDS.14 failure in action
Marco is reviewing a fresh batch of Security Hub findings on a Monday morning. RDS.14 has fired against prod-orders-cluster — an Aurora MySQL 8.0 cluster carrying the order pipeline. Severity: MEDIUM. The cluster has been running for 14 months, hosts six application services, and has never had backtracking enabled.
He doesn't immediately panic — RDS.14 is a preventative control, not a breach signal. But the cluster is exactly the kind of workload where backtracking pays for itself: high write volume, multiple services touching the same tables, and a recent history of "someone ran the wrong migration" incidents that took 30+ minutes to restore from snapshot.
He starts by confirming the current configuration.
First, describe the cluster and slice to just engine, version, and backtrack window — the three fields that decide whether RDS.14 passes.
Engine is eligible, but the window is zero — that's the RDS.14 failure.
Since you can't enable backtracking on an existing cluster, the path is a clone with the window set. Aurora clones are copy-on-write — fast and cheap.
Copy-on-write clone with a 24-hour backtrack window — the migration path RDS.14 expects.
Backtracking under the hooddeep dive
Aurora's storage layer is a distributed log-structured system: every page change is appended to a redo log replicated across three Availability Zones. When backtracking is enabled, Aurora retains an additional reverse-direction change-log for the configured window. Triggering a backtrack replays that log in reverse against the live cluster volume — the same cluster ID, same endpoint, same parameter group, just rewound.
Storage cost is metered per million change records retained, at roughly $0.012 per million in most regions. For typical OLTP workloads a 24-hour window costs single-digit dollars per month; a 72-hour window on a heavy-write cluster is rarely more than $30-50/month. The break-even against a single avoided snapshot-restore incident is essentially zero.
The hard limit is the engine: backtracking is supported only on Aurora MySQL clusters (5.6 originally, currently 5.7 and 8.0), and only when the window was set at cluster create or via restore-db-cluster-to-point-in-time with --backtrack-window. Aurora PostgreSQL has no equivalent — for Postgres clusters the closest analogue is restore-db-cluster-to-point-in-time creating a new cluster at the target time, which is slower and produces a separate endpoint.
# How Security Hub evaluates RDS.14 — any Aurora MySQL cluster with BacktrackWindow=0 fails.
aws rds describe-db-clusters \
--query "DBClusters[?Engine=='aurora-mysql' && BacktrackWindow==\`0\`].DBClusterIdentifier" What is the impact of running without backtracking?
The direct impact is recovery time. Without backtracking, every data-corruption incident — bad migration, accidental DELETE, runaway script, application bug writing garbage — forces a snapshot restore. For a 500 GB cluster that's 30-60 minutes minimum before a new cluster is ready, plus the time to repoint applications at the new endpoint, plus the data loss between the snapshot and the bad write. Whole engineering teams spend Tuesday afternoons restoring snapshots.
The second-order impact is the operational rigidity it creates. Teams that don't trust their recovery path become risk-averse with migrations and one-off fixes — every schema change becomes a multi-week project because reverting is painful. Backtracking turns reverts into a 30-second operation, which fundamentally changes how willing engineers are to ship corrective changes quickly.
From a compliance standpoint, RDS.14 maps to broader data-resilience requirements in SOC 2 (CC9.1 — recovery objectives) and ISO 27001 (A.12.3 — backup). Auditors increasingly ask not just "do you have backups" but "can you demonstrate a sub-five-minute recovery for a defined corruption scenario." Backtracking is the cleanest evidence for that.
The cost trade-off is almost trivially in favour of enabling: $30/month of backtrack storage versus four engineers spending an afternoon on a recovery that should have taken 30 seconds. The math is clear; the friction is just that you can't enable it after the fact, so it has to be set at cluster creation or via a clone-and-cutover.
How do you remediate an RDS.14 failure safely?
Because backtracking can't be enabled on a running cluster, remediation is a four-step loop: inventory eligible clusters, clone with the window set, cut over, then prevent future clusters from being created without it.
1. Inventory every Aurora MySQL cluster with BacktrackWindow=0
Run a describe-db-clusters query filtered to engine aurora-mysql and BacktrackWindow==0. Group results by environment — production clusters obviously matter most, but non-prod clusters that hold representative data should be on the list too. Backtracking on a staging cluster is what lets you safely test destructive migrations before running them in prod.
2. Clone with the window set, then cut over at a maintenance window
Use restore-db-cluster-to-point-in-time with --restore-type copy-on-write and --backtrack-window 86400 (24h) or 259200 (72h). The clone shares storage with the original until divergence, so it's fast and cheap. Promote the clone, repoint applications, then decommission the original. Standard blue-green pattern; total downtime measured in seconds if the application layer can do a DNS / configuration flip.
3. Verify the rewind before you need it
Backtracking is destructive — it rewinds the cluster, losing every write since the target time. Run a tabletop exercise: write a sentinel row, wait five minutes, backtrack to before the write, confirm the row is gone. This is the only way to know the recovery path actually works and to build the muscle memory for the day someone runs the wrong DROP. Also remember backtracking itself is a write — you can't undo a backtrack with another backtrack.
4. Prevent recurrence with an SCP and AWS Config
Attach an SCP at the org level that denies rds:CreateDBCluster when Engine is aurora-mysql and BacktrackWindow is absent or zero. Add the AWS Config managed rule rds-cluster-backtracking-enabled to surface any drift within minutes. New clusters arrive with the window set by default, and Security Hub stays green.
# Clone the cluster with a 24-hour backtrack window, then promote.
aws rds restore-db-cluster-to-point-in-time \
--source-db-cluster-identifier prod-orders-cluster \
--db-cluster-identifier prod-orders-cluster-bt \
--restore-type copy-on-write \
--use-latest-restorable-time \
--backtrack-window 86400
# Wait for the clone to be available, then add an instance.
aws rds create-db-instance \
--db-instance-identifier prod-orders-cluster-bt-1 \
--db-cluster-identifier prod-orders-cluster-bt \
--db-instance-class db.r6g.xlarge \
--engine aurora-mysql
# Trigger an actual backtrack (DESTRUCTIVE — loses all writes since target time).
aws rds backtrack-db-cluster \
--db-cluster-identifier prod-orders-cluster-bt \
--backtrack-to 2026-05-15T14:32:00Z Quick quiz
Question 1 of 5Security Hub flags an Aurora MySQL cluster with RDS.14. You log in and run aws rds modify-db-cluster --backtrack-window 86400. What happens?
You scored
0 / 5
Keep learning
Dig deeper into Aurora's recovery model and the broader RDS compliance controls.
- Amazon Aurora Backtracking documentation Full reference for enabling, configuring, and triggering backtracking on Aurora MySQL.
- AWS Security Hub control RDS.14 The control definition, severity, and AWS-published remediation steps.
- AWS Config rule: rds-cluster-backtracking-enabled Continuous detection for any Aurora MySQL cluster without a backtrack window.
- Aurora point-in-time recovery vs backtracking When to use snapshot-based PITR (Postgres, lossless) versus backtracking (MySQL, in-place).
You've completed Enable Aurora MySQL backtracking. You can now spot an RDS.14 failure, understand why the fix requires a clone rather than a config flip, walk through the copy-on-write cutover, and prevent the next cluster from being created without a window. The next time someone runs DROP TABLE in prod, you'll have a four-step loop — and a 30-second recovery — ready to run.