Skip to main content
emnode / learn
Cost

Prune ancient EBS snapshots

Years of incremental snapshots silently outgrow their source volumes — find the long tail and decide what's still recoverable.

14 min·10 sections·AWS

Last reviewed

Old EBS snapshots: the basics

Why does a 50 GB volume end up with 281 snapshots?

An EBS snapshot is a point-in-time backup of a volume, stored in S3 behind the scenes and billed at roughly $0.05 per GB-month. Snapshots are incremental — only the blocks that changed since the previous snapshot get stored — so the first one is full-size and every subsequent one is (usually) cheap. That's the design that makes daily snapshots affordable.

The trap is that nobody deletes them. A nightly Lambda or a hand-rolled cron job starts firing snapshots one day, the team rotates, the policy never gets written, and four years later there are 1,742 snapshots for a volume that's still 8 GB. The wastage check flags these as EC2-009 — "Old EBS Snapshots" — typically anything older than 90 days where no retention policy exists.

The per-snapshot cost looks small ($0.01–$11.54/mo in the inbox findings) but the long tail multiplies fast. A thousand stale snapshots averaging $2/mo each is $24k a year on backups nobody has restored from since the previous AWS console redesign. The fix isn't dramatic — it's just deciding what's worth keeping, with a policy that prevents the next four years of drift.

In this lesson you'll learn how EBS snapshot incrementality actually works (and why deleting a middle snapshot doesn't free what you'd expect), how to safely audit the long tail of old snapshots, when to use the Snapshot Archive tier vs. just deleting, and how to set a Data Lifecycle Manager or AWS Backup policy that prevents the problem from coming back.

Fun fact

The phantom storage bill

Deleting an old snapshot rarely frees as much storage as you'd expect. Because snapshots are incremental, the blocks "belonging" to snapshot N may still be referenced by snapshot N+1 — when you delete N, AWS just transfers ownership of those blocks to the next snapshot in the chain. The bill only drops meaningfully once you delete the snapshots that hold blocks no other snapshot still needs. AWS doesn't expose this dependency graph directly, which is why teams often delete 200 old snapshots and watch the line item barely move.

Snapshot pruning in action

Nina is the platform lead at a SaaS company that's been on AWS since 2019. The wastage report flags 281 snapshots older than 90 days against a single 50 GB volume — the oldest is 1,595 days old (4.4 years), the newest in the flagged set is 256 days.

She pulls the list. The pattern is depressingly familiar: a deprecated cron job from a previous platform team has been firing daily snapshots since 2021, nobody ever deleted them, and the team that owned the original workload is now disbanded. The volume itself is still attached to a running instance — so the recent snapshots have value, but the 1,500-day-old ones almost certainly don't.

Before she deletes anything, she checks two things: are any of these snapshots referenced by an AMI (deleting would break image registration), and what does compliance actually require? Their SOC 2 retention policy says 90 days for production data. Everything older is fair game.

First, list all self-owned snapshots older than 90 days, sorted by age. Filter to one volume to focus the audit.

$ aws ec2 describe-snapshots --owner-ids self --filters Name=volume-id,Values=vol-0abc123def456 --query 'Snapshots[?StartTime<=`2026-02-14`].[SnapshotId,StartTime,VolumeSize,Description]' --output table
---------------------------------------------------------------------------------
| DescribeSnapshots |
+----------------------+----------------------------+--------+-----------------+
| snap-0f9a1b2c3d4 | 2021-09-04T03:00:12.000Z | 50 | daily-backup |
| snap-1a8b7c6d5e3 | 2021-09-05T03:00:09.000Z | 50 | daily-backup |
| snap-2c4d6e8f0a1 | 2021-09-06T03:00:15.000Z | 50 | daily-backup |
| ... 275 rows elided ... |
| snap-9e8d7c6b5a4 | 2025-08-30T03:00:08.000Z | 50 | daily-backup |
| snap-8d7c6b5a4f3 | 2025-08-31T03:00:11.000Z | 50 | daily-backup |
+----------------------+----------------------------+--------+-----------------+
# 281 snapshots, oldest 1,595 days, all from the same orphaned cron.

The full long tail for one 50 GB volume — 4.4 years of untouched daily backups.

Before deleting, cross-check that no AMI depends on these snapshots. An AMI deregistration will fail if its backing snapshot is gone, but the snapshot delete itself won't warn you.

$ aws ec2 describe-images --owners self --filters Name=block-device-mapping.snapshot-id,Values=snap-0f9a1b2c3d4,snap-1a8b7c6d5e3 --query 'Images[].[ImageId,Name,CreationDate]' --output table
------------------------------------------
| DescribeImages |
+----------------------------------------+
|| ||
++--------------------------------------++
# Empty result — no AMI depends on either snapshot. Safe to delete.

Always run this check on any snapshot you're about to delete in bulk.

Snapshot incrementality under the hooddeep dive

EBS snapshots are billed per GB of unique data stored, not per snapshot. Snapshot 1 stores every used block on the volume. Snapshot 2 stores only the blocks that changed since snapshot 1, and it references the unchanged blocks from snapshot 1. Snapshot 3 references both. This chain is invisible to the API — there's no describe-snapshot-chain call — but it's what determines the actual storage bill.

When you delete a snapshot, AWS walks the chain and re-attributes any blocks that other snapshots still need to those snapshots. Blocks that no surviving snapshot references are actually freed and stop billing. The practical implication: deleting the middle snapshots in a long chain frees almost nothing; deleting the oldest and newest together typically frees the most, because that's where the unique deltas live.

The Snapshot Archive tier (launched late 2021) changes the math entirely. Archived snapshots are billed at $0.0125/GB-month — a quarter of standard — but with a 90-day minimum retention and a 24-72 hour restore time. They also bill as full snapshots, not incremental, so archiving doesn't help you if the snapshot is small because most of its blocks are shared. Archive is right for the snapshot you must keep for compliance but are never going to restore in a hurry; it's wrong for daily backups of a live workload.

# Move a single snapshot to the Archive tier — pays off if you'd keep it >90 days.
aws ec2 modify-snapshot-tier \
  --snapshot-id snap-0f9a1b2c3d4 \
  --storage-tier archive

# Restore an archived snapshot back to standard tier (24-72 hours).
aws ec2 restore-snapshot-tier \
  --snapshot-id snap-0f9a1b2c3d4 \
  --temporary-restore-days 7 \
  --permanent-restore

What is the impact of hoarding old snapshots?

The direct cost is unglamorous but real. A single 50 GB volume with 281 stale snapshots, assuming a typical 5-10% daily churn, can easily hold 200-400 GB of unique snapshot data — $10-20/mo per volume, every month, indefinitely. Multiply across an estate of a few hundred volumes and the snapshot line item quietly becomes one of the largest in EBS billing, second only to the volumes themselves.

The second-order cost is restore-time confidence. When you have 1,742 snapshots for one volume, recovery becomes a guessing game: which one was "before the bad deploy"? Teams default to the most recent and miss the point — old snapshots that nobody has documented are operationally useless even if they're technically still there.

Compliance teams care for the opposite reason. If your retention policy says 90 days for production data and you're holding 4-year-old snapshots, you've quietly become a data-retention risk. A regulator or customer auditor asking "show me you delete data per your stated schedule" will not be charmed by "we forgot." Pruning old snapshots is sometimes a compliance requirement, not an optimisation.

There's also a quota dimension. The default per-region soft limit is 100,000 snapshots per account. Most teams never hit it, but a few orphaned cron jobs across enough volumes will eventually start throwing SnapshotLimitExceeded errors on legitimate backup runs — a much more visible problem than the bill.

How do you prune old snapshots safely?

Snapshot hygiene is a four-step loop: inventory the long tail, validate dependencies, delete or archive what's stale, then automate retention so the problem stops re-creating itself.

1. Inventory the long tail

Pull every self-owned snapshot in every region, joined with its source volume's existence and tags. Group by source volume and age. The findings worth acting on are usually obvious: a single volume with hundreds of snapshots, source volume tiny, oldest snapshot measured in years. Don't try to boil the ocean — sort by total stored GB and start with the top 5%.

2. Cross-check AMI and Recycle Bin dependencies

Before deleting, run describe-images --filters block-device-mapping.snapshot-id=... against every snapshot in the batch. If any AMI still references the snapshot, deregister the AMI first (or skip the snapshot). Also enable the Recoverable Snapshots / Recycle Bin (introduced 2021) with a 7-30 day retention rule — soft-delete gives you a rollback window if a colleague yells about the snapshot two days later.

3. Archive what compliance requires, delete the rest

Anything you're keeping purely for "just in case" beyond 90 days belongs in the Snapshot Archive tier — 4× cheaper, same durability, slow restore. Anything genuinely orphaned (no AMI, no compliance requirement, no documented owner) can be deleted outright. Don't sit in the middle: indefinite standard-tier retention is the most expensive choice for snapshots you'll never touch.

4. Replace cron with a lifecycle policy

The reason you have 1,742 snapshots is that a script created them without ever deleting any. Replace it with Data Lifecycle Manager (free, simple, retention by count or age) or AWS Backup (more powerful, supports cross-account/cross-region, plays nicely with Backup Audit Manager). Set retention once — "keep 30 daily, 12 monthly, 7 yearly" — and never touch this again.

# Create a DLM policy: daily snapshots, retain 30, applied to volumes tagged Backup=true.
aws dlm create-lifecycle-policy \
  --execution-role-arn arn:aws:iam::123456789012:role/AWSDataLifecycleManagerDefaultRole \
  --description 'Daily EBS snapshots, 30-day retention' \
  --state ENABLED \
  --policy-details '{
    "PolicyType": "EBS_SNAPSHOT_MANAGEMENT",
    "TargetTags": [{"Key":"Backup","Value":"true"}],
    "Schedules": [{
      "Name": "daily",
      "CreateRule": {"Interval": 24, "IntervalUnit": "HOURS", "Times": ["03:00"]},
      "RetainRule": {"Count": 30}
    }]
  }'

# Enable Recycle Bin for accidental-delete recovery (30-day soft-delete window).
aws rbin create-rule \
  --resource-type EBS_SNAPSHOT \
  --retention-period RetentionPeriodValue=30,RetentionPeriodUnit=DAYS \
  --description 'Soft-delete snapshots for 30 days'

Quick quiz

Question 1 of 5

You find 281 snapshots on a 50 GB volume, oldest 1,595 days, no documented retention policy. Compliance says 90 days for production data. What's the right next move?

You've completed Prune ancient EBS snapshots. You now know how snapshot incrementality actually bills, when archive beats delete, how to validate AMI dependencies before pruning, and how to set a DLM policy so the next four years don't repeat the last four. Inventory, validate, archive-or-delete, automate — that's the loop.

Back to the library