Skip to main content
emnode / learn
Cost

Right-size EBS storage volumes

Match volume size, IOPS, and throughput to actual usage — most gp3 volumes are provisioned for headroom nobody uses.

14 min·10 sections·AWS

Last reviewed

EBS right-sizing: the basics

What does it mean to right-size an EBS volume?

An EBS volume has three independent dimensions you pay for: capacity in GB, provisioned IOPS, and provisioned throughput in MiB/s. On gp3, capacity bills per GB-month, with 3,000 IOPS and 125 MiB/s included for free; anything above that is paid for separately. On io1/io2 every provisioned IOP is billed, at roughly $0.065 per PIOPS-month for io1. Right-sizing means making sure the number on each of those three knobs actually matches what the workload uses.

The most common over-provisioning pattern shows up after a gp2 → gp3 migration. Teams script the upgrade and tell it to "match the prior IOPS" of the gp2 baseline, which scaled with volume size at 3 IOPS per GB. A 4 TB gp2 volume had a 12,000 IOPS baseline — so the migration provisions 12,000 IOPS on the new gp3 volume "to be safe." Most of those workloads never saw more than 500 IOPS on the original volume. The headroom was incidental, but the bill for it on gp3 is real.

The flip side is also true: an under-provisioned volume throttles, latency rises, and the application starts looking unhealthy in ways that don't immediately point at storage. EBS right-sizing isn't about cutting to the bone — it's about matching each of the three dimensions to actual observed usage with a sensible buffer, and then re-checking quarterly as the workload changes.

In this lesson you'll learn how EBS pricing splits into capacity, IOPS, and throughput; which CloudWatch metrics actually tell you what a volume is using; how to apply IOPS and throughput changes online with modify-volume; and why capacity-shrink is a fundamentally different (and harder) operation that requires a snapshot-and-swap. You'll see a Compute Optimizer EBS recommendation and the exact CLI calls to apply the safe parts of it.

Fun fact

The gp2-to-gp3 migration tax

When AWS launched gp3 in late 2020, the official upgrade guidance was to migrate gp2 volumes and "preserve the prior performance baseline" — which translated, for most automation, into provisioning IOPS at 3× the volume size. The result was that a wave of teams locked in gp3 volumes with 6,000–16,000 IOPS provisioned, paying for extra IOPS they never used on gp2 (where the baseline was "free" because it was bundled). Three years later, those volumes are still sitting there with the same over-provisioned IOPS — and they're now one of the single largest sources of low-risk gp3 savings in any mature AWS account.

EBS right-sizing in action

Nina runs SRE for a healthcare analytics platform. A monthly cost review flags that EBS is the second-largest line on the AWS bill — and a sizable chunk of it is gp3 IOPS surcharges, not capacity. She picks one of the biggest offenders: a 2 TB gp3 volume on a Postgres replica, provisioned at 12,000 IOPS and 500 MiB/s throughput.

She pulls 14 days of CloudWatch data. VolumeReadOps and VolumeWriteOps sum to a p99 of about 1,400 IOPS — well under the 3,000 IOPS that come free on gp3. Throughput peaks at 65 MiB/s, about half the free baseline. The volume is paying for 9,000 IOPS and 375 MiB/s of headroom that nobody is touching.

She cross-checks against Compute Optimizer's EBS recommendations, which classify the volume as Over-provisioned. The recommended target: 3,000 IOPS, 125 MiB/s, capacity unchanged. Projected monthly savings: roughly $530 per volume — and there are 34 volumes with the same shape across the fleet.

First, confirm the IOPS signal by pulling the operations metrics directly. CloudWatch reports them as a count per period — divide by the period seconds to get IOPS.

$ aws cloudwatch get-metric-statistics --namespace AWS/EBS --metric-name VolumeReadOps --dimensions Name=VolumeId,Value=vol-0abc123def456 --start-time $(date -u -d '14 days ago' +%FT%TZ) --end-time $(date -u +%FT%TZ) --period 300 --statistics Sum Maximum
{
"Datapoints": [
{ "Timestamp": "2026-05-01T08:00:00Z", "Sum": 187000, "Maximum": 420000, "Unit": "Count" },
{ "Timestamp": "2026-05-02T08:00:00Z", "Sum": 192400, "Maximum": 415000, "Unit": "Count" },
{ "Timestamp": "2026-05-03T08:00:00Z", "Sum": 198100, "Maximum": 438000, "Unit": "Count" }
]
}
# 420k ops / 300s = ~1,400 IOPS p99 — well below the 3,000 IOPS gp3 baseline.

14-day 5-minute window of read operations on the Postgres replica volume.

Now ask Compute Optimizer for its EBS recommendation. It's already been observing the volume and produces a target shape with a performance risk score.

$ aws compute-optimizer get-ebs-volume-recommendations --volume-arns arn:aws:ec2:eu-west-1:123456789012:volume/vol-0abc123def456 --query 'volumeRecommendations[0]'
{
"finding": "NotOptimized",
"currentConfiguration": { "volumeType": "gp3", "volumeSize": 2048, "volumeBaselineIOPS": 12000, "volumeBaselineThroughput": 500 },
"volumeRecommendationOptions": [
{
"configuration": { "volumeType": "gp3", "volumeSize": 2048, "volumeBaselineIOPS": 3000, "volumeBaselineThroughput": 125 },
"performanceRisk": 1.0,
"savingsOpportunity": { "savingsOpportunityPercentage": 38.4, "estimatedMonthlySavings": { "value": 531.20, "currency": "USD" } }
}
]
}
# performanceRisk 1.0 = LOW. Drop IOPS and throughput; keep capacity.

Compute Optimizer's recommendation on the same volume.

EBS right-sizing under the hooddeep dive

gp3 pricing in us-east-1 is roughly $0.08 per GB-month, $0.005 per provisioned IOP-month above the 3,000 baseline, and $0.04 per provisioned MiB/s-month above the 125 baseline. That means a 2 TB volume with 12,000 IOPS and 500 MiB/s is paying about $164/mo for capacity, $45/mo for the extra 9,000 IOPS, and $15/mo for the extra 375 MiB/s — roughly 27% of the line is performance, not storage. On io1/io2 the IOPS line is even larger: every PIOPS is billed at $0.065/mo on io1, so cutting from 12,000 to 3,000 PIOPS on io1 saves about $585/mo on a single volume.

IOPS and throughput modifications are online — you call ec2 modify-volume and the volume transitions through modifying → optimizing → in-use while the workload keeps running. The only constraint is the 6-hour cooldown between successive modifications on the same volume, so you can't oscillate. Capacity changes are also online, but only in one direction: EBS can grow a volume but it cannot shrink one. There is no --size 1024 call on a 2,048 GB volume that just gives you back the difference.

To actually reduce capacity, you need a snapshot-and-swap: snapshot the source volume, create a new, smaller volume from the snapshot (or from a logical backup if the filesystem doesn't fit), then detach the old volume and attach the new one. For stateful workloads this means downtime or careful coordination — typically a maintenance window, a brief read-only mode, or a database-level replication switchover. This asymmetry is why the cheapest and safest right-sizing target is almost always IOPS and throughput first, capacity second.

# Online: drop IOPS and throughput on a gp3 volume back to the free baseline.
aws ec2 modify-volume \
  --volume-id vol-0abc123def456 \
  --iops 3000 \
  --throughput 125

# Watch the modification progress; the workload stays online throughout.
aws ec2 describe-volumes-modifications \
  --volume-ids vol-0abc123def456 \
  --query 'VolumesModifications[0].[ModificationState,Progress,TargetIops,TargetThroughput]'

What is the impact of running over-provisioned EBS volumes?

The immediate impact is direct spend. A fleet with 200 over-provisioned gp3 volumes carrying an average of 8,000 IOPS and 300 MiB/s above baseline is paying roughly $130k/year in IOPS and throughput surcharges alone, before you touch the capacity line. On io1/io2 the multiplier is much harsher — IOPS is the dominant cost component, and "we provisioned 20,000 PIOPS to be safe" is a $15k/month decision per volume.

The second-order impact is that over-provisioned EBS distorts your storage tier choices. If gp3 volumes are silently expensive because everyone over-IOPSes them, teams start migrating workloads to instance-store volumes, NVMe-equipped families, or even back to gp2 "because gp3 is more expensive" — which is only true relative to a misconfigured baseline. Right-sizing the gp3 fleet first usually flips the comparison and lets gp3 deliver what it was designed for: cheaper, more predictable performance.

Capacity over-provisioning has a different impact profile. The marginal cost per unused GB is small (about $0.08/mo on gp3), but the operational cost of carrying it is large: backups, snapshots, replication targets, and DR copies all bill against the same provisioned size. A 4 TB volume with 1 TB of actual data is paying for 3 TB of empty space on every nightly snapshot, every cross-region copy, every restore-test rehearsal. That compounds quickly.

Finally, over-provisioned IOPS hides genuine performance signals. If your volumes are sized for 5× their actual peak, you'll never see throttling alarms — even when a query plan regresses or an application starts doing the wrong thing at scale. Right-sizing brings the baseline closer to actual usage and re-engages the early-warning signal you want anyway.

How do you right-size EBS volumes safely?

EBS right-sizing is a four-step loop: measure honestly, change the cheap dimensions first, plan the hard dimension as a project, and re-check on a cadence. Each step is straightforward; skipping any of them is where teams get hurt.

1. Measure IOPS and throughput against the 14-day p99

Pull VolumeReadOps + VolumeWriteOps from CloudWatch over 14 days at 5-minute resolution, sum them per period, and convert to IOPS by dividing by the period in seconds. Do the same for VolumeReadBytes + VolumeWriteBytes to get MiB/s. Use the p99 (not the average — averages hide bursts). Compare those numbers to the gp3 free baselines (3,000 IOPS, 125 MiB/s). Anything well below baseline with extra IOPS or throughput provisioned is a target.

2. Apply the IOPS and throughput cut online via modify-volume

ec2 modify-volume changes IOPS and throughput without detaching. The workload keeps running. The volume goes through modifying → optimizing and lands at the new shape, usually within minutes. Respect the 6-hour cooldown — you can't make two modifications in quick succession on the same volume — and use describe-volumes-modifications to confirm before moving on. This is by far the highest savings-per-risk move.

3. Plan capacity reduction as a snapshot-and-swap, not a click

Capacity cannot be shrunk in place. To actually reduce a volume's size you snapshot it, restore the snapshot into a smaller volume (or rebuild via a logical backup if the data fits), then detach the old and attach the new. Schedule a maintenance window, document a rollback path back to the original volume, and verify the workload reads/writes cleanly before deleting the old volume. For database hosts, prefer a replica-based switchover — promote a smaller replica, repoint the application, then decommission the oversized primary.

4. Re-evaluate quarterly and on workload change

A right-sized volume today is wrong in three months — data grew, a new index changed the IO pattern, or a feature shipped that 10×'d traffic. Wire Compute Optimizer's EBS recommendations into a weekly digest, and put a 30-minute review on the FinOps cadence. Tag volumes with their last-reviewed date so you can spot the ones that have been ignored.

# Bulk-list every EBS volume Compute Optimizer marks as over-provisioned, low-risk.
aws compute-optimizer get-ebs-volume-recommendations \
  --filters name=Finding,values=NotOptimized \
  --query 'volumeRecommendations[?volumeRecommendationOptions[0].performanceRisk==`1.0`].[volumeArn,volumeRecommendationOptions[0].configuration]' \
  --output json > over-provisioned-ebs.json

# Review the list, exclude any volumes under change-freeze, then schedule modify-volume calls.

Quick quiz

Question 1 of 5

You have a 2 TB gp3 volume provisioned at 12,000 IOPS and 500 MiB/s. CloudWatch p99 over 14 days shows ~1,400 IOPS and ~65 MiB/s. Compute Optimizer recommends 3,000 IOPS and 125 MiB/s, performanceRisk = 1.0. What's the right next move?

You've completed Right-size EBS storage volumes. You now know the three dimensions of EBS pricing, why the gp2 → gp3 migration left so much over-provisioned IOPS on the table, how to cut IOPS and throughput online with modify-volume, and why capacity reduction is a snapshot-and-swap project rather than a click. The next time the bill flags EBS, you'll have a four-step loop — measure, modify, swap, re-check — ready to run.

Back to the library