Skip to main content
emnode / learn
Cost

Right-size EC2 instance

Match instance types to actual workload — stop overpaying for unused capacity.

15 min·10 sections·AWS

Last reviewed

Right-sizing: the basics

What does it mean to right-size an EC2 instance?

Right-sizing is the practice of matching an EC2 instance's family, generation, and size to what the workload running on it actually consumes — CPU, memory, network, and storage IOPS. Most fleets drift the other way: instances are picked once, often deliberately oversized for headroom or as a copy-paste from another environment, and then quietly run that way for years.

An oversized instance burns money on capacity nobody is using. A 4xlarge holding a workload that peaks at 12% CPU and 25% memory is paying roughly four times what it should — and the bill compounds across every replica, every region, every month. Reserved Instances and Savings Plans don't fix it; they just lock you into the wrong shape for one to three years.

An undersized instance is the opposite trap: latency spikes, p99s break SLOs, autoscaling thrashes, and the workload becomes unstable in ways that look like "the app is slow" instead of "we picked the wrong instance." Both directions cost money — one in wasted spend, the other in incidents and engineering hours.

In this lesson you'll learn how to spot oversized and undersized EC2 instances, what utilisation signals actually matter, and how to safely move a running workload to a smaller (or differently-shaped) instance type. You'll see real CloudWatch metrics, a Compute Optimizer recommendation, and the exact AWS CLI calls to apply the change with zero customer impact.

Fun fact

The CPU credit cliff

Burstable instances (t2/t3/t4g) earn CPU credits while idle and spend them under load. Teams often "right-size" a steady-state workload onto a t3.large, see it run fine for a few weeks, and then watch performance fall off a cliff once accrued CPU credits run out. The instance is the right size on paper but the wrong family — t-types are for spiky workloads, not steady ones. Compute Optimizer's recommendation engine accounts for this; eyeballing average CPU does not.

Right-sizing in action

Sara runs the platform team at a media company. A finance review flags that one of their EKS node groups is responsible for $14k of monthly spend — six m5.4xlarge instances behind a queue worker.

She pulls 14 days of CloudWatch metrics for one of the instances. CPU is averaging 11% with brief spikes to 28%. Memory utilisation (via the CloudWatch agent) sits around 22%. Network throughput is comfortable at 100 Mbps against a 10 Gbps ceiling.

She cross-checks against Compute Optimizer, which has been observing the workload for two weeks and is confidently recommending m5.xlarge — a 4× drop. Projected monthly savings: roughly $10.5k across the six nodes.

First, pull the CPU utilisation distribution to confirm the signal isn't dominated by short bursts.

$ aws cloudwatch get-metric-statistics --namespace AWS/EC2 --metric-name CPUUtilization --dimensions Name=InstanceId,Value=i-0abc123def456 --start-time $(date -u -d '14 days ago' +%FT%TZ) --end-time $(date -u +%FT%TZ) --period 3600 --statistics Average Maximum p99
{
"Datapoints": [
{ "Timestamp": "2026-04-30T08:00:00Z", "Average": 11.4, "Maximum": 28.1, "p99": 19.7, "Unit": "Percent" },
{ "Timestamp": "2026-05-01T08:00:00Z", "Average": 12.1, "Maximum": 31.6, "p99": 21.2, "Unit": "Percent" },
{ "Timestamp": "2026-05-02T08:00:00Z", "Average": 10.9, "Maximum": 26.4, "p99": 18.3, "Unit": "Percent" }
]
}
# Mean ~11%, p99 well below 25% — clear oversize signal.

14-day hourly utilisation for one of the m5.4xlarge nodes.

Now ask Compute Optimizer for its recommendation. It's already analysed this instance against known workload patterns.

$ aws compute-optimizer get-ec2-instance-recommendations --instance-arns arn:aws:ec2:eu-west-1:123456789012:instance/i-0abc123def456 --query 'instanceRecommendations[0].recommendationOptions[0]'
{
"instanceType": "m5.large",
"performanceRisk": 1.0,
"projectedUtilizationMetrics": [
{ "name": "CPU", "statistic": "Maximum", "value": 41.2 },
{ "name": "Memory", "statistic": "Maximum", "value": 58.7 }
],
"savingsOpportunity": { "savingsOpportunityPercentage": 75.0, "estimatedMonthlySavings": { "value": 1751.04, "currency": "USD" } }
}
# performanceRisk 1.0 = LOW — projected headroom is comfortable.

Compute Optimizer's projection on m5.large for the same workload.

Right-sizing under the hooddeep dive

EC2 pricing is roughly linear in vCPU and memory within a generation: an m5.4xlarge costs about 4× an m5.xlarge for 4× the vCPUs and 4× the RAM. Drop a tier and the bill drops the same proportion immediately — there's no weird amortisation, no Reserved Instance penalty (RIs apply to instance families, not specific sizes within a family in most cases), and the new rate applies once the replacement or resized instance is running; billing is metered according to the relevant EC2 billing increment, commonly per-second with a 60-second minimum for supported platforms.

The risky part isn't the price math — it's the change itself. EBS-backed instances need to be stopped to change type, which means a brief outage for a single-instance workload. For ASG/EKS-managed nodes, the safer path is to update the launch template and let the autoscaler roll the fleet, draining one node at a time. For RDS the equivalent operation is ModifyDBInstance with --apply-immediately false so the change waits for the next maintenance window.

Compute Optimizer learns from 14 days of CloudWatch data by default; you can extend this to 93 days for steadier signal. It also flags performanceRisk from 1 (low) to 4 (high) — anything above 2 deserves a closer look at the underlying workload pattern (steady? spiky? memory-bound? burst-credit-dependent?) before applying.

# Update the EKS node-group launch template to the recommended type.
aws ec2 create-launch-template-version \
  --launch-template-id lt-0abc123def456 \
  --source-version '$Latest' \
  --launch-template-data '{"InstanceType":"m5.large"}'

aws ec2 modify-launch-template \
  --launch-template-id lt-0abc123def456 \
  --default-version '$Latest'

# Trigger a rolling replacement at the autoscaling layer.
aws eks update-nodegroup-version \
  --cluster-name prod \
  --nodegroup-name workers \
  --force-update

What is the impact of running oversized instances?

The most visible impact is the bill. A 4× oversized instance is paying for compute it doesn't need every hour, every day, for the entire life of the workload. Across a fleet of even a few hundred nodes this is hundreds of thousands of dollars a year disappearing into idle vCPUs.

The second-order impact is harder to see but at least as expensive: oversized instances mask architectural problems. A workload that should have been profiled, made concurrent, or moved off a single-threaded runtime instead just gets bigger boxes thrown at it. By the time someone runs the numbers, the team has built two years of muscle memory around "add nodes when it slows down" — and the actual fix is a much bigger refactor than it would have been early on.

On the FinOps side, oversized fleets distort Reserved Instance and Savings Plan commitments. You sign a one-year RI for the wrong size, the workload gets right-sized later, and now the RI is partly stranded — paying for a size you no longer run. Right-sizing should always come before committing, not after.

Undersized instances cost money differently — incidents, p99 alerts, autoscaling churn, customer-visible latency. The bill might look better but the engineering hours and SLO breach risk usually swamp the savings.

How do you right-size safely?

Right-sizing is a four-step loop that runs continuously as workloads evolve. Each step is cheap; the real cost is skipping any of them.

1. Instrument utilisation honestly

Default CloudWatch only ships CPU, network, and disk — memory and disk IO require the CloudWatch agent. Without memory data you'll right-size purely on CPU and end up with OOMs in production. Install the agent on every fleet, ship at 1-minute resolution for at least your top-spend 20% of instances, and store at least 14 days for the recommendation engines to chew on.

2. Use Compute Optimizer (or equivalent)

Compute Optimizer is free and accurate enough for the vast majority of decisions — it accounts for burst credits, projected headroom, and family changes (e.g. m5 → c5 if you're CPU-bound, m5 → r5 if memory-bound). Trust its LOW performance-risk recommendations; eyeball the MEDIUM+ ones before applying.

3. Apply changes via launch templates, not by hand

Never rely on a console click to change an instance type — it doesn't survive an ASG replacement. Update the launch template and let the autoscaler roll. For stateful workloads (RDS, ElastiCache, single-instance EC2) schedule the change for a maintenance window with a documented rollback path.

4. Re-evaluate at least quarterly

Workloads drift. A right-sized instance today is oversized in three months because a feature shipped that reduced traffic, or undersized because a campaign tripled it. Treat right-sizing as a continuous process — Compute Optimizer recommendations as a Slack/email digest, with a standing 30-min review on the FinOps cadence.

# Apply Compute Optimizer's recommendation across every instance with a LOW perf-risk option.
aws compute-optimizer get-ec2-instance-recommendations \
  --filters name=Finding,values=Overprovisioned \
  --query 'instanceRecommendations[?recommendationOptions[0].performanceRisk==`1.0`]' \
  > overprovisioned.json

# Pipe through your change-management tooling — never apply blindly.

Quick quiz

Question 1 of 5

You see an m5.4xlarge averaging 11% CPU and 22% memory over 14 days. Compute Optimizer recommends m5.large with performanceRisk = 1.0. What's the right next move?

You've completed Right-size EC2 instance. You now know how to read utilisation honestly, when to trust Compute Optimizer, and how to apply a size change without taking customer-visible downtime. The next time a finance review flags a high-spend node group, you'll have a four-step loop ready to run.

Back to the library