Cost

Right-size ECS tasks and Fargate services

Fargate bills for the vCPU and memory you request in the task definition, not what your containers actually use — over-requesting is pure waste, multiplied by every running task.

15 min·10 sections·AWS

Last reviewed 28 May 2026

Right-sizing Fargate: the basics

Why you pay for the size you ask for, not the size you use

AWS Fargate runs containers without requiring you to manage the underlying servers. Its pricing model is based on the CPU and memory allocated to each task definition — billed per vCPU-second and per GB-second while the task is running.

The important detail is that billing is driven by the resources you request, not what the container actually consumes at runtime. A task configured for 4 vCPU and 8 GB of memory will be billed for that full allocation even if the application only peaks at 0.4 vCPU and 1.2 GB in practice.

That over-allocation compounds quickly at scale. A service with a desiredCount of 20 over-provisioned tasks repeats the same waste twenty times, continuously, before any additional Service Auto Scaling capacity is added. Depending on region and pricing model, a heavily over-sized task can easily cost several times more than a right-sized equivalent running the exact same workload. Unlike EC2, Fargate does not allow arbitrary CPU and memory combinations. It only accepts predefined valid pairings — for example: 0.25 vCPU supports 0.5 GB, 1 GB, or 2 GB RAM 1 vCPU supports 2–8 GB RAM in 1 GB increments Effective Fargate optimisation therefore means analysing real utilisation data, selecting the smallest valid CPU/memory combination that still provides safe headroom for peak demand, and deploying the change through a new task definition revision

In this lesson you'll learn how the AWS Fargate billing model makes over-requested tasks unnecessarily expensive, how to read real container utilisation using CloudWatch Container Insights (CPU and memory utilised versus reserved), and how to choose from the limited set of valid Fargate CPU↔memory combinations. You'll also walk through the safe rollout pattern for right-sizing services: registering a smaller task definition revision, updating the ECS service, and reviewing desiredCount along with any Service Auto Scaling target-tracking policies. Finally, you'll explore additional optimisation opportunities including Fargate Spot, ARM64/Graviton-based workloads, AWS Compute Optimizer recommendations for ECS on Fargate, and how the optimisation approach differs for ECS on EC2 where the underlying instances and capacity providers must also be right-sized.

Fun fact

The combination you can't actually pick

Fargate doesn't let you choose arbitrary CPU and memory values — it only accepts a predefined set of valid combinations. For example, a task configured with 1 vCPU can only use memory values within a specific supported range, while 0.25 vCPU tasks are limited to a much smaller set of options. Teams often identify an ideal target size from utilisation metrics, only to discover that the exact combination they want is not valid on Fargate. At that point, many simply round up to the next supported configuration, unintentionally increasing cost again. Effective right-sizing on Fargate therefore requires two things: understanding real workload utilisation, and understanding the valid CPU↔memory combinations that Fargate will actually accept.

Right-sizing Fargate in action

Marcus runs the platform team at a logistics company. During a finance review, the team discovers that a single ECS service — an image-resizing worker — is responsible for a disproportionately large share of monthly Fargate spend. The service runs 20 tasks, each configured with 4 vCPU and 8 GB of memory.

Marcus opens CloudWatch Container Insights and reviews two weeks of utilisation data. CpuUtilized averages around 0.5 vCPU against 4 vCPU reserved, while MemoryUtilized sits close to 1.3 GB against 8 GB reserved. Even during batch-processing peaks, the service only reaches around 0.9 vCPU and 1.7 GB of memory usage. Using this data, he selects the smallest valid Fargate CPU↔memory combination that still provides safe operational headroom: 1 vCPU and 2 GB of memory.

Marcus registers the smaller configuration as a new task definition revision and rolls it out using a standard ECS rolling deployment. He keeps the service desiredCount unchanged during the rollout, then later tightens the Service Auto Scaling target-tracking policy once stability is confirmed. The result is a major reduction in monthly Fargate spend with no measurable impact on throughput, latency, or customer experience.

First, pull the real container utilisation from Container Insights — CPU used versus CPU reserved — to confirm the service is over-requested.

$ aws cloudwatch get-metric-statistics --namespace ECS/ContainerInsights --metric-name CpuUtilized --dimensions Name=ClusterName,Value=prod Name=ServiceName,Value=image-worker --start-time $(date -u -d '14 days ago' +%FT%TZ) --end-time $(date -u +%FT%TZ) --period 3600 --statistics Average Maximum

{

"Label": "CpuUtilized",

"Datapoints": [

{ "Timestamp": "2026-05-10T08:00:00Z", "Average": 512.4, "Maximum": 921.3, "Unit": "None" },

{ "Timestamp": "2026-05-11T08:00:00Z", "Average": 488.1, "Maximum": 874.6, "Unit": "None" }

]

}

# CpuUtilized is in CPU units (1024 = 1 vCPU). ~500 used vs 4096 reserved.

# Peak ~920 (0.9 vCPU) — 1 vCPU (1024) covers it with headroom.

14-day hourly CPU used vs the 4096-unit (4 vCPU) reservation — a clear over-request.

Register a smaller task-definition revision at a valid Fargate combo (1 vCPU / 2 GB), then roll the service onto it. desiredCount and the scaling policy come after.

$ aws ecs register-task-definition --family image-worker --requires-compatibilities FARGATE --network-mode awsvpc --cpu 1024 --memory 2048 --container-definitions file://image-worker.json && aws ecs update-service --cluster prod --service image-worker --task-definition image-worker --force-new-deployment

{

"service": {

"serviceName": "image-worker",

"taskDefinition": "arn:aws:ecs:us-east-1:123456789012:task-definition/image-worker:47",

"desiredCount": 20,

"deploymentConfiguration": { "minimumHealthyPercent": 100, "maximumPercent": 200 },

"rolloutState": "IN_PROGRESS"

}

# Revision :47 is 1 vCPU / 2 GB. Rolling deploy keeps the service up.

New revision at a valid CPU↔memory pairing, rolled out with zero downtime.

Fargate right-sizing under the hooddeep dive

Fargate pricing is linear and billed per second (with a one-minute minimum), based on the CPU and memory defined in the task definition. In practical terms, larger task sizes scale cost almost directly in proportion to the vCPU and memory requested. That means a heavily over-provisioned task can cost several times more than a right-sized equivalent, even when both process the same workload. Because the billing meter reads the task definition's requested cpu and memory values, the only way to materially reduce spend is by changing the task definition itself — actual runtime utilisation does not directly affect the bill.

Fargate CPU and memory combinations are a hard platform constraint, not a recommendation. Each CPU tier only supports a defined range of memory values, which means right-sizing is limited to the combinations Fargate will actually accept. In practice, this often forces teams to choose the nearest supported configuration rather than an exact theoretical target. CloudWatch Container Insights exposes the metrics needed to make those decisions properly: CpuUtilized versus CpuReserved, and MemoryUtilized versus MemoryReserved, at both service and task level. These metrics show the gap between what the workload actually consumes and what the task definition reserves for billing purposes. AWS Compute Optimizer can now ingest this utilisation data and generate ECS-on-Fargate right-sizing recommendations, including projected savings and optimisation findings, in much the same way it already provides recommendations for EC2 instances.

# Ask Compute Optimizer for ECS-on-Fargate service right-sizing recommendations.
aws compute-optimizer get-ecs-service-recommendations \
  --service-arns arn:aws:ecs:us-east-1:123456789012:service/prod/image-worker \
  --query 'ecsServiceRecommendations[0].{Finding:finding, \
           CurrentCpu:currentServiceConfiguration.cpu, \
           CurrentMem:currentServiceConfiguration.memory, \
           Option:serviceRecommendationOptions[0]}'

# Inspect memory used vs reserved to confirm the smaller memory tier is safe.
aws cloudwatch get-metric-statistics \
  --namespace ECS/ContainerInsights --metric-name MemoryUtilized \
  --dimensions Name=ClusterName,Value=prod Name=ServiceName,Value=image-worker \
  --start-time $(date -u -d '14 days ago' +%FT%TZ) \
  --end-time $(date -u +%FT%TZ) --period 3600 --statistics Average Maximum

What is the impact of over-requested Fargate tasks?

The most visible impact is the bill, and it's larger than it looks because the over-request is multiplied by the running task count. One service defined at 4× the capacity it needs, running 20 tasks, is paying for ~60 vCPU and ~120 GB it never touches — thousands of dollars a month for a single workload. Across an estate of dozens of services, low-utilisation Fargate is routinely 30–60% of container spend that could be reclaimed with no architectural change at all.

The second-order impact is that over-requesting masks where the workload actually lives. A team that pads every task "to be safe" never learns its real CPU and memory profile, so it can't reason about concurrency, batching, or whether a noisy neighbour problem is real. Generous task sizes paper over the questions that, answered, would make the service both cheaper and more predictable.

There's a compounding-discount impact too. Spot (~~70% off) and Graviton (~~20% off) both apply to the requested size, so an over-requested task wastes a proportionally larger discount: 70% off the wrong size is still paying for capacity you don't use. And Compute Savings Plans, which can cover Fargate, get committed against inflated usage — you end up locking in a one-year commitment sized to waste, the same stranding trap as over-sized Reserved Instances on EC2. Right-size before you commit.

Finally, the autoscaling interaction bites. A Service Auto Scaling target-tracking policy that targets, say, 50% CPU on a task reserving 4× what it needs will almost never scale — the per-task utilisation is structurally low — so you over-provision and lose the elasticity you thought you had. Right-sizing the task is what makes target-tracking work the way it's supposed to.

How do you right-size Fargate safely?

Right-sizing containers is a four-step loop that runs continuously as workloads evolve: read real utilisation, pick the smallest valid shape with headroom, roll it out as a new revision, then layer discounts and tighten autoscaling.

1. Read real utilisation from Container Insights

Enable CloudWatch Container Insights on the cluster so you get CpuUtilized/CpuReserved and MemoryUtilized/MemoryReserved per service and per task. Look at 14+ days, average and peak — memory especially, because a container that briefly touches its limit gets OOM-killed, not throttled. Size to cover the peak with comfortable headroom (target steady-state utilisation around 50–60%), not the average. Without this data you're guessing, and guessing is what created the over-request.

2. Choose the smallest valid CPU↔memory combination

Fargate only accepts fixed pairings, so right-sizing is a snap-to-grid exercise: find the smallest valid combo that still covers your peak. If CPU wants 0.5 vCPU but memory needs 4 GB, you're forced up to the 0.5 vCPU / 4 GB cell — and that mismatch is itself a signal to check whether the workload is memory-bound and might suit a different design. Let Compute Optimizer's ECS-on-Fargate recommendation propose the shape; trust its LOW-risk findings, eyeball the rest.

3. Roll out via a new task-definition revision, then tune desiredCount and autoscaling

Never edit a running task in place — register a new revision with the smaller cpu/memory and update-service --force-new-deployment so the rolling deploy keeps minimumHealthyPercent capacity up the whole time. Once the smaller tasks are stable, revisit desiredCount and the Service Auto Scaling target-tracking policy: a right-sized task makes target-tracking actually responsive, so you often need fewer baseline tasks and let scaling handle the peaks.

4. Layer Spot and Graviton, and don't commit before right-sizing

Once the shape is correct, compound the discounts. Move interruption-tolerant tasks (queue workers, batch, stateless replicas) to Fargate Spot via a capacity-provider strategy for ~70% off, and rebuild images multi-arch to run on ARM64/Graviton Fargate for ~20% off. Only after right-sizing should you size a Compute Savings Plan — committing against inflated usage strands the commitment. For ECS-on-EC2, also right-size the cluster's instances and capacity-provider Auto Scaling group and use bin-pack placement so tasks pack tightly.

# Route a service's tasks to Fargate Spot via a capacity-provider strategy (~70% off).
aws ecs update-service \
  --cluster prod --service image-worker \
  --capacity-provider-strategy \
    capacityProvider=FARGATE_SPOT,weight=4 \
    capacityProvider=FARGATE,weight=1,base=2 \
  --force-new-deployment

# Tighten target-tracking so a right-sized task actually scales on demand.
aws application-autoscaling put-scaling-policy \
  --service-namespace ecs \
  --resource-id service/prod/image-worker \
  --scalable-dimension ecs:service:DesiredCount \
  --policy-name cpu-target-50 --policy-type TargetTrackingScaling \
  --target-tracking-scaling-policy-configuration \
    '{"TargetValue":50.0,"PredefinedMetricSpecification":{"PredefinedMetricType":"ECSServiceAverageCPUUtilization"}}'

Quick quiz

Question 1 of 5

An ECS service runs 20 Fargate tasks defined at 4 vCPU / 8 GB. Container Insights shows CPU averaging 0.5 vCPU, peaking at 0.9, and memory averaging 1.3 GB, peaking at 1.7. What's the right next move?

Keep learning

Dig deeper into Fargate pricing, container utilisation tooling, and the right-sizing strategy around it.

You've completed Right-size ECS tasks and Fargate services. You now know why Fargate bills for the capacity you request rather than what you use, how to read real CPU and memory utilisation from Container Insights, how to snap to a valid CPU↔memory combo, and how to roll a smaller task-definition revision out with zero downtime — then compound the saving with Spot and Graviton. The next time a finance review flags a high-spend service, you'll have a four-step loop ready to run.

Back to the library

Right-sizing Fargate: what it means for the bill

The container line item is set by a request, not by usage

Most cloud bills are usage-based — you pay for what you consume. Fargate is subtly different and it catches finance teams out: you pay for what engineering asked for when they defined the container, regardless of how little the container actually uses. If a team requested a large container and the workload only touches a fraction of it, the bill reflects the large request, every hour, forever, until someone changes the definition.

This makes the Fargate line on the invoice deceptively sticky. It doesn't fall when traffic falls; it only falls when an engineer registers a smaller task definition. And because the over-request is multiplied by the number of running copies of the service (a service can run anywhere from one to dozens of identical tasks), a single oversized definition can quietly become one of the larger lines in the container spend without anyone noticing — there's no spike, just a flat number that's twice what it should be.

For budgeting and forecasting this matters in two ways. First, container spend that's set by request rather than usage won't self-correct, so it has to be actively reviewed. Second, the right metric isn't dollars in isolation — it's the gap between requested capacity and used capacity, often called utilisation. A service running at 15% CPU utilisation is the budget equivalent of a half-empty truck on every delivery. The question to bring to the monthly cost review is: across our container services, what's the average requested-vs-used ratio, and which services are furthest from a healthy number?

This lesson is for the finance partner who sees a flat "container compute" line and assumes it tracks usage. It explains why Fargate bills for requested capacity instead, how to read the requested-vs-used utilisation gap that tells you whether the line is healthy, how the number behaves under Spot and Graviton discounts and your existing commitments, and what "good" looks like as a utilisation target. By the end you'll know which one number to ask for at the monthly review and how to tell a genuine traffic-driven cost from waste that just needs a smaller definition.

Fun fact

The combination you can't actually pick

How a finance partner closes the loop

Priya is the finance partner embedded with the platform team at a logistics company. At the monthly cost review the container line is flat at around $4.3k for one service, and Priya asks the question that's now standard on the agenda: "What utilisation is that service running at — how much of what we're paying for is it actually using?" The engineering lead pulls up Container Insights: about an eighth, roughly 12% on CPU and 16% on memory.

The conversation isn't technical. Priya doesn't ask about task definitions or vCPU grids. She asks three things: is this driven by real traffic or by a request set once and forgotten, what would a sensible utilisation target look like, and when can it be reviewed. The answer is clear — it's an over-request, the team is comfortable targeting 50–60% utilisation with autoscaling for the spikes, and they can ship a smaller definition this sprint. Priya adds requested-vs-used utilisation as a standing line on the cost pack so it can't drift back.

A month later the same service is ~$1.1k at healthy utilisation. Priya knows that's the right floor — real work, sized sensibly — and she also knows that if utilisation across the container estate drifts below ~30% again, that's the conversation, not the raw dollar figure. The dollars are a lagging indicator; the utilisation gap is the leading one.

Why this matters to the budget, not just the bill

The aggregate impact is material precisely because it hides. There's no spike to investigate and no anomaly alert — just a flat container line that happens to be two-to-four times larger than the workload warrants, multiplied across every service that was padded "to be safe." In many estates this is the single largest pool of no-architecture-change savings available, and it sits in plain sight on the invoice as a perfectly normal-looking number.

It also distorts commitment economics. Compute Savings Plans can cover Fargate, and if you commit against today's inflated usage you lock in a one-to-three-year promise sized to waste — then right-size later and strand part of the commitment. The discount looks great on paper and quietly underperforms. The sequencing rule that protects the budget is simple: right-size first, commit second. The commitment-utilisation number on the monthly report is where this shows up when it's done out of order.

The third impact is on chargeback and forecast credibility. When a service's cost is driven by an arbitrary request rather than its real usage, the figure you charge back to a business unit isn't defensible — it's an accident of how the task definition was written, not a reflection of value delivered. Right-sizing makes the chargeback honest: the unit pays for capacity proportional to what its workload actually consumes.

Finally, it's a leading indicator. A container estate where average utilisation drifts down month over month is telling you that requests are set once and never revisited — and that same set-and-forget pattern predicts other waste categories (idle services, oversized databases, unused load balancers) trending the same way. Watch the requested-vs-used ratio as a signal, not just the dollars.

What finance can actually do about this

Finance can't register task definitions, but it can set the conditions that keep utilisation healthy. Three levers, used together at the monthly cost cadence.

1. Put requested-vs-used utilisation on the monthly report

Add container utilisation — average used capacity against requested capacity — as a recurring line alongside the dollar amount, with a count of services below the target band. The dollars are the lagging headline; the utilisation gap is the leading indicator. If utilisation drifts down two months running, that's the prompt to escalate, regardless of whether the total has moved yet.

2. Set a utilisation target as a budget expectation

Agree a sensible steady-state target with engineering — commonly 50–60% on the dimension that binds — and make "services persistently below the band" a standing review item rather than a one-off cleanup. The point isn't to police task definitions; it's to make sizing-to-reality a normal expectation that teams own themselves, the same way they own their budget envelope.

3. Enforce right-size-before-commit on Savings Plans

Before signing any Compute Savings Plan that covers Fargate, require that the covered services have been right-sized first. Committing against inflated usage locks in a one-to-three-year promise sized to waste and strands part of it the moment the workload is corrected. The order is non-negotiable: right-size, then commit to the right-sized baseline.

4. Treat the utilisation trend as the metric, not the raw dollars

Some services genuinely need large containers, and Spot/Graviton discounts will move the dollar figure for reasons unrelated to sizing. So don't chase the absolute number — watch whether utilisation is flat-and-healthy or drifting down. A flat estate at 55% utilisation is good even if spend grows with traffic; a falling utilisation trend is bad even if the bill happens to dip.

Quick quiz

Question 1 of 5

Average container utilisation across your Fargate services has drifted from 45% to 22% over six months while total spend is roughly flat. As the finance partner, what's the right next move?

Keep learning

Dig deeper into Fargate pricing, container utilisation tooling, and the right-sizing strategy around it.

You've finished the finance partner's view of Fargate right-sizing. You know why container spend is set by request rather than usage, why the requested-vs-used utilisation gap is the metric that matters, how the number interacts with Spot, Graviton, and Savings Plan commitments, and what the three finance levers are — utilisation on the report, a target as a budget expectation, and right-size-before-commit. Next time the container line shows up at the monthly review, you'll have a sharper question than "can we make it cheaper?"

Back to the library

Right-sizing Fargate: the headline

Paying for ordered capacity, not used capacity

Containers on Fargate are billed for the capacity the engineering team requested when they defined the workload — not for what the workload actually uses. When those requests are set high "to be safe" and never revisited, the business pays for headroom that's never touched, multiplied by every running copy of the service.

This is a recurring, low-risk savings category and a discipline signal. A healthy container estate runs at sensible utilisation and gets reviewed as workloads change; a low-utilisation estate means requests are set once and forgotten. The leadership question isn't the dollar figure — it's whether container utilisation is measured at all and trending the right way.

A five-minute read on a quiet, recurring category of container waste, for the exec who wants the headline and the one question to ask. You'll get the rule-of-thumb framing — Fargate bills for ordered capacity, not used capacity — what low utilisation signals about engineering discipline, and what "good" looks like at an org level. No commands, no internals.

Fun fact

The combination you can't actually pick

What it looks like when the org gets this right

At one company the quarterly review used to show container spend as a single growing line with no context — just a bigger number each quarter. The exec sponsor stopped asking "why is it going up?" and started asking "what utilisation are these services running at, and is that number on the dashboard at all?"

Two quarters later the slide had changed. Container spend was still there, but next to it sat an average-utilisation figure with a target band, and a count of services below the band. The image-worker service that had been $4.3k was $1.1k, and three similar over-requests had been caught the same way. The exec hadn't asked anyone to chase dollars — she'd asked them to measure utilisation, and the dollars followed.

That's the right outcome state. "Minimise container spend" is the wrong goal — some workloads genuinely need big containers. "Every service runs at a measured, sensible utilisation and gets reviewed" is the right one. The cost line stops being an action item and becomes a confidence signal.

Why this is on the report at all

Container spend on its own is a number without a verdict — it could be entirely justified or it could be twice what it should be, and the dollar figure alone won't tell you which. What tells you is utilisation: how much of the capacity the business is paying for is actually being used. A healthy, measured utilisation figure means engineering is sizing workloads to reality; a low or unmeasured one means capacity is being ordered on guesswork and never reconciled.

There's a second-order point. Whether utilisation is even on the dashboard is itself the signal. An org that measures and reviews container utilisation almost always has the broader cost discipline that keeps the bigger spend categories honest too; one that can't answer "what utilisation are we running at?" usually has the same blind spot across its whole estate. This category is cheap to fix and disproportionately revealing about everything else.

The leadership move on this category

The handle for an executive isn't to drive container spend down — it's to insist the org measures and reviews utilisation, which makes the dollars take care of themselves.

1. Insist utilisation is measured and on the dashboard

The single highest-leverage move is requiring that container utilisation — used versus requested — appears on the cost dashboard at all. You can't manage a request-based bill you don't measure, and the act of measuring it usually surfaces the worst over-requests within a quarter.

2. Set right-size-before-commit as policy

Make it a standing rule that workloads are right-sized before the org commits to multi-year Savings Plans covering them. This protects against locking in a commitment sized to waste and is a one-line policy that prevents a recurring, expensive mistake.

3. Ask for the utilisation trend at the leadership review

"Is container utilisation flat and healthy, or drifting down?" is a one-minute item that tells you whether engineering is sizing to reality without any technical depth. A steady, measured utilisation across three quarters means the discipline is working and leadership attention belongs elsewhere.

Quick quiz

Question 1 of 5

The cost pack now shows container utilisation alongside spend; it's been flat at around 55% for three quarters with a target band of 50–60%. What's the right read?

Keep learning

Dig deeper into Fargate pricing, container utilisation tooling, and the right-sizing strategy around it.

That's the lesson. Two takeaways worth holding onto: Fargate bills for the capacity engineering ordered, not what it uses, and the signal that matters is whether utilisation is measured and healthy — not the raw dollar figure. The leadership question is about utilisation and review cadence, not about chasing the bill down.

Back to the library

Part of the learning path Right-size your compute

Right-size ECS tasks and Fargate services

Right-sizing Fargate: the basics

The combination you can't actually pick

Right-sizing Fargate in action

Fargate right-sizing under the hooddeep dive

What is the impact of over-requested Fargate tasks?

How do you right-size Fargate safely?

1. Read real utilisation from Container Insights

2. Choose the smallest valid CPU↔memory combination

3. Roll out via a new task-definition revision, then tune desiredCount and autoscaling

4. Layer Spot and Graviton, and don't commit before right-sizing

Quick quiz

Keep learning

Right-sizing Fargate: what it means for the bill

The combination you can't actually pick

How a finance partner closes the loop

Why this matters to the budget, not just the bill

What finance can actually do about this

1. Put requested-vs-used utilisation on the monthly report

2. Set a utilisation target as a budget expectation

3. Enforce right-size-before-commit on Savings Plans

4. Treat the utilisation trend as the metric, not the raw dollars

Quick quiz

Keep learning

Right-sizing Fargate: the headline

The combination you can't actually pick

What it looks like when the org gets this right

Why this is on the report at all

The leadership move on this category

1. Insist utilisation is measured and on the dashboard

2. Set right-size-before-commit as policy

3. Ask for the utilisation trend at the leadership review

Quick quiz

Keep learning

Related cost lessons