Skip to main content
emnode / learn
Cost

Migrate Lambda functions to Graviton (ARM)

Roughly 20% lower price per GB-second, plus better price-performance that shortens duration — a one-line flip for pure-interpreted runtimes, real build work for native dependencies.

15 min·10 sections·AWS

Last reviewed

Lambda on Graviton: the basics

Why arm64 Lambda is cheaper and usually faster at the same time

Every Lambda function runs on one of two CPU architectures: x86_64 (Intel/AMD) or arm64 (AWS Graviton2). The architecture is a per-function property you set at deploy time. By default functions land on x86_64 — which means the majority of functions in most accounts are sitting on the more expensive option simply because nobody changed the default. Graviton2 is the same Neoverse-based ARM silicon that powers Graviton EC2; in Lambda it's exposed as a single config flag rather than an instance family.

The price difference is straightforward: Graviton2 Lambda lists at roughly 20% lower per GB-second than x86_64 in the same region — about $0.0000133334 per GB-second on arm64 versus $0.0000166667 on x86 in US-East-1. But the saving usually compounds, because functions frequently run faster on Graviton. Lambda bills duration in 1ms increments, so if a function that took 180ms on x86 takes 150ms on arm64, you pay the lower rate and for less time. The two effects multiply, which is why a 20% rate cut routinely shows up as 25-30% lower function cost on the bill.

The catch lives in the binary. For pure-interpreted runtimes — Python, Node.js, Ruby, and even the JVM — the switch is genuinely a one-line change: set Architectures: [arm64] and redeploy, because AWS ships the runtime for both architectures. The real work appears when a function carries native code: Python packages with C extensions, Node native addons, anything in Go or Rust, or a container-image function. Those have to be compiled or pulled as aarch64 builds before the flip will work. This lesson covers both paths and the test-before-flip discipline that keeps the migration boring.

In this lesson you'll learn why arm64 Lambda is both cheaper per GB-second and usually faster, the difference between the one-line flip for pure-interpreted runtimes and the real build work for native dependencies (aarch64 wheels, Node native addons, recompiled Go/Rust, multi-arch container images), the canary discipline of shifting traffic via a weighted alias before you commit, and why you must re-run Lambda Power Tuning after the switch because the cost/performance curve moves. You'll see the CLI to find every x86 function, flip the architecture, and roll it out safely behind an alias.

Fun fact

The runtime is already there waiting for you

When AWS added Graviton2 support to Lambda in 2021, they didn't ship a separate ARM-only product — they built arm64 versions of every managed runtime (Python, Node, Java, .NET, Ruby, Go) and made the architecture a single switch. For a function that imports nothing with native code, the migration is literally changing one field from x86_64 to arm64. AWS's own benchmarking on the launch showed up to 34% better price-performance, and because Lambda bills in 1ms increments, faster cold-and-warm execution turns straight into a lower invoice. The cheaper, faster option has been one flag away the entire time — most functions just never got the flag flipped.

Migrating Lambda to Graviton in action

Nina runs the serverless platform at a mid-sized SaaS company. The cost dashboard shows Lambda compute at about $6,200 a month, and a breakdown reveals roughly 80% of that is still on x86_64. She pulls the list of functions and finds two clear groups: a long tail of pure-Python and Node API handlers that import nothing exotic, and a smaller set of image-processing functions that depend on Pillow (C extensions) and a Rust-based hashing addon.

The pure-interpreted group is the easy win. She picks the busiest one — an order-validation function on Python 3.12, 512 MB, ~40M invocations a month at ~190ms average — and flips it to arm64 in a staging account first. It works on the first try because the only dependency is pydantic, which publishes aarch64 wheels. Average duration drops to ~160ms. At the lower rate plus the shorter duration, that single function's monthly cost falls about 28%.

She doesn't blast the change across production. She publishes a new version, points a live alias at a weighted split — 10% arm64, 90% x86 — and watches error rate, p99, and init duration for 48 hours. The curves hold. She walks the weight to 50%, then 100%, then re-runs Lambda Power Tuning, which now recommends dropping the memory from 512 MB to 384 MB because Graviton hits the same latency with less memory — a second saving the original projection never captured. The Pillow and Rust functions go into a separate sprint: rebuild aarch64 wheels and recompile the addon for ARM before they can move.

First, find every function still running on x86_64 — these are your migration candidates.

$ aws lambda list-functions --query "Functions[?Architectures[0]=='x86_64'].{Name:FunctionName,Runtime:Runtime,Mem:MemorySize,Arch:Architectures[0]}" --output table
-------------------------------------------------------------------------
| ListFunctions |
+----------------------------+--------------+--------+------------------+
| Name | Runtime | Mem | Arch |
+----------------------------+--------------+--------+------------------+
| order-validation | python3.12 | 512 | x86_64 |
| webhook-dispatch | nodejs20.x | 256 | x86_64 |
| image-thumbnailer | python3.12 | 1024 | x86_64 |
+----------------------------+--------------+--------+------------------+
# Pure-interpreted handlers flip in one step; image-thumbnailer pulls Pillow (C ext) — needs aarch64 wheels first.

Filter the fleet to x86_64 and triage: pure-interpreted functions are one-line flips, native deps need build work.

For a pure-interpreted function, flip the architecture in place — then publish a version and shift traffic gradually behind a weighted alias.

$ aws lambda update-function-configuration --function-name order-validation --architectures arm64
{
"FunctionName": "order-validation",
"Runtime": "python3.12",
"Architectures": [
"arm64"
],
"MemorySize": 512,
"LastUpdateStatus": "InProgress"
}
# Now: publish-version, point the 'live' alias at a 10/90 weighted split, watch p99 + errors before promoting.

One field changes for pure-interpreted runtimes — but never flip 100% blind; roll it behind a weighted alias.

Lambda on Graviton under the hooddeep dive

arm64 Lambda runs on the same Graviton2 Neoverse cores AWS uses elsewhere, fronted by Firecracker microVMs exactly like x86 functions. Every byte of executed code has to be aarch64: the AWS-managed runtime layer (which AWS provides for both architectures), every Lambda layer you attach, and every dependency your handler imports. For Python, Node, Ruby, Java, and .NET the runtime is handled for you — breakage lives one layer down, in a manylinux wheel compiled for x86, a Node native addon (.node built via node-gyp), or a statically-linked Go/Rust binary targeted at amd64. Pip will silently install an x86 wheel on an x86 build host and the function will fail with an exec-format or illegal-instruction error only at invocation time.

Pricing is per GB-second with a separate per-request charge that's identical across architectures. In US-East-1 arm64 compute is ~$0.0000133334 per GB-second versus ~$0.0000166667 on x86 — the ~20% gap. Because duration is billed in 1ms increments, any execution-time improvement from the faster cores converts directly into fewer billed milliseconds. The compounding is real but workload-dependent: CPU-bound and memory-bandwidth-bound functions (parsing, crypto, compression, image work) see the biggest duration wins; functions that mostly wait on I/O (a DynamoDB call, an HTTP fetch) see the rate cut but little duration change, since the chip isn't the bottleneck.

Because the cost/performance curve shifts, the optimal memory setting often moves too — and Lambda allocates CPU proportionally to memory, so this is also a CPU-tuning decision. A function tuned to 512 MB on x86 may hit the same p99 at 384 MB on arm64, or conversely a different memory point may now minimise cost-per-invocation. This is why you must re-run Lambda Power Tuning after the switch rather than assuming the old memory setting is still optimal — the migration and the right-sizing are two savings that should be captured together.

# Flip a pure-interpreted function to arm64 and publish an immutable version.
aws lambda update-function-configuration \
  --function-name order-validation \
  --architectures arm64

VERSION=$(aws lambda publish-version \
  --function-name order-validation \
  --query 'Version' --output text)

# Point the 'live' alias at a 10% canary on the new arm64 version.
aws lambda update-alias \
  --function-name order-validation \
  --name live \
  --function-version "$VERSION" \
  --routing-config "AdditionalVersionWeights={\"$VERSION\"=0.10}"

# After it holds, re-run Power Tuning to re-find the optimal memory on ARM.
aws stepfunctions start-execution \
  --state-machine-arn arn:aws:states:us-east-1:123456789012:stateMachine:powerTuningStateMachine \
  --input '{"lambdaARN":"arn:aws:lambda:us-east-1:123456789012:function:order-validation","powerValues":[256,384,512,768,1024],"num":50,"payload":{}}'

What is the impact of leaving Lambda on x86 when Graviton fits?

The direct impact is the compute line. A function billed at the x86 rate is paying ~20% more per GB-second than the identical arm64 function would, and on CPU-bound workloads it's also burning more billed milliseconds because the x86 cores finish slower. For the order-validation example — 512 MB, ~40M invocations/month at ~190ms — the x86 cost is roughly $660/month; the arm64 version at ~160ms is closer to $445, a ~$215/month gap on a single function. Multiply across a fleet of hundreds and the unrealised saving reaches five figures a month on workloads that would behave identically.

The second-order cost is the missed right-sizing. Because the optimal memory point moves on ARM, a fleet that flipped architecture but never re-ran Power Tuning leaves a further slice on the table — functions over-provisioned with memory (and therefore CPU) they no longer need at the new performance curve. The two savings are meant to be captured together; doing the architecture flip alone banks maybe two-thirds of what was available.

There's a runtime-version trap on the other side. The architecture flip is trivial on a current managed runtime, but functions stuck on a deprecated runtime (an old Node, an unsupported Python) often can't move cleanly because their pinned native dependencies have no maintained aarch64 build. Those functions surface the migration as a forcing function for a runtime upgrade that should have happened anyway — the Graviton saving is the carrot that finally justifies the overdue maintenance.

Finally, container-image functions add a build-pipeline cost that the projection ignores. An image-based Lambda must be built for arm64 — ideally as a multi-arch manifest via docker buildx — which means the CI pipeline, base image, and every layer have to support ARM. For teams that never set this up, the per-function saving is real but gated behind a one-time pipeline investment; budget the engineering time honestly rather than treating every function as a one-line flip.

How do you migrate Lambda to Graviton safely?

It's a triage-then-rollout loop: separate the one-line flips from the build-work functions, build aarch64 for the natives, canary behind a weighted alias, and re-tune memory after the switch. Skip the canary and you'll find the x86-only wheel the hard way — at invocation time, in production.

1. Triage: separate one-line flips from native-dependency functions

List every x86_64 function and bucket it. Pure-interpreted handlers on a current managed runtime (Python/Node/Ruby/Java importing only pure-language packages) are one-line flips — set Architectures: [arm64] and redeploy. Anything with native code is build work: Python packages with C extensions (Pillow, numpy, pandas, psycopg2, anything with a manylinux wheel), Node native addons built via node-gyp, Go or Rust binaries, and all container-image functions. Check pip/npm lockfiles and the function's deployment artifact to tell which bucket each one is in before touching anything.

2. Build native dependencies for aarch64

For zip-based functions, build the deployment package on an arm64 host or in an arm64 container so pip pulls aarch64 manylinux wheels instead of x86 ones — building on an x86 CI runner and flipping the flag is the classic silent failure. For Go/Rust, recompile with the ARM target (GOARCH=arm64, or a --target aarch64-unknown-linux-gnu Rust build). For container-image functions, build a multi-arch image with docker buildx build --platform linux/amd64,linux/arm64 --push so the same tag resolves correctly. Pin base images and Lambda layers that publish arm64.

3. Canary behind a weighted alias before committing

Never flip a production function straight to 100% arm64. Publish a new version on arm64, then point the function's alias at a weighted split — start at 5-10% arm64 — and watch error rate, p99 duration, init (cold-start) duration, and any function-specific metrics for 24-48 hours. Walk the weight up (10% → 50% → 100%) only as each stage holds steady through a full traffic cycle. The weighted alias gives you an instant rollback: drop the weight back to 0 and traffic returns to the proven x86 version with no redeploy.

4. Re-run Lambda Power Tuning after the switch

The cost/performance curve moves on Graviton, and Lambda scales CPU with memory, so the memory setting that was optimal on x86 usually isn't optimal on arm64. Re-run the Lambda Power Tuning state machine against the migrated function to re-find the memory point that minimises cost-per-invocation — frequently you can drop memory (and cost) further while holding the same latency. Treat the architecture flip and the memory re-tune as one initiative; capturing only the first banks roughly two-thirds of what's available.

# Build a zip-based Python function's deps as aarch64 wheels in an ARM container.
docker run --rm --platform linux/arm64 \
  -v "$PWD":/var/task public.ecr.aws/sam/build-python3.12 \
  /bin/sh -c "pip install -r requirements.txt -t python/ && exit"

# Package and update the function to arm64 in one shot.
zip -r function.zip python/ handler.py
aws lambda update-function-configuration \
  --function-name image-thumbnailer --architectures arm64
aws lambda update-function-code \
  --function-name image-thumbnailer --zip-file fileb://function.zip

# Roll back instantly if the canary alias shows trouble — no redeploy needed.
aws lambda update-alias \
  --function-name image-thumbnailer --name live \
  --routing-config 'AdditionalVersionWeights={}'

Quick quiz

Question 1 of 5

A Python 3.12 image-processing Lambda depends on Pillow (which has C extensions) and is currently on x86_64. Cost data shows it's a strong Graviton candidate. Your CI runs on x86 build hosts. What's the right next move?

You've completed Migrate Lambda functions to Graviton (ARM). You now know why arm64 Lambda is both ~20% cheaper per GB-second and usually faster, how to tell a one-line flip from a function that needs aarch64 build work, how to canary safely behind a weighted alias, and why the memory re-tune with Power Tuning is part of the same saving. The next x86 function in your fleet is ready to be triaged and moved — not left on the expensive default by inertia.

Back to the library