Compliance

Configure ElastiCache clusters with a custom subnet group

Security Hub ElastiCache.7 — a cluster on the default subnet group inherits the default VPC's wide-open networking; a custom group is how you control exactly which subnets your cache lives in.

12 min·10 sections·AWS

Last reviewed 27 May 2026

Remediates AWS Security Hub: ElastiCache.7

ElastiCache default subnet groups: the basics

Why the subnet group your cluster lands in is a security decision

An ElastiCache cluster doesn't float free in your account — it lives inside specific subnets, and the set of subnets it's allowed to use is called a subnet group. When you launch a cluster without naming a subnet group, ElastiCache creates or reuses one literally named default, built from the subnets of your default VPC. That's the convenient path, and it's exactly the path this control flags.

ElastiCache.7 checks whether a cluster's CacheSubnetGroupName is the string default. If it is, the control fails at High severity. The reasoning is network blast radius: the default VPC and its subnets are deliberately permissive so that getting started is easy, which is the opposite of what you want around a data store holding session tokens, cached credentials, or query results. A custom subnet group lets you place the cluster in private subnets you've deliberately chosen, with the route tables and NACLs you've deliberately set.

It's flagged High because a cache is a high-value, low-visibility target. Nobody logs into ElastiCache the way they log into a database, so a cluster sitting in the default VPC with broad networking can be reachable far more widely than its owner assumes — and the misconfiguration is invisible until someone audits it or, worse, exploits it.

In this lesson you'll learn what an ElastiCache subnet group actually controls, why the default group inherits the permissive networking of the default VPC, and how to move a cluster onto a purpose-built custom subnet group made of private subnets. You'll see the AWS CLI commands to find non-compliant clusters and create a proper subnet group, the catch that you can't change a running cluster's subnet group in place, and the guardrails — Config rules and infrastructure-as-code defaults — that stop the problem recurring.

Fun fact

The cache nobody knew was reachable

When you launch an ElastiCache cluster through the console and skip the subnet-group step, AWS doesn't stop you — it silently creates a group named default from your default VPC's subnets and proceeds. Because every AWS account ships with a default VPC whose subnets are all public-route-table-attached unless you've changed them, a cluster on the default subnet group can end up one security-group rule away from the open internet. Teams routinely discover during their first security audit that a cache they assumed was "internal" had been sitting in default networking for years — not because anyone chose it, but because nobody chose anything.

Moving a cluster off the default subnet group in action

Dev is the platform engineer on call when the Security Hub digest lands with a new High finding: ElastiCache.7 on prod-sessions-cache, a Redis cluster the auth service leans on. The control failed because its CacheSubnetGroupName is default — it was spun up during a hackathon eighteen months ago and never moved.

He confirms the blast radius first. The cluster is in the default VPC, and two of that VPC's three subnets route to an internet gateway. The cache's security group is tight today, but he's one careless rule away from exposing a store of live session tokens. This is worth fixing properly, not patching the security group.

He can't repoint a running cluster's subnet group — ElastiCache doesn't allow it. So he creates a custom subnet group from the three private subnets in the app VPC, takes a backup of the live cluster, and provisions a replacement cluster on the new group during the next maintenance window. The app's connection string moves to the new endpoint, the old cluster is deleted, and the finding clears on the next Config evaluation. Total hands-on time: under an hour, most of it waiting on the new cluster to come up.

First, find every ElastiCache cluster still using the default subnet group — that's exactly what the control flags.

$ aws elasticache describe-cache-clusters --query 'CacheClusters[].{Id:CacheClusterId,Engine:Engine,SubnetGroup:CacheSubnetGroupName,Status:CacheClusterStatus}' --output table

---------------------------------------------------------------------

| DescribeCacheClusters |

+----------------------+---------+------------------+---------------+

+----------------------+---------+------------------+---------------+

+----------------------+---------+------------------+---------------+

# Two clusters on 'default' — both fail ElastiCache.7. billing-cache is already compliant.

Any cluster whose SubnetGroup is 'default' fails the control; the named custom group is the compliant pattern.

Create a custom subnet group from your private subnets. This is the network the new cluster will live in.

$ aws elasticache create-cache-subnet-group --cache-subnet-group-name app-private-sng --cache-subnet-group-description 'Private subnets for app data caches' --subnet-ids subnet-0aa11bb22 subnet-0cc33dd44 subnet-0ee55ff66

{

"CacheSubnetGroup": {

"CacheSubnetGroupName": "app-private-sng",

"CacheSubnetGroupDescription": "Private subnets for app data caches",

"VpcId": "vpc-0a1b2c3d4e5f",

"Subnets": [

{ "SubnetIdentifier": "subnet-0aa11bb22", "SubnetAvailabilityZone": { "Name": "us-east-1a" } },

{ "SubnetIdentifier": "subnet-0cc33dd44", "SubnetAvailabilityZone": { "Name": "us-east-1b" } }

]

}

# Group created. Note: you can't repoint a running cluster — launch a replacement on this group.

A custom subnet group built from private subnets you control — the destination for a compliant replacement cluster.

ElastiCache subnet groups under the hooddeep dive

A cache subnet group is a named collection of VPC subnets that tells ElastiCache where it's allowed to place cache nodes. At launch, ElastiCache picks a subnet from the group for each node, attaches an elastic network interface in that subnet, and the node inherits everything that subnet implies: its route table (and therefore whether it routes to an internet gateway or NAT), its network ACLs, and its availability zone. The security group attached to the cluster layers on top, but the subnet decides the foundational network position. ElastiCache.7 fails precisely when CacheSubnetGroupName equals default, because that group is composed from the default VPC's subnets, which are public-routed unless explicitly changed.

The critical operational constraint is that a cluster's subnet group is immutable after creation. There is no modify-cache-cluster --cache-subnet-group-name — the parameter doesn't exist for in-place modification. To move a cluster off default you must create a new cluster on the desired subnet group and migrate to it. For a standalone cache you snapshot and restore onto the new group; for a Redis OSS / Valkey replication group you can use online migration or a backup-and-restore into a new replication group, then cut the application's endpoint over. The new subnet group must contain subnets in the same VPC and should span at least two availability zones for any multi-node or failover-enabled deployment.

Worth knowing: the default subnet group is created lazily. If you launch a cluster via the console or API without specifying a group and one named default doesn't yet exist, ElastiCache creates it from the default VPC's subnets on the spot — so a cluster can end up non-compliant without anyone ever having typed the word default. The AWS Config rule elasticache-subnet-group-check backs the control and re-evaluates on a periodic schedule, which is why a freshly remediated cluster clears on the next evaluation rather than instantly.

# 1. Confirm the offending cluster's subnet group and capture its endpoint.
aws elasticache describe-cache-clusters \
  --cache-cluster-id prod-sessions-cache \
  --show-cache-node-info \
  --query 'CacheClusters[0].{SubnetGroup:CacheSubnetGroupName,Nodes:CacheNodes[].Endpoint}'

# 2. Snapshot the live cluster before rebuilding (Redis OSS / Valkey only).
aws elasticache create-snapshot \
  --cache-cluster-id prod-sessions-cache \
  --snapshot-name prod-sessions-pre-migration

# 3. Restore into a NEW cluster on the custom subnet group.
aws elasticache create-cache-cluster \
  --cache-cluster-id prod-sessions-cache-v2 \
  --snapshot-name prod-sessions-pre-migration \
  --cache-subnet-group-name app-private-sng \
  --engine redis \
  --num-cache-nodes 1 \
  --cache-node-type cache.r6g.large

# 4. Cut the app over to the new endpoint, verify, then delete the old cluster.

What is the impact of an ElastiCache cluster on the default subnet group?

The direct impact is network exposure. The default VPC ships with subnets attached to a route table that points to an internet gateway, so a cluster placed there is, by default, in a publicly-routable network segment. The cluster's security group is the only thing standing between it and broad reachability — and security groups drift. A single over-broad ingress rule, a 0.0.0.0/0 added "temporarily" for debugging, or a peering connection into the default VPC can turn an assumed-internal cache into an internet-adjacent one holding live data.

The second-order impact is segmentation failure. The whole point of placing data stores in dedicated private subnets is to contain blast radius: if an application host is compromised, the attacker's lateral movement is bounded by network design. A cache sitting in the shared default VPC undermines that boundary — it sits alongside whatever else got launched into default networking, with the permissive intra-VPC connectivity the default configuration encourages. The control maps to a long list of NIST 800-53 boundary-protection requirements precisely because it's a segmentation control, not a cosmetic one.

The compliance impact is concrete and recurring. ElastiCache.7 is a High-severity control, so it materially affects your Security Hub score, it surfaces in audit evidence, and it's the kind of line item that appears verbatim on enterprise customer security questionnaires. An open High finding against a data store is hard to wave away in a SOC 2 review or a vendor-risk assessment; "we run caches on the default network" is not an answer that builds trust.

Notably, there is no cost impact and no performance trade-off. Moving to a custom subnet group doesn't change the instance type, the node count, or the data-transfer profile. This is a pure posture fix — the only "cost" is the one-time engineering effort of rebuilding the cluster on a new group and cutting the application over. That asymmetry — meaningful risk reduction for near-zero cost — is exactly why it's worth doing promptly rather than deferring.

How do you move an ElastiCache cluster to a custom subnet group?

Because a cluster's subnet group can't be changed in place, remediation is a build-and-migrate loop: create the right network, stand up a replacement cluster on it, cut over the application, then prevent recurrence at the deployment layer.

1. Inventory every cluster on the default subnet group

List all ElastiCache clusters across every region and account and filter for CacheSubnetGroupName == default. For each, record the engine, the VPC it's in, whether it holds sensitive data, and whether its current security group or route table actually exposes it. The exposed-and-sensitive ones are the priority; an internal-only dev cache on default networking is the same finding but a lower-urgency fix.

2. Create a custom subnet group from private subnets

Build a cache subnet group from subnets that have no route to an internet gateway, spanning at least two availability zones for any failover-enabled or multi-node cluster. Reuse one well-named group per data tier (for example app-private-sng) rather than creating one per cluster — it keeps the network design legible and the group itself is free. The subnets must be in the same VPC the replacement cluster will live in.

3. Rebuild the cluster on the new group and cut over

You cannot repoint a running cluster, so snapshot the existing one and restore into a new cluster created with --cache-subnet-group-name set to your custom group (for Redis OSS / Valkey replication groups, online migration is also an option). Validate the new endpoint, update the application's connection string, confirm traffic has moved, then delete the old cluster. Schedule the cutover in a maintenance window — there's a brief reconnect for clients.

4. Prevent recurrence at the deployment layer

Set the standard in infrastructure-as-code so no cluster is ever created without an explicit, non-default subnet group — make cache_subnet_group_name a required input in your Terraform/CloudFormation module. Back it with the AWS Config rule elasticache-subnet-group-check (the rule behind this control) and, ideally, a release gate that blocks any deployment targeting a default VPC. The one-time fix only holds if the next cluster is born compliant.

# Find all clusters on the default subnet group, across the account.
aws elasticache describe-cache-clusters \
  --query "CacheClusters[?CacheSubnetGroupName=='default'].[CacheClusterId,Engine,CacheClusterStatus]" \
  --output table

# Create the compliant custom subnet group from private subnets (>= 2 AZs).
aws elasticache create-cache-subnet-group \
  --cache-subnet-group-name app-private-sng \
  --cache-subnet-group-description 'Private subnets for app data caches' \
  --subnet-ids subnet-0aa11bb22 subnet-0cc33dd44 subnet-0ee55ff66

# Verify which clusters still need migrating after each batch.
aws elasticache describe-cache-clusters \
  --query "length(CacheClusters[?CacheSubnetGroupName=='default'])"

Quick quiz

Question 1 of 5

ElastiCache.7 fires on a production Redis cluster: its CacheSubnetGroupName is default. The cluster holds live session tokens. What's the right next move?

Keep learning

Dig deeper into ElastiCache subnet groups, the control behind this finding, and VPC network design.

You've completed Configure ElastiCache clusters with a custom subnet group. You now know what a subnet group controls, why the default group inherits the permissive networking of the default VPC, the build-and-migrate loop required because the group can't be changed in place, and the infrastructure-as-code guardrails that keep new clusters compliant from birth. The next time ElastiCache.7 fires, you'll have a defensible path from "High finding" to "cleared and prevented."

Back to the library

ElastiCache default subnet groups: what it means for risk

A convenience default that quietly widens exposure

ElastiCache is the in-memory cache layer most applications use to stay fast — it holds frequently accessed data so the app doesn't have to hit the main database every time. Where that cache lives on the network matters, because it often holds sensitive working data: session tokens, partial customer records, cached lookups. This finding says a cluster was launched on the "default" network configuration AWS provides out of the box rather than a deliberately restricted one your team designed.

The default configuration exists to make getting started painless, which means it's permissive by design. Using it for a real workload is the equivalent of standing up a filing cabinet of working data in an unlocked shared hallway because that was the first room available. Nothing has gone wrong yet, but the exposure is wider than anyone intended, and it stays that way silently until an audit or an incident surfaces it.

From a risk and compliance standpoint this is a posture finding, not a cost finding. It maps to network-segmentation requirements in frameworks like NIST 800-53, so it shows up in security reviews and customer security questionnaires. The fix carries no recurring cost — it's a one-time configuration change — but ignoring it is the kind of unforced error that turns a routine audit into a finding and a routine incident into a much larger one.

This lesson is for the finance or governance partner who sees "ElastiCache.7 — High" on a security posture report and needs to understand whether it's urgent, what it costs to fix, and what to ask engineering. It explains what the finding means in plain terms, why it's a one-time free fix rather than a recurring cost, how it ties to compliance frameworks and customer security questionnaires, and the two governance levers — making it a release gate and tracking it as a posture trend — that keep it from coming back. No AWS internals required.

Fun fact

The cache nobody knew was reachable

How a governance partner handles the finding

Priya runs the security-and-compliance cadence with the platform team. A High-severity ElastiCache.7 finding lands on the posture dashboard against the production session cache. Her first question isn't technical — it's "is anything exposed right now, and how long has it been like this?" The answer is that the cache holds live session data and has been on default networking since a hackathon eighteen months ago. That's enough to prioritise it.

The conversation that follows is about sequencing, not subnets. Priya doesn't ask which CIDR ranges or route tables — she asks three things: is this a free fix or does it add cost, does fixing it require downtime, and when can it land. The engineer confirms it's a one-time configuration change with no recurring cost, needs a brief maintenance window because the cluster has to be rebuilt on the new network, and can be done this week. Priya schedules it, notes it on the compliance tracker, and flags it to the customer-trust team because this exact control shows up on the security questionnaire a large prospect just sent over.

A week later the finding is cleared and the posture report is clean on this control. Priya's lasting move isn't the one fix — it's adding "no resource ships on a default subnet group" to the release checklist so the next cache is born compliant. The finding was free to fix; her job is making sure it's the last time it appears.

Why this matters to risk and compliance, not the budget

This finding has essentially no recurring cost attached — moving a cache to a custom subnet group doesn't change what you pay AWS. So the budget angle isn't about saving money; it's about avoiding a category of cost that doesn't show up on the cloud bill: the cost of a security incident, a failed audit, or a stalled enterprise deal.

It's a High-severity control mapped to network-segmentation requirements in NIST 800-53. In practice that means it appears in your compliance posture, in audit evidence packs, and on the security questionnaires that enterprise customers send before they'll sign. An open High finding against a data store holding live session or customer data is a real friction point in those processes — the kind of thing that turns a two-week procurement security review into a two-month one.

The economics strongly favour fixing it immediately. The remediation is a one-time engineering task with no ongoing cost and no performance trade-off — there's no "we'll save more by waiting" argument and no recurring spend to budget for. The only real decision is sequencing, because the cluster has to be rebuilt on the new network, which usually means a short maintenance window.

As a governance signal, clusters on default subnet groups indicate that infrastructure is being created ad hoc rather than from a hardened template. If this finding keeps reappearing on new resources, the issue isn't the individual cache — it's that the team's default deployment path doesn't bake in network discipline, and that pattern predicts other posture findings down the line.

What governance can actually do about this

Governance can't rebuild a cluster, but it can decide how urgently the finding gets fixed and ensure it never recurs. Two levers, applied at the security cadence.

1. Triage by data sensitivity and exposure, not finding count

Not every ElastiCache.7 finding is equally urgent. Ask engineering to rank them by two factors: does the cache hold sensitive data, and is it actually reachable today. A production session cache that's exposed is a this-week fix; an internal dev cache is a tidy-up. This keeps the team fixing the riskiest exposure first rather than burning the same effort on every finding equally.

2. Make 'no default networking' a release gate

The durable fix isn't remediation, it's prevention. Agree with engineering that no new data store ships onto a default VPC or default subnet group, and back it with an automated check that blocks the deployment. This converts a recurring finding into a one-time cleanup, because every future cluster is born compliant rather than caught later in an audit.

3. Track it as a posture trend on the security review

Add High-severity network-segmentation findings as a standing line on the security posture pack — the count and whether it's trending up or down. A flat zero after cleanup is the healthy state. A reappearing finding is the signal that the release gate isn't holding, and that's the conversation to have, not the individual cache.

4. Connect it to customer-trust commitments

This exact control turns up on enterprise security questionnaires and in audit evidence. Treat clearing it as part of the sales-enablement and compliance workstream, not just an engineering chore — a clean answer to "do your data stores run on isolated private networks?" removes friction from procurement and renewal conversations.

Quick quiz

Question 1 of 5

A High-severity ElastiCache.7 finding lands on a production cache holding customer session data, and the same control just appeared on an enterprise prospect's security questionnaire. As the governance partner, what's the right move?

Keep learning

Dig deeper into ElastiCache subnet groups, the control behind this finding, and VPC network design.

You've finished the governance partner's view of ElastiCache.7. You know it's a High-severity network-segmentation finding with no recurring cost, why it interacts with audits and customer security questionnaires, and the governance levers — triage by sensitivity and exposure, make 'no default networking' a release gate, and track the posture trend. Next time it lands on the report, you'll have a sharper question than "how much does this cost to fix?"

Back to the library

ElastiCache default subnet groups: the headline

Sensitive cached data sitting in the network's most open zone

ElastiCache is the fast in-memory layer that keeps applications responsive, and it frequently holds sensitive working data. This finding means one or more of those caches was deployed onto the default, deliberately-permissive network configuration AWS ships for convenience, rather than a restricted network your team controls.

This is a network-hygiene and segmentation issue, not a cost one. The remediation is free and one-time; the risk of leaving it is that a high-value data store sits in the most exposed part of the network, which is exactly the kind of thing that turns up in security audits, customer questionnaires, and post-incident reviews. The broader signal is whether the organisation defaults to deliberate network design or to whatever AWS hands you first.

A short read for the executive who wants the headline on a High-severity network finding and the one question to ask. You'll get the plain-language risk, why the fix is free and fast, what this category signals about the org's network discipline, and what "good" looks like — no commands, no implementation detail.

Fun fact

The cache nobody knew was reachable

What it looks like when the org gets this right

At one company a High-severity finding — a production cache sitting on default networking — surfaced during prep for an enterprise customer's security review. The exec sponsor's instinct wasn't to ask how it gets fixed; it was to ask why it existed at all: "Why is a store of customer session data on the network we ship by default? Who signed off on that, and how many more are like it?"

The team fixed the one cache in under a day at no recurring cost. The more important change was structural: deploying anything onto a default VPC or default subnet group became a release-blocking check, so the question could never come up in a customer audit again. Within a quarter the security questionnaire answer flipped from an awkward "we're remediating" to a clean "all data stores run on purpose-built private networks."

That's the right outcome state. The goal isn't to chase individual findings to zero by hand — it's to make "deliberate network design" the default so this class of finding stops being generated. The posture report becomes a confidence signal rather than a recurring to-do.

Why this is on the report at all

This is a High-severity security finding with no associated cost — the fix is free and one-time. It's on the report because it represents a meaningful, easily-avoided exposure: a high-value data store placed in the most permissive part of the network. The risk is real, the remediation is cheap, and that combination makes leaving it unaddressed an unforced error.

The reason to care beyond the single finding is what it signals. A clean record here means infrastructure is being built from deliberate, hardened patterns. A recurring stream of these findings means new resources are being deployed on whatever AWS provides by default — and the same gap that puts a cache on the default network tends to show up across the rest of the estate. This control sits squarely in the security and audit-readiness conversation, and it's exactly the sort of thing enterprise customers probe before they trust you with their data.

The leadership move on this category

The handle for an executive isn't to chase individual findings — it's to make deliberate network design the default so this class of finding stops being generated.

1. Insist new infrastructure starts from hardened templates

The reason a cache ends up on default networking is that someone took the path of least resistance. Mandate that data stores are provisioned from reviewed infrastructure-as-code patterns with private networking baked in, so the easy path and the safe path are the same path.

2. Make 'no default VPC for production' a non-negotiable

Default VPCs are for experimentation, not production data. A simple, enforceable rule — nothing holding real data runs in a default VPC or default subnet group — eliminates this finding and a cluster of related ones at the source, without any need for technical detail at your level.

3. Ask for the trend as a confidence signal

"Are High-severity network findings flat at zero, or do they keep reappearing?" is a one-minute review item that tells you whether infrastructure discipline is real or aspirational. A sustained zero after the initial cleanup means the guardrails are working and attention belongs elsewhere.

Quick quiz

Question 1 of 5

You're reviewing the security posture pack. High-severity network findings have been flat at zero for three quarters, and engineering reports new data stores are provisioned from reviewed templates with private networking. What's the right read?

Keep learning

Dig deeper into ElastiCache subnet groups, the control behind this finding, and VPC network design.

That's the lesson. Two takeaways: a cache on the default subnet group is a high-value data store in the network's most open zone, and the fix is free and one-time. The leadership move is making deliberate network design the default so this class of finding stops being generated — then watching the trend as a confidence signal.

Back to the library

Configure ElastiCache clusters with a custom subnet group

ElastiCache default subnet groups: the basics

The cache nobody knew was reachable

Moving a cluster off the default subnet group in action

ElastiCache subnet groups under the hooddeep dive

What is the impact of an ElastiCache cluster on the default subnet group?

How do you move an ElastiCache cluster to a custom subnet group?

1. Inventory every cluster on the default subnet group

2. Create a custom subnet group from private subnets

3. Rebuild the cluster on the new group and cut over

4. Prevent recurrence at the deployment layer

Quick quiz

Keep learning

ElastiCache default subnet groups: what it means for risk

The cache nobody knew was reachable

How a governance partner handles the finding

Why this matters to risk and compliance, not the budget

What governance can actually do about this

1. Triage by data sensitivity and exposure, not finding count

2. Make 'no default networking' a release gate

3. Track it as a posture trend on the security review

4. Connect it to customer-trust commitments

Quick quiz

Keep learning

ElastiCache default subnet groups: the headline

The cache nobody knew was reachable

What it looks like when the org gets this right

Why this is on the report at all

The leadership move on this category

1. Insist new infrastructure starts from hardened templates

2. Make 'no default VPC for production' a non-negotiable

3. Ask for the trend as a confidence signal

Quick quiz

Keep learning

Related compliance lessons