Compliance

Remove long-stopped EC2 instances

Security Hub EC2.4 — instances stopped for >30 days are usually abandoned. They keep EBS billing and AMIs/snapshots stale.

11 min·10 sections·AWS

Last reviewed 27 May 2026

Remediates AWS Security Hub: EC2.4

Long-stopped EC2 instances: the basics

Why "stopped" doesn't mean "safe"

When you stop an EC2 instance, AWS shuts down the OS, releases the underlying host capacity, and stops charging you for compute. The instance record stays in your account along with its attached EBS volumes, network interfaces, tags, and IAM instance profile — frozen in time at the moment it stopped. Restarting the instance brings it back exactly as it was: same kernel, same packages, same secrets, same AMI baseline.

That "exactly as it was" is the problem. An instance stopped for six months has missed six months of security patches. Its baked-in IAM credentials, SSM agent version, OS package signatures, and TLS root store are all six months stale. The AMI it was launched from may have been deregistered or marked as deprecated by the vendor. If something in your fleet auto-starts it for a one-off test or a stuck pipeline reruns it, you've just put an unpatched, unmonitored host on your network.

AWS Security Hub flags this pattern under control EC2.4 — "Stopped EC2 instances should be removed after a specified time period." The default threshold is 30 days; the rationale is hygiene as much as cost. After 30 days the instance has either been deliberately preserved (rare, usually a one-off forensic snapshot) or quietly abandoned (the common case). Either way it deserves a decision, not an indefinite limbo.

In this lesson you'll learn how to identify long-stopped EC2 instances, how to tell abandoned from intentionally-preserved, and how to safely retire them without losing data anyone still cares about. You'll see the AWS CLI investigation pattern, a decision matrix for what to do per instance, and a snapshot-then-terminate flow that leaves you with recoverable artifacts if you ever need them back.

Fun fact

The 4am restart that wasn't

In one well-known retrospective, a SaaS company traced a 4am outage to an instance that had been stopped for 14 months. A misconfigured Auto Scaling group, after a cooldown timer fired, picked the stopped instance to "replace" a healthy one and started it up. It came back running an OS three major versions behind, pulled bad config from a deprecated S3 bucket, and answered traffic for 11 minutes before health checks killed it. The fix was one line — but the lesson was: stopped is not deleted, and AWS will happily turn it back on for you.

Cleaning up long-stopped instances in action

Marco runs platform operations at a fintech. A quarterly compliance scan returns 47 EC2.4 findings across three accounts — 47 instances stopped for more than 30 days, the oldest sitting at 412 days. Severity is MEDIUM but the auditor flagged it as a recurring item from the last review, which moves it up his queue fast.

He doesn't just bulk-terminate. Some of these are deliberate — a security forensics box held for a pending investigation, a snapshot-source instance used as a base AMI builder. Some are clearly abandoned — old developer sandboxes, half-finished proof-of-concept work whose owner left the company a year ago. The difference is in the tags and the StateTransitionReason.

He starts by listing every stopped instance with the dates that matter.

First, list every stopped instance in the region with launch time, state-transition reason, and the user-applied Owner tag (if any).

$ aws ec2 describe-instances --filters Name=instance-state-name,Values=stopped --query "Reservations[*].Instances[*].{Id:InstanceId,Launched:LaunchTime,Stopped:StateTransitionReason,Owner:Tags[?Key=='Owner']|[0].Value}" --output table

┌──────────────────────┬──────────────────────┬───────────────────────────────────┬──────────────┐

│ Id │ Launched │ Stopped │ Owner │

├──────────────────────┼──────────────────────┼───────────────────────────────────┼──────────────┤

│ i-0a1b2c3d4e5f60001 │ 2024-03-12T09:14:00Z │ User initiated (2025-01-08 16:02) │ data-platform│

│ i-0a1b2c3d4e5f60002 │ 2023-11-04T12:30:00Z │ User initiated (2024-08-22 11:17) │ (none) │

│ i-0a1b2c3d4e5f60003 │ 2026-04-19T08:05:00Z │ User initiated (2026-04-22 18:40) │ ami-builder │

│ i-0a1b2c3d4e5f60004 │ 2024-09-01T14:22:00Z │ User initiated (2025-06-14 10:55) │ ex-employee │

└──────────────────────┴──────────────────────┴───────────────────────────────────┴──────────────┘

# Two over a year stopped, one with no owner, one belonging to someone who left. Only the ami-builder is recent.

Stopped instances across the region — age and ownership at a glance.

For each candidate to retire, snapshot every attached EBS volume first so the data is recoverable. This is the safety net that lets you terminate without losing sleep.

$ aws ec2 create-snapshots --instance-specification InstanceId=i-0a1b2c3d4e5f60002,ExcludeBootVolume=false --description "EC2.4 cleanup pre-terminate i-0a1b2c3d4e5f60002" --tag-specifications 'ResourceType=snapshot,Tags=[{Key=Purpose,Value=ec2.4-cleanup},{Key=SourceInstance,Value=i-0a1b2c3d4e5f60002}]'

{

"Snapshots": [

{ "SnapshotId": "snap-0aa11bb22cc33dd01", "VolumeId": "vol-09887766554433221", "State": "pending", "VolumeSize": 100 },

{ "SnapshotId": "snap-0aa11bb22cc33dd02", "VolumeId": "vol-09887766554433222", "State": "pending", "VolumeSize": 250 }

]

}

# Both volumes captured — tagged for traceability, recoverable for 90 days under our lifecycle policy.

Atomic snapshot of every attached volume before terminating.

How EC2.4 detects long-stopped instancesdeep dive

Security Hub control EC2.4 is backed by the AWS Config managed rule ec2-stopped-instance. The rule evaluates every EC2 instance in your account on a configurable cadence (default: every 24 hours) and compares the time since the most recent state-transition into stopped against the AllowedDays parameter. The default is 30 days; you can dial it up or down per-rule and per-account to match your operational reality.

The data point the rule reads is StateTransitionReason, which AWS embeds in the instance metadata at the moment it changes state. The format is fixed — User initiated (YYYY-MM-DD HH:MM:SS GMT) for human-triggered stops, plus distinct strings for ASG-initiated, spot-interruption, and host-failure stops. The rule parses that timestamp and does the math; there's no separate "stopped at" field, which is why scripts often read StateTransitionReason directly.

Crucially, the rule fires on the instance only — it doesn't surface the attached EBS volumes, AMIs derived from the instance, or snapshots that depend on those volumes. That's why a complete remediation has to walk the dependency graph yourself: snapshot the volumes, deregister any AMIs that point at them, then terminate the instance. If you skip the AMI step, deregistering the AMI later orphans the snapshots and you stop being able to relaunch from it.

# How AWS Config evaluates ec2-stopped-instance — list every stopped instance and the days since stop.
aws ec2 describe-instances \
  --filters Name=instance-state-name,Values=stopped \
  --query "Reservations[].Instances[].[InstanceId, StateTransitionReason]" \
  --output text

What is the impact of leaving instances stopped indefinitely?

The first impact is security hygiene. A stopped instance is a frozen attack surface: missing patches accumulate, baked-in IAM credentials and SSH keys may have been rotated organisation-wide while this instance kept its originals, and the AMI it was launched from may now be flagged as deprecated. Bring it back online and you've just attached an unmonitored, out-of-policy host to the VPC.

The second impact is cost — covered in detail in the related lesson on Stopped EC2 Instances with EBS, but worth restating: stopping the compute doesn't stop the EBS billing. A 250 GB gp3 volume sitting on a stopped instance costs roughly $20/month indefinitely; multiply that by the dozens of stopped instances most accounts accumulate and you have a six-figure annual line item with zero workload behind it.

The third impact is compliance and audit posture. EC2.4 is a recurring finding type — auditors check it on every visit. Open findings that have been outstanding for multiple quarters become evidence of weak operational hygiene, which raises questions about every other control in scope. "Why haven't you remediated this?" is a much harder conversation than "Here's our automated retire-after-30-days policy."

The fourth impact is operational risk: ASGs and orchestration tooling can sometimes restart stopped instances by mistake (a misconfigured replacement strategy, a stale launch-template reference, a manual recovery script). When that happens, the unpatched ghost rejoins the network and starts answering traffic before anyone realises what's running.

How do you safely retire long-stopped instances?

Retiring a stopped instance is a four-step loop. The order matters — you want a recoverable artifact before anything destructive, and a prevention rule in place so the same drift doesn't refill the queue next quarter.

1. Inventory and classify by intent

Pull every stopped instance with its LaunchTime, StateTransitionReason, and tags. Apply a decision matrix: recently stopped (<30 days) by an automation → leave it alone; stopped months ago with an owner tag → ping the owner once with a tag-and-warn label and an ExpiresAt date; stopped months ago with no owner or an owner who left → schedule for retirement; long-running before stop and still relevant to the business → schedule a patched re-launch from a current AMI rather than restarting the stale one.

2. Snapshot every attached EBS volume

Use aws ec2 create-snapshots --instance-specification to atomically snapshot the boot and data volumes in one call. Tag the snapshots with Purpose=ec2.4-cleanup and SourceInstance=<id> so you can find them later if someone needs the data. Set a snapshot lifecycle policy that retains them for 90 days — long enough to recover from a wrong call, short enough that the snapshot bill doesn't replace the volume bill.

3. Deregister dependent AMIs, then terminate

If the instance has been used as an AMI source, deregister those AMIs first (aws ec2 deregister-image) so you control the orphan-snapshot moment instead of discovering it later. Then call aws ec2 terminate-instances for the instance itself. Terminating frees the EBS volumes (if DeleteOnTermination is true) and the EC2.4 finding closes on the next AWS Config evaluation.

4. Prevent recurrence with AWS Config and tagging policy

Enable the AWS Config managed rule ec2-stopped-instance with AllowedDays set to your real threshold (30 is the default; many teams use 14 in non-prod). Add a tagging policy: every instance must carry Lifecycle=temporary|permanent and, when temporary, ExpiresAt=YYYY-MM-DD. An EventBridge rule reads the tag and triggers the snapshot-and-terminate Lambda automatically on the expiry date — drift never accumulates past the explicit intent.

# Snapshot, deregister, terminate — the safe retire flow for one instance.
INSTANCE=i-0a1b2c3d4e5f60002

aws ec2 create-snapshots \
  --instance-specification InstanceId=$INSTANCE,ExcludeBootVolume=false \
  --description "EC2.4 cleanup pre-terminate $INSTANCE" \
  --tag-specifications "ResourceType=snapshot,Tags=[{Key=Purpose,Value=ec2.4-cleanup},{Key=SourceInstance,Value=$INSTANCE}]"

# Find any AMIs derived from this instance and deregister them.
aws ec2 describe-images --owners self \
  --filters Name=tag:SourceInstance,Values=$INSTANCE \
  --query 'Images[].ImageId' --output text | \
  xargs -n1 -r aws ec2 deregister-image --image-id

aws ec2 terminate-instances --instance-ids $INSTANCE

Quick quiz

Question 1 of 5

You have 47 EC2 instances flagged by EC2.4 (stopped >30 days). You've classified them and identified 30 that are clearly abandoned. What's the right next step before terminating?

Keep learning

Dig deeper into EC2 lifecycle hygiene and the AWS tooling around it.

You've completed Remove long-stopped EC2 instances. You can now identify stopped instances by age and intent, snapshot their data for recoverability, deregister dependent AMIs cleanly, terminate without orphaning resources, and prevent the same drift from refilling your queue next quarter. The next time EC2.4 fires on 47 instances, you'll have a four-step loop ready to run.

Back to the library

Long-stopped EC2 instances: the cost and compliance basics

Stopped compute still bills — and accumulates audit liability over time

Stopping an EC2 instance halts compute charges but does not stop the EBS volume billing attached to it. A 250 GB gp3 volume runs roughly $20/month whether the instance it belongs to is on or off. Accounts that accumulate dozens of stopped instances — a common pattern in teams with active development cycles — are often carrying hundreds of dollars a month in volume costs with zero workload behind them. Security Hub EC2.4 surfaces this as a Medium-severity finding after 30 days of stoppage.

The 30-day threshold is the FinOps signal. An instance stopped for more than a month has almost certainly been abandoned rather than deliberately preserved. Every day past that threshold is unnecessary spend on EBS storage, plus the accumulation of stale AMIs, orphaned snapshots, and network interfaces that inflate the inventory and the bill further.

From a chargeback standpoint, long-stopped instances are difficult to allocate — they often outlive the project or person who created them, carry incomplete tags, and sit in accounts with no obvious cost owner. Making EC2.4 a tracked metric gives FinOps practitioners a concrete lever: identify, classify, snapshot-then-terminate, and prevent recurrence. The payback is immediate (EBS billing stops at termination) and the audit trail is clean.

This lesson is for the finance partner who wants to understand what EC2.4 findings mean on the cloud bill and how to act on them. You'll get a plain-English explanation of why stopped instances keep costing money, how to estimate the monthly waste from EBS volumes on stopped instances, the classification approach that separates genuine waste from intentional holds, and the governance mechanism — tagging, an expiry-date policy, and a tracked metric — that prevents the queue from refilling. No AWS CLI required; the focus is on cost framing, chargeback clarity, and audit-ready remediation.

Fun fact

The 4am restart that wasn't

How a finance partner frames the EC2.4 cleanup

Dana is the FinOps lead for a SaaS company. At the monthly cloud cost review, she notices a persistent EBS line item that hasn't tracked with any workload change. She pulls the EC2.4 findings: 31 instances stopped for more than 30 days across two accounts. The stopped compute costs nothing — but the attached EBS volumes are billing at roughly $18–22 per volume per month. Across 31 instances, many with multiple volumes, the total is over $900/month with zero output.

Dana doesn't treat it as a bulk delete. She asks engineering to tag each instance against a decision: intentional hold with an owner and expiry date, or retire. The goal is a chargeback-ready inventory: every retained instance has a cost centre, an owner, and an explicit expiry. Every retired instance has a snapshot as the audit trail and a termination record so the EBS billing drop is traceable to a decision.

Her takeaway for the next budget review is concrete: terminating the abandoned instances saves approximately $800/month in EBS storage immediately, with the remaining $100/month allocated to two documented holds with expiry dates. The EC2.4 finding count dropped from 31 to 2 — both suppressed with recorded justifications. The saving is realized in the next billing cycle and is easy to show in the unit economics report.

Why this matters to the budget and the waste report

The cost impact of long-stopped instances is almost entirely EBS storage. A gp3 volume bills at $0.08/GB/month regardless of whether its attached instance is running. A 250 GB data volume stopped for six months has cost $120 in storage with zero workload behind it. Multiply that across 30–50 stopped instances — the typical count in an active engineering org — and the monthly waste figure is measurable and eliminable without affecting any running system.

The FinOps insight is that this waste is invisible in standard cost dashboards because it sits in EBS storage spend rather than EC2 compute spend. Teams that monitor "are my EC2 costs down?" after a sprint ends miss the EBS tail entirely. EC2.4 is the signal that makes it visible: each finding represents at least one volume billing with no owner consuming it.

There is also an AMI and snapshot component. Instances stopped for months often have derived AMIs and manual snapshots that are never cleaned up. Those aren't free — snapshots bill at $0.05/GB/month. A 100 GB root volume snapshot left indefinitely costs $5/month forever. Multiply by the number of AMIs and snapshots in a typical account and the number adds up. The safe retire flow — snapshot before terminating, then set a lifecycle policy to expire that snapshot in 90 days — converts an indefinite cost into a bounded one.

From a chargeback standpoint, abandoned instances are the hardest category to allocate: they often lack owner tags, their projects are closed, and their cost centre is ambiguous. Every EC2.4 finding that goes unaddressed is a month of unallocated spend. Clearing the queue and tagging the survivors is the fastest path to clean cost allocation in the EC2 category.

What finance can drive on EC2.4 remediation

Finance doesn't run the CLI commands, but it owns the cost framing that shapes how engineering prioritises and documents the retirement. Four levers, applied at the monthly cadence.

1. Quantify the EBS waste before the remediation conversation

Pull the stopped-instance list and multiply each attached volume's size in GB by $0.08 (gp3 rate) by months stopped. Present that as the recoverable monthly waste — not a Security Hub finding count, a dollar number. A finding count of 47 is abstract; $1,100/month in EBS billing with zero workload behind it is a budget conversation.

2. Require owner tags and cost-centre allocation before any exception is granted

If an instance is being intentionally preserved, it needs an owner name, a cost centre, and an ExpiresAt date. Without those three tags, it is unallocated spend. This converts the decision from 'keep or delete?' into 'who is paying for this and for how long?' — which is the right question for both chargeback accuracy and budget transparency.

3. Track the post-termination EBS saving as a realised FinOps saving

Each batch of terminated instances reduces the EBS line item in the next billing cycle. Tag that reduction in the monthly cost report with a reference to the EC2.4 remediation effort. That creates an audit trail — the saving is attributable, quantified, and linked to a governance action rather than appearing as an unexplained variance.

4. Set a snapshot retention budget as part of the retire flow

The safety-net snapshots taken before termination also cost money — $0.05/GB/month indefinitely if no lifecycle policy is set. Budget 90 days of snapshot retention as part of the remediation cost, then expire them. A 100 GB root volume snapshot costs roughly $5/month; 90-day retention is $15 total, well worth the recoverability window. After that the cost should drop to zero and the saving is fully realised.

Quick quiz

Question 1 of 5

A stopped-instance audit shows 23 EC2 instances stopped for more than 60 days, with combined attached EBS storage of 4.2 TB. As the FinOps lead, what's the right framing for the remediation request to engineering?

Keep learning

Dig deeper into EC2 lifecycle hygiene and the AWS tooling around it.

You've finished the finance partner's view of EC2.4. You know that the real cost of long-stopped instances is in EBS storage billing — not compute — and that the waste is quantifiable per finding. You have the four FinOps levers: put a dollar number on the EBS waste before the engineering conversation, require owner tags and cost-centre allocation before any exception is granted, record the post-termination saving as a realised reduction, and budget a 90-day snapshot window as the recoverability cost so the saving is fully booked after that. Next time EC2.4 shows up, you'll frame it as a cost recovery opportunity with a traceable audit trail, not just a compliance checkbox.

Back to the library

Long-stopped EC2 instances: what leadership needs to know

Abandoned infrastructure is a governance problem as much as a cost one

When an EC2 instance is stopped rather than terminated, it sits in your cloud account indefinitely — frozen at the state it was in when someone last touched it. AWS Security Hub flags instances that have been stopped for more than 30 days under control EC2.4. The typical finding is not a one-off edge case; it is a symptom of operational hygiene at scale. Most accounts accumulate these across project endings, team changes, and sprint leftovers.

The governance concern is that a stopped instance is not inert. Its IAM credentials, OS, and packages age without patching. Cloud orchestration tooling can restart it without a human decision. And because its cost sits in EBS storage rather than visible compute spend, it tends to be overlooked in budget reviews. EC2.4 forces an active decision — retire it with an audit trail, or explicitly record that it is being preserved and why.

The right end state is a policy, not a cleanup campaign: instances that are not actively used are terminated on a defined schedule, with snapshots taken for recoverability. That makes every resource in the account an intentional, owned asset rather than an inherited liability.

A short read for the leader who wants to understand what EC2.4 represents and what a mature response looks like. You'll learn why "stopped" is not the same as "safe" or "free," why these findings tend to recur without a structural fix, and what a healthy end state looks like: a defined lifecycle policy that terminates idle instances automatically, with recoverable snapshots and an ownership record for every decision. No technical depth — just the governance picture and the one leadership question worth asking.

Fun fact

The 4am restart that wasn't

What it looks like when an organization gets this right

At one company, the VP of Engineering used to get a quarterly compliance report showing a recurring cluster of EC2.4 findings that never fully cleared. The pattern was always the same: engineers stopped instances at the end of projects and moved on, and nobody made the termination decision because it felt risky — what if someone needed the data back?

After establishing a lifecycle policy — every instance gets an ExpiresAt tag on creation, automated snapshots run before the expiry date, and termination fires automatically — the pattern broke. The EC2.4 queue no longer refills. The few instances that are intentionally held carry an owner's name, a cost-centre tag, and a reviewed expiry date. The ones with no owner and no tag are retired within 72 hours of the policy running.

The VP's question at the quarterly review changed from 'why do we keep having this finding?' to 'how many instances did the policy retire this quarter and how much did we recover?' That's the right question. It means the organisation is governed by a policy, not by individual decisions that accumulate into technical debt.

Why this is on the leadership agenda

Long-stopped instances are a proxy for organisational hygiene. In a well-run engineering culture, resources are owned by named teams, have explicit lifecycles, and are terminated when their purpose ends. When EC2.4 shows dozens of instances stopped for months or years, it is evidence that the lifecycle discipline is missing — resources are created on demand and abandoned when the work shifts, with no cleanup norm.

The direct operational risk is the accidental restart. AWS orchestration tools — Auto Scaling Groups, recovery scripts, CI/CD pipelines with stale configuration — can restart a stopped instance without a human making that decision. An instance that has been stopped for 14 months and gets restarted is running an OS and application stack that have missed hundreds of security patches. It joins your production VPC and starts answering traffic before any monitoring system flags it as anomalous. EC2.4 findings are not theoretical; they are a list of assets that could become that unpatched ghost.

The compliance dimension is also direct. EC2.4 is a recurring Security Hub finding. When the same findings appear across multiple quarterly reviews without remediation, auditors treat it as evidence of weak operational control — not just on EC2 lifecycle, but as a proxy for the overall maturity of the cloud governance programme. A finding that has been open for 18 months is a harder conversation than a policy that retires instances within 30 days automatically.

The leadership move on EC2.4

The executive action isn't to approve each termination individually — it's to establish the policy that removes the decision from the individual engineer's plate.

1. Set a default lifecycle: instances expire unless explicitly preserved

Policy should require that every EC2 instance carries an ExpiresAt tag at launch or within its first operational week. Instances without an expiry date after 30 days of stoppage are candidates for automated retirement. That flips the default from 'keep until someone deletes' to 'expire unless someone extends' — the one change that stops the queue from refilling every quarter.

2. Accept a snapshot window as the safety net, not a reason to delay

The common objection to automated termination is 'what if we need the data?' The answer is a tagged snapshot retained for 90 days. Any data recovery need in that window is addressed from the snapshot, not from the stopped instance. Establishing that as the accepted recovery model removes the blocking concern and lets the policy run.

3. Ask for the EC2.4 queue length as a single health signal

At the quarterly review, one question covers it: how many instances have been stopped for more than 30 days with no owner tag and no recorded expiry? A number trending toward zero means the policy is working. A number holding steady or growing means either the tagging policy isn't enforced or the automation isn't running. Either is an operational process question, not a technical one.

Quick quiz

Question 1 of 5

The EC2.4 finding count has appeared on the last three quarterly compliance reports with no meaningful reduction. What does this signal, and what's the right response?

Keep learning

Dig deeper into EC2 lifecycle hygiene and the AWS tooling around it.

That's the lesson. The core insight is that EC2.4 is a lifecycle governance problem, not a one-time cleanup task. If the finding queue refills every quarter, the org lacks a policy — and the fix is an automated expiry-and-retire process, not another manual sprint. The three leadership signals to track: the EC2.4 queue length trending toward zero, every retained instance carrying an owner and an expiry date, and the EBS saving from retirements appearing as a measurable line in the monthly FinOps report. When those three are in place, the finding stops being a recurring audit liability and becomes evidence of a mature cloud operations posture.

Back to the library

Part of the learning path Right-size your compute

Remove long-stopped EC2 instances

Long-stopped EC2 instances: the basics

The 4am restart that wasn't

Cleaning up long-stopped instances in action

How EC2.4 detects long-stopped instancesdeep dive

What is the impact of leaving instances stopped indefinitely?

How do you safely retire long-stopped instances?

1. Inventory and classify by intent

2. Snapshot every attached EBS volume

3. Deregister dependent AMIs, then terminate

4. Prevent recurrence with AWS Config and tagging policy

Quick quiz

Keep learning

Long-stopped EC2 instances: the cost and compliance basics

The 4am restart that wasn't

How a finance partner frames the EC2.4 cleanup

Why this matters to the budget and the waste report

What finance can drive on EC2.4 remediation

1. Quantify the EBS waste before the remediation conversation

2. Require owner tags and cost-centre allocation before any exception is granted

3. Track the post-termination EBS saving as a realised FinOps saving

4. Set a snapshot retention budget as part of the retire flow

Quick quiz

Keep learning

Long-stopped EC2 instances: what leadership needs to know

The 4am restart that wasn't

What it looks like when an organization gets this right

Why this is on the leadership agenda

The leadership move on EC2.4

1. Set a default lifecycle: instances expire unless explicitly preserved

2. Accept a snapshot window as the safety net, not a reason to delay

3. Ask for the EC2.4 queue length as a single health signal

Quick quiz

Keep learning

Related compliance lessons