Skip to main content
emnode / learn
Cost

Set retention on CloudWatch log groups

By default log groups keep data forever — set a retention policy and let AWS prune the long tail automatically.

12 min·10 sections·AWS

Last reviewed

Log retention: the basics

What does it mean for a log group to have no retention policy?

Every CloudWatch log group has a retentionInDays setting. When you create a group manually it defaults to null — which AWS renders in the console as "Never Expire." That's not marketing language; it is literally the policy. Logs ingested today will sit in the group until somebody, sometime, deletes them by hand.

Most log groups in a typical AWS account weren't created by hand. Lambda creates one the first time a function logs. ALB, API Gateway, ECS, EKS, RDS Performance Insights, VPC Flow Logs, CodeBuild, Step Functions, AppSync — all of them auto-create groups on first use, and every single one of those groups is born with no retention. They quietly grow for years before anyone runs a bill audit.

Wastage checks flag any log group with retentionInDays = null (or, less commonly, a retention that's obviously wrong for the data — debug logs kept for 10 years, compliance logs kept for 3 days). The fix is one API call per group. The hard part isn't applying it; the hard part is deciding, for each group, what "right" actually is.

In this lesson you'll learn how CloudWatch Logs is actually billed, why ingestion almost always dominates the line item but storage compounds over years, how to pick retention windows by log purpose, and how to bulk-fix an account that's been running with no policy for a while. You'll also see how to stop the bleeding so newly auto-created log groups don't start the cycle over again.

Fun fact

Lambda's never-expiring legacy

Lambda has auto-created log groups since 2014 — and for ten years every single one was born with Never Expire. In late 2023 AWS finally added an account-level default retention you can set, but it only applies to new log groups Lambda creates after you set it. Every legacy group still sits at infinite retention until someone fixes it. Mature accounts routinely have 40,000+ Lambda log groups, most empty or near-empty, a few hoarding gigabytes of data nobody's read since 2019.

Setting retention in action

Nina runs platform infra at a SaaS company. The CloudWatch line on the monthly invoice has crept from $400 to $2,800 over three years. Ingestion explains most of it — they ship more logs than they used to — but storage is now a third of the bill and growing.

She lists every log group in the production account and pipes it through jq to count how many have no retention. The answer is 4,612 out of 5,108. The largest single offender is a /aws/lambda/legacy-image-processor group from 2021 holding 38 GB of debug output for a function that was decommissioned 18 months ago.

She applies a default policy: 30 days for application logs, 7 days for debug groups, 400 days for the handful of audit groups. Storage drops by 71% over the next two weeks as AWS prunes the long tail automatically.

First, list every log group in the account and find the ones with no retention set.

$ aws logs describe-log-groups --query 'logGroups[?retentionInDays==`null`].[logGroupName,storedBytes]' --output table
----------------------------------------------------------------------
| DescribeLogGroups |
+----------------------------------------------------+---------------+
| /aws/lambda/legacy-image-processor | 40802189312 |
| /aws/lambda/billing-cron | 12533493760 |
| /aws/apigateway/welcome | 94371840 |
| /aws/codebuild/frontend-deploy | 3221225472 |
| /ecs/payments-service | 8589934592 |
| /aws/lambda/old-stripe-webhook | 1073741824 |
+----------------------------------------------------+---------------+
# 4,612 groups returned. Top 6 alone hold 62 GB at $0.03/GB-mo = $22.40/mo, forever.

Log groups with retentionInDays = null and their stored byte counts.

Now apply a 30-day retention policy to one of them and watch CloudWatch start pruning.

$ aws logs put-retention-policy --log-group-name /aws/lambda/legacy-image-processor --retention-in-days 30
# Command returns no output on success.
$ aws logs describe-log-groups --log-group-name-prefix /aws/lambda/legacy-image-processor
{
"logGroups": [
{
"logGroupName": "/aws/lambda/legacy-image-processor",
"retentionInDays": 30,
"storedBytes": 40802189312
}
]
}
# AWS will prune anything older than 30 days in the next 24-72 hours.

PutRetentionPolicy is idempotent and instant — pruning happens asynchronously.

How CloudWatch Logs is actually billeddeep dive

CloudWatch Logs has two main charges in every region: ingestion at $0.50/GB and storage at $0.03/GB-month (Standard class). Ingestion is paid once, at write time, on the compressed bytes shipped. Storage is paid every month the data sits there, on the same compressed footprint. For a noisy app the ingestion bill almost always dominates — you can pay $500 ingesting in a single day and only $30 to store the result for the rest of the month.

What changes that math is time. After about 17 months of sitting in a Standard log group, cumulative storage cost has matched the original ingestion cost. After three years it has more than doubled it. Most accounts that have never set retention have logs from 2019 still on the bill — and the storage line has quietly become the bigger of the two. Setting retention doesn't refund the ingestion you've already paid, but it stops the storage meter from running on data you don't read.

In 2023 AWS introduced an Infrequent Access (IA) log class at $0.0075/GB-month — 75% cheaper storage but $0.25/GB to query, and you can't use Live Tail or metric filters on it. IA is the right call for groups you keep purely for forensics or compliance and almost never actually look at; it is the wrong call for anything an on-call engineer needs to grep during an incident.

# Inspect a single group's retention + class + size.
aws logs describe-log-groups \
  --log-group-name-prefix /aws/lambda/payments \
  --query 'logGroups[*].{name:logGroupName, days:retentionInDays, class:logGroupClass, gb:storedBytes}' \
  --output table

# Move a low-traffic forensic group to Infrequent Access.
aws logs put-log-group-class \
  --log-group-name /aws/security/audit-trail \
  --log-group-class INFREQUENT_ACCESS

What is the impact of never-expiring log groups?

The direct cost is the easy part. A single ECS service shipping 100 MB/day of debug output at default retention will, after five years, be sitting on roughly 180 GB of stored data — about $5.40/month, forever, for logs nobody will ever read. Multiply by a few hundred services and the storage line on CloudWatch becomes thousands of dollars a month of pure deadweight.

The second-order impact is operational. Log Insights queries get slower the more data they have to scan, and queries are billed per GB scanned. A team that runs a weekly forensics query across five years of data is paying both for storage and for the query scan — when 99% of investigations only need the last 30 days. Pruning shortens query times and cuts the scan-cost line at the same time.

On the compliance side the calculus reverses. Some logs have to be kept — CloudTrail management events, PCI-DSS audit trails, HIPAA access logs — and storing them in CloudWatch Logs for years at $0.03/GB-month is much more expensive than the standard pattern: ship them to S3 via a subscription filter, apply an S3 lifecycle rule to transition to Glacier Deep Archive at 90 days, and pay $0.00099/GB-month from there. CloudWatch is for hot, queryable logs; S3 + Glacier is for cold, compliance logs.

The risk side: a log group with no retention is also a data-residency liability. Personal data, API tokens, or stack traces with customer IDs sit there indefinitely, expanding the blast radius of any IAM mishap that gives someone read access. Retention is a privacy and security control as much as a cost one.

How do you set retention safely across an account?

The fix is a four-step loop: inventory what you have, decide retention by purpose, apply the change, then prevent the next generation of orphan groups.

1. Inventory every log group and its size

Use aws logs describe-log-groups (it paginates — handle the nextToken) and dump everything to JSON. You want logGroupName, retentionInDays, storedBytes, and logGroupClass. Sort by storedBytes descending; the top 1% of groups almost always account for >80% of the storage. Fix those first and you've claimed most of the savings without touching the long tail.

2. Pick retention by purpose, not by service

Debug and ephemeral build logs: 3-7 days. Application logs an on-call needs to grep during an incident: 14-30 days. Business-critical app logs for trend analysis: 90 days. Audit/compliance logs (CloudTrail, IAM access analyzer, security tooling): 400 days in CloudWatch, then 7+ years archived to S3 + Glacier. Do not pick a uniform retention for the whole account — it will either be too short for compliance or too long for debug.

3. Bulk-apply with PutRetentionPolicy

Loop your inventory through aws logs put-retention-policy --log-group-name $name --retention-in-days $days. The call is idempotent and rate-limited at 5 TPS per account — for 5,000 groups budget about 20 minutes. Pruning happens asynchronously over the following 24-72 hours; do not panic if the bill doesn't drop the same day. For the never-actually-queried compliance groups, follow up with put-log-group-class --log-group-class INFREQUENT_ACCESS for an extra 75% storage saving.

4. Prevent recurrence

Set the account-level Lambda default retention via aws lambda put-function-configuration defaults or your IaC module. Enable the AWS Config managed rule cw-loggroup-retention-period-check so any future group born with no retention triggers a non-compliant finding. For greenfield infra, bake retention into the Terraform/CDK module that creates the group — never let aws_cloudwatch_log_group be defined without a retention_in_days argument.

# Bulk-apply a 30-day default to every group that has no retention set.
aws logs describe-log-groups \
  --query 'logGroups[?retentionInDays==`null`].logGroupName' \
  --output text \
  | tr '\t' '\n' \
  | while read -r LG; do
      echo "Setting 30d on $LG"
      aws logs put-retention-policy \
        --log-group-name "$LG" \
        --retention-in-days 30
    done

# Then enable the Config rule that catches the next one.
aws configservice put-config-rule --config-rule '{
  "ConfigRuleName": "cw-loggroup-retention-period-check",
  "Source": { "Owner": "AWS", "SourceIdentifier": "CW_LOGGROUP_RETENTION_PERIOD_CHECK" },
  "InputParameters": "{\"MinRetentionTime\":\"30\"}"
}'

Quick quiz

Question 1 of 5

An audit team needs to keep API Gateway access logs for 7 years. The current /aws/apigateway/prod log group has no retention set and is growing. What's the right move?

You've completed Set retention on CloudWatch log groups. You now know that "Never Expire" is the AWS default, why storage compounds even when ingestion dominates the headline, and how to pick retention by log purpose. The next time the CloudWatch line creeps up on an invoice, you'll have a four-step loop — inventory, decide, bulk-apply, prevent recurrence — ready to run.

Back to the library