Compliance

Manage secrets (rotation and hygiene)

One capability across Secrets Manager and Kubernetes: keep credentials rotating on a schedule that actually succeeds, retire the secrets nobody uses, and make sure your stored secrets are genuinely encrypted rather than merely encoded.

14 min·10 sections·AWS

Last reviewed 16 June 2026

Remediates AWS Security Hub: EKS.3 SecretsManager.1 SecretsManager.2 SecretsManager.3 SecretsManager.4

Managing secrets: the basics

What does secret hygiene actually cover beyond storing a password?

AWS Secrets Manager is a vault for credentials: database passwords, API keys, OAuth tokens, third-party service credentials. Storing a secret well is only the start. A secret has a lifecycle, and Security Hub turns each part of it into a control. SecretsManager.1 checks that automatic rotation is enabled. SecretsManager.2 checks that the configured rotation actually succeeds, because enabling rotation is not the same as it working. SecretsManager.3 checks that unused secrets are removed. SecretsManager.4 checks that rotation-enabled secrets are actually rotated within a configured window. The estate can fail several at once, but they are one capability: keep credentials fresh, in use, and protected.

One more control in this group lives outside Secrets Manager. EKS.3 checks that Kubernetes secrets in your EKS clusters are envelope-encrypted with a KMS key you control, rather than the base64 encoding Kubernetes uses by default, which is not encryption at all: anyone who can read the secret object or a copy of etcd can decode it instantly. Different store, same discipline: the credentials that gate your data should be genuinely encrypted, replaced on a schedule, and not left lying around when the service that needed them is gone.

What ties these together is that a long-lived or exposed credential is a leak waiting to happen. The blast radius of a leak grows with the credential's age: six months in, you genuinely cannot enumerate everyone who holds a copy in a log, a laptop, a CI cache or an old environment variable. Rotation bounds that window. Cleanup shrinks the attack surface. Encryption stops a single leaked snapshot from handing over every secret at once. Managing secrets well is about closing each of those gaps before the leak you do not know about becomes the breach you do.

In this lesson you will learn how AWS expresses secret hygiene across rotation, rotation success, cleanup and encryption, how the four-stage rotation Lambda actually works and where it usually breaks, why an unused secret is both a cost line and an attack surface, and why base64 in etcd is encoding rather than encryption. The Controls this lesson covers section lists every Security Hub control in this capability, each linking to a deep page with the exact check and a copy-and-paste fix.

Fun fact

The rotation that failed for 400 days straight

A team enabled 30-day automatic rotation on their primary RDS master credential and moved on. Eight months later a routine Security Hub review surfaced a rotation-success finding in a failed state. The rotation Lambda had been placed in a private subnet with no route to the RDS endpoint; every scheduled rotation since day one had timed out at the setSecret step and rolled back. The secret's LastRotatedDate was the day it was created. The password had never actually changed, despite every dashboard showing rotation enabled the entire time, and the CloudWatch logs held 400 days of identical connection-timeout traces that no alarm had ever been wired to. Enabled is not the same as working.

Auditing secret hygiene in action

Marco is finishing a SOC 2 prep cycle when Security Hub fires a batch of secrets findings: several with rotation disabled, one rotation-enabled secret whose last rotation silently failed, and a long list of secrets nobody has read in over 90 days.

Rather than mass-fix, he starts by listing rotation-enabled secrets with the one fact that separates working from broken: how long ago each actually rotated against its configured interval. A LastRotatedDate older than the interval is the fingerprint of a rotation that is enabled but failing, which is more dangerous than one that is simply off.

List rotation-enabled secrets with their last rotation and interval. A LastRotatedDate older than the interval is a silently failing rotation.

$ aws secretsmanager list-secrets --query 'SecretList[?RotationEnabled==`true`].{Name:Name,Rotated:LastRotatedDate,Days:RotationRules.AutomaticallyAfterDays}' --output table

------------------------------------------------------------------

| prod/payments/db-master | 2026-04-02T09:11Z | 30 |

| prod/api/stripe-key | 2026-05-20T03:00Z | 30 |

| prod/cache/redis-auth | 2026-05-18T03:00Z | 30 |

------------------------------------------------------------------

# db-master last rotated 54 days ago on a 30-day schedule: rotation is enabled but failing.

Enabled is not working. A last-rotation date older than the interval is the fingerprint of a silently failing rotation; read the Lambda logs to find the failing step.

How secret hygiene is evaluateddeep dive

Every Secrets Manager rotation is driven by a Lambda invoked four times in sequence: createSecret generates a new value staged AWSPENDING, setSecret writes it into the target database or service (where most failures happen, because it needs network reachability and credentials), testSecret connects with the pending value to prove it works, and finishSecret promotes AWSPENDING to AWSCURRENT and demotes the old version to AWSPREVIOUS. If any step throws, AWSCURRENT is left untouched, so applications keep using the old value, which is exactly why a failure is silent. SecretsManager.1 checks that rotation is enabled at all; SecretsManager.4 checks that it has actually run within the configured window; and SecretsManager.2 checks the RotationOccurringAsScheduled flag, which is true only when the last scheduled rotation completed on time.

SecretsManager.3 looks at a different signal: LastAccessedDate, which GetSecretValue and DescribeSecret update and which is null for a secret never read since creation. delete-secret with a 7-to-30-day recovery window puts the secret in a pending-deletion state that fails any caller loudly but can be undone with restore-secret, which is why the recovery window is the safe default and force-delete-without-recovery is reserved for known-throwaway secrets.

EKS.3 is a separate mechanism again. On EKS, supplying an encryptionConfig with the secrets resource and a KMS key ARN turns on envelope encryption: the API server generates a data key, encrypts the secret payload with it, then calls KMS to encrypt that data key with your customer managed key, so only the encrypted data key and ciphertext land in etcd. This sits on top of the default EBS volume encryption, which only protects against physical disk theft, not logical access to a running cluster or a leaked etcd backup. The association is forward-only and irreversible, encrypts only secrets written after it is enabled (so you re-save existing ones), and makes the KMS key a critical dependency: if the key is disabled or deleted, the cluster can no longer decrypt its secrets.

What is the impact of poorly managed secrets?

The headline impact is blast radius that grows with age. Every system that has ever read a secret keeps a copy, in memory, in env vars, in container layers, in CI logs, on a former contractor's laptop, so the longer a credential lives the more copies exist. Rotation bounds how long any leaked copy stays useful; a silently failing rotation removes that bound while every dashboard reports the credential as managed, which is more dangerous than no rotation at all because the org stops watching it.

Unused secrets and unencrypted ones widen the surface. A dormant credential (SecretsManager.3) is a live key lying on the floor, a path for lateral movement and a line that makes forensics harder, on top of the recurring storage cost. An EKS cluster without secret encryption (EKS.3) exposes every Kubernetes secret to anyone who reaches the API or a copy of etcd: one foothold turns into every database password, API token and TLS private key the cluster holds, at once, because base64 was never a lock.

On the compliance side, every framework that matters here, PCI DSS, SOC 2, NIST 800-53 and ISO 27001, expects documented credential rotation, lifecycle controls that retire unused credentials, and encryption of stored secrets at rest. A rotation-enabled-but-failing secret produces a clean audit finding when an assessor pulls LastRotatedDate and finds it predates the policy window, and a list of hundreds of unused credentials reads as a missing decommissioning process, not an isolated oversight.

How do you manage secrets safely?

Work the capability as one loop: enable rotation and confirm it succeeds, retire what nobody uses, encrypt what is only encoded, and then make each fix a default so the findings cannot recur.

1. Enable rotation and verify it actually succeeds

Turn on rotation for high-value secrets with the AWS-managed Lambda for RDS, DocumentDB, Redshift and ElastiCache, or a custom four-stage function for third-party keys, at a 30-to-90-day cadence matched to the threat model. Then verify the outcome, not just the config: confirm LastRotatedDate is current, and for any silently failing rotation read the Lambda's CloudWatch logs to find the failing step (usually setSecret), fix the root cause (IAM, KMS, networking, or target credentials), and run rotate-secret --rotate-immediately to re-run rather than waiting a full interval.

2. Retire the secrets nobody uses

Audit with a script, filtering on LastAccessedDate older than 90 days, and produce a triage list with owner and IaC-stack tags. Notify owning teams, then schedule deletion with delete-secret --recovery-window-in-days 7 so anything still depending on a secret fails loudly inside the window and restore-secret can undo a mistake. Reserve force-delete-without-recovery for known-disposable dev secrets only.

3. Encrypt what is only encoded

For EKS clusters, associate a dedicated customer managed KMS key with the secrets resource via associate-encryption-config, after confirming the cluster role and key policy allow kms:Encrypt, kms:Decrypt and kms:DescribeKey. The change is one-way and forward-only, so re-save existing secrets afterwards (per namespace) so already-stored values are rewritten through the envelope, and treat the key as a critical dependency with deletion protection.

4. Make each fix a default and alarm on failure

The durable fix is preventing recurrence. Bake rotation and EKS encryptionConfig into the Terraform and CloudFormation modules so new credentials and clusters are born compliant, set IaC removal policies to delete secrets on stack teardown, and keep the backing Config rules live. Most importantly, alarm on the rotation Lambda's Errors metric so a failed rotation pages a human the day it happens rather than at the next audit, the single highest-leverage thing you can add here.

# Enable rotation on an RDS-backed secret with the AWS-managed Lambda, 30-day cadence.
aws secretsmanager rotate-secret \
  --secret-id prod/payments/db-master \
  --rotation-lambda-arn arn:aws:lambda:eu-west-1:123456789012:function:SecretsManagerRDSPostgreSQLRotationSingleUser \
  --rotation-rules AutomaticallyAfterDays=30

# Schedule deletion of stale secrets behind a recovery window (never force-delete in prod).
NOW=$(date -u +%FT%TZ)
for arn in $(aws secretsmanager list-secrets \
  --query "SecretList[?LastAccessedDate<='$(date -u -d '90 days ago' +%FT%TZ)'].ARN" \
  --output text); do
  aws secretsmanager delete-secret --secret-id "$arn" --recovery-window-in-days 7
done

# Alarm so the next failed rotation pages a human, not the next audit.
aws cloudwatch put-metric-alarm \
  --alarm-name secrets-rotation-failed --namespace AWS/Lambda --metric-name Errors \
  --dimensions Name=FunctionName,Value=RotatePaymentsDbMaster \
  --statistic Sum --period 3600 --evaluation-periods 1 \
  --threshold 1 --comparison-operator GreaterThanOrEqualToThreshold \
  --alarm-actions arn:aws:sns:us-east-1:123456789012:security-oncall

Quick quiz

Question 1 of 5

Security Hub shows a rotation-success finding (SecretsManager.2) FAILED for a production database secret, but the secret reports rotation enabled and the application is working fine. What is the right read?

Keep learning

Go deeper on how secret hygiene works across the services in this capability.

You can now treat secret management as one capability rather than a scatter of findings: enable rotation and verify it actually succeeds, retire the secrets nobody uses behind a recovery window, encrypt the Kubernetes secrets that were only base64-encoded, and make each fix a default with infrastructure-as-code and a failure alarm so the findings cannot recur. The Controls this lesson covers section below links every control in this group to its deep page and fix.

Back to the library

Managing secrets: the cost and risk view

A handful of cents per secret per month standing in front of breach-scale exposure

Secrets Manager charges about $0.40 per secret per month plus a few cents per ten thousand API calls, and the AWS-managed rotation Lambdas for RDS and similar databases are pre-built and effectively free to run. EKS secret encryption costs one KMS key, roughly a dollar a month. The remediation across this whole capability is engineering time and rounding-error spend, while the downside it mitigates, a breach traceable to an unrotated or unencrypted credential, routinely runs six to eight figures in incident response, notification and customer trust.

There is a genuine cost angle on the cleanup side. Stale secrets accumulate quietly: a typical multi-account org carries hundreds to thousands of secrets within a couple of years, and the storage line climbs past a few hundred dollars a month for credentials nobody uses. Because each secret has a unit cost, an owner tag and a LastAccessedDate, this is unusually tractable for FinOps: you can produce an aged inventory, attribute waste to teams, and calculate exact savings before deleting anything.

Frame each failing control by the credential behind it. A production payments database master password that is unrotated (SecretsManager.1/.4) or whose rotation is silently failing (SecretsManager.2) is a this-sprint priority with direct PCI exposure; a cancelled-POC secret nobody has read in a year (SecretsManager.3) is a five-minute cleanup that also trims the bill. Prioritise by sensitivity and by what each credential can reach, not by the raw finding count.

This lesson is for the finance partner who sees Secrets Manager and EKS findings on the security and cost report and wants to know what they represent in both dollar and breach-risk terms. It covers why rotation is near-free relative to the exposure it removes, why unused-secret cleanup is the rare control that is both a compliance and a cost win, and how to tier the work by credential sensitivity so the most exposed secrets get fixed first.

Fun fact

The rotation that failed for 400 days straight

How a finance partner frames the secrets findings

Priya is the finance and risk partner for a payments platform. Security Hub fires a batch of secrets findings: several with rotation disabled, one rotation-enabled secret whose last rotation has been silently failing, an EKS cluster without secret encryption, and a long list of secrets nobody has read in over a year. Her first instinct is not to ask what the fixes cost, because the answer is rounding-error spend: Secrets Manager runs about $0.40 per secret per month, the AWS-managed rotation Lambdas are effectively free to run, and EKS secret encryption is one KMS key at roughly a dollar a month. The downside it mitigates, a breach traceable to an unrotated or unencrypted credential, routinely runs six to eight figures.

She reads each failing control by the credential behind it rather than by the finding count. The production payments database master password whose rotation is silently failing (SecretsManager.2) is the one with direct PCI exposure and a this-sprint priority, because the dashboard reports it as managed while the credential ages undetected. The EKS.3 finding on the production cluster is next, because base64 was never a lock and one foothold turns into every secret the cluster holds. The long list of unused secrets (SecretsManager.3) is the rare control that is both a security and a cost win: each secret has a unit cost, an owner tag and a LastAccessedDate, so she produces an aged inventory, attributes the waste per team, and the cleanup trims a storage line that had climbed past a few hundred dollars a month. Her output is a worklist tiered by sensitivity, with the storage line tracked alongside the finding count so a falling line shows the cleanup is sticking.

Why secret hygiene belongs on the risk register

The cost model is lopsided. Rotation is near-free, EKS secret encryption is about a dollar a month, and unused-secret cleanup actively reduces the bill. The downside is breach-scale: incident response, regulatory notification, customer churn and the cost of rotating every exposed credential, none of which appears on the cloud invoice but all of which lands on the business. There is no plausible cost-benefit case for leaving a high-value credential unrotated or a production cluster's secrets unencrypted.

The cleanup side is the rare control that is a cost win as well as a security one. Each secret has an identifiable unit cost, an attributable owner and a clear staleness signal, so you can produce an aged inventory, attribute waste per team, and recover the spend with a schedule-then-delete pass behind a recovery window. Track the Secrets Manager storage line alongside the finding count: a flat or falling line means the cleanup is sticking; a rising one means lifecycle discipline has slipped.

Tier the work by sensitivity. Rotation findings on PCI- or regulated-scope credentials sit in the must-fix-this-sprint bucket with the audit exposure spelled out; internal-tooling and dev secrets sit lower. The finance contribution is to make the asymmetry explicit so a control this cheap to fix never sits open across a reporting cycle on a credential that matters.

What finance can do about the secrets findings

Finance cannot write a rotation Lambda or associate a KMS key, but it can tier the work by credential sensitivity, turn the one cleanup control into a tracked cost win, and make the breach-versus-cents asymmetry impossible to ignore. Three levers.

1. Tier the work by what each credential can reach, not by the finding count

Frame each failing control by the credential behind it. A production payments database master password that is unrotated or silently failing rotation is a must-fix-this-sprint item with direct PCI exposure; a cancelled-POC secret nobody has read in a year is a five-minute cleanup. Internal-tooling and dev secrets sit lower. Prioritising by sensitivity and by what each credential can reach, not by the raw finding count, is what stops a control this cheap to fix sitting open across a reporting cycle on a credential that matters.

2. Turn the cleanup control into a tracked cost win

Unused-secret cleanup (SecretsManager.3) is the rare control that is both a compliance and a cost win, because each secret has an identifiable unit cost, an attributable owner and a clear staleness signal in LastAccessedDate. Produce an aged inventory, attribute the waste per team, and recover the spend with a schedule-then-delete pass behind a recovery window. Track the Secrets Manager storage line alongside the finding count: a flat or falling line means the cleanup is sticking, a rising one means lifecycle discipline has slipped.

3. Make the breach-versus-cents asymmetry explicit and require exceptions

Rotation is near-free, EKS secret encryption is about a dollar a month, and cleanup actively reduces the bill, while the downside is breach-scale incident response, regulatory notification and customer churn that never appears on the cloud invoice. There is no plausible cost-benefit case for leaving a high-value credential unrotated or a production cluster unencrypted. Spell out that asymmetry so the work keeps its priority, and require a documented, finance-visible exception for any high-value credential that genuinely cannot rotate.

Quick quiz

Question 1 of 5

What is the cost profile of the secrets controls in this group?

Keep learning

Go deeper on how secret hygiene works across the services in this capability.

You have finished the finance view of secret hygiene. You know rotation is near-free, EKS encryption is about a dollar a month, and unused-secret cleanup is the rare control that actively trims the bill, all standing in front of breach-scale exposure that never lands on the cloud invoice. The right approach is to tier the work by credential sensitivity, run cleanup as a tracked cost win with the storage line watched alongside the finding count, and make the breach-versus-cents asymmetry explicit. Next time a batch of secrets findings lands, you will produce a sensitivity-tiered worklist rather than ask what the fixes cost.

Back to the library

Managing secrets: the headline

Whether a leaked credential is exploitable indefinitely, or only until the next rotation

Applications connect to databases and external services using passwords and keys. This group of controls asks whether those credentials are well managed: do they rotate so a leak is only useful for a bounded window, does that rotation actually succeed rather than just being switched on, are unused credentials retired, and are stored secrets genuinely encrypted rather than merely encoded.

The leadership question is simple to state. If a credential leaked today and nobody noticed, how long would an attacker have it? With rotation on a 30-day cadence that actually succeeds, the answer is at most 30 days. Without rotation, or with rotation silently failing, the answer is forever. A rotation that is enabled but failing is the worst case of all, because the dashboard reports it as managed and the org looks elsewhere.

This is a breach-containment discipline, not a spending decision. The fixes are cheap; the value is in treating rotation success as a monitored outcome, retiring credentials when their service retires, and making real encryption a default on every cluster. The defensible end state is that every high-value credential rotates by policy, with documented exceptions for the rare cases that cannot.

A short read for the leader who needs to know what secret management protects, why rotation that succeeds matters more than rotation that is merely enabled, and what good looks like: every high-value credential rotating by policy with monitored success, unused credentials retired when their service retires, and real encryption a default on every cluster.

Fun fact

The rotation that failed for 400 days straight

What it looks like when secret hygiene is a default

After a routine review found that the primary RDS master credential had reported rotation enabled for 400 days while never actually rotating, because the rotation Lambda had no network route to the database and every scheduled run had timed out and rolled back, the CTO asked the security team one question: if a credential leaked today and nobody noticed, how long would an attacker have it. For that secret, the honest answer was forever, and the dashboard had said managed the entire time.

The team stopped treating rotation as a setting to tick and made secret hygiene a monitored outcome. Rotation is now on for every high-value credential at a cadence matched to its threat model, and a CloudWatch alarm on the rotation Lambda's Errors metric pages a human the day a rotation fails rather than at the next audit. Unused secrets are retired when their service retires, behind a recovery window so anything still depending on one fails loudly and can be undone. Every production EKS cluster has KMS envelope encryption on its secrets by default, with the key treated as a critical dependency. Rotation and encryptionConfig are baked into the Terraform and CloudFormation modules so new credentials and clusters are born compliant. The next review answered the CTO differently: every production credential rotates by policy with documented exceptions, and a failed rotation is a same-day alert. That standing posture, monitored as an outcome, is the governance signal.

Why this is a board-level risk

The core question this capability answers is the one a board asks after any breach: if a credential was stolen today without anyone knowing, how long would an attacker have access? Without rotation, indefinitely. With rotation that actually succeeds, at most the rotation interval. The difference is the gap between a contained incident and a multi-year breach, and it is decided by whether rotation is monitored as an outcome rather than ticked as a setting.

The regulatory and governance stakes are concrete. A rotation-enabled-but-failing secret, a pile of unused credentials, or an EKS cluster with unencrypted secrets are each the kind of finding a regulator or assessor cites directly, and in a post-breach review an unrotated credential the tooling flagged months earlier is a governance failure, not just a technical one. The leadership question is whether every production credential rotates by policy with documented exceptions, and whether a failed rotation pages a human the day it happens.

The leadership move on secret hygiene

The executive handle is to treat rotation as a monitored outcome rather than a setting, so a leaked credential is useful for a bounded window rather than forever. Three moves.

1. Set the standard: every high-value credential rotates by policy with monitored success

Make it policy that every production credential rotates on a cadence matched to its threat model and that rotation success, not just the rotation setting, is what counts as done. A rotation that is enabled but failing is the worst case of all, because the dashboard reports it as managed and the org looks elsewhere while the credential ages. The defensible end state is every high-value credential rotating by policy, with documented exceptions for the rare cases that genuinely cannot, and unused credentials retired when their service retires.

2. Demand that a failed rotation pages a human the day it happens

A clean report is not the proof; the proof is that a silent failure becomes a same-day alert. Ask to see the CloudWatch alarm on the rotation Lambda's Errors metric wired to on-call, the single highest-leverage thing in this capability, because it is the difference between catching a failed rotation today and discovering it 400 days later in an audit. The question to hold the team to is whether rotation is monitored as an outcome, since that is what separates a contained incident from a multi-year breach.

3. Demand that each fix is a default that cannot recur

The durable fix is preventing recurrence, not clearing the current report. Ask to see rotation and EKS encryptionConfig baked into the Terraform and CloudFormation modules so new credentials and clusters are born compliant, removal policies that delete secrets on stack teardown, and the backing Config rules kept live. Real encryption a default on every cluster and rotation a default on every high-value credential, enforced rather than remembered, is what turns these findings from a recurring chase into a one-time fix with exceptions on the record.

Quick quiz

Question 1 of 5

What is the central leadership question this capability answers?

Keep learning

Go deeper on how secret hygiene works across the services in this capability.

Two takeaways: this is a breach-containment discipline, not a spending decision, and the question that matters is how long a leaked credential stays useful, at most one rotation interval if rotation actually succeeds, forever if it is off or silently failing; and the right end state is every high-value credential rotating by policy with monitored success, unused credentials retired when their service retires, real encryption a default on every cluster, and a failed rotation paging a human the day it happens. Treat rotation as an outcome, not a setting.

Back to the library

Controls this lesson covers

One capability, many AWS Security Hub controls. This lesson is the shared playbook; each control below keeps its own deep page with the exact check, severity and a copy-and-paste fix.

EKS

EKS.3 Medium EKS clusters should use encrypted K8s secrets

SecretsManager

Part of the learning path Lock down access

Manage secrets (rotation and hygiene)

Managing secrets: the basics

The rotation that failed for 400 days straight

Auditing secret hygiene in action

How secret hygiene is evaluateddeep dive

What is the impact of poorly managed secrets?

How do you manage secrets safely?

1. Enable rotation and verify it actually succeeds

2. Retire the secrets nobody uses

3. Encrypt what is only encoded

4. Make each fix a default and alarm on failure

Quick quiz

Keep learning

Managing secrets: the cost and risk view

The rotation that failed for 400 days straight

How a finance partner frames the secrets findings

Why secret hygiene belongs on the risk register

What finance can do about the secrets findings

1. Tier the work by what each credential can reach, not by the finding count

2. Turn the cleanup control into a tracked cost win

3. Make the breach-versus-cents asymmetry explicit and require exceptions

Quick quiz

Keep learning

Managing secrets: the headline

The rotation that failed for 400 days straight

What it looks like when secret hygiene is a default

Why this is a board-level risk

The leadership move on secret hygiene

1. Set the standard: every high-value credential rotates by policy with monitored success

2. Demand that a failed rotation pages a human the day it happens

3. Demand that each fix is a default that cannot recur

Quick quiz

Keep learning

Controls this lesson covers

EKS

SecretsManager

Related compliance lessons