Managing secrets: the basics
What does secret hygiene actually cover beyond storing a password?
AWS Secrets Manager is a vault for credentials: database passwords, API keys, OAuth tokens, third-party service credentials. Storing a secret well is only the start. A secret has a lifecycle, and Security Hub turns each part of it into a control. SecretsManager.1 checks that automatic rotation is enabled. SecretsManager.2 checks that the configured rotation actually succeeds, because enabling rotation is not the same as it working. SecretsManager.3 checks that unused secrets are removed. SecretsManager.4 checks that rotation-enabled secrets are actually rotated within a configured window. The estate can fail several at once, but they are one capability: keep credentials fresh, in use, and protected.
One more control in this group lives outside Secrets Manager. EKS.3 checks that Kubernetes secrets in your EKS clusters are envelope-encrypted with a KMS key you control, rather than the base64 encoding Kubernetes uses by default, which is not encryption at all: anyone who can read the secret object or a copy of etcd can decode it instantly. Different store, same discipline: the credentials that gate your data should be genuinely encrypted, replaced on a schedule, and not left lying around when the service that needed them is gone.
What ties these together is that a long-lived or exposed credential is a leak waiting to happen. The blast radius of a leak grows with the credential's age: six months in, you genuinely cannot enumerate everyone who holds a copy in a log, a laptop, a CI cache or an old environment variable. Rotation bounds that window. Cleanup shrinks the attack surface. Encryption stops a single leaked snapshot from handing over every secret at once. Managing secrets well is about closing each of those gaps before the leak you do not know about becomes the breach you do.
In this lesson you will learn how AWS expresses secret hygiene across rotation, rotation success, cleanup and encryption, how the four-stage rotation Lambda actually works and where it usually breaks, why an unused secret is both a cost line and an attack surface, and why base64 in etcd is encoding rather than encryption. The Controls this lesson covers section lists every Security Hub control in this capability, each linking to a deep page with the exact check and a copy-and-paste fix.
The rotation that failed for 400 days straight
A team enabled 30-day automatic rotation on their primary RDS master credential and moved on. Eight months later a routine Security Hub review surfaced a rotation-success finding in a failed state. The rotation Lambda had been placed in a private subnet with no route to the RDS endpoint; every scheduled rotation since day one had timed out at the setSecret step and rolled back. The secret's LastRotatedDate was the day it was created. The password had never actually changed, despite every dashboard showing rotation enabled the entire time, and the CloudWatch logs held 400 days of identical connection-timeout traces that no alarm had ever been wired to. Enabled is not the same as working.
Auditing secret hygiene in action
Marco is finishing a SOC 2 prep cycle when Security Hub fires a batch of secrets findings: several with rotation disabled, one rotation-enabled secret whose last rotation silently failed, and a long list of secrets nobody has read in over 90 days.
Rather than mass-fix, he starts by listing rotation-enabled secrets with the one fact that separates working from broken: how long ago each actually rotated against its configured interval. A LastRotatedDate older than the interval is the fingerprint of a rotation that is enabled but failing, which is more dangerous than one that is simply off.
List rotation-enabled secrets with their last rotation and interval. A LastRotatedDate older than the interval is a silently failing rotation.
Enabled is not working. A last-rotation date older than the interval is the fingerprint of a silently failing rotation; read the Lambda logs to find the failing step.
How secret hygiene is evaluateddeep dive
Every Secrets Manager rotation is driven by a Lambda invoked four times in sequence: createSecret generates a new value staged AWSPENDING, setSecret writes it into the target database or service (where most failures happen, because it needs network reachability and credentials), testSecret connects with the pending value to prove it works, and finishSecret promotes AWSPENDING to AWSCURRENT and demotes the old version to AWSPREVIOUS. If any step throws, AWSCURRENT is left untouched, so applications keep using the old value, which is exactly why a failure is silent. SecretsManager.1 checks that rotation is enabled at all; SecretsManager.4 checks that it has actually run within the configured window; and SecretsManager.2 checks the RotationOccurringAsScheduled flag, which is true only when the last scheduled rotation completed on time.
SecretsManager.3 looks at a different signal: LastAccessedDate, which GetSecretValue and DescribeSecret update and which is null for a secret never read since creation. delete-secret with a 7-to-30-day recovery window puts the secret in a pending-deletion state that fails any caller loudly but can be undone with restore-secret, which is why the recovery window is the safe default and force-delete-without-recovery is reserved for known-throwaway secrets.
EKS.3 is a separate mechanism again. On EKS, supplying an encryptionConfig with the secrets resource and a KMS key ARN turns on envelope encryption: the API server generates a data key, encrypts the secret payload with it, then calls KMS to encrypt that data key with your customer managed key, so only the encrypted data key and ciphertext land in etcd. This sits on top of the default EBS volume encryption, which only protects against physical disk theft, not logical access to a running cluster or a leaked etcd backup. The association is forward-only and irreversible, encrypts only secrets written after it is enabled (so you re-save existing ones), and makes the KMS key a critical dependency: if the key is disabled or deleted, the cluster can no longer decrypt its secrets.
What is the impact of poorly managed secrets?
The headline impact is blast radius that grows with age. Every system that has ever read a secret keeps a copy, in memory, in env vars, in container layers, in CI logs, on a former contractor's laptop, so the longer a credential lives the more copies exist. Rotation bounds how long any leaked copy stays useful; a silently failing rotation removes that bound while every dashboard reports the credential as managed, which is more dangerous than no rotation at all because the org stops watching it.
Unused secrets and unencrypted ones widen the surface. A dormant credential (SecretsManager.3) is a live key lying on the floor, a path for lateral movement and a line that makes forensics harder, on top of the recurring storage cost. An EKS cluster without secret encryption (EKS.3) exposes every Kubernetes secret to anyone who reaches the API or a copy of etcd: one foothold turns into every database password, API token and TLS private key the cluster holds, at once, because base64 was never a lock.
On the compliance side, every framework that matters here, PCI DSS, SOC 2, NIST 800-53 and ISO 27001, expects documented credential rotation, lifecycle controls that retire unused credentials, and encryption of stored secrets at rest. A rotation-enabled-but-failing secret produces a clean audit finding when an assessor pulls LastRotatedDate and finds it predates the policy window, and a list of hundreds of unused credentials reads as a missing decommissioning process, not an isolated oversight.
How do you manage secrets safely?
Work the capability as one loop: enable rotation and confirm it succeeds, retire what nobody uses, encrypt what is only encoded, and then make each fix a default so the findings cannot recur.
1. Enable rotation and verify it actually succeeds
Turn on rotation for high-value secrets with the AWS-managed Lambda for RDS, DocumentDB, Redshift and ElastiCache, or a custom four-stage function for third-party keys, at a 30-to-90-day cadence matched to the threat model. Then verify the outcome, not just the config: confirm LastRotatedDate is current, and for any silently failing rotation read the Lambda's CloudWatch logs to find the failing step (usually setSecret), fix the root cause (IAM, KMS, networking, or target credentials), and run rotate-secret --rotate-immediately to re-run rather than waiting a full interval.
2. Retire the secrets nobody uses
Audit with a script, filtering on LastAccessedDate older than 90 days, and produce a triage list with owner and IaC-stack tags. Notify owning teams, then schedule deletion with delete-secret --recovery-window-in-days 7 so anything still depending on a secret fails loudly inside the window and restore-secret can undo a mistake. Reserve force-delete-without-recovery for known-disposable dev secrets only.
3. Encrypt what is only encoded
For EKS clusters, associate a dedicated customer managed KMS key with the secrets resource via associate-encryption-config, after confirming the cluster role and key policy allow kms:Encrypt, kms:Decrypt and kms:DescribeKey. The change is one-way and forward-only, so re-save existing secrets afterwards (per namespace) so already-stored values are rewritten through the envelope, and treat the key as a critical dependency with deletion protection.
4. Make each fix a default and alarm on failure
The durable fix is preventing recurrence. Bake rotation and EKS encryptionConfig into the Terraform and CloudFormation modules so new credentials and clusters are born compliant, set IaC removal policies to delete secrets on stack teardown, and keep the backing Config rules live. Most importantly, alarm on the rotation Lambda's Errors metric so a failed rotation pages a human the day it happens rather than at the next audit, the single highest-leverage thing you can add here.
# Enable rotation on an RDS-backed secret with the AWS-managed Lambda, 30-day cadence.
aws secretsmanager rotate-secret \
--secret-id prod/payments/db-master \
--rotation-lambda-arn arn:aws:lambda:eu-west-1:123456789012:function:SecretsManagerRDSPostgreSQLRotationSingleUser \
--rotation-rules AutomaticallyAfterDays=30
# Schedule deletion of stale secrets behind a recovery window (never force-delete in prod).
NOW=$(date -u +%FT%TZ)
for arn in $(aws secretsmanager list-secrets \
--query "SecretList[?LastAccessedDate<='$(date -u -d '90 days ago' +%FT%TZ)'].ARN" \
--output text); do
aws secretsmanager delete-secret --secret-id "$arn" --recovery-window-in-days 7
done
# Alarm so the next failed rotation pages a human, not the next audit.
aws cloudwatch put-metric-alarm \
--alarm-name secrets-rotation-failed --namespace AWS/Lambda --metric-name Errors \
--dimensions Name=FunctionName,Value=RotatePaymentsDbMaster \
--statistic Sum --period 3600 --evaluation-periods 1 \
--threshold 1 --comparison-operator GreaterThanOrEqualToThreshold \
--alarm-actions arn:aws:sns:us-east-1:123456789012:security-oncall Quick quiz
Question 1 of 5Security Hub shows a rotation-success finding (SecretsManager.2) FAILED for a production database secret, but the secret reports rotation enabled and the application is working fine. What is the right read?
You scored
0 / 5
Keep learning
Go deeper on how secret hygiene works across the services in this capability.
- Rotating your AWS Secrets Manager secrets How scheduled rotation works, the four-stage rotation function, and the version-stage labels.
- Troubleshooting AWS Secrets Manager rotation of secrets Common rotation failures and how to diagnose which of the four steps broke and why.
- Enabling secret encryption on an existing EKS cluster Associating a KMS key with the secrets resource and the one-way, forward-only nature of the change.
You can now treat secret management as one capability rather than a scatter of findings: enable rotation and verify it actually succeeds, retire the secrets nobody uses behind a recovery window, encrypt the Kubernetes secrets that were only base64-encoded, and make each fix a default with infrastructure-as-code and a failure alarm so the findings cannot recur. The Controls this lesson covers section below links every control in this group to its deep page and fix.
Back to the library