Compliance

Enable cluster and search audit logging

One capability across EKS clusters and Elasticsearch search domains: capture and watch the control-plane and search activity that records who called the API, what they queried, and what failed.

14 min·10 sections·AWS

Last reviewed 16 June 2026

Remediates AWS Security Hub: EKS.8 ES.4 ES.5 GuardDuty.5

Cluster and search audit logging: the basics

What does a quiet control plane actually hide?

This capability covers the cluster control planes and search domains that broker access to your workloads: the Amazon EKS Kubernetes API server, and Amazon Elasticsearch Service (the older name for OpenSearch) search domains. Each runs a front door that every action passes through, and by default that front door keeps no durable, watched record of who called it or what they asked for. Cluster and search audit logging is about turning that recorder on and, for EKS, having a managed detector read the tape.

AWS Security Hub turns each layer into its own control, which is why a single estate can fail several at once. EKS.8 checks that an EKS cluster has the audit control-plane log type exporting to CloudWatch Logs. GuardDuty.5 checks that GuardDuty EKS Audit Log Monitoring is analysing that Kubernetes audit stream for threats. ES.4 checks that an Elasticsearch domain publishes its error logs, and ES.5 checks that it publishes its audit logs. They look like separate problems on the report, but they are one capability: make sure the cluster and search layers record what happens to them, and that something is watching.

It is flagged because these are the most attacker-active and audit-relevant surfaces in their respective stacks. The Kubernetes API is where an attacker who lands in a cluster enumerates, escalates, and reads secrets; a search domain holds the most queryable copy of an organisation's sensitive data. Without the audit log there is no answer to "who deleted that deployment?" or "who ran that query at 2am?" and the events were never recorded, so no later investigation recovers them. Audit logging is the kind of thing that is cheap to leave on and impossible to retrofit onto an incident that already happened.

In this lesson you will learn what the EKS control plane and Elasticsearch search domains actually log, the difference between the agentless control-plane audit log and the managed detector that reads it, and how the Elasticsearch error and audit logs differ and depend on each other. The Controls this lesson covers section lists every Security Hub control in this capability, each linking to a deep page with the exact check and a copy-and-paste fix.

Fun fact

The audit log was already there

When GuardDuty EKS Audit Log Monitoring first shipped, a surprising number of teams found their clusters had been writing Kubernetes audit logs the whole time, with nobody analysing them. GuardDuty does not even require you to enable EKS control-plane logging to CloudWatch first; it consumes the audit stream directly at no extra EKS logging cost. One platform team turned the feature on across 40 clusters in an afternoon with a single organization-level setting and had their first finding (an over-permissive service account binding created months earlier) within hours. The data had been flowing past unwatched for the better part of a year. The same lesson holds for Elasticsearch: the moment someone asks for the access trail on a customer-data domain is the moment they discover audit logging was never on.

Finding unmonitored clusters and domains

Diego runs platform security at a healthcare SaaS company. Security Hub flags EKS.8 on three of seven clusters, GuardDuty.5 as failed across the organization, and ES.5 on a legacy Elasticsearch domain backing customer-record search. None of these surfaces is recording or being watched.

Rather than work the findings one by one, he starts by confirming which clusters have the audit log type enabled, so he can see the scope of the gap before changing anything.

Sweep the EKS fleet for the audit control-plane log type. A cluster missing it fails EKS.8 and is running its API server unrecorded.

$ aws eks list-clusters --query 'clusters[]' --output text

prod-platform staging-a staging-b data-eks sandbox

# prod-platform: audit=true

# staging-a: audit=false

# staging-b: audit=false

# Three clusters logging their control plane; two going unrecorded.

Confirm the audit log type across the fleet before enabling, so the change is deliberate per cluster rather than a blind toggle.

How clusters and search domains record activitydeep dive

EKS exposes five independently toggleable control-plane log types: api, audit, authenticator, controllerManager and scheduler. EKS.8 evaluates only audit (the Kubernetes audit log of every API request, the authenticated identity, the verb, the resource and the response), and its Config rule eks-cluster-log-enabled is parameterised with logTypes: audit. When enabled, EKS ships these records to a CloudWatch Logs group named /aws/eks//cluster. GuardDuty.5 is distinct: it is the EKS_AUDIT_LOGS detector feature that reads the Kubernetes audit stream directly from the control plane (agentless, no node software, and no requirement to enable CloudWatch export first) and raises Kubernetes/* threat findings. One gives you the durable log in your account; the other is a managed detector.

Elasticsearch domains configure logging through the LogPublishingOptions map, keyed by log type: ES_APPLICATION_LOGS (error logs, the ES.4 control), the slow-log types, and AUDIT_LOGS (the security trail, the ES.5 control). ES.4 fails when ES_APPLICATION_LOGS is absent or disabled; ES.5 fails when AUDIT_LOGS is. The hard dependency is that audit logs require fine-grained access control, which in turn requires node-to-node encryption, encryption at rest, and HTTPS enforcement, so enabling AUDIT_LOGS on a domain without FGAC is rejected. The error log has no such prerequisite, which is why ES.4 is usually the quicker fix.

Both layers have a permission wrinkle and an evaluation lag. Elasticsearch writes as the es.amazonaws.com service principal and needs a CloudWatch Logs resource policy on the target log group (not an IAM role on the domain); skip it and the domain reconfigures cleanly while no events arrive. In a GuardDuty organization, only the delegated administrator can enable EKS_AUDIT_LOGS and the cleanest pattern is auto-enable so new members inherit it, with the notorious edge case that a suspended member lacking the feature keeps GuardDuty.5 red until it is disassociated. Security Hub re-evaluates on a periodic or change-triggered cycle, so a fix can lag the change by a short window.

What is the impact of leaving these surfaces unmonitored?

The primary impact is investigative blindness on the highest-value targets. The Kubernetes API brokers every meaningful action in a cluster, and escaping one container can mean owning all of them; a search domain holds the most queryable copy of sensitive data. With audit logging off, a security incident on either has no trail: you cannot determine which identity escalated privileges, read a secret, or ran a bulk export, because the events were never recorded. For GuardDuty.5 specifically, the events are being logged by the control plane and analysed by no one, so the precursors to a breach pass by unseen.

The second impact is operational. Elasticsearch error logs are where the domain records circuit-breaker trips, shard allocation failures, and mapping conflicts; without them in CloudWatch, an engineer responding to a degraded search domain restarts blind and stretches a short incident into a long one. Cluster and search audit logs are routinely the first place teams look during the next reliability incident too, not just the next security one.

On the compliance side, EKS.8, ES.4 and ES.5 map to the NIST 800-53 audit family (AU-2, AU-3, AU-12) and to PCI DSS requirement 10.2.1, and GuardDuty.5 is a High-severity control whose persistent failure drags down the overall security score and surfaces in audits and customer security questionnaires. In a GuardDuty organization there is also no partial credit: the control only clears when the delegated administrator and every active member have the feature on, so covering 28 of 30 accounts reads the same as covering none.

How do you enable cluster and search audit logging safely?

Work the capability as one loop rather than chasing individual findings. The order matters: confirm prerequisites and scope before flipping switches, and set retention before logs start piling up.

1. Inventory which clusters and domains record and which are watched

Across every region and account, check which EKS clusters have the audit log type enabled, whether GuardDuty EKS Audit Log Monitoring is on at the delegated administrator, and which Elasticsearch domains publish error and audit logs. EKS.8 and the GuardDuty control are change-triggered or periodic, so a cluster or domain created before the standard will silently fail until someone touches it. Produce a one-line pass/fail per resource and rank by data sensitivity.

2. Enable the EKS audit log type and turn on the detector

Enable the audit log type with update-cluster-config; it is a non-disruptive control-plane change that does not restart the API server or evict pods, and EKS creates the /aws/eks//cluster log group on first enable. Separately, enable EKS_AUDIT_LOGS on the GuardDuty detector (the delegated administrator in an organization) and auto-enable it for all members so new accounts inherit it. If GuardDuty.5 stays red, disassociate any suspended member that lacks the feature.

3. Confirm FGAC, then enable the Elasticsearch error and audit logs

Error logging (ES.4) has no prerequisite: prepare a CloudWatch Logs group with the es.amazonaws.com resource policy and set ES_APPLICATION_LOGS. Audit logging (ES.5) requires fine-grained access control first; if AdvancedSecurityOptions is disabled, enabling it triggers a blue/green deployment, so schedule a window before setting AUDIT_LOGS. After each change, run a test request and confirm events actually land in the log group, because a clean reconfigure with a missing resource policy produces zero events silently.

4. Cap retention and prevent recurrence

Set a retention policy on every CloudWatch log group; the audit stream is the highest-volume EKS log type and the default is never-expire. Match the window to the strictest compliance obligation in scope. Then bake the audit log type, FGAC, the log publishing options, GuardDuty auto-enable, and retention into the provisioning template and Config rules (eks-cluster-log-enabled, opensearch-audit-logging-enabled) so new clusters and domains arrive compliant and the controls stay green by construction.

# Enable the EKS audit log type (non-disruptive), then bound the cost with retention.
aws eks update-cluster-config \
  --name prod-platform \
  --logging '{"clusterLogging":[{"types":["audit"],"enabled":true}]}'

aws logs put-retention-policy \
  --log-group-name /aws/eks/prod-platform/cluster \
  --retention-in-days 90

# Turn on GuardDuty EKS Audit Log Monitoring and auto-enable for the whole org.
DETECTOR=$(aws guardduty list-detectors --query 'DetectorIds[0]' --output text)
aws guardduty update-detector --detector-id "$DETECTOR" \
  --features '[{"Name":"EKS_AUDIT_LOGS","Status":"ENABLED"}]'
aws guardduty update-organization-configuration --detector-id "$DETECTOR" \
  --features '[{"Name":"EKS_AUDIT_LOGS","AutoEnable":"ALL"}]'

Quick quiz

Question 1 of 5

Security Hub shows EKS.8, GuardDuty.5 and ES.5 all failing. What is the most efficient way to think about them?

Keep learning

Go deeper on how the cluster and search layers record and detect activity.

You can now treat cluster and search audit logging as one capability rather than a scatter of findings: inventory which clusters and domains record and which are watched, enable the EKS audit log type and the GuardDuty detector, confirm the FGAC prerequisite before turning on Elasticsearch audit logs, and cap retention so the cost stays bounded. The Controls this lesson covers section below links every control in this group to its deep page and fix.

Back to the library

Cluster and search audit logging: the cost and risk view

A low-cost detective layer on the most-attacked parts of the platform

This is a compliance and detection capability with a small, tunable cost dimension. The EKS and Elasticsearch audit logs stream into CloudWatch Logs, where you pay for ingestion and storage; GuardDuty EKS Audit Log Monitoring is agentless and billed by the volume of audit-log events analysed, with no per-node licence. For most fleets these are minor lines on bills you already pay, and they are controllable through retention policies on the log groups.

Frame each failing control by the risk it covers, not the dollars. A compromised Kubernetes control plane can mean every workload in the cluster is exposed at once; an unlogged search domain holding regulated data means you cannot prove who accessed it in an audit or a breach. Both are the kind of gap an assessor singles out precisely because the fix is cheap and well understood, which makes a lingering red status look like neglect rather than a considered trade-off.

The one place cost can drift is unbounded retention on a busy cluster's audit stream, which is the highest-volume EKS log type. The finance contribution is to fund the logging as a platform cost, require a deliberate retention policy on every log group, and track the controls to green across the whole fleet rather than a sample.

This lesson is for the finance partner who sees CloudWatch Logs and GuardDuty climb after a security push and wants to know what they buy and whether they are justified. It covers why the cost is small and usage-based, why retention is the only real cost lever, and how to track these controls to green across the fleet so detection and audit coverage do not silently regress.

Fun fact

The audit log was already there

How a finance partner frames the cluster and search logging findings

Diego brings the findings to his finance partner before the healthcare-SaaS security review because two of them, the EKS audit log and GuardDuty EKS Audit Log Monitoring, add usage-based lines to bills the company already pays, and he wants the spend sized before he turns anything on. The partner's read on cost is calm: the EKS and Elasticsearch audit streams are modest CloudWatch Logs lines, and GuardDuty EKS Audit Log Monitoring is agentless and billed by the volume of audit events analysed, with no per-node licence. For most fleets none of this moves a quarterly number, and it charges back cleanly to the team that owns each cluster or domain.

What the partner cares about is the asymmetry. These controls cover the most-attacked layers of the container and search stacks: a compromised Kubernetes control plane can expose every workload at once, and an unlogged search domain holding regulated patient records means the company cannot prove who accessed it in an audit or a breach. The events the logging catches, privilege escalation and unprovable access to sensitive data, are the precursors to six- and seven-figure incidents. The one trap the partner flags is unbounded retention on the audit stream, which is the highest-volume EKS log type and converts a small predictable cost into an open-ended one, so the finance condition is a deliberate retention policy on every log group at the moment logging is enabled.

Why this matters to the budget and the risk register

The cost side of this capability is genuinely small. The EKS and Elasticsearch audit streams are modest CloudWatch Logs lines, and GuardDuty EKS Audit Log Monitoring is agentless and usage-based with no per-node licence. For most fleets none of these moves a quarterly number, and they charge back cleanly to the team that owns each cluster or domain.

The risk side is where the asymmetry lives. These controls cover the most-targeted layers of the container and search stacks, and the events they catch (privilege escalation, anonymous access to secrets, unprovable access to regulated data) are precisely the precursors to six- and seven-figure incidents. A small, fixed monitoring spend against that tail risk is one of the better risk-adjusted lines in the security portfolio.

The failure mode finance should watch for is not the control being off; it is the control turned on everywhere with no retention policy, which converts a small predictable cost into an unbounded one. The right report line is the controls green across all clusters and domains with retention set, not just the dollar figure, and the trend to react to is the status regressing, not the usage-based fee growing with healthy workload growth.

What finance can do about the cluster and search logging gap

Finance cannot toggle a log type, but it can fund the logging as a platform cost, keep the one real cost lever in check, and track the controls to green across the whole fleet. Three levers.

1. Fund the logging and detection as a small, charged-back platform cost

The EKS and Elasticsearch audit streams are modest CloudWatch Logs lines and GuardDuty EKS Audit Log Monitoring is agentless and usage-based with no per-node licence, so treat the whole capability as a planned platform line that charges back to the team owning each cluster or domain. It is one of the better risk-adjusted lines in the security portfolio: a small fixed spend against the tail risk of an un-investigable control-plane compromise or unprovable access to regulated data.

2. Require a deliberate retention policy on every log group

The failure mode to watch is not the control being off; it is the control turned on everywhere with no retention policy, which converts a small predictable cost into an unbounded one. The audit stream is the highest-volume EKS log type and the default is never-expire. Make a retention policy matched to the strictest compliance obligation in scope a non-negotiable part of enabling logging, not a follow-up, so storage stays bounded as the cluster grows.

3. Report the controls green across the fleet, not a sampled dollar figure

The right report line is the controls green across all clusters and domains with retention set, not just the usage-based fee. Because a GuardDuty organization gives no partial credit, covering most accounts reads the same as covering none, so the metric is fleet-wide coverage. React to the status regressing, which is a control failure to remediate now, rather than to the fee growing in step with healthy workload growth, which is expected.

Quick quiz

Question 1 of 5

How is GuardDuty EKS Audit Log Monitoring billed?

Keep learning

Go deeper on how the cluster and search layers record and detect activity.

You have finished the finance view of cluster and search audit logging. You know the direct cost is small, modest CloudWatch Logs lines plus agentless usage-based GuardDuty with no per-node licence, that unbounded retention on the highest-volume EKS audit stream is the one real cost trap to cap, and that the exposure is an un-investigable control-plane compromise or unprovable access to regulated data. Next time the cluster appears, you will fund it as a charged-back platform cost, require a retention policy on every log group, and report the controls green across the whole fleet rather than a sampled fee.

Back to the library

Cluster and search audit logging: the headline

Whether the business can prove what happened inside its clusters and search

If the business runs on Kubernetes or on managed search, the cluster control plane and the search domain are where attackers do their work and where auditors expect an access trail. This capability is the record of who did what inside those layers, and whether something is watching it. The report shows it as separate findings across EKS, GuardDuty and Elasticsearch, but the question underneath is one: can we reconstruct and detect activity on our most-attacked surfaces?

The leadership question is not the command; it is whether audit logging is on by default for every cluster and every sensitive search domain, with no exceptions that no one signed off on, and whether a detector is watching the highest-risk layer. The defensible end state is detection and an access trail by design, inherited automatically by new clusters and domains, not switched on after a scanner complains.

The economics are lopsided: a modest, tunable logging and detection cost against the inability to investigate a breach or pass an audit. That is the rare trade leadership should be quickest to make a default.

A short read for the leader who needs to know what cluster and search audit logging proves, why making it a default is a governance move rather than a budget one, and what good looks like: detection on the riskiest layers and an access trail on every sensitive domain, inherited by new resources automatically.

Fun fact

The audit log was already there

What it looks like when audit logging is a default, not a scanner response

The findings reached the executive review after the security team realised that three production clusters had been running their Kubernetes API server unrecorded and a legacy Elasticsearch domain backing customer-record search kept no access trail at all. The lesson leadership drew was not about those specific resources. It was that the business runs on Kubernetes and managed search, that the control plane and the search domain are exactly where attackers do their work and where auditors expect a trail, and that the company could neither reconstruct nor detect activity on its most-attacked surfaces.

So the executive framing settled on two things. First, the economics are lopsided: a modest, tunable logging and detection cost against the inability to investigate a breach or pass a PCI and NIST audit, which is the rare trade to make a default. Second, the end state is detection and an access trail by design, inherited automatically by new clusters and domains rather than switched on after a scanner complains. In a GuardDuty organization there is no partial credit either, so covering 28 of 30 accounts reads the same as covering none. The question leadership keeps asking is whether audit logging is on by default everywhere and a detector is watching the highest-risk layer, not what the line item costs.

Why this is a board-level risk

These controls sit at the intersection of security and compliance. The audit log is what lets the company prove what happened inside its container platform and its search infrastructure, to investigators after an incident and to auditors on demand, and the GuardDuty layer is what turns that log into detection. A failing control means that proof and that detection do not exist for the affected resources.

What makes it a clean leadership decision is the lopsided economics: a modest, tunable logging and detection cost against the inability to investigate a breach and a concrete PCI and NIST audit finding. A High-severity control that is cheap to fix but stays open, or that does not auto-inherit to new accounts, also signals that the security operating model is not closing obvious gaps. The leadership move is to make these controls on-by-default and inherited, then ask for the status rather than the dollar.

The leadership move on cluster and search audit logging

The handle is not to approve each toggle, it is to make audit logging on-by-default and inherited, then ask for the status rather than the dollar.

1. Make audit logging and detection a default, inherited by new resources

Set the standard that every cluster exports its audit log and every sensitive search domain publishes its audit and error logs, with GuardDuty EKS Audit Log Monitoring auto-enabled across the organization so new member accounts inherit it. Bake the audit log type, the FGAC prerequisite for search, the log publishing options and retention into the provisioning template and Config rules so new clusters and domains arrive compliant and the controls stay green by construction.

2. Treat the lopsided economics as a clean reason to act, with no quiet exceptions

A High-severity control that is cheap to fix but stays open, or that does not auto-inherit to new accounts, signals that the security operating model is not closing obvious gaps. The cost is a modest, tunable logging and detection line against the inability to investigate a breach and a concrete PCI and NIST finding, so any exception should be a deliberate, recorded decision, not a finding nobody signed off on.

3. Ask for the status across the fleet, not the command

Because a GuardDuty organization gives no partial credit, the one question worth asking is whether audit logging is on for every cluster and sensitive domain and the detector is watching the highest-risk layer, with no unsigned-off exceptions. A clean, fleet-wide attested answer is the acceptable one; a 28-of-30 answer is the same as none and is an accountability conversation.

Quick quiz

Question 1 of 5

What is the single leadership question behind the EKS, GuardDuty and Elasticsearch findings?

Keep learning

Go deeper on how the cluster and search layers record and detect activity.

Two takeaways: the cluster control plane and the search domain are where attackers work and where auditors expect a trail, and a failing control means that proof and that detection do not exist for the affected resources. The economics are lopsided enough that on-by-default and inherited is the right call, and in a GuardDuty organization there is no partial credit, so coverage must be fleet-wide. The thing to ask for is the status across every cluster and sensitive domain, not the dollar.

Back to the library

Controls this lesson covers

One capability, many AWS Security Hub controls. This lesson is the shared playbook; each control below keeps its own deep page with the exact check, severity and a copy-and-paste fix.

EKS

EKS.8 Medium EKS clusters should have audit logging

ES

GuardDuty

GuardDuty.5 High GuardDuty EKS audit log monitoring is off

Part of the learning path See what's happening

Enable cluster and search audit logging

Cluster and search audit logging: the basics

The audit log was already there

Finding unmonitored clusters and domains

How clusters and search domains record activitydeep dive

What is the impact of leaving these surfaces unmonitored?

How do you enable cluster and search audit logging safely?

1. Inventory which clusters and domains record and which are watched

2. Enable the EKS audit log type and turn on the detector

3. Confirm FGAC, then enable the Elasticsearch error and audit logs

4. Cap retention and prevent recurrence

Quick quiz

Keep learning

Cluster and search audit logging: the cost and risk view

The audit log was already there

How a finance partner frames the cluster and search logging findings

Why this matters to the budget and the risk register

What finance can do about the cluster and search logging gap

1. Fund the logging and detection as a small, charged-back platform cost

2. Require a deliberate retention policy on every log group

3. Report the controls green across the fleet, not a sampled dollar figure

Quick quiz

Keep learning

Cluster and search audit logging: the headline

The audit log was already there

What it looks like when audit logging is a default, not a scanner response

Why this is a board-level risk

The leadership move on cluster and search audit logging

1. Make audit logging and detection a default, inherited by new resources

2. Treat the lopsided economics as a clean reason to act, with no quiet exceptions

3. Ask for the status across the fleet, not the command

Quick quiz

Keep learning

Controls this lesson covers

EKS

ES

GuardDuty

Related compliance lessons