Compliance

Harden ECS container workloads

One capability across ECS task definitions, services, task sets and clusters: drop the privileges, close the network paths, move secrets out of plaintext and turn on the logging so a single compromised container stays a contained incident.

14 min·10 sections·AWS

Last reviewed 16 June 2026

Remediates AWS Security Hub: ECS.3 ECS.4 ECS.5 ECS.8 ECS.9 ECS.12 ECS.16 ECS.20 ECS.21

Hardening ECS workloads: the basics

Why a default ECS task definition is more privileged and more exposed than you think

An Amazon ECS task definition is the blueprint for a running container: its image, CPU and memory, networking, the user it runs as, what it can write, what it logs and what secrets it carries. The trouble is that the defaults lean towards convenience, not safety. Leave the user field unset and a Linux container runs as root; leave readonlyRootFilesystem unset and the container can rewrite itself at runtime; paste a credential into the environment block and it is stored as plaintext forever. Set pidMode to host or privileged to true and the wall between the container and the EC2 instance it shares with other tasks effectively disappears.

AWS Security Hub turns each weak default into its own control, which is why a single cluster can fail half a dozen ECS checks at once. ECS.3 flags a shared host process namespace (pidMode host); ECS.4 flags privileged containers; ECS.5 flags a writable root filesystem; ECS.8 flags AWS credentials in the environment block; ECS.9 flags missing log configuration; ECS.12 flags clusters without Container Insights; ECS.16 flags task sets that assign public IPs; ECS.20 and ECS.21 flag Linux containers running as root and Windows containers running as containeradministrator. They look like separate problems on the report, but they are one capability: give each container the least privilege, the smallest network footprint and the most visibility it can do its job with.

The reason these matter is blast radius. A container that runs as root, can write to itself, shares the host's process namespace, or carries a plaintext key is a far better launchpad for an attacker than a constrained one. Most of the failures are drift, a Dockerfile that never set a USER, a task definition copied from a Stack Overflow answer. The job is to find every over-privileged or over-exposed task definition, harden it at the source, then gate the pipeline so the failing configuration cannot ship again.

In this lesson you will learn how ECS expresses privilege, network exposure, secret handling and observability across task definitions, task sets and clusters, how to find every weakly-configured workload in an account, and how to harden them without breaking apps that legitimately need to write or carry config. The Controls this lesson covers section lists every Security Hub control in this capability, each linking to a deep page with the exact check and a copy-and-paste fix.

Fun fact

The backdoor that couldn't find a home

In a red-team exercise against a payments platform the attackers popped a container through a vulnerable dependency and tried their standard move: write a persistence script and a reverse-shell binary to disk so they would survive a restart. Every write failed with Read-only file system. The task had a non-root user, a read-only root filesystem, no privileged flag, and exactly one narrow writable mount that was wiped on every recycle. The foothold lasted until the next deployment, hours later, with nothing persisted. The whole class of post-exploitation moves was off the table because of a handful of one-line settings in the task definition.

Finding over-privileged containers across an estate

Devon is on the platform team at a B2B SaaS company preparing for a SOC 2 renewal. Security Hub shows ECS failures spread across task definitions in the production cluster: containers running as root, a writable root filesystem, and at least one plaintext credential.

Rather than work the findings one by one, he starts by inspecting the highest-risk task definition, the public-facing checkout-api, to see how many controls a single workload is failing at once.

Start with a public-facing service and inspect the user, filesystem and privilege settings together. One task definition often fails several controls.

$ aws ecs describe-task-definition --task-definition checkout-api --query 'taskDefinition.containerDefinitions[].{Name:name,User:user,ReadOnly:readonlyRootFilesystem,Priv:privileged}' --output table

------------------------------------------------------------

------------------------------------------------------------

# user None defaults to root (fails ECS.20); readonly None defaults to writable (fails ECS.5).

An unset user and an unset read-only flag both default to the insecure value. One workload here trips ECS.20 and ECS.5 at once; fix it at the source in one revision.

How ECS hardening actually worksdeep dive

Most ECS controls resolve to one of three concerns, all evaluated on the latest active revision of the task definition (or the cluster, or the task set). The first is privilege: the user field (ECS.20 for Linux, ECS.21 for Windows), the privileged flag (ECS.4) and pidMode (ECS.3). The second is exposure and integrity: readonlyRootFilesystem (ECS.5) and assignPublicIp on a task set's network configuration (ECS.16). The third is secrets and observability: AWS credential keys in the environment block versus the secrets field (ECS.8), the logConfiguration block (ECS.9) and containerInsights on the cluster (ECS.12).

Task definitions are immutable, so every fix is the same shape: register a new revision with the corrected fields, then redeploy the service so the change takes effect on the next deployment. Registering a revision does not restart running tasks, which is why a green control can lag a still-running insecure task until you redeploy. ECS.5 is reported NOT_APPLICABLE for Windows containers; ECS.20 only evaluates Linux task definitions and ECS.21 only Windows; ECS.16 is the task-set sibling of the service-level public-IP check, because task sets carry their own network configuration during blue-green and external deployments.

The strongest position is preventive. ECS.8 in particular needs care: the secrets field pulls a value from Secrets Manager or SSM at launch using the task EXECUTION role (not the task role), so the execution role needs GetSecretValue plus kms:Decrypt if a customer-managed key is used, and any credential that ever sat in the environment block must be rotated, not just relocated. Bake non-root users, read-only filesystems, the secrets pattern, log configuration and Container Insights into a shared task-definition template, and add a CI policy check (cfn-guard, OPA, or an AWS Config conformance rule) that rejects any non-compliant definition before it deploys.

What is the impact of leaving containers unhardened?

The direct impact is a wider blast radius on any compromise. A container running as root, with a writable filesystem, or with the privileged flag or a shared host process namespace, gives an attacker who lands a foothold the tools to escalate, persist, and break out onto the host that runs all your tasks. Read-only filesystems and non-root users remove most of the post-exploitation playbook; isolating the process namespace and dropping privileged keep a single compromised service from reaching the box and its neighbours.

The second-order impact is credential and data exposure. A plaintext AWS key in an environment block is readable by anyone who can describe the task definition and is the single most common root cause of expensive account compromise, with automated scanners finding leaked keys within hours. A task set that assigns public IPs quietly puts a backend on the internet during a blue-green cutover even when the service looks locked down. Each of these is a path from a misconfiguration to a real incident.

There is an observability impact too: a container with no log configuration is a black box when it crashes, and a cluster without Container Insights is one you pay for but cannot see into until an incident forces the question. On the compliance side these controls map to NIST 800-53 access-control, audit-and-accountability and PCI DSS requirements, so an open backlog drags the posture score, surfaces in SOC 2 and PCI assessments, and becomes friction in enterprise procurement, independent of whether any breach ever occurs.

How do you harden containers safely?

Work the capability as one loop rather than chasing individual findings. The order matters: work out what each container legitimately needs (to write, to log, to read as a secret) before you tighten it, so you do not break a running service.

1. Inventory every task definition, task set and cluster

Across accounts and regions, list the latest active revision of each task definition and flag containers that run as root or containeradministrator, set privileged true, set pidMode host, lack readonlyRootFilesystem, carry credential keys in the environment block, or have no logConfiguration. List task sets with assignPublicIp ENABLED and clusters without Container Insights. Prioritise internet-facing and high-privilege services first; remember ECS.20 is Linux only and ECS.21 Windows only.

2. Work out each container's legitimate needs before tightening

Before flipping flags, find out what each container actually needs: which paths it writes (for a narrow tmpfs or volume mount under a read-only root), whether it genuinely needs the privileged flag or host pidMode (almost never), and which environment values are really secrets. For ECS.8, treat any value that ever lived in the environment block as leaked and rotate it before relocating. The cleanest apps write nothing to root and carry no secrets in env, and harden with no other change.

3. Register hardened revisions and redeploy, highest impact first

Set a non-root user, readonlyRootFilesystem true with narrow writable mounts, privileged false, pidMode unset or task, a logConfiguration block, and move secrets to the secrets field referencing a Secrets Manager or SSM ARN (granting the execution role read plus kms:Decrypt). Disable assignPublicIp on task sets and enable Container Insights on clusters. Register a new revision and redeploy the service, since task definitions are immutable; roll out one service as a canary, confirm no Read-only file system errors and healthy tasks, then proceed.

4. Gate the pipeline so the failing configuration can't ship again

Cleanup without prevention just resets the clock. Bake the hardened settings into a shared task-definition template or IaC module, and add a CI policy check (cfn-guard, OPA) or an AWS Config conformance rule that rejects any task definition with a root user, a writable root, a privileged container, host pidMode, a credential in env, or no logging. Make the secure choice the default choice so engineers get it for free.

# Inventory: flag containers running as root or with a writable root filesystem.
for fam in $(aws ecs list-task-definition-families --status ACTIVE \
    --query 'families[]' --output text); do
  aws ecs describe-task-definition --task-definition "$fam" \
    --query "taskDefinition.containerDefinitions[?user==null || user=='root' || user=='0' || readonlyRootFilesystem!=\`true\`].{Family:'$fam',Name:name,User:user,ReadOnly:readonlyRootFilesystem}" \
    --output text
done

# Harden at the source. Dockerfile:
#   RUN addgroup -S app && adduser -S -G app appuser
#   USER appuser
# Task definition: non-root user, read-only root with one narrow tmpfs, secrets via ARN.
#   "user": "1000:1000",
#   "readonlyRootFilesystem": true,
#   "mountPoints": [{ "sourceVolume": "scratch", "containerPath": "/tmp", "readOnly": false }],
#   "secrets": [{ "name": "DB_PASSWORD",
#     "valueFrom": "arn:aws:secretsmanager:us-east-1:123456789012:secret:prod/checkout/db-AbCdEf" }]

# Register the hardened revision and roll it out (tasks only update on redeploy).
aws ecs register-task-definition --cli-input-json file://checkout-api-hardened.json
aws ecs update-service --cluster prod --service checkout-api \
  --task-definition checkout-api --force-new-deployment

Quick quiz

Question 1 of 5

Security Hub shows ECS failures across task definitions, a task set and a cluster. What is the most efficient way to think about them?

Keep learning

Go deeper on the ECS controls in this capability, the task-definition parameters they read, and how to enforce them.

You can now treat ECS hardening as one capability rather than a scatter of findings: inventory each task definition, task set and cluster, work out what each container legitimately needs, register hardened revisions and redeploy highest-impact first, and gate the pipeline so the failing configuration can't ship again. The Controls this lesson covers section below links every control in this group to its deep page and fix.

Back to the library

Hardening ECS workloads: the cost and risk view

A near-zero-cost capability that contains the blast radius of any container compromise

Containers are the small, disposable units of software that run most modern applications, and ECS is the AWS service that schedules them. Almost every control in this group costs nothing in AWS spend to fix. Setting a non-root user, locking the filesystem read-only, moving a secret to a vault and isolating the process namespace are configuration changes, not new tools or headcount. The one control with a small recurring cost is Container Insights (CloudWatch ingestion), and it is rounding error against the incidents it helps you catch.

Frame each failing control as a line on the risk register rather than a compliance checkbox. The exposure is not the hourly cost of the task, it is what a compromised container can reach: the host, the other tasks on it, the task role's permissions, and any plaintext credentials it carries. A public-facing service running as root with a plaintext AWS key carries far higher expected loss than an internal sandbox, yet both can show up as red findings. Map the failures to exposure and prioritise accordingly.

The cost shape of these findings is audit and incident, not bill. Open High-severity findings drag down the Security Hub posture score auditors and enterprise customers look at, surface in SOC 2 and PCI reviews, and stall procurement security questionnaires. The durable answer is a build-time gate so the hardened configuration is the only one that ships, which converts a recurring cleanup into a one-time engineering investment.

This lesson is for the finance partner who sees a cluster of ECS findings on the security report and wants to know what the right response is and what it costs. It covers why nearly all of these controls are free to fix, why an open High-severity finding is an audit and sales cost rather than a cloud cost, and how to turn a list of red findings into a risk-ordered plan with a pipeline gate as the acceptance bar.

Fun fact

The backdoor that couldn't find a home

How a finance partner risk-orders a wall of ECS findings

Priya is the finance partner working with the platform team on a SOC 2 renewal. Security Hub returns a scatter of ECS failures across the production cluster: containers running as root, a writable root filesystem, a plaintext AWS key in one task definition's environment block, and a cluster without Container Insights. Her instinct is not to ask how much it costs, because almost all of it is free configuration, but to ask what each compromised container could actually reach.

The team annotates the list. The public-facing checkout-api runs as root with a plaintext key, so its blast radius is the host, the neighbouring tasks and whatever that key can do; an internal batch job runs as root too but is sandboxed and carries no secrets. Priya re-orders the work by exposure rather than by the report's order, and on the one plaintext-key finding she sets a hard condition for the finance pack: the item is only closed when the key is rotated, not merely relocated, or the risk is hidden rather than removed. Her line for the pack reads: the public-facing high-privilege services harden this sprint, the one exposed credential is rotated and vaulted, and a build-time pipeline gate ships so the hardened configuration is the only one that can deploy.

Why ECS hardening belongs on the risk register

There is no meaningful direct cost to most of these findings, the fixes are configuration changes. The financial exposure is indirect and asymmetric: a low-probability, high-impact event (a container compromise that reaches the host, or a leaked key that runs up a five-figure mining bill overnight) on one side, and the steady cost of audit and sales friction on the other. That asymmetry is exactly why a backlog of these is worth standing attention even though it rarely shows up as a dollar line.

The clearest place it lands financially is the audit and sales cycle. These controls map to recognised frameworks and contribute to the Security Hub compliance percentage you report. An open High-severity finding is the kind of thing that turns a clean SOC 2 into a qualified one, or adds a remediation commitment to an enterprise contract, both real costs in deal delays and audit rework. For ECS.8 specifically, insist a closed finding means both relocated to a vault AND the exposed credential rotated, or the risk is only hidden, not closed.

Treat the finding count and its trend as a governance metric, and fund the durable fix: a build-time gate that blocks non-compliant task definitions costs almost nothing to run and converts an ongoing remediation chore into a one-time investment that protects the audit position indefinitely.

What finance can do about the ECS hardening gap

Finance cannot register a task-definition revision, but it can set the framing that turns a scatter of red findings into a risk-ordered plan with a pipeline gate as the acceptance bar. Three levers, used at the monthly risk-and-cost review.

1. Treat the cost as audit and incident, not bill

Almost every control here (a non-root user, a read-only filesystem, isolating the process namespace, moving a secret to a vault) is a free configuration change, not new tooling or headcount. The only recurring cost is Container Insights ingestion, which is rounding error against the incidents it helps catch. The financial exposure is indirect and asymmetric: a low-probability, high-impact event such as a container compromise reaching the host or a leaked key running up a five-figure mining bill overnight, set against the steady cost of audit and sales friction from open High-severity findings.

2. Risk-weight each failing workload by what a compromise could reach

The report lists findings by control, not by exposure. Re-order them by what a compromised container can reach: the host, the other tasks on it, the task role's permissions, and any plaintext credentials it carries. A public-facing service running as root with a plaintext AWS key carries far higher expected loss than an internal sandbox, yet both can show up as red findings. For ECS.8 specifically, insist a closed finding means both relocated to a vault AND the exposed credential rotated, or the risk is only hidden, not closed.

3. Make the pipeline gate the acceptance bar for closing the work

Cleanup without prevention just resets the clock. Fund the durable fix: a build-time gate (cfn-guard, OPA, or an AWS Config conformance rule) that blocks non-compliant task definitions costs almost nothing to run and converts an ongoing remediation chore into a one-time engineering investment that protects the audit position indefinitely. Treat the gate, not the cleared finding count, as the deliverable that signals the work is genuinely closed.

Quick quiz

Question 1 of 5

Why does the finance framing describe ECS hardening as a near-zero-cost capability whose exposure is audit and incident rather than bill?

Keep learning

Go deeper on the ECS controls in this capability, the task-definition parameters they read, and how to enforce them.

You have finished the finance view of ECS hardening. You know almost every fix is free configuration (Container Insights is the one small recurring cost), that the real exposure is audit and incident rather than bill, and that the right move is to risk-order workloads by what a compromise could reach, insist that exposed credentials are rotated and not just relocated, and treat the build-time pipeline gate as the deliverable that closes the work. Next time a wall of ECS findings lands, you will fund the durable gate rather than chasing the same cleanup each quarter.

Back to the library

Hardening ECS workloads: the headline

Whether containers ship with the least privilege they need, by default

Most of our applications run as containers. By default each one can run as the most powerful user, write to its own disk, share the host's internals, and carry credentials in plaintext, which means a single software vulnerability can turn one compromised container into a compromise of the host and everything else running on it. The report shows this as a scatter of separate findings across task definitions, services and clusters.

This is a security-posture and discipline issue, not a cost one. The fixes are configuration changes that cost essentially nothing, and they map directly to the frameworks enterprise customers and auditors evaluate. The leadership question is whether containers ship hardened by default, enforced in the deployment pipeline, or whether each engineer has to remember to do the right thing.

The defensible end state is that least privilege, network isolation, secret management and logging are the default a team starts from, with a build-time gate that blocks anything non-compliant. That turns a recurring backlog into a confidence signal.

A short read for the leader who needs to know what an over-privileged container estate exposes, why hardening it is a governance and discipline decision rather than a budget one, and what a secure-by-default end state, enforced in the pipeline, looks like.

Fun fact

The backdoor that couldn't find a home

What it looks like when containers ship hardened by default

After a red-team report showed that an attacker who popped one container through a vulnerable dependency could have written a persistence script to disk, escalated as root and reached the host that ran every other task, the CTO asked a direct question: do our containers ship with the least privilege they need, or does every engineer have to remember to lock each one down?

The honest answer was the latter, so the team made hardening a default rather than a habit. A shared task-definition template now sets a non-root user, a read-only root filesystem with one narrow writable mount, no privileged flag, secrets pulled from a vault by ARN, log configuration and Container Insights, and a CI policy check rejects any task definition that strays from it. The same red team came back a quarter later, popped a test container, and found every persistence move failing with Read-only file system and no plaintext credential to steal. The leadership answer had changed from we hope each team remembered to the secure configuration is the only one that ships.

Why this is on the report at all

The risk these controls address is rare but severe: a compromised container with too much privilege has a far easier path to the host and the rest of the environment, and a leaked credential is a fast, automated route to account compromise. Tracking the category is about whether the organisation ships workloads with the least privilege they need by default. A clean result means yes; a backlog means no.

There is a commercial dimension too. These findings map to the security frameworks enterprise customers and auditors evaluate, so an unaddressed backlog becomes a line in a SOC 2 report or a security questionnaire, and that translates into deal friction. The leadership value is closing the gap at the source, a pipeline default and gate, so it stops appearing on audits and questionnaires entirely.

The leadership move on container hardening

The executive handle is not to approve each task-definition fix individually; it is to require that containers ship hardened by default and that the deployment pipeline blocks anything non-compliant.

1. Set least privilege as the default a team starts from

State plainly that containers ship with the least privilege they need: a non-root user, a read-only root filesystem, no privileged flag, no shared host process namespace, secrets in a vault rather than plaintext, and logging on. The right place to enforce this is a shared template and the pipeline, not a per-engineer checklist. That reframes the existing findings from a backlog to a default that everyone gets for free.

2. Require rotation, not just relocation, as proof on credential findings

For any plaintext credential in an environment block, insist the closing definition of done is that the key was rotated or revoked first, then moved to a vault. Relocating a still-valid key leaves the risk in circulation, and a leaked key is a fast, automated route to account compromise. Ask for proof of rotation, not just a screenshot of the relocated secret, before treating the finding as closed.

3. Require a build-time gate as the durable deliverable

Ask the one closing question: can a non-compliant task definition still ship? If yes, the cleanup is temporary and the findings will regrow. The durable outcome is a pipeline gate that rejects a root user, a writable root, a privileged container, host pidMode, a credential in env or missing logging, so the controls stay green on their own and the category stops appearing on audits and questionnaires entirely.

Quick quiz

Question 1 of 5

What is the single leadership question this capability comes down to?

Keep learning

Go deeper on the ECS controls in this capability, the task-definition parameters they read, and how to enforce them.

Two takeaways: an over-privileged container turns one software vulnerability into a compromise of the host and its neighbours, and the right question is not the finding count but whether containers ship hardened by default and the pipeline blocks anything that is not. The fixes are near-free configuration changes; the durable outcome is least privilege as the default a team starts from, enforced by a build-time gate, with any exposed credential rotated rather than merely moved.

Back to the library

Controls this lesson covers

One capability, many AWS Security Hub controls. This lesson is the shared playbook; each control below keeps its own deep page with the exact check, severity and a copy-and-paste fix.

ECS

Part of the learning path Lock down access

Harden ECS container workloads

Hardening ECS workloads: the basics

The backdoor that couldn't find a home

Finding over-privileged containers across an estate

How ECS hardening actually worksdeep dive

What is the impact of leaving containers unhardened?

How do you harden containers safely?

1. Inventory every task definition, task set and cluster

2. Work out each container's legitimate needs before tightening

3. Register hardened revisions and redeploy, highest impact first

4. Gate the pipeline so the failing configuration can't ship again

Quick quiz

Keep learning

Hardening ECS workloads: the cost and risk view

The backdoor that couldn't find a home

How a finance partner risk-orders a wall of ECS findings

Why ECS hardening belongs on the risk register

What finance can do about the ECS hardening gap

1. Treat the cost as audit and incident, not bill

2. Risk-weight each failing workload by what a compromise could reach

3. Make the pipeline gate the acceptance bar for closing the work

Quick quiz

Keep learning

Hardening ECS workloads: the headline

The backdoor that couldn't find a home

What it looks like when containers ship hardened by default

Why this is on the report at all

The leadership move on container hardening

1. Set least privilege as the default a team starts from

2. Require rotation, not just relocation, as proof on credential findings

3. Require a build-time gate as the durable deliverable

Quick quiz

Keep learning

Controls this lesson covers

ECS

Related compliance lessons