Workload Identity and IRSA
At some point in your Kubernetes journey, someone on your team will create a Kubernetes Secret containing an AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY, mount it into a pod, and call it done. It works. It is also a slow-motion security incident waiting to happen.
Static credentials do not expire. They follow the pod image into registries, surface in log aggregators if someone mistakenly prints environment variables, and persist in Secret objects that are stored base64-encoded in etcd — which is readable by anyone with cluster-admin. The credential lifecycle is entirely manual. The blast radius when they leak is your entire AWS account.
There is a better mechanism, and it has been available in EKS since 2019. The adoption is still lower than it should be.
How IRSA Works
IRSA — IAM Roles for Service Accounts — is the EKS implementation of workload identity. The mechanism relies on OIDC federation between your EKS cluster and AWS IAM.
The pod never holds a long-lived key. It holds a short-lived JWT issued by the Kubernetes API server. AWS STS validates that JWT against your cluster's OIDC provider and exchanges it for temporary credentials scoped to exactly the IAM role you specified — nothing more.
Setting It Up End to End
Step 1: Enable the OIDC provider for your cluster
# For an existing EKS cluster
eksctl utils associate-iam-oidc-provider \
--region us-east-1 \
--cluster production \
--approve
# Verify
aws iam list-open-id-connect-providersStep 2: Create the IAM role with a trust policy scoped to a specific ServiceAccount
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::123456789012:oidc-provider/oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:sub": "system:serviceaccount:payments:payments-api",
"oidc.eks.us-east-1.amazonaws.com/id/EXAMPLED539D4633E53DE1B716D3041E:aud": "sts.amazonaws.com"
}
}
}
]
}The sub condition is the critical part. This role can only be assumed by the payments-api ServiceAccount in the payments namespace. A pod in billing with a different ServiceAccount cannot assume it, even if an attacker escalates pod privileges.
Step 3: Annotate the ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
name: payments-api
namespace: payments
annotations:
eks.amazonaws.com/role-arn: arn:aws:iam::123456789012:role/payments-api-role
# Optional: override token expiry (default 86400s, max 43200s recommended)
eks.amazonaws.com/token-expiration: "3600"Step 4: Reference the ServiceAccount in the Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: payments-api
namespace: payments
spec:
replicas: 3
selector:
matchLabels:
app: payments-api
template:
metadata:
labels:
app: payments-api
spec:
serviceAccountName: payments-api # This line does the work
containers:
- name: api
image: 123456789012.dkr.ecr.us-east-1.amazonaws.com/payments-api:v1.2.0
env:
- name: AWS_REGION
value: us-east-1
# No AWS_ACCESS_KEY_ID. No AWS_SECRET_ACCESS_KEY. Never again.The AWS SDK automatically detects the projected token at /var/run/secrets/eks.amazonaws.com/serviceaccount/token and uses the IRSA flow. Your application code does not change.
Scoping Roles — The Part Most Teams Get Wrong
The trust policy sub condition is not enough on its own. You also need to scope the IAM permissions on the role itself. The pattern that works:
One role per service. One role scoped to exactly the resources that service needs. No shared roles between services. No s3:* permissions because "we might need it later."
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:::payments-receipts/*"
},
{
"Effect": "Allow",
"Action": ["sqs:SendMessage", "sqs:ReceiveMessage", "sqs:DeleteMessage"],
"Resource": "arn:aws:sqs:us-east-1:123456789012:payments-events"
},
{
"Effect": "Allow",
"Action": ["secretsmanager:GetSecretValue"],
"Resource": "arn:aws:secretsmanager:us-east-1:123456789012:secret:payments/*"
}
]
}GKE Workload Identity
On GKE, the mechanism is Workload Identity. The concept is identical — Kubernetes ServiceAccount maps to a Google Service Account via an IAM binding — but the annotation format differs.
# GKE — annotate the Kubernetes ServiceAccount
apiVersion: v1
kind: ServiceAccount
metadata:
name: payments-api
namespace: payments
annotations:
iam.gke.io/gcp-service-account: payments-api@my-project.iam.gserviceaccount.com# Bind the GSA to the KSA
gcloud iam service-accounts add-iam-policy-binding \
payments-api@my-project.iam.gserviceaccount.com \
--role roles/iam.workloadIdentityUser \
--member "serviceAccount:my-project.svc.id.goog[payments/payments-api]"Auditing What You Already Have
If you inherited a cluster, the first thing to check is how many Secrets in kube-system or application namespaces contain AWS_ACCESS_KEY or similar patterns.
# Find Secrets that look like they contain static credentials
kubectl get secrets -A -o json | \
jq -r '.items[] | select(.data | to_entries[] | .value | @base64d | test("AKIA|ASIA")) | "\(.metadata.namespace)/\(.metadata.name)"'Any result from that command is a remediation ticket.
Key Takeaways
- Static AWS credentials inside Kubernetes Secrets are a persistent security liability — they do not expire, are stored base64-encoded in etcd, and require manual rotation.
- IRSA and GKE Workload Identity replace long-lived keys with short-lived tokens issued by the Kubernetes API server and validated by AWS STS or GCP IAM.
- The trust policy
subcondition scopes the IAM role to a specific ServiceAccount in a specific namespace — a compromised pod in another namespace cannot assume it. - Create one IAM role per service, scoped to exactly the AWS resources that service needs. Shared roles and wildcard permissions negate the security model.
- Your application code does not change. The AWS SDK handles the IRSA flow automatically when it detects the projected token.
- Audit existing clusters for Secrets containing static credentials and replace them. This is not optional hygiene — it is incident prevention.