Compute: Rightsizing and Graviton
Series
Cloud Cost EngineeringCompute is typically 40–60 % of an AWS bill. It is also the easiest category to overprovision because engineers default to "same as prod" for every environment and "one size up" when anything is slow. The result is a fleet where average CPU utilization sits at 8 % and the bill reflects 100 % reservation.
Finding Oversized Instances
The signal is simple: CPU utilization over a 14-day window. Anything averaging below 20 % and peaking below 50 % is a candidate for downsizing.
import boto3
from datetime import datetime, timedelta, timezone
def get_low_utilization_instances(threshold_avg=20, threshold_max=50):
ec2 = boto3.client('ec2')
cw = boto3.client('cloudwatch')
paginator = ec2.get_paginator('describe_instances')
instances = []
for page in paginator.paginate(
Filters=[{'Name': 'instance-state-name', 'Values': ['running']}]
):
for reservation in page['Reservations']:
for inst in reservation['Instances']:
instances.append({
'id': inst['InstanceId'],
'type': inst['InstanceType'],
'name': next((t['Value'] for t in inst.get('Tags', [])
if t['Key'] == 'Name'), 'unnamed'),
})
end = datetime.now(timezone.utc)
start = end - timedelta(days=14)
candidates = []
for inst in instances:
metrics = cw.get_metric_statistics(
Namespace='AWS/EC2',
MetricName='CPUUtilization',
Dimensions=[{'Name': 'InstanceId', 'Value': inst['id']}],
StartTime=start, EndTime=end,
Period=86400, Statistics=['Average', 'Maximum'],
)
if not metrics['Datapoints']:
continue
avg_cpu = sum(d['Average'] for d in metrics['Datapoints']) / len(metrics['Datapoints'])
max_cpu = max(d['Maximum'] for d in metrics['Datapoints'])
if avg_cpu < threshold_avg and max_cpu < threshold_max:
candidates.append({**inst, 'avg_cpu': round(avg_cpu, 1), 'max_cpu': round(max_cpu, 1)})
return sorted(candidates, key=lambda x: x['avg_cpu'])
if __name__ == '__main__':
for c in get_low_utilization_instances():
print(f"{c['id']:20s} {c['type']:15s} avg={c['avg_cpu']:5.1f}% max={c['max_cpu']:5.1f}% ({c['name']})")Run this across all accounts. The output is your rightsizing backlog.
Instance Family Decision Tree
The Graviton Math
Graviton3 (m7g) is ~10 % cheaper than equivalent x86 (m6i) at On-Demand price. With Compute Savings Plan on top, the effective discount reaches 35–40 % against On-Demand x86.
| Instance | vCPU | RAM | On-Demand/hr | vs m6i.xlarge |
|---|---|---|---|---|
| m6i.xlarge (x86) | 4 | 16 GiB | $0.192 | baseline |
| m7g.xlarge (Graviton3) | 4 | 16 GiB | $0.1632 | −15 % |
| m7g.xlarge + 1yr no-upfront CSP | 4 | 16 GiB | ~$0.103 | −46 % |
For a fleet of 100 m6i.xlarge running continuously:
x86 baseline: 100 × $0.192 × 8760 = $168,192/yr
Graviton3 CSP: 100 × $0.103 × 8760 = $90,228/yr
Annual saving: $77,964 (~46%)That is not a rounding error. That is a hiring decision.
Migration Playbook
A safe Graviton migration follows four stages.
Build the multi-arch image in CI. The Dockerfile changes nothing — the build pipeline adds the platform flag:
# Dockerfile (unchanged — works on both architectures)
FROM python:3.12-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8000"]# GitHub Actions — multi-arch build
- name: Build and push multi-arch image
uses: docker/build-push-action@v5
with:
platforms: linux/amd64,linux/arm64
push: true
tags: ${{ env.IMAGE_URI }}:${{ github.sha }}Terraform: EKS Node Group with Graviton
resource "aws_eks_node_group" "graviton" {
cluster_name = aws_eks_cluster.main.name
node_group_name = "graviton-general"
node_role_arn = aws_iam_role.node.arn
subnet_ids = var.private_subnet_ids
ami_type = "AL2023_ARM_64_STANDARD"
instance_types = ["m7g.xlarge", "m7g.2xlarge"] # multi-type for Spot fallback
scaling_config {
desired_size = 3
min_size = 1
max_size = 20
}
labels = {
"kubernetes.io/arch" = "arm64"
"node.kubernetes.io/instance-family" = "graviton"
}
tags = {
managed-by = "terraform"
team = var.team
env = var.environment
}
}Common Migration Blockers
| Blocker | Resolution |
|---|---|
| Native x86 binaries in Docker image | Rebuild from source; remove pre-built amd64 wheels |
| JVM running 32-bit mode | Pass -XX:+UseCompressedOops (on by default in modern JDKs) |
| Node.js native addons | npm rebuild on arm64 base image |
| Python C extensions | Use multi-arch wheels from PyPI or build in CI |
| Lambda functions | Change architecture to arm64 in function config — no rebuild needed for interpreted runtimes |
Key Takeaways
- Average CPU below 20 % over 14 days is a reliable rightsizing signal; add memory utilization from CloudWatch agent for a complete picture.
- Graviton3 is 10–15 % cheaper at On-Demand and 35–46 % cheaper when combined with Compute Savings Plans — this is the single largest levers on the compute line.
- Multi-arch Docker images are the prerequisite; build them in CI before touching any production infrastructure.
- Canary deployment with automatic rollback eliminates the migration risk that teams fear; the technical risk of Graviton is near zero for containerized workloads.
- Lambda arm64 is the easiest Graviton win — one config change, no Dockerfile, immediate 20 % cost reduction.
- Rightsizing and Graviton are independent optimizations that compound; apply both and the savings are multiplicative, not additive.
Series
Cloud Cost Engineering