Storage Tiers
Series
Cloud Cost EngineeringStorage costs do not spike dramatically. They accumulate silently. An S3 bucket created for a one-time data migration sits in Standard class for three years at $23/TB/month while Glacier would cost $4. Nobody deleted it because nobody knew it existed. Multiply that by a hundred buckets and you have a six-figure storage bill for data that nobody reads.
S3 Storage Class Economics
There are seven S3 storage classes. Only two decisions matter: how often is this data accessed, and can you tolerate retrieval latency?
| Storage Class | Price/TB/mo | Min Duration | Retrieval | Best For |
|---|---|---|---|---|
| Standard | $23.00 | None | ms | Active data, < 30 days |
| Intelligent-Tiering | $23.00* | None | ms | Unknown access pattern |
| Standard-IA | $12.50 | 30 days | ms | Monthly access |
| One Zone-IA | $10.00 | 30 days | ms | Re-creatable data |
| Glacier Instant | $4.00 | 90 days | ms | Quarterly access |
| Glacier Flexible | $3.60 | 90 days | 1–12 h | Annual access |
| Glacier Deep Archive | $0.99 | 180 days | 12–48 h | 7-year retention |
*Intelligent-Tiering adds a monitoring charge of $0.0025 per 1,000 objects — worthwhile only for objects > 128 KB.
Lifecycle Automation via Terraform
Manual lifecycle rules drift. Codify them.
resource "aws_s3_bucket_lifecycle_configuration" "data_lake" {
bucket = aws_s3_bucket.data_lake.id
rule {
id = "transition-and-expire-raw"
status = "Enabled"
filter {
prefix = "raw/"
}
transition {
days = 30
storage_class = "STANDARD_IA"
}
transition {
days = 90
storage_class = "GLACIER_IR"
}
transition {
days = 365
storage_class = "DEEP_ARCHIVE"
}
expiration {
days = 2555 # 7-year retention, then delete
}
noncurrent_version_transition {
noncurrent_days = 7
storage_class = "STANDARD_IA"
}
noncurrent_version_expiration {
noncurrent_days = 30
}
}
rule {
id = "abort-incomplete-multipart"
status = "Enabled"
filter {} # applies to all objects
abort_incomplete_multipart_upload {
days_after_initiation = 7
}
}
}The abort_incomplete_multipart_upload rule is easy to forget and silently accumulates cost — incomplete uploads are billed at Standard rates indefinitely.
S3 Object Access Analysis
Before setting lifecycle rules, measure actual access patterns with S3 Storage Lens or server access logging.
-- Query S3 server access logs in Athena
-- Shows objects not accessed in the last 90 days
SELECT
key,
MAX(requestdatetime) AS last_access,
COUNT(*) AS request_count,
DATE_DIFF('day', MAX(PARSE_DATETIME(requestdatetime,
'dd/MMM/yyyy:HH:mm:ss Z')), CURRENT_DATE) AS days_since_access
FROM s3_access_logs_db.bucket_logs
WHERE operation IN ('REST.GET.OBJECT', 'REST.HEAD.OBJECT')
GROUP BY key
HAVING days_since_access > 90
ORDER BY days_since_access DESC
LIMIT 100;Objects surfaced here are candidates for immediate transition to Glacier Instant Retrieval.
The Lifecycle Decision Flow
EBS Volume Optimization
EBS is often the second-largest storage cost. The three problems: wrong volume type, oversized volumes, and snapshots that outlive their usefulness.
Migrating gp2 volumes to gp3 requires zero downtime and saves 20 % immediately. Script it:
import boto3
def migrate_gp2_to_gp3(dry_run=True):
ec2 = boto3.client('ec2')
paginator = ec2.get_paginator('describe_volumes')
gp2_volumes = []
for page in paginator.paginate(Filters=[
{'Name': 'volume-type', 'Values': ['gp2']},
{'Name': 'status', 'Values': ['in-use', 'available']},
]):
gp2_volumes.extend(page['Volumes'])
print(f"Found {len(gp2_volumes)} gp2 volumes")
for vol in gp2_volumes:
vid = vol['VolumeId']
size = vol['Size']
iops = vol.get('Iops', 0)
# gp3 baseline: 3000 IOPS, 125 MB/s — match or exceed gp2 provisioned IOPS
target_iops = max(3000, iops)
print(f" {vid} {size} GiB {iops} IOPS -> gp3 {target_iops} IOPS")
if not dry_run:
ec2.modify_volume(
VolumeId=vid,
VolumeType='gp3',
Iops=target_iops,
Throughput=125,
)
return gp2_volumes
migrate_gp2_to_gp3(dry_run=True) # audit first
# migrate_gp2_to_gp3(dry_run=False) # then executeSnapshot Lifecycle Management
Snapshots cost $0.05/GB/month. A 1 TB volume snapshotted daily for a year without rotation costs $600 in snapshots alone — on top of the volume cost.
resource "aws_dlm_lifecycle_policy" "ebs_snapshots" {
description = "14-day daily, 90-day weekly retention"
execution_role_arn = aws_iam_role.dlm.arn
state = "ENABLED"
policy_details {
resource_types = ["VOLUME"]
schedule {
name = "daily-14-day-retention"
create_rule {
interval = 24
interval_unit = "HOURS"
times = ["03:00"]
}
retain_rule { count = 14 }
copy_tags = true
}
schedule {
name = "weekly-90-day-retention"
create_rule {
cron_expression = "cron(0 3 ? * SUN *)"
}
retain_rule { count = 13 } # ~90 days
copy_tags = true
}
target_tags = {
backup = "true"
}
}
}Key Takeaways
- Moving data from S3 Standard to Glacier Deep Archive at day 365 cuts storage costs by 96 % — this is the single largest storage optimization available.
- Abort-incomplete-multipart-upload rules are universally missed and silently accumulate cost; add them to every bucket lifecycle configuration.
- gp2-to-gp3 migration is zero-downtime and saves 20 % on every converted volume with no performance degradation — there is no reason to delay it.
- S3 Storage Lens and server access logs reveal actual access patterns before you commit to lifecycle rules; transition data you know is cold, not data you assume is cold.
- Snapshot rotation via Data Lifecycle Manager pays for itself within days on any fleet with more than 50 volumes.
- Intelligent-Tiering is not free — the per-object monitoring fee makes it cost-negative for objects under 128 KB; use Standard-IA with explicit lifecycle rules for small-object workloads instead.