Skip to main content
Kafka in Production

Cost Optimization

Ravinder··6 min read
KafkaStreamingDistributed SystemsCost OptimizationTiered Storage
Share:
Cost Optimization

A Kafka cluster's cloud bill has four line items that most teams don't examine until the bill arrives: compute (brokers), disk (EBS or equivalent), network (inter-AZ traffic), and managed service fees if you're on Confluent Cloud or MSK. Of these, the two that cause the most surprise are disk and network—because they scale with data volume in ways that aren't obvious until you're paying for them.

This post works through each cost driver, gives you the math to estimate your current spend, and covers the levers that actually move the needle.

Breaking Down Where the Money Goes

pie title Typical Kafka Cost Distribution (Self-Managed on AWS) "EC2 Compute (brokers)" : 35 "EBS Storage" : 30 "Cross-AZ Network" : 25 "Data Transfer Out" : 10

The distribution shifts significantly on managed services. On Confluent Cloud, the compute cost is bundled into CKU pricing, and cross-AZ traffic is typically not billed separately. On MSK, you pay EC2 + EBS + cross-AZ data transfer—the last of which is the most common budget surprise.

Cross-AZ Traffic: The Hidden Bill

AWS charges $0.01/GB for cross-AZ data transfer in each direction. At RF=3 with brokers spread across three AZs, every byte written by a producer crosses AZ boundaries twice during replication.

cross_az_gb_per_month = ingress_MB_s × 2 × 86400 × 30 / 1024
cost_per_month = cross_az_gb_per_month × $0.01

For 100 MB/s ingress:

100 MB/s × 2 × 86400 × 30 / 1024 = 507,812 GB
507,812 × $0.01 = $5,078/month

Just for replication traffic. Consumer fetches from followers in different AZs add more. This is why a cluster that looks cheap on compute can have a surprising network bill.

Mitigation 1: Rack-aware replica placement

Configure brokers with broker.rack matching their AZ. Kafka's rack-aware replica assignment tries to place each partition's replicas in different AZs—which is already what you want for availability. For consumers, prefer followers in the same AZ.

# server.properties for broker in us-east-1a
broker.rack=us-east-1a
// Consumer: fetch from closest replica
props.put(ConsumerConfig.CLIENT_RACK_CONFIG, "us-east-1a");
// Requires broker setting: replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelector

Mitigation 2: Compress before replication

Compression reduces the bytes that cross AZ boundaries. At 60% compression ratio with lz4, your cross-AZ traffic drops from $5,078 to ~$2,031/month.

Retention: The Disk Cost Multiplier

Retention is the biggest single lever for disk cost. From post 1's math: disk cost scales linearly with retention.

def monthly_disk_cost_usd(
    ingress_mb_s: float,
    rf: int,
    retention_hours: int,
    disk_price_per_gb_month: float = 0.10,  # gp3 pricing
    compression_ratio: float = 0.60,
) -> float:
    total_gb = (ingress_mb_s * rf * retention_hours * 3600
                * compression_ratio / 1024)
    return total_gb * disk_price_per_gb_month
 
# 100 MB/s, RF=3, varying retention
for hours in [24, 48, 168]:  # 1d, 2d, 7d
    cost = monthly_disk_cost_usd(100, 3, hours)
    print(f"{hours:4}h retention: ${cost:,.0f}/month")
  24h retention: $1,555/month
  48h retention: $3,110/month
 168h retention: $10,886/month

The question to ask every topic owner: what is your actual re-processing window? Most teams say "7 days" without knowing that their downstream consumers replay from S3, not from Kafka. If the consumer fails and replays from S3, Kafka retention beyond 48 hours is paying for safety theater.

Tiered Storage: Decoupling Retention from Broker Disk

Tiered storage moves log segments older than a configurable threshold from broker disk to object storage (S3, GCS). Brokers retain only recent data; historical data is fetched on demand.

flowchart LR P[Producer] --> B[Broker disk\nlast 24h] B -->|age > 24h| S3[S3 / Object Storage\ndays to years] C1[Real-time consumer] --> B C2[Backfill consumer] --> S3 style B fill:#e3f2fd style S3 fill:#fff9c4
# server.properties (Confluent Platform or MSK with tiered storage)
remote.log.storage.system.enable=true
log.local.retention.ms=86400000        # Keep 24h on broker disk
log.retention.ms=2592000000            # Keep 30 days total (rest in S3)
 
# Topic-level override
kafka-configs --bootstrap-server broker1:9092 \
  --alter --entity-type topics --entity-name orders \
  --add-config remote.storage.enable=true,\
local.retention.ms=86400000,\
retention.ms=2592000000

The economics: S3 costs ~$0.023/GB/month. EBS gp3 costs ~$0.08–0.10/GB/month. Tiered storage replaces 85–90% of your broker disk with S3, reducing storage cost by 70–75% for long-retention topics.

Cost for the same 100 MB/s, RF=3, 30-day retention:

  • Without tiered storage: EBS cost ≈ $136,000/month
  • With tiered storage (24h on EBS, 29d on S3): EBS ≈ $4,666 + S3 ≈ $12,200 = $16,866/month

Compression Codec Selection for Cost

Compression affects three cost dimensions: CPU (broker), disk, and network. The tradeoff:

Codec Disk reduction CPU cost Cross-AZ traffic reduction
none 0% zero 0%
lz4 40–60% very low 40–60%
zstd 50–70% medium 50–70%
gzip 45–65% high 45–65%

For cost optimization specifically: zstd at level 3 gives the best ratio for the CPU cost on modern brokers. For latency-sensitive topics where CPU is constrained, stick with lz4.

# Topic-level compression setting
kafka-configs --bootstrap-server broker1:9092 \
  --alter --entity-type topics --entity-name orders \
  --add-config compression.type=zstd

Setting compression at the topic level overrides any producer-level setting. This lets you enforce compression even for producers that forgot to set it.

Partition Count and Small-Message Overhead

Each partition is a directory of log segment files. High partition counts with low-throughput topics create many small segments, each of which occupies space and requires metadata tracking.

# Find topics with high partition count and low throughput
kafka-topics --bootstrap-server broker1:9092 --describe \
  | grep "PartitionCount" \
  | awk '{print $4, $2}' \
  | sort -rn \
  | head -20

Topics with > 100 partitions and < 1 MB/s throughput are candidates for consolidation. Reducing from 100 to 12 partitions on a low-throughput topic eliminates 88 partition directories per broker and reduces controller metadata overhead.

Monitoring Cost Metrics

# Prometheus alerting rules for cost signals
groups:
  - name: kafka-cost
    rules:
      - alert: KafkaDiskUsageHigh
        expr: kafka_log_size_bytes / kafka_disk_capacity_bytes > 0.75
        for: 10m
        labels:
          severity: warning
        annotations:
          summary: "Broker disk above 75% — check retention settings"
 
      - alert: KafkaCompressionIneffective
        expr: kafka_server_brokertopicmetrics_compressionratio > 0.9
        for: 30m
        annotations:
          summary: "Low compression ratio — verify compression codec config"

Key Takeaways

  • Cross-AZ traffic at RF=3 costs 2× your ingress rate in billable network bytes—at $0.01/GB, 100 MB/s ingress generates ~$5,000/month in replication traffic alone before any consumer fetches.
  • Retention is a cost dial, not a reliability setting—cutting from 7 days to 2 days reduces disk cost by 3.5×; audit each topic's actual consumer replay window before treating 7-day retention as a requirement.
  • Tiered storage reduces long-retention storage cost by 70–75%—it replaces EBS ($0.08–0.10/GB/month) with S3 ($0.023/GB/month) for historical data; the break-even is any topic retaining more than 48 hours.
  • Topic-level compression enforcement catches misconfigured producers—setting compression.type at the topic overrides producer settings; use zstd level 3 for the best cost-to-CPU tradeoff on throughput-sensitive topics.
  • Consumer rack awareness eliminates cross-AZ fetch traffic—configuring client.rack and replica.selector.class routes consumers to same-AZ followers; this removes the largest variable cost item after replication.
  • Low-throughput topics with high partition counts waste metadata resources—consolidate topics below 1 MB/s with partition counts above their consumer parallelism needs; the operational overhead per partition is not free.
Share: