Partition Strategy
Series
Kafka in ProductionThe question that comes up in every Kafka design review: "How many partitions should this topic have?" The honest answer is that partition count is a bet you're making about the future throughput of that topic. Get it wrong and you'll pay a painful tax—either under-parallelized consumers or a repartitioning migration that requires double the infrastructure and a careful cutover window.
Partitioning decisions are almost never purely technical. They're a product of what you know about your data distribution, your consumer parallelism requirements, and your tolerance for operational complexity. This post gives you a framework for making that bet deliberately.
What Partitions Actually Control
Before picking a number, be precise about what you're optimizing for.
Partitions control three things simultaneously:
- Consumer parallelism — Max concurrent consumers in a group equals the partition count. 10 partitions = max 10 active consumers.
- Ordering scope — Kafka guarantees order only within a partition. Your key choice determines which messages share a partition.
- Physical data distribution — Partitions spread across brokers. More partitions = finer-grained load distribution.
The tension: you want enough partitions for consumer parallelism and load distribution, but not so many that partition overhead dominates (metadata, leader elections, file handles).
Key Selection: The Most Consequential Decision
Your partition key determines your ordering guarantee and your data distribution. Choose a bad key and you get hot partitions. Choose no key and you get round-robin distribution with no ordering at all.
Good key properties:
- High cardinality (many distinct values)
- Roughly uniform distribution across values
- Meaningful for your ordering requirement
Common patterns by use case:
| Use Case | Key Choice | Why |
|---|---|---|
| Order events | order_id |
All events for one order land in one partition |
| User activity | user_id |
Per-user ordering, good cardinality |
| IoT sensor data | device_id |
Per-device stream |
| Payment transactions | account_id |
Not transaction_id—you want account-level ordering |
| Log aggregation | None (null) | No ordering requirement, maximize throughput |
// Explicit key in Java producer
ProducerRecord<String, OrderEvent> record = new ProducerRecord<>(
"order-events",
order.getOrderId(), // partition key
orderEvent // value
);
producer.send(record);# Python producer with explicit key
producer.produce(
topic="order-events",
key=order_id.encode("utf-8"),
value=serialize(order_event),
on_delivery=delivery_callback,
)The Hot Partition Problem
Even with a good key, you'll hit hot partitions if your key distribution is skewed. The classic case: a multi-tenant SaaS platform keying on tenant_id where one enterprise customer generates 80% of the traffic.
Symptoms of a hot partition:
- One broker's
BytesInPerSecis 5× the others - Consumer lag accumulating on one partition only
ReplicaFetcherManager.MaxLagelevated on one broker
Mitigation strategies:
// Strategy 1: Salted key for high-volume keys
String effectiveKey = isHighVolumeKey(tenantId)
? tenantId + "-" + (System.nanoTime() % SALT_BUCKETS)
: tenantId;
// Downside: breaks per-tenant ordering// Strategy 2: Custom partitioner
public class TenantAwarePartitioner implements Partitioner {
@Override
public int partition(String topic, Object key, byte[] keyBytes,
Object value, byte[] valueBytes, Cluster cluster) {
String tenantId = (String) key;
int numPartitions = cluster.partitionCountForTopic(topic);
if (HOT_TENANTS.contains(tenantId)) {
// Spread hot tenants across first N partitions
return Math.abs(tenantId.hashCode() + ThreadLocalRandom.current().nextInt(4))
% Math.min(numPartitions, HOT_TENANT_PARTITION_POOL);
}
return Utils.toPositive(Utils.murmur2(keyBytes)) % numPartitions;
}
}Calculating Partition Count
The formula most teams use:
partitions = max(target_throughput / throughput_per_partition, max_consumer_parallelism)A single partition on a modern broker handles roughly 10–50 MB/s depending on message size and compression. Start conservative—20 MB/s per partition—then benchmark your actual setup.
def min_partitions(
target_throughput_mbs: float,
throughput_per_partition_mbs: float = 20.0,
max_consumers_planned: int = 0,
) -> int:
from_throughput = math.ceil(target_throughput_mbs / throughput_per_partition_mbs)
return max(from_throughput, max_consumers_planned)
# Example: 100 MB/s topic, planning for 12 consumers
print(min_partitions(100, max_consumers_planned=12)) # → 12 (throughput needs only 5)Round up to a number that divides evenly by your planned consumer count. 12 partitions and 4 consumers = 3 partitions per consumer. Uneven splits mean some consumers carry more load.
The Repartitioning Problem
Here's the thing nobody tells you upfront: you cannot change a topic's partition count without consequences. Adding partitions changes the key-to-partition mapping for all existing keys. Messages for key user-123 that landed in partition 2 will now land in partition 7. Any consumer maintaining per-partition state (like a running aggregate) breaks.
Your options when you genuinely need more partitions:
Option A: New topic, dual-write period
Week 1: Create order-events-v2 with new partition count
Week 1-2: Producers write to both order-events and order-events-v2
Week 2: Consumers migrate to order-events-v2
Week 3: Retire order-eventsThis is safe but operationally expensive. You need twice the storage during the window.
Option B: Kafka Streams repartition
Use a Kafka Streams topology to read from the old topic and write to a new topic with the desired partitioning:
StreamsBuilder builder = new StreamsBuilder();
builder.stream("order-events", Consumed.with(Serdes.String(), orderSerde))
.selectKey((oldKey, value) -> value.getNewPartitionKey())
.to("order-events-v2", Produced.with(Serdes.String(), orderSerde));Option C: Accept the ordering break
For topics where ordering is not a hard requirement (metrics, logs, analytics events), just add partitions and document the behavior change.
Key Takeaways
- Partition count controls consumer parallelism ceiling—max active consumers in a group equals partition count; you cannot parallelize beyond it without adding partitions.
- Key selection is more important than partition count—a high-cardinality, uniformly distributed key prevents hot partitions that no amount of partitions can fix.
- Repartitioning is painful by design—changing partition count breaks the key-to-partition mapping, so size generously upfront with a 20% growth buffer.
- Hot partitions reveal themselves in broker-level metrics—watch
BytesInPerSecper broker and consumer lag per partition, not just cluster-wide averages. - Round partition count to a multiple of your consumer count—uneven splits give some consumers 33% more work than others, creating artificial lag imbalances.
- Null keys are a valid choice for order-agnostic topics—round-robin distribution maximizes throughput and eliminates hot partitions when per-key ordering is not required.