Choosing a Streaming Platform: Kafka, Pulsar, Kinesis, Redpanda
Every benchmark for streaming platforms shows the same thing: they all handle millions of messages per second. Throughput is solved. The real question is what happens when a broker goes down at 3am, when you need to replay six months of events across three regions, or when your team's median Kafka expertise is "I've seen it in a diagram."
This post is about the decisions that actually matter when choosing between Kafka, Pulsar, Kinesis, and Redpanda in 2026.
The Landscape in 2026
All four platforms share the same fundamental model: durable, ordered, partitioned logs with consumer groups and offset-based replay. The differences are in architecture, operational complexity, ecosystem, and where each one makes your life easier or harder.
Kafka: The Default for a Reason
Kafka is the default because it has the deepest ecosystem, the most battle-tested operators, and the largest community. If you're building a streaming pipeline in 2026 and don't have strong reasons to pick something else, Kafka is the correct baseline.
What Kafka does well:
- Kafka Connect has 200+ production-ready connectors. Every database, data warehouse, SaaS tool, and cloud service has a connector. This is not a small advantage.
- The ecosystem around Kafka (Kafka Streams, ksqlDB, Flink's Kafka source/sink, Spark Structured Streaming) is unmatched.
- KRaft mode (replacing ZooKeeper) is now production-stable and dramatically simplifies operations. The ZooKeeper excuse for avoiding Kafka is gone.
- Retention-based replay: keep data for 7 days or 1 year, replay at will. This is Kafka's killer feature for audit and debugging.
Where Kafka struggles:
- Partition count is a deployment-time decision that's painful to change. Under-partition and you hit throughput limits; over-partition and you pay in memory, file handles, and replication overhead.
- Multi-tenancy is bolted on. Kafka topics are flat namespaces. Quota management works but is not ergonomic.
- Geo-replication with MirrorMaker 2 is functional but operationally heavy. Active-active setups require careful offset translation.
# Kafka producer with idempotency and explicit acks
from confluent_kafka import Producer
producer = Producer({
'bootstrap.servers': 'broker1:9092,broker2:9092',
'enable.idempotence': True, # exactly-once at producer level
'acks': 'all', # wait for all ISR replicas
'compression.type': 'lz4',
'linger.ms': 5, # batch for 5ms to improve throughput
'batch.size': 65536,
})
def delivery_callback(err, msg):
if err:
print(f'Delivery failed: {err}')
else:
print(f'Delivered to {msg.topic()} [{msg.partition()}] @ {msg.offset()}')
producer.produce(
'order-events',
key=b'order-123',
value=b'{"event": "placed", "amount": 99.99}',
callback=delivery_callback
)
producer.flush()Pulsar: The Right Answer for Multi-Tenancy
Pulsar's architecture is genuinely different: it separates compute (brokers) from storage (BookKeeper). Brokers are stateless; all data lives in BookKeeper. This makes broker scaling, failover, and geo-replication fundamentally simpler.
Where Pulsar wins:
- Multi-tenancy is first-class. Pulsar has organizations → tenants → namespaces → topics, with quotas and policies at each level. For platform teams managing streaming infrastructure for multiple internal teams, this is transformative.
- Geo-replication is built in. Configure replication at the namespace level and Pulsar handles it. Active-active across regions without MirrorMaker.
- Subscription models: Pulsar supports exclusive, shared, failover, and key-shared subscriptions natively. Kafka's consumer group model is simpler but less flexible.
Where Pulsar struggles:
- BookKeeper adds operational complexity. You're now running Pulsar brokers + BookKeeper + ZooKeeper (or etcd). Three systems instead of one.
- Kafka connector ecosystem is larger. Pulsar has Kafka protocol compatibility, but native Pulsar connectors lag behind.
- Community and tooling are smaller. Fewer blog posts, fewer Stack Overflow answers, fewer engineers who've debugged it in production.
import pulsar
client = pulsar.Client('pulsar://localhost:6650')
# Pulsar supports schema registry natively
producer = client.create_producer(
'persistent://my-tenant/my-namespace/order-events',
schema=pulsar.schema.JsonSchema(dict),
batching_enabled=True,
batching_max_publish_delay_ms=10,
)
producer.send({'event': 'placed', 'order_id': 'ord-123', 'amount': 99.99})
client.close()Kinesis: Operational Simplicity at AWS Lock-in Cost
Kinesis Data Streams is AWS's managed offering. The pitch is simple: no infrastructure to operate, scales automatically, integrates natively with Lambda, Glue, Firehose, and the rest of the AWS ecosystem.
Where Kinesis wins:
- Zero ops. No brokers to patch, no disks to monitor, no replication to configure.
- Native integration with AWS services is unmatched. Lambda triggers on a Kinesis stream with one checkbox.
- Enhanced fan-out (dedicated 2 MB/s throughput per consumer) solves the shared-throughput problem for multiple consumers.
Where Kinesis struggles:
- 365-day retention maximum. Kafka keeps data as long as you have disk.
- 1 MB message size limit. Fine for most events, a blocker for large payloads.
- Shard management is still manual. Kinesis doesn't auto-scale shards; you call
UpdateShardCountand pay a 30-second unavailability window. - Ecosystem outside AWS is thin. Kafka Connect, Flink, and Spark have much better Kinesis support than native tooling.
- You're in the AWS ecosystem permanently. Migrating off Kinesis is a real engineering project.
import boto3, json, time
kinesis = boto3.client('kinesis', region_name='us-east-1')
# Kinesis shard count defines maximum throughput: 1 MB/s write, 2 MB/s read per shard
response = kinesis.put_record(
StreamName='order-events',
Data=json.dumps({'event': 'placed', 'order_id': 'ord-123'}).encode(),
PartitionKey='ord-123', # determines shard assignment
)
print(f"Shard: {response['ShardId']}, Sequence: {response['SequenceNumber']}")Redpanda: Kafka API, Better Ops Story
Redpanda is a Kafka API-compatible streaming platform written in C++. No JVM, no ZooKeeper, no Kafka. It implements the Kafka wire protocol, so any Kafka client works without modification.
Where Redpanda wins:
- Operational simplicity. Single binary. No JVM GC tuning. No ZooKeeper. Setup is genuinely simpler than Kafka.
- Lower tail latency. The C++ implementation and thread-per-core architecture produce more consistent P99 latency than JVM-based Kafka, especially under memory pressure.
- Kafka API compatibility. Drop-in replacement. Your Kafka consumers, producers, and Connect connectors work as-is.
- Faster for small clusters. Running 3 Redpanda nodes is less operationally intensive than 3 Kafka brokers + configuration management.
Where Redpanda struggles:
- Not open-core the same way Kafka is. Some enterprise features require Redpanda Enterprise.
- Smaller ecosystem. Kafka's 200+ connectors are not all tested against Redpanda's Kafka compatibility layer.
- Less battle-tested at extreme scale. Kafka has been running at millions of partitions at LinkedIn and Netflix for a decade. Redpanda has fewer of those war stories.
- Wasm transforms (Redpanda's in-broker compute) are not production-stable for all use cases.
Decision Framework
The selection criteria that actually matter:
| Criterion | Kafka | Pulsar | Kinesis | Redpanda |
|---|---|---|---|---|
| Ops complexity (self-hosted) | Medium | High | None | Low |
| Multi-tenancy | Poor | Excellent | Poor | Poor |
| Geo-replication | Hard | Easy | Medium | Medium |
| Ecosystem / connectors | Best | Good | AWS-only | Good (Kafka compat) |
| Retention flexibility | Unlimited | Unlimited | 365 days | Unlimited |
| Tail latency (P99) | Medium | Medium | Variable | Low |
| Lock-in risk | Low | Low | High | Low |
Consumer Ergonomics: The Underrated Factor
Consumer ergonomics — how easy it is to write a correct consumer — matters more than most teams admit when evaluating platforms.
Kafka's consumer group API is well-understood but has sharp edges: offset commits must be managed carefully, rebalances can cause duplicate processing, and exactly-once semantics require transactional producers and idempotent consumers.
Pulsar's subscription model is more flexible. Key-shared subscriptions allow ordered processing across multiple consumers without a full rebalance. For workflows where you need ordering per key but want horizontal consumer scaling, this is genuinely better than Kafka.
Kinesis's iterator model is simpler to understand but Enhanced Fan-Out adds a separate subscription concept. AWS Lambda integration abstracts most of this away, which is why Kinesis is popular for serverless consumers.
Redpanda presents the same consumer API as Kafka — so the ergonomics are identical, for better or worse.
Key Takeaways
- Throughput benchmarks are not the decision criteria — ops burden, geo-replication, consumer model, and ecosystem lock-in are what differentiate these platforms in production.
- Kafka is the right default for most teams: deepest ecosystem, most experienced operators in the job market, and KRaft mode has eliminated the ZooKeeper complexity argument.
- Pulsar wins decisively for multi-tenant platform teams that need namespace-level isolation and active-active geo-replication as a first-class feature — but the BookKeeper ops overhead is real.
- Kinesis is the correct choice if you're committed to AWS, don't need more than 365 days retention, and want zero infrastructure management; the lock-in cost is worth it for teams that want to stay in the AWS control plane.
- Redpanda is the best option for teams that want Kafka API compatibility with lower operational complexity on self-hosted clusters, particularly for smaller deployments where JVM tuning is a burden.
- Evaluate your consumer team's experience and your geo-replication requirements before the throughput numbers — they'll drive more of your operational costs than raw ingestion rate.