Outbox and Inbox
Every event-driven system eventually hits the dual-write problem. You save an order to the database and publish an OrderPlaced event to Kafka. What happens when the database write succeeds and the Kafka publish fails? Or the publish succeeds but the database transaction rolls back? You either lose events or produce phantom events—both are silent consistency failures.
The standard advice is "don't do distributed transactions." The transactional outbox pattern is the practical alternative that achieves the same durability guarantee without two-phase commit.
The Dual-Write Problem
The naive approach:
def place_order(order: Order) -> None:
db.save(order) # Step 1: persist
kafka.publish("orders.placed", order) # Step 2: publishFailure scenarios:
- DB succeeds, Kafka publish fails → event lost, order exists but nothing reacts
- Kafka publishes, DB transaction rolls back → phantom event, no corresponding order
- Network partition between steps → unpredictable
Even with retry logic on the Kafka publish, you can't retry a rolled-back transaction. The two operations are not atomic.
The Transactional Outbox
The outbox pattern makes event publication part of the database transaction. Instead of publishing directly to the broker, write a record to an outbox table in the same transaction as your business data.
-- outbox table
CREATE TABLE outbox (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
aggregate_type VARCHAR(100) NOT NULL,
aggregate_id VARCHAR(255) NOT NULL,
event_type VARCHAR(255) NOT NULL,
payload JSONB NOT NULL,
created_at TIMESTAMPTZ DEFAULT NOW(),
published_at TIMESTAMPTZ,
status VARCHAR(20) DEFAULT 'PENDING'
);def place_order(order: Order) -> None:
with db.transaction():
db.save(order)
db.insert("outbox", {
"aggregate_type": "Order",
"aggregate_id": order.id,
"event_type": "OrderPlaced",
"payload": order.to_dict(),
"status": "PENDING"
})
# Transaction commits both atomically.
# If it rolls back, neither the order nor the outbox record exists.A separate relay process reads unpublished outbox records and publishes them to the broker:
If the relay crashes after publishing but before marking the record published, it will re-publish on restart. This means the broker receives the event at least once. Consumers must be idempotent (covered in Post 7).
Relay Implementation Strategies
Polling relay: a background job runs on a schedule, queries the outbox table, and publishes pending records. Simple to implement, adds load to the database, has latency proportional to polling interval.
# Polling relay (simplified)
import time
def run_relay():
while True:
pending = db.query(
"SELECT * FROM outbox WHERE status='PENDING' ORDER BY created_at LIMIT 100"
)
for record in pending:
try:
kafka.publish(
topic=topic_for(record["event_type"]),
key=record["aggregate_id"],
value=record["payload"]
)
db.execute(
"UPDATE outbox SET status='PUBLISHED', published_at=NOW() WHERE id=%s",
[record["id"]]
)
except Exception as e:
logger.error(f"Failed to publish {record['id']}: {e}")
time.sleep(0.5)CDC relay (Change Data Capture): a CDC tool like Debezium tails the database transaction log and streams outbox record insertions directly to Kafka. No polling, near-real-time, zero additional database load.
CDC is operationally more complex (Debezium cluster, connector config, WAL retention settings) but is the right choice for high-throughput systems where polling latency or database load is a concern.
Inbox Pattern: Deduplication Downstream
The outbox guarantees at-least-once delivery. The inbox pattern handles the consumer side: deduplicate events before processing to achieve effectively-once semantics.
The inbox table records processed event IDs:
CREATE TABLE inbox (
event_id UUID PRIMARY KEY,
event_type VARCHAR(255) NOT NULL,
processed_at TIMESTAMPTZ DEFAULT NOW()
);Consumer processing:
def handle_order_placed(event: dict) -> None:
event_id = event["id"]
with db.transaction():
# Attempt to claim the event
inserted = db.execute(
"INSERT INTO inbox (event_id, event_type) VALUES (%s, %s) ON CONFLICT DO NOTHING",
[event_id, event["type"]]
)
if inserted.rowcount == 0:
# Already processed — skip idempotently
logger.info(f"Duplicate event {event_id}, skipping")
return
# Process the event within the same transaction
reserve_inventory(event["data"]["orderId"], event["data"]["items"])
# Transaction commits inbox record + business operation atomicallyThe ON CONFLICT DO NOTHING combined with the transaction ensures exactly-once processing: either the inbox record and the business operation both commit, or neither does.
Inbox Retention and Cleanup
Inbox tables grow unbounded without cleanup. Events have a natural expiry based on your broker's retention policy—if an event cannot be replayed after 7 days, there's no point keeping inbox records older than 7 days.
-- Run daily via pg_cron or a scheduled job
DELETE FROM inbox
WHERE processed_at < NOW() - INTERVAL '7 days';Match the inbox retention to your broker's message retention. If your Kafka topics retain 30 days of events (for replay), retain inbox records for 30+ days to catch any replay scenarios.
Outbox vs Saga
The outbox ensures that an event gets published exactly once. A saga coordinates a multi-step workflow across services. These are complementary: each saga step uses the outbox pattern to reliably emit the event that triggers the next step.
Each arrow that crosses a service boundary goes through an outbox. No step can silently drop an event.
Key Takeaways
- Dual-write (save to DB + publish to broker in separate operations) creates silent consistency failures when either step fails.
- The transactional outbox writes the event to the same database transaction as the business data, making publication atomic with persistence.
- Polling relays are simple to implement; CDC relays (Debezium) are faster and impose less database load at higher throughput.
- The outbox delivers at-least-once; consumers use the inbox pattern to deduplicate and achieve effectively-once processing.
- Inbox table retention must match or exceed broker message retention to handle replay scenarios correctly.
- Outbox and saga patterns are complementary—each saga step emits its trigger event via the outbox.