Webhooks
Series
API Design MasteryWebhooks invert the polling model: instead of clients asking "did anything change?", your API pushes notifications the moment something does. Done well, they are one of the most powerful integration primitives you can offer. Done poorly, they become a distributed reliability problem: missed events, duplicate deliveries, spoofed payloads, and ordering violations that corrupt your integrators' data. The gap between a webhook that works in testing and one that holds up in production is almost entirely in the delivery contract.
Webhook Payload Design
Every webhook event should be self-describing, versioned, and carry enough context to be actionable without a follow-up API call:
{
"id": "evt_01J8K2MNPQ3RS4TUV5WX6YZ7A",
"type": "order.shipped",
"apiVersion": "2025-11-01",
"createdAt": "2025-11-19T14:23:45Z",
"data": {
"object": {
"id": "ord_789",
"status": "shipped",
"customerId": "cust_123",
"items": [
{"productId": "prod_A", "quantity": 2}
],
"trackingNumber": "1Z999AA1012345678",
"shippedAt": "2025-11-19T14:20:00Z"
},
"previousAttributes": {
"status": "processing"
}
}
}Key fields:
id— a unique event ID, used for deduplication.type— namespaced event type (resource.action).apiVersion— pinned to the API version when the subscription was created.data.previousAttributes— what changed (the delta), so receivers do not need to call back immediately.
Embed enough data for the most common use case. If the payload is too large, include the object ID and let clients fetch details — but document this explicitly.
HMAC Signing
Without signing, any entity on the internet can POST to your customers' webhook endpoints pretending to be you. HMAC-SHA256 signing prevents this:
X-Signature-256: sha256=a1b2c3d4e5f6...
X-Signature-Timestamp: 1700395425Signing procedure:
import hmac
import hashlib
import time
def sign_payload(secret: str, payload: bytes, timestamp: int) -> str:
signed_content = f"{timestamp}.".encode() + payload
signature = hmac.new(
secret.encode(),
signed_content,
hashlib.sha256
).hexdigest()
return f"sha256={signature}"Receiver verification:
def verify_webhook(request, secret: str) -> bool:
timestamp = int(request.headers["X-Signature-Timestamp"])
signature = request.headers["X-Signature-256"]
# Reject stale requests (replay window: 5 minutes)
if abs(time.time() - timestamp) > 300:
return False
expected = sign_payload(secret, request.body, timestamp)
return hmac.compare_digest(expected, signature)Include the timestamp in the signed content to prevent replay attacks — a replayed payload from 24 hours ago will fail the staleness check. Use hmac.compare_digest (constant-time comparison) to prevent timing attacks.
Delivery and Retry
Retry policy: exponential backoff with jitter, bounded by a max retry count and deadline.
Attempt 1: immediately
Attempt 2: 30s
Attempt 3: 5m
Attempt 4: 30m
Attempt 5: 2h
Attempt 6: 8h
Attempt 7: 24h
Give up: move to dead letter queueJitter prevents thundering herd when a downstream endpoint recovers after downtime and would otherwise receive all retried events simultaneously.
A delivery is successful only when the endpoint returns 2xx within a timeout (20–30 seconds). Treat 4xx as permanent failures (stop retrying), 5xx and timeouts as transient (retry).
Ordering Guarantees
Webhooks do not guarantee ordering by default — network conditions, retries, and parallel workers can invert event sequences. Design receivers to be order-tolerant:
{
"type": "order.status_changed",
"data": {
"object": {
"id": "ord_789",
"status": "delivered",
"sequenceNumber": 7
}
}
}Include a monotonic sequenceNumber per resource so receivers can detect out-of-order delivery. If event 7 arrives before event 5, the receiver can either buffer event 7, request a replay of the gap, or re-fetch the current resource state.
For strict ordering requirements, offer a per-resource ordered queue (one goroutine/worker per resource ID). Events for different resources can still be processed in parallel, but events for a single resource are delivered in order.
Deduplication
At-least-once delivery is the realistic guarantee. Receivers will see duplicate events. Design for it:
def handle_webhook(event: dict):
event_id = event["id"]
# Idempotency check
if redis.setnx(f"webhook:seen:{event_id}", 1, ex=86400):
# First time seeing this event — process it
process_event(event)
else:
# Duplicate — acknowledge but skip
passAcknowledge the duplicate with 200 OK so your delivery system knows not to retry it.
Replay API
Provide a replay mechanism for events your customers missed:
POST /webhooks/subscriptions/{subId}/replay HTTP/1.1
Content-Type: application/json
{
"since": "2025-11-18T00:00:00Z",
"until": "2025-11-19T00:00:00Z",
"types": ["order.shipped", "order.delivered"]
}Store all events (with their original payload) for at least 7 days. Event replay is the escape hatch when a customer's endpoint was down for maintenance, their deploy was broken, or their database migration ate some records.
Key Takeaways
- Every webhook event needs a unique ID for deduplication, a namespaced type, a pinned API version, and enough payload data to be actionable without a follow-up call.
- Sign all payloads with HMAC-SHA256 including a timestamp; receivers must verify the signature and reject requests older than 5 minutes to prevent replay attacks.
- Use exponential backoff with jitter for retries; treat 4xx as permanent failures and 5xx/timeouts as transient; route undeliverable events to a dead-letter queue.
- Include a per-resource sequence number so receivers can detect and handle out-of-order delivery.
- Guarantee at-least-once delivery and design receiver logic to be idempotent using the event ID as a deduplication key.
- Provide a replay API backed by at least 7 days of event history — it is your customers' safety net when things go wrong.
Series
API Design Mastery