System Design Interviews

Newsfeed Walkthrough

Ravinder·April 29, 2025·6 min read

System DesignInterviewsArchitectureSocial Media

Series

System Design Interviews, Real

Part 5 of 10

← Part 4

Trade-off Vocabulary That Lands

Part 6 →

Ride-Sharing Walkthrough

The newsfeed question is deceptively familiar. Most candidates have mental models from "Design Twitter" blog posts that are five years old and describe architectures that the companies themselves have long since replaced. The concepts underneath are still valid, but they require careful application — particularly around fanout strategy, which is where most candidates take a firm position and then get demolished by the follow-up.

Scope and Constraints First

Before designing anything, establish:

Scale: 500M registered users, 100M DAU. Average user follows 200 people. 1% of users post per day = 1M new posts/day ≈ 12 posts/second average.
Read/write ratio: Heavily read-dominant. Assume 100:1 — for every post, 100 feed reads.
Latency target: Feed load < 200ms p99. Post publish latency < 1 second.
Consistency: Eventual consistency is acceptable — you can see posts up to a few seconds late.
Content: Text, images, links. Video is out of scope.

Derived: 12 writes/second, ~1,200 reads/second average, ~3,600 reads/second at peak (3x factor). Storage: 1M posts/day × average 500 bytes = 500MB/day of post data, ~180GB/year — trivial. The hard problem is feed assembly, not storage.

The Fanout Decision Is the Core Design

Fanout is the process of distributing a new post to the feeds of all followers. There are two strategies, and both are wrong in isolation.

Fanout on write (push model): When a user posts, immediately write to the feed caches of all N followers. Feed reads are O(1) — just read a prebuilt list. The problem: a user with 10M followers causes 10M writes per post. This is untenable for celebrities.

Fanout on read (pull model): When a user requests their feed, merge the recent posts from all N accounts they follow in real time. No write amplification. The problem: assembling a feed by merging 200 timelines, ranking, and paginating them in < 200ms at 3,600 RPS is expensive. It works at small scale; it becomes a bottleneck as follow counts grow.

The real answer: hybrid fanout.

flowchart TD P[User Posts] --> CE{Celebrity?\n>10k followers} CE -->|No - normal user| FW[Fanout on Write\nPush to follower feed caches] CE -->|Yes - celebrity| DB[Write to post store only] FW --> FC[Feed Cache Updated\nfor all followers] R[User Requests Feed] --> RC[Read prebuilt feed from cache] RC --> MG{Following any\ncelebrities?} MG -->|Yes| MR[Fetch celebrity posts\nfrom post store] MR --> RK[Merge + Rank in real time] MG -->|No| DONE[Serve prebuilt feed] RK --> DONE

The threshold (10k followers in this example) is tunable. In practice, you monitor the fanout queue depth and adjust the threshold to keep queue processing within SLA.

The Fanout Queue

Fanout-on-write is never synchronous. The publish request acknowledges immediately; a background job fans out to follower feeds via a message queue.

POST /post → Post Service → {
  1. Write post to Post Store (Cassandra/DynamoDB, keyed by post_id)
  2. Publish message to Fanout Queue (Kafka)
  3. Return 200 to client
}
 
Fanout Workers (consume from Kafka) → {
  For each follower_id of poster:
    LPUSH feed:{follower_id} post_id  ← Redis list, trimmed to 1000 items
}

This decouples the write path from the fanout. If the fanout queue backs up (e.g., a celebrity posts), post latency is unaffected. The queue depth becomes your leading indicator of fanout lag.

Feed Storage: What To Cache

Cache the feed as a list of post IDs, not full post objects. Post objects (content, like counts, author metadata) are mutable — like counts change constantly. If you cache full post objects in the feed, every like update requires invalidating potentially millions of feed cache entries.

Instead: cache a sorted list of post IDs per user. On read, fetch the top N IDs from the feed cache, then bulk-fetch post objects from a separate post object cache (TTL of 5 minutes). Like counts can be eventually consistent — served from a separate counter store (Redis counters or a dedicated counting service).

Feed cache (Redis): 
  Key: feed:{user_id}
  Value: sorted list of [post_id, score] — trimmed to 1000 entries
  TTL: none (updated on every fanout write)
 
Post object cache (Redis):
  Key: post:{post_id}
  Value: {author, text, media_urls, timestamp}
  TTL: 30 minutes
 
Like count store (Redis):
  Key: likes:{post_id}
  Value: integer counter (INCR on each like)

Ranking: The Part Most Candidates Skip

A chronological feed is not a ranked feed. Every major social platform uses a ranking model, and the interview often probes here.

A simplified ranking score:

score(post) = recency_weight * time_decay(post.age)
            + affinity_weight * affinity(viewer, author)
            + engagement_weight * (likes + comments + shares)

In a real system, affinity(viewer, author) is a precomputed score updated in the background based on historical interaction signals. Storing and serving this per-pair score at scale requires its own infrastructure (a feature store, typically).

For the interview, demonstrating that you know ranking exists, what inputs it uses, and how it interacts with the fanout system is sufficient. You do not need to design a full ML pipeline.

Capacity Calculation Check

Component	Estimate
Post writes	12/sec avg, ~35/sec peak
Feed read RPS	~1,200 avg, ~3,600 peak
Fanout queue messages/sec	12 posts/sec × 200 avg followers = 2,400 fanout ops/sec
Feed cache entries	100M DAU × 1,000 post IDs × 8 bytes = ~800GB
Redis nodes at 25GB each	~32 nodes for feed cache
Post object cache	1M posts/day × 500 bytes = 500MB/day, keep 7-day window = ~3.5GB — trivially cached

The feed cache size is the key constraint. 800GB across 32 Redis nodes is manageable but requires consistent hashing with replication (3x = ~2.4TB total Redis capacity).

Key Takeaways

Fanout strategy is the core decision in newsfeed design — neither push-only nor pull-only scales; hybrid fanout by follower count is the practitioner's answer.
Fanout must be asynchronous through a queue to avoid post latency coupling to fanout fan-out time.
Cache post IDs in the feed, not full post objects, to avoid cache invalidation hell when mutable fields (like counts) change.
Ranking exists in every real feed — even a simplified affinity + recency + engagement score demonstrates architectural awareness.
The 800GB+ feed cache is the dominant infrastructure cost; design sharding strategy before estimating node count.
Queue depth from fanout operations is your leading indicator of feed freshness SLA — monitor it as a first-class metric.