Trade-off Vocabulary That Lands
The most common sentence in system design interviews is also the most abused: "well, according to the CAP theorem, we can only have two of three." Interviewers have heard this hundreds of times. Most of the time it signals that the candidate memorized a triangle and stopped thinking. Knowing the vocabulary is necessary — using it precisely, in context, with concrete implications, is what actually moves the conversation.
CAP: What It Actually Says
CAP theorem states that in the presence of a network partition, a distributed system must choose between consistency and availability. Not "pick any two of three" as a general design principle — specifically, what happens when nodes cannot communicate.
Consistency (in CAP) means linearizability: every read reflects the most recent write, as if the system were a single machine. This is a strong guarantee.
Availability means every non-failed node returns a response (not necessarily the most recent data).
Partition tolerance is not optional in any system that crosses a network boundary. Partitions happen — cables get cut, switches fail, cloud availability zones lose connectivity. Rejecting partition tolerance means rejecting distributed systems entirely.
So the real choice is: when a partition occurs, do you return an error (choose C) or return potentially stale data (choose A)? That is the decision. Everything else is context.
PACELC: The More Useful Extension
The problem with CAP is that partitions are relatively rare in well-engineered networks. PACELC asks what you trade off when there is no partition — which is most of the time:
- If there is a Partition: choose Availability or Consistency
- Else (no partition): choose Latency or Consistency
This is much more useful for real design conversations. DynamoDB in its default configuration is PA/EL — it sacrifices consistency during partitions and prefers low latency over strict consistency in normal operation. Spanner is PC/EC — it maintains consistency during partitions and trades latency for consistency at all times.
When you are designing a user profile service, the interviewer probably does not care about partition behavior. They care about the daily normal-operation tradeoff: do you want 2ms reads with potential staleness, or 8ms reads with strong consistency? That is an EL vs EC question.
Naming Consistency Models Precisely
"Eventually consistent" is not one thing. There is a spectrum, and using precise names signals depth:
Linearizability (strong consistency): reads always return the most recent write. Requires coordination on every operation — expensive, slow. Use for: financial balances, inventory counts, seat booking.
Sequential consistency: operations appear to execute in some global order consistent with each program order, but not necessarily real-time order. Slightly weaker than linearizability, rarely used as an explicit design choice.
Causal consistency: if operation A caused operation B, all processes see A before B. Comments appearing before the posts they reply to. Used in: distributed social apps, collaborative editing.
Read-your-writes (session consistency): after you write, you always read your own write. Not guaranteed to others. Used in: profile updates (you see your own changes immediately), social posts (you see your own post immediately).
Eventual consistency: given no new writes, all replicas will converge. Weak guarantee — no timing bound, no ordering guarantee. Suitable for: analytics aggregates, social feed counts, DNS propagation.
| Model | Strength | Common Use Case | Latency Cost |
|---|---|---|---|
| Linearizable | Highest | Payments, inventory | High |
| Causal | Medium | Social replies, chat | Medium |
| Read-your-writes | Medium | Profile edits | Low-medium |
| Eventual | Weakest | Counters, feeds, DNS | Low |
Availability: Nines Are Not Enough
"We need five nines" is another phrase that lands badly without context. Availability percentages only matter when you pair them with what constitutes downtime and over what time window.
99.9% availability = ~8.7 hours downtime per year = ~43 minutes per month. For a social feed, tolerable. For a payment processor at peak, catastrophic.
The more useful framing: what is the blast radius of an outage? A 5-minute outage during off-peak hours has different business impact than a 5-minute outage during Black Friday checkout. Design your SLA around business risk, not raw percentage.
Also: availability is not just about uptime. A service that is "up" but returning 30% errors is not available. Your SLA should include error rate budgets (e.g., "< 0.1% 5xx responses") not just uptime.
Using Tradeoff Language in the Interview
The goal is to make every architectural choice sound deliberate. Compare:
Weak framing: "I'll use eventual consistency because it's faster."
Strong framing: "For the user timeline cache, I'll accept read-your-writes consistency but not linearizability. After a user posts, they should see their own post immediately — I'll route their next few reads to the primary for 5 seconds after a write. For everyone else's reads, eventual consistency through the replica is fine; a 1–2 second lag in seeing someone else's post is imperceptible."
The strong framing names the model, explains the business justification, and describes the implementation mechanism. That is a complete trade-off argument.
Partition Handling: The Practical Question
When you do need to address partition behavior explicitly (geo-distributed systems, multi-region designs), stop and ask: what is worse — an error message or wrong data?
Wrong data is worse than an error: financial transactions, booking confirmations, authentication. Choose CP. Return an error and let the caller retry or escalate.
An error is worse than stale data: DNS lookups (stale cached record is fine, no response is catastrophic), social feeds, product catalogs. Choose AP. Return the best data you have.
In practice, this often becomes a mixed strategy: most of your system is AP by default, with a small CP core for the data that absolutely cannot be wrong (account balances, write-once identifiers). Design the blast radius: minimize how much of the system must be CP.
Vocabulary Traps to Avoid
"CAP theorem means we can't have all three." Partition tolerance is not a design variable — it is a given for any networked system. The real choice is always C vs. A during partitions.
"We'll use eventual consistency for performance." Eventual consistency is not a performance optimization technique. It is a correctness tradeoff. If your bottleneck is something else (CPU, I/O, connection pooling), eventual consistency will not fix it.
"We need strong consistency." Linearizability is extremely expensive in distributed systems. You almost always need something weaker — read-your-writes or causal consistency — not full linearizability. Know which operations truly need the strongest guarantee and apply it narrowly.
Key Takeaways
- CAP's real question is narrow: when a partition occurs, do you return an error or stale data? Partition tolerance is not a design choice — it is a given.
- PACELC is more practically useful: it asks about latency vs. consistency during normal operation, which is the tradeoff you make daily.
- Know the consistency model spectrum precisely — linearizable, causal, read-your-writes, eventual — and match each to a concrete use case.
- Availability SLAs need error budgets and blast-radius framing, not just uptime percentages.
- Strong tradeoff arguments name the model, justify it with business impact, and describe the implementation mechanism.
- Apply CP behavior narrowly to the data that cannot be wrong; let the rest of the system be AP for resilience and performance.