Series · 10 parts · ~73 min total

Building Production RAG

Before you write a single line of RAG code, you need to be ruthlessly honest about what problem you're solving and whether your data can actually solve it.

Start Reading →

1
Problem Framing and Dataset Honesty
Before you write a single line of RAG code, you need to be ruthlessly honest about what problem you're solving and whether your data can actually solve it.
7 min
Jan 15, 2025
2
Chunking Strategies That Actually Move Recall
The chunking decision you make in hour one will haunt your recall numbers for months — here's what the tradeoffs actually look like in production.
7 min
Jan 22, 2025
3
Embedding Choice and Dimensionality
The embedding model you pick shapes every downstream tradeoff in your RAG pipeline — here's how to choose without regretting it six months later.
8 min
Jan 29, 2025
4
Hybrid Search: BM25 + Vector
Pure vector search leaves precision on the table for exact-match queries — here's how to combine lexical and semantic retrieval without making your pipeline a mess.
8 min
Feb 5, 2025
5
Re-ranking Architectures
First-stage retrieval gets you candidates — re-ranking is what turns a decent recall number into an answer users actually trust.
6 min
Feb 12, 2025
6
Caching, Batching, and Cost Control
A RAG system that works but costs $40K/month to run isn't a product — here's the concrete cost math and the levers that actually move it.
8 min
Feb 19, 2025
7
Evaluation Harness from Scratch
A RAG pipeline without an evaluation harness is a system you can only improve by accident — here's how to build the infrastructure that makes intentional progress possible.
7 min
Feb 26, 2025
8
Observability for Retrieval
If your RAG system fails silently and you don't know until a user screenshots the bad answer, you don't have observability — here's what to actually instrument.
7 min
Mar 5, 2025
9
Multi-Tenant Isolation
Letting multiple customers share a RAG pipeline is an engineering win until one tenant's data leaks into another's answer — here's how to prevent that.
7 min
Mar 12, 2025
10
Failure Modes and Runbooks
Every RAG system that runs in production will fail in ways you didn't anticipate — here are the incidents that actually happen and how to resolve them at 2am.
8 min
Mar 19, 2025