Indexes, End to End
The most common database performance antipattern is not a missing index—it is a table with fifteen indexes, half of which the planner never uses, all of which slow down every write. Indexes are a read/write tradeoff. Understanding that tradeoff end to end—how each index type is structured, when the planner will use it, and what it costs on writes—is what separates engineers who tune databases from engineers who randomly add indexes until things stop being slow.
How B-tree Indexes Are Structured
The default index type in PostgreSQL, MySQL, and most relational databases is a B-tree. An index is a separate data structure that maps column values to heap row locations (TIDs in PostgreSQL: page number + slot offset).
B-tree index on users(email):
Root: [m@ex.com | z@ex.com]
/ | \
[a@ex.com .. l@ex.com] [m@ex.com .. y@ex.com] [z@ex.com ..]
|
[TID(3,1) → a@ex.com]
[TID(7,4) → b@ex.com]
[TID(1,2) → c@ex.com]
...When you query WHERE email = 'a@example.com':
- The planner traverses the B-tree from root to leaf: O(log N) page reads.
- It finds the matching TID(s).
- It fetches the heap row at that TID.
Step 3—the heap fetch—is the cost that index-only scans eliminate. If the query only needs columns present in the index, the heap fetch is skipped entirely.
Index Types and When to Use Each
PostgreSQL supports multiple index access methods, each optimized for different operators:
| Index Type | Structure | Best For | Not For |
|---|---|---|---|
| B-tree | Balanced tree | =, <, >, BETWEEN, LIKE 'prefix%' |
Full-text, array containment |
| Hash | Hash table | = only |
Range queries |
| GIN | Inverted index | Array containment, full-text search, JSONB | Ordered access |
| GiST | Generalized search tree | Geometric types, range overlap, similarity | High write throughput |
| BRIN | Block range index | Monotonically increasing columns (timestamps) | Random access patterns |
| Bloom | Bit signature filter | Equality on many columns simultaneously | Range queries |
-- GIN index for JSONB queries
CREATE INDEX idx_events_payload ON events USING GIN (payload);
-- Enables: WHERE payload @> '{"type": "click"}'
-- BRIN for time-series (tiny index, good enough for appended data)
CREATE INDEX idx_logs_created USING BRIN (created_at);
-- 128-page blocks; works because new rows always have larger timestamps
-- Partial index: only index rows that match a condition
CREATE INDEX idx_orders_pending
ON orders (created_at)
WHERE status = 'pending';
-- Tiny index, fast for the common query, no overhead for completed ordersThe Write Cost of Indexes
Every index is updated on every write that touches the indexed columns. For a table with 10 indexes receiving 50,000 inserts/second:
Each INSERT:
1 heap page write
+ 10 index page writes (one per index)
= 11 page writes per row
At 50,000 rows/sec = 550,000 page writes/sec
vs 50,000 page writes/sec with no indexesThis is write amplification by index count. It compounds with B-tree page splits under random-key inserts.
-- Find unused indexes (safe to drop candidates)
SELECT
schemaname,
tablename,
indexname,
idx_scan,
pg_size_pretty(pg_relation_size(indexrelid)) AS index_size
FROM pg_stat_user_indexes
WHERE idx_scan = 0
AND indexrelid NOT IN (
SELECT conindid FROM pg_constraint WHERE contype IN ('p','u')
)
ORDER BY pg_relation_size(indexrelid) DESC;
-- idx_scan = 0 since last stats reset → the planner never used this indexDrop unused indexes. They cost writes, consume disk, and the planner must consider them during planning (adding overhead).
Covering Indexes and Index-Only Scans
A covering index includes all columns a query needs, eliminating the heap fetch:
-- Query: SELECT user_id, total FROM orders WHERE status = 'shipped'
-- Without covering index: index scan → TID → heap fetch for each row
-- With covering index (include non-key columns):
CREATE INDEX idx_orders_status_covering
ON orders (status)
INCLUDE (user_id, total);
-- EXPLAIN now shows "Index Only Scan" — no heap access
EXPLAIN SELECT user_id, total FROM orders WHERE status = 'shipped';
-- Index Only Scan using idx_orders_status_covering on orders
-- (cost=0.43..134.22 rows=4210 width=12)Index-only scans are the fastest possible read path. The only overhead is the visibility map check (to verify that the heap page is all-visible, meaning VACUUM has processed it). On a well-vacuumed table, index-only scans avoid all heap I/O.
Composite Index Column Order
For a composite index (a, b, c), the index is useful for queries filtering on:
aalonea, btogethera, b, ctogether
It is NOT useful for queries filtering on b alone or c alone. Column order in a composite index follows the leftmost prefix rule.
-- Index: (region, status, created_at)
-- Useful for:
WHERE region = 'us-east'
WHERE region = 'us-east' AND status = 'active'
WHERE region = 'us-east' AND status = 'active' AND created_at > '2025-01-01'
-- NOT useful for:
WHERE status = 'active' -- skips leading column
WHERE created_at > '2025-01-01' -- skips two leading columnsOrder guideline: equality filters first (high-selectivity columns before low-selectivity), range filter last.
Index Bloat
B-tree indexes accumulate bloat under heavy UPDATE/DELETE load, just like heap tables. Dead index entries remain until VACUUM cleans them. Index bloat degrades scan performance (more pages to read) and wastes disk.
-- Check index bloat using pgstattuple extension
CREATE EXTENSION IF NOT EXISTS pgstattuple;
SELECT * FROM pgstattuple('idx_orders_status_covering');
-- dead_tuple_percent > 10% → consider REINDEX CONCURRENTLYREINDEX CONCURRENTLY rebuilds the index without locking the table—essential for production systems.
Key Takeaways
- Every index is a write tax: each additional index adds one index page write per INSERT/UPDATE/DELETE on the indexed columns; audit and drop unused indexes regularly.
- Index type selection matters: B-tree for ordered comparisons, GIN for array/JSONB/full-text, BRIN for monotonically increasing columns, partial indexes for filtered subsets.
- Covering indexes with
INCLUDEcolumns enable index-only scans, eliminating heap fetches entirely and providing the fastest possible read path. - Composite index column order follows the leftmost prefix rule: equality-filtered columns first, range-filtered columns last.
- Index bloat from DELETE/UPDATE accumulation degrades scan performance; monitor with
pgstattupleand rebuild withREINDEX CONCURRENTLY. - The planner will not use an index if it estimates that the filter is not selective enough (seq scan is cheaper)—stale statistics or missing column statistics are the usual cause.