Indexing Past B-tree
Part 2 →
Locking and Contention
Most engineers reach for a B-tree index reflexively—and for good reason. B-trees handle equality and range queries over orderable types efficiently, cover the majority of OLTP access patterns, and require no configuration beyond the column name. The problem surfaces when your workload drifts outside that comfort zone: full-text search, geometric proximity, sparse high-cardinality columns, monotonically growing timestamps across billions of rows. At that point the B-tree either produces a bloated index that slows writes more than it helps reads, or simply cannot express the query semantics you need.
This post walks through the index types Postgres provides beyond B-tree, with enough mechanical grounding to make them first-class tools rather than last-resort curiosities.
Why Index Internals Matter
Before diving into specific types, it helps to frame what an index actually is in Postgres: a separate on-disk structure that stores a partial projection of table data in a form optimized for a specific class of lookup. Each index type defines its own:
- Access method — how Postgres traverses the structure during scans
- Operator classes — which operators (
=,<@,@@,&&) the index can satisfy - Write amplification — how much extra I/O each INSERT or UPDATE produces
Operator class mismatch is the silent killer: if the operators in your WHERE clause don't match the index's operator class, Postgres will skip the index entirely with no warning.
GIN — Generalized Inverted Index
GIN is the right choice when a single column contains multiple indexable items and queries need to find rows where the column contains one or more of those items. The canonical examples are:
text[]orinteger[]array columns queried with@>,&&tsvectorcolumns for full-text search with@@jsonbcolumns queried with@>or?
GIN stores a posting list per element—every distinct element in the column maps to the set of CTIDs (row pointers) containing it. Lookup is an inverted-index scan: find the element, fetch the posting list, return matching rows.
-- Full-text search: GIN over tsvector
CREATE INDEX idx_articles_fts
ON articles
USING GIN (to_tsvector('english', title || ' ' || body));
-- Query that uses it
SELECT id, title
FROM articles
WHERE to_tsvector('english', title || ' ' || body) @@ to_tsquery('english', 'postgres & indexing');
-- Array containment: tag filtering
CREATE INDEX idx_posts_tags
ON posts
USING GIN (tags);
SELECT id FROM posts WHERE tags @> ARRAY['postgres', 'performance'];Write trade-off. GIN maintains a pending list (a small heap) of new entries and merges them lazily. This keeps write amplification manageable for bulk loads, but vacuum is required to finalize merges. Set gin_pending_list_limit conservatively if vacuum is infrequent.
Covering vs. non-covering. GIN cannot be a covering index—it cannot include extra columns for an index-only scan. If you need to project additional columns, the query must still hit the heap.
GiST — Generalized Search Tree
GiST is a framework, not a single data structure. It exposes hooks (consistent, union, compress, decompress, penalty, picksplit) that extension authors implement to produce balanced tree access methods for arbitrary data types. This is how PostGIS, intarray, and the built-in range types all gain index support.
For backend engineers, the practical GiST cases are:
- Geometric types (
point,circle,polygon) with&&,@>,<->(distance) - Range types (
tsrange,daterange,int4range) with&&and@> - Full-text search (though GIN is usually faster for FTS)
-- Schedule conflict detection with a GiST range index
CREATE TABLE bookings (
id bigserial PRIMARY KEY,
resource_id int NOT NULL,
during tsrange NOT NULL,
EXCLUDE USING GiST (resource_id WITH =, during WITH &&)
);
-- The exclusion constraint implicitly creates a GiST index.
-- This query uses it:
SELECT *
FROM bookings
WHERE resource_id = 42
AND during && '[2025-06-10 09:00, 2025-06-10 17:00)'::tsrange;GiST is lossy in some configurations: the index may return false positives that the executor filters with a recheck. This is expected behavior, not a bug.
BRIN — Block Range INdex
BRIN stores a summary (min, max, and optionally other statistics) per block range—by default 128 pages. It is orders of magnitude smaller than a B-tree (often 1/10,000th the size) and has nearly zero write overhead. The catch: it is only useful when the physical order of rows in the heap correlates with the indexed column's sort order.
The ideal BRIN candidate is an append-only table with a monotonically increasing column: event logs, time-series ingestion tables, audit trails.
-- Timeseries events table - created_at correlates strongly with heap order
CREATE TABLE events (
id bigserial PRIMARY KEY,
created_at timestamptz NOT NULL DEFAULT now(),
event_type text NOT NULL,
payload jsonb
);
CREATE INDEX idx_events_created_brin
ON events
USING BRIN (created_at)
WITH (pages_per_range = 64);
-- Range scan: Postgres reads only the block ranges whose min/max overlap
SELECT * FROM events
WHERE created_at >= '2025-06-01' AND created_at < '2025-06-02';If rows are inserted out of order—say, backfilled historical data—correlation drops and BRIN degrades to a near-full sequential scan. Check correlation with SELECT correlation FROM pg_stats WHERE tablename = 'events' AND attname = 'created_at'. Values above 0.9 are BRIN-friendly.
Partial Indexes
A partial index indexes only the rows satisfying a WHERE predicate. This yields a smaller, faster index for any workload where the query target is a subset of the table.
-- Only index unprocessed jobs - the hot working set
CREATE INDEX idx_jobs_unprocessed
ON jobs (created_at)
WHERE status = 'pending';
-- Query must include the same predicate for Postgres to use the index
SELECT * FROM jobs
WHERE status = 'pending'
AND created_at < now() - interval '5 minutes';
-- Unique constraint on active records only
CREATE UNIQUE INDEX idx_users_active_email
ON users (email)
WHERE deleted_at IS NULL;The planner will use a partial index only when the query's WHERE clause implies (is logically entailed by) the index predicate. The predicate must be identical or more restrictive—Postgres does not perform semantic simplification.
Expression Indexes
An expression index indexes the result of an expression rather than raw column values. This makes function results indexable without materializing them into a column.
-- Case-insensitive email lookup
CREATE INDEX idx_users_email_lower
ON users (lower(email));
SELECT * FROM users WHERE lower(email) = lower('User@Example.com');
-- Computed JSON field - avoids redundant column
CREATE INDEX idx_orders_customer_id
ON orders ((payload->>'customer_id'));
SELECT * FROM orders WHERE payload->>'customer_id' = '12345';
-- Date truncation for day-level bucketing
CREATE INDEX idx_events_day
ON events (date_trunc('day', created_at));
SELECT count(*) FROM events
WHERE date_trunc('day', created_at) = '2025-06-01';Expression indexes must exactly match the expression in the query. lower(email) and LOWER(email) are identical after parsing, but date_trunc('day', created_at) and created_at::date are not—they produce different values for non-midnight timestamps.
Choosing the Right Index Type
Before creating any index: run EXPLAIN (ANALYZE, BUFFERS) on the query, check pg_stat_user_indexes for existing unused indexes, and measure write amplification on a staging system with realistic data volumes.
Key Takeaways
- B-tree covers equality and range on orderable scalars; every other index type fills a distinct gap—not a better B-tree.
- GIN is the correct choice for multi-valued columns (arrays, JSONB, tsvector); expect larger index size and lazy merge overhead.
- GiST enables geometric and range overlaps and powers exclusion constraints; it can be lossy and requires a recheck pass.
- BRIN is tiny and fast for append-only, naturally ordered tables; correlation must be high or it degrades to a sequential scan.
- Partial indexes shrink the index to the working set—pair them with matching WHERE predicates in queries.
- Expression indexes must exactly mirror the expression in the query, including function argument literals.
Part 2 →
Locking and Contention