API Design Mastery

Pagination and Filtering

Ravinder·October 22, 2025·4 min read

API DesignRESTGraphQLgRPCPagination

Series

API Design Mastery

Part 4 of 10

← Part 3

Error Design

Part 5 →

Auth and Scopes

A list endpoint that returns all records is a timebomb. It works fine in development with ten rows and detonates in production with ten million. Pagination, filtering, and sorting are not optional features — they are the contract that lets your API scale independently of the data it serves. Getting this contract wrong means breaking clients when you try to fix it later.

Offset Pagination: Simple but Fragile

Offset pagination is the default because it maps directly to SQL LIMIT and OFFSET:

GET /orders?page=3&pageSize=25 HTTP/1.1

{
  "data": [...],
  "pagination": {
    "page": 3,
    "pageSize": 25,
    "totalItems": 1847,
    "totalPages": 74
  }
}

This works until:

A record is inserted between page 2 and page 3 requests — the client sees a duplicate or skips a record.
OFFSET 50000 on a million-row table forces the database to scan 50,000 rows before returning 25. Performance degrades linearly with page depth.
You try to parallelize fetching — pages shift under concurrent writes.

Use offset pagination only for human-facing UIs where users jump to page numbers and the dataset is relatively stable (reports, admin dashboards).

Cursor-Based Pagination: Stable and Scalable

A cursor encodes a position in the dataset, not an absolute offset. The database can seek directly to that position with an index.

GET /orders?limit=25 HTTP/1.1

{
  "data": [
    {"id": "ord_A", "createdAt": "2025-10-01T10:00:00Z"},
    ...
    {"id": "ord_Y", "createdAt": "2025-10-01T09:00:00Z"}
  ],
  "pagination": {
    "nextCursor": "eyJpZCI6Im9yZF9ZIiwiY3JlYXRlZEF0IjoiMjAyNS0xMC0wMVQwOTowMDowMFoifQ==",
    "hasMore": true
  }
}

GET /orders?limit=25&cursor=eyJpZCI6Im9yZF9ZIiwiY3JlYXRlZEF0IjoiMjAyNS0xMC0wMVQwOTowMDowMFoifQ== HTTP/1.1

The cursor is an opaque, base64-encoded value. Internally it might be {"id":"ord_Y","createdAt":"2025-10-01T09:00:00Z"}, which translates to a SQL keyset query:

SELECT * FROM orders
WHERE (created_at, id) < ('2025-10-01T09:00:00Z', 'ord_Y')
ORDER BY created_at DESC, id DESC
LIMIT 25;

This query uses an index and costs the same regardless of how deep into the dataset you are.

sequenceDiagram participant Client participant API participant DB Client->>API: GET /orders?limit=25 API->>DB: SELECT ... LIMIT 25 DB-->>API: rows + last row position API-->>Client: data + nextCursor Client->>API: GET /orders?limit=25&cursor= API->>DB: SELECT ... WHERE (created_at,id) < (pos) LIMIT 25 DB-->>API: rows (index seek, O(1)) API-->>Client: data + nextCursor

Trade-off: cursors do not support random page access ("jump to page 50"). If you need both, expose two endpoints or make the pagination style a query parameter.

Filtering

A well-designed filter interface is composable, predictable, and documented.

Simple equality filters as query parameters work for most cases:

GET /orders?status=shipped&customerId=cust_123 HTTP/1.1

For range and comparison operators, use a consistent suffix or bracket notation:

GET /orders?createdAt[gte]=2025-10-01&createdAt[lt]=2025-11-01
GET /orders?amount[gt]=100&amount[lte]=500

For full-featured filter expressions, consider a structured query string or a request body (POST to a search endpoint):

POST /orders/search HTTP/1.1
Content-Type: application/json
 
{
  "filter": {
    "and": [
      {"field": "status", "op": "in", "value": ["shipped", "delivered"]},
      {"field": "amount", "op": "gte", "value": 100},
      {"field": "customer.country", "op": "eq", "value": "US"}
    ]
  },
  "sort": [{"field": "createdAt", "direction": "desc"}],
  "limit": 50
}

Documenting filterable fields explicitly prevents clients from trying to filter on unindexed columns and blowing up your database.

Sorting

Expose sort as a field name with optional direction:

GET /orders?sort=-createdAt,amount HTTP/1.1

Convention: prefix - for descending. No prefix means ascending. Multiple sort fields are comma-separated.

Always include a tiebreaker. Sorting by createdAt alone is non-deterministic when two records share the same timestamp. Add id as a secondary sort:

ORDER BY created_at DESC, id DESC

Match this in your cursor encoding — the cursor must capture all sort fields to remain stable.

Response Envelope for Collections

Standardize your list response shape:

{
  "data": [...],
  "pagination": {
    "nextCursor": "...",
    "prevCursor": "...",
    "limit": 25,
    "hasMore": true
  },
  "meta": {
    "totalCount": 1847,
    "filteredCount": 312
  }
}

totalCount is expensive — avoid it in cursor-based APIs unless the client genuinely needs it. If you must provide it, cap the count estimate at a threshold (e.g., return "totalCount": ">10000") to avoid full table scans.

Key Takeaways

Use offset pagination only for stable, human-navigable datasets; use cursor-based (keyset) pagination for anything at scale or that requires real-time consistency.
Encode cursors as opaque base64 strings — their internal structure is an implementation detail, not a client contract.
Support filter operators via a consistent convention (bracket notation or structured POST body) and document which fields are filterable.
Always include a tiebreaker field in your sort order to guarantee deterministic pagination.
Do not return unbounded totalCount from cursor-based endpoints — it forces full table scans; prefer capped estimates or omit entirely.
Match your cursor encoding to all active sort fields, or page boundaries will drift when sorting changes.

Series

API Design Mastery

Part 4 of 10

← Part 3

Error Design

Part 5 →

Auth and Scopes