MCP vs REST for AI Integrations: When Not to Use MCP
The MCP ecosystem has a boosterism problem. Every tutorial frames the question as "how do I build an MCP server?" and skips the prior question: "should I?" There are real situations where wrapping your API in MCP adds complexity without adding value, and shipping the wrong abstraction is worse than shipping nothing. This post is the direct comparison I wish existed when I was making this decision.
I am not arguing that MCP is overhyped. It solves a genuine problem. But "solves a genuine problem" is not the same as "is the right choice for your situation," and the criteria for choosing are not obvious from the protocol specification or its marketing.
What Each Protocol Is Actually Optimized For
REST is optimized for human-authored clients. You define an interface, publish documentation, and humans read the docs to write clients. The client knows the contract at build time. Discovery is a documentation problem, not a runtime problem. This model has worked for thirty years because humans are good at reading docs and translating them into code.
MCP is optimized for LLM-authored client calls. The model discovers your capabilities at runtime via tools/list, reads machine-readable descriptions, and constructs calls without any pre-existing knowledge of your specific API. Discovery is a runtime protocol concern, not a documentation problem. This is a fundamentally different constraint.
The moment you understand this distinction, the choice criteria become clearer.
The Discovery Argument
REST has no standard discovery mechanism. OpenAPI is widely used but not universal, and even when it exists, an LLM integrating with a REST API needs the spec injected into its context. That costs tokens, requires the spec to be current, and puts the burden of schema interpretation on the model at inference time.
MCP's tools/list response gives the model exactly what it needs to make a call: the name, a natural-language description, and a JSON Schema for the input. Nothing extra. No parsing an OpenAPI document. No guessing about which fields are required versus optional based on documentation prose.
Here is what a tools/list response looks like at the wire level:
{
"jsonrpc": "2.0",
"id": 1,
"result": {
"tools": [
{
"name": "query_orders",
"description": "Fetch orders for a customer. Returns at most 50 orders, sorted by creation date descending. Use cursor for pagination.",
"inputSchema": {
"type": "object",
"properties": {
"customerId": {
"type": "string",
"description": "The customer UUID, e.g. '550e8400-e29b-41d4-a716-446655440000'"
},
"cursor": {
"type": "string",
"description": "Pagination cursor from previous response. Omit for first page."
},
"status": {
"type": "string",
"enum": ["pending", "fulfilled", "cancelled"],
"description": "Filter by order status. Omit to return all statuses."
}
},
"required": ["customerId"]
}
}
]
}
}Compare this to what you would need to inject into an LLM's context to make it call a REST endpoint reliably: the endpoint URL, the HTTP method, the authentication scheme, the request body shape, the error response shape, and enough prose to explain the semantics. That is a significant context budget.
MCP wins on discovery. Full stop.
Stream Semantics: Where MCP Adds Real Value
REST streaming is an afterthought. Server-Sent Events and chunked transfer encoding work, but they are not part of the REST mental model. Every client library handles them differently. Error recovery mid-stream is undefined behavior in most implementations.
MCP's streaming is built into the transport layer. Progress notifications, partial results, and long-running operations are first-class protocol concepts. When an agent triggers a long computation, the server can emit progress notifications while the result is being computed. The client handles this the same way regardless of what tool it called.
// MCP server: streaming progress during a long operation
server.setRequestHandler(CallToolRequestSchema, async (request, { sendNotification }) => {
if (request.params.name === "bulk_export") {
const { customerId, format } = request.params.arguments as {
customerId: string;
format: "csv" | "json";
};
await sendNotification({
method: "notifications/progress",
params: { progressToken: request.params._meta?.progressToken, progress: 0, total: 100 },
});
const rows = await fetchAllOrders(customerId);
await sendNotification({
method: "notifications/progress",
params: { progressToken: request.params._meta?.progressToken, progress: 50, total: 100 },
});
const output = format === "csv" ? toCsv(rows) : JSON.stringify(rows, null, 2);
await sendNotification({
method: "notifications/progress",
params: { progressToken: request.params._meta?.progressToken, progress: 100, total: 100 },
});
return {
content: [{ type: "resource", resource: { uri: `data:text/${format};base64,${btoa(output)}`, mimeType: `text/${format}` } }],
};
}
});With REST, you would build this as a polling pattern or a webhook — both of which require the agent to understand your specific conventions. With MCP, the agent already knows how to handle progress notifications because the protocol defines them.
When REST Is the Right Answer
Here is the part most MCP tutorials skip.
Your clients are primarily humans writing code. If your API is consumed mostly by developers building applications — not AI agents orchestrating workflows — REST with OpenAPI is the better investment. The tooling ecosystem (Postman, Swagger UI, code generation, SDK generators) is vastly more mature. MCP's advantage in runtime discovery is irrelevant when the "client" is a developer reading docs.
You need fine-grained HTTP caching. MCP runs over a persistent connection (SSE or stdio). HTTP's cache semantics — ETags, Cache-Control headers, conditional GETs — do not translate cleanly. If your workload is read-heavy with cacheable responses, REST with a CDN in front will outperform an MCP server at a fraction of the operational cost.
Your API has a stable, simple shape. MCP shines when the capability surface is large and varied — dozens of tools, complex schemas, optional parameters everywhere. If you have three endpoints with predictable inputs, the overhead of running an MCP server (process management, transport, capability negotiation) is not justified. A REST API with an OpenAPI spec injected into the model's system prompt is fine.
You are integrating with existing REST infrastructure. If you have an API gateway, rate-limiting middleware, audit logging, and WAF rules all tuned for HTTP, slapping an MCP server in front duplicates that work and splits your security posture. The better move is to make the REST API LLM-friendly (good error messages, predictable schemas, sensible limits) and inject the OpenAPI spec.
The Hybrid Pattern That Actually Works
The real production answer is not either/or. Build a REST API with an OpenAPI spec. Add an MCP server that wraps your REST API for AI clients. The MCP server handles discovery, schema description, and progress notifications. The REST API handles caching, rate limiting, and human client integrations.
// mcp-server/src/tools/orders.ts
import { z } from "zod";
import { restClient } from "../rest-client";
export const queryOrdersTool = {
name: "query_orders",
description:
"Fetch orders for a customer. Returns at most 50 orders sorted by creation date descending.",
inputSchema: {
type: "object" as const,
properties: {
customerId: { type: "string", description: "Customer UUID" },
status: {
type: "string",
enum: ["pending", "fulfilled", "cancelled"],
description: "Filter by status. Omit for all.",
},
cursor: { type: "string", description: "Pagination cursor from previous call." },
},
required: ["customerId"],
},
handler: async (args: unknown) => {
const { customerId, status, cursor } = z
.object({
customerId: z.string().uuid(),
status: z.enum(["pending", "fulfilled", "cancelled"]).optional(),
cursor: z.string().optional(),
})
.parse(args);
// Delegate to the existing REST API — no business logic duplication
const response = await restClient.get("/orders", {
params: { customerId, status, cursor },
});
const orders = response.data.orders as Array<{
id: string;
status: string;
total: number;
createdAt: string;
}>;
return {
content: [
{
type: "text",
text: orders
.map((o) => `Order ${o.id}: ${o.status} — $${o.total} (${o.createdAt})`)
.join("\n"),
},
],
};
},
};This pattern lets you evolve the REST API independently. The MCP server is a thin adaptation layer — it adds description, schema validation, and result formatting, but contains no business logic. When you add a new REST endpoint, you add a corresponding MCP tool. The surface stays in sync.
Performance Reality Check
MCP has overhead that REST does not. The capability negotiation handshake adds one round trip before any tool call. Over stdio, latency is sub-millisecond; over SSE, you pay network RTT plus HTTP overhead. For interactive agents where the model is thinking between calls, this overhead is invisible. For batch processing pipelines that call tools thousands of times per second, it is not.
If your AI workload looks like: agent thinks → calls one tool → agent thinks → calls next tool, MCP's overhead is negligible. If it looks like: fetch 10,000 records in parallel via tool calls, you should not be using tool calls for that at all — inject the data into context or use a resource endpoint.
Key Takeaways
- MCP's primary advantage is runtime capability discovery — the model learns what tools exist and how to call them without a pre-existing client or injected documentation.
- REST is the right choice when primary consumers are human developers, when you need HTTP caching, or when you already have mature HTTP infrastructure you do not want to duplicate.
- MCP's progress notifications and streaming are first-class protocol features that are genuinely better than ad-hoc REST polling patterns for long-running agent operations.
- The hybrid pattern — REST API for human clients, thin MCP server wrapping it for AI clients — is the production-grade default for most teams.
- Do not put business logic in the MCP server; it is an adaptation and description layer, not a separate application tier.
- If your capability surface is small and stable, an OpenAPI spec injected into the model's system prompt is a perfectly valid substitute for an MCP server.