Skip to main content
MCP

Authentication for MCP Servers: OAuth, Scoped Tokens, and Audit

Ravinder··9 min read
MCPAuthenticationOAuthSecurity
Share:
Authentication for MCP Servers: OAuth, Scoped Tokens, and Audit

The default MCP server example in every tutorial is unauthenticated. It binds to localhost, trusts every caller, and returns data freely. That is fine for a local demo. It is a catastrophe in production, where your MCP server sits in front of a database, internal API, or CI system that an LLM agent can now call on behalf of any user.

Authentication for MCP is an unsolved problem in most teams I have spoken to. The protocol itself is transport-agnostic — it does not mandate how you authenticate. That flexibility is a trap: it means every team rolls something ad-hoc and usually gets it wrong. This post is what I wish someone had handed me: a complete auth stack with OAuth 2.1, scoped tokens, transport-layer enforcement, and audit logging, including a threat model that will help you decide how far to go.

The Threat Model First

Before writing a single line of code, get explicit about what you are defending against. MCP's threat surface is different from a REST API because the caller is an autonomous agent that can chain tool calls without human review.

graph LR subgraph Attackers A1[Prompt Injection] A2[Token Theft] A3[Over-Privileged Tool] A4[Confused Deputy] end subgraph MCP Stack T[Transport Layer] S[MCP Server] B[Backend Service] end A1 -->|injects instructions into tool output| S A2 -->|stolen bearer token reused| T A3 -->|LLM calls cancel_all_orders| B A4 -->|agent acts on behalf of wrong principal| S

The four threats that keep me up at night:

Prompt injection via tool output. A malicious data source returns text that contains LLM instructions. The model reads them as legitimate commands. Mitigation: treat all tool output as untrusted user data. Never embed tool output verbatim in the system prompt.

Token theft. A long-lived bearer token stored in the client environment leaks. Mitigation: short-lived tokens, refresh token rotation, and binding tokens to client IP or fingerprint.

Over-privileged tools. The agent has a delete_database tool it should never call. It calls it. Mitigation: scope tokens to the minimum set of operations the session needs.

Confused deputy. The agent acts on behalf of User A but accidentally uses a token scoped to User B. Mitigation: bind tokens to subject claims and verify them server-side on every request.

Transport-Layer Authentication

MCP supports multiple transports. For HTTP-based transports (SSE and Streamable HTTP), authentication lives in the Authorization header. For stdio transports, authentication must happen at the process-spawn level — environment variables or OS-level identity.

HTTP Transport: Bearer Tokens

The MCP specification aligns with OAuth 2.1 for HTTP transports. Every request must carry a Bearer token. Implement a middleware that validates it before the MCP handler sees the request.

import express from "express";
import { createRemoteJWKSet, jwtVerify } from "jose";
 
const JWKS = createRemoteJWKSet(
  new URL(`${process.env.AUTH_ISSUER}/.well-known/jwks.json`)
);
 
export async function authMiddleware(
  req: express.Request,
  res: express.Response,
  next: express.NextFunction
) {
  const authHeader = req.headers.authorization;
  if (!authHeader?.startsWith("Bearer ")) {
    res.status(401).json({ error: "missing_token" });
    return;
  }
 
  const token = authHeader.slice(7);
  try {
    const { payload } = await jwtVerify(token, JWKS, {
      issuer: process.env.AUTH_ISSUER,
      audience: process.env.AUTH_AUDIENCE,
    });
 
    // Attach principal to request for downstream use
    (req as any).principal = {
      sub: payload.sub,
      scopes: ((payload.scope as string) ?? "").split(" "),
    };
    next();
  } catch (err) {
    res.status(401).json({ error: "invalid_token", detail: String(err) });
  }
}

Apply this middleware before mounting the MCP HTTP handler:

import { StreamableHTTPServerTransport } from "@modelcontextprotocol/sdk/server/streamableHttp.js";
 
const app = express();
app.use(express.json());
app.use("/mcp", authMiddleware);  // auth first
 
const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: () => crypto.randomUUID() });
app.post("/mcp", (req, res) => transport.handleRequest(req, res));
app.listen(3000);

OAuth 2.1 Authorization Code Flow for MCP Clients

When your MCP client is a user-facing application (not a service account), use the Authorization Code flow with PKCE. The client redirects to your authorization server, gets a short-lived access token, and passes it to the MCP server.

sequenceDiagram participant U as User participant C as MCP Client participant AS as Auth Server participant M as MCP Server U->>C: Open session C->>AS: Authorization request + PKCE challenge AS->>U: Login & consent U->>AS: Approve AS->>C: Authorization code C->>AS: Code + PKCE verifier → token exchange AS->>C: access_token (15 min) + refresh_token C->>M: POST /mcp (Bearer access_token) M->>M: Verify JWT, extract scopes M->>C: MCP response C->>AS: Refresh when expired

Key points for this flow in an MCP context:

  • Access token lifetime: 15 minutes maximum. MCP sessions are short; there is no reason for longer-lived tokens.
  • Refresh token rotation: every refresh issues a new refresh token and invalidates the old one. A stolen refresh token used after rotation is immediately detected.
  • PKCE is mandatory. The code_challenge_method must be S256. Plain is not acceptable.

Token Scoping: Least Privilege at the Tool Level

OAuth scopes are your primary mechanism for tool-level authorization. Define one scope per logical capability, not per tool. Scopes should map to business operations, not implementation details.

mcp:orders:read       - read order data
mcp:orders:write      - create or update orders
mcp:orders:cancel     - cancel orders (destructive, separate scope)
mcp:inventory:read    - read inventory
mcp:reports:generate  - trigger report generation
mcp:admin             - administrative operations

Enforce scopes server-side in every tool handler. Do not trust the client to only request tools it has access to — the LLM makes that decision at runtime.

function requireScope(scope: string) {
  return function (req: express.Request) {
    const principal = (req as any).principal;
    if (!principal?.scopes?.includes(scope)) {
      throw new McpError(
        ErrorCode.InvalidRequest,
        `Insufficient scope. Required: ${scope}`
      );
    }
  };
}
 
server.tool(
  "cancel_order",
  "[MUTATES] Cancel an order. Requires mcp:orders:cancel scope.",
  { order_id: z.string().uuid() },
  async ({ order_id }, { requestContext }) => {
    requireScope("mcp:orders:cancel")(requestContext.req);
    await orderService.cancel(order_id);
    return { content: [{ type: "text", text: `Order ${order_id} cancelled.` }] };
  }
);

Passing Principal Context Through MCP

The MCP SDK passes a context object to tool handlers. Use a custom context to thread the authenticated principal through every handler without global state.

import { McpServer, RequestHandlerContext } from "@modelcontextprotocol/sdk/server/mcp.js";
 
interface AuthContext extends RequestHandlerContext {
  principal: { sub: string; scopes: string[] };
}
 
// In your transport setup, attach principal from middleware to context
transport.onRequest = (req) => ({
  principal: (req as any).principal,
});

This ensures every tool handler can access context.principal.sub to:

  1. Filter results to only the authenticated user's data.
  2. Write audit log entries with the correct subject.
  3. Enforce row-level security at the database layer.

Audit Logging

Every tool invocation that touches sensitive data or has side effects must be logged. A good audit log record contains: who, what, when, with what arguments, and the outcome.

interface AuditEntry {
  timestamp: string;     // ISO 8601
  principal: string;     // JWT sub
  tool: string;          // tool name
  args: unknown;         // sanitized input (PII redacted)
  outcome: "success" | "error" | "denied";
  error?: string;
  duration_ms: number;
  session_id: string;
}
 
async function withAudit<T>(
  ctx: AuthContext,
  tool: string,
  args: unknown,
  fn: () => Promise<T>
): Promise<T> {
  const start = Date.now();
  let outcome: AuditEntry["outcome"] = "success";
  let error: string | undefined;
 
  try {
    return await fn();
  } catch (err) {
    outcome = err instanceof McpError && err.code === ErrorCode.InvalidRequest
      ? "denied"
      : "error";
    error = String(err);
    throw err;
  } finally {
    const entry: AuditEntry = {
      timestamp: new Date().toISOString(),
      principal: ctx.principal.sub,
      tool,
      args: redactPii(args),
      outcome,
      error,
      duration_ms: Date.now() - start,
      session_id: ctx.sessionId ?? "unknown",
    };
    await auditLog.write(entry);  // async, non-blocking
  }
}

Ship audit logs to your SIEM (Splunk, Datadog, OpenSearch) and set alerts on:

  • More than N tool invocations per minute from one principal (rate abuse)
  • Any invocation of destructive tools outside business hours
  • Auth failures from the same IP within a short window (credential stuffing)

Rate Limiting

Unauthenticated rate limiting is IP-based and trivially bypassed. Token-based rate limiting is the right approach: each JWT sub gets a quota.

import Bottleneck from "bottleneck";
 
const limiters = new Map<string, Bottleneck>();
 
function getLimiter(sub: string): Bottleneck {
  if (!limiters.has(sub)) {
    limiters.set(sub, new Bottleneck({
      reservoir: 60,              // 60 requests
      reservoirRefreshAmount: 60,
      reservoirRefreshInterval: 60 * 1000,  // per minute
      maxConcurrent: 3,
    }));
  }
  return limiters.get(sub)!;
}
 
// Wrap tool execution
async function rateLimitedExec<T>(sub: string, fn: () => Promise<T>): Promise<T> {
  return getLimiter(sub).schedule(fn);
}

Use Redis-backed rate limiters in multi-instance deployments. An in-process Map only works on a single server.

Rotating Secrets Without Downtime

MCP servers run continuously. Rotating signing keys mid-session is a pain point. The solution is key ID (kid) in JWT headers and a JWKS endpoint that serves multiple simultaneous keys.

During rotation:

  1. Generate new key pair, add to JWKS endpoint.
  2. Update auth server to sign new tokens with the new key.
  3. Keep old key in JWKS for the maximum token lifetime (15 minutes).
  4. Remove old key from JWKS only after it has expired from all active tokens.

Your jwtVerify call against the JWKS endpoint handles this automatically because jose matches kid in the token header to the correct public key in the set.

Key Takeaways

  • The MCP protocol does not enforce authentication — you must implement it at the transport layer before any MCP handler runs.
  • Use OAuth 2.1 Authorization Code + PKCE for user-facing clients; use Client Credentials for service-to-service MCP connections.
  • Access tokens should expire in 15 minutes; use refresh token rotation to detect theft without forcing re-login.
  • Scope tokens to logical business operations (not individual tools) and enforce scopes server-side in every tool handler — never rely on the LLM to self-restrict.
  • Audit every tool invocation with principal, tool name, sanitized args, outcome, and duration; ship logs to your SIEM and alert on anomalies.
  • Rotate signing keys via JWKS without downtime by keeping the old key available for one maximum token lifetime after rotation.