Skip to main content
MCP

MCP for Internal Platforms: Wrapping Jenkins, Buildkite, and Internal CLIs

Ravinder··9 min read
MCPDevOpsInternal ToolsAIPlatform Engineering
Share:
MCP for Internal Platforms: Wrapping Jenkins, Buildkite, and Internal CLIs

The engineering platform team is the highest-leverage place to deploy MCP. Jenkins, Buildkite, Backstage, internal CLIs, on-call runbooks — these systems contain the accumulated operational knowledge of your organization. When an LLM agent can read build logs, trigger deploys, query service ownership, and page the right team, you stop paying the tax of context-switching between tools and start having conversations with your infrastructure.

The risk is real and the stakes are high. A poorly scoped MCP server over Jenkins is an AI agent with the ability to cancel every build, modify pipeline configurations, and restart production services. The difference between a transformative internal tool and a liability is RBAC, audit logging, and a thoughtful rollout. This post covers all three, plus the concrete implementation for wrapping Jenkins and Buildkite.

What Internal Platform MCP Looks Like

The goal is an MCP server that acts as a unified gateway to all internal platform APIs. Engineers (and AI agents working on their behalf) interact with one MCP server instead of learning ten internal API formats.

graph TD subgraph Engineer Workstation A[IDE / Claude] B[MCP Client] end subgraph MCP Gateway C[Auth + RBAC Layer] D[Audit Logger] E[Jenkins Adapter] F[Buildkite Adapter] G[CLI Runner] H[Backstage Adapter] end subgraph Internal Systems I[Jenkins] J[Buildkite API] K[Internal CLIs] L[Backstage API] end A --> B B -->|Bearer token + MTLS| C C --> D C --> E C --> F C --> G C --> H E --> I F --> J G --> K H --> L

The MCP gateway is the choke point. Every request goes through auth and audit before reaching the underlying platform. The underlying systems do not need to be modified — only the adapter layer changes.

RBAC: Roles, Not Scopes

OAuth scopes are binary: you either have jenkins:write or you don't. For internal platforms, you need something finer: roles that capture organizational intent and map to sets of permitted operations.

Define roles in a central configuration file that is version-controlled and reviewed:

# roles.yaml
roles:
  engineer:
    description: "Standard engineer access"
    permissions:
      jenkins:
        - read_build
        - read_logs
        - trigger_build:own-team-pipelines
      buildkite:
        - read_pipeline
        - read_build
        - trigger_build:own-team-pipelines
      backstage:
        - read_service
        - read_ownership
 
  platform-on-call:
    description: "On-call platform engineer"
    extends: engineer
    permissions:
      jenkins:
        - read_build
        - read_logs
        - trigger_build:all-pipelines
        - cancel_build:all-pipelines
      buildkite:
        - read_pipeline
        - read_build
        - trigger_build:all-pipelines
        - cancel_build:all-pipelines
 
  platform-admin:
    description: "Platform admin — for break-glass scenarios only"
    extends: platform-on-call
    permissions:
      jenkins:
        - "*"
      buildkite:
        - "*"

Load this at server startup and enforce it in every tool handler:

import yaml from "js-yaml";
import fs from "fs";
 
interface RoleConfig {
  roles: Record<string, {
    description: string;
    extends?: string;
    permissions: Record<string, string[]>;
  }>;
}
 
function loadRoles(path: string): RoleConfig {
  return yaml.load(fs.readFileSync(path, "utf8")) as RoleConfig;
}
 
function resolvePermissions(
  roleName: string,
  config: RoleConfig,
  system: string
): string[] {
  const role = config.roles[roleName];
  if (!role) return [];
 
  const inherited = role.extends
    ? resolvePermissions(role.extends, config, system)
    : [];
  const own = role.permissions[system] ?? [];
  return [...new Set([...inherited, ...own])];
}
 
function hasPermission(
  roleName: string,
  config: RoleConfig,
  system: string,
  action: string,
  resource?: string
): boolean {
  const perms = resolvePermissions(roleName, config, system);
  return perms.some(p => {
    if (p === "*") return true;
    const [permAction, permResource] = p.split(":");
    if (permAction !== action) return false;
    if (!permResource) return true;
    return resource?.startsWith(permResource.replace("*", "")) ?? false;
  });
}

Wrapping Jenkins

Jenkins has a REST API but it was designed for humans with curl, not LLMs. Wrap it with intent-aligned tools.

import { McpServer } from "@modelcontextprotocol/sdk/server/mcp.js";
import { z } from "zod";
import axios from "axios";
 
const jenkins = axios.create({
  baseURL: process.env.JENKINS_URL,
  auth: {
    username: process.env.JENKINS_USER!,
    password: process.env.JENKINS_API_TOKEN!,
  },
});
 
server.tool(
  "get_build_status",
  "Get the status of a Jenkins build by job name and build number. Use 'lastBuild' as build_number for the most recent build.",
  {
    job_name: z.string().describe("Full job path, e.g. 'my-team/service-name/main'"),
    build_number: z.union([z.number().int().positive(), z.literal("lastBuild")]),
  },
  async ({ job_name, build_number }, ctx) => {
    assertPermission(ctx, "jenkins", "read_build");
 
    const path = job_name.split("/").map(p => `job/${p}`).join("/");
    const res = await jenkins.get(`/${path}/${build_number}/api/json`);
    const build = res.data;
 
    return {
      content: [{
        type: "text",
        text: JSON.stringify({
          number: build.number,
          status: build.result ?? build.building ? "RUNNING" : "UNKNOWN",
          duration_ms: build.duration,
          url: build.url,
          started_at: new Date(build.timestamp).toISOString(),
          causes: build.actions
            ?.find((a: any) => a._class?.includes("CauseAction"))
            ?.causes?.map((c: any) => c.shortDescription) ?? [],
        }),
      }],
    };
  }
);
 
server.tool(
  "get_build_log_tail",
  "Get the last N lines of a Jenkins build log. Use this to diagnose build failures.",
  {
    job_name: z.string(),
    build_number: z.union([z.number().int().positive(), z.literal("lastBuild")]),
    lines: z.number().int().min(10).max(500).default(100),
  },
  async ({ job_name, build_number, lines }, ctx) => {
    assertPermission(ctx, "jenkins", "read_logs");
 
    const path = job_name.split("/").map(p => `job/${p}`).join("/");
    const res = await jenkins.get(`/${path}/${build_number}/logText/progressiveText`, {
      params: { start: 0 },
    });
 
    const logLines = (res.data as string).split("\n");
    const tail = logLines.slice(-lines).join("\n");
 
    return { content: [{ type: "text", text: tail }] };
  }
);
 
server.tool(
  "trigger_build",
  "[MUTATES] Trigger a Jenkins job. Optionally pass build parameters as key-value pairs.",
  {
    job_name: z.string(),
    parameters: z.record(z.string()).optional().describe("Build parameters"),
  },
  async ({ job_name, parameters }, ctx) => {
    assertPermission(ctx, "jenkins", "trigger_build", job_name);
    await auditLog(ctx, "jenkins.trigger_build", { job_name, parameters });
 
    const path = job_name.split("/").map(p => `job/${p}`).join("/");
    const endpoint = parameters
      ? `/${path}/buildWithParameters`
      : `/${path}/build`;
 
    const res = await jenkins.post(endpoint, null, {
      params: parameters,
    });
 
    const queueUrl = res.headers["location"];
    return {
      content: [{
        type: "text",
        text: `Build triggered for ${job_name}. Queue item: ${queueUrl}`,
      }],
    };
  }
);

Wrapping Buildkite

Buildkite has a modern REST API. The challenge is that pipeline slugs and organization slugs are opaque strings that the LLM must discover before it can act.

const buildkite = axios.create({
  baseURL: "https://api.buildkite.com/v2",
  headers: { Authorization: `Bearer ${process.env.BUILDKITE_API_TOKEN}` },
});
 
// Resource: list pipelines so the agent can discover slugs
server.resource(
  "buildkite-pipelines",
  "buildkite://pipelines",
  { mimeType: "application/json" },
  async () => {
    const org = process.env.BUILDKITE_ORG!;
    const res = await buildkite.get(`/organizations/${org}/pipelines`, {
      params: { per_page: 100 },
    });
 
    return {
      contents: [{
        uri: "buildkite://pipelines",
        mimeType: "application/json",
        text: JSON.stringify(
          res.data.map((p: any) => ({
            slug: p.slug,
            name: p.name,
            description: p.description,
            default_branch: p.default_branch,
            uri: `buildkite://pipelines/${p.slug}`,
          }))
        ),
      }],
    };
  }
);
 
server.tool(
  "get_buildkite_build",
  "Get the latest build for a Buildkite pipeline, optionally filtered to a specific branch.",
  {
    pipeline_slug: z.string(),
    branch: z.string().optional(),
  },
  async ({ pipeline_slug, branch }, ctx) => {
    assertPermission(ctx, "buildkite", "read_build");
    const org = process.env.BUILDKITE_ORG!;
    const res = await buildkite.get(
      `/organizations/${org}/pipelines/${pipeline_slug}/builds`,
      { params: { branch, per_page: 1 } }
    );
    const build = res.data[0];
    if (!build) {
      return { content: [{ type: "text", text: "No builds found." }] };
    }
    return {
      content: [{
        type: "text",
        text: JSON.stringify({
          number: build.number,
          state: build.state,
          branch: build.branch,
          commit: build.commit,
          created_at: build.created_at,
          finished_at: build.finished_at,
          url: build.web_url,
          creator: build.creator?.name,
        }),
      }],
    };
  }
);

Wrapping Internal CLIs with Sandboxing

Internal CLIs are the most dangerous surface to expose. A tool that runs arbitrary shell commands is an RCE vector. You must sandbox the CLI execution.

Use a whitelist of allowed commands and arguments. Never pass raw LLM output to exec or spawn.

import { spawn } from "child_process";
 
const ALLOWED_COMMANDS: Record<string, {
  executable: string;
  allowed_subcommands: string[];
  max_runtime_ms: number;
}> = {
  kubectl: {
    executable: "/usr/local/bin/kubectl",
    allowed_subcommands: ["get", "describe", "logs", "rollout status"],
    max_runtime_ms: 30_000,
  },
  terraform: {
    executable: "/usr/local/bin/terraform",
    allowed_subcommands: ["plan", "output", "state list"],
    max_runtime_ms: 120_000,
  },
};
 
async function runSandboxedCLI(
  command: string,
  subcommand: string,
  args: string[]
): Promise<string> {
  const config = ALLOWED_COMMANDS[command];
  if (!config) throw new Error(`Command '${command}' not in allowlist`);
 
  const isAllowed = config.allowed_subcommands.some(
    s => subcommand.startsWith(s)
  );
  if (!isAllowed) throw new Error(`Subcommand '${subcommand}' not permitted for ${command}`);
 
  // Sanitize args: no shell metacharacters
  const safeArgs = args.map(a => a.replace(/[;&|`$()\\]/g, ""));
 
  return new Promise((resolve, reject) => {
    const proc = spawn(config.executable, [subcommand, ...safeArgs], {
      timeout: config.max_runtime_ms,
      env: { PATH: "/usr/local/bin:/usr/bin:/bin" },  // minimal PATH
    });
 
    let stdout = "";
    let stderr = "";
    proc.stdout.on("data", d => (stdout += d));
    proc.stderr.on("data", d => (stderr += d));
    proc.on("close", code => {
      if (code !== 0) reject(new Error(`Command failed (${code}): ${stderr.slice(0, 500)}`));
      else resolve(stdout.slice(0, 50_000));  // cap output size
    });
  });
}

The minimal PATH, argument sanitization, command allowlist, and output cap are all non-negotiable for production use.

Rollout Strategy

Exposing internal platforms to LLM agents is a gradual process. Do not flip it on for everyone on day one.

gantt title MCP Internal Platform Rollout dateFormat YYYY-MM-DD section Phase 1 - Shadow Mode Read-only tools, platform team only :2026-02-01, 14d section Phase 2 - Pilot Read-only + trigger for 5 teams :2026-02-15, 21d section Phase 3 - Controlled Expansion Mutating tools with approval workflow :2026-03-08, 30d section Phase 4 - General Availability All tools, all engineers :2026-04-07, 30d

Phase 1 — Shadow mode. Deploy read-only tools to the platform team only. Log every tool call with the full request and response. Review logs weekly. Fix misunderstood descriptions and bad error messages.

Phase 2 — Pilot. Enable triggering (mutating) tools for five volunteer teams. Require a "dry run" flag on all mutating tools so engineers can see what would happen before committing.

Phase 3 — Controlled expansion. Roll out to more teams with an approval workflow gate on high-impact tools (cancel_build, rollback_service). Use a Slack DM or PR comment to ask for human confirmation before the MCP server executes.

Phase 4 — General availability. All tools available to all engineers. Monitor anomaly metrics weekly. Keep the admin role break-glass and require two-person authorization.

Key Takeaways

  • Internal platform MCP is highest-leverage but highest-risk: wrap Jenkins, Buildkite, and CLIs through a single MCP gateway with auth and audit as the mandatory choke point.
  • Use role-based permissions defined in a version-controlled YAML file rather than OAuth scopes alone; roles capture organizational intent (e.g., "on-call engineer") better than flat permission lists.
  • Expose pipeline and service catalogues as resources so the agent can discover slugs and names before invoking tools — this eliminates a class of hallucinated identifiers.
  • Sandbox CLI execution with an explicit allowlist of commands and subcommands, a minimal PATH, argument sanitization, a hard timeout, and output size caps — never pass raw LLM-generated strings to spawn.
  • Roll out in phases: read-only first, then mutating with dry-run flags, then mutating with approval workflows, and only then full general availability; each phase requires a log review before advancing.
  • Audit every mutating tool call with principal, role, tool name, sanitized arguments, and outcome; alert on mutating calls outside business hours and on more than N calls per minute from any single principal.