Skip to main content
Platform Engineering

Golden Paths

Ravinder··5 min read
Platform EngineeringDevOpsIDPDeveloper Experience
Share:
Golden Paths

"Golden path" sounds like marketing. In practice it's a constraint that feels like freedom: instead of a product engineer making 47 decisions before deploying their first service, they make three. The platform has already made the other 44 — and made them well.

The key word is "opinionated." An unopinionated platform is just documentation. The whole point is that the platform bets on a set of defaults so product teams don't have to.

What Makes a Path Golden

A golden path is not a mandate. It is the answer to: "what should I do if I don't have a strong reason to do something else?"

That framing matters because it immediately surfaces two things:

  1. The defaults must be genuinely good — not just the tech the platform team prefers.
  2. There must be a known, non-punishing way to diverge when the defaults don't fit.

A path with no escape hatch is a cage. A path with escape hatches everywhere is no path at all.

graph LR A[New Service] --> B{On the golden path?} B -- Yes --> C[Use platform template] C --> D[Automated CI/CD] D --> E[Built-in observability] E --> F[Ship it] B -- No --> G[Escape hatch process] G --> H{Justified?} H -- Yes --> I[Custom path, documented exception] H -- No --> J[Redirect to golden path] I --> F

Designing the Defaults

Start by auditing what your best teams already do. Not what you wish they did — what the ones who ship reliably and have low incident rates actually do. That is your empirical golden path.

Common components:

Runtime. Pick one. Two is survivable. Three starts to fracture knowledge. If your default is "containerised service on Kubernetes," that constrains (helpfully) your CI, your networking, your secrets model, and your rollout strategy.

CI/CD skeleton. A reusable workflow that handles build, test, security scan, and deploy. Teams fork the skeleton, not write from scratch.

# .github/workflows/service-ci.yml — reusable workflow called by product teams
name: Service CI
 
on:
  workflow_call:
    inputs:
      service_name:
        required: true
        type: string
      deploy_env:
        required: false
        type: string
        default: staging
 
jobs:
  build-and-test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
 
      - name: Build image
        run: |
          docker build \
            --label "service=${{ inputs.service_name }}" \
            --label "git-sha=${{ github.sha }}" \
            -t ${{ inputs.service_name }}:${{ github.sha }} .
 
      - name: Run tests
        run: docker run --rm ${{ inputs.service_name }}:${{ github.sha }} make test
 
      - name: Security scan
        uses: aquasecurity/trivy-action@master
        with:
          image-ref: ${{ inputs.service_name }}:${{ github.sha }}
          severity: HIGH,CRITICAL
          exit-code: 1
 
  deploy:
    needs: build-and-test
    uses: ./.github/workflows/deploy.yml
    with:
      service_name: ${{ inputs.service_name }}
      image_tag: ${{ github.sha }}
      environment: ${{ inputs.deploy_env }}
    secrets: inherit

Observability bootstrap. Every service gets structured logs, a metrics endpoint, and a standard health check. Not optional, just included. We'll cover this in depth in post 7.

Secrets access. Workload identity, not static credentials. Post 6 goes deep here, but the golden path should make the secure option the default option.

Real Examples From the Field

Spotify's "Just Use This" culture. Spotify's platform team built Backstage precisely because they had 2,000 engineers making independent tooling decisions. The golden path was a catalog entry plus a predefined plugin stack. The insight: you don't need everyone on the path, you need enough critical mass that the path is self-reinforcing.

A fintech I worked with built a golden path for "deploy a new API service" that took 47 decisions down to 4: language (Go or Python), database (Postgres or none), async or sync, public or internal. Everything else was pre-decided. Time-to-first-deploy went from 2 weeks to half a day.

AWS App Runner vs ECS vs EKS. This is a common internal debate. The right answer for a golden path is usually: App Runner for simple HTTP services (zero K8s knowledge needed), ECS Fargate for services needing more control, EKS for teams with specific requirements that justify the complexity. Document which is which, make the right tier the default per service type.

The Escape Hatch Contract

Every escape hatch should have:

  • A documented reason it exists
  • A known owner who is on-call for it
  • A review step (not a ticket — a conversation or an automated check)
  • A sunset plan or a promotion path to the golden path
# exception.yaml — checked into the service repo
exception:
  from: golden-path/container-runtime
  reason: "ML inference requires GPU nodes not in standard node pool"
  owner: "@ml-infra-team"
  approved_by: "@platform-team"
  review_date: "2026-07-01"
  custom_path: "docs/ml-infra-runbook.md"

The review date matters. Exceptions that never expire become undocumented forks. In two years nobody remembers why the ML team runs on bare metal and the runbook is stale.

Pitfalls

The golden path nobody uses. If adoption is low, the problem is either discoverability (nobody knows it exists), fit (it doesn't cover the common cases), or trust (teams tried it and got burned). All three are fixable. Mandating adoption before fixing the root cause is not.

The golden path that's too narrow. If the path covers one language, one database, and one cloud region, it will fit one team perfectly and no one else. Start with the 80% case and extend.

Updating the path without communicating the change. A golden path that silently changes its CI defaults breaks every team on it simultaneously. Version your reusable workflows. Pin to tags, not @main.

Key Takeaways

  • A golden path is opinionated defaults that make the right choice the easy choice — not a mandate, but a strong prior.
  • Defaults should be reverse-engineered from what your highest-performing teams already do, not from what the platform team prefers.
  • Every path needs an escape hatch with documented ownership, a justification, and a review date.
  • Reusable CI/CD workflows, standardised observability, and workload identity are the three most valuable things to put on the golden path.
  • Version your reusable workflows and communicate breaking changes — silent updates to shared infrastructure are a reliability hazard.
  • Low adoption is a signal, not a compliance problem. Debug the path before mandating it.
Share: