Architecture

The Cost of a Microservice

Ravinder·August 11, 2025·9 min read

ArchitectureMicroservicesCost OptimizationEngineering Management

Microservices Have a Price Tag Nobody Advertises

When teams adopt microservices, they budget for compute. They do not budget for everything else. Six months in, engineers are drowning in runbooks, on-call pages span three services for one incident, and a two-line change requires coordinating four deployments. The compute cost was the smallest item on the bill.

This post puts real numbers on the costs that do not appear in your AWS invoice: the per-service overhead for CI/CD, observability, on-call rotation, and inter-service dependencies. Then it makes the case for when those costs are worth it — and when they are not.

The Unit of Cost Is the Service, Not the Feature

Most cost analyses treat microservices as a cloud-bill problem. That is wrong. The dominant cost is engineering time, and it scales with the number of services, not with traffic.

Each service you add to your architecture requires:

A CI/CD pipeline it owns
An observability baseline (metrics, logs, traces)
A deploy runbook
An on-call owner
Dependency management for everything it calls
A contract it exposes that other services depend on

None of these are one-time costs. They are recurring. A service you stopped actively developing still needs patching, still pages someone when it degrades, still needs its dependencies upgraded when there is a CVE.

graph TD A[New microservice] --> B[CI/CD pipeline] A --> C[Observability setup] A --> D[On-call rotation slot] A --> E[Deploy runbook] A --> F[Upstream dependencies] A --> G[Downstream consumers] B --> H[Ongoing: pipeline maintenance] C --> I[Ongoing: alert tuning] D --> J[Ongoing: incident response] F --> K[Ongoing: version negotiation] G --> L[Ongoing: backward compat]

Per-Service CI/CD Cost

A minimal CI/CD pipeline for a microservice takes roughly half a day to set up correctly. That includes the pipeline config, the Docker build, the deployment manifest, the smoke test that runs post-deploy, and the rollback mechanism.

But setup cost is not the real number. Maintenance is.

Every time you update your base image, your build system, your Kubernetes version, or your secret management approach, you update it N times — once per service. A platform team that maintains that infrastructure for 30 services is doing 30x the work of a platform team maintaining one monolith.

The mitigation is a golden path: a shared pipeline template that services inherit from. This works until a service has a legitimate reason to deviate, at which point you are debugging a customized fork of the template.

# Golden path pipeline template - teams should not fork this
# unless they have a documented reason
name: standard-service-ci
 
on: [push, pull_request]
 
jobs:
  build-test-push:
    uses: ./.github/workflows/standard-build.yml
    with:
      service-name: ${{ vars.SERVICE_NAME }}
      registry: ${{ vars.REGISTRY }}
    secrets: inherit

A realistic cost estimate: 0.1 FTE per service per year for pipeline maintenance. At 20 services, that is 2 FTE. Most teams do not account for this at all.

Observability Tax

A single service in production needs, at minimum:

Metrics: request rate, error rate, latency (P50/P95/P99), resource utilisation
Structured logs: with trace IDs so you can correlate across services
Distributed traces: spans emitted for every outbound call
Alerts: on error rate, latency degradation, and resource saturation

Each of these requires setup, calibration, and ongoing tuning. Alerts that fire too often get ignored. Alerts that fire too rarely miss real incidents. Finding the right thresholds takes time, and those thresholds change as traffic patterns evolve.

# Every service should emit this baseline — no exceptions
from opentelemetry import trace, metrics
from opentelemetry.instrumentation.fastapi import FastAPIInstrumentor
 
# Automatic instrumentation covers the basics
FastAPIInstrumentor.instrument_app(app)
 
# But business-level metrics require manual instrumentation
meter = metrics.get_meter(__name__)
orders_processed = meter.create_counter(
    name="orders.processed",
    description="Total orders processed",
    unit="1",
)
 
# You cannot alert on what you do not measure
payment_latency = meter.create_histogram(
    name="payment.duration",
    description="Payment processing latency",
    unit="ms",
)

In a monolith, a single instrumentation effort covers the whole system. In a microservices system, each service needs its own instrumentation, its own dashboards, and its own alert definitions. The per-service cost is real.

On-Call and Incident Cost

A microservices incident almost never involves exactly one service. It involves a cascade: Service A is slow because Service B is timing out because Service C has a saturated database connection pool.

Diagnosing this cascade requires:

Knowing that Service A is the entry point (user-visible symptom)
Following the trace to Service B (requires distributed tracing)
Understanding that Service B's owner is a different team (requires org chart knowledge)
Getting Service C's owner on a call (if it is after hours, that is a third person paged)

This is the hidden cost of distributed ownership. An incident that would take 20 minutes to diagnose in a monolith takes 90 minutes across three teams with three separate runbooks.

sequenceDiagram participant U as User participant A as Service A (Team 1) participant B as Service B (Team 2) participant C as Service C (Team 3) U->>A: Request (500 error) Note over A: Pages Team 1 on-call A->>B: Downstream call (timeout) Note over A,B: Team 1 escalates to Team 2 B->>C: DB query (connection pool exhausted) Note over B,C: Team 2 escalates to Team 3 Note over C: Team 3 identifies root cause Note over A,B,C: Total MTTR: 90 min vs 20 min in monolith

A rough cost estimate: each inter-service boundary adds 15–20 minutes to the mean time to resolution for incidents that cross it. A system with 5 services in the critical path has 4 boundaries. That is up to 80 additional minutes per incident.

The Dependency Upgrade Math

Every service depends on other services, and those services change. When Service B releases a breaking change, every service that calls it must be updated. In a monolith, this is a refactor inside one codebase. In a microservices system, it is a coordinated multi-team effort.

The math compounds. If Service A calls B and C, and B calls D and E, a change to D may require:

D to release a new version
B to update its D dependency and release a new version
A to update its B dependency
All consumers of A to test that the transitive change has no impact

This is a dependency graph problem, and it grows worse as the graph deepens and broadens.

Monolith: 1 change, 1 refactor, 1 deploy
Microservices: 1 change → fan-out across the dependency tree

The mitigation is API versioning with explicit deprecation windows. But versioning has its own overhead: every service must support multiple versions of its API until all consumers have migrated, which can take quarters.

When the Cost Is Worth It

Microservices pay off when the problems they solve are real and significant. The conditions:

Independent scaling requirements. If your image resizing service needs 10x the compute of your auth service at peak, putting them in the same process means you overprovision auth to satisfy image resizing. This is real money.

Different deployment cadences. A reporting service that changes quarterly and a real-time pricing service that deploys dozens of times per day are genuinely different things. Coupling their deploys creates friction in both directions.

Team autonomy at scale. When you have 5+ teams working on the same codebase, merge conflicts, shared ownership, and coordination cost become real. Independent deployable units reduce that coupling — but only if teams genuinely own their services end-to-end.

Technology heterogeneity. A machine learning pipeline has different runtime requirements from a web API. Sometimes you genuinely need different languages or runtimes.

None of these conditions apply to most companies below 50 engineers. And even above 50, they apply to a subset of services — not to the whole system.

The Monolith Comparison

A well-structured monolith deployed to production has:

One CI/CD pipeline
One observability setup
One on-call rotation with full context
One deploy runbook
Zero cross-service network calls for in-process operations

The engineering overhead is dramatically lower. The trade-off is coupling: a slow function in one module can affect the whole process, and a deploy of any feature deploys all features.

That trade-off is acceptable for most systems most of the time. The coupling is manageable with good code boundaries and fast test suites. The shared-deploy risk is manageable with feature flags and progressive rollouts.

The question is not "monolith vs microservices." The question is: "Which specific problems do I have that require distributing this system across process boundaries?" If you cannot name those problems concretely, you probably should not pay the microservices tax.

A Cost Accounting Framework

Before splitting a service out of an existing system, price it honestly:

Cost Item	Estimate
CI/CD pipeline setup	0.5 engineer-days
Observability setup + initial alerts	1 engineer-day
Deploy runbook	0.5 engineer-days
On-call rotation update + training	0.5 engineer-days
API design + versioning strategy	1 engineer-day
One-time cost	~3.5 engineer-days
Pipeline maintenance (annual)	0.1 FTE
Alert tuning + dashboard upkeep	0.05 FTE
Incident overhead (cross-service)	situational
Annual recurring	~0.15 FTE

At a loaded cost of $200K/FTE, a single microservice costs roughly $30K/year in recurring engineering overhead before you touch the compute bill. Across 20 services, that is $600K/year.

That number should change how you think about service decomposition.

Key Takeaways

The dominant cost of microservices is engineering time, not compute. It scales with service count, not traffic.
Each service carries a recurring tax: CI/CD maintenance (~0.1 FTE/year), observability tuning, on-call context, and dependency negotiation.
Every inter-service boundary adds 15–20 minutes to incident MTTR. A 5-service critical path can add over an hour to each incident.
Microservices pay off with genuinely independent scaling needs, different deployment cadences, team autonomy at scale, or real technology heterogeneity.
A well-structured monolith has one pipeline, one observability setup, and zero cross-service network hops. That simplicity has real value.
Before splitting a service, price it: roughly 3.5 engineer-days of setup and $30K/year in recurring overhead. Make sure the benefit exceeds that cost.