Kubernetes Pragmatic

When You Don't Need K8s

Ravinder·July 1, 2025·5 min read

KubernetesCloud NativeDevOpsECSPlatform Engineering

Series

Kubernetes Without the YAML Stockholm Syndrome

Part 1 of 10

Start of series

Part 2 →

The Cluster You Actually Want

There is a pattern I see repeatedly: a team of four engineers, two microservices, and an ambitious CTO who just came back from KubeCon. Six months later, those four engineers are now principally occupied with cluster maintenance, Helm chart debugging, and arguing about whether to use Flux or Argo. The two microservices still exist. They are not meaningfully more reliable.

Kubernetes is genuinely excellent for specific problems. The mistake is treating it as the default answer for container orchestration rather than a tool with a sharp cost-benefit tradeoff.

The Honest Cost Inventory

Before you provision a cluster, the cost inventory needs to be honest. Kubernetes costs are not just the EC2 or GKE bill.

Operational surface area you now own:

Control plane upgrades (at minimum once every 12–14 months before you hit EOL)
Node pool management, AMI patches, kernel updates
CNI, CSI, and admission webhook compatibility across every upgrade
Certificate rotation and etcd health
Ingress controller maintenance
Debugging networking failures that are invisible from application code

A managed control plane (GKE Autopilot, EKS with Fargate, AKS) eliminates some of this. Not all of it. The application layer — deployments, services, ingress rules, RBAC, NetworkPolicies, resource quotas, PodDisruptionBudgets — remains yours entirely.

graph LR A[Your Team Capacity] --> B{What are you building?} B -->|1-5 services, stable load| C[Managed PaaS] B -->|5-20 services, variable load| D[Evaluate Carefully] B -->|20+ services, platform team| E[Kubernetes makes sense] C --> F[ECS / Fly / Render / Railway] D --> G[ECS + ALB or K8s with guardrails] E --> H[EKS / GKE / AKS]

The Alternatives That Actually Work

AWS ECS (Fargate)

ECS with Fargate is underrated by engineers who have drunk the K8s Kool-Aid. You get:

No node management. Zero. Fargate provisions compute per task.
Native IAM task roles (no IRSA complexity)
Service Connect for service-to-service discovery
Deep CloudWatch integration with no custom metrics pipeline
ALB weighted routing for canary deployments without Argo Rollouts

# ECS Task Definition — the K8s Pod equivalent, minus 80% of the fields
{
  "family": "api-service",
  "networkMode": "awsvpc",
  "requiresCompatibilities": ["FARGATE"],
  "cpu": "512",
  "memory": "1024",
  "taskRoleArn": "arn:aws:iam::123456789:role/api-service-task-role",
  "containerDefinitions": [
    {
      "name": "api",
      "image": "123456789.dkr.ecr.us-east-1.amazonaws.com/api:latest",
      "portMappings": [{ "containerPort": 8080 }],
      "logConfiguration": {
        "logDriver": "awslogs",
        "options": {
          "awslogs-group": "/ecs/api-service",
          "awslogs-region": "us-east-1",
          "awslogs-stream-prefix": "ecs"
        }
      }
    }
  ]
}

No Ingress resource, no Service object, no HPA YAML. ALB does the load balancing. CloudWatch does the alerting. IAM does the identity. You can be productive on day one.

Fly.io

Fly is what Heroku should have become. You deploy from a Dockerfile or a fly.toml, and it runs in Fly's global anycast network. For latency-sensitive APIs serving global users, this is genuinely hard to replicate in Kubernetes without a multi-region cluster setup that will consume your entire quarter.

# fly.toml — entire deployment config
app = "my-api"
primary_region = "iad"
 
[build]
  dockerfile = "Dockerfile"
 
[http_service]
  internal_port = 8080
  force_https = true
  auto_stop_machines = true
  auto_start_machines = true
  min_machines_running = 1
 
[[vm]]
  memory = "512mb"
  cpu_kind = "shared"
  cpus = 1

Scale to zero, automatic TLS, private networking between apps. The operational complexity is near zero.

Render

Render occupies the space between Heroku and full ECS. Zero-downtime deploys, managed PostgreSQL, Redis, cron jobs, and private services. If you are building an internal API or a B2B SaaS MVP, Render removes an entire category of infrastructure decisions.

When the Equation Flips

None of this is to say Kubernetes is always wrong. The inflection point comes when:

You have more than 15–20 services and need consistent deployment patterns across teams
You need fine-grained network policies between services
You have GPU workloads or custom scheduling requirements
You're running batch workloads alongside long-running services and need bin-packing
Compliance requires you to run in your own VPC with locked-down node images
You have a dedicated platform team (at minimum 2 engineers) whose job is the cluster

quadrantChart title Team Size vs Service Count Decision Matrix x-axis Few Services --> Many Services y-axis Small Team --> Large Platform Team quadrant-1 Kubernetes justified quadrant-2 Evaluate with caution quadrant-3 Use PaaS quadrant-4 Managed ECS / Cloud Run Startup MVP: [0.15, 0.1] Growing SaaS: [0.35, 0.3] Platform at Scale: [0.8, 0.85] Mid-Market Eng: [0.55, 0.5]

The Honest Conversation With Your CTO

The migration from a simple deployment model to Kubernetes is not free. There is a cost in engineering hours, a learning curve tax, and a sustained operational burden. If your team does not have the capacity to absorb that cost, you will end up with a half-configured cluster that provides neither the simplicity of ECS nor the power of a well-run Kubernetes deployment.

The question to ask is not "should we use Kubernetes?" The question is: "Do we have the people and time to run it well, or will it run us?"

If the honest answer is no — reach for ECS, Fly, Render, or Cloud Run. Ship product. Revisit the question when you have a platform team.

Key Takeaways

Kubernetes has real operational costs beyond the compute bill: upgrades, CNI, RBAC, and networking complexity are ongoing.
ECS Fargate eliminates node management and integrates natively with AWS IAM, ALB, and CloudWatch — it is a serious platform, not a stepping stone.
Fly.io and Render solve global deployment and developer experience problems with near-zero operational overhead.
The inflection point for Kubernetes is roughly 15–20 services, a dedicated platform team, and requirements that managed PaaS cannot satisfy.
The worst outcome is a Kubernetes cluster that consumes your entire platform engineering capacity while your product team waits.
Make the choice deliberately. The YAML can always come later.