Python for the JVM Engineer

Python in Containers

Ravinder·January 3, 2026·6 min read

PythonJVMJavaDockercontainersmulti-stagedeployment

Series

Python for the JVM Engineer

Part 10 of 10

← Part 9

Packaging for Distribution

End of series

Containerising a Spring Boot application is straightforward: start from eclipse-temurin:21-jre-jammy, copy your fat JAR, set ENTRYPOINT ["java", "-jar", "app.jar"]. The image is predictable, JVM startup options are well-known, and layered image caching with Spring Boot's layered JAR tool is a one-liner. Python containerisation has the same structure but different gotchas — mostly around dependency installation layers, the interpreter version, and runtime tuning flags that are not obvious unless you know to look for them.

Image Selection

The Python image taxonomy mirrors the JRE image taxonomy:

Python Image	Java Equivalent	Size (approx)
`python:3.12`	`eclipse-temurin:21-jdk`	1.0 GB
`python:3.12-slim`	`eclipse-temurin:21-jre-jammy`	130 MB
`python:3.12-slim-bookworm`	`eclipse-temurin:21-jre-jammy`	130 MB
`python:3.12-alpine`	`eclipse-temurin:21-alpine`	55 MB

Always use python:3.12-slim (or the current stable version) as your production base. The alpine variant is tempting for size but breaks packages with C extensions because Alpine uses musl libc instead of glibc — the equivalent of the musl vs glibc friction you occasionally hit with JNI native libraries.

A Naive Dockerfile (What Not to Do)

FROM python:3.12-slim
WORKDIR /app
COPY . .
RUN pip install -r requirements.txt
ENTRYPOINT ["python", "main.py"]

Problems:

Every code change rebuilds the dependency layer — expensive in CI.
Dependencies are installed as root, pip cache is left in the image.
The application runs as root — a security risk.
No .pyc compilation — first-run import time is slower.

Production Dockerfile with Multi-Stage Build

# ─── Stage 1: build / install dependencies ───────────────────────────────────
FROM python:3.12-slim AS builder
 
WORKDIR /build
 
# Install build tools for packages with C extensions
RUN apt-get update && apt-get install -y --no-install-recommends \
        gcc \
        libpq-dev \
    && rm -rf /var/lib/apt/lists/*
 
# Copy dependency specs first — cached if they don't change
COPY pyproject.toml uv.lock ./
 
# Install to a prefix directory (like Maven's local repo in stage 1)
RUN pip install uv && \
    uv pip install --system --prefix=/install -r pyproject.toml
 
# ─── Stage 2: runtime image ──────────────────────────────────────────────────
FROM python:3.12-slim AS runtime
 
WORKDIR /app
 
# Non-root user
RUN groupadd --gid 1001 appgroup && \
    useradd --uid 1001 --gid appgroup --no-create-home appuser
 
# Copy installed packages from builder (no gcc, no build cache)
COPY --from=builder /install /usr/local
 
# Copy application source
COPY --chown=appuser:appgroup src/ ./src/
 
# Pre-compile .pyc files — faster startup
RUN python -m compileall -q src/
 
USER appuser
 
# Tuning: unbuffered output (equivalent to -Djava.io.tmpdir in JVM)
ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=0 \
    PYTHONFAULTHANDLER=1
 
EXPOSE 8000
 
ENTRYPOINT ["python", "-m", "uvicorn", "src.main:app", "--host", "0.0.0.0", "--port", "8000"]

flowchart LR subgraph "Stage 1: builder" B1["python:3.12-slim + gcc"] --> B2["install deps\n(uv pip install)"] B2 --> B3["/install prefix\n(wheels unpacked)"] end subgraph "Stage 2: runtime" R1["python:3.12-slim\n(no gcc)"] B3 -->|"COPY --from=builder"| R1 R1 --> R2["application source"] R2 --> R3["compileall .pyc"] end R3 --> FINAL["final image\n~160MB"]

This mirrors Spring Boot's layered JAR + multi-stage Docker pattern — dependencies in one layer, application code in another, no build tools in the final image.

Layer Ordering for Cache Efficiency

Order COPY instructions from least-changed to most-changed:

# 1. Dependency specs (rarely change)
COPY pyproject.toml uv.lock ./
RUN uv pip install ...
 
# 2. Application code (changes every commit)
COPY src/ ./src/

This is identical advice to the Spring Boot layered JAR best practice: dependencies layer → snapshot-dependencies layer → application layer.

Environment Variable Configuration

Python applications configured via environment variables follow the 12-factor app pattern — same as Spring Boot's @ConfigurationProperties loaded from SPRING_* env vars:

# config.py
import os
from pydantic_settings import BaseSettings
 
class Settings(BaseSettings):
    database_url: str
    redis_url: str = "redis://localhost:6379/0"
    debug: bool = False
    workers: int = 4
    log_level: str = "INFO"
 
    class Config:
        env_prefix = ""        # reads DATABASE_URL, REDIS_URL, etc.
        env_file = ".env"      # local dev only
 
settings = Settings()

In Docker / Kubernetes:

env:
  - name: DATABASE_URL
    valueFrom:
      secretKeyRef:
        name: db-secret
        key: url
  - name: REDIS_URL
    value: "redis://redis-service:6379/0"
  - name: WORKERS
    value: "4"

Critical Runtime Environment Variables

Variable	Effect	JVM Analogue
`PYTHONUNBUFFERED=1`	Flush stdout/stderr immediately	`-Djava.io.tmpdir` (output)
`PYTHONFAULTHANDLER=1`	Dump traceback on crash (SIGSEGV, etc.)	`-XX:+HeapDumpOnOutOfMemoryError`
`PYTHONDONTWRITEBYTECODE=1`	Skip `.pyc` generation (set 0 in prod to cache)	N/A
`PYTHONHASHSEED=random`	Random hash seed (default since 3.3)	`-Djava.security.egd`
`MALLOC_ARENA_MAX=2`	Limit glibc malloc arenas (reduces RSS)	`-Xmx` / `-XX:+UseG1GC`

MALLOC_ARENA_MAX=2 deserves special mention: glibc's default arena count (8 per CPU core) can cause Python processes to appear to use far more virtual memory than they actually do — setting it to 2 is a common production fix, similar to tuning -Xmx to control JVM heap sizing.

Worker Process Tuning

For WSGI (Gunicorn):

gunicorn \
  --workers $((2 * $(nproc) + 1)) \   # standard formula: 2*CPU + 1
  --worker-class uvicorn.workers.UvicornWorker \
  --bind 0.0.0.0:8000 \
  --timeout 30 \
  --graceful-timeout 20 \
  src.main:app

For ASGI (Uvicorn directly):

uvicorn src.main:app \
  --host 0.0.0.0 \
  --port 8000 \
  --workers 4 \
  --loop uvloop         # drop-in faster event loop (C extension)

The 2*CPU + 1 formula for Gunicorn workers is the Python equivalent of server.tomcat.threads.max — sized to overlap I/O wait across workers.

Health Checks and Graceful Shutdown

HEALTHCHECK --interval=30s --timeout=5s --retries=3 \
  CMD curl -f http://localhost:8000/health || exit 1

# FastAPI graceful shutdown with lifespan
from contextlib import asynccontextmanager
from fastapi import FastAPI
 
@asynccontextmanager
async def lifespan(app: FastAPI):
    # startup: connect DB, warm caches
    yield
    # shutdown: close connections, flush queues
 
app = FastAPI(lifespan=lifespan)
 
@app.get("/health")
async def health() -> dict:
    return {"status": "ok"}

This mirrors Spring Boot's ApplicationListener<ContextClosedEvent> + /actuator/health pattern.

Key Takeaways

Use python:3.12-slim as the production base — alpine breaks C extension packages due to musl vs glibc differences.
Multi-stage builds separate the build environment (gcc, dev tools) from the runtime image — same pattern as Spring Boot layered JARs.
Copy dependency specs before source code so that the dependency layer is cached between commits.
Set PYTHONUNBUFFERED=1 and PYTHONFAULTHANDLER=1 in production — they are the equivalent of JVM crash-dump flags.
Set MALLOC_ARENA_MAX=2 to prevent glibc arena fragmentation inflating the reported RSS of Python workers.
Use pydantic-settings for environment-variable-driven configuration — it is the @ConfigurationProperties equivalent with full type validation.

Series

Python for the JVM Engineer

Part 10 of 10

← Part 9

Packaging for Distribution

End of series