Skip to main content
Python for the JVM Engineer

The Runtime Model

Ravinder··5 min read
PythonJVMJavaCPythonBytecodeRuntime
Share:
The Runtime Model

If you have spent years writing Java you carry a precise mental model of execution: source → javac → .class files → class loader → JIT compilation in the HotSpot (or GraalVM) runtime. That model shapes every assumption you bring to a new language. When you first open a Python file and run python app.py, the absence of a build step feels suspicious. Where are the class files? Is there a JIT? What is actually happening?

The short answer: CPython compiles your source to bytecode and interprets that bytecode in a C eval loop — no JIT, no ahead-of-time optimisation by default. The longer answer is worth understanding in detail, because it changes how you reason about performance, tooling, and debugging.

Source to Bytecode

CPython's pipeline has three stages: parse, compile, execute.

flowchart LR A[".py source"] --> B["Tokeniser\n(lexer)"] B --> C["AST\n(abstract syntax tree)"] C --> D["Compiler\n(compile.c)"] D --> E[".pyc bytecode\n(__pycache__)"] E --> F["ceval loop\n(CPython C runtime)"] F --> G["Result / Side Effects"]

The .pyc files in __pycache__ are the Python equivalent of .class files — but they are only a cache. If the source is newer, CPython regenerates them automatically. You can inspect them with the dis module:

import dis
 
def add(a: int, b: int) -> int:
    return a + b
 
dis.dis(add)

Output (CPython 3.12):

  2           RESUME           0
 
  3           LOAD_FAST        0 (a)
              LOAD_FAST        1 (b)
              BINARY_OP        0 (+)
              RETURN_VALUE

Compare this to javap -c output for the equivalent Java method — the instruction sets look similar in spirit (stack-based operations), but the JVM bytecode is far richer, with typed instructions and a verifier pass. CPython bytecode is dynamically typed at the instruction level: BINARY_OP will call __add__ at runtime and figure out the types then.

The Eval Loop vs the JVM's JIT

The JVM's HotSpot JIT monitors method invocation counts and compiles hot paths to native machine code. CPython 3.12 introduced a specialising adaptive interpreter — it observes types at runtime and rewrites bytecode into faster specialised forms (LOAD_FAST_CHECKLOAD_FAST, BINARY_OPBINARY_OP_ADD_INT). This is incremental optimisation, not a full JIT.

sequenceDiagram participant Source participant CPython participant Specialiser participant C_Runtime Source->>CPython: execute function CPython->>C_Runtime: generic BINARY_OP C_Runtime->>Specialiser: record type observation Specialiser-->>CPython: rewrite to BINARY_OP_ADD_INT CPython->>C_Runtime: fast-path on next call

For sustained compute-heavy workloads where HotSpot would shine, Python engineers reach for NumPy (C extension arrays), Cython (C compilation), or swap the runtime entirely for PyPy (a full tracing JIT).

Object Model: Everything is a Heap Object

In Java, primitive int is a value type stored on the stack; Integer is a heap-allocated object. In CPython, every value is a PyObject on the heap, including small integers. CPython caches integers from -5 to 256 in a static array, so a is b is True for small integers — a trap that has no Java equivalent:

a = 256
b = 256
print(a is b)   # True  — same cached object
 
a = 257
b = 257
print(a is b)   # False — different heap objects

Java comparison:

Integer a = 127;
Integer b = 127;
System.out.println(a == b);  // true  — Integer cache
 
Integer x = 128;
Integer y = 128;
System.out.println(x == y);  // false — new heap objects

The caching range differs, but the pattern is identical. Always use == (value equality) in Python; reserve is for identity checks (is None, is True, is False).

Frames, Stacks, and the Call Stack

Every function call in CPython creates a PyFrameObject — roughly equivalent to a Java stack frame. Unlike Java stack frames (allocated on the thread's native stack), CPython frames in 3.11+ are allocated on the C stack with a frame-local array, reducing heap pressure significantly.

import inspect
 
def outer():
    def inner():
        for frame_info in inspect.stack():
            print(frame_info.function)
    inner()
 
outer()
# inner
# outer
# <module>

The inspect module gives you the call stack without a debugger — useful for building introspective tooling, similar to Thread.currentThread().getStackTrace() in Java.

The Import System

Python's module system is closer to Java's classpath than it first appears. When you write import requests, CPython searches sys.path (a list of directories and zip files) in order, finds the package, compiles it to bytecode if needed, and caches it in sys.modules — a dict that acts like the JVM class cache.

import sys
 
import json
print(sys.modules["json"])   # <module 'json' from '...'>
print(sys.modules["json"] is json)  # True — same object

Re-importing a module returns the cached version, not a fresh copy. This is why the "reload trick" (importlib.reload) exists — it forces a re-execution of the module, analogous to reloading a class with a custom ClassLoader.

What This Means for You

Coming from the JVM, the most important runtime differences to internalise are:

  • No JIT by default — tight loops are slower than you expect until you profile and reach for native extensions.
  • No type verification at startup — type errors surface at runtime, not load time (this is why type hints + mypy exist; more on that in post 4).
  • The GIL — only one thread runs Python bytecode at a time (covered in depth in post 3).
  • Everything is mutable and dynamic — classes, functions, and modules are all first-class objects you can replace at runtime.

Key Takeaways

  • CPython compiles .py to bytecode (.pyc) automatically; dis.dis() lets you inspect it like javap -c.
  • The eval loop is a tree-walking interpreter, not a JIT; CPython 3.12's adaptive specialiser is a partial mitigation, not a full solution.
  • Every Python value is a heap-allocated PyObject; integer caching (-5..256) is the equivalent of Java's Integer cache.
  • is tests object identity; == tests value equality — use is only for singletons (None, True, False).
  • sys.path and sys.modules are the Python analogues of the JVM classpath and class cache.
  • For compute-heavy workloads, reach for NumPy, Cython, or PyPy rather than optimising pure-Python loops.
Share: