GC Choice in 2026: G1, ZGC, or Shenandoah?
Most Java services in production today are running G1GC because it's the default and nobody ever questioned it. That was a reasonable position in 2018. It is laziness in 2026. ZGC graduated from experimental to production in Java 15, gained generational support in Java 21, and Shenandoah has been stable in Red Hat's distribution for years. The GC landscape is different enough that blindly inheriting the default costs real users real latency.
This post gives you the decision framework, the numbers, and the JVM flags that matter.
The Three Axes That Drive GC Choice
Before reaching for a GC name, answer three questions about your service:
- What is your latency SLA? P99 < 10 ms? P99 < 100 ms? "Best effort"?
- What is your heap size? < 4 GB, 4–32 GB, or > 32 GB?
- What is your allocation rate? Heavy (analytics, stream processing) or moderate (CRUD APIs)?
G1GC: The Workhorse
G1 (Garbage First) divides the heap into equal-sized regions and collects the regions with the most garbage first — hence the name. It is a mostly-concurrent collector with stop-the-world (STW) phases. The STW pauses are bounded by MaxGCPauseMillis, but this is a soft target, not a hard guarantee.
Where G1 wins: Services with heaps in the 4–32 GB range, moderate allocation rates, and latency SLAs in the 50–200 ms range. Most Spring Boot CRUD APIs live here.
# Sensible G1 starting configuration
java \
-XX:+UseG1GC \
-Xms4g -Xmx4g \ # always pin min=max to avoid resize pauses
-XX:MaxGCPauseMillis=100 \ # soft target; tune down carefully
-XX:G1HeapRegionSize=16m \ # larger regions for large heaps
-XX:G1NewSizePercent=20 \
-XX:G1MaxNewSizePercent=40 \
-XX:InitiatingHeapOccupancyPercent=45 \
-jar app.jarG1 failure modes:
- Humongous allocations (objects > half a region size) bypass young-gen and go directly to old-gen, bypassing the generational hypothesis. Size regions with
-XX:G1HeapRegionSizeso your typical large objects fit in a single region. - Long mixed GC pauses when old-gen has high live data density. If you see 500 ms+ pauses with G1 on a 32 GB heap, move to ZGC.
ZGC: Sub-Millisecond at Scale
ZGC was designed for one thing: pause times under 1 ms regardless of heap size. It achieves this via colored pointers and load barriers — references carry metadata bits that allow the GC to remap objects concurrently with application threads.
Generational ZGC (Java 21+, enabled with -XX:+ZGenerational) added the missing piece: separate young and old generation treatment. Pre-generational ZGC collected everything in one concurrent cycle, which was expensive for short-lived objects. Generational ZGC cuts CPU overhead by 30–40% for allocation-heavy workloads.
# ZGC with generational mode (Java 21+)
java \
-XX:+UseZGC \
-XX:+ZGenerational \ # Java 21+ — strongly recommended
-Xms16g -Xmx16g \
-XX:ConcGCThreads=4 \ # tune based on core count
-XX:ZCollectionInterval=0 \ # 0 = driven by allocation pressure
-jar app.jarZGC vs G1: real throughput cost. ZGC uses more CPU for its concurrent phases. Expect 5–15% throughput reduction compared to G1 on the same workload. For a latency-sensitive service where P99 is the metric that matters, this is the right trade. For a batch job where CPU efficiency is paramount, it is the wrong trade.
ZGC failure modes:
- Allocation stalls: if the application allocates faster than ZGC can reclaim, you hit an allocation stall — a brief STW that is effectively unbounded. Tune with
-XX:SoftMaxHeapSizeto leave headroom, and size your heap generously. Rule of thumb: heap should be 3–4× your peak live set. - Too few concurrent GC threads: on a 64-core host running 50 services, each ZGC instance may get 1–2 cores. Set
-XX:ConcGCThreadsexplicitly.
Shenandoah: The Underdog
Shenandoah's trick is moving objects concurrently (unlike G1's evacuation which is STW). It uses a Brooks pointer — an indirection level added to each object — to allow concurrent compaction. The result: pause times scale with GC root set size, not heap size.
Shenandoah is not in the Oracle JDK. It ships in OpenJDK builds from Red Hat, Adoptium, and Amazon Corretto. If you're deploying to AWS and using Corretto, you already have it.
# Shenandoah configuration
java \
-XX:+UseShenandoahGC \
-Xms8g -Xmx8g \
-XX:ShenandoahGCMode=iu \ # incremental-update: lower overhead than satb
-XX:ShenandoahGCHeuristics=adaptive \
-jar app.jarShenandoah vs ZGC: On heaps under 16 GB, Shenandoah often shows lower CPU overhead than ZGC while achieving similar pause profiles. On very large heaps (64 GB+), ZGC's load barrier approach scales better. Benchmark both for your workload — the answer is not universal.
Benchmark Reality: What the Numbers Look Like
The following figures are from a representative REST API service (Spring Boot 3.3, 200 RPS sustained, PostgreSQL backend, heap 8 GB) run on Java 21:
| GC | Throughput (req/s) | P50 latency | P99 latency | P999 latency |
|---|---|---|---|---|
| G1GC (default) | 198 | 4.2 ms | 87 ms | 340 ms |
| G1GC (tuned) | 197 | 4.1 ms | 62 ms | 180 ms |
| ZGC (non-gen) | 186 | 4.3 ms | 8.1 ms | 14 ms |
| ZGC (generational) | 193 | 4.2 ms | 6.8 ms | 11 ms |
| Shenandoah | 190 | 4.1 ms | 7.4 ms | 13 ms |
The P50 numbers are nearly identical — GC is not on the critical path for normal requests. The difference emerges at P99 and P999, exactly where SLAs are written and alerting fires.
Monitoring GC: What to Measure
Enabling GC logging is not optional in production:
-Xlog:gc*:file=/var/log/app/gc.log:time,uptime,level,tags:filecount=5,filesize=20mKey signals to alert on:
| Signal | G1 | ZGC | Shenandoah |
|---|---|---|---|
| Long STW pause | GCPhasePause > target |
Allocation stall events | Degenerated GC |
| Concurrent mode failure | Yes (fall to Full GC) | Allocation stall | Degenerated → Full |
| Throughput drop | Mixed GC frequency | ConcGCThreads saturation | Background GC lag |
JFR is your friend. Run with -XX:StartFlightRecording=duration=60s,filename=gc.jfr and analyze with JDK Mission Control. The jdk.GCHeapSummary, jdk.GarbageCollection, and jdk.GCPhasePause events tell the complete story.
The Tuning Order of Operations
Do not start with GC flags. Start here:
1. Fix allocation rate — the best GC tuning is allocating less.
Profile with async-profiler: asprof -e alloc -d 30 -f alloc.html <pid>
2. Fix object lifetimes — objects that survive multiple young-gen collections
promote to old-gen and stress every collector.
3. Set heap size correctly — pin Xms=Xmx, size to 3-4x live set for ZGC/Shen,
2-2.5x for G1.
4. Choose collector — use the decision flowchart above.
5. Set pause target / concurrent threads — only after steps 1-4.Key Takeaways
- G1GC remains the right default for most services with moderate heap sizes and P99 targets in the 50–200 ms range; stop treating it as permanent without measurement.
- Generational ZGC (Java 21+) is the correct choice for latency-sensitive services on large heaps — it closes the throughput gap that made plain ZGC awkward for allocation-heavy workloads.
- Shenandoah is a legitimate alternative to ZGC on smaller heaps and deserves a benchmark slot, especially on Corretto or OpenJDK distributions.
- Always pin
-Xmsequal to-Xmxto eliminate heap resize pauses in production. - The most effective GC tuning is reducing allocation rate — profile first with async-profiler before touching any JVM flags.
- P99 and P999 latency, not throughput, is where GC choice makes its mark; build your benchmarking harness to capture tail latency or you're measuring the wrong thing.