Skip to content

Latest commit

 

History

History
814 lines (574 loc) · 22.2 KB

slides.md

File metadata and controls

814 lines (574 loc) · 22.2 KB
title
Modern JVM Garbage Collection

Modern JVM
Garbage Collection


Java User Group Switzerland
2020-12-01


Puzzle ITC

Puzzle ITC


What to Expect

  • Some GC theory
  • New GCs in OpenJDK HotSpot
  • How to choose and tune GC in OpenJDK HotSpot
  • GC in containers

Garbage Collector Theory


What Does a GC Do?

  • Tracks every object in JVM Heap
  • Removes unused objects

    ⇒ Easy?


GC Design Factors

  • Program throughput
  • GC throughput
  • Heap overhead
  • Pause times
  • Pause frequency
  • Pause distribution
  • Allocation performance
  • Compaction
 

GC Design Factors

  • Concurrency
  • Scaling
  • Tuning
  • Warmup time
  • Page release
  • Portability
  • Compatibility

    ⇒ Very hard, lots of tradeoffs

References: 1


Generational Hypothesis

  • Empirical observation
  • (Weak): Most objects die young
  • Strong: The older the object, the less chance it has to die
  • Notable exception: LRU caches

References: 1


Throughput and Latency

Throughput
% of time not spent in garbage collection
Latency
Responsiveness of an application, affected by GC pauses

References: 1


STW, Parallel and Concurrent

STW (Stop-The-World)
All app threads are stopped during GC
Concurrent
All app threads are running during GC
Parallel
GC uses multiple threads in STW and/or concurrent phases

OpenJDK HotSpot Garbage Collectors


JVM Ergonomics

  • JVM ergonomics selects GC if none specified 1
  • Serial GC if 1 CPU or < 1.75 GiB RAM 2
  • OpenJDK < 8u191 not fully aware of containers 3
  • GCs ergonomics auto-tune low-level params

OpenJDK 8 Available GCs

GC Option Comment
Serial -XX:+UseSerialGC default
Parallel -XX:+UseParallelGC default
CMS -XX:+UseConcMarkSweepGC
G1 -XX:+UseG1GC
Shenandoah -XX:+UseShenandoahGC non-mainline backport

References: 1 2


OpenJDK 11 Available GCs

GC Option Comment
Serial -XX:+UseSerialGC default
Parallel -XX:+UseParallelGC
CMS -XX:+UseConcMarkSweepGC deprecated
G1 -XX:+UseG1GC default
Shenandoah -XX:+UseShenandoahGC non-mainline backport
ZGC -XX:+UseZGC exp., Linux x86_64 only
Epsilon -XX:+UseEpsilonGC experimental

References: 1 2 3 4


Epsilon GC

  • No-op gc, does not collect garbage at all
  • Useful for:
    • Measurements
    • Short living processes
    • Garbage-free applications

References: 1 2


GC Concurrency

Young Generation Old Generation
Serial Copy Mark Compact
Parallel Copy Mark Compact
CMS Copy Conc Mark Conc Sweep
G1 Copy Conc Mark Compact
Shenandoah Conc Mark Conc Compact
ZGC Conc Mark Conc Compact

  Stop-The-World   Concurrent with application

References: 1


Concurrent GC Phases

Concurrent GC Phases

References: 1


Concurrent GCs

  • All expensive GC phases run concurrent
  • Coordination between GC & app through barriers
  • STW pauses for root set scan and cleanup
  • Pauses are short and predictable
  • Better scalability for large heaps

Barriers

  • Machine code injected by JIT compiler
  • Additional metadata needed for coordination
  • Throughput reduction
    • Predictable
    • Can be offset with more resources

Shenandoah Garbage Collector


Shenandoah Garbage Collector

  • Developed by Red Hat
  • Named after Shenandoah national park
  • Originally based on G1 GC
  • Concurrent marking with SATB like G1

Shenandoah Garbage Collector

  • Pause times independent of heap and live-set size
  • Single generation, multiple regions
  • Multiple heuristics and failure modes

Shenandoah Garbage Collector

  • Concurrent compaction:
    • v1: Brooks pointers, read and write barriers
    • v2: on-heap forwarding pointers, load barriers
  • Barrier loop optimizations

References: 1 2 3 4 5


On-Heap Forwarding Pointers

Shenandoah On-Heap Forwarding Pointers

References: 1 2


Load (Reference) Barriers

Extra code when object reference is loaded from heap:

Object obj2 = obj.field1;  // Loading an object reference from heap
// Load barrier needed here

Object obj3 = obj2;        // No barrier, not a load from heap
obj.doSomething();         // No barrier, not a load from heap
int i = obj.field2;        // No barrier, not an object reference

Optimized for common case

References: 1


Shenandoah Load Barrier

Pseudocode:

load_reference_barrier(addr):
  if in_evac_phase() and in_collection_set(addr) and !is_forwarded(addr):
    new_addr = copy_object(addr)
    if cas_fwd_pointer(addr, new_addr):
      return new_addr
    else:
      return get_fwd_pointer(addr)  # Another thread copied object

References: 1 2


Shenandoah JDK Support

  • Available in OpenJDK, except Oracle builds
  • Fixes and updates backported to LTS JDKs
JDK Support Status
8 LTS Ready for production
11 LTS Ready for production
12-14 STS Experimental, discontinued
15 STS Ready for production

References: 1 2


Z Garbage Collector


Z Garbage Collector

  • Developed by Oracle, initially proprietary
  • Inspired by patented Azul C4 GC
  • Pause times independent of heap and live-set size
  • Single generation, multiple regions

Z Garbage Collector

  • Concurrent marking and compaction:
    • Load barriers and colored pointers
      • 64 bit only
      • no compressed oops
    • off-heap forward pointers

Z Garbage Collector

  • Failure modes not documented
  • Windows and MacOS support in JDK 14+

References: 1 2 3 4


Colored Pointers

ZGC Colored Pointers

Object address size was changed from 42 to 44 bits in JDK 13 1


Heap Multi-Mapping on x86_64

ZGC Heap Multi-Mapping

References: 1 2


ZGC Load Barrier

Pseudo code:

load_reference_barrier(addr):
  if color(addr) is bad:
    return slow_path(addr)  # mark/relocate/remap, depends on gc phase

References: 1 2


ZGC JDK Support

  • Available in OpenJDK, including Oracle builds
  • No backports ⇒ upgrade for fixes & new features
JDK Support Status
11 LTS Experimental, discontinued
12-14 STS Experimental, discontinued
15 STS Production ready

References: 1


Choosing and Tuning GCs


When to use which GC?

Criteria GC
Heap ≤ 100 MiB Serial
Single CPU, long pauses ok Serial
Maximum throughput, long pauses ok Parallel
Minimum latency, reduced throughput ok Shenandoah or ZGC
Minimum latency, JDK 8 or 11 LTS Shenandoah
Large heap, long pauses not ok Shenandoah or ZGC
Slow hardware, long pauses not ok Shenandoah (or ZGC?)
Balanced / otherwise G1

References: 1 2


Garbage Collector Design Goals

GC Throughput Pause time
Serial 99%
Parallel 99%
G1 90% 200 ms
Shenandoah 85% 10 ms
ZGC 85% 10 ms
Epsilon 100% 0 ms

List actual values: -XX:+UnlockDiagnosticVMOptions -XX:+PrintFlagsFinal References: 1 2 3


Generic GC Tuning

  • Set max heap size based on live-set and alloc rate
  • For Parallel and G1 and with specific requirement:
    • Set max pause-time or throughput goal
  • Leave other options on default
  • If goal isn't met tune according to next slides
  • If goal still isn't met tune according to references

References: 1 2


Serial GC Tuning

  • Throughput goal not met:
    • Increase max heap size
  • Pauses too long:
    • Choose different GC

References: 1


Parallel GC Tuning

  • Throughput goal not met:
    • Increase max heap size
    • Remove or raise max pause-time goal
  • Pauses too long:
    • Set or lower max pause-time goal

References: 1


G1 GC Tuning

  • Throughput too low:
    • Increase max heap size
    • Raise max pause-time goal
  • Pauses too long:
    • Lower max pause-time goal
    • Disable string deduplication

References: 1 2


ZGC Tuning

  • Throughput too low:
    • Decrease number of GC threads
  • Latency too high / allocation failures:
    • Increase max heap size
    • Increase GC threads

References: 1


Shenandoah Tuning

  • Latency too high / allocation failures:
    • Increase max heap size
    • Change/tune heuristics to run GC sooner
    • Increase allocator thread pacing delay

References: 1


Monitoring GCs

  • GCs provide detailed metrics in logs 1
  • GCs provide partial metrics via MBeans 2 3
  • Each GC reports metrics differently
  • Compare with GC logs to find what is reported

Monitoring GCs

  • Use tool like GCViewer to interpret logs 1
  • Export metrics to Prometheus with Micrometer 2
    • Supports concurrent GCs in 1.6+ 3
    • Reports all GC phases as pauses
  • Compare with non-GC metrics 4

Java GCs in Containers


JVM Ergonomics in Containers

  • Same rules apply in containers since 8u191 1
    • Serial GC if 1 CPU or < 1792 MiB RAM
  • Red Hat/fabric8 images use run-java.sh 2
    • Defaults to Parallel GC on JVM < 10
    • Sets max heap size by default

Heap Size

  • Optimize performance 1:
    • Set min heap size = max heap size
    • Enable -XX:+AlwaysPreTouch
  • Optimize cost:
    • Select GC with page release
      • G1 (JDK 12+), Shenandoah or ZGC 2 3
    • Don't set min heap size

GraalVM Native Image

  • Community Edition: Serial GC
  • Enterprise Edition: Serial GC, G1 GC (Linux only)
  • Implemented in Java
  • Native image not as efficient as JIT code

References: 1 2 3


Finalization

Finalization is deprecated. Instead use:

  • Explicit clean-up method
  • AutoClosable
  • PhantomReference and ReferenceQueue

References: 1


What's Next

  • Check references in slides
  • Look at some benchmarks 1 2 3
  • Verify and tune GC in your apps
  • Collect GC metrics in your apps
  • Read more about GC Theory 4