Comment by jonasn
15 hours ago
Great question! I actually just touched on this in another thread that went up right around the same time you asked this. It is clearly the next big frontier!
The short answer is: It's something I'm actively thinking about, but instrumenting micro-level events (like ZGC's load barriers or G1's write barriers) directly inside application threads without destroying throughput (or creating observer effects invalidating the measurements) is incredibly difficult.
Do you think it can be done by adjusting GC aggressiveness (or even disabling it for short periods of time) and correlating it with execution time?
That is spot on. Effectively disabling GC to establish a baseline is exactly the methodology used in the Blackburn & Hosking paper [1] I referenced.
In general, for a production JVM like HotSpot, the implicit cost comes largely from the barriers (instructions baked directly into the application code). So even if we disable GC cycles, those barriers are still executing.
If we were to remove barriers during execution, maintaining correctness becomes the bottleneck. We would need a way to ensure we don't mark a live (reachable) object as dead the moment we re-enable the collector.
[1] https://dl.acm.org/doi/pdf/10.1145/1029873.1029891
Would running an application with chosen GC, subtracting GC time reported by methods You introduced, and then comparing with Epsilong-based run be a good estimate of barrier overhead ?
Thank you for the well written article!
1 reply →