Comment by kouteiheika

10 months ago

I regularly profile heavy time-sensitive (as in: if the code takes too long to run it breaks) workloads, and I even do non-sampling memory profiling (meaning: on every memory allocation and deallocation I grab a full backtrace, which is orders of magnitude more data than normal sampling profiling) and it works just fine with minimal slowdown even though I get the unwinding info from vanilla DWARF.

Granted, this is using optimized tooling which uses a bunch of tricks to side-step the problem of DWARF being slow, I only profile native code (and some VMs which do ahead-of-time codegen) and I've never worked with JVM, but in principle I don't see why it wouldn't be practical on JVM too, although it certainly would be harder and might require better tooling (which might not exist currently). If you have the luxury of enabling frame pointers then that certainly would be easier and simpler.

(Somewhat related, but I really wish we would standardize on something better than DWARF for unwinding tables and basic debug info. Having done a lot of work with DWARF and its complexity I wouldn't wish it upon my worst enemy.)