← Back to context

Comment by pajko

10 months ago

There's another option: https://lesenechal.fr/en/linux/unwinding-the-stack-the-hard-...

Brendan mentions DWARF unwinding, actually, and briefly mentions why he considers it insufficient.

  • The biggest objection seems to be the Java/JIT case. eh_frame supports a "personality function" which is AIUI basically a callback for performing custom unwinding. If the personality function could also support custom logic for producing backtraces, then the profiling sampler could effectively read the JVM's own metadata about the JIT'ted code, which I assume it must have in order to produce backtraces for the JVM itself.

DWARF unwinding isn't practical: https://rwmj.wordpress.com/2023/02/14/frame-pointers-vs-dwar...

  • TBH this sounds more like perf's implementation is bad.

    I'm waiting for this to happen: https://github.com/open-telemetry/community/issues/1918

    • There's always room for improvement, for example, Samply [0] is a wonderful profiler that uses the same APIs that `perf` uses, but unwinds the stacks as they come rather than dumping them all to disk and then having to process them in bulk.

      Samply unwinds significantly faster than `perf` because it caches unwind information.

      That being said, this approach still has some limitations, such as that very deep stacks won't be unwound, as the size of the process stack the kernel sends is quite limited.

      - [0]: https://github.com/mstange/samply