Comment by weinzierl
10 months ago
Also I've heard that the whole .eh_frame unwinding is more fragile than a simple frame pointer. I've seen enough broken stack traces myself, but honestly I never tried if -fno-omit-frame-pointer would have helped.
Yes and no. A simple frame pointer needs to be present in all libraries, and depending on build settings, this might not be the case. .eh_frame tends to be emitted almost everywhere...
So it's both similarly fragile, but one is almost never disabled.
The broader point is: For HLL runtimes you need to be able to switch between native and interpreted unwinds anyhow, so you'll always do some amount of lifting in eBPF land.
And yes, having frame pointers removes a lot of complexity, so it's net a very good thing. It's just that the situation wasnt nearly as dire as described, because people that care about profiling had built solutions.
Forget eBPF even -- why do the job of userspace in the kernel? Instead of unwinding via eBPF, we should ask userspace to unwind itself using a synchronous signal delivered to userspace whenever we've requested a stack sample.
Context switches are incredibly expensive. Given the sampling rate of eBPF profilers all the useful information would get lost in the context switch noise.
Things get even more complicated because context switches can mean CPU migrations, making many of your data useless.
1 reply →