← Back to context

Comment by higherhalf

1 day ago

clock_gettime() goes through vDSO, avoiding a context switch. It shows up on the flamegraph as well.

Only for some clocks (CLOCK_MONOTONIC, etc) and some clock sources. For VIRT/SCHED, the vDSO shim still has to invoke the actual syscall. You can't avoid the kernel transition when you need per-thread accounting.

  • Oh for some time after its introduction, CLOCK_MONOTONIC_RAW wasn't vDSO'd and it took some time and syscall profiling ('huh, why do I see these as syscalls in perf record -e syscalls' ...) to understand what was going on.

If you look below the vDSO frame, there is still a syscall. I think that the vDSO implementation is missing a fast path for this particular clock id (it could be implemented though).

edit: agh, no. CLOCK_THREAD_CPUTIME_ID falls through the vdso to the kernel which makes sense as it would likely need to look at the task struct.

here it gets the task struct: https://elixir.bootlin.com/linux/v6.18.5/source/kernel/time/... and here https://elixir.bootlin.com/linux/v6.18.5/source/kernel/time/... to here where it actually pulls the value out: https://elixir.bootlin.com/linux/v6.18.5/source/kernel/sched...

where here is the vdso clock pick logic https://elixir.bootlin.com/linux/v6.18.5/source/lib/vdso/get... and here is the fallback to the syscall if it's not a vdso clock https://elixir.bootlin.com/linux/v6.18.5/source/lib/vdso/get...