Comment by ori_b
16 days ago
I can't reproduce. When I run The code is here: https://github.com/gsauthof/osjitter/blob/master/bench_sysca..., here are the numbers on the computers I have:
AMD Ryzen 7 9700X Desktop:
----------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------
bench_getuid 38.6 ns 38.5 ns 18160546
bench_getpid 39.9 ns 39.9 ns 17703749
bench_close 45.2 ns 45.1 ns 15711379
bench_syscall 42.2 ns 42.1 ns 16638675
bench_sched_yield 81.7 ns 81.6 ns 8623522
bench_clock_gettime 15.9 ns 15.9 ns 44010857
bench_clock_gettime_tai 15.9 ns 15.9 ns 43997256
bench_clock_gettime_monotonic 15.9 ns 15.9 ns 44012908
bench_clock_gettime_monotonic_raw 15.9 ns 15.9 ns 43982277
bench_nanosleep0 49961 ns 370 ns 100000
bench_nanosleep0_slack1 10839 ns 351 ns 1000000
bench_nanosleep1_slack1 10878 ns 358 ns 1000000
bench_pthread_cond_signal 1.37 ns 1.37 ns 503715097
bench_assign 0.563 ns 0.562 ns 1000000000
bench_sqrt 1.63 ns 1.63 ns 430096636
bench_sqrtrec 5.33 ns 5.33 ns 132574542
bench_nothing 0.394 ns 0.394 ns 1000000000
12th Gen Intel(R) Core(TM) i5-12600H
----------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------
bench_getuid 70.0 ns 70.0 ns 9985369
bench_getpid 71.6 ns 71.6 ns 9763016
bench_close 76.7 ns 76.7 ns 9131090
bench_syscall 66.8 ns 66.8 ns 10533946
bench_sched_yield 160 ns 160 ns 4377987
bench_clock_gettime 12.2 ns 12.2 ns 57432496
bench_clock_gettime_tai 12.1 ns 12.1 ns 57826299
bench_clock_gettime_monotonic 12.2 ns 12.2 ns 57736141
bench_clock_gettime_monotonic_raw 12.3 ns 12.3 ns 57070425
bench_nanosleep0 63154 ns 11834 ns 55756
bench_nanosleep0_slack1 2933 ns 1700 ns 348675
bench_nanosleep1_slack1 2654 ns 1479 ns 467420
bench_pthread_cond_signal 1.39 ns 1.39 ns 483995101
bench_assign 0.868 ns 0.868 ns 821103909
bench_sqrt 1.69 ns 1.69 ns 422094139
bench_sqrtrec 4.06 ns 4.06 ns 174511095
bench_nothing 0.750 ns 0.750 ns 941204159
AMD Ryzen 5 PRO 7545U Laptop:
----------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------
bench_getuid 106 ns 106 ns 6581746
bench_getpid 111 ns 111 ns 6271878
bench_close 116 ns 116 ns 5944154
bench_syscall 85.9 ns 85.9 ns 7317584
bench_sched_yield 315 ns 315 ns 2249333
bench_clock_gettime 17.6 ns 17.6 ns 39935693
bench_clock_gettime_tai 17.6 ns 17.6 ns 39920957
bench_clock_gettime_monotonic 17.5 ns 17.5 ns 39962966
bench_clock_gettime_monotonic_raw 17.5 ns 17.5 ns 39561163
bench_nanosleep0 52720 ns 3058 ns 100000
bench_nanosleep0_slack1 13815 ns 2969 ns 244790
bench_nanosleep1_slack1 13710 ns 2722 ns 254666
bench_pthread_cond_signal 2.66 ns 2.66 ns 264735233
bench_assign 0.930 ns 0.930 ns 813279743
bench_sqrt 2.43 ns 2.43 ns 286953468
bench_sqrtrec 5.67 ns 5.67 ns 123889652
bench_nothing 0.812 ns 0.812 ns 860562208
So, I've tested multiple times in multiple ways, and the results don't seem to match.
Interesting because on my machine I can reproduce the results. It's a pretty hefty 5.3GHz and recentish (Raptor Lake) Intel i7-13850HX CPU:
EDIT: also reproducible on my skylake-x (Gold 6152) machine
With turbo-boost @3.7Ghz enabled:
With turbo-boost disabled (@2.1GHz base frequency):
I wonder why your results are so much different. Mine almost linearly scale with the core frequency.
Something is definitely up. Is there a VM? are you running in a container with seccomp?
Why are your calls to sqrt so slow on your newest machine? Why is sqrtrec free on the others?
No VM, no container. I could check the asm later on but sqrtrec is likely "free" because it was optimized away, no fences in the code neither so this might be an artifact of different versions of gcc being used across two different platforms.
As for the sqrt, I don't think it is unusually slow if we compare it against the results from the table above - it's definitely not an outlier since the recorded range is from 1ns to 15ns and I recorded the value of 8ns. Why is that so is not a question here.
Better question is why are your results such a big outlier?
4 replies →