Comment by menaerus
16 days ago
Interesting because on my machine I can reproduce the results. It's a pretty hefty 5.3GHz and recentish (Raptor Lake) Intel i7-13850HX CPU:
----------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------
bench_getuid 384 ns 384 ns 1822307
bench_getpid 382 ns 382 ns 1835289
bench_close 390 ns 390 ns 1796493
bench_syscall 374 ns 374 ns 1874165
bench_sched_yield 611 ns 611 ns 1143456
bench_clock_gettime 44.1 ns 44.1 ns 15872740
bench_clock_gettime_tai 44.1 ns 44.1 ns 15879915
bench_clock_gettime_monotonic 44.1 ns 44.1 ns 15887383
bench_clock_gettime_monotonic_raw 44.4 ns 44.4 ns 15755225
bench_nanosleep0 55617 ns 4647 ns 100000
bench_nanosleep0_slack1 7144 ns 4362 ns 160448
bench_nanosleep1_slack1 7159 ns 4369 ns 160645
bench_pthread_cond_signal 7.38 ns 7.38 ns 94670062
bench_assign 0.523 ns 0.523 ns 1000000000
bench_sqrt 8.04 ns 8.04 ns 86998912
bench_sqrtrec 11.4 ns 11.4 ns 61428535
bench_nothing 0.000 ns 0.000 ns 1000000000
EDIT: also reproducible on my skylake-x (Gold 6152) machine
With turbo-boost @3.7Ghz enabled:
----------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------
bench_getuid 619 ns 616 ns 1153007
bench_getpid 632 ns 627 ns 1150829
bench_close 629 ns 626 ns 1110226
bench_syscall 617 ns 613 ns 1160239
bench_sched_yield 974 ns 969 ns 702773
bench_clock_gettime 17.9 ns 17.8 ns 39368735
bench_clock_gettime_tai 17.8 ns 17.7 ns 39109544
bench_clock_gettime_monotonic 17.9 ns 17.8 ns 39591364
bench_clock_gettime_monotonic_raw 19.0 ns 18.8 ns 38902038
bench_nanosleep0 63993 ns 4381 ns 100000
bench_nanosleep0_slack1 7445 ns 2115 ns 328474
bench_nanosleep1_slack1 7346 ns 2111 ns 334833
bench_pthread_cond_signal 2.13 ns 2.12 ns 327903411
bench_assign 0.167 ns 0.166 ns 1000000000
bench_sqrt 1.87 ns 1.85 ns 374885774
bench_sqrtrec 0.000 ns 0.000 ns 1000000000
bench_nothing 0.000 ns 0.000 ns 1000000000
With turbo-boost disabled (@2.1GHz base frequency):
----------------------------------------------------------------------------
Benchmark Time CPU Iterations
----------------------------------------------------------------------------
bench_getuid 1019 ns 1012 ns 688965
bench_getpid 1057 ns 1048 ns 688020
bench_close 1039 ns 1029 ns 684537
bench_syscall 1010 ns 1003 ns 696919
bench_sched_yield 1653 ns 1642 ns 434212
bench_clock_gettime 30.7 ns 30.4 ns 22999055
bench_clock_gettime_tai 30.5 ns 30.2 ns 23716873
bench_clock_gettime_monotonic 29.8 ns 29.6 ns 23643198
bench_clock_gettime_monotonic_raw 30.5 ns 30.3 ns 23277717
bench_nanosleep0 65256 ns 5114 ns 100000
bench_nanosleep0_slack1 11649 ns 3402 ns 197983
bench_nanosleep1_slack1 11572 ns 3528 ns 209371
bench_pthread_cond_signal 3.62 ns 3.60 ns 195696177
bench_assign 0.255 ns 0.253 ns 1000000000
bench_sqrt 3.13 ns 3.10 ns 225561559
bench_sqrtrec 0.000 ns 0.000 ns 1000000000
bench_nothing 0.000 ns 0.000 ns 1000000000
I wonder why your results are so much different. Mine almost linearly scale with the core frequency.
Something is definitely up. Is there a VM? are you running in a container with seccomp?
Why are your calls to sqrt so slow on your newest machine? Why is sqrtrec free on the others?
No VM, no container. I could check the asm later on but sqrtrec is likely "free" because it was optimized away, no fences in the code neither so this might be an artifact of different versions of gcc being used across two different platforms.
As for the sqrt, I don't think it is unusually slow if we compare it against the results from the table above - it's definitely not an outlier since the recorded range is from 1ns to 15ns and I recorded the value of 8ns. Why is that so is not a question here.
Better question is why are your results such a big outlier?
Are you sure they're outliers? Here's someone else with similar results:
https://arkanis.de/weblog/2017-01-05-measurements-of-system-...
Google also reported similar numbers in 2011, when publicizing their fiber work.
I can also get similar numbers (~68ns) on 9front, though a little higher.
3 replies →