Comment by rwmj
12 hours ago
Let me test that now. Note I only have 1 Intel machine so any results are very specific to this laptop.
-j time (mean ± σ)
12 (#P+#E) 130.889 s ± 4.072 s
13 (..+1) 135.049 s ± 2.270 s
4 (#P) 179.845 s ± 1.783 s
8 (#E) 141.669 s ± 3.441 s
Machine: 13th Gen Intel(R) Core(TM) i7-1365U; 2 x P-cores (4 threads), 8 x E-cores
Your processor has two P cores, and ten cores total, not twelve. The HyperThreading (SMT) does not make the two P cores into four cores. Your experiment with 4 threads will most likely result in using both P cores and two E cores, as no sane OS would double up threads on the P cores before the E cores were full with one thread each.
The hyperthreading should cover up memory latency, since the workload (compiling qemu) might not fit into L3 cache. Although I take your point that it doesn't magically create two core-equivalents.
“Hyperthreading” is a write pipe hack.
If the core stalls on a write then the other thread gets run.
I am sure rwmj was smart enough to use `taskset` to make this experiment meaningful.
Hehe, if only :-( However I do want to know what's best with the default Linux scheduler and just using 'make' rather than more complicated commands.