Comment by adrian_b

1 day ago

As a general rule, also the amount of physical memory installed in a computer should be proportional with the number of hardware threads provided by its CPU.

Besides the fact that the operating system may allocate some memory for each thread, when you launch a multi-threaded application that is able to use all available threads, for instance the compilation of a big software project, it frequently will allocate some working memory in an amount proportional with the amount of working threads.

I have encountered many multi-threaded applications that need up to 2 GB per thread to work well.

This corresponds to having 64 GB for a desktop CPU with 32 threads, like Ryzen 9 9950X.

For the compilation example, I have seen software projects, like Chrome/Chromium and its derivatives, where if you do not have enough memory, proportional to the number of hardware threads, e.g. when you have only 32 GB for a 16 core/32 thread CPU, you must reduce the number of concurrent compilations, e.g. with an appropriate parameter to "make -j", leaving some threads and cores idle, because otherwise you may encounter out-of-memory errors.

> when you have only 32 GB for a 16 core/32 thread CPU, you must reduce the number of concurrent compilations

Also, depending on the architecture, avoiding odd(or even) virtual cores might free more L2 or L3 for the worker threads and speed up the process.

Compiling flash-attn (Flash Attention) is a another great stress-test for CPU+RAM as just using 16 threads can balloon you into 128GB RAM usage territory already. Same thing with needing to not do too much concurrency when compiling it.

  • I have this problem with NixOS as one of my build servers doesn’t have enough ram. There doesn’t seem to be a way to know if a compilation is likely to be ram heavy and either use a tagged server with more ram or use few threads on servers with less ram.

It's an important point. I went from 4c/8t and 32GB to 16/32 and 96GB. Dramatically less memory per thread. Some software (looking at you, Vivado) can take incredible amounts of memory per parallel job thus mandating some projects can only run with a subset of my cores. At least until I stepped up my work laptop to 10.66 GB/thread. That seems to be manageable

Yes! I have also observed that with compilation VMs on a big server.