Comment by embedding-shape

1 day ago

Compiling flash-attn (Flash Attention) is a another great stress-test for CPU+RAM as just using 16 threads can balloon you into 128GB RAM usage territory already. Same thing with needing to not do too much concurrency when compiling it.

I have this problem with NixOS as one of my build servers doesn’t have enough ram. There doesn’t seem to be a way to know if a compilation is likely to be ram heavy and either use a tagged server with more ram or use few threads on servers with less ram.