Comment by spease

8 hours ago

There was a library for Rust called “faster” which worked similarly to Rayon, but for SIMD.

The simpleminded way to do what you’re saying would be to have the compiler create separate PTX and native versions of a Rayon structure, and then choose which to invoke at runtime.