Comment by jamiejquinn

2 days ago

Always been possible, but now the time cost of moving data between the GPU and CPU memory is too high to ignore. Branching may be slower on the GPU but it's still faster than moving data to the CPU for a time then back. The maturation of direct GPU-GPU transfers over the network also helped enable GPU-only MPI codes.