Comment by dragontamer

1 year ago

> But the biggest problem I'm having is management of buffer space for intermediate objects. That's not relevant to the core of raytracing because you're fundamentally just accumulating an integral, then writing out the answer for a single pixel at the end.

True Allocation just seems to be a "forced sequential" operation. A "stop the world, figure out what RAM is available" kind of thing.

If you can work with pre-allocated buffers, then GPUs work by reading from lists (consume operations), and then outputting to lists (append operations). Which can be done with gather / scatter, or more precisely stream-expansion and stream-compaction in a grossly parallel manner.

---------

If that's not enough "memory management" for you, then yeah, CPU is the better device to work with. At which point I again point back to the 192-core EPYC Zen5c example, we have grossly parallel CPUs today if you need them. Just a few clicks away to rent from cloud providers like Amazon or Azure.

GPUs are good at certain things (and I consider the pinnacle of "Connection Machine" style programming. Just today's GPUs are far more parallel, far easier to program and far faster than the old 1980s stuff).

Some problems cannot be split up (ex: web requests are so unique I cannot imagine they'd ever be programmed into a GPU due to their divergence). However CPUs still exist for that.