Comment by MindSpunk

15 days ago

What exactly is the difference between these?

cuMemAlloc -> vmaAllocate + VMA_MEMORY_USAGE_GPU_ONLY

cuMemAllocHost -> vmaAllocate + VMA_MEMORY_USAGE_CPU_ONLY

It seems like the functionality is the same, just the memory usage is implicit in cuMemAlloc instead of being typed out? If it's that big of a deal write a wrapper function and be done with it?

Usage flags never come up in CUDA because everything is just a bag-of-bytes buffer. Vulkan needs to deal with render targets and textures too which historically had to be placed in special memory regions, and are still accessed through big blocks of fixed function hardware that are very much still relevant. And each of the ~6 different GPU vendors across 10+ years of generational iterations does this all differently and has different memory architectures and performance cliffs.

It's cumbersome, but can also be wrapped (i.e. VMA). Who cares if the "easy mode" comes in vulkan.h or vma.h, someone's got to implement it anyway. At least if it's in vma.h I can fix issues, unlike if we trusted all the vendors to do it right (they wont).

1 comment

MindSpunk

m-schuetz 15 days ago

> and are still accessed through big blocks of fixed function hardware that are very much still relevant

But is it relevant for malloc? Everthing is put into the same physical device memory, so what difference would the usage flag make? Specialized texture fetching and caching hardware would come into play anyway when you start fetching texels via samplers.

> It seems like the functionality is the same, just the memory usage is implicit in cuMemAlloc instead of being typed out? If it's that big of a deal write a wrapper function and be done with it?

The main reason I did not even give VMA a chance is the github example that does in 7 lines what Cuda would do in 2. You now say it's not too bad, but that's not reflected in the very first VMA examples.