Comment by ahartmetz
2 days ago
Seems very likely that the hardware decompresses the data more or less on the fly. The acceleration structures are for the hardware, arithmetics hardware is cheap (compared to memory access), and they could use the compressed structures on older hardware with new drivers if hardware support wasn't necessary.
Right, the point of raytracing extensions is that there can definitively be wins thanks to specialized circuitry.
What I do wonder, like you mention that older chips could probably use the more optimized structures via software (after all, my naive-ish raytracer is fully in OpenGL and could me modified to use these structures instead), with memory being the big pain-point, what hardware optimizations/specializations are most relevant to get big gains compared to what can be done in "microcode". Circuitry for triangle-intersections, bit-unpacking but considering stack management there's probably other parts left to microcode.