← Back to context

Comment by auto

18 days ago

I'm in the depths of optimization on a game right now, and it's interesting how the gains I'm making currently all seem to be a matter of scaling the concept of lookup tables, and using the right tool for the job.

What I mean is that traditionally I think peoples' ideas of lookup tables are things like statically baked arrays setup at compile time, or even first thing at runtime, and they never change. But if you loosen your adherence to that last idea a bit, where a lookup table can change slightly over time, you can get a ton of mileage out of a comparatively small amount of memory compared to wasting cycles every frame.

As for the right tool for the job, I've read tons of dev logs and research papers over the years about moving work to the GPU, but this last few months stint of ripping my game inside out has really made me see the light. It's not just lookup tables built at compile or early runtime, but lookup tables modified slightly over time, and sent to the GPU as textures and used there.

Follow this train of thought long enough, and now we're just calling memory writes and reads "lookup tables" when they aren't really that anymore, but whos to say where the barrier really lies?

To some extent it is required that all serious work by a computer be the kind of repititious thing that can be at least partially addressed by a lookup table.

If you take the example of a game, drawing sprites say. Drawing a single preloaded sprite of reasonable size is always cheap, so a slow frame must have an excessive number. It's very hard to construct a practical scenario of a large number of truly distinct sprites though. A level has a finite tile palette, a finite cast of characters, abilities, etc. It's hard to logistically get them all into a scene together, and even then it won't be that many. So the only scenario left where sprite drawing will be slow is drawing the same handful of sprites over and over again. By contrast that's super common: just spam a persistent projectile, tap the analog stick to generate dust particles, etc.

Without getting too much into detail (because the people I worked for were really paranoid, and I don't want to give them agita), we used to build lookup tables "on the fly," sometimes, deep inside iterators.

For example, each block of pixels might have some calculated characteristics that were accessed by a hash into a LUT, but the characteristics would change, as we went through the image.

We'd do a "triage" run, where we'd build the LUT, then a "detailed" run, where we'd apply the LUT to the pixels.

It could get pretty hairy.