Comment by anticrymactic
4 months ago
This could be different in game dev, but in the last years of writing rust (outside of learning the language) I very rarely need to index any collection.
There is a very certain way rust is supposed to be used, which is a negative on it's own, but it will lead to a fulfilling and productive programming experience. (My opinion) If you need to regularly index something, then you're using the language wrong.
I'm no game dev but I have had friends who do it professionally.
Long story short, yes, it's very different in game dev. It's very common to pre-allocate space for all your working data as large statically sized arrays because dynamic allocation is bad for performance. Oftentimes the data gets organized in parallel arrays (https://en.wikipedia.org/wiki/Parallel_array) instead of in collections of structs. This can save a lot of memory (because the data gets packed more densely) be more cache-friendly, and makes it much easier to make efficient use of SIMD instructions.
This is also fairly common in scientific computing (which is more my wheelhouse), and for the same reason: it's good for performance.
> Oftentimes the data gets organized in parallel arrays (https://en.wikipedia.org/wiki/Parallel_array) instead of in collections of structs. This can save a lot of memory (because the data gets packed more densely) be more cache-friendly, and makes it much easier to make efficient use of SIMD instructions.
That seems like something that could very easily be turned into a compiler optimisation and enabled with something like an annotation. Would have some issue when calling across library boundaries ( a lot like the handling of gradual types), but within the codebase that'd be easy.
The underlying issue with game engine coding is that the problem is shaped in this way:
* Everything should be random access(because you want to have novel rulesets and interactions)
* It should also be fast to iterate over per-frame(since it's real-time)
* It should have some degree of late-binding so that you can reuse behaviors and assets and plug them together in various ways
* There are no ideal data structures to fulfill all of this across all types of scene, so you start hacking away at something good enough with what you have
* Pretty soon you have some notion of queries and optional caching and memory layouts to make specific iterations easier. Also it all changes when the hardware does.
* Congratulations, you are now the maintainer of a bespoken database engine
You can succeed at automating parts of it, but note that parent said "oftentimes", not "always". It's a treadmill of whack-a-mole engineering, just like every other optimizing compiler; the problem never fully generalizes into a right answer for all scenarios. And realistically, gamedevs probably haven't come close to maxing out what is possible in a systems-level sense of things since the 90's. Instead we have a few key algorithms that go really fast and then a muddle of glue for the rest of it.
It's not at all easy to implement as an optimisation, because it changes a lot of semantics, especially around references and pointers. It is something that you can e.g. implement using rust procedural macros, but it's far from transparent to switch between the two representations.
(It's also not always a win: it can work really well if you primarily operate on the 'columns', and on each column more or less once per update loop, but otherwise you can run into memory bandwidth limitations. For example, games with a lot of heavily interacting systems and an entity list that doesn't fit in cache will probably be better off with trying to load and update each entity exactly once per loop. Factorio is a good example of a game which is limited by this, though it is a bit of an outlier in terms of simulation size.)
Meh. I've tried "SIMD magic wand" tools before, and found them to be verschlimmbessern.
At least on the scientific computing side of things, having the way the code says the data is organized match the way the data is actually organized ends up being a lot easier in the long run than organizing it in a way that gives frontend developers warm fuzzies and then doing constant mental gymnastics to keep track of what the program is actually doing under the hood.
I think it's probably like sock knitting. People who do a lot of sock knitting tend to use double-pointed needles. They take some getting used to and look intimidating, though. So people who are just learning to knit socks tend to jump through all sorts of hoops and use clever tricks to allow them to continue using the same kind of knitting needles they're already used to. From there it can go two ways: either they get frustrated, decide sock knitting is not for them, and go back to knitting other things; or they get frustrated, decide magic loop is not for them, and learn how to use double-pointed needles.
2 replies →
I'm not a game dev, but what's a straightforward way of adjusting some channel of a pixel at coordinate X,Y without indexing the underlying raster array? Iterators are fine when you want to perform some operation on every item in a collection but that is far from the only thing you ever might want to do with a collection.
Game dev here. If you’re concerned about performance the only answer to this is a pixel shader, as anything else involves either cpu based rendering or a texture copy back and forth.
A compute shader could update some subset of pixels in a texture. It's on the programmer to prevent race conditions though. However that would again involve explicit indexing.
In general I think GP is correct. There is some subset of problems that absolutely requires indexing to express efficiently.
7 replies →
On the contrary, I find indices to be the most natural way to represent anything that resembles a graph in Rust. They allow you to sidestep the usual issues that arise with ownership and borrowing, particularly with mutability, by handing ownership to the collection and using indices to allow nodes to refer to one another. It's delightfully simple compared to the mess of Arc and RefCell that tends to result when one tries to apply patterns from languages that leave "shared XOR mutable" as the programmer's responsibility. That's not to say that Vec and usize are appropriate for the task, but Rust's type system can be used to do a lot better.
This is getting downvoted but it's kind of true. Indexing collections all the time usually means you're not using iterators enough. (Although iterators become very annoying for fallible code that you want to return a Result, so sometimes it's cleaner not to use them.)
However this problem does still come up in iterator contexts. For example Iterator::take takes a usize.
An iterator works if you're sequentially visiting every item in the collection, in the order they're stored. It's terrible if you need random access, though.
Concrete example: pulling a single item out of a zip file, which supports random access, is O(1). Pulling a single item out of a *.tar.gz file, which can only be accessed by iterating it, is O(N).
History lesson for the cheap seats in the back:
Compressed tars are terrible for random access because the compression occurs after the concatenation and so knows nothing about inner file metadata, but it's good for streaming and backups. Uncompressed tars are much better for random access. (Tar was a used as a backup mechanism to tape (tape archive).)
Zips are terrible for streaming because their metadata is stored at the end, but are better for 1-pass creation and on-disk random access. (Remember that zip files and programs were created in an era of multiple floppy disk-based backups.)
When fast tar enumeration is desired, at the cost of compatibility and compression potential, it might be worth compressing files and then taring them when and if zipping alone isn't achieving enough compression and/or decompression performance. FUSE compressed tar mounting gets to be really expensive with terabyte archives.
2 replies →
In C++, random access iterators are a thing. Indeed, raw pointers satisfy the requirements of a random access iterator concept. Is that not the case in Rust?
While you maybe "shouldn't" be indexing collections often (which I also don't agree with, there is a reason that we have more collections then linked lists, lookup is important) even just getting the size of a collection which is often very related to business logic can be quite annoying.
For data that needs to be looked up mostly I want a hashtable. Not always, but mostly. It's rare that I want to look up something but its position in a list.