Comment by _bin_
4 months ago
I saw a good talk, though I don't remember the name, that went over the array-index approach. It correctly pointed out that by then, you're basically recreating your own pointers without any of the guarantees rust, or even C++ smart pointers, provide.
> It correctly pointed out that by then, you're basically recreating your own pointers without any of the guarantees rust, or even C++ smart pointers, provide.
I've gone back and forth on this, myself.
I wrote a custom b-tree implementation in rust for a project I've been working on. I use my own implementation because I need it to be an order-statistic tree, and I need internal run length encoding. The original version of my b-tree works just like how you'd implement it in C. Each internal node / leaf is a raw allocations on the heap.
Because leaves need to point back up the tree, there's unsafe everywhere, and a lot of raw pointers. I ended up with separate Cursor and CursorMut structs which held different kinds of references to the tree itself. Trying to avoid duplicating code for those two cursor types added a lot of complex types and trait magic. The implementation works, and its fast. But its horrible to work with, and it never passed MIRI's strict checks. Also, rust has really bad syntax for interacting with raw pointers.
Recently I rewrote the b-tree to simply use a vec of internal nodes, and a vec of leaves. References became array indexes (integers). The resulting code is completely safe rust. Its significantly simpler to read and work with - there's way less abstraction going on. I think its about 40% less code. Benchmarks show its about 25% faster than the raw pointer version. (I don't know why - but I suspect the reason is due to better cache locality.)
I think this is indeed peak rust.
It doesn't feel like it, but using an array-index style still preserves many of rust's memory safety guarantees because all array lookups are bounds checked. What it doesn't protect you from is use-after-free bugs.
Interestingly, I think this style would also be significantly more performant in GC languages like javascript and C#, because a single array-of-objects is much simpler for the garbage collector to keep track of than a graph of nodes & leaves which all reference one another. Food for thought!
>Benchmarks show its about 25% faster than the raw pointer version. (I don't know why - but I suspect the reason is due to better cache locality.)
Cache locality matters, but so does having less allocator pressure. Use 32-bit unsigned ints as indices, and you get improvements on that as well.
>The original version of my b-tree works just like how you'd implement it in C. Each internal node / leaf is a raw allocations on the heap.
I'd always try to avoid that type of allocation pattern in C++, FWIW :-).
> Recently I rewrote the b-tree to simply use a vec of internal nodes
Doesn't this also require you to correctly and efficiently implement (equivalents of C's) malloc() and free()? IIUC your requirements are more constrained, in that malloc() will only ever be called with a single block size, meaning you could just maintain a stack of free indices -- though if tree nodes are comparable in size to integers this increases memory usage by a significant fraction.
(I just checked and Rust has unions, but they require unsafe. So, on pain of unsafe, you could implement a "traditional" freelist-based allocator that stores the index of the next free block in-place inside the node.)
Depends on if you need to allocate/deallocate nodes. If you construct the tree once and don’t modify it thereafter you don’t need to. If you do need to modify and alloc/dealloc nodes you can use a bitmap to track free/occupied slots which is very fast (find first set + bitmanip) and has minuscule overhead even for integer sized elements.
1 reply →
GC languages like C# don't need these tricks, because it is feature rich enough to do C++ style low level programming, and has value types.
Having gone full-in on this approach before, with some good success, it still feels wrong to me today. Contiguous storage may work for reasonable numbers of elements, but it's potentially blocking a huge contiguous chunk of address space especially for large numbers of elements.
I probably say this because I still have to main 32-bit binaries (only 2G of address space), but it can potentially be problematic even on 64-bit machines (typically 256 TB of address space), especially if the data structure should be a reusable container with unknown number of instances. If you don't know a reasonable upper bound of elements beforehand, you have to reallocate later, or drastically over-reserve from the start. The former removes a pointer stability guarantee, the later is uneconomical, it may even be uneconomical on 64-bit depending on how many instances of the data structures you plan to have. And having to reallocate when overflowing the preallocated space makes operations less deterministic with regards to execution time.
> Having gone full-in on this approach before, with some good success, it still feels wrong to me today. Contiguous storage may work for reasonable numbers of elements, but it's potentially blocking a huge contiguous chunk of address space especially for large numbers of elements.
That makes sense. If my btree was gigabytes in size, I might rethink the approach for a number of reasons. But in my case, even for quite large input, the data structure never gets more than a few megabytes in size. Thats small enough that resizing the vec has a negligible performance impact.
It helps that my btree stores its contents using lossless internal run-length encoding. Eg, if I have values like this:
Then I store them like this:
In my use case, this compaction decreases the size of the data structure by about 20x. There's some overhead in joining and splitting values - but its easily worth it.
> What it doesn't protect you from is use-after-free bugs.
Yes. I've found that problem in index-allocated code.
Also, when you do this, you need an allocator for the indexes. I've found bugs in those.
Could std::rc::Weak solve the backreference problem?
Weak is very helpful in preventing ownership loops which prevent deallocation. Weak plus RefCell lets you do back pointers cleanly. You call ".borrow()" to get access to the data protected by a RefCell. The run-time borrow panics if someone else is using the data item. This prevents two mutable pointers to the same data, which Rust requires.
Static analysis could potentially check for those potential panics at compile time. If that was implemented, the run time check, and the potential for a panic, would go away. It's not hard to check, provided that all borrows have limited scope. You just have to determine, conservatively, that no two borrow scopes for the same thing overlap.
If you had that check, it would be possible to have something that behaves like RefCell, but is checked entirely at compile time. Then you know you're free of potential double-borrow panics.
I started a discussion on this on a Rust forum. A problem is that you have to perform that check after template expansion, and the Rust compiler is not set up to do global analysis after template expansion. This idea needs further development.
This check belongs to the same set of checks which prevent deadlocking a mutex against itself. There's been some work on Rust static deadlock analysis, but it's still a research topic.
I didn't consider that. Looking at how weak references work, that might work. It would reduce the need for raw pointers and unsafe code. But in exchange, it would add 16 bytes of overhead to every node in my data structure. That's pure overhead - since the reference count of all nodes should always be exactly 1.
However, I'm not sure what the implications are around mutability. I use a Cursor struct which stores a reference to a specific leaf node in the tree. Cursors can walk forward in the tree (cursor.next_entry()). The tree can also be modified at the cursor location (cursor.insert(item)). Modifying the tree via the cursor also updates some metadata all the way up from the leaf to the root.
If the cursor stored a Rc<Leaf> or Weak<Leaf>, I couldn't mutate the leaf item because rc.get_mut() returns None if there are other strong or weak pointers pointing to the node. (And that will always be the case!). Maybe I could use a Rc<Cell<Leaf>>? But then my pointers down the tree would need the same, and pointers up would be Weak<Cell<Leaf>> I guess? I have a headache just thinking about it.
Using Rc + Weak would mean less unsafe code, worse performance and code thats even harder to read and reason about. I don't have an intuitive sense of what the performance hit would be. And it might not be possible to implement this at all, because of mutability rules.
Switching to an array improved performance, removed all unsafe code and reduced complexity across the board. Cursors got significantly simpler - because they just store an array index. (And inserting becomes cursor.insert(item, &mut tree) - which is simple and easy to reason about.)
I really think the Vec<Node> / Vec<Leaf> approach is the best choice here. If I were writing this again, this is how I'd approach it from the start.
1 reply →
One can also use this array-index approach in C++, utilize the `at` methods and have "memory safety guarantees", no ?
> What it doesn't protect you from is use-after-free bugs.
How about using hash maps/hash tables/dictionaries/however it's called in Rust? You could generate unique IDs for the elements rather than using vector indices.
But Unity game objects are the same way: you allocate them when they spawn into the scene, and you deallocate them when they despawn. Accessing them after you destroyed them throws an exception. This is exactly the same as entity IDs! The GC doesn't buy you much, other than memory safety, which you can get in other ways (e.g. generational indices, like Bevy does).
But in rust you have to fight the borrow checker a lot, and sometimes concede, with complex referential stuff. I say this as someone who writes a good bit of rust and enjoys doing so.
I just don't, and even less often with game logic which tends to be rather simple in terms of the data structures needed. In my experience, the ownership and borrowing rules are in no way an impediment to game development. That doesn't invalidate your experience, of course, but it doesn't match mine.
4 replies →
Given my experience with Bevy this doesn't happen very often, if ever.
The only challenge is not having an ecosystem with ready made everything like you do in "batteries included" frameworks. You are basically building a game engine and a game at the same time.
We need a commercial engine in Rust or a decade of OSS work. But what features will be considered standard in Unreal Engine 2035?
2 replies →
> fight the borrow checker
I see this and I am reminded when I had to fight the 0 indexing, when I was cutting my teeth in C, for class.
I wonder why no one complains about 0 indexing anymore. Isn't it weird how you have to go 0 to length - 1, and implement algorithm differently than in a math book?
30 replies →
You can't do possibly-erroneous pointer math on a C# object reference. You don't need to deal with the game life cycle AND the memory life cycle with a GC. In Unity they free the native memory when a game object calls Destroy() but the C# data is handled by the GC. Same with any plain C# objects.
To say it's the same as using array indices is just not true.
> You can't do possibly-erroneous pointer math on a C# object reference.
Bevy entity IDs are opaque and you have to try really hard to do arithmetic on them. You can technically do math on instance IDs in Unity too; you might say "well, nobody does that", which is my point exactly.
> You don't need to deal with the game life cycle AND the memory life cycle with a GC.
I don't know what this means. The memory for a `GameObject` is freed once you call `Destroy`, which is also how you despawn an object. That's managing the memory lifecycle.
> In Unity they free the native memory when a game object calls Destroy() but the C# data is handled by the GC. Same with any plain C# objects.
Is there a use for storing data on a dead `GameObject`? I've never had any reason to do so. In any case, if you really wanted to do that in Bevy you could always use an `EntityHashMap`.
1 reply →
While we don't need, we can, that is the beauty of languages like C#, that offer the productivity of automatic memory management, and the tools to go low level if desired/needed.
At least in terms of doing math on indices, I have to imagine you could just wrap the type to make indices opaque. The other concerns seem valid though.
Yes but regarding use of uninitialized/freed memory, neither GC nor memory safety really help. Both "only" help with totally incidental and unintentional and small scale violations.
That sounds like Jonathan blow's "rant" on the subject. You can watch it on YouTube https://youtu.be/4t1K66dMhWk
pointers sure are useful