Only when used in a naïve way, which Rust does not. For example, the increments/decrements are done only when "clone" is called and scope exit respectively, and based on Rust ownership/borrow checking, is rarely done combining the best of both worlds (but yes, implementations with aggressive increment/decrements in loops and on every function call can be very slow). Rust also separates Arc (atomic refs) and Rc (non-atomic refs) and enforces usage scenarios in the type checker giving you cheap Rc in single threaded scenarios. Reference counting when done in a smart way works pretty well, but you obviously have to be a little careful of cycles (which in my experience are pretty rare and fairly obvious when you have such a data type).
It's how often reference counts are adjusted on hot paths that matters (including in libraries), and back to the original point, reference counting doesn't let you free groups of objects in one go (unlike a tracing GC).
Also it'd be nice if the reference counts were stored separately from the objects. Storing them alongside the object being tracked is a classic mistake made by reference count implementations (it spreads the writes over a large number of cache lines). I was actually surprised that Rust doesn't get this right.
Another issue with manual memory management is that you can't compact the heap.
The amount of reference-counted pointers in most Rust code is a tiny fraction compared to boxes or compiler-tracked-lifetime references.
Yes in theory it would be more efficient to store all the reference counts together, but that's in theory. In practice most Rust apps will not call clone on a shared pointer on a hot path and if they do it's usually 1 such pointer and they do something with the data as well (so it's all 1 cache line anyway)
You can't compare Rust/C++ with Swift/Nim when it comes to RC, there just aren't enough reference count operations for it to matter much (unless you're in a shitty OO C++ codebase like me that pretends it is java with std::shared_ptr everywhere)
Apps where heap compaction would be relevant in a low-level language like Rust or C++ will typically use a bump allocator which will trounce any kind of GC.
Anyone claiming something like this obviously hasn’t dig into GCs. You honestly think that writing into memory at each access, especially atomically is anywhere near the performance of a GC that can do most of its work in parallel and just flip a bit to basically “having deleted” everything no longer accessible?
This isn't the kind of program Swift was designed to perform well for.
Nor is wallclock speed even what the system should be optimizing for, since you buy phones to run apps not to run the system. You should be measuring how well it gets out of the way of the important work.
Only when used in a naïve way, which Rust does not. For example, the increments/decrements are done only when "clone" is called and scope exit respectively, and based on Rust ownership/borrow checking, is rarely done combining the best of both worlds (but yes, implementations with aggressive increment/decrements in loops and on every function call can be very slow). Rust also separates Arc (atomic refs) and Rc (non-atomic refs) and enforces usage scenarios in the type checker giving you cheap Rc in single threaded scenarios. Reference counting when done in a smart way works pretty well, but you obviously have to be a little careful of cycles (which in my experience are pretty rare and fairly obvious when you have such a data type).
The increment/decrement calls only occur on an explicit call to .clone(). No .clone(), no increment/decrement.
You won't see many clones in rust code.
It's how often reference counts are adjusted on hot paths that matters (including in libraries), and back to the original point, reference counting doesn't let you free groups of objects in one go (unlike a tracing GC).
Also it'd be nice if the reference counts were stored separately from the objects. Storing them alongside the object being tracked is a classic mistake made by reference count implementations (it spreads the writes over a large number of cache lines). I was actually surprised that Rust doesn't get this right.
Another issue with manual memory management is that you can't compact the heap.
The amount of reference-counted pointers in most Rust code is a tiny fraction compared to boxes or compiler-tracked-lifetime references.
Yes in theory it would be more efficient to store all the reference counts together, but that's in theory. In practice most Rust apps will not call clone on a shared pointer on a hot path and if they do it's usually 1 such pointer and they do something with the data as well (so it's all 1 cache line anyway)
You can't compare Rust/C++ with Swift/Nim when it comes to RC, there just aren't enough reference count operations for it to matter much (unless you're in a shitty OO C++ codebase like me that pretends it is java with std::shared_ptr everywhere)
Apps where heap compaction would be relevant in a low-level language like Rust or C++ will typically use a bump allocator which will trounce any kind of GC.
Tracing is the worst in terms of performance
Anyone claiming something like this obviously hasn’t dig into GCs. You honestly think that writing into memory at each access, especially atomically is anywhere near the performance of a GC that can do most of its work in parallel and just flip a bit to basically “having deleted” everything no longer accessible?
a bit flip is not writing?
Also do traces not have to work atomically? The program needs to stop, you can’t have it check roots as it runs.
I’ll admit I am no GC researcher with ph.D experience, but your comment makes it seem you aren’t either.
1 reply →
That depends. Deallocating a zillion little objects one a a time can be slower than doing them all in a batch.
Not really, here it is winning hands down over Swift's ARC implementation.
https://github.com/ixy-languages/ixy-languages
Wasn't this the comparison that decided to use reference types for everything for no real reason?
10 replies →
This isn't the kind of program Swift was designed to perform well for.
Nor is wallclock speed even what the system should be optimizing for, since you buy phones to run apps not to run the system. You should be measuring how well it gets out of the way of the important work.
1 reply →
[flagged]
[flagged]
1 reply →