Comment by zozbot234
7 months ago
A somewhat different approach was recently proposed here: https://news.ycombinator.com/item?id=44319427 but it seems to have non-trivial overhead. (Still very much worthwhile, given the potential advantages of deterministic cycle collection.) The paper you reference is quite a bit older so it would of course be interesting to do a proper comparison.
The talk for this paper came up on YouTube just the other day: https://www.youtube.com/watch?v=GwXjydSQjD8
I'll look at that. About performance: people in practice have always favored GC, so I think there's a lot to be discovered in optimization of reference counting algorithms, including concurrent traversal (which is easier because each node has local info in the form of refcounts and flags) and maybe detection of problematic worse-case graphs
Naive ref counting (RC) and tracing GC are very different, but they start looking more and more similar the more you optimize them. Adding cycle collection to RC means adding some tracing. Adding deferred/batched destruction to RC is similar to making a tracing GC incremental. Saturated ref counts (or otherwise avoiding updates) are similar to creating an older generation in a tracing GC. Barriers in a tracing GC (for incremental/generational/concurrent collection) are similar to the ref count updates when mutating RC objects. RC cycle collection time is heavily determined by how much of the graph is traced through from "suspected" roots, so it can be optimized by tracing known-live stuff and removing it from consideration.
But some significant performance-relevant differences remain. RC's cycle collection tends to take time proportional to the amount of dead stuff. Tracing GC tends to take time proportional to the amount of live stuff. (Both use optimizations that weaken the connection, but they still show their origins.)