← Back to context

Comment by pjmlp

13 years ago

> ...WinRT has evolved into something very efficient (e.g. by using ref counting rather than GC).

Ah you mean by causing cache stalls when doing increments/decrements after each assignment and allowing for cyclic references?

If the ref counting is manual or optimized, it doesn't happen after every assignment, only ones which change object ownership. If you're really nuts about avoiding cache hits, you can pack most ref counts into the low bits of the class pointer, since allocations tend to be 16-byte aligned, including allocations for class objects. Assuming you use the object at least once following a change of ownership, the overall cost in cache misses becomes 0. Or you can store the references in a global table with better cache properties (IIRC this is what ObjC does). Or you can rely on the fact that cache lines are typically large enough to grab a few words at a time, so fetching the class pointer will automatically fetch the refcount (again with the assumption that you use objects at least once per assignment). Honestly, I'm having difficulty imagining how cache behavior could become a problem unless you wanted it to.

As for not allowing cycles, I consider that a feature. The headache of memory management doesn't go away with GC. You still have to avoid inadvertently keeping references to objects through the undo stack & so on. Unintentional strong refs creep through GC code just as easily as memory leaks creep through code with manual memory management. Almost universally, I find that GC'd projects large enough to have memory management best practices implicitly do away with this freedom at the first opportunity by calling for acyclic or single-parent object ownership graphs. These restrictions primarily make it easier to think about object lifecycle -- the fact that they allow refcounting to suffice for memory management is icing on the cake.

  • > If the ref counting is manual or optimized, it doesn't happen after every assignment, only ones which change object ownership.

    The Rust compiler does this. Even so, 19% of the binary size in rustc is adjusting reference counts.

    I am not exaggerating this. One-fifth of the code in the binary is sitting there wasted adjusting reference counts. This is much of the reason we're moving to tracing garbage collection.

    All of those optimizations that you're talking about (for example, packing reference counts into the class pointer) will make adjusting reference counts take even more code size.

    > As for not allowing cycles, I consider that a feature. The headache of memory management doesn't go away with GC.

    The fact that memory management is still an issue doesn't mean that we shouldn't make it work as well as we can.

    In large software projects, cycles are a persistent issue. Gecko had to add a backup cycle collector over its reference counting framework, and observed memory usage decreased significantly once it was implemented.

    • That's the evolution Python went through as well: reference counting, then (in CPython 2.0) an adjunct cycle-collector, then (in PyPy) a non-reference-counting garbage collector.

      Hopefully in future versions of PyPy an even better garbage collector. :-)

  • Very nice explanation, except you are forgetting WinRT is actually an extension of COM.

    This means each increment/decrement is a virtual method call with the corresponding invocation costs.

    This is why the .NET runtime caches the WinRT objects, and even re-uses them instead of making a 1:1 use like C++/CX does.

The caching misbehavior of referencing counting has been greatly exaggerated, especially in the context of UI where responsiveness is much more important than raw CPU speed. Also, the ref counting tradeoff seems to work better for device (e.g. all the cool kids [1] are doing it).

[1] https://developer.apple.com/library/mac/documentation/Genera...

.NET still does GC, it is only the WinRT APIs (something like COM) that manage resources through ref counting. There is some cool interop magic that makes this somewhat transparent to the programmer.

  • > The caching misbehavior of referencing counting has been greatly exaggerated, especially in the context of UI where responsiveness is much more important than raw CPU speed. Also, the ref counting tradeoff seems to work better for device (e.g. all the cool kids [1] are doing it).

    Apple's ARC is a compiler hack. It only works if you happen to use set of Cocoa libraries that are recognized by the compiler and allow for some magic incantation.

    You cannot make any Objective-C library ARC aware.

    ARC is a good solution for Objective-C, because Apple never managed to have their GC working safely in all situations, given the underlying C type system.

    > .NET still does GC, it is only the WinRT APIs (something like COM) that manage resources through ref counting. There is some cool interop magic that makes this somewhat transparent to the programmer.

    True, but C++ does not and you still have the performance penalty of reference counting.

    The only way to have reference counting with good performance is do what Parasol or Objective-C do, by removing unnecessary increment/decrement operations thanks to dataflow analysis.

    Which is something that you cannot do just by using the WinRT runtime and need to rely on the cleverness of the compilers.

    Actually, are you aware that reference counting tends to be presented on the first chapter of many CS books about garbage collection as poor man's GC?

    • > Apple's ARC is a compiler hack. It only works if you happen to use set of Cocoa libraries that are recognized by the compiler

      I don't understand. How does ARC rely on more than what NSObject provides? I'm coming from the C++ shared_ptr world and ARC does not feel much different so far (less explicit, but easier to optimise).

      7 replies →

    • > True, but C++ does not and you still have the performance penalty of reference counting.

      Not very relevant. For UI code, responsive resource allocation and de-allocation is much more important than saving a few % CPU cycles.

      > Actually, are you aware that reference counting tends to be presented on the first chapter of many CS books about garbage collection as poor man's GC?

      Yes! Sometimes the conventional wisdom doesn't hold in the end, or it doesn't hold for all use cases. But I should have been more clear that ref counting is GC, whereas what I refer to as GC involves sweeping reference roots periodically to identify garbage.

      2 replies →