Comment by kaba0

3 years ago

With all due respect, you are talking about everything, but never had any objective comparison where the RC-tracing GC were the culprit.

> RC is super cheap. Seriously. You can do about several billion of them per second. Your malloc/free call is going to be more expensive.

Yes, and it is completely irrelevant. Good tracing GCs can use a thread local bump allocator, and it can even defragment it automatically later.

> Sure. Atomic counters are relatively expensive. But I’m good designs there’s very few of them. And they easily show up in hotspot

That’s just false, unless you are using something like cachgrind or so — any sane compiler will inline the 2-3 instructions of counter increment/decrements. It is the stereotypical “death by thousands cuts”, never showing up in profiling.

> Most languages that use RC actually avoid most memory allocations/frees by leveraging value composition instead of referential ownership.

I’m sorry but this has absolutely nothing to do with the topic here, there are plenty of tracing GCd languages that can do that as well, like D, Go, C#, Nim just from the top of my head. That’s just a completely different axis.

> I did actually provide proof by the way. Apple’s phones use half the RAM as Android and are at least as equally fast even if you discount better HW

That only proves that RCs use less memory, which as has been pointed out in the thread many times is one of its few positives — it does make sense to use in a mobile phone, but then you finish off with “ big hyperscaler is unlikely to be using Java for their core performance-critical infrastructure” which is just bad logic, and straight up false. Half of the internet literally runs on Java, with heap sizes going up to the terabytes range. Alibaba, many part of Google, Apple’s literally every backend system, Twitter, whole cloud providers(!) all run on the JVM.

> You only do this when you need to share ownership

That’s like.. the point of garbage collection algorithms? Otherwise you why don’t you just randomly increment a counter here and there?!

> Not only was the initial version faster than the equivalent Java code (not surprising since c++ will outperform due to no auto boxing, at least at the time

You literally give the reason why it was faster — you are comparing a sequentially laid out data structure to one of pointers. The effect size of that will trump any concern about which GC algorithm is used, so your point doesn’t apply here at all.