Comment by roca
3 years ago
I agree that GC languages are faster for prototyping, but this article downplays the disadvantages of GC. E.g. "deterministic destruction" is listed as a benefit of reference counting, but it really should be listed as a disadvantage of GC, since all the other memory-management methods have it.
The "leaky abstractions" argument doesn't work for me. E.g. in a Java API you don't commit (at the language level) to the ownership of passed-in or returned mutable objects. In Rust, you do, and the author claims this is a problem. I say that in fact you depend on the ownership of the referenced object either way; Rust forces you to write down what it is and enforce that the callers honor it, while in Java you cannot. Arguably therefore Rust is less leaky in that hidden assumptions are documented and enforced at the API layer.
I've had lots of good experiences refactoring Rust code. It's probably more typing than other languages but the increased assurance that you didn't break the code (especially in the presence of parallelism) more than makes up for it.
Because no two languages are implemented the same, and there are plenty languages with tracing GC that also support deterministic destruction.
Unfortunely people keep reading only about Java and then they think they know something about GC based languages.
Modula-3 and Mesa/Cedar are two examples from the past, D one from modern times, with many others in between.
Probably the same for the others; in D you have RAII but destruction is only deterministic for stack/static variables. There isn't deterministic destruction for heap allocated stuff unless you count manually freeing.
Except the little detail that manual heap allocation with RAII is also a thing.
3 replies →
D is not a gc based language. GC is just another tool D makes available to the programmer.
For other folks who don't write D:
D has GC, no GC ("@nogc" mode), or Ref-Counting (experimentally) with "@safe" and "@live" modes.
1 reply →
It is according to CS definition, and will keep being one unless @nogc is enabled by default for everything.
2 replies →
> The "leaky abstractions" argument doesn't work for me. E.g. in a Java API you don't commit (at the language level) to the ownership of passed-in or returned mutable objects.
Yep, and so you get all sorts of bugs or defensive code everywhere. For that reason I also have a hard time with one section being
> GC'd code can be more correct
But apparently issues like iterator invalidation and other mutability conflicts don’t rate.
While technically true, I'm not sure that the iterator invalidation argument is that strong in practice. Rust doesn't seem to eliminate those problems, it more moves them around and transforms them into different (though sometimes easier) problems.
We often think that the borrow checker protects us from these kinds of mutability conflict / single-threaded race conditions, by making it so we don't have a reference to an object while someone else might modify it. However, the classic workaround is to turn one of those references into an index/ID into a central collection. In real programming, that index/ID is equivalent to the reference. But while I hold that index/ID, the object can still change, which kind of throws the benefit into question.
In other situations where we only have multiple read-only references, there's not really a motivating problem to begin with, even when coded in other languages that don't also regard them as read-only.
I think the other benefits the article mentions (concurrency benefits, plus encouraging into flatter, cleaner architectures) are a bit more compelling.
I get it about Rust, because Rust really won't let you hide the fact that you're holding on to a temporary reference (GC is an abstraction over lifetime of objects). Borrow checking is also intentionally less flexible about getters (allowing library author to do anything in the getter, but that makes exclusive-mutable getters exclusive over the whole object not just one field). So Java programmers are in for a shock — public fields are encouraged, and APIs have to commit to a particular memory management strategy.
But OTOH not having to worry about defensive programming is so great. Nothing is NULL. Nothing will unexpectedly mutate. Thread-safety is explicit. It's always clear when vectors and objects are passed, copied, or only briefly inspected.
Welcome to ML derived languages with tracing GC, specially pure ones.
Java isn't the be all end all of GC based languages.
3 replies →
The author mentions how experienced C developers use some techiniques that drastically improve the correctness of their code.
The same goes with Java: we never use mutability in modern code bases anymore unless we have strong reasons to do so and can keep that mutability as constrained locally as possible.
We also avoid inheritance these days, have your heard about that!? Because yeah, inheritance can be helpful but tends to bite you in the end.
> But apparently issues like iterator invalidation and other mutability conflicts don’t rate.
They don't, because you're using java.util.concurrent when you need data structures that deal with concurrency.
I miss java.util.concurrent terribly when I have to program in any other language. I don't miss Java, but I weep every time I need the equivalent of ConcurrentHashMap or CopyOnWriteArrayList in another language.
> in a Java API you don't commit (at the language level) to the ownership of passed-in or returned mutable objects. In Rust, you do[.]
That's only the default, of course. Rc is commonly used in Rust to decouple ownership management from the simple use or passing of objects throughout the program, much like in any GC language. (Of course Rc on its own does not collect reference cycles; thus, "proper" tracing GC is now becoming available, with projects like Samsara https://redvice.org/2023/samsara-garbage-collector/ that try to leverage Rc for the most part, while efficiently tracing possibly-cyclical references.)
Manual memory management does close-couple memory layout/ownership with the public API, and that will objectively make refactors regarding that layout breaking changes, causing many downstream changes required.
A quote from the GC Handbook: “modules should not have to know the rules of the memory management game played by other modules”
While Rust does make these refactors safe, it is nonetheless work that simply doesn’t have to be done in case of Java for example.
I disagree. It depends on the type of project you're working on. If you're writing a public facing high-performance application it MAY be more beneficial to use deterministic memory management.
Most tools, even very technical tools like assemblers, debuggers and compilers, don't need deterministic memory management. Only operating systems, device drivers and some time-critical applications (like video/audio encoding/decoding) need deterministic memory management.
The trick is, when you have deterministic destruction you can use it for many more things than just memory management. E.g. need to add a metric to measure how many "somethings" are active - just attach a tracker to the struct representing that "something" and the tracker can count itself because it knows exactly when it is created and destroyed. We tried to add such metrics system to a Java app another day, and it turned out to be virtually impossible because due to GC nondeterminism we couldn't tell when most of the objects die.
Do you really need to know? All that matters is that the GC will eventually clean up the used memory. When you're interested in that sort of info you're using the wrong language or asking the wrong kind of questions.
5 replies →
> The "leaky abstractions" argument doesn't work for me. E.g. in a Java API you don't commit (at the language level) to the ownership of passed-in or returned mutable objects. In Rust, you do, and the author claims this is a problem. I say that in fact you depend on the ownership of the referenced object either way; Rust forces you to write down what it is and enforce that the callers honor it, while in Java you cannot. Arguably therefore Rust is less leaky in that hidden assumptions are documented and enforced at the API layer.
A lot of programs and large parts of a lot more programs just don't need these concepts at all and would run correctly with a GC that did not collect any garbage, memory permitting. Most Python code I write is in this genre.
On the flip side, at work we mainly work on a low-latency networked application that has per-coroutine arenas created once at startup and long-term state created once at startup. The arenas are used mostly for deserializing messages into native format to apply to the long-term state and serializing messages to wire format. There's not much benefit to be had from writing all the time that everything that isn't on the stack belongs to the per-coroutine arena, except for all instances of like 3 structs, which belong to the long-term state.
As an older dev who spent a lot of time with C on applications where performance and timing was key, I have found the timing of the overhead of GC hard to predict. So I tend to pre-allocate a lot of variables and structures in advance and hold them, so I don't have to worry about GC in a loop. If someone knows a good tutorial of how to predict when garbage collection will happen in my program, I would love to read it and learn more.
It is more typing than julia or numpy maybe, but probably not more than c#, maybe not more than go, and most certainly not more than java.
I think even within GC languages developer velocity varies a lot based on things like typing system, reflection, repl, and maybe interpreted vs compiled. My experience is that this variance is greater than what can be put at the feet of memory management
Java is much much less verbose than Go, and contrary to popular opinion, it really is not particularly verbose, especially modern Java.
I haven't written much go or much java (in this millennium anyway). The little go I wrote didn't have many error checks, is that what usually makes it verbose? The modern java I wrote was a little more functional than 1999 java, but it utilised reflection heavily, and it was a test harness for a black box, so not idiomatic probably, but there was still a lot of boilerplate. C# seems about normal verbosity, and so does rust. Both are less verbose than a lot of c++98 I run into because of lambdas and iterators etc. Julia on the other hand is quite possibly the least verbose language I have ever seen, apart from matlab (which isn't truly general purpose) , and python/numpy/pytorch close behind.
1 reply →
My beef with Java has always the extremelyLongMethodNamesInTheStdLibraryFactoryFactory.
Even if you use a non-GC language, you are not guaranteed deterministic destruction (e.g. in data structures with loops or a highly variable number of nodes).
Deterministic is not the same as bounded. Deterministic means you can determine the moment the destruction happens and you can estimate its upper bound (knowing how much you want to release), not that it always happens in 100 ns. You also get a guarantee that when a certain point in code has been reached, then a resource has been freed.
As for the cost it takes to release the resources, in MMM/borrow checking/refcounting this is typically bounded by the amount of stuff to release. In tracing GC it is bounded by the total amount of stuff on the heap, which is typically several orders of magnitude larger value.