Comment by 9rx

1 month ago

> it is you writing a garbage collector in C.

That's right, you can write a garbage collector in C for C to use. You can also write a garbage collector in C for Javascript to use, you could even write a garbage collector in C for Rust to use, but in this case we are talking about garbage collector for C to use.

If you are writing a garbage collector that will not be used, why bother?

This feels like it's into trolling, so one last round:

The C language does not use your C code, it is your C code that uses the C language.

The tools available to C is what the language specification dictated and what the compiler implemented. For example, C might use stack memory and related CPU instructions, because the C specification described "automatic memory" and the compiler implemented it with the CPU's stack functionality. It might insert calls to "memcpy" as this function is part of the C language spec. For C++, the compiler will insert calls to constructors and destructors as the language specified them.

The C language does not specify a garbage collector so it can never use one.

You, however, can use C to write a garbage collector to manually use in your C code. C remains entirely unaware of the garbage collectors existence as it has no idea what the code you write does - it will never call it on its own and the compiler will never make any decisions based on its existence. From C's perspective, it's still just memory managed manually by your application with your logic.

In JavaScript and Go, the language specifies the presence of garbage collection and how that should work, and so any runtime is required to implement it accordingly. You can write that runtime in C, but the C code and C compiler will still not be garbage collected.

  • The C standard is actually carefully written to allow for placing distinct "objects" in separate memory segments of a non-flat address space, such that ordinary pointer arithmetic cannot be expected to reach across to a separate "object". This is not far from allowing for some sort of GC as part of low-level C implementation, and in fact the modern Fil-C relies on it.

    • This is actually quite an interesting topic in its own right, but not quite the discussion above, which is about whether C or Rust include a GC or is considered to "use" the GC you hand-rolled or pulled in from a library.

      I wouldn't consider FilC's GC a GC in the conventional sense either, in that code compiled with Fil-C is still managed manually, with the (fully-fledged) GC effectively only serving as a way to do runtime validation and to turn what would otherwise be a UAF with undefined behavior into a well-defined and immediate panic. This aspect is in essence an alternative approach to what is done by AddressSanitizer.

      I'll have to look a bit more into Fil-C though. Might be interesting to see how it compares to the usual sanitizers in practice.

  • > The C language does not use your C code

    Your impression that there is a semantic authority misses the mark. While you are free to use English as you see fit, so too is everyone else. We already agreed on the intent of the message, so when I say something like "C uses C code", it absolutely does, even if you wouldn't say it that way yourself. I could be alone in this usage and it would remain valid. Only intent is significant.

    However, I am clearly not alone in that style of usage. I read things like "Rust can use code written in C" on here and in other developer venues all the time. Nobody ever appears confused by such a statement even. If Rust can use code written in C, why can't C use code written in C?

    > The C language does not specify a garbage collector so it can never use one.

    The C language also does not specify a linked list. Go tell your developer friends that C can never use a linked list. Please take a photo when they look at you like you have two heads. Admittedly I lack the ability to say something so outlandish to another human with a straight face, but for the sake of science I put that into an LLM. It called me out on the bullshit, pointing out that C can, in fact, use a linked list.

    For what it is worth, I also put "C can never use a garbage collector" into an LLM. It also called me out on that bullshit just the same. LLMs are really good at figuring out how humans generally use terminology. It is inherit to how they are trained. If an LLM is making that connection, so too would many humans.

    > In JavaScript and Go, the language specifies the presence of garbage collection

    The Go language spec does, no doubt as a result of Pike's experience with Alef. The JavaScript spec[1] does not. Assuming you aren't making things up, I am afraid your intent was lost. What were you actually trying to say?

    > C code and C compiler will still not be garbage collected.

    That depends. GC use isn't typical in the C ecosystem, granted, but you absolutely can use garbage collection in a C program. You can even use something like the CCured compiler to have GC added automatically. The world is your oyster. There is no way you couldn't have already realized that, though, especially since we already went over it earlier. It is apparent that your intent wasn't successfully transferred again. What are you actually trying to say here?

    > This is turning into trolling.

    The mightiest tree in the forest could be cut down with that red herring!

    [1] The standard calls itself ECMAScript, but I believe your intent here is understood.

    • > The C language also does not specify a linked list. Go tell your developer friends that C can never use a linked list.

      They would not blink because the statement is accurate. To the C language and to the C compiler, there are no linked lists - just random structs with random pointers pointing to god knows what. C does know about arrays though.

      > The JavaScript spec[1] does not [specify garbage collection].

      I have good reason to believe that you are not familiar with the specification, although to be fair most developers would not be familiar with its innards.

      The specification spends quite a while outlining object liveness and rules for when garbage is allowed to be collected. WeakRefs, FinalizationRegistries, the KeptObjects list on the agent record, ...

      Just like with Go, it is perfectly valid to have an implementation of a "garbage collector" that is a no-op that never collects anything, which means that the application will continously leak memory until it runs out and crashes as the language provides no mechanism to free memory - for Go, you can switch to this with `GOGC=off`. The specific wording from ECMA-262:

      > This specification does not make any guarantees that any object or symbol will be garbage collected. Objects or symbols which are not live may be released after long periods of time, or never at all. For this reason, this specification uses the term "may" when describing behaviour triggered by garbage collection.

      If you're not used to reading language specs the rest of the details can be a bit dry to extract, but the general idea is that the spec outlines automatic allocation, permission for a runtime to deallocate things that are not considered "live", and the rules under which something is considered to be "live". And importantly, it provides no means within the language to take on the task of managing memory yourself.

      This is how languages specify garbage collection as the language does not want to limit you to a specific garbage collection algorithm, and only care about what the language needs to guarantee.

      > [1] The standard calls itself ECMAScript, but I believe your intent here is understood.

      sigh.

      > For what it is worth, I also put "C can never use a garbage collector" into an LLM. It also called me out on that bullshit just the same.

      more sigh. LLMs always just wag their tails when you beg the question on an opinion. They do not do critical thinking for you or have any strong opinions to give of their own.

      I'm done, have a nice day.

      1 reply →