Comment by samsquire

2 days ago

Thanks for such a detailed article.

In my spare time working with C as a hobby I am usually in "vertical mode" which is different to how I would work (carefully) at work, which is just getting things done end-to-end as fast as possible, not careful at every step that we have no memory errors. So I am just trying to get something working end-to-end so I do not actually worry about memory management when writing C. So I let the operating system handle memory freeing. I am trying to get the algorithm working in my hobby time.

And since I wrote everything in Python or Javascript initially, I am usually porting from Python to C.

If I were using Rust, it would force me to be careful in the same way, due to the borrow checker.

I am curious: we have reference counting and we have Profile guided optimisation.

Could "reference counting" be compiled into a debug/profiled build and then detect which regions of time we free things in before or after (there is a happens before relation with dropping out of scopes that reference counting needs to run) to detect where to insert frees? (We Write timing metadata from the RC build, that encapsulates the happens before relationships)

Then we could recompile with a happens-before relation file that has correlations where things should be freed to be safe.

EDIT: Any discussion about those stack diagrams and alignment should include a link to this wikipedia page;

https://en.wikipedia.org/wiki/Data_structure_alignment

16 comments

samsquire

jvanderbot 2 days ago

> which is just getting things done end-to-end as fast as possible, not careful at every step that we have no memory errors.

One horrible but fun thing a former professor of mine pointed out: If your program isn't going to live long, then you never have to deallocate memory. Once it exits, the OS will happily clean it up for you.

This works in C or perhaps lazy GC languages, but for stateful objects where destructors do meaningful work, like in C++, this is dangerous. This is one of the reasons I hate C++ so much: Unintended side effects that you have to trigger.

> Could "reference counting" be compiled into a debug/profiled build and then detect which regions of time we free things in before or after (there is a happens before relation with dropping out of scopes that reference counting needs to run) to detect where to insert frees?

This is what Rust does, kinda.

C++ also does this with "stack" allocated objects - it "frees" (calls destructor and cleans up) when they go out of scope. And in C++, heap allocated data (if you're using a smart pointer) will automatically deallocate when the last reference drops, but this is not done at compile time.

Those are the only two memory management models I'm familiar with enough to comment on.

MarkSweep 2 days ago
There is this old chestnut about “null garbage collectors”:
https://devblogs.microsoft.com/oldnewthing/20180228-00/?p=98...
> This sparked an interesting memory for me. I was once working with a customer who was producing on-board software for a missile. In my analysis of the code, I pointed out that they had a number of problems with storage leaks. Imagine my surprise when the customers chief software engineer said "Of course it leaks". He went on to point out that they had calculated the amount of memory the application would leak in the total possible flight time for the missile and then doubled that number. They added this much additional memory to the hardware to "support" the leaks. Since the missile will explode when it hits its target or at the end of its flight, the ultimate in garbage collection is performed without programmer intervention.
- jvanderbot 2 days ago
  
  Rapid disassembly as GC. Love it.
  Have you heard the related story about the patriot missile system?
  https://www.cs.unc.edu/~smp/COMP205/LECTURES/ERROR/lec23/nod...
  Not a GC issue, but fun software bug.
- gpderetta 1 day ago
  
  Untill the software is reused for a newer model with longer range and they forget to increase the ram size.
  But of course that would never happen, wouldn't it?
pjmlp 2 days ago
The wonders of corrupted data, stale advisory locks and UNIX IPC leftovers, because they weren't properly flushed, or closed before process termination.
- jvanderbot 2 days ago
  
  I'll narrow my scope more explicitly:
  close(x) is not memory management - not at the user level. This should be done.
  free(p) has no O/S side effects like this in C - this can be not-done if you don't malloc all your memory.
  You can get away with not de-allocating program memory, but (as mentioned), that has nothing to do with freeing Os/ kernel / networking resources in C.
  
  3 replies →

caspper69 2 days ago

Nothing is going to tell you where to put your free() calls to guarantee memory safety (otherwise Rust wouldn't exist).

There are tools that will tell you they're missing, however. Read up on Valgrind and ASAN.

In C, non-global variables go out of scope when the function they are created in ends. So if you malloc() in a fn, free() at the end.

If you're doing everything with globals in a short-running program, let the OS do it if that suits you (makes me feel dirty).

This whole problem doesn't get crazy until your program gets more complicated. Once you have a lot of pointers among objects with different lifetimes. or you decide to add some concurrency (or parallelism), or when you have a lot of cooks in the kitchen.

In the applications you say you are writing, just ask yourself if you're going to use a variable again. If not, and it is using dynamically-allocated memory, free() it.

Don't psych yourself out, it's just C.

And yes, there are ref-counting libraries for C. But I wouldn't want to write my program twice, once to use the ref-counting library in debug mode and another to use malloc/free in release mode. That sounds exhausting for all but the most trivial programs.

SkiFire13 2 days ago

> I am curious: we have reference counting and we have Profile guided optimisation. > > Could "reference counting" be compiled into a debug/profiled build and then detect which regions of time we free things in before or after (there is a happens before relation with dropping out of scopes that reference counting needs to run) to detect where to insert frees?

Profile guided optimizations can only gather informations about what's most probable, but they can't give knowledge about things about what will surely happen. For freeing however you most often want that knowledge, because not freeing will result in a memory leak (and freeing too early will result in a use-aftee-free, which you definitely want to avoid so the analysis needs to be conservative!). In the end this can only be an _optimization_ (just like profile guided _optimization_s are just optimizations!) on top of a workflows that is ok with leaking everything.

mgaunard 2 days ago

In C, not all objects need to be their own allocated entity (like they are in other languages). They can be stored in-line within another object, which means the lifetime of that object is necessarily constrained by that of its parent.

You could make every object its own allocated entity, but then you're losing most of the benefits of using C, which is the ability to control memory layout of objects.

pjmlp 2 days ago
As any systems programming language include those that predate C by a decade, and still it doesn't allow full control without compiler extensions, if you really want full control of memory layout of objects, Assembly is the only way.
- gizmo686 2 days ago
  
  In practice C let's you control memory layout just fine. You might need to use __attribute__((packed)), which is technically non standard.
  I've written hardware device drivers in pure C where you need need to peek and poke at specific bits on the memory bus. I defined a struct that matched the exact memory layout that the hardware specifies. Then cast an integer to a pointer to that struct type. At which point I could interact with the hardware by directly reading/writing fields if the struct (most of which were not even byte aligned).
  It is not quite that simple, as you also have to deal with bypassing the cache, memory barriers, possibly virtual memory, finding the erreta that clarifies the originaly published register address was completely wrong. But I don't think any of that is what people mean when they say "memory layout".
  
  2 replies →