Comment by infinity0

8 years ago

Could someone enlighten me on why malloc and free don't automatically zero memory by default?

Someone pointed me to MALLOC_PERTURB_ and I've just run a few test programs with it set - including a stage1 GCC compile, which granted may not be the best test - and it really doesn't dent performance by much. (edit: noticeably, at all, in fact)

People who prefer extreme performance over prudent security should be the ones forced to mess about with extra settings, anyway.

Some old IBM environments initialized fresh allocations to 0xDEADBEEF, which had the advantage that the result you got from using such memory would (usually) be obviously incorrect. The fact that it was done decades ago is pretty good evidence that it's not about the actual initialization cost: these things cost a lot more back then.

What changed is the paged memory model: modern systems don't actually tie an address to a page of physical RAM until the first time you try to use it (or something else on that page). Initializing the memory on malloc() would "waste" memory in some cases, where the allocation spans multiple pages and you don't end up using the whole thing. Some software assumes this, and would use quite a bit of extra RAM if malloc() automatically wiped memory. It would also tend to chew through your CPU cache, which mattered less in the past because any nontrivial operation already did that.

I personally don't think this is a good enough reason, but it is a little more than just a minor performance issue.

That all being said, while it would likely have helped slightly in this case, it would not solve the problem: active allocations would still be revealed.

  • > Some old IBM environments initialized fresh allocations to 0xDEADBEEF, which had the advantage that the result you got from using such memory would (usually) be obviously incorrect.

    On BSDs, malloc.conf can still be configured to do that: on OpenBSD, junking (fills allocations with 0xdb and deallocations with 0xdf) is enabled by default on small allocations, "J" will enable it for all allocations. On FreeBSD, "J" will initialise all allocations with 0xa5 and deallocations with 0x5a.

  • > What changed is the paged memory model: modern systems don't actually tie an address to a page of physical RAM until the first time you try to use it (or something else on that page). Initializing the memory on malloc() would "waste" memory in some cases, where the allocation spans multiple pages and you don't end up using the whole thing. Some software assumes this, and would use quite a bit of extra RAM if malloc() automatically wiped memory. It would also tend to chew through your CPU cache, which mattered less in the past because any nontrivial operation already did that.

    Maybe an alternative approach is to simply mark the pages to be lazily zeroed out when attached, in the Page Table Entries of the MMU. They wouldn't be zeroed out at the time of the call malloc(), but only when they are attached to a physical memory location (the first time you use it).

    • And it seems to me the OS should ensure the pages are zero'd out rather than user space (via malloc()) doing it, because it's still a security hole to let a process read data that it's not supposed to have access to (whether it's from another process or the kernel - it doesn't matter).

      1 reply →

    • Unsure, not my job. But I read stuff along those lines. A modern OS plays all sorts of games to delay doing work. Allocate a couple of megs of memory and the OS sets up some pointers in a page table. And yes it'll keep already zero'd pages handy. And mark pages as dirty to be scraped clean later.

  • It doesn't need to affect your CPU cache, because x64 processors have non-temporal writes (streaming stores) that bypass the cache.

    The stuff about eagerly allocating pages is spot on though.

    There is calloc which allocates and zeroes memory, but people don't use it as often as they should.

  • Parsers don't usually need to hold onto what they're parsing for a very long time, so unless they were running this parallel on a machine with 4k cores, I'd imagine it would be much more likely that a buffer overrun hits the middle of an already-freed allocation rather than going into an active one.

    In terms of "wasting" memory, perhaps the kernel could detect that you are writing 0s to a COW 0 page and still not actually tie the page to physical RAM. (If you're overwriting non-0 data, well it's already in a physical page.)

    I don't quite follow the details of the CPU cache issue and why that is more-than-minor.

    I do think in this day and age we should be re-visiting this question seriously in our C standard libraries. If the performance issues are actually major problems for specific systems, the old behaviour could be kept, but after benchmarking to show that it really is a performance problem.

    • In terms of "wasting" memory, perhaps the kernel could detect that you are writing 0s to a COW 0 page and still not actually tie the page to physical RAM.

      Writing to your COW zero page causes a page fault. Now, in theory you could disassemble the executing instruction and if it's some kind of zero write, just bump the instruction pointer and go back to userspace - but then the very next instruction in your loop that zeroes the next 8 bytes will cause the same page fault. And the next. And the next...

      Taking a page fault for every 8 bytes in your allocation is completely infeasible. You'd be better off taking the hit of the additional memory usage.

      3 replies →

  • An invariant you get from most kernels is that all new memory pages are zeroed when mapped into processes (normally through mmap or sbrk), so you only have the paging problem when initializing with a value other than zero.

Zeroing on malloc and/or free would not have prevented this type of error, since the information disclosure was due to an overflow into an adjacent allocated buffer.

However, zeroing on free is generally a useful defense-in-depth measure because can minimize the risk of some types of information disclosure vulnerabilities. If you use grsecurity, this feature is provided by grsecurity's PAX_MEMORY_SANITIZE [0].

[0]: https://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity...

Zeroing on alloc/free probably wouldn't have helped much with this bug. Data in live allocations would still be leaked.

> Could someone enlighten me on why malloc and free don't automatically zero memory by default?

The computational cost of doing so, I suspect.

  • Just like why most filesystems don't zero deleted files.

    • Neither of these are good reasons: I already talked about MALLOC_PERTURB_ (man mallopt) in my post and my naive performance tests, and we rarely get bad security holes based on data from deleted files left on filesystems.

      3 replies →

Are these results hardware independent? Maybe it makes a difference on older machines, or different architectures.

I imagine clearing memory on free is more relevant than MALLOC_PERTURB_?