My best guess would be to prevent the 'useless instruction' from being optimized out, but an or with 0 is still useless, and I don't see what optimizer lies between this GCC feature and the final binary.
Why is it shorter? Both MOV and OR have one byte encodings, and with the OR you either have to use an immediate zero (which burns a byte) or materialize zero in some other way. As that email points out, the entire sequence would be shorter using a different addressing mode anyways. And a read-modify-write is definitely slower at runtime.
I wonder if it's because it's safer in that it doesn't change anything there if you've gone over the stack limit and into the heap? I know that -fstack-protect was designed a long time ago, possible before guard pages and before 64bit addressing.
> This code is so wrong I don't even no where to start. ... I suppose we could try to make the kernel fail to build at all on a broken configuration like this.
Very good point. Key phrase is "an offset > 4096 is just bogus. That's big enough to skip right over the guard page" - if allocating more than a page-size-worth (4K in most cases), it should try to write to it before returning.
My best guess would be to prevent the 'useless instruction' from being optimized out, but an or with 0 is still useless, and I don't see what optimizer lies between this GCC feature and the final binary.
Maybe the segfault only occurs on a write?
Apparently, because it's shorter [0].
[0]: See https://lkml.org/lkml/2017/11/10/348
Why is it shorter? Both MOV and OR have one byte encodings, and with the OR you either have to use an immediate zero (which burns a byte) or materialize zero in some other way. As that email points out, the entire sequence would be shorter using a different addressing mode anyways. And a read-modify-write is definitely slower at runtime.
I wonder if it's because it's safer in that it doesn't change anything there if you've gone over the stack limit and into the heap? I know that -fstack-protect was designed a long time ago, possible before guard pages and before 64bit addressing.
Because gcc's -fstack-check is garbage. Gentoo Hardened should not be using it.
From https://lkml.org/lkml/2017/11/10/310, discussing the disassembly:
> This code is so wrong I don't even no where to start. ... I suppose we could try to make the kernel fail to build at all on a broken configuration like this.
Very good point. Key phrase is "an offset > 4096 is just bogus. That's big enough to skip right over the guard page" - if allocating more than a page-size-worth (4K in most cases), it should try to write to it before returning.
5 replies →
This guy is a class act. Smart AF