← Back to context

Comment by gpderetta

1 day ago

Do you fully agree? I finally went and read n3005.pdf. The important item there is that a cast to integer exposes the pointer and now the compiler must be conservative and assume that the pointed object might be changed via non trackable pointers. This seems quite a reasonable compromise to make existing code work without affecting the vast majority of objects whose address is never cast to an integer. But ncruces wants defined semantics for arbitrary forged pointers.

You are right, I wasn't thinking straight. I do not fully agree. Creating arbitrary pointers can not work. Forging pointers to implementation-defined memory region would be ok though.

  • Why can't it work though?

    And I'm taking about both things.

    Integer arithmetic that produces pointers that are just out of bounds of an object. Why can't this work? Why can't the compiler assume that, since I explicitly converted a pointer to an integer, the pointed-to object can't be put into a register, or made to go out or scope early?

    Second, fabricating pointers. If I have a pointer to mmap/sbrk memory, shouldn't I be allowed to “fabricate” arbitrary pointers from integers that point into that area? If not, why not?

    Finally Wasm. The linear memory is addressable from address 0 to __builtin_wasm_memory_size * PAGESIZE. Given this, and except maybe the address at zero, why should it be undefined behavior to dereference any other address?

    What's the actual advantage to making these undefined behavior? What to we gain in return?

    • In practice if you do a volatile read at an arbitrary mapped address it will work. But you have no guarantee regarding what you will read from it, even it if it happens to match the address of a variable you just wrote into.

      Formally it is undefined, as there is no way to give it sane semantics and it will definitely trip sanitizers and similar memory safety tools.

      3 replies →

    • Integer arithmetic working correctly, converting a pointer to an integer, and converting an integer back to the same object is something which should work. This is what we made sure to guarantee in the provenance TS. Making this work for memory outside of any object or converting back to a different object whose address was not converted to an integer previously is difficult. It can be made to work, but then you need to give up a lot (and rewrite your compilers). Outside of an object is clear, because there might be no memory mapped. Converting back to an arbitrary object does not work because the compiler must know somehow that it can not put the object into a register. If you allow conversion back to arbitrary objects, it can not put anything into registers. This would be bad.

      Fabricating pointers in an implementation-defined way is obviously ok. This would cover mmap / sbrk or mapped I/O. Note also that historically, the C standard left things UB exactly so that implementations can use it for extensions (e.g. mmap). The idea that UB == invalid program is fairly recent misinformation, but we have to react to it and make things at least implementation-defined (which also meant a bit something else before).

      1 reply →