Comment by gronpi

3 days ago

In C, you can alias pointers if they have compatible types. Not the case in Rust for mutable references. And the rules of Rust have tripped up even senior Rust developers.

https://github.com/rust-lang/rust/commit/71f5cfb21f3fd2f1740...

Without MIRI, a lot of Rust developers would be lost, as they do not even attempt to understand unsafe. And MIRI cannot and does not cover everything, no matter how good and beloved it is.

It should have been possible for senior Rust developers to write UB-free code without having to hope that MIRI saves them.

The situation is not that bad. The rules of unsafe code were pretty badly defined back then, but they are in process of becoming a lot clearer, and like the grandparent argues, with a well-defined aliasing model like Tree Borrows, they are easier to understand than C's.

If you look into the code you linked, the problem was about accessing undefined bytes though an aliased, differently-typed pointer – something you would have hard time doing in C to begin with. MaybeUninit was a new thing back then. I think that nowadays, a senior Rust developer would clear the hurdles better.

  • I am very sorry, but you do not address that TBAA, like C has by default, generally is easier than just no aliasing, like what Rust has for mutable references. This is a major difference. C code can opt into a similar kind of aliasing, namely by using _restrict_, but that is opt-in, while it is always on for Rust.

    And there is newer UB as well in Rust stdlib

    https://github.com/rust-lang/rust/pull/139553

    • > TBAA, like C has by default, generally is easier than just no aliasing [note: I assume no aliasing here to refer Rust's mutable references, not C's restrict]

      I don't accept that statement. Rust's mutable references are statically checked, so they are impossible to get wrong (modulo compiler bugs) if you aren't using unsafe code. Using raw pointers has no no-aliasing requirements, so again, it's easier than in C.

      The hard part is mixing mutable references and raw pointers. In 95% of Rust code, you are not required to do that, and you shouldn't do that. In the remaining 5%, you should understand the aliasing model. In that case, indeed, you need to know more than what TBAA requires you to know. But that's for the case where you can also _DO_ more than TBAA would allow you to do.

      5 replies →

> And the rules of Rust have tripped up even senior Rust developers.

Yeah, even senior Rust devs make mistakes. Thanks to Miri, we can catch such mistakes. No reasonable person would expect even senior Rust devs to be magic superheroes that can write tricky unsafe code without making any mistake.

How confident are you that glibc has zero Undefined Behavior? I rather doubt it. The Rust standard library has its entire test suite (well, almost everything, except for some parts in std::fs and std::net) run through Miri. That's not a proof there's no UB in corner cases not covered by the tests, but it means we are much, much more likely to find such bugs and fix them than comparable C code.

C has an opt-out that works sometimes, if a compatible type exists. Rust has an opt-out that works always: use raw pointers (or interior mutable shared references) for all accesses, and you can stop worrying about aliasing altogether.

  • I think that way of describing it is really weird.

    In C, you do not use any special keywords to opt into or opt out of TBAA, instead it is the rule by default that one must follow. I do not consider that 'opting out'. One can disable that in some compilers by disabling 'strict aliasing', as the Linux kernel does, but that is usually on a whole-program basis and not standard.

    In Rust, using raw pointers is using a different mechanism, and mutable references are always 'no aliasing'.

    An example of opting in would be C's "restrict" keyword, where one opts into a similar constraint to that of Rust's 'no aliasing' for mutable references.

    >use raw pointers (or interior mutable shared references) for all accesses, and you can stop worrying about aliasing altogether.

    And dereferencing a raw pointer requires 'unsafe', right? And if one messee the rules up for it, theN UB.

    Can you confirm that the interaction between raw pointers and mutable references still requires care? Is this comment accurate?

    >It is safe to hold a raw pointer, const T or mut T, at the same time as a mutable reference, &mut T, to the same data. However, it is Undefined Behaviour if you deference that raw pointer while the mutable reference is still live.

    • > Can you confirm that the interaction between raw pointers and mutable references still requires care?

      Yes, that still requires care.

      > In C, you do not use any special keywords to opt into or opt out of TBAA, instead it is the rule by default that one must follow

      That's exactly the problem: sometimes you need to write code where there's more aliasing, and C just makes that impossible. It tells you to use memcpy instead, causing extra copies which cost performance.

      In Rust, you can still write the code without the extra copies. Yes, it requires unsafe, but it's at least possible. (Remember that writing C is like having all code be unsafe, so in a comparison with C, if 10% of the Rust code needs to be unsafe that's still 90% less unsafe than C.)

  • the opt-out in C is to use char*, which is specifically allowed to alias other types.

    There are also GCC extensions to build other types that may alias.

    • That doesn't let you treat data in-place at two different types, unless one of them happens to be char. So you still need to make a copy in many cases, e.g. to convert between an array of uint32_t and an array of uint16_t.

      In some cases you can use unions, but that, too, is very limited, and I am not sure it would let you do this particular case.

      1 reply →