← Back to context

Comment by jcranmer

6 hours ago

C currently remains the language of system ABIs, and there remains functionality that C can express that Rust cannot (principally bitfields).

Furthermore, in terms of extensions to the language to support more obtuse architecture, Rust has made a couple of decisions that make it hard for some of those architectures to be supported well. For example, Rust has decided that the array index type, the object size type, and the pointer size type are all the same type, which is not the case for a couple of architectures; it's also the case that things like segmented pointers don't really work in Rust (of course, they barely work in C, but barely is more than nothing).

Can you expand on bitfields? There’s crates that implement bitfield structs via macros so while not being baked into the language I’m not sure what in practice Rust isn’t able to do on that front.

  • Yeah, not sure what they're saying... I use bitfields in multiple of my rust projects using those macros.

    • I'm not a rust or systems programmer but I think it meant that as an ABI or foreign function interface bitfields are not stable or not intuitive to use, as they can't be declared granularily enough.

    • Across binary libraries ABI, regardless of static or dynamically linked?

That first sentence though. Bitfields and ABI alongside each other.

Bitfield packing rules get pretty wild. Sure the user facing API in the language is convenient, but the ABI it produces is terrible (particularly in evolution).

  • I would like a revision to bitfields and structs to make them behave the way a programmer things, with the compiler free to suggest changes which optimize the layout. As well as some flag that indicates the compiler should not, it's a finalized structure.

I'm genuinely surprised that usize <=> pointer convertibility exists. Even Go has different types for pointer-width integers (uintptr) and sizes of things (int/uint). I can only guess that Rust's choice was seen as a harmless simplification at the time. Is it something that can be fixed with editions? My guess is no, or at least not easily.

  • There is a cost to having multiple language-level types that represent the exact same set of values, as C has (and is really noticeable in C++). Rust made an early, fairly explicit decision that a) usize is a distinct fundamental type from the other types, and not merely a target-specific typedef, and b) not to introduce more types for things like uindex or uaddr or uptr, which are the same as usize on nearly every platform.

    Rust worded in its initial guarantee that usize was sufficient to roundtrip a pointer (making it effectively uptr), and there remains concern among several of the maintainers about breaking that guarantee, despite the fact that people on the only target that would be affected basically saying they'd rather see that guarantee broken. Sort of the more fundamental problem is that many crates are perfectly happy opting out of compiling for weirder platform--I've designed some stuff that relies on 64-bit system properties, and I'd rather like to have the ability to say "no compile for you on platform where usize-is-not-u64" and get impl From<usize> for u64 and impl From<u64> for usize. If you've got something like that, it also provides a neat way to say "I don't want to opt out of [or into] compiling for usize≠uptr" and keeping backwards compatibility.

    If you want to see some long, gory debates on the topic, https://internals.rust-lang.org/t/pre-rfc-usize-is-not-size-... is a good starting point.

    • > ...not to introduce more types for things like uindex or uaddr or uptr, which are the same as usize on nearly every platform. ... there remains concern among several of the maintainers about breaking that guarantee, despite the fact that people on the only target that would be affected basically saying they'd rather see that guarantee broken.

      The proper approach to resolving this in an elegant way is to make the guarantee target-dependent. Require all depended-upon crates to acknowledge that usize might differ from uptr in order to unlock building for "exotic" architectures, much like how no-std works today. That way "nearly every platform" can still rely on the guarantee with no rise in complexity.

  • > Is it something that can be fixed with editions? My guess is no, or at least not easily.

    Assuming I'm reading these blog posts [0, 1] correctly, it seems that the size_of::<usize>() == size_of::<*mut u8>() assumption is changeable across editions.

    Or at the very least, if that change (or a similarly workable one) isn't possible, both blog posts do a pretty good job of pointedly not saying so.

    [0]: https://faultlore.com/blah/fix-rust-pointers/#redefining-usi...

    [1]: https://tratt.net/laurie/blog/2022/making_rust_a_better_fit_...

In what architecture are those types different? Is there a good reason for it there architecturally, or is it just a toolchain idiosyncrasy in terms of how it's exposed (like LP64 vs. LLP64 etc.)?

  • CHERI has 64-bit object size but 128-bit pointers (because the pointer values also carry pointer provenance metadata in addition to an address). I know some of the pointer types on GPUs (e.g., texture pointers) also have wildly different sizes for the address size versus the pointer size. Far pointers on segmented i386 would be 16-bit object and index size but 32-bit address and pointer size.

    There was one accelerator architecture we were working that discussed making the entire datapath be 32-bit (taking less space) and having a 32-bit index type with a 64-bit pointer size, but this was eventually rejected as too hard to get working.

    • I guess today, instead of 128bit pointers we have 64bit pointers and secret provenance data inside the cpu, at least on the most recent shipped iPhones and Macs.

      In the end, I’m not sure that’s better, or maybe we should have had extra large pointers again (in that way back 32bit was so large we stuffed other stuff in there) like CHERI proposes (though I think it still has secret sidecar of data about the pointers).

      Would love to Apple get closer to Cheri. They could make a big change as they are vertically integrated, though I think their Apple Silicon for Mac moment would have been the time.

      I wonder what big pointers does to performance.

      1 reply →

Also you can't do self-referential strutcs.

Double-linked lists are also pain to implement, and they're are heavily used in kernel.

  • > Also you can't do self-referential strutcs.

    You mean in safe rust? You can definitely do self-referential structs with unsafe and Pin to make a safe API. Heck every future generated by the compiler relies on this.

  • There's going to be ~zero benefit of Rust safety mechanisms in the kernel, which I anticipate will be littered with the unsafe sections. I personally see this move not as much technical as I see it as a political lobbying one (US gov). IMHO for Linux kernel development this is not a good news since it will inevitably introduce friction in development and consequently reduce the velocity, and most likely steer away certain amount of long-time kernel developers too.