Comment by jcranmer

2 months ago

C currently remains the language of system ABIs, and there remains functionality that C can express that Rust cannot (principally bitfields).

Furthermore, in terms of extensions to the language to support more obtuse architecture, Rust has made a couple of decisions that make it hard for some of those architectures to be supported well. For example, Rust has decided that the array index type, the object size type, and the pointer size type are all the same type, which is not the case for a couple of architectures; it's also the case that things like segmented pointers don't really work in Rust (of course, they barely work in C, but barely is more than nothing).

75 comments

jcranmer

raggi 2 months ago

That first sentence though. Bitfields and ABI alongside each other.

Bitfield packing rules get pretty wild. Sure the user facing API in the language is convenient, but the ABI it produces is terrible (particularly in evolution).

mjevans 2 months ago

I would like a revision to bitfields and structs to make them behave the way a programmer things, with the compiler free to suggest changes which optimize the layout. As well as some flag that indicates the compiler should not, it's a finalized structure.

vlovich123 2 months ago

Can you expand on bitfields? There’s crates that implement bitfield structs via macros so while not being baked into the language I’m not sure what in practice Rust isn’t able to do on that front.

structural 2 months ago

Now, try and use two or more libraries that expose data structures with bitfields, and they have all chosen different crates for this (or even the same crate but different, non-ABI-compatible-versions of it).
There's a ton of standardization work that really should be done before these are safe for library APIs. Mostly fine to just write an application that uses one of these crates though.
ZeWaka 2 months ago
Yeah, not sure what they're saying... I use bitfields in multiple of my rust projects using those macros.
- maweki 2 months ago
  
  I'm not a rust or systems programmer but I think it meant that as an ABI or foreign function interface bitfields are not stable or not intuitive to use, as they can't be declared granularily enough.
  
  1 reply →
- pjmlp 2 months ago
  
  Across binary libraries ABI, regardless of static or dynamically linked?
  
  1 reply →

kbolino 2 months ago

I'm genuinely surprised that usize <=> pointer convertibility exists. Even Go has different types for pointer-width integers (uintptr) and sizes of things (int/uint). I can only guess that Rust's choice was seen as a harmless simplification at the time. Is it something that can be fixed with editions? My guess is no, or at least not easily.

jcranmer 2 months ago
There is a cost to having multiple language-level types that represent the exact same set of values, as C has (and is really noticeable in C++). Rust made an early, fairly explicit decision that a) usize is a distinct fundamental type from the other types, and not merely a target-specific typedef, and b) not to introduce more types for things like uindex or uaddr or uptr, which are the same as usize on nearly every platform.
Rust worded in its initial guarantee that usize was sufficient to roundtrip a pointer (making it effectively uptr), and there remains concern among several of the maintainers about breaking that guarantee, despite the fact that people on the only target that would be affected basically saying they'd rather see that guarantee broken. Sort of the more fundamental problem is that many crates are perfectly happy opting out of compiling for weirder platform--I've designed some stuff that relies on 64-bit system properties, and I'd rather like to have the ability to say "no compile for you on platform where usize-is-not-u64" and get impl From<usize> for u64 and impl From<u64> for usize. If you've got something like that, it also provides a neat way to say "I don't want to opt out of [or into] compiling for usize≠uptr" and keeping backwards compatibility.
If you want to see some long, gory debates on the topic, https://internals.rust-lang.org/t/pre-rfc-usize-is-not-size-... is a good starting point.
- zozbot234 2 months ago
  
  > ...not to introduce more types for things like uindex or uaddr or uptr, which are the same as usize on nearly every platform. ... there remains concern among several of the maintainers about breaking that guarantee, despite the fact that people on the only target that would be affected basically saying they'd rather see that guarantee broken.
  The proper approach to resolving this in an elegant way is to make the guarantee target-dependent. Require all depended-upon crates to acknowledge that usize might differ from uptr in order to unlock building for "exotic" architectures, much like how no-std works today. That way "nearly every platform" can still rely on the guarantee with no rise in complexity.
- kbolino 2 months ago
  
  I brought up Go because it was designed around the same time and, while it gets a lot of flack for some of its other design decisions, this particular one seems prescient. However, I would be remiss if I gave the impression that the reasoning behind the decision was anticipation of some yet unseen future; the reality was that int and uint (which are not aliases for sized intN or uintN) were not initially the same as ptrdiff_t and size_t (respectively) on all platforms. Early versions of Go for 64-bit systems had 32-bit int and uint, so naturally uintptr had to be different (and it's also not an alias). It was only later that int and uint became machine-word-sized on all platforms and so made uintptr seem a bit redundant. However, this distinction is fortuitous for CHERI etc. support. Still, Go on CHERI with 128-bit uintptr might break some code, however such code is likely in violation of the unsafe pointer rules anyway: https://pkg.go.dev/unsafe#Pointer
  Yet Rust is not Go and this solution is probably not the right one for Rust. As laid out in a link on a sibling comment, one possibility is to do away with pointer <=> integer conversions entirely, and use methods on pointers to access and mutate their addresses (which may be the only thing they represent on some platforms, but is just a part of their representation on others). The broader issue is really about evolving the language and ecosystem away from the mentality that "pointers are just integers with fancy sigil names".
- torginus 2 months ago
  
  I'd say, that even more than pointer sizes, the idea that a pointer is just a number really needs to die, and is in no way a forward looking decision expected of a modern language.
  Pointers should at no point be converted into numbers and back as that trips up many assumptions (special runtimes, static/dynamic analysis tools, compiler optimizations).
  Additionally, I would make it a priority that writing FFIs should be as easy as possible, and requires as little human deliberation as possible. Even if Rust is safe, its safety can only be assumed as long as the underlying external code upholds the invariants.
  Which is a huge risk factor for Rust, especially in today's context of the Linux kernel. If I have an object created/handled by external native code, how do I make sure that it respects Rust's lifetime/aliasing rules?
  What's the exact list of rules my C code must conform to?
  Are there any static analysis/fuzzing tools that can verify that my code is indeed compliant?
  
  3 replies →
aw1621107 2 months ago
> Is it something that can be fixed with editions? My guess is no, or at least not easily.
Assuming I'm reading these blog posts [0, 1] correctly, it seems that the size_of::<usize>() == size_of::<*mut u8>() assumption is changeable across editions.
Or at the very least, if that change (or a similarly workable one) isn't possible, both blog posts do a pretty good job of pointedly not saying so.
[0]: https://faultlore.com/blah/fix-rust-pointers/#redefining-usi...
[1]: https://tratt.net/laurie/blog/2022/making_rust_a_better_fit_...
- kbolino 2 months ago
  
  Great info!
  Personally, I like 3.1.2 from your link [0] best, which involves getting rid of pointer <=> integer casts entirely, and just adding methods to pointers, like addr and with_addr. This needs no new types and no new syntax, though it does make pointer arithmetic a little more cumbersome. However, it also makes it much clearer that pointers have provenance.
  I think the answer to "can this be solved with editions" is more "kinda" rather than "no"; you can make hard breaks with a new edition, but since the old editions must still be supported and interoperable, the best you can do with those is issue warnings. Those warnings can then be upgraded to errors on a per-project basis with compiler flags and/or Cargo.toml options.
  
  5 replies →

gf000 2 months ago

> that C can express that Rust cannot

The reverse is probably more true, though. Rust has native SIMD support for example, while in standard C there is no way to express that.

'C is not a low-level language' is a great blog post about the topic.

exDM69 2 months ago

> Rust has native SIMD support
std::simd is nightly only.
> while in standard C there is no way to express that.
In ISO Standard C(++) there's no SIMD.
But in practice C vector extensions are available in Clang and GCC which are very similar to Rust std::simd (can use normal arithmetic operations).
Unless you're talking about CPU specific intrinsics, which are available to in both languages (core::arch intrinsics vs. xmmintrin.h) in all big compilers.
ndiddy 2 months ago

AFAIK std::simd is still nightly only. You can use the raw intrinsics in std::arch, but that's not any better than "#include <immintrin.h>".
kristianp 2 months ago

So use the fairly common SIMD extensions in C, that's not much of an argument.

dataflow 2 months ago

In what architecture are those types different? Is there a good reason for it there architecturally, or is it just a toolchain idiosyncrasy in terms of how it's exposed (like LP64 vs. LLP64 etc.)?

jcranmer 2 months ago
CHERI has 64-bit object size but 128-bit pointers (because the pointer values also carry pointer provenance metadata in addition to an address). I know some of the pointer types on GPUs (e.g., texture pointers) also have wildly different sizes for the address size versus the pointer size. Far pointers on segmented i386 would be 16-bit object and index size but 32-bit address and pointer size.
There was one accelerator architecture we were working that discussed making the entire datapath be 32-bit (taking less space) and having a 32-bit index type with a 64-bit pointer size, but this was eventually rejected as too hard to get working.
- sroussey 2 months ago
  
  I guess today, instead of 128bit pointers we have 64bit pointers and secret provenance data inside the cpu, at least on the most recent shipped iPhones and Macs.
  In the end, I’m not sure that’s better, or maybe we should have had extra large pointers again (in that way back 32bit was so large we stuffed other stuff in there) like CHERI proposes (though I think it still has secret sidecar of data about the pointers).
  Would love to Apple get closer to Cheri. They could make a big change as they are vertically integrated, though I think their Apple Silicon for Mac moment would have been the time.
  I wonder what big pointers does to performance.
  
  1 reply →

yencabulator 2 months ago

> For example, Rust has decided that the array index type, the object size type, and the pointer size type are all the same type, [...]

Not really, at least not anymore: https://doc.rust-lang.org/std/ptr/index.html#provenance

This is also how Rust will work on CHERI. You can run your pointer-fiddling code in Miri today to check that you're following the rules.

throw_a_grenade 2 months ago

Also you can't do self-referential strutcs.

Double-linked lists are also pain to implement, and they're are heavily used in kernel.

K0nserv 2 months ago

> Also you can't do self-referential strutcs.
You mean in safe rust? You can definitely do self-referential structs with unsafe and Pin to make a safe API. Heck every future generated by the compiler relies on this.
menaerus 2 months ago
[flagged]
- baq 2 months ago
  
  Don’t spread FUD, you can check some example code yourself.
  https://git.kernel.org/pub/scm/linux/kernel/git/a.hindborg/l...
  
  36 replies →

qwm 2 months ago

Correct. C has a mostly standardized ABI, Rust does not. C has a spec, Rust does not. That matters in things like kernels.