Comment by wg0
1 year ago
Otherwise is a decent language but what makes it difficult is the borrow semantics and lifetimes. Lifetimes are more complicated to get your head around.
But then there's this Arc, Ref, Pinning and what not - how deep is that rabbit hole?
Context: I'm writing a novel kernel in Rust.
Lifetimes aren't bad, the learning curve is admittedly a bit high. Post-v1 rust significantly reduced the number of places you need them and a recent update allows you to elide them even more if memory serves.
Arc isn't any different than other languages, not sure what you're referring to by ref but a reference is just a pointer with added semantic guarantees, and Pin isn't necessary unless you're doing async (not a single Pin shows up in the kernel thus far and I can't imagine why I'd have one going forward).
Would you not want to use Pin when sharing memory with a driver or extension written in a different language (eg: C)?
Pin is a pure compile time abstraction for a single problem: memory safety of self referential struct.
Pin leverages the type system to expose to the programmer receiving a pointer to a Pin'ned object, that this object has some pointer on itself (a self referencial struct). You better be mindful not to move this object to a different memory location unless you know for sure that it is safe to do so. The Pin abstraction makes it harder to forget; and easier to notice during code review; by forcing you to use the keyword unsafe for any operations on the pinned object that could move it around.
In C, there is no such way to warn the programmer besides documentation. It is up to the programmer to be very careful.
Nah not really. Pin is for self-refefential data typically. It's compile time only so that information would get lost in C anyway, and there's no way to distinguish that data at runtime.
The kernel is doing so much anyway with memory maps and flipping in / out pages for scheduling and context switching that Pin doesn't add any value in such cases anyway.
It was also specifically built for async rust. I've never personally seen it in the wild in any other context.
If you’re writing C and don’t track ownership of values, you’re in a world of hurt. Rust makes you do from day one what you could do in C but unless you have years of experience you think it isn’t necessary.
Okay, I think it is is more like Typescript. You hate it but one day you just write small JS program and convert it to Typescript to discover that static analysis alone had so many code paths revealed that would have resulted in uncaught errors and then you always feel very uncomfortable writing plain Javascript.
But what about tools like valgrind in context of C?
Valgrind can only tell you about issues that your testcases exercise. It doesn't provide the same guarantees as static checking of memory safety invariants. But, if you're really concerned (especially about unsafe code), belt-and-bracers is a good strategy, and valgrind will work with rust binaries as well. Rust also has a tool called MIRI which can similarly flag up issues in testcases (it's effectively an interpreter for the intermediate representation in the compiler, and it can detect undefined behaviour even if the compiled assembly would happen to look OK. Still has the same limitation of needing extensive testcases though)
A few years ago I was worrying on a Byzantine mess of a JS project. I converted everything to TS for the sole reason of somewhat safely refactoring the project as a whole.
There was so little trust in the fragility of the original, it took a few months to convince everyone the refactored TS branch was safe.
After that, feature development was a lot faster in terms of productivity again.
You probably should run your rust programs through valgrind regardless. Rust is safer than C, but any unsafe code drops you to approximately C level of safety and any C FFI calls are obviously outside of rust's control or responsibility.
Valgrind is great, especially if you write extensive tests and you actually run them through it regularly. And even then, it does not prove the absence of any kind of bugs. Safe rust has strong guarantees.
frama-c
1 reply →
It was true until LLMs arrive. Feature compilers + IDEs can be integrated with LLMs to help programmers.
Rust was a great idea, before LLMs, but I don't see the motivation for Rust when LLMs can be the solution initial for C/C++ 'problems'.
Relying on LLMs to code for you in no way solves the safety problem of C/C++ and probably worsens it.
2 replies →
Rust compiler checks things for you. People trust the Rust compiler because it enforces rules they want, so people don’t have to be in its place. Your suggestion is to be that checker to LLM-generated code. Back to square one.
On the contrary LLMs make using safe but constraining languages easier - you can just ask it how to do what you want in Rust, perhaps even by asking it to translate C-ish pseudocode.
I don’t entirely agree, you can get used to the borrow checker relatively quickly and you mostly stop thinking about it.
What tends to make Rust complex is advanced use of traits, generics, iterators, closures, wrapper types, async, error types… You start getting these massive semi-autogenerated nested types, the syntax sugar starts generating complex logic for you in the background that you cannot see but have to keep in mind.
It’s tempting to use the advanced type system to encode and enforce complex API semantics, using Rust almost like a formal verifier / theorem prover. But things can easily become overwhelming down that rabbit hole.
It's just overengineered. Many Rust folks don't realize it because they come from C++ and suffer from Stockholm Syndrome.
How is it overengineered?
26 replies →
I always feel Arc is the admission that the borrow checker with different/overlapping lifetimes is too difficult, despite what many Rust developers - who liberally use Arc - claim.
Lifetime tracking and ownership are very difficult. That's why languages like C and C++ don't do it. It's also why those languages needs tons of extra validation steps and analysis tools to prevent bugs.
Arc is nothing more than reference counting. C++ can do that too, and I'm sure there are C libraries for it. That's not an admission of anything, it's actually solving the problem rather than ignoring it and hoping it doesn't crash your program in fun and unexpected ways.
Using Arc also comes with a performance hit because validation needs to be done at runtime. You can go back to the faster C/C++ style data exchange by wrapping your code in unsafe {} blocks, though, but the risks of memory corruption, concurrent access, and using deallocated memory are on you if you do it, and those are generally the whole reason people pick Rust over C++ in the first place.
Looking at the code, it consists of long chains of get().unwrap().to_mut().unwrap().get() noise. Looks like coping with library design than ownership tacking. Also why Result<Option<T>>? Isn't Result already Option by itself? I guess that's why you need get().unwrap().to_mut() to get a value from Result<Option<T>> from an average function call?
9 replies →
It's not that the borrow checker is too difficult, it's that it's too limiting.
The _static_ borrow checker can only check what is _statically_ verifiable, which is but a subset of valid programs. There are few things more frustrating than doing something you know is correct, but that you cannot express in your language.
I've found the opposite. every time I attempt to subvert the borrow checker, I eventually discover that I'm attempting to write a bug.
For kernels (and I suspect database engines might be added to the list, since they seem to have similar requirements to be both scalable and deal with massive amounts of shared state, but I'm not overly familiar with them) is where it gets particularly difficult.
Several kernels for example use type-stable memory, memory that is guaranteed to only hold objects of a particular type, though perhaps only providing that guarantee for as long as you hold an RCU read-lock (this is the case in Linux with SLAB_TYPESAFE_BY_RCU). It is possible in some cases to be able to safely deal with references to objects where the "lifetime" of the referent has ended, but where by dint of it being guaranteed to be the same type of object, you can still do what you want to do.
This comes in handy when you have a problem that commonly appears in kernels where you need to invert a typical lock ordering (a classic case is that the page fault codepath might want to lock, say, VM object then page queue, but the page-replacement codepath will want to lock page-queue then VM object.)
Unfortunately it's hard to think of how the preconditions for these tricks could be formally expressed.
1 reply →
It's not just difficult, sometimes it's impossible to statically know a lifetime of a value, so you must dynamically track it. Arc is one of such tools.
If tracking lifetimes is simple 90% of the time and complex 10% of the time, maybe a tool that lets you have them automatically managed (with some runtime overhead) that 10% of the time is the right way forward.
Then you can use a language and runtime like C# or Java. Or you can use patterns like Go promotes.
There are lots of options if you want.
2 replies →
Arc usually means you've not structured your code to have clear lifetimes where one object clearly outlives another. Typically I see c++ applications avoid it but actually suffer from bugs due to the same structural deficiencies. They said, I think it's almost always possible to to avoid it if you try hard enough. With async you need to use structured concurrency.
Rust lifetime is just a label for a region of memory with various data, which is discarded at the end of its life time. When compiler enters a function, it creates a memory block to hold data of all variables in the function, and then discards this block at the exit from the function, so these variables are valid for life time of the function call only.