Comment by zozbot234
20 hours ago
One potential way to solve this in a principled manner is to turn at least some "unsafe" annotations into ghost capability tokens that are explicitly threaded through the code and consistently checked by the compiler. Manufacturing the capability could itself be left as an unsafe operation, or require a runtime check of some kind.
You already see this in some cases, for example the NonZero<T> generic type can be viewed as a T endowed with a capability or token that just says "this particular value of type T is nonzero, so the zero value is available for niche purposes". But this could be expanded a lot, especially with some AI assistance.
This already happens all the time in rust, including in the standard library. The typical pattern is to define your CheckedType to be
pub struct CheckedType(UncheckedType);
e.g. where its inner field is private. Then, you only present safe constructors that check your invariant, and only provide methods that maintain the invariant.
For a concrete example, String in rust is a Vec<u8> with the guarantee that the underlying bytes correspond to valid UTF8. Concretely, it is defined as
#[derive(PartialEq, PartialOrd, Eq, Ord)] #[stable(feature = "rust1", since = "1.0.0")] #[lang = "String"] pub struct String { vec: Vec<u8>, }
You can construct a string from a vec of bytes via
fn from_utf8(vec: Vec<u8>) -> Result<String, _>;
as well as the unsafe method
unsafe fun from_utf8_unchecked(vec: Vec<u8>) -> String;
Note here that there isn't a separate capability/token though. This is typically viewed as bad practice in rust, as you can always ignore checking a capability/token. See for example rust's mutexes Mutex<T>, which carry the data (T) that you want access to themself. So, to get access to the data, you must call .lock(). There is a similar philosophy behind Rust's `Result` type. to get data underlying it, you must handle the possibility of an error somehow (which can include panicing upon detecting the error of course).
Yes, or you could review the code.
It’d only take an hour if you reviewed a million lines per hour
[Sorry guys, I couldn't review this code because I generated it all]
Even before AI, deterministic checks by compilers are almost always better than "review the code"
"review the code" as a solution will eventually fail and cause a problem, even pre-AI.
The entire point of unsafe blocks and SAFETY comments is that they are easy for humans to find and audit, but not compiler checkable. If it can be compiler-checked by some clever token system, then ... it's just plain safe rust, and you don't need to document any special safety invariants in the first place
even when you can review the code, it's good to have the compiler check for you. This is for similar reasons why it's better to have CI check correctness on each code change, vs testing the code thoroughly one time, and then being careful going forward.