Comment by petcat
1 day ago
Cool you can just search specifically for potentially unsafe code in Rust. How do you search for unsafe code in Zig? Or do you just have to assume it's everywhere?
1 day ago
Cool you can just search specifically for potentially unsafe code in Rust. How do you search for unsafe code in Zig? Or do you just have to assume it's everywhere?
If half of your code is unsafe then unless you exercise tremendous discipline (Claude basically doesn't) you will just end up with a big ball of unsafe, peppered with hallucinations in whatever random documentary comments Claude decided to make. I doubt they enforced the confinement of unsafe to a specific architectural layer or anything like that.
Aren't the Rust unsafes a reflection of the Zig it was ported from? However now that you're working with Rust, you're in a position to continue improving and eliminating the unsafes.
Plus I seem to recall the Rust community solved this issue by making tooling that proofs if unsafe code is truly unsafe, I remember one of the concurrent frameworks got scanned and people freaked out, the creator was about to abandon ship entirely as a result, don't recall what fully came of it. Anyway my overall point being, if there's already tooling to find the truly unsafe / bad code, it might make fixing it simpler / quicker to accomplish.
4 replies →
There is a qualitative difference between unsafe Rust and Zig as far as I know.
In principle static analysis is possible. (Note: WIP)
https://github.com/ityonemo/clr
if half of your files in a million line codebase are unsafe that doesn't tell you much any more. Presumably the point of a Rust rewrite is that you actually make use of Rust's safety features in a coherent way.
But given the whole "let AI rewrite this for me" stunt nature of this project that was not going to happen because that would require well, actual thinking and a re-design. So now you have Zig disguised as Rust and a line-by-line port because the semantics of idiomatic Rust don't map on the semantics of Zig.
>if half of your files in a million line codebase are unsafe that doesn't tell you much any more.
If half of your files in the first pass of a million line rewrite are unsafe then that's completely fine. Do you understand what the tag actually is? It doesn't even mean that the code is actually unsafe, just that the compiler can't guarantee its safety, which can happen for a number of reasons, some benign.
Who rewrites a 700K codebase trying to be idiomatic from the get go ? That's setting yourself up for failure, whether you're a human or a machine.
A 1:1 translation, warts and all, is the _only_ foolproof way to do a language to language rewrite. Anything else in a non-trivial codebase is almost guaranteed to introduce regressions.
And? This is absolutely the correct and standardized way to do mechanical rewrites: you do a rewrite that maps directly to the original source so you can rely on the original correctness guarantees and bug-for-bug compatibility and log issues, and then you go into the next phase where you begin to use idiomatic constructs.
This is the same in COBOL-to-Java ports that have been done in banking and insurance for the past 20 years.
If the rewrite was zig to C and half the code was in __asm blocks is that different or the same?
COBOL to Java is a completely different thing and pretty much unrelated.
Rust can easily call C libraries and vice versa and so can Zig. A more appropriate and designed rewrite would identify the core pieces of the Zig code that were the primary sources of all the big issues. Then, you rewrite that component in Rust and verify that you get the expected improvements. That keeps the codebase stable, it keeps you honest on actually reducing bugs and issues, and other benefits. Then you either just keep it that way or slowly rinse and repeat.
Without doing the analysis of what the core issues were in the first place, the author of Bun can make no claims towards the rewrite. He claims to have fixed flaky tests and improved memory safety. Where is the analysis that shows this? Where is the proof and data? Does he even know where the issues in the Zig codebase were at? I saw a commit where a test had a one second sleep put in place.
Compare this to say the Racket rewrite where a significant portion of the C core was replaced by Chez Scheme and Racket itself. There were several blog posts doing both pre- and post-analysis, and Racket has far less users than Bun.
This rewrite is totally unprofessional and has been poorly and even antagonistically communicated. The author was on this site just days ago telling everyone to relax and that he'd probably throw out this code, and that was even after it had been brought up that this wasn't pre-communicated to users. If I was a dependent in Bun, I would migrate off immediately.
So I push back in the idea that this is the way to do a rewrite like this.
>This is the same in COBOL-to-Java ports
it isn't, because those guys didn't think a naive 1-1 machine translation would give them the benefits of Java, which somehow the people involved in this rust rewriting seem to think they've already gained despite the virtually identical code.
If the whole point genuinely would have been to do a purely mechanical translation they could and should have written a transpiler, which would have had significantly higher correctness guarantees than this given that it'd be deterministic, but of course that would have defeated the PR purpose of this whole thing, which just looks like a marketing for Anthropic frankly
9 replies →
[dead]
It's worth pointing out that "unsafe" in rust is not a very sound concept - it's not like a monad or "function colour" whereby the compiler can say "this code ultimately calls unsafe". It's more like a comment on steroids; you call unsafe in a function, write a comment about it, and no caller of that function would have any idea that it's calling unsafe code.
Yes, the point of unsafe is that you promise it's safe, you promise to preserve the necessary invariants to make it safe to call no matter from where. It was never supposed to "taint" all code that calls it, that would defeat its purpose. It's sound enough, it's just not at all trying to do that.
Yes I understand what they were trying to do and this is the ugly hack they came up with.
I just don't like it. I am not ignorant of their intentions, it just does not work well.
Unsafe code is normal. Trying to hide it is unsound. And I stand by that.