Comment by josephg

5 months ago

> When you program with totally unsafe languages, you develop more strategies than just relying on a type checker and borrow checker: RAII, "crash early", TDD, completion-compatible naming conventions, even syntax highlighting (coloring octal numbers differently)...

Having written a fair bit of rust and C, I don't consider the tools for safety in C to be good enough.

In C, its so easy for small mistakes to turn into CVEs. ASAN and friends help. But they're a long way from perfect. Testing helps. But in C, there's usually a fair bit of time that passes between when I make a mistake and when I discover the bug through testing. Its also so easy for bugs to hide in C in the cracks of UB.

One of my clearest experiences with C and rust was a rope library I wrote several years ago. Ropes are "fancy strings". They're strings, but they support O(log n) insert & delete, at arbitrary positions. I wrote my library in pure C, implemented on top of a skip list. The code is very subtle - like, there's a lot of very careful logic. A single incorrect line of code will often cause silent data corruption or memory errors that don't show up until much later.

It took about as long to properly test & debug the library as it took to write it in the first place. Debugging it was exhausting - there were a myriad of obscure edge cases that I needed fuzzing to track down. When the fuzzer found problems, going from a failing fuzzer trace to a code fix was a big job.

Before I started, I had a bunch of optimisations in mind that I wanted to add to the library. But it was so exhausting getting it working at all that I never got around to most of them. Eg I wanted to make each node in the skip list into a gap buffer to reduce memcopies. But implementing that would have required significant code changes - which in turn would have meant a new round of memory bugs and debugging. I never brought myself to do it.

At some point I rewrote the library in rust, with liberal use of raw pointers. I made just as many mistakes in the implementation - though the compiler caught a lot of them. The first time I ran it it segfaulted. And I thought "here we go again". But despite using raw pointers, there were only 2 unsafe functions in the whole program. A segfault in rust can only happen from unsafe code. So I took a read of that code - and lo and behold, there was my bug, plain as day. Time to fix: 2 minutes. The library never segfaulted again in all my testing. The first time I benchmarked it it was ~10% faster than the C version. I still have no idea why.

It was so much easier to write that a little while later, I put the gap buffer optimisation in. Now the rust library is 2-3x faster than C. In this case, memory safety made my program easier to write. And that resulted in better performance.

If anyone is curious, C / rust code is here:

https://github.com/josephg/librope

https://github.com/josephg/jumprope-rs

> BUT. the cultural characteristics of the programmers are only one-quarter of the story. The bigger part is about company culture, and more specifically the availability of programmers.

Yeah absolutely. I think this is the biggest downside of rust. Rust is really hard - and painful - to learn. It front loads all the pain. In C, you suffer while debugging. In rust, all that suffering happens while learning the language in the first place. I spent months fighting the borrow checker. And its very demotivating not being able to compile your program at all. Once you understand it, it makes sense. But I think there will always be a limited pool of programmers willing to struggle through.

Even chatgpt is bad at rust. It makes all sorts of classic beginner mistakes with lifetimes, and the resulting code often won't compile. Even after pointing out the problem, chatgpt is often unable to correct lifetime bugs.

0 comments

josephg

No comments yet

Contribute on Hacker News ↗