Comment by codedokode

3 years ago

I think it was a mistake when CPU and language designers decided to make overflow in arithmetic operations not an error. This is wrong. Overflow can produce invalid data, which can break something elsewhere. Overflow has been a reason for security vulnerabilities.

For example, you don't want to have an overflow when calculating a total cost of ordered goods or size of an allocated memory block.

One could argue that it is a developer's responsibility to make sure overflow doesn't happen, but the history shows that developers usually fail to protect the program from such kind of errors. We have memory protection, why cannot we have overflow protection?

In my opinion, an overflow should cause an exception unless the program explicitly asks to ignore it. Of course this means that every addition now becomes a (very unlikely) conditional branch, but I guess this can be optimized by static prediction. Every memory access is already a conditional branch, and it doesn't cause problems.

But CPUs and programming languages seems to make writing overflow-safe code more difficult than unsafe. In assembly, you have to insert additional branch instructions, and in business-oriented languages like Java you have to write long sentences like Math.addExact(). If you write a complicated formula this way, it becomes unreadable.

Sadly, modern languages like Rust haven't fixed the mistake and also penalize writing safe code.

The only language I know with overflow protection is Swift: an overflow causes an exception there. Well done, Apple, you are lightyears ahead of open source software.

11 comments

codedokode

dist1ll 3 years ago

If your unsafe code is sound, you can't cause memory corruptions with wrapping integer overflows, no matter what you do with them. If your Rust code is engineered correctly, overflows are primarily logic bugs, not memory safety issues.

> Every memory access is already a conditional branch, and it doesn't cause problems.

What are you referring to here? The MMU/TLB?

> Sadly, modern languages like Rust haven't fixed the mistake and also penalize writing safe code

Writing safe code is smooth as butter. You just follow borrowing and lifetime rules and you get memory safety basically for free. Same with safe concurrent data structures.

Unsafe code is the one penalized. The amount of boilerplate and unstable features I need to use in Rust for ergonomic unsafe code is pretty staggering.

MiguelX413 3 years ago

What's wrong with checked_add() and its friends in Rust?

codedokode 3 years ago
They are longer to type than `+`.
It is easy to write unsafe code and difficult to write safe code. As a result, while most mathematical operations must be safe, developers use unsafe operations because they are easier to type and more readable.
Operators like `+` should be equivalent to checked_add().unwrap(), not to unchecked_add(). Why choose unsafe and rarely used operation as a default? It is a mistake. Swift uses sane defaults.
UPD: for example, let's say we need to allocate memory for N elements of size S and a header of size H. Here is the unsafe code:
u32 s = N * S + H
And here is safe code:
u32 s = checked_add(checked_mul(N, S).unwrap(), H).unwrap()
And here is safe code in Swift:
let s: Uint32 = N * S + H
And unsafe code in Swift:
let s = N &* S &+ H
- mlindner 3 years ago
  
  In the "unsafe" example in Swift can you get to undefined behavior?
  In your "unsafe" example in Rust, you still don't get undefined behavior.
  https://huonw.github.io/blog/2016/04/myths-and-legends-about...
  
  1 reply →

scoutt 3 years ago

> CPU and language designers

Which CPU? There are a lot of CPU architectures and their variants, some older than 40 years, and C is supported on most of them. Each CPU has its use case. Also, safety is not always a top requirement.

But I also think that true safety will come at silicon level, and not from a programming language.

codedokode 3 years ago
> Also, safety is not always a top requirement.
Safety is the requirement for desktop, mobile and server processors but x86 and ARM penalize using safe math operations (you have to insert branch instructions).
> There are a lot of CPU architectures and their variants, some older than 40 years, and C is supported on most of them.
Between safety on modern CPUs and supporting 40-years CPU I would choose the first.
- scoutt 3 years ago
  
  > Between safety on modern CPUs and supporting 40-years CPU I would choose the first.
  I don't know what you mean with "modern", but x86 was first introduced in 1978, 1985 for 32-bit, and 2003 for 64-bit, for example.
  There may be a water purifier somewhere running in a 1995 x86 box that still needs to be maintained.
  > Safety is the requirement for desktop, mobile and server
  There are several degrees of "safe", and I don't mean just "a hacker stole my data".
  I am sure you hear about internet services crashing all the time. A failing service could be a safety issue too. And no, they are not written in C. There are entire companies living on providing redundancy and watchdog services. Often companies just go with "I don't care if the program crashes, we put another server up, just keep developing new features in whatever safe language we are using".
  
  1 reply →

UncleEntity 3 years ago

> Every memory access is already a conditional branch, and it doesn't cause problems.

Is that how it actually works or is it the kernel/hardware not letting you dereference 0?

codedokode 3 years ago

I mean every memory access can cause an exception and handling an exception is a kind of branch.