Comment by jchw
5 hours ago
I wonder if some day we'll look back differently on the "billion-dollar mistake" thing. The key problem with null references is that it forces you to either check any given reference to see if it's null if you don't already know, or you would have to have a contract that it can't be null. Not having null references really does solve that problem, but still in every day programs you often wind up with situations where you actually can know from the outside that some function will return a non-empty value, but the function itself is unable to make that guarantee in a way that the compiler can enforce it; in those cases, you have no choice but to face the same dilemma. In Rust this situation plays out with `unwrap()`, which in practice most reasonably-sized codebases will end up with some. You could always forbid it, but this is only somewhat of an improvement because in a lot of cases there also isn't anything logical to do once that invariant hasn't held. (Though for critical production workloads, it is probably a good idea to try to find something else to do other than let the program entirely crash in this event, even if it's still potentially an emergency.)
In other words: after all this time, I feel that Tony Hoare framing null references as the billion-dollar mistake may be overselling it at least a little. Making references not nullable by default is an improvement, but the same problem still plays out so as long as you ever have a situation where the type system is insufficient to be able to guarantee the presence of a value you "know" must be there. (And even with formal specifications/proofs, I am not sure we'll ever get to the point where is always feasible to prove.) The only real question is how much of the problem is solved by not having null references, and I think it's less than people acknowledge.
(edit: Of course, it might actually be possible to quantify this, but I wasn't able to find publicly-available data. If any organization were positioned to be able to, I reckon it would probably be Uber, since they've developed and deployed both NullAway (Java) and NilAway (Go). But sadly, I don't think they've actually published any information on the number of NPEs/panics before and after. My guess is that it's split: it probably did help some services significantly reduce production issues, but I bet it's even better at preventing the kinds of bugs that are likely to get caught pre-production even earlier.)
I think Hoare is bang on because we know the only similar values in many languages are also problematic even though they're not related to memory.
The NaNs are, as their name indicates, not numbers. So the fact this 32-bit floating point value parameter might be NaN, which isn't even a number, is as unhelpful as finding that the Goose you were passed as a parameter is null (ie not actually a Goose at all)
There's a good chance you've run into at least one bug where oops, that's NaN and now the NaN has spread and everything is ruined.
The IEEE NaNs are baked into the hardware everybody uses, so we'll find it harder to break away from this situation than for the Billion Dollar Mistake, but it's clearly not a coincidence that this type problem occurs for other types, so I'd say Hoare was right on the money and that we're finally moving in the correct direction.