Comment by WalterBright

16 hours ago

NaNs are a very underappreciated feature of IEEE-754 floating point. In the D programming language, floats get default initialized to NaN, not to 0.0.

    double y = 0.0; // initialized to 0.0
    double x; // initialized to NaN

The discussion routinely comes up as "why not default initialize to 0.0?" The reason is a routine mistake in programming is forgetting to initialize a variable. With a floating point 0.0, one may never realize that the floating point calculation results are wrong. But with NaN, the result of a floating point computation will be NaN, which is unlikely to go unnoticed.

I don't know of any other programming language with this safety feature.

Also, the D `char` type is initialized to 0xFF, not 0, because Unicode says that 0xFF is an invalid character.

23 comments

WalterBright

p1necone 15 hours ago

Just requiring explicit assignment before first use feels like the superior approach to automatic initialization, regardless of whether the automatic initialization is with 0 or with NaN.

WalterBright 15 hours ago
That suggestion is often made.
The trouble with it is a bug I've seen often. People will get an error message about an "uninitialized variable". Then they go into "just get the compiler to shut up" mode, amd pick "0" as the initializer. Then, the program compiles and runs, and silently produces the wrong answer. Code reviews will simply pass over the "0" initializer, as it looks right.
With default NaN initialization, the programmer is more likely to stop and think about it, not just insert 0.
Another issue with it is:
float x = 0.0; setFloat(&x); void setFloat(float* px) { *px = 3.0; }
For the purposes of code clarity I don't want to see a variable initialized to a value that is never used, just to shut the compiler up.
- ncurses1010 12 hours ago
  
  With the default initialization to nan, do you ever run into situations where people are searching for common sources for nan (nan literals, div by zero) and they can't find it? Or cases where only some branches but not others initialize the float?
  
  4 replies →
billforsternz 10 hours ago
How long did you think about this before making this declaration? How long did Walter Bright think about this before making his decision when designing his language? Not saying you're wrong, just something to think about perhaps.
- electroly 10 hours ago
  
  C# requires explicit assignment. If an appeal to authority sways you (it shouldn't), you can substitute Anders Hejlsberg instead of this random OP. How long do you suppose Anders Hejlsberg thought about this?
  But I contend it's more useful (and interesting) to think about the idea with your own mind instead of tallying up the perceived authority of its supporters and relying on trust. It was also somewhat rude to suggest that the OP had not given their idea much thought. This is a forum for discussion, isn't it?
  
  1 reply →
- WalterBright 10 hours ago
  
  Thank you. I've made many counter-intuitive decisions based on long experience. Sometimes I just have to say "trust me".
  Like not allowing macros in D, or version algebra.
lmm 12 hours ago
Yep. This is NaN as a billion dollar mistake all over again.
- WalterBright 8 hours ago
  
  Unrecognized subtle errors in floating point calculations are worse problems.
  
  1 reply →

WalterBright 15 hours ago

Another crucial use of NaNs is if you have a sensor. If the sensor has failed, the sensed value should be transmitted as NaN, not 0, so the receiver knows the data is bad.

bumby 2 hours ago

Doesn’t this completely depend on the sensor failure mode? Eg if a voltage sensor internally shorts to ground, the failure will read 0V, not NaN. Or are you using “failed sensor” to only mean “not reporting” here?
I think your initialization is smart in many use cases, but the sensor application probably isn’t one of them except for that single failure mode. It can still lead to masked failures and false assumptions (“the sensor is getting a value so it must be working”). That’s the same issue as what you’re supposedly fixing by that design choice. It still requires engineering knowledge to assess correctly.
AlotOfReading 15 hours ago
My experience is that if you write an interface that (rarely) returns NaNs, someone will use it assuming it's never NaN no matter how good the docs are. Then their code does bad things and you have to patiently explain why they're wrong and yes, they are holding isnan() wrong (in C/C++).
- adrian_b 7 hours ago
  
  When such users are expected, there exists only one solution.
  Do not mask the invalid operation exception, which was actually the original recommendation of the IEEE standard, which was that the default behavior should be to mask all exceptions, except the invalid operation exception.
  When the invalid operation exception is not masked, NaNs are never generated and any NaN present in the input data will generate an exception, which will abort the program, unless the exception is handled.
  This behavior avoids the bugs caused by careless programmers. Unfortunately, the original suggestion was not adopted by most programming language implementers, so nowadays the typical default setting is to have all exceptions masked. When the programmers also omit to handle the special values, bugs may remain unnoticed.
  Special values need not be handled everywhere, because infinities and NaNs will propagate through many operations, so they will remain in the final results. But wherever a value is not persistent, but it is used in some decision and it is discarded after that, special values like NaNs must be handled correctly.
- WalterBright 13 hours ago
  
  NaN for a failed sensor is objectively better than any other value. But at some point you just cannot help some people.

anitil 15 hours ago

That's a very thoughtful decision, I always enjoy your updates on D

wpollock 15 hours ago

> ... Unicode says that 0xFF is an invalid character.

Not so. You may be thinking of UTF-8 encoding. 0xff is DEL in Unicode.

LittleLily 12 hours ago

DEL is unicode codepoint U+007F, which is the byte 0x7F in UTF-8, not 0xFF. Perhaps you were thinking of ÿ which is codepoint U+00FF, which encodes to the bytes 0xC3 0xBF in UTF-8.
WalterBright 15 hours ago

The "char" type in D represents a UTF-8 code unit, the byte 0xFF is not a valid character code and is strictly forbidden.