Comment by thechao
3 years ago
And a 64b sweep is ~21 days. As a HW guy who tests fp units... just saying.
The real issue is that the verification code is much, much, slower.
3 years ago
And a 64b sweep is ~21 days. As a HW guy who tests fp units... just saying.
The real issue is that the verification code is much, much, slower.
> And a 64b sweep is ~21 days
Per GPU. A consumer GPU at that (A100 is faster).
4x GPUs (the AMD 6900 XT) per node and maybe 10x nodes in a rack (4U per node, 42U cabinet) can do 64-bit sweep in half a day.
> The real issue is that the verification code is much, much, slower.
Isn't verification of 64b multipliers actually a 128-bit problem? (x * y == z, so you have 2x 64-bit numbers to work with).
Also, I think they use BDDs, which although multiplication is exponential, its still small enough exponent that 128-bit BDDs fit on today's computers. So its not a brute force thing like discussed in this topic, but actually a very elegant data-structure (though with tons of complications, as is the nature of NP-complete and expspace problems).
It’s a lot easier to do formal verification… which is a fancy BDD, in many ways.
At that point I might just take the first 2^32 numbers, the last 2^32 numbers, and some range in between, and feel good about it.
https://en.wikipedia.org/wiki/Bounds_checking still seems sensible, if 2^32-1 works why wouldn't 2^32-2?