Comment by raincole
20 days ago
Low level numerical operation optimizations are often not reproduceable. For example: https://www.intel.com/content/dam/develop/external/us/en/doc... (2013)
But it's still surprising that that LLM doesn't work on iPhone 16 at all. After all LLMs are known for their tolerance to quantization.
Yes, "floating point accumulation doesn't commute" is a mantra everyone should have in their head, and when I first read this article, I was jumping at the bit to dismiss it out of hand for that reason.
But, what got me about this is that:
* every other Apple device delivered the same results
* Apple's own LLM silently failed on this device
to me that behavior suggests an unexpected failure rather than a fundamental issue; it seems Bad (TM) that Apple would ship devices where their own LLM didn't work.
> floating point accumulation doesn't commute
It is commutative (except for NaN). It isn't associative though.
I think it commutes even when one or both inputs are NaN? The output is always NaN.
13 replies →
I would go even further and state that "you should never assume that floating point functions will evaluate the same on two different computers, or even on two different versions of the same application", as the results of floating point evaluations can differ depending on platform, compiler optimizations, compilation-flags, run-time FPU environment (rounding mode, &c.), and even memory alignment of run-time data.
There's a C++26 paper about compile time math optimizations with a good overview and discussion about some of these issues [P1383]. The paper explicitly states:
1. It is acceptable for evaluation of mathematical functions to differ between translation time and runtime.
2. It is acceptable for constant evaluation of mathematical functions to differ between platforms.
So C++ has very much accepted the fact that floating point functions should not be presumed to give identical results in all circumstances.
Now, it is of course possible to ensure that floating point-related functions give identical results on all your target machines, but it's usually not worth the hassle.
[P1383]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p13...
Even the exact same source code compiled with different compilers, or the same compiler with different compiler options.
Intel Compiler for e.g. uses less than IEEE764 precision for floating point ops by default, for example.
FYI, the saying is "champing at the bit", it comes from horses being restrained.
Huh. I never knew "champing" was the proper spelling [0]
[0] https://www.npr.org/sections/memmos/2016/06/09/605796769/che...
hey, I appreciate your love of language and sharing with us.
I'm wondering if we couldn't re-think "bit" to the computer science usage instead of the thing that goes in the horse's mouth, and what it would mean for an AI agent to "champ at the bit"?
What new sayings will we want?
1 reply →
chomping at the bit
4 replies →
As a sister comment said, floating point computations are commutative, but not associative.
a * b = b * a for all "normal" floating point numbers.