Comment by raincole

20 days ago

Low level numerical operation optimizations are often not reproduceable. For example: https://www.intel.com/content/dam/develop/external/us/en/doc... (2013)

But it's still surprising that that LLM doesn't work on iPhone 16 at all. After all LLMs are known for their tolerance to quantization.

28 comments

raincole

bri3d 20 days ago

Yes, "floating point accumulation doesn't commute" is a mantra everyone should have in their head, and when I first read this article, I was jumping at the bit to dismiss it out of hand for that reason.

But, what got me about this is that:

* every other Apple device delivered the same results

* Apple's own LLM silently failed on this device

to me that behavior suggests an unexpected failure rather than a fundamental issue; it seems Bad (TM) that Apple would ship devices where their own LLM didn't work.

sva_ 20 days ago
> floating point accumulation doesn't commute
It is commutative (except for NaN). It isn't associative though.
- ekelsen 20 days ago
  
  I think it commutes even when one or both inputs are NaN? The output is always NaN.
  
  13 replies →
DavidVoid 20 days ago
I would go even further and state that "you should never assume that floating point functions will evaluate the same on two different computers, or even on two different versions of the same application", as the results of floating point evaluations can differ depending on platform, compiler optimizations, compilation-flags, run-time FPU environment (rounding mode, &c.), and even memory alignment of run-time data.
There's a C++26 paper about compile time math optimizations with a good overview and discussion about some of these issues [P1383]. The paper explicitly states:
1. It is acceptable for evaluation of mathematical functions to differ between translation time and runtime.
2. It is acceptable for constant evaluation of mathematical functions to differ between platforms.
So C++ has very much accepted the fact that floating point functions should not be presumed to give identical results in all circumstances.
Now, it is of course possible to ensure that floating point-related functions give identical results on all your target machines, but it's usually not worth the hassle.
[P1383]: https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p13...
- physicsguy 20 days ago
  
  Even the exact same source code compiled with different compilers, or the same compiler with different compiler options.
  Intel Compiler for e.g. uses less than IEEE764 precision for floating point ops by default, for example.
danpalmer 20 days ago
FYI, the saying is "champing at the bit", it comes from horses being restrained.
- jasinjames 19 days ago
  
  Huh. I never knew "champing" was the proper spelling [0]
  [0] https://www.npr.org/sections/memmos/2016/06/09/605796769/che...
- mylifeandtimes 20 days ago
  
  hey, I appreciate your love of language and sharing with us.
  I'm wondering if we couldn't re-think "bit" to the computer science usage instead of the thing that goes in the horse's mouth, and what it would mean for an AI agent to "champ at the bit"?
  What new sayings will we want?
  
  1 reply →
- odo1242 20 days ago
  
  chomping at the bit
  
  4 replies →
BeetleB 20 days ago

As a sister comment said, floating point computations are commutative, but not associative.
a * b = b * a for all "normal" floating point numbers.