Comment by duped

1 year ago

I mean if you emit compiler commands from any build system they're going to be completely illegible due to the number of -L,-l,-I,-i,-D flags which are mostly generated by things like pkg-config and your build configuration.

There's not many optimization flags that people get fine grained with, the exception being floating point because -ffast-math alone is extremely inadvisable

12 comments

duped

dapperdrake 1 year ago

It goes even further.

Technically, the compilers can choose to make undefined-behavior implementation-defined-behavior instead. But they don't.

That's kind of also how C++ std::span wound up without overflow checks in practice. And my_arr.at(i) just isn't really being used by anybody.

Seems very user-hostile to me.

dapperdrake 1 year ago

-ffast-math and -Ofast are inadvisable on principle:

Tl;dr: python gevent messes up your x87 float registers (yes.)

https://moyix.blogspot.com/2022/09/someones-been-messing-wit...

duped 1 year ago
I disagree with "on principle." There are flaws in the design of IEEE 754 and omitting strict adherence for the purposes of performance is fine, if not required for some applications.
For example, recursive filters (even the humble averaging filter) will suffer untold pain without enabling DAZ/FTZ mode.
fwiw the linked issue has been remedied in recent compilers and isn't a python problem, it's a gcc problem. Even that said, if your algorithm requires subnormal numbers, for the love of numeric stability, guard your scopes and set the mxcsr register accordingly!
- usefulcat 1 year ago
  
  A big problem with -ffast-math is that it causes isnan and isinf to be completely, silently broken (gcc and clang).
  Like, "oh you want faster FP operations? well then surely you have no need to ever be able to detect infinite or NaN values.."
  
  2 replies →
- dapperdrake 1 year ago
  
  In practice, "some applications" seems to include almost all of NumPy and Python. Good call.
  Like with the Java sin() fixes: if you don't care about the results being correct why not constant-fold an arbitrary number? Way faster at run-time.
  
  2 replies →
- a_e_k 1 year ago
  
  I find that building and testing my code with -Ofast and -ffast-math from the beginning helps to avoid a lot of the issues with them. Any new code that breaks with them on probably wasn't particularly stable anyway and should be rethought.
bobmcnamara 1 year ago
"what kind of math does the compile usually do without this funsafemath flag? Sad dangerous math?"
- dapperdrake 1 year ago
  
  There are things like floating point exceptions (IEEE 754) and subnormal numbers (close to zero, have less precision than the small approximation error "machine-epsilon"). The idea is to degrade gracefully. These additional features require additional transistors and processing which raises latency.
  If you really know (and want to know) what you are doing, turning this stuff off may help. Some people even advocate brute-forcing all 2^32 single floats in your test cases, because it is kind if feasible to do so: https://news.ycombinator.com/item?id=34726919