Comment by twic
21 hours ago
And yet, there is a good chance that C++ will start doing exactly this [1]. Because [2]:
> The performance impact is negligible (less that 0.5% regression) to slightly positive (that is, some code gets faster by up to 1%). The code size impact is negligible (smaller than 0.5%). Compile-time regressions are negligible. Were overheads to matter for particular coding patterns, compilers would be able to obviate most of them.
> The only significant performance/code regressions are when code has very large automatic storage duration objects. We provide an attribute to opt-out of zero-initialization of objects of automatic storage duration. We then expect that programmer can audit their code for this attribute, and ensure that the unsafe subset of C++ is used in a safe manner.
> This change was not possible 30 years ago because optimizations simply were not as good as they are today, and the costs were too high. The costs are now negligible.
[1] https://github.com/cplusplus/papers/issues/1401
[2] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p27...
Thanks for the references - that was interesting reading, particularly that initialisation can be good for instruction pipelining.
A trick we were using with SSE was something like
__m128 zero = _mm_undefined_ps(); zero = _mm_xor_ps(zero, zero);
Now we were really careful with viewing our ops as data dependencies to reason about pipelining efficiency. But our profiling tools were not measuring this.
We did avoid _mm_set_ps(0.0f) which was actually showing up as cache misses.
I wonder if we were actually slower because cache misses are something we can measure?!