← Back to context

Comment by twic

21 hours ago

And yet, there is a good chance that C++ will start doing exactly this [1]. Because [2]:

> The performance impact is negligible (less that 0.5% regression) to slightly positive (that is, some code gets faster by up to 1%). The code size impact is negligible (smaller than 0.5%). Compile-time regressions are negligible. Were overheads to matter for particular coding patterns, compilers would be able to obviate most of them.

> The only significant performance/code regressions are when code has very large automatic storage duration objects. We provide an attribute to opt-out of zero-initialization of objects of automatic storage duration. We then expect that programmer can audit their code for this attribute, and ensure that the unsafe subset of C++ is used in a safe manner.

> This change was not possible 30 years ago because optimizations simply were not as good as they are today, and the costs were too high. The costs are now negligible.

[1] https://github.com/cplusplus/papers/issues/1401

[2] https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p27...

Thanks for the references - that was interesting reading, particularly that initialisation can be good for instruction pipelining.

A trick we were using with SSE was something like

__m128 zero = _mm_undefined_ps(); zero = _mm_xor_ps(zero, zero);

Now we were really careful with viewing our ops as data dependencies to reason about pipelining efficiency. But our profiling tools were not measuring this.

We did avoid _mm_set_ps(0.0f) which was actually showing up as cache misses.

I wonder if we were actually slower because cache misses are something we can measure?!