Comment by cookiengineer
2 days ago
I thought SSE2 and everything that came after like AVX 512 or SSE4 were made for streaming, leveraging the cache only for direct access to speed things up?
Haven't used SSE instructions for anything other than fiddling around with it yet, so I don't know if I'm wrong in this assumption. I understand the lock state argument about cores due to always max 2 cores being able to access the same cache/memory... but doesn't this have to be identical for FPUs if we compare this with SIMD + AVX?
No comments yet
Contribute on Hacker News ↗