← Back to context

Comment by sidewndr46

15 days ago

Not to suggest you weren't competent, but did you consider and try and control for the fact that your measurement could be the problem?

Not going to dismiss it, but I did try to not do stupid stuff. I used QueryPerformanceCounter outside the loop, pinned the benchmark thread to a single core, and the array of elements it processed was fairly large. So I don't think overhead and throttling was an issue. The measurements were very consistent and repeatable.

  • Fair enough, I've only really ever found assembly level optimization on embedded microcontrollers to make any degree of sense. Performance optimization usually means something along the lines of "convince co-workers not to implement their own bubble sort" in my lines of work

    • Yeah, I've also come across a lot of assembly code which was faster 10 years ago, but where the compiler now beats it. So for a while now my take has been to mostly avoid asm, but if needed always have a compiled version, and always do runtime performance detection to select optimal version.