Comment by tsimionescu
3 hours ago
It's because removing a monster with 20 fields from an SoA structure means resizing 20 arrays. Removing the same monster from an AoS array involves resizing a single array, which you're going to process in a very cache friendly way.
Assuming ordering isn't a concern, can't you just have a field called "removed" and skip those when iterating?
Or swap it with the last monster, and keeping an index for the last monster alive.
Then you have to read the "removed" field on every field read on every operation.
SoA is only useful when you don't read multiple fields for most operations.
Two fields should be fine, actually. The way caches are organized you are very unlikely to thrash with the lookups (due to n-way associativity) while only keeping relevant data in the cache at the same time. You still have roughly the following layout (in the cache), where A is the field and V is valid:
The former access pattern still yields a clean cache layout where no unnecessary data is loaded (which is the most costly operation here by far) as opposed to
In the general case there will exist a number of fields for which SOA layout will be worse if all are accessed close to each other, but for just a validity indicator this should not be the case. I think your statement is not wrong, but also not 100% correct.
This is on par to linear search being faster than binary search for small n. As soon as caches and branch prediction chime in many rules of thumb just change. Most importantly, however, is that a distinction between small and large n basically _needs_ to happen at that point.
I'm not sure why anybody would at the same time be implementing SoA AND resizing 20 arrays for a single delete, those things seem to be on either ends of the "I care about performance" spectrum.