Comment by kaoD
17 days ago
> It's not by chance that eg. V8 will try to write objects parsed from a JSON array of objects one right after the other.
Yes, but note this is still a different pattern of access (array of "entities"). V8 does this because it assumes that e.g. `foo.name` is very likely going to be accessed along with `foo.lastName` (which is likely the 99% case for general computing) whereas ECS assumes `foo.name` is very likely going to be accessed along with `foo2.name`, `foo3.name`, ..., `fooN.name` (which is the 99% case for videogame timestep loops).
> Software engineering is architecture is full of trade-offs :) I'm just hoping that the ones we've made will prove to make sense.
To clarify: my comment is not a criticism of Nova's design decisions. I was only trying to clarify the answer to "Why would a game benefit from ECS?" for those foreign to ECS's existential motive.
I'm sure Nova's tradeoffs make sense for some workloads and I wish you the best!
Thank you very much for your well-wishes <3
> Yes, but note this is still a different pattern of access (array of "entities").
I was referring to the `[foo, foo2, foo3]` objects themselves; V8 does use an "cache local" placement for those so you'll find them laid out in memory as:
> [foo_proto, foo_elems, foo_props, foo_name, foo_lastName, foo2_proto, foo2_elems, foo2_props, foo2_name, foo2_lastName, ...]
For what it's worth, I am interested in laying object properties out in an ECS like manner in Nova, so the properties would be laid out as `[foo.name, foo2.name, foo3.name, ...]`, but currently the properties are laid out similarly to V8, `[foo.name, foo.lastName]`. The only difference is that we do not have "in object properties".
That being said: I am obviously biased, but I do wonder if an ECS-like layout wouldn't be nearly universally beneficial. Thinking of the `foo.name` and `foo.lastName` access: If those are on the same cache line then accessing the two only reads one cache line. This is nice. But if there are more properties in the objects (and there often are), then those will pollute the cache. If you do this access once, it doesn't matter. If you do this a million times, now the cache pollution becomes a real issue: In Node.js even the optimal case would be that you read read 625,000 cache lines worth of data, only to discard 250,000 cache lines of it.
If instead we use an ECS-like layout, then accessing these two properties reads two 10100cache lines: That's bad, but on the other hand if this happens once then it won't even make a blip on the screen. If a million of these accesses are done, you could think that we'd suddenly be slow as molasses but now the ECS-like layout is probably going to help you: You're more likely reading the next `name` and `lastName` property values on each access. If you have it bad and only half of the property data you read is actually the `name` and `lastName` properties you want, then you read 750,000 cache lines and lose out to the traditional engine by 100,000 cache lines. If you get 67% "hit rate" then you break even. And that's comparing to the case where the objects only contain `name` and `lastName` and nothing more.
It of course all comes down to statistics but... I'm very interested in the potential benefits here :)
Again, thank you for your comments, I've enjoyed discussing and pondering this <3
Would this mean that each shape/structure/map get its own vector for each field in order for this to work?
That would be one way; it would offer the best theoretical memory layout for well-behaved programs. But, I expect it to be very painful to work with and to come with some performance penalties in mixed use cases due to the extra indirection required.
No, my thinking is that properties would be stored into tables based on their size class: All objects that have 4-7 properties are in the same table, and all of their first property would be in the same slice, second property in another etc.
2 replies →