← Back to context

Comment by NBJack

9 months ago

Aside from a physics simulation, I'm curious as to what you think would be a positive cost benefit from that level of multithreading for the majority of game engines. Graphical pipelines take advantage of the concept but offload as much work as possible to the GPU.

We were doing threading beyond that in 2010, you could easily have rendering, physics, animation, audio and other subsystems chugging along on different threads. As I was leaving the industry most engines were trending towards very parallel concurrent job execution systems.

The PS3 was also an interesting architecture(i.e. SPUs) from that perspective but it was so distant from the current time that it never really took off. Getting existing things ported to it was a beast.

Bevy really nails the concurrency right IMO(having worked on AA/AAA engines in the past) it's missing a ton in other dimensions but the actual ECS + scheduling APIs are a joy. Last "proper" engine I worked on was a rats-nest of concurrency in comparison.

That said as a few other people pointed out, the key is iteration, hot-reload and other things. Given the choice I'd probably do(and have done) a Rust based engine core where you need performance/stability and some dynamic language on top(Lua, quickjs, etc) for actual game content.

  • > That said as a few other people pointed out, the key is iteration, hot-reload and other things. Given the choice I'd probably do(and have done) a Rust based engine core where you need performance/stability and some dynamic language on top(Lua, quickjs, etc) for actual game content.

    I fully agree that this will likely be the solution a lot of people want to go with in Bevy: scripting for quick iteration, Rust for the stuff that has to be fast. (Also thank you for the kind words!)

    • Yeah, it's a fairly clean and natural divide. You see it in most of the major engines and it was present in all the proprietary engines I worked on(we mostly used Lua/LuaJIT since this predated some great recent options like quickjs).

      We even had things like designers writing scripts for AI in literate programming with Lua using coroutines. We fit in 400kb of space for code + runtime using Lua on the PSP(man that platform was a nightmare but the scripting worked out really well).

      Rust excels when you know what you want to build, and core engine tech fits that category pretty cleanly. Once you get up in game logic/behavior that iteration loop is so dynamic that you are prototyping more than developing.

In big-world high-detail games, the rendering operation wants so much time that the main thread has time for little else. There's physics, there's networking, there's game movement, there's NPC AI - those all need some time. If you can get that time from another CPU, rendering tends to go faster.

I tend to overdo parallelism. Load this file into a Tracy profile, version 0.10.0, and you can see what all the threads in my program are doing.[1] Currently I'm dealing with locking stalls at the WGPU level. If you have application/Rend3/WGPU/Vulkan/GPU parallism, every layer has to get it right.

Why? Because the C++ clients hit a framerate wall, with the main thread at 100% and no way to get faster.

[1] https://animats.com/sl/misc/traces/clockhavenspeed02.tracy

Animations are an example. I landed code in Bevy 0.13 to evaluate all AnimationTargets (in Unity speak, animators) for all objects in parallel. (This can't be done on GPU because animations can affect the transforms of entities, which can cause collisions, etc. triggering arbitrary game logic.) For my test workload with 10,000 skinned meshes, it bumped up the FPS by quite a bit.