← Back to context

Comment by nicoburns

4 days ago

It's not a formal benchmark, but my Browser Engine / Webview (https://github.com/DioxusLabs/blitz/) has pluggable rendering backends (via https://github.com/DioxusLabs/anyrender) with Vello (GPU), Vello CPU, Skia (various backends incl. Vulkan, Metal, OpenGL, and CPU) currently implemented

On my Apple M1 Pro, the Vello CPU renderer is competitive with the GPU renderers on simple scenes, but falls behind on more complex ones. And especially seems to struggle with large raster images. This is also without a glyph cache (so re-rasterizing every glyph every time, although there is a hinting cache) which isn't implemented yet. This is dependent on multi-threading being enabled and can consume largish portions of all-core CPU while it runs. Skia raster (CPU) gets similarish numbers, which is quite impressive if that is single-threaded.

I think Vello CPU would always struggle with raster images, because it does a bounds check for every pixel fetched from a source image. They have at least described this behavior somewhere in Vello PRs.

The obsession for memory safety just doesn't pay off in some cases - if you can batch 64 pixels at once with SIMD it just cannot be compared to a per-pixel processor that has a branch in a path.