← Back to context

Comment by oofabz

2 days ago

Very impressive work. For those who aren't familiar with this field, Valve invented SDF text rendering for their games. They published a groundbreaking paper on the subject in 2007. It remains a very popular technique in video games with few changes.

In 2012, Behdad Esfahbod wrote Glyphy, an implementation of SDF that runs on the GPU using OpenGL ES. It has been widely admired for its performance and enabling new capabilities like rapidly transforming text. However it has not been widely used.

Modern operating systems and web browsers do not use either of these techniques, preferring to rely on 1990s-style Truetype rasterization. This is a lightweight and effective approach but it lacks many capabilities. It can't do subpixel alignment or arbitrary subpixel layout, as demonstrated in the article. Zooming carries a heavy performance penalty and more complex transforms like skew, rotation, or 3d transforms can't be done in the text rendering engine. If you must have rotated or transformed text you are stuck resampling bitmaps, which looks terrible as it destroys all the small features that make text legible.

Why the lack of advancement? Maybe it's just too much work and too much risk for too little gain. Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering? It would be a daunting task. Rendering glyphs is one thing but how about handling line breaking? Seems like it would require a lot of communication between CPU and GPU, which is slow, and deep integration between the software and the GPU, which is difficult.

> Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering? […] Rendering glyphs is one thing but how about handling line breaking?

I’m not sure why you’re saying this: text shaping and layout (including line breaking) are almost completely unrelated to rendering.

> Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering?

https://github.com/servo/pathfinder uses GPU compute shaders to do this, which has way better performance than trying to fit this task into the hardware 3D rendering pipeline (the SDF approach).

> Can you imagine rewriting a modern web browser engine to use GPU-accelerated text rendering?

It is tricky, but I thought they already (partly) do that. https://keithclark.co.uk/articles/gpu-text-rendering-in-webk... (2014):

“If an element is promoted to the GPU in current versions of Chrome, Safari or Opera then you lose subpixel antialiasing and text is rendered using the greyscale method”

So, what’s missing? Given that comment, at least part of the step from UTF-8 string to bitmap can be done on the GPU, can’t it?

  • The issue is not subpixel rendering per se (at least if you're willing to go with the GPU compute shader approach, for a pixel-perfect result), it's just that you lose the complex software hinting that TrueType and OpenType fonts have. But then the whole point of rendering fonts on the GPU is to support smooth animation, whereas a software-hinted font is statically "snapped" to the pixel/subpixel grid. The two use cases are inherently incompatible.

Just for the record, text rendering - including with subpixel antialiasing - has been GPU accelerated on Windows for ages and in Chrome/Firefox for ages. Probably Safari too but I can't testify to that personally.

The idea that the state of the art or what's being shipped to customers haven't advanced is false.

SDF is not a panacea.

SDF works by encoding a localized _D_istance from a given pixel to the edge of character as a _F_ield, i.e. a 2d array of data, using a _S_ign bit to indicate whether that distance is inside or outside of the character. Each character has its own little map of data that gets packed together into an image file of some GPU-friendly type (generically called a "map" when it does not represent an image meant for human consumption), along with a descriptor file of where to find the sub-image of each character in that image, to work with the SDF rendering shader.

This definition of a character turns out to be very robust against linear interpolation between field values, enabling near-perfect zoom capability for relatively low resolution maps. And GPUs are pretty good at interpolating pixel values in a map.

But most significantly, those maps have to be pre-processed during development from existing font systems for every character you care to render. Every. Character. Your. Font. Supports. It's significantly less data than rendering every character at high resolution to a bitmap font. But, it's also significantly more data than the font contour definition itself.

Anything that wants to support all the potential text of the world--like an OS or a browser--cannot use SDF as the text rendering system because it would require the SDF maps for the entire Unicode character set. That would be far too large for consumption. It really only works for games because games can (generally) get away with not being localized very well, not displaying completely arbitrary text, etc.

The original SDF also cannot support Emoji, because it only encodes distance to the edges of a glyph and not anything about color inside the glyph. Though there are enhancements to the algorithm to support multiple colors (Multichannel SDF), the total number of colors is limited.

Indeed, if you look closely at games that A) utilize SDF for in-game text and B) have chat systems in which global communities interact, you'll very likely see differences in the text rendering for the in-game text and the chat system.

  • If I understand correctly, the authors approach doesn't really have this problem since they only upload the glyphs being used to the GPU (at runtime). Yes you still have to pre-compute them for your font, but that should be fine.

    • but the grandparent post is talking about a browser - how would a browser pre-compute a font, when the fonts are specified by the webpage being loaded?

      3 replies →

  • Why not prepare SDFs on-demand, as the text comes in? Realistically, even for CJK fonts you only need a couple thousand characters. Ditto for languages with complex characters.

    • Generating SDFs is really slow, especially if you can't use the GPU to do it, and if you use a faster algorithm it tends to produce fields with glitches in them

> complex transforms like skew, rotation, or 3d transforms can't be done

Good. My text document viewer only needs to render text in straight lines left to right. I assume right to left is almost as easy. Do the Chinese still want top to bottom?

  • If you work with ASCII-only monospaced-only text, then yeah sure. It gets weird real quick outside of those boundaries.

  • > Good. My text document viewer only needs to render text in straight lines left to right.

    Yes, inconceivable that somebody might ever want to render text in anything but a "text document viewer"!