I thought somehow the animation was playing "by itself," but I guess it was accomplished by holding down the '.' key? The font code swaps a run of N dots with the glyph corresponding to the Nth frame of animation.
Fontemon [0] makes this a bit more obvious by including a web page with the font embedded, so we can control the animation by typing, rather than watching someone else type. However, mmulet embeds some sort of Blender project [1], rather than a wasm binary to accomplish the font shaping.
So, the blender project was just used to create the game (set up each decision tree, and the position of all images), from there I compile everything into complex ligatures in the GSUB table. The wasm binary feature wasn’t around when I made fontemon, but it looks like it would have made development a lot easier!
As someone completely ignorant of the inner working of fonts, how is this different from ligatures? Those also produce special glyphs based on combinations.
Yes this is exactly ligature, I just became aware that it was possible to use wasm for them in HarfBuzz, if you want to see some wasm examples that is not possible with ligatures have a look at https://github.com/harfbuzz/harfbuzz-wasm-examples
Obviously the thought comes up of the fact that this feels unsafe to have WASM in font files, but, I'm also aware that font layout engines are already turing complete, which leads me to wonder: have there been any high profile malware font examples? That entire stack feels a lot like an attack surface to me, especially given stuff like the fact that windows used to render fonts in the kernel layer.
Multiple iOS jailbreaks--both by comex--were buffer overflows of the virtual machine stack due to bugs in how a few instructions were handled in freetype's implementation of true type font hinting. The resulting exploit was embedded in a PDF file (which was itself deployed by a website), but that was just a convenient way to embed the font and trigger very deterministic hinting: the bug wasn't in the PDF renderer, per se (though I imagine a lot of people were confused on that front in the popular press about the issue).
He open sourced the exploit concurrent to the website going up, and it was immediately adjusted for use against different targets (including FoxIt reader or something like that on Windows), and as freetype was used by a lot of Linux distributions in addition to iOS I imagine it was used in a ton of malware (which might or might not have been "high profile"). I actually use those vulnerabilities as a case study in the ethical trade offs of open source weaponization in my talks.
(There were two such jailbreaks, as there were/are separate implementations of two similar yet slightly different virtual machine versions, each of which had bugs that I remember to be related to the same fundamental mistake; and--as you can read about in another big thread on this website today, most developers think coming up with difficult abstractions isn't worth their effort and would rather fix things by playing whack-a-mole.)
Font layout engines are only Turing-complete if the stack is unbounded (to be fair: that's true actual computers too: they're not Turing-complete because they don't have infinite RAM), and AFAIK the major font engines all impose a quite strict limit on the stack size.
Font files already have embedded code for hinting. So while this might increase the attack surface somewhat, it was already there and I honestly trust wasm execution more than the severely underdocumented and poorly understood hinting VM.
> That entire stack feels a lot like an attack surface to me, especially given stuff like the fact that windows used to render fonts in the kernel layer.
The blog post talks a lot about how he got the frames into the font, but very little about how the animation works.
AFAICT this is how it is done (edit: I am wrong, it uses Wasm):
- The frames of the video are simply stored as glyphs in the font
- There is a ligature mapping for sequences of dots to glyphs (for example "." is mapped to glyph 1, ".." is mapped to glyph 2, "..." is mapped to glyph 3, etc.
- If you use the font in an editable part of the browser and hold the "." key pressed, dots get added by autorepeat and a growing a sequence of dots is inserted. This sequence of dots is converted by the font's ligature mapping to different animation frame glyphs, thus showing the animation.
I have no idea why WASM and HarfBuzz are needed (it should work in any modern browser without them), but it looks like a fun little experiment.
A new experimental feature of HarfBuzz allows the font to include WASM code for the shaper within the font itself. So the code shown in the post is inside the font and getting run "live," rather than being something that generated or modified the ligature tables in the font file in advance.
I wondered myself about just using "simple" ligatures, but I don't know whether or not it's feasible to statically store several thousand ligature definitions in a font that are each mostly runs of several thousand characters being substituted. But maybe? OpenType has mysterious depths.
Should be no problem. GSUB lookup type 4.1* uses a uint16 to store the number of ligatures, so 65000 ligatures should be feasible. To store the actual glyphs, 32bit offsets are used, so you theoretically have a 2GByte of memory available, which should be plenty (although I have never seen a font larger than 15MB).
Using Wasm for this animation really is an overkill IMHO.
Edit: IIRC Ligatures are applied recursively, so you can have a ligature based on other ligatures. If I am right here, each ligature can consist only of two glyphs (the glyph of the previous animation frame followed by a dot). This would keep the GSUB table small.
This reminds me of a torrent on Nyaa that implemented Bad Apple!! in ASS subtitles by retracing the frames into SVG (seems to be better quality than simply using potrace by itself), converting the SVGs into ASS Dialogue events, and muxing it into a Matroska container with a placeholder video. Therefore the "video" window can be resized without rescaling raster images (and it actually runs well on most hardware and players, unlike his other torrents putting whole anime episodes in subtitles). The subtitle attachment could also be extracted from the container and executed as a valid shell script which would run mpv or ffplay, use the respective options to create a libavfilter filter to create a blank video (to overlay the subtitles on), use its own filename for subtitles, and play the song by decoding and piping a base64 string at the bottom of the script to mpv/ffmpeg's stdin.
In the middle of the article you see a line "RUST Full code for character replacement". If you click on that, it will show you the Wasm code.
It looks like it uses Wasm to replace a sequence of dots with a glyph from the font, which shows a frame from the animation, similar to ligatures, but using Wasm. You could do the same with storing the svg paths for each animation frame in an array and then using Javascript iterate over and display these paths, but this uses Wasm, HarfBuzz and a font.
Stupid question: Is this an entire blog post about an animated font without showing the animation in action, or does it simply not work on my device? (iOS 15) I’m not sure where to look.
Yes, it is, and that's because the trick is that it relies in ligatures to combine series of dots into frames. One '.' shows the first frame, '..' the second, '...' the third, you get the idea.
The only way to animate the font is thus to hold down the . key, which you can't really do in a blog post, at least without some custom JavaScript.
In theory, a font is purely set of vector graphics. In practice, just rasterizing vector graphics usually doesn't lead to good results on small font sizes combined with small pixel density, so vector graphics needs to be adjusted to better fit into pixel grid. There are multiple ways to do it, one of the ways is to write a script to adjust graphics so it better fits the pixel grid. For example TypeType fonts contain a virtual machine that is capable of just that [1]. By conceptual extension, then a font format might just as well contain a full blown virtual machine with potentially a program per glyph, WASM is a reasonable candidate for something like that.
But then nobody can interpret the machine, only software that you need to write on your own, which makes the whole exercise useless as you can just read the wasm directly?
Hey, I'm the person who made SmilApple (typo'd here as SmllApple)! Super great to see my quick and dirty project used like this, I just wish I'd thought of it first
Even though I feel somewhat familiar with most Internet culture, I've never come across the Bad Apple meme, and I've just spent a good hour down the YouTube rabbit hole watching variations of Bad Apple. Fantastic stuff
Cool idea. Something very similar is possible without WASM in the font renderer. I remember a font that displayed its own size, so it changed zooming in and out. I think you could adapt that to play a short animation.
A bit offtopic: In principle I like the design of the bloc, but I recently searched for a technical-looking font that is not monospaced. I like the design of the blog, but reading prose in monospaced font is just not very pleasant I think. All (freely available) coding I found fonts only have monospaced variants and all non-monospaced fonts didn't look like coding fonts. Any ideas?
The niche seems to be people who like to code without monospace, or who present code without monospace, e.g. in slides or in blog posts. Or if you want typographical consistency between non-code and code, as I understand you are suggesting.
Stuff like this does get me excited, as a web developer who's always wanted to explore graphics but never really dipped into native development.
On the other hand, I know enough to know that Chromium uses Harfbuzz and Skia to render a webpage that, in itself, is going to use another instance of Harfbuzz and Skia to render into a canvas element. Intuitively, it feels dirty.
Opentype supported ligatures in ‘96. Postscript Type 1 and even Knuth’s TeX supported ligatures to a certain extent.
It’s a pretty standard base-level feature for any sort of typesetting.
Imo this is akin to making a terminal animation by outputting blocks of ascii art. It’s not that terminals added video playback support— which would be bloat—but instead someone pushed a standard feature to a novel extent.
The goal is to print with a computer something like a late 15th century humanist document, a tradition of typography 500 years old, and not to print on a 200x320 screen, which had a tradition of merely a decade.
The early computer age of the 80s and 90s was merely playing catch-up to established standards. The standards of the 80s and 90s are not what we wanted to achieve ultimately.
Same with cinema: We shot on 4K-equivalent film for the past 100 years, only in the 80s and 90s with the computerization and videotapes we had a temporary standard of 480i, which we have overcome with sheer computer power, and we’re back to where we actually wanted to be in the beginning.
I thought somehow the animation was playing "by itself," but I guess it was accomplished by holding down the '.' key? The font code swaps a run of N dots with the glyph corresponding to the Nth frame of animation.
Fontemon [0] makes this a bit more obvious by including a web page with the font embedded, so we can control the animation by typing, rather than watching someone else type. However, mmulet embeds some sort of Blender project [1], rather than a wasm binary to accomplish the font shaping.
[0] https://www.coderelay.io/fontemon.html
[1] https://github.com/mmulet/code-relay/blob/main/markdown/Tuto...
So, the blender project was just used to create the game (set up each decision tree, and the position of all images), from there I compile everything into complex ligatures in the GSUB table. The wasm binary feature wasn’t around when I made fontemon, but it looks like it would have made development a lot easier!
Yeah in retrospect I really should have added something like a overlay so it was possible to see what keys I pressed.
As someone completely ignorant of the inner working of fonts, how is this different from ligatures? Those also produce special glyphs based on combinations.
It's exactly ligatures. I presume the author avoided that term so you didn't confuse it with the ligatures that are included in Unicode.
Yes this is exactly ligature, I just became aware that it was possible to use wasm for them in HarfBuzz, if you want to see some wasm examples that is not possible with ligatures have a look at https://github.com/harfbuzz/harfbuzz-wasm-examples
Obviously the thought comes up of the fact that this feels unsafe to have WASM in font files, but, I'm also aware that font layout engines are already turing complete, which leads me to wonder: have there been any high profile malware font examples? That entire stack feels a lot like an attack surface to me, especially given stuff like the fact that windows used to render fonts in the kernel layer.
Multiple iOS jailbreaks--both by comex--were buffer overflows of the virtual machine stack due to bugs in how a few instructions were handled in freetype's implementation of true type font hinting. The resulting exploit was embedded in a PDF file (which was itself deployed by a website), but that was just a convenient way to embed the font and trigger very deterministic hinting: the bug wasn't in the PDF renderer, per se (though I imagine a lot of people were confused on that front in the popular press about the issue).
He open sourced the exploit concurrent to the website going up, and it was immediately adjusted for use against different targets (including FoxIt reader or something like that on Windows), and as freetype was used by a lot of Linux distributions in addition to iOS I imagine it was used in a ton of malware (which might or might not have been "high profile"). I actually use those vulnerabilities as a case study in the ethical trade offs of open source weaponization in my talks.
(There were two such jailbreaks, as there were/are separate implementations of two similar yet slightly different virtual machine versions, each of which had bugs that I remember to be related to the same fundamental mistake; and--as you can read about in another big thread on this website today, most developers think coming up with difficult abstractions isn't worth their effort and would rather fix things by playing whack-a-mole.)
Wasn't there also a Telugu glyph that could in some weird corner cases brick an iPhone?
Font layout engines are only Turing-complete if the stack is unbounded (to be fair: that's true actual computers too: they're not Turing-complete because they don't have infinite RAM), and AFAIK the major font engines all impose a quite strict limit on the stack size.
I suppose with WASM you can just write an infinite loop?
Font files already have embedded code for hinting. So while this might increase the attack surface somewhat, it was already there and I honestly trust wasm execution more than the severely underdocumented and poorly understood hinting VM.
https://www.truetype-typography.com/tthints.htm
Wasm is sandboxed, so it's not really any different than rendering a web view inside an app.
Note the author had to modify Gimp to get it to run the wasm. It's not something most apps would allow just for font rendering.
I only had to enable it in hafbuzz as gimp uses dynamic linking. So I luckily did not have to build it as well
> That entire stack feels a lot like an attack surface to me, especially given stuff like the fact that windows used to render fonts in the kernel layer.
Indeed. https://googleprojectzero.github.io/0days-in-the-wild/0day-R...
Something like that is the reason that most OSs have dropped support for Postscript Type 1 fonts.
The blog post talks a lot about how he got the frames into the font, but very little about how the animation works.
AFAICT this is how it is done (edit: I am wrong, it uses Wasm):
- The frames of the video are simply stored as glyphs in the font
- There is a ligature mapping for sequences of dots to glyphs (for example "." is mapped to glyph 1, ".." is mapped to glyph 2, "..." is mapped to glyph 3, etc.
- If you use the font in an editable part of the browser and hold the "." key pressed, dots get added by autorepeat and a growing a sequence of dots is inserted. This sequence of dots is converted by the font's ligature mapping to different animation frame glyphs, thus showing the animation.
I have no idea why WASM and HarfBuzz are needed (it should work in any modern browser without them), but it looks like a fun little experiment.
A new experimental feature of HarfBuzz allows the font to include WASM code for the shaper within the font itself. So the code shown in the post is inside the font and getting run "live," rather than being something that generated or modified the ligature tables in the font file in advance.
I wondered myself about just using "simple" ligatures, but I don't know whether or not it's feasible to statically store several thousand ligature definitions in a font that are each mostly runs of several thousand characters being substituted. But maybe? OpenType has mysterious depths.
Should be no problem. GSUB lookup type 4.1* uses a uint16 to store the number of ligatures, so 65000 ligatures should be feasible. To store the actual glyphs, 32bit offsets are used, so you theoretically have a 2GByte of memory available, which should be plenty (although I have never seen a font larger than 15MB).
Using Wasm for this animation really is an overkill IMHO.
*) https://learn.microsoft.com/en-us/typography/opentype/spec/g...
Edit: IIRC Ligatures are applied recursively, so you can have a ligature based on other ligatures. If I am right here, each ligature can consist only of two glyphs (the glyph of the previous animation frame followed by a dot). This would keep the GSUB table small.
2 replies →
So the WASM code of the font is just for a dynamic ligature engine?
This reminds me of a torrent on Nyaa that implemented Bad Apple!! in ASS subtitles by retracing the frames into SVG (seems to be better quality than simply using potrace by itself), converting the SVGs into ASS Dialogue events, and muxing it into a Matroska container with a placeholder video. Therefore the "video" window can be resized without rescaling raster images (and it actually runs well on most hardware and players, unlike his other torrents putting whole anime episodes in subtitles). The subtitle attachment could also be extracted from the container and executed as a valid shell script which would run mpv or ffplay, use the respective options to create a libavfilter filter to create a blank video (to overlay the subtitles on), use its own filename for subtitles, and play the song by decoding and piping a base64 string at the bottom of the script to mpv/ffmpeg's stdin.
The post also links to some examples of using Wasm to solve complex typographical problems, which I found interesting.
https://github.com/harfbuzz/harfbuzz-wasm-examples
Okay, now you have the frames as glyphs in the font, but how are they going to animate? The most interesting part of the explanation is missing.
The glyph corresponds to how many (e.g.) dots you have in a row and the author set key-repeat at 30/s and held down the . key.
...okay, that is less automatic than I was led to expect. I mean, it's still cool, I guess.
In the middle of the article you see a line "RUST Full code for character replacement". If you click on that, it will show you the Wasm code.
It looks like it uses Wasm to replace a sequence of dots with a glyph from the font, which shows a frame from the animation, similar to ligatures, but using Wasm. You could do the same with storing the svg paths for each animation frame in an array and then using Javascript iterate over and display these paths, but this uses Wasm, HarfBuzz and a font.
More importantly, seeing the other comments, it uses those and a keyboard.
I came here to ask the same thing, then saw interroboink's guess in another comment and it seems plausible: https://news.ycombinator.com/item?id=37317687
Stupid question: Is this an entire blog post about an animated font without showing the animation in action, or does it simply not work on my device? (iOS 15) I’m not sure where to look.
They posted a YouTube video near the end of the blog post showing the result. See:
https://m.youtube.com/watch?v=GF2sn2DXjlA
Yes, it is, and that's because the trick is that it relies in ligatures to combine series of dots into frames. One '.' shows the first frame, '..' the second, '...' the third, you get the idea.
The only way to animate the font is thus to hold down the . key, which you can't really do in a blog post, at least without some custom JavaScript.
Sure, but they could have embedded the video in the page, or at the very least included some screenshots of individual frames.
Yep. I'm not seeing anything on firefox on windows either
I don’t understand the WASM part at all and I feel dumb
How can WASM be in font? Font is a font, not WASM file. It’s a different format
In theory, a font is purely set of vector graphics. In practice, just rasterizing vector graphics usually doesn't lead to good results on small font sizes combined with small pixel density, so vector graphics needs to be adjusted to better fit into pixel grid. There are multiple ways to do it, one of the ways is to write a script to adjust graphics so it better fits the pixel grid. For example TypeType fonts contain a virtual machine that is capable of just that [1]. By conceptual extension, then a font format might just as well contain a full blown virtual machine with potentially a program per glyph, WASM is a reasonable candidate for something like that.
- [1] https://learn.microsoft.com/en-us/typography/truetype/hintin...
But then nobody can interpret the machine, only software that you need to write on your own, which makes the whole exercise useless as you can just read the wasm directly?
I’m surely missing something btw
1 reply →
WASM is a binary format and can be embedded in other data formats.
But then what will actually interpret the WASM? The font reader? That doesn’t have WASM compatibility?
Hey, I'm the person who made SmilApple (typo'd here as SmllApple)! Super great to see my quick and dirty project used like this, I just wish I'd thought of it first
https://github.com/Eiim/SmilApple
I always love a reference to Bad Apple
Even though I feel somewhat familiar with most Internet culture, I've never come across the Bad Apple meme, and I've just spent a good hour down the YouTube rabbit hole watching variations of Bad Apple. Fantastic stuff
Part of the reason why it became such a cultural phenomenon is the permissive license governing the use of the characters:
https://touhou.fandom.com/wiki/Touhou_Wiki:Copyrights
Just a warning, getting into Touhou PVs is a good way to lose a month.
It's a convenient double-reference, given Apple's history of well-known font-rendering exploits: https://news.ycombinator.com/item?id=16177535
compute + input + visual output: can it run DOOM? visual output: can it animate Bad Apple?
Cool idea. Something very similar is possible without WASM in the font renderer. I remember a font that displayed its own size, so it changed zooming in and out. I think you could adapt that to play a short animation.
In the future, we have given up on fonts and unicode...
All communication will be through sequences of SVG images and animations.
Picard and Riker sitting side by side, their faces buried in their hands.
A bit offtopic: In principle I like the design of the bloc, but I recently searched for a technical-looking font that is not monospaced. I like the design of the blog, but reading prose in monospaced font is just not very pleasant I think. All (freely available) coding I found fonts only have monospaced variants and all non-monospaced fonts didn't look like coding fonts. Any ideas?
Does your browser have a "Reader" mode?
There are font systems that target code and aren't monospace.
An example is Input: https://input.djr.com/ - https://input.djr.com/preview/
The niche seems to be people who like to code without monospace, or who present code without monospace, e.g. in slides or in blog posts. Or if you want typographical consistency between non-code and code, as I understand you are suggesting.
I like "the Half-Life HUD font", which is some variant of something specified by the German DIN technical standards, from what I understand.
I have used the font "Alte DIN 1451 Mittelschrift" before, and it fits your requirements pretty well.
https://www.1001fonts.com/alte-din-1451-mittelschrift-font.h...
The Iosevka font family has quasi-proportional variants that might fit your needs.
Stuff like this does get me excited, as a web developer who's always wanted to explore graphics but never really dipped into native development.
On the other hand, I know enough to know that Chromium uses Harfbuzz and Skia to render a webpage that, in itself, is going to use another instance of Harfbuzz and Skia to render into a canvas element. Intuitively, it feels dirty.
Now how can you print this so it stays animated on the paper sheet?
You can also speak of their bad | in terminal
It shows how current font rendering systems have accumulated quite a bit of bloat.
This makes me nostalgic for bitmap fonts.
I don’t think this is really bloat.
Opentype supported ligatures in ‘96. Postscript Type 1 and even Knuth’s TeX supported ligatures to a certain extent.
It’s a pretty standard base-level feature for any sort of typesetting.
Imo this is akin to making a terminal animation by outputting blocks of ascii art. It’s not that terminals added video playback support— which would be bloat—but instead someone pushed a standard feature to a novel extent.
The goal is to print with a computer something like a late 15th century humanist document, a tradition of typography 500 years old, and not to print on a 200x320 screen, which had a tradition of merely a decade.
The early computer age of the 80s and 90s was merely playing catch-up to established standards. The standards of the 80s and 90s are not what we wanted to achieve ultimately.
Same with cinema: We shot on 4K-equivalent film for the past 100 years, only in the 80s and 90s with the computerization and videotapes we had a temporary standard of 480i, which we have overcome with sheer computer power, and we’re back to where we actually wanted to be in the beginning.