Comment by paulgb

3 months ago

This morning I got nerd-sniped by this comment[1] on the Open Heart Protocol about encoding arbitrary-length data in an emoji.

I wanted to try it out and sure enough, you can encode an arbitrary string message in one unicode character!

The approach I took is to encode the string as UTF-8, and then turn each byte into a pair of “variation selector” characters (there are conveniently 16 codepoints allocated as variation selectors, and 16x16 = 256). Variation selectors don't show up visually, but are retained when the character is copied/pasted.

Some platforms normalize the character, but others don't (Google Docs in particular does not!)

Here's the code: https://news.ycombinator.com/item?id=42823876