I Stored a Website in a Favicon

13 hours ago (timwehrle.de)

100 comments

theanonymousone

cfrs 5 minutes ago

That reminded me of Inigo's "real pixel coding" https://www.youtube.com/watch?v=FvS_DG8yIqQ

A 256b intro coded by placing pixels in photoshop and saving into an exe.

Tepix 12 hours ago

Instead of going via pixels, why not use a SVG favicon and directly store markup inside it and extract it?

Use this favicon.svg:

    <svg xmlns="http://www.w3.org/2000/svg">
    <circle cx="50%" cy="50%" r="50%" fill="orange"/>
    <p>hello HN!</p>
    </svg>

use this in your <head> to use a svg favicon:

    <link id="favicon" rel="icon" href="favicon.svg" type="image/svg+xml">

finally, use this in your <body> to extract it and add it to your document body:

    <script>
    fetch(favicon.href).then(r => r.text()).then(t => document.body.innerHTML += t.match(/<p[\s\S]*p>/)[0]);
    </script>

montebicyclelo 7 hours ago

"why not alternative", would be better framed as, "here's a fun variation" — because both approaches are just playing around with technology, for fun / curiosity / exploration. Storing in the pixels is a fun approach, resulting in something Rube Goldberg-esque.
weetii 12 hours ago
Hey, yeah, I wrote the article. This (of course) would be more practical. Thanks for pointing it out. I wanted the payload to "live" in actual pixel data rather than hidden text inside an XML file. That’s why I went this way :)
- cogman10 3 hours ago
  
  If you wanted to play around and do something a little more challenging (though you'd be bulking up the javascript) then one thing you could do is play with a bespoke html compression. You could store the tags in 4 bits `0001` first bit, tag open or close, and the remaining 3 bits indicate which tag is being used (div/p/b/h1/etc). With at least one of the values like `0111` indicating text is following and another tag like `1111` indicating that an unsupported tag follows.
  If you extend it out to 8 bits you can pretty nearly store all the html tags (it'd give you 256 tags to play with).
- peter-m80 12 hours ago
  
  The ico file format allows multiple resolution icons, so a lot of data
  
  2 replies →
chrismorgan 10 hours ago
Regular expressions? Ugh. Encode it properly as XML in the correct namespace, load it so, and take it from that.
Or just serve the SVG file and use <foreignObject> to embed the HTML, and include <link rel="icon" href=""> inside it. In theory you should be able to define a <view id="icon"> and use <link rel="icon" href="#icon">, but in practice neither Firefox nor Chromium seems to be handling that properly in a favicon, which is disappointing.
- Tepix 7 hours ago
  
  It's a hack. A one-liner. Go crazy with it. Or touch grass ;-)
  Oh yeah and favicon isn't part of the DOM.
reichstein 9 hours ago
Just because it's my windmill to tilt at: `[\s\S]` can be written shorter and more precisely as `[^]`.
- MomsAVoxell 8 hours ago
  
  [\s\S] vs. [^]
  A quixotic windmill tilt if ever I saw one.
- GoToRO 8 hours ago
  
  ai says [^] is not portable; I did not test it. Too bad, I'll stick to [\s\S].
  
  1 reply →
berkes 10 hours ago

An SVG can embed raster images: base64 encoded bytes.
So you could layer this experiment: favicon is svg, that contains encoded raster, whose bytes are encoded html.
At the very least it would make a mindboggling CTF step.

Retr0id 6 hours ago

> You still need a tiny bootstrap loader to decode the image.

Nope, you can do it all in a single file with an html/png polyglot (and nowadays you can get better compression ratios with newer formats like webp).

https://web.archive.org/web/20120801001616/http://daeken.com...

gildas 6 hours ago

You can even make the file compatible with ZIP (and PDF) on top of that, see https://github.com/gildas-lormeau/Polyglot-HTML-ZIP-PNG/raw/... (and https://github.com/gildas-lormeau/Polyglot-HTML-ZIP-PNG)

sheept 13 hours ago

You can use the favicon cache as storage too, by redirecting users across domains. It's been proposed as a potential fingerprinting risk[0], and if a browser naively reuses the cache for incognito mode, it could be used to track users across browser profiles.

[0]: https://www.schneier.com/blog/archives/2021/02/browser-track...

koolala 12 hours ago
Wasn't this fixed or mostly fixed?
ai_fry_ur_brain 10 hours ago

My thoughts instinctively went to "this has to be being used for fingerprinting" when I read OPs blog. Are anti fingerprinting measures taking into account the use of the canvas api with favicons?
The link to the supercookie site is dead unfortunately.

Walf 12 hours ago

PNG has comment chunks tEXt, zTXt, and iTXt. You can have a completely normal image whose file is stuffed with as much content as you want. That is less fun, I suppose.

weetii 12 hours ago
Yes, that would also work, thanks for pointing it out

franciscop 13 hours ago

Is this timing coincidence? I just submitted 1h (30 mins before this) ago a website I just made about storing your stock porfolio in a URL + favicon!

https://news.ycombinator.com/item?id=48606396

Tagbert 2 hours ago

Then there is this one. Seems to be a trend.
“Pong in S Favicon” https://news.ycombinator.com/item?id=48608681

esquivalience 12 hours ago

I found the agressively staccato, clearly LLM-generated content extremely difficult to read.

k2enemy 8 hours ago

Halfway through I was sure that there would be a reveal at the end of the article that the article itself was stored in the site's favicon, thus explaining the short, terse sentences. I was genuinely disappointed when I realized it wasn't. Missed opportunity!
stevenhuang 13 minutes ago

There should be a pathology for thinking things must be LLM generated when it's simply not always the case.
People's ability to discern is completely fried.
benhill70 6 hours ago
I like the way it's written. I often write in a similar manner and I have never used LLMs to generate an writing for me. I have written exactly this way at work.
Too me, the author is just trying to get to the point. They know people start skimming if there is too much text.
- SamBam 3 hours ago
  
  > The important catch
  > The favicon doesn't actually contain the whole website itself.
  This is the kind of thing that is extremely idiomatic LLM speak. There's nothing particularly wrong about it per se, but it just makes everyone who is familiar with LLMs say "oh, it's written by an AI" and it just becomes disappointing.
bstsb 11 hours ago
for the first time in a while on HN, i disagree with the characterisation as AI-generated. at most it was drafted with an LLM, but the final output is pretty human to me.
they used the wrong it’s/its, made But. its own one-word sentence, didn’t capitalise HTML, and used “okayy” in parenthesis. all of this isn’t to criticise the writer - i enjoyed it more seeing these little imperfections that make up a blog post
- fortuitous-frog 9 hours ago
  
  Looks largely AI-written, with some human edits: https://www.pangram.com/history/9afe7542-1085-4264-9691-2172...
  FWIW -- I'm not as repulsed by it as the parent comment. But I do want to substantiate that it _is_ heavily LLM-written.
  (If you're unfamiliar, Pangram has garnered a reputation as the leading LLM-detector, with a minimal rate of false positives; IME this has come with the tradeoff of being easy to manipulate/tweak your way into turning an LLM-generated piece of text into reporting a false negative, but for most folks that's worthwhile.)
  
  2 replies →
istjohn 4 hours ago

I found the writing engaging and enjoyable to read.
estetlinus 12 hours ago

It’s the new internet. So, so annoying.
scottmcdot 12 hours ago
Which bit? The short sentences?
- bonoboTP 5 hours ago
  
  Not just the length but the structure, the way the headlines are phrased, the use of "honestly", the "not X but Y", many things cumulatively, not one particular thing in itself. If you work a lot with LLM writing, you notice. Same way you recognize the writing style of famous authors. It's never one particular thing but many.
bonoboTP 6 hours ago

Agreed. Disappointing that more people don't notice it's AI.
noduerme 12 hours ago
Yeah, but it's kinda weird. The typical LLM headers and bullet points are there, but it's like someone took an axe to the rest of the spew. I too would rather read someone's original bad writing than their bad editing of AI writing, but it's kinda interesting how this all shakes out.
- netsharc 10 hours ago
  
  It doesn't seem to be LLM, but reads like one. The author is German, maybe it's a language expertise thing, maybe he likes the LLM style (unrelated to his nationality).
  But yeah, sentences that only have 3-4 word each feel like 3rd grade writing; I couldn't read it.
  
  4 replies →
- darianvc 9 hours ago
  
  Might stop using bullet points for not being flagged as AI lol
  "Very small" -> yeah, this header is mostly AI generated. No hate against the author but this doesn't make any sense as header
- bartvk 11 hours ago
  
  I wish people would include their prompts.

MomsAVoxell 8 hours ago

Oh, I am so aligned with this mentality:

    A monitor is storage.

    A keyboard is storage.

Forum posts are storage. Markov-approved tweaks in an edit, over time, certainly enough for quite a lot of storage. Dual-use storage to boot, since .. you know .. sometimes the comments are socially interesting.

Best thing is, nobody really knows if their chicken casserole recipe isn't just a handle to a carefully constructed GUID pointing across to .. lets say, for humor .. a thousand different forum postings ...

I do have to wonder if the author is familiar with PoC||GTFO, for this is certainly a technique one will find deep within the depths of the Alchemist Owls' holy tomes...

drob518 4 hours ago

Codes within codes. Wheels within wheels.

jorisw 10 hours ago

Fun Fact: You can use any inline SVG for a favicon and keep it right in the HTML document.

This also allows you to use an emoji directly as a favicon, like so:

  <link
    rel="icon"
    type="image/svg+xml"
    href="data:image/svg+xml,<svg xmlns='http://www.w3.org/2000/svg' viewBox='0 0 100 100'><text y='.9em' font-size='90'>(your emoji here)</text></svg>"
  />

(HN isn't showing the emoji)

Timwi 1 hour ago

Just as a heads-up, if you do this and you want to use #rrggbb color codes or url(#id) links, you have to escape the # as %23, otherwise it gets parsed as a URL fragment and your SVG code is cut off there.

inglor 8 hours ago

Cool! Here is a GH repo demonstrating unbounded favicons I made 11 years ago - it crashes some browsers - wanna guess how long it took each one to fix it :D https://github.com/benjamingr/favicon-bug

echoangle 5 hours ago

> The length header is important because the image itself may contain unused pixels at the end. If there's no length value, there's no way to know where the real payload stops.

Not really, can’t you just pad with 0 bytes and stop reading when you encounter one that’s not part of the current Unicode codepoint?

zahlman 2 hours ago
Zero bytes won't ever be part of a multi-byte character in UTF-8. They simply represent code point 0 (which is valid, but wouldn't appear in normal text) by themselves.
- echoangle 1 hour ago
  
  Ah even better so you can just use null terminated strings

divvsaxena 4 hours ago

This is one of those projects that's completely impractical but makes the web more interesting. I love seeing people explore weird constraints just to see what's possible.

tetrisgm 10 hours ago

Love it. Did you see the old effort to store the page in the url? https://github.com/jstrieb/urlpages

purple-leafy 10 hours ago

That’s awesome. I took this a bit further a few years ago making a url only notepad quine that as you add data to it, creates itself. that can be saved as a bookmarklet. Have to watch the gif to understand
https://github.com/con-dog/serverless-architecture

berkes 10 hours ago

I'd imagine the (aggressive) caching of the favicon by browsers makes it a challenge, but you could generate the favicon dynamically, then have JS extract the sequentially. Basically streaming arbitraily large content to a webpage via favicons. Via blocks of 239 bytes.

It may be a fun, novel way to proxy webpages that are otherwise blocked. Though, i guess, the service rendering the favicons can just as easily be blocked then.

Gabrys1 7 hours ago

Have an index.html that's also (byte-to-byte equal) served as favicon.ico. If that page "works" and the favicon doesn't show garbage, it is a website stored in a favicon (by my standards).

herodoturtle 9 hours ago

How long before someone ports DOOM into a favicon? ^_^

(For the technical gurus here, would that even be possible?)

shakna 9 hours ago

You can already play it in a favicon [0].
But as favicons can be svgs, and let you store foreign objects... You could store the whole thing in the favicon, but might also need a line of JS to extract it.
[0] https://vidferris.github.io/FaviconDoom/
drob518 4 hours ago

3… 2… 1…

titularcomment 9 hours ago

Painful read.

Related interesting project: https://github.com/EtherDream/web2img

brtkwr 8 hours ago

Hmm this is cool but what are the practical use cases?

It didn’t load first time round on my browser (Brave) without disabling its prevent tracking feature…

MomsAVoxell 8 hours ago

Practical use cases for stashing data in places people least expect it?
Wallet password.
New ecosystem for the kids.
That's two, at least.

momoraul 8 hours ago

The browser already asks for the favicon on every page. Might as well put it to work.

superjose 13 hours ago

Pretty cool tbh!!! Would have loved seeing the decoder code!!!

It's also pretty interesting to think how an attacker could exploit images on his behalf. Never thought that would be a way!!!

Thanks!

schobi 13 hours ago
I guess the decoder is more than the 208 bytes that this page uses..
But maybe you can misuse this and store a session ID / cookie in a favicon (give everyone a unique one) and survive some cookie cleanup and evade privacy restrictions?
Maybe you can still make it that the favicon looks like an image a little to not raise suspicion?
Favicons seem to be cached across private browsing sessions. Oh no
- RetroTechie 9 hours ago
  
  I'm tempted to think that only someone working for a company in the advertising industry could come up with that.
  Must EVERYTHING be polluted by ad tech & privacy intrusions?

franze 7 hours ago

also https://pong-in-a-favicon.franzai.com/ for further favicon (mis)use

franze 7 hours ago

and the obvious Doom Example https://vidferris.github.io/FaviconDoom/favicondoom.html

Izmaki 9 hours ago

Wait 'til the author discovers that you can use ping (ICMP) to transfer data, too! :)

beardyw 12 hours ago

I would have used a minimal service worker to unpack the web data and present it as if it were just a normal page being loaded.

soanvig 11 hours ago

Honestly it didn't interest me, but I do remember from back in the days full websites rendered by a browser from... Empty files. https://mathiasbynens.be/notes/css-without-html

aaubry 9 hours ago

A neat improvement would be to make the decoder into a bookmarklet. This would avoid the overhead of serving the script. Of course you would rely on the user having the bookmarklet installed, but when you serve HTML you also rely on the user having a web browser installed.

bozdemir 12 hours ago

Very cool. I wonder is it possible to make a simple game with also leveraging the webassembly?

netsharc 10 hours ago

https://violet78910.github.io/faviconSnake/
weetii 12 hours ago
Yes, probably. I guess, you’d need a bigger favicon since the minimal Rust WASM binary is around 20KB+ (?)
- alex_suzuki 11 hours ago
  
  You might find my tinkering useful: https://strich.io/blog/posts/embedding-webassembly-in-qrcode... A QR code isn’t much different from a favicon I guess. :)
  
  2 replies →

abc123abc123 8 hours ago

Bravo Maestro! Wonderful performance! =)

frankzero 8 hours ago

I personally won't do things this way, but this is really cool and I could see the applications already.

ab_wahab01 12 hours ago

Fascinating concept! Thanks for sharing this!

fitsumbelay 12 hours ago

very cool and interesting after reading just the title I wrongly assumed this would be about svg

neon_me 11 hours ago

Is it cake? Game for devs.

scoot 12 hours ago

Would have been more fun if the blogpost was rendered from the favicon.

charcircuit 9 hours ago

You can literally just use the file itself as the favicon. There is no need to over complicate it.

cp index.html favicon.png

deemwar 8 hours ago

looks good

jibal 12 hours ago

Surprised that a minimal "website" only requires a small image = few pixels = few bytes to store it? Um, ok.

pradeep4j 2 hours ago

[flagged]

swordlucky666 5 hours ago

[dead]

pizzaballs 11 hours ago

[dead]

heroku 9 hours ago

[dead]

anujshashimal98 13 hours ago

[flagged]

shaharamir 12 hours ago

Amazing!

fvckaigotohell 6 hours ago

AI generated garbage. Blocked