Supercookie: Browser Fingerprinting via Favicon (2021)

3 months ago (github.com)

At some point we need actual consequences for sites that intentionally hide their tracking. It should be criminal. It is stalking and has real world consequences. Just because an exploit exists doesn't mean it should be used. That logic is like saying it is OK to break into a house because the lock on the door was weak. If we don't get real protections, at what point does it become justified to go offensive against sites that exploit things like this? If I found someone putting trackers on me with the intent to sell that information (harm me) I would defend myself. When am I allowed to do that in the digital world?

Quick side note here. I appreciate the research calling this out. We need to know the dangers out there to figure out how to protect ourselves, especially since governments don't seem to take this seriously.

  • I think two things keep the status quo where the end-user is exploited and attacked constantly. The first is the VC / Startup model. Because VC is the true customer, and not the end-user. The second is the current marketing and advertising model. Can it keep working well enough to be worth the money? When it's not, the bottom falls out.

    Old business model: solve a problem for your customer, add some value, take home a cut. Current business model: solve investment return for your investors, get the returns by addicting your end-user to something they don't need. Future business model: ?

    • > The first is the VC / Startup model. Because VC is the true customer, and not the end-user.

      I don't see how that's related? Anyone looking to increase their revenue looks at tracking. Even I, with my popular open source projects, receive emails to add tracking, let alone business that need money to pay their employees.

  • I am not concerned with bieng tracked, and assume that large entities on the net have the ability to track and find anything or anybody, ho hum, but my simple personal requirement is not to be then sold to petty merchants and harassed in my own home with adds and fake "personalisations", and offered unasked for "help", so I watch closely, and go to any length to disable adds, or "fingerprinting", "profiling", or whatever. The net is horrible, I need socks, but as I am now sensitised to bieng tracked and followed, I will just get socks at the hardware store, rather then try and track down what were mentioned as perfect travellers socks and other gear, because the mountain of equipment relentlessly devoted to selling me anything from the waist down herafter is impossible to contemplate, and I now only use search for items required for my business, but am often forced to give up, as the vast majority of the web has been co-opted by major retailers. Even though I have never been on social media, have no accounts with any of the retailers, people are telling me that they found me and my business through an LLM, of some flavor, and/or were convinced of my abilities from my "5 star ratings", I am too busy currently to unravel, exactly how the data is put together and then used, but quite clearly, there is no way to use the net(however "lightly"), and not be swollowed up and commodified.

  • The only reason these things work is because we let our browsers silently execute arbitrary code. That logic is more like saying it is OK to enter a house because the owner sent you an invitation, then greeted you at the door and said "GO NUTS!".

    • Trust is a powerful multiplier. By that I mean if you have trust in your city as a safe place you generally don't see bars on windows and have more open, inviting and usable spaces. You have more businesses and happier people. Right now the web is like the worst crime ridden city in the world. There is 0 trust and it means we can't have nice things. Society builds trust by being open and allowing but with enforcement when things do happen. We need to bring that to the web. Right now the enforcement either happens before-hand by blocking something or not at all. I want good browser features. I want companies to use them for my benefit but I also want social and legal repercussions when those features are abused. We need to build up both of those in a durable way. When people see offending sites they should avoid them and spread the word that those businesses are bad. When they cross the line then we need enforcement of not just civil, but also criminal penalties. Basically, we need to avoid removing features and instead start evolving society to be able to interact in this environment in a way that we can trust it.

  • If you visit my eg. physical clothing store I'm allowed to monitor your in-shop behavior to better optimize my store for your needs. Same for a restaurant etc. That's how _you_ get _much improved services_ and I get _happier customers_.

    Ofc I'm not allowed to freaking resell that data. THIS is the problem in online: releseling and data-brokers. Just KILL these categories of businesses off completely and make _them_ criminal (like even give f prison sentences to their operators).

    We should get back to our sanity in ONLINE. As long as you're on _my (online) property_ and using _my services_ I can of course see EVERYTHING you f do, and should stop pretending I don't (as a business, ofc - anonymization exists and not any random employee can access any customer's data, probably should never access both data and identity correlated unless they're actively investigating some serious fraud). As long as I'm not sharing this data with anyone else, I should be 100% allowed to use every drop of this data to improve my services to you and totally differentiate myself from the incompetent competition that can't properly do this.

    Data privacy (from EU's GDPR to... everything else) only helps big corporations fend-off competition from small startups or boutique shops that could easily out-compete them by offering hyper-personalized hand tailored micro-optimized experiences for their smaller number of customers based on the loads of data they collect from them. In the EU I've only ever seen these kinds of laws severely hamper small boutique or family businesses that wanted to hyperpersonalize to everyone's gain while big corpos easily surf around them with their teams of lawyers.

    ...we've all been brainwashed by this privacy psyop to sheepishly "fight for our privacy" in ways that are detrimental to us and only help our corporate oligarch overlords maintain an even tighter grip on power, while offering us worse and worse services. Wake the f up, DATA IS MEANT TO BE USED to IMPROVE goods and services, not remain uncollected or sit unused!

    • + as a bonus we'd also incentivize businesses to internalize their marketing and related tech operations (since sharing data with 3rd parties would not be allowed), same for AI-customizations etc., forcing them to tech-ify and become more tech-savy businesses instead of externalizing all such things to evil big tech (eg. a clothing store chain could compete not only by producing better clothes, but also by developing better monitoring and generative AI for human-in-the-loop hyperpersonalization, spreading tech out... instead of outsourcing these to tech or big-consulting companies as they do now when the too-little-data they so collect anyhow is otherwise easily share-able to third parties)

    • > As long as you're on _my (online) property_ and using _my services_ I can of course see EVERYTHING you f do

      That's fine, but you are not allowed to send me malware, that runs on _my property_ and snoops on _my data_.

      Also data doesn't stop being mine, just because you have it. You also can't take photographs of random people and claim this is yours now. That's an important difference between the USA and European countries.

      2 replies →

  • Umm...But it is criminal. The GDPR, at least, doesn't care how you track users - whether through cookies, local storage, favicon or whatever other mechanism you've developed. If you track users you must follow certain rules, and if don't, you will be facing fines if/when you're caught.

I was sure this has been a thing for a while, either that or safari has a UI bug since forever.

I regularly get the wrong favicon in specific sites, for example ars technica favicon in reddit

  • My hacker news icon has been stuck as the icon for a weather site that I sometimes check. It’s been stuck that way for close to a year now, and has survived an iOS update too.

    It persists across profiles and into private browsing mode.

  • For me the iOS HN icon changes between the reddit and github, depending on which one I've been using the most on my phone recently. This happens on both iOS Safari and Kagi's Orion.

    I thought that this was just a bug in iOS but based on the comments in this thread, it seems to be common not only across OSes but browser vendors too (I assume iOS Orion uses the same engine as Safari)

  • I thought I was the only one! Something in the UI cache is so horribly corrupted and it has been for years on my MacBook, I just gave up hope.

  • I have the same, the Youtube icon is the Hacker news icon, and the other way round. I have to assume this is some sort of race condition, data corruption, or something else, and it's quite widespread too given all these reports.

What is the live demo supposed to do? I just get stuck in an endless redirect loop with a counter going from 1 to 18 and then restarting. I’m using Safari on iOS.

  • Look at the Github repo:

    - The last update was 2 years ago.

    - It says that MS Edge 87 is affected. The current Version of Edge is 142.

    This is no longer an issue, but it is interesting thinking about how long the NSA knew about this before the general population did.

  • Android/Firefox it showed me my unique ID after the first 18. Then there was a button to try again ans that put me in the same loop you're having.

    • Safari on iOS. It goes to 18/18 and then starts over from 1/18 again for me too. I had not pressed any retry button, this happened the first time I visited the page. And I wasn’t even in private browsing mode. Just navigated to it normally.

Reminds me I noticed macOS Safari pulling in the favicons somewhat frequently when I load the new tab page with favorites on it.

Definitely something I don't want. Maybe I should just remove the favorites or maybe I can save them as redirects or HTML or something.

Note I use private windows most often & shoutout Little Snitch for driving the discovery.

The urge to mine user's data in every possible way is being escalated to higher levels each day. We are, for sure, living in the data rush era.

Another interesting method for web fingerprinting explored by a team of researchers back in 2022 uses the GPU to create unique fingerprints and uses them for persistent web tracking. Codenamed 'DrawnApart' [1] and relies on WebGL to count the number and speed of the execution units in the GPU, measure the time needed to complete vertex renders, handle stall functions, and more. It uses short GLSL programs executed by the target GPU as part of the vertex shader to overcome the challenge of having random execution units handling the computations. Hence, the workload allocation is predictable and standardized.

__________

1. https://www.bleepingcomputer.com/news/security/researchers-u...

Nonpersistent vm-based browser, I use qemu + cage + firefox and some glue logic to fire up a copy of a base image which gets deleted on exit. Fires up slower than a native firefox instance but runs all the same.

Can containerize for the less paranoid and less work but browsers touching host kernel gives me the ick as does the idea of trying to write ebpf policies for firefox to mitigate. Browsers are pain.

  • Tried a similar approach but found that putting the browser in a VM has a tendency to expose a few data points that stand out as less trust worthy which means you end up getting a lot of captchas on some websites (like using swiftshader for renderer, not having some fonts installed, among other things), lying about these can typically be detected as well (like injecting noise into a canvas, modifying the advertised renderer). If you've found any solutions to these please share.

I just got a refresh per second and a counter from 1/18 to 18/18 and repeat. Feels like I wasted 20s.

Totally unrelated, but what I found interesting: the README hasn’t been touched for years, yet it looks entirely AI generated. Including the commits.

People actually wrote READMEs / commit messages like that before? Have I been living under a rock?

  • Where do you think the AI training data comes from?

    Emoji-heavy documentation/commit messages always seem very popular in JS projects, as this is seems to be the project of a 12 (Edit: It's 20, misread) year old I'm not too surprised that it's a bit unusual compared to others.

    • Ah, I didn’t know this was made by a child, that makes sense then.

      I knew this was part of the JS community, I just didn’t realize AI was literally 1:1 using the same style.

      I guess didn’t realize that the NodeJS community was so dominant.

      Or maybe is it because the NodeJS community always had a style of “many small libraries”, which causes them to be over represented?

I use a browser that does not support favicon

Wondering why users of popular browsers believe favicon is needed

(I'm assuming users asked the authors of those browsers for favicon)

  • Popular browsers support tabs. When you have many tabs open, it's hard to show a meaningful title for each one. An icon takes up less place and is easier to scan for visually.

    • Mozilla Firefox doesn't shrink tabs any further, but instead lets the tab list go off screen and you can scroll. I think that is a Google Chrome specific thing.

      2 replies →

  • Do tabs in the popular graphical browsers display a number on each tab by default

    This might be useful when switching from, e.g., tab#1 to tab#7, using keyboard shortcut Ctrl-7

I have never liked how Safari always tries to reload favicons. Seems like an obvious and annoying privacy leak.

Lovely attack vector. It's fun to open the various databases stored by browsers like Chrome in SQLITE to see the kind of information they store. I wouldn't be surprised if a similar attack vector existed for a different stored artefact.

Probably not a popular opinion here but i'm honestly impressed that someone made this work?

  • There is ad money at stake, and it is unfortunately one of the key revenue models in the modern web. I don't know if this particular research was sponsored by ad-tech or if it's preventive, but it shouldn't be generally surprising that this kind of things are heavily researched.

I don't understand the live demo

it gave me some ID, but how do I test that some different website can track me resulting in same ID?

or is it only "detect private browsing/container on same browser" kind of stuff?

  • The "supercookie" phenomenon from a few years ago (when this was created) was that despite using private browsing or deleting cookies, the site id remains the same for the same browser.

This is an insightful read. One question I have is, how do you ensure a user visits all of the N routes for the ID to be generate or to be verified on revisits.

Does it work if you disable favicons? (I disabled favicons when I set up the computer, but for a different reason; it is a feature that I don't use.)

  • If websites can detect that you've disabled favicons, then you are easy to track between all websites because you are very unusual.

Can’t wait for this to be abused and linked to your digital ID through the wallet app!

Why doesn't this apply to any kind of cached content?

  • I guess that you can do fingerprinting with any cached content, but the insane persistency of favicon's cache makes this much more concerning.

    • If caching is bounded in time, can't you use other fingerprinting methods to seal the gaps?

This is great, I needed more tools for tracking bad users who have been banned and try to ban evade. I have been using Samy Kamkars evercookie which is pretty good but some of the techniques are dated.

did anyone ever make use of this in practice? 32 redirects to construct a unique id seems very impractical

  • Ad networks don’t care. It’s a data leak. Even a few extra bits can be valuable to tag you with a better uid.