← Back to context

Comment by Santosh83

5 years ago

"More importantly, the caching of favicons in modern browsers exhibits several unique characteristics that render this tracking vector particularly powerful, as it is persistent (not affected by users clearing their browser data), non-destructive (reconstructing the identifier in subsequent visits does not alter the existing combination of cached entries), and even crosses the isolation of the incognito mode."

Why are favicons cached separately? I assume it is just code from pre-commercial www days that no one has since bothered to examine or rewrite?

I feel like privacy within modern browsers is a Sisyphean struggle. Their vast and ever-expanding API surface can never be brought under sensible control without splitting the browser into several unrelated tasks that must cross strictly locked down interprocess communication channels. The existing multi-process architecture must be taken to the next level, but who will do the difficult work involved given that of the major players only Mozilla and Apple have a stated incentive for privacy and even there their stated incentive is on fairly weak grounds since one is a profitable corporation while the other is funded by profitable corporations?

Maybe we should consider legal solutions?

There are real-world threats that you can't 100% defend against and yet we are mostly safe because the law is an effective deterrent.

Why not apply the same on the web? How come we have draconian anti-hacking laws (that are sometimes abused), but none of them are used against this tracking where it's essentially the same result as installing spyware?

  • Strongly this.

    Privacy is not a technology problem. It's a business problem. As long as the adtech industry is allowed to thrive, as long as people build companies with ad-based or data-resale-based business models, this will be an endless game of whac-a-mole, with the browsers only ever growing in complexity, and building anything on the web only becoming more difficult.

    We have to address the root cause: advertising as a business model. My suggestion: let's apply regulatory measures to kill this business model entirely.

    • > We have to address the root cause: advertising as a business model.

      Isn't the root cause advertising that depends on data sharing, rather than advertising itself? I think it's fine if a site wants to display advertising that it serves from its own domain, without passing on any data to third parties.

      3 replies →

    • It's not just ad-tech. You have full fledged business models based on siphoning off and sell your privacy, like Plaid and Visa. In fact, every CEO asks their company a very important question--how do we weaponize our data? It's a revenue stream for everyone.

    • One idea I’ve chewed on: Make it a law to have to pay people for their private data. Ie charged by the minute (second would be best) to the tune of minimum wage or an organizations WTP as a salary for 24/7 access. The idea is to price and legislate it at the point where it makes sense for the average citizen/user. Creating the notion of private property and reinforcing it is a fundamental purpose of government.

      4 replies →

  • The legal system cannot keep up with the complexity rapid change of software. You will end up with one regulation that says you cannot track users, and another regulation that says you have to prove that you are effectively blocking Iran users from using your product. If you log IPs connecting to your severs you'll be accused of tracking users, and if you don't log IPs you'll be accused of not doing your due diligence in confirming you blocked foreign users.

    • Is this an actual problem or is this a typical knee-jerk argument people make when someone is talking about regulation? (despite the current situation being so bad that it's hard to imagine regulation making it worse)

      Regarding your specific example, the GDPR appears to deal with it easily: any data processing to comply with the law is allowed and does not require explicit consent. This seems to work well (of course, the GDPR is bad because it't not being enforced seriously, but if it was, the scenario you describe wouldn't be a problem)

      Also, when I talk about regulation, I'm talking about regulating the intent and/or outcome rather than a particular implementation. If you track someone without their explicit consent for the purposes of targeted advertising or marketing you are in breach of the regulation, regardless of whether you obtained that data online, in the real-world (mobile phone tracking, facial recognition, loyalty cards, etc), by using Tarot cards or even a fortune-telling goldfish.

"Why are favicons cached separately?"

That's a great question. My guess is that it's because they are used for things like bookmarks and the chrome page that shows frequently visited websites. And that something about those uses made a separate cache logical. A bit of googling does show lots of confusion and bugs because of it though.

> Why are favicons cached separately?

Likely because they are used for bookmarks and you don't want clearing the cache to remove all of the icons from your bookmarks.

Of course you could only do this for URLs which are bookmarked however it would be more work (probably why it wasn't done) and would remove icons from your browser history (probably a minor loss).

TL;DR Because they are used outside the context of browsing.

  • why couldn't we solve that by having a separate cache for bookmarks sandboxed away from web content processes?

    • Presumptive user feedback: "How come when I click this bookmark which has the icon nice and right there it sometimes takes minutes for the icon to show up on the tab?"

      Principle of least surprises for the user is probably at play here. Bookmarks and tab icons seem like reasonably similar "chrome" to the user.

      Separating the caches isn't necessarily easy either: it is just as likely to hand the trackers at that point a good signal for people who bookmarked a site based on whatever heuristic ends up being to refresh that cache if it is no longer "recently accessed tabs".

maybe we should consider switching to alternative browsers like kingpin. the chrome is too powerful and google has its own (business) goals.