Tales of Favicons and Caches: Persistent Tracking in Modern Browsers [pdf]

5 years ago (cs.uic.edu)

56 comments

amenghra

"More importantly, the caching of favicons in modern browsers exhibits several unique characteristics that render this tracking vector particularly powerful, as it is persistent (not affected by users clearing their browser data), non-destructive (reconstructing the identifier in subsequent visits does not alter the existing combination of cached entries), and even crosses the isolation of the incognito mode."

Why are favicons cached separately? I assume it is just code from pre-commercial www days that no one has since bothered to examine or rewrite?

I feel like privacy within modern browsers is a Sisyphean struggle. Their vast and ever-expanding API surface can never be brought under sensible control without splitting the browser into several unrelated tasks that must cross strictly locked down interprocess communication channels. The existing multi-process architecture must be taken to the next level, but who will do the difficult work involved given that of the major players only Mozilla and Apple have a stated incentive for privacy and even there their stated incentive is on fairly weak grounds since one is a profitable corporation while the other is funded by profitable corporations?

Nextgrid 5 years ago
Maybe we should consider legal solutions?
There are real-world threats that you can't 100% defend against and yet we are mostly safe because the law is an effective deterrent.
Why not apply the same on the web? How come we have draconian anti-hacking laws (that are sometimes abused), but none of them are used against this tracking where it's essentially the same result as installing spyware?
- TeMPOraL 5 years ago
  
  Strongly this.
  Privacy is not a technology problem. It's a business problem. As long as the adtech industry is allowed to thrive, as long as people build companies with ad-based or data-resale-based business models, this will be an endless game of whac-a-mole, with the browsers only ever growing in complexity, and building anything on the web only becoming more difficult.
  We have to address the root cause: advertising as a business model. My suggestion: let's apply regulatory measures to kill this business model entirely.
  
  10 replies →
- bob33212 5 years ago
  
  The legal system cannot keep up with the complexity rapid change of software. You will end up with one regulation that says you cannot track users, and another regulation that says you have to prove that you are effectively blocking Iran users from using your product. If you log IPs connecting to your severs you'll be accused of tracking users, and if you don't log IPs you'll be accused of not doing your due diligence in confirming you blocked foreign users.
  
  1 reply →
tyingq 5 years ago

"Why are favicons cached separately?"
That's a great question. My guess is that it's because they are used for things like bookmarks and the chrome page that shows frequently visited websites. And that something about those uses made a separate cache logical. A bit of googling does show lots of confusion and bugs because of it though.
kevincox 5 years ago
> Why are favicons cached separately?
Likely because they are used for bookmarks and you don't want clearing the cache to remove all of the icons from your bookmarks.
Of course you could only do this for URLs which are bookmarked however it would be more work (probably why it wasn't done) and would remove icons from your browser history (probably a minor loss).
TL;DR Because they are used outside the context of browsing.
- clairity 5 years ago
  
  why couldn't we solve that by having a separate cache for bookmarks sandboxed away from web content processes?
  
  1 reply →
vladojsem 5 years ago

maybe we should consider switching to alternative browsers like kingpin. the chrome is too powerful and google has its own (business) goals.

grahameb 5 years ago

"Firefox: ... However, it never actually uses the cache to fetch the entries. As a result, Firefox actually issues requests to re-fetch favicons that are already present in the cache. We have reported this bug to the Mozilla team, who verified and acknowledged it. At the time of submission, this remains an open issue. Nonetheless, we believe that once this bug is fixed our attack will work in Firefox..."

Gosh, I hope the favicon cache bug the authors filed isn't fixed until a broader mitigation against this is implemented.

hnaccy 5 years ago

Bugzilla link since I didn't see it in paper: https://bugzilla.mozilla.org/show_bug.cgi?id=1618257
I find it kinda weird that Solomos reported it as normal defect and even prompted for fix update months later without making it clear it would make FF vulnerable to issue...
weinzierl 5 years ago

> " However, it never actually uses the cache to fetch the entries."
I doubt the "never" because it regularly shows me the wrong favicon. This has been true for so many years that I consider it a familiar quirk more than a bug...
rrampage 5 years ago

Firefox bug is being tracked here: https://bugzilla.mozilla.org/show_bug.cgi?id=1618257
123123141 5 years ago
Later in the paper:
> we have disclosed our research to all the browser vendors.
Please consider that the researchers apparently submitted TWO bug reports. One because functionally the cache is broken, one because there's a potential privacy issue.
- mccr8 5 years ago
  
  The account used to file that bug has not filed any other bug reports, so it isn't clear to me if they did report the underlying security issue that they found. (Disclaimer: I work on Firefox, but I'm just speaking for myself.)

floatingatoll 5 years ago

It looks like someone else posted their paper to their bug.

amenghra 5 years ago

Good. This sums it up pretty well:

    I also think that it would have been appropriate to notify about the
    ulterior motive behind this defect report at the latest when the paper got
    published. This underhanded approach of reporting a defect just leaves a bad
    taste, really.

    The behavior may be an actual defect in the classical sense, but I'm just
    wondering what would have happened, had this been addressed "in time" by the
    developers. It would seem that the researchers would then have triumphantly
    proclaimed that all major browsers are prone to their newly found attack.
    Must be somewhat disappointing that it didn't get fixed "in time" to make it
    into the paper that way.

2 replies →

LeonB 5 years ago

Straight up Black Hat work. Not cool.

dannyw 5 years ago

It's unbelievable that any form of unclearable cache is allwoed to exist.

"Clear Browsing Data" must clear ALL browser data, as if I was doing a completely fresh install of my browser but maintaining my settings, extensions, bookmarks, and auto-fill.

That is IT. Yes, Google Chrome, you must also delete Google cookies (which they do not do).

ad404b8a372f2b9 5 years ago
That's why I setup my Linux install to work like a live-CD, with a two layer filesystem: a read-only base, and a read-write overlay that lives in the RAM. The files that I know I want to keep are bound from a read-write partition on the disk to the RAM filesystem, and all the rest gets deleted every time I shutdown my PC.
A lot of pieces of software non-maliciously keep records of everything you do with them through logs or caches that aren't straightforward to delete and it's the only way I found to have control over it.
- elteto 5 years ago
  
  How do you persist files you care about? Another separate partition?
  This is an interesting approach. Do you have any documentation on to how it was setup? Also, how do you change a setting in your browser? Do you have to rebuild your base layer?
  
  1 reply →
steerablesafe 5 years ago
Clearing browser history should be interpreted as "nuke my browser container's cache directory please". This also requires that all "cache" gets into the cache directory though, which might not be the case.
Unfortunately nuking the whole of the container while effective, it's probably not desired, as it contains various browser settings and browser extensions.
- sloshnmosh 5 years ago
  
  The open-source BleachBit does an excellent job of clearing out caches and vacuuming out SGLite databases and can also remove icons and thumbnails.
  bleachbit.org
matheusmoreira 5 years ago

What's unbelievable is the audacity of these companies. Programmers want to improve performance for everyone so they come up with caching mechanisms. So what do the companies do? They abuse the feature in order to track users.
The ability to clear browser data is not quite enough. Caching should be disabled by default in all browsers due to the potential for abuse. Oh no, now companies are getting less conversions and sales due to the loss in performance... Sucks to be them. Actually the more their abuse costs them the better.
yaris 5 years ago
I wonder if it is possible to implement an "out-of-band" cache clearing command(s). On Linux it would be quite straightforward, but I know next to nothing about Windows or OSX.
- colejohnson66 5 years ago
  
  Check out BleachBit[0]. It’s available for Windows and Linux.
  [0]: https://bleachbit.org

dessant 5 years ago

Browser vendors don't take clearing browser data seriously, see how Firefox has implemented the browsingData extension API [1]. These bugs compromise the security and privacy of Firefox users, but fixing them has not been a priority over the years.

Built-in clearing options in Firefox will also leave classes of cached data behind. The only reliable way to wipe everything has been to delete specific files from the Firefox profile folder before the browser launches.

[1] https://armin.dev/blog/2019/03/firefox-extensions-browsing-d...

vxNsr 5 years ago

So two things:

1) this is insane! It even breaks the “sandbox” of incognito mode.

2) Based on how it works I would assume it absolutely decimates the back button functionality, which depending on what you’re trying to accomplish might be a good thing, and 2 seconds isn’t a short period of time. Ppl wouldn’t be that ok with waiting 2 secs even with today’s js heavy loads.

uppsalax 5 years ago

First of all, thanks for sharing because it's such an insightful paper!

Some thoughts/doubts on it:

1. It's unbelievable that in a world where we promote privacy and freedom of individuals such cross-country trackers exist. It seems more an Orwellian story rather than reality.

2. I'm a bit ignorant on this theme on a technical level (I have a business background, even if working at a tech startup focused on security). There is a growing concern globally over an increasing sensitisation over privacy and over the importance of security. Even Google has promised to remove third party cookies within 2 years, and there is going to be a migration from Whatsapp to Signal (even if Whatsapp clarified a bit on that). Do you think that such fresh tools like these "favicons" or simple tracking will remain long term?

jannes 5 years ago
Do you speak Spanish by any chance? The way that you are using "doubts" (dudas) sounds slightly off to me. I live in Spain and I hear it a lot :)
- uppsalax 5 years ago
  
  I am italian actually, pero hablo un poco español tambien! The idioms are really similar.

EastSmith 5 years ago

> Specifically, websites can create and store a unique browser identifier through a unique combination of entries in the favicon cache. To be more precise, this tracking can be easily performed by any website by redirecting the user accordingly through a series of subdomains. These subdomains serve different favicons and, thus, create their own entries in the Favicon-Cache. Accordingly, a set of N-subdomains can be used to create an N-bit identifier, that is unique for each browser. Since the attacker controls the website, they can force the browser to visit subdomains without any user interaction. In essence, the presence of the favicon for subdomain in the cache corresponds to a value of 1 for the i-th bit of the identifier, while the absence denotes a value of 0.

So the bulk of it is: cashing favicons, timing request, multiple redirects through controlled subdomains.

DanielDent 5 years ago

This paper references https://www.ndss-symposium.org/wp-content/uploads/2019/02/nd...

They in turn reference my 2015 take on this: http://dnscookie.com/

With homage Moxie's Cryptographic Doom Principle, I propose the Cache Doom Principle: If a system's behaviour can be influenced by a cache, eventually someone will figure out a way to use that cache to leak data.

jackewiehose 5 years ago

Does this make much of a difference? My impression was that we already lost against fingerprinting and browser vendors keep adding more and more feature crap which makes it only worse.

mleonhard 5 years ago

Could one perform this attack without redirects by changing the page's DOM.head.link(rel=icon).href value with JavaScript?

RegW 5 years ago

Well apparently javascript can be used to modify the favicon dynamically: https://stackoverflow.com/questions/260857/changing-website-... - presumably this will then have the same interactions with the cache.
Perhaps you could just rely on the user navigating across a number of pages on your attack site.

everdrive 5 years ago

Does anyone know if you can disable favicons in firefox?

johnwayne666 5 years ago

You can on iOS Safari. I never understood why it was disabled by default.

1vuio0pswjnm7 5 years ago

I block favicon requests with a forward proxy. One could probably block them using DevTools or an ad blocker.

XCSme 5 years ago

For me favicons are a huge UX requirement, I can barely use my browser tabs without favicons.

Klais 5 years ago

Really cool