Comment by theptip
5 months ago
> Why do they need to know my screen brightness, memory amount, current volume and if I'm wearing headphones?
This is clearly adding entropy to de-anonymize users between apps, rather than to add specificity to ad bids.
It would be amazing if you could build and send fake profiles of this information to create fake browser fingerprints and help track the trackers. Similarly, creating a lot of random noise here may help hide the true signal, or at least make their job a lot harder.
Unfortunately fingerprinting prevention/resistance tactics become a readily identifiable signal unto themselves. I.e., the 'random noise' becomes fingerprintable if not widely utilized.
Everyone would need to be generating the same 'random noise' for any such tactics to be truly effective.
A sufficient number of people would need to, not everyone. And if I were the only one then tracking companies wouldn't adjust for just me. Basically, if this were to catch on then ad trackers wouldn't adjust until there was enough traffic for it to work. Also, that doesn't negate the ability to use this to create fake credentials that aids in tracking ads back to their source.
6 replies →
That's why it should be the browsers & OS's that enforce such privacy measures... it shouldn't be an option that my Grandma needs to enable...
1 reply →
> adding entropy to de-anonymize users
_removing_ entropy, by adding more information bits
Technically, information are the bits you DON'T know. Once you know the bits, it isn't "information" in the Shannon sense, in that it takes no energy to reset a message if you know all the bits, but takes N-units of energy for N unknown bits of information. (See; Feynman's lectures on computation)
It's also useful for making ads more effective & manipulation overall. As long as you can connect the data you track & buy, you can use Thompson sampling. In fact, why would we think knowing the name of a person is anything but bad business?
Straight up fingerprinting us without consent it’s pure insanity.
They’ve basically turned every phone into a tracking beacon
I'm sure there is a choir of "told you so"-singers somewhere.
2 replies →
I'm in this industry, and I have knowledge about this.
It's important to point out that it takes a long time for uptake of new versions of ad SDKs. The general assumption is that it takes about 6 months after release of a new version for 50% of ad traffic to come from that version or newer. Also, for every version you release, approximately 1% of traffic will never upgrade past that version.
In that kind of world, over-collecting data makes sense, especially if you think nobody will ever find out. Like total / and free disk space. There's no good reason to need those, right? But let's say an advertiser comes to you and says "we want to spend $1M / day to advertise our 10GB game, but only to devices that could install it." All of a sudden it's useful to know that a device only has 8GB of disk space, or only 100MB of free space.
So OK, if we didn't collect disk space, now it makes sense to collect disk space. Let's add it to the SDK. It takes a month or two to release a new version of the SDK. 3 months to get any meaningful traffic from it, and another 3 months to get up to 50% of your traffic. Assuming the ramps are linear, 4 months of 0%, and then 3 months of ramping to 50%, 30 days per month, you'll make $22.5M in the first 7 months. But if you had the logic in there to begin with, you'd have made $210M during the same time period. That makes it an easy choice for the business folks.
There are answers to this, but they all have drawbacks. You could limit data that ad agencies can collect. This reduces the value of ads. And agencies have learned that some data (like location) is low-value and high-risk, so they're removing the ability to supply it. I think it'd be better to support a model where ad code can be updated independently of the app. This way we could push out bug fixes faster, and could remove our just-in-case collection, but Apple has no signs that this is coming soon, and Google's answer has been such a shit-show that we aren't considering it viable over the next 4 years.
Edit: To address screen brightness specifically, it's a very rough proxy for age of the user.
> But let's say an advertiser comes to you and says "we want to spend $1M / day to advertise our 10GB game, but only to devices that could install it."
I don't want to call you a liar, but having seen ads that are presumably targeted at me, it feels like a total fiction to say that anyone is actually capable or interested in doing this.
I get advertisements for just absolute nonsense garbage that has no bearing on my life, and no bearing on anything that could have possibly been collected from my device.
The closest thing is that when I was in Mexico for a week, some of my podcast pre-roll ads were in Spanish. (Which, I should note, I do not speak fluently enough to even understand.) Even now, the occasional ad I'm served on a podcast is in Spanish.
And that's it. They saw that my IP came from Quintana Roo, and (somewhat reasonably) decided that I need to hear Spanish-language content. Even when I physically moved back to the United States.
The mobile ad industry is weird, and has some perverse incentives. Good games don't advertise (they don't need to). Games that hook the users just enough that they can show them more ads tend to plow that money right back into advertising to get more users. Those are the ads you see 99% of the time, and they're not really targeted. They're just people who know that the average 15 second interstitial will net them $0.006 in revenue, so they bid for it at $0.005.
Are there whales that spend $1m / day in advertising. Absolutely, 100%. Are they running at all times? No. We typically see that kind of spend from a single advertiser around 30 days out of the year. They're short campaigns, typically around a launch of a big title, and they always try to target as narrowly as they can to maximize their impact.
You're right about it using IP geo-location to guess where you are and what language you want. We also use that to determine if we should show you the GDPR disclosures. But try looking at ads on a Xiaomi phone versus a Samsung and you'll see a different set of ads, because one of those purchasers tends to have more disposable income.
I believe some apps actually have to automatically brighten up your screen when displaying a QR code for scanning, and then reduce back the brightness of its previous setting when moving out of the QR code. I believe the Whole Foods app does this for its first screen.
Surely that could be done without sending the brightness to some 3rd party.
Everything listed changes way too often to be useful for tracking. My guess is that it's for anti-fraud purposes. Someone setting up fake devices and/or device farms is likely to get similar values, which means they can be detected via ML or whatever.
> screen brightness, memory amount, current volume and if I'm wearing headphones
None of those are likely to change when you navigate from one website to another, with tracking/ads disabled, which is what they want to be able to track. Otherwise they'd just use their cookies.
One device visits a site where you sell ads. A minute later, an unknown device with identical battery, volume, headphone, brightness, model number, browser version, and boot time to the second arrives on another site you run ads on. There's a pretty good chance they're related, because the odds of all those being the same plus those two sites and recent timings involved is rather low: https://coveryourtracks.eff.org/
Plus it doesn't have to be perfect. It just has to be good enough in bulk to sell.
Combine this with IP, timestamp, and some behavioral patterns, and you’ve got an extremely robust tracking mechanism that operates outside of explicit consent mechanisms.
Screen brightness can identify weather you are outside or inside.
Taken as one of a thousand attributes it's likely to provide at least some discriminatory lift in isolating a single user, even if tiny.