Comment by espadrine

9 years ago

It is far from over, too! Google Cache still has loads of sensitive information, a link away!

Look at this, click on the downward arrow, "Cached": https://www.google.com/search?q="CF-Host-Origin-IP:"+"author...

(And then, in Google Cache, "view source", search for "authorization".)

(Various combinations of HTTP headers to search for yield more results.)

197 comments

espadrine

> The infosec team worked to identify URIs in search engine caches that had leaked memory and get them purged. With the help of Google, Yahoo, Bing and others, we found 770 unique URIs that had been cached and which contained leaked memory. Those 770 unique URIs covered 161 unique domains. The leaked memory has been purged with the help of the search engines.

So I tried it too, and there's still data cached there.

Am I misunderstanding something - that above statement must be wrong, surely?

They can't have found everything even in the big search engines if it's still showing up in Google's cache, let alone the infinity other caches around the place.

EDIT: If the cloudflare team sees I see leaked credentials for these domains:

android-cdn-api.fitbit.com

iphone-cdn-client.fitbit.com

api-v2launch.trakt.tv

vengefulduck 9 years ago
I'm also seeing a ton from cn-dc1.uber.com with oauth, cookies and even geolocation info. https://webcache.googleusercontent.com/search?q=cache:VlVylT...
- sneak 9 years ago
  
  That's terrifying.
  Thanks to Uber now requiring location services on Always instead of just when hailing a car, my and others' personal location history even outside of Uber usage could have been compromised. Sweet.
  
  4 replies →
- Animats 9 years ago
  
  At least the location isn't embarrassing.[1]
  [1] https://goo.gl/maps/FjQVttcZCpH2
  
  2 replies →
- kmfrk 9 years ago
  
  What did it show before it was taken down? In vague terms, of course.
infinity0 9 years ago
Could someone enlighten me on why malloc and free don't automatically zero memory by default?
Someone pointed me to MALLOC_PERTURB_ and I've just run a few test programs with it set - including a stage1 GCC compile, which granted may not be the best test - and it really doesn't dent performance by much. (edit: noticeably, at all, in fact)
People who prefer extreme performance over prudent security should be the ones forced to mess about with extra settings, anyway.
- amalcon 9 years ago
  
  Some old IBM environments initialized fresh allocations to 0xDEADBEEF, which had the advantage that the result you got from using such memory would (usually) be obviously incorrect. The fact that it was done decades ago is pretty good evidence that it's not about the actual initialization cost: these things cost a lot more back then.
  What changed is the paged memory model: modern systems don't actually tie an address to a page of physical RAM until the first time you try to use it (or something else on that page). Initializing the memory on malloc() would "waste" memory in some cases, where the allocation spans multiple pages and you don't end up using the whole thing. Some software assumes this, and would use quite a bit of extra RAM if malloc() automatically wiped memory. It would also tend to chew through your CPU cache, which mattered less in the past because any nontrivial operation already did that.
  I personally don't think this is a good enough reason, but it is a little more than just a minor performance issue.
  That all being said, while it would likely have helped slightly in this case, it would not solve the problem: active allocations would still be revealed.
  
  14 replies →
- garrettr_ 9 years ago
  
  Zeroing on malloc and/or free would not have prevented this type of error, since the information disclosure was due to an overflow into an adjacent allocated buffer.
  However, zeroing on free is generally a useful defense-in-depth measure because can minimize the risk of some types of information disclosure vulnerabilities. If you use grsecurity, this feature is provided by grsecurity's PAX_MEMORY_SANITIZE [0].
  [0]: https://en.wikibooks.org/wiki/Grsecurity/Appendix/Grsecurity...
- roca 9 years ago
  
  Zeroing on alloc/free probably wouldn't have helped much with this bug. Data in live allocations would still be leaked.
- Kalium 9 years ago
  
  > Could someone enlighten me on why malloc and free don't automatically zero memory by default?
  The computational cost of doing so, I suspect.
  
  5 replies →
- nerdponx 9 years ago
  
  Are these results hardware independent? Maybe it makes a difference on older machines, or different architectures.
- Lxr 9 years ago
  
  I imagine clearing memory on free is more relevant than MALLOC_PERTURB_?
- ppoint 9 years ago
  
  calloc zeroes memory on allocation.
  
  4 replies →
- w8rbt 9 years ago
  
  It takes time to do that.
toyg 9 years ago
> that above statement must be wrong, surely?
Either they believe it's right, which means they're not competent enough to really assess the scope of the leak; or they don't believe it, but they went "fuck it, that's the best we can do".
In either case, it doesn't really inspire trust in their service.
- blibble 9 years ago
  
  you missed one possibility: that they're deliberately attempting to downplay the severity to make themselves look less incompetent
sikhnerd 9 years ago
jgrahamc: can you list which public caches you worked with to attempt to address this? It does not inspire confidence when even google is still showing obvious results
- eastdakota 9 years ago
  
  Google, Microsoft Bing, Yahoo, DDG, Baidu, Yandex, and more. The caches other than Google were quick to clear and we've not been able to find active data on them any longer. We have a team that is continuing to search these and other potential caches online and our support team has been briefed to forward any reports immediately to this team.
  I agree it's troubling that Google is taking so long. We were working with them to coordinate disclosure after their caches were cleared. While I am thankful to the Project Zero team for their informing us of the issue quickly, I'm troubled that they went ahead with disclosure before Google crawl team could complete the refresh of their own cache. We have continued to escalate this within Google to get the crawl team to prioritize the clearing of their caches as that is the highest priority remaining remediation step.
  
  91 replies →

Retr0spectrum 9 years ago

https://webcache.googleusercontent.com/search?q=cache:lw4K9G...

    Internal Upstream Server Certificate
    ...
    /C=US/ST=California/L=San Francisco/O=Cloudflare Inc./OU=Cloudflare Services - nginx-cache/CN=Internal Upstream Server Certificate

That really doesn't look good.

sfeng 9 years ago
Just to point out, this is apparently a cert used for communicating between Cloudflare's services which has (presumably) been replaced. Cloudflare customer's certs weren't exposed.
- jgrahamc 9 years ago
  
  Correct. That's that cert.
  
  1 reply →

toyg 9 years ago

Lol, Google just purged that search.

EDIT: but there's still plenty of fish: http://webcache.googleusercontent.com/search?q=cache:lw4K9G2...

This will take weeks to clean, and that's just for Google.

EDIT2: found other oauth tokens, lots of fitbit calls... And this just by searching for typical CF internal headers on Google and Bing. There is no way to know what else is out there. What a mess.

camus2 9 years ago
Ouch, you really see everything :
> authorization: OAuth oauth_consumer_key ...
what a shit show. I'm sorry but at that point there must be consequences for incompetence. Some might argue "But nobody can't do anything" ...
I'm sorry, CF has the money to to ditch C entirely and rewrite everything from the ground up with a safer language, I don't care what it is, Go,Rust whatever.
At that point people using C directly are playing with fire. C isn't a language for highly distributed applications, it will only distribute memory leaks ... With all the wealth there is in the whole Silicon Valley, trillions of dollars, there is absolutely 0 effort to come up with an acceptable solution? all these startups can't come together and say: "Ok,we're going to design or choose a real safe language and stick to that"? where does all that money goes then? Because this bug is going to cost A LOT OF MONEY to A LOT OF PEOPLE.
- dunham 9 years ago
  
  These guys were probably saved by using OAuth - there is a consumer secret (which the "_key" is just an identifier for) and an access token secret, both of which are not sent over the wire. Just a signature based on them. (The timestamp and nonce prevent replay attacks.)
  OAuth2 "simplified" things and just sends the secret over the wire, trusting SSL to keep things safe.
  
  1 reply →
- lexicality 9 years ago
  
  This actually happened because they started to rewrite it all, according to their blog post.
  
  1 reply →
danielweber 9 years ago
Good. They're trying to clean up all the private data leaked everywhere. I tempted to say "why couldn't they figure out this google dork themselves" but they've probably been slammed for the past 7 days cleaning up a bunch of stuff anyway.
- taviso 9 years ago
  
  You have no idea.
  
  13 replies →
photon-torpedo 9 years ago

> This will take weeks to clean, and that's just for Google.
Couldn't Google just purge all cached documents which match any Cloudflare header? This will probably purge a lot of false positives, but it's just cached data, so would that loss really matter? My guess is that this approach should not take more than a few hours on Google's infrastructure.
Of course, this leaves the problem of all the other non-Google caches out there.
dunham 9 years ago

OAuth1 doesn't send the secrets with the requests, just a key to identify the secret and a signature made with the secret.
OAuth2 does send the secret, typically in an "Authorization: Bearer ..." header.
The uber stuff that somebody else linked to looks like a home-grown auth scheme and it appears that "x-uber-token" is a secret, but hard to know for sure.
ifdefdebug 9 years ago

So while people are having fun here with search queries, how many scripts are already up and running in the wild, scraping every caching service they can think of in creative ways for useful data...
This is an ongoing disaster, wasn't this disclosed too soon?

mintplant 9 years ago

The "well-known chat service" mentioned by Tavis appears to be Discord, for the record.

edit: Uber also seems to be affected.

ikeboy 9 years ago

>It is a snapshot of the page as it appeared on Feb 21, 2017 20:20:45 GMT

So the issue wasn't fully fixed on Feb 19, or Google's cache date isn't accurate?

chrissnell 9 years ago

It seems like the reasonable thing for Google to do is to clear their entire cache. The whole thing. This is the one thing that they could do to be certain that they aren't caching any of this.

callahad 9 years ago
What about Bing, Baidu, Yandex, The Internet Archive, and Common Crawl? What about caches that are surely maintained by the NSA, ФСБ, and 3PLA?
- chrissnell 9 years ago
  
  Of course. Google dumping their cache puts only a small dent into the problem, but I feel that it's their responsibility to the innocent site operators caught in the middle of this.
  
  2 replies →
homakov 9 years ago

CF should be thankful Google is doing any of this, clearing their entire cache would cost Google $ to index web from scratch.
meowface 9 years ago
That might be a bit too extreme. But they should do something quickly to try to find all of these.
- wapz 9 years ago
  
  I would say cloudflare should hire them to try to find them. It's really not on google IMO (unless caching has some implications regarding storing sensitive data).

Edmond 9 years ago

more results from duckduckgo: https://duckduckgo.com/?q=+%7B%22scheme%22%3A%22http%22%7D+C...

kristianp 9 years ago

Wow, I just tried this, the first result with a google cache copy has a bunch of the kind of data described. Although there was only one result with a cache.

batbomb 9 years ago

The second page had a result with an OAuth2 Bearer token in it.
dsp1234 9 years ago
PII, OAuth data, etc.
- PuffinBlue 9 years ago
  
  I've so far seen an oAuth key for fitbit (via their android app) and api keys for trakt (though apparently that service doesn't use them?)
  I don't know, this just seems catastrophic.

djKianoosh 9 years ago

I searched for

"CF-Host-Origin-IP:" token

.... uhm is that what I think I'm seeing???

tlrobinson 9 years ago

The first couple I looked at were requests to Uber and Fitbit...

emmelaich 9 years ago
One of my Uber rides two weeks ago went completely nuts. Both my and my drivers app screwed up at the same time and I was never picked up and then seconds later the app claimed I reached my destination.
You have to wonder whether something like this is implicated.
- Twirrim 9 years ago
  
  That's one phenomenal leap of logic there. Why would you think that?
  
  6 replies →
- SteveNuts 9 years ago
  
  Probably not.
  If someone knew about this exploit they're not going to be messing with people's Uber rides for lulz.
  
  1 reply →

alinspired 9 years ago

this is quite bad. i hope google can put some effort in clearing it's cache too

scurvy 9 years ago

Time to find out where various "booter" sites are actually hiding.

lesterwelsh28 9 years ago

Any thing