← Back to context

Comment by tptacek

8 years ago

Oh, my god.

Read the whole event log.

If you were behind Cloudflare and it was proxying sensitive data (the contents of HTTP POSTs, &c), they've potentially been spraying it into caches all across the Internet; it was so bad that Tavis found it by accident just looking through Google search results.

The crazy thing here is that the Project Zero people were joking last night about a disclosure that was going to keep everyone at work late today. And, this morning, Google announced the SHA-1 collision, which everyone (including the insiders who leaked that the SHA-1 collision was coming) thought was the big announcement.

Nope. A SHA-1 collision, it turns out, is the minor security news of the day.

This is approximately as bad as it ever gets. A significant number of companies probably need to compose customer notifications; it's, at this point, very difficult to rule out unauthorized disclosure of anything that traversed Cloudflare.

In case you're wondering how this could be worse than Heartbleed:

Yes, apparently the allocation patterns inside Cloudflare mean TLS keys aren't exposed to this vulnerability.

But Heartbleed happened at the TLS layer. To get secrets from Heartbleed, you had to make a particular TLS request that nobody normally makes.

Cloudbleed is a bug in Cloudflare's HTML parser, and the secrets it discloses are mixed in with, apparently, HTTP response data. The modern web is designed to cache HTTP responses aggressively, so whatever secrets Cloudflare revealed could be saved in random caches indefinitely.

You really want to see Cloudflare spend more time discussing how they've quantified the leak here.

  • You really want to see Cloudflare spend more time discussing how they've quantified the leak here.

    What would you like to see? The SAFE_CHAR logging allowed us to get data on the rate which is how I got the % of requests figure.

    • Perhaps as a follow up to this bug, you can write a temporary rule to log the domain of any http responses with malformed HTML that would have triggered a memory leak. That way you can patch the bug immediately, and observe future traffic to find the domains that were most likely affected by the bug when it was running.

      Or is the problem that one domain can trigger the memory leak, and another (unpredictable) domain is the "victim" that has its data dumped from memory?

      2 replies →

  • It shouldn't be too difficult to feed an instrumented copy of the parser some fraction of their cached pages (after all, that's what they're for.. right?) and calculate a percentage of how many triggered e.g. valgrind, or just some magic string tacked on the end of the input appearing in the output or similar

    I prefer CloudScare to Cloudbleed :)

It is far from over, too! Google Cache still has loads of sensitive information, a link away!

Look at this, click on the downward arrow, "Cached": https://www.google.com/search?q="CF-Host-Origin-IP:"+"author...

(And then, in Google Cache, "view source", search for "authorization".)

(Various combinations of HTTP headers to search for yield more results.)

  • > The infosec team worked to identify URIs in search engine caches that had leaked memory and get them purged. With the help of Google, Yahoo, Bing and others, we found 770 unique URIs that had been cached and which contained leaked memory. Those 770 unique URIs covered 161 unique domains. The leaked memory has been purged with the help of the search engines.

    So I tried it too, and there's still data cached there.

    Am I misunderstanding something - that above statement must be wrong, surely?

    They can't have found everything even in the big search engines if it's still showing up in Google's cache, let alone the infinity other caches around the place.

    EDIT: If the cloudflare team sees I see leaked credentials for these domains:

    android-cdn-api.fitbit.com

    iphone-cdn-client.fitbit.com

    api-v2launch.trakt.tv

    • Could someone enlighten me on why malloc and free don't automatically zero memory by default?

      Someone pointed me to MALLOC_PERTURB_ and I've just run a few test programs with it set - including a stage1 GCC compile, which granted may not be the best test - and it really doesn't dent performance by much. (edit: noticeably, at all, in fact)

      People who prefer extreme performance over prudent security should be the ones forced to mess about with extra settings, anyway.

      31 replies →

    • > that above statement must be wrong, surely?

      Either they believe it's right, which means they're not competent enough to really assess the scope of the leak; or they don't believe it, but they went "fuck it, that's the best we can do".

      In either case, it doesn't really inspire trust in their service.

      1 reply →

    • jgrahamc: can you list which public caches you worked with to attempt to address this? It does not inspire confidence when even google is still showing obvious results

      92 replies →

  • https://webcache.googleusercontent.com/search?q=cache:lw4K9G...

        Internal Upstream Server Certificate
        ...
        /C=US/ST=California/L=San Francisco/O=Cloudflare Inc./OU=Cloudflare Services - nginx-cache/CN=Internal Upstream Server Certificate
    

    That really doesn't look good.

    • Just to point out, this is apparently a cert used for communicating between Cloudflare's services which has (presumably) been replaced. Cloudflare customer's certs weren't exposed.

      2 replies →

  • Lol, Google just purged that search.

    EDIT: but there's still plenty of fish: http://webcache.googleusercontent.com/search?q=cache:lw4K9G2...

    This will take weeks to clean, and that's just for Google.

    EDIT2: found other oauth tokens, lots of fitbit calls... And this just by searching for typical CF internal headers on Google and Bing. There is no way to know what else is out there. What a mess.

    • Ouch, you really see everything :

      > authorization: OAuth oauth_consumer_key ...

      what a shit show. I'm sorry but at that point there must be consequences for incompetence. Some might argue "But nobody can't do anything" ...

      I'm sorry, CF has the money to to ditch C entirely and rewrite everything from the ground up with a safer language, I don't care what it is, Go,Rust whatever.

      At that point people using C directly are playing with fire. C isn't a language for highly distributed applications, it will only distribute memory leaks ... With all the wealth there is in the whole Silicon Valley, trillions of dollars, there is absolutely 0 effort to come up with an acceptable solution? all these startups can't come together and say: "Ok,we're going to design or choose a real safe language and stick to that"? where does all that money goes then? Because this bug is going to cost A LOT OF MONEY to A LOT OF PEOPLE.

      4 replies →

    • Good. They're trying to clean up all the private data leaked everywhere. I tempted to say "why couldn't they figure out this google dork themselves" but they've probably been slammed for the past 7 days cleaning up a bunch of stuff anyway.

      14 replies →

    • > This will take weeks to clean, and that's just for Google.

      Couldn't Google just purge all cached documents which match any Cloudflare header? This will probably purge a lot of false positives, but it's just cached data, so would that loss really matter? My guess is that this approach should not take more than a few hours on Google's infrastructure.

      Of course, this leaves the problem of all the other non-Google caches out there.

    • OAuth1 doesn't send the secrets with the requests, just a key to identify the secret and a signature made with the secret.

      OAuth2 does send the secret, typically in an "Authorization: Bearer ..." header.

      The uber stuff that somebody else linked to looks like a home-grown auth scheme and it appears that "x-uber-token" is a secret, but hard to know for sure.

    • So while people are having fun here with search queries, how many scripts are already up and running in the wild, scraping every caching service they can think of in creative ways for useful data...

      This is an ongoing disaster, wasn't this disclosed too soon?

  • The "well-known chat service" mentioned by Tavis appears to be Discord, for the record.

    edit: Uber also seems to be affected.

  • >It is a snapshot of the page as it appeared on Feb 21, 2017 20:20:45 GMT

    So the issue wasn't fully fixed on Feb 19, or Google's cache date isn't accurate?

  • It seems like the reasonable thing for Google to do is to clear their entire cache. The whole thing. This is the one thing that they could do to be certain that they aren't caching any of this.

    • What about Bing, Baidu, Yandex, The Internet Archive, and Common Crawl? What about caches that are surely maintained by the NSA, ФСБ, and 3PLA?

      3 replies →

    • CF should be thankful Google is doing any of this, clearing their entire cache would cost Google $ to index web from scratch.

  • The first couple I looked at were requests to Uber and Fitbit...

    • One of my Uber rides two weeks ago went completely nuts. Both my and my drivers app screwed up at the same time and I was never picked up and then seconds later the app claimed I reached my destination.

      You have to wonder whether something like this is implicated.

      9 replies →

If anyone here is HIPAA-regulated or you have a customer who is, and you used Cloudflare during those dates, it is Big Red Button time. You've almost certainly got a reportable breach; depending on how tightly you're able to scope it maybe it won't be company-ending.

  • > If anyone here is HIPAA-regulated or you have a customer who is

    Cloudflare certainly does; I founded a health tech company, and Cloudflare was the recommended go-to for health tech startups who needed a CDN while serving PHI.

    And this is definitely a reportable breach. Technically any breach is supposed to be reported to HHS, but in reality, a lot of covered entities (e.g. insurers) fail to report smaller breaches (which, as a patient, should terrify you). The big ones, though, are really, really bad, and when reported, the consequences can be very serious and potentially even include serving time, depending on the circumstances.

    The reason I can be so confident that this is a reportable breach is that the definition of PHI is so broad that even revealing the existence of information between two known entities can be considered protected information. Anything more specific, like a phone number or DOB, or time of an appointment (even if you don't know who the appointment corresponds to) - that's always protected. And Cloudflare certainly has many of those.

    • Well HIPAA wouldnt allow your https traffic flow unencrypted through a shared proxy right? This means cloudflare couldnt offer that feature, so they probably didn't?

      Just think about the HIPAA document describing a single endpoint of dozens of sensitive datastreams, decrypting and then encrypting them all on the same machine, a machine that does some random HTML parsing for snippet caching on the side.

      I don't see that passing review, but perhaps I'm naieve..

      4 replies →

  • Isn't it worse than that? Even if you are not a CF user, if your apps make calls to a third party site protected by CF, you could be at risk (stolen credentials, API keys), and could be attacked using those now.

    • That's also a bad thing, but you can roll creds and check if anyone has exfiltrated data from your various accounts. You can't roll patient identities. There doesn't appear to be any way to figure out which of your HTTPS pages served in last 6 months are presently publicly exposed.

      I feel for folks who lost API keys -- really -- but everyone regulated should be in full-on disaster recovery mode right now.

  • If you are/were using Cloudflare to cache PHI though their CDN without a BAA, you were likely in breach before this.

    Some have suggested that Cloudflare might not be a business associate because of an exception to the definition of business associate known as the "conduit" exception.

    Cloudflare is almost certainly not a conduit. HHS's recent guidance on cloud computing takes a very narrow view[0]:

    "The conduit exception applies where the only services provided to a covered entity or business associate customer are for transmission of ePHI that do not involve any storage of the information other than on a temporary basis incident to the transmission service."

    OCR hasn't clarified what "temporary" means or whether a CDN would qualify, but again, almost certainly not. ISPs qualify, but your data just sits on the CDN indefinitely.

    p.s. Hi Patrick and Aditya!

    [0] https://www.hhs.gov/hipaa/for-professionals/special-topics/c...

    • Agree completely with you on this, and based on my experience with OCR, I'd say they would as well. The analogy for a "mere conduit" is the postal service. And that analogy falls apart as soon as you realize that CloudFlare, when being used as an SSL termination point, is opening and repackaging each "letter" on the way to the destination.

      I do hate for CloudFlare to be the example for companies playing fast and loose with the rules, but I am hoping we'll have an opportunity in this to clarify the conduit definition a bit more.

      Would like to mention that I don't think this declaration applies to every scenario. CloudFlare isn't just one service. I don't see an immediate issue using CloudFlare for DNS on a healthcare app. Neither do I see an issue using CloudFlare as the CDN for static assets. Both of these cases should be evaluated in a risk analysis, but they don't necessitate the level of shared responsibility a BAA entails.

I remember Tavis tweeted Friday night asking for a cloudflare engineer to contact him, and everyone joked that the last thing you want on a Friday evening is an urgent message from tavis ormandy.

  • That was my tweet believe it or not. I had to turn notifications off on my phone because out of nowhere it was getting bombarded with shares/likes...

I would say the crazy thing is a mere t-shirt as their "bug bounty" top tier award given how they've pitched themselves as an extremely secure service.

https://hackerone.com/cloudflare

I'm sorry but when the reward for breaking into you is basically a massive pinata of personal information...that simply is a bad joke. Security flaws are going to happen and if you aren't going to even offer a reasonable financial reward to report them to you, well, that is just begging to be exploited with a pinata that size.

  • Nah. Bug bounties don't work for services like CDNs. Maybe they do elsewhere. But for enterprise services, the noise rate is too high, and the very good bug finders are either salaried, free, or working for the adversary.

    • I think I'd need to see some sort of evidence of this assertion. Bug bounties are commonly offered across a huge variety of online services, and they get results...not always, not necessarily consistently high quality, but even the giants (facebook comes to mind) have had reasonably serious bugs found by people seeking bounties.

      6 replies →

    • > Nah. Bug bounties don't work for services like CDNs. Maybe they do elsewhere. But for enterprise services, the noise rate is too high, and the very good bug finders are either salaried, free, or working for the adversary.

      Yes, running a real bug bounty system requires professional security engineers and a professional security posture to sort through the noise. However, when the sole product you are selling is security (i.e. Cloudflare) you kind of have to admit it should be expected that they do so.

      It isn't "too high", it simply requires a serious financial commitment to security in the terms of salaried security engineers.

      As to your other point, No one works for free. Project Zero is paid for by Google. Security engineers are going to prioritize the purposes that make them real, hard cash.

      2 replies →

  • What would make sense (to me, not a business/marketing guy, nor a lawyer, at all) would be a t-shirt and free subscription as the offered thing, something which costs the company nothing.

    Then for anything like this, give publically a bonus gift which makes it worth people reporting to them and not blackmarket selling it. Once it's gone through the legal dept. and so on.

    Then they can be very quick with handing out tshirts and so on to any and every microissue report, without the people running triage having to care about amounts or tax or whatever.

    Having any kind of publically offered payment for service (beyond a tshirt bounty or services in kind) is just begging for legal issues, right?

  • The reward includes a t-shirt, it isn't a mere t-shirt. You also get "12 months of CloudFlare's Pro or 1 month of Business service on us" (~$200). The reward is also not tiered.

    The award may still not be all that much, but let's not make things up about them.

    • That's still pretty much as silly as a tshirt. When a vulnerability was found in my hobby project I paid 200 to the reporter as a thanks. From my own pocket for my own open source program.

    • If I needed CF Pro though I'd already be on it.

      I mean I guess it's good if you're already on Pro and could do with the freebie year but it's not really much to get the whitehats auditing your systems for free*

      * free unless they find something

    • > The reward includes a t-shirt, it isn't a mere t-shirt. You also get "12 months of CloudFlare's Pro or 1 month of Business service on us" (~$200). The reward is also not tiered.

      I've never put any of my sites behind Cloudflare precisely because I never had faith their WAF would always be bug free and I'm not comfortable with their MitM position.

      Getting me to use your service on a time limited basis falls more under the category of "try-it-so-you-buy-it" marketing ploy than a real bonus to me. It benefits Cloudflare more than the researcher for that reason since if they use it, they'll be invested continuing to "help" Cloudflare since they'll be dependent on it.

      I'm sorry, I just don't buy that is anything but a marketing ploy wrapped up as a bonus.

Can someone tell me the implications of this in laymen terms?

For instance what does it mean "sprayed into caches"? what cache? dns cache? browser cache? if the latter, does it mean you are safe if the person who owns that cache is an innocent non technical iser?

  • There are caches all over the Internet; Google and Microsoft run some of them, but so do virtually every Fortune 500 company, most universities, and governments all over the world.

    The best way to understand the bug is this: if a particular HTTP response happened to be generated in response to a request, the response would be intermingled with random memory contents from Cloudflare's proxies. If that request/response happened through someone else's HTTP proxy --- for instance, because it was initiated by someone at a big company that routes all its traffic through a Bluecoat appliance --- then that appliance might still have that improperly disclosed memory saved.

  • There are all kinds of places were things are cached, both on- and offline. Your data may end up in:

    * Browser caches.

    * Sites like wayback machine or search engines that make copies of webpages and save them.

    * Tools that store data downloaded from the web, e.g. RSS readers.

    * Caching proxies.

    * the list goes on and on.

    I think what tptacek wanted to say: It's just so common that people download things from the web and store them without even thinking much about it. And all those places where this happens now potentially can contain sensitive data.

  • Many services on the internet keep a copy of a page they have loaded in the past. Google does this, for example. It lets them do things like search across websites quickly.

    Many of these caches are available online, to anyone who wants to look at them.

    This bug meant that any time a page was sent through Cloudflare, the requester might receive the page plus some sensitive personal information, or credentials that could be used to log in to a stranger's account. Some of these credentials might let a bad actor pretend to be a service like Uber or Fitbit.

    This very sensitive information might end up saved in a public cache, where anyone could find it and use it to do harm.

    • What are my rough odds of having stored a credential,if I were a provider?

      What are the odds I had a credential stored?

      We know the impact but what are the odds to a provider and to a possible exposeee?

  • It's reminiscent of the earlier days of the Squid cache.

    When it had bugs and devivered up cached files the typical symptom was that everyone in the company got unwanted porn.

    Because the biggest user (by far) of the 'net was the person into porn and so 90% of the Squid cache was porn.

    • It served the wrong resource instead of failing to serve a resouce? Back then, if I were to suffer this, what is the likelihood of a porn for cats experience?

    • It served the wrong resource instead of failing to serve a resouce? Back then, if I were to suffer this, what is the likelihood of a porn for cats experience?

  • Far worse than this. Yes, browser caches, but also web crawlers (like google)'s caches. This means that anyone who requested certain public content could have instead received secret content from completely unrelated websites.

  • As for the SHA-1 collision mentioned by jgrahamc[1] earlier today:

    How am I going to explain this to my wife?

    Actually a serious question. How do we communicate something like this to the general public?

    [1] https://news.ycombinator.com/item?id=13713826

    • "It's like some extremely popular remailer company accidentally put badly or barely shredded copies of handled letters into other people's envelopes. Strangers' sensitive info is potentially sitting inside unsuspecting mailboxes worldwide."

      1 reply →

> A significant number of companies probably need to compose customer notifications;

As a one-man company who has never done this before (and to the best of my knowledge never needed to): Any guides/examples to writing a customer notification for security ups like this? Or just recommendations? Thanks.

  • It's as easy as throwing a red banner on your website that explains the situation briefly and recommends that users change their passwords, if you take this more seriously you can force a password reset for all users. Depends on how sensitive the information that your users trust your site to hold is.

What a mess.

On the plus side, all those booter services hiding behind the Cloudflare are probably being probed and classified/identified/disabled by competitors and probably FBI. That is good.

>Tavis found it by accident just looking through Google search results.

Curious whether there could be some automated way of preventing such a widespread cache poisoning in the future. Some ML trained on valid pages from a given domain?

Is it even possible to recover the original content of the documents or was the data randomly inserted into different parts?