← Back to context

Comment by rckoepke

6 years ago

What types of AWS data would be trawled? Are we talking about data inside S3 buckets, database schemas, particular architecure styles, the fact that a product is consuming {x, y, z} amounts of cloud resources, or simply "spending $m / year" in gross?

I worked in an area where it is really hard to figure out exactly what workloads were being run and where it would have been extremely useful to know even basic things like CPU utilization patterns, network throughput patterns, etc for a specific customer.

We had access to absolutely none of that information. We flew blind, relying entirely on the fact that we gave our customers enough hand-holding support that they would willingly volunteer information about their workloads so we could help them optimize it/save money.

No one even attempted to get more detailed customer information AFAIK because it would have been extremely against company culture. That isn't Earning Trust or having Customer Obsession. The idea of reading data in someone's S3 bucket or inspecting what is happening inside of someone's EC2 instance in any way was unthinkable. Amazon is huge and imperfect, but from what I saw AWS takes data privacy extremely seriously.

I can confidently tell you that Amazon's employees cannot see customers data inside S3 buckets or EC2 instances. They are extremely serious about that stuff since they know that will erode their customer's confidence.

But there's probably other superficial business data that's helpful to evaluate that.

  • > I can confidently tell you that Amazon's employees cannot see customers data inside S3 buckets or EC2 instances.

    From a technical standpoint, that statement is false.

    Every employee might not have the credentials to, but for AWS to function as it does, SOMEONE inside the company has to have those credentials.

    If you change 'cannot' to 'don't', well then we've just gotta take you at your word, which is where we started anyway.

    • > SOMEONE

      That's not necessary unless SOMEONE includes computer programs.

      Yes, when things go very seriously wrong, I believe AWS can have literal people override that permission, which will leave a mile long audit trail and likely accompanied by an internet scale outage.

      5 replies →

    • Except they get audited by 3rd parties on statements like that, and have controls tested. It's not like they're just ... digital ocean or somebody.

      8 replies →

    • Yeah. True. I guess what I meant is that just a handful of employees have access to that and they need to have legitimate reasons.

    • Also, it is possible to build systems such that, no, there isn't a 'root' or 'unlimited permission' or whatever. Or that there is, but it's a multi-person credential.

      This is one area where AWS takes things MUCH more seriously than it's competition, and they don't talk about it enough publicly.

    • The critical factor here is whether there are controls in place to prevent it. Sure, somebody probably could, but what to what lengths must that person go to do it, and what happens when it is discovered? Most things are not technically impossible, after all.

    • for its faults aws takes data privacy super serious. if you are in support you cant even see attachments customers put on cases without providing auditable justification

      and you def cant see in s3 buckets or instances. hell if a customer sends you a link to an object in their s3 youre not supposed to open it

      5 replies →

  • This is incorrect, at least from a logical POV and why it's hard to trust what cloud vendors say. A statement like this is either naive (most likely) or actively attempting to mislead.

    Technically, its absolutely possible. Most likely you'll just need a support ticket or bug, and then you can troll around as engineer.

    Also, security teams also usually have access to stuff when things get interesting.

    Better to say that access is strictly on a case by case basis and monitored thoroughly.

    Ideally customer is notified each time it happens - that would be cool, but likely technically not possible since data ends up in so many systems (like logs, SIEM, telemetry, debug files, backups, data scientist desktops,....)

    • > Ideally customer is notified each time it happens - that would be cool, but likely technically not possible

      You're underestimating the investments that AWS (and Amazon at large) make in to security, confidentiality, and auditing. You're also missing a fundamental implication of building AWS on AWS primitives.

      As a relevant example there is only one AWS IAM and one CloudTrail. It's a core tenant of AWS IAM to put that control and root of trust in to the customers control. That means when developer support is helping with your ticket they do so via your accounts AWSServiceRoleForSupport role. That means you can control whether that role exists, which principals can assume it, the capabilities it has, and you can see those same API calls in your CloudTrail logs. Although it would make support difficult you're welcome to delete that service linked role and prevent support.amazonaws.com from assuming said role in your account.

      https://docs.aws.amazon.com/awssupport/latest/user/using-ser...

      2 replies →

    • This really isn't how modern security works at most cloud companies. Even if you have root class credentials or the ability to escalate to them in some way (and that's a big if by itself), its a LOT of steps to get access to customer data, almost always involving broken glass, many protection layers, and often requires cooperation of multiple other root level people/credentials from completely different teams.

      Depending on how the infrastructure is built, or what the particular service set up, it may not even be possible to gain access to specific data without extraordinary means, possibly involving replacing physical hardware.

    • I already corrected my statement in another reply. You're right. I said probably only a handful of people can access customer data to do their job. I personally never met one. The goal of my comment was to illustrate that in my experience handling customer data there was a big deal. It's not like something you can casually query to see if a particular customer has a good business or not.

  • Amazon is a massive company. How can you know this with confidence? Are you in the C-Suite?

    • It’s the thing they tell you the most when you work there. Like in a a obnoxious way. Most infosec training is about that.

      If someone has access to customer’s data for their work they have to do a bunch of extra training and do other stuff. Potentially sign some things and there’s probably a different way to authenticate. I really don’t know because I never had to do that and nobody I knew had that type of access but I heard when you do you have to put with more things.

      6 replies →

  • 1. Did you work on a team at Amazon in the likes of what user throwaway_aws mentioned?

    2. What measures that you know of is Amazon implementing to make sure no employees across all teams are having access to said resources?

    • As I said below this is something that they will talk a about like every freaking day. They talk about customer’s data as the most important thing to take care of.

      Basically is preferable to get a bullet in the head than to ever reveal or tamper with customer’s data.

      I cannot answer your question about who has access or not but I’m telling you what’s the culture when it comes to customer’s data.

      At the end of the day I was just another IC doing menial work so probably not a good reference, but that was my experience

  • I'm sorry but what you just said is patently false:

    https://www.bloomberg.com/news/articles/2019-07-29/capital-o...

    Quote:

    Capital One Financial Corp. said data from about 100 million people in the U.S. was illegally accessed after prosecutors accused a Seattle woman identified by Amazon.com Inc. as one of its former cloud service employees of breaking into the bank’s server.

    While the complaint doesn’t identify the cloud provider that stored the allegedly stolen data, the charging papers mention information stored in S3, a reference to Simple Storage Service, Amazon Web Services’ popular data storage software.

    • My reading of this is that the ex-employee used the knowledge about EC2 instance credentials being accessible as a path to gain unauthorized access to data. In theory anyone could have exploited this vulnerability even if they had never worked for Amazon. They never say that Amazon employees had privileged credentaials that would give them unauthorized access to customer data.

      AWS customers that want to avoid this vulnerability should disable IMDSv1 as per https://aws.amazon.com/blogs/security/defense-in-depth-open-...

      1 reply →

    • That leak didn't involve any insider access. So it doesn't prove that employees get access to the S3 data.

Can speak for AWS. Only the later. Basically the usage information for cloud resources. This constitutes the foundation for billing. BTW, this is be true for any cloud, any SAAS.

There is no way an employee can look into customer data. There's enough trail inside AWS to prove that without any doubt.

  • What are the measures being implemented to ensure that no employee can look into customer's data?

    • I used to work for AWS and had to deep dive into IAM to build a feature.

      Basically Everytime you touch AWS your session is tagged with your credentials and has a unique ID. So everything downstream you touch has your session ID associated with it.

      Now say somebody from Redshift wants to access the customer's data. They will then need to access to the encryption key in KMS. The trail will be there since KMS lives in the customer's account (you can audit your own access). And for production services, human actors cannot access these keys - only production credentials can. An engineer who can log into a prod host in theory can grab the temporary credentials there but it expires in 15 minutes so your trail will be rather visible. Also access to prod host has a high bar - only senior people can do it.

      Now in theory somebody can coordinate with a malicious user in KMS team - but the bar is high. Also the actual master key never leaves the premise for KMS so your attack surface is very limited.

      Of course there are some core teams like IAM and KMS where if they become vulnerable the whole thing falls apart. But that's a big stretch for those systems since they are the core to the business.

      9 replies →

    • I can tell you generally how this works in Azure, I can't speak for AWS, but unless a customer is using BYOK for encryption of their data, I can't imagine how AWS c o u l d n ' t be capable of accessing data, and even then I wouldn't gurantee they couldn't still get your data. In Azure (as of a couple years ago), in order to access a customer's tenant it required VP approval, the support engineer was granted access for a specific amount of time, and typically only to specific services, all with the customers knowledge beforehand. It may have changed since the last time I had to go through this process and was restricted to blue badge employees. I have worked support cases since then and the support engineer would not even do a log me in/WebEx, etc session as they said they were not allowed to see the portal. But it may have been that they were not a blue badge and/or bcuz the customer was a critical infrastructure customer.

      In order for AWS to comply with LEO's they must have some way of accessing data, that is NOT to say they do this for business purposes.

      1 reply →

    • Other than the data each service actually retains themselves (i.e. the Lambda service themselves store your Lambda Functions because they need to execute them) customer data is generally stored encrypted at rest with KMS keys belonging to the customer (or sometimes managed by the storage team). It wouldn't be possible to peer into unencrypted data without persuading the KMS API to authenticate your access to the key. Presumably this capability exists, because otherwise Amazon wouldn't be able to honor warrants for customer data, but the premise that KMS is handing out decryption tokens for customer data for the benefit of Amazon Retail's business analysts is pretty silly.

      And of course, you're always vulnerable to someone with access to the physical host of an EC2 instance where your workload is running. Only GCP AFAIK offers an encrypted-in-processing compute service, and it's like a week old.

      https://cloud.google.com/blog/products/identity-security/int...

Given how granular AWS billing data is, I would expect the odds to be fairly good that it alone is sufficient to make a good analysis for which third-party offerings are compelling markets. Then AWS takes their execution advantage, along with things like the lower friction that arises from first-party integration with IAM and billing, as well as not having to pay retail for the cloud resources, and it becomes very difficult to retain a moat unless you have a paradigm or perspective that is both critical to succeeding and is also incompatible with AWS culture.

  • You’re correct. It’s disturbingly detailed as far as what it reveals about architecture.