This is just the storage cost. That is they will keep your data on their servers, nothing more.
Now if you want to do something with the data, that's where you need to hold your wallet. Either you get their compute ($$$ for Amazon) or you send it to your data centre (egress means $$$ for Amazon).
When you start to do math, hard drive are cheap when you go for capacity and not performance.
0.00099*1000 is 0.99. So about 12$ a year. Now extrapolate something like 5 year period or 10 year period. And you get to 60 to 120$ for TB. Even at 3 to 5x redundancy those numbers start to add up.
S3 does not spend 3x drives to provide redundancy. Probably 20% more drives or something like that. They split data to chunks and use erasure coding to store them in multiple drives with little overhead.
AFAIK geo-replication between regions _does_ replicate the entire dataset. It sounds like you're describing RAID configurations, which are common ways to provide redundancy and increased performance within a given disk array. They definitely do that too, but within a zone
Mate, this is better than an entire nation's data getting burned.
Yes its pricey but possible.
Now its literally impossible.
I think that AWS Glacier at that scale should be the thing preferred as they had their own in house data too but they still should've wanted an external backup and they are literally by the govt. so they of all people shouldn't worry about prices.
Have secure encrypted backups in aws and other possibilities too and try to create a system depending on how important the treat model is in the sense that absolutely filter out THE MOST important stuff out of those databases but that would require them to label it which I suppose would make them gather even more attention to somehow exfiltrate / send them to things like north korea/china so its definitely a mixed bag.
my question as I said multiple times, why didn't they build a backup in south korea only and used some other datacentre in south korea only as the backup to not have to worry about encryption thing but I don't really know and imo it would make more sense for them to actually have a backup in aws and not worry about encryption personally since I find the tangents of breaking encryption a bit unreasonable since if that's the case, then all bets are off and the servers would get hacked too and that was the point of phrack with the advanced persistent threat and so much more...
are we all forgetting that intel has a proprietory os minix running in the most privileged state which can even take java bytecode through net and execute it and its all proprietory. That is a bigger security threat model personally to me if they indeed are using that which I suppose they might be using.
This is just the storage cost. That is they will keep your data on their servers, nothing more.
Now if you want to do something with the data, that's where you need to hold your wallet. Either you get their compute ($$$ for Amazon) or you send it to your data centre (egress means $$$ for Amazon).
When you start to do math, hard drive are cheap when you go for capacity and not performance.
0.00099*1000 is 0.99. So about 12$ a year. Now extrapolate something like 5 year period or 10 year period. And you get to 60 to 120$ for TB. Even at 3 to 5x redundancy those numbers start to add up.
S3 does not spend 3x drives to provide redundancy. Probably 20% more drives or something like that. They split data to chunks and use erasure coding to store them in multiple drives with little overhead.
S3 uses 5-of-9 erasure coding[1]. That's roughly 2x overhead.
[1] https://bigdatastream.substack.com/p/how-aws-s3-scales-with-...
AFAIK geo-replication between regions _does_ replicate the entire dataset. It sounds like you're describing RAID configurations, which are common ways to provide redundancy and increased performance within a given disk array. They definitely do that too, but within a zone
wait, can you elaborate on how this works?
1 reply →
And S3 RRS and Glacier do even less.
They charge little for storage and upload, but download, so getting your data back, is pricey.
Mate, this is better than an entire nation's data getting burned.
Yes its pricey but possible.
Now its literally impossible.
I think that AWS Glacier at that scale should be the thing preferred as they had their own in house data too but they still should've wanted an external backup and they are literally by the govt. so they of all people shouldn't worry about prices.
Have secure encrypted backups in aws and other possibilities too and try to create a system depending on how important the treat model is in the sense that absolutely filter out THE MOST important stuff out of those databases but that would require them to label it which I suppose would make them gather even more attention to somehow exfiltrate / send them to things like north korea/china so its definitely a mixed bag.
my question as I said multiple times, why didn't they build a backup in south korea only and used some other datacentre in south korea only as the backup to not have to worry about encryption thing but I don't really know and imo it would make more sense for them to actually have a backup in aws and not worry about encryption personally since I find the tangents of breaking encryption a bit unreasonable since if that's the case, then all bets are off and the servers would get hacked too and that was the point of phrack with the advanced persistent threat and so much more...
are we all forgetting that intel has a proprietory os minix running in the most privileged state which can even take java bytecode through net and execute it and its all proprietory. That is a bigger security threat model personally to me if they indeed are using that which I suppose they might be using.
I just responded to "How does this even make sense business wise for AWS?"
It's expensive if you calculate what it would cost for a third party to compete with. Or see e.g. this graph from a recent HN submission: https://si.inc/posts/the-heap/#the-cost-breakdown-cloud-alte...