Comment by WaxProlix

4 years ago

I wrote one of these as a POC when at AWS to store data sharded across all the free namespaces (think Lambda names), with pointers to the next chunk of data.

I like to think you could unify all of these into a FUSE filesystem and just mount your transparent multi-cloud remote FS as usual.

It's inefficient, but free! So you can have as much space as you want. And it's potentially brittle, but free! So you can replicate/stripe the data across as many providers as you want.

I was an eng manager on Lambda for a time, and we definitely knew people were doing this, and had plans to cut it out if it ever became a problem. :D

  • Yeah, you'd need to find some sort of auto-balancing to detect this kind of bitrot from over-aggressive engineering managers & their ilk and rebalance the data across other sources. I think the multiple-shuffle-shard approach has been done before, maybe we could steal some algo from a RAID driver, or DynamoDB.