Comment by rakoo

2 months ago

If it's a common enough occurrence to have _all_ your nodes down at the same time maybe you should reevaluate your deployment choices. The whole point of multi-nodes clustering is that _some_ of the nodes will always be up and running otherwise what you're doing is useless.

Also, garage gives you the possibility to automatically snapshot the metadata, advices on how to do the snapshotting at the filesystem level and to restore that.

2 comments

rakoo

Dylan16807 2 months ago

All nodes going down doesn't have to be common to make that much data loss a terrible design. It just has to be reasonably possible. And it is. Thinking your nodes will never go down together is hubris. Admitting the risk is being realistic, not something that makes the system useless.

How do filesystem level snapshots work if nodes might get corrupted by power loss? Booting from a snapshot looks exactly the same to a node as booting from a power loss event. Are you implying that it does always recover from power loss and you're defending a flaw it doesn't even have?

rakoo 1 month ago

No, the snapshotting and restore is manual