Comment by z3t4
7 hours ago
Had an accidental reboot, and it could not boot. Had redundancy, but the other server had failed silently days prior. Solved it with three way redundancy and extra monitoring. Systems fail in many ways at the same time. If you do not test it, there is a chance it wont work. Controlled failure is preferred over unknowns, like rebooting once in a while just to make sure it works.
No comments yet
Contribute on Hacker News ↗