Comment by Johnny555
5 years ago
You're talking about the grid, the OP was talking about datacenter infrastructure -- which one is the weak link?
If a datacenter can't go from idle (but powered on) servers to fully utilized servers without taking down the power grid, then it seems that they'd have software controls in place to prevent this, since there are other failure modes that could cause this behavior other than a global Facebook outage.
Unfortunately the article doesn’t provide enough explicit detail to be 100% sure one way or the other, however my read is that it’s probably the grid.
> Individual data centers were reporting dips in power usage in the range of tens of megawatts, and suddenly reversing such a dip in power consumption could put everything from electrical systems to caches at risk.
“Electrical systems” is vague and could refer to either internal systems, external systems or both.
That said, if the DC is capable of running under sustained load at peak (which we have to assume it is, since that’s its normal state when FB is operational) it seems to me like the externality of the grid is the more likely candidate.
In terms of software controls preventing this kind of failure mode, they do have it - load shedding. They’ll cut your supply until capacity is made available.