Comment by evancox100
4 years ago
It's not losing pending writes, it's the drive saying there are no pending writes, but losing them anyways. ie the drive is most likely lying
4 years ago
It's not losing pending writes, it's the drive saying there are no pending writes, but losing them anyways. ie the drive is most likely lying
As I said in the recent Apple discussion, pretty much all drives are lying and have been for decades at this point. The good brands just spec out enough capacitance that you don't see the difference externally.
https://news.ycombinator.com/item?id=30370551#30374585
The more you look at speculative execution and these drive issues, the more you see that we're giving up a lot of what computing "safe" for just performance.
Brings to mind Goodhart's law:
> When a measure becomes a target, it ceases to be a good measure.
In this case, performance as a measure of value.
it's mongodb all over again
gotta get dem sweet sweet benchmarks, to hell if we're persisting garbage
So if SSDs rely solely on capacitors for data integrity and lie about flushes, what do they do on a flush that takes any amount of time? Are they just taking a speed hit for funsies? Heck, from this test, the magnitude of the speed hit isn't even correlated with whether they lose writes...
At one point it was different barriers on the different submission queues inside the drive. Not externally visible queues, but between internal implementation layers.
It's been a few years since I've checked up on this and it was for the most part pre SSDs though.
Probably implementing a barrier for ordering io…
When you look at how long it takes to perform a block write on a flash device, you'll see that no SSD is going to honor flush semantics.
Considering the amount of 10uF Tantalum caps (30) on one of the bricked enterprise SSDs I opened I'm not surprised at all.
300uF is not a supercap.
1 reply →
You should have cooling in your datacenter.
1 reply →
Any ides how one finds out whether a drive is actually capped properly to handle power off data loss?
Understood. Low-end drives have always done this because: performance.