I tested four NVMe SSDs from four vendors – half lose FLUSH'd data on power loss (2022)

2 years ago (twitter.com)

260 comments

whitepoplar

We shipped a shader cache in the latest release of OBS and quickly had reports come in that the cached data was invalid. After investigating, the cache files were the correct size on disk but the contents were all zero. On a journaled file system this seems like it should be impossible, so the current guess is that some users have SSDs that are ignoring flushes and experience data corruption on crash / power loss.

fulafel 2 years ago
I think this is typical behaviour with ext4 on Linux, if the application doesn't do fsync/fdatasync to flush the data to disk.
Depending on mount options, ext4fs does metadata journaling ensuring the FS itself is not borked, but not data journaling which would safeguard the file contents in event of unclean shutdown with pending writes in the caches.
The same phenomenon is at play when people complain that their log files contain NUL bytes after a crash. The file system metadata has been updated for the size of the file to fit the appended write, but the data itself was not written out yet.
- Dylan16807 2 years ago
  
  The current default is data=ordered, which should prevent this problem if the hardware doesn't lie. The data doesn't go in the journal, but it has to be written before the journal is committed.
  There was a point where ext3 defaulted to data=writeback, which can definitely give you files full of null bytes.
  And data=journal exists but is overkill for this situation.
  
  18 replies →
- lxgr 2 years ago
  
  I don't think that's how it works: Flushing metadata before data would be a security concern (consider e.g. the metadata change of increasing a file's length due to an append before the data change itself), so file systems usually only ever do the opposite, which is safe.
  Getting back zeroes after a metadata sync (which must follow a data sync) would accordingly be an indication of something weird having happened at the disk level: We'd expect to either see no data at all, or correct data, but not zeroes or any other file's or previously written stale data.
  
  10 replies →
bugfix 2 years ago
I had this exact experience with my workstation SSD (NTFS) after a short power loss while NPM was running. After I turned the computer back on, several files (package.json, package-lock.json and many others inside node_modules) had the correct size on disk but were filled with zeros.
I think the last time I had corrupted files after a power loss was in a FAT32 disk on Win98, but you'd usually get garbage data, not all zeros.
- dspillett 2 years ago
  
  > but you'd usually get garbage data, not all zeros.
  You are less likely to get garbage with an SSD in combination with a modern filesystem because of TRIM. Even if the SSD has not (yet) wiped the data, it knows that a block that is marked as unused can be retuned as a block of 0s without needing to check what is currently stored for that block.
  Traditional drives had no such facility to have blocks marked as unused from their PoV, so they always performed the read and returned what they found which was most likely junk (old data from deleted files that would make sense in another context) though could also be a block of zeros (because that block hadn't been used since the drive had a full format or someone zeroed free-space).
- wannacboatmovie 2 years ago
  
  They may be pointing to unallocated space which on a SSD running TRIM would return all zeros. NTFS is an extremely resilient yet boring filesystem, I cannot remember the last time I had to run chkdsk even after an improper shutdown.
  
  4 replies →
matja 2 years ago

Journaling filesystems (including NTFS, and ext3/ext4 using default mount options) typically only track file structure metadata in the journal, so that is WAI - the filesystem structure was not corrupted, but all bets are off when it comes to the contents of the files.
chrismorgan 2 years ago

I lost Audacity projects due to BSODs on a Surface Book several times in ~2019: the *_data/**.au files were intact, each containing just a few seconds of audio; but the .aup XML file that maps them and contains whatever else makes up the project was all zeroed. My memory’s fuzzy, but I think it was something like exit sometimes triggering the BSOD, and save-on-exit corrupting consistently if it BSODed, and so the workaround was to remember to save first, and then if it BSODs you’re OK.
agbrrw 2 years ago

>experience data corruption on crash / power loss
You mean on complete system crash, right? Your application crashing shouldn't lead to files being fulls of zeroes as long as you've already written everything out.

ricardobeat 2 years ago

Misleading headline since after testing eight more drives, none more failed.

2/12 is not nearly as dramatic as “half”, and the ones that lost data are the cheap brands as one would expect.

alanfranz 2 years ago
You can either not editorialize the title, and accept that the thread contains updates, or editorialize it and violate HN guidelines.
Either choice will lead somebody to complain
- wannacboatmovie 2 years ago
  
  Clearly they should only editorialize the ones that are wrong.
  
  7 replies →
- thanhhaimai 2 years ago
  
  It's clear from the guidelines: https://news.ycombinator.com/item?id=38365934#38368867
  Specifically: use the original title, then express your view as a top level comment. If people agree with it, the comment is naturally voted up.
  
  10 replies →
Shekelphile 2 years ago
2/12 is not good, especially if the drives that failed were using off the shelf phison controllers which is basically the entire market besides sandisk/samsung/intel.
- SuchAnonMuchWow 2 years ago
  
  It doesn't seems to be related to the controller: the same phison controller was in some model that failed and some models that passed.
  (see later in the thread: https://twitter.com/xenadu02/status/1496770658750980103)
daneel_w 2 years ago

> "... and the ones that lost data are the cheap brands as one would expect."
What a sad world to live in, when one comes to expect cheap storage devices to not fulfill intended function.
wmf 2 years ago
SK Hynix is a major brand and the P31 is a great midrange SSD... except for the fact that it seemingly doesn't care about your data.
- dspillett 2 years ago
  
  > SK Hynix is a major brand
  Is it? I passed on an offer for a drive carrying that name, and got something else for slightly more, the other day as I didn't know the name.
  Perhaps their noteworthiness varies internationally? Or do they mainly sell to manufacturers rather than direct to the likes of me?
  
  1 reply →
- ricardobeat 2 years ago
  
  I have a Sabrent M2 in my own PC, bought it because it was the cheapest option. Incidentally I suspect it's the cause of system-wide slowdown in the past few months, even opening the file explorer takes over ten seconds sometimes.
cm2187 2 years ago
To me the real thing missing is whether those drive advertise power loss protection or not. The next question is whether they are to be used in a laptop where power loss protection is less relevant given the local battery.
- jtwaleson 2 years ago
  
  That should be irrelevant, because flush is flush right? If your SSD does not write the data after a flush it's violating basic hard drive functionality.
  
  1 reply →

IngvarLynn 2 years ago

There is a flood of fake SSDs currently, mostly big brands. I've recently purchased counterfeit 1TB. It passes all the tests, performance is ok, it works... except it gets episodes where ioping would be anything between 0.7 ms and 15 seconds, that is under zero load. And these are quality fakes from a physical appearance perspective. The only way I could tell mine was fake is that the official Kingston firmware update tool would not recognize this drive.

loeg 2 years ago
Where are you seeing counterfeits? AliExpress, Ebay, Amazon?
- Shekelphile 2 years ago
  
  Probably chinese sellers on all those sites. I've noticed a common thread with people who complain about counterfeits is that they're literally buying alphabet soup brand fakes from chinese FBA sellers instead of buying products directly sold by amazon or from more traditional retail channels.
  
  5 replies →
alanfranz 2 years ago
Did you get the fake in an official box? Or OEM version? This is quite a big claim.
- iamacyborg 2 years ago
  
  It doesn't strike me as being a big claim, I recently bought some RAM for a NUC a few weeks ago on Amazon only to determine that it was likely counterfeit. It came in an official box with all packaging intact.
  
  2 replies →
vladvasiliu 2 years ago
That's interesting. I have a Samsung 990 pro bought on Amazon and have the random lags. I've only noticed it in the terminal, so I figured something else may be the culprit. Never went to 15 secondes, but it can be around 1s.
The Samsung Magician app on Windows reports it as "genuine" and it was able to apply two firmware updates. The only thing it complains about is that I should be using PCIE 4 instead of 3, but I can't do anything about that.
- thargor90 2 years ago
  
  I have been able to fix these random lags by doing multiple full disk reads. The first one will take very long, because it will trigger these lags. Subsequent ones will be much better.
  The leading theory I have read is that maintenance/refreshing on the ssd is not done preventative/correctly by the firmware and you need to trigger it by accessing the data.
  
  1 reply →
schainks 2 years ago

If you dig at the vendor data stored on the drive firmware, fakes are easy to spot. Model numbers, vendor ID, and serial numbers will be zero’d out or not conforming to manufacturer spec.
I purchased a bunch of fake kingston SD cards in China that worked well enough for the price, but crapped out within a year of mild use. I didn’t lose data. It was as if one day they worked. Then one day they were fried.
jwells89 2 years ago

That’s wild. Is this limited to specific distribution channels or can you get them from anywhere?
archsurface 2 years ago

How do you conclude from a single drive that there is a flood?

kristopolous 2 years ago

Under long term heavy duty, I've routinely seen cheap modern platter outperform cheap brand name NVME.

There's some cost cutting somewhere. The NVMEs can't seem to sustain throughput.

It's been pretty disappointing to move I/O bound workloads over and not see notable improvements. The magnitude of data I'm talking about is 500-~3000GB

I've only got two NVME machines for what I'm doing so I'll gladly accept that it's coincidentally flaky bus hardware on two machines, but I haven't been impressed except for the first few seconds.

I know Everyone says otherwise which is why I brought it up. Someone tell me why I'm crazy

Edit: no, I'm not crazy. https://htwingnut.com/2022/03/06/review-leven-2tb-2-5-sata-s... this is similar to what I'm seeing with Crucial and Adata hardware, almost binary performance

dspillett 2 years ago

For write loads this is expected, even for good drives, at some level. They tend to have some faster storage which takes your writes and the controller later pushes the changes to the main body of the drive. If you write in bulk the main, slower, portion can't keep up so the faster cache fills and your write has to wait and will perform as per the slowest part of the drive. Furthermore: good drives tend to have an amount of even faster DRAM cache too, so you'll see two drop-offs in performance during bulk write operations. For mainly read based loads any proper SSD¹ will outperform a traditional drive, but if your use case involves a lot of writing³ you need to make more careful choices⁵ to get good performance.
I can't say I've ever seen a recent SSD (that isn't otherwise faulty) get slow enough to say it is outperformed by a traditional drive, even just counting the fastest end of the disk, but I've certainly seen them drop to around the same speed during a bulk write.
--
[1] unlike this sort of thing: https://www.tomshardware.com/news/low-performance-external-m...
[2] get SLC-only⁴ drives, not QLC-with-SLC-cache or just-QLC, and so forth
[3] bulk data processing tasks such as video editing are where you'll feel this significantly, unless your number-crunching is also bottlenecked at the CPU/GPU
[4] SLC-only is going to be very expensive for large drives, even high-grade enterprise drives tend to be MLC-with SLC-cache. SLC>MLC>TLC>QLC…
[5] this can be quite difficult in the “consumer” market because you'll sometimes find a later revision of the same drive having a completely different memory and/or controller arrangement despite the headline model name/number not changing at all – this is one reason why early reviews can be very misleading
efxhoy 2 years ago
I think cheaper QLC chips use a part of their storage space as SLC, which is fast to write. But once you’ve written the fast part that fits in the SLC cache write throughput quickly tanks as it has to push the data further in to the slower QLC parts.
- kristopolous 2 years ago
  
  Yeah I guess it works well for how most people use computers which is not actually for computation...
  Modern platter is actually pretty decent and cheap. It's probably still the way to go for large loads unless you have a grove of money trees
nijave 2 years ago

I used to use an HP EX920 for my system drive and it was abysmally slow at syncs. I'd open Signal and the computer would grind to a halt while it loaded messages from group chats. After much debugging, I found out Signal was saving each message to sqlite in a transaction causing lots of syncing.
I found some bash script that looped and wrote small blocks synchronously and the HP EX920 was like 20 syncs/sec and my WD RE4 spinner was around 150. Other SSDs were much faster (it was a few years ago so can't remember the exact numbers)
Sakos 2 years ago
1) Nobody says otherwise about cheap anything NVMe. They're pretty terrible once they've exhausted the write cache. This is well-known and addressed in every decent review by reputable sites.
2) Sustaining throughput seems the least of our problems when some unknown number of NVMe SSDs might be literally losing flushed data.
- kristopolous 2 years ago
  
  Is this expected with say Samsung evo 9X0 pro? Or is there another tier above consumer level gear? Is there something I should go with?
  
  9 replies →
MonaroVXR 2 years ago

>Under long term heavy duty, I've routinely seen cheap modern platter outperform cheap brand name NVME.
Saw this happen with previous job. I upgraded several Windows devices too Windows 10 and the fastest PC was a Dell desktop with a HDD.
Midrange to lower-mid laptop coupled with low-end SSD's.

kmxdm 2 years ago

Writes are completed to the host when they land on the SSD controller, not when written to Flash. The SSD controller has to accumulate enough data to fill its write unit to Flash (the absolute minimum would be a Flash page, typically 16kB). If it waited for the write to Flash to send a completion, the latency would be unbearable. If it wrote every write to Flash as quickly as possible, it could waste much of the drive's capacity padding Flash pages. If a host tried to flush after every write to force the latter behavior, it would end up with the same problem. Non-consumer drives solve the problem with back-up capacitance. Consumer drives do not have this. Also, if the author repeated this test 10 or 100 times on each drive, I suspect that he would uncover a failure rate for each consumer drive. It's a game of chance.

gumby 2 years ago
The whole point of explicit flush is to tell the drive that you want the write at the expense of performance. Either the drive should not accept the flush command or it should fulfill it, not lie.
(BTW this points out the crappy use of the word “performance” in computing to mean nothing but “speed”. The machine should “perform” what the user requests — if you hired someone to do a task and they didn’t do it, we’d say they failed to perform. That’s what’s going on here.)
- kmxdm 2 years ago
  
  The more dire problem is the case where the drive runs out of physical capacity before logical capacity. If the host flushes data that is smaller than the physical write unit of the SSD, capacity is lost to padding (if the SSD honors every Flush). A "reasonable" amount of Flush would not make too much of a difference, but a pathological case like flush-after-every-4k would cause the SSD to run out of space prematurely. There should be a better interface to handle all this, but the IO stack would need to be modified to solve what amounts to a cost issue at the SSD level. It's a race to the bottom selling 1TB consumer SSDs for less than $100.
  
  11 replies →
nippoo 2 years ago

This is the whole point of a FLUSH though. You expect latency penalties and worse performance (and extra pages) if you flush, but that's the expected behaviour: not for it to (apparently) completely disregard the command while pretending like it's done it.
donmcronald 2 years ago

> Non-consumer drives solve the problem with back-up capacitance.
I’m pretty sure they used to be on consumer drives too. Then they got removed and all the review sites gave the manufacturer a free pass even though they’re selling products that are inadequate.
Disks have one job, save data. If they can’t do that reliably they’re defective IMO.
nolist_policy 2 years ago
> If a host tried to flush after every write to force the latter behavior, it would end up with the same problem.
So? No reason to break the contract that flush makes all submitted writes durable. The drive can compact space in the background.
- kmxdm 2 years ago
  
  Yes, GC should be smart enough to free up space from padding. But then there's a write amplification penalty and meeting endurance specifications is impossible. A padded write already carries a write amplification >1, then GC needs to be invoked much more frequently on top of that to drive it even higher. With pathological Flush usage, you have to pick your poison. Run out of space, run out of SSD life.

loloquwowndueo 2 years ago

Twitter yuk, can somebody just post the names of the four tested drives and which passed/failed please?

RankingMember 2 years ago
imo hackernews should just automatically replace twitter.com with nitter.net even if just for readability without logging in's sake:
https://nitter.net/xenadu02/status/1495693475584557056
- matheusmoreira 2 years ago
  
  And reddit links with old reddit.
- neoecos 2 years ago
  
  The Harmonic HN reader for Android does this
  
  2 replies →
- usr1106 2 years ago
  
  We should just learn to ignore people still posting on Twitter (which is called something else now).
  
  24 replies →
- lapsis_beeftech 2 years ago
  
  Is nitter.net still functional for you? The service was accessible earlier this year from my location but now it is all 403 Forbidden.
  
  2 replies →
geosh 2 years ago

This is from February 2022. This should be in the title.
temp112123 2 years ago
The models that never lost data: Samsung 970 EVO Pro 2TB and WD Red SN700 1TB
Correction: “Plus” not “Pro”. Exact model and date codes:
Samsung 970 Evo Plus: MZ-V7S2T0, 2021.10 WD Red: WDS100T1R0C-68BDK0, 04Sept2021
Update 2: models that lost writes: SK Hynix Gold P31 2TB SHGP31-2000GM-2, FW 31060C20 Sabrent Rocket 512 (Phison PH-SBT-RKT-303 controller, no version or date codes listed)
- sundvor 2 years ago
  
  Thanks, the 970 Evo Plus is a solid unit. I have one in my old PC, along with a 980 Pro. Went 990 Pros on my new, they didn't go 30 minutes before I had the firmware updated. :-) Very good performers.
  (I always keep an eye on things like firmware updates).
Dalewyn 2 years ago

[flagged]
Double_a_92 2 years ago
We should just say X. Not "X (formerly Twitter), not X-itter, or anything. It's called X, if that name is stupid and unclear that's on them.
- eviks 2 years ago
  
  It's also on you since you have a choice re which name to use

CoastalCoder 2 years ago

Does advertising a product as adhering to some standard, but secretly knowing that it doesn't 100%, count as e.g. fraud? I.e., is there any established case law on the matter?

I'm thinking of this example, but also more generally USB devices, Bluetooth devices, etc.

sofixa 2 years ago
> there any established case law on the matter
Always makes me laugh.
Anyways, not in the US where you're probably asking for, but yes, the vast majority of the developed word has that. It's called "false advertising", and exists at least in the EU, Australia, UK. You can't put a label on your product or advert that is false or misleading.
So if the box says this is a WiFi6E router, but it's actually only 5 because it's using the wrong components to save on costs, you can report them to the relevant authority and they'll be fined (and depending on the case and scenario you get compensation). The process is a harder bordering on the impossible if you bought from AliExpress from a random no name vendor though, but as long as the vendor or platform or store exists in the country with the sensible regulation you can report it.
- vineyardmike 2 years ago
  
  That’s not really what the commenter was asking. That’d be false advertising in the US too.
  I think the question is less “if they skip on parts and lie” and more along the lines of incompleteness. Like “it’s an HTTP server, but they saved on effort and implement put as post, which works fine for most of use cases”.
  That said, I’d guess this would be a pretty hard case to win. The law typically requires intent when false advertising, so if they didn’t know they didn’t follow the spec they might be fine. And it depends on the claims and what the consumer can expect. Like, if you deliberately don’t say explain the exact spec your SSD complies with, and you make no explicit promises of compatibility, it’s a harder win. Like I bet few SSD manufacturers will say “Serial ATA v3.5 (may 2023) tested and compatible with OpenXFS commit XYZ on Debian Linux running kernel version 4.3.2”. But if they say “super fast SSD with a physical SATA cable socket”, then what really was false if it doesn’t support the full spec?
yjftsjthsd-h 2 years ago
I was under the impression that a lot of off-brand USB devices didn't use the USB logo specifically to get around certification requirements. Basically, they just aren't advertising adherence to a standard. No idea about NVMe or BT.
- adhesive_wombat 2 years ago
  
  The USB trident logo doesn't need certification. The other, usually coloured, logos do need certification.
  https://ftdichip.com/wp-content/uploads/2020/08/TN_114_USB-D...
  https://www.usb.org/sites/default/files/usb-if_original_logo...
- vineyardmike 2 years ago
  
  I’ve similarly seen a lot of “TF card reader slots” instead of “micro SD card” slots.
  
  1 reply →
hsbauauvhabzb 2 years ago
Hardware vendors are known to swap to cheaper lower performance hardware after reviews are out which in my eyes is fraud, whether or not the law agrees is a different story.
- matheusmoreira 2 years ago
  
  Please name the vendors that do this so I can avoid them.
  
  2 replies →
lxgr 2 years ago
Not a lawyer, but I doubt it – otherwise you might have a case against Intel and AMD regarding Spectre and Meltdown?
It might be a different story if the spec was intentionally violated, though (rather than incidentally, i.e. due to an idea that should have been transparent/indistinguishable externally but didn't work out).
- KennyBlanken 2 years ago
  
  "Oops we didn't mean to do that" isn't a defense from liability for product not doing what you told the purchaser it would.
  It's their responsibility to do develop the product correctly, do QA, and if a defect is found, advise customers or stop selling the defective goods.
  The greatest scam the computer industry pulled was convincing people that computers are magical, unpredictable devices that are too complex for the industry to be held responsible for things not working as claimed.
  
  1 reply →
KennyBlanken 2 years ago
Merchantability and implied fitness? You absolutely could try sue the in small claims court for damages.
For extra fun: if the box carries a trademark from a standards group, you could try adding them into the suit; use of their trademarked logo could be argued to be implied fitness, if there are standards the drive is supposed to meet to use it.
At the very least they might get tired of the expense of the expense of sending someone to defend the claim, and it would cease to be profitable to engage in this scammery.
- nraynaud 2 years ago
  
  I don't think it's even implied fitness. Declaring you support SCSI commands is probably a direct advertisement of conformance.
nraynaud 2 years ago

I would probably use stronger words than that, data persistence is a big deal, so the missing part of the spec is a fundamental flaw. What's a disk whose persitence is random? You can probably legally assail the substance of the product.
teo_zero 2 years ago

For IT products, I doubt it. For sectors where regulation is more mature of course: take food, automotive, etc.
wmf 2 years ago
I wouldn't say fraud but this issue should trigger a recall.
- nraynaud 2 years ago
  
  I think it's more or less the same thing, the recall is the way to legally prove you didn't intend to disseminate the flawed product, wheras leaving it on the market after learning of the problem shows intent to keep it there. I would be surprised if a discovery at those companies would not surface an email form engineers discussing this problem.

handedness 2 years ago

Previous Discussion: https://news.ycombinator.com/item?id=30419618

sashk 2 years ago

This is (2022).

Wondering if anything changed since the original tests...

throw0101b 2 years ago
> Wondering if anything changed since the original tests...
You're wondering if firmware writers lie to layers higher up in the stack? I think it's a 100% certainly that there's drive firmware that lies.
There's a reason why many vendors have compatibility lists, approved firmware versions, and even their "own" (rebranded from an OEM) drives that you have to buy if you want official support (and it's not entirely a money grab: a QA testing infrastructure does cost money).
- Tarball10 2 years ago
  
  I'm curious whether any of the brands which failed this test owned up to the issue and released firmware updates.

arglebargle123 2 years ago

Meanwhile I'm over here jamming Micron 7450 pros into my work laptop for better sync write performance.

I have very little trust in consumer flash these days after seeing the firmware shortcuts and stealth hardware replacements manufacturers resort to to cut costs.

CobaltFire 2 years ago

Have a solid vendor for these that isn't insanely priced (for home use)? The last couple I tried to buy they sent 7300's and tried to buy me off with a small refund (eBay).

jauntywundrkind 2 years ago

Losing flushes is obviously bad.

I wonder how much perf is on the table in various scenarios when we can give up needing to flush. If you know the drive has some resilience, say, 0.5s of time it can safely writeback during, maybe you can give up flushes (in some cases). How much faster is the app then?

It's be neat to see some low-cost improvements here. Obviously in most cases, just get an enterprise drive with supercapa or batteries onboard. But an ATX power rail that has extra resilience from the supply, or an add-in/pass-through 6-pin sata power supercap... that could be useful too.

invalidator 2 years ago

If the write-cache is reordering requests (and it does, that's the whole point), you can't guarantee that $milliseconds will be enough unless you stop all requests, wait $milliseconds, write your commit record, wait $milliseconds, then resume requests. This is essentially re-implementing write-barriers in an ad-hoc, buggy way which requires stalling requests even longer.
Flush+FUA requires the data to be stored to non-volatile media. Capacitor-backed RAM dumping to flash is non-volatile. When a drive knows it has enough capacitor-time to finish flushing all preceding writes from the cache, it can immediately say the flush was completed. This can all be handled on the device without the software having to make guesses at how long something has to be written before it's durable.
supersour 2 years ago
Performance gains wouldn’t be that large as enterprise SSDs already have internal capacitors to flush pending writes to NAND.
During typical usage the flash controller is constantly journaling LBA to physical addresses in the background, so that the entire logical to physical table isn’t lost when the drive loses power. With a larger capacitor you could potentially remove this background process and instead flush the entire logical to physical table when the drive registers power loss. But as this area makes up ~2% of the total NAND, that’s at absolute best a 2% performance benefit we are potentially missing out on.
- hurryer 2 years ago
  
  You could gain much more by coalescing repeated writes to the same address - database scenarios for example

lxgr 2 years ago

I guess it's time for `fsync_but_really_actually_sync_it_please(2)` (and the lower level equivalents in SATA, NVMe etc.)?

throw0101b 2 years ago
> (and the lower level equivalents in SATA, NVMe etc.)?
This is not a technical problem that needs yet another SATA/SAS/etc command to be standardized. It's a 'social' problem that there's no real incentives for firmware writers to tell the truth 100% of the time.
The best you can hope for is if you buy a fancy-pants enterprise storage solution with compatibility lists and approved firmware versions.
- lxgr 2 years ago
  
  Well said.
  Not even OS vendors are immune to the temptations of higher performance through somewhat relaxed interpretation of interfaces: https://developer.apple.com/library/archive/documentation/Sy...
wmf 2 years ago
Buggy firmware could also screw that up.
- kragen 2 years ago
  
  this is like blaming ftx's collapse on sloppy accounting
  
  2 replies →

tripdout 2 years ago

Flushing in this case is from the SSDs internal DRAM cache to the actual NAND flash?

MBCook 2 years ago

It’s the computer telling the drive “write everything to durable storage (as opposed to some kind of in-drive cache/RAM) and tell me when it’s done”.
After that command it should be 100% safe to pull the power because everything SHOULD have been written to flash. That’s the point of the command.
It’s interesting that the drives that do it wrong still take time indicating they’re doing something.
supertrope 2 years ago

The DRAM cache does not hold user data. It holds the flash transition layer that links LBAs to NAND pages. Higher performance drives use 1GB of DRAM per 1TB of NAND. In cheap DRAM-less drives if the I/O to be serviced is not cached in the 1MB or so of SRAM it has to do a double lookup. Once to retrieve the full FTL table from NAND and a second lookup to actually service the I/O.

nijave 2 years ago

It'd be nice if there were a database of known bad/known good hardware to reference. I know there's been some spreadsheets and special purpose like the USB-C cables Benson Leung tested.

Especially for consumer hardware on Linux--there's a lot of stuff that "works" but is not necessarily stable long term or that required a lot of hacking on the kernel side to work around issues

AlbertoGP 2 years ago

Well, yes, but which were those 2 out of 4 vendors?

alright2565 2 years ago
There were a few more listed deeper in the thread. Fyi, the nitter version is actually readable: https://nitter.net/xenadu02/status/1495693475584557056#m
Samsung 970 Evo Plus: MZ-V7S2T0, 2021.10: Pass
WD Red: WDS100T1R0C-68BDK0, 04Sept2021: Pass
Crucial P2 250GB CT250P2SSD8, FW P2CR046: Pass
Samsung 980 250GB MZ-V8V250, 2021/11/07: Pass
WD Black SN750 1TB WDS100T1B0E, 09Jan2022: Pass
WD Green SN350 240GB WDS240G20C, 02Aug2021: Pass
SK Hynix Gold P31 2TB SHGP31-2000GM-2, FW 31060C20: Fail
Sabrent Rocket 512 (Phison PH-SBT-RKT-303 controller, no version or date codes listed): Fail
- dboreham 2 years ago
  
  So...vendors no sane person would store valuable data on their drives were the only ones that failed. The sound of a dog biting a man...
  
  7 replies →
whitepoplar 2 years ago
SK Hynix and Sabrent: https://x.com/xenadu02/status/1496290369184874497?s=20
- mjevans 2 years ago
  
  Did what, for those of us without Twitter accounts?
  Did they Pass or Fail the flush test? (Saved or Lost data respectively?)
  
  1 reply →

caycep 2 years ago

The model I’d be interested in would be the SK Hynix/Solidigm P44 Pro, as that model competes w the Samsung 9xx evo and pro models

dtx1 2 years ago

I am a bit annoyed that everyone here takes this at face value. There's 0 evidence given, not even the vendors and models are named to confirm this.

On a related note I tested 4 DDR5 Ram kits from major vendors - half of them corrupt data when exposed to UV light.

39 2 years ago

This has always been the case? At least it was a course learning when we wrote our own device drivers for minux, even the controllers on spinning metal fib about flush.

naasking 2 years ago

At this point, any storage vendor should be required to pass the Sqlite test suite before they can sell their product.

caycep 2 years ago

Also…would modern journaling file systems protect against this sort of data loss?

GrayShade 2 years ago

They won't, since they rely on the drive to make sure their data was actually persisted.

pleoxy 2 years ago

If you need PLP use an enterprise drive. That's what they're for.

Dylan16807 2 years ago

Well, I don't need PLP.
I just need to know when the data is written and when it isn't.
schleyreuth 2 years ago

If stuck with consumer drives, you can add cheap PLP via riser cards that have a supercapacitor. Here's a post on the TrueNAS forums that tested one out.[1]
[1] https://www.truenas.com/community/threads/x4-pcie-to-nvme-ad...
RecycledEle 2 years ago

PLP = Power Loss Prevention
anarazel 2 years ago

I've seen lost completed FUA writes on enterprise drives quite a few times.

Joel_Mckay 2 years ago

Cheap drives don't include large dram caches, lack fast SLC areas, and leave off super-capacitors that allow chips to drain buffers during a power-failure.

"Buy cheap, buy twice" as they say... =)

pajko 2 years ago

Without any more information this post is just bullshit. For example, it's not documented how the flushing has been done. On Linux, even issuing 'sync' is not enough: https://unix.stackexchange.com/questions/98568/difference-be...

The bottom answer especially states that "blockdev --flushbufs may still be required if there is a large write cache and you're disconnecting the device immediately after"

The hpdarm utility has a parameter for syncing and flushing device buffers themselves. Seems like all three should be done for a complete flush at all levels.

nik736 2 years ago

That's what PLP is for.

tw1984 2 years ago

Don't use home-grade SSDs for storing anything that is considered critical.

The rule is not that hard to remember.

martincmartin 2 years ago

2022

jbverschoor 2 years ago

Brands please. It’s time they have some pressure to fix these data corruption issues

babberman 2 years ago

[flagged]

layer8 2 years ago
Multiple reports in the Twitter thread that this also happens for server-grade HDDs.
- RedShift1 2 years ago
  
  I couldn't find this? Can you link this claim?
  
  2 replies →

xbmcuser 2 years ago

this is from Feb 2022

java-man 2 years ago

Name the offenders please.

I am sure it might be easy to see visually - a lack of substantial capacitor on the board would indicate a high likelihood of data loss.

whitepoplar 2 years ago
SK Hynix and Sabrent: https://x.com/xenadu02/status/1496290369184874497?s=20
- java-man 2 years ago
  
  Which did "pass" the test?
  
  2 replies →
potatopatch 2 years ago

I'd be curious how well it actually correlates. It would be hard to make the most performant system that's always consistent with flushed data but there are probably a lot of firmwares out there with untested performance ideas, etc.
wmf 2 years ago
None of the tested drives use power loss protection capacitors.
- 15457345234 2 years ago
  
  We're talking about scenarios where the drives report (untruthfully) that the data has been committed to non-volatile storage _before_ the power is removed
  Not scenarios where the data is in the cache when the power drops and the drive is expected to write out the cache with internal power alone.
  
  4 replies →

babberman 2 years ago

That is unfortunate, but I guess those SSDs performed really well and outclassed all others in performance benchmarks? lol

hackerfactor1 2 years ago

The posting is from Feb 2022, nearly 2 years ago. How is this suddenly trending on Hacker News?

laluser 2 years ago

Doesn’t take much for a story to trend. Also, it’s an interesting topic.