Comment by hedora

6 hours ago

I think it was killed primarily because the DIMM version had a terrible programming API. There was no way to pin a cache line, update it and flush, so no existing database buffer pool algorithms were compatible with it. Some academic work tried to address this, but I don’t know of any products.

The SSD form factor wasn’t any faster at writes than NAND + capacitor-backed power loss protection. The read path was faster, but only in time to first byte. NAND had comparable / better throughput. I forget where the cutoff was, but I think it was less than 4-16KB, which are typical database read sizes.

So, the DIMMs were unprogrammable, and the SSDs had a “sometimes faster, but it depends” performance story.

The DIMMs were their own shitshow and I don't know how they even made it as far as they did.

The SSDs were never going to be dominant at straight read or write workloads, but they were absolutely king of the hill at mixed workloads because, as you note, time to first byte was so low that they switched between read and write faster than anything short of DRAM. This was really, really useful for a lot of workloads, but benchmarkers rarely bothered to look at this corner... despite it being, say, the exact workload of an OS boot drive.

For years there was nothing that could touch them in that corner (OS drive, swap drive, etc) and to this day it's unclear if the best modern drives still can or can't compete.

It sounds like they didn't do a good job of putting the DIMM version in the hands of folks who'd write the drivers just for fun.

The read path is sort of a wash, but writes are still unequalled. NAND writes feel like you're mailing a letter to the floating gate...

  • Isn't this addressed by newer PCIe standards? Of course, even the "new" Optane media reviewed in OP is stuck on PCIe 4.0...

    • No; the issue with the DIMMs wasn’t drivers. The issue was that the only people allowed to target the DIMMs directly were the xeon hardware team.

      There was a startup doing good work with similar storage chips that were pin (BGA) compatible with standard memory. Not sure what happened to them. That’d be easier to program than xpoint.

      As for the new PCIe standard (you probably mean CXL), that’s also basically dead on arrival. The CPU is the power and money bottleneck for the applications it targets, so they provide a synchronous hardware API that stalls the processor pipeline when accessing high-latency devices.

      Contrast this to NVMe, which can be set up to either never block the CPU or amortize multiple I/Os per cache miss.

      Companies like NVIDIA are already able to maintain massive I/O concurrency over PCIe without CXL, because they have a programming model (the GPU) that supports it. CXL might be a small win for that.