Comment by wtallis
4 years ago
Consumer SSDs don't have much room to offer a different abstraction from emulating the semantics of hard drives and older technology. But in the enterprise SSD space, there's a lot of experimentation with exactly this kind of thing. One of the most popular right now is zoned namespaces, which separates write and erase operations but otherwise still abstracts away most of the esoteric details that will vary between products and chip generations. That makes it a usable model for both flash and SMR hard drives. It doesn't completely preclude dishonest caching, but removes some of the incentive for it.
Check out https://www.snia.org/ if you want to track development in this area.
There is no strong reason why a consumer SSD can't allow reformatting to a smaller normal namespace and a separate zoned namespace. Zone-aware CoW file systems allow efficiently combining FS-level compaction/space-reclamation with NAND-level rewrites/write-leveling.
I'd probably pay for "unlocking" ZNS on my Samsung 980 Pro, if just to reduce the write amplification.
Enabling those features on the drive side is little more than changing some #ifdef statements in the firmware, since the same controllers are used in high-end consumer drives and low-power data center drives. But that doesn't begin to address the changes necessary to make those features actually usable to a non-trivial number of customers, such as anyone running Windows.
Isn't this a chicken and egg problem? Why would OS vendors spend time implementing this on their side if the drives don't support it?
The difference here being that it's not clear to me that there's much cost on the drive side to actually allow this. Aside maybe for the will to segment the market.
To me, this looks like the whole sector size situation. OSs, including regular Windows, have supported 4K drives for quite a while now. I bought a Samsung 980 (non-pro) the other day that still pretends to have 512B sectors. The OEM drive in my laptop (some kind of Samsung) can be formatted with a 4k namespace, but the default is also 512B. The 980 doesn't even support this.
2 replies →
> Consumer SSDs don't have much room to offer a different abstraction from emulating the semantics of hard drives and older technology.
From what I understand the abstraction works a lot like virtual memory. The drive shows up as a virtual address space pretending to be a a disk drive and then the drive's firmware maps virtual addresses to physical ones.
That doesn't seem at all incompatible with exposing the mappings to the OS through newer APIs so the OS can inspect or change the mappings instead of having the firmware do it.
The current standard block storage abstraction presented by SSDs is a logical address space of either 512-byte or 4kB blocks (but pretty much always 4kB behind the scenes). Allocation is implicit upon writing to a block, and deallocation is explicit but optional. This model is indeed a good match for how virtual memory is handled, especially on systems with 4kB pages; there are already NVMe commands analogous to eg. madvise().
The problem is that it's not a good match for how flash memory actually works, especially with regards to the extreme disparity between a NAND page write and a NAND erase block. Giving the OS an interface to query which blocks the SSD considers as live/allocated rather than deallocated and implicitly zero doesn't seem all that useful. Giving the OS an interface to manipulate the SSD's logical to physical mappings (while retaining the rest of the abstraction's features) would be rather impractical, as both the SSD and the host system would have to care about implementation details like wear leveling.
Going beyond the current HDD-like abstraction augmented with optional hints to an abstraction that is actually more efficient and a better match for the fundamental characteristics of NAND flash memory requires moving away from a RAM/VM-like model and toward something that imposes extra constraints that the host software must obey (eg. append-only zones). Those constraints are what breaks compatibility with existing software.
If anything consumer-level SSDs move to the opposite direction. On Samsung 980 Pro it is not even possible to change the sector size from 512 bytes to 4K.