Comment by mgerdts
2 days ago
I remember seeing another strategy where a remote block device was (lazily?) mirrored to a local SSD. The mirror was configured such that reads from the local device were preferred and writes would go to both devices. I think this was done by someone on GCP.
Does this ring any bells? I’ve searched for this a time or two and can’t find it again.
Discord: https://discord.com/blog/how-discord-supercharges-network-di...
(Somehow the name "SuperDisks" was burned into my brain for this. Although Discord's post does use 'Super-Disks' in a section header, if you search the Internet for SuperDisks you'll everything's about the LS-120 floppies that went by that name.)
This is not quite the same, it's for migrating from one device to another while keeping the file system writable, but it's quite neat: dm-clone[1]
I've used it before for a low downtime migration of VMs between two machines - it was a personal project and I could have just kept the VM offline for the migration, but it was fun to play around with it.
You give it a read-only backing device and a writable device that's at least as big. It will slowly copy the data from the read-only device to the writable device. If a read is issued to the dm-clone target it's either gotten from the writable device if it's already cloned or forwarded to the read-only device. Writes are always going to the writable device and afterwards the read-only device is ignored for that block.
It's not the fastest, but it's relatively easy to set up, even though using device mapper directly is a bit clunky. It's also not super efficient, IIRC if a read goes to a chunk that hasn't been copied yet, that's used to give the data to the reading program, but it's not stored on the writable device, so it has to be fetched again. If the file system being copied isn't full, it's a good idea to run trimming after creating the dm-clone target as discarded blocks are marked as not needing to be fetched.
[1] https://docs.kernel.org/admin-guide/device-mapper/dm-clone.h...
I've done this on EC2 -- in particular back in the days when EBS billed per I/O (as opposed to using a "reserved IOPs" model where you say up front how much I/O performance you need). I haven't bothered recently since EBS performance is good enough for most purposes and there's no automatic cost savings.
There was some discussion amongst the ZFS devs for such a feature.
As I recall it was to change the current mirrored read strategy to be aware of the speed of the underlying devices, and prefer the faster if it has capacity. Though perhaps a fixed pool property to always read from a given device was discussed, it's been a while so my memory is hazy.
The use-case was similar IIRC, where a customer wanted to combine local SSD with remote block device.
So, might come to ZFS.
Google's L4 cache? https://cloud.google.com/blog/products/storage-data-transfer...