← Back to context

Comment by formerly_proven

5 days ago

My understanding is that ZFS does virtual <-> physical translation in the vdev layer, i.e. all block references in ZFS contain a (vdev, vblock) tuple, and the vdev knows how to translate that virtual block offset into actual on-disk block offset(s).

This kinda implies that you can't actually remove data vdevs, because in practice you can't rewrite all references. You also can't do offline deduplication without rewriting references (i.e. actually touching the files in the filesystem). And that's why ZFS can't deduplicate snapshots after the fact.

On the other hand, reshaping a vdev is possible, because that "just" requires shuffling the vblock -> physical block associations inside the vdev.

There is a clever trick that is used to make top level removal work. The code will make the vdev readonly. Then it will copy its contents into free space on other vdevs (essentially, the contents will be stored behind the scenes in a file). Finally, it will redirect reads on that vdev into the stored vdev. This indirection allows you to remove the vdev. It is not implemented for raid-z at present though.

  • Though the vdev itself still exists after doing that? It just happens to be backed by, essentially, a "file" in the pool, instead of the original physical block devices, right?