← Back to context

Comment by m4rtink

6 years ago

Can't you over provision even just by creating too many many snapshots ? Even if you never make the filesystems bigger then the backing pool, the snapshots will allocate some blocks from the pool and over time, boom.

Snapshots can't cause over-provisioning, not for file systems. If I mutate my data and keep snapshots forever, eventually my pool will run out of free space. But that's not a problem of over-provisioning, that's just running out of space.

With ZFS, if I take a snapshot and then delete 10GB of data my file system will appear to have shrunk by 10GB. If I compare the output of df before and after deleting the data, df will tell me that "size" and "used" have decreased by 10GB while "available" remained constant. Once the snapshot is deleted that 10GB will be made available again and the "size" and "available" columns in df will increase. It avoids over-provisioning by never promising more available space than it can guarantee you're able to write.

I think you're trying to relate ZFS too much to how LVM works, where LVM is just a volume manager that exposes virtual devices. The analogue to thin provisioned LVM volumes is thin-provisioned zvols, not regular ZFS file systems. I can choose to use ZFS in place of LVM as a volume manager with XFS as my file system. Over-provisioned zvols+XFS will have functionally equivalent problems as over-provisioned LVM+XFS.

ZFS doesn't work this way. The free blocks in the ZFS pool are available to all datasets (filesystems). The datasets themselves don't take up any space up front until you add data to them. Snapshots don't take up any space initially. They only take up space when the original dataset is modified, and altered blocks are moved onto a "deadlist". Since the modification allocates new blocks, if the pool runs out of space it will simply return ENOSPC at some point. There's no possibility of over-provisioning.

ZFS has quotas and reservations. The former is a maximum allocation for a dataset. The latter is a minimum guaranteed allocation. Neither actually allocate blocks from the pool. These don't relate in any comparable way to how LVM works. They are just numbers to check when allocating blocks.