[linux-lvm] Reserve space for specific thin logical volumes

Gionatan Danti g.danti at assyoma.it
Wed Sep 13 07:53:27 UTC 2017


Il 13-09-2017 01:22 matthew patton ha scritto:
>> Step-by-step example:
>  > - create a 40 GB thin volume and subtract its size from the thin
> pool (USED 40 GB, FREE 60 GB, REFER 0 GB);
>  > - overwrite the entire volume (USED 40 GB, FREE 60 GB, REFER 40 GB);
>  > - snapshot the volume (USED 40 GB, FREE 60 GB, REFER 40 GB);
> 
> And 3 other threads also take snapshots against the same volume, or
> frankly any other volume in the pool.
> Since the next step (overwrite) hasn't happened yet or has written
> less than 20GB, all succeed.
> 
>  > - completely overwrite the original volume (USED 80 GB, FREE 20 GB,
> REFER 40 GB);
> 
> 4 threads all try to write their respective 40GB. Afterall, they got
> the green-light since their snapshot was allowed to be taken.
> Your thinLV blows up spectacularly.
> 
>  > - a new snapshot creation will fails (REFER is higher then FREE).
> nobody cares about new snapshot creation attempts at this point.
> 
> 
>> When do you decide it ?  (you need to see this is total race-lend)
> 
> exactly!

I all the examples I did, the snapshot are suppose to be read-only or at 
least never written. I thought that it was implicitly clear due to ZFS 
(used as example) being read-only by default. Sorry for not explicitly 
stating that.

However, the refreservation mechanism can protect the original volume 
even when snapshots are writeable. Here we go:

# Create a 400M ZVOL and fill it
[root at localhost ~]# zfs create -V 400M tank/vol1
[root at localhost ~]# dd if=/dev/zero of=/dev/zvol/tank/vol1 bs=1M 
oflag=direct
dd: error writing ‘/dev/zvol/tank/vol1’: No space left on device
401+0 records in
400+0 records out
419430400 bytes (419 MB) copied, 23.0573 s, 18.2 MB/s
[root at localhost ~]# zfs list -t all
NAME        USED  AVAIL  REFER  MOUNTPOINT
tank        416M   464M    24K  /tank
tank/vol1   414M   478M   401M  -

# Create some snapshots (note how the USED value increased due to the 
snapshot reserving space for all "live" data in the ZVOL)
[root at localhost ~]# zfs set snapdev=visible tank/vol1
[root at localhost ~]# zfs snapshot tank/vol1 at snap1
[root at localhost ~]# zfs snapshot tank/vol1 at snap2
[root at localhost ~]# zfs list -t all
NAME              USED  AVAIL  REFER  MOUNTPOINT
tank              816M  63.7M    24K  /tank
tank/vol1         815M   478M   401M  -
tank/vol1 at snap1     0B      -   401M  -
tank/vol1 at snap2     0B      -   401M  -

# Clone the snapshot (to be able to overwrite it)
[root at localhost ~]# zfs clone tank/vol1 at snap1 tank/cvol1
[root at localhost ~]# zfs list -t all
NAME              USED  AVAIL  REFER  MOUNTPOINT
tank              815M  64.6M    24K  /tank
tank/cvol1          1K  64.6M   401M  -
tank/vol1         815M   479M   401M  -
tank/vol1 at snap1     0B      -   401M  -
tank/vol1 at snap2     0B      -   401M  -

# Writing to the cloned ZVOL fails (after only 66 MB written) *without* 
impacting the original volume
[root at localhost ~]# dd if=/dev/zero of=/dev/zvol/tank/cvol1 bs=1M 
oflag=direct
dd: error writing ‘/dev/zvol/tank/cvol1’: Input/output error
64+0 records in
63+0 records out
66060288 bytes (66 MB) copied, 25.9189 s, 2.5 MB/s

After the last write, the cloned cvol1 is clearly corrputed, but the 
original volume has not problem at all.

Now, I am *not* advocating switching thinp to a ZFS-like things (ie: 
note the write speed, which is low even for my super-slow notebook HDD). 
However, a mechanism with which we can tell LVM "hey, this volume should 
have all its space as reserved, don't worry about preventing snapshots 
and/or freezing them when free space runs out".

This was more or less the case with classical, fat LVM: a snapshot 
runnig out of space *will* fail, but the original volume remains 
unaffected.

Thanks.

-- 
Danti Gionatan
Supporto Tecnico
Assyoma S.r.l. - www.assyoma.it
email: g.danti at assyoma.it - info at assyoma.it
GPG public key ID: FF5F32A8




More information about the linux-lvm mailing list