[linux-lvm] Snapshots and disk re-use

Wed Feb 23 19:49:01 UTC 2011

On 02/23/2011 10:39 AM, Les Mikesell wrote:
> On 2/23/2011 12:19 PM, Jonathan Tripathy wrote:
>>
>>>> So you're not worried about the security implication of leftovers in
>>>> free
>>>> space, and just want a base image to clone for new customers?
>>>>
>>>> The logical thing to do is to keep the origin volume untouched (except
>>>> for upgrading now and then), and take a snapshot for each customer.
>>>> Each snapshot would then be a new clone of the origin. Unfortunately,
>>>> large numbers of snapshots are inefficient for writes to new data,
>>>> so you'd likely have to "dd" to an independent LV instead. (This is
>>>> being
>>>> worked on, and there are 3rd party products like Zumastor that fix it
>>>> now.)
>>> Actually, if you never (or rarely) write to the origin, lots of
>>> snapshots
>>> should be fine.
>>
>>> But every write to the origin will first copy the
>>> original origin data to every snapshot.
>>>
>> Why would origin data be copied over to the snapshot after the snapshot
>> has been created? Surely the point of a snapshot is to have "frozen"
>> data?
The content of the snapshot is not changing. Previously the data in the
snapshot was a pointer to the origin.  When you rewrite data in the
origin, the old data gets moved to the cow device which is owned by the
snapshot, not by the origin.  So the view of the snapshot does not
change, only that for the underlying data that was changed in the
origin, the previous data is moved to the snapshot's cow and now the
snapshot points to that data in the cow instead of in the origin volume.

 I would be careful of using the word frozen, because you can actually
make changes to the snapshot volume if you wanted to, so it is not
really frozen.

>
> Yes, is the way this actually works explained somewhere?  I would have
> expected the 'copy-on-write' blocks to be copied only on the side
> where the write is happening and relocated instead of rewriting all
> the snapshots that might be outstanding with the old data.
>
I don't know for sure, but it sounds like it is currently done this
way.  Ideally there could be some kind of data sharing between multiple
snapshots.  I suspect the data gets copied (instead of just changing
pointers) to avoid excessive fragmentation in the origin volume.

Nataraj