[linux-lvm] Data deduplication for Linux : lessfs

Les Mikesell lesmikesell at gmail.com
Wed Jun 24 21:21:59 UTC 2009


malahal at us.ibm.com wrote:
> 
>>>> Block level deduplication isn't going to know/care about the difference 
>>>> between file contents and metadata.  It is either stored in blocks that 
>>>> match other blocks or not and the difference should not be visible to the 
>>>> filesystem living on top of the block device.
>>> My point exactly. If dedup was to be done on the block layer, you'd need 
>>> flag to say "do not dedup this".
>> Why?  How can it possibly make any difference? It's not likely that you'd 
>> have dupes in the metadata block, but if you do it doesn't matter that they 
>> are transparently mapped into one.  You need a copy-on-write mechanism 
>> anyway since if you write to either they won't be dups any more.
> 
> Because some file systems create duplicate copies of metadata for
> recovery if there is some sectors go bad on the media. You really don't
> want to merge them!

My experience with disks is that if any part of them fails you don't 
want to trust data from any other part.  So I'd consider this a big 
waste of time and generally keep data that matters on mirrored drives. 
Hmmm, I suppose you would want it to know not to de-dup the mirrored 
blocks..

-- 
   Les Mikesell
    lesmikesell at gmail.com




More information about the linux-lvm mailing list