[linux-lvm] Data deduplication in LVM?

Wed Jun 10 19:04:45 UTC 2009

On 10. juni. 2009, at 20.41, Roy Sigurd Karlsbakk wrote:

> Hi all
>
> I've been reading up a little about data deduplication, and have  
> been in search for an OSS filesystem with dedup without much luck.  
> While testing snapshots and so on in LVM, I started wondering if  
> dedup would be better off in LVM than in the filesystem. Would it be  
> possible/efficient to add dedup to the LVM layer, or perhaps a layer  
> above LVM? This could make dedup work for all or most of  
> filesystems. Make a hash table with 4k (or whatever) blocks, make  
> virtual blocks pointing to the physical blocks and run a remapping/ 
> deduping job at night. If written to, copy-on-write could be used to  
> increase speed.

Answering myself, it seems there can be a problem with this without a  
rather large change in the APIs. If I understand it correctly, if  
metadata is deduplicated, it may impose a rather large performance  
impact on writes, and from the block layer, how do you know what's  
metadata and what's not?

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
roy at karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres  
intelligibelt. Det er et elementært imperativ for alle pedagoger å  
unngå eksessiv anvendelse av idiomer med fremmed opprinnelse. I de  
fleste tilfeller eksisterer adekvate og relevante synonymer på norsk.