Rsync and Compression

WipeOut wipe_out at users.sourceforge.net
Wed Apr 21 10:19:00 UTC 2004


Ron Goulard wrote:

>On Wed, 2004-04-21 at 02:38, Tom 'Needs A Hat' Mitchell wrote:
>  
>
>>On a local link I turn compression off.  The CPU effort and latency to
>>compress then uncompress does not justify the time saved in transfer
>>time.
>>
>>Also most digital content (images, rpms) do not compress enough to
>>justify the cycles.  Some increase in size....
>>
>>Since this depends on your content and your local system capabilities
>>I can only advise you to list all the possible environment knobs in
>>your script and then benchmark by turning them on and off.
>>
>>Of interest if compression proves to be an advantage for backups then
>>you should check into compression on the httpd server side.  It is
>>possible to present compressed content to an aware client that is then
>>expanded locally by the browser.
>>
>>If you think about it a distant proxy server could do this and make
>>the link look faster.  Some services are apparently doing this and
>>charging extra for it.  As a content provider you should do this to
>>save both you and your customers bandwidth.  It might be interesting
>>to make sure that precompressed content is not expanded to make a link
>>look slow.  I seriously doubt that any service would like to be caught
>>doing this ... but ... ya never know.
>>
>>The backup impact of this is that the pages on the web site are
>>compressed already and will not compress any more.  So why bother.
>>    
>>
>
>Something else to consider is if you are keeping multiple copies of the
>backup.  Tar is much faster at doing this than rsync _if_ the files
>being backed up do not already exist on the other end.  Rsync gets its
>speed by moving only what has changed (it compares both ends of the
>connection before deciding).
>
>Therefore, if you are keeping only a single copy of the backup, you can
>use rsync, but if you are keeping multiple copies, for example a
>complete copy of the data for every day of the past month, then I think
>tar may be the better choice.  (there are reasons for doing it either
>way, but I won't get into that here, just providing an option)
>
>
>  
>
I am having a full backup every day, but using tar would mean I was 
pulling a complete compressed copy across every night which is very 
inefficient..

Using rsync with the --link-dest option means that a hard link is 
created for files that haven't changes and only changed files are 
brought across... Add compression to that (the reason for my initial 
question) and you effectively have a full backup everyday with only the 
incremental changes being compressed and shipped across the wire.. To my 
mind this is about as efficient as it can hope to be in terms of both 
backup performance and bandwidth conservation..

Later..





More information about the fedora-list mailing list