Rsync and Compression

Ron Goulard foz at techville.org
Wed Apr 21 07:13:25 UTC 2004


On Wed, 2004-04-21 at 02:38, Tom 'Needs A Hat' Mitchell wrote:
> On a local link I turn compression off.  The CPU effort and latency to
> compress then uncompress does not justify the time saved in transfer
> time.
> 
> Also most digital content (images, rpms) do not compress enough to
> justify the cycles.  Some increase in size....
> 
> Since this depends on your content and your local system capabilities
> I can only advise you to list all the possible environment knobs in
> your script and then benchmark by turning them on and off.
> 
> Of interest if compression proves to be an advantage for backups then
> you should check into compression on the httpd server side.  It is
> possible to present compressed content to an aware client that is then
> expanded locally by the browser.
> 
> If you think about it a distant proxy server could do this and make
> the link look faster.  Some services are apparently doing this and
> charging extra for it.  As a content provider you should do this to
> save both you and your customers bandwidth.  It might be interesting
> to make sure that precompressed content is not expanded to make a link
> look slow.  I seriously doubt that any service would like to be caught
> doing this ... but ... ya never know.
> 
> The backup impact of this is that the pages on the web site are
> compressed already and will not compress any more.  So why bother.

Something else to consider is if you are keeping multiple copies of the
backup.  Tar is much faster at doing this than rsync _if_ the files
being backed up do not already exist on the other end.  Rsync gets its
speed by moving only what has changed (it compares both ends of the
connection before deciding).

Therefore, if you are keeping only a single copy of the backup, you can
use rsync, but if you are keeping multiple copies, for example a
complete copy of the data for every day of the past month, then I think
tar may be the better choice.  (there are reasons for doing it either
way, but I won't get into that here, just providing an option)

An example for the tar version would be:

tar cf - /to_be_backed_up -X /path_to_excluded_files.txt | ssh
yourdomain.com "cat > /backup-somedate.tar"

Substitute your own values of course.  The excluded files is just a list
of files and directories you don't want backed up (and can easily be
omitted if not needed).  I've never tried it with compression on the fly
though, I just compress locally it when it's done, if I feel the need.

This is out of the 'linux server hacks' book from O'reilly (I think
that's what the book was called - this is all by memory).  Some very
good stuff in there.

Ron





More information about the fedora-list mailing list