[K12OSN] OT: Off-site data replication
Christopher K. Johnson
ckjohnson at gwi.net
Thu Apr 7 21:46:09 UTC 2005
Mike Lichtenwalner wrote:
> Les Mikesell wrote:
>
>> On Wed, 2005-04-06 at 13:19, Mike Lichtenwalner wrote:
>>
>>
>>> What are some alternatives in replicating data in a relatively-live
>>> manner? A SAN is an option, but is something that extensive
>>> required? Would doing an rsync every 15 minutes bring the box to its
>>> knees?
>>
>>
>>
>> There is a lot of overhead in the directory-scan part of an rsync. If
>> you can confine the frequent runs to smaller subdirectories at a time
>> it will not have much impact. Since maildir type mailboxes can have
>> a huge number of files, you might go so far as to script a loop over
>> individual home directories to rsync one at a time.
>
>
> Thanks to those of you who suggested rsync - I'll give it a try. I
> think I will NFS mount the volumes I want to rsync from another box,
> and then run the rsync between the NFS mounted volume and a local
> volume. I'm hoping that will cut the load on the original server.
I'm not sure what topology you meant by your description. There are a
lot of factors, including cpu speed and loading, disk I/O speed, and
network bandwidth affecting rsync performance. If you are truly looking
for off-site data replication then be sure not to mount NFS across the
wide-area network. You want only rsync communications over any slow
network. You might want to employ ssh with rsync to secure your
site-to-site communications - explore the "-e ssh" rsync option. If you
do so then chances are you will also want to use public/private key
based authentication for the ssh login to work from a cron job.
It could look like this:
[ busy server ] [ local backup proxy ] [ offsite backup server ]
[ running nfsd ] -- nfs -- [cron job runs rsync cmd] -- rsync via ssh -- [ running sshd ]
Chris
--
"Spend less! Do more! Go Open Source..." -- Dirigo.net
Chris Johnson, RHCE #807000448202021
More information about the K12OSN
mailing list