Automated Mirror Selection [Re: Worst experience with Up2Date ever.]

Fulko.Hew at sita.aero Fulko.Hew at sita.aero
Wed Sep 29 18:41:17 UTC 2004



Jeff Spaleta <jspaleta at gmail.com>@redhat.com on 09/29/2004 01:22:16 PM
commented:


> On Wed, 29 Sep 2004 09:56:27 -0500, Chris Adams <cmadams at hiwaay.net>
wrote:
>

... snip ...

> >  For example, mirror.hiwaay.net is in the list, but I don't
> > mirror all the rawhide architectures (just i386 and x86_64).  Does
> > up2date recognize a 404 and try another mirror?  Is this action
> > remembered?
>
> I don't think there is any cleverness about remembering if mirrors
> have gone sour and removing them locally from the client. You have to
> be careful with this sort of logic. A specific mirror could fail for a
> temporary reason, so trying to be clever and remember that it failed
> could prevent you from ever using that mirror again just because it
> had a short period downtime event once.

On the other hand 'muohio' seems to be on the mirror list, and it has never
(apparently) worked, and up2date just hangs on it forever (or at least
a _very_ long time.)  I end up killing up2date, and trying again.

The trouble is that up2date has a variety of stages that it goes through,
and it apparently re-picks from the list of hosts during each stage.

So even though it went to a good site in the beginning to detect that
there are new packages, it may choose to go to a bad site to fetch the
RPM headers.

The list of mirrors should/could at the very least be a package that
we can download to get the 'latest set' of behaving mirrors.

> > One problem with syncing rawhide to mirrors is that the daily build and
> > push of rawhide to the master servers takes a different amount of time
> > (and so finishes at a different time) each day.  Some days, my sync
> > finishes before the push is complete, so until my mirror syncs again
> > (once a day), it will be effectively fubared.

Also the sync of headers is independent of the sync of packages.
Also the notification of packages is independent of syncing of any parts
of any mirrors.

In the end my rule of thumb is that if I see the update notification icon,
and I can't get the files now (because they're not sync'ed yet)  I'll try
again in a few hours.  There is no point in trying them sooner, since
syncing
mirrors takes a completely variable amount of time.

> Okay this sort of problem could maybe be preventable with small
> changes on the master server, if there was a way to implement a
> start/stop file on the master servers that mirrors could check to see
> if a rawhide push was still in process and delay starting a sync
> attempt until the rawhide push on the master server was done.  But
> even if the small technical change on the master server was made, all
> the mirrors would have to write/use scripts to take advantage of the
> change and there is no way you can enforce mirror admins to do that
> across the board. Its the politics of individual mirror admins. So
> while you might use this to make your mirror more reliable... i don't
> think you can count on all admins to do it.

Can the master server check the mirrors to see if (all of) their transfers
have been completed, and only then re-add that mirror to the list of
candidates?
Sure, it would be nicer if the mirror could just 'tell' the master, rather
than
have the master poll...

> > One thing I've considered
> > doing is to change my mirroring scripts to watch for actual file syncs
> > and loop (with a small delay) until the sync runs without changing any
> > files.  That would help, but there's still a period where my mirror of
> > the tree is not in sync.  I guess if I synced the header.info and
> > repodata files last and then deleted, I should always have a consistent
> > tree, but scripting that will be somewhat of a PITA.

That would be the first major boon... if RPMs would be sync'ed before
headers
at least there shouldn't be a 'premature' change indication without the
files
being available.

> The heart of the problem... anything that is going to make mirrors
> significantly more reliable is going to end up being a solution that
> puts a heavy burden on the mirror admins.
> PITA scripts or extra services that communicate back or whatever. The
> maximum reliability in terms of how out of sync an invidiual mirror
> can get basically comes down to how much care and feeding individual
> mirror maintainers are willing to do. So even if you get scripts in
> place on your mirror that work to keep your mirror synced with the
> master mirror, its not clear to me that you can get equal effort from
> the other mirrors who are volunteering their bandwidth to host the
> packages.

PITA scripts should be a prerequisite to becomming a mirror, or at the very
least to be a mirror that gets put into the mirror list.

BTW.  It seems to me that the mirror list is dynamically retrieved during
each run of up2date  (true or false)?  If false, then where is my local
mirror list stored?  (so I can get rid of that damn 'muohio'!)

> And if we can't get all the mirrors to be reliable, then the only
> other way to do it, is to have some scheme that dynamically checks for
> out of synced mirrors and ranks the reliability of mirrors so that
> over time, clients who are connecting to mirrors get a preferred set
> and never set a set that is known to be out of sync.  But, i have not
> seen a scheme that doesn't require an admin to run extra services or
> scripts, that is garunteed to catch a mirror being out of sync and
> prevent a client computer from contacting a mirror that is out of
> sync.

I think thats too complicated.  Something that notifies when the rsync is
done is better.  For the forseable future I don't think a mirror will be
'constantly' in flux.  I'm sure there will be at least a few minutes of the
day when it _will_ be in sync.  When we get to the point where updates
are coming out on all/any packages so quickly that clients are _always_
in the mode of fetching at least _one_ package, _all_ the time, then boy
have we either got a problem with buggy software, or _way_ too many
packages...
or not enough bandwidth.  (I guess the later is always true.)  :-)







More information about the fedora-test-list mailing list