Mirror monitor for meta data repos

david dfarning at sbcglobal.net
Tue Jun 7 18:00:36 UTC 2005


Jeff Spaleta wrote:

>On 6/7/05, david <dfarning at sbcglobal.net> wrote:
>  
>
>>I'm looking at pulling the timestamp from the master about every hour.
>>
>>If a master version/arch timestamp changes,  I will check evey hour
>>until it becomes up to date.
>>If a prior version/arch timestamp fails,  I will check evey hour until
>>it becomes up to date.
>>
>>Every {hour | couple of hours] I will check a random version/arch
>>timestamp per mirror to ensure that the entire mirror isn't dead.
>>    
>>
>
>So you are checking every mirror once an hour after you see the master
>mirror timestamp change?  So the time between checking the master and
>the time to check an individual a mirror is at maximum 1 hour? You
>need to have a state of "unknown" for each mirror between when you
>know the master has changed and you have yet to check to see if the
>mirror has been updated. You can't assume that a mirror is out of date
>until you check that mirror.  Nor can you assume the mirror is up2date
>either. Mirrors are simply in an unverified state in the meantime.
>
>  
>
Good heavens man, I just working on this a few hours ago. So as they 
say, "The concrete ain't set up real firm yet."
I originally started looking at the mirror status problem to set up some 
sort of geoip/ mirror_list/dns_server for a repository and wanted to 
ensure that I am not referring to dead or out of date mirrors.  But, 
that a good why off.

In the mean time--

My intentions are to probe all master timestamps each hour. So, for the 
development branch
1 mirror X 6 arch X ~1.1K (size of repomd.xml) = 250K traffic per hour

If a master timestamp has change - probe all mirrors
~250 mirror X 1 arch X ~1.1K  = 1.5M traffic per arch per change 
(development changes ~ once per day)

If a master timestamp has not changed - probe only those mirrors  that 
did not report op2date last probe
~150 mirror X 1 arch X ~1.1K  = 150K  traffic first hour  after change 
(sync delay)
~75 mirror X 1 arch X ~1.1K  = 750K  traffic second hour  after sage 
(sync delay)
...

Every time a master timestamp is changed all mirrors are immediately 
reprobed to eliminate "the dreaded unknown state."

>>[dfarning at localhost m3d]$ time /home/dfarning/workspace/m3d/mirmon.pl -v
>>-get all -c development/core.conf -probes 25
>>real    13m12.098s
>>user    0m51.962s
>>sys     0m31.346s
>>
>>An update should be pretty quick because the master timestamp seldom
>>changes.
>>    
>>
>
>that 25 means 25 different mirrors?  How long does it take to do the
>whole set of mirrors assuming just x86? Thinking again about the
>unverified state for the web page summary... if it takes 15 minutes to
>work through a list of mirrors.. you want to make sure that for those
>15 minutes the webpage isnt misleading.
>
>  
>
A probe is a little chunk of c code ( part of a repository tool kit that 
I am working on) that return the primary.xml timestamp for a given repo

ie. rtk_probe 
ftp://ftp.linux.ncsu.edu/pub/fedora/linux/core//development/ returns 
1118144219

hmm.. lets rerun  for only one arch (x86) and up the probes to 250 and 
see the results on my dsl line

Took 23 seconds to get all but the last 14 mirrors.  Those reported time 
out failure are 360 seconds as set in the probe.
Not bad

How would you feel if I stated the time since the last probe so user are 
aware of that fact.

Also, the mirror master could (with the help of m3d) clean up the mirror 
list so not as much effort is spent contacting dead mirrors.


>-jef
>
>  
>
thanks
 -dtf




More information about the fedora-devel-list mailing list