InstantMirror needs a rethink

chasd chasd at silveroaks.com
Thu Jan 24 18:21:30 UTC 2008


> Today InstantMirror is pretty useful for home and small office  
> mirrors,
> but its limitations make it unsustainable without manual  
> intervention of
> the sysadmin.

I am using it now so our ~20 systems don't waste T-1 bandwidth.

> - Synchronization/locking of multiple connections downloading the same
> file is awkward and broken.

My use is low enough volume I haven't run into that.

> - There is no good way to clean up aborted tmp files.

Haven't had any.

> - There is no good way to know what are old files that need pruning.

With disk space relatively cheap here in the USA, and a new Fedora  
every ~6 months, I just rm -rf the old release directories after I  
migrate to the new version. I don't worry about multiple updates to  
the same package, except for giant ones like OOo.

Another outgrowth of the Fedora release cycle is I usually only apply  
security updates, or updates that fix specific problems I experience.  
There is no sense for me to download and apply updates for hardware I  
don't use, for example. I figure I'll pick up application updates in  
6 months when the next release drops, I usually don't need the update  
_right now_.

I don't need a rsync of a mirror,  just a cache of the updates I  
choose to apply because those specific updates will be applied across  
multiple machines.

> - There is no good way of keeping track of the "Big Picture" of its  
> own
> cache, "least recently used" knowing what files were unpopular locally
> and should be pruned.

I don't have a need for that functionality with my usage.

> Any thoughts?

Ignoring the temp file and multiple connection issues, the  
synchronization part could be solved by InstantMirror writing some  
type of log file or access popularity file. A separate cron script  
could read in that data and prune the unpopular / duplicate files.

 From a separate message :

> 1) Origin HTTP mirrors can be configured to serve "Cache-Control:
> max-age=0" in HTTP headers whenever they serve repodata/* files.  This
> can become a standard recommendation for all Fedora mirrors.  Does
> anyone know how to configure Apache to do this?


<Directory /var/ftp/pub/fedora/linux/releases/8/Everything/x86_64/os/ 
repodata>
Header always set Cache-Control: max-age=0
</Directory>

Probably the best way would be to put this in a .htaccess file for  
each repodata directory as that directory is created. The .htaccess  
file would have a local directory directive instead of a full path  
( createrepo ? ). Otherwise the main apache config ( or a file in  
conf.d ) would need to be updated / added each time a release is made  
( or an arch is added).

> 2) Squid refresh_pattern can use a regex to override max-age=0 for
> repodata/* files.  I haven't figured out exactly what the syntax is  
> for
> this.  Anybody know squid.conf?


refresh_pattern \/repodata\/.*		0	0%	0

> <hno> Apache do not have this same abstract internal layer, and  
> writing
> a mod_disk_cache replacement which keeps a mirror type file structure
> should be pretty easy thing to do.

This seems to best leverage existing code / apps, although I am not  
in a position to help here.


Charles Dostale
System Admin - Silver Oaks Communications
http://www.silveroaks.com/
824 17th Street, Moline  IL  61265




More information about the fedora-devel-list mailing list