Yum, Proxy Cache Safety, Storage Backend

Les Mikesell lesmikesell at gmail.com
Thu Jan 24 14:05:23 UTC 2008


Warren Togami wrote:
> Les Mikesell wrote:
>>
>> Interesting, but it still requires custom setup for any distro/version 
>> that the proxy admin would want to support. What I'd really like to 
>> happen is for yum to just always prefer the same URL when working 
>> through the same proxy so caching would work by default without 
>> needing to be aware of the cache content.  This would work 
>> automatically if the target was a single site, RRDNS, or geo-ip 
>> managed DNS, but you probably can't arrange that for all the repo 
>> mirrors. There has to be some clever way to get the same effect even 
>> when using a mirrorlist - like making sure the mirrorlist itself is 
>> cached and always picking the same entry so any client will use the 
>> same URL that the mirrormanger gave to the first one that made a 
>> request.  Of course you'd need a reasonable retry mechanism to pick 
>> something else if this choice fails but I'd guess it would be a big 
>> win in bandwidth use and load on the mirrors if it worked most of the 
>> time to take advantage of existing local caches with no modifications.
>>
> 
> I just thought of a simple but gross solution for you.
> 
> http://mirrors.fedoraproject.org/mirrorlist?repo=fedora-$releasever&arch=$basearch 
> 
> 
> It sounds like you are using a transparent proxy.  Just redirect 
> mirrors.fedoraproject.org to localhost at another port and serve files 
> so the mirrorlist URL's hand back a single mirror of your choosing.

I think you are missing my point, which is that it would be a huge win 
if yum automatically used typical existing caching proxies with no extra 
setup on anyone's part, so that any number of people behind them would 
get the cached packages without knowing about each other or that they 
need to do something special to defeat the random URLs.  I used to run a 
number of centos3 boxes in several locations and it always worked nicely 
to just:
http_proxy=http://my_proxy.domain:port  yum update
pointing at a local squid because the mirrors used RRDNS so the URLs 
were the same among the machines - and this would have happened 
automatically with a transparent proxy  or on machines set to use a 
proxy by default as they must be in many locations.  Since yum started 
randomizing the requests with a mirrorlist, updates are a lot slower.

Maybe yum needs to do some tricks with cache control headers or 
appending random arguments to ensure the repo data is fresh, but there 
has to be some way to make it re-use packages already downloaded in a 
local proxy cache without any local changes.   We have several locations 
where everyone in a large building has to use the same proxy to get out, 
but the people who would be installing/updating their own linux boxes 
would not know what anyone else is doing or be likely to coordinate the 
choice of a URL if they had to change anything - and I'd guess that's a 
common situation.

-- 
   Les Mikesell
    lesmikesell at gmail.com





More information about the fedora-devel-list mailing list