[Pulp-list] pulp v1 vs pulp v2 rpm repo sync times

Justin Sherrill jsherril at redhat.com
Mon Feb 25 19:55:49 UTC 2013


On 02/25/2013 02:35 PM, Mike McCune wrote:
> On 02/25/2013 10:29 AM, Randy Barlow wrote:
>> On Mon, 25 Feb 2013, Jay Dobies wrote:
>>> Touch base with Randy. He tweaked a bunch of those numbers before v2
>>> released
>>> and should be able to point you to the best places to start playing
>>> with.
>>
>> I had done some lab testing with the --num-threads parameter in
>> December. I had learned that our Grinder code was using threads for CPU
>> intensive work during downloads. Due to the Python GIL, this was
>> actually causing the threads to thrash each other, which significantly
>> lowered performance for synchronization.
>>
>> For my test, I used traffic control to limit my bandwidth to 20 Mbps, 10
>> Mpbs, and 1 Mbps between myself and a LAN reachable CentOS repository,
>> and in all three cases I found that having one thread resulted in the
>> best performance. Due to this finding, I set the default number of
>> threads to 1. The --num-threads flag can be used to override the
>> default.
>>
>> One thing I did not simulate in my testing was network latency. If there
>> were high network latency, I would guess that adding more threads might
>> eventually lead to better performance, as more of them would be in a
>> waiting state instead of thrashing each other. I didn't much time to
>> simulate this scenario, so if you find that adding threads improves
>> performance, it might be due to latency. I'd like to know if that does
>> help, as it would warrant another test.
>>
>> Thanks!
>>
>
> in my first test I mention in the initial post there was no real
> network latency since it was all over my local gige network.
>
> I repeated the above test with the addition of 4 threads and it went
> from ~3m -> 2m20s
>
> so for very low latency syncs:
>
> Pulp V1                       : 1m18s
> Pulp V2 with 4 threads        : 2m20s
> Pulp V2 default with 1 thread : 3m12s
>
> repeating my test with a larger network latency between the pulp
> server and the remote repo (roughly 500K/sec download speed, 100ms
> ping) and the difference actually gets much wider:
>
> high latency sync:
>
> Pulp V1                       :    5m10s
> Pulp V2 with 4 threads        : 7m27s
> Pulp V2 default with 1 thread : 25m18s
>
> so, by default it is *really* bad, with a bit of tuning it gets much
> better but is still slower than Pulp V1
>
> Mike
To give an example of a larger repo with latency, the RHEL 6Server 
x86_64 repo:

Pulp V1 (i believe 4 threads):  108 minutes
Pulp V2 with 4 threads: 192 minutes
Pulp V2 with 1 thread:  300 minutes

-Justin




More information about the Pulp-list mailing list