[Pulp-dev] Performance while syncing very large repositories

Lubos Mjachky lmjachky at redhat.com
Fri Feb 28 10:58:20 UTC 2020


Dear colleagues,

I am currently working on the issue https://pulp.plan.io/issues/6121. It
was reported that syncs of very large repositories take a huge amount of
time compared to Pulp 2.

I profiled the code and realized that that we are repeatedly fetching data
from the database in a loop and manually excluding units which should not
be added to a repository mirrored by Pulp.

Today, I submitted a PR (https://github.com/pulp/pulpcore/pull/565) that
may resolve this issue. Instead of the aforesaid loop, there is now used a
single database call which does the same thing, I suppose. Please, focus on
the last note https://pulp.plan.io/issues/6121#note-11 to better understand
my findings. Also, do not hesitate to review the submitted PR.

Thank you!
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20200228/d8dec721/attachment.htm>


More information about the Pulp-dev mailing list