[Pulp-list] Sometimes 'purge_duplicates' takes an abnormal amount of time

Sean Myers sean.myers at redhat.com
Mon Nov 21 23:00:28 UTC 2016

On 11/21/2016 05:52 PM, Eric Helms wrote:
> On Mon, Nov 21, 2016 at 4:03 PM, Sean Myers <sean.myers at redhat.com> wrote:
>> On 11/21/2016 03:29 PM, Eric Helms wrote:
>>> Hi,
>>> I've noticed that at times, during a sync, the 'purge_duplicates' step
>>> seems to take what I would consider an abnormal amount of time. A sync
>> that
>>> usually takes 5 or so minutes can suddenly take an hour and it seems to
>>> just sit at this particular step. Since Pulp progress reports do not
>>> contain timing data [1] I cannot give hard numbers on this. Is there
>>> anything that might explain this behavior? Is this a valid or known bug?
>> I
>>> am happy to provide as much data and information as I can because.
>> I don't know of issues where this step just pauses.
>> What version(s) of pulp/mongodb is this running on? What distribution(s)?
> 2.8.7 on mongodb 2.6.11
>> Is there a particular feed that reliably causes this?
> I wish. This does not happen all of the time which makes it hard to zero on
> the problem. We experience high IO time at some points, and I have noticed
> that during these times when we do large queries to Pulp and thus mongodb
> we run into all things slowing down. I am working to get some better hard
> evidence by adding cProfiles.

cProfile stuff would be gravy, but I suspect that a pulp developer with a
fresh take on this would be able to take my work from the original effort
to speed up the purging of duplicate nevra and find some obvious things to
optimization. This is a new Redmine issue that should be filed in Pulp's

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/pulp-list/attachments/20161121/d0577e44/attachment.sig>

More information about the Pulp-list mailing list