<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<p>Looks like it was the OOM reaper killing the task. Thanks for
pointing me in the right direction Brian!<br>
</p>
<pre class="moz-signature" cols="72">
David Gersting
Linux Systems Administrator
WVU Information Technology Services
</pre>
<div class="moz-cite-prefix">On 12/12/16 1:13 PM, Brian Bouterse
wrote:<br>
</div>
<blockquote
cite="mid:CAAcvrTHuvSowXuRJ+xzJYNMtK9OT=xY-j2+EN3Rn2hshtSyz7A@mail.gmail.com"
type="cite">
<div dir="ltr">
<div>I think the celery worker is experiencing a segfault or
maybe it's being killed by the OOM. If the OOM is killing it
there would be log evidence. If it's a segfault, with Python a
segfault is unlikely, so this is probably a segfault while
calling to the system using subprocess which Pulp does in
various places. I haven't looked in the publish code of
platform and rpm to look for subprocess usage but that would
probably hint at the problem. To really debug something like
that you would want capture a coredump. I think celery has the
ability to capture coredumps, but I've never done it.<br>
<br>
</div>
<div>The pulp-smash tests for publish showed they were working.
Is it possible that this could be an environment issue? Is it
possible to reproduce the issue on separate hardware to rule
that out. If it is reproducable, I recommend opening a bug
[0].<br>
</div>
<div><br>
[0]: <a moz-do-not-send="true"
href="https://pulp.plan.io/projects/pulp/issues/new"
target="_blank">https://pulp.plan.io/projects/<wbr>pulp/issues/new</a><br>
<br>
</div>
-Brian<br>
</div>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Mon, Dec 12, 2016 at 11:49 AM, David
Gersting <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:dgersting@systems.wvu.edu" target="_blank">dgersting@systems.wvu.edu</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">Hello
everyone,<br>
<br>
I've been banging my head against the desk for a while on
this one, and<br>
could use the group's help.<br>
<br>
I have a rather large repo (OEL 6's base repo with 36,684
RPMs) that I'm<br>
trying to mirror locally to speed up our os patching, and
every time I<br>
try to publish the repo the task fails just after the
"Publishing Delta<br>
RPMs" step starts. After some digging it seems to me that
the worker is<br>
timing out. Has anyone else seen this and/or know how I can
fix it or<br>
increase the timeout for this task?<br>
<br>
I've attached the full shell output for anyone who wants it,
but the<br>
error message I'm seeing from the worker is:<br>
# journalctl --unit=pulp_worker-5<br>
*SNIP*<br>
Dec 12 10:48:19 *HOSTNAME* pulp[1403]:
celery.worker.job:ERROR:<br>
(1403-27776) Task<br>
pulp.server.managers.repo.<wbr>publish.publish[e3d25854-757c-<wbr>40af-8979-d0b7287263ed]<br>
raised unexpected: WorkerLostError('Worker exited
prematurely: signal 9<br>
(SIGKILL).',)<br>
Dec 12 10:48:19 *HOSTNAME* pulp[1403]:
celery.worker.job:ERROR:<br>
(1403-27776) Traceback (most recent call last):<br>
Dec 12 10:48:19 *HOSTNAME* pulp[1403]:
celery.worker.job:ERROR:<br>
(1403-27776) File<br>
"/usr/lib64/python2.7/site-<wbr>packages/billiard/pool.py",
line 1171, in<br>
mark_as_worker_lost<br>
Dec 12 10:48:19 *HOSTNAME* pulp[1403]:
celery.worker.job:ERROR:<br>
(1403-27776) human_status(exitcode)),<br>
Dec 12 10:48:19 *HOSTNAME* pulp[1403]:
celery.worker.job:ERROR:<br>
(1403-27776) WorkerLostError: Worker exited prematurely:
signal 9 (SIGKILL).<br>
Dec 12 10:48:21 *HOSTNAME* pulp[49191]:
py.warnings:WARNING:<br>
(49191-27776) /usr/lib64/python2.7/site-<wbr>packages/pymongo/topology.py:<wbr>74:<br>
UserWarning: MongoClient opened before fork. Create
MongoClient with<br>
connect=False, or create client after forking. Se<br>
Dec 12 10:48:21 *HOSTNAME* pulp[49191]:
py.warnings:WARNING:<br>
(49191-27776) "MongoClient opened before fork. Create
MongoClient "<br>
Dec 12 10:48:21 *HOSTNAME* pulp[49191]:
py.warnings:WARNING:<br>
(49191-27776)<br>
Dec 12 10:48:22 *HOSTNAME* pulp[49191]:<br>
pulp.server.async.tasks:INFO: Task failed :<br>
[e3d25854-757c-40af-8979-<wbr>d0b7287263ed]<br>
<br>
<br>
<br>
Any help would be much appreciated!<br>
<span class="HOEnZb"><font color="#888888"><br>
--<br>
David Gersting<br>
Linux Systems Administrator<br>
WVU Information Technology Services<br>
<br>
</font></span><br>
______________________________<wbr>_________________<br>
Pulp-list mailing list<br>
<a moz-do-not-send="true" href="mailto:Pulp-list@redhat.com">Pulp-list@redhat.com</a><br>
<a moz-do-not-send="true"
href="https://www.redhat.com/mailman/listinfo/pulp-list"
rel="noreferrer" target="_blank">https://www.redhat.com/<wbr>mailman/listinfo/pulp-list</a><br>
</blockquote>
</div>
<br>
</div>
</blockquote>
<br>
</body>
</html>