<div dir="ltr"><a class="gmail_plusreply" id="plusReplyChip-0" href="mailto:dellweg@atix.de" tabindex="-1">@Matthias Dellweg</a><div> did you restart Pulp after upgrading redis-py?   Are you seeing this on fresh boxes, or does it require some modification to reproduce?  I don't think any of us have experienced this thus far.<br></div><div><br></div><div>The only thing I can think of is that maybe when the worker process forks it ends up using a different version of redis-py than the parent worker.<br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jan 21, 2020 at 4:49 AM Matthias Dellweg <<a href="mailto:dellweg@atix.de">dellweg@atix.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">[@ Brian:<br>
<a href="https://github.com/pulp/pulpcore/commit/e36e7b5f0eccc176a6e6298df29293b014f4710c" rel="noreferrer" target="_blank">https://github.com/pulp/pulpcore/commit/e36e7b5f0eccc176a6e6298df29293b014f4710c</a>]<br>
<br>
Hi Daniel,<br>
thank you for looking into this.<br>
What i am seeing is:<br>
<br>
Jan 21 09:23:10 <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a><br>
gunicorn[23274]: 127.0.0.1 - admin [21/Jan/2020:09:23:10 +0000] "PATCH<br>
/pulp/api/v3/repositories/file/file/3a31ed13-585d-4a36-8398-7df40560ffa4/<br>
HTTP/1.1" 202 67 "-" "python-requests/2.22.0" Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: pulp:<br>
rq.worker:INFO: <a href="mailto:23269@pulp3-source-fedora30.anubis.example.com" target="_blank">23269@pulp3-source-fedora30.anubis.example.com</a>:<br>
3a776d4d-ff0c-4b44-afae-08dc7c6cc415 Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23270]: pulp:<br>
rq.worker:INFO: resource-manager: Job OK<br>
(aa22aed0-0363-45ae-8254-4ab14893a4a1) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: pulp:<br>
rq.worker:ERROR: Worker<br>
<a href="mailto:rq%3Aworker%3A23269@pulp3-source-fedora30.anubis.example.com" target="_blank">rq:worker:23269@pulp3-source-fedora30.anubis.example.com</a>: found an<br>
unhandled exception, quitting... Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: Traceback (most<br>
recent call last): Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line<br>
782, in prepare_job_execution Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
pipeline.execute() Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",<br>
line 3707, in execute Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:     self.reset()<br>
Jan 21 09:23:10 <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",<br>
line 3476, in reset Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
self.connection_pool.release(self.connection) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/connection.py",<br>
line 1114, in release Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
self._in_use_connections.remove(connection) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: KeyError:<br>
Connection<host=localhost,port=6379,db=0> Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: During handling of<br>
the above exception, another exception occurred: Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: Traceback (most<br>
recent call last): Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line<br>
515, in work Jan 21 09:23:10 <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a><br>
rq[23269]:     self.execute_job(job, queue) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/home/vagrant/devel/pulpcore/pulpcore/tasking/worker.py", line 72, in<br>
execute_job Jan 21 09:23:10 <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a><br>
rq[23269]:     super().execute_job(*args, **kwargs) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line<br>
727, in execute_job Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
self.fork_work_horse(job, queue) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line<br>
667, in fork_work_horse Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
self.main_work_horse(job, queue) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line<br>
744, in main_work_horse Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:     raise e Jan 21<br>
09:23:10 <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line<br>
741, in main_work_horse Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
self.perform_job(job, queue) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/home/vagrant/devel/pulpcore/pulpcore/tasking/worker.py", line 103, in<br>
perform_job Jan 21 09:23:10 <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a><br>
rq[23269]:     return super().perform_job(job, queue) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line<br>
866, in perform_job Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
self.prepare_job_execution(job, heartbeat_ttl) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py", line<br>
782, in prepare_job_execution Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
pipeline.execute() Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",<br>
line 3445, in __exit__ Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:     self.reset()<br>
Jan 21 09:23:10 <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",<br>
line 3476, in reset Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
self.connection_pool.release(self.connection) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
"/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/connection.py",<br>
line 1114, in release Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
self._in_use_connections.remove(connection) Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: KeyError:<br>
Connection<host=localhost,port=6379,db=0> Jan 21 09:23:10<br>
<a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> gunicorn[23274]: 127.0.0.1 -<br>
admin [21/Jan/2020:09:23:10 +0000] "GET<br>
/pulp/api/v3/tasks/3a776d4d-ff0c-4b44-afae-08dc7c6cc415/ HTTP/1.1" 200<br>
477 "-" "python-requests/2.22.0"<br>
<br>
and that last line repeats forever.<br>
<br>
On Fri, 17 Jan 2020 09:09:01 -0500<br>
Daniel Alley <<a href="mailto:dalley@redhat.com" target="_blank">dalley@redhat.com</a>> wrote:<br>
<br>
> Different issue perhaps?  Are you seeing anything in the logs that<br>
> looks like this? <a href="https://github.com/rq/rq/issues/1044" rel="noreferrer" target="_blank">https://github.com/rq/rq/issues/1044</a><br>
> <br>
> On Fri, Jan 17, 2020 at 9:06 AM Daniel Alley <<a href="mailto:dalley@redhat.com" target="_blank">dalley@redhat.com</a>><br>
> wrote:<br>
> <br>
> > Strange, I'm pretty sure an issue like this is why we pinned<br>
> > originally, but upstream said that that particular issue was<br>
> > (supposedly) fixed.<br>
> ><br>
> > <a href="https://github.com/andymccurdy/redis-py/issues/1136#issuecomment-571168161" rel="noreferrer" target="_blank">https://github.com/andymccurdy/redis-py/issues/1136#issuecomment-571168161</a><br>
> ><br>
> > On Fri, Jan 17, 2020 at 4:15 AM Matthias Dellweg <<a href="mailto:dellweg@atix.de" target="_blank">dellweg@atix.de</a>><br>
> > wrote: <br>
> >> Hello all,<br>
> >> I believe i have  found a new incarnation of hanging tasks (tm).<br>
> >> This time it is pulp3 and as hard to nail down as ever.<br>
> >> I think, it is introduced by<br>
> >> e36e7b5f0eccc176a6e6298df29293b014f4710c. Where the dependency on<br>
> >> redis was dropped with the result that 3.3.smth instead of<br>
> >> 3.1.smth was installed.<br>
> >><br>
> >> Before filing an issue, is there anyone out there to share that<br>
> >> experience?<br>
> >><br>
> >> Also as a thought protocol of how to reproduce:<br>
> >> I have seen tasks hanging in both "waiting" and "running" state<br>
> >> when using the command `prestart; django-admin test pulp_deb` or<br>
> >> `<...> test pulpcore`. All the tasks I have seen were `sync`,<br>
> >> `general_create` or `general_delete` and looked like they never<br>
> >> started to do anything for real. To have consistent results, i had<br>
> >> the impression that i needed to rebuild the vagrant boxes for<br>
> >> every bisecting step. Also updating the python-redis package on a<br>
> >> box that worked, produced a hanging task in the next run.<br>
> >><br>
> >> Have a good day,<br>
> >>   Matthias<br>
</blockquote></div>