<div dir="ltr">4 runs on Fedora 31 has not triggered it either. This is using the latest version of redis-py, 3.3.11, and redis 5.0.7.<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jan 21, 2020 at 11:16 AM Daniel Alley <<a href="mailto:dalley@redhat.com" target="_blank">dalley@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">So far I've run it 4 times back to back on a CentOS 7 box and not had any lockups.  I'll try Fedora<br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jan 21, 2020 at 9:36 AM Matthias Dellweg <<a href="mailto:dellweg@atix.de" target="_blank">dellweg@atix.de</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">@Daniel<br>
It is happening with fresh installations (pulplift with libvirt) on<br>
both boxes i have tried (pulp3-source-debian10 and<br>
pulp3-source-fedora30). To test, i run `prestart; django-admin test<br>
pulpcore` multiple times until it hangs (though i have seen it hang on<br>
the first try). There is no modification that i know of. Also, up until<br>
now, i seem to be able to reproduce reliably, while downgrading<br>
(pinning) redis-py completely solves the issue.<br>
<br>
On Tue, 21 Jan 2020 09:16:43 -0500<br>
Daniel Alley <<a href="mailto:dalley@redhat.com" target="_blank">dalley@redhat.com</a>> wrote:<br>
<br>
> @Matthias Dellweg <<a href="mailto:dellweg@atix.de" target="_blank">dellweg@atix.de</a>><br>
> did you restart Pulp after upgrading redis-py?  Are you seeing this on<br>
> fresh boxes, or does it require some modification to reproduce?  I<br>
> don't think any of us have experienced this thus far.<br>
> <br>
> The only thing I can think of is that maybe when the worker process<br>
> forks it ends up using a different version of redis-py than the<br>
> parent worker.<br>
> <br>
> On Tue, Jan 21, 2020 at 4:49 AM Matthias Dellweg <<a href="mailto:dellweg@atix.de" target="_blank">dellweg@atix.de</a>><br>
> wrote:<br>
> <br>
> > [@ Brian:<br>
> ><br>
> > <a href="https://github.com/pulp/pulpcore/commit/e36e7b5f0eccc176a6e6298df29293b014f4710c" rel="noreferrer" target="_blank">https://github.com/pulp/pulpcore/commit/e36e7b5f0eccc176a6e6298df29293b014f4710c</a><br>
> > ]<br>
> ><br>
> > Hi Daniel,<br>
> > thank you for looking into this.<br>
> > What i am seeing is:<br>
> ><br>
> > Jan 21 09:23:10 <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a><br>
> > gunicorn[23274]: 127.0.0.1 - admin [21/Jan/2020:09:23:10 +0000]<br>
> > "PATCH<br>
> > /pulp/api/v3/repositories/file/file/3a31ed13-585d-4a36-8398-7df40560ffa4/<br>
> > HTTP/1.1" 202 67 "-" "python-requests/2.22.0" Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: pulp:<br>
> > rq.worker:INFO: <a href="mailto:23269@pulp3-source-fedora30.anubis.example.com" target="_blank">23269@pulp3-source-fedora30.anubis.example.com</a>:<br>
> > 3a776d4d-ff0c-4b44-afae-08dc7c6cc415 Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23270]: pulp:<br>
> > rq.worker:INFO: resource-manager: Job OK<br>
> > (aa22aed0-0363-45ae-8254-4ab14893a4a1) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: pulp:<br>
> > rq.worker:ERROR: Worker<br>
> > <a href="mailto:rq%3Aworker%3A23269@pulp3-source-fedora30.anubis.example.com" target="_blank">rq:worker:23269@pulp3-source-fedora30.anubis.example.com</a>: found an<br>
> > unhandled exception, quitting... Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: Traceback (most<br>
> > recent call last): Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",<br>
> > line 782, in prepare_job_execution Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > pipeline.execute() Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",<br>
> > line 3707, in execute Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self.reset() Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",<br>
> > line 3476, in reset Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self.connection_pool.release(self.connection) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/connection.py",<br>
> > line 1114, in release Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self._in_use_connections.remove(connection) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: KeyError:<br>
> > Connection<host=localhost,port=6379,db=0> Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: During handling<br>
> > of the above exception, another exception occurred: Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: Traceback (most<br>
> > recent call last): Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",<br>
> > line 515, in work Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self.execute_job(job, queue) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/home/vagrant/devel/pulpcore/pulpcore/tasking/worker.py", line 72,<br>
> > in execute_job Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > super().execute_job(*args, **kwargs) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",<br>
> > line 727, in execute_job Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self.fork_work_horse(job, queue) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",<br>
> > line 667, in fork_work_horse Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self.main_work_horse(job, queue) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",<br>
> > line 744, in main_work_horse Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:     raise e Jan<br>
> > 21 09:23:10 <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",<br>
> > line 741, in main_work_horse Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self.perform_job(job, queue) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/home/vagrant/devel/pulpcore/pulpcore/tasking/worker.py", line<br>
> > 103, in perform_job Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:     return<br>
> > super().perform_job(job, queue) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",<br>
> > line 866, in perform_job Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self.prepare_job_execution(job, heartbeat_ttl) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",<br>
> > line 782, in prepare_job_execution Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > pipeline.execute() Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",<br>
> > line 3445, in __exit__ Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self.reset() Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",<br>
> > line 3476, in reset Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self.connection_pool.release(self.connection) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:   File<br>
> > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/connection.py",<br>
> > line 1114, in release Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]:<br>
> > self._in_use_connections.remove(connection) Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> rq[23269]: KeyError:<br>
> > Connection<host=localhost,port=6379,db=0> Jan 21 09:23:10<br>
> > <a href="http://pulp3-source-fedora30.anubis.example.com" rel="noreferrer" target="_blank">pulp3-source-fedora30.anubis.example.com</a> gunicorn[23274]: 127.0.0.1<br>
> > - admin [21/Jan/2020:09:23:10 +0000] "GET<br>
> > /pulp/api/v3/tasks/3a776d4d-ff0c-4b44-afae-08dc7c6cc415/ HTTP/1.1"<br>
> > 200 477 "-" "python-requests/2.22.0"<br>
> ><br>
> > and that last line repeats forever.<br>
> ><br>
> > On Fri, 17 Jan 2020 09:09:01 -0500<br>
> > Daniel Alley <<a href="mailto:dalley@redhat.com" target="_blank">dalley@redhat.com</a>> wrote:<br>
> >  <br>
> > > Different issue perhaps?  Are you seeing anything in the logs that<br>
> > > looks like this? <a href="https://github.com/rq/rq/issues/1044" rel="noreferrer" target="_blank">https://github.com/rq/rq/issues/1044</a><br>
> > ><br>
> > > On Fri, Jan 17, 2020 at 9:06 AM Daniel Alley <<a href="mailto:dalley@redhat.com" target="_blank">dalley@redhat.com</a>><br>
> > > wrote:<br>
> > >  <br>
> > > > Strange, I'm pretty sure an issue like this is why we pinned<br>
> > > > originally, but upstream said that that particular issue was<br>
> > > > (supposedly) fixed.<br>
> > > ><br>
> > > >  <br>
> > <a href="https://github.com/andymccurdy/redis-py/issues/1136#issuecomment-571168161" rel="noreferrer" target="_blank">https://github.com/andymccurdy/redis-py/issues/1136#issuecomment-571168161</a><br>
> >  <br>
> > > ><br>
> > > > On Fri, Jan 17, 2020 at 4:15 AM Matthias Dellweg<br>
> > > > <<a href="mailto:dellweg@atix.de" target="_blank">dellweg@atix.de</a>> wrote:  <br>
> > > >> Hello all,<br>
> > > >> I believe i have  found a new incarnation of hanging tasks<br>
> > > >> (tm). This time it is pulp3 and as hard to nail down as ever.<br>
> > > >> I think, it is introduced by<br>
> > > >> e36e7b5f0eccc176a6e6298df29293b014f4710c. Where the dependency<br>
> > > >> on redis was dropped with the result that 3.3.smth instead of<br>
> > > >> 3.1.smth was installed.<br>
> > > >><br>
> > > >> Before filing an issue, is there anyone out there to share that<br>
> > > >> experience?<br>
> > > >><br>
> > > >> Also as a thought protocol of how to reproduce:<br>
> > > >> I have seen tasks hanging in both "waiting" and "running" state<br>
> > > >> when using the command `prestart; django-admin test pulp_deb`<br>
> > > >> or `<...> test pulpcore`. All the tasks I have seen were<br>
> > > >> `sync`, `general_create` or `general_delete` and looked like<br>
> > > >> they never started to do anything for real. To have consistent<br>
> > > >> results, i had the impression that i needed to rebuild the<br>
> > > >> vagrant boxes for every bisecting step. Also updating the<br>
> > > >> python-redis package on a box that worked, produced a hanging<br>
> > > >> task in the next run.<br>
> > > >><br>
> > > >> Have a good day,<br>
> > > >>   Matthias<br>
</blockquote></div>
</blockquote></div>