[Pulp-dev] Hanging tasks (reloaded)

Daniel Alley dalley at redhat.com
Tue Jan 21 16:16:43 UTC 2020


So far I've run it 4 times back to back on a CentOS 7 box and not had any
lockups.  I'll try Fedora

On Tue, Jan 21, 2020 at 9:36 AM Matthias Dellweg <dellweg at atix.de> wrote:

> @Daniel
> It is happening with fresh installations (pulplift with libvirt) on
> both boxes i have tried (pulp3-source-debian10 and
> pulp3-source-fedora30). To test, i run `prestart; django-admin test
> pulpcore` multiple times until it hangs (though i have seen it hang on
> the first try). There is no modification that i know of. Also, up until
> now, i seem to be able to reproduce reliably, while downgrading
> (pinning) redis-py completely solves the issue.
>
> On Tue, 21 Jan 2020 09:16:43 -0500
> Daniel Alley <dalley at redhat.com> wrote:
>
> > @Matthias Dellweg <dellweg at atix.de>
> > did you restart Pulp after upgrading redis-py?  Are you seeing this on
> > fresh boxes, or does it require some modification to reproduce?  I
> > don't think any of us have experienced this thus far.
> >
> > The only thing I can think of is that maybe when the worker process
> > forks it ends up using a different version of redis-py than the
> > parent worker.
> >
> > On Tue, Jan 21, 2020 at 4:49 AM Matthias Dellweg <dellweg at atix.de>
> > wrote:
> >
> > > [@ Brian:
> > >
> > >
> https://github.com/pulp/pulpcore/commit/e36e7b5f0eccc176a6e6298df29293b014f4710c
> > > ]
> > >
> > > Hi Daniel,
> > > thank you for looking into this.
> > > What i am seeing is:
> > >
> > > Jan 21 09:23:10 pulp3-source-fedora30.anubis.example.com
> > > gunicorn[23274]: 127.0.0.1 - admin [21/Jan/2020:09:23:10 +0000]
> > > "PATCH
> > >
> /pulp/api/v3/repositories/file/file/3a31ed13-585d-4a36-8398-7df40560ffa4/
> > > HTTP/1.1" 202 67 "-" "python-requests/2.22.0" Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]: pulp:
> > > rq.worker:INFO: 23269 at pulp3-source-fedora30.anubis.example.com:
> > > 3a776d4d-ff0c-4b44-afae-08dc7c6cc415 Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23270]: pulp:
> > > rq.worker:INFO: resource-manager: Job OK
> > > (aa22aed0-0363-45ae-8254-4ab14893a4a1) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]: pulp:
> > > rq.worker:ERROR: Worker
> > > rq:worker:23269 at pulp3-source-fedora30.anubis.example.com: found an
> > > unhandled exception, quitting... Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]: Traceback (most
> > > recent call last): Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",
> > > line 782, in prepare_job_execution Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > pipeline.execute() Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",
> > > line 3707, in execute Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self.reset() Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]: File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",
> > > line 3476, in reset Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self.connection_pool.release(self.connection) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > >
> "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/connection.py",
> > > line 1114, in release Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self._in_use_connections.remove(connection) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]: KeyError:
> > > Connection<host=localhost,port=6379,db=0> Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]: During handling
> > > of the above exception, another exception occurred: Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]: Traceback (most
> > > recent call last): Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",
> > > line 515, in work Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self.execute_job(job, queue) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/home/vagrant/devel/pulpcore/pulpcore/tasking/worker.py", line 72,
> > > in execute_job Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > super().execute_job(*args, **kwargs) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",
> > > line 727, in execute_job Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self.fork_work_horse(job, queue) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",
> > > line 667, in fork_work_horse Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self.main_work_horse(job, queue) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",
> > > line 744, in main_work_horse Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:     raise e Jan
> > > 21 09:23:10 pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",
> > > line 741, in main_work_horse Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self.perform_job(job, queue) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/home/vagrant/devel/pulpcore/pulpcore/tasking/worker.py", line
> > > 103, in perform_job Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:     return
> > > super().perform_job(job, queue) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",
> > > line 866, in perform_job Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self.prepare_job_execution(job, heartbeat_ttl) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/rq/worker.py",
> > > line 782, in prepare_job_execution Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > pipeline.execute() Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",
> > > line 3445, in __exit__ Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self.reset() Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]: File
> > > "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/client.py",
> > > line 3476, in reset Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self.connection_pool.release(self.connection) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:   File
> > >
> "/usr/local/lib/pulp/lib64/python3.7/site-packages/redis/connection.py",
> > > line 1114, in release Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]:
> > > self._in_use_connections.remove(connection) Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com rq[23269]: KeyError:
> > > Connection<host=localhost,port=6379,db=0> Jan 21 09:23:10
> > > pulp3-source-fedora30.anubis.example.com gunicorn[23274]: 127.0.0.1
> > > - admin [21/Jan/2020:09:23:10 +0000] "GET
> > > /pulp/api/v3/tasks/3a776d4d-ff0c-4b44-afae-08dc7c6cc415/ HTTP/1.1"
> > > 200 477 "-" "python-requests/2.22.0"
> > >
> > > and that last line repeats forever.
> > >
> > > On Fri, 17 Jan 2020 09:09:01 -0500
> > > Daniel Alley <dalley at redhat.com> wrote:
> > >
> > > > Different issue perhaps?  Are you seeing anything in the logs that
> > > > looks like this? https://github.com/rq/rq/issues/1044
> > > >
> > > > On Fri, Jan 17, 2020 at 9:06 AM Daniel Alley <dalley at redhat.com>
> > > > wrote:
> > > >
> > > > > Strange, I'm pretty sure an issue like this is why we pinned
> > > > > originally, but upstream said that that particular issue was
> > > > > (supposedly) fixed.
> > > > >
> > > > >
> > >
> https://github.com/andymccurdy/redis-py/issues/1136#issuecomment-571168161
> > >
> > > > >
> > > > > On Fri, Jan 17, 2020 at 4:15 AM Matthias Dellweg
> > > > > <dellweg at atix.de> wrote:
> > > > >> Hello all,
> > > > >> I believe i have  found a new incarnation of hanging tasks
> > > > >> (tm). This time it is pulp3 and as hard to nail down as ever.
> > > > >> I think, it is introduced by
> > > > >> e36e7b5f0eccc176a6e6298df29293b014f4710c. Where the dependency
> > > > >> on redis was dropped with the result that 3.3.smth instead of
> > > > >> 3.1.smth was installed.
> > > > >>
> > > > >> Before filing an issue, is there anyone out there to share that
> > > > >> experience?
> > > > >>
> > > > >> Also as a thought protocol of how to reproduce:
> > > > >> I have seen tasks hanging in both "waiting" and "running" state
> > > > >> when using the command `prestart; django-admin test pulp_deb`
> > > > >> or `<...> test pulpcore`. All the tasks I have seen were
> > > > >> `sync`, `general_create` or `general_delete` and looked like
> > > > >> they never started to do anything for real. To have consistent
> > > > >> results, i had the impression that i needed to rebuild the
> > > > >> vagrant boxes for every bisecting step. Also updating the
> > > > >> python-redis package on a box that worked, produced a hanging
> > > > >> task in the next run.
> > > > >>
> > > > >> Have a good day,
> > > > >>   Matthias
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20200121/6eedb7e9/attachment.htm>


More information about the Pulp-dev mailing list