[Pulp-dev] Port Pulp3 to use RQ
ipanova at redhat.com
Wed Mar 21 12:13:18 UTC 2018
+1 what said dalley.
Whatever we'd decide to replace celery with, should not go before beta
that's for sure.
I am +10000 to get rid of celery, but with something that would not have
other limitations which would bring just different kind of pain. 
Let's keep searching and evaluating alternatives.
Software Engineer| Pulp| Red Hat Inc.
"Do not go where the path may lead,
go instead where there is no path and leave a trail."
On Tue, Mar 20, 2018 at 9:52 PM, Daniel Alley <dalley at redhat.com> wrote:
> Another option is TaskTiger (https://github.com/closeio/tasktiger) which
> really hooked me with their tagline.
> But I really just don't see how we could pull this off responsibly in the
> next month (or even 3 months). Assuming the functionality gaps can be
> worked out, it then becomes a question of whether that amount of change
> would be acceptable in the interim period between betas.
> On Tue, Mar 20, 2018 at 4:39 PM, Daniel Alley <dalley at redhat.com> wrote:
>> As Brian said, Celery has a lot of limitations and drawbacks, a lot of
>> code complexity, and an upstream that is not terribly responsive. I, too,
>> would love to see us move away from Celery at some point.
>> But having done a little bit of research over the last few hours since it
>> was mentioned, I have some concerns about the gaps between Celery and RQ,
>> and I don't think that changing Pulp to use RQ would be as trivial as we
>> I'll start with the benefits of RQ, from what I've read so far.
>> - It has task prioritization that *actually works*, which would help
>> resolve the issue where reserved resource work tasks get choked out by
>> less important tasks like applicability. The officially recommended
>> solution that Celery provides for this is... have dedicated workers for
>> each priority level. Not ideal.
>> - The documentation is pretty good, from what I can tell. The Celery
>> documentation is usually OK but sometimes... lacking.
>> - RQ is a lot more straightforwards and less complex to use, from
>> what I can tell
>> But, problems:
>> - RQ does not support revoking tasks. If you send the worker a
>> SIGINT, it will finish the task and then stop processing new ones. If you
>> send the worker SIGKILL, it will stop immediately, but I don't think it
>> gracefully handles this circumstance.
>> - People have rolled their own revoke functionality, but we should
>> really look at this.
>> - When a RQ task fails, it does not provide a mechanism to
>> automatically run a piece of code. It puts the task on a "failed" queue
>> and the python handle for it will have is_failed set to True. this means
>> we would have to redesign how failed tasks are cleaned up
>> - I have no idea what happens when RQ loses connection to Redis, I
>> couldn't find that info anywhere. Celery (in theory, at least, reality is
>> mushy) will try to reconnect to the broker.
>> - I have no idea how well RQ deals with persistence
>> Also... we have shaped large parts of our API around what Celery does.
>> Undoing this would be very... nontrivial and I don't think it is possible
>> before the beta date, and definitely not if we want to guarantee some level
>> of stability.
>> I'll keep looking but as much as I despise working with Celery I don't
>> think we can make this move without a lot more research to make sure these
>> problems are solvable.
>> On Tue, Mar 20, 2018 at 4:03 PM, Austin Macdonald <austin at redhat.com>
>>> Not being familiar with RQ, I have questions (but no opinion).
>>> Will we also be replacing RabbitMQ with Redis?
>>> Does anyone on the team have experience with RQ? In production?
>>> How well does RQ scale?
>>> Is RQ's use of `pickle` a problem? https://pulp.plan.io/issues/23
>>> RQ doesn't work on Windows. Is that a problem? (jk)
>>> On Tue, Mar 20, 2018 at 3:35 PM, Brian Bouterse <bbouters at redhat.com>
>>>> 1. Celery causes many bugs and issues for Pulp2 and 3 users and there
>>>> is no end in sight.
>>>> 2. The Pulp core team spends a lot of effort fixing Celery bugs. It's
>>>> often times just us doing it with little or no assistance from the upstream
>>>> communities. It's across 4 projects: celery, kombu, billiard, and pyamqp.
>>>> 3. Celery will never allow a coverage report to be generated when
>>>> pulp-smash runs because Celery forked the multiprocessing library into
>>>> something called billiard. This will limit Pulp forever.
>>>> 4. I don't want to work with Celery anymore and I think the other
>>>> maintainers (@dalley, @daviddavis) may feel the same. It's an endless
>>>> headache. Even basic things don't work in Celery regularly.
>>>> Proposed change: Replace Pulp3's usage of Celery with RQ (
>>>> We would keep the exact same design of a resource manager with n
>>>> workers, each worker pulling it's work exclusively from a dedicated queue.
>>>> I've looked into porting pulp3 to it and it's doable because all the same
>>>> concepts are there. There are a few details to work out, but I wanted to
>>>> start the "should we" discussion before we do all-out technical planning.
>>>> When would we do this? I'm proposing soon. It doesn't need to block the
>>>> beta, but soon would be good. I don't think users will care much except for
>>>> their systemd files, but it is fundamental and important to pulp3 so we
>>>> want to get it testing sooner.
>>>> Ideas, comments, questions are welcome!
>>>> Pulp-dev mailing list
>>>> Pulp-dev at redhat.com
>>> Pulp-dev mailing list
>>> Pulp-dev at redhat.com
> Pulp-dev mailing list
> Pulp-dev at redhat.com
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Pulp-dev