[Pulp-dev] Port Pulp3 to use RQ

Tue Mar 20 20:52:55 UTC 2018

Another option is TaskTiger (https://github.com/closeio/tasktiger) which
really hooked me with their tagline.

But I really just don't see how we could pull this off responsibly in the
next month (or even 3 months).  Assuming the functionality gaps can be
worked out, it then becomes a question of whether that amount of change
would be acceptable in the interim period between betas.

On Tue, Mar 20, 2018 at 4:39 PM, Daniel Alley <dalley at redhat.com> wrote:

> As Brian said, Celery has a lot of limitations and drawbacks, a lot of
> code complexity, and an upstream that is not terribly responsive.  I, too,
> would love to see us move away from Celery at some point.
>
> But having done a little bit of research over the last few hours since it
> was mentioned, I have some concerns about the gaps between Celery and RQ,
> and I don't think that changing Pulp to use RQ would be as trivial as we
> hope.
>
> I'll start with the benefits of RQ, from what I've read so far.
>
>
>    - It has task prioritization that *actually works*, which would help
>    resolve the issue where reserved resource work tasks get choked  out by
>    less important tasks like applicability.  The officially recommended
>    solution that Celery provides for this is... have dedicated workers for
>    each priority level.  Not ideal.
>    - The documentation is pretty good, from what I can tell.  The Celery
>    documentation is usually OK but sometimes... lacking.
>    - RQ is a lot more straightforwards and less complex to use, from what
>    I can tell
>
> But, problems:
>
>    - RQ does not support revoking tasks.  If you send the worker a
>    SIGINT, it will finish the task and then stop processing new ones.  If you
>    send the worker SIGKILL, it will stop immediately, but I don't think it
>    gracefully handles this circumstance.
>       - People have rolled their own revoke functionality, but we should
>       really look at this.
>    - When a RQ task fails, it does not provide a mechanism to
>    automatically run a piece of code.  It puts the task on a "failed" queue
>    and the python handle for it will have is_failed set to True.  this means
>    we would have to redesign how failed tasks are cleaned up
>    - I have no idea what happens when RQ loses connection to Redis, I
>    couldn't find that info anywhere.  Celery (in theory, at least, reality is
>    mushy) will try to reconnect to the broker.
>    - I have no idea how well RQ deals with persistence
>
> Also... we have shaped large parts of our API around what Celery does.
> Undoing this would be very... nontrivial and I don't think it is possible
> before the beta date, and definitely not if we want to guarantee some level
> of stability.
>
> I'll keep looking but as much as I despise working with Celery I don't
> think we can make this move without a lot more research to make sure these
> problems are solvable.
>
> On Tue, Mar 20, 2018 at 4:03 PM, Austin Macdonald <austin at redhat.com>
> wrote:
>
>> Not being familiar with RQ, I have questions (but no opinion).
>>
>> Will we also be replacing RabbitMQ with Redis?
>> Does anyone on the team have experience with RQ? In production?
>> How well does RQ scale?
>> Is RQ's use of `pickle` a problem? https://pulp.plan.io/issues/23
>> RQ doesn't work on Windows. Is that a problem? (jk)
>>
>>
>> On Tue, Mar 20, 2018 at 3:35 PM, Brian Bouterse <bbouters at redhat.com>
>> wrote:
>>
>>> Motivation:
>>> 1. Celery causes many bugs and issues for Pulp2 and 3 users and there is
>>> no end in sight.
>>>
>>> 2. The Pulp core team spends a lot of effort fixing Celery bugs. It's
>>> often times just us doing it with little or no assistance from the upstream
>>> communities. It's across 4 projects: celery, kombu, billiard, and pyamqp.
>>>
>>> 3. Celery will never allow a coverage report to be generated when
>>> pulp-smash runs because Celery forked the multiprocessing library into
>>> something called billiard. This will limit Pulp forever.
>>>
>>
>>> 4. I don't want to work with Celery anymore and I think the other
>>> maintainers (@dalley, @daviddavis) may feel the same. It's an endless
>>> headache. Even basic things don't work in Celery regularly.
>>>
>>> Proposed change: Replace Pulp3's usage of Celery with RQ (
>>> http://python-rq.org/)
>>>
>>> We would keep the exact same design of a resource manager with n
>>> workers, each worker pulling it's work exclusively from a dedicated queue.
>>> I've looked into porting pulp3 to it and it's doable because all the same
>>> concepts are there. There are a few details to work out, but I wanted to
>>> start the "should we" discussion before we do all-out technical planning.
>>>
>>> When would we do this? I'm proposing soon. It doesn't need to block the
>>> beta, but soon would be good. I don't think users will care much except for
>>> their systemd files, but it is fundamental and important to pulp3 so we
>>> want to get it testing sooner.
>>>
>>> Ideas, comments, questions are welcome!
>>>
>>> Thanks,
>>> Brian
>>>
>>> _______________________________________________
>>> Pulp-dev mailing list
>>> Pulp-dev at redhat.com
>>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>>
>>>
>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20180320/2ebf1f11/attachment.htm>