[Pulp-dev] Port Pulp3 to use RQ

Tue Mar 20 20:39:00 UTC 2018

As Brian said, Celery has a lot of limitations and drawbacks, a lot of code
complexity, and an upstream that is not terribly responsive.  I, too, would
love to see us move away from Celery at some point.

But having done a little bit of research over the last few hours since it
was mentioned, I have some concerns about the gaps between Celery and RQ,
and I don't think that changing Pulp to use RQ would be as trivial as we
hope.

I'll start with the benefits of RQ, from what I've read so far.

   - It has task prioritization that *actually works*, which would help
   resolve the issue where reserved resource work tasks get choked  out by
   less important tasks like applicability.  The officially recommended
   solution that Celery provides for this is... have dedicated workers for
   each priority level.  Not ideal.
   - The documentation is pretty good, from what I can tell.  The Celery
   documentation is usually OK but sometimes... lacking.
   - RQ is a lot more straightforwards and less complex to use, from what I
   can tell

But, problems:

   - RQ does not support revoking tasks.  If you send the worker a SIGINT,
   it will finish the task and then stop processing new ones.  If you send the
   worker SIGKILL, it will stop immediately, but I don't think it gracefully
   handles this circumstance.
      - People have rolled their own revoke functionality, but we should
      really look at this.
   - When a RQ task fails, it does not provide a mechanism to automatically
   run a piece of code.  It puts the task on a "failed" queue and the python
   handle for it will have is_failed set to True.  this means we would have to
   redesign how failed tasks are cleaned up
   - I have no idea what happens when RQ loses connection to Redis, I
   couldn't find that info anywhere.  Celery (in theory, at least, reality is
   mushy) will try to reconnect to the broker.
   - I have no idea how well RQ deals with persistence

Also... we have shaped large parts of our API around what Celery does.
Undoing this would be very... nontrivial and I don't think it is possible
before the beta date, and definitely not if we want to guarantee some level
of stability.

I'll keep looking but as much as I despise working with Celery I don't
think we can make this move without a lot more research to make sure these
problems are solvable.

On Tue, Mar 20, 2018 at 4:03 PM, Austin Macdonald <austin at redhat.com> wrote:

> Not being familiar with RQ, I have questions (but no opinion).
>
> Will we also be replacing RabbitMQ with Redis?
> Does anyone on the team have experience with RQ? In production?
> How well does RQ scale?
> Is RQ's use of `pickle` a problem? https://pulp.plan.io/issues/23
> RQ doesn't work on Windows. Is that a problem? (jk)
>
>
> On Tue, Mar 20, 2018 at 3:35 PM, Brian Bouterse <bbouters at redhat.com>
> wrote:
>
>> Motivation:
>> 1. Celery causes many bugs and issues for Pulp2 and 3 users and there is
>> no end in sight.
>>
>> 2. The Pulp core team spends a lot of effort fixing Celery bugs. It's
>> often times just us doing it with little or no assistance from the upstream
>> communities. It's across 4 projects: celery, kombu, billiard, and pyamqp.
>>
>> 3. Celery will never allow a coverage report to be generated when
>> pulp-smash runs because Celery forked the multiprocessing library into
>> something called billiard. This will limit Pulp forever.
>>
>
>> 4. I don't want to work with Celery anymore and I think the other
>> maintainers (@dalley, @daviddavis) may feel the same. It's an endless
>> headache. Even basic things don't work in Celery regularly.
>>
>> Proposed change: Replace Pulp3's usage of Celery with RQ (
>> http://python-rq.org/)
>>
>> We would keep the exact same design of a resource manager with n workers,
>> each worker pulling it's work exclusively from a dedicated queue. I've
>> looked into porting pulp3 to it and it's doable because all the same
>> concepts are there. There are a few details to work out, but I wanted to
>> start the "should we" discussion before we do all-out technical planning.
>>
>> When would we do this? I'm proposing soon. It doesn't need to block the
>> beta, but soon would be good. I don't think users will care much except for
>> their systemd files, but it is fundamental and important to pulp3 so we
>> want to get it testing sooner.
>>
>> Ideas, comments, questions are welcome!
>>
>> Thanks,
>> Brian
>>
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
>>
>>
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20180320/b05475f7/attachment.htm>