[Pulp-dev] "Internal" periodic tasks in Pulp3?

Brian Bouterse bbouters at redhat.com
Wed Mar 29 14:44:52 UTC 2017

We can easily continue to dispatch internal periodic tasks using code we
write and maintain, or we can find a dedicated project for an HA-cron like
rcron[0]. Using Celerybeat made sense in Pulp2 because the requirement of
dynamic, user facing schedules is exactly what celerybeat provides. With
Pulp3 we don't have that requirement anymore so that is what is changing in
terms of the benefit of celerybeat.

Now here are the problems with Celerybeat and Pulp:

* It's fragile. Pulp's custom scheduler relies on internal behaviors of
Celerybeat that are not part of an API. We have spent a lot more time
fixing than we would with a much smaller code base of our own code.

* Celerybeat does not support high availability (HA) or failover but Pulp's
use of Celerybeat is specifically HA. Their feature is not a good fit for
our needs. This requires our Scheduler to take special care with how we
hack the internals of Celery. We would do better to get the periodic
dispatch feature without Celery.

* Not much benefit from Celerybeat. For all the time and care ^ we need to
put into working with it, we aren't getting much benefit in Pulp3 due to
"dynamic user facing schedules" not being part of Pulp3. With that
requirements change the value proposition from Celerybeat is very limited
which makes the risk not worth it.

* The alternatives are very simple and easy. A single thread in the
Celerybeat process that does not call into Celery code could do all of this
in ~20 lines of code.

As an aside, if we did engineer a way to not have the periodic tasks as
part of the Pulp3 MVP, we would be able to eliminate Celerybeat from the
Pulp architecture entirely.

[0]: https://github.com/EvanK/rcron


On Wed, Mar 29, 2017 at 10:13 AM, Michael Hrivnak <mhrivnak at redhat.com>

> I think we could probably engineer our way around the requirement to run
> that periodic task if we had to. But we should weigh that against what it
> would take to continue using celerybeat.
> Generally, I think we'll continue to find value in having something that
> can run periodic tasks for internal purposes. Trimming history is one good
> example, not just for the cases Pulp 2 currently does, but also for things
> like repo versions.  Celerybeat doesn't have to be the thing that initiates
> periodic tasks, but it's a reasonable starting place since we already use
> it that way.
> Can you shed some light on the problems with using celerybeat with Pulp 3?
> Michael
> On Tue, Mar 28, 2017 at 3:55 PM, Brian Bouterse <bbouters at redhat.com>
> wrote:
>> This came out of IRC and discussion on a Pulp3 MVP call...
>> In Pulp2 we had three "internal" periodic tasks [0] that would do things
>> like database maintenance. The "nice to have" maintained periodic tasks can
>> be left out of the Pulp3 MVP, but @mhrivnak identified that '
>> download_deferred_content' is required for lazy downloading correctness.
>> Can others confirm that for Pulp3 we will require the
>> download_deferred_content Celery task?
>> Once we confirm ^, I want to discuss why we should not implement it in
>> Pulp3 how we did in Pulp2.
>> Note, internal periodic calls is not the same as "user scheduled calls".
>> Pulp3 will is not supporting user scheduled calls for reasons identified in
>> these blog posts [1][2].
>> [0]: https://github.com/pulp/pulp/blob/fba39f1a82bbf8666df64999e8
>> 9bc21d6d5ec897/server/pulp/server/async/celery_instance.py#L24-L40
>> [1]: http://pulpproject.org/2016/12/07/deprecating-nodes/
>> [2]: http://pulpproject.org/2016/10/31/pulp-3.0-mvp/
>> -Brian
>> _______________________________________________
>> Pulp-dev mailing list
>> Pulp-dev at redhat.com
>> https://www.redhat.com/mailman/listinfo/pulp-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170329/854d9acc/attachment.htm>

More information about the Pulp-dev mailing list