[Pulp-dev] Tasking System Changes and Feedback

Brian Bouterse bmbouter at redhat.com
Mon May 3 19:55:29 UTC 2021


Just catching up after being out for a few weeks.

One additional point to add is that RQ (and queueing systems generally)
have a first-come-first-serve assumption/requirement. Pulp on the other
hand has this "resource" based task dependency where any task is able to be
worked on as long as no task "ahead" of it also requires one of the same
resources. I believe RQ would welcome additional "backends" other than
redis, but I believe they would not accept us breaking the first come first
serve assumption. Pulp needs to do this though because when Pulp uses
traditional queueing approaches it misses many opportunities for parallel
execution.


On Tue, Apr 13, 2021 at 5:59 PM Matthias Dellweg <mdellweg at redhat.com>
wrote:

>
>
> On Tue, Apr 13, 2021 at 6:55 PM Daniel Alley <dalley at redhat.com> wrote:
>
>> Are there any benefits to improving RQ vs the invented here method? I'm
>>> just curious about the cost of maintaining a tasking system versus being
>>> part of a community built one. This feels like the kind of problem that
>>> many other applications should have in the Python world -- or are there
>>> elements of Pulp's deployment architecture that make it unique here?
>>>
>>
>> This shouldn't be viewed through an "invented here" lens, because most of
>> what the proposal does would actually reduce our dependence on "invented
>> here".
>>
>> Basically there is a fundamental problem with having task state split
>> between the database, and some external service (RQ), which is incredibly
>> difficult to keep consistent.  There is a lot of existing complexity around
>> resource locking that would completely go away if we just switch to keeping
>> the "task queue" in the database, and using normal transactions and
>> row/table locks rather than separate "lock objects" in additional tables,
>> etc.
>>
>> The idea is, we already have all of this information about the tasks in
>> the database (reporting what happened and so on), and if we just store 2
>> extra pieces of information - the function to execute, and the parameters
>> to execute with - we will essentially have the "front half" of a task
>> queue, that we can much more easily keep in a consistent state with
>> everything else.
>>
>> This actually is a fairly common problem - it's called the outbox
>> pattern: https://microservices.io/patterns/data/transactional-outbox.html
>>
>> Regarding the "back half", which is dealing with the actual process of
>> spawning the process, I'm less certain.  Maybe Matthias can explain what
>> the plan is, there.  IMO even if we continued using RQ for that portion
>> (part 3 in the diagram in the link), the change to the "front half"
>> (everything up to and including the pulp resource manager") makes a lot of
>> sense and would be a significant net reduction in complexity.
>>
> Sure. The idea here is to not even have a task queue as such. If today,
> tasks are dispatched by the resource manager into the queues for different
> workers, the new system would just collect tasks to be done in the database
> like a backlog. Idle workers would look for the next available task and
> then try to acquire the needed resources. So a task will be assigned to a
> worker only when setting it to running. This will happen in a simple
> endless loop. Forking for the actual task is planned to avoid memory leaks.
> Among the benefits of this way to distribute tasks is that there is no
> need to cancel tasks that have been assigned to a worker but are still
> queued just because the worker is restarted.
> "the next available task" is the oldest not started task that does not use
> resources (locks) held by another running task or requested by another even
> older unstarted task. This is such a pulp specific definition, that we do
> not believe there exists a ready to use solution.
>
>>
>>
> This is sort of an aside to this general change. Are Pulp tasks cleaned up
>>> from the database today?
>>>
>>
>> They aren't.  We don't clean up anything automatically, cleanup is
>> user-driven.
>>
>> On Tue, Apr 13, 2021 at 11:18 AM Eric Helms <ehelms at redhat.com> wrote:
>>
>>>
>>>
>>> On Thu, Apr 8, 2021 at 5:24 PM Daniel Alley <dalley at redhat.com> wrote:
>>>
>>>> Eric,
>>>>
>>>> * The idea is to move away from RQ entirely.  RQ is fine (and vastly
>>>> better than Celery IMO), but managing task state across both 1) the
>>>> database and 2) a separate, external registry is still problematic.  If all
>>>> of the information can simply be kept in the database, then it will be much
>>>> easier to maintain consistent state.
>>>>
>>>
>>> Are there any benefits to improving RQ vs the invented here method? I'm
>>> just curious about the cost of maintaining a tasking system versus being
>>> part of a community built one. This feels like the kind of problem that
>>> many other applications should have in the Python world -- or are there
>>> elements of Pulp's deployment architecture that make it unique here?
>>>
>>>
>>>> * *Maybe*.  We're considering using Redis as a cache to improve
>>>> content serving performance (after all, caching is one of the primary uses
>>>> of Redis). If we do, then Redis would remain in the architecture, but it
>>>> could potentially be an optional component and would be easier to remove at
>>>> some point in the future.
>>>> * We'd just be adding a small amount of information to each task
>>>> record, and it wouldn't prevent cleanup later.
>>>>
>>>
>>> This is sort of an aside to this general change. Are Pulp tasks cleaned
>>> up from the database today?
>>>
>>>
>>>>
>>>>
>>>>
>>>> On Thu, Apr 8, 2021 at 4:42 PM Eric Helms <ehelms at redhat.com> wrote:
>>>>
>>>>> A few initial questions that get a bit into the stack but will help
>>>>> the Foreman project think on the proposed changes:
>>>>>
>>>>>  * Does this move away from RQ entirely or just RQ workers?
>>>>>  * Do the new workers remove Pulp 3's use of Redis all together?
>>>>>  * Will using the database result in any additional build up of
>>>>> tasking information that can impact performance over time? (Or does all
>>>>> task data get cleaned up eventually?)
>>>>>
>>>>> Thanks for sending this along early.
>>>>>
>>>>> On Fri, Apr 2, 2021 at 4:43 PM Brian Bouterse <bmbouter at redhat.com>
>>>>> wrote:
>>>>>
>>>>>> FYI, @mdellweg and I have been collaborating on the tasking system
>>>>>> changes. This email is to share some info to transition the work to
>>>>>> @mdellweg while I'm out. With the new-style disabled by default I am hoping
>>>>>> it can go into 3.13.
>>>>>>
>>>>>> ## The PoC and ticket info
>>>>>>
>>>>>> The PoC is basically functional, but it's a PoC:
>>>>>> https://github.com/pulp/pulpcore/pull/1222/
>>>>>>
>>>>>> * The epic is being tracked here which recaps why we're doing this
>>>>>> and the high level approach. The sub-tasks capture the various detailed
>>>>>> changes. https://pulp.plan.io/issues/8495
>>>>>>
>>>>>> * This is totally separate from the RQ workers you use today, and
>>>>>> those will continue to be available for a while.
>>>>>>
>>>>>> ## Next Steps
>>>>>>
>>>>>> * @mdellweg will continue the work and hopefully merge the PoC while
>>>>>> I'm out
>>>>>>
>>>>>> * Once it's demo-able I've asked @mdellweg to give a 20 minute,
>>>>>> public (hopefully recorded) technical demo. While it is designed to be a
>>>>>> drop-in replacement from a user perspective, we think sharing the internals
>>>>>> will be helpful to get feedback and increase the list of those who
>>>>>> understand the work.
>>>>>>
>>>>>> All the best,
>>>>>> Brian
>>>>>>
>>>>>> _______________________________________________
>>>>>> Pulp-dev mailing list
>>>>>> Pulp-dev at redhat.com
>>>>>> https://listman.redhat.com/mailman/listinfo/pulp-dev
>>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Eric Helms
>>>>> Principal Software Engineer
>>>>> Satellite
>>>>> _______________________________________________
>>>>> Pulp-dev mailing list
>>>>> Pulp-dev at redhat.com
>>>>> https://listman.redhat.com/mailman/listinfo/pulp-dev
>>>>>
>>>>
>>>
>>> --
>>> Eric Helms
>>> Principal Software Engineer
>>> Satellite
>>>
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20210503/f2162712/attachment.htm>


More information about the Pulp-dev mailing list