[Pulp-dev] Meeting Minutes: Lockless pulp

Matthias Dellweg mdellweg at redhat.com
Wed Aug 12 15:24:52 UTC 2020


Today we met to brainstorm about the pulp design around locking and
the resource manager.
Here's the meeting transcript:

# djnd (Lockless Pulp)

# General Problem Statements

- Resource manager is a problem.
    - It's a bottleneck. Every task goes through the resource manager.
- When tasks die, inconsistent state rq vs Pulp
- Pulp is inefficient. It's always FIFO when that may not be optimal.
    - Work has to wait if there's a reservation ahead of it waiting
for another resource
- Orphan cleanup blocks all work
    - Orphan cleanup might be handled separately by assuring that any
resource the is to be clean up (currently content units and artifacts)
is owned by at least one thing (RepositoryVersion, Task or User) at
any time until not needed anymore


# Why do we have mutable resources?

Some things are immutable, e.g. content, but other things do, e.g. a
Repository's data, e.g. it's name.

* Mutable data creates write-write race conditions
* Users have a first-come-first-serve (FCFS) expectation, e.g. a
sync-then-publish


# Why do we use locking today?

* We use it to solve the base-path problem for Distributions
* It provides the FCFS guarantee of work w.r.t a specific resource,
e.g. Respotiory
* Orphan cleanup is a singleton so it locks content units that are in
use by other tasks when deleting
* Deletion of resources is synchronized by locking
* Updating of resources is synchronized by locking
* Creation of RepositoryVersions is serialized by locking on the Repository


# Solution

...getting out of the synchronization quadrant with immutability...
Locks or bottlenecks are needed to prevent usage of resources by one
process, while being changed by another.
If all resources were immutable, they could be used by an unlimited
number of processes simultaneously.

Pros:
  - no need for resource manager
  - all (remaining) services are scalable

Cons:
  - Harder to design


## Exposed immutability

*All* resources are immutable

Pros:
  - relatively easy to implement; no need to redesign the database
  - blockless; no waiting on resources, ever (the user has to wait...)
  - another user cannot modify ad resource you are about to use

Cons:
  - all burden pushed to the user
  - changing a resource requires replacing it
  - unable to change parameters behind a "name" (natural key)

## Futures (delayed immutability)

Resources are created as futures and resolved by tasks.
Once resolved, resources are immutable.
Resources start in the unshared mutable corner and move to the shared
immutable corner.
Futures can readily be used to create new resources while not yet resolved.

Pros:
  - accounts for actions like sync, publish to take time
  - futures form a DAG -> no possibility for dead locks
  - still no waiting on finished resources

Cons:
  - waiting on resources; blocking
  - futures can fail while waiting on failing dependent futures

## Copy-on-write with lookup table

Resources are immutable, and referenced by a key lookup table.
Changing a resource means creating a modified copy and changing the
reference in the lookup table.
Tasks reference the actual resources.

Pros:
  - users can "change" resources as they are used to
  - blockless; no waiting on resources, ever (the user has to wait...)
  - tasks can never fail on missing resources (being deleted in the meantime)

Cons:
  - extra table join on lookup
  - impossible to retrieve the natural key




More information about the Pulp-dev mailing list