[Pulp-dev] pulp3: Task Resource proposal

Michael Hrivnak mhrivnak at redhat.com
Fri Jun 30 18:36:26 UTC 2017

Thanks for digging into this.

On Thu, Jun 29, 2017 at 3:16 PM, David Davis <daviddavis at redhat.com> wrote:

> I've been working on supporting tags in Pulp 3[0] and have gotten some
> feedback from various other developers. I'd like to open up the discussion
> though to a broader audience and see if anyone has any feedback.
> Background:
> Currently in Pulp 2, there are two types of task tags: action and
> resource. Action tags indicate the action being performed (sync, publish,
> etc), while resource tags indicate the resource type (repository,
> publisher, etc) and the resource id of the resource being acted upon.

For extra context, this is all a relic from before Pulp was even using
celery. Back then it had a home-grown task system that all operated inside
a single process, and these tags were used more extensively than today. At
this point I think they are primarily used by REST API clients to have some
understanding of what a task is doing. There may be no other use.

> Proposal:
> In Pulp 3, we ought to get rid of tags. For action tags, we'll simply add
> a field named "name" to the task table. This will store information about
> which action the tag is performing (sync, publish, etc).

I suggest these correlate directly to the task name as celery knows it,
which we can (and probably should) manually set. It defaults to the python
path, but it's better to have a stable value that won't change due to

> For resource tags, these will be replaced with Task Resources. Like
> resource tags, these will be a one-to-many relationship to tasks (ie a task
> will have many task resources). These Task Resources then have a one-to-one
> generic relation[1] to any object in Pulp that a task can act on.

When you say one-to-one, I think that would actually be a one-to-many,
right? A GenericForeignKey ? A resource needs to be able to be referenced
by more than one task.

Should we use the same generic relation for ReservedResource? Currently it
stores "resource" as a text field. Consistency seems valuable.

> Database Tables:
> Task
> ---
> ...
> name - varchar
> TaskResource
> ---
> id - uuid
> task_id - uuid (foreign key to a task)
> content_type - varchar (e.g. "Repository")
> object_id - uuid (generic foreign key to a repo/publisher/etc)
> Rest API:
> For task names, this field will be returned when querying tasks and can
> also be filtered, etc when searching tasks. Pretty straightforward.
> For Task Resources, I'm imagining that there won't be a separate API for
> Task Resources. Instead, task resource information will be returned along
> with tasks in a list field such as "resources." This presents a bit of a
> problem though: each resource will have different data depending on the
> type of resource (repo, publisher, etc)[2]. Alternatively, we could return
> a homogeneous list of hashes with fields like type (Publisher/Repository,
> etc), id, and href. However, I think this is less useful to users.

I think this should be serialized as a simple list of URLs to the
corresponding resources. That is REST's way of Uniformly doing Resource
Identification. ;) This is a topic I've been meaning to email this list
about anyway, to get us thinking more RESTful than in the past. We need to
break away from the idea that REST API clients should know about primary
keys, natural keys, or anything similar as the means through which to
reference a resource. This rant by the creator of REST is a fun starting
point for reading about how to identify resources:


> For filtering, I think it would be possible to filter on content_type and
> object_id in Task Resource but I think this would be inconvenient for
> users. I think it would be easier if we define some shortcuts (e.g.
> repository_id, publisher_id, etc) that users can use to filter on. I'm
> imagining one such id filter for every possible content_type.

If they can filter based on a resource's URI, I think that would cover the
basic filtering use cases for this field.


Michael Hrivnak

Principal Software Engineer, RHCE

Red Hat
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170630/4ac5a812/attachment.htm>

More information about the Pulp-dev mailing list