[Pulp-dev] pulp3: Task Resource proposal

Fri Jun 30 19:03:29 UTC 2017

On Fri, Jun 30, 2017 at 2:36 PM, Michael Hrivnak <mhrivnak at redhat.com>
wrote:

> Thanks for digging into this.
>
> On Thu, Jun 29, 2017 at 3:16 PM, David Davis <daviddavis at redhat.com>
> wrote:
>
>> I've been working on supporting tags in Pulp 3[0] and have gotten some
>> feedback from various other developers. I'd like to open up the discussion
>> though to a broader audience and see if anyone has any feedback.
>>
>> Background:
>>
>> Currently in Pulp 2, there are two types of task tags: action and
>> resource. Action tags indicate the action being performed (sync, publish,
>> etc), while resource tags indicate the resource type (repository,
>> publisher, etc) and the resource id of the resource being acted upon.
>>
>
> For extra context, this is all a relic from before Pulp was even using
> celery. Back then it had a home-grown task system that all operated inside
> a single process, and these tags were used more extensively than today. At
> this point I think they are primarily used by REST API clients to have some
> understanding of what a task is doing. There may be no other use.
>
>
>>
>> Proposal:
>>
>> In Pulp 3, we ought to get rid of tags. For action tags, we'll simply add
>> a field named "name" to the task table. This will store information about
>> which action the tag is performing (sync, publish, etc).
>>
>
> I suggest these correlate directly to the task name as celery knows it,
> which we can (and probably should) manually set. It defaults to the python
> path, but it's better to have a stable value that won't change due to
> refactor.
>

Agreed.

>
>
>>
>> For resource tags, these will be replaced with Task Resources. Like
>> resource tags, these will be a one-to-many relationship to tasks (ie a task
>> will have many task resources). These Task Resources then have a one-to-one
>> generic relation[1] to any object in Pulp that a task can act on.
>>
>
> When you say one-to-one, I think that would actually be a one-to-many,
> right? A GenericForeignKey ? A resource needs to be able to be referenced
> by more than one task.
>

Yes, good catch.

>
> Should we use the same generic relation for ReservedResource? Currently it
> stores "resource" as a text field. Consistency seems valuable.
>

Yea, I think that would be a good idea.

>
>
>>
>> Database Tables:
>>
>> Task
>> ---
>> ...
>> name - varchar
>>
>> TaskResource
>> ---
>> id - uuid
>> task_id - uuid (foreign key to a task)
>> content_type - varchar (e.g. "Repository")
>> object_id - uuid (generic foreign key to a repo/publisher/etc)
>>
>> Rest API:
>>
>> For task names, this field will be returned when querying tasks and can
>> also be filtered, etc when searching tasks. Pretty straightforward.
>>
>> For Task Resources, I'm imagining that there won't be a separate API for
>> Task Resources. Instead, task resource information will be returned along
>> with tasks in a list field such as "resources." This presents a bit of a
>> problem though: each resource will have different data depending on the
>> type of resource (repo, publisher, etc)[2]. Alternatively, we could return
>> a homogeneous list of hashes with fields like type (Publisher/Repository,
>> etc), id, and href. However, I think this is less useful to users.
>>
>
> I think this should be serialized as a simple list of URLs to the
> corresponding resources. That is REST's way of Uniformly doing Resource
> Identification. ;) This is a topic I've been meaning to email this list
> about anyway, to get us thinking more RESTful than in the past. We need to
> break away from the idea that REST API clients should know about primary
> keys, natural keys, or anything similar as the means through which to
> reference a resource. This rant by the creator of REST is a fun starting
> point for reading about how to identify resources:
>
> http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
>

+1

>
>
>
>>
>> For filtering, I think it would be possible to filter on content_type and
>> object_id in Task Resource but I think this would be inconvenient for
>> users. I think it would be easier if we define some shortcuts (e.g.
>> repository_id, publisher_id, etc) that users can use to filter on. I'm
>> imagining one such id filter for every possible content_type.
>>
>
> If they can filter based on a resource's URI, I think that would cover the
> basic filtering use cases for this field.
>

+1

>
> --
>
> Michael Hrivnak
>
> Principal Software Engineer, RHCE
>
> Red Hat
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170630/8eca2f40/attachment.htm>