[Pulp-dev] pulp3: Task Resource proposal

Wed Jul 5 19:32:21 UTC 2017

I added an updated version of the proposal to the task tag task along with
user stories:

https://pulp.plan.io/issues/2482

Looking for some reviews. I’d like to create redmine issues before next
triage.

Also, I opened a task for converting the foreign key on ReservedResource:

https://pulp.plan.io/issues/2869

David

On Fri, Jun 30, 2017 at 3:30 PM, David Davis <daviddavis at redhat.com> wrote:

> Talking ReservedResource over further with @mhrivnak and @bmbouter on IRC,
> it sounds like only repositories can be ReservedResources. Therefore, we
> should probably convert the resource text field into a resource_id foreign
> key to the repositories table instead of having a generic foreign key.
>
> David
>
> On Fri, Jun 30, 2017 at 3:03 PM, David Davis <daviddavis at redhat.com>
> wrote:
>
>> On Fri, Jun 30, 2017 at 2:36 PM, Michael Hrivnak <mhrivnak at redhat.com>
>> wrote:
>>
>>> Thanks for digging into this.
>>>
>>> On Thu, Jun 29, 2017 at 3:16 PM, David Davis <daviddavis at redhat.com>
>>> wrote:
>>>
>>>> I've been working on supporting tags in Pulp 3[0] and have gotten some
>>>> feedback from various other developers. I'd like to open up the discussion
>>>> though to a broader audience and see if anyone has any feedback.
>>>>
>>>> Background:
>>>>
>>>> Currently in Pulp 2, there are two types of task tags: action and
>>>> resource. Action tags indicate the action being performed (sync, publish,
>>>> etc), while resource tags indicate the resource type (repository,
>>>> publisher, etc) and the resource id of the resource being acted upon.
>>>>
>>>
>>> For extra context, this is all a relic from before Pulp was even using
>>> celery. Back then it had a home-grown task system that all operated inside
>>> a single process, and these tags were used more extensively than today. At
>>> this point I think they are primarily used by REST API clients to have some
>>> understanding of what a task is doing. There may be no other use.
>>>
>>>
>>>>
>>>> Proposal:
>>>>
>>>> In Pulp 3, we ought to get rid of tags. For action tags, we'll simply
>>>> add a field named "name" to the task table. This will store information
>>>> about which action the tag is performing (sync, publish, etc).
>>>>
>>>
>>> I suggest these correlate directly to the task name as celery knows it,
>>> which we can (and probably should) manually set. It defaults to the python
>>> path, but it's better to have a stable value that won't change due to
>>> refactor.
>>>
>>
>> Agreed.
>>
>>
>>>
>>>
>>>>
>>>> For resource tags, these will be replaced with Task Resources. Like
>>>> resource tags, these will be a one-to-many relationship to tasks (ie a task
>>>> will have many task resources). These Task Resources then have a one-to-one
>>>> generic relation[1] to any object in Pulp that a task can act on.
>>>>
>>>
>>> When you say one-to-one, I think that would actually be a one-to-many,
>>> right? A GenericForeignKey ? A resource needs to be able to be referenced
>>> by more than one task.
>>>
>>
>> Yes, good catch.
>>
>>
>>>
>>> Should we use the same generic relation for ReservedResource? Currently
>>> it stores "resource" as a text field. Consistency seems valuable.
>>>
>>
>> Yea, I think that would be a good idea.
>>
>>
>>>
>>>
>>>>
>>>> Database Tables:
>>>>
>>>> Task
>>>> ---
>>>> ...
>>>> name - varchar
>>>>
>>>> TaskResource
>>>> ---
>>>> id - uuid
>>>> task_id - uuid (foreign key to a task)
>>>> content_type - varchar (e.g. "Repository")
>>>> object_id - uuid (generic foreign key to a repo/publisher/etc)
>>>>
>>>> Rest API:
>>>>
>>>> For task names, this field will be returned when querying tasks and can
>>>> also be filtered, etc when searching tasks. Pretty straightforward.
>>>>
>>>> For Task Resources, I'm imagining that there won't be a separate API
>>>> for Task Resources. Instead, task resource information will be returned
>>>> along with tasks in a list field such as "resources." This presents a bit
>>>> of a problem though: each resource will have different data depending on
>>>> the type of resource (repo, publisher, etc)[2]. Alternatively, we could
>>>> return a homogeneous list of hashes with fields like type
>>>> (Publisher/Repository, etc), id, and href. However, I think this is less
>>>> useful to users.
>>>>
>>>
>>> I think this should be serialized as a simple list of URLs to the
>>> corresponding resources. That is REST's way of Uniformly doing Resource
>>> Identification. ;) This is a topic I've been meaning to email this list
>>> about anyway, to get us thinking more RESTful than in the past. We need to
>>> break away from the idea that REST API clients should know about primary
>>> keys, natural keys, or anything similar as the means through which to
>>> reference a resource. This rant by the creator of REST is a fun starting
>>> point for reading about how to identify resources:
>>>
>>> http://roy.gbiv.com/untangled/2008/rest-apis-must-be-hypertext-driven
>>>
>>
>> +1
>>
>>
>>>
>>>
>>>
>>>>
>>>> For filtering, I think it would be possible to filter on content_type
>>>> and object_id in Task Resource but I think this would be inconvenient for
>>>> users. I think it would be easier if we define some shortcuts (e.g.
>>>> repository_id, publisher_id, etc) that users can use to filter on. I'm
>>>> imagining one such id filter for every possible content_type.
>>>>
>>>
>>> If they can filter based on a resource's URI, I think that would cover
>>> the basic filtering use cases for this field.
>>>
>>
>> +1
>>
>>
>>>
>>> --
>>>
>>> Michael Hrivnak
>>>
>>> Principal Software Engineer, RHCE
>>>
>>> Red Hat
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170705/01ea22cf/attachment.htm>