[Pulp-dev] Publish API for Plugin Writers (Pulp3)
Jeff Ortel
jortel at redhat.com
Mon Apr 24 14:51:40 UTC 2017
On 04/24/2017 06:31 AM, Mihai Ibanescu wrote:
> Jeff,
>
> A few comments to your strawman:
>
> * What is an artifact? If it is a database model, then why not call it a unit if that's what it's called
> everywhere else in the code?
In pulp3, content units and associated files are separate. Each content unit has 1/many artifacts. An RPM
content unit has exactly 1 artifact whereas docker has 3 artifacts.
> * How would you deal with metadata-only units that don't have a file representation, but show up in some kind
> of metadata (e.g. package groups / errata). associate() doesn't seem to give me that.
The straw-man splits published items into 2 categories:
1. Files associated with content which are artifacts. Handled by associate().
- RPMs
- ISOs
- Docker images
- etc ...
2. Files generated by the publisher. Handled by add().
- Metadata (like everything in repodata/)
- Metadata only units like errata and package groups.
For category #2 files the publisher uses the information stored in the content unit to generate a file in the
staging directory. Then uses add() to include it in the publication.
> * For that matter, how would you deal with files that are not representations of units, but new artifacts?
> (e.g. repomd.xml and the like). It feels like it can be possible by extending my commit() with writing the
> metadata and then calling the parent class' commit() (which does the atomic publish), but I think that's not
> pretty.
See above ^^.
>
>
> On Fri, Apr 21, 2017 at 5:09 PM, Jeff Ortel <jortel at redhat.com <mailto:jortel at redhat.com>> wrote:
>
> I like this Brian and want to take it one step further. I think there is value in abstracting how a
> publication is composed. Files like metadata need to be composed by the publisher (as needed) in the
> working_dir then "added" to the publication. Artifacts could be "associated" to the publication and the
> platform determines how this happens (symlinks/in the DB).
>
> Assuming the Publisher is instantiated with a 'working_dir' attribute.
>
> ---------------------------------------
>
> Something like this to kick around:
>
>
> class Publication:
> """
> The Publication provided by the plugin API.
>
> Examples:
>
> A crude example with lots of hand waving.
>
> In Publisher.publish()
>
> >>>
> >>> publication = Publication(self.working_dir)
> >>>
> >>> # Artifacts
> >>> for artifact in []: # artifacts
> >>> path = ' <determine relative path>'
> >>> publication.associate(artifact, path)
> >>>
> >>> # Metadata created in self.staging_dir <here>.
> >>>
> >>> publication.add('repodata/primary.xml')
> >>> publication.add('repodata/others.xml')
> >>> publication.add('repodata/repomd.xml')
> >>>
> >>> # - OR -
> >>>
> >>> publication.add('repodata/')
> >>>
> >>> publication.commit()
> """
>
> def __init__(self, staging_dir):
> """
> Args:
> staging_dir: Absolute path to where publication is staged.
> """
> self.staging_dir = staging_dir
>
> def associate(self, artifact, path):
> """
> Associate an artifact to the publication.
> This could result in creating a symlink in the staging directory
> or (later) creating a record in the db.
>
> Args:
> artifact: A content artifact
> path: Relative path within the staging directory AND eventually
> within the published URL.
> """
>
> def add(self, path):
> """
> Add a file within the staging directory to the publication by relative path.
>
> Args:
> path: Relative path within the staging directory AND eventually within
> the published URL. When *path* is a directory, all files
> within the directory are added.
> """
>
> def commit(self):
> """
> When committed, the publication is atomically published.
> """
> # atomic magic
>
>
>
>
>
> On 04/19/2017 10:16 AM, Brian Bouterse wrote:
> > I was thinking about the design here and I wanted to share some thoughts.
> >
> > For the MVP, I think a publisher implemented by a plugin developer would write all files into the working
> > directory and the platform will "atomically publish" that data into the location configured by the repository.
> > The "atomic publish" aspect would copy/stage the files in a permanent location but would use a single symlink
> > to the top level folder to go live with the data. This would make atomic publication the default behavior.
> > This runs after the publish() implemented by the plugin developer returns, after it has written all of its
> > data to the working dir.
> >
> > Note that ^ allows for the plugin writer to write the actual contents of files in the working directory
> > instead of symlinks, causing Pulp to duplicate all content on disk with every publish. That would be a
> > incredibly inefficient way to write a plugin but it's something the platform would not prevent in any explicit
> > way. I'm not sure if this is something we should improve on or not.
> >
> > At a later point, we could add in the incremental publish maybe as a method on a Publisher called
> > incremental_publish() which would only be called if the previous publish only had units added.
> >
> >
> >
> > On Mon, Apr 17, 2017 at 4:22 PM, Brian Bouterse <bbouters at redhat.com <mailto:bbouters at redhat.com> <mailto:bbouters at redhat.com
> <mailto:bbouters at redhat.com>>> wrote:
> >
> > For plugin writers who are writing a publisher for Pulp3, what do they need to handle during publishing
> > versus platform? To make a comparison against sync, the "Download API" and "Changesets" [0] allows the
> > plugin writer to tell platform about a remote piece of content. Then platform handles creating the unit,
> > fetching it, and saving it. Will there be a similar API to support publishing to ease the burden of a
> > plugin writer? Also will this allow platform to have a structured knowledge of a publication with Pulp3?
> >
> > I wanted to try to characterize the problem statement as two separate questions:
> >
> > 1) How will units be recorded to allow platform to know which units comprise a specific publish?
> > 2) What are plugin writer's needs at publish time, and what repetitive tasks could be moved to platform?
> >
> > As a quick recalling of how Pulp2 works. Each publisher would write files into the working directory and
> > then they would get moved into their permanent home. Also there is the incrementalPublisher base machinery
> > which allowed for an additive publication which would use the previous publish and was "faster". Finally
> > in Pulp2, the only record of a publication are the symlinks on the filesystem.
> >
> > I have some of my own ideas on these things, but I'll start the conversation.
> >
> > [0]: https://github.com/pulp/pulp/pull/2876 <https://github.com/pulp/pulp/pull/2876>
> <https://github.com/pulp/pulp/pull/2876 <https://github.com/pulp/pulp/pull/2876>>
> >
> > -Brian
> >
> >
> >
> >
> > _______________________________________________
> > Pulp-dev mailing list
> > Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>
> > https://www.redhat.com/mailman/listinfo/pulp-dev <https://www.redhat.com/mailman/listinfo/pulp-dev>
> >
>
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com <mailto:Pulp-dev at redhat.com>
> https://www.redhat.com/mailman/listinfo/pulp-dev <https://www.redhat.com/mailman/listinfo/pulp-dev>
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 847 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170424/323c940a/attachment.sig>
More information about the Pulp-dev
mailing list