[Pulp-dev] Publish API for Plugin Writers (Pulp3)

Mon Apr 24 11:31:13 UTC 2017

Jeff,

A few comments to your strawman:

* What is an artifact? If it is a database model, then why not call it a
unit if that's what it's called everywhere else in the code?
* How would you deal with metadata-only units that don't have a file
representation, but show up in some kind of metadata (e.g. package groups /
errata). associate() doesn't seem to give me that.
* For that matter, how would you deal with files that are not
representations of units, but new artifacts? (e.g. repomd.xml and the
like). It feels like it can be possible by extending my commit() with
writing the metadata and then calling the parent class' commit() (which
does the atomic publish), but I think that's not pretty.

On Fri, Apr 21, 2017 at 5:09 PM, Jeff Ortel <jortel at redhat.com> wrote:

> I like this Brian and want to take it one step further.  I think there is
> value in abstracting how a
> publication is composed.  Files like metadata need to be composed by the
> publisher (as needed) in the
> working_dir then "added" to the publication.  Artifacts could be
> "associated" to the publication and the
> platform determines how this happens (symlinks/in the DB).
>
> Assuming the Publisher is instantiated with a 'working_dir' attribute.
>
> ---------------------------------------
>
> Something like this to kick around:
>
>
> class Publication:
>     """
>     The Publication provided by the plugin API.
>
>     Examples:
>
>     A crude example with lots of hand waving.
>
>     In Publisher.publish()
>
>     >>>
>     >>> publication = Publication(self.working_dir)
>     >>>
>     >>> # Artifacts
>     >>> for artifact in []: # artifacts
>     >>>     path = ' <determine relative path>'
>     >>>     publication.associate(artifact, path)
>     >>>
>     >>> # Metadata created in self.staging_dir <here>.
>     >>>
>     >>> publication.add('repodata/primary.xml')
>     >>> publication.add('repodata/others.xml')
>     >>> publication.add('repodata/repomd.xml')
>     >>>
>     >>> # - OR -
>     >>>
>     >>> publication.add('repodata/')
>     >>>
>     >>> publication.commit()
>     """
>
>     def __init__(self, staging_dir):
>         """
>         Args:
>             staging_dir: Absolute path to where publication is staged.
>         """
>         self.staging_dir = staging_dir
>
>     def associate(self, artifact, path):
>         """
>         Associate an artifact to the publication.
>         This could result in creating a symlink in the staging directory
>         or (later) creating a record in the db.
>
>         Args:
>             artifact: A content artifact
>             path: Relative path within the staging directory AND eventually
>                   within the published URL.
>         """
>
>     def add(self, path):
>         """
>         Add a file within the staging directory to the publication by
> relative path.
>
>         Args:
>             path: Relative path within the staging directory AND
> eventually within
>                   the published URL.  When *path* is a directory, all files
>                   within the directory are added.
>         """
>
>     def commit(self):
>         """
>         When committed, the publication is atomically published.
>         """
>         # atomic magic
>
>
>
>
>
> On 04/19/2017 10:16 AM, Brian Bouterse wrote:
> > I was thinking about the design here and I wanted to share some thoughts.
> >
> > For the MVP, I think a publisher implemented by a plugin developer would
> write all files into the working
> > directory and the platform will "atomically publish" that data into the
> location configured by the repository.
> > The "atomic publish" aspect would copy/stage the files in a permanent
> location but would use a single symlink
> > to the top level folder to go live with the data. This would make atomic
> publication the default behavior.
> > This runs after the publish() implemented by the plugin developer
> returns, after it has written all of its
> > data to the working dir.
> >
> > Note that ^ allows for the plugin writer to write the actual contents of
> files in the working directory
> > instead of symlinks, causing Pulp to duplicate all content on disk with
> every publish. That would be a
> > incredibly inefficient way to write a plugin but it's something the
> platform would not prevent in any explicit
> > way. I'm not sure if this is something we should improve on or not.
> >
> > At a later point, we could add in the incremental publish maybe as a
> method on a Publisher called
> > incremental_publish() which would only be called if the previous publish
> only had units added.
> >
> >
> >
> > On Mon, Apr 17, 2017 at 4:22 PM, Brian Bouterse <bbouters at redhat.com
> <mailto:bbouters at redhat.com>> wrote:
> >
> >     For plugin writers who are writing a publisher for Pulp3, what do
> they need to handle during publishing
> >     versus platform? To make a comparison against sync, the "Download
> API" and "Changesets" [0] allows the
> >     plugin writer to tell platform about a remote piece of content. Then
> platform handles creating the unit,
> >     fetching it, and saving it. Will there be a similar API to support
> publishing to ease the burden of a
> >     plugin writer? Also will this allow platform to have a structured
> knowledge of a publication with Pulp3?
> >
> >     I wanted to try to characterize the problem statement as two
> separate questions:
> >
> >     1) How will units be recorded to allow platform to know which units
> comprise a specific publish?
> >     2) What are plugin writer's needs at publish time, and what
> repetitive tasks could be moved to platform?
> >
> >     As a quick recalling of how Pulp2 works. Each publisher would write
> files into the working directory and
> >     then they would get moved into their permanent home. Also there is
> the incrementalPublisher base machinery
> >     which allowed for an additive publication which would use the
> previous publish and was "faster". Finally
> >     in Pulp2, the only record of a publication are the symlinks on the
> filesystem.
> >
> >     I have some of my own ideas on these things, but I'll start the
> conversation.
> >
> >     [0]: https://github.com/pulp/pulp/pull/2876 <
> https://github.com/pulp/pulp/pull/2876>
> >
> >     -Brian
> >
> >
> >
> >
> > _______________________________________________
> > Pulp-dev mailing list
> > Pulp-dev at redhat.com
> > https://www.redhat.com/mailman/listinfo/pulp-dev
> >
>
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20170424/d80dd393/attachment.htm>