[Pulp-dev] Pulp 3 plugin for Chef cookbooks

David Davis daviddavis at redhat.com
Wed May 16 20:14:33 UTC 2018


This is great. I had a chance to look at your plugin and am really excited
by having chef support in Pulp.

Some responses below.


On Tue, May 15, 2018 at 2:31 PM, Simon Baatz <gmbnomis at gmail.com> wrote:

> I created the  beginnings of a Pulp 3 plugin to manage Chef cookbooks
> [1].  Currently, it supports to create repos, create cookbook content
> units, and publish repos.  A published & distributed repo will offer
> a "universe" API endpoint for tools like Berkshelf.
>
> I did not implement sync yet. I am waiting for "PendingVersion" to be
> available
> first.
>
> I ran into a couple of problems/uncertainties described below (sorry for
> the
> lengthy mail). I am new to Django, DRF, and, obviously, Pulp 3 so any
> remarks or
> suggestions are welcome:
>
> - Create Content: The plugin reads as much meta-data as possible from the
> actual
>   cookbook Artifact when creating a content unit. The motivation for this
> is:
>
>   - One doesn't need a special tool to upload content, which makes
> uploading by e.g.
>     a CI job easier.
>   - It ensures consistency between metadata stored in Pulp and the actual
>     metadata in the cookbook.
>
>   However, this requires to extract a metadata file from a gzipped tar
> archive.
>   Content unit creation is synchronous and doing this work in a
> synchronous call
>   might not be optimal (we had a discussion in this topic on the pulp-dev
>   mailing list already).
>

I agree. Can you make this an async call in the plugin or is there a need
to also make this an async call in core/other plugins?


>
> - Publication/Distribution: The metadata file ("universe") for a published
>   cookbook repository contains absolute URLs for download (i.e. these point
>   to published artifacts in a distribution).
>
>   The current publication/distribution concept seems to have the underlying
>   assumption that a Publication is fully relocatable: PublishedMetadata
>   artifacts are created by the publishing task and creating a Distribution
> is a
>   synchronous call that determines the base path of the published
> artifacts.
>
>   This causes a problem with said "universe" file. Ideally, it could be
>   pre-computed (it lists all artifacts in the repo).  However, this can't
> be
>   done AFAIK since the base path is unknown at publication time and one
> can't
>   generate additional metadata artifacts for a specific distribution later.
>
>   The best solution I came up with was to implement a dynamic API. To
> reduce the
>   amount of work to be done, the API does a simple string replacement:
> During
>   publication, the universe file is pre-computed using placeholders. In the
>   dynamic API these placeholders are replaced with the actual base URL of
> the
>   distribution.
>
>   However, I would prefer not to be forced to implement a dynamic API for
> static
>   information. Is there a way to solve this differently?
>

Are relative paths an option? If not, I don’t think there’s currently an
alternative to a live api. I probably wouldn’t even use a published
metadata file TBH. I wonder though if there’s maybe functionality we could
add to do this.


>
> - Content-Type Header: The "universe" file is JSON and must have a
> corresponding
>   "Content-Type" HTTP header.
>
>   However, content type of the development server seems to be "text/html"
> by
>   default for all artifacts. Apparently, I can't set the content-type of a
>   (meta-data) artifact(?)
>

I think this goes back to not using a published metadata file to serve up
your api. However, I could see how it might be useful.


>
> - Getting the base url of a distribution in the dynamic API is surprisingly
>   complicated and depends on the inner structure of pulp core (I took the
>   implementation from 'pulp_ansible'). IMHO, a well defined way to obtain
> it
>   should be part of the plugin API.
>

I agree. Opened: https://pulp.plan.io/issues/3677


>
> - "Content" class: The way to use only a single artifact in Content (like
> done
>   in pulp_file) seems to require in-depth knowledge of the
>   Content/ContentSerializer class and its inner workings.
>
>   The downside of this can already be experienced in the "pulp_file"
> plugin: The
>   fields "id" and "created" are missing, since the implementation there
> just
>   overrides the 'fields' in the serializer).
>
>   I think two Content types should be part of the plugin API: one with
>   multiple artifacts, and a simpler one with a single artifact
>

I began working on this:

https://github.com/pulp/pulp/pull/3476

But there was opposition around having the Content model be responsible for
generating the relative path on the Content Artifact. I’ve opened an issue
to see if there’s another way to do this (e.g. using serializers):

https://pulp.plan.io/issues/3678


>
> - Uploading an Artifact that already exists returns an error, which is
>   annoying if you use http/curl to import artifacts. Suppose some other
> user
>   uploaded an artifact in the past. You won't get useful
>   information from the POST request uploading the same artifact:
>
>   HTTP/1.1 400 Bad Request
>   Allow: GET, POST, HEAD, OPTIONS
>   Content-Type: application/json
>   Date: Sat, 12 May 2018 17:50:54 GMT
>   Server: WSGIServer/0.2 CPython/3.6.2
>   Vary: Accept
>
>   {
>       "non_field_errors": [
>           "sha512 checksum must be unique."
>       ]
>   }
>
>   This forced me to do something like:
>
>     ...
>     sha256=$(sha256sum "$targz" | awk '{print $1}')
>     ARTIFACT_HREF=$(http :8000/pulp/api/v3/artifacts/?sha256=$sha256 | jq
> -r '.results[0]._href')
>     if [[ $ARTIFACT_HREF == "null" ]]; then
>         echo uploading artifact $cookbook_name sha256: $sha256
>         http --form POST http://localhost:8000/pulp/api/v3/artifacts/
> file@$targz
>         ARTIFACT_HREF=$(http :8000/pulp/api/v3/artifacts/?sha256=$sha256
> | jq -r '.results[0]._href')
>         ...
>
>   Perhaps a "303 See Other" to the existing artifact would help here.
>
>
>
Why not just do something like:

http --form POST http://localhost:8000/pulp/api/v3/artifacts/ file@$targz
|| true
ARTIFACT_HREF=$(http :8000/pulp/api/v3/artifacts/?sha256=$sha256 | jq -r
'.results[0]._href’)
if [[ $ARTIFACT_HREF == “null” ]]; then exit 1; fi

The error message could be more helpful though. It should probably contain
the existing artifact’s href. I looked at 303 and am a bit ambivalent
toward using it.


>
> [1]: https://github.com/gmbnomis/pulp_cookbook
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20180516/6d735e6f/attachment.htm>


More information about the Pulp-dev mailing list