<div dir="ltr"><div>Thanks for all the details. I would like to provide Pulp 3 users with a similar feature. In order to do that, the archive produced by Pulp will need to include all that extra metadata that comes from Katello right now. Pulp should support 2 use cases:</div><div><br></div><div> - As a user, I can generate an archive by specifying a list of pulp_hrefs. <br></div><div> - As a user, I can import an archive that was generated on another pulp.</div><div><br></div><div>The archive would contain database migrations needed to restore all the resources. It would also have all the files needed to back the artifacts.<br></div><div><br></div><div>Users could then provide a list of repository versions, publications, and distributions when creating an artchive. Once the archive is imported, Pulp is serving the content without having to republish. </div><div><br></div><div><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, Feb 20, 2020 at 9:53 AM Justin Sherrill <<a href="mailto:jsherril@redhat.com" target="_blank">jsherril@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>There are two different forms of export today in katello:</p>
<p>Legacy version: <br>
</p>
<p> * Uses pulp2's export functionality</p>
<p> * Takes the tarball as is</p>
<p>"New" Version</p>
<p> * Just copies published repository as is (following symlinks)</p>
<p> * Adds own 'katello' metadata to existing tarball</p>
<p><br>
</p>
<p>I would imagine that with pulp3 we would somewhat combine these
two approaches and take the pulp3 generated export file and add in
a metadata file of some sort.<br>
</p>
<p>Justin<br>
</p>
<div>On 2/19/20 2:28 PM, Dennis Kliban
wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div dir="ltr">Thank you for the details. More questions inline.<br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Feb 19, 2020 at 2:04
PM Justin Sherrill <<a href="mailto:jsherril@redhat.com" target="_blank">jsherril@redhat.com</a>> wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>the goal from our side is to have a very similar
experience to the user. Today the user would:</p>
<p>* run a command (for example, something similar to
hammer content-view version export
--content-view-name=foobar --version=1.0)</p>
<p>* this creates a tarball on disk</p>
</div>
</blockquote>
<div>What all is in the tarball? Is this just a repository
export created by Pulp or is there extra information from
the Katello db?<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>* they copy the tarball to external media</p>
<p>* they move the external media to the disconnected
katello</p>
<p>* they run 'hammer content-view version import
--export-tar=/path/to/tarball</p>
</div>
</blockquote>
<div>Does katello untar this archive, create a repository in
pulp, sync from the directory containing the unarchive, and
then publish? <br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p>I don't see this changing much for the user, anything
additional that needs to be done in pulp can be done
behind the cli/api in katello. Thanks!<br>
</p>
</div>
</blockquote>
<div><br>
</div>
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p> </p>
<p>Justin<br>
</p>
<div>On 2/19/20 12:52 PM, Dennis Kliban wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">In Katello that uses Pulp 2, what steps
does the user need to take when importing an export
into an air gapped environment? I am concerned about
making the process more complicated than what the user
is already used to. <br>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Feb 19, 2020
at 11:20 AM David Davis <<a href="mailto:daviddavis@redhat.com" target="_blank">daviddavis@redhat.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div dir="ltr">Thanks for the responses so far. I
think we could export publications along with the
repo version by exporting any publication that
points to a repo version.
<div><br>
</div>
<div>My concern with exporting repositories is
that users will probably get a bunch of content
they don't care about if they want to export a
single repo version. That said, if users do want
to export entire repos, we could add this
feature later I think?<br clear="all">
<div>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div><br>
</div>
<div>David<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
</div>
</div>
<br>
<div class="gmail_quote">
<div dir="ltr" class="gmail_attr">On Wed, Feb 19,
2020 at 10:30 AM Justin Sherrill <<a href="mailto:jsherril@redhat.com" target="_blank">jsherril@redhat.com</a>>
wrote:<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<div>
<p><br>
</p>
<div>On 2/14/20 1:09 PM, David Davis wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>Grant and I met today to discuss
importers and exporters[0] and we'd like
some feedback before we proceed with the
design. To sum up this feature briefly:
users can export a repository version
from one Pulp instance and import it to
another. </div>
<div><br>
</div>
# Master/Detail vs Core
<div><br>
</div>
<div>So one fundamental question is
whether we should use a Master/Detail
approach or just have core control the
flow but call out to plugins to get
export formats.</div>
<div><br>
</div>
<div>To give some background: we currently
define Exporters (ie FileSystemExporter)
in core as Master models. Plugins extend
this model which allows them to
configure or customize the Exporter.
This was necessary because some plugins
need to export Publications (along with
repository metadata) while other plugins
who don't have Publications or metadata
export RepositoryVersions.</div>
<div><br>
</div>
<div>The other option is to have core
handle the workflow. The user would call
a core endpoint and provide a
RepositoryVersion. This would work
because for importing/exporting, you
wouldn't ever use Publications because
metadata won't be used for importing
back into Pulp. If needed, core could
provide a way for plugin writers to
write custom handlers/exporters for
content types.</div>
<div><br>
</div>
<div>If we go with the second option, the
question then becomes whether we should
divorce the concept of Exporters and
import/export. Or do we also switch
Exporters from Master/Detail to core
only?</div>
<div><br>
</div>
<div># Foreign Keys</div>
<div><br>
</div>
<div>Content can be distributed across
multiple tables (eg UpdateRecord has
UpdateCollection, etc). In our export,
we could either use primary keys (UUIDs)
or natural keys to relate records. The
former assumes that UUIDs are unique
across Pulp instances. The safer but
more complex alternative is to use
natural keys. This would involve storing
a set of fields on a record that would
be used to identify a related record.</div>
<div><br>
</div>
<div># Incremental Exports</div>
<div><br>
</div>
<div>There are two big pieces of data
contained in an export: the dataset of
Content from the database and the
artifact files. An incremental export
cuts down on the size of an export by
only exporting the differences. However,
when performing an incremental export,
we could still export the complete
dataset instead of just a set of
differences
(additions/removals/updates). This
approach would be simpler and it would
allow us to ensure that the new repo
version matches the exported repo
version exactly. It would however
increase the export size but not by much
I think--probably some number of
megabytes at most.</div>
</div>
</blockquote>
<p>If its simper, i would go with that.
Saving even ~100-200 MB isn't that big of a
deal IMO. the biggest savings is in the RPM
content. <br>
</p>
<p><br>
</p>
<blockquote type="cite">
<div dir="ltr">
<div><br>
</div>
<div>[0] <a href="https://pulp.plan.io/issues/6134" target="_blank">https://pulp.plan.io/issues/6134</a></div>
<div><br>
</div>
<div>
<div>
<div dir="ltr">
<div dir="ltr">
<div>
<div dir="ltr">
<div>
<div dir="ltr">
<div>David<br>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
</div>
<br>
<fieldset></fieldset>
<pre>_______________________________________________
Pulp-dev mailing list
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" target="_blank">https://www.redhat.com/mailman/listinfo/pulp-dev</a>
</pre>
</blockquote>
</div>
_______________________________________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman/listinfo/pulp-dev</a><br>
</blockquote>
</div>
_______________________________________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman/listinfo/pulp-dev</a><br>
</blockquote>
</div>
</blockquote>
</div>
</blockquote>
</div>
</div>
</blockquote>
</div>
</blockquote></div></div></div>