<div dir="ltr"><div>I am in favour of letting the core handle the workflow and provide the RepositoryVersion. <br></div><div>I don't know if it in general makes sense to export a publication especially when we'll get to the incremental export implementation.</div><div><br></div><div><span style="font-weight:bold;color:rgb(32,74,135)"></span>I do think we should separate the Export and Exporter concepts:</div><div>- while exporting/importing repo version it makes sense to export only the repo vesion, so we can then plugin it in.</div><div>- Exporter workflow ( rsync for example) will need to "export" the repo version content with repodata, if available (publication) so when the task finishes the content is already consumable. So I think exporter should stay master/details because the presence/absence of the publication depending on the plugin.</div><div><br></div><div>+1 to uuids</div><div><br></div><div>I am not sure (or maybe not following) if exporting complete dataset will help in case other repo versions have been created in between full export and incremental export.<br></div><div><div><div dir="ltr" data-smartmail="gmail_signature"><div dir="ltr"><div><div dir="ltr"><br><br>--------<br>Regards,<br><br>Ina Panova<br>Senior Software Engineer| Pulp| Red Hat Inc.<br><br>"Do not go where the path may lead,<br> go instead where there is no path and leave a trail."<br></div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, Feb 14, 2020 at 7:11 PM David Davis <<a href="mailto:daviddavis@redhat.com" target="_blank">daviddavis@redhat.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>Grant and I met today to discuss importers and exporters[0] and we'd like some feedback before we proceed with the design. To sum up this feature briefly: users can export a repository version from one Pulp instance and import it to another. </div><div><br></div># Master/Detail vs Core<div><br></div><div>So one fundamental question is whether we should use a Master/Detail approach or just have core control the flow but call out to plugins to get export formats.</div><div><br></div><div>To give some background: we currently define Exporters (ie FileSystemExporter) in core as Master models. Plugins extend this model which allows them to configure or customize the Exporter. This was necessary because some plugins need to export Publications (along with repository metadata) while other plugins who don't have Publications or metadata export RepositoryVersions.</div><div><br></div><div>The other option is to have core handle the workflow. The user would call a core endpoint and provide a RepositoryVersion. This would work because for importing/exporting, you wouldn't ever use Publications because metadata won't be used for importing back into Pulp. If needed, core could provide a way for plugin writers to write custom handlers/exporters for content types.</div><div><br></div><div>If we go with the second option, the question then becomes whether we should divorce the concept of Exporters and import/export. Or do we also switch Exporters from Master/Detail to core only?</div><div><br></div><div># Foreign Keys</div><div><br></div><div>Content can be distributed across multiple tables (eg UpdateRecord has UpdateCollection, etc). In our export, we could either use primary keys (UUIDs) or natural keys to relate records. The former assumes that UUIDs are unique across Pulp instances. The safer but more complex alternative is to use natural keys. This would involve storing a set of fields on a record that would be used to identify a related record.</div><div><br></div><div># Incremental Exports</div><div><br></div><div>There are two big pieces of data contained in an export: the dataset of Content from the database and the artifact files. An incremental export cuts down on the size of an export by only exporting the differences. However, when performing an incremental export, we could still export the complete dataset instead of just a set of differences (additions/removals/updates). This approach would be simpler and it would allow us to ensure that the new repo version matches the exported repo version exactly. It would however increase the export size but not by much I think--probably some number of megabytes at most.</div><div><br></div><div>[0] <a href="https://pulp.plan.io/issues/6134" target="_blank">https://pulp.plan.io/issues/6134</a></div><div><br></div><div><div><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><div><div dir="ltr"><div>David<br></div></div></div></div></div></div></div></div></div></div> _______________________________________________<br> Pulp-dev mailing list<br> <a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br> <a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman/listinfo/pulp-dev</a><br> </blockquote></div>