[Pulp-dev] Changesets Challenges

Austin Macdonald amacdona at redhat.com
Tue Apr 10 14:00:02 UTC 2018


>
>
> 1. There is redundant "differencing" code in all plugins. The Changeset
> interface requires the plugin writer to determine what units need to be
> added and those to be removed. This requires all plugin writers to write
> the same non-trivial differencing code over and over. For example, you can
> see the same non-trivial differencing code present in pulp_ansible
> <https://github.com/pulp/pulp_ansible/blob/d0eb9d125f9a6cdc82e2807bcad38749967a1245/pulp_ansible/app/tasks/synchronizing.py#L217-L306>
> , pulp_file
> <https://github.com/pulp/pulp_file/blob/30afa7cce667b57d8fe66d5fc1fe87fd77029210/pulp_file/app/tasks/synchronizing.py#L114-L193>,
> and pulp_python
> <https://github.com/pulp/pulp_python/blob/066d33990e64b5781c8419b96acaf2acf1982324/pulp_python/app/tasks/sync.py#L172-L223>.
> Line-wise, this "differencing" code makes up a large portion (maybe 50%) of
> the sync code itself in each plugin.
>

I agree that core could provide some of this logic. However, I'm not sure
that the changeset is the right scope. The more monolithic the tool, the
less flexible we will be, so I hope that we can provide a tool that is
separate from the changeset. This way, a plugin can determine for
themselves what needs to be added and removed, if the general way doesn't
work for them.

2. Plugins can't do end-to-end stream processing. The Changesets themselves
> do stream processing, but when you call into changeset.apply_and_drain()
> you have to have fully parsed the metadata already. Currently when fetching
> all metadata from Galaxy, pulp_ansible takes about 380 seconds (6+ min).
> This means that the actual Changeset content downloading starts 380 seconds
> later than it could. At the heart of the problem, the fetching+parsing of
> the metadata is not part of the stream processing.
>

I can see how this isn't ideal, curious to see how you would address it.



On Mon, Apr 9, 2018 at 5:38 AM, Milan Kovacik <mkovacik at redhat.com> wrote:

> Brian,
>
> thanks for the proposal!
>
> On Fri, Apr 6, 2018 at 4:15 PM, Brian Bouterse <bbouters at redhat.com>
> wrote:
> > Several plugins have started using the Changesets including pulp_ansible,
> > pulp_python, pulp_file, and perhaps others. The Changesets provide
> several
> > distinct points of value which are great, but there are two challenges I
> > want to bring up. I want to focus only on the problem statements first.
> >
> > 1. There is redundant "differencing" code in all plugins. The Changeset
> > interface requires the plugin writer to determine what units need to be
> > added and those to be removed. This requires all plugin writers to write
> the
> > same non-trivial differencing code over and over. For example, you can
> see
> > the same non-trivial differencing code present in pulp_ansible,
> pulp_file,
> > and pulp_python. Line-wise, this "differencing" code makes up a large
> > portion (maybe 50%) of the sync code itself in each plugin.
>
> +1; DRY is always better
>
> >
> > 2. Plugins can't do end-to-end stream processing. The Changesets
> themselves
> > do stream processing, but when you call into changeset.apply_and_drain()
> you
> > have to have fully parsed the metadata already. Currently when fetching
> all
> > metadata from Galaxy, pulp_ansible takes about 380 seconds (6+ min). This
> > means that the actual Changeset content downloading starts 380 seconds
> later
> > than it could. At the heart of the problem, the fetching+parsing of the
> > metadata is not part of the stream processing.
> >
> > Do you see the same challenges I do? Are these the right problem
> statements?
> > I think with clear problem statements a solution will be easy to see and
> > agree on.
>
> Totally, esp. on these, easier-to-see-the-value-of ones.
>
> Cheers,
> milan
>
> >
> > Thanks!
> > Brian
> >
> > _______________________________________________
> > Pulp-dev mailing list
> > Pulp-dev at redhat.com
> > https://www.redhat.com/mailman/listinfo/pulp-dev
> >
>
> _______________________________________________
> Pulp-dev mailing list
> Pulp-dev at redhat.com
> https://www.redhat.com/mailman/listinfo/pulp-dev
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/pulp-dev/attachments/20180410/a231fbce/attachment.htm>


More information about the Pulp-dev mailing list