<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
</head>
<body text="#000000" bgcolor="#FFFFFF">
<br>
<br>
<div class="moz-cite-prefix">On 04/11/2018 10:59 AM, Brian Bouterse
wrote:<br>
</div>
<blockquote type="cite"
cite="mid:CAAcvrTHGeceNxkTJzNg1KPPFJ7u48NSo4VUtuDjm=yeeoydyzQ@mail.gmail.com">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Tue, Apr 10, 2018 at 10:43 AM,
Jeff Ortel <span dir="ltr"><<a
href="mailto:jortel@redhat.com" target="_blank"
moz-do-not-send="true">jortel@redhat.com</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="HOEnZb">
<div class="h5">
<div text="#000000" bgcolor="#FFFFFF"> <br>
<div
class="m_-5307126087037616587moz-forward-container"><br>
<br>
<table
class="m_-5307126087037616587moz-email-headers-table"
border="0" height="86" cellspacing="0"
cellpadding="0" width="380">
<tbody>
<tr>
<th valign="BASELINE" align="RIGHT"
nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
<tr>
<th valign="BASELINE" align="RIGHT"
nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
<tr>
<th valign="BASELINE" align="RIGHT"
nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
<tr>
<th valign="BASELINE" align="RIGHT"
nowrap="nowrap"><br>
</th>
<td><br>
</td>
</tr>
</tbody>
</table>
<br>
<div class="m_-5307126087037616587moz-cite-prefix">On
04/06/2018 09:15 AM, Brian Bouterse wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div>
<div>Several plugins have started
using the Changesets including
pulp_ansible, pulp_python,
pulp_file, and perhaps others. The
Changesets provide several distinct
points of value which are great, but
there are two challenges I want to
bring up. I want to focus only on
the problem statements first.<br>
<br>
</div>
1. There is redundant "differencing"
code in all plugins. The Changeset
interface requires the plugin writer
to determine what units need to be
added and those to be removed. This
requires all plugin writers to write
the same non-trivial differencing code
over and over. For example, you can
see the same non-trivial differencing
code present in <a
href="https://github.com/pulp/pulp_ansible/blob/d0eb9d125f9a6cdc82e2807bcad38749967a1245/pulp_ansible/app/tasks/synchronizing.py#L217-L306"
target="_blank"
moz-do-not-send="true">pulp_ansible</a>,
<a
href="https://github.com/pulp/pulp_file/blob/30afa7cce667b57d8fe66d5fc1fe87fd77029210/pulp_file/app/tasks/synchronizing.py#L114-L193"
target="_blank"
moz-do-not-send="true">pulp_file</a>,
and <a
href="https://github.com/pulp/pulp_python/blob/066d33990e64b5781c8419b96acaf2acf1982324/pulp_python/app/tasks/sync.py#L172-L223"
target="_blank"
moz-do-not-send="true">pulp_python</a>.
Line-wise, this "differencing" code
makes up a large portion (maybe 50%)
of the sync code itself in each
plugin.<br>
</div>
</div>
</div>
</div>
</div>
</blockquote>
<br>
Ten lines of trivial set logic hardly seems like a
big deal but any duplication is worth exploring. <br>
</div>
</div>
</div>
</div>
</blockquote>
<div>It's more than ten lines. Take pulp_ansible for
example. By my count (the linked to section) it's 89
lines, which out of 306 lines of plugin code for sync is
29% of extra redundant code. The other plugins have
similar numbers. So with those numbers in mind, what do
you think?<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
I was counting the lines (w/o comments) in find_delta() based on the
linked code. Which functions are you counting?<br>
<br>
<blockquote type="cite"
cite="mid:CAAcvrTHGeceNxkTJzNg1KPPFJ7u48NSo4VUtuDjm=yeeoydyzQ@mail.gmail.com">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="HOEnZb">
<div class="h5">
<div text="#000000" bgcolor="#FFFFFF">
<div
class="m_-5307126087037616587moz-forward-container">
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div>
<div><br>
</div>
2. Plugins can't do end-to-end stream
processing. The Changesets themselves do
stream processing, but when you call
into changeset.apply_and_drain() you
have to have fully parsed the metadata
already. Currently when fetching all
metadata from Galaxy, pulp_ansible takes
about 380 seconds (6+ min). This means
that the actual Changeset content
downloading starts 380 seconds later
than it could. At the heart of the
problem, the fetching+parsing of the
metadata is not part of the stream
processing.<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
The additions/removals can be any interable (like
generator) and by using ChangeSet.apply() and
iterating the returned object, the pluign can
"turn the crank" while downloading and processing
the metadata. The ChangeSet.apply_and_drain() is
just a convenience method. I don't see how this
is a limitation of the ChangeSet.<br>
</div>
</div>
</div>
</div>
</blockquote>
<div><br>
</div>
<div>That is new info for me (and maybe everyone). OK so
Changesets have two interfaces. apply() and
apply_and_drain(). Why do we have two interfaces when
apply() can support all existing use cases (that I know
of) and do end-to-end stream processing but
apply_and_drain() cannot? I see all of our examples (and
all of our new plugins) using apply_and_drain().<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
The ChangeSet.apply() was how I designed (and documented) it. Not
sure when/who added the apply_and_drain(). +1 for removing it.<br>
<br>
<blockquote type="cite"
cite="mid:CAAcvrTHGeceNxkTJzNg1KPPFJ7u48NSo4VUtuDjm=yeeoydyzQ@mail.gmail.com">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class="HOEnZb">
<div class="h5">
<div text="#000000" bgcolor="#FFFFFF">
<div
class="m_-5307126087037616587moz-forward-container">
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div>
<div><br>
</div>
Do you see the same challenges I do? Are
these the right problem statements? I
think with clear problem statements a
solution will be easy to see and agree on.<br>
</div>
</div>
</div>
</blockquote>
<br>
I'm not convinced that these are actual
problems/challenges that need to be addressed in
the near term.<br>
<br>
<blockquote type="cite">
<div dir="ltr">
<div>
<div><br>
</div>
Thanks!<br>
</div>
Brian<br>
</div>
<br>
<fieldset
class="m_-5307126087037616587mimeAttachmentHeader"></fieldset>
<br>
<pre>______________________________<wbr>_________________
Pulp-dev mailing list
<a class="m_-5307126087037616587moz-txt-link-abbreviated" href="mailto:Pulp-dev@redhat.com" target="_blank" moz-do-not-send="true">Pulp-dev@redhat.com</a>
<a class="m_-5307126087037616587moz-txt-link-freetext" href="https://www.redhat.com/mailman/listinfo/pulp-dev" target="_blank" moz-do-not-send="true">https://www.redhat.com/<wbr>mailman/listinfo/pulp-dev</a>
</pre>
</blockquote>
<br>
</div>
</div>
</div>
</div>
<br>
______________________________<wbr>_________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com"
moz-do-not-send="true">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev"
rel="noreferrer" target="_blank" moz-do-not-send="true">https://www.redhat.com/<wbr>mailman/listinfo/pulp-dev</a><br>
<br>
</blockquote>
</div>
<br>
</div>
</div>
</blockquote>
<br>
</body>
</html>