<div dir="ltr"><div>I wrote up a description of the opportunity I see here [0]. I put a high level pro/con analysis below. I would like feedback on (a) if this adequately addresses the problem statements, (b) if there are alternatives, and (c) does this improve the plugin wrtier's experience enough to adopt this?<br><br></div><div>pros:<br></div><div>* significantly less plugin code to write. Compare the Thing example code versus the current docs.<br></div><div>* Higher performing with metadata downloading and parsing being included in stream processing. This causes sync's for pulp_ansible to start 6+ min earlier.<br></div><div><br>cons:<br></div><div>* Progress reporting doesn't know how many things it's processing (it's a stream). So user's would see progress as "X things completed", not "X of Y things completed". Y can't be known until just before the stream processing completes otherwise it's not stream processing.<br><br></div><div>[0]: <a href="https://pulp.plan.io/issues/3570">https://pulp.plan.io/issues/3570</a><br><br></div><div>Thanks!<br></div><div>Brian<br></div><div><br><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Apr 12, 2018 at 7:12 PM, Jeff Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF"><div><div class="h5">
    <br>
    <br>
    <div class="m_-6788984366494356404moz-cite-prefix">On 04/12/2018 04:00 PM, Brian Bouterse
      wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra"><br>
          <div class="gmail_quote">On Thu, Apr 12, 2018 at 11:53 AM,
            Jeff Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
            wrote:<br>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                  <br>
                  <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615moz-cite-prefix">On
                    04/12/2018 10:01 AM, Brian Bouterse wrote:<br>
                  </div>
                  <blockquote type="cite">
                    <div dir="ltr"><br>
                      <div class="gmail_extra"><br>
                        <div class="gmail_quote">On Wed, Apr 11, 2018 at
                          6:07 PM, Jeff Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
                          wrote:<br>
                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            <div text="#000000" bgcolor="#FFFFFF"><span>
                                <br>
                                <br>
                                <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278moz-cite-prefix">On
                                  04/11/2018 03:29 PM, Brian Bouterse
                                  wrote:<br>
                                </div>
                                <blockquote type="cite">
                                  <div dir="ltr">I think we should look
                                    into this in the near-term. Changing
                                    an interface on an object used by
                                    all plugins will be significantly
                                    easier, earlier.<br>
                                    <br>
                                    <div>
                                      <div>
                                        <div>
                                          <div class="gmail_extra"><br>
                                            <div class="gmail_quote">On
                                              Wed, Apr 11, 2018 at 12:25
                                              PM, Jeff Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
                                              wrote:<br>
                                              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                <div text="#000000" bgcolor="#FFFFFF"><span>
                                                    <br>
                                                    <br>
                                                    <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045moz-cite-prefix">On
                                                      04/11/2018 10:59
                                                      AM, Brian Bouterse
                                                      wrote:<br>
                                                    </div>
                                                    <blockquote type="cite">
                                                      <div dir="ltr"><br>
                                                        <div class="gmail_extra"><br>
                                                          <div class="gmail_quote">On
                                                          Tue, Apr 10,
                                                          2018 at 10:43
                                                          AM, Jeff Ortel
                                                          <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
                                                          wrote:<br>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045HOEnZb">
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045h5">
                                                          <div text="#000000" bgcolor="#FFFFFF"> <br>
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-forward-container"><br>
                                                          <br>
                                                          <table class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-email-headers-table" width="380" cellspacing="0" cellpadding="0" height="86" border="0">
                                                          <tbody>
                                                          <tr>
                                                          <th valign="BASELINE" nowrap align="RIGHT"><br>
                                                          </th>
                                                          <td><br>
                                                          </td>
                                                          </tr>
                                                          <tr>
                                                          <th valign="BASELINE" nowrap align="RIGHT"><br>
                                                          </th>
                                                          <td><br>
                                                          </td>
                                                          </tr>
                                                          <tr>
                                                          <th valign="BASELINE" nowrap align="RIGHT"><br>
                                                          </th>
                                                          <td><br>
                                                          </td>
                                                          </tr>
                                                          <tr>
                                                          <th valign="BASELINE" nowrap align="RIGHT"><br>
                                                          </th>
                                                          <td><br>
                                                          </td>
                                                          </tr>
                                                          </tbody>
                                                          </table>
                                                          <br>
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-cite-prefix">On
                                                          04/06/2018
                                                          09:15 AM,
                                                          Brian Bouterse
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div>
                                                          <div>
                                                          <div>
                                                          <div>Several
                                                          plugins have
                                                          started using
                                                          the Changesets
                                                          including
                                                          pulp_ansible,
                                                          pulp_python,
                                                          pulp_file, and
                                                          perhaps
                                                          others. The
                                                          Changesets
                                                          provide
                                                          several
                                                          distinct
                                                          points of
                                                          value which
                                                          are great, but
                                                          there are two
                                                          challenges I
                                                          want to bring
                                                          up. I want to
                                                          focus only on
                                                          the problem
                                                          statements
                                                          first.<br>
                                                          <br>
                                                          </div>
                                                          1. There is
                                                          redundant
                                                          "differencing"
                                                          code in all
                                                          plugins. The
                                                          Changeset
                                                          interface
                                                          requires the
                                                          plugin writer
                                                          to determine
                                                          what units
                                                          need to be
                                                          added and
                                                          those to be
                                                          removed. This
                                                          requires all
                                                          plugin writers
                                                          to write the
                                                          same
                                                          non-trivial
                                                          differencing
                                                          code over and
                                                          over. For
                                                          example, you
                                                          can see the
                                                          same
                                                          non-trivial
                                                          differencing
                                                          code present
                                                          in <a href="https://github.com/pulp/pulp_ansible/blob/d0eb9d125f9a6cdc82e2807bcad38749967a1245/pulp_ansible/app/tasks/synchronizing.py#L217-L306" target="_blank">pulp_ansible</a>, <a href="https://github.com/pulp/pulp_file/blob/30afa7cce667b57d8fe66d5fc1fe87fd77029210/pulp_file/app/tasks/synchronizing.py#L114-L193" target="_blank">pulp_file</a>, and <a href="https://github.com/pulp/pulp_python/blob/066d33990e64b5781c8419b96acaf2acf1982324/pulp_python/app/tasks/sync.py#L172-L223" target="_blank">pulp_python</a>. Line-wise, this
                                                          "differencing"
                                                          code makes up
                                                          a large
                                                          portion (maybe
                                                          50%) of the
                                                          sync code
                                                          itself in each
                                                          plugin.<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          Ten lines of
                                                          trivial set
                                                          logic hardly
                                                          seems like a
                                                          big deal but
                                                          any
                                                          duplication is
                                                          worth
                                                          exploring. <br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <div>It's more
                                                          than ten
                                                          lines. Take
                                                          pulp_ansible
                                                          for example.
                                                          By my count
                                                          (the linked to
                                                          section) it's
                                                          89 lines,
                                                          which out of
                                                          306 lines of
                                                          plugin code
                                                          for sync is
                                                          29% of extra
                                                          redundant
                                                          code. The
                                                          other plugins
                                                          have similar
                                                          numbers. So
                                                          with those
                                                          numbers in
                                                          mind, what do
                                                          you think?<br>
                                                          </div>
                                                          </div>
                                                        </div>
                                                      </div>
                                                    </blockquote>
                                                    <br>
                                                  </span> I was counting
                                                  the lines (w/o
                                                  comments) in
                                                  find_delta() based on
                                                  the linked code. 
                                                  Which functions are
                                                  you counting?<span><br>
                                                  </span></div>
                                              </blockquote>
                                              <div><br>
                                              </div>
                                              <div>I was counting the
                                                find_delta,
                                                build_additions, and
                                                build_removals methods.
                                                Regardless of how the
                                                lines are counted, that
                                                differencing code is the
                                                duplication I'm talking
                                                about. There isn't a way
                                                to use the changesets
                                                without duplicating that
                                                differencing code in a
                                                plugin.<br>
                                              </div>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </blockquote>
                                <br>
                              </span> The differencing code is limited
                              to find_delta() and perhaps
                              build_removals().  Agreed, the line count
                              is less useful than specifically
                              identifying duplicate code.  Outside of
                              find_delta(), I see similar code (in part
                              because it got copied from file plugin)
                              but not seeing actual duplication.  Can
                              you be more specific?<span><br>
                              </span></div>
                          </blockquote>
                          <div><br>
                          </div>
                          <div>Very similar code or identical code, I
                            think it begs the question why are we having
                            plugin writer's do this at all? What value
                            are they creating with it? I don't have a
                            reasonable answer to that question, so the
                            requirement for plugin writer's to write
                            that code brings me back to the problem
                            statement: "plugin writers have redundant
                            differencing code when using Changesets".
                            More info on why it is valuable for the
                            plugin writer to do the differencing code
                            versus the Changesets would be helpful.<br>
                          </div>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                  <br>
                </span> The ChangeSet abstraction (and API) is based on
                following division of responsibility:<br>
                <br>
                The plugin  (with an understanding of the remote and its
                content):<br>
                  - Download metadata.<br>
                  - Parse metadata<br>
                  - Based on the metadata:<br>
                    - determine content to be added to the repository.<br>
                      - define how artifacts are downloaded.<br>
                      - construct content<br>
                    - determine content to be removed to the repository.<br>
                <br>
                Core (without understand of specific remote or its
                content):<br>
                  - Provide low level API for plugin to affect the
                changes it has determined need to be made to the
                repository.  This is downloaders, models etc.<br>
                  - Provide high(er) level API for plugin to affect the
                changes it has determined need to be made to the
                repository.  This is the ChangeSet.<br>
                <br>
                Are you proposing that this is not the correct division?<span><br>
                </span></div>
            </blockquote>
            <div><br>
            </div>
            <div>Yes I believe these problem statements suggest we
              should adjust the plugin writer's responsibilities when
              interacting with the Changesets in two specific ways. It's
              not exactly the language you used, but I believe the
              following two responsibilities could be moved into the
              Changesets entirely:<br>
              <br>
            </div>
            <div>- determining if any given Artifact or Content unit is
              already present in Pulp (aka computing what needs tobe
              added)<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></div></div>
    Did you mean <i>added</i> to the repository or <i>created</i> in
    pulp.  Currently, the plugin determines the content that needs to be
    added to the repository.  This is modeled using a PendingContent
    which fully defines the Content (unit) and its PendingArtifact(s)
    which are included in the <i>additions</i>. The ChangeSet does
    determine whether or not any artifacts need to be downloaded (and
    downloads them based on policy) and determines which Content needs
    to be <i>created</i> vs simply added to the repository.  The plugin
    blindly assumes that none of the <i>pending</i> content has yet
    been created pulp.  This accomplishes 2 things.  1) reduces
    complexity and decision making by the plugin.  2) provides the
    ChangeSet with all the information needed to <i>create</i> and <i>download</i>
    as needed.  The <i>additions</i> represents what the plugin wants
    to be added to the repository to synchronize it with the remote
    repository. <br><span class="">
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div>- determining which content units need to be removed
              (aka computing the removals)<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></span>
    I don't see how the ChangeSet has enough information to do this. 
    The plugin can (most likely will) make the decision about what to
    remove based on remote metadata, policy and configuration.<span class=""><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div> <br>
            </div>
            <div>^ goals are a restating of the problem statement that
              plugin writers are asked to do differencing calculations
              when the Changesets could provide that to the plugin
              writer instead.<br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"><span>
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div class="gmail_extra">
                        <div class="gmail_quote">
                          <div> </div>
                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            <div text="#000000" bgcolor="#FFFFFF"><span>
                                <br>
                                <blockquote type="cite">
                                  <div dir="ltr">
                                    <div>
                                      <div>
                                        <div>
                                          <div class="gmail_extra">
                                            <div class="gmail_quote">
                                              <div><br>
                                                So a shorter, simpler
                                                problem statement is:
                                                "to use the changesets
                                                plugin writers have to
                                                do extra work to compute
                                                additions and removals
                                                parameters".<br>
                                              </div>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </blockquote>
                                <br>
                              </span> This statement ^ is better but
                              still too vague to actually solve.  Can we
                              elaborate on specifically what "to do
                              extra work" means?<span><br>
                              </span></div>
                          </blockquote>
                          <div><br>
                          </div>
                          <div>Sure. Removing that vague language is one
                            way to resolve its vagueness. Here's a
                            revised problem statement: <span class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933gmail-">"to
                              use the changesets plugin writers have to
                              compute additions and removals
                              parameters". This problem statement would
                              be resolved by a solution that causes the
                              plugin writer to never have to produce
                              these parameters and be replaced by an
                              interface that would require less effort
                              from a plugin writer.</span><br>
                          </div>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                  <br>
                </span> I think it's the plugin's responsibility to
                determine the difference.  Aside from that: without an
                understanding of the metadata and content type, how
                could the ChangeSet do this?  What might that looks
                like?<span><br>
                </span></div>
            </blockquote>
            <div> <br>
            </div>
            <div>If I'm understanding this correctly, the Changesets
              already do this for additions right? Help check my
              understanding. If a plugin writer delivers PendingContent
              and PendingArtifacts to the Changesets as 'additions', the
              Changesets will recognize them as already downloaded and
              not download them right? If this is the case, what is the
              benefit of having plugin writers also try to figure out if
              things should be downloaded?<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></span>
    As you pointed out, the plugin writer does not need to figure out
    what needs to be downloaded.<span class=""><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
            </div>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div class="gmail_extra">
                        <div class="gmail_quote">
                          <div> <br>
                          </div>
                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            <div text="#000000" bgcolor="#FFFFFF"><span>
                                <br>
                                <blockquote type="cite">
                                  <div dir="ltr">
                                    <div>
                                      <div>
                                        <div>
                                          <div class="gmail_extra">
                                            <div class="gmail_quote">
                                              <div> </div>
                                              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                <div text="#000000" bgcolor="#FFFFFF"><span>
                                                    <br>
                                                    <blockquote type="cite">
                                                      <div dir="ltr">
                                                        <div class="gmail_extra">
                                                          <div class="gmail_quote">
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045HOEnZb">
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045h5">
                                                          <div text="#000000" bgcolor="#FFFFFF">
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-forward-container">
                                                          <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div>
                                                          <div>
                                                          <div><br>
                                                          </div>
                                                          2. Plugins
                                                          can't do
                                                          end-to-end
                                                          stream
                                                          processing.
                                                          The Changesets
                                                          themselves do
                                                          stream
                                                          processing,
                                                          but when you
                                                          call into
                                                          changeset.apply_and_drain()
                                                          you have to
                                                          have fully
                                                          parsed the
                                                          metadata
                                                          already.
                                                          Currently when
                                                          fetching all
                                                          metadata from
                                                          Galaxy,
                                                          pulp_ansible
                                                          takes about
                                                          380 seconds
                                                          (6+ min). This
                                                          means that the
                                                          actual
                                                          Changeset
                                                          content
                                                          downloading
                                                          starts 380
                                                          seconds later
                                                          than it could.
                                                          At the heart
                                                          of the
                                                          problem, the
                                                          fetching+parsing
                                                          of the
                                                          metadata is
                                                          not part of
                                                          the stream
                                                          processing.<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          The
                                                          additions/removals
                                                          can be any
                                                          interable
                                                          (like
                                                          generator) and
                                                          by using
                                                          ChangeSet.apply()
                                                          and iterating
                                                          the returned
                                                          object, the
                                                          pluign can
                                                          "turn the
                                                          crank" while
                                                          downloading
                                                          and processing
                                                          the metadata. 
                                                          The
                                                          ChangeSet.apply_and_drain()
                                                          is just a
                                                          convenience
                                                          method.  I
                                                          don't see how
                                                          this is a
                                                          limitation of
                                                          the ChangeSet.<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <div><br>
                                                          </div>
                                                          <div>That is
                                                          new info for
                                                          me (and maybe
                                                          everyone). OK
                                                          so Changesets
                                                          have two
                                                          interfaces.
                                                          apply() and
                                                          apply_and_drain().
                                                          Why do we have
                                                          two interfaces
                                                          when apply()
                                                          can support
                                                          all existing
                                                          use cases
                                                          (that I know
                                                          of) and do
                                                          end-to-end
                                                          stream
                                                          processing but
apply_and_drain() cannot? I see all of our examples (and all of our new
                                                          plugins) using
apply_and_drain().<br>
                                                          </div>
                                                          </div>
                                                        </div>
                                                      </div>
                                                    </blockquote>
                                                    <br>
                                                  </span> The
                                                  ChangeSet.apply() was
                                                  how I designed (and
                                                  documented) it.  Not
                                                  sure when/who added
                                                  the
                                                  apply_and_drain().  +1
                                                  for removing it.<span><br>
                                                  </span></div>
                                              </blockquote>
                                              <div><br>
                                              </div>
                                              <div>I read through the
                                                changeset docs. I think
                                                this stream processing
                                                thing is still a problem
                                                but perhaps in how we're
                                                presenting the Changeset
                                                with it's arguments. I
                                                don't think apply()
                                                versus apply_and_drain()
                                                are at all related.
                                                Regardless of if you are
                                                using apply() or
                                                apply_and_drain(), the
                                                Changeset requires an
                                                'additions' and
                                                'removals' arguments.
                                                This sends a clear
                                                message to the plugin
                                                writer that they need to
                                                compute additions and
                                                removals. They will
                                                fetch the metadata to
                                                compute these which is
                                                mostly how the changeset
                                                documentation reads. To
                                                know that they could
                                                present a generator that
                                                would correctly allow
                                                the metdata from inside
                                                the Changeset is I feel
                                                as non-obvious. I want
                                                the high-performing
                                                implementation to be the
                                                obvious one.<br>
                                                <br>
                                                So what about a problem
                                                statement like this:
                                                "Changesets are
                                                presented such that when
                                                you call into them you
                                                should already have
                                                fetched the metadata"?<br>
                                              </div>
                                            </div>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </blockquote>
                                <br>
                              </span> I'm not sure what is meant by
                              "presented".  If this means that we should
                              provide an example of how the ChangeSet
                              can be used by plugins (with large
                              metadata) in such a way that does not
                              require downloading all the metadata first
                              - that sounds like a good idea. <br>
                            </div>
                          </blockquote>
                          <div><br>
                          </div>
                          <div>Cool so this is transitioning to ideas
                            for resolution. The solution to add
                            documentation on how to do this with the
                            existing interface is one option. My concern
                            with adding additional docs on how to use
                            the current interface better is that if
                            users choose to follow the existing docs
                            then they will have the stream processing
                            problem once again. To me, this suggests
                            that this new example should actually
                            replace the existing documentation.<br>
                          </div>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                  <br>
                </span> Seems like both example would be useful.  I'm
                not convinced that all plugins would benefit from this. 
                For example: the File plugin manifest is small and would
                likely not benefit from the extra complexity.  For
                complicated plugins (like RPM), can differencing
                decision be made before analyzing the entire metadata
                (eg: primary.xml)?  Also, it's not clear to me how this
                would work using the Downloader.  Are you suggesting
                that the plugin would parse/process metadata files while
                they're being downloaded?  Perhaps a better
                understanding of the flow to be supported would help me
                understand this.<span><br>
                </span></div>
            </blockquote>
            <div><br>
            </div>
            <div>Yes I am suggesting just that: that the Changesets
              could facilitate parse/processing metadata files while
              actual content named in those files is also being
              downloaded. I have a straightforward idea on how to
              achieve this. It's short and easy enough to write up (no
              code), but I want to make sure I'm not moving beyond the
              problem statement without others. Is there more we want to
              do on these problem statements, or would answering a bit
              about one way it could work be helpful?<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></span>
    The <i>additions</i> can be (and usually is) a generator.  The
    generator can yield based on metadata as it is downloaded and
    digested.  In this way, the ChangeSet already facilitates this.<span class=""><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div><br>
              Just to state my expectations: Moving beyond the problem
              statement I don't consider to be a commitment to solve it;
              just an agreement on what we're solving as we discuss
              various resolutions. Problem statements can also always be
              revisited. Either way forward is fine w/ me, just let me
              know how we should continue<br>
            </div>
          </div>
        </div>
      </div>
    </blockquote>
    <br></span>
    So far, I'm not convinced that any specific problems/deficiencies
    have been identified.  That said, you seem to have a different
    abstraction in mind. I would be interested in reviewing it and how
    it would be used by plugin writers.  It may help illustrate the
    gains that you are envisioning.<span class=""><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div class="gmail_extra">
          <div class="gmail_quote">
            <div> </div>
            <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
              <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div class="gmail_extra">
                        <div class="gmail_quote">
                          <div> </div>
                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            <div text="#000000" bgcolor="#FFFFFF"><span>
                                <br>
                                <blockquote type="cite">
                                  <div dir="ltr">
                                    <div>
                                      <div>
                                        <div>
                                          <div class="gmail_extra">
                                            <div class="gmail_quote">
                                              <div> <br>
                                              </div>
                                              <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                <div text="#000000" bgcolor="#FFFFFF"><span>
                                                    <br>
                                                    <blockquote type="cite">
                                                      <div dir="ltr">
                                                        <div class="gmail_extra">
                                                          <div class="gmail_quote">
                                                          <div> </div>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045HOEnZb">
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045h5">
                                                          <div text="#000000" bgcolor="#FFFFFF">
                                                          <div class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-forward-container">
                                                          <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div>
                                                          <div><br>
                                                          </div>
                                                          Do you see the
                                                          same
                                                          challenges I
                                                          do? Are these
                                                          the right
                                                          problem
                                                          statements? I
                                                          think with
                                                          clear problem
                                                          statements a
                                                          solution will
                                                          be easy to see
                                                          and agree on.<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          I'm not
                                                          convinced that
                                                          these are
                                                          actual
problems/challenges that need to be addressed in the near term.<br>
                                                          <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div><br>
                                                          </div>
                                                          Thanks!<br>
                                                          </div>
                                                          Brian<br>
                                                          </div>
                                                          <br>
                                                          <fieldset class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587mimeAttachmentHeader"></fieldset>
                                                          <br>
                                                          <pre>______________________________<wbr>_________________
Pulp-dev mailing list
<a class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-txt-link-abbreviated" href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a>
<a class="m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-txt-link-freetext" href="https://www.redhat.com/mailman/listinfo/pulp-dev" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a>
</pre>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          <br>
______________________________<wbr>_________________<br>
                                                          Pulp-dev
                                                          mailing list<br>
                                                          <a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
                                                          <a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
                                                          <br>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                        </div>
                                                      </div>
                                                    </blockquote>
                                                    <br>
                                                  </span></div>
                                              </blockquote>
                                            </div>
                                            <br>
                                          </div>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </blockquote>
                                <br>
                              </span></div>
                          </blockquote>
                        </div>
                        <br>
                      </div>
                    </div>
                  </blockquote>
                  <br>
                </span></div>
              <br>
              ______________________________<wbr>_________________<br>
              Pulp-dev mailing list<br>
              <a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
              <a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
              <br>
            </blockquote>
          </div>
          <br>
        </div>
      </div>
    </blockquote>
    <br>
  </span></div>

<br>______________________________<wbr>_________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/<wbr>mailman/listinfo/pulp-dev</a><br>
<br></blockquote></div><br></div>