<div dir="ltr"><div>Both of those tickets look good to me. I can clearly see how the proposed changes will simplify the plugin writing experience.<br><br></div>Thank you both for putting together this plan.<br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Apr 19, 2018 at 2:50 PM, Brian Bouterse <span dir="ltr"><<a href="mailto:bbouters@redhat.com" target="_blank">bbouters@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div id="m_-8036996345952189131gmail-magicdomid1929" class="m_-8036996345952189131gmail-ace-line"><span class="m_-8036996345952189131gmail-author-a-z71zyo2lz69z7bz66zgqz86z4haz67z">Jeff and I met and we put together two pieces of work which would create a declarative interface for a plugin writer to use. This would be used in stead of the Changeset interface by plugin writers</span><span class="m_-8036996345952189131gmail-author-a-ewjnmz73zz74zz71zmaz78zz77zunz83zz78z">. Whether or not to continue including the ChangeSet in the plugin API is still being discussed.</span></div><div id="m_-8036996345952189131gmail-magicdomid1428" class="m_-8036996345952189131gmail-ace-line"><br></div><div id="m_-8036996345952189131gmail-magicdomid1427" class="m_-8036996345952189131gmail-ace-line"><span class="m_-8036996345952189131gmail-author-a-z71zyo2lz69z7bz66zgqz86z4haz67z">There seemed to be interest in offering an interface like this so on Monday we will put together a PR so that we can see what it looks like and how hard it would be to switch. Look at these stories in the hopes that we can groom them and put them on the sprint.</span></div><div id="m_-8036996345952189131gmail-magicdomid1429" class="m_-8036996345952189131gmail-ace-line"><br></div><div id="m_-8036996345952189131gmail-magicdomid1431" class="m_-8036996345952189131gmail-ace-line"><span class="m_-8036996345952189131gmail-author-a-z71zyo2lz69z7bz66zgqz86z4haz67z">* </span><span class="m_-8036996345952189131gmail-author-a-z71zyo2lz69z7bz66zgqz86z4haz67z m_-8036996345952189131gmail-url"><a href="https://pulp.plan.io/issues/3570" target="_blank">https://pulp.plan.io/issues/<wbr>3570</a></span></div><div id="m_-8036996345952189131gmail-magicdomid1433" class="m_-8036996345952189131gmail-ace-line"><span class="m_-8036996345952189131gmail-author-a-z71zyo2lz69z7bz66zgqz86z4haz67z">* </span><span class="m_-8036996345952189131gmail-author-a-z71zyo2lz69z7bz66zgqz86z4haz67z m_-8036996345952189131gmail-url"><a href="https://pulp.plan.io/issues/3582" target="_blank">https://pulp.plan.io/issues/<wbr>3582</a></span></div><div id="m_-8036996345952189131gmail-magicdomid1199" class="m_-8036996345952189131gmail-ace-line"><br></div><div id="m_-8036996345952189131gmail-magicdomid1810" class="m_-8036996345952189131gmail-ace-line"><span class="m_-8036996345952189131gmail-author-a-z71zyo2lz69z7bz66zgqz86z4haz67z">Our plan is to start on ^ on Monday so if there are questions, ideas, or concerns let us know. Once we have something to share, we'll email back to this thread. Feel free to comment on the issues directly also.</span></div><div id="m_-8036996345952189131gmail-magicdomid1812" class="m_-8036996345952189131gmail-ace-line"><br></div><div id="m_-8036996345952189131gmail-magicdomid1818" class="m_-8036996345952189131gmail-ace-line"><span class="m_-8036996345952189131gmail-author-a-z71zyo2lz69z7bz66zgqz86z4haz67z">Thanks,</span></div><div id="m_-8036996345952189131gmail-magicdomid1831" class="m_-8036996345952189131gmail-ace-line"><span class="m_-8036996345952189131gmail-author-a-z71zyo2lz69z7bz66zgqz86z4haz67z">Brian & Jeff</span></div><br></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Mon, Apr 16, 2018 at 3:10 PM, Dennis Kliban <span dir="ltr"><<a href="mailto:dkliban@redhat.com" target="_blank">dkliban@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span>On Mon, Apr 16, 2018 at 2:13 PM, Dennis Kliban <span dir="ltr"><<a href="mailto:dkliban@redhat.com" target="_blank">dkliban@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><span>On Mon, Apr 16, 2018 at 12:21 PM, Jeff Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
  
    
  
  <div text="#000000" bgcolor="#FFFFFF">
    <font face="DejaVu Sans">Thanks for the proposal, Brian.  I also
      commented on the issue.</font><span><br>
    <br>
    <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965moz-cite-prefix">On 04/16/2018 09:41 AM, Brian Bouterse
      wrote:<br>
    </div>
    <blockquote type="cite">
      <div dir="ltr">
        <div>I wrote up a description of the opportunity I see here [0].
          I put a high level pro/con analysis below. I would like
          feedback on (a) if this adequately addresses the problem
          statements, (b) if there are alternatives, and (c) does this
          improve the plugin wrtier's experience enough to adopt this?<br>
          <br>
        </div>
        <div>pros:<br>
        </div>
        <div>* significantly less plugin code to write. Compare the
          Thing example code versus the current docs.<br>
        </div>
      </div>
    </blockquote></span>
    +1<span><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div>* Higher performing with metadata downloading and parsing
          being included in stream processing. This causes sync's for
          pulp_ansible to start 6+ min earlier.<br>
        </div>
      </div>
    </blockquote>
    <br></span>
    This could also be done currently with the ChangeSet as-is.<span><br>
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div><br>
          cons:<br>
        </div>
        <div>* Progress reporting doesn't know how many things it's
          processing (it's a stream). So user's would see progress as "X
          things completed", not "X of Y things completed". Y can't be
          known until just before the stream processing completes
          otherwise it's not stream processing.<br>
        </div>
      </div>
    </blockquote>
    <br></span>
    I'm not a fan of the SizedIterator either.<br>
    I contemplated this when designing the ChangeSet.  An alternative I
    considered was to report progress like OSTree does.  It reports
    progress by periodically updating the expected TOTAL.  It's better
    than nothing.<div><div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866h5"><br></div></div></div></blockquote><div><br></div></span><div>What if we allow plugin writers to optionally provide a total number when instantiating the ChangeSet? I bet there will be cases where the number of items in the repository version will be known without having to fully parse all the metadata. In these cases the progress reporting could be more informative. <br></div><div><div class="m_-8036996345952189131m_4769324167015933096h5"><div> </div></div></div></div></div></div></blockquote><div><br></div></span><div>Here is another idea for progress reporting for stream processing: have ChangeSet create a separate progress report for downloads. The total could by dynamically updated as downloads are scheduled. The complete count can be updated after each successful download. <br></div><div><br></div><div>Any limitations in progress reporting are outweighed by the efficiency gained by having plugins always use stream processing. Just imagine not having to wait for the RPM plugin to finish "processing metadata" to start downloading content. <br></div><div><div class="m_-8036996345952189131h5"><div> </div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><div><div class="m_-8036996345952189131m_4769324167015933096h5"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div text="#000000" bgcolor="#FFFFFF"><div><div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866h5">
    <br>
    <blockquote type="cite">
      <div dir="ltr">
        <div><br>
        </div>
        <div>[0]: <a href="https://pulp.plan.io/issues/3570" target="_blank">https://pulp.plan.io/issues/35<wbr>70</a><br>
          <br>
        </div>
        <div>Thanks!<br>
        </div>
        <div>Brian<br>
        </div>
        <div><br>
          <br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <div class="gmail_quote">On Thu, Apr 12, 2018 at 7:12 PM, Jeff
          Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
            <div text="#000000" bgcolor="#FFFFFF">
              <div>
                <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965h5"> <br>
                  <br>
                  <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404moz-cite-prefix">On
                    04/12/2018 04:00 PM, Brian Bouterse wrote:<br>
                  </div>
                  <blockquote type="cite">
                    <div dir="ltr">
                      <div class="gmail_extra"><br>
                        <div class="gmail_quote">On Thu, Apr 12, 2018 at
                          11:53 AM, Jeff Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
                          wrote:<br>
                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                            <div text="#000000" bgcolor="#FFFFFF"><span>
                                <br>
                                <br>
                                <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615moz-cite-prefix">On
                                  04/12/2018 10:01 AM, Brian Bouterse
                                  wrote:<br>
                                </div>
                                <blockquote type="cite">
                                  <div dir="ltr"><br>
                                    <div class="gmail_extra"><br>
                                      <div class="gmail_quote">On Wed,
                                        Apr 11, 2018 at 6:07 PM, Jeff
                                        Ortel <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
                                        wrote:<br>
                                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                          <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                                              <br>
                                              <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278moz-cite-prefix">On
                                                04/11/2018 03:29 PM,
                                                Brian Bouterse wrote:<br>
                                              </div>
                                              <blockquote type="cite">
                                                <div dir="ltr">I think
                                                  we should look into
                                                  this in the near-term.
                                                  Changing an interface
                                                  on an object used by
                                                  all plugins will be
                                                  significantly easier,
                                                  earlier.<br>
                                                  <br>
                                                  <div>
                                                    <div>
                                                      <div>
                                                        <div class="gmail_extra"><br>
                                                          <div class="gmail_quote">On
                                                          Wed, Apr 11,
                                                          2018 at 12:25
                                                          PM, Jeff Ortel
                                                          <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
                                                          wrote:<br>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                                                          <br>
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045moz-cite-prefix">On
                                                          04/11/2018
                                                          10:59 AM,
                                                          Brian Bouterse
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">
                                                          <div dir="ltr"><br>
                                                          <div class="gmail_extra"><br>
                                                          <div class="gmail_quote">On
                                                          Tue, Apr 10,
                                                          2018 at 10:43
                                                          AM, Jeff Ortel
                                                          <span dir="ltr"><<a href="mailto:jortel@redhat.com" target="_blank">jortel@redhat.com</a>></span>
                                                          wrote:<br>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045HOEnZb">
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045h5">
                                                          <div text="#000000" bgcolor="#FFFFFF"> <br>
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-forward-container"><br>
                                                          <br>
                                                          <table class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-email-headers-table" width="380" cellspacing="0" cellpadding="0" border="0" height="86">
                                                          <tbody>
                                                          <tr>
                                                          <th valign="BASELINE" nowrap align="RIGHT"><br>
                                                          </th>
                                                          <td><br>
                                                          </td>
                                                          </tr>
                                                          <tr>
                                                          <th valign="BASELINE" nowrap align="RIGHT"><br>
                                                          </th>
                                                          <td><br>
                                                          </td>
                                                          </tr>
                                                          <tr>
                                                          <th valign="BASELINE" nowrap align="RIGHT"><br>
                                                          </th>
                                                          <td><br>
                                                          </td>
                                                          </tr>
                                                          <tr>
                                                          <th valign="BASELINE" nowrap align="RIGHT"><br>
                                                          </th>
                                                          <td><br>
                                                          </td>
                                                          </tr>
                                                          </tbody>
                                                          </table>
                                                          <br>
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-cite-prefix">On
                                                          04/06/2018
                                                          09:15 AM,
                                                          Brian Bouterse
                                                          wrote:<br>
                                                          </div>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div>
                                                          <div>
                                                          <div>
                                                          <div>Several
                                                          plugins have
                                                          started using
                                                          the Changesets
                                                          including
                                                          pulp_ansible,
                                                          pulp_python,
                                                          pulp_file, and
                                                          perhaps
                                                          others. The
                                                          Changesets
                                                          provide
                                                          several
                                                          distinct
                                                          points of
                                                          value which
                                                          are great, but
                                                          there are two
                                                          challenges I
                                                          want to bring
                                                          up. I want to
                                                          focus only on
                                                          the problem
                                                          statements
                                                          first.<br>
                                                          <br>
                                                          </div>
                                                          1. There is
                                                          redundant
                                                          "differencing"
                                                          code in all
                                                          plugins. The
                                                          Changeset
                                                          interface
                                                          requires the
                                                          plugin writer
                                                          to determine
                                                          what units
                                                          need to be
                                                          added and
                                                          those to be
                                                          removed. This
                                                          requires all
                                                          plugin writers
                                                          to write the
                                                          same
                                                          non-trivial
                                                          differencing
                                                          code over and
                                                          over. For
                                                          example, you
                                                          can see the
                                                          same
                                                          non-trivial
                                                          differencing
                                                          code present
                                                          in <a href="https://github.com/pulp/pulp_ansible/blob/d0eb9d125f9a6cdc82e2807bcad38749967a1245/pulp_ansible/app/tasks/synchronizing.py#L217-L306" target="_blank">pulp_ansible</a>, <a href="https://github.com/pulp/pulp_file/blob/30afa7cce667b57d8fe66d5fc1fe87fd77029210/pulp_file/app/tasks/synchronizing.py#L114-L193" target="_blank">pulp_file</a>, and <a href="https://github.com/pulp/pulp_python/blob/066d33990e64b5781c8419b96acaf2acf1982324/pulp_python/app/tasks/sync.py#L172-L223" target="_blank">pulp_python</a>. Line-wise, this
                                                          "differencing"
                                                          code makes up
                                                          a large
                                                          portion (maybe
                                                          50%) of the
                                                          sync code
                                                          itself in each
                                                          plugin.<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          Ten lines of
                                                          trivial set
                                                          logic hardly
                                                          seems like a
                                                          big deal but
                                                          any
                                                          duplication is
                                                          worth
                                                          exploring. <br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <div>It's more
                                                          than ten
                                                          lines. Take
                                                          pulp_ansible
                                                          for example.
                                                          By my count
                                                          (the linked to
                                                          section) it's
                                                          89 lines,
                                                          which out of
                                                          306 lines of
                                                          plugin code
                                                          for sync is
                                                          29% of extra
                                                          redundant
                                                          code. The
                                                          other plugins
                                                          have similar
                                                          numbers. So
                                                          with those
                                                          numbers in
                                                          mind, what do
                                                          you think?<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          </span> I was
                                                          counting the
                                                          lines (w/o
                                                          comments) in
                                                          find_delta()
                                                          based on the
                                                          linked code. 
                                                          Which
                                                          functions are
                                                          you counting?<span><br>
                                                          </span></div>
                                                          </blockquote>
                                                          <div><br>
                                                          </div>
                                                          <div>I was
                                                          counting the
                                                          find_delta,
                                                          build_additions,
                                                          and
                                                          build_removals
                                                          methods.
                                                          Regardless of
                                                          how the lines
                                                          are counted,
                                                          that
                                                          differencing
                                                          code is the
                                                          duplication
                                                          I'm talking
                                                          about. There
                                                          isn't a way to
                                                          use the
                                                          changesets
                                                          without
                                                          duplicating
                                                          that
                                                          differencing
                                                          code in a
                                                          plugin.<br>
                                                          </div>
                                                          </div>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </div>
                                                </div>
                                              </blockquote>
                                              <br>
                                            </span> The differencing
                                            code is limited to
                                            find_delta() and perhaps
                                            build_removals().  Agreed,
                                            the line count is less
                                            useful than specifically
                                            identifying duplicate code. 
                                            Outside of find_delta(), I
                                            see similar code (in part
                                            because it got copied from
                                            file plugin) but not seeing
                                            actual duplication.  Can you
                                            be more specific?<span><br>
                                            </span></div>
                                        </blockquote>
                                        <div><br>
                                        </div>
                                        <div>Very similar code or
                                          identical code, I think it
                                          begs the question why are we
                                          having plugin writer's do this
                                          at all? What value are they
                                          creating with it? I don't have
                                          a reasonable answer to that
                                          question, so the requirement
                                          for plugin writer's to write
                                          that code brings me back to
                                          the problem statement: "plugin
                                          writers have redundant
                                          differencing code when using
                                          Changesets". More info on why
                                          it is valuable for the plugin
                                          writer to do the differencing
                                          code versus the Changesets
                                          would be helpful.<br>
                                        </div>
                                      </div>
                                    </div>
                                  </div>
                                </blockquote>
                                <br>
                              </span> The ChangeSet abstraction (and
                              API) is based on following division of
                              responsibility:<br>
                              <br>
                              The plugin  (with an understanding of the
                              remote and its content):<br>
                                - Download metadata.<br>
                                - Parse metadata<br>
                                - Based on the metadata:<br>
                                  - determine content to be added to the
                              repository.<br>
                                    - define how artifacts are
                              downloaded.<br>
                                    - construct content<br>
                                  - determine content to be removed to
                              the repository.<br>
                              <br>
                              Core (without understand of specific
                              remote or its content):<br>
                                - Provide low level API for plugin to
                              affect the changes it has determined need
                              to be made to the repository.  This is
                              downloaders, models etc.<br>
                                - Provide high(er) level API for plugin
                              to affect the changes it has determined
                              need to be made to the repository.  This
                              is the ChangeSet.<br>
                              <br>
                              Are you proposing that this is not the
                              correct division?<span><br>
                              </span></div>
                          </blockquote>
                          <div><br>
                          </div>
                          <div>Yes I believe these problem statements
                            suggest we should adjust the plugin writer's
                            responsibilities when interacting with the
                            Changesets in two specific ways. It's not
                            exactly the language you used, but I believe
                            the following two responsibilities could be
                            moved into the Changesets entirely:<br>
                            <br>
                          </div>
                          <div>- determining if any given Artifact or
                            Content unit is already present in Pulp (aka
                            computing what needs tobe added)<br>
                          </div>
                        </div>
                      </div>
                    </div>
                  </blockquote>
                  <br>
                </div>
              </div>
              Did you mean <i>added</i> to the repository or <i>created</i>
              in pulp.  Currently, the plugin determines the content
              that needs to be added to the repository.  This is modeled
              using a PendingContent which fully defines the Content
              (unit) and its PendingArtifact(s) which are included in
              the <i>additions</i>. The ChangeSet does determine
              whether or not any artifacts need to be downloaded (and
              downloads them based on policy) and determines which
              Content needs to be <i>created</i> vs simply added to the
              repository.  The plugin blindly assumes that none of the <i>pending</i>
              content has yet been created pulp.  This accomplishes 2
              things.  1) reduces complexity and decision making by the
              plugin.  2) provides the ChangeSet with all the
              information needed to <i>create</i> and <i>download</i>
              as needed.  The <i>additions</i> represents what the
              plugin wants to be added to the repository to synchronize
              it with the remote repository. <br>
              <span> <br>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div class="gmail_extra">
                      <div class="gmail_quote">
                        <div>- determining which content units need to
                          be removed (aka computing the removals)<br>
                        </div>
                      </div>
                    </div>
                  </div>
                </blockquote>
                <br>
              </span> I don't see how the ChangeSet has enough
              information to do this.  The plugin can (most likely will)
              make the decision about what to remove based on remote
              metadata, policy and configuration.<span><br>
                <br>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div class="gmail_extra">
                      <div class="gmail_quote">
                        <div> <br>
                        </div>
                        <div>^ goals are a restating of the problem
                          statement that plugin writers are asked to do
                          differencing calculations when the Changesets
                          could provide that to the plugin writer
                          instead.<br>
                        </div>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF"><span>
                              <blockquote type="cite">
                                <div dir="ltr">
                                  <div class="gmail_extra">
                                    <div class="gmail_quote">
                                      <div> </div>
                                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                        <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                                            <blockquote type="cite">
                                              <div dir="ltr">
                                                <div>
                                                  <div>
                                                    <div>
                                                      <div class="gmail_extra">
                                                        <div class="gmail_quote">
                                                          <div><br>
                                                          So a shorter,
                                                          simpler
                                                          problem
                                                          statement is:
                                                          "to use the
                                                          changesets
                                                          plugin writers
                                                          have to do
                                                          extra work to
                                                          compute
                                                          additions and
                                                          removals
                                                          parameters".<br>
                                                          </div>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </div>
                                                </div>
                                              </div>
                                            </blockquote>
                                            <br>
                                          </span> This statement ^ is
                                          better but still too vague to
                                          actually solve.  Can we
                                          elaborate on specifically what
                                          "to do extra work" means?<span><br>
                                          </span></div>
                                      </blockquote>
                                      <div><br>
                                      </div>
                                      <div>Sure. Removing that vague
                                        language is one way to resolve
                                        its vagueness. Here's a revised
                                        problem statement: <span class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933gmail-">"to
                                          use the changesets plugin
                                          writers have to compute
                                          additions and removals
                                          parameters". This problem
                                          statement would be resolved by
                                          a solution that causes the
                                          plugin writer to never have to
                                          produce these parameters and
                                          be replaced by an interface
                                          that would require less effort
                                          from a plugin writer.</span><br>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </blockquote>
                              <br>
                            </span> I think it's the plugin's
                            responsibility to determine the difference. 
                            Aside from that: without an understanding of
                            the metadata and content type, how could the
                            ChangeSet do this?  What might that looks
                            like?<span><br>
                            </span></div>
                        </blockquote>
                        <div> <br>
                        </div>
                        <div>If I'm understanding this correctly, the
                          Changesets already do this for additions
                          right? Help check my understanding. If a
                          plugin writer delivers PendingContent and
                          PendingArtifacts to the Changesets as
                          'additions', the Changesets will recognize
                          them as already downloaded and not download
                          them right? If this is the case, what is the
                          benefit of having plugin writers also try to
                          figure out if things should be downloaded?<br>
                        </div>
                      </div>
                    </div>
                  </div>
                </blockquote>
                <br>
              </span> As you pointed out, the plugin writer does not
              need to figure out what needs to be downloaded.<span><br>
                <br>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div class="gmail_extra">
                      <div class="gmail_quote">
                        <div><br>
                        </div>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                              <blockquote type="cite">
                                <div dir="ltr">
                                  <div class="gmail_extra">
                                    <div class="gmail_quote">
                                      <div> <br>
                                      </div>
                                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                        <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                                            <blockquote type="cite">
                                              <div dir="ltr">
                                                <div>
                                                  <div>
                                                    <div>
                                                      <div class="gmail_extra">
                                                        <div class="gmail_quote">
                                                          <div> </div>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div class="gmail_extra">
                                                          <div class="gmail_quote">
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045HOEnZb">
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045h5">
                                                          <div text="#000000" bgcolor="#FFFFFF">
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-forward-container">
                                                          <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div>
                                                          <div>
                                                          <div><br>
                                                          </div>
                                                          2. Plugins
                                                          can't do
                                                          end-to-end
                                                          stream
                                                          processing.
                                                          The Changesets
                                                          themselves do
                                                          stream
                                                          processing,
                                                          but when you
                                                          call into
                                                          changeset.apply_and_drain()
                                                          you have to
                                                          have fully
                                                          parsed the
                                                          metadata
                                                          already.
                                                          Currently when
                                                          fetching all
                                                          metadata from
                                                          Galaxy,
                                                          pulp_ansible
                                                          takes about
                                                          380 seconds
                                                          (6+ min). This
                                                          means that the
                                                          actual
                                                          Changeset
                                                          content
                                                          downloading
                                                          starts 380
                                                          seconds later
                                                          than it could.
                                                          At the heart
                                                          of the
                                                          problem, the
                                                          fetching+parsing
                                                          of the
                                                          metadata is
                                                          not part of
                                                          the stream
                                                          processing.<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          The
                                                          additions/removals
                                                          can be any
                                                          interable
                                                          (like
                                                          generator) and
                                                          by using
                                                          ChangeSet.apply()
                                                          and iterating
                                                          the returned
                                                          object, the
                                                          pluign can
                                                          "turn the
                                                          crank" while
                                                          downloading
                                                          and processing
                                                          the metadata. 
                                                          The
                                                          ChangeSet.apply_and_drain()
                                                          is just a
                                                          convenience
                                                          method.  I
                                                          don't see how
                                                          this is a
                                                          limitation of
                                                          the ChangeSet.<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <div><br>
                                                          </div>
                                                          <div>That is
                                                          new info for
                                                          me (and maybe
                                                          everyone). OK
                                                          so Changesets
                                                          have two
                                                          interfaces.
                                                          apply() and
                                                          apply_and_drain().
                                                          Why do we have
                                                          two interfaces
                                                          when apply()
                                                          can support
                                                          all existing
                                                          use cases
                                                          (that I know
                                                          of) and do
                                                          end-to-end
                                                          stream
                                                          processing but
apply_and_drain() cannot? I see all of our examples (and all of our new
                                                          plugins) using
apply_and_drain().<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          </span> The
                                                          ChangeSet.apply()
                                                          was how I
                                                          designed (and
                                                          documented)
                                                          it.  Not sure
                                                          when/who added
                                                          the
                                                          apply_and_drain(). 
                                                          +1 for
                                                          removing it.<span><br>
                                                          </span></div>
                                                          </blockquote>
                                                          <div><br>
                                                          </div>
                                                          <div>I read
                                                          through the
                                                          changeset
                                                          docs. I think
                                                          this stream
                                                          processing
                                                          thing is still
                                                          a problem but
                                                          perhaps in how
                                                          we're
                                                          presenting the
                                                          Changeset with
                                                          it's
                                                          arguments. I
                                                          don't think
                                                          apply() versus
apply_and_drain() are at all related. Regardless of if you are using
                                                          apply() or
                                                          apply_and_drain(),
                                                          the Changeset
                                                          requires an
                                                          'additions'
                                                          and 'removals'
                                                          arguments.
                                                          This sends a
                                                          clear message
                                                          to the plugin
                                                          writer that
                                                          they need to
                                                          compute
                                                          additions and
                                                          removals. They
                                                          will fetch the
                                                          metadata to
                                                          compute these
                                                          which is
                                                          mostly how the
                                                          changeset
                                                          documentation
                                                          reads. To know
                                                          that they
                                                          could present
                                                          a generator
                                                          that would
                                                          correctly
                                                          allow the
                                                          metdata from
                                                          inside the
                                                          Changeset is I
                                                          feel as
                                                          non-obvious. I
                                                          want the
                                                          high-performing
                                                          implementation
                                                          to be the
                                                          obvious one.<br>
                                                          <br>
                                                          So what about
                                                          a problem
                                                          statement like
                                                          this:
                                                          "Changesets
                                                          are presented
                                                          such that when
                                                          you call into
                                                          them you
                                                          should already
                                                          have fetched
                                                          the metadata"?<br>
                                                          </div>
                                                        </div>
                                                      </div>
                                                    </div>
                                                  </div>
                                                </div>
                                              </div>
                                            </blockquote>
                                            <br>
                                          </span> I'm not sure what is
                                          meant by "presented".  If this
                                          means that we should provide
                                          an example of how the
                                          ChangeSet can be used by
                                          plugins (with large metadata)
                                          in such a way that does not
                                          require downloading all the
                                          metadata first - that sounds
                                          like a good idea. <br>
                                        </div>
                                      </blockquote>
                                      <div><br>
                                      </div>
                                      <div>Cool so this is transitioning
                                        to ideas for resolution. The
                                        solution to add documentation on
                                        how to do this with the existing
                                        interface is one option. My
                                        concern with adding additional
                                        docs on how to use the current
                                        interface better is that if
                                        users choose to follow the
                                        existing docs then they will
                                        have the stream processing
                                        problem once again. To me, this
                                        suggests that this new example
                                        should actually replace the
                                        existing documentation.<br>
                                      </div>
                                    </div>
                                  </div>
                                </div>
                              </blockquote>
                              <br>
                            </span> Seems like both example would be
                            useful.  I'm not convinced that all plugins
                            would benefit from this.  For example: the
                            File plugin manifest is small and would
                            likely not benefit from the extra
                            complexity.  For complicated plugins (like
                            RPM), can differencing decision be made
                            before analyzing the entire metadata (eg:
                            primary.xml)?  Also, it's not clear to me
                            how this would work using the Downloader. 
                            Are you suggesting that the plugin would
                            parse/process metadata files while they're
                            being downloaded?  Perhaps a better
                            understanding of the flow to be supported
                            would help me understand this.<span><br>
                            </span></div>
                        </blockquote>
                        <div><br>
                        </div>
                        <div>Yes I am suggesting just that: that the
                          Changesets could facilitate parse/processing
                          metadata files while actual content named in
                          those files is also being downloaded. I have a
                          straightforward idea on how to achieve this.
                          It's short and easy enough to write up (no
                          code), but I want to make sure I'm not moving
                          beyond the problem statement without others.
                          Is there more we want to do on these problem
                          statements, or would answering a bit about one
                          way it could work be helpful?<br>
                        </div>
                      </div>
                    </div>
                  </div>
                </blockquote>
                <br>
              </span> The <i>additions</i> can be (and usually is) a
              generator.  The generator can yield based on metadata as
              it is downloaded and digested.  In this way, the ChangeSet
              already facilitates this.<span><br>
                <br>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div class="gmail_extra">
                      <div class="gmail_quote">
                        <div><br>
                          Just to state my expectations: Moving beyond
                          the problem statement I don't consider to be a
                          commitment to solve it; just an agreement on
                          what we're solving as we discuss various
                          resolutions. Problem statements can also
                          always be revisited. Either way forward is
                          fine w/ me, just let me know how we should
                          continue<br>
                        </div>
                      </div>
                    </div>
                  </div>
                </blockquote>
                <br>
              </span> So far, I'm not convinced that any specific
              problems/deficiencies have been identified.  That said,
              you seem to have a different abstraction in mind. I would
              be interested in reviewing it and how it would be used by
              plugin writers.  It may help illustrate the gains that you
              are envisioning.<span><br>
                <br>
                <blockquote type="cite">
                  <div dir="ltr">
                    <div class="gmail_extra">
                      <div class="gmail_quote">
                        <div> </div>
                        <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                          <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                              <blockquote type="cite">
                                <div dir="ltr">
                                  <div class="gmail_extra">
                                    <div class="gmail_quote">
                                      <div> </div>
                                      <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                        <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                                            <blockquote type="cite">
                                              <div dir="ltr">
                                                <div>
                                                  <div>
                                                    <div>
                                                      <div class="gmail_extra">
                                                        <div class="gmail_quote">
                                                          <div> <br>
                                                          </div>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div text="#000000" bgcolor="#FFFFFF"><span> <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div class="gmail_extra">
                                                          <div class="gmail_quote">
                                                          <div> </div>
                                                          <blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045HOEnZb">
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045h5">
                                                          <div text="#000000" bgcolor="#FFFFFF">
                                                          <div class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-forward-container">
                                                          <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div>
                                                          <div><br>
                                                          </div>
                                                          Do you see the
                                                          same
                                                          challenges I
                                                          do? Are these
                                                          the right
                                                          problem
                                                          statements? I
                                                          think with
                                                          clear problem
                                                          statements a
                                                          solution will
                                                          be easy to see
                                                          and agree on.<br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          I'm not
                                                          convinced that
                                                          these are
                                                          actual
problems/challenges that need to be addressed in the near term.<br>
                                                          <br>
                                                          <blockquote type="cite">
                                                          <div dir="ltr">
                                                          <div>
                                                          <div><br>
                                                          </div>
                                                          Thanks!<br>
                                                          </div>
                                                          Brian<br>
                                                          </div>
                                                          <br>
                                                          <fieldset class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587mimeAttachmentHeader"></fieldset>
                                                          <br>
                                                          <pre>______________________________<wbr>_________________
Pulp-dev mailing list
<a class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-txt-link-abbreviated" href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a>
<a class="m_-8036996345952189131m_4769324167015933096m_-9031017347325771866m_-5537642402529512965m_-6788984366494356404m_9076289667921108317m_2885702093445463615m_-2307966803311662826m_-4677839867557154933m_-6005010689108662278m_2987425865952863938m_-2372350557339798045m_-5307126087037616587moz-txt-link-freetext" href="https://www.redhat.com/mailman/listinfo/pulp-dev" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a>
</pre>
                                                          </blockquote>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          </div>
                                                          <br>
______________________________<wbr>_________________<br>
                                                          Pulp-dev
                                                          mailing list<br>
                                                          <a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
                                                          <a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
                                                          <br>
                                                          </blockquote>
                                                          </div>
                                                          <br>
                                                          </div>
                                                          </div>
                                                          </blockquote>
                                                          <br>
                                                          </span></div>
                                                          </blockquote>
                                                        </div>
                                                        <br>
                                                      </div>
                                                    </div>
                                                  </div>
                                                </div>
                                              </div>
                                            </blockquote>
                                            <br>
                                          </span></div>
                                      </blockquote>
                                    </div>
                                    <br>
                                  </div>
                                </div>
                              </blockquote>
                              <br>
                            </span></div>
                          <br>
                          ______________________________<wbr>_________________<br>
                          Pulp-dev mailing list<br>
                          <a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
                          <a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
                          <br>
                        </blockquote>
                      </div>
                      <br>
                    </div>
                  </div>
                </blockquote>
                <br>
              </span></div>
            <br>
            ______________________________<wbr>_________________<br>
            Pulp-dev mailing list<br>
            <a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
            <a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
            <br>
          </blockquote>
        </div>
        <br>
      </div>
    </blockquote>
    <br>
  </div></div></div>

<br>______________________________<wbr>_________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
<br></blockquote></div></div></div><br></div></div>
</blockquote></div></div></div><br></div></div>
<br>______________________________<wbr>_________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
<br></blockquote></div><br></div>
</div></div></blockquote></div><br></div>