<div dir="ltr"><div>As described in 3770, pulp_file syncs 2.4x slower than than pulp2 [0]. I believe we want Pulp3 to sync at least as fast as Pulp2. I think we should consider making the goal of "have pulp3 sync as fast as pulp2" a Pulp3 GA requirement. The reasoning for me is two fold. (a) users aren't going to switch to something over twice as slow. (b) we likely will have to make some non-trivial database changes so doing them now.<br></div><div><br></div><div>How do you feel about this goal/need?<br></div><div><br></div><div>In terms of tackling the problems themselves, I've separated the performance issue into 3 different performance problems:<br></div><div><br></div><div><a href="https://pulp.plan.io/issues/3812" target="_blank">https://pulp.plan.io/issues/38<wbr>12</a><br></div><div><a href="https://pulp.plan.io/issues/3813" target="_blank">https://pulp.plan.io/issues/38<wbr>13</a></div><div><a href="https://pulp.plan.io/issues/3814" target="_blank">https://pulp.plan.io/issues/38<wbr>14</a></div><div><br></div><div>Any feedback or discussion on these is welcome. I plan to help organizing ideas as we explore possible solutions. Once some more info is available and a few vetted ideas are available, I plan to bring it back to the list. If anyone wants to talk through them before then, feel free to reach out to me.</div><div><br></div><div>[0]: <a href="https://pulp.plan.io/issues/3770#note-5">https://pulp.plan.io/issues/3770#note-5</a><br></div><div><br></div><div>-Brian<br></div><div><br></div><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 21, 2018 at 4:50 PM, Brian Bouterse <span dir="ltr"><<a href="mailto:bbouters@redhat.com" target="_blank">bbouters@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>I just tried an implementation of DeclarativeVersion that uses bulk_create for all content units, content artifacts, and remote artifacts.</div><div><br></div><div>The content units are incompatible with bulk_save(). When trying to save a batch of content units with bulk_save it raises: ValueError: Can't bulk create a multi-table inherited model<br></div></div><div class="gmail-m_648007178859630485m_3606263288339347409HOEnZb"><div class="gmail-m_648007178859630485m_3606263288339347409h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 21, 2018 at 4:19 PM, Brian Bouterse <span dir="ltr"><<a href="mailto:bbouters@redhat.com" target="_blank">bbouters@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div></div><div>I'm only considering these changes for the plugin writer API to help resolve the performance issues.<br></div></div><div class="gmail-m_648007178859630485m_3606263288339347409m_280425820938108662HOEnZb"><div class="gmail-m_648007178859630485m_3606263288339347409m_280425820938108662h5"><div class="gmail_extra"><br><div class="gmail_quote">On Thu, Jun 21, 2018 at 4:11 PM, Austin Macdonald <span dir="ltr"><<a href="mailto:amacdona@redhat.com" target="_blank">amacdona@redhat.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr"><div>For models, bulk_create seems good to me. Endpoints to kick off tasks like sync that use bulk_create seems fine.</div><div><br></div><div>Are you also proposing we have bulk_create for non-task REST API calls? Should a user be able to POST a list of dictionaries that becomes a set of Content? I'm open to it, but it seems like it could get ugly.<br></div></div><div class="gmail_extra"><br><div class="gmail_quote"><div><div class="gmail-m_648007178859630485m_3606263288339347409m_280425820938108662m_-7831704515762135928h5">On Thu, Jun 21, 2018 at 3:54 PM, Brian Bouterse <span dir="ltr"><<a href="mailto:bbouters@redhat.com" target="_blank">bbouters@redhat.com</a>></span> wrote:<br></div></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div><div class="gmail-m_648007178859630485m_3606263288339347409m_280425820938108662m_-7831704515762135928h5"><div dir="ltr"><div>I've run cprofile on some of the sync code for Pulp3 and I've noticed that we may have some problems with bulk_create on some of the object types.</div><div><br></div><div>Here is a small analysis I did: <a href="https://pulp.plan.io/issues/3770#note-2" target="_blank">https://pulp.plan.io/issues/37<wbr>70#note-2</a></div><div><br></div><div>As an aside, we don't have a bulk add option for RepositoryVersion.add_content, which ensures each round trip to the db will be for one unit. When you're processing 70K units, that's a lot of trips. I don't think we have to add this right now, but to resolve an issue like 3770 we may need to.<br></div><div><br></div><div>I do think we should make our models compatible with bulk_create now either way.<br></div><div><br></div><div>What do you think?</div><span class="gmail-m_648007178859630485m_3606263288339347409m_280425820938108662m_-7831704515762135928m_-5060667670849886742HOEnZb"><font color="#888888"><div><br></div><div>-Brian<br></div></font></span></div>
<br></div></div><span>______________________________<wbr>_________________<br>
Pulp-dev mailing list<br>
<a href="mailto:Pulp-dev@redhat.com" target="_blank">Pulp-dev@redhat.com</a><br>
<a href="https://www.redhat.com/mailman/listinfo/pulp-dev" rel="noreferrer" target="_blank">https://www.redhat.com/mailman<wbr>/listinfo/pulp-dev</a><br>
<br></span></blockquote></div><br></div>
</blockquote></div><br></div>
</div></div></blockquote></div><br></div>
</div></div></blockquote></div><br></div></div>