<div dir="ltr"><div dir="ltr">On Tue, Feb 15, 2022 at 7:01 PM Richard W.M. Jones <<a href="mailto:rjones@redhat.com">rjones@redhat.com</a>> wrote:<br></div><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On Tue, Feb 15, 2022 at 06:38:55PM +0200, Nir Soffer wrote:<br> > On Tue, Feb 15, 2022 at 5:54 PM Richard W.M. Jones <<a href="mailto:rjones@redhat.com" target="_blank">rjones@redhat.com</a>> wrote:<br> > <br> > Pick the nbdcopy --requests parameter to target an implicit buffer<br> > size of 64M inside nbdcopy. However don't set nbdcopy --request < 64.<br> > <br> > If request_size == 256K (the default) => requests = 256<br> > If request_size == 8M => requests = 64 (buffer size 512M)<br> > <br> > <br> > Considering the total bytes buffered makes sense. I did the same in another<br> > application that only reads from NBD using libnbd async API. I'm using:<br> > <br> > max_requests = 16<br> > max_bytes = 2m<br> > <br> > So if you have small requests (e.g. 4k), you get 16 inflight requests per<br> > connection<br> > and with 4 connections 64 inflight requests on the storage side.<br> > <br> > But if you have large requests (256k), you get only 8 requests per connection<br> > and<br> > 32 requests on the storage side.<br> > <br> > This was tested in a read-only case both on my laptop with fast NVMe<br> > (Samsung 970 EVO Plus 1T) and with super fast NVMe on Dell server,<br> > and with shared storage (NetApp iSCSI).<br> > <br> > With fast NVMe, limiting the maximum buffered bytes to 1M is actually<br> > ~10% faster, but with shared storage using more requests is faster.<br> > <br> > What you suggest here will result in:<br> > small requests: 256 requests per connection, 1024 requests on storage side<br> > large requests: 64 requests per connection, 156 requests on storage side.<br> <br> So a note here that we're not using multi-conn when converting from<br> VDDK because VDDK doesn't behave well:<br> <br> <a href="https://github.com/libguestfs/virt-v2v/commit/bb0e698360470cb4ff5992e8e01a3165f56fe41e" rel="noreferrer" target="_blank">https://github.com/libguestfs/virt-v2v/commit/bb0e698360470cb4ff5992e8e01a3165f56fe41e</a><br> <br> > I don't think any storage can handle such a large amount of connections better.<br> > <br> > I think we should test --requests 8 first, it may show nice speedup comapred<br> > to what we see in <br> > <a href="https://bugzilla.redhat.com/show_bug.cgi?id=2039255#c33" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=2039255#c33</a><br> > <br> > Looks like in <br> > <a href="https://bugzilla.redhat.com/show_bug.cgi?id=2039255#c32" rel="noreferrer" target="_blank">https://bugzilla.redhat.com/show_bug.cgi?id=2039255#c32</a><br> > <br> > We introduced 2 changes at the same time, which makes it impossible to tell<br> > the effect of any single change.<br> <br> I couldn't measure any performance benefit from increasing the number<br> of requests, but also it didn't have any down-side. Ming Xie also did<br> a test and she didn't see any benefit or loss either.<br> <br> The purpose of the patch (which I didn't explain well) was to ensure<br> that if we make the request-size larger, we don't blow up nbdcopy<br> memory usage too much. So aim for a target amount of memory consumed<br> in nbdcopy buffers (64M), but conservatively never reducing #buffers<br> below the current setting (64).</blockquote><div><br></div><div>The intent is good, but I think we need to refine the</div><div>actual sizes, but this can be done later.</div><div><br></div><div>Also this should be better to do in nbdcopy instead of virt-v2v, since it</div><div>will improve all callers of nbdcopy.</div><div><br></div><div>Nir</div></div></div>