[Libguestfs] FYI: perf commands I'm using to benchmark nbdcopy
Nir Soffer
nsoffer at redhat.com
Wed May 26 22:16:00 UTC 2021
On Thu, May 27, 2021 at 1:11 AM Nir Soffer <nsoffer at redhat.com> wrote:
>
> On Thu, May 27, 2021 at 12:36 AM Eric Blake <eblake at redhat.com> wrote:
> >
> > On Wed, May 26, 2021 at 03:15:13PM +0100, Richard W.M. Jones wrote:
> > > On Wed, May 26, 2021 at 04:49:50PM +0300, Nir Soffer wrote:
> > > > On Wed, May 26, 2021 at 4:03 PM Richard W.M. Jones <rjones at redhat.com> wrote:
> > > > > In my testing, nbdcopy is a clear 4x faster than qemu-img convert, with
> > > > > 4 also happening to be the default number of connections/threads.
> > > > > Why use nbdcopy --connections=1? That completely disables threads in
> > > > > nbdcopy.
> > > >
> > > > Because qemu-nbd does not report multicon when writing, so practically
> > > > you get one nbd handle for writing.
> > >
> > > Let's see if we can fix that. Crippling nbdcopy because of a missing
> > > feature in qemu-nbd isn't right. I wonder what Eric's reasoning for
> > > multi-conn not being safe is.
> >
> > multi-conn implies that connection A writes, connection B flushes, and
> > connection C is then guaranteed to read what connection A wrote.
> > Furthermore, if client A and B plan on doing overlapping writes, the
> > presence of multi-conn means that whoever flushes last is guaranteed
> > to have that last write stick. Without multiconn, even if A writes, B
> > writes, B flushes, then A flushes, you can end up with A's data
> > (rather than B's) as the final contents on disk, because the separate
> > connections are allowed to have separate caching regions where the
> > order of flushes determines which cache (with potentially stale data)
> > gets flushed when. And note that the effect of overlapping writes may
> > happen even when your client requests are not overlapping: if client A
> > and B both write distinct 512 byte regions within a larger 4k page,
> > the server performing RMW caching of that page will behave as though
> > there are overlapping writes.
> >
> > During nbdcopy or qemu-img convert, we aren't reading what we just
> > wrote and can easily arrange to avoid overlapping writes, so we don't
> > care about the bulk of the semantics of multi-conn (other than it is a
> > nice hint of a server that accepts multiple clients). So at the end
> > of the day, it boils down to:
> >
> > If the server advertised multi-conn: connect multiple clients, then
> > when all of them are done writing, only ONE client has to flush, and
> > the flush will be valid for what all of the clients wrote.
> >
> > If the server did not advertise multi-conn, but still allows multiple
> > clients: connect those clients, write to distinct areas (avoid
> > overlapping writes, and hopefully your writes are sized large enough
> > that you are also avoiding overlapping cache granularities); then when
> > all clients are finished writing, ALL of them must call flush
> > (ideally, only the first flush takes any real time, and the server can
> > optimize the later flushes as having nothing further to flush - but we
> > can't guarantee that).
>
> In nbdcopy case, all writers write 128mb segments, except the last
> one which may have shorter segment, but this cannot overlap with
> anything. So I think we cannot have overlapping writes due to partial
> blocks.
>
> Optimizing flush to do only one flush when server supports multi-conn
> seems like very low priority to me, so we do this later.
>
> Looks liks nbd_ops_flush is already doing the right thing, flushing all
> open handles.
>
> So the only change needed seems to be:
>
> diff --git a/copy/main.c b/copy/main.c
> index b9dbe1d..924fedb 100644
> --- a/copy/main.c
> +++ b/copy/main.c
> @@ -321,10 +321,6 @@ main (int argc, char *argv[])
> exit (EXIT_FAILURE);
> }
>
> - /* If multi-conn is not supported, force connections to 1. */
> - if (! src->ops->can_multi_conn (src) || ! dst->ops->can_multi_conn (dst))
> - connections = 1;
> -
> /* Calculate the number of threads from the number of connections. */
> if (threads == 0) {
> long t;
Missing the next chunk:
@@ -384,10 +380,8 @@ main (int argc, char *argv[])
*/
if (connections > 1) {
assert (threads == connections);
- if (src->ops->can_multi_conn (src))
- src->ops->start_multi_conn (src);
- if (dst->ops->can_multi_conn (dst))
- dst->ops->start_multi_conn (dst);
+ src->ops->start_multi_conn (src);
+ dst->ops->start_multi_conn (dst);
}
/* If the source is NBD and we couldn't negotiate meta
>
> We always use the requested number of connections, ignoring can_multi_conn.
>
> What do you think?
More information about the Libguestfs
mailing list