[Libguestfs] [PATCH virt-v2v 1/2] v2v: -o rhv-upload: Split plugin functions from create/finalize

Tue Aug 3 19:39:44 UTC 2021

On Tue, Aug 03, 2021 at 09:18:50PM +0300, Nir Soffer wrote:
> I see, so if I understand how it works correctly, we have:
> 
> after fork:
> - called once
> - create pool of connections
> 
> open:
> - called once per nbd connection
> - does nothing since we already opened the connections
> 
> write/zero:
> - called once per nbd command
> - pick random connection and send the request
> 
> flush:
> - called once per nbd connection
> - flush all connections

If the other end of the http connections is multi-conn consistent
(right now, qemu-nbd doesn't yet advertise multi-conn for writable
exports, but everything I've seen in the code says that it IS
consistent because it serializes flush requests at the block layer
backend regardless of which client front-end sent them - it's still on
my todo list to get qemu 6.2 to advertise multi-conn for writable
qemu-nbd), then you only have to flush on one random connection.

If the other end of the http connection is not multi-conn consistent,
then you do indeed have to flush ALL of the http connections; at which
point, you are somewhat recreating the effects of the nbdkit
multi-conn filter.  However, the multi-conn filter only knows how to
propagate flush across openings into the plugin, and not into the pool
of http connections embedded within the plugin, so the duplication is
probably necessary rather than relying on the filter.

> 
> close:
> - called once per nbd connection
> - close all connections on first call
> - does nothing on later call

Or do you want it to be more reference-counted, where it leaves all
connections open until the LAST nbd connection is closed?

> 
> So we have some issues:
> - calling flush 16 times instead of 4 times

Are the later flushes expensive, or is the server smart enough to
optimize that if nothing has changed since the first flush, the later
flushes return immediately?

And whether it is called 1, 4, or 16 times (assuming 4 NBD clients
each talking to a pool of 4 http connections) depends on which parts
of the chain advertise (or assume) multi-conn semantics.  If you have
multi-conn all the way through, a single flush on any one http
connection is immediately visible to all other connections, and you
don't need the remaining flushes.  If you do not have multi-conn
consistency natively, then yes, replicating the flush across all
connections is necessary, although the multi-conn filter may let you
tweak whether that replication has to be done by the client or can be
emulated by nbdkit.  Also, the multi-conn filter can turn a single flush from
the NBD client into multiple flushes to the plugin; but it also has an
operational mode where it recognizes multiple flushes from NBD clients
and turns the latter flushes into no-ops if nothing has changed in the
meantime from the plugin.

> 
> Can we disable the parallel threading model with multi_con, or we must
> use it to have concurrent calls?

Parallel threading is necessary if you want concurrent calls (more
than one outstanding read or write at a time).  That's independent of
whether you have multi-conn.  Note that you can have a semi-serialized
connection that advertises multi-conn but does not allow parallel
requests on a given connection (you have to use parallel NBD clients
to get any parallelism); and conversely you can have a parallel
connection that does not advertise multi-conn (parallel requests are
supported on any connection, but flush on one connection does not
necessarily affect the other connections, and you may see stale data
from a per-connection cache if you don't flush on all connections).

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org