[Libguestfs] Fwd: nbdkit async

Richard W.M. Jones rjones at redhat.com
Sun Feb 19 16:34:20 UTC 2017


----- Forwarded message -----

Date: Sat, 18 Feb 2017 22:21:19 -0500
Subject: nbdkit async

Hello,

Hope this is the right person to contact regarding nbdkit design.

I have a high latency massively parallel device that I am currently
implementing as an nbdkit plugin in c++ and have run into some design
limitations due to the synchronous callback interface nbdkit requires.

Nbdkit is currently designed to call the plugin
pread/pwrite/trim/flush/zero ops as synchronous calls and expects when the
plugin functions return that it can then send the nbd reply to the socket.
It's parallel thread model is also not implemented as of yet but the
current design still mandates a worker thread per parallel op in progress
due to the synchronous design of the plugin calls.

I would like to modify this to allow for an alternative operating mode
where nbdkit calls the plugin functions and expects the plugin to callback
to nbdkit when a request has completed rather than responding right after
the plugin call returns to nbdkit.

If you are familiar with fuse low level api design, something very similar
to that.

An example flow for a read request would be as follows:

1) nbdkit reads and validates the request from the socket
2) nbdkit calls handle_request but now also passing in the nbd request
handle value
3) nbdkit bundles the nbd request handle value, bool flush_on_update, and
read size into an opaque ptr to struct
4) nbdkit calls my_plugin.pread passing in the usual args + the opaque ptr
5) my_plugin.pread makes an asynchronous read call with a handler set on
completion to call nbdkit_reply_read(conn, opaque ptr, buf) or on error
nbdkit_reply_error(conn, opaque_ptr, error)
6) my_plugin.pread returns back to nbdkit without error after it has
started the async op but before it has completed
7) nbdkit doesn't send a response to the conn->sockout beause  when the
async op has completed my_plugin will callback to nbdkit for it to send the
response
8) nbdkit loop continues right away on the next request and it reads and
validates the next request from conn->sockin without waiting for the
previous request to complete
*) Now requires an additional mutex on the conn->sockout for writing
responses

The benefit of this approach is that 2 threads (1 thread for reading
requests from the socket and kicking off requests to the plugin and 1
thread (or more) in a worker pool executing the async handler callbacks)
can service 100s of slow nbd requests in parallel overcoming high latency.

The current design with synchronous callbacks we can provide async in our
plugin layer for pwrites and implement our flush to enforce it but we can't
get around a single slow high latency read op blocking the entire pipe.

I'm willing to work on this in a branch and push this up as opensource but
first wanted to check if this functionality extension is in fact something
redhat would want for nbdkit and if so if there were suggestions to the
implementation.

Initial implementation approach was going to be similar to the
fuse_low_level approach and create an entirely separate header file for the
asynchronous plugin api because the plugin calls now need an additional
parameter (opaque ptr to handle for nbdkit_reply_). This header file
nbdkit_plugin_async.h defines the plugin struct with slightly different
function ptr prototypes that  accepts the opaque ptr to nbd request handle
and some additional callback functions nbdkit_reply_error, nbdkit_reply,
and nbdkit_reply_read. The user of this plugin interface is required to
call either nbdkit_reply_error or nbdkit_reply[_read] in each of the
pread/pwrite/flush/trim/zero ops.

If you got this far thank you for the long read and please let me know if
there is any interest.
Thank you,
 - Shaun McDowell

----- End forwarded message -----

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-builder quickly builds VMs from scratch
http://libguestfs.org/virt-builder.1.html




More information about the Libguestfs mailing list