[Libguestfs] Extending nbdkit to support listing exports

Tue Jul 21 14:04:04 UTC 2020

On 7/21/20 7:46 AM, Richard W.M. Jones wrote:
> Hi Eric, Nir.
> 
> $SUBJECT - it's complicated!  Imagine a scenario where we have
> extended the file plugin so you can point it at a directory and it
> will use the client-supplied export name to select a file to serve.
> Also we have extended the tar filter so that you can map tar file
> components into export names.  You may be able to see where this is
> going ...
> 
> Now we point the file plugin at a directory of tar files, and layer
> the tar filter on top, and then what happens?  The client can only
> supply a single export name, so how can it specify component X of tar
> file Y?
> 
>    tar filter       --->   file plugin
>    tar-exportname=on       exportname=on
>                            directory=/my-files ---> directory of tar files
> 
> It gets worse!  At the moment filters and plugins can read the export
> name sent by the client by calling the global function
> nbdkit_export_name().  I'm not so worried about filters, but plugins
> must be able to continue to use this global function.  In the scenario
> above both the tar filter and the file plugin would see the same
> string (if we did nothing), and it would only make sense to one or the
> other but never to both.
> 
> Listing exports is also complicated: Should we list only the nearest
> filter's exports?  Or the plugin's exports?  Cartesian product of both
> somehow?

Definitely something for filters to decide, and not the core engine. 
But it does sound like we can come up with quite a bit of flexibility, 
and the idea of a Cartesian product (a single export name can tell both 
the tar filter which component to extract and the file plugin which tar 
file to serve) is indeed tempting.

> 
> 
> That's the background, but how do we solve it?
> ----------------------------------------------
> 
> I'd like to say first of all that I believe export names are already a
> dumpster fire of potential insecurity, and that we shouldn't have the
> core server attempt to parse them, so solutions involving splitting
> the export names into components (like tar-component/file-name) should
> not be considered because they are too risky, and run into arbitrary
> limits (what escape character to use? what is the max length?)  Also
> changing the NBD protocol is not under consideration.

Not directly, but maybe it is worth adding optional support to the NBD 
protocol to make it possible for NBD_OPT_LIST to support a hierarchical 
return.  That is, pointing the file plugin to serve directory=base 
containing the following  layout:
base
  + f1
  + dir1
    + f2
    + f3
  + dir2
    + f4
the current semantics of NBD_OPT_LIST only lend themselves to returning 
4 NBD_REP_SERVER replies in a row: "f1", "dir1/f2", "dir1/f3", 
"dir2/f4".  But what if the protocol added NBD_OPT_LIST_HIER which took 
a starting point argument, and new replies NBD_REP_SERVER_DIR and 
NBD_REP_ERR_NOTDIR, such that we could then have:

NBD_OPT_LIST_HIER("") => SERVER("f1"), SERVER_DIR("dir1/"), 
SERVER_DIR("dir2/")
NBD_OPT_LIST_HIER("dir1/") => SERVER("dir1/f2"), SERVER("dir1/f3")
NBD_OPT_LIST_HIER("dir2/") => SERVER("dir2/f4")
NBD_OPT_LIST_HIER("f1") => NBD_REP_ERR_NOTDIR
NBD_OPT_LIST_HIER("nosuch") => NBD_REP_ERR_UNKNOWN

True, this hierarchy is only available to new client/server pairs (if 
either side lacks the new code, you're back to flat listings), but it is 
food for thought.  I'll propose something on the upstream NBD list if we 
think it has potential.

> 
> I think we need to consider these aspects separately:
> 
> (a) How filters and plugins get the client exportname.
> 
> (b) How filters and plugins respond when the client issues NBD_OPT_LIST.
> 
> (c) How the server, filters and plugins respond to NBD_OPT_INFO.

I think (c) is already handled - they respond the same as they do to 
NBD_OPT_GO, but rather than starting to serve traffic, they go back to 
waiting for the next NBD_OPT_*.

> 
> 
> (a) Client exportname
> ---------------------
> 
> The client sends the export name of the export it wants to access with
> NBD_OPT_EXPORT_NAME or the newer NBD_OPT_GO.  This is an opaque string
> (not a filename, pathname etc).  Currently filters and plugins can
> fetch this using nbdkit_export_name().  However it cannot be modified
> by filters.
> 
> My proposal is that we extend the .open() callback with an extra
> export_name parameter:
> 
>    void *filter_open (int readonly, const char *export_name);
> 
> For plugins I have already proposed this for inclusion in API V3.  We
> already do this in nbdkit-sh-plugin.  For backwards compatibility with
> existing plugins we'd make nbdkit_export_name() return the export name
> passed down by the last filter in the chain, and deprecate this
> function in V3.

Yeah, for plugins, we can't do anything easy without moving to a v3 
protocol; but making the existing nbdkit_export_name() smarter about 
grabbing the name from the last filter rather than directly from the 
client makes sense for v2 filters.

> 
> For filters the next_open field would take the export_name and this
> would allow filters to modify the export name.

For filters, we have total liberty to change our .open signature right 
now, even without introducing plugin v3.

> 
> nbdkit-tar-filter would take an explicit tar-export-name=<new>
> parameter to pass a different export name down to the underlying
> plugin.  If the tar filter was implementing its own export name
> processing then this parameter would either be required or would
> default to "".  (If the tar filter was not implementing export name
> functionality it would pass it down unchanged).

Makes sense.

> 
> 
> (b) Listing exports with NBD_OPT_LIST
> -------------------------------------
> 
> I believe we need a new .list_exports() callback for filters and
> plugins to handle this case.  I'm not completely clear on the API, but
> it could be something as simple as:
> 
>    const char **list_exports (void);
> 
> returning a NULL-terminated list of strings.

But what is the lifetime on those strings?  Can the list change over 
time (that is, if I'm running 'nbdkit file dir=base', does it rerun 
opendir()/stat()/closedir() to populate a fresh list for each client, or 
does it only populate the list once, regardless of later underlying 
changes in the directory hierarchy it is wrapping)?  Returning const 
char ** means the plugin has to manage the memory (the server will not 
touch anything), so whether the plugin computes the list up-front 
(compute the list during .load, free it during .unload) or per-client 
(compute the list during .preconnect or .list_exports, free during 
.close) should be reasonable.

Thus, I'm assuming the sequence would be:

client connects
   .preconnect
if client calls NBD_OPT_LIST
   .list_exports
if client calls NBD_OPT_INFO or NBD_OPT_GO
   .open
   .get_size
   .can_FOO...
client disconnects
   .close

that is, .preconnect always corresponds to the initial client connection 
before NBD_OPT_ handling, and .open corresponds to the point when the 
client requests a specific export, but .close can gracefully clean up 
whatever per-client results it generated for any client that actually 
asked for a listing.  It also gives us the flexibility that a single 
client can both obtain a listing and choose which export to then 
request, rather than having to do two separate connections (one for the 
listing, the second specifying a specific export).

I was wondering if we should instead have the core server handle memory, 
similar to how we handle .exports (that is, the server allocates a 
container, passes it to the plugin, and the plugin calls 
nbdkit_export_list_add("...") as many times as it wants, where the 
server now frees the list instead of throwing the burden on the plugin). 
  Offhand, I'm thinking either approach could be made to work, so it 
becomes a decision on which is easier to maintain (both from the 
perspective of the core server code, and for an arbitrary plugin writer).

> 
> Is it conceivable we will need additional fields in future?

For now, only one field - a description string.  That's all the more 
that NBD_REP_SERVER supports, but 'qemu-nbd -D' is an example of setting it.

If my idea of adding NBD_OPT_INFO_HIER and NBD_REP_SERVER_DIR goes 
anywhere, that may be an opportunity to pass other information as well.

> 
> Plugins which do not implement this would be assumed to return a
> single export "" (or perhaps the -e parameter??)

Both sound fine.  In turn, what happens if -e is in use when a plugin 
does have .list_exports?  Do we just ignore -e in that case?

> 
> Filters could ignore or replace exports by optionally implementing
> this callback.  I wouldn't recommend that filters modify export names
> however.
> 
> 
> (c) Handling NBD_OPT_INFO
> -------------------------
> 
> I believe our current implementation of this option (which is
> definitely not being robustly tested!) is something like this:
> 
>    - NBD_OPT_INFO received from client (containing the export name)
>    - open a backend handle
>    - query backend for flags
>    - reply to client
>    - close backend handle
>    - process the next option
> 
> Assuming this is actually correct, probably we can make sure that it
> passes the export name to the backend_open, and everything should just
> work.  In fact this would happen as a side effect of fixing (a) above
> so I don't believe there's anything to do here except to have some
> test coverage.

Yep, which is why I stated above that I think we already do (c).

> 
> 
> ----------------------------------------------------------------------
> 
> Thoughts welcomed as always,
> 
> Rich.
> 

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org