[Libguestfs] URI Handling Patch

Pino Toscano ptoscano at redhat.com
Mon Jun 29 15:37:52 UTC 2015


In data venerdì 26 giugno 2015 18:50:16, Gabriel Hartmann ha scritto:
> >
> > > By default, when saving a URI using xmlSaveUri it escapes everything in
> > the URI.  QEMU doesn't want anything escaped, so now I unescape
> > everything after the URI is generated.  Unfortunately there's no flag to
> > change the default behavior.
> >
> > I'm not sure that's the actual issue here, but I'm somehow included to
> > think this is another consequence of the lack of query string handling
> > for http/https URIs.
> >
> > In any case, do you have a simple reproducer for this escaping handling
> > for qemu?
> >
> 
> I don't have a simple repro for qemu, but this is pretty close.  In
> guestfish do this:
> 
> add
> "/vhds/osdiskforconsul0-osdisk.vhd?se=2016-01-01T00%3A00%3A00Z&sp=r&sv=2014-02-14&sr=b&sig=LOlrHXrQeaqlSEP51hRi7E5KDa9lnkqSvLTaZBmTkrQ%3D"
> readonly:true protocol:https server:
> gabhartswarmstorage.blob.core.windows.net
> 
> After you execute 'run', you should see qemu complaining about something
> like a 404 Error and the file not existing.  After the unescape change the
> above will start working.

In this case, you can check in a simplier way:
$ guestfish -x -a '/vhds/etc...'
you'll see a "libguestfs: trace: add_drive ..." line contaning the
actual add_drive call which guestfish will do to the library, and the
parameters passed to it.

Also, I'm thinking we should focus on making sure to have the proper
query string for http(s) URIs to add_drive, we'll deal with the
percent-encoding on the way.

> > > The other problem was that only the "path" portion of the URI struct
> > was being used to indicate the path.  That's natural enough, but that
> > practice is what was dropping the query string.  The query string is kept
> > separately from the path.  I now concat the query string back onto the URI
> > with a separating '?'.
> >
> > I don't think that appending the query string to the path is a good idea.
> > For example, we do a minimum of parsing of the URI, and for some
> > protocols (like the "We may have to adjust the path ..." comment says)
> > we adjust path according to the elements in the query string.
> 
> 
> Are you talking about this comment?
> /* We may have to adjust the path depending on the protocol.  For
>  * example ceph/rbd URIs look like rbd:///pool/disk, but the
>  * exportname expected will be "pool/disk".  Here, uri->path will be
>  * "/pool/disk" so we have to knock off the leading '/' character.
>  */
> 
> This is talking about removing leading the '/' in the path sometimes
> depending on protocol.  Nothing to do with the query string.

Yes, although it was not directly to the specified code snippet. The
advice here is that some protocols don't need a query string in the
path, because parts of it might be handled differently. See also below.

> > Ad additional checking for URIs, we have fish/test-add-uri.sh, which
> > fails with your patch. You might want to also add additional checks
> > with for query string http/https URIs.
> >
> 
> The test is now failing on this case:
> + guestfish -x -a 'nbd:///export?socket=/sk'
> + grep -sq 'add_drive *"/export"* "protocol:nbd" "server:unix:/sk"'
> test-add-uri.out
> 
> because I am now appending the query string as in:
> add_drive* "/export?socket=/sk"* "protocol:nbd" "server:unix:/sk"

It didn't expect a query string before, and it must not take it now,
as the parameters are not related to the actual path, but they are
connection parameters.

> In which cases should the query string be appended and in what cases should
> it be dropped?  I believe in the http and https case it should be
> appended.  I'm not familiar with the nbd protocol.  Is dropping the query
> string always the right thing to do for this protocol?  Other protocols?

Surely http(s) drives need to have the query string that was specified
in guestfish URIs, just like the add_drive API need to properly deal
with such extra parameter, and manage them depending on the protocol.

We have few possibilities at this point, in order of complexity:

(a) split the query string into an "hash table" (i.e. a C list of
key/value items), pass that as new parameter to add_drive, and rebuild
a query string depending on the backend (direct/qemu, libvirt, etc) and
on the protocol (http, nbd, etc)

(b) pass the query string as-is (percent-encoded or not) to add_drive,
and add it back depending on the protocol

(c) join the query string to the path, depending on the protocol, in
guestfish

Your solution so far is (c), while IMHO a better one would be (b) or
even (a).

Also, something more to check is how e.g. libvirt needs such kind of
URI parameter after the path, and how. You can check whether your
change also work with libvirt by exporting LIBGUESTFS_BACKEND=libvirt
(or also using "set-backend libvirt" in guestfish), provided you have
built libguestfs with libvirt support.

Thanks,
-- 
Pino Toscano




More information about the Libguestfs mailing list