[PATCH 01/10] viruri: Search params case insensitively

Michal Prívozník mprivozn at redhat.com
Tue Feb 7 07:33:03 UTC 2023


On 2/6/23 16:02, Peter Krempa wrote:
> On Mon, Feb 06, 2023 at 10:16:49 +0100, Michal Privoznik wrote:
>> Our URI handling code (doRemoteOpen() specifically), uses case
>> insensitive parsing of query part of URI. For instance:
>>
>>   qemu:///system?socket=/some/path
>>   qemu:///system?SoCkEt=/some/path
>>
>> are the same URI. Even though the latter is probably not used
>> anywhere, let's switch to STRCASEEQ() instead of STREQ() at two
>> places: virURIGetParam() and virURICheckUnixSocket().
> 
> rfc3986 (Uniform Resource Identifier (URI): Generic Syntax) states:
> 
> 6.2.2.1.  Case Normalization
> 
>    For all URIs, the hexadecimal digits within a percent-encoding
>    triplet (e.g., "%3a" versus "%3A") are case-insensitive and therefore
>    should be normalized to use uppercase letters for the digits A-F.
> 
>    When a URI uses components of the generic syntax, the component
>    syntax equivalence rules always apply; namely, that the scheme and
>    host are case-insensitive and therefore should be normalized to
>    lowercase.  For example, the URI <HTTP://www.EXAMPLE.com/> is
>    equivalent to <http://www.example.com/>.  The other generic syntax
>    components are assumed to be case-sensitive unless specifically
>    defined otherwise by the scheme (see Section 6.2.3).
> 
> Thus in our scheme we can assume that the query parameters are indeed
> case in-sensitive, but others such as HTTP may disagree:
> 
> rfc7230 (Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing)
> 
> 2.7.3.  http and https URI Normalization and Comparison
> 
>    Since the "http" and "https" schemes conform to the URI generic
>    syntax, such URIs are normalized and compared according to the
>    algorithm defined in Section 6 of [RFC3986], using the defaults
>    described above for each scheme.
> 
>    If the port is equal to the default port for a scheme, the normal
>    form is to omit the port subcomponent.  When not being used in
>    absolute form as the request target of an OPTIONS request, an empty
>    path component is equivalent to an absolute path of "/", so the
>    normal form is to provide a path of "/" instead.  The scheme and host
>    are case-insensitive and normally provided in lowercase; all other
>    components are compared in a case-sensitive manner.  Characters other
> 
>                                 ^^^^^^^^^^^^^^^^^^^^^
> 
>    than those in the "reserved" set are equivalent to their
>    percent-encoded octets: the normal form is to not encode them (see
>    Sections 2.1 and 2.2 of [RFC3986]).
> 
> 
> The functions you are modifying are used (for now) exclusively for
> handling of the URIs based on libvirt-defined schemes, so this does not
> create problems for current usage.
> 
> Nevertheless the code is placed in a common helper module I suggest you
> add a comment describing that it's case sensitive by design.
you mean case IN-sensitive, right?

For HTTP - I don't think we ever parse that.
But for me, the deciding factor to do this change was consistency. We
already have case insensitive lookup, so maybe there's somebody using
that (although, we don't really document that, so it's not written in
stone, yet).

> 
> 
>>
>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>> ---
>>  src/util/viruri.c | 4 ++--
>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/src/util/viruri.c b/src/util/viruri.c
>> index 88bb0cc1f8..53f85ed705 100644
>> --- a/src/util/viruri.c
>> +++ b/src/util/viruri.c
>> @@ -371,7 +371,7 @@ virURIGetParam(virURI *uri, const char *name)
>>      size_t i;
>>  
>>      for (i = 0; i < uri->paramsCount; i++) {
>> -        if (STREQ(uri->params[i].name, name))
>> +        if (STRCASEEQ(uri->params[i].name, name))
>>              return uri->params[i].value;
>>      }
>>  
>> @@ -403,7 +403,7 @@ virURICheckUnixSocket(virURI *uri)
>>          return false;
>>  
>>      for (i = 0; i < uri->paramsCount; i++) {
>> -        if (STREQ(uri->params[i].name, "socket"))
>> +        if (STRCASEEQ(uri->params[i].name, "socket"))
>>              return true;
>>      }
> 
> 
> If you add a warning into the comment:

Fair enough. Consider this squashed in:

diff --git i/src/util/viruri.c w/src/util/viruri.c
index 53f85ed705..4afdd30c59 100644
--- i/src/util/viruri.c
+++ w/src/util/viruri.c
@@ -365,6 +365,17 @@ virURIResolveAlias(virConf *conf, const char *alias, char **uri)
 }
 
 
+/**
+ * virURIGetParam:
+ * @uri: URI to get parameter from
+ * @name: name of the parameter
+ *
+ * For parsed @uri, find parameter with name @name and return its value. The
+ * string comparison is case insensitive, by design.
+ *
+ * Returns: a value on success, or
+ *          NULL on error (with error reported)
+ */
 const char *
 virURIGetParam(virURI *uri, const char *name)
 {
@@ -389,6 +400,8 @@ virURIGetParam(virURI *uri, const char *name)
  * scenario the socket might be proxied to a remote server even though the URI
  * looks like it is only local.
  *
+ * The "socket" parameter is looked for in case insensitive manner, by design.
+ *
  * Returns: true if the URI might be proxied to a remote server
  */
 bool


> 
> Reviewed-by: Peter Krempa <pkrempa at redhat.com>

Thanks.

> 
> P.S: virURICheckUnixSocket really doesn't look like a helper that should
> reside in the generic URI module.
> 

I can see where you're coming from, but since this function is used in
two distinct modules (src/libvirt-domain.c and
src/remote/remote_driver.c) I can't think of a better place.

Michal



More information about the libvir-list mailing list