[Libguestfs] [hivex PATCH] lib: write: improve key collation compatibility with Windows

Laszlo Ersek lersek at redhat.com
Mon Sep 13 18:50:48 UTC 2021


On 09/10/21 09:48, Richard W.M. Jones wrote:
> On Fri, Sep 10, 2021 at 01:06:17AM +0200, Laszlo Ersek wrote:
>> There are multiple problems with using strcasecmp() for ordering registry
>> keys:
>>
>> (1) strcasecmp() is influenced by LC_CTYPE.
>>
>> (2) strcasecmp() cannot implement case conversion for multibyte UTF-8
>>     sequences.
>>
>> (3) Even with LC_CTYPE=POSIX and key names consisting solely of ASCII
>>     characters, strcasecmp() converts characters to lowercase, for
>>     comparison. But on Windows, the CompareStringOrdinal() function
>>     converts characters to uppercase. This makes a difference when
>>     comparing a letter to one of the characters that fall between 'Z'
>>     (0x5A) and 'a' (0x61), namely {'[', '\\', ']', '^', '_', '`'}. For
>>     example,
>>
>>       'c' (0x63) > '_' (0x5F)
>>       'C' (0x43) < '_' (0x5F)
>>
>> Compare key names byte for byte, eliminating problems (1) and (3).
>>
>> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1648520
>> Signed-off-by: Laszlo Ersek <lersek at redhat.com>
>> ---
>>  lib/write.c | 32 +++++++++++++++++++++++++++++++-
>>  1 file changed, 31 insertions(+), 1 deletion(-)
>>
>> diff --git a/lib/write.c b/lib/write.c
>> index 70105c9d9907..d9a13a3c18b6 100644
>> --- a/lib/write.c
>> +++ b/lib/write.c
>> @@ -462,7 +462,37 @@ compare_name_with_nk_name (hive_h *h, const char *name, hive_node_h nk_offs)
>>      return 0;
>>    }
>>  
>> -  int r = strcasecmp (name, nname);
>> +  /* Perform a limited case-insensitive comparison. ASCII letters will be
>> +   * *upper-cased*. Multibyte sequences will produce nonsensical orderings.
>> +   */
>> +  int r = 0;
>> +  const char *s1 = name;
>> +  const char *s2 = nname;
>> +
>> +  for (;;) {
>> +    unsigned char c1 = *(s1++);
>> +    unsigned char c2 = *(s2++);
>> +
>> +    if (c1 >= 'a' && c1 <= 'z')
>> +      c1 = 'A' + (c1 - 'a');
>> +    if (c2 >= 'a' && c2 <= 'z')
>> +      c2 = 'A' + (c2 - 'a');
>> +    if (c1 < c2) {
>> +      /* Also covers the case when "name" is a prefix of "nname". */
>> +      r = -1;
>> +      break;
>> +    }
>> +    if (c1 > c2) {
>> +      /* Also covers the case when "nname" is a prefix of "name". */
>> +      r = 1;
>> +      break;
>> +    }
>> +    if (c1 == '\0') {
>> +      /* Both strings end. */
>> +      break;
>> +    }
>> +  }
>> +
>>    free (nname);
>>  
>>    return r;
> 
> Thanks for the detailed analysis on the BZ.
> 
> ACK - since it's an incremental improvement over what we have now and
> fixes the bug.

Thank you! Commit d5a522c0bb73.

> There may be registries with multibyte keys (nothing surprises me
> about the Windows registry), but as this only affects the ability to
> insert new keys into a node that has such keys, the problem that we
> don't handle those is limited in practice.

Laszlo




More information about the Libguestfs mailing list