[libvirt] [PATCH] Replace hashing algorithm with murmurhash

Eric Blake eblake at redhat.com
Wed Jan 25 16:55:25 UTC 2012


On 01/25/2012 09:38 AM, Daniel P. Berrange wrote:
>>> -static unsigned long virCgroupPidCode(const void *name)
>>> +static int32_t virCgroupPidCode(const void *name, int32_t seed)
>>>  {
>>> -    return (unsigned long)name;
>>> +    unsigned long pid = (unsigned long)name;
>>> +    return virHashCodeGen(&pid, sizeof(pid), seed);
>>
>> I'm not sure if it matters, but you could shorten these two lines to
>> just one:
>>
>> return virHashCodeGen(&name, sizeof(unsigned long), seed);
> 
> I actually preferred the slightly more verbose style, to make
> it clearer that we're actually passing in the PID as an int,
> cast to a pointer, instead of an int pointer.

And given the recent thread about mingw32, this is wrong anyways.
There, 'unsigned long' is 32 bits, but pid_t is 64 bits, so we'd be
losing information in our hash.  We probably need to fix this code to be
passing (void *)&pid_t and dereferencing the pointer to get to a full
pid, rather than cheating with (void*)(unsigned long)pid_t and possibly
losing bits (although since this is in a context of hashing, lost bits
only means more chance for collision, and not a fatal algorithmic flaw).

>>
>> where existing calls to virRandom(256) will now be virRandomBits(8),
>> virRandom(1024*1024*1024) will now be virRandomBits(30), and your code
>> would use uint32_t seed = virRandom(32).

Of course, here I meant virRandomBits(32),

> 
> Yep, I've done that.

but I'm sure you figured that out.

>>> +
>>> +    return le32toh(r);
>>
>> <endian.h>, and thus le32toh(), is not yet standardized (although POSIX
>> will be adding it in the future), nor is it currently provided by
>> gnulib.  We'd have to get that fixed first.
> 
> The le32toh call was only here because the code I copied wanted to be
> endian neutral. I don't think libvirt really cares if its hash codes
> are endian neutral, so I trivially just removed the le32toh call and
> avoid the problem of endian.h

Agreed - we aren't sharing hash values over the wire, so all hash values
within a particular libvirtd process will be the same endianness,
without having to waste time on swapping bytes around.

-- 
Eric Blake   eblake at redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 620 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20120125/b64a4b4a/attachment-0001.sig>


More information about the libvir-list mailing list