[libvirt] [PATCH 1/8] Add virRandom() API to generate numbers with non-power-of-2 limit

Fri Aug 10 16:06:39 UTC 2012

On 08/10/2012 09:31 AM, Daniel P. Berrange wrote:
> On Fri, Aug 10, 2012 at 08:58:04AM -0600, Eric Blake wrote:
>> On 08/10/2012 07:47 AM, Daniel P. Berrange wrote:
>>> From: "Daniel P. Berrange" <berrange at redhat.com>
>>>
>>> The current virRandomBits() API is only usable if the caller wants
>>> a random number in the range [0, (n-1)] where n is a power of two.
>>> This adds a virRandom() API which works for upper limits which are
>>> not a power of two. It works by using virRandomBits() to generate
>>> a number to the next nearest power of 2 limit, and then scales it
>>> down.
>>
>> I like the idea, but NAK to this version of the patch.  That algorithm
>> is not uniform.
>>

>>> +    while (tmp) {
>>> +        tmp >>= 1;
>>> +        bits++;
>>> +    }
>>
>> Slow.  I would suggest using gcc's __builtin_clz (the way qemu does),
>> except that gnulib doesn't yet have a module exposing it.  I guess that
>> means I'd better write a quick gnulib module.  Once you have clz(), this
>> is much simpler as:
>>
>> assert(max);
>> bits = 64 - clz(max);

This part is independently useful, while we still discuss the algorithm;
I'll have the gnulib module complete later today.

>> that is, you have a 50% chance of getting 0, but only a 25% chance of
>> getting 1 or 2 - that is not a uniform distribution.  The ONLY way to do
>> this is:
>>
>> do {
>>     ret = virRandomBits(bits);
>> } while (ret >= max);
>>
>> Then, with the same input of max == 3, you have a 25% chance of calling
>> virRandomBits at least twice, a 12.5% chance of calling it at least
>> three times, and so on, but you are guaranteed that you will eventually
>> get a result that is uniformly distributed.
> 
> Hmm, it was the looping that I used originally, but I wanted to get
> something that had a deterministic upper bound, instead of just a
> probablistic bound.

Oh, I see your point - you don't want to drag on for a long time on the
rare occasions where probabilities are against you.

> 
> The alternative I wanted to use was  drand48_t() which returns a double
> in the range 0.0 -> 1.0, with 48 bits of entropy. You could multiply by
> 'max', and cast to int. But this isn't portable and is not fixed by
> GNULIB.

Might not be too hard to make drand48_r into a gnulib module (harder
than clz, but not intractible, so I'll take a shot at it, but no
promises).  Using drand48_r does still hit non-uniformity due to
rounding, but with much smaller fuzz factor (that is, instead of my
example of 50/25/25, you'd be more like 25.000004, 24.999993,
24.999993), which I can live with as being in the noise.  Furthermore,
since it only gives 48 bits of entropy, if 'max' is larger than 1<<47
you'll have to do some legwork yourself (I'm not quite sure what that
legwork involves to still be fair), so I'd feel more comfortable if we
capped this function to uint32_t and require the use of powers-of-2 for
anything larger than a 32-bit max.

> I'm wondering if we could emulate drand48_t() ourselves by doing
> something like this:
> 
>   #include <ieee754.h>
> 
>   double virRandom(void) {
>     union ieee754_double val;
> 
>     val.ieee.negative = 0;
>     val.ieee.exponent = IEEE754_DOUBLE_BIAS;
>     val.ieee.mantissa0 = virRandomBits(20);
>     val.ieee.mantissa0 = virRandomBits(32);
> 
>     return val.d - 1.0;
>   }

No need to do that.  -lm already gives us a very nice function: ldexp().
 Basically, you take an integer in the range [0,1<<48), then multiply
that by 2**-48 using ldexp().

-- 
Eric Blake   eblake at redhat.com    +1-919-301-3266
Libvirt virtualization library http://libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 620 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20120810/812b5220/attachment-0001.sig>