[libvirt] [PATCH 1/8] Add virRandom() API to generate numbers with non-power-of-2 limit

Fri Aug 10 15:31:11 UTC 2012

On Fri, Aug 10, 2012 at 08:58:04AM -0600, Eric Blake wrote:
> On 08/10/2012 07:47 AM, Daniel P. Berrange wrote:
> > From: "Daniel P. Berrange" <berrange at redhat.com>
> > 
> > The current virRandomBits() API is only usable if the caller wants
> > a random number in the range [0, (n-1)] where n is a power of two.
> > This adds a virRandom() API which works for upper limits which are
> > not a power of two. It works by using virRandomBits() to generate
> > a number to the next nearest power of 2 limit, and then scales it
> > down.
> 
> I like the idea, but NAK to this version of the patch.  That algorithm
> is not uniform.
> 
> > +/**
> > + * virRandom:
> > + * @max: Upper bound on random number (not inclusive)
> > + *
> > + * Generate an evenly distributed random number between [0,max-1]
> > + * If @max is a power of 2, then use virRandomBits instead
> > + *
> > + * Return: a random number with @nbits entropy
> > + */
> > +uint64_t virRandom(unsigned long long max)
> > +{
> > +    int bits = 0;
> > +    unsigned long long tmp = max - 1;
> > +    uint64_t ret;
> > +
> > +    while (tmp) {
> > +        tmp >>= 1;
> > +        bits++;
> > +    }
> 
> Slow.  I would suggest using gcc's __builtin_clz (the way qemu does),
> except that gnulib doesn't yet have a module exposing it.  I guess that
> means I'd better write a quick gnulib module.  Once you have clz(), this
> is much simpler as:
> 
> assert(max);
> bits = 64 - clz(max);
> 
> > +
> > +    ret = virRandomBits(bits);
> > +
> > +    if ((1 << bits) != max) {
> > +        double d = ret;
> > +        d *= max;
> > +        d /= (1 << bits);
> > +        ret = (uint64_t)d;
> > +    }
> 
> Wrong.  Consider when max == 3 (i.e. generate a random choice between 0,
> 1, or 2).  bits is 2, so d starts as 0, 1, 2, or 3, each with 25%
> probability.  Then after this computation:
> 0 -> 0*3 / 4.0 -> 0
> 1 -> 1*3 / 4.0 -> 0
> 2 -> 2*3 / 4.0 -> 1
> 3 -> 3*3 / 4.0 -> 2
> 
> that is, you have a 50% chance of getting 0, but only a 25% chance of
> getting 1 or 2 - that is not a uniform distribution.  The ONLY way to do
> this is:
> 
> do {
>     ret = virRandomBits(bits);
> } while (ret >= max);
> 
> Then, with the same input of max == 3, you have a 25% chance of calling
> virRandomBits at least twice, a 12.5% chance of calling it at least
> three times, and so on, but you are guaranteed that you will eventually
> get a result that is uniformly distributed.

Hmm, it was the looping that I used originally, but I wanted to get
something that had a deterministic upper bound, instead of just a
probablistic bound.

The alternative I wanted to use was  drand48_t() which returns a double
in the range 0.0 -> 1.0, with 48 bits of entropy. You could multiply by
'max', and cast to int. But this isn't portable and is not fixed by
GNULIB.

I'm wondering if we could emulate drand48_t() ourselves by doing
something like this:

  #include <ieee754.h>

  double virRandom(void) {
    union ieee754_double val;

    val.ieee.negative = 0;
    val.ieee.exponent = IEEE754_DOUBLE_BIAS;
    val.ieee.mantissa0 = virRandomBits(20);
    val.ieee.mantissa0 = virRandomBits(32);

    return val.d - 1.0;
  }

  int virRandomInt(int max) {
     return virRandom() * max;
  }

Guess it depends whether ieee754.h is portable ? Or can we in fact just
define that union ourselves ?

  union ieee754_double
  {
    double d;

    /* This is the IEEE 754 double-precision format.  */
    struct
      {
#if     __BYTE_ORDER == __BIG_ENDIAN
        unsigned int negative:1;
        unsigned int exponent:11;
        /* Together these comprise the mantissa.  */
        unsigned int mantissa0:20;
        unsigned int mantissa1:32;
#endif                          /* Big endian.  */
#if     __BYTE_ORDER == __LITTLE_ENDIAN
# if    __FLOAT_WORD_ORDER == __BIG_ENDIAN
        unsigned int mantissa0:20;
        unsigned int exponent:11;
        unsigned int negative:1;
        unsigned int mantissa1:32;
# else
        /* Together these comprise the mantissa.  */
        unsigned int mantissa1:32;
        unsigned int mantissa0:20;
        unsigned int exponent:11;
        unsigned int negative:1;
# endif
#endif                          /* Little endian.  */
      } ieee;

 };

Regards,
Daniel
-- 
|: http://berrange.com      -o-    http://www.flickr.com/photos/dberrange/ :|
|: http://libvirt.org              -o-             http://virt-manager.org :|
|: http://autobuild.org       -o-         http://search.cpan.org/~danberr/ :|
|: http://entangle-photo.org       -o-       http://live.gnome.org/gtk-vnc :|