[Libguestfs] [PATCH v2 nbdkit] common: Improve pseudo-random number generation.

Eric Blake eblake at redhat.com
Mon Dec 31 15:35:42 UTC 2018


On 12/28/18 2:55 PM, Richard W.M. Jones wrote:
> Currently we use non-cryptographically secure random numbers in two
> places, the error filter (to inject errors at random) and the random
> plugin.  For this we have used either random_r or a home-brew-ish
> Linear Congruential Generator.  Use of random_r is problematic on BSDs
> because it doesn't exist there.  Use of the LCG is simply a bad
> choice.
> 
> Replace both uses with a better quality and faster PRNG from David
> Blackman and Sebastiano Vigna called ‘xoshiro256** 1.0’
> (http://xoshiro.di.unimi.it/).  This is released into the public
> domain (where possible) so it compatible with the licensing of nbdkit.
> 
> This also fixes a bug in the random plugin where it could never
> generate the byte 255 because I used modulo 255 instead of modulo 256
> arithmetic.  Ooops.
> 
> This also adds a simple sniff test that the random plugin is producing
> something which at least looks like random data.
> ---

> +#include <stdint.h>
> +
> +/* Generate pseudo-random numbers, quickly, with explicit state.
> + *
> + * This is based on the xoshiro/xoroshiro generators by David Blackman
> + * and Sebastiano Vigna at http://xoshiro.di.unimi.it/ .  Specifically
> + * this is ‘xoshiro256** 1.0’.
> + *
> + * This does _NOT_ generate cryptographically secure random numbers
> + * (CSPRNG) and so should not be used when cryptography or security is
> + * required - use gcrypt if you need those.
> + */
> +
> +/* You can seed ‘struct random_state’ by setting the s[] elements
> + * directly - but not you must NOT set it all to zero.  OR if you have

s/not you/note you/

> + * a 64 bit seed, you can use xsrandom below to initialize the state.
> + */
> +struct random_state {
> +  uint64_t s[4];
> +};
> +
> +static inline uint64_t
> +snext (uint64_t *seed)
> +{
> +  uint64_t z = (*seed += 0x9e3779b97f4a7c15);
> +  z = (z ^ (z >> 30)) * 0xbf58476d1ce4e5b9;
> +  z = (z ^ (z >> 27)) * 0x94d049bb133111eb;
> +  return z ^ (z >> 31);
> +}

Okay, that matches http://xoshiro.di.unimi.it/splitmix64.c.  And even
though there is one particular value of seed which can result in snext()
returning 0 (namely 0x61c8864680b583eb), the fact that xsrandom() calls
snext() four times means that random_state is guaranteed to have at
least three non-zero values, which was a documented prerequisite of the
seeding.

> +/* Returns 64 random bits.  Updates the state. */
> +static inline uint64_t
> +xrandom (struct random_state *state)
> +{
> +  const uint64_t result_starstar = rotl (state->s[1] * 5, 7) * 9;
> +  const uint64_t t = state->s[1] << 17;
> +
> +  state->s[2] ^= state->s[0];
> +  state->s[3] ^= state->s[1];
> +  state->s[1] ^= state->s[2];
> +  state->s[0] ^= state->s[3];
> +
> +  state->s[2] ^= t;
> +
> +  state->s[3] = rotl (state->s[3], 45);
> +
> +  return result_starstar;

and that matches http://xoshiro.di.unimi.it/xoshiro256starstar.c

Looks like a reasonable copy of the PRNG implementation, and it is nice
that the reference site includes enough information to justify this
choice of a PRNG.

> +++ b/plugins/random/random.c

>  /* Seed. */
>  static uint32_t seed;
>  

Interesting that you kept this as a 32-bit number; I don't see it as
being a real drawback, though, especially since...

> @@ -135,13 +117,27 @@ random_pread (void *handle, void *buf, uint32_t count, uint64_t offset,
>  {
>    uint32_t i;
>    unsigned char *b = buf;
> -  uint32_t s;
> +  uint64_t s;
>  
>    for (i = 0; i < count; ++i) {
> -    s = LCG (seed) + offset + i;
> -    s = LCG (s);
> -    s = LCG (s);
> -    s = s % 255;

Oops indeed - %255 and &255 are quite different operations :)

> +    /* We use nbdkit common/include/random.h to make random numbers.
> +     *
> +     * However we're not quite using it in the ordinary way.  In order
> +     * to be able to read any byte of data without needing to run the
> +     * PRNG from the start, the random data is computed from the index
> +     * and seed through three rounds of PRNG:
> +     *
> +     * index i     PRNG(seed+i)   -> PRNG -> PRNG -> mod 256 -> b[i]
> +     * index i+1   PRNG(seed+i+1) -> PRNG -> PRNG -> mod 256 -> b[i+1]
> +     * etc
> +     */
> +    struct random_state state;
> +
> +    xsrandom (seed + offset + i, &state);

...you are adding in 64-bit offset for every re-seed.  And my above
analysis of the seeding function shows that there is no magic seed value
which can result in random_state being all zeros.  It's also nice that
you keep explicit state here and separately in error.c, so that using
the filter and the plugin together does not cause contention where the
repeated reseeding from the plugin could negatively affect the error rates.


> +++ b/tests/test-random.c

> +   * complete statistical study.
> +   */
> +  for (i = 0; i < SIZE; ++i) {
> +    unsigned char c = (unsigned char) data[i];
> +    histogram[c]++;
> +  }
> +  for (i = 0; i < 256; ++i) {
> +    const unsigned expected = SIZE / 256;
> +    if (histogram[i] < 80 * expected / 100) {
> +      fprintf (stderr, "test-random: "
> +               "random data is not uniformly distributed\n"
> +               "eg. byte %d occurs %u times (expected about %u times)\n",
> +               i, histogram[i], expected);

Sane enough for a first-order analysis.

The typo fix is trivial, then this should be good to go.

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3226
Virtualization:  qemu.org | libvirt.org

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 488 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20181231/698dad08/attachment.sig>


More information about the Libguestfs mailing list