[Libguestfs] [PATCH nbdkit] sparse-random: Don't generate random content in blocks by default

Richard W.M. Jones rjones at redhat.com
Tue May 25 16:36:15 UTC 2021


Testing nbdcopy with nbdkit-sparse-random-plugin as a test harness
under perf showed that 34% of the total time was taken calling
read_block() to generate random data within each block (about 18% when
reading and another 16% when writing and verifying).  While we could
probably optimize this a bit, it's all pointless make-work when
testing.  As long as there is some non-zero data in each block it's
still a valid test of nbdcopy.

Therefore add a new flag (random-content=true|false) to enable random
content inside each block.  The new default is random-content=false
which means each block has the same random non-zero byte repeated
across the whole block, which is fast.  To get the old behaviour use
random-content=true.

Total time spent running read_block() went from 34% down to 7%.
Total time spent running nbdkit went from 58% down to 37%.
---
 .../nbdkit-sparse-random-plugin.pod           | 19 +++++++++++--
 plugins/sparse-random/sparse-random.c         | 28 ++++++++++++++-----
 2 files changed, 37 insertions(+), 10 deletions(-)

diff --git a/plugins/sparse-random/nbdkit-sparse-random-plugin.pod b/plugins/sparse-random/nbdkit-sparse-random-plugin.pod
index fff98f3c..12e2798b 100644
--- a/plugins/sparse-random/nbdkit-sparse-random-plugin.pod
+++ b/plugins/sparse-random/nbdkit-sparse-random-plugin.pod
@@ -4,7 +4,9 @@ nbdkit-sparse-random-plugin - make sparse random disks
 
 =head1 SYNOPSIS
 
- nbdkit sparse-random [size=]SIZE [seed=SEED] [percent=N] [runlength=N]
+ nbdkit sparse-random [size=]SIZE [seed=SEED]
+                      [percent=N] [runlength=N]
+                      [random-content=true]
 
 =head1 DESCRIPTION
 
@@ -27,7 +29,12 @@ tries to create runs of data and runs of empty space.  The
 C<runlength> parameter controls the average length of each run of
 random data.
 
-The random data is generated using an I<insecure> method.
+The data in each block normally consists of the same random non-zero
+byte repeated over the whole block.  If you want fully random content
+within each block use C<random-content=true>.  This is not the default
+because earlier testing of this plugin showed that a great deal of
+time was spent generating random content.  The random content is
+generated using a method which is I<not> cryptographically secure.
 
 =head2 Writes and testing copying
 
@@ -51,6 +58,12 @@ data versus sparse empty space.  The default is 10 (10%).
 C<percent=0> will create a completely empty disk and C<percent=100>
 will create a completely full disk.
 
+=item B<random-content=true>
+
+By default a single random non-zero byte is repeated over the whole
+block, which is fast to generate and check.  If you want blocks where
+each byte is random, use this setting.
+
 =item B<runlength=>N
 
 Specify the average length of runs of random data.  This is expressed
@@ -109,4 +122,4 @@ Richard W.M. Jones
 
 =head1 COPYRIGHT
 
-Copyright (C) 2018 Red Hat Inc.
+Copyright (C) 2018-2021 Red Hat Inc.
diff --git a/plugins/sparse-random/sparse-random.c b/plugins/sparse-random/sparse-random.c
index 2512130a..34347dbf 100644
--- a/plugins/sparse-random/sparse-random.c
+++ b/plugins/sparse-random/sparse-random.c
@@ -1,5 +1,5 @@
 /* nbdkit
- * Copyright (C) 2017-2020 Red Hat Inc.
+ * Copyright (C) 2017-2021 Red Hat Inc.
  *
  * Redistribution and use in source and binary forms, with or without
  * modification, are permitted provided that the following conditions are
@@ -55,6 +55,7 @@ static uint32_t seed;           /* Random seed. */
 static double percent = 10;     /* Percentage of data. */
 static uint64_t runlength =     /* Expected average run length of data (bytes)*/
   UINT64_C(16*1024*1024);
+static int random_content;      /* false: Repeat same byte  true: Random bytes*/
 
 /* We need to store 1 bit per block.  Using a 4K block size means we
  * need 32M to map each 1T of virtual disk.
@@ -111,6 +112,11 @@ sparse_random_config (const char *key, const char *value)
       return -1;
     }
   }
+  else if (strcmp (key, "random-content") == 0) {
+    random_content = nbdkit_parse_bool (value);
+    if (random_content == -1)
+      return -1;
+  }
   else {
     nbdkit_error ("unknown parameter '%s'", key);
     return -1;
@@ -123,7 +129,8 @@ sparse_random_config (const char *key, const char *value)
   "size=<SIZE>  (required) Size of the backing disk\n" \
   "seed=<SEED>             Random number generator seed\n" \
   "percent=<PERCENT>       Percentage of data\n" \
-  "runlength=<BYTES>       Expected average run length of data"
+  "runlength=<BYTES>       Expected average run length of data\n" \
+  "random-content=true     Fully random content in each block"
 
 /* Create the random bitmap of data and holes.
  *
@@ -276,19 +283,26 @@ static void
 read_block (uint64_t blknum, uint64_t offset, void *buf)
 {
   unsigned char *b = buf;
+  uint64_t s;
+  uint32_t i;
+  struct random_state state;
 
   if (bitmap_get_blk (&bm, blknum, 0) == 0) /* hole */
     memset (buf, 0, BLOCKSIZE);
-  else {                        /* data */
-    uint32_t i;
-    struct random_state state;
-
+  else if (!random_content) {   /* data when random-content=false */
+    xsrandom (seed + offset, &state);
+    s = xrandom (&state);
+    s &= 255;
+    if (s == 0) s = 1;
+    memset (buf, (int)s, BLOCKSIZE);
+  }
+  else {                        /* data when random-content=true */
     /* This produces repeatable data for the same offset.  Note it
      * works because we are called on whole blocks only.
      */
     xsrandom (seed + offset, &state);
     for (i = 0; i < BLOCKSIZE; ++i) {
-      uint64_t s = xrandom (&state);
+      s = xrandom (&state);
       s &= 255;
       b[i] = s;
     }
-- 
2.31.1




More information about the Libguestfs mailing list