[Libguestfs] [PATCH 2/2] build: Replace ./nbdkit with a C program.

Richard W.M. Jones rjones at redhat.com
Wed Nov 14 09:45:50 UTC 2018


On Tue, Nov 13, 2018 at 08:10:34PM -0600, Eric Blake wrote:
> On 11/13/18 4:51 PM, Richard W.M. Jones wrote:
[...]
> >+  size_t i;
> >+  char *s;
> >+
> >+  /* If NBDKIT_VALGRIND=1 is set in the environment, then we run the
> >+   * program under valgrind.  This is used by the tests.  Similarly if
> >+   * NBDKIT_GDB=1 is set, we run the program under GDB, useful during
> >+   * development.
> >+   */
> >+  s = getenv ("NBDKIT_VALGRIND");
> >+  if (s && strcmp (s, "1") == 0) {
> 
> The shell did this if NBDKIT_VALGRIND was set to anything non-empty;
> now you are requiring it to be exactly "1". Intentional, or do you
> really want 'if (s && *s)'?

I was following the documentation rather than what the shell script
did.  The existing code sets NBDKIT_VALGRIND=1 and in other tests we
have awkward tests such as:

  if [ "$NBDKIT_VALGRIND" = "1" ]; then

so I guess the intention is NBDKIT_VALGRIND=1 rather than just is set.

[...]
> >+  /* Option parsing.  We don't really parse options here.  We are only
> >+   * interested in which options have arguments and which need
> >+   * rewriting.
> >+   */
> >+  for (;;) {
> >+    long_index = -1;
> 
> Why do we need this?
> 
> /me reads ahead...
> 
> Oh, because you are taking the lazy way out by rewriting all user's
> input back in long-opt form, regardless of how the user spelled it
> on input, rather than duplicating a switch statement that has to be
> maintained in parallel with main.c.

Right, if long_index is available then we don't need to entirely
duplicate the switch statement from the main program.

> >+    c = getopt_long (argc, argv, short_options, long_options, &long_index);
> >+    if (c == -1)
> >+      break;
> >+
> >+    /* long_index is only set if it's an actual long option.  However
> >+     * in nbdkit all short options have long equivalents so we can
> >+     * associate those with a long_index, but we must do it ourselves.
> >+     * https://stackoverflow.com/a/34070441
> >+     */
> >+    if (long_index == -1) {
> >+      for (i = 0; long_options[i].name != NULL; ++i) {
> >+        if (long_options[i].val == c) {
> >+          long_index = i;
> >+          break;
> >+        }
> >+      }
> >+      if (long_index == -1)
> >+        abort ();
> 
> I can abort this, by passing in a garbage argument - at which point
> getopt_long returns '?', but that doesn't map back to
> long_options[].
> 
> $ ./nbdkit -1
> ./nbdkit: invalid option -- '1'
> Aborted (core dumped)
> $ ./nbdkit --huh
> ./nbdkit: unrecognized option '--huh'
> Aborted (core dumped)
> $ ./nbdkit --filter
> ./nbdkit: option '--filter' requires an argument
> Aborted (core dumped)

OK I fixed these as you suggested by testing if c == '?'.  getopt
already prints a usable error message so:

$ ./nbdkit --huh
./nbdkit: unrecognized option '--huh'
$ ./nbdkit -1
./nbdkit: invalid option -- '1'
$ ./nbdkit -?
./nbdkit: invalid option -- '?'
$ ./nbdkit -boo
./nbdkit: invalid option -- 'b'
$ ./nbdkit --filter
./nbdkit: option '--filter' requires an argument

All exiting with code 1.

> Best is probably to just exit(1) if c == '?' (you've already
> diagnosed the option error, so passing through to the real nbdkit
> would diagnose it again), or MAYBE to do the equivalent of
> "src/nbdkit --help" (but then still exit with non-zero status - that
> gets trickier, unless you also move the usage text to options.h in
> patch 1) to get the effects of normal nbdkit printing usage after
> improper options.
> 
> >+    }
> 
> Still, I wonder if this loop is necessary - if long_index is -1, you
> know the user passed a short opt, AND you know that printf("-%c", c)
> will spell that short opt (except for c == '?' - but you should
> already be special-casing that). So is it really necessary to map
> all the user's short options into a long option before doing the
> passthrough?

I believe yes, we always need this loop, so that we can test
long_options[long_index].has_arg.

> >+
> >+    if (c == FILTER_OPTION) {
> >+      /* Filters can be rewritten if purely alphanumeric. */
> >+      if (is_short_name (optarg)) {
> >+        passthru_format ("--%s", /* --filter */
> >+                         long_options[long_index].name);
> 
> Here, you know that there is no short option for --filter, but you
> ALSO know that the option is spelled exactly "--filter", so why
> bother with formatting the reverse lookup into long_options when you
> can just hardcode passthru("--format")?
> 
> >+        passthru_format ("%s/filters/%s/.libs/nbdkit-%s-filter.so",
> >+                         builddir, optarg, optarg);
> >+      }
> >+      else goto opt_with_arg;
> >+    }
> >+    else if (c == 'v') {
> >+      /* Verbose is special because we will print the final command. */
> >+      verbose = 1;
> >+      passthru_format ("--%s", long_options[long_index].name);
> 
> Okay, here if you don't reverse map long_index when the user passed
> a short option, then you have to do a runtime decision between
> "--%s" or "-%c",
> 
> >+    }
> >+    else if (long_options[long_index].has_arg) {
> >+      /* Remaining options that take an argument are passed through. */
> >+    opt_with_arg:
> >+      passthru_format ("--%s", long_options[long_index].name);
> 
> and repeat that runtime decision,
> 
> >+      passthru (optarg);
> >+    }
> >+    else {
> >+      /* Remaining options that do not take an argument. */
> >+      passthru_format ("--%s", long_options[long_index].name);
> 
> and repeat again.  At the end of the day, it may still be slightly
> more efficient to avoid the reverse lookup loop, but I don't know if
> it becomes any easier to maintain (you'd be best off adding in a
> helper function rather than open-coding the decision).  Offhand,
> 
> void passthru_option (int c, int long_index)
> {
>   assert (c != '?');
>   if (long_index == -1)
>     passthru_format ("--%s", long_options[long_index].name);
>   else {
>     assert (c <= CHAR_MAX)
>     passthru_format ("-%c", c);
>   }
> }
> 
> >+    }
> >+  }
> >+
> >+  /* Are there any non-option arguments? */
> >+  if (optind < argc) {
> >+    /* The first non-option argument is the plugin name.  If it is
> >+     * purely alphanumeric then rewrite it.
> >+     */
> >+    if (is_short_name (argv[optind])) {
> 
> is_short_name(" ") and is_short_name("-") both return true, leading
> to some unusual messages:
> 
> $ ./nbdkit -
> nbdkit: /home/eblake/nbdkit/plugins/-/.libs/nbdkit---plugin.so:
> /home/eblake/nbdkit/plugins/-/.libs/nbdkit---plugin.so: cannot open
> shared object file: No such file or directory
> $ ./nbdkit ' '
> nbdkit: /home/eblake/nbdkit/plugins/ /.libs/nbdkit- -plugin.so:
> /home/eblake/nbdkit/plugins/ /.libs/nbdkit- -plugin.so: cannot open
> shared object file: No such file or directory
> 
> but that's pre-existing even with the real nbdkit, and not something
> you need to worry about in this patch other than the comment.  Among
> other things, the comment lies (while I can allow "-" as a stretch
> that is nearly alphanumeric, " " certainly doesn't fit the bill).
> Note that it IS different from the shell script, where you did the
> rewrite only after checking the regex ^[a-zA-Z0-9]+$ (no '_'?) and
> was therefore purely alphanumeric - but by being consistent with
> src/main.c, now you can rewrite is_short_name() in a separate patch
> to get both the main binary and the wrapper to change their rules on
> how to treat the plugin name.

Right, the script and the program were inconsistent.  I made the
wrapper consistent but used the comment from the shell script.  Fixed
in the next version.

> >+      /* Special plugins written in Perl. */
> >+      if (strcmp (argv[optind], "example4") == 0 ||
> >+          strcmp (argv[optind], "tar") == 0) {
> >+        passthru_format ("%s/plugins/perl/.libs/nbdkit-perl-plugin.so",
> >+                         builddir);
> >+        passthru_format ("%s/plugins/%s/nbdkit-%s-plugin",
> >+                         builddir, argv[optind], argv[optind]);
> >+      }
> >+      else {
> >+        passthru_format ("%s/plugins/%s/.libs/nbdkit-%s-plugin.so",
> >+                         builddir, argv[optind], argv[optind]);
> >+      }
> >+      ++optind;
> >+    }
> 
> Hmm. If the user calls 'nbdkit file -- -my-file' as an alternative
> to 'nbdkit file ./-my-file', you end up not repeating the "--"
> argument. You probably want to do passthru("--") right here, whether
> or not it was present in the caller's command line, to ensure that
> the real nbdkit doesn't see any remaining arguments as options.

Good point.  Currently:

$ ./nbdkit -f -v file -- -my-file

is rewritten to this:

/home/rjones/d/nbdkit/src/nbdkit --foreground --verbose /home/rjones/d/nbdkit/plugins/file/.libs/nbdkit-file-plugin.so -my-file

which fails (also "--" is actually dropped).

I added passthru ("--") before the plugin name parsing and it is now
rewritten to:

/home/rjones/d/nbdkit/src/nbdkit -f -v -- /home/rjones/d/nbdkit/plugins/file/.libs/nbdkit-file-plugin.so -my-file

> >+
> >+    /* Everything else is passed through without rewriting. */
> >+    while (optind < argc) {
> >+      passthru (argv[optind]);
> >+      ++optind;
> >+    }
> >+  }
> >+
> >+  end_passthru ();
> >+  if (verbose)
> >+    print_command ();
> 
> The rewrite from './nbdkit -f' into '/path/to/src/nbdkit
> --foreground' looks weird; I argued that a helper function to
> preserve the short option spelling when possible might be nicer
> (you'll still rewrite './nbdkit -vf' into '/path/to/src/nbdkit -v
> -f' even if you preserve short options, but that's not as drastic).
> Similarly, the rewrite from './nbdkit --filter=foo' into
> '/path/to/src/nbdkit --filter foo' is reasonable, as is './nbdkit
> -Pfile.pid' into '/path/to/src/nbdkit -P file.pid'.

I still think, because we test long_options.has_arg, we must compute
long_index.  However in the second version I have made it preserve
original long/short arguments using a variation of your technique
described above.

> >+
> >+  /* Run the final command. */
> >+  execvp (cmd[0], (char **) cmd);
> >+  perror (cmd[0]);
> >+  exit (EXIT_FAILURE);
> >+}
> >
> 
> Hmm - should 'nbdkit --help --garbage' warn about --garbage being
> unrecognized, or should it unconditionally print help regardless of
> the rest of the command line?  Right now it does the former; but if
> it switches to the latter, then you have to special-case --help in
> wrapper.c to get the same effect.  Ditto for --version.

I'm not sure.  At the moment you are right it does:

$ ./nbdkit --help --garbage
./nbdkit: unrecognized option '--garbage'

which seems reasonable to me (after all, if you want help, then
just use --help ...)

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines.  Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v




More information about the Libguestfs mailing list