[Libguestfs] [PATCH 2/2] build: Replace ./nbdkit with a C program.
Richard W.M. Jones
rjones at redhat.com
Wed Nov 14 09:45:50 UTC 2018
On Tue, Nov 13, 2018 at 08:10:34PM -0600, Eric Blake wrote:
> On 11/13/18 4:51 PM, Richard W.M. Jones wrote:
[...]
> >+ size_t i;
> >+ char *s;
> >+
> >+ /* If NBDKIT_VALGRIND=1 is set in the environment, then we run the
> >+ * program under valgrind. This is used by the tests. Similarly if
> >+ * NBDKIT_GDB=1 is set, we run the program under GDB, useful during
> >+ * development.
> >+ */
> >+ s = getenv ("NBDKIT_VALGRIND");
> >+ if (s && strcmp (s, "1") == 0) {
>
> The shell did this if NBDKIT_VALGRIND was set to anything non-empty;
> now you are requiring it to be exactly "1". Intentional, or do you
> really want 'if (s && *s)'?
I was following the documentation rather than what the shell script
did. The existing code sets NBDKIT_VALGRIND=1 and in other tests we
have awkward tests such as:
if [ "$NBDKIT_VALGRIND" = "1" ]; then
so I guess the intention is NBDKIT_VALGRIND=1 rather than just is set.
[...]
> >+ /* Option parsing. We don't really parse options here. We are only
> >+ * interested in which options have arguments and which need
> >+ * rewriting.
> >+ */
> >+ for (;;) {
> >+ long_index = -1;
>
> Why do we need this?
>
> /me reads ahead...
>
> Oh, because you are taking the lazy way out by rewriting all user's
> input back in long-opt form, regardless of how the user spelled it
> on input, rather than duplicating a switch statement that has to be
> maintained in parallel with main.c.
Right, if long_index is available then we don't need to entirely
duplicate the switch statement from the main program.
> >+ c = getopt_long (argc, argv, short_options, long_options, &long_index);
> >+ if (c == -1)
> >+ break;
> >+
> >+ /* long_index is only set if it's an actual long option. However
> >+ * in nbdkit all short options have long equivalents so we can
> >+ * associate those with a long_index, but we must do it ourselves.
> >+ * https://stackoverflow.com/a/34070441
> >+ */
> >+ if (long_index == -1) {
> >+ for (i = 0; long_options[i].name != NULL; ++i) {
> >+ if (long_options[i].val == c) {
> >+ long_index = i;
> >+ break;
> >+ }
> >+ }
> >+ if (long_index == -1)
> >+ abort ();
>
> I can abort this, by passing in a garbage argument - at which point
> getopt_long returns '?', but that doesn't map back to
> long_options[].
>
> $ ./nbdkit -1
> ./nbdkit: invalid option -- '1'
> Aborted (core dumped)
> $ ./nbdkit --huh
> ./nbdkit: unrecognized option '--huh'
> Aborted (core dumped)
> $ ./nbdkit --filter
> ./nbdkit: option '--filter' requires an argument
> Aborted (core dumped)
OK I fixed these as you suggested by testing if c == '?'. getopt
already prints a usable error message so:
$ ./nbdkit --huh
./nbdkit: unrecognized option '--huh'
$ ./nbdkit -1
./nbdkit: invalid option -- '1'
$ ./nbdkit -?
./nbdkit: invalid option -- '?'
$ ./nbdkit -boo
./nbdkit: invalid option -- 'b'
$ ./nbdkit --filter
./nbdkit: option '--filter' requires an argument
All exiting with code 1.
> Best is probably to just exit(1) if c == '?' (you've already
> diagnosed the option error, so passing through to the real nbdkit
> would diagnose it again), or MAYBE to do the equivalent of
> "src/nbdkit --help" (but then still exit with non-zero status - that
> gets trickier, unless you also move the usage text to options.h in
> patch 1) to get the effects of normal nbdkit printing usage after
> improper options.
>
> >+ }
>
> Still, I wonder if this loop is necessary - if long_index is -1, you
> know the user passed a short opt, AND you know that printf("-%c", c)
> will spell that short opt (except for c == '?' - but you should
> already be special-casing that). So is it really necessary to map
> all the user's short options into a long option before doing the
> passthrough?
I believe yes, we always need this loop, so that we can test
long_options[long_index].has_arg.
> >+
> >+ if (c == FILTER_OPTION) {
> >+ /* Filters can be rewritten if purely alphanumeric. */
> >+ if (is_short_name (optarg)) {
> >+ passthru_format ("--%s", /* --filter */
> >+ long_options[long_index].name);
>
> Here, you know that there is no short option for --filter, but you
> ALSO know that the option is spelled exactly "--filter", so why
> bother with formatting the reverse lookup into long_options when you
> can just hardcode passthru("--format")?
>
> >+ passthru_format ("%s/filters/%s/.libs/nbdkit-%s-filter.so",
> >+ builddir, optarg, optarg);
> >+ }
> >+ else goto opt_with_arg;
> >+ }
> >+ else if (c == 'v') {
> >+ /* Verbose is special because we will print the final command. */
> >+ verbose = 1;
> >+ passthru_format ("--%s", long_options[long_index].name);
>
> Okay, here if you don't reverse map long_index when the user passed
> a short option, then you have to do a runtime decision between
> "--%s" or "-%c",
>
> >+ }
> >+ else if (long_options[long_index].has_arg) {
> >+ /* Remaining options that take an argument are passed through. */
> >+ opt_with_arg:
> >+ passthru_format ("--%s", long_options[long_index].name);
>
> and repeat that runtime decision,
>
> >+ passthru (optarg);
> >+ }
> >+ else {
> >+ /* Remaining options that do not take an argument. */
> >+ passthru_format ("--%s", long_options[long_index].name);
>
> and repeat again. At the end of the day, it may still be slightly
> more efficient to avoid the reverse lookup loop, but I don't know if
> it becomes any easier to maintain (you'd be best off adding in a
> helper function rather than open-coding the decision). Offhand,
>
> void passthru_option (int c, int long_index)
> {
> assert (c != '?');
> if (long_index == -1)
> passthru_format ("--%s", long_options[long_index].name);
> else {
> assert (c <= CHAR_MAX)
> passthru_format ("-%c", c);
> }
> }
>
> >+ }
> >+ }
> >+
> >+ /* Are there any non-option arguments? */
> >+ if (optind < argc) {
> >+ /* The first non-option argument is the plugin name. If it is
> >+ * purely alphanumeric then rewrite it.
> >+ */
> >+ if (is_short_name (argv[optind])) {
>
> is_short_name(" ") and is_short_name("-") both return true, leading
> to some unusual messages:
>
> $ ./nbdkit -
> nbdkit: /home/eblake/nbdkit/plugins/-/.libs/nbdkit---plugin.so:
> /home/eblake/nbdkit/plugins/-/.libs/nbdkit---plugin.so: cannot open
> shared object file: No such file or directory
> $ ./nbdkit ' '
> nbdkit: /home/eblake/nbdkit/plugins/ /.libs/nbdkit- -plugin.so:
> /home/eblake/nbdkit/plugins/ /.libs/nbdkit- -plugin.so: cannot open
> shared object file: No such file or directory
>
> but that's pre-existing even with the real nbdkit, and not something
> you need to worry about in this patch other than the comment. Among
> other things, the comment lies (while I can allow "-" as a stretch
> that is nearly alphanumeric, " " certainly doesn't fit the bill).
> Note that it IS different from the shell script, where you did the
> rewrite only after checking the regex ^[a-zA-Z0-9]+$ (no '_'?) and
> was therefore purely alphanumeric - but by being consistent with
> src/main.c, now you can rewrite is_short_name() in a separate patch
> to get both the main binary and the wrapper to change their rules on
> how to treat the plugin name.
Right, the script and the program were inconsistent. I made the
wrapper consistent but used the comment from the shell script. Fixed
in the next version.
> >+ /* Special plugins written in Perl. */
> >+ if (strcmp (argv[optind], "example4") == 0 ||
> >+ strcmp (argv[optind], "tar") == 0) {
> >+ passthru_format ("%s/plugins/perl/.libs/nbdkit-perl-plugin.so",
> >+ builddir);
> >+ passthru_format ("%s/plugins/%s/nbdkit-%s-plugin",
> >+ builddir, argv[optind], argv[optind]);
> >+ }
> >+ else {
> >+ passthru_format ("%s/plugins/%s/.libs/nbdkit-%s-plugin.so",
> >+ builddir, argv[optind], argv[optind]);
> >+ }
> >+ ++optind;
> >+ }
>
> Hmm. If the user calls 'nbdkit file -- -my-file' as an alternative
> to 'nbdkit file ./-my-file', you end up not repeating the "--"
> argument. You probably want to do passthru("--") right here, whether
> or not it was present in the caller's command line, to ensure that
> the real nbdkit doesn't see any remaining arguments as options.
Good point. Currently:
$ ./nbdkit -f -v file -- -my-file
is rewritten to this:
/home/rjones/d/nbdkit/src/nbdkit --foreground --verbose /home/rjones/d/nbdkit/plugins/file/.libs/nbdkit-file-plugin.so -my-file
which fails (also "--" is actually dropped).
I added passthru ("--") before the plugin name parsing and it is now
rewritten to:
/home/rjones/d/nbdkit/src/nbdkit -f -v -- /home/rjones/d/nbdkit/plugins/file/.libs/nbdkit-file-plugin.so -my-file
> >+
> >+ /* Everything else is passed through without rewriting. */
> >+ while (optind < argc) {
> >+ passthru (argv[optind]);
> >+ ++optind;
> >+ }
> >+ }
> >+
> >+ end_passthru ();
> >+ if (verbose)
> >+ print_command ();
>
> The rewrite from './nbdkit -f' into '/path/to/src/nbdkit
> --foreground' looks weird; I argued that a helper function to
> preserve the short option spelling when possible might be nicer
> (you'll still rewrite './nbdkit -vf' into '/path/to/src/nbdkit -v
> -f' even if you preserve short options, but that's not as drastic).
> Similarly, the rewrite from './nbdkit --filter=foo' into
> '/path/to/src/nbdkit --filter foo' is reasonable, as is './nbdkit
> -Pfile.pid' into '/path/to/src/nbdkit -P file.pid'.
I still think, because we test long_options.has_arg, we must compute
long_index. However in the second version I have made it preserve
original long/short arguments using a variation of your technique
described above.
> >+
> >+ /* Run the final command. */
> >+ execvp (cmd[0], (char **) cmd);
> >+ perror (cmd[0]);
> >+ exit (EXIT_FAILURE);
> >+}
> >
>
> Hmm - should 'nbdkit --help --garbage' warn about --garbage being
> unrecognized, or should it unconditionally print help regardless of
> the rest of the command line? Right now it does the former; but if
> it switches to the latter, then you have to special-case --help in
> wrapper.c to get the same effect. Ditto for --version.
I'm not sure. At the moment you are right it does:
$ ./nbdkit --help --garbage
./nbdkit: unrecognized option '--garbage'
which seems reasonable to me (after all, if you want help, then
just use --help ...)
Thanks,
Rich.
--
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-p2v converts physical machines to virtual machines. Boot with a
live CD or over the network (PXE) and turn machines into KVM guests.
http://libguestfs.org/virt-v2v
More information about the Libguestfs
mailing list