[Libguestfs] [PATCH] RFC: OCaml tools: add and use a Getopt module
Pino Toscano
ptoscano at redhat.com
Mon Jul 11 15:57:43 UTC 2016
On Monday, 27 June 2016 15:42:46 CEST Richard W.M. Jones wrote:
> On Fri, Jun 24, 2016 at 05:42:37PM +0200, Pino Toscano wrote:
> > Add a new Getopt module to mllib, to parse command line arguments with
> > handlers close to the ones used with Arg, but using getopt(3) (actually
> > getopt_long_only) to do the real parsing. This allow us to provide
> > options for OCaml tools with a syntax similar to the C tools, and use
> > the additional features getopt offers and Arg does not.
> >
> > Do a single-step conversion of Common_utils and all the OCaml tools to
> > the syntax of Getopt.
> >
> > As side-change due to the conversion, extra arguments for sysprep
> > operation can have more keys for the same argument.
>
> In general terms, it's a very good change which really improves the
> tools.
>
> I have a few fairly minor issues below. ACK if you can clean all of
> those up.
[...]
> > diff --git a/mllib/getopt-c.c b/mllib/getopt-c.c
> > new file mode 100644
> > index 0000000..d44448f
> > --- /dev/null
> > +++ b/mllib/getopt-c.c
> > @@ -0,0 +1,398 @@
>
> > +value
> > +guestfs_int_mllib_getopt_parse (value argsv, value specsv, value anon_funv, value usage_msgv)
>
> I'm not convinced that this function is safe against OCaml GC compaction.
>
> In particular there are problems such as:
>
> ...
> > + for (j = 0; j < len; ++j) {
> > + const char *key = String_val (Field (keysv, j));
>
> key now points to a string on the OCaml heap, and then ...
>
> > + if (newopts == NULL)
> > + caml_raise_out_of_memory ();
> > + longopts = newopts;
> > + longopts[longopts_len].name = key;
>
> the same pointer is copied to longopts, but ...
>
> > + case 0: /* Unit of (unit -> unit) */
> > + do_call1 (Field (actionv, 0), Val_unit);
>
> At this point you're calling an OCaml function which is likely to
> allocate, and could therefore call the GC, and could therefore compact
> the heap, which would move that string around, and make your pointer
> invalid. (You could try adding `Gc.compact ()' to one of these
> callback functions -- I'm fairly sure at least some of the time you
> could get a segfault, and if not, valgrind wouldn't be happy).
I see, it fails that way indeed. Fixed it by:
- copying long option strings on the heap (with the ugly cast in
cleanup_option_list, but almost unavoidable)
- making sure that operations that trigger allocations are executed
one-by-one (i.e. caml_copy_string out of direct function parameters,
etc)
> > +external getopt_parse : string array -> (c_keys * spec * doc) array -> ?anon_fun:anon_fun -> usage_msg -> unit = "guestfs_int_mllib_getopt_parse"
> > +
> > +let parse_argv argv specs ?anon_fun usage_msg =
>
> I feel this function could do some sanity checking on the inputs,
> especially the keys (but see my comment below). If the sanity check
> fails, it should either assert or failwith, but definitely not
> continue.
Moved the sanity checks in the OCaml part.
> > + let specs = specs @ [
> > + (* Handled internally by getopt_parse. *)
> > + [ "-h"; "-help"; "--help" ], Unit (fun () -> ()), s_"Display brief help";
> > + ] in
> > + let specs = List.map (
> > + fun (keys, spec, doc) ->
> > + (Array.of_list keys), spec, doc
>
> You don't need parens here, since function application always binds tightest.
Removed, thanks (leftover of other code previously there).
> > +type spec =
> > + | Unit of (unit -> unit)
> > + (* Simple option with no argument; call the function. *)
> > + | Set of bool ref
> > + (* Simple option with no argument; set the reference to true. *)
> > + | Clear of bool ref
> > + (* Simple option with no argument; set the reference to false. *)
>
> Does getopt_long create the '--no-X' options automatically?
No, it doesn't.
> > + | String of string * (string -> unit)
> > + (* Option requiring an argument; the first element in the tuple
> > + is the documentation string of the argument, and the second
> > + is the function to call. *)
> > + | Set_string of string * string ref
> > + (* Option requiring an argument; the first element in the tuple
> > + is the documentation string of the argument, and the second
> > + is the reference to be set. *)
> > + | Int of string * (int -> unit)
> > + (* Option requiring an integer value as argument; the first
> > + element in the tuple is the documentation string of the
> > + argument, and the second is the function to call. *)
> > + | Set_int of string * int ref
> > + (* Option requiring an integer value as argument; the first
> > + element in the tuple is the documentation string of the
> > + argument, and the second is the reference to be set. *)
> > +
> > +type keys = string list
>
> I had a vague idea that you might make this more type safe by
> changing this type to:
>
> type optstring = S of char (** short option *) | L of string (** --long *)
> and keys = optstring list
>
> It requires a bunch of extra changes through the code, but also avoids
> needing to write horrible unsafe code like:
>
> if String.is_prefix arg "-" && not (String.is_prefix arg "--") then
>
> You could also put optstring into a submodule so that just the L and S
> definitions can be imported into client modules without needing to
> import the whole of Getopt, so that makes the syntax quite brief:
>
> open Getopt.Optstring
> ...
> [ S'l'; L"list" ], Getopt.Unit list_mode, s_"List available templates";
>
> or if you prefer:
>
> module O = Getopt.Optstring
> ...
> [ O.S'l'; O.L"list" ], Getopt.Unit list_mode, s_"List available templates";
I thought about this, and at the moment it feels to me a bit too
complicated. I might revise that in the future though.
> > --- a/v2v/cmdline.ml
> > +++ b/v2v/cmdline.ml
> > @@ -164,55 +164,47 @@ let parse_cmdline () =
> > and o_options =
> > String.concat "|" (Modules_list.output_modules ()) in
> >
> > - let ditto = " -\"-" in
> > let argspec = [
> > - "-b", Arg.String add_bridge, "in:out " ^ s_"Map bridge 'in' to 'out'";
> > - "--bridge", Arg.String add_bridge, "in:out " ^ ditto;
> > - "--compressed", Arg.Set compressed, " " ^ s_"Compress output file";
> > - "--dcpath", Arg.String (set_string_option_once "--dcpath" dcpath),
> > - "path " ^ s_"Override dcPath (for vCenter)";
> > - "--dcPath", Arg.String (set_string_option_once "--dcPath" dcpath),
> > - "path " ^ ditto;
> > - "--debug-overlay",Arg.Set debug_overlays,
> > - " " ^ s_"Save overlay files";
> > - "--debug-overlays",Arg.Set debug_overlays,
> > - ditto;
> > - "-i", Arg.String set_input_mode, i_options ^ " " ^ s_"Set input mode (default: libvirt)";
> > - "-ic", Arg.String (set_string_option_once "-ic" input_conn),
> > - "uri " ^ s_"Libvirt URI";
> > - "-if", Arg.String (set_string_option_once "-if" input_format),
> > - "format " ^ s_"Input format (for -i disk)";
> > - "--in-place", Arg.Set in_place, " " ^ s_"Only tune the guest in the input VM";
> > - "--machine-readable", Arg.Set machine_readable, " " ^ s_"Make output machine readable";
> > - "-n", Arg.String add_network, "in:out " ^ s_"Map network 'in' to 'out'";
> > - "--network", Arg.String add_network, "in:out " ^ ditto;
> > - "--no-copy", Arg.Clear do_copy, " " ^ s_"Just write the metadata";
> > - "--no-trim", Arg.String no_trim_warning,
> > - "-" ^ " " ^ s_"Ignored for backwards compatibility";
> > - "-o", Arg.String set_output_mode, o_options ^ " " ^ s_"Set output mode (default: libvirt)";
> > - "-oa", Arg.String set_output_alloc,
> > - "sparse|preallocated " ^ s_"Set output allocation mode";
> > - "-oc", Arg.String (set_string_option_once "-oc" output_conn),
> > - "uri " ^ s_"Libvirt URI";
> > - "-of", Arg.String (set_string_option_once "-of" output_format),
> > - "raw|qcow2 " ^ s_"Set output format";
> > - "-on", Arg.String (set_string_option_once "-on" output_name),
> > - "name " ^ s_"Rename guest when converting";
> > - "-os", Arg.String (set_string_option_once "-os" output_storage),
> > - "storage " ^ s_"Set output storage location";
> > - "--password-file", Arg.String (set_string_option_once "--password-file" password_file),
> > - "file " ^ s_"Use password from file";
> > - "--print-source", Arg.Set print_source, " " ^ s_"Print source and stop";
> > - "--qemu-boot", Arg.Set qemu_boot, " " ^ s_"Boot in qemu (-o qemu only)";
> > - "--root", Arg.String set_root_choice,"ask|... " ^ s_"How to choose root filesystem";
> > - "--vdsm-image-uuid", Arg.String add_vdsm_image_uuid, "uuid " ^ s_"Output image UUID(s)";
> > - "--vdsm-vol-uuid", Arg.String add_vdsm_vol_uuid, "uuid " ^ s_"Output vol UUID(s)";
> > - "--vdsm-vm-uuid", Arg.String (set_string_option_once "--vdsm-vm-uuid" vdsm_vm_uuid),
> > - "uuid " ^ s_"Output VM UUID";
> > - "--vdsm-ovf-output", Arg.String (set_string_option_once "--vdsm-ovf-output" vdsm_ovf_output),
> > - " " ^ s_"Output OVF file";
> > - "--vmtype", Arg.String vmtype_warning,
> > - "- " ^ s_"Ignored for backwards compatibility";
> > + [ "-b"; "--bridge" ], Getopt.String ("in:out", add_bridge), s_"Map bridge 'in' to 'out'";
> > + [ "--compressed" ], Getopt.Set compressed, s_"Compress output file";
> > + [ "--dcpath"; "--dcPath" ], Getopt.String ("path", set_string_option_once "--dcpath" dcpath),
> > + s_"Override dcPath (for vCenter)";
> > + [ "--debug-overlay"; "--debug-overlays" ], Getopt.Set debug_overlays, s_"Save overlay files";
> > + [ "-i" ], Getopt.String (i_options, set_input_mode), s_"Set input mode (default: libvirt)";
> > + [ "-ic" ], Getopt.String ("uri", set_string_option_once "-ic" input_conn),
> > + s_"Libvirt URI";
> > + [ "-if" ], Getopt.String ("format", set_string_option_once "-if" input_format),
> > + s_"Input format (for -i disk)";
>
> I'm interested to know if these awkward "single dash long options"
> actually work now? And the -o* ones below.
Thanks to getopt_long_only, they work (and thus the v2v test suite
passes).
Thanks for the thorough review, v2 coming in a moment.
--
Pino Toscano
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: This is a digitally signed message part.
URL: <http://listman.redhat.com/archives/libguestfs/attachments/20160711/07d5b81f/attachment.sig>
More information about the Libguestfs
mailing list