[libvirt] [v4 2/5] virsh: Improve readline generators and readline completion

Sun Sep 22 14:57:51 UTC 2013

On 17/09/13 at 09:22pm, Eric Blake wrote:
> On 09/10/2013 09:54 AM, Tomas Meszaros wrote:
> > This patch is rather big and introduces several new functions,
> > but I kept all new functions in the one patch because they are
> > all connected together.
> > 
> > I had to extend vshReadlineOptionsGenerator() a lot, so it is possible
> > to fully complete options by calling appropriate opt->completer().
> > 
> > vshReadlineOptionsCompletionGenerator() is simple new function
> > which is used only when we need to fully complete specific option,
> > e.g.: virsh # vol-key --vol <TAB>
> > 
> > ---
> 
> I still think you are mixing a bit too much into one patch, and that you
> are reimplementing pieces of the parser instead of refactoring what is
> already existing in other places for easier reuse.
> 
> On the other hand, I'm seeing some nice improvements. Below, I'm using
> ^^ to highlight what I typed; best viewed in fixed width font.
> 
> Pre-patch, I'm seeing:
> 
> virsh# vol-k<TAB>
>        ^^^^^^^^^^
> virsh# vol-key <TAB>
>                ^^^^^
> virsh# vol-key --pool <TAB>
>                       ^^^^^
> virsh# vol-key --pool --pool
> 
> while post-patch, it changes to:
> 
> virsh# vol-k<TAB>
>        ^^^^^^^^^^
> virsh# vol-key --<TAB>
>                  ^^^^^
> --help  --pool  --vol
> virsh# vol-key --<TAB>
>                  ^^^^^
> --help  --pool  --vol
> virsh# vol-key --p<TAB>
>                  ^^^^^^
> virsh# vol-key --pool <TAB>
>                       ^^^^^
> --help  --vol
> 
> Which is definitely better, but still not perfect (--pool takes a
> mandatory argument, so we should NOT be offering --vol as a valid
> completion once --pool has been typed, until AFTER the argument of pool
> has also been typed).

> 
> > +++ b/tools/virsh.c
> > @@ -2587,6 +2587,118 @@ vshCloseLogFile(vshControl *ctl)
> >   * -----------------
> >   */
> >  
> > +static const vshCmdDef *
> > +vshDetermineCommandName(void)
> > +{
> > +    const vshCmdDef *cmd = NULL;
> > +    char *p;
> > +    char *cmdname;
> > +
> > +    if (!(p = strchr(rl_line_buffer, ' ')))
> > +        return NULL;
> > +
> > +    cmdname = vshMalloc(NULL, (p - rl_line_buffer) + 1);
> > +    memcpy(cmdname, rl_line_buffer, p - rl_line_buffer);
> > +
> > +    cmd = vshCmddefSearch(cmdname);
> > +    VIR_FREE(cmdname);
> > +
> > +    return cmd;
> > +}
> 
> This particular part of the patch looks like it is just code motion
> (moving it earlier in the file to avoid having to do a forward
> reference).  If so, please split that into a separate patch.

Yes it is just refactored code. I'll put it in the separate patch.

> Both pre- and post-patch, I noticed:
> 
> virsh# e''<TAB>
>        ^^^^^^^^
> list of files in current directory starting with e
> 
> although what I really want is:
> 
> virsh# e''<TAB>
>        ^^^^^^^^
> echo  edit  emulatorpin  exit
> virsh# e''cho hi<ENTER>
>           ^^^^^^^^^^^^^
> hi
> 
> that is, the virsh parser is already pretty sophisticated about doing
> quote removal, but the readline parser is not.  I think this function
> needs to take advantage of what parsing we already do.
> 
> Ideally, you should be using stuff like vshCommandStringGetArg to take
> rl_line_buffer and split that into de-quoted words, and only then try to
> figure out what completions to attempt.  That is, instead of trying to
> complete on "e''" (which is not a prefix of any command, and therefore
> falls back to the default completion of file names), you would be
> completing on "e" (the results after vshCommandStringGetArg removes
> quotes).  Then again, if you do that, then you also have to figure out
> the readline hooks so that when the completion is not ambiguous (or when
> it is ambiguous but has a non-ambiguous prefix and the user has
> requested that <TAB> auto-type the common prefix), you have to know
> which part of the completion is the suffix to append to what the user
> typed without stripping the quotes they already typed.  It can get
> tricky fast, so maybe the current behavior is okay, but food for thought.

I have looked over vshCommandStringGetArg but cannot figure out how to
use it to parse rl_line_buffer... can you give me any hints?

> > +
> > +/*
> > + * Check if option with opt_name is already fully completed.
> > + */
> > +static bool
> > +vshOptFullyCompleted(const char *opt_name)
> > +{
> > +    const vshCmdDef *cmd = NULL;
> > +    char *completed_name = NULL;
> > +    char **completed_list = NULL;
> > +    size_t completed_list_index;
> > +    size_t opts_index = 0;
> > +    bool opt_fully_completed = false;
> > +
> > +    if (!opt_name)
> > +        return opt_fully_completed;
> > +
> > +    cmd = vshDetermineCommandName();
> 
> Here's an interesting exercise - run under gdb and put a breakpoint on
> vshDetermineCommandName() (that gets tricky - I find it easiest to start
> virsh in batch mode in one terminal, then use 'gdb --pid $(pidof
> lt-virsh)' in another), and see how many times you are calling this
> function (not all necessarily from this call site, but still
> instructive).  For example, I counted:
> 
> virsh# e<TAB> => once
> virsh# ec<TAB> => once
> virsh# echo <TAB> => seven times
> virsh# echo --s<TAB> => six times

Yay, thats a bug. I've fixed it so it does not call vsHDetermineCommandName()
over and over again.

> Wow.  Once should be sufficient; which means we aren't caching what
> we've learned about rl_line_buffer.  Really, when the user hits <TAB>,
> we should COMPLETELY parse what they've typed so far (tokenize it into
> an array of arguments with quotes stripped, with special handling if the
> final argument is an unterminated quote such as hitting <TAB> after
> unmatched '), and then ALL of our completion functions should refer back
> to that cached structure for the remainder of their work, without having
> to reparse the line.  Readline already gives you the hints you need for
> proper memory management of that cache (argument of 0 when <TAB> is hit,
> non-zero for all subsequent work until you finally return NULL to end
> the completion list).  (Yeah, I wish it were via an opaque pointer,
> rather than requiring you to store it in global variables, but such is
> life when dealing with super-old APIs that fortunately still happen to
> only need single-threading even today).

That's good idea. I will try to reimplement it so it can use tokenized
rl_line_buffer. Anyway, we will still need to call
vshDetermineCommandName() to get vshCmdDef *cmd of the current command
(or we declare another global variable which will take care of this, so
we can refer to it every time we want to check current cmd in
rl_line_buffer).

> 
> Your only saving grace is that tab completion is interactive and that we
> don't have any command with more than 32 options by virtue of how we
> track which options have been seen so far in an int bitmap, so even
> though your behavior feels like O(n*2) (or maybe even O(n*3)), we don't
> necessarily hit the penalties of scale where the user will notice the
> wasted cpu cycles.  On the other hand, I did NOT test is whether you
> call vshDetermineCommandName() once per volume in the per-volume
> completer - and there, we KNOW there are users that stick literally
> thousands of volumes in one pool, where the cost of calling
> vshDetermineCommandName() per volume would quickly become a noticeable
> delay.  There is no reason why we cannot get the parsing down to O(1)
> (once per tab).
> 
> > +
> > +    if (!cmd)
> > +        return opt_fully_completed;
> > +
> > +    while (cmd->opts[opts_index].name) {
> > +        const vshCmdOptDef *opt = &cmd->opts[opts_index];
> > +        opts_index++;
> > +
> > +        if (!STREQ(opt->name, opt_name) || !opt->completer)
> > +            continue;
> > +
> > +        completed_list_index = 0;
> > +        completed_list = opt->completer(opt->completer_flags);
> > +
> > +        if (!completed_list)
> > +            continue;
> > +
> > +        while ((completed_name = completed_list[completed_list_index])) {
> > +            completed_list_index++;
> > +            if (strstr(rl_line_buffer, completed_name))
> 
> strstr() is NOT the right function to be using.  A demonstration that
> strstr is not right:
> 
> virsh# echo "string with --shell in middle" --shell<ENTER>
> 'string with --shell in middle'
> virsh# echo "string with --shell in middle" --<TAB>
> --help  --str   --xml
> 
> Oops.  Notice that the way virsh parses command lines, it is VALID to
> put --shell AFTER the first --str argument; but your strstr mistakenly
> hits on the embedded --shell inside the middle of the (implicit --str)
> argument and refuses to treat --shell as a valid completion at the point
> where I hit TAB.
> 
> Again, I argue that you should parse the complete line into an array of
> strings, and then evaluate those strings in order, similar to how
> vshCmddefOptParse is behaving.  In fact, I think that your first order
> of business is to prepare a preliminary patch that refactors the body of
> vshCmddefOptParse into calls to a helper function, so that both it and
> your new code share the same parse engine, and the only place they
> differ is what they do once the parse is complete - vshCmddefOptParse
> raises errors if an option is missing an argument, if a required option
> is missing, if an unknown option is found, or invokes the command; while
> tab completion knows that an option missing an argument or a missing
> required option means to do the per-option completer for that particular
> option, unknown options mean that there is no longer anything to
> complete, and the end result is displaying the completion instead of
> running the command.
> 
> I also think it would be wise, for testing purposes, to implement my
> proposed new 'virsh complete ...' command EARLY in the series, so that
> you can more easily test scenarios from the command line instead of
> having to fire up interactive gdb sessions across two terminals (that
> is, it's easier to debug 'gdb --args virsh complete echo -' than it is
> to fire up 'virsh' in one window, attach gdb in the other window, type
> 'echo -<TAB>' in the first window, then interact with gdb in the
> second).  That, and having 'virsh complete' working would let us write
> unit tests (tests/virshtest.c and tests/virsh-optparse already do a lot
> of argument parsing testing on 'echo', these tests would be a great
> starting point for a new test that virsh complete returns what we think
> it should).

If I get it right: "complete" should be new command which would perform
auto-completion on it's arguments.
So when I type this from shell:

 $ ./virsh s<TAB>

virsh should be called with "complete s" arguments.

So it would be enough when I create new command called "complete"?
Bash would take care of everything else?

> Remember, if you parse this exactly once, you would be left with:
> 
> argv[0] = "echo", associated with cmd name
> argv[1] = "string with --shell in middle", associated with --str
> argv[2] = "--", needs completion
> 
> along with the knowledge that --help, --shell, --str, and --xml are all
> still valid at that point (--help because it is always valid if we
> aren't waiting for an option's required argument; --str because it can
> appear more than once thanks to its VSH_OT_ARGV data-type, and the other
> two because they haven't been seen yet).
> 
> At this point, I'm torn on whether it is effective to review the rest of
> your patch as-is, knowing that parts of it may be irrelevant if you
> first do the cleanup to fix the algorithms behind the patch.  I'm glad
> that you're making progress, and I want to encourage you to keep trying
> (especially since I know this is a summer of code project, where you are
> under a timeline to get something submitted to qualify for benefits
> outside the scope of this list).  And I also apologize that it has taken
> me a week to review your patch (you posted on the 10th, and it is now
> the 17th), as that cuts into your timeline - if nothing else, I hope
> that this serves as a positive learning experience that getting the
> design right up front is KEY to getting the code approved later with
> minimal churn (coding first, only to have the design ripped to shreds
> and having to do more iterations of code, can be frustrating, both to
> the coder and to the reviewers).  Please, even if the summer of code
> deadlines pass, continue your work on this until it is in - I definitely
> want this in virsh.

I definitely want to continue working on this, even after official GSoC
deadline.

> 
> Revisiting some examples from my earlier mails:
> 
> >> virsh# vol-key vol <TAB>
> >> pool1 pool2 --pool
> 
> The one-time parser should split this into:
> 
> argv[0] = "vol-key", cmd name
> argv[1] = "vol", argument to implicit --vol
> argv[2] = "", needs completion
> at this point, --pool is the next available option, so the completion is
> the union of the per-option pool completer (but with names beginning
> with - filtered out, because the parser would treat leading - as the
> start of an option rather than a pool name) and the possible remaining
> options (--pool and the ever-present --help)
> 
> >> 
> >> virsh# vol-key -<TAB>
> >> --vol --pool
> 
> argv[0] = "vol-key", cmd name
> argv[1] = "-", needs completion
> at this point, we know we have to complete an option, so we don't use
> any per-option completer (neither volume nor pool names can appear
> here), and use just the option completer (--vol, --pool, and the
> ever-present --help)
> 
> >> 
> >> virsh# vol-key v<TAB>
> >> vol1 vol2
> 
> argv[0] = "vol-key", cmd name
> argv[1] = "v", needs completion
> at this point, we know we have to complete a non-option; as no options
> have been consumed yet, this is the first option that requires an
> argument, so we treat it as an implicit --vol, and run just the volume
> name completer (filtered to just names beginning with 'v').
> 
> >> 
> >> virsh# vol-key --pool <TAB>
> >> pool1 pool2
> 
> argv[0] = "vol-key", cmd name
> argv[1] = "--pool", --pool option requires argument
> argv[2] = "", needs completion
> at this point, we know we have to supply the argument to the --pool
> option, so we run just the pool name completer, and have nothing to
> filter on (all pool names are valid, even those beginning with '-').
> Guess what - even though I said "--help" is ever-present, here it does
> not get listed (--help is only present when an option can appear, but
> here we expect only an argument)
> 
> >> 
> >> virsh# vol-key --pool pool1 <TAB>
> >> vol1 vol2 --vol
> 
> argv[0] = "vol-key", cmd name
> argv[1] = "--pool", --pool option requires argument
> argv[2] = "pool1", provides argument to --pool
> argv[3] = "", needs completion
> at this point, we can take either an option or an argument to an
> implicit option; the next available option needing an argument is --vol,
> so we run the union of the per-option volume completer (filter out
> leading '-') and the remaining option completer (filter out --pool,
> leaving just --vol and the ever-present --help).
> 
> >> 
> >> virsh# vol-key --pool=[<TAB>
> >> --pool=pool1 --pool=pool2
> 
> argv[0] = "vol-key", cmd name
> argv[1] = "--pool", --pool option requires an argument
> argv[2] = "[", needs completion
> ooh tricky - I split "--pool=<TAB>" into two arguments, and then
> complete on just the per-option pool completer (filtered to names
> beginning with 'p'); then have to reconstruct it back into a single
> string for display.  But having the parser normalize the code once, then
> generate the completion list, then munge the generated list back into
> display format at the end, will be easier than having to write the
> generation list to handle '--option arg' and '--option=arg' as
> equivalent throughout the entire body of completer code.

Yeah, I understand this and all the logic behind it. I'll try to rewrite
my code so it can work over tokenized line buffer instead of parsing
actual rl_line_buffer.

> [Oh, and I have homework too - I STILL need to submit my patch for a
> much simpler ./configure --without-readline.  It's a challenge to see
> who can post the next patch :) ]
> 
> -- 
> Eric Blake   eblake redhat com    +1-919-301-3266
> Libvirt virtualization library http://libvirt.org
> 

-- 
Tomas Meszaros