[PATCH v2 1/7] util: introduce a parser for kernel cmdline arguments

Erik Skultety eskultet at redhat.com
Thu Jun 11 08:09:08 UTC 2020


On Wed, Jun 10, 2020 at 03:55:14PM +0200, Paulo de Rezende Pinatti wrote:
>
> On 08/06/20 16:35, Erik Skultety wrote:
> > On Fri, May 29, 2020 at 12:10:03PM +0200, Paulo de Rezende Pinatti wrote:
> > > Introduce two utility functions to parse a kernel command
> > > line string according to the kernel code parsing rules in
> > > order to enable the caller to perform operations such as
> > > verifying whether certain argument=value combinations are
> > > present or retrieving an argument's value.
> > >
> > > Signed-off-by: Paulo de Rezende Pinatti <ppinatti at linux.ibm.com>
> > > Signed-off-by: Boris Fiuczynski <fiuczy at linux.ibm.com>
> > > ---
> > >   src/libvirt_private.syms |   2 +
> > >   src/util/virutil.c       | 169 +++++++++++++++++++++++++++++++++++++++
> > >   src/util/virutil.h       |  17 ++++
> > >   tests/utiltest.c         | 141 ++++++++++++++++++++++++++++++++
> > >   4 files changed, 329 insertions(+)
> > >
> > > diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
> > > index a6af44fe1c..a206a943c5 100644
> > > --- a/src/libvirt_private.syms
> > > +++ b/src/libvirt_private.syms
> > > @@ -3433,6 +3433,8 @@ virHostGetDRMRenderNode;
> > >   virHostHasIOMMU;
> > >   virIndexToDiskName;
> > >   virIsDevMapperDevice;
> > > +virKernelCmdlineMatchParam;
> > > +virKernelCmdlineNextParam;
> > >   virMemoryLimitIsSet;
> > >   virMemoryLimitTruncate;
> > >   virMemoryMaxValue;
> > > diff --git a/src/util/virutil.c b/src/util/virutil.c
> > > index fb46501142..749c9d7116 100644
> > > --- a/src/util/virutil.c
> > > +++ b/src/util/virutil.c
> > > @@ -1725,6 +1725,175 @@ virHostGetDRMRenderNode(void)
> > >       return ret;
> > >   }
> > >
> > > +
> > > +static const char *virKernelCmdlineSkipDbQuote(const char *cmdline,
> >
> > minor nitpick: we can call ^this simply ...SkipQuote, we don't need to be so
> > specific about it being a double quotation mark, do we?
> >
>
> Sure, changed to SkipQuote.
>
> > > +                                               bool *is_quoted)
> > > +{
> > > +    if (cmdline[0] == '"') {
> > > +        *is_quoted = !(*is_quoted);
> > > +        cmdline++;
> > > +    }
> > > +    return cmdline;
> > > +}
> > > +
> > > +
> > > +static size_t virKernelCmdlineSearchForward(const char *cmdline,
> > > +                                            bool *is_quoted,
> > > +                                            bool include_equal)
> >
> > Hmm, what if instead we tried to find and return the index of the '=' character
> > but iterated all the way until the next applicable (i.e. taking quotation into
> > account) space and saved that end of arg/arg=val parameter into **res. The
> > caller of this function would then continue directly from *res with the next
> > arg/arg=val parameter.
> > We could then call this something like virKernelCmdlineFindEqual and return -1
> > if there is no '=' sign, indicating that it's a standalone parameter with no
> > value.
> >
>
> Yes, we can do that. It would look like this:
>
>     size_t index;

nitpick: ^this is a standard iteration variable, so naming it "i" might look
more "expected"

>     size_t equal_index = 0;
>
>     for (index = 0; cmdline[index]; index++) {
>         if (!(is_quoted) && g_ascii_isspace(cmdline[index]))
>             break;
>         if (equal_index == 0 && cmdline[index] == '=') {
>             equal_index = index;
>             continue;
>         }
>         virKernelCmdlineSkipQuote(cmdline + index, &is_quoted);
>     }
>     *res = cmdline + index;
>     return equal_index;
>
>
> I found it more convenient to use 0 rather than -1 so that we can also
> handle the case when the argument itself starts with the equal sign.

Yes, that's fine, but please provide a doc string for this function mentioning
this.

...

>
> So the function would look like this now:
>
>     virSkipSpaces(&cmdline);
>     cmdline = virKernelCmdlineSkipQuote(cmdline, &is_quoted);
>     equal_index = virKernelCmdlineFindEqual(cmdline, is_quoted, &next);
>
>     if (next == cmdline)
>         return next;
>
>     /* param has no value */
>     if (equal_index == 0) {
>         if (is_quoted && next[-1] == '"')
>             *param = g_strndup(cmdline, next - cmdline - 1);
>         else
>             *param = g_strndup(cmdline, next - cmdline);
>         return next;
>     }
>
>     *param = g_strndup(cmdline, equal_index);
>
>     if (cmdline[equal_index + 1] == '"') {
>         is_quoted = true;
>         equal_index++;
>     }
>
>     if (is_quoted && next[-1] == '"')
>         *val = g_strndup(cmdline + equal_index + 1,
>                          next - cmdline - equal_index - 2);
>     else
>         *val = g_strndup(cmdline + equal_index + 1,
>                          next - cmdline - equal_index - 1);
>     return next;
>
> There's still a bit of handling required due to the kernel's quoting rules.
> I took the liberty of using 'next' in place of 'res' as I think it conveys
> better its purpose. Let me know if that looks good to you.

That is fine, looks good.

>
> > > +}
> > > +
> > > +
> > > +#define VIR_CMDLINE_STR_CMP(kernel_val, caller_val, flags) \
> > > +    (((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ) && \
> > > +      STREQ(kernel_val, caller_val)) || ((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX) && \
> > > +                                         STRPREFIX(kernel_val, caller_val)))
> > > +
> > > +
> > > +/*
> > > + * Try to match the provided kernel cmdline string with the provided @arg
> > > + * and the list @values of possible values according to the matching strategy
> > > + * defined in @flags. Possible options include:
> > > + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX: do a substring comparison of values
> >
> > I know you used it in the following patches to match against the accepted
> > values, but do we really need to match with a prefix, I'd be fine with simple
> > stirng equality matching in all cases and not mix the matching strategy for
> > both values and kernel arguments.
> >
>
> The prefix based comparison is to make sure libvirt matches exactly like the
> kernel code does for the parameter prot_virt (see arch/s390/kernel/uv.c ->
> prot_virt_setup -> kstrtobool) which performs prefix based matching. Using
> equality matching should work for most cases but there would be certain
> corner cases missed (i.e. prot_virt=yfoo will be recognized as 'activate' by
> the kernel but not by libvirt).

That..is...interesting...

>
> If you think such cases are not worth the trouble I can remove the PREFIX
> flag, let me know what you prefer.

Well, given that kernel supposedly recognizes "yfoo" as "on" just because of
the prefix, we obviously can't neglect the PREFIX matching can we? If we did
and someone indeed used such a horrible example of a cmdline,
virt-host-validate would lie about the feature not being enabled (even though
it would be) because libvirt would only do an equality match, so it has to stay.

<reprove>bad kernel, bad kernel</reprove>

Just to re-assure myself, that's a generic behavior used for all option parsing
in the kernel not just prot_virt, right? If so, then contrary to my previous
response, we should always use and default to prefix matching, because given
the situation equality matching would clearly not be a satisfactory behaviour.

...

>
>
> > > + * - VIR_KERNEL_CMDLINE_FLAGS_SEARCH_STICKY: first positive match satifies search
> >
> > is "sticky searching" some terminus technicus? If not, we probably should name
> > this FLAGS_MATCH_FIRST and FLAGS_MATCH_LAST respectively.
> >
>
> I guess the flag explanation was not clear enough, sticky actually means
> "stop as soon as you find a match, no matter the order", so

Well, isn't "stop as soon as you find a match" == match first occurrence?

> FLAGS_SEARCH_FIRST doesn't really fit here. A better name could be
> FLAGS_SEARCH_ANY. Let me know if you agree.
>

...

> > >
> > > +struct testKernelCmdlineNextParamData
> > > +{
> > > +    const char *cmdline;
> > > +    const char *param;
> > > +    const char *val;
> > > +    const char *next;
> > > +};
> > > +
> > > +static struct testKernelCmdlineNextParamData kEntries[] = {
> > > +    { "arg1 arg2 arg3=val1",                            "arg1",           NULL,                  " arg2 arg3=val1" },
> > > +    { "arg1=val1 arg2 arg3=val3 arg4",                  "arg1",           "val1",                " arg2 arg3=val3 arg4" },
> > > +    { "arg3=val3 ",                                     "arg3",           "val3",                " " },
> > > +    { "arg3=val3",                                      "arg3",           "val3",                "" },
> > > +    { "arg-3=val3 arg4",                                "arg-3",          "val3",                " arg4" },
> > > +    { "arg_3=val3 arg4",                                "arg-3",          "val3",                " arg4" },
> > > +    { " arg_3=val3 arg4",                               "arg-3",          "val3",                " arg4" },
> > > +    { "  arg-3=val3 arg4",                              "arg-3",          "val3",                " arg4" },
> > > +    { "arg2=\"value with spaces\" arg3=val3",           "arg2",           "value with spaces",   " arg3=val3" },
> > > +    { " arg2=\"value with spaces\"   arg3=val3",        "arg2",           "value with spaces",   "   arg3=val3" },
> > > +    { "  \"arg2=value with spaces\" arg3=val3",         "arg2",           "value with spaces",   " arg3=val3" },
> > > +    { "arg2=\"val\"ue arg3",                            "arg2",           "val\"ue",             " arg3" },
> > > +    { " arg3\" escaped=val2\"",                         "arg3\" escaped", "val2",                "" },
> >
> > ^Is this even valid for the kernel itself? Looking at [1], they clearly don't
> > allow escaped \" in the arg/value.
> >
> > [1] https://github.com/torvalds/linux/blob/db54615e21419c3cb4d699a0b0aa16cc44d0e9da/lib/cmdline.c
> >
>
> I guess the word "escaped" in this test is a bit misleading; it's actually
> escaping the blank space, not the quote itself. This is valid for the
> kernel. In order to assure our parsing results would match those of the
> kernel I executed the code in cmdline.c in a standalone file to generate the
> reference values for the test.
>
> I'll change "arg3\" escaped" to "arg3\" with spaces" to clearly state the
> intention here.

Sure, that will help, although I wasn't relating to the word "escaped" in my
reply. Nevermind though, my bad, I just couldn't wrap my head around seeing
"val\"ue" and somehow could not make sense out of it, too much QE to process I
guess, it's fine.

Regards,
Erik




More information about the libvir-list mailing list