[PATCH v2 1/7] util: introduce a parser for kernel cmdline arguments

Paulo de Rezende Pinatti ppinatti at linux.ibm.com
Wed Jun 10 13:55:14 UTC 2020


On 08/06/20 16:35, Erik Skultety wrote:
> On Fri, May 29, 2020 at 12:10:03PM +0200, Paulo de Rezende Pinatti wrote:
>> Introduce two utility functions to parse a kernel command
>> line string according to the kernel code parsing rules in
>> order to enable the caller to perform operations such as
>> verifying whether certain argument=value combinations are
>> present or retrieving an argument's value.
>>
>> Signed-off-by: Paulo de Rezende Pinatti <ppinatti at linux.ibm.com>
>> Signed-off-by: Boris Fiuczynski <fiuczy at linux.ibm.com>
>> ---
>>   src/libvirt_private.syms |   2 +
>>   src/util/virutil.c       | 169 +++++++++++++++++++++++++++++++++++++++
>>   src/util/virutil.h       |  17 ++++
>>   tests/utiltest.c         | 141 ++++++++++++++++++++++++++++++++
>>   4 files changed, 329 insertions(+)
>>
>> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
>> index a6af44fe1c..a206a943c5 100644
>> --- a/src/libvirt_private.syms
>> +++ b/src/libvirt_private.syms
>> @@ -3433,6 +3433,8 @@ virHostGetDRMRenderNode;
>>   virHostHasIOMMU;
>>   virIndexToDiskName;
>>   virIsDevMapperDevice;
>> +virKernelCmdlineMatchParam;
>> +virKernelCmdlineNextParam;
>>   virMemoryLimitIsSet;
>>   virMemoryLimitTruncate;
>>   virMemoryMaxValue;
>> diff --git a/src/util/virutil.c b/src/util/virutil.c
>> index fb46501142..749c9d7116 100644
>> --- a/src/util/virutil.c
>> +++ b/src/util/virutil.c
>> @@ -1725,6 +1725,175 @@ virHostGetDRMRenderNode(void)
>>       return ret;
>>   }
>>
>> +
>> +static const char *virKernelCmdlineSkipDbQuote(const char *cmdline,
> 
> minor nitpick: we can call ^this simply ...SkipQuote, we don't need to be so
> specific about it being a double quotation mark, do we?
> 

Sure, changed to SkipQuote.

>> +                                               bool *is_quoted)
>> +{
>> +    if (cmdline[0] == '"') {
>> +        *is_quoted = !(*is_quoted);
>> +        cmdline++;
>> +    }
>> +    return cmdline;
>> +}
>> +
>> +
>> +static size_t virKernelCmdlineSearchForward(const char *cmdline,
>> +                                            bool *is_quoted,
>> +                                            bool include_equal)
> 
> Hmm, what if instead we tried to find and return the index of the '=' character
> but iterated all the way until the next applicable (i.e. taking quotation into
> account) space and saved that end of arg/arg=val parameter into **res. The
> caller of this function would then continue directly from *res with the next
> arg/arg=val parameter.
> We could then call this something like virKernelCmdlineFindEqual and return -1
> if there is no '=' sign, indicating that it's a standalone parameter with no
> value.
> 

Yes, we can do that. It would look like this:

     size_t index;
     size_t equal_index = 0;

     for (index = 0; cmdline[index]; index++) {
         if (!(is_quoted) && g_ascii_isspace(cmdline[index]))
             break;
         if (equal_index == 0 && cmdline[index] == '=') {
             equal_index = index;
             continue;
         }
         virKernelCmdlineSkipQuote(cmdline + index, &is_quoted);
     }
     *res = cmdline + index;
     return equal_index;


I found it more convenient to use 0 rather than -1 so that we can also 
handle the case when the argument itself starts with the equal sign.

>> +{
>> +    size_t index;
>> +
>> +    for (index = 0; cmdline[index]; index++) {
>> +        if ((!(*is_quoted) && g_ascii_isspace(cmdline[index])) ||
>> +            (include_equal && cmdline[index] == '='))
>> +            break;
>> +        virKernelCmdlineSkipDbQuote(cmdline + index, is_quoted);
>> +    }
>> +    return index;
>> +}
>> +
>> +
>> +static size_t virKernelCmdlineNextSpace(const char *cmdline,
>> +                                        bool *is_quoted)
>> +{
>> +    return virKernelCmdlineSearchForward(cmdline, is_quoted, false);
>> +}
>> +
>> +
>> +static size_t virKernelCmdlineNextSpaceOrEqual(const char *cmdline,
>> +                                               bool *is_quoted)
>> +{
>> +    return virKernelCmdlineSearchForward(cmdline, is_quoted, true);
>> +}
> 
> If we implement what I suggested above for virKernelCmdlineSearchForward, we
> won't need 2 wrappers for the same thing differentiating only in whether we do
> or do not need to search for the '=' sign as well.
> 

Yes, wrappers are gone.

>> +
>> +
>> +static char* virKernelArgNormalize(const char *arg)
>> +{
>> +    return virStringReplace(arg, "_", "-");
>> +}
>> +
>> +
>> +static char* virKernelCmdlineArgNormalize(const char *cmdline, size_t offset)
>> +{
>> +    g_autofree char *param = g_strndup((cmdline), offset);
>> +
>> +    return virKernelArgNormalize(param);
> 
> See below for comments why we don't need this wrapper.
> 

This wrapper is also gone.

>> +}
>> +
>> +
>> +/*
>> + * Parse the kernel cmdline and store the next parameter in @param
>> + * and the value of @param in @val which can be NULL if @param has
>> + * no value. In addition returns the address right after @param=@value
>> + * for possible further processing.
>> + *
>> + * @cmdline: kernel command line string to be checked for next parameter
>> + * @param: pointer to hold retrieved parameter, will be NULL if none found
>> + * @val: pointer to hold retrieved value of @param
>> + *
>> + * Returns a pointer to address right after @param=@val in the
>> + * kernel command line, will point to the string's end (NULL)
>> + * in case no next parameter is found
>> + */
>> +const char *virKernelCmdlineNextParam(const char *cmdline,
>> +                                      char **param,
>> +                                      char **val)
>> +{
>> +    size_t offset;
>> +    bool is_quoted = false;
>> +    *param = NULL;
>> +    *val = NULL;
>> +
>> +    virSkipSpaces(&cmdline);
>> +    cmdline = virKernelCmdlineSkipDbQuote(cmdline, &is_quoted);
>> +    offset = virKernelCmdlineNextSpaceOrEqual(cmdline, &is_quoted);
> 
> if you get the index of the equal sign, but iterate all the way until the end
> of the arg/arg=val parameter, you can use index and do:
> 
> *param = g_strndup(cmdline, equal_index);
> 

Yep, done.

>> +    if (offset == 0)
> 
> If we used something like char **res (taking it as a parameter to this function)
> to represent the rest of the unparsed cmdline, then it should be NULL in this
> case, so the check could remain with a slight adjustment (I haven't implemented
> it, but I do hope it should work :))
> 

See below how it looks like now.


>> +        return cmdline;
>> +
>> +    *param = virKernelCmdlineArgNormalize(cmdline, offset);
> 
> ^This normalization should be done in virKernelCmdlineMatchParam instead. That
> way, you won't need a wrapper over virKernelArgNormalize.
> 

Yes, done.

>> +    cmdline = cmdline + offset;
>> +    /* param has no value */
>> +    if (*cmdline != '=')
>> +        return cmdline;
> 
> ^This check could then be dropped along with moving the cursor in the cmdline
> string.
> 

Yes, done.

>> +
>> +    cmdline = virKernelCmdlineSkipDbQuote(++cmdline, &is_quoted);
>> +    offset = virKernelCmdlineNextSpace(cmdline, &is_quoted);
> 
> If we do the above, we should be able to ditch ^this second loop searching for
> the next space.
> 

Yes, done.

>> +    if (cmdline[offset-1] == '"')
>> +        *val = g_strndup(cmdline, offset-1);
>> +    else
>> +        *val = g_strndup(cmdline, offset);
> 
> ^Here you'd have to do arithmetics using the *res variable instead.
> 
>> +
>> +    return cmdline + offset;
> 
> We'd simply return *res;
> 

So the function would look like this now:

     virSkipSpaces(&cmdline);
     cmdline = virKernelCmdlineSkipQuote(cmdline, &is_quoted);
     equal_index = virKernelCmdlineFindEqual(cmdline, is_quoted, &next);

     if (next == cmdline)
         return next;

     /* param has no value */
     if (equal_index == 0) {
         if (is_quoted && next[-1] == '"')
             *param = g_strndup(cmdline, next - cmdline - 1);
         else
             *param = g_strndup(cmdline, next - cmdline);
         return next;
     }

     *param = g_strndup(cmdline, equal_index);

     if (cmdline[equal_index + 1] == '"') {
         is_quoted = true;
         equal_index++;
     }

     if (is_quoted && next[-1] == '"')
         *val = g_strndup(cmdline + equal_index + 1,
                          next - cmdline - equal_index - 2);
     else
         *val = g_strndup(cmdline + equal_index + 1,
                          next - cmdline - equal_index - 1);
     return next;

There's still a bit of handling required due to the kernel's quoting 
rules. I took the liberty of using 'next' in place of 'res' as I think 
it conveys better its purpose. Let me know if that looks good to you.

>> +}
>> +
>> +
>> +#define VIR_CMDLINE_STR_CMP(kernel_val, caller_val, flags) \
>> +    (((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ) && \
>> +      STREQ(kernel_val, caller_val)) || ((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX) && \
>> +                                         STRPREFIX(kernel_val, caller_val)))
>> +
>> +
>> +/*
>> + * Try to match the provided kernel cmdline string with the provided @arg
>> + * and the list @values of possible values according to the matching strategy
>> + * defined in @flags. Possible options include:
>> + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX: do a substring comparison of values
> 
> I know you used it in the following patches to match against the accepted
> values, but do we really need to match with a prefix, I'd be fine with simple
> stirng equality matching in all cases and not mix the matching strategy for
> both values and kernel arguments.
> 

The prefix based comparison is to make sure libvirt matches exactly like 
the kernel code does for the parameter prot_virt (see 
arch/s390/kernel/uv.c -> prot_virt_setup -> kstrtobool) which performs 
prefix based matching. Using equality matching should work for most 
cases but there would be certain corner cases missed (i.e. 
prot_virt=yfoo will be recognized as 'activate' by the kernel but not by 
libvirt).

If you think such cases are not worth the trouble I can remove the 
PREFIX flag, let me know what you prefer.

>> + *   (uses size of value provided as input)
>> + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ: do a strict string comparison of values
> 
> ^This should be default used in all cases IMO unless the prefix matching
> provides us with a considerable performance gain.
> 

Yes, that makes sense. I made it the default now.


>> + * - VIR_KERNEL_CMDLINE_FLAGS_SEARCH_STICKY: first positive match satifies search
> 
> is "sticky searching" some terminus technicus? If not, we probably should name
> this FLAGS_MATCH_FIRST and FLAGS_MATCH_LAST respectively.
> 

I guess the flag explanation was not clear enough, sticky actually means 
"stop as soon as you find a match, no matter the order", so 
FLAGS_SEARCH_FIRST doesn't really fit here. A better name could be 
FLAGS_SEARCH_ANY. Let me know if you agree.


>> + *   (in case of multiple argument occurrences)
>> + * - VIR_KERNEL_CMDLINE_FLAGS_SEARCH_LAST: use the result of last argument occurence
>> + *   (in case of multiple argument occurrences)
>> + *
>> + * @cmdline: kernel command line string to be checked for @arg
>> + * @arg: kernel command line argument
>> + * @values: array of possible values to match @arg
>> + * @len_values: size of array, it can be 0 meaning a match will be positive if the
>> + * argument has no value.
>> + * @flags: flag mask defining the strategy for matching and comparing
>> + *
>> + * Returns true if a match is found, false otherwise
>> + */
>> +bool virKernelCmdlineMatchParam(const char *cmdline,
>> +                                const char *arg,
>> +                                const char **values,
>> +                                size_t len_values,
>> +                                virKernelCmdlineFlags flags)
>> +{
>> +    bool match = false;
>> +    size_t i;
>> +    const char *next = cmdline;
>> +    g_autofree char *norm_arg = virKernelArgNormalize(arg);
>> +    g_autofree char *kparam = NULL;
>> +    g_autofree char *kval = NULL;
> 
> ^These last two variables can be moved into the while loop, you won't need the
> explicit VIR_FREEs then.
> 

Done.

>> +
>> +    while (next[0] != '\0') {
>> +        VIR_FREE(kparam);
>> +        VIR_FREE(kval);
>> +        next = virKernelCmdlineNextParam(next, &kparam, &kval);
> 
> Insert a blank line in between all these "ifs" for better readability (I know
> the coding guideline we have doesn't mention it).
> 

Done.

>> +        if (!kparam)
>> +            break;
> 
> You'd do the normalization of the parsed arg value here.
> 

Done.

>> +        if (STRNEQ(kparam, norm_arg))
>> +            continue;
>> +        if (!kval) {
>> +            match = (len_values == 0) ? true : false;
>> +        } else {
>> +            match = false;
>> +            for (i = 0; i < len_values; i++) {
>> +                if (VIR_CMDLINE_STR_CMP(kval, values[i], flags)) {
>> +                    match = true;
>> +                    break;
>> +                }
>> +            }
>> +        }
>> +        if (match && (flags & VIR_KERNEL_CMDLINE_FLAGS_SEARCH_STICKY))
>> +            break;
>> +    }
>> +
>> +    return match;
>> +}
>> +
>> +
>>   /*
>>    * Get a password from the console input stream.
>>    * The caller must free the returned password.
>> diff --git a/src/util/virutil.h b/src/util/virutil.h
>> index 49b4bf440f..7499b78153 100644
>> --- a/src/util/virutil.h
>> +++ b/src/util/virutil.h
>> @@ -147,6 +147,23 @@ bool virHostHasIOMMU(void);
>>
>>   char *virHostGetDRMRenderNode(void) G_GNUC_NO_INLINE;
>>
>> +typedef enum {
>> +    VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX = 1,
>> +    VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ = 2,
>> +    VIR_KERNEL_CMDLINE_FLAGS_SEARCH_STICKY = 4,
>> +    VIR_KERNEL_CMDLINE_FLAGS_SEARCH_LAST = 8,
>> +} virKernelCmdlineFlags;
>> +
>> +const char *virKernelCmdlineNextParam(const char *cmdline,
>> +                                      char **param,
>> +                                      char **val);
>> +
>> +bool virKernelCmdlineMatchParam(const char *cmdline,
>> +                                const char *arg,
>> +                                const char **values,
>> +                                size_t len_values,
>> +                                virKernelCmdlineFlags flags);
>> +
>>   /**
>>    * VIR_ASSIGN_IS_OVERFLOW:
>>    * @rvalue: value that is checked (evaluated twice)
>> diff --git a/tests/utiltest.c b/tests/utiltest.c
>> index 5ae04585cb..01fb8c89f5 100644
>> --- a/tests/utiltest.c
>> +++ b/tests/utiltest.c
>> @@ -254,6 +254,145 @@ testOverflowCheckMacro(const void *data G_GNUC_UNUSED)
>>   }
>>
>>
>> +struct testKernelCmdlineNextParamData
>> +{
>> +    const char *cmdline;
>> +    const char *param;
>> +    const char *val;
>> +    const char *next;
>> +};
>> +
>> +static struct testKernelCmdlineNextParamData kEntries[] = {
>> +    { "arg1 arg2 arg3=val1",                            "arg1",           NULL,                  " arg2 arg3=val1" },
>> +    { "arg1=val1 arg2 arg3=val3 arg4",                  "arg1",           "val1",                " arg2 arg3=val3 arg4" },
>> +    { "arg3=val3 ",                                     "arg3",           "val3",                " " },
>> +    { "arg3=val3",                                      "arg3",           "val3",                "" },
>> +    { "arg-3=val3 arg4",                                "arg-3",          "val3",                " arg4" },
>> +    { "arg_3=val3 arg4",                                "arg-3",          "val3",                " arg4" },
>> +    { " arg_3=val3 arg4",                               "arg-3",          "val3",                " arg4" },
>> +    { "  arg-3=val3 arg4",                              "arg-3",          "val3",                " arg4" },
>> +    { "arg2=\"value with spaces\" arg3=val3",           "arg2",           "value with spaces",   " arg3=val3" },
>> +    { " arg2=\"value with spaces\"   arg3=val3",        "arg2",           "value with spaces",   "   arg3=val3" },
>> +    { "  \"arg2=value with spaces\" arg3=val3",         "arg2",           "value with spaces",   " arg3=val3" },
>> +    { "arg2=\"val\"ue arg3",                            "arg2",           "val\"ue",             " arg3" },
>> +    { " arg3\" escaped=val2\"",                         "arg3\" escaped", "val2",                "" },
> 
> ^Is this even valid for the kernel itself? Looking at [1], they clearly don't
> allow escaped \" in the arg/value.
> 
> [1] https://github.com/torvalds/linux/blob/db54615e21419c3cb4d699a0b0aa16cc44d0e9da/lib/cmdline.c
> 

I guess the word "escaped" in this test is a bit misleading; it's 
actually escaping the blank space, not the quote itself. This is valid 
for the kernel. In order to assure our parsing results would match those 
of the kernel I executed the code in cmdline.c in a standalone file to 
generate the reference values for the test.

I'll change "arg3\" escaped" to "arg3\" with spaces" to clearly state 
the intention here.

>> +    { " arg2longer=someval arg2=val2 arg3 ",            "arg2longer",     "someval",             " arg2=val2 arg3 " },
>> +    { "=val1 arg2=val2",                                NULL,             NULL,                  "=val1 arg2=val2" },
>> +    { " ",                                              NULL,             NULL,                  "" },
>> +    { "",                                               NULL,             NULL,                  "" },
>> +};
>> +
>> +static int
>> +testKernelCmdlineNextParam(const void *data G_GNUC_UNUSED)
>> +{
>> +    char *param = NULL;
>> +    char *val = NULL;
>> +    const char *next;
>> +    size_t i;
>> +
>> +    for (i = 0; i < G_N_ELEMENTS(kEntries); ++i) {
>> +        VIR_FREE(param);
>> +        VIR_FREE(val);
>> +
>> +        next = virKernelCmdlineNextParam(kEntries[i].cmdline, &param, &val);
>> +
>> +        if (STRNEQ_NULLABLE(param, kEntries[i].param) ||
>> +            STRNEQ_NULLABLE(val, kEntries[i].val) ||
>> +            STRNEQ(next, kEntries[i].next)) {
>> +            VIR_TEST_DEBUG("\nKernel cmdline [%s]", kEntries[i].cmdline);
>> +            VIR_TEST_DEBUG("Expect param [%s]", kEntries[i].param);
>> +            VIR_TEST_DEBUG("Actual param [%s]", param);
>> +            VIR_TEST_DEBUG("Expect value [%s]", kEntries[i].val);
>> +            VIR_TEST_DEBUG("Actual value [%s]", val);
>> +            VIR_TEST_DEBUG("Expect next [%s]", kEntries[i].next);
>> +            VIR_TEST_DEBUG("Actual next [%s]", next);
>> +
>> +            VIR_FREE(param);
>> +            VIR_FREE(val);
>> +
>> +            return -1;
>> +        }
>> +    }
>> +
>> +    VIR_FREE(param);
>> +    VIR_FREE(val);
>> +
>> +    return 0;
>> +}
> 
> I appreciate the thorough unit testing :)
> 

:)

> Erik
> 

-- 
Best regards,

Paulo de Rezende Pinatti




More information about the libvir-list mailing list