[PATCH v2 1/7] util: introduce a parser for kernel cmdline arguments
Paulo de Rezende Pinatti
ppinatti at linux.ibm.com
Fri Jun 12 14:51:40 UTC 2020
On 11/06/20 10:09, Erik Skultety wrote:
> On Wed, Jun 10, 2020 at 03:55:14PM +0200, Paulo de Rezende Pinatti wrote:
>>
>> On 08/06/20 16:35, Erik Skultety wrote:
>>> On Fri, May 29, 2020 at 12:10:03PM +0200, Paulo de Rezende Pinatti wrote:
>>>> Introduce two utility functions to parse a kernel command
>>>> line string according to the kernel code parsing rules in
>>>> order to enable the caller to perform operations such as
>>>> verifying whether certain argument=value combinations are
>>>> present or retrieving an argument's value.
>>>>
>>>> Signed-off-by: Paulo de Rezende Pinatti <ppinatti at linux.ibm.com>
>>>> Signed-off-by: Boris Fiuczynski <fiuczy at linux.ibm.com>
>>>> ---
>>>> src/libvirt_private.syms | 2 +
>>>> src/util/virutil.c | 169 +++++++++++++++++++++++++++++++++++++++
>>>> src/util/virutil.h | 17 ++++
>>>> tests/utiltest.c | 141 ++++++++++++++++++++++++++++++++
>>>> 4 files changed, 329 insertions(+)
>>>>
>>>> diff --git a/src/libvirt_private.syms b/src/libvirt_private.syms
>>>> index a6af44fe1c..a206a943c5 100644
>>>> --- a/src/libvirt_private.syms
>>>> +++ b/src/libvirt_private.syms
>>>> @@ -3433,6 +3433,8 @@ virHostGetDRMRenderNode;
>>>> virHostHasIOMMU;
>>>> virIndexToDiskName;
>>>> virIsDevMapperDevice;
>>>> +virKernelCmdlineMatchParam;
>>>> +virKernelCmdlineNextParam;
>>>> virMemoryLimitIsSet;
>>>> virMemoryLimitTruncate;
>>>> virMemoryMaxValue;
>>>> diff --git a/src/util/virutil.c b/src/util/virutil.c
>>>> index fb46501142..749c9d7116 100644
>>>> --- a/src/util/virutil.c
>>>> +++ b/src/util/virutil.c
>>>> @@ -1725,6 +1725,175 @@ virHostGetDRMRenderNode(void)
>>>> return ret;
>>>> }
>>>>
>>>> +
>>>> +static const char *virKernelCmdlineSkipDbQuote(const char *cmdline,
>>>
>>> minor nitpick: we can call ^this simply ...SkipQuote, we don't need to be so
>>> specific about it being a double quotation mark, do we?
>>>
>>
>> Sure, changed to SkipQuote.
>>
>>>> + bool *is_quoted)
>>>> +{
>>>> + if (cmdline[0] == '"') {
>>>> + *is_quoted = !(*is_quoted);
>>>> + cmdline++;
>>>> + }
>>>> + return cmdline;
>>>> +}
>>>> +
>>>> +
>>>> +static size_t virKernelCmdlineSearchForward(const char *cmdline,
>>>> + bool *is_quoted,
>>>> + bool include_equal)
>>>
>>> Hmm, what if instead we tried to find and return the index of the '=' character
>>> but iterated all the way until the next applicable (i.e. taking quotation into
>>> account) space and saved that end of arg/arg=val parameter into **res. The
>>> caller of this function would then continue directly from *res with the next
>>> arg/arg=val parameter.
>>> We could then call this something like virKernelCmdlineFindEqual and return -1
>>> if there is no '=' sign, indicating that it's a standalone parameter with no
>>> value.
>>>
>>
>> Yes, we can do that. It would look like this:
>>
>> size_t index;
>
> nitpick: ^this is a standard iteration variable, so naming it "i" might look
> more "expected"
>
Changed.
>> size_t equal_index = 0;
>>
>> for (index = 0; cmdline[index]; index++) {
>> if (!(is_quoted) && g_ascii_isspace(cmdline[index]))
>> break;
>> if (equal_index == 0 && cmdline[index] == '=') {
>> equal_index = index;
>> continue;
>> }
>> virKernelCmdlineSkipQuote(cmdline + index, &is_quoted);
>> }
>> *res = cmdline + index;
>> return equal_index;
>>
>>
>> I found it more convenient to use 0 rather than -1 so that we can also
>> handle the case when the argument itself starts with the equal sign.
>
> Yes, that's fine, but please provide a doc string for this function mentioning
> this.
>
Will do.
> ...
>
>>
>> So the function would look like this now:
>>
>> virSkipSpaces(&cmdline);
>> cmdline = virKernelCmdlineSkipQuote(cmdline, &is_quoted);
>> equal_index = virKernelCmdlineFindEqual(cmdline, is_quoted, &next);
>>
>> if (next == cmdline)
>> return next;
>>
>> /* param has no value */
>> if (equal_index == 0) {
>> if (is_quoted && next[-1] == '"')
>> *param = g_strndup(cmdline, next - cmdline - 1);
>> else
>> *param = g_strndup(cmdline, next - cmdline);
>> return next;
>> }
>>
>> *param = g_strndup(cmdline, equal_index);
>>
>> if (cmdline[equal_index + 1] == '"') {
>> is_quoted = true;
>> equal_index++;
>> }
>>
>> if (is_quoted && next[-1] == '"')
>> *val = g_strndup(cmdline + equal_index + 1,
>> next - cmdline - equal_index - 2);
>> else
>> *val = g_strndup(cmdline + equal_index + 1,
>> next - cmdline - equal_index - 1);
>> return next;
>>
>> There's still a bit of handling required due to the kernel's quoting rules.
>> I took the liberty of using 'next' in place of 'res' as I think it conveys
>> better its purpose. Let me know if that looks good to you.
>
> That is fine, looks good.
>
>>
>>>> +}
>>>> +
>>>> +
>>>> +#define VIR_CMDLINE_STR_CMP(kernel_val, caller_val, flags) \
>>>> + (((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_EQ) && \
>>>> + STREQ(kernel_val, caller_val)) || ((flags & VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX) && \
>>>> + STRPREFIX(kernel_val, caller_val)))
>>>> +
>>>> +
>>>> +/*
>>>> + * Try to match the provided kernel cmdline string with the provided @arg
>>>> + * and the list @values of possible values according to the matching strategy
>>>> + * defined in @flags. Possible options include:
>>>> + * - VIR_KERNEL_CMDLINE_FLAGS_CMP_PREFIX: do a substring comparison of values
>>>
>>> I know you used it in the following patches to match against the accepted
>>> values, but do we really need to match with a prefix, I'd be fine with simple
>>> stirng equality matching in all cases and not mix the matching strategy for
>>> both values and kernel arguments.
>>>
>>
>> The prefix based comparison is to make sure libvirt matches exactly like the
>> kernel code does for the parameter prot_virt (see arch/s390/kernel/uv.c ->
>> prot_virt_setup -> kstrtobool) which performs prefix based matching. Using
>> equality matching should work for most cases but there would be certain
>> corner cases missed (i.e. prot_virt=yfoo will be recognized as 'activate' by
>> the kernel but not by libvirt).
>
> That..is...interesting...
>
>>
>> If you think such cases are not worth the trouble I can remove the PREFIX
>> flag, let me know what you prefer.
>
> Well, given that kernel supposedly recognizes "yfoo" as "on" just because of
> the prefix, we obviously can't neglect the PREFIX matching can we? If we did
> and someone indeed used such a horrible example of a cmdline,
> virt-host-validate would lie about the feature not being enabled (even though
> it would be) because libvirt would only do an equality match, so it has to stay.
>
> <reprove>bad kernel, bad kernel</reprove>
>
> Just to re-assure myself, that's a generic behavior used for all option parsing
> in the kernel not just prot_virt, right? If so, then contrary to my previous
> response, we should always use and default to prefix matching, because given
> the situation equality matching would clearly not be a satisfactory behaviour.
>
No, actually other parameters might use equality matching, this is the
case with 'mem_encrypt' (see arch/x86/mm/mem_encrypt_identity.c ->
sme_enable -> strncmp at line 575) which was being checked for SEV in
the first version of this series. I think we are fine with the current
approach as we allow the caller to choose which matching strategy to use
by specifying the appropriate flag.
> ...
>
>>
>>
>>>> + * - VIR_KERNEL_CMDLINE_FLAGS_SEARCH_STICKY: first positive match satifies search
>>>
>>> is "sticky searching" some terminus technicus? If not, we probably should name
>>> this FLAGS_MATCH_FIRST and FLAGS_MATCH_LAST respectively.
>>>
>>
>> I guess the flag explanation was not clear enough, sticky actually means
>> "stop as soon as you find a match, no matter the order", so
>
> Well, isn't "stop as soon as you find a match" == match first occurrence?
>
I suppose one can also see that way :) I will thus use SEARCH_FIRST as
requested.
>> FLAGS_SEARCH_FIRST doesn't really fit here. A better name could be
>> FLAGS_SEARCH_ANY. Let me know if you agree.
>>
>
> ...
>
>>>>
>>>> +struct testKernelCmdlineNextParamData
>>>> +{
>>>> + const char *cmdline;
>>>> + const char *param;
>>>> + const char *val;
>>>> + const char *next;
>>>> +};
>>>> +
>>>> +static struct testKernelCmdlineNextParamData kEntries[] = {
>>>> + { "arg1 arg2 arg3=val1", "arg1", NULL, " arg2 arg3=val1" },
>>>> + { "arg1=val1 arg2 arg3=val3 arg4", "arg1", "val1", " arg2 arg3=val3 arg4" },
>>>> + { "arg3=val3 ", "arg3", "val3", " " },
>>>> + { "arg3=val3", "arg3", "val3", "" },
>>>> + { "arg-3=val3 arg4", "arg-3", "val3", " arg4" },
>>>> + { "arg_3=val3 arg4", "arg-3", "val3", " arg4" },
>>>> + { " arg_3=val3 arg4", "arg-3", "val3", " arg4" },
>>>> + { " arg-3=val3 arg4", "arg-3", "val3", " arg4" },
>>>> + { "arg2=\"value with spaces\" arg3=val3", "arg2", "value with spaces", " arg3=val3" },
>>>> + { " arg2=\"value with spaces\" arg3=val3", "arg2", "value with spaces", " arg3=val3" },
>>>> + { " \"arg2=value with spaces\" arg3=val3", "arg2", "value with spaces", " arg3=val3" },
>>>> + { "arg2=\"val\"ue arg3", "arg2", "val\"ue", " arg3" },
>>>> + { " arg3\" escaped=val2\"", "arg3\" escaped", "val2", "" },
>>>
>>> ^Is this even valid for the kernel itself? Looking at [1], they clearly don't
>>> allow escaped \" in the arg/value.
>>>
>>> [1] https://github.com/torvalds/linux/blob/db54615e21419c3cb4d699a0b0aa16cc44d0e9da/lib/cmdline.c
>>>
>>
>> I guess the word "escaped" in this test is a bit misleading; it's actually
>> escaping the blank space, not the quote itself. This is valid for the
>> kernel. In order to assure our parsing results would match those of the
>> kernel I executed the code in cmdline.c in a standalone file to generate the
>> reference values for the test.
>>
>> I'll change "arg3\" escaped" to "arg3\" with spaces" to clearly state the
>> intention here.
>
> Sure, that will help, although I wasn't relating to the word "escaped" in my
> reply. Nevermind though, my bad, I just couldn't wrap my head around seeing
> "val\"ue" and somehow could not make sense out of it, too much QE to process I
> guess, it's fine.
>
> Regards,
> Erik
>
--
Best regards,
Paulo de Rezende Pinatti
More information about the libvir-list
mailing list