[Libguestfs] [libguestfs-common PATCH v2 2/2] options/keys: introduce unescape_device_mapper_lvm()

Fri May 19 09:30:58 UTC 2023

On 5/18/23 17:59, Richard W.M. Jones wrote:
> On Thu, May 18, 2023 at 05:42:03PM +0200, Laszlo Ersek wrote:
>> On 5/18/23 15:42, Richard W.M. Jones wrote:
>>> On Thu, May 18, 2023 at 03:09:42PM +0200, Laszlo Ersek wrote:
>>>> Libguestfs uses the
>>>>
>>>>   /dev/VG/LV
>>>>
>>>> format internally, for identifying LVM logical volumes, but the user might
>>>> want to specify the
>>>>
>>>>   /dev/mapper/VG-LV ID
>>>>
>>>> format with the "--key ID:..." options.
>>>>
>>>> Introduce unescape_device_mapper_lvm() for turning
>>>>
>>>>   /dev/mapper/VG-LV
>>>>
>>>> key IDs into
>>>>
>>>>   /dev/VG/LV
>>>>
>>>> ones, unescaping doubled hyphens to single hyphens in both VG and LV in
>>>> the process.
>>>>
>>>> Call unescape_device_mapper_lvm() from key_store_import_key(). That is,
>>>> translate the ID as soon as the "--key" option is processed -- let the
>>>> keystore only know about the usual /dev/VG/LV format, for later matching.
>>>>
>>>> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168506
>>>> Signed-off-by: Laszlo Ersek <lersek at redhat.com>
>>>> ---
>>>>
>>>> Notes:
>>>>     v2:
>>>>     
>>>>     - rewrite without regular expressions [Rich, Laszlo]
>>>>     
>>>>     - restructure the commit message
>>>>     
>>>>     v1:
>>>>     
>>>>     regcomp() must definitely allocate memory dynamically, and we
>>>>     (intentionally) never free that -- we never call regfree(). I assume
>>>>     valgrind will catch this as a leak, so we might have to extend
>>>>     "valgrind-suppressions" in each dependent superproject. However, I'm
>>>>     unable to run "make check-valgrind" successfully in e.g. virt-v2v even
>>>>     before these patches; see also
>>>>     <https://listman.redhat.com/archives/libguestfs/2023-May/031496.html>.
>>>>
>>>>  options/keys.c | 100 ++++++++++++++++++++
>>>>  1 file changed, 100 insertions(+)
>>>>
>>>> diff --git a/options/keys.c b/options/keys.c
>>>> index bc7459749276..52b27369016a 100644
>>>> --- a/options/keys.c
>>>> +++ b/options/keys.c
>>>> @@ -260,6 +260,105 @@ key_store_add_from_selector (struct key_store *ks, const char *selector)
>>>>    return key_store_import_key (ks, &key);
>>>>  }
>>>>  
>>>> +/* Turn /dev/mapper/VG-LV into /dev/VG/LV, in-place. */
>>>> +static void
>>>> +unescape_device_mapper_lvm (char *id)
>>>> +{
>>>> +  static const char dev[] = "/dev/", dev_mapper[] = "/dev/mapper/";
>>>> +  const char *input_start;
>>>> +  char *output;
>>>> +  enum { M_SCAN, M_FILL, M_DONE } mode;
>>>> +
>>>> +  if (!STRPREFIX (id, dev_mapper))
>>>> +    return;
>>>> +
>>>> +  /* Start parsing "VG-LV" from "id" after "/dev/mapper/". */
>>>> +  input_start = id + (sizeof dev_mapper - 1);
>>>> +
>>>> +  /* Start writing the unescaped "VG/LV" output after "/dev/". */
>>>> +  output = id + (sizeof dev - 1);
>>>> +
>>>> +  for (mode = M_SCAN; mode < M_DONE; ++mode) {
>>>> +    char c;
>>>> +    const char *input = input_start;
>>>> +    const char *hyphen_buffered = NULL;
>>>> +    bool single_hyphen_seen = false;
>>>> +
>>>> +    do {
>>>> +      c = *input;
>>>> +
>>>> +      switch (c) {
>>>> +      case '-':
>>>> +        if (hyphen_buffered == NULL)
>>>> +          /* This hyphen may start an escaped hyphen, or it could be the
>>>> +           * separator in VG-LV.
>>>> +           */
>>>> +          hyphen_buffered = input;
>>>> +        else {
>>>> +          /* This hyphen completes an escaped hyphen; unescape it. */
>>>> +          if (mode == M_FILL)
>>>> +            *output++ = '-';
>>>> +          hyphen_buffered = NULL;
>>>> +        }
>>>> +        break;
>>>> +
>>>> +      case '/':
>>>> +        /* Slash characters are forbidden in VG-LV anywhere. If there's any,
>>>> +         * we'll find it in the first (i.e., scanning) phase, before we output
>>>> +         * anything back to "id".
>>>> +         */
>>>> +        assert (mode == M_SCAN);
>>>> +        return;
>>>> +
>>>> +      default:
>>>> +        /* Encountered a non-slash, non-hyphen character -- which also may be
>>>> +         * the terminating NUL.
>>>> +         */
>>>> +        if (hyphen_buffered != NULL) {
>>>> +          /* The non-hyphen character comes after a buffered hyphen, so the
>>>> +           * buffered hyphen is supposed to be the single hyphen that separates
>>>> +           * VG from LV in VG-LV. There are three requirements for this
>>>> +           * separator: (a) it must be unique (we must not have seen another
>>>> +           * such separator earlier), (b) it must not be at the start of VG-LV
>>>> +           * (because VG would be empty that way), (c) it must not be at the end
>>>> +           * of VG-LV (because LV would be empty that way). Should any of these
>>>> +           * be violated, we'll catch that during the first (i.e., scanning)
>>>> +           * phase, before modifying "id".
>>>> +           */
>>>> +          if (single_hyphen_seen || hyphen_buffered == input_start ||
>>>> +              c == '\0') {
>>>> +            assert (mode == M_SCAN);
>>>> +            return;
>>>> +          }
>>>> +
>>>> +          /* Translate the separator hyphen to a slash character. */
>>>> +          if (mode == M_FILL)
>>>> +            *output++ = '/';
>>>> +          hyphen_buffered = NULL;
>>>> +          single_hyphen_seen = true;
>>>> +        }
>>>> +
>>>> +        /* Output the non-hyphen character (including the terminating NUL)
>>>> +         * regardless of whether there was a buffered hyphen separator (which,
>>>> +         * by now, we'll have attempted to translate and flush).
>>>> +         */
>>>> +        if (mode == M_FILL)
>>>> +          *output++ = c;
>>>> +      }
>>>> +
>>>> +      ++input;
>>>> +    } while (c != '\0');
>>>> +
>>>> +    /* We must have seen the VG-LV separator. If that's not the case, we'll
>>>> +     * catch it before modifying "id".
>>>> +     */
>>>> +    if (!single_hyphen_seen) {
>>>> +      assert (mode == M_SCAN);
>>>> +      return;
>>>> +    }
>>>> +  }
>>>> +}
>>>
>>> So this code can never return an error?
>>
>> It surely can; the one error mode is exactly the same as it was in v1
>> (the regex version). Namely, the error mode is that the input pathname
>> in "id" does not match the expected form (i.e., /dev/mapper/VG-LV), and
>> then we don't modify "id" at all, we let "id" be added to the keystore
>> exactly as the user specified it.
>>
>> The error mode is activated whenever we reach any of the explicit return
>> statements in the function:
>>
>> - when the first STRPREFIX call falls
>>
>> - or when any of the other three "return" statements are reached (which
>> can only be reached during the scanning phase, M_SCAN; that is, *before*
>> we write anything at all into "id").
>>
>> The point of separating M_SCAN from M_FILL is to verify the full format
>> match before we start modifying "id" in-place. The traversal (parsing)
>> through "VG-LV" is exactly the same in both cases, but in the first
>> pass, we just check. If that completes fine, we repeat the same
>> traversal, but then we also modify "id" in-place.
>>
>> The translation only ever contracts VG-LV (it produces at most as many
>> bytes as it consumes), and the write point is always to the left of the
>> read point, so the in-place modification, including the final
>> NUL-termination, is safe.
>>
>>> eg if the input was "A-B-C",
>>> I think it would translate the string to "A/B-C" which is an output
>>> but the input seems like it was wrong.
>>
>> If "id" contains the string "/dev/mapper/A-B-C", then it *mismatches*
>> the expected form "/dev/mapper/VG-LV" just as much as an ID containing
>> "/dev/sda2" would mismatch the expected form.
>>
>> In either of those cases, "id" does not get modified, and gets added to
>> the keystore precisely as the user specified it on the command line.
>>
>> We don't say "this ID is wrong, error!". We don't know what the ID is,
>> we only know what it is *not*. We only say "this is not a
>> /dev/mapper/VG-LV specification, we just don't know what it is supposed
>> to be, so we won't touch it, we'll add it as it is, to the keystore".
>>
>> That's exactly the behavior (= blind addition) that we have right now,
>> i.e., pre-patch.
> 
> OK got it, in that case:
> 
> Reviewed-by: Richard W.M. Jones <rjones at redhat.com>

Thank you; commit range 38e6988c1864..b636c3f20a1b.
Laszlo

> 
>> unescape_device_mapper_lvm() either modifies "id" in place (if "id" on
>> input has the expected form), or leaves "id" alone.
>>
>> The logic following the unescape_device_mapper_lvm() *call site* doesn't
>> know which way the function went, but it does not *need* to know.
>>
>>> Or is it that only the VG is
>>> escaped?  (I don't imagine there is some specification for /dev/mapper
>>> device names.)
>>
>> "/dev/mapper/A-B-C" simply cannot be interpreted according to the
>> "/dev/mapper/VG-LV" scheme. For such an interpretation to be possible,
>> we'd have to have one of the following strings in "id":
>>
>> - "/dev/mapper/A--B-C" --> "/dev/A-B/C"
>> - "/dev/mapper/A-B--C" --> "/dev/A/B-C"
>>
>> If the user specifies (for example) "/dev/mapper/A-B-C", we reject that,
>> so the ID remains "/dev/mapper/A-B-C", and that is what gets added to
>> the keystore -- exactly as it happens pre-patch.
>>
>> If the user specifies (for example) "/dev/mapper/A--B--C", we also
>> reject that (there is no separator to tell us what is the VG and what is
>> the LV), so the ID remains "/dev/mapper/A--B--C", and that is what gets
>> added to the keystore -- exactly as it happens pre-patch.
>>
>> FWIW, this aspect is identical between version 1 and version 2 of this
>> patch: in v1, if the regex didn't match, then "id" would simply not be
>> rewritten, and would get added to the keystore as specified by the user.
>> The v1->v2 update is an implementation detal, internal to
>> unescape_device_mapper_lvm() -- it only affects how the expected form is
>> recognized (regex versus manual parsing), and how a matching input is
>> translated (the output being constructed from matched regex
>> subexpressions, versus byte-wise buffering and flushing integrated with
>> scanning).
> 
> Thanks,
> 
> Rich.
>