[Libguestfs] [libguestfs-common PATCH v2 2/2] options/keys: introduce unescape_device_mapper_lvm()

Laszlo Ersek lersek at redhat.com
Thu May 18 15:42:03 UTC 2023


On 5/18/23 15:42, Richard W.M. Jones wrote:
> On Thu, May 18, 2023 at 03:09:42PM +0200, Laszlo Ersek wrote:
>> Libguestfs uses the
>>
>>   /dev/VG/LV
>>
>> format internally, for identifying LVM logical volumes, but the user might
>> want to specify the
>>
>>   /dev/mapper/VG-LV ID
>>
>> format with the "--key ID:..." options.
>>
>> Introduce unescape_device_mapper_lvm() for turning
>>
>>   /dev/mapper/VG-LV
>>
>> key IDs into
>>
>>   /dev/VG/LV
>>
>> ones, unescaping doubled hyphens to single hyphens in both VG and LV in
>> the process.
>>
>> Call unescape_device_mapper_lvm() from key_store_import_key(). That is,
>> translate the ID as soon as the "--key" option is processed -- let the
>> keystore only know about the usual /dev/VG/LV format, for later matching.
>>
>> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168506
>> Signed-off-by: Laszlo Ersek <lersek at redhat.com>
>> ---
>>
>> Notes:
>>     v2:
>>     
>>     - rewrite without regular expressions [Rich, Laszlo]
>>     
>>     - restructure the commit message
>>     
>>     v1:
>>     
>>     regcomp() must definitely allocate memory dynamically, and we
>>     (intentionally) never free that -- we never call regfree(). I assume
>>     valgrind will catch this as a leak, so we might have to extend
>>     "valgrind-suppressions" in each dependent superproject. However, I'm
>>     unable to run "make check-valgrind" successfully in e.g. virt-v2v even
>>     before these patches; see also
>>     <https://listman.redhat.com/archives/libguestfs/2023-May/031496.html>.
>>
>>  options/keys.c | 100 ++++++++++++++++++++
>>  1 file changed, 100 insertions(+)
>>
>> diff --git a/options/keys.c b/options/keys.c
>> index bc7459749276..52b27369016a 100644
>> --- a/options/keys.c
>> +++ b/options/keys.c
>> @@ -260,6 +260,105 @@ key_store_add_from_selector (struct key_store *ks, const char *selector)
>>    return key_store_import_key (ks, &key);
>>  }
>>  
>> +/* Turn /dev/mapper/VG-LV into /dev/VG/LV, in-place. */
>> +static void
>> +unescape_device_mapper_lvm (char *id)
>> +{
>> +  static const char dev[] = "/dev/", dev_mapper[] = "/dev/mapper/";
>> +  const char *input_start;
>> +  char *output;
>> +  enum { M_SCAN, M_FILL, M_DONE } mode;
>> +
>> +  if (!STRPREFIX (id, dev_mapper))
>> +    return;
>> +
>> +  /* Start parsing "VG-LV" from "id" after "/dev/mapper/". */
>> +  input_start = id + (sizeof dev_mapper - 1);
>> +
>> +  /* Start writing the unescaped "VG/LV" output after "/dev/". */
>> +  output = id + (sizeof dev - 1);
>> +
>> +  for (mode = M_SCAN; mode < M_DONE; ++mode) {
>> +    char c;
>> +    const char *input = input_start;
>> +    const char *hyphen_buffered = NULL;
>> +    bool single_hyphen_seen = false;
>> +
>> +    do {
>> +      c = *input;
>> +
>> +      switch (c) {
>> +      case '-':
>> +        if (hyphen_buffered == NULL)
>> +          /* This hyphen may start an escaped hyphen, or it could be the
>> +           * separator in VG-LV.
>> +           */
>> +          hyphen_buffered = input;
>> +        else {
>> +          /* This hyphen completes an escaped hyphen; unescape it. */
>> +          if (mode == M_FILL)
>> +            *output++ = '-';
>> +          hyphen_buffered = NULL;
>> +        }
>> +        break;
>> +
>> +      case '/':
>> +        /* Slash characters are forbidden in VG-LV anywhere. If there's any,
>> +         * we'll find it in the first (i.e., scanning) phase, before we output
>> +         * anything back to "id".
>> +         */
>> +        assert (mode == M_SCAN);
>> +        return;
>> +
>> +      default:
>> +        /* Encountered a non-slash, non-hyphen character -- which also may be
>> +         * the terminating NUL.
>> +         */
>> +        if (hyphen_buffered != NULL) {
>> +          /* The non-hyphen character comes after a buffered hyphen, so the
>> +           * buffered hyphen is supposed to be the single hyphen that separates
>> +           * VG from LV in VG-LV. There are three requirements for this
>> +           * separator: (a) it must be unique (we must not have seen another
>> +           * such separator earlier), (b) it must not be at the start of VG-LV
>> +           * (because VG would be empty that way), (c) it must not be at the end
>> +           * of VG-LV (because LV would be empty that way). Should any of these
>> +           * be violated, we'll catch that during the first (i.e., scanning)
>> +           * phase, before modifying "id".
>> +           */
>> +          if (single_hyphen_seen || hyphen_buffered == input_start ||
>> +              c == '\0') {
>> +            assert (mode == M_SCAN);
>> +            return;
>> +          }
>> +
>> +          /* Translate the separator hyphen to a slash character. */
>> +          if (mode == M_FILL)
>> +            *output++ = '/';
>> +          hyphen_buffered = NULL;
>> +          single_hyphen_seen = true;
>> +        }
>> +
>> +        /* Output the non-hyphen character (including the terminating NUL)
>> +         * regardless of whether there was a buffered hyphen separator (which,
>> +         * by now, we'll have attempted to translate and flush).
>> +         */
>> +        if (mode == M_FILL)
>> +          *output++ = c;
>> +      }
>> +
>> +      ++input;
>> +    } while (c != '\0');
>> +
>> +    /* We must have seen the VG-LV separator. If that's not the case, we'll
>> +     * catch it before modifying "id".
>> +     */
>> +    if (!single_hyphen_seen) {
>> +      assert (mode == M_SCAN);
>> +      return;
>> +    }
>> +  }
>> +}
> 
> So this code can never return an error?

It surely can; the one error mode is exactly the same as it was in v1
(the regex version). Namely, the error mode is that the input pathname
in "id" does not match the expected form (i.e., /dev/mapper/VG-LV), and
then we don't modify "id" at all, we let "id" be added to the keystore
exactly as the user specified it.

The error mode is activated whenever we reach any of the explicit return
statements in the function:

- when the first STRPREFIX call falls

- or when any of the other three "return" statements are reached (which
can only be reached during the scanning phase, M_SCAN; that is, *before*
we write anything at all into "id").

The point of separating M_SCAN from M_FILL is to verify the full format
match before we start modifying "id" in-place. The traversal (parsing)
through "VG-LV" is exactly the same in both cases, but in the first
pass, we just check. If that completes fine, we repeat the same
traversal, but then we also modify "id" in-place.

The translation only ever contracts VG-LV (it produces at most as many
bytes as it consumes), and the write point is always to the left of the
read point, so the in-place modification, including the final
NUL-termination, is safe.

> eg if the input was "A-B-C",
> I think it would translate the string to "A/B-C" which is an output
> but the input seems like it was wrong.

If "id" contains the string "/dev/mapper/A-B-C", then it *mismatches*
the expected form "/dev/mapper/VG-LV" just as much as an ID containing
"/dev/sda2" would mismatch the expected form.

In either of those cases, "id" does not get modified, and gets added to
the keystore precisely as the user specified it on the command line.

We don't say "this ID is wrong, error!". We don't know what the ID is,
we only know what it is *not*. We only say "this is not a
/dev/mapper/VG-LV specification, we just don't know what it is supposed
to be, so we won't touch it, we'll add it as it is, to the keystore".

That's exactly the behavior (= blind addition) that we have right now,
i.e., pre-patch.

unescape_device_mapper_lvm() either modifies "id" in place (if "id" on
input has the expected form), or leaves "id" alone.

The logic following the unescape_device_mapper_lvm() *call site* doesn't
know which way the function went, but it does not *need* to know.

> Or is it that only the VG is
> escaped?  (I don't imagine there is some specification for /dev/mapper
> device names.)

"/dev/mapper/A-B-C" simply cannot be interpreted according to the
"/dev/mapper/VG-LV" scheme. For such an interpretation to be possible,
we'd have to have one of the following strings in "id":

- "/dev/mapper/A--B-C" --> "/dev/A-B/C"
- "/dev/mapper/A-B--C" --> "/dev/A/B-C"

If the user specifies (for example) "/dev/mapper/A-B-C", we reject that,
so the ID remains "/dev/mapper/A-B-C", and that is what gets added to
the keystore -- exactly as it happens pre-patch.

If the user specifies (for example) "/dev/mapper/A--B--C", we also
reject that (there is no separator to tell us what is the VG and what is
the LV), so the ID remains "/dev/mapper/A--B--C", and that is what gets
added to the keystore -- exactly as it happens pre-patch.

FWIW, this aspect is identical between version 1 and version 2 of this
patch: in v1, if the regex didn't match, then "id" would simply not be
rewritten, and would get added to the keystore as specified by the user.
The v1->v2 update is an implementation detal, internal to
unescape_device_mapper_lvm() -- it only affects how the expected form is
recognized (regex versus manual parsing), and how a matching input is
translated (the output being constructed from matched regex
subexpressions, versus byte-wise buffering and flushing integrated with
scanning).

Laszlo

> 
> Rich.
> 
>>  struct key_store *
>>  key_store_import_key (struct key_store *ks, struct key_store_key *key)
>>  {
>> @@ -278,6 +377,7 @@ key_store_import_key (struct key_store *ks, struct key_store_key *key)
>>      error (EXIT_FAILURE, errno, "realloc");
>>  
>>    ks->keys = new_keys;
>> +  unescape_device_mapper_lvm (key->id);
>>    ks->keys[ks->nr_keys] = *key;
>>    ++ks->nr_keys;
>>  
>> _______________________________________________
>> Libguestfs mailing list
>> Libguestfs at redhat.com
>> https://listman.redhat.com/mailman/listinfo/libguestfs
> 



More information about the Libguestfs mailing list