[Libguestfs] [libguestfs-common PATCH v2 2/2] options/keys: introduce unescape_device_mapper_lvm()

Richard W.M. Jones rjones at redhat.com
Thu May 18 15:59:19 UTC 2023


On Thu, May 18, 2023 at 05:42:03PM +0200, Laszlo Ersek wrote:
> On 5/18/23 15:42, Richard W.M. Jones wrote:
> > On Thu, May 18, 2023 at 03:09:42PM +0200, Laszlo Ersek wrote:
> >> Libguestfs uses the
> >>
> >>   /dev/VG/LV
> >>
> >> format internally, for identifying LVM logical volumes, but the user might
> >> want to specify the
> >>
> >>   /dev/mapper/VG-LV ID
> >>
> >> format with the "--key ID:..." options.
> >>
> >> Introduce unescape_device_mapper_lvm() for turning
> >>
> >>   /dev/mapper/VG-LV
> >>
> >> key IDs into
> >>
> >>   /dev/VG/LV
> >>
> >> ones, unescaping doubled hyphens to single hyphens in both VG and LV in
> >> the process.
> >>
> >> Call unescape_device_mapper_lvm() from key_store_import_key(). That is,
> >> translate the ID as soon as the "--key" option is processed -- let the
> >> keystore only know about the usual /dev/VG/LV format, for later matching.
> >>
> >> Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=2168506
> >> Signed-off-by: Laszlo Ersek <lersek at redhat.com>
> >> ---
> >>
> >> Notes:
> >>     v2:
> >>     
> >>     - rewrite without regular expressions [Rich, Laszlo]
> >>     
> >>     - restructure the commit message
> >>     
> >>     v1:
> >>     
> >>     regcomp() must definitely allocate memory dynamically, and we
> >>     (intentionally) never free that -- we never call regfree(). I assume
> >>     valgrind will catch this as a leak, so we might have to extend
> >>     "valgrind-suppressions" in each dependent superproject. However, I'm
> >>     unable to run "make check-valgrind" successfully in e.g. virt-v2v even
> >>     before these patches; see also
> >>     <https://listman.redhat.com/archives/libguestfs/2023-May/031496.html>.
> >>
> >>  options/keys.c | 100 ++++++++++++++++++++
> >>  1 file changed, 100 insertions(+)
> >>
> >> diff --git a/options/keys.c b/options/keys.c
> >> index bc7459749276..52b27369016a 100644
> >> --- a/options/keys.c
> >> +++ b/options/keys.c
> >> @@ -260,6 +260,105 @@ key_store_add_from_selector (struct key_store *ks, const char *selector)
> >>    return key_store_import_key (ks, &key);
> >>  }
> >>  
> >> +/* Turn /dev/mapper/VG-LV into /dev/VG/LV, in-place. */
> >> +static void
> >> +unescape_device_mapper_lvm (char *id)
> >> +{
> >> +  static const char dev[] = "/dev/", dev_mapper[] = "/dev/mapper/";
> >> +  const char *input_start;
> >> +  char *output;
> >> +  enum { M_SCAN, M_FILL, M_DONE } mode;
> >> +
> >> +  if (!STRPREFIX (id, dev_mapper))
> >> +    return;
> >> +
> >> +  /* Start parsing "VG-LV" from "id" after "/dev/mapper/". */
> >> +  input_start = id + (sizeof dev_mapper - 1);
> >> +
> >> +  /* Start writing the unescaped "VG/LV" output after "/dev/". */
> >> +  output = id + (sizeof dev - 1);
> >> +
> >> +  for (mode = M_SCAN; mode < M_DONE; ++mode) {
> >> +    char c;
> >> +    const char *input = input_start;
> >> +    const char *hyphen_buffered = NULL;
> >> +    bool single_hyphen_seen = false;
> >> +
> >> +    do {
> >> +      c = *input;
> >> +
> >> +      switch (c) {
> >> +      case '-':
> >> +        if (hyphen_buffered == NULL)
> >> +          /* This hyphen may start an escaped hyphen, or it could be the
> >> +           * separator in VG-LV.
> >> +           */
> >> +          hyphen_buffered = input;
> >> +        else {
> >> +          /* This hyphen completes an escaped hyphen; unescape it. */
> >> +          if (mode == M_FILL)
> >> +            *output++ = '-';
> >> +          hyphen_buffered = NULL;
> >> +        }
> >> +        break;
> >> +
> >> +      case '/':
> >> +        /* Slash characters are forbidden in VG-LV anywhere. If there's any,
> >> +         * we'll find it in the first (i.e., scanning) phase, before we output
> >> +         * anything back to "id".
> >> +         */
> >> +        assert (mode == M_SCAN);
> >> +        return;
> >> +
> >> +      default:
> >> +        /* Encountered a non-slash, non-hyphen character -- which also may be
> >> +         * the terminating NUL.
> >> +         */
> >> +        if (hyphen_buffered != NULL) {
> >> +          /* The non-hyphen character comes after a buffered hyphen, so the
> >> +           * buffered hyphen is supposed to be the single hyphen that separates
> >> +           * VG from LV in VG-LV. There are three requirements for this
> >> +           * separator: (a) it must be unique (we must not have seen another
> >> +           * such separator earlier), (b) it must not be at the start of VG-LV
> >> +           * (because VG would be empty that way), (c) it must not be at the end
> >> +           * of VG-LV (because LV would be empty that way). Should any of these
> >> +           * be violated, we'll catch that during the first (i.e., scanning)
> >> +           * phase, before modifying "id".
> >> +           */
> >> +          if (single_hyphen_seen || hyphen_buffered == input_start ||
> >> +              c == '\0') {
> >> +            assert (mode == M_SCAN);
> >> +            return;
> >> +          }
> >> +
> >> +          /* Translate the separator hyphen to a slash character. */
> >> +          if (mode == M_FILL)
> >> +            *output++ = '/';
> >> +          hyphen_buffered = NULL;
> >> +          single_hyphen_seen = true;
> >> +        }
> >> +
> >> +        /* Output the non-hyphen character (including the terminating NUL)
> >> +         * regardless of whether there was a buffered hyphen separator (which,
> >> +         * by now, we'll have attempted to translate and flush).
> >> +         */
> >> +        if (mode == M_FILL)
> >> +          *output++ = c;
> >> +      }
> >> +
> >> +      ++input;
> >> +    } while (c != '\0');
> >> +
> >> +    /* We must have seen the VG-LV separator. If that's not the case, we'll
> >> +     * catch it before modifying "id".
> >> +     */
> >> +    if (!single_hyphen_seen) {
> >> +      assert (mode == M_SCAN);
> >> +      return;
> >> +    }
> >> +  }
> >> +}
> > 
> > So this code can never return an error?
> 
> It surely can; the one error mode is exactly the same as it was in v1
> (the regex version). Namely, the error mode is that the input pathname
> in "id" does not match the expected form (i.e., /dev/mapper/VG-LV), and
> then we don't modify "id" at all, we let "id" be added to the keystore
> exactly as the user specified it.
> 
> The error mode is activated whenever we reach any of the explicit return
> statements in the function:
> 
> - when the first STRPREFIX call falls
> 
> - or when any of the other three "return" statements are reached (which
> can only be reached during the scanning phase, M_SCAN; that is, *before*
> we write anything at all into "id").
> 
> The point of separating M_SCAN from M_FILL is to verify the full format
> match before we start modifying "id" in-place. The traversal (parsing)
> through "VG-LV" is exactly the same in both cases, but in the first
> pass, we just check. If that completes fine, we repeat the same
> traversal, but then we also modify "id" in-place.
> 
> The translation only ever contracts VG-LV (it produces at most as many
> bytes as it consumes), and the write point is always to the left of the
> read point, so the in-place modification, including the final
> NUL-termination, is safe.
> 
> > eg if the input was "A-B-C",
> > I think it would translate the string to "A/B-C" which is an output
> > but the input seems like it was wrong.
> 
> If "id" contains the string "/dev/mapper/A-B-C", then it *mismatches*
> the expected form "/dev/mapper/VG-LV" just as much as an ID containing
> "/dev/sda2" would mismatch the expected form.
> 
> In either of those cases, "id" does not get modified, and gets added to
> the keystore precisely as the user specified it on the command line.
> 
> We don't say "this ID is wrong, error!". We don't know what the ID is,
> we only know what it is *not*. We only say "this is not a
> /dev/mapper/VG-LV specification, we just don't know what it is supposed
> to be, so we won't touch it, we'll add it as it is, to the keystore".
>
> That's exactly the behavior (= blind addition) that we have right now,
> i.e., pre-patch.

OK got it, in that case:

Reviewed-by: Richard W.M. Jones <rjones at redhat.com>

> unescape_device_mapper_lvm() either modifies "id" in place (if "id" on
> input has the expected form), or leaves "id" alone.
> 
> The logic following the unescape_device_mapper_lvm() *call site* doesn't
> know which way the function went, but it does not *need* to know.
> 
> > Or is it that only the VG is
> > escaped?  (I don't imagine there is some specification for /dev/mapper
> > device names.)
> 
> "/dev/mapper/A-B-C" simply cannot be interpreted according to the
> "/dev/mapper/VG-LV" scheme. For such an interpretation to be possible,
> we'd have to have one of the following strings in "id":
> 
> - "/dev/mapper/A--B-C" --> "/dev/A-B/C"
> - "/dev/mapper/A-B--C" --> "/dev/A/B-C"
> 
> If the user specifies (for example) "/dev/mapper/A-B-C", we reject that,
> so the ID remains "/dev/mapper/A-B-C", and that is what gets added to
> the keystore -- exactly as it happens pre-patch.
> 
> If the user specifies (for example) "/dev/mapper/A--B--C", we also
> reject that (there is no separator to tell us what is the VG and what is
> the LV), so the ID remains "/dev/mapper/A--B--C", and that is what gets
> added to the keystore -- exactly as it happens pre-patch.
> 
> FWIW, this aspect is identical between version 1 and version 2 of this
> patch: in v1, if the regex didn't match, then "id" would simply not be
> rewritten, and would get added to the keystore as specified by the user.
> The v1->v2 update is an implementation detal, internal to
> unescape_device_mapper_lvm() -- it only affects how the expected form is
> recognized (regex versus manual parsing), and how a matching input is
> translated (the output being constructed from matched regex
> subexpressions, versus byte-wise buffering and flushing integrated with
> scanning).

Thanks,

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top


More information about the Libguestfs mailing list