[libvirt] [PATCH v2 10/12] qemu_hotplug: Hotplug of reservations

Thu Mar 8 00:47:33 UTC 2018

On 03/06/2018 12:31 PM, Michal Privoznik wrote:
> On 03/02/2018 06:55 PM, John Ferlan wrote:
>>
>>
>> On 02/21/2018 01:11 PM, Michal Privoznik wrote:
>>> Surprisingly, nothing special is happening here. If we are the
>>> first to use the managed helper then spawn it. If not, we're
>>> almost done.
>>
>> But wasn't there a very special reason to start it between fork and
>> exec? 
> 
> No. If you are referring to previous patch, the very special reason
> applies to calling qemuProcessSetupOnePRDaemonHook() which places the
> qemu-pr-helper process (well, at this stage it's still just our own
> fork) into the namespace of qemu process.
> 
>> Does that still hold true? That is why can we start the daemon
>> after exec now, but we couldn't before in the previous patch?
>>
>> It feels quite "disjoint" to have the unplug in a subsequent patch too.
>> Considering them both in one just seems "better", but it's not a deal
>> breaker.
>>
>>>
>>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>>> ---
>>>  src/qemu/qemu_hotplug.c | 72 +++++++++++++++++++++++++++++++++++++++++++++++++
>>>  src/qemu/qemu_process.c | 38 +++++++++++++++++++++-----
>>>  src/qemu/qemu_process.h |  7 +++++
>>>  3 files changed, 110 insertions(+), 7 deletions(-)
>>>
>>> diff --git a/src/qemu/qemu_hotplug.c b/src/qemu/qemu_hotplug.c
>>> index f28006e3c..2ebb68fbc 100644
>>> --- a/src/qemu/qemu_hotplug.c
>>> +++ b/src/qemu/qemu_hotplug.c
>>> @@ -348,6 +348,58 @@ qemuDomainChangeEjectableMedia(virQEMUDriverPtr driver,
>>>  }
>>>  
>>>  
>>> +static int
>>> +qemuBuildPRDefInfoProps(virDomainObjPtr vm,
>>
>> qemuBuild?  Why not qemuDomainAdd?
> 
> Dunno. Other functions have the same prefix, e.g.
> qemuBuildSecretInfoProps().
> 

OK - I see... Still qemuBuildSecretInfoProps is in qemu_command - there
are not qemuBuild API's in qemu_hotplug.

>>
>>> +                        virDomainDiskDefPtr disk,
>>> +                        virJSONValuePtr *prmgrProps,
>>> +                        const char **prAlias,
>>> +                        const char **prPath)
>>> +{
>>> +    qemuDomainObjPrivatePtr priv = vm->privateData;
>>> +    qemuDomainStorageSourcePrivatePtr srcPriv;
>>> +    virJSONValuePtr props = NULL;
>>> +    int ret = -1;
>>> +
>>> +    srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src);
>>
>> ohh, umm, in qemuDomainAttachDiskGeneric there's a :
>>
>>      srcPriv = QEMU_DOMAIN_STORAGE_SOURCE_PRIVATE(disk->src);
>>      if (srcPriv) {
>>          secinfo = srcPriv->secinfo;
>>          encinfo = srcPriv->encinfo;
>>      }
>>
>> Which makes light dawn that it "was" possible for srcPriv == NULL and
>> the "thought" that the subsequent deref is going to fail rather
>> spectacularly. See commit '8056721cb' (and a couple of others).
>>
>> Although it seems patch 4 and your change to qemuDomainPrepareDiskSource
>> to call qemuDomainStorageSourcePrivateNew instead of having it called in
>> qemuDomainSecretStorageSourcePrepare would seem to ensure that disk
>> source privateData is always allocated now - meaning a number of other
>> places where srcPriv is/was checked are no longer necessary.
> 
> Yes. There are such places.
> 

And Peter also made changes in the domain private code that you'll have
to merge with.  Seeing this here and thinking about the recent history
of this code caused me to "elaborate".

>>
>> Maybe that one change needs to be "extracted" out to make things
>> obvious. Not required, but just thinking and typing thoughts.
> 
> Okay, I can try that. Also try removing those unnecessary checks, but as
> I was proven many times, touching this part of libvirt code always ends
> bad with me.
> 

We're in the same boat. Usually what happens is something I take as a
given during review process bites me later on when something else I was
depending on is changed as part of the review, but I neglect to remember
to go back to the subsequent patches that may rely on the earlier
assumption.  Someone usually finds those pretty quick though.

>>
>>> +
>>> +    *prmgrProps = NULL;
>>> +
>>> +    if (priv->prPid != (pid_t) -1 ||
>>> +        !srcPriv->prd ||
>>> +        !srcPriv->prd->alias)
>>> +        return 0;
>>> +
>>> +    if (virJSONValueObjectCreate(&props,
>>> +                                 "s:path", srcPriv->prd->path,
>>> +                                 NULL) < 0)
>>> +        goto cleanup;
>>> +> +    if (qemuProcessSetupOnePRDaemon(vm, disk) < 0)
>>> +        goto cleanup;> +
>>> +    *prAlias = srcPriv->prd->alias;
>>> +    *prPath = srcPriv->prd->path;
>>> +    *prmgrProps = props;
>>> +    props = NULL;
>>> +    ret = 0;
>>> + cleanup:
>>> +    virJSONValueFree(props);
>>> +    return ret;
>>> +}
>>> +
>>> +
>>> +static void
>>> +qemuDestroyPRDefObject(virDomainObjPtr vm,
>>
>> qemuDestroy?  Why not qemuDomainDel?
> 
> It's counterpart. qemuBuildPRDef.. qemuDestroyPRDef..
> 
>>
>>> +                       const char *alias,
>>> +                       const char *path)
>>> +{
>>> +    if (!alias)
>>> +        return;
>>
>> BTW: This is where I'd expect to remove the object and then...
>>
>>> +
>>> +    qemuProcessKillPRDaemons(vm, path, false);
>>
>> Just unlink the path...  See some thoughts below...
>>
>>> +}
>>> +
>>> +
>>>  /**
>>>   * qemuDomainAttachDiskGeneric:
>>>   *
>>> @@ -365,12 +417,16 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver,
>>>      char *devstr = NULL;
>>>      char *drivestr = NULL;
>>>      char *drivealias = NULL;
>>> +    const char *prAlias = NULL;
>>> +    const char *prPath = NULL;
>>>      bool driveAdded = false;
>>>      bool secobjAdded = false;
>>>      bool encobjAdded = false;
>>> +    bool prmgrAdded = false;
>>>      virQEMUDriverConfigPtr cfg = virQEMUDriverGetConfig(driver);
>>>      virJSONValuePtr secobjProps = NULL;
>>>      virJSONValuePtr encobjProps = NULL;
>>> +    virJSONValuePtr prmgrProps = NULL;
>>>      qemuDomainStorageSourcePrivatePtr srcPriv;
>>>      qemuDomainSecretInfoPtr secinfo = NULL;
>>>      qemuDomainSecretInfoPtr encinfo = NULL;
>>> @@ -403,6 +459,9 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver,
>>>                                        disk->info.alias) < 0)
>>>          goto error;
>>>  
>>> +    if (qemuBuildPRDefInfoProps(vm, disk, &prmgrProps, &prAlias, &prPath) < 0)
>>> +        goto error;
>>> +
>>>      if (!(drivestr = qemuBuildDriveStr(disk, false, priv->qemuCaps)))
>>>          goto error;
>>>  
>>> @@ -435,6 +494,15 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver,
>>>          encobjAdded = true;
>>>      }
>>>  
>>> +    if (prmgrProps) {
>>> +        rv = qemuMonitorAddObject(priv->mon, "pr-manager-helper", prAlias,
>>> +                                  prmgrProps);
>>> +        prmgrProps = NULL; /* qemuMonitorAddObject consumes */
>>> +        if (rv < 0)
>>> +            goto exit_monitor;
>>> +        prmgrAdded = true;
>>> +    }
>>> +
>>
>> So for one of the managed modes (as noted much earlier) we could be
>> creating a object with a duplicated alias - does that work? I thought
>> id= has to be unique.
> 
> How come? The first thing that qemuBuildPRDefInfoProps() does is it
> checks 'priv->prPid != (pid_t) -1'. The fact that this pid is not -1
> means that there is a pr-helper process already running.
> 

Not sure why I'm under the impression that each object needs a unique id
value. Whether two objects could actually be created because of some
other check - well I guess not all the dots are connecting for me yet.

>>
>>>      if (qemuMonitorAddDrive(priv->mon, drivestr) < 0)
>>>          goto exit_monitor;
>>>      driveAdded = true;
>>> @@ -455,6 +523,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver,
>>>   cleanup:
>>>      virJSONValueFree(secobjProps);
>>>      virJSONValueFree(encobjProps);
>>> +    virJSONValueFree(prmgrProps);
>>>      qemuDomainSecretDiskDestroy(disk);
>>>      VIR_FREE(devstr);
>>>      VIR_FREE(drivestr);
>>> @@ -472,6 +541,8 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver,
>>>          ignore_value(qemuMonitorDelObject(priv->mon, secinfo->s.aes.alias));
>>>      if (encobjAdded)
>>>          ignore_value(qemuMonitorDelObject(priv->mon, encinfo->s.aes.alias));
>>> +    if (prmgrAdded)
>>> +        ignore_value(qemuMonitorDelObject(priv->mon, prAlias));
>>>      if (qemuDomainObjExitMonitor(driver, vm) < 0)
>>>          ret = -2;
>>>      virErrorRestore(&orig_err);
>>> @@ -481,6 +552,7 @@ qemuDomainAttachDiskGeneric(virQEMUDriverPtr driver,
>>>   error:
>>>      qemuDomainDelDiskSrcTLSObject(driver, vm, disk->src);
>>>      ignore_value(qemuHotplugPrepareDiskAccess(driver, vm, disk, NULL, true));
>>> +    qemuDestroyPRDefObject(vm, prAlias, prPath);
>>
>> oh, I see, you're mixing the way TLS object tear down occurred and how
>> other objects happen.  If you're going to mimic TLS, then change
>> qemuBuildPRDefInfoProps to be something like qemuDomainAddPRDefObject
>> which would do the EnterMonitorAsync, Props mgmt, AddObject, and
>> ExitMonitor.
>>
>> That way qemuDestroyPRDefObject changes to qemuDomainDelPRDefObject
>> which would handle the failure path.
>>
>> Then I believe you don't have to pass around prAlias and prPath since
>> they'd be "obtainable".
> 
> Actually, I'm mimicing qemuBuildSecretInfoProps().
> 

And teardown of that isn't in the "error" label other than of course the
TLS objects (including those secinfo things).  And that's because the
decision was to build those objects wholly/separately with an Enter/Exit
Monitor before we get into the guts of the Attach code.  I think you may
miss an error path by putting it here, but I forget exactly what I was
thinking about when I wrote the comment.

>>
>>>      goto cleanup;
>>>  }
>>>  
>>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>>> index 33e0ad30c..3a914b6c6 100644
>>> --- a/src/qemu/qemu_process.c
>>> +++ b/src/qemu/qemu_process.c
>>> @@ -2556,16 +2556,40 @@ qemuProcessResctrlCreate(virQEMUDriverPtr driver,
>>>  }
>>>  
>>>  
>>
>> This subsequent hunk feels like it could have gone in for the previous
>> patch.  Or at the very least some function that would unlink the socket
>> path for each of the sockets in ndisks that were created.  Then that
>> single unlink API gets exposed.
> 
> Well, the only socket we can unlink is for the managed pr-helper.
> Otherwise we might be unlinking a socket that is still active. And since
> I need to use those static functions only in this patch I'm exposing
> them here.
> 

Yeah - this is definitely a part I'm not understanding - socket mgmt
with the daemon.

>>
>>> -static void
>>> -qemuProcessKillPRDaemons(virDomainObjPtr vm)
>>> +void
>>> +qemuProcessKillPRDaemons(virDomainObjPtr vm,
>>> +                         const char *socketPath,
>>> +                         bool force)
>>
>> The only time force == true is when socketPath == NULL... and that's
>> only in the shutdown path.  When socketPath != NULL, force == false, and
>> we're going to unplug the socket, right?
> 
> Yes.
> 
>>
>> Perhaps this would be cleaner if only @socketPath was in play. If NULL,
>> then walk the ndisks looking for paths that you'll unlink first before
>> killing the daemon.
>>
>> I actually think having a separate unlink the socket would be just fine.
>> Does it really matter if it's the "last" one to stop the helper daemon?
> 
> I think it does, because if we did not stop it I can see people
> complaining immediately. We would be leaving a SUID process around for
> longer than needed. Which increases the attack surface (the process has
> a socket which is writable to UID:GID of the corresponding qemu process
> and can do RAWIO operations).
> 

Oh, well then, yes a SUID process shouldn't stick around if not
necessary... Of course you know QE will come up with a way to start/stop
incessantly and break something.

John