[libvirt] [PATCH v4 07/14] qemu_cgroup: Allow /dev/mapper/control for PR

Michal Privoznik mprivozn at redhat.com
Mon Apr 16 14:56:43 UTC 2018


On 04/14/2018 02:55 PM, John Ferlan wrote:
> 
> 
> On 04/10/2018 10:58 AM, Michal Privoznik wrote:
>> Just like in previous commit, qemu-pr-helper might want to open
>> /dev/mapper/control under certain circumstances. Therefore we
>> have to allow it in cgroups.
> 
> Perhaps the two patches should be combined then...  yes, I know cgroups
> is different than namespace, so I understand the separation.

Yeah, I'd like to keep them separated. Even though they allow access to
the same path they do that in different areas of the code.

> 
>>
>> Signed-off-by: Michal Privoznik <mprivozn at redhat.com>
>> ---
>>  src/qemu/qemu_cgroup.c  | 33 ++++++++++++++++++++++++++++++---
>>  src/util/virdevmapper.c |  8 +++++++-
>>  2 files changed, 37 insertions(+), 4 deletions(-)
>>
>> diff --git a/src/qemu/qemu_cgroup.c b/src/qemu/qemu_cgroup.c
>> index d88eb7881f..546a4c8e63 100644
>> --- a/src/qemu/qemu_cgroup.c
>> +++ b/src/qemu/qemu_cgroup.c
>> @@ -114,6 +114,8 @@ qemuSetupImagePathCgroup(virDomainObjPtr vm,
>>  }
>>  
>>  
>> +#define DEVICE_MAPPER_CONTROL_PATH "/dev/mapper/control"
>> +
> 
> Almost too bad we didn't have a common place for this in some existing
> included .h file.

Yeah :(

> 
>>  static int
>>  qemuSetupImageCgroupInternal(virDomainObjPtr vm,
>>                               virStorageSourcePtr src,
>> @@ -125,6 +127,10 @@ qemuSetupImageCgroupInternal(virDomainObjPtr vm,
>>          return 0;
>>      }
>>  
>> +    if (virStoragePRDefIsManaged(src->pr) &&
>> +        qemuSetupImagePathCgroup(vm, DEVICE_MAPPER_CONTROL_PATH, false) < 0)
>> +        return -1;
>> +
> 
> Could the above be done potentially many times?  Could be expensive, no?
>  Considering what virDevMapperGetTargets[Impl] does...

It shouldn't be expensive. At the very first call of
virDevMapperGetTargetsImpl() we get ENXIO (this the change below) and
thus there only very small time penalty added.

> 
>>      return qemuSetupImagePathCgroup(vm, src->path, src->readonly || forceReadonly);
>>  }
>>  
>> @@ -142,9 +148,8 @@ qemuTeardownImageCgroup(virDomainObjPtr vm,
>>                          virStorageSourcePtr src)
>>  {
>>      qemuDomainObjPrivatePtr priv = vm->privateData;
>> -    int perms = VIR_CGROUP_DEVICE_READ |
>> -                VIR_CGROUP_DEVICE_WRITE |
>> -                VIR_CGROUP_DEVICE_MKNOD;
>> +    int perms = VIR_CGROUP_DEVICE_RWM;
>> +    size_t i;
>>      int ret;
>>  
>>      if (!virCgroupHasController(priv->cgroup,
>> @@ -157,6 +162,28 @@ qemuTeardownImageCgroup(virDomainObjPtr vm,
>>          return 0;
>>      }
>>  
>> +    for (i = 0; i < vm->def->ndisks; i++) {
>> +        virStorageSourcePtr diskSrc = vm->def->disks[i]->src;
>> +
>> +        if (src == diskSrc)
>> +            continue;
>> +
>> +        if (virStoragePRDefIsManaged(diskSrc->pr))
>> +            break;
>> +    }
>> +
>> +    if (i == vm->def->ndisks) {
> 
> If there was only one that's managed and it's @src, then we don't get
> here...
> 

How come? Say there are 3 disks for a domain and the first one is
managed: disks = {A, B, C} (where A.managed = yes}.

So the loop goes like this:

  qemuTeardownImageCgroup(A):
    i = 0;
    if (A.src == disks[0].src) //true
      continue;

    i = 1;
    if (A.src == disks[1].src) //false
      continue;

    if (isManaged(disks[1]) //false
      break;

    i = 2;
    if (A.src == disks[2].src) //false
      continue;

    if (isManaged(disks[2]) //false
      break;

    // Here, i == ndisks == 3;

Or am I missing something?


>> +        VIR_DEBUG("Disabling device mapper control");
>> +        ret = virCgroupDenyDevicePath(priv->cgroup,
>> +                                      DEVICE_MAPPER_CONTROL_PATH, perms, true);
> 
> You go through great lengths to ensure this is only done once converse
> to the qemuSetupImageCgroupInternal change...

Yes, because we can't deny helper the access if there's still another
disk that might be using it (note that only managed disks use helper
which runs in qemu's namespace/cgroup context). But we can allow it
multiple times.

> 
> I guess I would think that if we can determine at Prepare time that NS
> and cgroups would need to add /dev/mapper/control and we only want to do
> that config once, then we should be able
> 
> Yes, of course we then we potentially miss adding for hotplug. Honestly
> it seems more of a "global" to qemu_domain rather than "local" to a
> particular disk source even though it's needed by the local disk source,
> is it really something the disk source itself (or it's private data
> structure) should be managing?

Well, I view qemuSetupImageCgroup() and qemuTeardownImageCgroup() as
entry points to CGroup mgmt. Therefore it's less lines of code to do
decision there. Doing it anywhere else would be impractical IMO.

> 
>> +        virDomainAuditCgroupPath(vm, priv->cgroup, "deny",
>> +                                 DEVICE_MAPPER_CONTROL_PATH,
>> +                                 virCgroupGetDevicePermsString(perms), ret);
>> +        if (ret < 0)
>> +            return ret;
>> +    }
>> +
>> +
>>      VIR_DEBUG("Deny path %s", src->path);
>>  
>>      ret = virCgroupDenyDevicePath(priv->cgroup, src->path, perms, true);
>> diff --git a/src/util/virdevmapper.c b/src/util/virdevmapper.c
>> index d2c25af003..ef4b1e480a 100644
>> --- a/src/util/virdevmapper.c
>> +++ b/src/util/virdevmapper.c
>> @@ -101,8 +101,14 @@ virDevMapperGetTargetsImpl(const char *path,
>>  
>>      dm_task_no_open_count(dmt);
>>  
>> -    if (!dm_task_run(dmt))
>> +    if (!dm_task_run(dmt)) {
>> +        if (errno == ENXIO) {
>> +            /* In some cases devmapper realizes this late device
>> +             * is not managed by it. */
>> +            ret = 0;
>> +        }
> 
> Should this be separated? Or is it related to /dev/mapper/control?

No, this is exactly at the right spot in the correct patch. If
virDevMapperGetTargetsImpl() sees /dev/mapper/control the dm_task_run()
returns ENXIO. Speaking of which, this is not documented anywhere. I had
to dig through devmapper sources to find out this errno is in fact
returned by kernel.

Michal




More information about the libvir-list mailing list