[libvirt] running Libvirt from source code, IPC_LOCK and VFIO

Daniel Henrique Barboza danielhb413 at gmail.com
Mon Feb 4 18:26:41 UTC 2019


Hey Erik,


On 2/4/19 8:11 AM, Erik Skultety wrote:
> On Fri, Feb 01, 2019 at 07:40:36PM -0200, Daniel Henrique Barboza wrote:
>> Update: I've figured it out.
>>
>> The bug here was that, even running as root, I was getting errors like:
>>
>> error : virQEMUCapsNewForBinaryInternal:4687 : internal error: Failed to
>> probe QEMU binary with
>> QMP: libvirt:  error : prctl failed to enable 'dac_override' in the AMBIENT
>> set:
>> Operation not permitted
> Being responsible for the latest changes wrt to capabilities, this error itself
> is very strange because the prctl man page says the following about EPERM errno:
>
> "option is PR_CAP_AMBIENT and arg2 is PR_CAP_AMBIENT_RAISE, but either the
> capability specified in arg3 is not present in the process's permitted and
> inheritable capability sets, or the PR_CAP_AMBIENT_LOWER securebit has been
> set."
>
> So I'm wondering how can that be since that prctl call happens after we applied
> the capabilities we want with capng_apply. Just out of curiosity, what happens
> if you move the whole PR_CAP_AMBIENT at the very end of virSetUIDGIDWithCaps
> function? Does it change anything?

Moving the code as  you suggested got rid of the internal error:


--- a/src/util/virutil.c
+++ b/src/util/virutil.c
@@ -1587,27 +1587,6 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t 
*groups, int ngroups,
          goto cleanup;
      }

-# ifdef PR_CAP_AMBIENT
-    /* we couldn't do this in the loop earlier above, because the 
capabilities
-     * were not applied yet, since in order to add a capability into 
the AMBIENT
-     * set, it has to be present in both the PERMITTED and INHERITABLE sets
-     * (capabilities(7))
-     */
-    for (i = 0; i <= CAP_LAST_CAP; i++) {
-        capstr = capng_capability_to_name(i);
-
-        if (capBits & (1ULL << i)) {
-            if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) {
-                virReportSystemError(errno,
-                                     _("prctl failed to enable '%s' in 
the "
-                                       "AMBIENT set"),
-                                     capstr);
-                goto cleanup;
-            }
-        }
-    }
-# endif
-
      /* Set bounding set while we have CAP_SETPCAP.  Unfortunately we 
cannot
       * do this if we failed to get the capability above, so ignore the
       * return value.
@@ -1630,6 +1609,27 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, gid_t 
*groups, int ngroups,
          goto cleanup;
      }

+# ifdef PR_CAP_AMBIENT
+    /* we couldn't do this in the loop earlier above, because the 
capabilities
+     * were not applied yet, since in order to add a capability into 
the AMBIENT
+     * set, it has to be present in both the PERMITTED and INHERITABLE sets
+     * (capabilities(7))
+     */
+    for (i = 0; i <= CAP_LAST_CAP; i++) {
+        capstr = capng_capability_to_name(i);
+
+        if (capBits & (1ULL << i)) {
+            if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) < 0) {
+                virReportSystemError(errno,
+                                     _("prctl failed to enable '%s' in 
the "
+                                       "AMBIENT set"),
+                                     capstr);
+                goto cleanup;
+            }
+        }
+    }
+# endif
+




However, this code still doesn't add IPC_LOCK as capability:


index 0d58f1ee57..f4b46abc08 100644
--- a/src/util/virutil.c
+++ b/src/util/virutil.c
+++ b/src/qemu/qemu_capabilities.c
@@ -4525,6 +4525,9 @@ 
virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd,
      /* QEMU might run into permission issues, e.g. /dev/sev (0600), 
override
       * them just for the purpose of probing */
      virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE);
+    virCommandAllowCap(cmd->cmd, CAP_IPC_LOCK);
+    virCommandAllowCap(cmd->cmd, CAP_IPC_OWNER);
+
  #endif



So I am not sure if my mod above is wrong or your suggestion of moving the
PR_CAP_AMBIENT code made the warning go away but isn't setting the 
capabilities
at all. I'll investigate it more.



DHB


>
> Thanks,
> Erik
>
>> The reason is that the host has libcap-ng installed. ./configure uses it if
>> available,
>> setting WITH_CAPNG in the code. I am unsure if this has something to do with
>> the libcap-ng configuration in this system I'm using or if there is
>> something
>> missing in the Libvirt code, but the spawned QEMU process isn't inheriting
>> the
>> capabilities it should have.
>>
>> Disabling support of this lib with "--with-capng=no" in autogen.sh and
>> rebuilding Libvirt fixed the problem. I was even able to see more NUMA
>> nodes than I was before using the system libvirt (which is the original
>> bug I am/was investigating).
>>
>>
>> Thanks!
>>
>>
>>
>>
>>
>> On 2/1/19 4:04 PM, Daniel Henrique Barboza wrote:
>>> Hi,
>>>
>>> I'm facing a strange behavior when running Libvirt from source code,
>>> latest upstream, on an Ubuntu 18.04.1 LTS Power 9 server. My QEMU
>>> guest - which is using VFIO and GPU passthrough - breaks on boot when
>>> trying to allocate a DMA window inside KVM.
>>>
>>> Debugging the code, I've found out that the problem is related to the
>>> process
>>> not having CAP_IPC_LOCK - at least from the host kernel perspective.
>>>
>>> This is strange because:
>>>
>>> - the same VM running directly from QEMU command line works
>>> - the same VM running in the system Libvirt (v4.0.0, Ubuntu version)
>>> also works
>>>
>>> What am I missing? My understanding on Linux process is that a process
>>> running as root should inherit the same capabilities of the user, which
>>> includes
>>> CAP_IPC_LOCK. Running Libvirt from source code should grant ipc_lock
>>> to it ... right?
>>>
>>>
>>>
>>> Any help is appreciated. I can provide more details (VM XML for example)
>>> if necessary.
>>>
>>>
>>> Thanks!
>> --
>> libvir-list mailing list
>> libvir-list at redhat.com
>> https://www.redhat.com/mailman/listinfo/libvir-list




More information about the libvir-list mailing list