[libvirt] running Libvirt from source code, IPC_LOCK and VFIO

Daniel Henrique Barboza danielhb413 at gmail.com
Mon Feb 4 22:44:21 UTC 2019


Hi Erik,

Just to let you know that the error I reported in one of my replies was
being caused by one change I forgot to undo. This error here:


error : virQEMUCapsNewForBinaryInternal:4687 : internal error: Failed to
probe QEMU binary with
QMP: libvirt:  error : prctl failed to enable 'dac_override' in the AMBIENT
set:
Operation not permitted


was happening because I have commented out this line inside
qemu_capabilities.c:

--- a/src/qemu/qemu_capabilities.c
+++ b/src/qemu/qemu_capabilities.c
@@ -4519,7 +4519,7 @@ 
virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd,
                                      "-daemonize",
                                      NULL);
      virCommandAddEnvPassCommon(cmd->cmd);
-    virCommandClearCaps(cmd->cmd);
+   // virCommandClearCaps(cmd->cmd);

  #if WITH_CAPNG
      /* QEMU might run into permission issues, e.g. /dev/sev (0600), 
override


Thus there is no need to move the PR_CAP_AMBIENT around to prevent the
error message. Sorry for any alarms I might have raised there.


I'm still experiencing the issue with IPC_LOCK inside the guest though. 
I'll update
here when I have concrete findings about it.


Thanks,

DHB

On 2/4/19 4:26 PM, Daniel Henrique Barboza wrote:
> Hey Erik,
>
>
> On 2/4/19 8:11 AM, Erik Skultety wrote:
>> On Fri, Feb 01, 2019 at 07:40:36PM -0200, Daniel Henrique Barboza wrote:
>>> Update: I've figured it out.
>>>
>>> The bug here was that, even running as root, I was getting errors like:
>>>
>>> error : virQEMUCapsNewForBinaryInternal:4687 : internal error: 
>>> Failed to
>>> probe QEMU binary with
>>> QMP: libvirt:  error : prctl failed to enable 'dac_override' in the 
>>> AMBIENT
>>> set:
>>> Operation not permitted
>> Being responsible for the latest changes wrt to capabilities, this 
>> error itself
>> is very strange because the prctl man page says the following about 
>> EPERM errno:
>>
>> "option is PR_CAP_AMBIENT and arg2 is PR_CAP_AMBIENT_RAISE, but 
>> either the
>> capability specified in arg3 is not present in the process's 
>> permitted and
>> inheritable capability sets, or the PR_CAP_AMBIENT_LOWER securebit 
>> has been
>> set."
>>
>> So I'm wondering how can that be since that prctl call happens after 
>> we applied
>> the capabilities we want with capng_apply. Just out of curiosity, 
>> what happens
>> if you move the whole PR_CAP_AMBIENT at the very end of 
>> virSetUIDGIDWithCaps
>> function? Does it change anything?
>
> Moving the code as  you suggested got rid of the internal error:
>
>
> --- a/src/util/virutil.c
> +++ b/src/util/virutil.c
> @@ -1587,27 +1587,6 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, 
> gid_t *groups, int ngroups,
>          goto cleanup;
>      }
>
> -# ifdef PR_CAP_AMBIENT
> -    /* we couldn't do this in the loop earlier above, because the 
> capabilities
> -     * were not applied yet, since in order to add a capability into 
> the AMBIENT
> -     * set, it has to be present in both the PERMITTED and 
> INHERITABLE sets
> -     * (capabilities(7))
> -     */
> -    for (i = 0; i <= CAP_LAST_CAP; i++) {
> -        capstr = capng_capability_to_name(i);
> -
> -        if (capBits & (1ULL << i)) {
> -            if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) 
> < 0) {
> -                virReportSystemError(errno,
> -                                     _("prctl failed to enable '%s' 
> in the "
> -                                       "AMBIENT set"),
> -                                     capstr);
> -                goto cleanup;
> -            }
> -        }
> -    }
> -# endif
> -
>      /* Set bounding set while we have CAP_SETPCAP.  Unfortunately we 
> cannot
>       * do this if we failed to get the capability above, so ignore the
>       * return value.
> @@ -1630,6 +1609,27 @@ virSetUIDGIDWithCaps(uid_t uid, gid_t gid, 
> gid_t *groups, int ngroups,
>          goto cleanup;
>      }
>
> +# ifdef PR_CAP_AMBIENT
> +    /* we couldn't do this in the loop earlier above, because the 
> capabilities
> +     * were not applied yet, since in order to add a capability into 
> the AMBIENT
> +     * set, it has to be present in both the PERMITTED and 
> INHERITABLE sets
> +     * (capabilities(7))
> +     */
> +    for (i = 0; i <= CAP_LAST_CAP; i++) {
> +        capstr = capng_capability_to_name(i);
> +
> +        if (capBits & (1ULL << i)) {
> +            if (prctl(PR_CAP_AMBIENT, PR_CAP_AMBIENT_RAISE, i, 0, 0) 
> < 0) {
> +                virReportSystemError(errno,
> +                                     _("prctl failed to enable '%s' 
> in the "
> +                                       "AMBIENT set"),
> +                                     capstr);
> +                goto cleanup;
> +            }
> +        }
> +    }
> +# endif
> +
>
>
>
>
> However, this code still doesn't add IPC_LOCK as capability:
>
>
> index 0d58f1ee57..f4b46abc08 100644
> --- a/src/util/virutil.c
> +++ b/src/util/virutil.c
> +++ b/src/qemu/qemu_capabilities.c
> @@ -4525,6 +4525,9 @@ 
> virQEMUCapsInitQMPCommandRun(virQEMUCapsInitQMPCommandPtr cmd,
>      /* QEMU might run into permission issues, e.g. /dev/sev (0600), 
> override
>       * them just for the purpose of probing */
>      virCommandAllowCap(cmd->cmd, CAP_DAC_OVERRIDE);
> +    virCommandAllowCap(cmd->cmd, CAP_IPC_LOCK);
> +    virCommandAllowCap(cmd->cmd, CAP_IPC_OWNER);
> +
>  #endif
>
>
>
> So I am not sure if my mod above is wrong or your suggestion of moving 
> the
> PR_CAP_AMBIENT code made the warning go away but isn't setting the 
> capabilities
> at all. I'll investigate it more.
>
>
>
> DHB
>
>
>>
>> Thanks,
>> Erik
>>
>>> The reason is that the host has libcap-ng installed. ./configure 
>>> uses it if
>>> available,
>>> setting WITH_CAPNG in the code. I am unsure if this has something to 
>>> do with
>>> the libcap-ng configuration in this system I'm using or if there is
>>> something
>>> missing in the Libvirt code, but the spawned QEMU process isn't 
>>> inheriting
>>> the
>>> capabilities it should have.
>>>
>>> Disabling support of this lib with "--with-capng=no" in autogen.sh and
>>> rebuilding Libvirt fixed the problem. I was even able to see more NUMA
>>> nodes than I was before using the system libvirt (which is the original
>>> bug I am/was investigating).
>>>
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>>
>>> On 2/1/19 4:04 PM, Daniel Henrique Barboza wrote:
>>>> Hi,
>>>>
>>>> I'm facing a strange behavior when running Libvirt from source code,
>>>> latest upstream, on an Ubuntu 18.04.1 LTS Power 9 server. My QEMU
>>>> guest - which is using VFIO and GPU passthrough - breaks on boot when
>>>> trying to allocate a DMA window inside KVM.
>>>>
>>>> Debugging the code, I've found out that the problem is related to the
>>>> process
>>>> not having CAP_IPC_LOCK - at least from the host kernel perspective.
>>>>
>>>> This is strange because:
>>>>
>>>> - the same VM running directly from QEMU command line works
>>>> - the same VM running in the system Libvirt (v4.0.0, Ubuntu version)
>>>> also works
>>>>
>>>> What am I missing? My understanding on Linux process is that a process
>>>> running as root should inherit the same capabilities of the user, 
>>>> which
>>>> includes
>>>> CAP_IPC_LOCK. Running Libvirt from source code should grant ipc_lock
>>>> to it ... right?
>>>>
>>>>
>>>>
>>>> Any help is appreciated. I can provide more details (VM XML for 
>>>> example)
>>>> if necessary.
>>>>
>>>>
>>>> Thanks!
>>> -- 
>>> libvir-list mailing list
>>> libvir-list at redhat.com
>>> https://www.redhat.com/mailman/listinfo/libvir-list
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20190204/6270cf92/attachment-0001.htm>


More information about the libvir-list mailing list