[libvirt PATCH] qemu: Allow sockets in long or deep paths.

Laine Stump laine at redhat.com
Wed Apr 19 02:13:50 UTC 2023


On 4/18/23 1:18 PM, Nick Guenther wrote:
> April 18, 2023 3:37 AM, "Peter Krempa" <pkrempa at redhat.com> wrote:
> 
>> cases of code style not being aligned from what libvirt does normally ...
> 
> I'm very happy to conform my style as needed. I just want my users to be able to use libvirt (if they can't I'll teach them to use qemu directly I suppose but that's annoying). I looked for style guidelines in https://libvirt.org/hacking.html but I didn't find them. The style robot did nag me at https://gitlab.com/kousu/libvirt/-/pipelines/840365326 though, so I will try my best to learn from it!
> 
>> This will not work in a multi-threaded process, which can potentially
>> call this function multiple times at the same time for starting multiple
>> VMs. One of the threads changes the directory of the process, second thread
>> changes it again and then one of the threads creates the socket in the
>> wrong directory.
> 
> I didn't think about multithreading. I'll look around the codebase to learn what locking API you're using.
> 
>> Also any failure here doesn't restore the directory.
> 
> Ah true, that's obvious in retrospect. My python is showing. I'll fix that, but maybe after discussing the better ideas below.
> 
> 
> April 18, 2023 4:12 AM, "Daniel P. Berrangé" <berrange at redhat.com> wrote:
> 
>> On Tue, Apr 18, 2023 at 02:59:26AM -0400, Nick Guenther wrote:
>>
>>> The qemu driver creates IPC sockets using absolute paths,
>>> but under POSIX socket paths are constrained pretty tightly.
>>> On systems with homedirs on an unusual mount point, like
>>> network homedirs, or just particularly long usernames, this
>>> could make starting VMs under qemu:///session impossible.
>>>
>>> Resolves https://gitlab.com/libvirt/libvirt/-/issues/466
>>
>> Example path from that bug
>>
>> /home/WAVES.EXAMPLE.ORG/u123456/.config/libvirt/qemu/channel/target/domain-11-alpinelinux3.14/org.qe
>> u.guest_agent.0
>>
>> IMHO the key problem here is not your $HOME location, but rather
>> libvirt being ridiculously verbose in the nested dir structuion
>>
>> Looking at the pieces
>>
>> * /home/WAVES.EXAMPLE.ORG/u123456 -> 31 chars
>>
>> $HOME, we can't change this
>>
>> * /.config/libvirt/qemu -> 21 chars
>>
>> Libvirt's config directory scope to the driver, we don't
>> want to change this
>>
>> * /channel/target/domain-11-alpinelinux3.14 => 42 chars
>>
>> Arbitrarily inventing nesting for the sockets, we can
>> change at will
>>
>> * /org.qemu.guest_agent.0 -> 23 chars
>>
>> Either user specified, or matches the port name, ideally
>> don't change this
>>
>> So we've got 31 + 21 + 23 == 75 chars we don't want to /can't
>> change, but that leaves 33 to play with
>>
>> I feel like having 'domain-' in the path is redudant as
>> everything under /channel is for a domain. Having 'target'
>> also feels redundant to me.
>>
>> We could use '/channel/tgt-11-alpinelinux3.14' == 31 chars
>> or just /channel/11-alpinelinux3.14 == 27 chars, which
>> gives extra scope for a longer $HOME.
>>
> 
> I agree that these paths are a bit excessive and reducing them would help. I'm all for that.
> But any patch that fiddles with lengths while still using $HOME is not a long-term solution,
> a sysadmin can and someone always will set a longer home directory. And in those cases the
> end-user is out of luck, because they can't control where their home directory is.
> 
> Plus the user could also replace ".config" by setting XDG_CONFIG_DIR,

True, but to be fair XDG_CONFIG_DIR (which is retrieved vit 
virGetUserConfigDirectory() --> g_get_user_config_dir()) shouldn't have 
been used for the user agent socket in the first place; should have been 
XDG_RUNTIME_DIR (virGetRuntimeDirectory() --> g_get_user_runtime_dir()) 
instead.

Also, if you change XDG_CONFIG_DIR, btw, then any other path based on 
virGetUserConfigDirectory() (see src/*/*_conf.c) also will change. 
Hopefully there's nothing in that path that's actual config, and needs 
to be there and not in /var/run/user/$UID/libvirt/* (I think not, but 
I'm just thinking of as many potential pitfalls as possible, since 
that's what I'm best at :-))


> lengthen the port name
> (org.qemu.guest_agent.0) by specifying it, or the  user-specified and the name of the domain-specific subfolder (domain-11-alpinelinux3.14) is user-specified (partly: libvirt clips it to a maximum length, experimentally, 30?).
> These are more nuisances than outright bugs because, being user-controlled, the user could, in theory, figure out how to shorten them if they wanted to use libvirt. But it would be nicer if we didn't make them fight libvirt.
> 
> 
> Instead of fighting with $HOME, what about using /run? That's where sockets usually go. ~/.config is a strange place to be sticking live sockets -- they aren't configuration at all.
> 
> If we moved everything to /run/user/$UID/libvirt/qemu/ then as $UID is a 32 bit integer it's <= 10 decimal digits which makes this path length <= 34

Note that changing the location of any runtime file/socket of libvirt 
will likely lead to problems during a libvirt upgrade, unless the path 
for every one of those objects is stored in the domain status XML - if 
libvirt gets updated to use a different path while there are active 
domains (or networks), then the new libvirt binary will attempt to use 
the new path assuming the file/socket/whatever is in the new location, 
but it will still be in the old location. In order to change the path 
for any runtime object without requiring a total shutdown for an upgrade 
(which is something that we *never* want to do), we also need to store 
the full path for every affected file/socket in the 
domain/network(/storage?/other?) status XML.

> 
> Experimentally the "/channel/target/domain-11-alpinelinux3.14" part is clipped to 36 chars (maybe 38? I haven't tested what happens with hundreds of VMs, presumably they get named "domain-2424-alpinelinux3.14")
> 
> And as you said "org.qemu.guest_agent.0" is always 23.
> 
> So in total that's <= 93 chars. That's still tight, pretty close to the 107 limit (or 104 on macOS), but feasible.
> 
> 
> In fact libvirt is already using this folder for tracking qemu's PIDs:
> 
> /run/user/703204575/libvirt/qemu
> └── run
>      ├── abcdefg.pid
>      ├── abcdefg.xml
>      ├── abcdef.pid
>      ├── abcdef.xml
>      ├── abcde.pid
>      ├── abcde.xml
>      ├── abcd.pid
>      ├── abcd.xml
>      ├── a.pid
>      ├── autostarted
>      ├── a.xml
>      ├── dbus
>      ├── driver.pid
>      └── slirp
> 
> 3 directories, 12 files
> 
> We could move the rest in there and fix the bug. I'd also like to incorporate your suggestion of dropping the "target" subfolder, and maybe picking either just `domain-$N` or `$name` instead of `domain-$N-$name`.

Once we overcome the upgrade problem, I think there's a *lot* of excess 
in many paths. For example, cfg->stateDir is 
"/var/run/user/$UID/qemu/run" - the 2nd "run" is completely superfluous. 
Also, cfg->slirpDir is /var/run/user/$UID/qemu/run/slirp, and 
passtStateDir and dbusStateDir are similarly repetitive (and the 
associated files/sockets for each of those all have slirp/passt/dbus in 
the name anyway, so *two* levels of the path are repetitively redundant).
But again, it's not as simple as just changing the path in the code - we 
need to make sure that when upgrading a live system from the "old paths" 
to the "new paths" that everything works properly without requiring 
either the host or the guests to be restarted.



More information about the libvir-list mailing list