[libvirt] [PATCH] qemu: Do not reattach PCI device used by other domain when shutdown

Osier Yang jyang at redhat.com
Thu Oct 13 00:53:28 UTC 2011


于 2011年10月13日 00:41, Eric Blake 写道:
> On 09/27/2011 12:53 AM, Osier Yang wrote:
>
> Apologies on the delayed review. This is some hairy code, and I want 
> to make sure we get it right, so I kind of shelved it knowing it would 
> be a longer review.
>
>> When failing on starting a domain, it tries to reattach all the PCI
>> devices defined in the domain conf, regardless of whether the devices
>> are still used by other domain. This will cause the devices are deleted
>
> s/are deleted/to be deleted/
>
>> from the list qemu_driver->activePciHostdevs, thus the devices will be
>> thought as usable even if it's not true. And following commands
>> nodedev-{reattach,reset} will be successful.
>>
>> How to reproduce:
>> 1) Define two domains with same PCI device defined in the confs.
>> 2) # virsh start domain1
>> 3) # virsh start domain2
>> 4) # virsh nodedev-reattach $pci_device
>>
>> You will see the device will be reattached to host successfully.
>> As pciDeviceReattach just check if the device is still used by
>> other domain via checking if the device is in list 
>> driver->activePciHostdevs,
>> however, the device is deleted from the list by step 2).
>
> Ouch, and definitely needs patching.
>
>>
>> This patch is to prohibit the bug by:
>> 1) Prohibit a domain starting or device attachment right at
>> preparation period (qemuPrepareHostdevPCIDevices) if the
>> device is in list driver->activePciHostdevs, which means
>> it's used by other domain.
>
> Off-hand, I'm not sure if this is quite right - it is completely valid 
> to have two persistent domains both using the same hostdev, _as long 
> as_ you guarantee that at most one of those two domains is active at a 
> time. Is qemuPrepareHostdevPCIDevices called at define time (when the 
> persistent xml is stored) or at start time (when actually starting the 
> guest)?
>
> [/me goes and looks...]
>
> Yay! qemuPrepareHostdevPCIDevices is only called during hot-plug 
> (domain already active) and by qemuPrepareHostDevices(), which in turn 
> is only used in qemuProcessStart. So it does indeed look like this 
> patch merely prohibits starting domain2 if domain1 is running in your 
> above example, but does not prohibit both definitions from existing.

Yes, I checked this too when doing the patch.

>
>>
>> 2) Introduces a new field for struct _pciDevice, (char *used_by),
>> it will be set as the domain name at preparation period,
>> (qemuPrepareHostdevPCIDevices). Thus we can prohibit deleting
>> the device from driver->activePciHostdevs if it's still used by
>> other domain when stopping the domain process.
>>
>> * src/pci.h (define two internal functions, pciDeviceSetUsedBy and
>> pciDevceGetUsedBy)
>> * src/pci.c (new field "char *used_by" for struct _pciDevice,
>> implementations for the two new functions)
>> * src/libvirt_private.syms (Add the two new internal functions)
>> * src/qemu_hostdev.h (Modify the definition of functions
>> qemuPrepareHostdevPCIDevices, and qemuDomainReAttachHostdevDevices)
>> * src/qemu_hostdev.c (Prohibit preparation and don't delete the
>> device from activePciHostdevs list if it's still used by other domain)
>> * src/qemu_hotplug.c (Update function usage, as the definitions are
>> changed)
>> ---
>> src/libvirt_private.syms | 2 ++
>> src/qemu/qemu_hostdev.c | 31 ++++++++++++++++++++++++++++---
>> src/qemu/qemu_hostdev.h | 2 ++
>> src/qemu/qemu_hotplug.c | 4 ++--
>> src/util/pci.c | 22 ++++++++++++++++++++++
>> src/util/pci.h | 3 +++
>> 6 files changed, 59 insertions(+), 5 deletions(-)
>>
>> @@ -126,7 +127,10 @@ int qemuPrepareHostdevPCIDevices(struct 
>> qemud_driver *driver,
>> for (i = 0; i< pciDeviceListCount(pcidevs); i++) {
>> pciDevice *dev = pciDeviceListGet(pcidevs, i);
>> if (!pciDeviceIsAssignable(dev, !driver->relaxedACS))
>> - goto reattachdevs;
>> + goto cleanup;
>
> This doesn't seem right, when there are multiple devices.
>
> Pre-patch, suppose we had:
> dev1 - is assignable, is managed, and detach succeeds (nevermind the 
> spelling error in pciDettachDevice)
> dev2 - is not assignable
> goto reattachdevs
> dev1 - is managed, so it is reattached
>
> Now we have:
> dev1 - is assignable, is not active, and detach succeeds
> dev2 - is not assignable
> goto cleanup
>
> Oops - we left dev1 detached.
>
> I think you _have_ to break this into two separate loops (and adjust 
> the line earlier about 4 loops to now call out 5 loops!).
>
> Loop 1 - prove that all desired devices are assignable and not already 
> active
>
> Loop 2 - for all managed devices, detach them

Oops, I ignored in the loop there is also pciDeviceDetach. And yes,
breaking them into 2 loops is good.

>
>> +
>> + if (pciDeviceListFind(driver->activePciHostdevs, dev))
>> + goto cleanup;
>
> This is correct for loop 1, validating that the requested device is 
> not already in use by another active domain.
>
>>
>> if (pciDeviceGetManaged(dev)&&
>> pciDettachDevice(dev, driver->activePciHostdevs)< 0)
>> @@ -156,6 +160,14 @@ int qemuPrepareHostdevPCIDevices(struct 
>> qemud_driver *driver,
>> pciDeviceListSteal(pcidevs, dev);
>> }
>>
>> + /* Now set the used_by_domain of the device in 
>> driver->activePciHostdevs
>> + * as domain name.
>> + */
>> + for (i = 0; i< pciDeviceListCount(driver->activePciHostdevs); i++) {
>> + pciDevice * dev = pciDeviceListGet(driver->activePciHostdevs, i);
>
> Formatting: s/* dev/*dev/
>
> This iterates over all active pci devices, but it _should_ be 
> iterating only over the pci devices in use by the given domain. 
> Otherwise, you have:
>
> dom1 - takes dev1, dev1 in use by "dom1"
> dom2 - takes dev2, now dev1 and dev2 both in use by "dom2"

Ah, right. I was blind on this with testing on just one domain.

>
>> + pciDeviceSetUsedBy(dev, name);
>
> Ouch. pciDeviceSetUsedBy is malloc'ing memory, but you aren't checking 
> for failure. More on this below...

okay

>
>> @@ -277,6 +291,17 @@ void qemuDomainReAttachHostdevDevices(struct 
>> qemud_driver *driver,
>>
>> for (i = 0; i< pciDeviceListCount(pcidevs); i++) {
>> pciDevice *dev = pciDeviceListGet(pcidevs, i);
>> + pciDevice *activeDev = NULL;
>> +
>> + /* Never delete the dev from list driver->activePciHostdevs
>> + * if it's used by other domain.
>> + */
>> + activeDev = pciDeviceListFind(driver->activePciHostdevs, dev);
>> + if (activeDev&&
>> + (used_by = pciDeviceGetUsedBy(activeDev))&&
>> + STRNEQ(used_by, name))
>
> I would have simplified these two conditions to one line:
>
> STRNEQ(name, pciDeviceGetUsedBy(activeDev))

Okay, make sense.

>
>> +++ b/src/util/pci.c
>> @@ -62,6 +62,7 @@ struct _pciDevice {
>> char name[PCI_ADDR_LEN]; /* domain:bus:slot.function */
>> char id[PCI_ID_LEN]; /* product vendor */
>> char *path;
>> + char *used_by; /* The domain which uses the device */
>
> As promised above, why not make this 'const char *used_by', since a 
> domain can only own a device for as long as it is active, and as long 
> as the domain is active, then the domain's name will be in scope. 
> Then, you don't have to strdup() the name, and don't have to worry 
> about malloc failure. But you _do_ have to worry about setting the 
> name back to NULL after a domain goes inactive or the device is 
> hotplugged, to update the fact that no active domain owns the device.

Sounds good, making a v2.

>
>> int fd;
>>
>> unsigned initted;
>> @@ -1312,6 +1313,7 @@ pciGetDevice(unsigned domain,
>> dev->bus = bus;
>> dev->slot = slot;
>> dev->function = function;
>> + dev->used_by = NULL;
>>
>> if (snprintf(dev->name, sizeof(dev->name), "%.4x:%.2x:%.2x.%.1x",
>> dev->domain, dev->bus, dev->slot,
>> @@ -1374,6 +1376,7 @@ pciFreeDevice(pciDevice *dev)
>> VIR_DEBUG("%s %s: freeing", dev->id, dev->name);
>> pciCloseConfig(dev);
>> VIR_FREE(dev->path);
>> + VIR_FREE(dev->used_by);
>
> If you use my suggestion of pointing to the pre-existing domain name, 
> rather than strdup(), then you don't want this line.
>
>> VIR_FREE(dev);
>> }
>>
>> @@ -1387,6 +1390,25 @@ unsigned pciDeviceGetManaged(pciDevice *dev)
>> return dev->managed;
>> }
>>
>> +int
>> +pciDeviceSetUsedBy(pciDevice *dev, const char *name)
>> +{
>> + dev->used_by = strdup(name);
>> +
>> + if (!dev->used_by) {
>> + virReportOOMError();
>> + return -1;
>> + }
>> +
>> + return 0;
>
> and here, you could make the function return void (straight pointer 
> assignment, rather than strdup()).
>
> Needs a v2, but we should definitely get this patched.
>




More information about the libvir-list mailing list