[libvirt] [PATCH 2/4] qemu: process: detect if dimm aliases are broken on reconnect
John Ferlan
jferlan at redhat.com
Thu Nov 10 13:56:48 UTC 2016
On 11/10/2016 04:11 AM, Peter Krempa wrote:
> On Wed, Nov 09, 2016 at 18:40:28 -0500, John Ferlan wrote:
>>
>>
>> On 11/03/2016 02:12 AM, Peter Krempa wrote:
>>> Detect on reconnect to a running qemu VM whether the alias of a
>>> hotpluggable memory device (dimm) does not match the dimm slot number
>>> where it's connected to. This is necessary as qemu is actually
>>> considering the alias as machine ABI used to connect the backend object
>>> to the dimm device.
>>>
>>> This will require us to keep them consistent so that we can reliably
>>> restore them on migration. In some situations it was currently possible
>>> to create a mismatched configuration and qemu would refuse to restore
>>> the migration stream.
>>>
>>> To avoid breaking existing VMs we'll need to keep the old algorithm
>>> though.
>>> ---
>>> src/qemu/qemu_domain.h | 3 +++
>>> src/qemu/qemu_process.c | 25 +++++++++++++++++++++++++
>>> 2 files changed, 28 insertions(+)
>>>
>>> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
>>> index 2ee1829..1b7b375 100644
>>> --- a/src/qemu/qemu_domain.h
>>> +++ b/src/qemu/qemu_domain.h
>>> @@ -232,6 +232,9 @@ struct _qemuDomainObjPrivate {
>>> /* private XML) - need to restore at process reconnect */
>>> uint8_t *masterKey;
>>> size_t masterKeyLen;
>>> +
>>> + /* note whether memory device alias does not correspond to slot number */
>>> + bool memHotplugAliasMismatch;
>>> };
>>>
>>> # define QEMU_DOMAIN_PRIVATE(vm) \
>>> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
>>> index 1b67aee..15b8ec8 100644
>>> --- a/src/qemu/qemu_process.c
>>> +++ b/src/qemu/qemu_process.c
>>> @@ -3205,6 +3205,29 @@ qemuDomainPerfRestart(virDomainObjPtr vm)
>>> return 0;
>>> }
>>>
>>> +
>>> +static void
>>> +qemuProcessReconnectCheckMemoryHotplugMismatch(virDomainObjPtr vm)
>>> +{
>>> + size_t i;
>>> + int aliasidx;
>>> + virDomainDefPtr def = vm->def;
>>> + qemuDomainObjPrivatePtr priv = vm->privateData;
>>> +
>>> + if (!virDomainDefHasMemoryHotplug(def) || def->nmems == 0)
>>> + return;
>>> +
>>> + for (i = 0; i < def->nmems; i++) {
>>> + aliasidx = qemuDomainDeviceAliasIndex(&def->mems[i]->info, "dimm");
>>
>> When/how does the info.alias get filled in during restart processing?
>
> The live definition along with aliases is saved to the status XML and
> then reloaded from the disk.
>
> Otherwise if we'd not remember the aliases the whole hotunplug code
> would not work.
>
face-palm - I knew I was missing something obvious, but the brain just
wasn't able to recognize it, <sigh>.
Still consider the changed XML example from patch1 (without any
hotplug), we have:
<address type='dimm' slot='0' ...> => alias == "dimm0"
<address type='dimm' slot='2' ...> => alias == "dimm1"
so the bool could be "memAliasOrderMismatch" or "memAliasUnordered".
Since it's not necessarily "Hotplug" related, I think changing the
variable and function name should be done, but it's not a requirement
for the ACK since it's understandable why Hotplug is used.
John
FWIW:
One other oddball path that "might" disrupt things is the
ignore_value(qemuDomainUpdateMemoryDeviceInfo(...) hotplug code where
we're not "guaranteed" that the dimm.slot is filled in...
>> I see it being defined during qemuAssignDeviceMemoryAlias and
>> qemuAssignDeviceAliases. The former is only called in hotplug processing
>> and the latter during qemuProcessPrepareDomain (domain startup). So I
>> think the answer is we return -1 always here, but could be proved wrong.
>
> The aliases are reloaded so they are valid. Re-assigning them would
> break all hotunplug if you generate a different alias. It works this way
> in other parts for quite a long time now.
>
>> Thus, I think this is doomed from the start. I also wonder how existing
>> code is affected since it's based on getting alias index - which
>
> As most other code that needs aliases after restart ...
>
>> wouldn't be defined, thus wouldn't a hotplug after libvirtd restart
>> result in "dimm0"?
>
> Nope. Both paragraphs don't make sense due to the fact above.
>
>> The good news is we do have a way to fetch/return a 'meminfo' from
>> qemuMonitorGetMemoryDeviceInfo which would be a hash table indexed by
>> the ID's provided at startup. Thus we'd just need a mechanism to match
>> 'mems' with each element of the returned meminfo hash table and
>> "generate"/assign/steal the alias from that to mems.
>
> No need. It's saved by libvirt.
>
>> At the very least - whatever we set will be whatever we created or was
>> hotplugged. It won't be stored in the config XML (yet), but it would
>> seemingly be bug for bug compatible.
>
> No it's explicitly stored into the live XML. The address and slot number
> are necessary to ensure migration compatibility.
>
>> This way - we shouldn't need all of patch4, I think... Or at least we
>
> Patch 4 is necessary as QEMU actually makes the alias part of the qemu
> migration stream. This effectively makes the alias part of the machine
> ABI. If the aliases don't match on migration qemu rejects it.
>
>> wouldn't need the memHotplugAliasMismatch logic. Forcing the alias ID to
>> match the slot would probably still be good goal - still doesn't mean
>
> That is a necessary goal. The whole purpose of this series!
>
>> the mems list is ordered 0..def->mem.memory_slots. You could have
>> 0,4,2,1,3...
>
> Exactly. Due do the fact above the alias needs to be tied to the slot
> number rather than the order.
>
> This patch is to make sure that if we employed the wrong alias alocation
> scheme the code will not refuse to hotplug memory into a existing VM
> that has mismatched slots and aliases.
>
More information about the libvir-list
mailing list