[libvirt] [PATCH 2/4] qemu: process: detect if dimm aliases are broken on reconnect

John Ferlan jferlan at redhat.com
Wed Nov 9 23:40:28 UTC 2016



On 11/03/2016 02:12 AM, Peter Krempa wrote:
> Detect on reconnect to a running qemu VM whether the alias of a
> hotpluggable memory device (dimm) does not match the dimm slot number
> where it's connected to. This is necessary as qemu is actually
> considering the alias as machine ABI used to connect the backend object
> to the dimm device.
> 
> This will require us to keep them consistent so that we can reliably
> restore them on migration. In some situations it was currently possible
> to create a mismatched configuration and qemu would refuse to restore
> the migration stream.
> 
> To avoid breaking existing VMs we'll need to keep the old algorithm
> though.
> ---
>  src/qemu/qemu_domain.h  |  3 +++
>  src/qemu/qemu_process.c | 25 +++++++++++++++++++++++++
>  2 files changed, 28 insertions(+)
> 
> diff --git a/src/qemu/qemu_domain.h b/src/qemu/qemu_domain.h
> index 2ee1829..1b7b375 100644
> --- a/src/qemu/qemu_domain.h
> +++ b/src/qemu/qemu_domain.h
> @@ -232,6 +232,9 @@ struct _qemuDomainObjPrivate {
>      /* private XML) - need to restore at process reconnect */
>      uint8_t *masterKey;
>      size_t masterKeyLen;
> +
> +    /* note whether memory device alias does not correspond to slot number */
> +    bool memHotplugAliasMismatch;
>  };
> 
>  # define QEMU_DOMAIN_PRIVATE(vm)	\
> diff --git a/src/qemu/qemu_process.c b/src/qemu/qemu_process.c
> index 1b67aee..15b8ec8 100644
> --- a/src/qemu/qemu_process.c
> +++ b/src/qemu/qemu_process.c
> @@ -3205,6 +3205,29 @@ qemuDomainPerfRestart(virDomainObjPtr vm)
>      return 0;
>  }
> 
> +
> +static void
> +qemuProcessReconnectCheckMemoryHotplugMismatch(virDomainObjPtr vm)
> +{
> +    size_t i;
> +    int aliasidx;
> +    virDomainDefPtr def = vm->def;
> +    qemuDomainObjPrivatePtr priv = vm->privateData;
> +
> +    if (!virDomainDefHasMemoryHotplug(def) || def->nmems == 0)
> +        return;
> +
> +    for (i = 0; i < def->nmems; i++) {
> +        aliasidx = qemuDomainDeviceAliasIndex(&def->mems[i]->info, "dimm");

When/how does the info.alias get filled in during restart processing?

I see it being defined during qemuAssignDeviceMemoryAlias and
qemuAssignDeviceAliases. The former is only called in hotplug processing
and the latter during qemuProcessPrepareDomain (domain startup). So I
think the answer is we return -1 always here, but could be proved wrong.

Thus, I think this is doomed from the start. I also wonder how existing
code is affected since it's based on getting alias index - which
wouldn't be defined, thus wouldn't a hotplug after libvirtd restart
result in "dimm0"?

The good news is we do have a way to fetch/return a 'meminfo' from
qemuMonitorGetMemoryDeviceInfo which would be a hash table indexed by
the ID's provided at startup. Thus we'd just need a mechanism to match
'mems' with each element of the returned meminfo hash table and
"generate"/assign/steal the alias from that to mems.

At the very least - whatever we set will be whatever we created or was
hotplugged. It won't be stored in the config XML (yet), but it would
seemingly be bug for bug compatible.

This way - we shouldn't need all of patch4, I think... Or at least we
wouldn't need the memHotplugAliasMismatch logic. Forcing the alias ID to
match the slot would probably still be good goal - still doesn't mean
the mems list is ordered 0..def->mem.memory_slots. You could have
0,4,2,1,3...

It's late... been a long day... hopefully this all makes sense.

John

> +
> +        if (def->mems[i]->info.addr.dimm.slot != aliasidx) {
> +            priv->memHotplugAliasMismatch = true;
> +            break;
> +        }
> +    }
> +}
> +
> +
>  struct qemuProcessReconnectData {
>      virConnectPtr conn;
>      virQEMUDriverPtr driver;
> @@ -3389,6 +3412,8 @@ qemuProcessReconnect(void *opaque)
>      if (qemuProcessUpdateDevices(driver, obj) < 0)
>          goto error;
> 
> +    qemuProcessReconnectCheckMemoryHotplugMismatch(obj);
> +
>      /* Failure to connect to agent shouldn't be fatal */
>      if ((ret = qemuConnectAgent(driver, obj)) < 0) {
>          if (ret == -2)
> 




More information about the libvir-list mailing list