[edk2-devel] A problem with live migration of UEFI virtual machines

Andrew Fish via Groups.Io afish=apple.com at groups.io
Fri Feb 28 04:04:00 UTC 2020



> On Feb 26, 2020, at 1:42 AM, Laszlo Ersek <lersek at redhat.com> wrote:
> 
> Hi Andrew,
> 
> On 02/25/20 22:35, Andrew Fish wrote:
> 
>> Laszlo,
>> 
>> The FLASH offsets changing breaking things makes sense.
>> 
>> I now realize this is like updating the EFI ROM without rebooting the
>> system.  Thus changes in how the new EFI code works is not the issue.
>> 
>> Is this migration event visible to the firmware? Traditionally the
>> NVRAM is a region in the FD so if you update the FD you have to skip
>> NVRAM region or save and restore it. Is that activity happening in
>> this case? Even if the ROM layout does not change how do you not lose
>> the contents of the NVRAM store when the live migration happens? Sorry
>> if this is a remedial question but I'm trying to learn how this
>> migration works.
> 
> With live migration, the running guest doesn't notice anything. This is
> a general requirement for live migration (regardless of UEFI or flash).
> 
> You are very correct to ask about "skipping" the NVRAM region. With the
> approach that OvmfPkg originally supported, live migration would simply
> be unfeasible. The "build" utility would produce a single (unified)
> OVMF.fd file, which would contain both NVRAM and executable regions, and
> the guest's variable updates would modify the one file that would exist.
> This is inappropriate even without considering live migration, because
> OVMF binary upgrades (package updates) on the virtualization host would
> force guests to lose their private variable stores (NVRAMs).
> 
> Therefore, the "build" utility produces "split" files too, in addition
> to the unified OVMF.fd file. Namely, OVMF_CODE.fd and OVMF_VARS.fd.
> OVMF.fd is simply the concatenation of the latter two.
> 
> $ cat OVMF_VARS.fd OVMF_CODE.fd | cmp - OVMF.fd
> [prints nothing]


Laszlo,

Thanks for the detailed explanation. 

Maybe I was overcomplicating this. Given your explanation I think the part I'm missing is OVMF is implying FLASH layout, in this split model, based on the size of the OVMF_CODE.fd and OVMF_VARS.fd.  Given that if OVMF_CODE.fd gets bigger the variable address changes from a QEMU point of view. So basically it is the QEMU  API that is making assumptions about the relative layout of the FD in the split model that makes a migration to larger ROM not work. Basically the -pflash API does not support changing the size of the ROM without moving NVRAM given the way it is currently defined. 

Given the above it seems like the 2 options are:
1) Pad OVMF_CODE.fd to be very large so there is room to grow.
2) Add some feature to QUEM that allows the variable store address to not be based on OVMF_CODE.fd size. 

I did see this [1] and combined with your email I either understand, or I'm still confused? :)

I'm not saying we need to change anything, I'm just trying to make sure I understand how OVMF and QEMU are tied to together. 

[1] https://www.redhat.com/archives/libvir-list/2019-January/msg01031.html

Thanks,

Andrew Fish




> 
> When you define a new domain (VM) on a virtualization host, the domain
> definition saves a reference (pathname) to the OVMF_CODE.fd file.
> However, the OVMF_VARS.fd file (the variable store *template*) is not
> directly referenced; instead, it is *copied* into a separate (private)
> file for the domain.
> 
> Furthermore, once booted, guest has two flash chips, one that maps the
> firmware executable OVMF_CODE.fd read-only, and another pflash chip that
> maps its private varstore file read-write.
> 
> This makes it possible to upgrade OVMF_CODE.fd and OVMF_VARS.fd (via
> package upgrades on the virt host) without messing with varstores that
> were earlier instantiated from OVMF_VARS.fd. What's important here is
> that the various constants in the new (upgraded) OVMF_CODE.fd file
> remain compatible with the *old* OVMF_VARS.fd structure, across package
> upgrades.
> 
> If that's not possible for introducing e.g. a new feature, then the
> package upgrade must not overwrite the OVMF_CODE.fd file in place, but
> must provide an additional firmware binary. This firmware binary can
> then only be used by freshly defined domains (old domains cannot be
> switched over). Old domains can be switched over manually -- and only if
> the sysadmin decides it is OK to lose the current variable store
> contents. Then the old varstore file for the domain is deleted
> (manually), the domain definition is updated, and then a new (logically
> empty, pristine) varstore can be created from the *new* OVMF_2_VARS.fd
> that matches the *new* OVMF_2_CODE.fd.
> 
> 
> During live migration, the "RAM-like" contents of both pflash chips are
> migrated (the guest-side view of both chips remains the same, including
> the case when the writeable chip happens to be in "programming mode",
> i.e., during a UEFI variable write through the Fault Tolerant Write and
> Firmware Volume Block(2) protocols).
> 
> Once live migration completes, QEMU dumps the full contents of the
> writeable chip to the backing file (on the destination host). Going
> forward, flash writes from within the guest are reflected to said
> host-side file on-line, just like it happened on the source host before
> live migration. If the file backing the r/w pflash chip is on NFS
> (shared by both src and dst hosts), then this one-time dumping when the
> migration completes is superfluous, but it's also harmless.
> 
> The interesting question is, what happens when you power down the VM on
> the destination host (= post migration), and launch it again there, from
> zero. In that case, the firmware executable file comes from the
> *destination host* (it was never persistently migrated from the source
> host, i.e. never written out on the dst). It simply comes from the OVMF
> package that had been installed on the destination host, by the
> sysadmin. However, the varstore pflash does reflect the permanent result
> of the previous migration. So this is where things can fall apart, if
> both firmware binaries (on the src host and on the dst host) don't agree
> about the internal structure of the varstore pflash.
> 
> Thanks
> Laszlo
> 
> 
> 
> 


-=-=-=-=-=-=-=-=-=-=-=-
Groups.io Links: You receive all messages sent to this group.

View/Reply Online (#55046): https://edk2.groups.io/g/devel/message/55046
Mute This Topic: https://groups.io/mt/71141681/1813853
Group Owner: devel+owner at edk2.groups.io
Unsubscribe: https://edk2.groups.io/g/devel/unsub  [edk2-devel-archive at redhat.com]
-=-=-=-=-=-=-=-=-=-=-=-

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/edk2-devel-archive/attachments/20200227/5b6d2b2d/attachment.htm>


More information about the edk2-devel-archive mailing list