[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: Two questions about NVDIMM devices

Daniel P. Berrangé <berrange redhat com> writes:

> On Thu, Sep 10, 2020 at 04:26:40PM +0200, Milan Zamazal wrote:
>> Daniel P. Berrangé <berrange redhat com> writes:
>> > On Thu, Jul 02, 2020 at 01:21:15PM +0200, Milan Zamazal wrote:
>> >> Hi,
>> >> 
>> >
>> >> I've met two situations with NVDIMM support in libvirt where I'm not
>> >> sure all the parties (libvirt & I) do the things correctly.
>> >> 
>> >> The first problem is with memory alignment and size changes.  In
>> >> addition to the size changes applied to NVDIMMs by QEMU, libvirt also
>> >> makes some NVDIMM size changes for better alignments, in
>> >> qemuDomainMemoryDeviceAlignSize.  This can lead to the size being
>> >> rounded up, exceeding the size of the backing device and QEMU failing to
>> >> start the VM for that reason (I've experienced that actually).  I work
>> >> with emulated NVDIMM devices, not a bare metal hardware, so one might
>> >> argue that in practice the device sizes should already be aligned, but
>> >> I'm not sure it must be always the case considering labels or whatever
>> >> else the user decides to set up.  And I still don't feel very
>> >> comfortable that I have to count with two internal size adjustments
>> >> (libvirt & QEMU) to the `size' value I specify, with the ultimate goal
>> >> of getting the VM started and having the NVDIMM aligned properly to make
>> >> (non-NVDIMM) memory hot plug working.  Is the size alignment performed
>> >> by libvirt, especially rounding up, completely correct for NVDIMMs?
>> >
>> > The comment on the function says QEMU aligns to "page size", which
>> > is something that can vary depending not only on architecture, and
>> > also the build config options for the kernel on that architecture.
>> > eg aarch64 has different page size in RHEL than other distros because
>> > of different choice of page size in kernel config. 
>> >
>> > Libvirt rounds up to 1 MB, essentially so that the size works no matter
>> > what architecture or build options were used. I think this is quite
>> > compelling as I don't think mgmt apps are likely to care enough about
>> > non-x86 architectures to pick the right rounded sizes. 
>> >
>> > If we're enforcing this 1 MB rounding though, we really should be
>> > documenting it clearly, so that apps can pick the right backing file
>> > size. I think we dropped the ball on docs.
>> I still can't see it in the documentation, would it be possible to be
>> clear about it in the docs, please?  For first, it's not very intuitive
>> to figure out that (if I've figured out it correctly) on POWER one
>> *must* specify the NVDIMM size S as
>>   S == aligned_size + label_size
>> and that size is used for the QEMU device; while on x86_64 one can
>> specify any size S and
>>   align_up(S)
>> will be used for the QEMU device (and label size doesn't influence the
>> value).  And additional alignment may be required for having any memory
>> hot plug working.
>> For second, and more importantly, I'm afraid that without documenting
>> it, future changes may break the current behavior without warning.  For
>> example, the recent changes regarding POWER alignment in 6.7.0 are for
>> good IMHO and one can use the same size with both 6.7 and 6.6 versions,
>> but they could still cause pre-6.7 sizes stop working.
> I don't know what changes you are referring to here, but if they were
> in libvirt I'd consider that a bug - we shouldn't break a previously
> working configuration by increasing required alignment.

I mean disabling the auto alignment in
and replacing it with validation in

That change can cause a VM fail to start but after (manually) adjusting
the device size, all should work all right.  Changes that would actually
change sizes would be more dangerous.


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]