Two questions about NVDIMM devices

Daniel P. Berrangé berrange at redhat.com
Thu Sep 10 14:30:31 UTC 2020


On Thu, Sep 10, 2020 at 04:26:40PM +0200, Milan Zamazal wrote:
> Daniel P. Berrangé <berrange at redhat.com> writes:
> 
> > On Thu, Jul 02, 2020 at 01:21:15PM +0200, Milan Zamazal wrote:
> >> Hi,
> >> 
> >
> >> I've met two situations with NVDIMM support in libvirt where I'm not
> >> sure all the parties (libvirt & I) do the things correctly.
> >> 
> >> The first problem is with memory alignment and size changes.  In
> >> addition to the size changes applied to NVDIMMs by QEMU, libvirt also
> >> makes some NVDIMM size changes for better alignments, in
> >> qemuDomainMemoryDeviceAlignSize.  This can lead to the size being
> >> rounded up, exceeding the size of the backing device and QEMU failing to
> >> start the VM for that reason (I've experienced that actually).  I work
> >> with emulated NVDIMM devices, not a bare metal hardware, so one might
> >> argue that in practice the device sizes should already be aligned, but
> >> I'm not sure it must be always the case considering labels or whatever
> >> else the user decides to set up.  And I still don't feel very
> >> comfortable that I have to count with two internal size adjustments
> >> (libvirt & QEMU) to the `size' value I specify, with the ultimate goal
> >> of getting the VM started and having the NVDIMM aligned properly to make
> >> (non-NVDIMM) memory hot plug working.  Is the size alignment performed
> >> by libvirt, especially rounding up, completely correct for NVDIMMs?
> >
> > The comment on the function says QEMU aligns to "page size", which
> > is something that can vary depending not only on architecture, and
> > also the build config options for the kernel on that architecture.
> > eg aarch64 has different page size in RHEL than other distros because
> > of different choice of page size in kernel config. 
> >
> > Libvirt rounds up to 1 MB, essentially so that the size works no matter
> > what architecture or build options were used. I think this is quite
> > compelling as I don't think mgmt apps are likely to care enough about
> > non-x86 architectures to pick the right rounded sizes. 
> >
> > If we're enforcing this 1 MB rounding though, we really should be
> > documenting it clearly, so that apps can pick the right backing file
> > size. I think we dropped the ball on docs.
> 
> I still can't see it in the documentation, would it be possible to be
> clear about it in the docs, please?  For first, it's not very intuitive
> to figure out that (if I've figured out it correctly) on POWER one
> *must* specify the NVDIMM size S as
> 
>   S == aligned_size + label_size
> 
> and that size is used for the QEMU device; while on x86_64 one can
> specify any size S and
> 
>   align_up(S)
> 
> will be used for the QEMU device (and label size doesn't influence the
> value).  And additional alignment may be required for having any memory
> hot plug working.
> 
> For second, and more importantly, I'm afraid that without documenting
> it, future changes may break the current behavior without warning.  For
> example, the recent changes regarding POWER alignment in 6.7.0 are for
> good IMHO and one can use the same size with both 6.7 and 6.6 versions,
> but they could still cause pre-6.7 sizes stop working.

I don't know what changes you are referring to here, but if they were
in libvirt I'd consider that a bug - we shouldn't break a previously
working configuration by increasing required alignment.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|




More information about the libvirt-users mailing list