Predictable and consistent net interface naming in guests

Yalan Zhang yalzhang at redhat.com
Mon Dec 12 03:49:31 UTC 2022


Hi Igor,

I have tried some scenarios and recorded the status in this document[1].
Could you please help to check the test result?
Is my test matrix enough? (I will test again once qemu is ready)
Thank you!

BTW, current test results for pxb:
Q35+ pcie-expander-bus - works
PC + pci-expander-bus  - not work


[1]
https://docs.google.com/document/d/1C5wseFWLTpNPaeRls8Z8yppocLslTvz9HCIojR9bHHY/edit#


Yalan


On Fri, Dec 9, 2022 at 5:39 AM Igor Mammedov <imammedo at redhat.com> wrote:

> On Thu, Dec 8, 2022 at 5:44 PM Laine Stump <laine at redhat.com> wrote:
> >
> > On 12/8/22 11:15 AM, Julia Suvorova wrote:
> > > On Thu, Nov 3, 2022 at 9:26 AM Amnon Ilan <ailan at redhat.com> wrote:
> > >>
> > >>
> > >>
> > >> On Thu, Nov 3, 2022 at 12:13 AM Amnon Ilan <ailan at redhat.com> wrote:
> > >>>
> > >>>
> > >>>
> > >>> On Wed, Nov 2, 2022 at 6:47 PM Laine Stump <laine at redhat.com> wrote:
> > >>>>
> > >>>> On 11/2/22 11:58 AM, Igor Mammedov wrote:
> > >>>>> On Wed, 2 Nov 2022 15:20:39 +0000
> > >>>>> Daniel P. Berrangé <berrange at redhat.com> wrote:
> > >>>>>
> > >>>>>> On Wed, Nov 02, 2022 at 04:08:43PM +0100, Igor Mammedov wrote:
> > >>>>>>> On Wed, 2 Nov 2022 10:43:10 -0400
> > >>>>>>> Laine Stump <laine at redhat.com> wrote:
> > >>>>>>>
> > >>>>>>>> On 11/1/22 7:46 AM, Igor Mammedov wrote:
> > >>>>>>>>> On Mon, 31 Oct 2022 14:48:54 +0000
> > >>>>>>>>> Daniel P. Berrangé <berrange at redhat.com> wrote:
> > >>>>>>>>>
> > >>>>>>>>>> On Mon, Oct 31, 2022 at 04:32:27PM +0200, Edward Haas wrote:
> > >>>>>>>>>>> Hi Igor and Laine,
> > >>>>>>>>>>>
> > >>>>>>>>>>> I would like to revive a 2 years old discussion [1] about
> consistent network
> > >>>>>>>>>>> interfaces in the guest.
> > >>>>>>>>>>>
> > >>>>>>>>>>> That discussion mentioned that a guest PCI address may
> change in two cases:
> > >>>>>>>>>>> - The PCI topology changes.
> > >>>>>>>>>>> - The machine type changes.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Usually, the machine type is not expected to change,
> especially if one
> > >>>>>>>>>>> wants to allow migrations between nodes.
> > >>>>>>>>>>> I would hope to argue this should not be problematic in
> practice, because
> > >>>>>>>>>>> guest images would be made per a specific machine type.
> > >>>>>>>>>>>
> > >>>>>>>>>>> Regarding the PCI topology, I am not sure I understand what
> changes
> > >>>>>>>>>>> need to occur to the domxml for a defined guest PCI address
> to change.
> > >>>>>>>>>>> The only think that I can think of is a scenario where
> hotplug/unplug is
> > >>>>>>>>>>> used,
> > >>>>>>>>>>> but even then I would expect existing devices to preserve
> their PCI address
> > >>>>>>>>>>> and the plug/unplug device to have a reserved address
> managed by the one
> > >>>>>>>>>>> acting on it (the management system).
> > >>>>>>>>>>>
> > >>>>>>>>>>> Could you please help clarify in which scenarios the PCI
> topology can cause
> > >>>>>>>>>>> a mess to the naming of interfaces in the guest?
> > >>>>>>>>>>>
> > >>>>>>>>>>> Are there any plans to add the acpi_index support?
> > >>>>>>>>>>
> > >>>>>>>>>> This was implemented a year & a half ago
> > >>>>>>>>>>
> > >>>>>>>>>>      https://libvirt.org/formatdomain.html#network-interfaces
> > >>>>>>>>>>
> > >>>>>>>>>> though due to QEMU limitations this only works for the old
> > >>>>>>>>>> i440fx chipset, not Q35 yet.
> > >>>>>>>>>
> > >>>>>>>>> Q35 should work partially too. In its case acpi-index support
> > >>>>>>>>> is limited to hotplug enabled root-ports and PCIe-PCI bridges.
> > >>>>>>>>> One also has to enable ACPI PCI hotplug (it's enled by default
> > >>>>>>>>> on recent machine types) for it to work (i.e.it's not
> supported
> > >>>>>>>>> in native PCIe hotplug mode).
> > >>>>>>>>>
> > >>>>>>>>> So if mgmt can put nics on root-ports/bridges, then acpi-index
> > >>>>>>>>> should just work on Q35 as well.
> > >>>>>>>>
> > >>>>>>>> With only a few exceptions (e.g. the first ich9 audio device,
> which is
> > >>>>>>>> placed directly on the root bus at 00:1B.0 because that is
> where the
> > >>>>>>>> ich9 audio device is located on actual Q35 hardware), libvirt
> will
> > >>>>>>>> automatically put all PCI devices (including network
> interfaces) on a
> > >>>>>>>> pcie-root-port.
> > >>>>>>>>
> > >>>>>>>> After seeing reports that "acpi index doesn't work with Q35
> > >>>>>>>> machinetypes" I just assumed that was correct and didn't try
> it. But
> > >>>>>>>> after seeing the "should work partially" statement above, I
> tried it
> > >>>>>>>> just now and an <interface> of a Q35 guest that had its PCI
> address
> > >>>>>>>> auto-assigned by libvirt (and so was placed on a
> pcie-root-port)m and
> > >>>>>>>> had <acpi index='4'/> was given the name "eno4". So what
> exactly is it
> > >>>>>>>> that *doesn't* work?
> > >>>>>>>
> > >>>>>>>   From QEMU side:
> > >>>>>>> acpi-index requires:
> > >>>>>>>    1. acpi pci hotplug enabled (which is default on relatively
> new q35 machine types)
> > >>>>>>>    2. hotpluggble pci bus (root-port, various pci bridges)
> > >>>>>>>    3. NIC can be cold or hotplugged, guest should pick up
> acpi-index of the device
> > >>>>>>>       currently plugged into slot
> > >>>>>>> what doesn't work:
> > >>>>>>>    1. device attached to host-bridge directly  (work in progress)
> > >>>>>>>          (q35)
> > >>>>>>>    2. devices attached to any PXB port and any hierarchy hanging
> of it (there are not plans to make it work)
> > >>>>>>>          (q35, pc)
> > >>>>>>
> > >>>>>> I'd say this is still a relatively important, as the PXBs are
> needed
> > >>>>>> to create a NUMA placement aware topology for guests, and I'd say
> it
> > >>>>>> is undesirable to loose acpi-index if a guest is updated to be
> NUMA
> > >>>>>> aware, or if a guest image can be deployed in either normal or
> NUMA
> > >>>>>> aware setups.
> > >>>>>
> > >>>>> it's not only Q35 but also PC.
> > >>>>> We basically do not generate ACPI hierarchy for PXBs at all,
> > >>>>> so neither ACPI hotplug nor depended acpi-index would work.
> > >>>>> It's been so for many years and no one have asked to enable
> > >>>>> ACPI hotplug on them so far.
> > >>>>
> > >>>> I'm guessing (based on absolutely 0 information :-)) that there
> would be
> > >>>> more demand for acpi-index (and the resulting predictable interface
> > >>>> names) than for acpi hotplug for NUMA-aware setup.
> > >>>
> > >>>
> > >>> My guess is similar, but it is still desirable to have both (i.e.
> support ACPI-indexing/hotplug with Numa-aware)
> > >>> Adding @Peter Xu to check if our setups for SAP require NUMA-aware
> topology
> > >>>
> > >>> How big of a project would it be to enable ACPI-indexing/hotplug
> with PXB?
> > >
> > > Why would you need to add acpi hotplug on pxb?
> > >
> > >> Adding +Julia Suvorova and +Tsirkin, Michael to help answer this
> question
> > >>
> > >> Thanks,
> > >> Amnon
> > >>
> > >>>
> > >>> Since native PCI was improved, we can still compromise on switching
> to native-PCI-hotplug when PXB is required (and no fixed indexing)
> > >
> > > Native hotplug works on pxb as is, without disabling acpi hotplug.
> >
> > Are you saying you can add an acpi-index to a device plugged into a pxb,
> > that index will be recognized (and used to name the device), but it will
> > still do native hotplug?
>
> nope, acpi-index won't work on pxb hierarchy, it works only PCI tree
> hanging off main host bridge.
>
> >
> > That sounds okay to me, since it ticks all the functional marks
> > (hotplug, consistent device names, NUMA-aware). It's possible there are
> > some things I'm misunderstanding or haven't thought of though...
> >
> >
> > >
> > >>> Thanks,
> > >>> Amnon
> > >>>
> > >>>
> > >>>>
> > >>>>
> > >>>> Anyway, it sounds like (*within the confines of how libvirt
> constructs
> > >>>> the PCI topology*) we actually have functional parity of acpi-index
> > >>>> between 440fx and Q35.
> > >>>>
> > >
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/libvirt-users/attachments/20221212/e5e53976/attachment-0001.htm>


More information about the libvirt-users mailing list