[libvirt] [PATCH] spapr: make default PHB optionnal
Shivaprasad G Bhat
sbhat at linux.vnet.ibm.com
Wed Jul 12 11:39:37 UTC 2017
On 07/12/2017 04:25 PM, Andrea Bolognani wrote:
> [libvir-list added to the loop]
>
> On Tue, 2017-07-04 at 10:47 +0200, Greg Kurz wrote:
>> On Tue, 4 Jul 2017 17:29:01 +1000 David Gibson <david at gibson.dropbear.id.au> wrote:
>>> On Mon, Jul 03, 2017 at 06:48:25PM +0200, Greg Kurz wrote:
>>>>
>>>> The sPAPR machine always create a default PHB during initialization, even
>>>> if -nodefaults was passed on the command line. This forces the user to
>>>> rely on -global if she wants to set properties of the default PHB, such
>>>> as numa_node.
>>>>
>>>> This patch introduces a new machine create-default-phb property to control
>>>> whether the default PHB must be created or not. It defaults to on in order
>>>> to preserve old setups (which is also the motivation to not alter the
>>>> current behavior of -nodefaults).
>>>>
>>>> If create-default-phb is set to off, the default PHB isn't created, nor
>>>> any other device usually created with it. It is mandatory to provide
>>>> a PHB on the command line to be able to use PCI devices (otherwise QEMU
>>>> won't start). For example, the following creates a PHB with the same
>>>> mappings as the default PHB and also sets the NUMA affinity:
>>>>
>>>> -machine type=pseries,create-default-phb=off \
>>>> -numa node,nodeid=0 -device spapr-pci-host-bridge,index=0,numa_node=0
>>>
>>> So, I agree that the distinction between default devices that are
>>> disabled with -nodefaults and default devices that aren't is a big
>>> mess in qemu configuration. But on the other hand this only addresses
>>> one tiny aspect of that, and in the meantime means we will silently
>>> ignore some other configuration options in some conditions.
>>>
>>> So, what's the immediate benefit / use case for this?
Setting numa_node for emulated devices is the benefit for now. On x86, I
figured there is
no way to set the numa_node for the root controller and the emulated
devices sitting there
all have numa_node set to -1. Only the devices on the pxb can have a
sensible value specified.
Does it mean, the emulated devices/drivers don't care about the
numa_node they are on?
Would it be fine on PPC to disallow setting the NUMA node for the
default PHB because that is where
all the emulated devices sit ?
>>
>> With the current code base, the only way to set properties of the default
>> PHB, is to pass -global spapr-pci-host-bridge.prop=value for each property.
>> The immediate benefit of this patch is to unify the way libvirt passes
>> PHB description to the command line:
>>
>> ie, do:
>>
>> -machine type=pseries,create-default-phb=off \
>> -device spapr-pci-host-bridge,prop1=a,prop2=b,prop3=c \
>> -device spapr-pci-host-bridge,prop1=d,prop2=e,prop3=f
>>
>> instead of:
>>
>> -machine type=pseries \
>> -global spapr-pci-host-bridge.prop1=a \
>> -global spapr-pci-host-bridge.prop2=b \
>> -global spapr-pci-host-bridge.prop3=c \
>> -device spapr-pci-host-bridge,prop1=d,prop2=e,prop3=f
> So, I'm thinking about this mostly in terms of NUMA nodes
> because that's the use case I'm aware of.
>
> The problem with using -global is not that it requires using
> a different syntax to set properties for the default PHB,
> but rather that such properties are then inherited by all
> other PHBs unless explicitly overridden. Not creating the
> default PHB at all would solve the issue.
>
> On the other hand, libvirt would then need to either
>
> 1) only allow setting NUMA nodes for PHBs if QEMU supports
> the new option, leaving QEMU < 2.10 users behind; or
>
> 2) implement handling for both the new and old behavior.
>
> I'm not sure we could get away with 1), and going for 2)
> means more work both for QEMU and libvirt developers for
> very little actual gain, so I'd be inclined to scrap this
> and just build the libvirt glue on top of the existing
> interface.
>
> That is, of course, unless
>
> 1) having a random selection of PHBs not assigned to any
> NUMA node is a sensible use case. This is something
> we just can't do reliably with the current interface:
> we can decide to set the NUMA node only for say, PHBs
> 1 and 3 leaving 0 and 2 alone, but once we set it for
> the default PHB we *have* to set it for all remaining
> ones as well. libvirt will by default assign emulated
> devices to the default PHB, so I would rather expect
> users to leave that one alone and set a NUMA node for
> all other PHBs; or
>
> 2) there are other properties outside of numa_node we
> might want to deal with; or
>
> 3) it turns out it's okay to require a recent QEMU :)
>
> --
> Andrea Bolognani / Red Hat / Virtualization
>
More information about the libvir-list
mailing list