[libvirt] [PATCH v4 2/5] numa: describe siblings distances within cells

Jim Fehlig jfehlig at suse.com
Thu Oct 12 18:09:45 UTC 2017


On 10/12/2017 04:37 AM, Wim ten Have wrote:
> On Fri, 6 Oct 2017 08:49:46 -0600
> Jim Fehlig <jfehlig at suse.com> wrote:
> 
>> On 09/08/2017 08:47 AM, Wim Ten Have wrote:
>>> From: Wim ten Have <wim.ten.have at oracle.com>
>>>
>>> Add libvirtd NUMA cell domain administration functionality to
>>> describe underlying cell id sibling distances in full fashion
>>> when configuring HVM guests.
>>
>> May I suggest wording this paragraph as:
>>
>> Add support for describing sibling vCPU distances within a domain's vNUMA cell
>> configuration.
> 
>    See below (v5 comment).
> 
>>> Schema updates are made to docs/schemas/cputypes.rng enforcing domain
>>> administration to follow the syntax below the numa cell id and
>>> docs/schemas/basictypes.rng to add "numaDistanceValue".
>>
>> I'm not sure this paragraph is needed in the commit message.
>>
>>> A minimum value of 10 representing the LOCAL_DISTANCE as 0-9 are
>>> reserved values and can not be used as System Locality Distance Information.
>>> A value of 20 represents the default setting of REMOTE_DISTANCE
>>> where a maximum value of 255 represents UNREACHABLE.
>>>
>>> Effectively any cell sibling can be assigned a distance value where
>>> practically 'LOCAL_DISTANCE <= value <= UNREACHABLE'.
>>>
>>> [below is an example of a 4 node setup]
>>>
>>>     <cpu>
>>>       <numa>
>>>         <cell id='0' cpus='0' memory='2097152' unit='KiB'>
>>>           <distances>
>>>             <sibling id='0' value='10'/>
>>>             <sibling id='1' value='21'/>
>>>             <sibling id='2' value='31'/>
>>>             <sibling id='3' value='41'/>
>>>           </distances>
>>>         </cell>
>>>         <cell id='1' cpus='1' memory='2097152' unit='KiB'>
>>>           <distances>
>>>             <sibling id='0' value='21'/>
>>>             <sibling id='1' value='10'/>
>>>             <sibling id='2' value='31'/>
>>>             <sibling id='3' value='41'/>
>>>           </distances>
>>>         </cell>
>>>         <cell id='2' cpus='2' memory='2097152' unit='KiB'>
>>>           <distances>
>>>             <sibling id='0' value='31'/>
>>>             <sibling id='1' value='21'/>
>>>             <sibling id='2' value='10'/>
>>>             <sibling id='3' value='21'/>
>>>           </distances>
>>>         <cell id='3' cpus='3' memory='2097152' unit='KiB'>
>>>           <distances>
>>>             <sibling id='0' value='41'/>
>>>             <sibling id='1' value='31'/>
>>>             <sibling id='2' value='21'/>
>>>             <sibling id='3' value='10'/>
>>>           </distances>
>>>         </cell>
>>>       </numa>
>>>     </cpu>
>>
>> How would this look when having more than one cpu in a cell? I suppose something
>> like
>>
>>    <cpu>
>>       <numa>
>>         <cell id='0' cpus='0-3' memory='2097152' unit='KiB'>
>>           <distances>
>>             <sibling id='0' value='10'/>
>>             <sibling id='1' value='10'/>
>>             <sibling id='2' value='10'/>
>>             <sibling id='3' value='10'/>
>>             <sibling id='4' value='21'/>
>>             <sibling id='5' value='21'/>
>>             <sibling id='6' value='21'/>
>>             <sibling id='7' value='21'/>
>>           </distances>
>>         </cell>
>>         <cell id='1' cpus='4-7' memory='2097152' unit='KiB'>
>>           <distances>
>>             <sibling id='0' value='21'/>
>>             <sibling id='1' value='21'/>
>>             <sibling id='2' value='21'/>
>>             <sibling id='3' value='21'/>
>>             <sibling id='4' value='10'/>
>>             <sibling id='5' value='10'/>
>>             <sibling id='6' value='10'/>
>>             <sibling id='7' value='10'/>
>>           </distances>
>>        </cell>
>>      </numa>
>>    </cpu>
> 
>    Nope. That machine seems to make a 2 node vNUMA setup.
> 
>    Where;
>    * NUMA node(0) defined by <cell id='0'> holds 4 (cores)
>      cpus '0-3' with 2GByte of dedicated memory.
>    * NUMA node(1) defined by <cell id='1'> holds 4 (cores)
>      cpus '4-7' with 2GByte of dedicated memory.

Correct.

>        <cpu>
>           <numa>
>             <cell id='0' cpus='0-3' memory='2097152' unit='KiB'>
>               <distances>
>                 <sibling id='0' value='10'/>
>                 <sibling id='1' value='21'/>
>               </distances>
>             </cell>
>             <cell id='1' cpus='4-7' memory='2097152' unit='KiB'>
>               <distances>
>                 <sibling id='0' value='21'/>
>                 <sibling id='1' value='10'/>
>               </distances>
>            </cell>
>          </numa>
>        </cpu>

Duh. sibling id='x' refers to cell with id 'x'. For some reason I had it stuck 
in my head that it referred to vcpu with id 'x'.

> 
>    Specific configuration would typically report below when examined from
>    within the guest domain; (despite ignorance in this example that it
>    _DOES_ concern a single socket 8 cpu machine).
> 
>        [root at f25 ~]# lscpu
>        Architecture:          x86_64
>        CPU op-mode(s):        32-bit, 64-bit
>        Byte Order:            Little Endian
>        CPU(s):                8
>        On-line CPU(s) list:   0-7
>        Thread(s) per core:    1
>        Core(s) per socket:    8
>        Socket(s):             1
>    *>  NUMA node(s):          2
>        Vendor ID:             AuthenticAMD
>        CPU family:            21
>        Model:                 2
>        Model name:            AMD FX-8320E Eight-Core Processor
>        Stepping:              0
>        CPU MHz:               3210.862
>        BogoMIPS:              6421.83
>        Virtualization:        AMD-V
>        Hypervisor vendor:     Xen
>        Virtualization type:   full
>        L1d cache:             16K
>        L1i cache:             64K
>        L2 cache:              2048K
>        L3 cache:              8192K
>    *>  NUMA node0 CPU(s):     0-3
>    *>  NUMA node1 CPU(s):     4-7
>        Flags:                 fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm rep_good nopl cpuid extd_apicid pni pclmulqdq ssse3 fma cx16 sse4_1 sse4_2 x2apic popcnt aes xsave avx f16c hypervisor lahf_lm svm cr8_legacy abm sse4a misalignsse 3dnowprefetch ibs xop lwp fma4 tbm vmmcall bmi1 arat npt lbrv nrip_save tsc_scale vmcb_clean decodeassists pausefilter
> 
>        [root at f25 ~]# numactl -H
>        available: 2 nodes (0-1)
>        node 0 cpus: 0 1 2 3
>        node 0 size: 1990 MB
>        node 0 free: 1786 MB
>        node 1 cpus: 4 5 6 7
>        node 1 size: 1950 MB
>        node 1 free: 1820 MB
>        node distances:
>        node   0   1
>          0:  10  21
>          1:  21  10

Right, got it.

> 
>> In the V3 thread you mentioned "And to reduce even more we could also
>> remove LOCAL_DISTANCES as they make a constant factor where; (cell_id ==
>> sibling_id)". In the above example cell_id 1 == sibling_id 1, but it is not
>> LOCAL_DISTANCE.
>>
>>> Whenever a sibling id the cell LOCAL_DISTANCE does apply and for any
>>> sibling id not being covered a default of REMOTE_DISTANCE is used
>>> for internal computations.
>>
>> I'm having a hard time understanding this sentence...
> 
>    Me.2
> 
>> I didn't look closely at the patch since I'd like to understand how multi-cpu
>> cells are handled before doing so.
> 
>    Let me prepare v5.  I found a silly error in code being fixed and
>    given above commented confusion like to take a better approach under
>    the commit messages and witin the cover letter.

Thanks. Hopefully I'll have time to review it without much delay.

Regards,
Jim




More information about the libvir-list mailing list