[libvirt] OpenStack/libvirt CAT interface

Wed Jan 11 12:23:30 UTC 2017

On Wed, Jan 11, 2017 at 10:18:11AM +0000, Daniel P. Berrange wrote:
> On Tue, Jan 10, 2017 at 02:18:43PM -0200, Marcelo Tosatti wrote:
> > 
> > There have been queries about the OpenStack interface 
> > for CAT:
> 
> FYI, there's another mail discussing libvirt design here:
> 
>   https://www.redhat.com/archives/libvir-list/2017-January/msg00354.html
> 
> > http://bugzilla.redhat.com/show_bug.cgi?id=1299678
> > 
> > Comment 2 says:
> > Sahid Ferdjaoui 2016-01-19 10:58:48 EST
> > A spec will have to be addressed, after a first look this feature needs
> > some work in several components of Nova to maintain/schedule/consume
> > host's cache. I can work on that spec and implementation it when libvirt
> > will provides information about cache and feature to use it for guests.
> > 
> > I could add a comment about parameters to resctrltool, but since
> > this depends on the libvirt interface, it would be good to know
> > what the libvirt interface exposes first.
> > 
> > I believe it should be essentially similar to OpenStack's
> > "reserved_host_memory_mb":
> > 
> >         Set the reserved_host_memory_mb to reserve RAM for host
> > processes. For
> >         the purposes of testing I am going to use the default of 512 MB:
> >         reserved_host_memory_mb=512
> > 
> > But rather use:
> > 
> >         rdt_cat_cache_reservation=type=code/data/both,size=10mb,cacheid=2;
> >                                   type=code/data/both,size=2mb,cacheid=1;...
> > 
> > (per-vcpu).
> > 
> > Where cache-id is optional.
> > 
> > What is cache-id (from Documentation/x86/intel_rdt_ui.txt on recent
> > kernel sources):
> > Cache IDs
> > ---------
> > On current generation systems there is one L3 cache per socket and L2
> > caches are generally just shared by the hyperthreads on a core, but this
> > isn't an architectural requirement. We could have multiple separate L3
> > caches on a socket, multiple cores could share an L2 cache. So instead
> > of using "socket" or "core" to define the set of logical cpus sharing
> > a resource we use a "Cache ID". At a given cache level this will be a
> > unique number across the whole system (but it isn't guaranteed to be a
> > contiguous sequence, there may be gaps).  To find the ID for each
> > logical
> > CPU look in /sys/devices/system/cpu/cpu*/cache/index*/id
> 
> So it seems like cache ID is something we need to add to the XML
> I proposed at
> 
>   https://www.redhat.com/archives/libvir-list/2017-January/msg00489.html
> 
> > 
> > 
> > WHAT THE USER NEEDS TO SPECIFY FOR VIRTUALIZATION (KVM-RT)
> > ==========================================================
> > 
> > For virtualization the following scenario is desired,
> > on a given socket:
> > 
> >         * VM-A with VCPUs VM-A.vcpu-1, VM-A.vcpu-2.
> >         * VM-B with VCPUs VM-B.vcpu-1, VM-B.vcpu-2.
> > 
> > With one realtime workload on each vcpu-2.
> > 
> > Assume VM-A.vcpu-2 on pcpu 3.
> > Assume VM-B.vcpu-2 on pcpu 5.
> > 
> > Assume pcpus 0-5 on cacheid 0.
> > 
> > We want VM-A.vcpu-2 to have a certain region of cache reserved,
> > and VM-B.vcpu-2 as well. vcpu-1 for both VMs can use the default group
> > (that is not have reserved L3 cache).
> > 
> > This translates to the following resctrltool-style reservations:
> > 
> >         res.vm-a.vcpu-2
> > 
> >                 type=both,size=VM-A-RESSIZE,cache-id=0
> > 
> >         res.vm-b.vcpu-2
> > 
> >                 type=both,size=VM-B-RESSIZE,cache-id=0
> > 
> > Which translate to the following in resctrlfs:
> > 
> >         res.vm-a.vcpu-2
> > 
> >                 type=both,size=VM-A-RESSIZE,cache-id=0
> >                 type=both,size=default-size,cache-id=1
> >                 ...
> > 
> >         res.vm-b.vcpu-2
> > 
> >                 type=both,size=VM-B-RESSIZE,cache-id=0
> >                 type=both,size=default-size,cache-id=1
> >                 ...
> > 
> > Which is what we want, since the VCPUs are pinned.
> > 
> > 
> > res.vm-a.vcpu-1 and res.vm-b.vcpu-1 don't need to
> > be assigned to any reservation, which means they'll
> > remain on the default group.
> 
> You've showing type=both here which IIUC, means data
> and instruction cache. 

No, type=both is non-cdp hosts (data and instructions 
reservations shared).

type=data,type=code is for cdp hosts (data and instructions 
reservations separate).

> Is that configuring one cache
> that serves both purposes ?

Yes.

> Do we need to be able
> to configure them independantly.

Yes.

> > RESTRICTIONS TO THE SYNTAX ABOVE
> > ================================
> > 
> > Rules for the parameters:
> > * type=code must be paired with type=data entry.
> 
> What does this mean exactly when configuring guests ? Do
> we have to configure data + instruction cache on the same
> cache ID, do they have to be the same size, or are they
> completely independant ?

This means that a user can't specify this reservation:

	type=data,size=10mb,cache-id=1

They have to specify _both_ code and data
sizes:

	type=data,size=10mb,cache-id=1;
	type=code,size=2mb,cache-id=1

Now a single both reservation is valid:

	type=both,size=10mb,cache-id=1

> > ABOUT THE LIST INTERFACE
> > ========================
> > 
> > About an interface for listing the reservations
> > of the system to OpenStack.
> > 
> > I think that what OpenStack needs is to check, before
> > starting a guest on a given host, that there is sufficient
> > space available for the reservation.
> > 
> > To do that, it can:
> > 
> >         1) resctrltool list (the end of the output mentions
> >            how much free space available there is), or
> >            via resctrlfs directly (have to lock the filesystem,
> >            read each directory, AND each schemata, and count
> >            number of zero bits).
> >         2) Via libvirt
> > 
> > Should fix resctrltool/API to list amount of contiguous free space
> 
> OpenStack, should just use libvirt APIs exclusively - there should not
> be any need for it to use other tools if we've designed the libvirt API
> correctly.

Got it.