[libvirt] Yet another RFC for CAT
Daniel P. Berrange
berrange at redhat.com
Mon Sep 4 15:57:31 UTC 2017
On Mon, Sep 04, 2017 at 04:14:00PM +0200, Martin Kletzander wrote:
> * The current design (finally something libvirt-related, right?)
>
> The discussion ended with a conclusion of the following (with my best
> knowledge, there were so many discussions about so many things that I
> would spend too much time looking up all of them):
>
> - Users should not need to specify bit masks, such complexity should be
> abstracted. We'll use sizes (e.g. 4MB)
>
> - Multiple vCPUs might need to share the same allocation.
>
> - Exclusivity of allocations is to be assumed, that is only unoccupied
> cache should be used for new allocations.
>
> The last point seems trivial but it's actually very specific condition
> that, if removed, can cause several problems. If it's hard to grasp the
> last point together with the second one, you're on the right track. If
> not, then I'll try to make a point for why the last point should be
> removed in 3... 2... 1...
>
> * Design flaws
>
> 1) Users cannot specify any allocation that would share only part with
> some other allocation of the domain or the default group.
>
> 2) It was not specified what to do with the default resource group.
> There might be several ways to approach this, with varying pros and
> cons:
>
> a) Treat it as any other group. That is any bit set for this group
> will be excluded from usable bits when creating new allocation
> for a domain.
>
> - Very predictable behaviour
>
> - You will not be able to allocate any amount of cache without
> previous setting for the default group as that will have all
> the bits set which will make all the cache unusable
>
> b) Automatically remove the appropriate amount of bits that are
> needed for new domains.
>
> - No need to do any change to the system settings in order to
> use this new feature
>
> - We would have to change system settings, which is generally
> frowned upon when done "automatically" as a side effect of
> starting a domain, especially for such scarce resource as
> cache
>
> - The change to system settings would not be entirely
> predictable
>
> c) Act like it doesn't exist and don't remove its allocations from
> consideration
>
> - Doesn't really make sense as system processes might be
> trashing the cache as any VM, moreover when all VM processes
> without allocations will be based in the default group as
> well
>
> 3) There is no way for users to know what the particular settings are
> for any running domain.
>
> The first point was deemed a corner case. Fair enough on its own, but
> considering point 2 and its solutions, it is rather difficult for me to
> justify it. Also, let's say you have domain with 4 vCPUs out of which
> you know 1 might be trashing the cache, but you don't want to restrict
> it completely, but others will utilize it very nicely. Sensible
> allocations for such domain's vCPUs might be:
>
> vCPU 0: 000f
> vCPUs 1-3: ffff
>
> as you want vCPUs 1-3 to utilize even the part of cache that might get
> trashed by vCPU 0. Or they might share some data (especially
> guest-memory-related).
>
> The case above is not possible to set up with only per-vcpu(s) scalar
> setting. And there are more as you might imagine now. For example how
> do we behave with iothreads and emulator threads?
Ok, I see what you're getting at. I've actually forgotten what
our current design looks like though :-)
What level of granularity were we allowing within a guest ?
All vCPUs use separate cache regions from each other, or all
vCPUs use a share cached region, but separate from other guests,
or a mix ?
> * My suggestion:
>
> - Provide an API for querying and changing the allocation of the
> default resource group. This would be similar to setting and
> querying hugepage allocations (see virsh's freepages/allocpages
> commands).
Reasonable
> - Let users specify the starting position in addition to the size, i.e.
> not only specifying "size", but also "from". If "from" is not
> specified, the whole allocation must be exclusive. If "from" is
> specified it will be set without checking for collisions. The latter
> needs them to query the system or know what settings are applied
> (this should be the case all the time), but is better then adding
> non-specific and/or meaningless exclusivity settings (how do you
> specify part-exclusivity of the cache as in the example above)
I'm concerned about the idea of not checking 'from' for collisions,
if there's allowed a mix of guests with & within 'from'.
eg consider
* Initially 24 MB of cache is free, starting at 8MB
* run guest A from=8M, size=8M
* run guest B size=8M
=> libvirt sets from=16M, so doesn't clash with A
* stop guest A
* run guest C size=8M
=> libvirt sets from=8M, so doesn't clash with B
* restart guest A
=> now clashes with guest C, whereas if you had
left guest A running, then C would have
got from=24MB and avoided clash
IOW, if we're to allow users to set 'from', I think we need to
have an explicit flag to indicate whether this is an exclusive
or shared allocation. That way guest A would set 'exclusive',
and so at least see an error when it got a clash with guest
C in the example.
> - After starting a domain, fill in any missing information about the
> allocation (I'm generalizing here, but fro now it would only be the
> optional "from" attribute)
>
> - Add settings not only for vCPUs, but also for other threads as we do
> with pinning, schedulers, etc.
Regards,
Daniel
--
|: https://berrange.com -o- https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org -o- https://fstop138.berrange.com :|
|: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|
More information about the libvir-list
mailing list