[libvirt] [jenkins-ci PATCH 0/1] jenkins: Start building on Ubuntu

Tue Dec 10 14:54:36 UTC 2019

On Tue, Dec 10, 2019 at 02:54:22PM +0100, Andrea Bolognani wrote:
> This patch is intended to start a slightly larger discussion about
> our plans for the CentOS CI environment going forward.
> 
> At the moment, we have active builders for
> 
>   CentOS 7
>   Debian 9
>   Debian 10
>   Fedora 30
>   Fedora 31
>   Fedora Rawhide
>   FreeBSD 11
>   FreeBSD 12
> 
> but we don't have builder for
> 
>   Debian sid
>   FreeBSD -CURRENT
>   Ubuntu 16.04
>   Ubuntu 18.04
> 
> despite them being fully supported in the libvirt-jenkins-ci
> repository.
> 
> This makes sense for sid and -CURRENT, since the former covers the
> same "freshest Linux packages" angle that Rawhide already takes care
> of and the latter is often broken and not trivial to keep updated;
> both Ubuntu targets, however, should IMHO be part of the CentOS CI
> environment. Hence this series :)
> 
> Moreover, we're in the process of adding
> 
>   CentOS 8
>   openSUSE Leap 15.1
>   openSUSE Tumbleweed
> 
> as targets, of which the first two should also IMHO be added as they
> would provide useful additional coverage.
> 
> The only reason why I'm even questioning whether this should be done
> is capacity for the hypervisor host: the machine we're running all
> builders on has
> 
>   CPUs: 8
>   Memory: 32 GiB
>   Storage: 450 GiB
> 
> and each of the guests is configured to use
> 
>   CPUs: 2
>   Memory: 2 GiB
>   Storage: 20 GiB
> 
> So while we're good, and actually have plenty of room to grow, on
> the memory and storage front, we're already overcommitting our CPUs
> pretty significantly, which I guess is at least part of the reason
> why builds take so long.

NB the memory that's free is not really free - it is being usefull
as I/O cache for the VM disks. So more VMs will reduce I/O cache.
Whether that will actually impact us I don't know though.

More importantly though, AFAICT, those are not 8 real CPUs.

virsh nodeinfo reports 8 cores, but virsh capabilities
reports it as a 1 socket, 4 core, 2 thread CPU.

IOW we haven't really got 8 CPUs, more like equivalent of 5 CPUs.
as HT only really gives a x1.3 boost in best case, and I suspect
builds are not likely to be hitting the best case.

> Can we afford to add 50% more load on the machine without making it
> unusable? I don't know. But I think it would be worthwhile to at
> least try and see how it handles an additional 25%, which is exactly
> what this series does.

Giving it a try is ok I guess.

I expect there's probably more we can do to optimize the setup
too.

For example, what actual features of qcow2 are we using ? We're
not snapshotting VMs, we don't need grow-on-demand allocation.
AFACT we're paying the performance cost of qcow2 (l1/l2 table
lookups & metadata caching), for no reason. Switch the VMs to
fully pre-allocated raw files may improve I/O performance.
Raw LVM VGs would be even better but that will be painful
to setup given the host install setup.

I also wonder if we have the optimal aio setting for disks,
as there's nothing in the XML.

We could consider using cache=unsafe for VMs, though for
that I think we'd want to separate off a separate disk
for /home/jenkins so that if there was a host OS crash,
we wouldn't have to rebuild the entire VMs - just throw
away the data disk & recreate.

Since we've got plenty of RAM, another obvious thing would be
to turn on huge pages and use them for all guest RAM. This may
well have a very significant performance boost from reducing
CPU overhead which is our biggest bottleneck.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|