[libvirt PATCH] ci: Reduce number of stages

Daniel P. Berrangé berrange at redhat.com
Wed Jun 10 12:31:55 UTC 2020

On Wed, Jun 10, 2020 at 01:14:51PM +0100, Daniel P. Berrangé wrote:
> On Wed, Jun 10, 2020 at 01:33:01PM +0200, Andrea Bolognani wrote:
> > Right now we're dividing the jobs into three stages: prebuild, which
> > includes DCO checking as well as building artifacts such as the
> > website, and native_build/cross_build, which do exactly what you'd
> > expect based on their names.
> > 
> > This organization is nice from the logical point of view, but results
> > in poor utilization of the available CI resources: in particular, the
> > fact that cross_build jobs can only start after all native_build jobs
> > have finished means that if even a single one of the latter takes a
> > bit longer the pipeline will stall, and with native builds taking
> > anywhere from less than 10 minutes to more than 20, this happens all
> > the time.
> > 
> > Building artifacts in a separate pipeline stage also doesn't have any
> > advantages, and only delays further stages by a couple of minutes.
> > The only job that really makes sense in its own stage is the DCO
> > check, because it's extremely fast (less than 1 minute) and, if that
> > fails, we can avoid kicking off all other jobs.
> The advantage of using stages is that it makes it easy to see at a
> glance where the pipeline was failing. 
> > 
> > Reducing the number of stages results in significant speedups:
> > specifically, going from three stages to two stages reduces the
> > overall completion time for a full CI pipeline from ~45 minutes[1]
> > to ~30 minutes[2].
> > 
> > [1] https://gitlab.com/abologna/libvirt/-/pipelines/154751893
> > [2] https://gitlab.com/abologna/libvirt/-/pipelines/154771173
> I don't think this time comparison is showing a genuine difference.
> If we look at the original staged pipeline, every single individual
> job took much longer than every individual jobs in the simplified
> pipeline. I think the difference in job times accounts for most
> (possibly all) of the difference in the pipelines time.
> If we look at the history of libvirt pipelines:
>    https://gitlab.com/libvirt/libvirt/pipelines
> the vast majority of the time we're completing in 30 minutes or
> less already.
> If you want to demonstrate an time improvement from these merged
> stages, then run 20 pipelines over a cople of days and show
> that they're consistently better than what we see already, and
> not just a reflection of the CI infra load at a point in time.

Also remember that we're using ccache, so slower builds may just be a
reflection of the ccache having low hit rate - a sequence of repeated
builds of the same branch should identify if that's the case.

|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|

More information about the libvir-list mailing list