[libvirt PATCH 4/4] gitlab-ci: Introduce a new test 'integration' pipeline stage

Wed Mar 2 09:43:24 UTC 2022

On Wed, Mar 02, 2022 at 08:42:30AM +0100, Erik Skultety wrote:
> ...
> 
> > > > > +libvirt-perl-bindings:
> > > > > +  stage: bindings
> > > > > +  trigger:
> > > > > +    project: eskultety/libvirt-perl
> > > > > +    branch: multi-project-ci
> > > > > +    strategy: depend
> > > > > +
> > > > > +
> > > > > +centos-stream-8-tests:
> > > > > +  extends: .tests
> > > > > +  needs:
> > > > > +    - libvirt-perl-bindings
> > > > > +    - pipeline: $PARENT_PIPELINE_ID
> > > > > +      job: x86_64-centos-stream-8
> > > > > +    - project: eskultety/libvirt-perl
> > > > > +      job: x86_64-centos-stream-8
> > > > > +      ref: multi-project-ci
> > > > > +      artifacts: true
> > > >
> > > > IIUC from the documentation and from reading around, using
> > > > strategy:depend will cause the local job to reflect the status of the
> > > > triggered pipeline. So far so good.
> > > >
> > > > What I am unclear about is, is there any guarantee that using
> > > > artifact:true with a job from an external project's pipeline will
> > > > expose the artifacts from the job that was executed as part of the
> > > > specific pipeline that we've just triggered ourselves as opposed to
> > > > some other pipeline that might have already been completed in the
> > > > past of might have completed in the meantime?
> > >
> > > Not just by using artifact:true or strategy:depend. The important bit is having
> > > 'libvirt-perl-bindings' in the job's needs list. Let me explain, if you don't
> > > put the bindings trigger job to the requirements list of another job
> > > (centos-stream-8-tests in this case) what will happen is that the trigger job
> > > will be waiting for the external pipeline to finish, but centos-stream-8-tests
> > > job would execute basically as soon as the container project builds are
> > > finished because artifacts:true would download the latest RPM artifacts from an
> > > earlier build...
> > 
> > No, I got that part. My question was whether
> > 
> >   other-project-pipeline:
> >     trigger:
> >       project: other-project
> >       strategy: depend
> > 
> >   our-job:
> >     needs:
> >       - other-project-pipeline
> >       - project: other-project
> >         job: other-project-job
> >         artifacts: true
> > 
> > actually guarantees that the instance of other-project-job whose
> > artifacts are available to our-job is the same one that was started
> > as part of the pipeline triggered by the other-project-pipeline job.
> 
> Sorry for a delayed response.
> 
> I don't think so. We can basically only rely on a fact that the jobs would
> actually be queued in order they arrive which means that jobs submitted earlier
> should finish earlier, but that of course is only a premise not a guarantee.
> 
> On the other hand I never intended to run the integration CI on every single
> push to the master branch, instead, I wanted to make this a scheduled pipeline
> which would effectively alleviate the problem, because with scheduled pipelines
> there would very likely not be a concurrent pipeline running in libvirt-perl
> which would make us download artifacts from a pipeline we didn't trigger
> ourselves.

Ultimately when we switch to using merge requests, the integration tests
should be run as a gating job, triggered from the merge train when the
code gets applied to git, so that we prevent regressions actually making
it into git master at all.

Post-merge integration testing always exhibits the problem that people will
consider it somebody else's problem to fix the regression. So effectively
whoever creates the integration testing system ends up burdened with the
job of investigating failures and finding someone to poke to fix it. With
it run pre-merge then whoever wants to get their code merged needs to
investigate the problems. Now sometimes the problems with of course be
with the integration test system itself, not the submitters code, but
this is OK because it leads to situation where the job of maintaining
the integration tests are more equitably spread across all involved and
builds a mindset that functional / integration testing is a critical
part of delivering code, which is something we've lacked for too long
in libvirt.

> > > > Taking a step back, why exactly are we triggering a rebuild of
> > > > libvirt-perl in the first place? Once we change that project's
> > > > pipeline so that RPMs are published as artifacts, can't we just grab
> > > > the ones from the latest successful pipeline? Maybe you've already
> > > > explained why you did things this way and I just missed it!
> > >
> > > ...which brings us here. Well, I adopted the mantra that all libvirt-friends
> > > projects depend on libvirt and given that we need libvirt-perl bindings to test
> > > upstream, I'd like to always have the latest bindings available to test with
> > > the current upstream build. The other reason why I did the way you commented on
> > > is that during development of the proposal many times I had to make changes to
> > > both libvirt and libvirt-perl in lockstep and it was tremendously frustrating
> > > to wait for the pipeline to get to the integration stage only to realize that
> > > the integration job didn't wait for the latest bindings and instead picked up
> > > the previous latest artifacts which I knew were either faulty or didn't contain
> > > the necessary changes yet.
> > 
> > Of course that would be annoying when you're making changes to both
> > projects at the same time, but is that a scenario that we can expect
> > to be common once the integration tests are in place?
> > 
> > To be clear, I'm not necessarily against the way you're doing things
> > right now, it's just that it feels like using the artifacts from the
> > latest successful libvirt-perl pipeline would lower complexity, avoid
> > burning additional resources and reduce wait times.
> > 
> > If the only only downside is having a worse experience when making
> > changes to the pipeline, and we can expect that to be infrequent
> > enough, perhaps that's a reasonable tradeoff.
> 
> I gave this more thought. What you suggest is viable, but the following is worth
> considering if we go with your proposal:
> 
> - libvirt-perl jobs build upstream libvirt first in order to build the bindings
>     -> generally it takes until right before the release that APIs/constants
>        are added to the respective bindings (Perl/Python)
>     -> if we rely on the latest libvirt-perl artifacts without actually
>        triggering the pipeline, yes, the artifacts would be stable, but fairly
>        old (unless we schedule a recurrent pipeline in the project to refresh
>        them), thus not giving us feedback from the integration stage that
>        bindings need to be added first, because the API coverage would likely
>        fail, thus failing the whole libvirt-perl pipeline and thus invalidating
>        the integration test stage in the libvirt project
>         => now, I admit this would get pretty annoying because it would force
>            contributors (or the maintainer) who add new APIs to add respective
>            bindings as well in a timely manner, but then again ultimately we'd
>            like our contributors to also introduce an integration test along
>            with their feature...

Note right now the perl API coverage tests are configured to only be gating
when run on nightly scheduled jobs. I stopped them being gating on contributions
because if someone if fixing a bug in the bindings it is silly to force
their merge request to also add new API bindings.

I'm thinking about whether we should even making the API coverage tests be
non-gating even for scheduled jobs. I miss the fact that we when we see a
notification of a failed pipeline we don't see at a glance whether it is
a genuine build failure or merely a new API missing.

The python bindings have a little different of a situation. Sometimes the
code generator can do the job on its own, but other times the code generator
trips over its cane and breaks a leg. In the latter cases, we're always going
to get a hard CI failure we can't ignore, unless we teach the code generator
to make it a soft failure and just skip the API with a warning when it is
something it can't cope with. I think if we're going to use the python
bindings from automated tests we'll have no choice but to make the code
generator treat it as a soft failure, if we want to use the tests as a
gating check, otherwise you'll end up with a chicken & egg problem between
merging new APIs to C lib and Python.

> > > What's the point, we'd have to constantly refresh the tags if the platforms
> > > come and go given our support, whereas fedora-vm and centos-stream-vm cover all
> > > currently supported versions - always!
> > > Other than that, I'm not sure that tags are passed on to the gitlab job itself,
> > > I may have missed it, but unless the tags are exposed as env variables, the
> > > provisioner script wouldn't know which template  to provision. Also, the tag is
> > > supposed to annotate the baremetal host in this case, so in that context having
> > > '-vm' in the tag name makes sense, but doesn't for the provisioner script which
> > > relies on/tries to be compatible with lcitool as much as possible.
> > 
> > Okay, my misunderstanding was caused by not figuring out the purpose
> > of DISTRO. I agree that more specific tags are not necessary.
> > 
> > Should we make them *less* specific instead? As in, is there any
> > reason for having different tags for Fedora and CentOS jobs as
> > opposed to using a generic "this needs to run in a VM" tag for both?
> 
> Well, I would not be against, but I feel this is more of a political issue:
> this HW was provided by Red Hat with the intention to be dedicated for Red Hat
> workloads. If another interested 3rd party comes (and I do hope they will) and
> provides HW, we should utilize the resources fairly in a way respectful to the
> donor's/owner's intentions, IOW if party A provides a single machine to run
> CI workloads using Debian VMs, we should not schedule Fedora/CentOS workloads
> in there effectively saturating it.
> So if the tags are to be adjusted, then I'd be in favour of recording the owner
> of the runner in the tag.

If we have hardware available, we should use to the best of its ability.
Nothing is gained by leaving it idle if it has spare capacity to run jobs.

Until we start using it though, we will not have a clear idea of how many
distro combinations we can cope with for integration testing. We'll also
want to see how stable the jobs prove to be as we start using it for real.
With that in mind it makes sense to start off with a limit number of distro
jobs and monitor the situation. If it is reliable and the machine shows it
has capacity to run more then we can add more, picking distros that give
the maximum benefit in terms of identifying bugs.  IOW, I would much
rather run 1x CentOS Stream + 1x Fedora + 1x Debian + 1x Suse, than
2 x CentOS and 2x Fedora, because the former will give much broader
ability to find bugs.

Regards,
Daniel
-- 
|: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
|: https://libvirt.org         -o-            https://fstop138.berrange.com :|
|: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|