[libvirt PATCH 0/4] RFC: tests: introduce lavocado

Mon Jul 26 13:27:36 UTC 2021

On Wed, Jul 21, 2021 at 07:26:49PM +0100, Daniel P. Berrangé wrote:
> On Tue, Jul 20, 2021 at 02:19:25PM +0200, Erik Skultety wrote:
> > On Thu, Jul 01, 2021 at 07:04:32PM +0100, Daniel P. Berrangé wrote:
> > > Libvirt has consumers writing applications in a variety of
> > > languages, and periodically reporting bugs.  My general wish
> > > for a test harness would be for something that can invoke
> > > and consume arbitrary test cases. Essentially if someone
> > > can provide us a piece of code that demonstrates a problem
> > > in libvirt, I would like to be able to use that directly as
> > > a functional test, regardless of language it was written
> > > in.
> > 
> > Well, in theory yes - it is all nice to let anyone contribute tests in their
> > language of preference. Realistically, look at the nwfilter stuff in TCK, it's
> > a pile of unintelligible Bash which cannot be debugged. So, I understand why
> > you'd want to imagine or write tests in Go, but someone will have to maintain
> > them otherwise we'd end up with a 3rd framework nobody will want to use.
> 
> The nwfilter shell script was just badly written, and we should have
> pushed back on accepting it in that state, but it is was so broadly
> functional it was preferrable than getting no testing. It could have
> been just as horrible if written in Perl or Python too - the core
> approach of the tests themselves was just wrong. The use of shell
> was just icing on the cake.
> 
> 
> The main point is that we have applications consuming libvirt in
> multiple languages. Python and Go are the most important ones in
> terms of usage today, and we get some quite detailed bug reports
> with demo code. If we get a bug report with a Go demo program,
> it can't be assumed to be easy to just convert that to Python,
> and vica-verca.

If the bug is against the Go/Whatever bindings, the test should be integrated
there. If it's against libvirt core, I don't see how the test case could not
be ported nor reproduced with Python and so I'd strictly mandate Python - now,
I'm not saying we should reject anything, what I'm saying is that the same way
we don't accept just any code in any language in libvirt upstream (even if
people went the extra mile and made sure the whole project would just work)
and people have to re-spin a number of revisions until the community is
satisfied enough to merge it because of both stability and maintainability
reasons, we should approach the tests with the same care and set clear
expectations that must be met if a test case is to be contributed.

If a test case is contributed as part of some code changes, then it has to
conform to our expecations == Python.
On the other hand if it's part of a bug report, than we need to IMO invest the
time into porting that test case to our selected language of preference
ourselves.

...

> > > I'm wondering if we can't bridge the divide between what we
> > > have already in libvirt-tck, and what you're bringing to the
> > > table with avocado here. While we've not done much development
> > > with the TCK recently, there are some very valuable tests
> > > there, especially related to firewall support and I don't
> > > fancy rewriting them.
> > 
> > Nobody does fancy rewriting those, but there's no way going around that, we'll
> > have to do it at some point because like I said, they're very error prone
> > since they match against the verbatim output of firewall CLI tools which leads
> > to frequent breakages and you figuring out what went wrong is pure torment.
> 
> Agreed, we'll have to bite the bullet eventually. The hard thing is
> deciding just how we want to validate correctness. Comparing stdout
> format is an "easy" way. If we don't do that we need to figure out a
> way to declare what the firewall state is suposed to be, and how to
> validate a match, as we want to validate that the correct set of
> rules is actually generated.
> 
> Possibly we should have used a python binding for libxtables to
> query the firewall state and thus been in control of the expected
> data format.

I don't know anything about libxtables, so I'll happily leave it on the table
and hope for the best in terms of relying on a stable data format,
but I guess the proper way of handling networking is to come up with helper
code that constructs and sends raw packets to test whether the firewall rules
we put in place as part of a test case actually work.

...

> 
> > > Thus my suggestion is that we:
> > > 
> > >   - Put this avocado code into the libvirt-tck repository,
> > >     with focus on the supporting infra for making it easy to
> > >     write Python tests
> > 
> > Clearly I missed the section where you mentioned that we don't need to host
> > the framework in the main repo...Anyway, I disagree here, again, we have a
> > precedent of 2 failed frameworks where nobody really knows where to contribute
> > a test nor wants to deal with any additional repos. A few remarks
> > - it is much easier if the
> >   test suite lives alongside our unit test suite which is an "environment" every
> >   developer is familiar with.
> 
> I don't buy that argument. We already have countless repos and people
> who are motivated to contribute do so easily. The failures of the
> existing frameworks is down to a number of reasons. People really
> didn't want to write Perl code is the big one for the TCK. For the
> other framework its complexity and fact that it was not safe to
> run on precious hosts put off ever using it.
> 
> The biggest thing though is that we never even made an effort to
> develop a culture of writing functional tests as a prerequisite
> for merging feature.
> 
> In QEMU there is a avocado integration in-repo, but very few
> QEMU contributors ever do anything with it.  The block iotests
> integration is another functional test suite and that is very
> successful, because the block maintainers have set clear
> expectation that you must contributor tests to it if you want
> your features accepted.
> 
> We can see the same with other projects which have a strong
> culture of testing. In OpenStack you'll not get your feature
> accepted unless you've made a credible effort to provide
> functional tests - which live in a completely separate git
> repo.
> 
> > - integration into gitlab CI would be much easier with a single repo than with
> >   a multi-repo setup
> >     -> yes, we already do that with binding builds, but I'm talking about
> >        setups where we'd have to configure gitlab to clone 3 repos and build
> >        them in chain, setup the framework config and execute the tests;
> 
> As you say, we already deal with that. Cloning extra git repos is
> a complete non-issue in the grand scheme of everything needed to
> create and maintain a CI pipeline.  If we really wanted to we could
> create a psuedo-monolithic repo and graft in the individual repos
> using 'git subtree'.

Well, it's not about that it cannot be done, it's about developer experience.
Let's say I'm working on a feature to which I'm expected to contribute a test
case for.

I have to clone 2 repos and make changes to both - this is fine, even though
having it in a single repo would speed things up. Now I have to test my test
case, what then?
    - I need to build and deploy libvirt myself
    - I need to build/run the test suite against the libvirt I built in the
      previous step - all manually!

If we instead have everything in the same repo, we can (and we indeed plan to)
have automation helpers, that would prepare the environment for you and with a
single "make integration_test" an environment would be created, git would be
clone, libvirt built and installed, and test run against that. Also the review
process would be much more predictable, because a feature could not be merged
before the corresponding test was merged which is much easier to do in a single
repo than 2 individual ones, especially with a one that is solely based on a
merge request workflow which on its own is enough for many core libvirt
developers to be discouraged to bother with contributing.

All I'm trying to say here is that we need to put the developer's needs and
workflow first, because in the end, it's them who are going to both contribute
and maintain those tests. And so we should get more developer voices here (so
far it was just you Dan, Cleber, Beraldo, and myself) - if the overall
consensus is in favour of usage of arbitrary languages for test cases and
hosting the repo separately, by all means let's do it as long as the core
development team agrees on it, otherwise we'll just end up with another useless
framework and thus many human hours wasted on the CI efforts.

Erik