[Tendrl-devel] Former Console 2.0 QE upstream CI strategy

Wed Sep 7 09:23:12 UTC 2016

On 09/06/2016 07:26 PM, John Spray wrote:
> On Tue, Sep 6, 2016 at 5:54 PM, Martin Bukatovic <mbukatov at redhat.com> wrote:
>> On 09/06/2016 05:38 PM, John Spray wrote:
>>> This point would be a good reason to consider using the existing Ceph
>>> test framework (teuthology) -- we have a public environment.  You
>>> could for example have a private teuthology environment if you wanted,
>>> but when folks wanted to run their tests in a public upstream
>>> environment they could either ask for resources in the "sepia" Ceph
>>> environment, or they could use teuthology's openstack mode to run on
>>> an openstack cloud.
>>
>> I don't think we could use teuthology directly. But maybe we could at
>> least create some wrapper which would run our tests in teuthology
>> environment.
> 
> Can you elaborate on why you can't use teuthology?  It seems like
> testing a management console on top of a Ceph cluster would have
> extremely similar environmental requirements to just testing the Ceph
> cluster.  Yes, at some stage you need to be able to add Gluster
> support too, but that's not a reason to start from scratch.

We need to work with both gluster and ceph teams. Here the
problem is that while we integrate with both of the teams, we don't
actually have the same testing strategy, we don't need to retest all
possible details of ceph or gluster use cases, we will test just use
cases of the console assuming that the ceph and gluster works.

We use Selenium based automation clicking in the web interface or
scripting REST API requests and in both cases when some operation is
started, we need to check details on the actual machines if it was
done as expected (eg. volume created). So we have a centralized testing
model, when the tests runs on a machine from which we access the
console via HTTP and the other nodes via ssh (when necessary).

And since we need to work with 2 different teams with 2 completely
different testing strategies and automation, it wouldn't make sense
for us (at least at first sight) to just pick either ceph or gluster
qe framework, as we would need to extend it to:

* be able test web ui/REST api with it while running
  other test cases
* use the other test framework we didn't pick
* provide a way for both gluster and ceph teams to run our tests
  (eg. to setup a cluster via tendrl)

It would make more sense if we could reuse particular components
from both teams when necessary. But here the problem is that both
teuthology[1] and distaf[2] looks quite monolithic/all in one
solutions. There is also new ceph qe tool glusto[3] which doesn't
look that monolithic.

[1] http://docs.ceph.com/teuthology/docs/README.html
[2] https://github.com/gluster/distaf
[3] https://glusto.readthedocs.io/en/latest/

That said, maybe I'm missing few important details and my impression
is wrong, but at least at this point, it seem to me that it would be
better for our team to try to have modular test automation, which would
allow to reuse particular components with other teams. And for this
reason, our current plan is to use pytest framework for our
testing and ansible for setup/configuration.

Our current plan
================

Our plan looks like this:

 * use pytest framework as a base
 * our test code is python 3 only, while libraries/extensions we
   would maintain will be python2/3
 * write few pytest plugins/extensions to get functionality we need
 * use some pytest extensions from other teams
 * use ansible for setup
 * use ansible inventory file for both ansible itself and our tests
   (to describe machines and roles in the cluster)

We would like to get inspiration from ManageIQ QE team, which has
a similar scenario: they test a management server which provides
both web UI and REST API, and controls/manages (to some degree)
several other machines (virtual machines running in cloud in ManageIQ
case, we would manage storage machines in our case). Theirs repo is
here:

https://github.com/ManageIQ/integration_tests

It would be great if we are able to cooperate with them and get some
modules out so that both teams could use them without taking in the
whole framework and the tests with it.

When it comes to the management of the machines, we would like to
be able utilize any kind of machines, from real hw storage team
has in the lab to libvirt or open stack virtual machines. To describe
the machines of a particular cluster, we use ansible inventory file.

For example: this is how mbukatov-usm2.hosts file looks like (it's
generated by our deployment automation service):

~~~
[usm_nodes]
mbukatov-usm2-mon1.example.redhat.com
mbukatov-usm2-mon2.example.redhat.com
mbukatov-usm2-mon3.example.redhat.com
mbukatov-usm2-node1.example.redhat.com
mbukatov-usm2-node2.example.redhat.com
mbukatov-usm2-node3.example.redhat.com
mbukatov-usm2-node4.example.redhat.com
mbukatov-usm2-node5.example.redhat.com
mbukatov-usm2-node6.example.redhat.com

[ceph_mon]
mbukatov-usm2-mon1.example.redhat.com
mbukatov-usm2-mon2.example.redhat.com
mbukatov-usm2-mon3.example.redhat.com

[ceph_osd]
mbukatov-usm2-node1.example.redhat.com
mbukatov-usm2-node2.example.redhat.com
mbukatov-usm2-node3.example.redhat.com
mbukatov-usm2-node4.example.redhat.com
mbukatov-usm2-node5.example.redhat.com
mbukatov-usm2-node6.example.redhat.com

[usm_server]
mbukatov-usm2-server.example.redhat.com
~~~

This file contains all machines of the particular test cluster
(mbukatov-usm2), and groups them into roles (from tendrl perspective).

The inventory is then used during:

* run of our qe ansible playbooks both during deployment and during
setup (for a particular test of set of tests - eg. one playbook would
install stress package on given roles,
other would configure firewall based on the documentation ... and
such playbook could be used for both manual testing and run during
automated test run)
* when we need to check something ad hoc (from the commanline)
* from or test code to do some operation for every machine of a
particular group (eg. ceph_mon nodes)

This way, we have a unified elementary description of the cluster
machines.

>> Could you point me to a description of how teuthology describes
>> machines of a cluster?
> 
> http://docs.ceph.com/teuthology/docs/detailed_test_config.html#test-configuration
> 
> Machines themselves are generally identical within a particular
> cluster so there's no description as such.
> 
> CCing the sepia list which has bigger teuthology experts than myself.

That looks interesting, thanks for the pointer.

We would need to study the teuthology in more detail anyway, so I
just created a ticket to track it here:

https://tendrl.atlassian.net/browse/TEN-23

-- 
Martin Bukatovic
USM QE team