[Avocado-devel] Running tests in parallel

Wed Nov 23 15:26:45 UTC 2016

On Wed, Nov 23, 2016 at 12:46 PM Zubair Lutfullah Kakakhel <
Zubair.Kakakhel at imgtec.com> wrote:

> Hi,
>
> On 11/23/2016 02:28 PM, Cleber Rosa wrote:
> >
> > On 11/23/2016 07:07 AM, Zubair Lutfullah Kakakhel wrote:
> >> Hi,
> >>
> >> Thank you for your comprehensive reply!
> >>
> >> Comments inline.
> >>
> >> On 11/22/2016 02:11 PM, Cleber Rosa wrote:
> >>> On 11/22/2016 07:53 AM, Zubair Lutfullah Kakakhel wrote:
> >>>> Hi,
> >>>>
> >>>
> >>> Hi Zubair,
> >>>
> >>>> There are quite a few threads about this and a trello card
> >>>> https://trello.com/c/xNeR2slj/255-support-running-tests-in-parallel
> >>>>
> >>>> And the discussion leads to a complex multi-host RFC.
> >>>>
> https://www.redhat.com/archives/avocado-devel/2016-March/msg00025.html
> >>>>
> >>>> Our requirement is simpler.
> >>>> All we wanted to do is run disjoint simple (c executables) tests in
> >>>> parallel.
> >>>>
> >>>
> >>> Sounds fair enough.
> >>>
> >>>> I was wondering if somebody has a WIP branch that has some level of
> >>>> implementation for this?
> >>>
> >>> I'm not familiar with a WiP or PoC on this (yet).  If anyone has
> >>> experimented with it, I'd happy to hear about it.
> >>>
> >>>> Or If somebody is familiar with the code base, I'd appreciate some
> >>>> direction on how to implement this.
> >>>>
> >>>
> >>> Avocado already runs every single test in a fresh new process.  This
> is,
> >>> at least theoretically,  a good start.  Also, the test process is
> >>> handled based on the standard Python multiprocessing module:
> >>>
> >>>
> https://github.com/avocado-framework/avocado/blob/master/avocado/core/runner.py#L363
> >>>
> >>>
> >>> The first experimentation I'd do would be to attempt using the also
> >>> Python standard multiprocessing.Pool:
> >>>
> >>>
> https://docs.python.org/2.7/library/multiprocessing.html#using-a-pool-of-workers
> >>>
> >>
> >> In this case, there would be a separate python thread for each test
> >> being run in parallel.
> >> Each python thread would actually call the test executable using a
> >> sub-process?
> >>
> >
> > Ideally, the Avocado test runner would remain a single process, that is,
> > without one additional thread (or process) to manage each *test* process.
> >
> >> That can be OK for Desktops but won't scale well for using avocado in
> >> memory
> >> constrained Embedded devices.
> >>
> >
> > I must admit I haven't attempted to run Avocado in resource constrained
> > environments.  Can you explain what is your bigger concern?
>
> In our case, primarily memory. Even for dormant processes. Although cpu
> usage is also
> a concern.
>
> Imagine running Avocado on a slightly beefy WiFi router with 128 Mbytes of
> RAM.
> One python process is slow/difficult. Run a few python processes in
> parallel.
> And the Kernel Out of Memory killer starts killing processes.
>
> >
> > Do you feel that Avocado (as a single process test *runner*) plus one
> > process for each *test* is not suitable to those environments?
>
> Avocado should only be running one process ideally.
> And each test should be running 'only' its process.
>
> I think we've confused the dialogue with terminology.
> Threads/processes/subprocesses/multiprocessing
> I'll attempt to clarify.
>
> My current understanding of Avocado
>
> Avocado-runner parent process
> runs - > Avocado test thread using multiprocessing.Process here [1]
>           run - > Actual test executable using subprocess here [2]
>

We use subprocess only if the test is an executable. If it's an avocado
'instrumented' test we execute the test class main entry point

> Is this correct?
> Is there a particular purpose the runner starts a separate thread which
> actually calls the test executable?
>

Yes, test isolation. We don't want tests modifying and messing with the
main runner process in any capacity.

> Now coming back to running tests in parallel.
>
> You mentioned using multiprocessing.Pool. In that case, there could be
> a potential issue for constrained devices.
> e.g. Running 4 tests in parallel.
>
> Avocado-runner parent process
> runs - > Avocado test thread using multiprocessing.Process here [1]
>           run - > Actual test process using subprocess here [2]
> runs - > Avocado test thread using multiprocessing.Process here [1]
>           run - > Actual test process using subprocess here [2]
> runs - > Avocado test thread using multiprocessing.Process here [1]
>           run - > Actual test process using subprocess here [2]
> runs - > Avocado test thread using multiprocessing.Process here [1]
>           run - > Actual test process using subprocess here [2]
>
> So 4 tests would actually result in 9 processes being created.
> 1 runner in Python
> 4 of them mostly dormant Python multiprocessing. (their purpose is a bit
> unclear)
> 4 actual executables.
>

I hope the use of multiprocessing is clearer now.

> Ideally, that number should be 5 for running 4 tests.
> Avocado-runner parent process
> run - > Actual test process using subprocess here [2]
> run - > Actual test process using subprocess here [2]
> run - > Actual test process using subprocess here [2]
> run - > Actual test process using subprocess here [2]
>

I suppose you could hack the entire test loader system to avoid
multiprocessing in the case we want to run executables, but I'm not a big
fan of the idea.

> I hope this doesn't look even more confusing :)
>
> Regards,
> ZubairLK
>
> [1]
> https://github.com/avocado-framework/avocado/blob/master/avocado/core/runner.py#L363
> [2]
> https://github.com/avocado-framework/avocado/blob/master/avocado/utils/process.py#L273
>
> >
> > - Cleber.
> >
> >> Please correct me if I am reading this incorrectly.
> >>
> >> Regards,
> >> ZubairLK
> >>
> >>>
> >>> This would most certainly lead to changes in how Avocado currently
> >>> serially waits for the test status:
> >>>
> >>>
> https://github.com/avocado-framework/avocado/blob/master/avocado/core/runner.py#L403
> >>>
> >>>
> >>> Which ultimately is added to the (Job wide) results:
> >>>
> >>>
> https://github.com/avocado-framework/avocado/blob/master/avocado/core/runner.py#L455
> >>>
> >>>
> >>> Since the results for many tests will now be acquired in unpredictable
> >>> order, this will require changes to the ResultEvent based plugins (such
> >>> as the UI).
> >>>
> >>>> Thanks
> >>>>
> >>>> Regards,
> >>>> ZubairLK
> >>>>
> >>>
> >>> I hope this is a good initial set of pointers.  If you feel adventurous
> >>> and wants to start hacking on this, you're more then welcome.
> >>>
> >>> BTW: we've had quite a number of features that started as
> >>> experiments/ideas/not-really-perfect-pull-requests from the community
> >>> that Avocado "core team" members embraced and pushed all the way to
> >>> completeness.
> >>>
> >
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20161123/c84aba48/attachment.htm>