[Avocado-devel] [RFC] Improve job status

Olav Philipp Henschel olavph at linux.vnet.ibm.com
Sat Apr 9 01:23:00 UTC 2016


Hi guys,

I liked the idea of the ORable exit codes.
Have you thought about what should be the exit code and behavior when 
avocado is aborted with other signals, such as SIGTERM, for example?
Is it AVOCADO_FAIL? Shouldn't it handle the signal similarly to SIGINT 
and interrupt the current test immediately?
I'm asking that because I use an automated environment that calls 
avocado and I've tried to abort it with SIGTERM, but it looks like it 
does not handle this signal, so I've resorted to send SIGINT, wait the 2 
seconds interval and send SIGINT again. This worked for me, but the 
different behaviors seemed confusing at first.

Regards,
Olav


On 08-04-2016 07:51, Lukáš Doktor wrote:
> Dne 5.4.2016 v 23:37 Amador Pahim napsal(a):
>> Hi folks,
>>
>> This is the RFC for the rework in avocado job exit status. Some
>> discussion have already happened on github, but still we should document
>> the decisions and open the discussion for a broader audience as well.
>
> Some discussion already happened also here: 
> https://trello.com/c/SU5fixgH/510-improve-job-statuses
>
>>
>> Motivation
>> =======
>>
>> Currently the job expects from the runner a list of tests that failed to
>> determine the exit code Avocado will finish with. If the list is empty,
>> the exit code is 0. Otherwise, 1. This implementation is very limited,
>> given the number of possibilities of test ending status and the exit
>> codes. The goal of this RFC is to determine the internal API between job
>> and runner, the relationship between the tests status and the Avocado
>> exit codes and the meaning of the exit codes.
>>
>> Use cases/current issues:
>>
>> - When all tests end with 'PASS' the avocado exit code is 0, which means
>> "AVOCADO_ALL_OK".
>> - When some or all tests end with 'FAIL', avocado exit code is 1, which
>> is defined as "AVOCADO_TESTS_FAIL".
>> - When the job is interrupted with CTRL+C: Current test is INTERRUPTED,
>> avocado exit code is "AVOCADO_TESTS_FAIL".
> This had been fixed by you and AVOCADO_JOB_INTERRUPTED is reported
>> - When the job hits the timeout before finish the tests, we have 2
>> possible results:
>> -- Timeout during a test: The test is interrupted, user sees the status
>> ERROR (this status is buggy, it's being fixed, but it's not part of this
>> RFC) for the test and next tests are skipped, avocado exit code is
>> "AVOCADO_TESTS_FAIL".
> Also fixed by you,
>> -- Timeout between tests: Next tests are skipped, avocado exit code is
>> "AVOCADO_ALL_OK".
> IMO this last is a bug and it should report `AVOCADO_JOB_INTERRUPTED`, 
> because the job was interrupted, but unless I'm wrong this is the 
> current behavior (after your fix).
>
>>
>>
>> Internals
>> ======
>>
>> We have currently a dictionary with the status as key and True or False
>> as value for each status:
>>
>> mapping = {"SKIP": True, "ABORT": False, "ERROR": False, FAIL": False,
>> "WARN": True, "PASS": True, "START": True, "ALERT": False, "RUNNING":
>> False, "NOSTATUS": False, "INTERRUPTED": False}
>>
>> That dictionary tells the runner is a status is good or bad:
>>
>> ...
>> if not status.mapping[test_state['status']]:
>>      failures.append(test_state['name'])
>> ...
>> return failures
>> ...
>>
>> Based on that return, the job decides between 0 or 1 as the exit code:
>>
>> ...
>> tests_status = not bool(failures)
>> if tests_status:
>>      return exit_codes.AVOCADO_ALL_OK
>> else:
>>      return exit_codes.AVOCADO_TESTS_FAIL
>> ...
>>
>> Currently the exit codes available are:
>>
>> AVOCADO_ALL_OK = 0
>> AVOCADO_TESTS_FAIL = 1
>> AVOCADO_JOB_FAIL = 2
>> AVOCADO_FAIL = 3
>> AVOCADO_JOB_INTERRUPTED = 4
>>
>>
>> Recommended Solution
>> ===============
>>
>> Runner should be able to provide a more accurate information to the job,
>> better representing what actually happened to the tests. After some
>> discussion in github, we are currently proposing the minimum enough
>> information for the runner to report so the job can decide the best fit
>> for the exit code:
>>
>> On the runner:
>> - Instead of a list called 'failures', the proposal is to have a set,
>> called 'summary'.
>> - If the job hits the timeout, being the test reported as INTERRUPTED or
>> SKIP, we add the string 'INTERRUPTED' to the 'summary'.
>> - If the test finishes with a bad status ('False' in the mapping), we
>> add the string FAIL to the 'summary'.
>> - If the test finishes with a good status, we don't add anything to the
>> 'summary'.
>> - If the runner someway crashes, 'summary' will not be returned and the
>> job should handle that.
>>
>> On the job:
>> - Receive the summary and test:
>> -- If the string "INTERRUPTED" is there, exit with
>> "AVOCADO_JOB_INTERRUPTED", regardless if any test failed.
>> -- If we don't have "INTERRUPTED" in 'summary' but still we have
>> something there, exit with "AVOCADO_TESTS_FAIL".
>> -- Empty 'summary' means job should exit with "AVOCADO_ALL_OK".
>> -- 'None' in 'summary' means runner crashed and job should exit with
>> "AVOCADO_JOB_FAIL".
> Already upstream by you.
>
>>
>>
>> Additional Improvements
>> ================
>>
>> There is a request to the exit codes to be ORable. To do so, we have to
>> use different codes of what we have currently, changing them to numbers
>> that set only one bit to 1 when converted to binary:
>>
>> AVOCADO_ALL_OK = 0
>> AVOCADO_TESTS_FAIL = 1
>> AVOCADO_JOB_FAIL = 2
>> AVOCADO_FAIL = 4
>> AVOCADO_JOB_INTERRUPTED = 8
>>
>> That way, the test status should be a code that can be used to have more
>> information about what happened to the group of tests. Example:
>>
>> Test1: PASS
>> Test2: FAIL
>> Test3: INTERRUPTED
>> Test4: SKIP
>>
>> On the example above, we have a FAILed test, making job to use the
>> AVOCADO_TESTS_FAIL code, and an INTERRUPTED test, making job to use the
>> AVOCADO_JOB_INTERRUPTED. PASS and SKIP are considered good statuses, so
>> the final job exit code would be 9 (AVOCADO_ALL_OK | AVOCADO_TESTS_FAIL
>> | AVOCADO_JOB_INTERRUPTED).
> This is basically the idea from the trello card. I agree with it, it 
> just requires deeper changes to avocado and job.
>
> Regards,
> Lukáš
>
>>
>> This request is quite well designed, but still there is room for
>> discussion before it gains upstream.
>>
>> Thanks,
>> -- 
>> apahim
>>
>>
>>
>>
>>
>>
>>
>>
>> _______________________________________________
>> Avocado-devel mailing list
>> Avocado-devel at redhat.com
>> https://www.redhat.com/mailman/listinfo/avocado-devel
>
> _______________________________________________
> Avocado-devel mailing list
> Avocado-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/avocado-devel
>




More information about the Avocado-devel mailing list