[Avocado-devel] RFC: avocado.utils.process (or, how to handle the split stdout/stderr recording)

Wed Nov 8 19:42:16 UTC 2017

On 11/08/2017 11:01 AM, Lucas Meneghel Rodrigues wrote:
> On Wed, Nov 8, 2017 at 4:01 PM, Cleber Rosa <crosa at redhat.com> wrote:
>> Hi everyone,
>>
>> This is kind of brainstorm about the really annoying issue we've trying
>> to deal with in Avocado.  The root problem we're trying to fix is that
>> Avocado's "output check" feature can not be used when the test's
>> generated content combines both stdout and stderr in a single file.  One
>> example is QEMU's iotests[1].
>>
>> The root cause of this limitation is that avocado.utils.process.run()[2]
>> (because of avocado.utils.process.SubProcess[3]) decided a long time ago
>> to split the content of a process STDOUT and STDERR to separate files
>> (and different logging streams).  This is reflected on Avocado's "output
>> check" feature[4].  Users can choose to record the test's generated
>> STDOUT, STDERR, both (all) or none of them.  This is how the command
>> line options look like:
>>
>>   --output-check-record {none,all,stdout,stderr}
>>                Record output streams of your tests to reference files
>>                (valid options: none (do not record output streams),
>>                all (record both stdout and stderr), stdout (record
>>                only stderr), stderr (record only stderr). Current:
>>                none
>>
>> Which map to the following recorded files:
>>
>> +----------+----------------------------+--------------------+
>> | OPTION   | RECORDED FILES             | CONTENT            |
>> +----------+----------------------------+--------------------+
>> | none     |                            |                    |
>> +----------+----------------------------+--------------------+
>> |          | stdout                     | process FD 1       |
>> + all      +----------------------------+--------------------+
>> |          | stderr                     | process FD 2       |
>> +----------+----------------------------+--------------------+
>> | stdout   | stdout                     | process FD 1       |
>> +----------+----------------------------+--------------------+
>> | stderr   | stderr                     | process FD 2       |
>> +----------+----------------------------+--------------------+
>>
>> Notice how the "all" option still creates two separate files, with
>> content coming from individual file descriptors.  Adding another record
>> option would actually be pretty simple, that is, something like:
>>
>> +----------+----------------------------+--------------------+
>> | OPTION   | RECORDED FILES             | CONTENT            |
>> +----------+----------------------------+--------------------+
>> | combined | combined_stdout_stderr     | process FD 1 and 2 |
>> +----------+----------------------------+--------------------+
>>
>> What is *not* easily possible, is to have two options such as "all" and
>> "combined" at the same time.  The reason is that it the order of the
>> content written to "combined_stdout_stderr" can not be guaranteed to be
>> correct.
>>
>> I've attempted to allow for both options to be used at once, by using a
>> pair of file-like objects with specialized writes that would either
>> share single list of records or just share a secondary file with a lock
>> (pseudo/prototype code):
>>
>> class TeeFile(file):
>>     def __init__(self, path, tee_path, tee_lock, mode='a'):
>>         super(TeeFile, self).__init__(path, mode)
>>         self._tee_path = tee_path
>>         self._tee_lock = tee_lock
>>
>>     def write(self, record):
>>         self.tee_lock.acquire()
>>         super(TeeFile, self).write(record)
>>         with open(self._tee_path, 'a') as tee_file:
>>             tee_file.write(record)
>>             tee_file.flush()
>>         self.tee_lock.release()
>>
>> But the fact that the Python standard library "subprocess.Popen"
>> implementation works with file descriptors doesn't make it possible to
>> implement a shared/synchronized write strategy.  Rewriting parts of
>> "subprocess.Popen" or "avocado.utils.process.SubProcess" is, IMO, a
>> clear sign to not attempt something way more complicated and error prone
>> for the sake of this feature.
> 
> Yes, writing to a synchronized, combined output buffer should be
> possible, but it is as you said, complicated and error prone.
> 
> the SubProcess implementation was written considering that stdout and
> stderr streams are separate in POSIX systems, so it seemed the most
> obvious behavior. If having a combined stream is something important
> for the users, it's worthwhile to add a work item for such an
> implementation. Meanwhile, the solution proposed below is a good
> intermediate step.
> 

Yep, I also think it's the most obvious, and most flexible.  It's just
that Avocado needs to be flexible than "average", and those
unanticipated things can appear later on.

>> Accepting the fact that either combined (stdout+stderr) or a split
>> (stdout, stderr) will be available, the point of option names comes into
>> play.  Both "stdout", "stderr" and "none" are aptly named.  But, with
>> the introduction of a third *real* output record option/file, "all"
>> becomes a rather bad option name.
> 
> Well, sure. In hindsight, 'all' seems bad indeed.
> 
>> To keep things simple, I'd add an alias from "all" to "both",
>> deprecating the former.  The new option would be named either "combined"
>> and the generated file named "combined_stdout_stderr" by default.  It's
>> a rather verbose name, but I'd go with explicit option names rather than
>> confusing ones.
> 
> The term 'both' seems fine, although there's an argument to be made
> for 'combined'. The file name 'combined_output' is shorter while still
> being descriptive.
> 

Agreed, I think "combined_output" is a good file name.

- Cleber.

>> Comments?
>>
>> --
>>
>> [1] -
>> https://git.qemu.org/?p=qemu.git;a=blob;f=tests/qemu-iotests/check;h=e6b6ff7a047562099c2fb5f0fd35652780c9c006;hb=HEAD#l757
>> [2] -
>> http://avocado-framework.readthedocs.io/en/55.0/api/utils/avocado.utils.html#avocado.utils.process.run
>> [3] -
>> http://avocado-framework.readthedocs.io/en/55.0/api/utils/avocado.utils.html#avocado.utils.process.SubProcess
>> [4] -
>> http://avocado-framework.readthedocs.io/en/55.0/WritingTests.html#test-output-check-and-output-record-mode
>>
>> --
>> Cleber Rosa
>> [ Sr Software Engineer - Virtualization Team - Red Hat ]
>> [ Avocado Test Framework - avocado-framework.github.io ]
>> [  7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3  ]
>>
>>
> 
> 
> 

-- 
Cleber Rosa
[ Sr Software Engineer - Virtualization Team - Red Hat ]
[ Avocado Test Framework - avocado-framework.github.io ]
[  7ABB 96EB 8B46 B94D 5E0F  E9BB 657E 8D33 A5F2 09F3  ]

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/avocado-devel/attachments/20171108/3287d9c7/attachment.sig>