[Freeipa-devel] [PATCH] 0748 Handle encoding for ipautil.run

Jan Cholasta jcholast at redhat.com
Fri Dec 4 11:15:28 UTC 2015


On 4.12.2015 10:53, Petr Viktorin wrote:
> On 12/04/2015 08:51 AM, Jan Cholasta wrote:
>> On 3.12.2015 15:42, Petr Viktorin wrote:
>>> On 12/03/2015 10:55 AM, Jan Cholasta wrote:
>>> [...]
>>>>>>>>>> If we don't care about output somewhere, we should not capture it
>>>>>>>>>> there
>>>>>>>>>> at all.
>>>>>>>>>
>>>>>>>>> Then people need to remember to put "capture_output=True"
>>>>>>>>> everywhere
>>>>>>>>> (except that also disables logging of the output, so we'd need an
>>>>>>>>> additional option).
>>>>>>>>
>>>>>>>> capture_output=True is the default and capture_output=False
>>>>>>>> should be
>>>>>>>> set where output is not needed.
>>>>>>>
>>>>>>> "capture_output=True" being the default would mean that it's still
>>>>>>> possible to leave "capture_output=False" out, but ignore the
>>>>>>> output. It
>>>>>>> would make the wrong usage easier to type than the right one,
>>>>>>> which is
>>>>>>> something to avoid.
>>>>>>
>>>>>> I'm not sure I understand what are you trying to say here.
>>>>>>
>>>>>> My point is that capture_output=True is currently the default (see for
>>>>>> yourself) and it is indeed possible to leave capture_output=False but
>>>>>> ignore the output. I believe this is wrong and that you should always
>>>>>> have to explicitly request the output to be captured.
>>>>>
>>>>> OK, looks like I will need to refactor ipautil.run even for Python 2,
>>>>> then.
>>>>
>>>> If this is just about changing the default and fixing all run()
>>>> invocations, I can provide the patch myself, if that would work better
>>>> for you.
>>>>
>>>>> I seem to have have problems coming up with a solution that would match
>>>>> your expectations, so let me write the semantics as I understand you'd
>>>>> like hem:
>>>>>
>>>>> Would you be happy with the following signature?
>>>>>
>>>>> def run(args, stdin=None, raiseonerr=True,
>>>>>            nolog=(), env=None,
>>>>>            capture_output=False, skip_output=False,
>>>>>            cwd=None, runas=None, timeout=None, suplementary_groups=[],
>>>>>            capture_stderr=False,
>>>>>            encode_stdout=True,
>>>>>            encode_stderr=True,
>>>>>           ):
>>>>>
>>>>> Output and error would be always logged, except if skip_output is set.
>>>>> But they would be only returned if capture_output/capture_stderr were
>>>>> set.
>>>>
>>>> I would personally also rename capture_output to capture_stdout. If we
>>>> change the default, we have to update its every use, so we might as well
>>>> rename it. (Again, I can provide the patch.)
>>>
>>> If we agree on the semantics, writing the patch isn't that big a deal.
>>
>> Right.
>>
>>>
>>>>> If encode_* is False, bytes are returned, by default
>>>>> locale.getpreferredencoding() is used. (If the caller wants any other
>>>>> encoding, it can do that by itself.)
>>>>
>>>> I thought bytes should be the default after all? See my comment in the
>>>> previous mail.
>>>
>>> If the default is no capturing, I'd rather go with text by default.
>>> With capturing by deafult, "run(args)" could raise encoding errors *if
>>> the output is then ignored*. If it's "run(args, capture_stdout=True)", I
>>> think it's fine to encode.
>>
>> OK, makes sense.
>>
>>>
>>>>> stdin it encoded using locale.getpreferredencoding() if it's unicode.
>>>>
>>>> If we do encode/decode in run(), I think there should be a way to
>>>> override the default encoding. My idea was to either accept/return only
>>>> bytes and let the caller handle encoding themselves, or to handle
>>>> encoding in run(), but be able to override the defaults.
>>>>
>>>> Given we handle encoding in run(), I would imagine it like this:
>>>>
>>>>       def run(args, stdin=None, raiseonerr=True, nolog=(), env=None,
>>>>               capture_stdout=False, skip_output=False, cwd=None,
>>>>               runas=None, timeout=None, suplementary_groups=[],
>>>>               capture_stderr=False, encode_stdout=False,
>>>>               encode_stderr=False, encoding=None):
>>>>
>>>> i.e. nothing is captured by default, when stdout or stderr are captured
>>>> they are returned as bytes by default, when stdout or stderr are
>>>> returned as text they are decoded using the locale encoding by default,
>>>> or the encoding can be explicitly specified.
>>>>
>>>> Do you think this is feasible?
>>>
>>> Feasible, yes.
>>> In the majority of cases where the output is needed, we want text
>>> encoded with locale.getpreferredencoding(), so I'd rather make that the
>>> default rather than bytes.
>>>
>>> Or we could make "encode_stdout" imply "capture_stdout", so you wouldn't
>>> have to always specify both. (It would be better named "encoded_stdout"
>>> in that case.)
>>
>> I think it should be OK to decode by default. (IMO it would be best to
>> reverse the logic and name it "raw_stdout", with False default. Do we
>> need "raw_stderr" at all? I don't think we do.)
>
> Actually, now that I think about it: if the result was a namedtuple
> subclass, we can always set extra raw_stdout/raw_stderr attributes on
> it. (Unless skip_output=True, of course.)
>
>     result = run(['generate_binary_data'])
>     use(result.raw_stdout)
>
>     # and for backcompat,
>     rc, _out, _err = result

Good idea.

Perhaps we should use the same names as CalledProcessError for the 
attributes, i.e. "returncode", "output", and "error_output", 
"raw_output", "raw_error_output" for the new attributes?

-- 
Jan Cholasta




More information about the Freeipa-devel mailing list