[Freeipa-devel] [PATCH] 0748 Handle encoding for ipautil.run

Jan Cholasta jcholast at redhat.com
Fri Dec 4 07:51:26 UTC 2015


On 3.12.2015 15:42, Petr Viktorin wrote:
> On 12/03/2015 10:55 AM, Jan Cholasta wrote:
> [...]
>>>>>>>> If we don't care about output somewhere, we should not capture it
>>>>>>>> there
>>>>>>>> at all.
>>>>>>>
>>>>>>> Then people need to remember to put "capture_output=True" everywhere
>>>>>>> (except that also disables logging of the output, so we'd need an
>>>>>>> additional option).
>>>>>>
>>>>>> capture_output=True is the default and capture_output=False should be
>>>>>> set where output is not needed.
>>>>>
>>>>> "capture_output=True" being the default would mean that it's still
>>>>> possible to leave "capture_output=False" out, but ignore the output. It
>>>>> would make the wrong usage easier to type than the right one, which is
>>>>> something to avoid.
>>>>
>>>> I'm not sure I understand what are you trying to say here.
>>>>
>>>> My point is that capture_output=True is currently the default (see for
>>>> yourself) and it is indeed possible to leave capture_output=False but
>>>> ignore the output. I believe this is wrong and that you should always
>>>> have to explicitly request the output to be captured.
>>>
>>> OK, looks like I will need to refactor ipautil.run even for Python 2,
>>> then.
>>
>> If this is just about changing the default and fixing all run()
>> invocations, I can provide the patch myself, if that would work better
>> for you.
>>
>>> I seem to have have problems coming up with a solution that would match
>>> your expectations, so let me write the semantics as I understand you'd
>>> like hem:
>>>
>>> Would you be happy with the following signature?
>>>
>>> def run(args, stdin=None, raiseonerr=True,
>>>           nolog=(), env=None,
>>>           capture_output=False, skip_output=False,
>>>           cwd=None, runas=None, timeout=None, suplementary_groups=[],
>>>           capture_stderr=False,
>>>           encode_stdout=True,
>>>           encode_stderr=True,
>>>          ):
>>>
>>> Output and error would be always logged, except if skip_output is set.
>>> But they would be only returned if capture_output/capture_stderr were
>>> set.
>>
>> I would personally also rename capture_output to capture_stdout. If we
>> change the default, we have to update its every use, so we might as well
>> rename it. (Again, I can provide the patch.)
>
> If we agree on the semantics, writing the patch isn't that big a deal.

Right.

>
>>> If encode_* is False, bytes are returned, by default
>>> locale.getpreferredencoding() is used. (If the caller wants any other
>>> encoding, it can do that by itself.)
>>
>> I thought bytes should be the default after all? See my comment in the
>> previous mail.
>
> If the default is no capturing, I'd rather go with text by default.
> With capturing by deafult, "run(args)" could raise encoding errors *if
> the output is then ignored*. If it's "run(args, capture_stdout=True)", I
> think it's fine to encode.

OK, makes sense.

>
>>> stdin it encoded using locale.getpreferredencoding() if it's unicode.
>>
>> If we do encode/decode in run(), I think there should be a way to
>> override the default encoding. My idea was to either accept/return only
>> bytes and let the caller handle encoding themselves, or to handle
>> encoding in run(), but be able to override the defaults.
>>
>> Given we handle encoding in run(), I would imagine it like this:
>>
>>      def run(args, stdin=None, raiseonerr=True, nolog=(), env=None,
>>              capture_stdout=False, skip_output=False, cwd=None,
>>              runas=None, timeout=None, suplementary_groups=[],
>>              capture_stderr=False, encode_stdout=False,
>>              encode_stderr=False, encoding=None):
>>
>> i.e. nothing is captured by default, when stdout or stderr are captured
>> they are returned as bytes by default, when stdout or stderr are
>> returned as text they are decoded using the locale encoding by default,
>> or the encoding can be explicitly specified.
>>
>> Do you think this is feasible?
>
> Feasible, yes.
> In the majority of cases where the output is needed, we want text
> encoded with locale.getpreferredencoding(), so I'd rather make that the
> default rather than bytes.
>
> Or we could make "encode_stdout" imply "capture_stdout", so you wouldn't
> have to always specify both. (It would be better named "encoded_stdout"
> in that case.)

I think it should be OK to decode by default. (IMO it would be best to 
reverse the logic and name it "raw_stdout", with False default. Do we 
need "raw_stderr" at all? I don't think we do.)

So the signature will be this:

       def run(args, stdin=None, raiseonerr=True, nolog=(), env=None,
               capture_stdout=False, skip_output=False, cwd=None,
               runas=None, timeout=None, suplementary_groups=[],
               capture_stderr=False, raw_stdout=False, encoding=None):

Then, when you want to capture just error messages on stderr:

     rc, _out, err = run(args,
                         capture_stderr=True)

When you want to capture text on both stdout and stderr:

     rc, out, err = run(args,
                        capture_stdout=True,
                        capture_stderr=True)

When you want to capture bytes on stdout and text on stderr (think 
"certutil -L -r"):

     rc, out, err = run(args,
                        capture_stdout=True,
                        capture_stderr=True,
                        raw_stdout=True)

When you want to capture text on both stdout and stderr, where the text 
on stdout uses different encoding (think your wget example from before):

     rc, out, err = run(args,
                        capture_stdout=True,
                        capture_stderr=True,
                        raw_stdout=True)
     out = out.decode('utf-8')

And when you want to force the "C" locale:

     rc, _out, err = run(args,
                         env={'LC_ALL': 'C'},
                         capture_stderr=True,
                         encoding='ascii')

Right?

-- 
Jan Cholasta




More information about the Freeipa-devel mailing list