[Freeipa-devel] [PATCH] 0015 Only split CSV strings once

Petr Viktorin pviktori at redhat.com
Wed Mar 21 14:28:21 UTC 2012


On 03/20/2012 10:08 PM, Rob Crittenden wrote:
> Petr Viktorin wrote:
>> On 03/16/2012 12:55 PM, Petr Viktorin wrote:
>>> On 03/15/2012 08:55 PM, Rob Crittenden wrote:
>>>> Petr Viktorin wrote:
>>>>> https://fedorahosted.org/freeipa/ticket/2227 (Unable to add certain
>>>>> sudo
>>>>> commands to groups). What an interesting bug to get :)
>>>>>
>>>>>
>>>>> One problem with our CSV splitting is that it's not idempotent
>>>>> (baskslashes are "eaten" when there are escaped commas), but when we
>>>>> forward a call it gets done on both the client and the server. The
>>>>> attached patch fixes that in a somewhat hacky way. It's not a complete
>>>>> fix for the issue though, for that I need to ask what to do.
>>>>>
>>>>>
>>>>> Some Params use the CSV format when we just need a list of
>>>>> comma-separated values. Our flavor of CSV does escaping in two
>>>>> different
>>>>> ways. This is pretty bad for predictability, least-surprise,
>>>>> documentation, ...
>>>>>
>>>>> To recap, the two ways are escaping (escaped commas are ignored,
>>>>> backslashes also need to be escaped) and quoting (values are
>>>>> optionally
>>>>> enclosed in double quotes, to include a '"' in a quoted string, use
>>>>> '""').
>>>>> Escaping is good because users are used to it, but currently literal
>>>>> backslashes need to be doubled, making multivalue syntax different
>>>>> from
>>>>> single-value syntax, even if there are no commas in the value.
>>>>> Quoting is weird, but complete by itself so it doesn't also need
>>>>> escaping.
>>>>>
>>>>> Do we need to use both? Is this documented somewhere? Some of our
>>>>> tests
>>>>> check only for one, some assume the other. Users are probably even
>>>>> more
>>>>> confused.
>>>>>
>>>>>
>>>>> If we only keep one of those, the fix for #2227 should be quite easy.
>>>>> If not (backwards compatibility), we need to document this properly,
>>>>> test all the corner cases, and fix the UI to handle CSV escaping.
>>>>> Since it's subtly broken in lots of places, maybe it's best to break
>>>>> backwards compatibility and go for a simple solution now.
>>>>>
>>>>> Anyway, if I want to do a proper fix, CSV handling should be only done
>>>>> on the client. Unfortunately turning off CSV handling in the server
>>>>> will
>>>>> break the UI, which at places uses `join(',')` (no escaping commas, no
>>>>> single place to change it), even though it can just send a list.
>>>>
>>>> I'm with Honza, not too keen on the _forwarded_call part. I'd rather
>>>> you
>>>> do a mix of comparing self.env.in_server and whether the value is a
>>>> basestring to determine if we need to split it.
>>>
>>> This was a hack necessary because the WebUI relied on server-side
>>> splitting (in a few cases, and it did not escape correctly). Now that
>>> Petr Vobornik's patch 99 is in, it is unnecessary.
>>> This discussion thread has an updated patch (0015-03); what I'm
>>> attaching now builds on top of that to add several new changes in tests.
>>> To summarize it: the splitting is only done in the client; from the RPC
>>> call on no CSV magic is involved at all.
>>>
>>> Please tell me your reasons for basestring checking. I think it's
>>> entirely unnecessary and just adds complexity.
>>> In the rest of the framework a string, as a parameter value, has the
>>> same effect as a list of that one string. Why break this?
>>> We are concerned about the API here, not the command line. Shortcuts to
>>> save the user's keystrokes in the common case should *not* win over
>>> predictability and making sure the easier way is the correct, robust way
>>> to do it. Compare -- with basestring checking:
>>>
>>> some_command(arg=escape_commas(some_string))
>>> - is correct
>>>
>>> some_command(arg=[some_string])
>>> - is also correct
>>>
>>> some_command(arg=some_string)
>>> - is WRONG because the string is not escaped
>>> - but is the easiest solution!
>>>
>>> The problem is much worse because it is subtle: it only affects values
>>> with commas and backslashes, so *the wrong solution would work* in most
>>> of the cases.
>>> And I can assure you plugin writers *would* pick the wrong way if it
>>> would usually work. Our own web UI had examples.
>>>
>>> Contrast to my proposed behavior:
>>> some_command(arg=some_string)
>>> - is correct
>>>
>>> some_command(arg=[some_string])
>>> - is also correct
>>>
>>> some_command(arg=escape_commas(some_string))
>>> - is wrong, but doesn't make much sense because at this level you don't
>>> have to worry about CSV at all
>>>
>>> Basestring checking would add unnecessary complexity for both us *and*
>>> the plugin writer. The only case it would make it easier for the plugin
>>> writer is if the plugin took CSV values ­-- but why force a random
>>> format, tailored for our specific frontend, into RPC clients?
>>>
>>> The client should translate the command line into a RPC call. There's no
>>> reason to arbitrarily expect the server to do part of the job.
>>>
>>>
>>>> I'm not sure why this is broken out of normalize. Isn't this
>>>> normalizing
>>>> the incoming values into something more sane?
>>>
>>> "Normalization" implies it's an idempotent operation, which CSV
>>> splitting is not (at least its current incarnation, without the
>>> basestring checking):
>>> r"a,b\,c" splits to ["a", "b,c"] which splits to ["a", "b", "c"]
>>>
>>>
>>
>> Just to make things clear: freeipa-pviktori-0015-04 is the latest
>> version of this patch.
>> It's complemented by freeipa-pviktori-0021-03 which adds the tests for
>> this (and other things) (but contains one more refactor).
>>
>>
>> If and when that's all reviewed and pushed, I'd like to discuss
>> freeipa-pviktori-0018 (Simplify CSV escaping syntax). I'm not saying my
>> solution there is perfect, but the flavor of Python's CSV parsing we're
>> using now is quite quirky. It would be nice to get the syntax right, or
>> at least better, while we're fixing this from a totally-broken state.
>> Later there'll be backwards compatibility problems.
>> But, don't let that hold back this patch.
>>
>
> I'm still not completely convinced. This patch looks ok and I'm sold on
> client side only parsing but this doesn't actually fix any of the CSV
> bugs we have open.
>
> I'd like to evaluate this in more context.
>
> I think the simplest fix is to drop escapechar processing in the
> csvreader and have users quote strings with commas instead, this is how
> I originally intended it to work. The UI won't have to do anything
> special since it will already be split and it isn't a lot to ask to put
> quotes around a value on the CLI.
>
> This is my proposed fix on top of your patch:
>
> diff --git a/ipalib/parameters.py b/ipalib/parameters.py
> index 80b175d..ec6020e 100644
> --- a/ipalib/parameters.py
> +++ b/ipalib/parameters.py
> @@ -702,7 +702,7 @@ class Param(ReadOnly):
> # csv.py doesn't do Unicode; encode temporarily as UTF-8:
> csv_reader = csv.reader(self.__utf_8_encoder(unicode_csv_data),
> dialect=dialect,
> - delimiter=self.csv_separator, escapechar='\\',
> + delimiter=self.csv_separator, quotechar='"',
> skipinitialspace=self.csv_skipspace,
> **kwargs)
> for row in csv_reader:
>
> With this change:
> - all the self-tests pass
> - https://fedorahosted.org/freeipa/ticket/2417 is fixed
> - https://fedorahosted.org/freeipa/ticket/2227 is fixed
>
> I tested with:
>
> $ ipa sudocmdgroup-add-member test --sudocmd='/bin/chown -R
> apache\:developers /var/www/*/shared/log'
>
> $ ipa role-add-privilege --privileges='"u,r stuff"' test
>
> rob

Yes, that works well. ACK on your change; here it is squashed in.

https://fedorahosted.org/freeipa/ticket/2402 can wait for 3.0, then?

-- 
Petr³
-------------- next part --------------
A non-text attachment was scrubbed...
Name: freeipa-pviktori-0015-05-Only-split-CSV-in-the-client.patch
Type: text/x-patch
Size: 14514 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/freeipa-devel/attachments/20120321/040cbd79/attachment.bin>


More information about the Freeipa-devel mailing list