[Freeipa-devel] validating return values in XML-RPC

John Dennis jdennis at redhat.com
Wed Oct 7 23:48:52 UTC 2009


Sorry to harp on this :-) But the more I work with the XML-RPC interface 
from non-python code the more I think we've got a problem.

The first problem is what was discussed in the team meeting. You don't 
know what a function is going to return and nothing enforces the 
consistency of return values. Jason has done an awesome job of enforcing 
the consistency of input arguments, but that's only half the battle. 
What gets returned is purely a function of what the plugin author 
happens to stuff into the plugin's return statement. There is no 
enforcement of how many values get returned, what their types are, what 
is optional, what is mandatory, etc. In other words everything which is 
enforced on the input side of the call is absent on the output side, it 
should be obvious why this is a problem, especially for any callers of 
XML-RPC which are *not* in the python plugin framework.

The second problem I've run into with return values is especially 
pernicious because the plugin framework is hiding a very fundamental and 
apparently common error. Here is the issue:

* We've adopted the convention that *all* strings will be unicode objects.

* str objects will be treated as binary data

* Python will in many instances freely convert between str objects and 
unicode objects.

* If a plugin wants to return a string it *must* return a unicode 
object. If the plugin mistakenly returns a str object (a very very easy 
mistake to make) then what gets returned through XML-RPC is a *binary* 
base64 encoded blob, not an XML-RPC *string* value!

The above is so critical let me repeat it: FAILING TO ASSURE A RETURNED 
STRING IS UNICODE AND NOT STR RESULTS IN BINARY BLOBS ON THE RECEIVING 
END INSTEAD OF A STRING.

* However, the python framework *hides* the error on the return side 
because it decodes the base64 binary value back into a str object. 
Because str objects and unicode objects are often interchangeable the 
python code receiving the return value thinks it sees the right result 
even though it's not.

If we're going to have other clients of the XML-RPC interface then that 
client *must* know what the return values are and what their type is. It 
can't (or shouldn't) do things like:

* I was expecting a string but I got a binary blob so that must have 
been a mistake so I'll treat the binary blob as a string and hope it's 
correct.

-or-

* I was expecting an integer but I actually got a string (yes there are 
plugins which do this), so I'll try to read an integer value out of the 
string. But wait, suppose the plugin author who returned the integer as 
a string forget to assure that the string representing the integer was a 
unicode object and not a str object, then the receiver really has to 
start guessing because he's gotten back a binary blob. Is that binary 
blob a 2's compliment representation of a signed integer, is it unsigned 
integer, or is the binary blob a string representation of the integer? 
Clearly this doesn't work.

Now let's suppose another common scenario. The plugin author discovers 
he has mistakenly returned a str object when it should have been a 
unicode object and corrects his mistake, seemingly innocent because 
everything continues to work (but only in python). We have a non-python 
client of the XML-RPC interface who has corrected for the mistake by 
expecting binary data for the string, now that client fails! Or let's 
say the plugin was correctly returning a unicode object but some 
seemingly innocent change is made and the value ends up being a str 
object instead. Once again the python code continues to work correctly 
but the non-python code fails.

So how easy it for Python programmers to make the mistake between str 
and unicode? *VERY VERY EASY!* In fact it's so easy even Jason's 
documentation and examples sometimes make the mistake. It's especially 
easy mistake to make when calling another function because the vast 
majority of existing Python libraries return str objects for strings.

If we don't validate the return values from the plugin's I don't see how 
we'll ever have non-python XML-RPC clients which are robust.

I'm happy to work with Jason to get the return validation implemented. I 
now know the insides of the plugin rpc mechanism in gory detail :-)

I don't think we should postpone this, I think there are all sorts of 
errors lurking in the code, or will get introduced in the code in the 
future. For robustness sake it's better to fix this sooner rather than 
later IMHO.

Of course the other option is to do away with the rule that str objects 
will get converted into binary blobs in XML-RPC. I think there is a good 
argument for treating both str objects and unicode objects as strings in 
XML-RPC. How often do we really need to pass binary data back in 
XLM-RPC? I'm thinking it's not very often. For those limited cases 
wouldn't it be better to require the plugin to return a Binary object 
instead of a str?. This has the advantage that's it's explicit and 
removes the dangerous practice of automatic conversion of conventional 
python strings into binary blobs when 99% of the time that's not the 
intention of the programmer.

BTW, removing the automatic conversion of str to binary is a simple one 
line change which would have a huge impact on robustness without any 
loss of functionality. However that still does not address the issue of 
return value validation, but it does remove the single most common error 
we have with return values at the moment.

-- 
John Dennis <jdennis at redhat.com>

Looking to carve out IT costs?
www.redhat.com/carveoutcosts/




More information about the Freeipa-devel mailing list