[Libguestfs] New Python API? (was: Re: About the return value of value_value)

Richard W.M. Jones rjones at redhat.com
Sun Aug 10 21:18:04 UTC 2014


On Sun, Aug 10, 2014 at 10:19:39PM +0200, Peter Wu wrote:
> The Python documentation is scare on the type of the various parameters and 
> return values. Moreover, it states
> 
>     "Read the hivex(3) man page to find out how to use the API."
>
> Perhaps a second API should be created that is more pythonic (read:
> easier to use)?

One (negative) thing we learned doing libvirt is that unless you
generate the language bindings and C API together, the language
bindings inevitably get out of date, or (worse) contain non-systematic
errors which are difficult to discover and correct.

Therefore you're welcome to create a more Pythonic hivex API either on
top of the existing Python API or talking directly to C, but we
couldn't accept it upstream (well, unless it was fully generated and
included in generator.ml, but that seems unlikely to be possible).
Having said that ...

> hive = hivex.Hivex2("system", write=True)
> ccs_name = "ControlSet001"
> svc_viostor = hive.root()[ccs_name].Services.viostor
> if svc_viostor.Start != 4:
>     # Automatically detect that int '4' is an DWORD
>     svc_viostor.Start = 4
> svc.commit()

... a possible exception would be if it just involves adding some
extra code to the existing hivex.py file, eg. adding a just the extra
classes with __setattr__ and __getattr__ functions.

> I (ab)use the __getattr__ methods if an object to allow this kind of 
> modifications. See also the RegistryHandle helper class at
> https://github.com/Lekensteyn/qemu-tools/blob/master/vbox-to-qemu.py 
> (_import_callback at line 216 may also be interesting)

Noted.  If this could be added to the existing hivex.py ...

[...]

> In the current implementation, Python 3 bytes (Python 2 strings) are treated 
> as plain bytes(*). That is fine, but Unicode is not handled correctly. This 
> might also be an opportunity to treat Unicode strings as UTF-16 (LE) strings 
> which must be nul-terminated. So u'Bar' should become b'B\0a\0r\0\0\0'.

It's worth saying that encoding in the registry itself is not always
UTF-16LE.  It's sometimes UTF-8, ASCII or (in a case I found last
week) an NLS like ISO-8859-1 or Big5.  Essentially the consuming app
always has to know what encoding to use.  Doing "clever" stuff in the
bindings is therefore almost always going to be wrong in some case.
(This is also why the C functions like hivex_value_string are
deprecated).

>  (*) Actually, Hivex 1.3.10 is broken in Python 3 and tries to convert all 
> strings from UTF-8 to bytes and segfaults on other input which does not work 
> for UTF-16 strings[0].

> > In Ruby it seems as if the length could be calculated from the string.
> > On the other hand, I'm not sure there is any point in intentionally
> > removing the length from the return value, as that might break callers
> > for no particular reason.
> > 
> > The best plan here is probably to add a note to the Ruby documentation
> > for RLenTypeVal saying what the hash contains on Ruby.
>
> ... and mention that all other language bindings return a tuple /
> list / array with just two elements as the length can be found from
> the value?

Yup.

Rich.

-- 
Richard Jones, Virtualization Group, Red Hat http://people.redhat.com/~rjones
Read my programming and virtualization blog: http://rwmj.wordpress.com
virt-top is 'top' for virtual machines.  Tiny program with many
powerful monitoring features, net stats, disk stats, logging, etc.
http://people.redhat.com/~rjones/virt-top




More information about the Libguestfs mailing list