[Libguestfs] New Python API?

Peter Wu peter at lekensteyn.nl
Sun Aug 10 23:11:54 UTC 2014


On Sunday 10 August 2014 22:18:04 Richard W.M. Jones wrote:
> On Sun, Aug 10, 2014 at 10:19:39PM +0200, Peter Wu wrote:
> > The Python documentation is scare on the type of the various parameters
> > and
> > return values. Moreover, it states
> > 
> >     "Read the hivex(3) man page to find out how to use the API."
> > 
> > Perhaps a second API should be created that is more pythonic (read:
> > easier to use)?
> 
> One (negative) thing we learned doing libvirt is that unless you
> generate the language bindings and C API together, the language
> bindings inevitably get out of date, or (worse) contain non-systematic
> errors which are difficult to discover and correct.
> 
> Therefore you're welcome to create a more Pythonic hivex API either on
> top of the existing Python API or talking directly to C, but we
> couldn't accept it upstream (well, unless it was fully generated and
> included in generator.ml, but that seems unlikely to be possible).

I was thinking of basing the more Pythonic API on top of the current 
hivex.Hivex class, not adding more functionality to that wrapper. If someone 
would like to create a broken registry, (s)he then has the full power with the 
low-level API. If on the other hand one is looking for a way to access a 
registry without breaking, a nicer API would be nice. Something that prevents 
a programmer from writing 1 byte to a DWORD type for example. Something that 
makes traversing through registry keys easier (as demonstrated before).

Would there be interest for inclusion of such an API in hivex? Since it uses 
the existing Python methods, breakage must not be possible unless you break 
other programs relying on it.

> Having said that ...
> 
> > hive = hivex.Hivex2("system", write=True)
> > ccs_name = "ControlSet001"
> > svc_viostor = hive.root()[ccs_name].Services.viostor
> > 
> > if svc_viostor.Start != 4:
> >     # Automatically detect that int '4' is an DWORD
> >     svc_viostor.Start = 4
> > 
> > svc.commit()
> 
> ... a possible exception would be if it just involves adding some
> extra code to the existing hivex.py file, eg. adding a just the extra
> classes with __setattr__ and __getattr__ functions.

Yes, the low-level binding is left intact, it's just a new Hivex2 class that 
is being added. No more changes are needed in libhivexmod.

> > In the current implementation, Python 3 bytes (Python 2 strings) are
> > treated as plain bytes(*). That is fine, but Unicode is not handled
> > correctly. This might also be an opportunity to treat Unicode strings as
> > UTF-16 (LE) strings which must be nul-terminated. So u'Bar' should become
> > b'B\0a\0r\0\0\0'.
> It's worth saying that encoding in the registry itself is not always
> UTF-16LE.  It's sometimes UTF-8, ASCII or (in a case I found last
> week) an NLS like ISO-8859-1 or Big5.  Essentially the consuming app
> always has to know what encoding to use.  Doing "clever" stuff in the
> bindings is therefore almost always going to be wrong in some case.
> (This is also why the C functions like hivex_value_string are
> deprecated).

When doing a registry export (.reg), all strings like "Key"="Value" appears to 
be UTF-16 strings. Trying to push an UTF-8 string into the registry results in 
Chinese characters (UTF-16?). Could you confirm/reject this against the 
exports of your keys? Also, when the trailing NUL byte is missing in the 
services values, a BSOD can be observed.

If it is necessary to support other encodings, it may be worth to add a 
function to wrap the encoding, (type?) and value:

UTF_16_LE = "utf-16-le"
class RegistryString(object):
    def __init__(self, type, value, encoding=UTF_16_LE):
        ...
    def value(self):
        return self.value.encode(self.encoding) + u"\0".encode(self.encoding)

(maybe introduce a wrapper function for this to avoid long lines)
Strings are always NUL-terminated, right? I recall reading something like that 
in the MSDN documentation.

Kind regards,
Peter
https://lekensteyn.nl




More information about the Libguestfs mailing list