[Libguestfs] XML encoding of the registry
Simson Garfinkel
simsong at acm.org
Sat Mar 20 16:18:58 UTC 2010
>
>
> One issue that may be of concern is string encoding in registry
> values, which is not well defined. Naturally for XML I suppose you'd
> want to represent string values as UTF-8. However it's almost
> impossible to know for sure how strings are encoded in the registry,
> so doing this conversion would either involve a heuristic, or you'd
> have to store binary blobs in the XML (encoded as Base64 or as hex
> digits). The registry is a mess in this respect.
Rich,
We have encountered the same problem with the XML encoding of file names. Sometimes they are in ASCII, sometimes they are in a code page, sometimes they are in UTF-8, and sometimes they are in corrupt UTF-8.
This is the approach we are using:
1. Represent everything that can be represented in UTF-8 as UTF-8.
2. If something can't be shown as UTF-8, then we add a "coding='base64'" attribute the XML tag and represent it as Base64
We would like to replace #2 with an explicit encoding of the invalid characters as Unicode entities, but we haven't written that.
>
> [...]
>> You can find an example of the digital forensics XML at:
>> http://www.forensicswiki.org/wiki/Fiwalk
>
> Looks interesting. It should be easily possible to get libguestfs to
> write this format for disk images. There is already a (trivial) demo
> program I wrote along those lines:
>
> http://git.annexia.org/?p=libguestfs.git;a=blob;f=examples/to-xml.c;hb=HEAD
Thanks. I'll check that out. We've made a lot of progress writing program in Python that process the Digital Forensics XML, and it is proving to be a good approach for integrating a range of computer forensic tools. You may be interested in my paper:
Garfinkel, Simson., Automating Disk Forensic Processing with SleuthKit, XML and Python, Systematic Approaches to Digital Forensics Engineering (IEEE/SADFE 2009), Oakland, California.
http://simson.net/clips/academic/2009.SADFE.xml_forensics.pdf
>
> - - -
>
> If you have changes for libguestfs or hivex, please submit them to
> this mailing list as for any open source project:
>
> http://people.redhat.com/~rjones/how-to-supply-code-to-open-source-projects/
Thanks. My understanding is that the current code does not build on MacOS. I was just going to download the GIT repository and have at it, but I was not sure how to send back changes.
More information about the Libguestfs
mailing list