[Libguestfs] Auditing a vm image - virt-diff - was: Read MBR and store in a file?

adrelanos adrelanos at riseup.net
Mon Nov 25 22:47:55 UTC 2013


Matthew Booth:
> On Fri, 2013-11-22 at 20:14 +0000, Richard W.M. Jones wrote:
>> On Fri, Nov 22, 2013 at 05:56:00PM +0000, adrelanos wrote:
>>> Thank you all for your suggestions!
>>>
>>> Richard W.M. Jones:
>>>> I keep meaning to write a comprehensive "virt-diff" tool.  I needed it
>>>> myself just yesterday.
>>>
>>> Most interesting. I guess there are two reasons for creating such a
>>> tool: just compare the images (show the diff) and/or check for malicious
>>> additions in the other image.
>>>
>>> Did you consider implementing the former or both?
>>
>> For all the reasons that Alex goes into, it would just be for checking
>> NON-malicious differences.  The use case is to reverse engineer what
>> files change in a guest when you perform an action (eg. install a
>> Windows driver or run some Linux administrative command).
>>
>> [...]
>>> At the moment I am not trying to write a virt-diff like tool, but
>>> something simpler. A tool to create a report of all of a vm image's
>>> contents. (Checksums for all files, filesystem, for MBR and Volume Boot
>>> Record.) When publishing VM images, it might be useful to publish such a
>>> report together with the image, so others who re-build from source can
>>> be certain, they ended up with a very similar image. When having created
>>> two such reports, one could easily get a virt-diff like tool.
>>
>> I think Matt Booth was doing something like this for Windows systems,
>> with the aim of being able to recreate a Windows VM from a (smaller)
>> description.  Don't know what state that was/is in.
> 
> I wrote a POC tool to store an MD5 of every file on a Windows
> filesystem. It looked like a good idea for what it was, but not very
> applicable here.
> 
>> [...]
>>> What other data can there be outside the filesystem?
>>>
>>> I can think of:
>>>
>>> - MBR
>>> - Volume Boot Record
>>>
>>> Anything else?
>>
>> Potentially all unused space inside and between partitions /
>> filesystems / logical volumes.  The boot loader is sometimes stored in
>> the space between the MBR and the first partition.  Other peculiar
>> things lie in other spaces.
> 
> Any mechanism for doing volume management. e.g. MBR, GPT, LVM (Linux),
> LDM (Windows). Sometimes these overlap and interact in complex ways,
> e.g. LDM has an MBR and a GPT, both of which it ignores in favour of its
> own metadata.
> 
>> However if you don't care about guests that are malicious / hiding
>> data, then you can ignore everything except for the MBR and any
>> non-zero data between the MBR and the first partition.  Note for GPT
>> you have to take into account two partition tables as well.
>>
>>> If these have been compared, the compared image should be as safe to use
>>> as the original one?
>>>
>>> (I could imagine that there can be extra data outside filesystem, maybe
>>> in regions outside the partition table, but those data shouldn't get
>>> executed after starting the image in a VM.)
> 
> I'm coming in to this discussion late, so I don't know what you're doing
> or how paranoid you need to be.

A few years ago, I could say very paranoid. Otherwise, I wouldn't do it
in the first place. :) Nowadays after the news coverage, I'd say no
paranoia at all, just reasonable precuations. ;)

> However, cranking up the paranoia a
> little, imagine the following scenario:
> 
> There's a bug in a critical boot element which means the boot relies on
> uninitialised disk space. As it happens, in a normal installation this
> uninitialised disk space is always safe and it's located somewhere which
> will rarely, if ever, be touched, so nobody has every noticed it.
> (Paranoia level: state actor. Somebody put the bug there deliberately.)
> Malicious person modifies the uninitialised disk space. Your tool will
> never notice. The boot process is now compromised.
> 
> You could probably come up with more with a few minutes of thought. I'm
> pretty sure a dedicated team given a few months to work on this project
> could come up with some inventive ideas :)

I hope you are wrong. :) I am going to ask for more feedback on another
mailing list after the initial implementation of the script is done.

(At the moment I am making good progress, the initial report creation
script is almost finished, currently ironing out a few non-deterministic
/var/cache... files and folders and recreating them during the first boot.)




More information about the Libguestfs mailing list