[Libguestfs] extract NTFS Master File Table for analysis

noxdafox noxdafox at gmail.com
Thu Feb 18 19:41:51 UTC 2016



On 02/02/16 21:35, Richard W.M. Jones wrote:
> On Tue, Feb 02, 2016 at 07:40:12PM +0200, noxdafox wrote:
>> Greetings,
>>
>> I'm playing around an idea and I'd like to ask you some questions.
>>
>> I'd like to extract the MFT table from a disk image file. The idea
>> is to employ it to build a sort of reverse lookup table which, given
>> a cluster, could retrieve the corresponding file with the related
>> metadata.
>>
>> Such table could be used to optimize the analysis of disk snapshots
>> in order to collect the changes which happened on the disk. As the
>> disk snapshots contains only the new or modified clusters, I could
>> avoid exploring the whole FS content and focus on what has really
>> changed on disk.
>>
>> Did you explore the concept anyhow?
> No.
>
>> Is there a way I can use libguestfs to locate and extract the MFT
>> table from a disk image?
> If there's an ntfsprogs command that does this (ntfsinfo --mft maybe?)
> then it's really easy to extract the output from that command.  You
> could hack it together using `debug sh', search this page:
>
>    http://libguestfs.org/guestfs-faq.1.html
>
> ... but if you wanted to do it "properly" then you could add an API
> modelled on one of the `FileOut' APIs, eg:
>
>    https://github.com/libguestfs/libguestfs/blob/master/daemon/base64.c#L100
>
> For information on adding APIs, see:
>
>    http://libguestfs.org/guestfs-hacking.1.html#adding-a-new-api
I played around a bit and I need to confess I am impressed on how easy 
is to add functionalities to libguestfs.

I could easily extract the Master File Table using the download API and 
parse it with third party tools.

I'd like to extract as well the Update Sequence Number Journal 
($UsnJrnl) but it seems unaccessible via it's path (C:\$Extend\$UsnJrnl).
I tried on a real disk and it seems to be a limitation of the NTFS-3g 
driver: it can extract C:\$MTF and C:\$LogFile, it can list C:\$Extend 
content but it cannot access those files.

Curiously enough, stat() syscall on C:\$Extend\$UsnJrnl seems to work 
and returns the correct inode number. Yet the size is wrong as it 
reports 0 while the real one is > 9Mb.

The next step I tried was to use ntfscat command in the following 
manner: ntfscat -i <UsnJrnl inode number> /dev/sdXX and it worked 
flawlessly.

So I proceeded adding such API to libguestfs and I could extract the 
journal without any issue. The UsnJrnl file is very handy to check what 
changes were made on disk. Not only it's faster than using virt-diff on 
two different snapshots but it also shows much more relevant 
information. I could for example track down temporary files created and 
deleted within the two snapshots.

All of this to say I'd like to add the possibility of extracting files 
via their inode. This functionality has the advantage of not requiring 
the FS to be mounted. Would libguestfs benefit from this?

If so how should I proceed? Which API names to use?

Most straightforward would be something like:

   ntfsicat(device, inode)

or

   ntfsidownload(device, inode)

I guess also linux guest disks would benefit from this but this requires 
a bit more research.

>
> This question of how do you find which disk block is associated with a
> particular file comes up often enough that I have looked at it various
> times on my blog:
>
>    https://rwmj.wordpress.com/2014/02/21/use-guestfish-and-nbdkit-to-examine-physical-disk-locations/
>
>    https://rwmj.wordpress.com/2014/11/23/mapping-files-to-disk/
>
> Rich.
>




More information about the Libguestfs mailing list