[Libguestfs] [PATCH 0/2] added icat and fls0 APIs for deleted files recovery

noxdafox noxdafox at gmail.com
Mon Mar 7 18:14:41 UTC 2016

On 07/03/16 13:29, Richard W.M. Jones wrote:
> On Sun, Mar 06, 2016 at 05:42:24PM +0200, Matteo Cafasso wrote:
>> As discussed in the topic: https://www.redhat.com/archives/libguestfs/2016-March/msg00018.html
>> I'd like to add to libguestfs the disk forensics capabilities offered by The Sleuth Kit.
>> http://www.sleuthkit.org/
>> The two APIs I'm adding with the patch are a simple example of which type of features TSK can enable.
> A few comments in general terms:
> The current splitting of the commits doesn't make much sense to me.
> I think it would be better as:
>   - commit to add TSK to the appliance
>   - commit to add the icat API
>   - tests for icat
>   - commit to add the fls0 API
>   - tests for fls0
> although it would be fine to combine the tests with the new API, or
> even have all the tests as a single separate commit (as now).
> This benefits you because it will allow patches to go upstream
> earlier.  For example, a commit to add TSK to the appliance is a
> simple and obvious change that I see no problem with.  Also the icat
> API is closer to being ready than the fls0 API (see below for
> explanation).
Indeed I've done quite a poor job in this. I will split it as suggested.
>>> <fs> fls0 /dev/sda2 /home/noxdafox/disk-content.txt
>> r/r 15711-128-1:        $Recycle.Bin/S-1-5-21-2379395878-2832339042-1309242031-1000/desktop.ini
>> -/r * 60015-128-1:      $Recycle.Bin/S-1-5-21-2379395878-2832339042-1309242031-1000/$R07QQZ2.txt
>> -/r * 60015-128-3:      $Recycle.Bin/S-1-5-21-2379395878-2832339042-1309242031-1000/$R07QQZ2.txt:Zone.Identifier
> What is `/home/noxdafox/disk-content.txt'?
It's the local (host side) file where to store the command output.
> The problem with this API is it pushes all the parsing up in the
> stack, to libguestfs consumers.
> In general we'd like to avoid that and have just one place where all
> parsing needs to be done (ie. libguestfs itself), so it'd be nicer to
> have an API that returns a list of structs (RStructList) with all the
> important fields parsed out.
As the API documentation says, this is the low level API which I have 
provided as an example.

I took inspiration from the guestfs_ls0 API which does a similar job 
storing the content of a directory onto a host file.

If I understood correctly (the dynamic code generation is still 
confusing me a bit), the way Libguestfs implements commands which could 
have a large output is via first dumping it onto a local file and then 
iterating over it.
This command would list the entire content of a disk including the 
deleted files therefore we need to expect a large output.

What is missing is the higher level implementation which would pretty 
much look like the libguestfs_ls API. I need to better understand how to 
implement it and suggestions are more than appreciated. I tried to trace 
back how the guestfs_find is implemented for example, but I'm still a 
bit disoriented by the automagic code generation.
> Does TSK have a machine-readable mode?  If it does, it'll definitely
> make things easier if (eg) JSON or XML output is available.  If not,
> push upstream to add that to TSK -- it's a simple change for them,
> which will make their tools much more usable, a win for everyone.
I personally disagree on this. The TSK `fls` command is a clone of the 
bash `ls` one. Maybe it's more similar to `ls -al` as it returns 
additional information. IMHO asking to upstream to add JSON or XML 
output format would sound pretty much as asking the same to bash for the 
`ls` utility.

The end result is to still return a list of structs or a list of 
strings. But parsing the `fls` output shouldn't be that hard. It's 
documentation is here:

> Rich.

More information about the Libguestfs mailing list