(

Fri Mar 2 08:50:55 UTC 2007

On Fri, Mar 02, 2007 at 01:07:05AM -0500, Tony Nelson wrote:
> Also, if it were to always run:
> 
> Readahead-collector allocates memory in big chunks.  It uses lots of memory
> -- when I ran it, 39 MB of /var/log/readahead-rac.log (which produced about
> .33 MB of /etc/readahead.d/custom.* -- but see bz 230687).  (I note that
> readahead-collector will collect without limit, but that readahead will
> only use the first 32K entries.)  Thus, while readahead-collect uses too
> much memory now to run every time, if it used a better data structure, say
> a balanced tree, and parsed the audit data into the tree as the data
> arrived, it could use about 2% of what is currently does.

 It's not so easy. My first implementation has collected only paths, but
 this way is not reliable. You need to collect all events and parse it
 by libauparse, because every syscall produces three events (syscall,
 cwd and path) and the collector requires data from all three events. The
 order of events could be *random* and before parsing you need to
 all events for the syscall.

 I think a simple solution is reduce number of fields in events and
 store to memory simplificated event strings. I hope libauparse
 doesn't have care about number of fields. This way can save 80% of
 used memory (I think). I'll try to implement it.

 Frankly, I'm not sure if 30MB of RAM is so big problem in particular
 case that readahead is effective solution for machines where is a lot of
 memory for kernel cache.

 But you're right that there is a place for optimization.

> Neither program seems to take account of the memory used by the files that
> are read, though readahead can report it. (Possibly readahead-collect
> should avoid the largest files, as they probably aren't mostly used and
> don't cause so much seeking.)

 Any example of really large file (during boot)?

> Readahead-collector runs for 5 minutes, so its output might need pruning if
> it ran each boot.  When run manually, one knows to start stuff up and then
> wait for readahead to finish.  BTW, the collection loop has a 30 second
> timeout that isn't being used.  It might be reasonable to stop collecting
> if no event has come in in that time.

 Good idea, but I'm pessimistic that there is 30s when system doesn't
 call open() :-)

> If readahead-collect could run automatically, readahead might request it
> for the next boot if "too many" files are not found (say, after a firefox
> update).

 Very good point.

 TODO updated:

    http://git.fedoraproject.org/?p=hosted/readahead;a=blob_plain;f=TODO;hb=HEAD

 Thanks.

        Karel

-- 
 Karel Zak  <kzak at redhat.com>