[PATCH] Reporting file descriptors created by pipe and socketpair

Tue Sep 12 18:25:50 UTC 2006

Alexander Viro <aviro at redhat.com> writes:

> c) just how do you propose to do "tracking file descriptors"?

We aren't proposing to track file descriptors.  We already have code
that does that.  Currently, we collect the traces with a modified
version of strace, but for a variety of reasons, autrace would be a
much better source of trace data.  First, we have to modify strace so
it includes security contexts on labeled objects.  Second, strace
output is designed to be easily consumed by humans, and is a bear to
get a program to understand it.  You can see the AWK/YACC/sed pipeline
required to put strace output into a form that can be easily consumed
by a python program by reading the source file
polgen/src/trackfd/trackstrace.in in the polgen CVS repository at
http://sf.net/projects/polgen.  You'll quickly realize why I am eager
for the parsing library to be introduced in audit 1.3.

The program that does the analysis is in polgen/src/trackfd/trackfd.py.
It analyzes the records for the following system calls:

  close open socket pipe socketpair dup dup2 fcntl64 read write bind
  accept connect recv send unlink execve clone

The key thing is the program doesn't really track file descriptors,
instead it tracks what they refer to.  The program generates a data
structure when each file descriptor is created via an open or socket
system call.  The system calls dup, fcntl, close, and execve change
mappings of file descriptors to the data structure, and a close system
call causes a summary of reads and writes to be written.  Here is the
summary of the file descriptor tracker from the polgen document.

  An essential part of the data reduction is the summarization of the
  life cycle of a file descriptor.  For each file descriptor created
  by a program, the @code{trackfd} program creates a data structure.
  The data structure is updated whenever a system call is found that
  applies to the file descriptor.  Finally, when a file descriptor is
  closed, a summary of the activity associated with the file
  descriptor is written to the output.

It's one of those programs that is either correct, or explodes and
dies horribly when a bug is exercised.  On Fedora Core, the program
has been quite solid for quite a while now.  I has been used to
analyze the Jabber server and an application running in a Java Virtual
Machine.

By the way, we do not claim to handle every possible path for
information flow yet.  In fact, we ignore all system calls implemented
by the ipc common kernel entry point.  Our experience is that the
current set of system calls we analyze handles a large number
important target applications.

The trackfd.py file is about 600 lines of code.  I tried to make it
easily read, in case someone took the time to proof read my code.

John