Audit format utility

Steve Grubb sgrubb at redhat.com
Thu Oct 2 21:23:49 UTC 2014


On Thursday, October 02, 2014 07:44:32 AM Burn Alting wrote:
> On Wed, 2014-10-01 at 14:44 -0400, Steve Grubb wrote:
> > > Further, it has a couple of immediate issues given it's using
> > > libauparse.
> > > 
> > > -  it is "lossy" in that it wont parse poorly formed audit events (see
> > > the op key value pair below)
> > > 
> > >         [burn at swtf auformat]$ cat add_user.txt
> > >         node=swtf.swtf.dyndns.org type=ADD_USER
> > >         msg=audit(1411871714.393:47872): user pid=13455 uid=0 auid=500
> > >         ses=11
> > >         subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> > >         msg='op=adding home directory id=502 exe="/usr/sbin/useradd"
> > >         hostname=? addr=? terminal=pts/2 res=success'
> > >         [burn at swtf auformat]$ ./auformat "%node %date %time %milli %
> > >         serial: type=%TYPE msg=%msg op=%op auid=%auid pid=%pid  path=%
> > >         path exe=%exe subj=%subj hostname=%hostname terminal=%terminal
> > >         res=%res\n" add_user.txt
> > >         swtf.swtf.dyndns.org 09/28/2014 12:35:14 393 47872:
> > >         type=ADD_USER msg= op=adding auid=500 pid=13455  path=
> > >         exe="/usr/sbin/useradd"
> > >         subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> > >         hostname=? terminal=pts/2 res=success
> > >         [burn at swtf auformat]$
> > > 
> > > We loose the strings
> > > 
> > >     - 'user' before the pid key
> > 
> > Which is meaningless in this case.
> > 
> > >     - op='adding home directory' becomes op'adding'
> > > 
> > > This is particularly important for incorrectly formatted application
> > > level audit sent via auditd.
> > 
> > This is a problem in the shadow-utils package. It is the one that I'm
> > currently having to re-do for this reason and many more. Upstream seems to
> > have taken a stab at re-doming the audit events and pretty much used it
> > like syslog.
> 
> I suppose my concern is that until we have fixed all the incorrectly
> formatted key values, auparse is going to loose information.

OK. I see. I really think we should get some tool created that can help 
identify these. It could be as simple as pushing into auparse, iterating 
across fields to to recreate the record, then diff the original and the 
recreated.

>From that, we can get fixes in place. I think shadow-utils is the package most 
affected.

 
> > > - 'rewinding' the event's cursor for each possible key, the call to
> > > auparse_first_record() in print_item(), is probably not what one would
> > > want - but then again, auformat is just a mock up at the moment.
> > 
> > Well, if you want your fields in a specific order and its not the order in
> > the event, then we have no choice. Note that the event is alrady parsed
> > at this point so we are just literally changing the position in a linked
> > list. The cost is a series of strcmp calls.
> > 
> > > - one looses the parsing 'fix-up' that ausearch does in
> > > src/ausearch-report.c:output_interpreted_node()
> > 
> > Not sure what "fix-up" we are talking about. The intention is that auparse
> > completely mimicks ausearch's interpretation ability (which ausearch was
> > switched over to use auparse a few releases back).
> 
> By 'fix-up' I meant the code like
>                 // Some user messages have msg='uid=500   in this case
>                 // skip the msg= piece since the real stuff is the uid=
> ...
>                // Value side  has commas and another field exists
>                // Known: LABEL_LEVEL_CHANGE banners=none,none
>                // Known: ROLL_ASSIGN new-role=r,r
>                // Known: any MAC LABEL can potentially have commas
> etc

Auparse should handle these.


> > > - to build a complete event, having addressed the 'rewinding' issue,
> > > would make the format look very messy - you would need to include every
> > > possible key to print all key/values.
> > 
> > If you wanted that, yeah. But I am thinking of cases where one may not
> > want
> > every field. For example, you might do something like this to check file
> > access:
> > 
> > 
> > # ausearch --start today -m path --raw |
> > 
> > 		auformat 'auid=%AUID res=%SUCCESS name=%NAME\n'
> > 		
> > > - one should add event separation so that further tools could process
> > > the data more easily.
> > 
> > I am thinking of 1 event per line. This is kind of a requirement of Map
> > Reduce.
> 
> So you expect the complete event of my tailing audit.log
>         node=swtf.swtf.dyndns.org type=SYSCALL
>         msg=audit(1412198543.190:141570): arch=c000003e syscall=59
>         success=yes exit=0 a0=1a2d530 a1=1a2d350 a2=1a06f10 a3=20
>         items=2 ppid=19529 pid=32647 auid=500 uid=0 gid=0 euid=0 suid=0
>         fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=1 comm="tail"
>         exe="/usr/bin/tail"
>         subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
>         key="cmds"
>         node=swtf.swtf.dyndns.org type=EXECVE
>         msg=audit(1412198543.190:141570): argc=3 a0="tail" a1="-f"
>         a2="/var/log/audit/audit.log"
>         node=swtf.swtf.dyndns.org type=CWD
>         msg=audit(1412198543.190:141570):  cwd="/home/burn"
>         node=swtf.swtf.dyndns.org type=PATH
>         msg=audit(1412198543.190:141570): item=0 name="/usr/bin/tail"
>         inode=2135830 dev=fd:00 mode=0100755 ouid=0 ogid=0 rdev=00:00
>         obj=system_u:object_r:bin_t:s0 nametype=NORMAL
>         node=swtf.swtf.dyndns.org type=PATH
>         msg=audit(1412198543.190:141570): item=1 name=(null)
>         inode=524293 dev=fd:00 mode=0100755 ouid=0 ogid=0 rdev=00:00
>         obj=system_u:object_r:ld_so_t:s0 nametype=NORMAL
> to generate one line of output?

I really don't think all of those fields are necessary to understand what is 
happening. I have a plan to be able to take that and reduce it down to 

[On node] at time, subj [acting-as] results action what using

Your event would become
At 17:22:23 uid-500 acting as root successfully run cmds using tail

I think it will take me a few weeks to get it to this point. But I suspect 
that this work will point the way to reducing logs smartly. This of course 
doesn't mean getting rid of the full event. But for processing massive amounts 
of data, it needs to become normalized.


> > > At the moment, the only tool I'm aware of that 'correctly' parses a log
> > > file is ausearch.
> > 
> > If there are omissions in auparse, I really want to know. It must be able
> > to correctly parse events.
> 
> By correctly, I meant completely. It currently, in
> output_interpreted_node() handles incorrectly formed key values like
>    op=adding home directory

A bugzilla should be opened on the package originating the event. These need 
to get fixed.


> as per
> [burn at swtf auformat]$ /sbin/ausearch -i -if add_user.txt
> ----
> node=swtf.swtf.dyndns.org type=ADD_USER msg=audit(09/28/2014
> 12:35:14.393:47872) : user pid=13455 uid=root auid=burn ses=11
> subj=unconfined_u:unconfined_r:unconfined_t:s0-s0:c0.c1023
> msg='op=adding home directory id=freddo exe=/usr/sbin/useradd hostname=?
> addr=? terminal=pts/2 res=success'
> [burn at swtf auformat]$
> 
> > > Perhaps we would be better served by adding another
> > > output option to ausearch to print events in a much more parse-able
> > > format (e.g. XML, JSON)
> > 
> > I am sort of going that way. I am thinking about logstash/elastic search
> > and Map reduce and how one might use the audit system when you have say
> > 10,000 systems.
> 
> Which is my use case.
> From my standpoint, I need each host to
> - enrich the data e.g. uid=500 to become uid=500(burn) (I want both the
> id and interpreted name for checking id mismatches in the enterprise),
> syscall=59 to become syscall=execve, etc
> - not loose important data (op=adding home directory)

The best thing here is opening bz. We really need some test tool to search for 
malformed events for when upstream tinkers with the events.


> - turn single and multi-line events into well defined and formatted
> events (xml/json),
> - send the data to an aggregation point within the enterprise.
> 
> At the aggregation point I can apply capability such as logstash/elastic
> search/map reduce and analyse the data.

Sure. I have not looked at what it takes to make a logstash plugin. But I 
could envision feeding the event into auparse and then using it to provide the 
interpreted fields as needed.
 

> Ideally I'd extend ausearch-report.c:output_record() to output events in
> a well defined format (xml/json) - probably refactoring
> output_interpreted_node() to generate it's current format or xml/json
> depending on a flag so we only have one 'parser' to maintain.

Sure. I could see that as a follow on after getting some new capabilities in 
place and working through how to enrich audit events in general. I'd like to 
make a couple prototypes as standalone utilities to experiment with. When it 
seems to work good, then merge into ausearch. I'd like to see others in the 
community also help define what this can do.

Thanks,
-Steve




More information about the Linux-audit mailing list