Occasional delayed output of events

Paul Moore paul at paul-moore.com
Tue Jan 19 19:11:05 UTC 2021


On Tue, Jan 19, 2021 at 3:18 AM Burn Alting <burn.alting at iinet.net.au> wrote:
> I tend to have a rigorous auditing posture (see the rules loaded in https://github.com/linux-audit/audit-userspace/issues/148) which is not normal for most. Perhaps, Paul, you have hit the nail on the head by stating that this 'severe delay' is not that unreasonable given my rules posture and we just need to 'deal with it' in user space.
> We still get the event data, I just need to adjust the user space tools to deal with this occurrence.
> As for what the system is doing, in my home case it's a Centos 7 VM running a tomcat service which only gets busy every 20 minutes and the other is a HPE Z800 running Centos 8 with 4-5 VM's mostly dormant. I can put any code in these hosts to assist in 'validating'/testing the delay. Advise and I will run.

I took a (slightly) closer look at the queuing code just now and the
corner case I suspected doesn't look very promising here.  On failure
to send a record to userspace, the record is put back at the front of
the queue to preserve ordering so the kernel *should* still emit
records in order even when auditd is under pressure, dead, or in the
process of a restart.  Also, in this case the kernel kicks the auditd
connection, disconnecting auditd's netlink socket; I imagine that
would be a notable event on your systems.

However, looking at the timestamps in the audit events you posted, I
noticed something a little odd.  If I align the timestamps a bit
better let's see if it jumps out at you guys ...

type=XXX msg=audit(1609519900.159:44606): ...
type=XXX msg=audit(1609519900.161:44607): ...
type=XXX msg=audit(1609519900.163:44608): ...
type=XXX msg=audit(1609519896.829:44609): ...
type=XXX msg=audit(1609519900.170:44610): ...

Let me guess Burn, you're running NTP/PTP :)  I'm not sure how auditd
handles things like this, but it looks like there was a small negative
time correction between events which caused the odd scenario where
event N+1 actually occurred before event N according to the wall
clock.  In other words, your system is time traveling ;)

-- 
paul moore
www.paul-moore.com





More information about the Linux-audit mailing list