why I have lost messages on boot even with very big backlog while I hunting only 2 syscalls?

Steve Grubb sgrubb at redhat.com
Sat Sep 30 14:03:59 UTC 2017


On Saturday, September 30, 2017 8:48:23 AM EDT you wrote:
> Re: why I have lost messages on boot even with very big backlog while I
> hunting only 2 syscalls? 
> From:	Lev Olshvang <levonshe at yandex.com>
> To:	Me
> CC:	"linux-audit at redhat.com" <linux-audit at redhat.com>
> Date:	9/30/17 8:48 AM
> 
> 28.09.2017, 17:02, "Steve Grubb" <sgrubb at redhat.com>:
> > Hello,
> > 
> > On Thursday, September 28, 2017 4:51:38 AM EDT Lev Olshvang wrote:
> >>  28.09.2017, 00:32, "Steve Grubb" <sgrubb at redhat.com>:
> >>  > On Wednesday, September 27, 2017 4:41:29 PM EDT Lev Olshvang wrote:
> >>  >> Hello list !
> >>  >> 
> >>  >> A very technical question
> >>  >> I have Ubuntu 16.10 Virtual Box , auditd 2.7.8
> >>  >> I have audit=1 parameter in grub.cfg
> >>  >> I see that /proc/cmdline indeed sees it
> >>  >> 
> >>  >> I see that auditd is started with PID 564
> >>  >> 
> >>  >> root 312 2 0 23:12 ? 00:00:00 [kauditd]
> >>  >> root 564 1 0 23:12 ? 00:00:00 /sbin/auditd
> >>  >> 
> >>  >> And I have 15 lost messages ???
> >>  >> auditctl -s
> >>  >> enabled 1
> >>  >> failure 1
> >>  >> pid 564
> >>  >> rate_limit 0
> >>  >> backlog_limit 16384
> >>  >> lost 15
> >>  >> backlog 0
> >>  >> backlog_wait_time 30
> >>  >> loginuid_immutable 0 unlocked
> >>  >> 
> >>  >> auditctl -l
> >>  >> -a always,exit -F arch=b64 -S execve,execveat -F key=exec
> >>  >> 
> >>  >> Do I understand correctly that auiditd is indeed started by systemd
> >>  >> before
> >>  >> other services, except 2 that is listed in auditd.service
> >>  >> dependencuies
> >>  >> -
> >>  >> local-fs and some temp setup of systemd ?
> >>  > 
> >>  > Yes, it is started before most services. However. systemd-journal for
> >>  > some
> >>  > reason feels obligated to enable auditing. And sometimes people put
> >>  > audit=1 on the kernel command line. Either way, auditing is on way
> >>  > before
> >>  > auditd starts. The audit logs have a 64 entry buffer by default. So,
> >>  > as
> >>  > the system boots events pile up and eventually overflows the 64 entry
> >>  > limit.
> >>  > 
> >>  > The fix is to add another boot command option audit_backlog_limit=8192
> >>  > or
> >>  > some other suitable number. The test to check for this is to boot your
> >>  > system, login and run auditctl -s. If you have just booted and lost
> >>  > events during boot, this should fix it.
> >>  > 
> >>  > -Steve
> >>  
> >>  Hi Steve
> >>  
> >>  Thank you for your answer.
> >>  I added backlog parameter as you advised, but it did not solve the
> >>  problem
> >>  
> >>  cat /proc/cmdline
> >>  BOOT_IMAGE=/vmlinuz-4.8.0-59-generic root=/dev/mapper/kubuntu--vg-root
> >>  ro
> >>  net.ifnames=0 biosdevname=0 audit=1 audit_backlog_limit=8192 debug
> >>  splash
> >>  auditctl -s
> >>  enabled 1
> >>  failure 1
> >>  pid 672
> >>  rate_limit 0
> >>  backlog_limit 16384
> >>  lost 16
> >>  backlog 10
> >>  backlog_wait_time 30
> >>  loginuid_immutable 0 unlocked
> >>  
> >>  Perhaps something else in configuration ?
> > 
> > You have a backlog of 10. That should normally be 0 unless the system is
> > very busy. What do you have for the flush and freq settings in
> > auditd.conf?
> > 
> > -Steve
> 
> Hi Steve,
> 
> I overloked your mail yesterday, sorry for delay.
> 
> 
> Here the auditd.conf
> 
> local_events = yes
> write_logs = yes
> log_format = RAW
> log_file = /var/log/audit/audit.log
> log_group = root
> priority_boost = 16
> flush = INCREMENTAL_ASYNC
> freq = 20
> num_logs = 5
> disp_qos = lossy
> 
> 
> I increased priority_boost from 4 to 16 in a hope to solve lost messages
> problem. I observed other values of backlog, it was sometimes 6, sometimes
> 7.
> 
> Today I made very big backlog, here are results
> enabled 1
> failure 1
> pid 663
> rate_limit 0
> backlog_limit 32768
> lost 15
> backlog 0
> backlog_wait_time 15000
> 
> Still 15 losts, now events in backlog
> Perhaps I need to add some tracer to lost messages code in kernel to debug
> it.

Maybe adjust your freq from 20 to maybe 50. Other than that, I don't know of 
any other user space tricks to improve the flow rate. Maybe Paul or Richard 
has ideas. I see you have a 4.8 kernel. I think I remember there being some 
netlink comm issues prior to 4.12.

-Steve




More information about the Linux-audit mailing list