BIG performance hit with auditd on large cpus (>64 cpus)

Steve Grubb sgrubb at redhat.com
Fri May 19 21:00:24 UTC 2017


On Friday, May 19, 2017 4:22:24 PM EDT Klaus Lichtenwalder wrote:
> (note to moderator: i sent this before from the wrong address, hope it
> doesn't get duplicated)
> 
> Hi,
> 
> we have a few SAP systems on RHEV (so virtualized on KVM) with >= 74
> CPUs and >= 400G RAM.
> When the system is busy with large SAP jobs, it goes onto its knees with
> cpu %system up to 80%, thus making the SAP jobs run twice as long. As
> soon as you stop auditd everything returns to normal...
> 
> Facts:
> RHEL6 instances on RHEL7 hosts.
> the rule set (see below) runs fine on any other system with less cpus
> (<64, maybe this is the cut off?). We have smaller systems with this
> rule set that rotate the audit file nearly every minute without any
> noticable performance hit, these SAP systems rotate once every
> 20-24hours....
> 
> Anyone has an idea?
> 
> Here's an excerpt from "perf top":
> with auditd running:
> 
> Samples: 28M of event 'cpu-clock', Event count (approx.): 236747914918
> Overhead Shared Object Symbol
> 23.13% [kernel] [k] get_task_cred
> 10.05% [kernel] [k] audit_filter_rules
> 4.21% [kernel] [k] _spin_unlock_irqrestore
> 3.30% libdb2e.so.1 [.] sqlbfix
> 2.92% [kernel] [k] finish_task_switch
> 1.69% disp+work [.] rrol_in
> 1.69% disp+work [.] rrol_out
> 0.98% [kernel] [k] run_timer_softirq
> 0.96% [kernel] [k] rcu_process_gp_end
> 
> 
> auditd stopped:
> 
> Samples: 3M of event 'cpu-clock', Event count (approx.): 526535382557
> Overhead Shared Object Symbol
> 2.41% disp+work [.] memcmpU16
> 2.32% disp+work [.] MmxMalloc2
> 2.25% disp+work [.] ab_Rudi
> 2.07% disp+work [.] rrol_out
> 1.98% disp+work [.] rrol_in
> 1.95% disp+work [.] ab_CompByCmpCntx
> 1.88% libdb2e.so.1 [.] sqlbfix
> 1.73% disp+work [.] MmxFree2
> 1.62% [kernel] [k] run_timer_softirq
> 1.56% [kernel] [k] __do_softirq
> 1.39% disp+work [.] ab_InitRcDecompress
> 
> These are the audit rules:
> auditctl -l
> -a always,exit -S all -F path=/etc/environment -F perm=wa -F auid>=400 -F
> key=CRIT_CONF

Clipped all the other rules. Out of curiosity, why do you include -S all in 
every rule? That will automatically send the syscall into the syscall rules 
which affects the performance of every single syscall in every single 
application. The majority of your rules are file watches which generally takes 
a different route that is more efficient.

To fix this, just remove "-S all" in every rule. I bet it works much better 
after that.

-Steve




More information about the Linux-audit mailing list