Systemd Journald and audit logging causing journal issues

Brad Zynda bradley.v.zynda at nasa.gov
Wed Oct 18 15:14:31 UTC 2017


Hey Steve,

Here is an output from the server with PATH audit type re-allowed
(everything back to normal):

Key Summary Report
===========================
total  key
===========================
6019  perm_mod
3878  delete
964  access
96  privileged
57  time-change
51  session
41  modules
20  logins
6  system-locale
5  identity
2  mounts
2  scope
2  actions
1  MAC-policy

Now this is probably not a peak result but I have come across 2 questions..

1. I wanted to cron this and email results but get, verified path sbin:

Key Summary Report
===========================
total  key
===========================
<no events of interest were found>


2. If it ends up being perm_mod as the high count what is the next step
to identify the rule in question?

Thanks,
Brad

On 10/17/2017 01:13 PM, Brad Zynda wrote:
> Hi Steve,
> 
> Thanks for pointing me in the right direction and including the 2 year
> old ticket to reference ;)
> 
> I will see about getting the audit.socket masked if it is allowed under
> FIPS/NIST.
> 
> Thanks again,
> Brad
> 
> 
> On 10/17/2017 12:25 PM, Steve Grubb wrote:
>> On Tuesday, October 17, 2017 11:40:12 AM EDT Brad Zynda wrote:
>>> Hey Steve,
>>>
>>> No problem you guys are busy with updates..
>>>
>>> So I kind of stepped into a known issue with a current disagreement
>>> between the 2 maintainers?
>>
>> Its not a disagreement. Its systemd wants to do everything. Its a crond/
>> xinetd/syslogd/auditd/core dump collector/udev/login service/fstab/fs 
>> automounter/container manager/file system monitor/resource manager/daemon 
>> watchdog  and oh by the way, it does init.
>>
>>
>>> what can be done to resolve this going
>>> forward as it is killing services in production environments?
>>
>> End users have to take the situation into their own hands. There are 
>> configuration knobs for a reason. More info here:
>>
>> https://bugzilla.redhat.com/show_bug.cgi?id=1227379
>>
>>
>>> I agree with the need not to remove auditing as this is a slippery slope
>>> and should not occur but the decision was based on little documentation
>>> in regards to the problem and loss of service, I will look at further
>>> checking in hopes to find the specific rule. Though I think the latter
>>> to fix the issue is the appropriate avenue.
>>
>> Figuring out which rule is triggering is the best solution. It may turn out 
>> you just have a busy system. But most of the time its a bad rule.
>>
>>> The rules have been put in place across many organizations that check
>>> with tools like CIS-CAT and OSCAP, so a lot of rules and a point of
>>> possible single failure.
>>
>> They make mistakes, too.
>>
>>> In regards to the audit.socket what is the expected outcome of masking
>>> this service?
>>
>> The expected outcome is that journald stops getting audit records. It doesn't 
>> solve the problem of why you are getting so many events. Fixing the rule does 
>> that.
>>
>> -Steve
>>  
>>> On 10/17/2017 11:25 AM, Steve Grubb wrote:
>>>> Hello,
>>>>
>>>> I apologize for the late reply...just found the message.
>>>>
>>>> On Monday, October 2, 2017 1:30:19 PM EDT Brad Zynda wrote:
>>>>> I am sending along an issue brought to the systemd-journald dev list
>>>>> initially:
>>>>>
>>>>> On 10/02/2017 11:40 AM, Lennart Poettering wrote:
>>>>>> On Mo, 02.10.17 11:25, Brad Zynda (bradley.v.zynda at nasa.gov) wrote:
>>>>>>> Sep 28 13:50:03 server systemd-journal[565]: Suppressed 73244 messages
>>>>>>> from /system.slice/auditd.service
>>>>>>
>>>>>> The question is: why does auditd even log to the journal?
>>>>
>>>> It doesn't. I have had many arguments with the systemd people about
>>>> polluting syslog with audit events. If we wanted audit events there, we
>>>> would have wrote them there. The journal is listening on a multicast
>>>> audit socket that was created just for them and using a posix capability
>>>> that was created just for them. And journal also turns on auditing even
>>>> if you didn't want it. In short, they have, with intention, created your
>>>> problem.
>>>>
>>>>>>> Now we are required to have full audit rules and does this look like at
>>>>>>> rate limiting issue or an issue of journal not able to handle the
>>>>>>> traffic to logging?
>>>>>>
>>>>>> journald detected that it got flooded with too many messages in too
>>>>>> short a time from auditd. if this happens then something is almost
>>>>>> certainly off with auditd, as auditd is not supposed to flood journald
>>>>>> with messages, after all it maintains its own auditing log database.
>>>>
>>>> No...that's the way it works. If you want the audit stream, you have to be
>>>> able to handle it. My suggestion is that we have a separation of duties.
>>>> Auditd has audit events, journal has syslog. Besides, mixing audit and
>>>> syslog data means the security officer and system admin roles have been
>>>> combined. I think there is an audit.socket unit file that can be masked.
>>>>
>>>>>> Please ping the auditd folks for help
>>>>
>>>> They created the problem of audit events in syslog. That said, its been my
>>>> experience that whenever you get lots of events, there may be something
>>>> wrong with your rules.
>>>>
>>>> The normal technique to figure out what wrong is to run aureport --summary
>>>> --key during the time range of the flood to see what rule is triggering.
>>>> Then we can look at that rule to see if there's something wrong with it.
>>>>
>>>> More below...
>>>>
>>>>> Hey Everyone,
>>>>>
>>>>> Not sure if this is a bug so:
>>>>>
>>>>> systemctl status -l systemd-journald.service
>>>>> ● systemd-journald.service - Journal Service
>>>>>
>>>>>    Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service;
>>>>>
>>>>> static; vendor preset: disabled)
>>>>>
>>>>>    Active: active (running) since Tue 2017-09-26 20:01:16 UTC; 5 days ago
>>>>>    
>>>>>      Docs: man:systemd-journald.service(8)
>>>>>      
>>>>>            man:journald.conf(5)
>>>>>  
>>>>>  Main PID: 565 (systemd-journal)
>>>>>  
>>>>>    Status: "Processing requests..."
>>>>>    CGroup: /system.slice/systemd-journald.service
>>>>>    
>>>>>            └─565 /usr/lib/systemd/systemd-journald
>>>>>
>>>>> Sep 28 13:50:03 server systemd-journal[565]: Suppressed 73244 messages
>>>>> from /system.slice/auditd.service
>>>>> Sep 28 13:51:03 server systemd-journal[565]: Suppressed 98979 messages
>>>>> from /system.slice/auditd.service
>>>>> Sep 28 13:52:03 server systemd-journal[565]: Suppressed 109433 messages
>>>>> from /system.slice/auditd.service
>>>>> Sep 28 13:53:03 server systemd-journal[565]: Suppressed 99788 messages
>>>>> from /system.slice/auditd.service
>>>>> Sep 28 13:54:03 server systemd-journal[565]: Suppressed 111605 messages
>>>>> from /system.slice/auditd.service
>>>>> Sep 28 13:55:03 server systemd-journal[565]: Suppressed 111591 messages
>>>>> from /system.slice/auditd.service
>>>>> Sep 28 13:56:03 server systemd-journal[565]: Suppressed 107947 messages
>>>>> from /system.slice/auditd.service
>>>>> Sep 28 13:57:51 server systemd-journal[565]: Suppressed 32760 messages
>>>>> from /system.slice/auditd.service
>>>>> Sep 28 17:21:40 server systemd-journal[565]: Suppressed 210 messages
>>>>> from /system.slice/auditd.service
>>>>> Oct 01 02:16:01 server systemd-journal[565]: Suppressed 1333 messages
>>>>> from /system.slice/auditd.service
>>>>>
>>>>> journalctl --verify
>>>>> PASS: /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system.journal
>>>>> PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-000000000097f6c7-0005596b745b4d1c.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-000000000096a587-00055966f35ae59a.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-00000000009554f1-000559629c4cdb7e.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-0000000000940591-0005595e1811a2d1.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-000000000092b500-00055959f2de5ede.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-0000000000916479-0005595573137b74.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-0000000000901337-00055950d80cc3d8.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-00000000008ec2fb-0005594cad14b07a.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-00000000008d7373-0005594838683e58.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-00000000008c238e-00055943fe2072e3.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-00000000008ad1d9-0005593ff64a4f69.journal PASS:
>>>>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0
>>>>> b95 d8203c5e96a46-0000000000897f32-0005593e18c5758b.journal
>>>>>
>>>>>
>>>>> journalctl --disk-usage
>>>>> Archived and active journals take up 1.1G on disk.
>>>>>
>>>>>
>>>>> Initially we saw:
>>>>> 16733 PATH
>>>>> 5070 SYSCALL
>>>>> 5024 CWD
>>>>> 3765 AVC
>>>>> 323 CRYPTO_KEY_USER
>>>>> 223 USER_START
>>>>> 222 USER_ACCT
>>>>> 222 CRED_ACQ
>>>>> 220 LOGIN
>>>>> 220 CRED_REFR
>>>>> 218 USER_END
>>>>> 218 CRED_DISP
>>>>> 46 USER_LOGIN
>>>>> 12 EXECVE
>>>>> 4 USER_AUTH
>>>>> 2 CRYPTO_SESSION
>>>>> 1 USER_ROLE_CHANGE
>>>>> 1 USER_CMD
>>>>> 1 SERVICE_STOP
>>>>> 1 SERVICE_START
>>>>> 1 BPRM_FCAPS
>>>>>
>>>>> so we blocked type PATH in audit.rules
>>>>
>>>> This is not the right thing to do. If a security officer asks what is
>>>> being
>>>> accessed, you got rid of the information. The right thing is to figure out
>>>> which rule is being hit and see if something is wrong with it. For
>>>> example, I have seen people do this:
>>>>
>>>> -a always,exit -S open,openat -F exit=-EPERM
>>>>
>>>> The problem is that they did not restrict the rule an architecture and
>>>> they
>>>> were getting lots of events for the wrong syscall. I've also seen people
>>>> add -F success 0 to an open syscall. This also results in a large number
>>>> of events.
>>>>
>>>> So, I'd recommend making sure all rules have keys added and the running
>>>> the
>>>> key summary report to see what rule needs inspection.
>>>>
>>>> If you find the rule that's causing the problem and you want an opinion,
>>>> send it to the mail list.
>>>>
>>>> -Steve
>>>>
>>>>> But we are still seeing 100K of dropped/suppressed messages.
>>>>>
>>>>> Note: systemloglevel = INFO
>>>>>
>>>>> Centos 7 1708 3.10.0-693.2.2.el7.x86_64
>>>>>
>>>>> systemd.x86_64                      219-42.el7_4.1
>>>>>
>>>>>
>>>>> Now we are required to have full audit rules and does this look like at
>>>>> rate limiting issue or an issue of journal not able to handle the
>>>>> traffic to logging?
>>>>>
>>>>> Error we are seeing from services that have silently failed, in this
>>>>> case glassfish..
>>>>>
>>>>> systemctl status -l glassfish
>>>>> ● glassfish.service - SYSV: GlassFish start and stop  daemon
>>>>>
>>>>>    Loaded: loaded (/etc/rc.d/init.d/glassfish; bad; vendor preset:
>>>>>    disabled)
>>>>>
>>>>> Active: active (exited) since Tue 2017-09-26 20:01:36 UTC; 5 days ago
>>>>> Docs:
>>>>> man:systemd-sysv-generator(8)
>>>>>
>>>>>   Process: 1328 ExecStart=/etc/rc.d/init.d/glassfish start (code=exited,
>>>>>
>>>>> status=0/SUCCESS)
>>>>>
>>>>> Warning: Journal has been rotated since unit was started. Log output is
>>>>> incomplete or unavailable.
>>>>>
>>>>> Eventually glassfish will fail but it wont kill the service so we never
>>>>> get an nms service down trap from the OID.
>>>>>
>>>>> Please let me know if further info is needed or if certain limits need
>>>>> to be adjusted.
>>>>>
>>>>> Thanks,
>>>>> Brad Zynda
>>>>>
>>>>>
>>>>> --
>>>>> Linux-audit mailing list
>>>>> Linux-audit at redhat.com
>>>>> https://www.redhat.com/mailman/listinfo/linux-audit
>>
>>
>>




More information about the Linux-audit mailing list