Systemd Journald and audit logging causing journal issues

Brad Zynda bradley.v.zynda at nasa.gov
Tue Oct 17 15:40:12 UTC 2017


Hey Steve,

No problem you guys are busy with updates..

So I kind of stepped into a known issue with a current disagreement
between the 2 maintainers? what can be done to resolve this going
forward as it is killing services in production environments?

I agree with the need not to remove auditing as this is a slippery slope
and should not occur but the decision was based on little documentation
in regards to the problem and loss of service, I will look at further
checking in hopes to find the specific rule. Though I think the latter
to fix the issue is the appropriate avenue.

The rules have been put in place across many organizations that check
with tools like CIS-CAT and OSCAP, so a lot of rules and a point of
possible single failure.

In regards to the audit.socket what is the expected outcome of masking
this service?

Thanks,
Brad

On 10/17/2017 11:25 AM, Steve Grubb wrote:
> Hello,
> 
> I apologize for the late reply...just found the message.
> 
> On Monday, October 2, 2017 1:30:19 PM EDT Brad Zynda wrote:
>> I am sending along an issue brought to the systemd-journald dev list
>> initially:
>>
>> On 10/02/2017 11:40 AM, Lennart Poettering wrote:
>>> On Mo, 02.10.17 11:25, Brad Zynda (bradley.v.zynda at nasa.gov) wrote:
>>>> Sep 28 13:50:03 server systemd-journal[565]: Suppressed 73244 messages
>>>> from /system.slice/auditd.service
>>>
>>> The question is: why does auditd even log to the journal?
> 
> It doesn't. I have had many arguments with the systemd people about polluting 
> syslog with audit events. If we wanted audit events there, we would have wrote 
> them there. The journal is listening on a multicast audit socket that was 
> created just for them and using a posix capability that was created just for 
> them. And journal also turns on auditing even if you didn't want it. In short, 
> they have, with intention, created your problem.
> 
> 
>>>> Now we are required to have full audit rules and does this look like at
>>>> rate limiting issue or an issue of journal not able to handle the
>>>> traffic to logging?
>>>
>>> journald detected that it got flooded with too many messages in too
>>> short a time from auditd. if this happens then something is almost
>>> certainly off with auditd, as auditd is not supposed to flood journald
>>> with messages, after all it maintains its own auditing log database.
> 
> No...that's the way it works. If you want the audit stream, you have to be 
> able to handle it. My suggestion is that we have a separation of duties. 
> Auditd has audit events, journal has syslog. Besides, mixing audit and syslog 
> data means the security officer and system admin roles have been combined. I 
> think there is an audit.socket unit file that can be masked.
> 
> 
>>> Please ping the auditd folks for help
>  
> They created the problem of audit events in syslog. That said, its been my 
> experience that whenever you get lots of events, there may be something wrong 
> with your rules.
> 
> The normal technique to figure out what wrong is to run aureport --summary 
> --key during the time range of the flood to see what rule is triggering. Then 
> we can look at that rule to see if there's something wrong with it.
> 
> More below...
>  
>> Hey Everyone,
>>
>> Not sure if this is a bug so:
>>
>> systemctl status -l systemd-journald.service
>> ● systemd-journald.service - Journal Service
>>    Loaded: loaded (/usr/lib/systemd/system/systemd-journald.service;
>> static; vendor preset: disabled)
>>    Active: active (running) since Tue 2017-09-26 20:01:16 UTC; 5 days ago
>>      Docs: man:systemd-journald.service(8)
>>            man:journald.conf(5)
>>  Main PID: 565 (systemd-journal)
>>    Status: "Processing requests..."
>>    CGroup: /system.slice/systemd-journald.service
>>            └─565 /usr/lib/systemd/systemd-journald
>>
>> Sep 28 13:50:03 server systemd-journal[565]: Suppressed 73244 messages
>> from /system.slice/auditd.service
>> Sep 28 13:51:03 server systemd-journal[565]: Suppressed 98979 messages
>> from /system.slice/auditd.service
>> Sep 28 13:52:03 server systemd-journal[565]: Suppressed 109433 messages
>> from /system.slice/auditd.service
>> Sep 28 13:53:03 server systemd-journal[565]: Suppressed 99788 messages
>> from /system.slice/auditd.service
>> Sep 28 13:54:03 server systemd-journal[565]: Suppressed 111605 messages
>> from /system.slice/auditd.service
>> Sep 28 13:55:03 server systemd-journal[565]: Suppressed 111591 messages
>> from /system.slice/auditd.service
>> Sep 28 13:56:03 server systemd-journal[565]: Suppressed 107947 messages
>> from /system.slice/auditd.service
>> Sep 28 13:57:51 server systemd-journal[565]: Suppressed 32760 messages
>> from /system.slice/auditd.service
>> Sep 28 17:21:40 server systemd-journal[565]: Suppressed 210 messages
>> from /system.slice/auditd.service
>> Oct 01 02:16:01 server systemd-journal[565]: Suppressed 1333 messages
>> from /system.slice/auditd.service
>>
>> journalctl --verify
>> PASS: /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system.journal
>> PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-000000000097f6c7-0005596b745b4d1c.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-000000000096a587-00055966f35ae59a.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-00000000009554f1-000559629c4cdb7e.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-0000000000940591-0005595e1811a2d1.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-000000000092b500-00055959f2de5ede.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-0000000000916479-0005595573137b74.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-0000000000901337-00055950d80cc3d8.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-00000000008ec2fb-0005594cad14b07a.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-00000000008d7373-0005594838683e58.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-00000000008c238e-00055943fe2072e3.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-00000000008ad1d9-0005593ff64a4f69.journal PASS:
>> /run/log/journal/d28b0080ffe0432a974f36e4fb4bfa9b/system at 0d49221d68d04ef0b95
>> d8203c5e96a46-0000000000897f32-0005593e18c5758b.journal
>>
>>
>> journalctl --disk-usage
>> Archived and active journals take up 1.1G on disk.
>>
>>
>> Initially we saw:
>> 16733 PATH
>> 5070 SYSCALL
>> 5024 CWD
>> 3765 AVC
>> 323 CRYPTO_KEY_USER
>> 223 USER_START
>> 222 USER_ACCT
>> 222 CRED_ACQ
>> 220 LOGIN
>> 220 CRED_REFR
>> 218 USER_END
>> 218 CRED_DISP
>> 46 USER_LOGIN
>> 12 EXECVE
>> 4 USER_AUTH
>> 2 CRYPTO_SESSION
>> 1 USER_ROLE_CHANGE
>> 1 USER_CMD
>> 1 SERVICE_STOP
>> 1 SERVICE_START
>> 1 BPRM_FCAPS
>>
>> so we blocked type PATH in audit.rules
> 
> This is not the right thing to do. If a security officer asks what is being 
> accessed, you got rid of the information. The right thing is to figure out 
> which rule is being hit and see if something is wrong with it. For example, I 
> have seen people do this:
> 
> -a always,exit -S open,openat -F exit=-EPERM
> 
> The problem is that they did not restrict the rule an architecture and they 
> were getting lots of events for the wrong syscall. I've also seen people add 
> -F success 0 to an open syscall. This also results in a large number of 
> events.
> 
> So, I'd recommend making sure all rules have keys added and the running the 
> key summary report to see what rule needs inspection.
> 
> If you find the rule that's causing the problem and you want an opinion, send 
> it to the mail list.
> 
> -Steve
> 
> 
>> But we are still seeing 100K of dropped/suppressed messages.
>>
>> Note: systemloglevel = INFO
>>
>> Centos 7 1708 3.10.0-693.2.2.el7.x86_64
>>
>> systemd.x86_64                      219-42.el7_4.1
>>
>>
>> Now we are required to have full audit rules and does this look like at
>> rate limiting issue or an issue of journal not able to handle the
>> traffic to logging?
>>
>> Error we are seeing from services that have silently failed, in this
>> case glassfish..
>>
>> systemctl status -l glassfish
>> ● glassfish.service - SYSV: GlassFish start and stop  daemon
>>    Loaded: loaded (/etc/rc.d/init.d/glassfish; bad; vendor preset: disabled)
>> Active: active (exited) since Tue 2017-09-26 20:01:36 UTC; 5 days ago Docs:
>> man:systemd-sysv-generator(8)
>>   Process: 1328 ExecStart=/etc/rc.d/init.d/glassfish start (code=exited,
>> status=0/SUCCESS)
>>
>> Warning: Journal has been rotated since unit was started. Log output is
>> incomplete or unavailable.
>>
>> Eventually glassfish will fail but it wont kill the service so we never
>> get an nms service down trap from the OID.
>>
>> Please let me know if further info is needed or if certain limits need
>> to be adjusted.
>>
>> Thanks,
>> Brad Zynda
>>
>>
>> --
>> Linux-audit mailing list
>> Linux-audit at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-audit
> 
> 
> 




More information about the Linux-audit mailing list