handling disk full

Wed Dec 15 19:16:44 UTC 2004

On Wed, Dec 15, 2004 at 01:09:01PM -0500, Stephen Smalley wrote:
> On Wed, 2004-12-15 at 13:01, Klaus Weidner wrote:
> > Keep in mind that the CAPP audit requirements are fairly independent from
> > the SELinux uses of the audit subsystem. 
> > 
> > CAPP requires that specific actions don't complete if they can't be
> > audited, and those events will in general occur from a syscall context
> > where a sleep should not be a problem.
> 
> 1) What does "can't be audited" mean - that we couldn't send the audit
> record to userspace or that it couldn't reach the disk?

The more abstract requirement is that the implementation minimizes the
potential loss of audit records in a way that makes the maximum lossage
predictable, and ideally configurable based on admin preferences for a
reliability/performance tradeoff.

In this context, it would mean that the kernel knows that it won't be
able to send the record to userspace, for example because the queue is
full. auditd should let the kernel know in some way when it's out of
disk space to stop further events from being queued.

Quoting from CAPP (also LSPP, its requirements here are identical): 

	5.1.7   Guarantees of Audit Data Availability (FAU_STG.1)

	5.1.7.1 The TSF shall protect the stored audit records from
	unauthorized deletion. FAU_STG.1.1

	5.1.7.2 The TSF shall be able to prevent modifications to the
	audit records. FAU_STG.1.2

	Application Note: On many systems, in order to reduce the
	   performance impact of audit generation, audit records will be
	   temporarily buffered in memory before they are written to
	   disk.  In these cases, it is likely that some of these records
	   will be lost if the operation of the TOE is interrupted by
	   hardware or power failures. The developer needs to document
	   what the likely loss will be and show that it has been

	Rationale: This component supports the O.AUDITING objective by
	   protecting the audit trail from tampering, via deletion or
	   modification of records in it. Further it ensures that it is
	   as complete as possible.

	5.1.9   Prevention of Audit Data Loss (FAU_STG.4)

	5.1.9.1 The TSF shall be able to prevent auditable events, except
	those taken by the authorized administrator, and [assignment:
	other actions to be taken in case of audit storage failure] if
	the audit trail is full. FAU_STG.4.1 / NOTE 5

	Application Note: The selection of "preventing" auditable
	   actions if audit storage is exhausted is minimal
	   functionality; providing a range of configurable choices
	   (e.g., ignoring auditable actions and/or changing to a
	   degraded mode) is allowable, as long as "preventing" is one of
	   the choices. If configurable, then FMT_MOF.1 should be
	   incorporated into the ST.

	Rationale: This component supports the O.AUDITING and O.MANAGE
	   objectives by providing the audit trail is complete with
	   respect to non-administrative users while providing
	   administrators with the ability to recover from the situation.

> 2) Even from process context, you'd have to make sure that the caller is
> never holding a lock when it calls audit_log*.

That's a potential advantage of generating audit events in the syscall
path.

> > The events generated by SELinux are not required by CAPP, and it's not a
> > problem for CAPP compliance if those messages get discarded if there is
> > no room for them and the kernel can't sleep.
> 
> Possibly, but audit_log* can't automatically detect whether it is safe
> to sleep.  Caller will have to provide that information via a flag or
> alternate interface.  

Yes, and it's not pretty...

> In any event, use of sigsuspend seems questionable.

I think it needs to be an in-kernel sleep, since a sigsuspend could be
undone by a sigresume from a different process.

-Klaus