[RFC PATCH] audit: reduce the number of kauditd_thread wakeups

Richard Guy Briggs rgb at redhat.com
Mon Jun 7 18:40:14 UTC 2021


On 2021-06-05 23:23, Paul Moore wrote:
> [NOTE: As this is an RFC patch, I wanted to add some commentary at
>  the top of the patch description explaining where this patch came
>  from and what testing has been done.  This patch is a derivative
>  of another unreleased patch that removed all of the wake up calls
>  from the audit_log_start() and audit_log_end() functions; while
>  that patch worked well, there we some record losees with high
>  volume burst traffic in the case of `auditctl -a task,never` or
>  CONFIG_AUDITSYSCALL=n.  As this patch doesn't completely remove
>  the wake up calls these two cases should behave similarly to how
>  they do today.  As far as testing is concerned, this patch passes
>  both the audit-testsuite and selinux-testsuite without problem and
>  with expected audit queue losses (no losses outside of the tests
>  which attempt to generate losses).  This patch also preserves the
>  ability for the system to continue to function, in the
>  `auditctl -a exit,always -S all` case, albeit very slowly.]
> 
> When audit is enabled, the kauditd_thread() function runs in its own
> kernel thread, emptying the audit record queue and sending the
> entries to userspace via different means.  As part of its normal
> processing it goes to sleep after emptying the queue and waits for
> the audit functions to wake it.
> 
> The current audit kernel code attempts to wake the kauditd thread
> after each entry is added to the queue.  Considering that a single
> syscall can have multiple audit records, with each wake attempt
> involving locking and additional processing, this can be rather
> wasteful.  In an effort to limit the number of wake attempts without
> unnecessarily risking the audit queue size this patch does the
> following:
> 
> * In the case of syscall auditing the wake up call is moved from
>   audit_log_end() to audit_log_exit() meaning that the kauditd
>   thread is only woken once, at the end of the syscall, after all of
>   the syscall's audit records have been added to the queue.
> 
> * In the case where audit records are generated outside of a syscall
>   context, the wake up call in audit_log_end() is preserved in order
>   to ensure that these records do not suffer any additional latency
>   or put unnecessary pressure on the queue.  This is done through
>   some additional checking in audit_log_start() and an additional
>   flag in the audit_buffer struct.
> 
> * The audit_log_start() function never attempts to wake the kauditd
>   thread.  In the current code this is only done when the queue is
>   under pressure, but at that point it is extremely likely that the
>   thread is already active or scheduled, do we should be able to
>   safely remove this wake attempt.
> 
> * Always wake the kauditd thread after processing management and
>   user records in audit_receive_msg().  This should be relatively
>   low frequency compared to the other audit sources, and doing a
>   wake call here helps keep record latency low and the queue size
>   in check.
> 
> * The kauditd thread itself is now a bit better at handling high
>   volume audit record bursts.  Previously after emptying the queue
>   the thread would wake every process that was blocked and then go
>   to sleep; possibly going to sleep just a flood of new records
>   are added to the queue.  The new kauditd thread approach wakes all
>   of the blocked processes and then reschedules itself in
>   anticipation of new records.  When the thread returns to execution
>   it checks the queue and if there are any records present it
>   immediately starts processing them, if the queue is empty the
>   kauditd thread goes back to sleep.
> 
> Signed-off-by: Paul Moore <paul at paul-moore.com>

This looks good to me.  Thank you for the thorough description.  I can
see how this work was provoked given some of the other work in progress
and some of the minor changes that will be needed to those other works
as a result.

Acked-by: Richard Guy Briggs <rgb at redhat.com>

> ---
>  kernel/audit.c   |   66 ++++++++++++++++++++++++++++++++++++++----------------
>  kernel/audit.h   |    2 ++
>  kernel/auditsc.c |    2 ++
>  3 files changed, 51 insertions(+), 19 deletions(-)
> 
> diff --git a/kernel/audit.c b/kernel/audit.c
> index 551a394bc8f42..7c3368835bb71 100644
> --- a/kernel/audit.c
> +++ b/kernel/audit.c
> @@ -197,9 +197,10 @@ static struct audit_ctl_mutex {
>   * to place it on a transmit queue.  Multiple audit_buffers can be in
>   * use simultaneously. */
>  struct audit_buffer {
> -	struct sk_buff       *skb;	/* formatted skb ready to send */
> -	struct audit_context *ctx;	/* NULL or associated context */
> -	gfp_t		     gfp_mask;
> +	struct sk_buff *skb;
> +	struct audit_context *ctx;
> +	gfp_t gfp_mask;
> +	unsigned int kauditd_wake:1;
>  };
>  
>  struct audit_reply {
> @@ -804,6 +805,17 @@ static void kauditd_send_multicast_skb(struct sk_buff *skb)
>  	nlmsg_multicast(sock, copy, 0, AUDIT_NLGRP_READLOG, GFP_KERNEL);
>  }
>  
> +/**
> + * kauditd_wakeup - Wake the kauditd_thread
> + *
> + * Description:
> + * Wake up the kauditd_thread thread so that it can process the audit queues.
> + */
> +void kauditd_wakeup(void)
> +{
> +	wake_up_interruptible(&kauditd_wait);
> +}
> +
>  /**
>   * kauditd_thread - Worker thread to send audit records to userspace
>   * @dummy: unused
> @@ -832,6 +844,7 @@ static int kauditd_thread(void *dummy)
>  		portid = ac->portid;
>  		rcu_read_unlock();
>  
> +all_queues:
>  		/* attempt to flush the hold queue */
>  		rc = kauditd_send_queue(sk, portid,
>  					&audit_hold_queue, UNICAST_RETRIES,
> @@ -861,25 +874,33 @@ static int kauditd_thread(void *dummy)
>  					kauditd_send_multicast_skb,
>  					(sk ?
>  					 kauditd_retry_skb : kauditd_hold_skb));
> -		if (ac && rc < 0)
> +		if (ac && rc < 0) {
>  			auditd_reset(ac);
> -		sk = NULL;
> +			goto disconnect;
> +		}
> +
> +		/* we have processed all the queues so wake everyone who might
> +		 * be waiting, reschedule ourselves to give others a chance to
> +		 * run, and check the queue once again when we return; if
> +		 * everything is quiet then we can go back to sleep */
> +		wake_up(&audit_backlog_wait);
> +		cond_resched();
> +		if (skb_queue_len(&audit_queue))
> +			goto all_queues;
>  
> +disconnect:
>  		/* drop our netns reference, no auditd sends past this line */
> +		sk = NULL;
>  		if (net) {
>  			put_net(net);
>  			net = NULL;
>  		}
>  
> -		/* we have processed all the queues so wake everyone */
> -		wake_up(&audit_backlog_wait);
> -
>  		/* NOTE: we want to wake up if there is anything on the queue,
>  		 *       regardless of if an auditd is connected, as we need to
>  		 *       do the multicast send and rotate records from the
>  		 *       main queue to the retry/hold queues */
> -		wait_event_freezable(kauditd_wait,
> -				     (skb_queue_len(&audit_queue) ? 1 : 0));
> +		wait_event_freezable(kauditd_wait, skb_queue_len(&audit_queue));
>  	}
>  
>  	return 0;
> @@ -1509,6 +1530,9 @@ static int audit_receive_msg(struct sk_buff *skb, struct nlmsghdr *nlh)
>  		break;
>  	}
>  
> +	/* poke the kauditd_thread to handle any records we generated */
> +	kauditd_wakeup();
> +
>  	return err < 0 ? err : 0;
>  }
>  
> @@ -1750,6 +1774,7 @@ static struct audit_buffer *audit_buffer_alloc(struct audit_context *ctx,
>  
>  	ab->ctx = ctx;
>  	ab->gfp_mask = gfp_mask;
> +	ab->kauditd_wake = 0;
>  
>  	return ab;
>  
> @@ -1831,14 +1856,10 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
>  
>  		while (audit_backlog_limit &&
>  		       (skb_queue_len(&audit_queue) > audit_backlog_limit)) {
> -			/* wake kauditd to try and flush the queue */
> -			wake_up_interruptible(&kauditd_wait);
> -
>  			/* sleep if we are allowed and we haven't exhausted our
>  			 * backlog wait limit */
>  			if (gfpflags_allow_blocking(gfp_mask) && (stime > 0)) {
>  				long rtime = stime;
> -
>  				DECLARE_WAITQUEUE(wait, current);
>  
>  				add_wait_queue_exclusive(&audit_backlog_wait,
> @@ -1864,10 +1885,14 @@ struct audit_buffer *audit_log_start(struct audit_context *ctx, gfp_t gfp_mask,
>  		return NULL;
>  	}
>  
> -	audit_get_stamp(ab->ctx, &t, &serial);
>  	/* cancel dummy context to enable supporting records */
>  	if (ctx)
>  		ctx->dummy = 0;
> +	/* force a kauditd wakeup if the normal syscall exit isn't doing it */
> +	if (!ctx || !ctx->in_syscall)
> +		ab->kauditd_wake = 1;
> +
> +	audit_get_stamp(ab->ctx, &t, &serial);
>  	audit_log_format(ab, "audit(%llu.%03lu:%u): ",
>  			 (unsigned long long)t.tv_sec, t.tv_nsec/1000000, serial);
>  
> @@ -2365,8 +2390,8 @@ int audit_signal_info(int sig, struct task_struct *t)
>   *
>   * We can not do a netlink send inside an irq context because it blocks (last
>   * arg, flags, is not set to MSG_DONTWAIT), so the audit buffer is placed on a
> - * queue and a kthread is scheduled to remove them from the queue outside the
> - * irq context.  May be called in any context.
> + * queue where a kthread processes the buffer from outside the irq context.
> + * May be called in any context.
>   */
>  void audit_log_end(struct audit_buffer *ab)
>  {
> @@ -2385,9 +2410,12 @@ void audit_log_end(struct audit_buffer *ab)
>  		nlh = nlmsg_hdr(skb);
>  		nlh->nlmsg_len = skb->len - NLMSG_HDRLEN;
>  
> -		/* queue the netlink packet and poke the kauditd thread */
> +		/* queue the netlink packet */
>  		skb_queue_tail(&audit_queue, skb);
> -		wake_up_interruptible(&kauditd_wait);
> +
> +		/* only wakeup kauditd if we can't do it later */
> +		if (ab->kauditd_wake)
> +			kauditd_wakeup();
>  	} else
>  		audit_log_lost("rate limit exceeded");
>  
> diff --git a/kernel/audit.h b/kernel/audit.h
> index 3b9c0945225a1..b5f677017e975 100644
> --- a/kernel/audit.h
> +++ b/kernel/audit.h
> @@ -213,6 +213,8 @@ static inline int audit_hash_ino(u32 ino)
>  /* Indicates that audit should log the full pathname. */
>  #define AUDIT_NAME_FULL -1
>  
> +extern void kauditd_wakeup(void);
> +
>  extern int audit_match_class(int class, unsigned syscall);
>  extern int audit_comparator(const u32 left, const u32 op, const u32 right);
>  extern int audit_uid_comparator(kuid_t left, u32 op, kuid_t right);
> diff --git a/kernel/auditsc.c b/kernel/auditsc.c
> index 47fb48f42c934..bad776497438f 100644
> --- a/kernel/auditsc.c
> +++ b/kernel/auditsc.c
> @@ -1600,6 +1600,8 @@ static void audit_log_exit(void)
>  		audit_log_end(ab);
>  	if (call_panic)
>  		audit_panic("error converting sid to string");
> +
> +	kauditd_wakeup();
>  }
>  
>  /**
> 
> --
> Linux-audit mailing list
> Linux-audit at redhat.com
> https://listman.redhat.com/mailman/listinfo/linux-audit

- RGB

--
Richard Guy Briggs <rgb at redhat.com>
Sr. S/W Engineer, Kernel Security, Base Operating Systems
Remote, Ottawa, Red Hat Canada
IRC: rgb, SunRaycer
Voice: +1.647.777.2635, Internal: (81) 32635




More information about the Linux-audit mailing list