null pointer dereference regression in 5.7

Paul Moore paul at paul-moore.com
Thu Jul 23 01:01:11 UTC 2020


On Tue, Jul 21, 2020 at 10:01 PM Richard Guy Briggs <rgb at redhat.com> wrote:
> On 2020-07-21 18:45, Paul Moore wrote:
> > On Tue, Jul 21, 2020 at 6:30 PM Paul Moore <paul at paul-moore.com> wrote:
> > > Richard, you broke it, you bought it :)  Did you want to take a closer
> > > look at this?  If you can't let me know.  Based on a quick look, my
> > > gut feeling is that either context->pwd is never set properly or it is
> > > getting free'd prematurely; I'm highly suspicious of the latter but
> > > the former seems like it might be a reasonable place to start.
> >
> > Actually, yes, I'm pretty certain the problem is that context->pwd is
> > never set in this case.
>
> Does the ghak96 upstream patch in audit/next on 5.8-rc1 fix it?
>         d7481b24b816 ("audit: issue CWD record to accompany LSM_AUDIT_DATA_* records")
>
> The avc is generated by common_lsm_audit() which calls
> dump_common_audit_data() that now calls audit_getcwd() on the 5
> LSM_AUDIT_DATA_* types that deal with paths.

I would expect that it would resolve the problem being reported, which
is good, but I'm not sure it is a general solution to the problem.  I
suspect there is bigger problem of context->pwd not always having a
"safe" value when the task exits or the syscall returns to userspace.

> > Normally context->pwd would be set by a call to
> > audit_getname()/__audit_getname(), but if there audit context is a
> > dummy context, that is skipped and context->pwd is never set.
> > Normally that is fine, expect with Richard's patch if the kernel
> > explicitly calls audit_log_start() we mark the context as ... not a
> > dummy?  smart?  I'm not sure of the right term here ... which then
> > triggers all the usual logging one would expect.  In this particular
> > case, a SELinux AVC, the audit_log_start() happens *after* the
> > pathname has been resolved and the audit_getname() calls are made;
> > thus in this case context->pwd is not valid when the normal audit
> > logging takes place on exit and things explode in predictable fashion.
>
> The first two AVCs that were accompanied by syscalls had "items=0" but
> the one that blew up had "items=2" so it appears the paths were already
> present in the context, but missing the pwd.

Yes, the issue is with context->pwd, although I suppose other fields
could also be suspect.

> > Unfortunately, it is beginning to look like 1320a4052ea1 ("audit:
> > trigger accompanying records when no rules present") may be more
> > dangerous than initially thought.  I'm borderline tempted to just
> > revert this patch, but I'll leave this open for discussion ...
> > Richard, I think you need to go through the code and audit all of the
> > functions that store data in an audit context that are skipped when
> > there is a dummy context to see which fields are potentially unset,
> > and then look at all the end of task/syscall code to make sure the
> > necessary set/unset checks are in place.
>
> Auditing all the callers is not a small task, but I agree it may be
> necessary.

Do you have a rough idea as to how long it would take to chase down
all the code paths?  I'm asking not to rush you, but to figure out if
we should revert the patch now to resolve the problem and restore it
later once we are confident there are no additional issues lurking.

-- 
paul moore
www.paul-moore.com




More information about the Linux-audit mailing list