Auditing nftables changes

Thu Mar 9 16:33:12 UTC 2023

I think I need to clarify where I'm confused ;-)

With iptables you could write a rule that would only catch system
calls that were for iptables changes. That is, you didn't need to
capture *all* setsockopt calls (not that there would be lots of
*those*) but rather you could add the a2=64 to only get the
op=IPT_SO_SET_REPLACE ones.

With netfilter, however, since the control interface is netlink and
netlink requires a message to a socket and messages are structs, there
is no way to have a similarly narrow audit rule as in the case of
iptables.

That's the first thing I want to confirm: whether my understanding
above is correct? I'm confused because your answer implies I'm correct
but you didn't explicitly confirm that my interpretation of how it
works was correct.

You talk about having an exclude filter on NETFILTER_CFG (or rather
exclude everything except NETFILTER_CFG??) but my understanding is
that you can only do that filtering after the fact using ausearch or
writing some sort of correlation code using the auparse library. But
you are then, in this case, still capturing a haystack and, after the
fact, searching for the needle afterwards. Actually, that's a bad
analogy because ausearch easily finds the events of type=NETFILTER_CFG
very easily and then backtracks and gives you the proctitle, sockaddr
call, and sendmsg syscall associated with the type=netfilter_cfg at
which point you can look at the auid and decide what to do then.

But this is very different from what was possible with iptables where
the rule itself can filter just the iptables-related setsockopt
syscalls.

It just seemed surprising that there is a non-trivial loss of audit
functionality but that I could not find any obvious discussion about
that. By obvious discussion I mean as explicitly as what I'm trying to
say here.

The other thing I'm trying to understand is how heavy an audit load
would it be to have an audit rule that captures *all* sendmsg calls
(well, all except where auid=-1 or auid=${serviceuser_uid}). I don't
have a good enough understanding of systems programming to know where
and how often the sendmsg is called. Of course I know this is highly
dependent on workload, but my knowledge is limited enough that I I can
convince myself both that the audit load would be not trivial but
still manageable in most cases but also I can convince myself that no
same sysadmin would consider running such an audit rule. With file IO
it's easy to distinguish that file opens are worth auditing but file
reads and writes would be insane to audit. It's not so clear for me
for sockets.

Cheers...
Bruce

On Wed, Mar 8, 2023 at 8:34 PM Paul Moore <paul at paul-moore.com> wrote:
>
> On Wed, Mar 8, 2023 at 7:13 PM Bruce Elrick <bruce.elrick at canonical.com> wrote:
> > Hello all,
> >
> > I'm not sure if this list is appropriate for questions so please let
> > me know and otherwise ignore if this message is not appropriate.
> >
> > I'm trying to help someone who is finally migrating from iptables to
> > nftables on the back-end and needs to therefore migrate their audit
> > capability.
> >
> > Currently they have a single simple audit rule to detect when there is
> > a iptable change from any audit user apart from their service user
> > using a rule like the accepted answer given in this[0] StackExchange
> > question, although with added filters on the auid (I have to admit I
> > don't know the origin of auid=-1 events):
> >
> >     auditctl -a exit,always -F arch=b64 -F a2=64 -F auid!=-1 -F
> > auid!=${serviceuser_uid} -S setsockopt -k iptablesChange
> >
> > They are migrating from Ubuntu bionic to jammy and still using the
> > iptables front-end but since the back-end changes from default
> > iptables to default nftables they need to change their audit rules
> >
> > They did strace testing and noted the syscall changing from
> >
> >     setsockopt(4, SOL_IP, IPT_SO_SET_REPLACE,
> > "filter\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"...,
> > 80952) = 0
> >
> > to
> >
> >     sendto(3, [{nlmsg_len=20,
> > nlmsg_type=NFNL_SUBSYS_NFTABLES<<8|NFT_MSG_GETGEN,
> > nlmsg_flags=NLM_F_REQUEST, nlmsg_seq=0, nlmsg_pid=0},
> > {nfgen_family=AF_UNSPEC, version=NFNETLINK_V0, res_id=htons(0)}], 20,
> > 0, {sa_family=AF_NETLINK, nl_pid=0, nl_groups=00000000}, 12) = 20
> >
> > between the two versions.
> >
> > In my own testing, I decided to approach from the audit tools
> > perspective so I created a broad rule to capture all system call
> > related to a test user:
> >
> >     auditctl -a always,exit -S all -F auid=1001 # 1001 is uid of testuser
> >
> > Then I tried various operations using my testuser such as
> > iptables-restore of either a default-accept rule set with no rules or
> > with one or two simple drop rules. I also tested adding just a single
> > iptables rule. I then used ausearch to discover what the audit system
> > captured:
> >
> >     # ausearch -i -m NETFILTER_CFG
> >     ...
> >     ----
> >     type=PROCTITLE msg=audit(03/07/2023 17:18:55.152:143044) :
> > proctitle=iptables-restore
> >     type=SYSCALL msg=audit(03/07/2023 17:18:55.152:143044) :
> > arch=x86_64 syscall=sendmsg success=yes exit=764 a0=0x3
> > a1=0x7ffdb0e98db0 a2=0x0 a3=0x7ffdb0e98d9c items=0 ppid=5673 pid=5676
> > auid=testuser uid=root gid=root euid=root suid=root fsuid=root
> > egid=root sgid=root fsgid=root tty=pts2 ses=108 comm=iptables-restor
> > exe=/usr/sbin/xtables-nft-multi subj=unconfined key=(null)
> >     type=NETFILTER_CFG msg=audit(03/07/2023 17:18:55.152:143044) :
> > table=filter:30 family=ipv4 entries=12 op=nft_unregister_table
> > pid=5676 subj=unconfined comm=iptables-restor
> >     type=NETFILTER_CFG msg=audit(03/07/2023 17:18:55.152:143044) :
> > table=filter:30 family=ipv4 entries=7 op=nft_register_chain pid=5676
> > subj=unconfined comm=iptables-restor
> >     ----
> >     type=PROCTITLE msg=audit(03/07/2023 17:23:04.390:144459) :
> > proctitle=sudo /usr/sbin/iptables -A OUTPUT -d 10.100.249.64 -j DROP
> >     type=SOCKADDR msg=audit(03/07/2023 17:23:04.390:144459) : saddr={
> > saddr_fam=netlink nlnk-fam=16 nlnk-pid=0 }
> >     type=SYSCALL msg=audit(03/07/2023 17:23:04.390:144459) :
> > arch=x86_64 syscall=sendmsg success=yes exit=304 a0=0x3
> > a1=0x7ffc80659110 a2=0x0 a3=0x7ffc806590fc items=0 ppid=5703 pid=5704
> > auid=testuser uid=root gid=root euid=root suid=root fsuid=root
> > egid=root sgid=root fsgid=root tty=pts2 ses=108 comm=iptables
> > exe=/usr/sbin/xtables-nft-multi subj=unconfined key=(null)
> >     type=NETFILTER_CFG msg=audit(03/07/2023 17:23:04.390:144459) :
> > table=filter:31 family=ipv4 entries=1 op=nft_register_rule pid=5704
> > subj=unconfined comm=iptables
> >
> > The event sequences seem to make sense with the sockaddr function
> > selecting the netlink family which agrees with the strace output.
> >
> > With the change in the back-end to nftables, I can see in either case
> > that the setsockopt system call with a nice, crisp, single argument
> > (a2=64/IPT_SO_SET_REPLACE) option with either a sendto or sendmsg
> > system call but with a pointer to a message structure. I read that
> > audit rules cannot filter using data inside struct arguments.
> >
> > My naive interpretation of this is that I'd need to have a rule that
> > captures all sendmsg syscalls with (auid!=-1 and
> > auid!=${serviceuser_uid} but I don't know enough about socket syscall
> > usage to know whether this is too much. I see that write(2) to a
> > socket is the same as send(2) without the flags so I might assume that
> > most socket syscalls that are sending data use write(2) and not
> > send/sendto/sendmsg(2) but I worry this would be too much audit data.
> >
> > Anyone care to comment or point me in the correct direction?
>
> The problem I think you're going to have, and I believe you've already
> suspected it, is that auditing socket writes is going to result in a
> firehose of records.  However, unless you have an exclude filter for
> NETFILTER_CFG records I believe they will be generated without an
> explicit filter rule triggering their generation.
>
> Or am I misunderstanding your question?
>
> --
> paul-moore.com