[libvirt] [PATCH v2 03/17] vircgroup: introduce virCgroupV2DevicesAttachProg

Pavel Hrdina phrdina at redhat.com
Tue Jan 15 12:07:55 UTC 2019


On Tue, Jan 15, 2019 at 10:14:33AM +0000, Daniel P. Berrangé wrote:
> On Tue, Jan 15, 2019 at 09:58:47AM +0100, Pavel Hrdina wrote:
> > On Mon, Jan 14, 2019 at 04:43:34PM +0000, Daniel P. Berrangé wrote:
> > > On Mon, Jan 14, 2019 at 04:47:37PM +0100, Pavel Hrdina wrote:
> > > > This function loads the BPF prog with prepared map into kernel and
> > > > attaches it into guest cgroup.  It can be also used to replace existing
> > > > program in the cgroup if we need to resize BPF map to store more rules
> > > > for devices. The old program will be closed and removed from kernel.
> > > > 
> > > > There are two possible ways how to create BPF program:
> > > > 
> > > >     - One way is to write simple C-like code which can by compiled into
> > > >       BPF object file which can be loaded into kernel using elfutils.
> > > > 
> > > >     - The second way is to define macros which looks like assembler
> > > >       instructions and can be used directly to create BPF program that
> > > >       can be directly loaded into kernel.
> > > > 
> > > > Since the program is not too complex we can use the second option.
> > > 
> > > I'm really not liking this to be honest. Even for this "simple" program,
> > > I struggle to understand what the code below is doing. It is basically
> > > assembly language, which is inevitably hard to follow even for simple
> > > things.
> > > 
> > > I'd like to see us use the BPF C compiler to build the loaded
> > > program unless there's a compelling reason why it won't work
> > 
> > I personally don't like it as well, but the issue is that for the
> > compilation we would have to use clang/llvm, AFAIK there is no support
> > to compile BPF in GCC.  Another issue is the loading part, you need to
> > read the object file, extract correct sections which is not trivial and
> > in order to have different size of maps we would have to modify the
> > loaded data from the compiled object file.
> 
> I don't see a problem with using clang for BPF.  Reading the DPDK lists,
> I see you can do something like this:
> 
>   $ clang -I iproute2/include/ -O2 -emit-llvm -c tap_bpf_program.c -o - \
>         | llc -march=bpf -filetype=obj -o tap_bpf_program.o
>   $ objdump -j l3_l4 -s tap_bpf_program.o > tap_bpf_program.bin
> 
> where the tap_bpf_program.bin should be the raw instructions to be
> loaded.

There is no technical problem with using clang for BPF, the only issue
that I have is that we will require clang to compile libvirt from
source code.

> Supposedly they also have a script that turns the .bin file into a
> C byte array which they then compile into their binary, so they
> can just load the uint8[] directly, instead of from a separate file

This is not an issue, there is a sample code in kernel that uses
elfutils to load the bpf.obj file into C byte array, just different
method how the get the list of instructions.

> The matter of dynamically sized device maps is more tricky. Assuming
> the device map size is basically a list of (major,minor) numbers,
> that's 2 bytes per entry. We could easily do a fixed size 50 elements
> giving 100 bytes per guest if that makes it easier. I guess the hard
> thing is actually getting the dynamic data into the code.

50 elements per guest will not be good enough.  Yes, in most cases it's
perfectly OK, but there are some users having hundreds of disks inside
a guest and we would break their usage and since we don't have any
written limitation of how many disk/devices can be inside a guest we
have to be able to store unlimited number of devices into the map until
we run out of memory.

> > I can try to prepare alternative patches for that but it might me even
> > more complex to understand than short assembly program.
> 
> Maybe there's a compromise/hybrid that would work simpler ? eg is
> it possible to structure things so all the checking code is
> statically compiled, and then we dynamically generate the instructions
> to define the list of devices and append that to the main static
> code. ?

This is not an issue, the list of devices is stored into the BPF map
from user-space (libvirt) and the BPF map is used by the BPF program
in the kernel.  The limits are that you cannot reallocate the BPF map
(this is solved by creating new program with bigger map) and you cannot
change the BPF map FD while the program is loaded into kernel.

Pavel
 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/libvir-list/attachments/20190115/3a53e6bd/attachment-0001.sig>


More information about the libvir-list mailing list