[libvirt] Re: [discuss] The new cgroup patches for libvirt

Dhaval Giani dhaval at linux.vnet.ibm.com
Fri Oct 3 18:40:41 UTC 2008


On Fri, Oct 03, 2008 at 07:13:58PM +0100, Daniel P. Berrange wrote:
> On Fri, Oct 03, 2008 at 09:31:52PM +0530, Balbir Singh wrote:
> > I understand that in the past there has been a perception that libcgroups might
> > not yet be ready, because we did not have ABI stability built into the library
> > and the header file had old comments about things changing. I would urge the
> > group to look at the current implementation of libcgroups (look at v0.32) and
> > help us
> > 
> > 1. Fix any issues you see or point them to us
> > 2. Add new API or request for new API that can help us integrate better with libvirt
> 
> To expand on what I said in my other mail about providing value-add over 
> the representation exposed by the kernel, here's some thoughts on the API
> exposed.
> 
> Consider the following high level use case of libvirt
> 
>  - A set of groups, in a 3 level hierarchy <APPNAME>/<DRIVER>/<DOMAIN>
>  - Control the ACL for block/char devices
>  - Control memory limits
> 
> This translates into an underling implementation, that I need to create 3
> levels of cgroups in the filesystem, attach my PIDs at the 3rd level
> use the memory and device controllers and attach PIDs at the 3rd, and
> set values for attributes exposed by the controllers. Notice I'm not
> actually setting any config parms at the 1st & 2nd levels, but they
> do need to still exist to ensure namespace uniqueness amongst different
> applications using cgroups.
> 
> The current cgroups API provides APIs that directly map to individual
> actions wrt the kernel filesystem exposed. So as an application developer
> I have to explicitly create the 3 levels of hierarchy, tell it I want
> to use memory & device controllers, format config values into the syntax
> required for each attribute, and remeber the attribute names.
> 
>      // Create the hierachy <APPNAME>/<DRIVER>/<DOMAIN>
>      c1 = cgroup_new_cgroup("libvirt")
>      c2 = cgroup_new_cgroup_parent(c1, "lxc")
>      c3 = cgroup_new_cgroup_parent(c2, domain.name)
> 
>      // Setup the controllers I want to use
>      cgroup_add_controler(c3, "devices")
>      cgroup_add_controller(c3, "memory")
>    
>      // Add my domain's PID to the cgroup
>      cgroup_attach_task(c3, domain.pid)
> 
>      // Set the device ACL limits
>      cgroup_set_value_string(c2, "devices.deny", "a");
> 
>      char buf[1024];
>      sprintf(buf, "%c %d:%d", 'c', 1, 3);
>      cgroup_set_value_stirng(c2, "devices.allow", buf);
> 
>      // Set memory limit
>      cgroup_set_value_uint64(c2, "memory.limit_in_bytes", domain.memory * 1024);
> 
> This really isn't providing any semantically useful abstraction over 
> the direct filesytem manipulation. Just a bunch of wrappers for mkdir(),
> mount() and read()/write() calls. My application still has to know far
> too much information about the details of cgroups as exposed by the
> kernel. 
> 

Good point! Let's see how we can improve upon this issue faced by
applications.

> I do not care that there is a concept of  'controllers' at all, I just
> want to set device ACLs and memory limits. I do not care what the attributes
> in the filesystem are called, again I just want to set device ACLs and memory
> limits.  I do not care what the data format for them must be for device/memory
> settings. Memory settings could be stored in base-2, base-10 or base-16 I 
> should not have to know this information.
> 
> With this style of API, the library provide no real value-add or  compelling
> reason to use it.
> 
> What might a more useful API look like? At least from my point of view,
> I'd like to be able to say:
> 
>       // Tell it I want $PID placed in <APPNAME>/<DRIVER>/<DOMAIN>
>       char *path[] = { "libvirt", "lxc", domain.name};
>       cg = cgroup_new_path(path, domain.pid)
> 
>       // I want to deny all devices
>       cgroup_deny_all_devices(cg);
> 
>       // Allow /dev/null - either by node/major/minor
>       cgroup_allow_device_node(cg, 'c', 1, 3);
>    
>       // Or more conviently just give it a node to copy info from
>       cgroup_allow_device_node(cg, "/dev/null")
> 
>       // Set memory in KB
>       cgroup_set_memory_limit_kb(cg, domain.memory)
> 
> Notice how with such a style of API, I don't need to know anything about
> the low level implementation details - I'm working entirely in terms of
> semantically meaningful concepts.
> 

OK. This is something Balbir and I have been discussing , on
how to push libcgroup forward. I do have a patch which started
looking at controller specific stuff, but now that we are quite clear on
what would be good, its much clearer in what direction it should proceed
(and that I should throw away what I wrote, and look to design it in
this fashion). I am on vacation for the next two weeks, but I shall look
at pushing this forward, very soon.

> Now, comes the hard bit - you have to figure out what semantic concepts
> you want to expose to applications. The example here would be suitable
> for libvirt, but not neccessarily for other applications. Picking the
> right APIs is very much much harder than  just exposing the kernel 
> capabilities directly as libcgroup.h does now, but the trade off is 
> that the resulting API would be much more useful and interesting to 
> app developers.
> 

I hope we can utilize your experience here to help us with libcgroup as
well.

thanks,
-- 
regards,
Dhaval




More information about the libvir-list mailing list