[Crash-utility] [PATCH] Add the proccgroup extension

Dave Anderson anderson at redhat.com
Mon Apr 11 14:06:45 UTC 2016



----- Original Message -----
> Initial version of a crash module which can be used to show which cgroups
> is the currently active process member of.
> ---
> Hello this is a simple crash extension that I hacked up over the weekend, in my
> case when I look at kernel crash dump I want to quickly understand which cgroup
> is the current process member of. Currently it uses the process from the current
> context but this might change in the future. Here is an example output:
> 
> crash> show_cgroups
> subsys: cpuset               cgroup: c6666
> subsys: cpu                  cgroup: c6666
> subsys: cpuacct              cgroup: c6666
> subsys: io                   cgroup: c6666
> subsys: memory               cgroup: c6666
> subsys: devices              cgroup: c6666
> subsys: freezer              cgroup: c6666
> subsys: perf_event           cgroup: c6666
> subsys: pids                 cgroup: c6666
> 
> I have tested this on 4.6-rc2 with and without cgroup support enabled.
> 
> I'm just sending this to get an initial idea whether I have used crash's
> facilities correctly and canvas for future ideas. I'm aware there are already 2
> cgroup modules but when I tried running either they complained of no cgroup 
> support or the command did nothing. In any case provided that the code is ok I
> guess this can be used as a good example of how to traverse structures with crash
> 
> TODO:
>  * Make the command understand either task_struct pointer or pid being passed.
>  * Add support for pre-3.15 kernels (the cgroup name struct changed to kernfs at that point)
>  * Whatever people think might be useful


Hello Nikolay,

I have a few comments and suggestions:

+#include "defs.h"
+
+#define CGROUP_PATH_MAX

What is the purpose of CGROUP_PATH_MAX?  

+void proccgroup_init(void);
+void proccgroup_fini(void);
+
... [ cut ] ...
+
+void __attribute__((constructor))
+echo_init(void)
+{
+        register_extension(command_table);
+}
+
+void __attribute__((destructor))
+echo_fini(void) { }
+

You've got forward declarations of proccgroup_init() and proccgroup_fini(), 
but the two functions don't exist.  The "echo_xxx()" functions in the 
echo.c sample module are meant to be replaced by names reflecting your
module. 

If I run this module on a 3.10 kernel, I get this:
  
crash> extend proccgroup.so
./extensions/proccgroup.so: shared object loaded
crash> show_cgroups
Unsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionshow_cgroups: invalid kernel virtual address: 0  type: "cgroup_subsys_state->cgroup"
  Unsupported kernel version
crash>

because of this:
  
+        /* Handle the 2 cases of cgroup_name and the kernfs one */
+        if (MEMBER_EXISTS("cgroup", "kn")) {
+            get_kn_cgroup_name(cgroup, subsys_ptr);
+        } else if (MEMBER_EXISTS("cgroup", "name")) {
+            fprintf(fp, "Unsupported kernel version");
+        }

You should replace the "fprintf(fp, "Unsupported kernel version")" error message
above with an "error(FATAL, ...)" construct (and add a "\n" to the end), so that
the command will fail immediately and return to the "crash> " prompt.

Alternatively, you can have your module purposely fail to load if any required
structure members do not exist.  Typically modules will have their xxxx_init() 
constructor function do some pre-checks of the kernel being analyzed, and if the
command will not be able to run correctly, return before calling register_extension():
  
  void __attribute__((constructor))
  xxxx_init(void)
  {
   	if (!MEMBER_EXISTS(...) || !MEMBER_EXISTS(...)) {
  		fprintf(fp, "some error message\n");
  		return;
  	}
  
	register_extension(command_table);
  }

I note that all of your readmem() calls use RETURN_ON_ERROR, such as these examples:

+static void get_cgroup_name(ulong cgroup, char *buf, size_t buflen) 
+{
+
+    ulong kernfs_node;
+    ulong cgroup_name_ptr;
+    ulong kernfs_parent;
+
+    /* Get cgroup->kn */
+    readmem(cgroup + MEMBER_OFFSET("cgroup", "kn"), KVADDR, &kernfs_node, sizeof(void *),
+            "cgroup->kn", RETURN_ON_ERROR);
+
+    readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "parent"), KVADDR, &kernfs_parent, sizeof(void *),
+            "kernfs_node->parent", RETURN_ON_ERROR);
+
+    if (kernfs_parent == 0) {
+        sprintf(buf, "/");
+        return;
+    }
+
+    /* Get kn->name */
+    readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "name"), KVADDR, &cgroup_name_ptr, sizeof(void *),
+            "kernfs_node->name", RETURN_ON_ERROR);
+
+    read_string(cgroup_name_ptr, buf, buflen-1);
+}

Unless you can do something useful in the case of a readmem() error, it's best to
use FAULT_ON_ERROR.  You will get an error message automatically generated by readmem(),
but similar to "error(FATAL, ...)" it will return immediately to the "crash> " prompt.
In your module above, you would just get a stream of confusing error messages.
  
You might consider changing the name of the command to something that
flows off the fingertips a bit easier:

+static struct command_table_entry command_table[] = {
+        { "show_cgroups", show_proc_cgroups, help_proc_cgroups, 0},
+        { NULL },
+};

Maybe something like "cgrp", or "showcg", or anything short but to
the point.
 
Thanks,
  Dave






More information about the Crash-utility mailing list