[Crash-utility] [PATCH] Add the proccgroup extension
Dave Anderson
anderson at redhat.com
Mon Apr 11 14:06:45 UTC 2016
----- Original Message -----
> Initial version of a crash module which can be used to show which cgroups
> is the currently active process member of.
> ---
> Hello this is a simple crash extension that I hacked up over the weekend, in my
> case when I look at kernel crash dump I want to quickly understand which cgroup
> is the current process member of. Currently it uses the process from the current
> context but this might change in the future. Here is an example output:
>
> crash> show_cgroups
> subsys: cpuset cgroup: c6666
> subsys: cpu cgroup: c6666
> subsys: cpuacct cgroup: c6666
> subsys: io cgroup: c6666
> subsys: memory cgroup: c6666
> subsys: devices cgroup: c6666
> subsys: freezer cgroup: c6666
> subsys: perf_event cgroup: c6666
> subsys: pids cgroup: c6666
>
> I have tested this on 4.6-rc2 with and without cgroup support enabled.
>
> I'm just sending this to get an initial idea whether I have used crash's
> facilities correctly and canvas for future ideas. I'm aware there are already 2
> cgroup modules but when I tried running either they complained of no cgroup
> support or the command did nothing. In any case provided that the code is ok I
> guess this can be used as a good example of how to traverse structures with crash
>
> TODO:
> * Make the command understand either task_struct pointer or pid being passed.
> * Add support for pre-3.15 kernels (the cgroup name struct changed to kernfs at that point)
> * Whatever people think might be useful
Hello Nikolay,
I have a few comments and suggestions:
+#include "defs.h"
+
+#define CGROUP_PATH_MAX
What is the purpose of CGROUP_PATH_MAX?
+void proccgroup_init(void);
+void proccgroup_fini(void);
+
... [ cut ] ...
+
+void __attribute__((constructor))
+echo_init(void)
+{
+ register_extension(command_table);
+}
+
+void __attribute__((destructor))
+echo_fini(void) { }
+
You've got forward declarations of proccgroup_init() and proccgroup_fini(),
but the two functions don't exist. The "echo_xxx()" functions in the
echo.c sample module are meant to be replaced by names reflecting your
module.
If I run this module on a 3.10 kernel, I get this:
crash> extend proccgroup.so
./extensions/proccgroup.so: shared object loaded
crash> show_cgroups
Unsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionshow_cgroups: invalid kernel virtual address: 0 type: "cgroup_subsys_state->cgroup"
Unsupported kernel version
crash>
because of this:
+ /* Handle the 2 cases of cgroup_name and the kernfs one */
+ if (MEMBER_EXISTS("cgroup", "kn")) {
+ get_kn_cgroup_name(cgroup, subsys_ptr);
+ } else if (MEMBER_EXISTS("cgroup", "name")) {
+ fprintf(fp, "Unsupported kernel version");
+ }
You should replace the "fprintf(fp, "Unsupported kernel version")" error message
above with an "error(FATAL, ...)" construct (and add a "\n" to the end), so that
the command will fail immediately and return to the "crash> " prompt.
Alternatively, you can have your module purposely fail to load if any required
structure members do not exist. Typically modules will have their xxxx_init()
constructor function do some pre-checks of the kernel being analyzed, and if the
command will not be able to run correctly, return before calling register_extension():
void __attribute__((constructor))
xxxx_init(void)
{
if (!MEMBER_EXISTS(...) || !MEMBER_EXISTS(...)) {
fprintf(fp, "some error message\n");
return;
}
register_extension(command_table);
}
I note that all of your readmem() calls use RETURN_ON_ERROR, such as these examples:
+static void get_cgroup_name(ulong cgroup, char *buf, size_t buflen)
+{
+
+ ulong kernfs_node;
+ ulong cgroup_name_ptr;
+ ulong kernfs_parent;
+
+ /* Get cgroup->kn */
+ readmem(cgroup + MEMBER_OFFSET("cgroup", "kn"), KVADDR, &kernfs_node, sizeof(void *),
+ "cgroup->kn", RETURN_ON_ERROR);
+
+ readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "parent"), KVADDR, &kernfs_parent, sizeof(void *),
+ "kernfs_node->parent", RETURN_ON_ERROR);
+
+ if (kernfs_parent == 0) {
+ sprintf(buf, "/");
+ return;
+ }
+
+ /* Get kn->name */
+ readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "name"), KVADDR, &cgroup_name_ptr, sizeof(void *),
+ "kernfs_node->name", RETURN_ON_ERROR);
+
+ read_string(cgroup_name_ptr, buf, buflen-1);
+}
Unless you can do something useful in the case of a readmem() error, it's best to
use FAULT_ON_ERROR. You will get an error message automatically generated by readmem(),
but similar to "error(FATAL, ...)" it will return immediately to the "crash> " prompt.
In your module above, you would just get a stream of confusing error messages.
You might consider changing the name of the command to something that
flows off the fingertips a bit easier:
+static struct command_table_entry command_table[] = {
+ { "show_cgroups", show_proc_cgroups, help_proc_cgroups, 0},
+ { NULL },
+};
Maybe something like "cgrp", or "showcg", or anything short but to
the point.
Thanks,
Dave
More information about the Crash-utility
mailing list