[Crash-utility] [PATCH] Add the proccgroup extension
Nikolay Borisov
n.borisov.lkml at gmail.com
Mon Apr 11 14:22:27 UTC 2016
On 04/11/2016 05:06 PM, Dave Anderson wrote:
>
>
> ----- Original Message -----
>> Initial version of a crash module which can be used to show which cgroups
>> is the currently active process member of.
>> ---
>> Hello this is a simple crash extension that I hacked up over the weekend, in my
>> case when I look at kernel crash dump I want to quickly understand which cgroup
>> is the current process member of. Currently it uses the process from the current
>> context but this might change in the future. Here is an example output:
>>
>> crash> show_cgroups
>> subsys: cpuset cgroup: c6666
>> subsys: cpu cgroup: c6666
>> subsys: cpuacct cgroup: c6666
>> subsys: io cgroup: c6666
>> subsys: memory cgroup: c6666
>> subsys: devices cgroup: c6666
>> subsys: freezer cgroup: c6666
>> subsys: perf_event cgroup: c6666
>> subsys: pids cgroup: c6666
>>
>> I have tested this on 4.6-rc2 with and without cgroup support enabled.
>>
>> I'm just sending this to get an initial idea whether I have used crash's
>> facilities correctly and canvas for future ideas. I'm aware there are already 2
>> cgroup modules but when I tried running either they complained of no cgroup
>> support or the command did nothing. In any case provided that the code is ok I
>> guess this can be used as a good example of how to traverse structures with crash
>>
>> TODO:
>> * Make the command understand either task_struct pointer or pid being passed.
>> * Add support for pre-3.15 kernels (the cgroup name struct changed to kernfs at that point)
>> * Whatever people think might be useful
>
>
> Hello Nikolay,
>
> I have a few comments and suggestions:
>
> +#include "defs.h"
> +
> +#define CGROUP_PATH_MAX
>
> What is the purpose of CGROUP_PATH_MAX?
I was thinking of using this define to limit the size of various
buffers, but I saw that for most cases BUFSIZE works. This will likely
be removed.
>
> +void proccgroup_init(void);
> +void proccgroup_fini(void);
> +
> ... [ cut ] ...
> +
> +void __attribute__((constructor))
> +echo_init(void)
> +{
> + register_extension(command_table);
> +}
> +
> +void __attribute__((destructor))
> +echo_fini(void) { }
> +
>
> You've got forward declarations of proccgroup_init() and proccgroup_fini(),
> but the two functions don't exist. The "echo_xxx()" functions in the
> echo.c sample module are meant to be replaced by names reflecting your
> module.
I will rename the echo* function, I guess I forgot doing that. Actually
are forward declarations for the init/destroy functions really needed? I
guess not.
>
> If I run this module on a 3.10 kernel, I get this:
>
> crash> extend proccgroup.so
> ./extensions/proccgroup.so: shared object loaded
> crash> show_cgroups
> Unsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionUnsupported kernel versionshow_cgroups: invalid kernel virtual address: 0 type: "cgroup_subsys_state->cgroup"
> Unsupported kernel version
> crash>
>
> because of this:
>
> + /* Handle the 2 cases of cgroup_name and the kernfs one */
> + if (MEMBER_EXISTS("cgroup", "kn")) {
> + get_kn_cgroup_name(cgroup, subsys_ptr);
> + } else if (MEMBER_EXISTS("cgroup", "name")) {
> + fprintf(fp, "Unsupported kernel version");
> + }
>
> You should replace the "fprintf(fp, "Unsupported kernel version")" error message
> above with an "error(FATAL, ...)" construct (and add a "\n" to the end), so that
> the command will fail immediately and return to the "crash> " prompt.
I've since added supported for pre-3.15 kernels. So this will likely
work, but I did see the output you showed. Clearly the failure case
wasn't handled properly.
>
> Alternatively, you can have your module purposely fail to load if any required
> structure members do not exist. Typically modules will have their xxxx_init()
> constructor function do some pre-checks of the kernel being analyzed, and if the
> command will not be able to run correctly, return before calling register_extension():
>
> void __attribute__((constructor))
> xxxx_init(void)
> {
> if (!MEMBER_EXISTS(...) || !MEMBER_EXISTS(...)) {
> fprintf(fp, "some error message\n");
> return;
> }
>
> register_extension(command_table);
> }
>
> I note that all of your readmem() calls use RETURN_ON_ERROR, such as these examples:
>
> +static void get_cgroup_name(ulong cgroup, char *buf, size_t buflen)
> +{
> +
> + ulong kernfs_node;
> + ulong cgroup_name_ptr;
> + ulong kernfs_parent;
> +
> + /* Get cgroup->kn */
> + readmem(cgroup + MEMBER_OFFSET("cgroup", "kn"), KVADDR, &kernfs_node, sizeof(void *),
> + "cgroup->kn", RETURN_ON_ERROR);
> +
> + readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "parent"), KVADDR, &kernfs_parent, sizeof(void *),
> + "kernfs_node->parent", RETURN_ON_ERROR);
> +
> + if (kernfs_parent == 0) {
> + sprintf(buf, "/");
> + return;
> + }
> +
> + /* Get kn->name */
> + readmem(kernfs_node + MEMBER_OFFSET("kernfs_node", "name"), KVADDR, &cgroup_name_ptr, sizeof(void *),
> + "kernfs_node->name", RETURN_ON_ERROR);
> +
> + read_string(cgroup_name_ptr, buf, buflen-1);
> +}
>
> Unless you can do something useful in the case of a readmem() error, it's best to
> use FAULT_ON_ERROR. You will get an error message automatically generated by readmem(),
> but similar to "error(FATAL, ...)" it will return immediately to the "crash> " prompt.
> In your module above, you would just get a stream of confusing error messages.
I thought return_on_error essentially works with fault_on_error
semantics. Clearly I've misunderstood that so will fix it.
>
> You might consider changing the name of the command to something that
> flows off the fingertips a bit easier:
>
> +static struct command_table_entry command_table[] = {
> + { "show_cgroups", show_proc_cgroups, help_proc_cgroups, 0},
> + { NULL },
> +};
>
> Maybe something like "cgrp", or "showcg", or anything short but to
> the point.
Yes, I'm not very happy with the current name so will likely change that.
Thanks for taking the time to review it.
>
> Thanks,
> Dave
More information about the Crash-utility
mailing list