[Crash-utility] [RFC][PATCH]: crash aborts with cannot determine idle task
Chandru
chandru at in.ibm.com
Wed Jun 10 08:26:32 UTC 2009
Dave Anderson wrote:
> Sorry -- that's not what I meant...
>
> What I want to avoid is screwing around with the prstatus notes bookkeeping
> unless it is absolutely necessary, i.e., where there had been some cpus offlined
> prior to the crash. The original thread back in April 2008 mentioned something
> to the effect that your test system only had cpus 12 and 13 online at the time of
> the crash. When that is the case, is kt->cpus equal to 14? I.e., what
> does the "sys" command show for "CPUS:"?
>
>
I had the vmcore file from that test system and ran crash with -d1.
The cpu maps shown are ...
cpu_possible_map: 0 1 2 3 4 5 6 7 8 9 10 11 12 13
cpu_present_map: 8 9 10 11 12 13
cpu_online_map: 12 13
The 'sys' command shows CPUS as '14' (with the patch applied)
<snip>
crash> sys
KERNEL: ./vmlinux
DUMPFILE: ./vmcore
CPUS: 14
DATE: Tue Mar 25 14:43:39 2008
UPTIME: 08:01:57
LOAD AVERAGE: 12.73, 5.40, 3.72
TASKS: 262
I had another vmcore collected on another system by offlining couple of
cpus through sysfs interface. The cpu maps on this machine with 'crash
-d1' show...
cpu_possible_map: 0 1 2 3
cpu_present_map: 0 1 2 3
cpu_online_map: 2 3
and 'sys' shows as
<snip>
crash> sys
KERNEL: ./vmlinux
DUMPFILE: ./vmcore
CPUS: 4
DATE: Sat Jun 6 15:00:24 2009
UPTIME: 15:56:30
> I ask because this is the way I'd prefer to go:
>
> void
> map_cpus_to_prstatus(void)
> {
> void **nt_ptr;
> int online, i, j, nrcpus;
> size_t size;
>
> if (!(online = get_cpus_online()) || (online == kt->cpus))
> return;
>
> if (CRASHDEBUG(1))
> error(INFO,
> "cpus: %d online: %d NT_PRSTATUS notes: %d (remapping)\n",
> kt->cpus, online, nd->num_prstatus_notes);
>
> size = NR_CPUS * sizeof(void *);
>
> nt_ptr = (void **)GETBUF(size);
> BCOPY(nd->nt_prstatus_percpu, nt_ptr, size);
> BZERO(nd->nt_prstatus_percpu, size);
>
> /*
> * Re-populate the array with the notes mapping to online cpus
> */
> nrcpus = (kt->kernel_NR_CPUS ? kt->kernel_NR_CPUS : NR_CPUS);
>
> for (i = 0, j = 0; i < nrcpus; i++) {
> if (in_cpu_map(ONLINE, i))
> nd->nt_prstatus_percpu[i] = nt_ptr[j++];
> }
>
> FREEBUF(nt_ptr);
> }
>
> And since kt->cpus may not be finally initialized until later than
> kernel_init(), I moved the call to map_cpus_to_prstatus() to here
> in task_init():
>
> if (ACTIVE()) {
> active_pid = REMOTE() ? pc->server_pid : pc->program_pid;
> set_context(NO_TASK, active_pid);
> tt->this_task = pid_to_task(active_pid);
> }
> else {
> if (KDUMP_DUMPFILE())
> map_cpus_to_prstatus();
> please_wait("determining panic task");
> set_context(get_panic_context(), NO_PID);
> please_wait_done();
> }
>
> Can you test the map_cpus_to_prstatus() function above, along with the
> movement of the call to it from kernel_init() to task_init()?
>
>
>
Yes, I tested these changes and they work fine.
Thanks,
Chandru
More information about the Crash-utility
mailing list