[Crash-utility] crash aborts with cannot determine idle task

Dave Anderson anderson at redhat.com
Sat Apr 5 19:06:43 UTC 2008


Chandru wrote:
> Dave Anderson wrote:
>> As I suggested before, you're going to have to determine why
>> the tasklist[i] is bogus.  The first things to determine are:
>>
>> (1) what "nr_cpus" was calculated to be, and
>> (2) whether the SMP and PER_CPU_OFF flags are set in kt->flags.
>>
>> If those variables/settings make sense, then presumably the
>> problem is in the determination of the per-cpu offset values.
>> That's done in a machine-specific way, so I can't help you
>> without knowing what architecture you're dealing with, not
>> to mention what kernel version, or whether it's configured
>> CONFIG_SMP or not, and whether you can run crash on the live
>> system that generated the dumpfile.
>>
>> Dave
>>
> The machine is a ppc64 box with a RHEL5.1 based SMP kernel.  nr_cpus 
> is equal to '2' in get_idle_threads() , but the system actually has 14 
> cpus and 12 of them were offline when a vmcore was collected. The 
> kt->__per_cpu_offset[12 & 13 ] have per cpu offset values  where as 
> kt->__per_cpu_offset[0 to 11] = 0.   I changed kt->__per_cpu_offset[i] 
> in ppc64_paca_init()  to kt->__per_cpu_offset[cpus]  and that started 
> crash.  But backtrace 'bt' exited with segmentation fault .  Looking 
> further the code in get_netdump_regs_ppc64()
>                if (nd->num_prstatus_notes > 1)
>                {
>                        note = (Elf64_Nhdr *)
>                                nd->nt_prstatus_percpu[bt->tc->processor];
>                }
> had bt->tc->processor as '12'. I changed it to '0' and that gave the 
> backtrace.
>
> Regards,
> Chandru
>

OK, it sounds like the kt->cpus value should have been set to 14 by 
ppc64_paca_init().

And it appears that when kdump created the vmcore, it only installs two 
NT_PRSTATUS
sections.  And that being the case,  the consumer of the ELF header has 
to figure out what
cpu that each NT_PRSTATUS section one belongs to.  I'm not sure how that 
can be
determined. 

And it seems that there may be other oddities that may crop up when running
other commands.  Or maybe not...

I'm going to ultimately defer this back to the IBM for resolution.
haren at us.ibm.com 
<https://www.redhat.com/mailman/options/crash-utility/haren--at--us.ibm.com> 
wrote the ppc64.c file, and he is on this mailing
list.  But would it be possible for you to make the vmlinux/vmcore pair
available to me?  (If so, you can send me the particulars off-line)

Thanks,
  Dave







More information about the Crash-utility mailing list