[Crash-utility] [RFC][PATCH v3] Use register value in elf note NT_PRSTATUS to do backtrace

Wang Chao wang.chao at cn.fujitsu.com
Fri May 6 09:16:06 UTC 2011


Hi Dave and all,

Thanks a lot for your suggestion.

According to the feedbacks, I made the following main changes since v2:
1) Add a new option --no_elf_notes. If specified, it'll skip processing
   elf notes in kdump-compressed dump files.
2) Dump notes related information including total number of notes, their
   buffers' pointer value and finally offset in the dump file.
3) Move nt_prstatus_percpu and nt_prstatus_percpu to diskdump_data struct.
   However, bt command may need sp/ip from the notes later, so I guess
   the buffer should not be freed after diskdump_read_header().

So the v3 patch is attached and we're looking forward to your opinions.

Thanks,
Wang Chao

Sent on 2011-4-26 23:15, Dave Anderson wrote:
> 
> 
> ----- Original Message -----
>> Hi Dave and list,
>>
>> I guess you still remember that a few weeks ago Wen has sent a patch which intends
>> to use register values in NT_PRSTATUS notes for backtracing if no panic task was
>> found. Thanks for your review and the suggestions are very useful.
>>
>> However, Wen was busy with other work for these days, so I'll continue with this
>> work and now the 2rd version patch is attached.
>>
>> v2: Address review feedbacks.
>> 1) Set up a flag in pc->flags2 and it's determined during the diskdump file init procedure.
>> 2) Seperate code according to the that flag.
>> 3) Also, we've done some tests on dumpfile of xen kernel and the trouble described
>> in the previous mail was gone.
>>
>> So we're looking forward to your feedback and if you still have any problems with
>> it, please tell us.
>>
>> Thanks,
>> Wang Chao
> 
> Hello Wang,
> 
> I haven't had a chance to run a complete test of this latest patch,
> but I do have several suggestions.
> 
> diskdump.c:
> 
> Put the nt_prstatus_percpu array pointer and the num_prstatus_notes
> into the diskdump_data structure -- like you did with the notes_buf.
> Make nt_prstatus_percpu a pointer that gets allocated and initialized
> only if necessary -- because if it's not used, it's a waste of space.
> 
> In read_dump_header(), when dd->notes_buf is malloc()'d, malloc() the
> dd->nt_prstatus_percpu array. 
> 
> And at the bottom of read_dump_header(), free both the dd->notes_buf
> and the dd->nt_prstatus_percpu array if they exist, under the "err:"
> label:
> 
> err:
>    ...
>    if (dd->notes_buf)
> 	free(dd->notes_buf);
>    if (dd->nt_prstatus_percpu)
>         free(dd->nt_prstatus_percpu;
> 
> In order to be able to debug/understand why things may fail, especially
> due to the exclusion of prstatus notes of offline cpus, it would be 
> very helpful to be able to see the dd->nt_prstatus pointers for 
> each cpu.  Similar to what is done with the vmcoreinfo data, can 
> you add a section to __diskdump_dump_header():
> 
>                 if (dh->header_version >= 3) {
>                         fprintf(fp, "   offset_vmcoreinfo: %lx\n",
>                                 (ulong)dd->sub_header_kdump->offset_vmcoreinfo);
>                         fprintf(fp, "     size_vmcoreinfo: %lu\n",
>                                 dd->sub_header_kdump->size_vmcoreinfo);
>                         if (dd->sub_header_kdump->offset_vmcoreinfo &&
>                                 dd->sub_header_kdump->size_vmcoreinfo) {
>                                 dump_vmcoreinfo(fp);
>                         }
>                 }
>                 if (dh->header_version >= 4) {
>                         fprintf(fp, "         offset_note: %lx\n",
>                                 (ulong)dd->sub_header_kdump->offset_note);
>                         fprintf(fp, "           size_note: %lu\n",
>                                 dd->sub_header_kdump->size_note);
> here ==================> 
>                 }
> 
> Call a function that dumps the file offset of each cpu's nt_prstatus data.  
> Then, if necessary, any cpu's register set can be read with "rd -f".
> 
> 
> netdump.c:
> 
> In get_netdump_regs_x86(), both of these settings are incorrect:
> 
>                 user_regs = get_regs_from_note((char *)note, &ip, &sp);
>      =====>     bt->flags |= BT_KDUMP_ELF_REGS;
>                 if (is_kernel_text(ip) &&
>                     (((sp >= GET_STACKBASE(bt->task)) &&
>                       (sp < GET_STACKTOP(bt->task))) ||
>                     in_alternate_stack(bt->tc->processor, sp))) {
>                         *eip = ip;
>                         *esp = sp;
>      =====>             bt->flags |= BT_KERNEL_SPACE;
>                         return;
>                 }
> 
> Neither of these flags should be set there.  BT_KDUMP_ELF_REGS
> is only applicable to x86_64, and BT_KERNEL_SPACE is only
> appicable to kvmdump dumpfiles.
> 
> Finally, in the interest of paranoia, give the user the capability
> of *not* using this facility.  In main.c, create a "--no_elf_notes"
> option (similar to "--zero_excluded"), and have it set a NO_ELF_NOTES
> bit in the globally-accessible "*diskdump_flags".  As an example, see
> how ZERO_EXCLUDED is set, used, and accessed.  Then in read_dump_header()
> skip the new ELF setup if that flag is set.  It will not be necessary
> to go as creating a new environment variable like "zero_excluded", but
> it would be nice to be able to restrict its use on the crash invocation
> command line.
>  
> Thanks,
>   Dave
> 
> 
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: 0001-Use-register-value-in-elf-note-NT_PRSTATUS-to-do-bac.patch
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20110506/100140a3/attachment.ksh>


More information about the Crash-utility mailing list