[Crash-utility] User Stack back trace of the process

Rajesh rajeshkuri at rediffmail.com
Fri Sep 7 11:24:11 UTC 2007


  
May be I'm posting to wrong mailing list.kindly guide me...

I have modified the elf_core dump functionality, to take only text, data and stack segments. I'm not intrested in dynamic allocated memroy of the proces.
Below is the modification i have done in "binfmt_elf.c" file.
In maydump() function I'm checking for the VMA mapped to dynamic memory of the proces or not.

-------------------------------------------------------
if ((vma->vm_file == NULL) &&
                (!((current->mm->start_stack) < vma->vm_end)))
         return 0;
-------------------------------------------------------

It is working fine for single threaded processes, but when i take the core dump of the multi-threaded process, I only get the core dump of the process i kill. And in gdb I'm not able to switch between the threads.

Please let me know whether those modifications are correct or not.

--Regards,
rajesh

On Wed, 05 Sep 2007 Dave Anderson wrote :
>Rajesh wrote:
>>Sorry in my previous e-mail I mistyped.
>>
>>I want to dump only code and stack segments of a process.
>>
>>--Regards,
>>rajesh
>
>stack segments would have: (vma->vm_flags & VM_GROWSDOWN)
>
>
>>
>>
>>On Wed, 05 Sep 2007 Rajesh wrote :
>>  >Hi,
>>  >
>>  >Is there any way to find using kernel data structure, the VMA of a process belongs to stack or heap. It is easy to distinguish the VMA  belongs to code segment or not from vm_area_struct structure, using "vm_flags" variable.
>>  >
>>  >In "elf_core_dump()" function I'm planning to dump only code and data segments.
>>  >
>>  >Can any body please guide me...
>>  >
>>  >--Regards,
>>  >rajesh
>>  >
>>  >On Wed, 05 Sep 2007 Dave Anderson wrote :
>>  > >Rajesh wrote:
>>  > >>Dave,
>>  > >>
>>  > >>Thanks for your explanation.
>>  > >>
>>  > >>Well the reason behind my questions is, we have an application running on customer site and the application consumes around 60GB of system memory.
>>  > >>When this process receives the segmentation fault or signal abort, the kernel will start to take the process core dump. Here is the problem. Kernel takes at least  1hr (60-minutes) to come out from core dump. During this time the system is unresponsive (hung), and I feel it is because the system is entering into thrashing due to huge memory usage by the process. This long down time is not acceptable by the customer.
>>  > >>
>>  > >>So I started to find the better way or tackling the problem.
>>  > >>
>>  > >>1>First thing we thought is changing the system page size from 4KB to 8KB. Since this change could not be done on our x86_64 architecture, since x86_64 architecture doesn’t support multi-page size option.
>>  > >>
>>  > >>2>We wrote a program using libbfd API’s and used with in our application. Whenever the SIGSEGV or SIGABRT is received by the process it will log the stack trace of all the threads within that process. This feature is not so effective or flexible as compared to process core dump.
>>  > >>
>>  > >>3>Last we thought of using kcore/vmcore to analyze the cause for SIGSEGV or SIGABRT.
>>  > >>
>>  > >>4>I have one more thought, making the “elf_core_dump()” function SMP. This function is responsible for dumping the core, and the function is present in “/usr/src/linux/fs/binfmt_elf.c”
>>  > >>
>>  > >>
>>  > >>Any comments/ideas are welcome.
>>  > >>
>>  > >>--Regards,
>>  > >>rajesh
>>  > >
>>  > >Maybe tinker with maydump()?
>>  > >
>>  > >If you know that the core dump contains the VMA's that are
>>  > >not necessary to dump, such as large shared memory segments,
>>  > >and you can identify them from the VMA, you can prevent
>>  > >them from being copied to the core dump.  There's this
>>  > >patch floating around, which may have been updated:
>>  > >
>>  > >  http://lkml.org/lkml/2007/2/16/149
>>  > >
>>  > >Dave
>>  > >
>>  > >
>>  > >
>>  > >
>>  >--
>>  >Crash-utility mailing list
>>  >Crash-utility at redhat.com
>>  >https://www.redhat.com/mailman/listinfo/crash-utility
>>
>>
>>
>><http://adworks.rediff.com/cgi-bin/AdWorks/click.cgi/www.rediff.com/signature-home.htm/1050715198@Middle5/1422717_1416193/1422162/1?PARTNER=3&OAS_QUERY=null target=new >
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20070907/6006893d/attachment.htm>


More information about the Crash-utility mailing list