[Crash-utility] [PATCH] do not check sp if ip points to user space

Wed Sep 28 09:03:45 UTC 2011

At 09/26/2011 08:51 PM, Dave Anderson Write:
> 
> 
> ----- Original Message -----
>> At 09/23/2011 09:41 PM, Dave Anderson Write:
>>>
>>>
>>> ----- Original Message -----
>>>> If the task is a user program, the sp can be points to anywhere,
>>>> because we can modify sp in assembly.
>>>> For example:
>>>>
>>>> .globl main
>>>>         .type   main, @function
>>>> main:
>>>>
>>>>         finit
>>>>         subq $16, (%rsp)
>>>>         movq $0, (%rsp)
>>>> .loop:
>>>>         jmp .loop
>>>>
>>>>
>>>
>>> Why would any user task do that?
>>>
>>> And what happens when a backtrace is attempted on such a task?
>>>
>>> Since the current code would not set BT_USER_SPACE, I'm guessing that it
>>> would run into this (at least on x86_64):
>>>
>>>         if (!(bt->flags & BT_USER_SPACE) && (!rsp || !accessible(rsp))) {
>>>                 error(INFO, "cannot determine starting stack pointer\n");
>>>                 return;
>>>         }
>>
>> Yes, crash will run into this on x86_64.
> 
> OK, so why not change the above to do something like this:
> 
>         if (!(bt->flags & BT_USER_SPACE) && (!rsp || !accessible(rsp))) {
>                 fprintf(ofp, "cannot determine starting stack pointer\n");
>                 if(KVMDUMP_DUMPFILE())
>                         kvmdump_display_regs(bt->tc->processor, ofp);
>                 else if (ELF_NOTES_VALID() && DISKDUMP_DUMPFILE())
>                         diskdump_display_regs(bt->tc->processor, ofp);
>                 else if (SADUMP_DUMPFILE()) {
>                         sadump_display_regs(bt->tc->processor, ofp);
>                 return;
>         }
> 
> Dave

Agree with it. But we should init ofp earlier, and we should add the
same code in the function x86_back_trace_cmd().

Thanks
Wen Congyang

>  
>>>
>>> I do believe that I put the additional in_user_stack() checks in those
>>> locations for a reason.  Consider a task running in kernel mode that
>>> corrupts its return address stack location with a non-kernel  address,
>>> or called a function indirectly that had a NULL pointer in it.  That
>>> would cause a kernel crash with a non-kernel RIP in its exception frame,
>>> and your patch would mistake it for user-space.
>>
>> I know the reason why you check if sp is in user stack.
>> What about this:
>> if !is_kernel_text(ip) && (sp is in kernel stack(include irq))
>>     try to backtrace according to sp
>> else
>>     display registers
>>
>> If both ip and sp is corrupted, and we can not determine sp according to
>> the content in the stack, I think we should display registers.
>>
>>>
>>> In any case, you're going to have to come up with a more compelling
>>> reason to change all of these locations.  (And for that matter, I wonder
>>> why you didn't patch Fujitsu's get_sadump_regs() the same way?)
>>
>> No only for Fujitsu's sadump, I think kvmdump has the same problem.
>>
>> By the way, crash try to use the register when the dump format is diskdump.
>> I think we should use the register value when the dump format is Fujitsu's
>> sadump.
>>
>> Thanks
>> Wen Congyang
>>
>>>
>>> Dave
> 
>