[Crash-utility] [PATCH] do not check sp if ip points to user space

Mon Sep 26 13:20:26 UTC 2011

----- Original Message -----
> Hello all,
> 
> From: Wen Congyang <wency at cn.fujitsu.com>
> Subject: Re: [Crash-utility] [PATCH] do not check sp if ip points to
> user space
> Date: Mon, 26 Sep 2011 13:47:39 +0800
> 
> > At 09/23/2011 09:41 PM, Dave Anderson Write:
> >> 
> >> 
> >> ----- Original Message -----
> >>> If the task is a user program, the sp can be points to anywhere,
> >>> because we can modify sp in assembly.
> >>> For example:
> >>>
> >>> .globl main
> >>>         .type   main, @function
> >>> main:
> >>>
> >>>         finit
> >>>         subq $16, (%rsp)
> >>>         movq $0, (%rsp)
> >>> .loop:
> >>>         jmp .loop
> >>>
> >>>
> >> 
> >> Why would any user task do that?
> >> 
> >> And what happens when a backtrace is attempted on such a task?
> >> 
> >> Since the current code would not set BT_USER_SPACE, I'm guessing that it
> >> would run into this (at least on x86_64):
> >> 
> >>         if (!(bt->flags & BT_USER_SPACE) && (!rsp ||
> >>         !accessible(rsp))) {
> >>                 error(INFO, "cannot determine starting stack
> >>                 pointer\n");
> >>                 return;
> >>         }
> > 
> > Yes, crash will run into this on x86_64.
> > 
> >> 
> >> I do believe that I put the additional in_user_stack() checks in those
> >> locations for a reason.  Consider a task running in kernel mode that
> >> corrupts its return address stack location with a non-kernel address,
> >> or called a function indirectly that had a NULL pointer in it.  That 
> >> would cause a kernel crash with a non-kernel RIP in its exception frame,
> >> and your patch would mistake it for user-space.
> > 
> > I know the reason why you check if sp is in user stack.
> > What about this:
> > if !is_kernel_text(ip) && (sp is in kernel stack(include irq))
> >     try to backtrace according to sp
> > else
> >     display registers
> > 
> > If both ip and sp is corrupted, and we can not determine sp according to
> > the content in the stack, I think we should display registers.
> > 
> >> 
> >> In any case, you're going to have to come up with a more compelling
> >> reason to change all of these locations.  (And for that matter, I wonder
> >> why you didn't patch Fujitsu's get_sadump_regs() the same way?)
> > 
> > No only for Fujitsu's sadump, I think kvmdump has the same problem.
> > 
> > By the way, crash try to use the register when the dump format is diskdump.
> > I think we should use the register value when the dump format is Fujitsu's
> > sadump.
> > 
> 
> To be honest, I douted the logic just as you when I read
> get_kvmdump_regs() for the first time, but at that time I've
> understood why it is so, just as Dave has already explained, that bt
> command in crash utility mainly targets kernel stacks generated by C
> programs, and it seems reasonable enough as the feature users want.
> 
> BTW, it is the fact that it's not sufficient to see only RIP and RSP
> to decide if the execution mode for a given task. More condition is
> neccesary.
> 
> On X86 archtectures, for example, CS gives information about such
> information, and can cover the assembly example you showed in the
> first post, but of course this is the specific logic on X86 and
> X86_64, and I have no idea on other architectures, and might be
> inapplicable to kvmdump that targets archs other than X86.

Right, but the fact of the matter is that a backtrace cannot be
performed for a task with a nonsensical SP value, so it doesn't
make a difference whether it was in user-space or kernel-space.
So the "cannot determine starting stack pointer" error message
should still be displayed as it currently does -- and with my patch
suggestion, the registers can be dumped (if available) before
returning.

Dave

> 
> Thanks.
> HATAYAMA, Daisuke
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>