[Crash-utility] invalid regs display in bt

James Washer washer at trlp.com
Thu Sep 27 14:34:47 UTC 2007


Richard, the SS is "bogus" because it is NOT saved by the processor unless there is a privilege level change with the exception, and in this case there was no privilege change. I think you'll find SS is valid when a fault occurs in user land, resulting in a priv change as we enter the kernel.

And NO.. I don't want to see a different format for priv-level-change vs non-priv-level change exceptions and this makes it harder to post process with perl, etc..

As for error-code, I don't know why it would be replace with -1

 - jim

On Tue, 25 Sep 2007 22:58:44 +0100
Richard J Moore <richardj_moore at uk.ibm.com> wrote:

> I've been puzzling over why the regs formatted with a backtrace on an IA32 
> dump are invalid. Here's what I mean:
> 
> PID: 2692   TASK: f4656630  CPU: 0   COMMAND: "rmmod"
>  #0 [f463ce54] crash_kexec at c044a1f7
>  #1 [f463ce9c] die at c040651a
>  #2 [f463ced4] do_page_fault at c0603107
>  #3 [f463cf14] error_code (via page_fault) at c060190a
>     EAX: 00000018  EBX: f8b43400  ECX: f8b4304f  EDX: 00200000 
>     DS:  007b      ESI: 00000000  ES:  007b      EDI: 00000000
>     SS:  304f      ESP: f8b4302b  EBP: f463c000
>     CS:  0060      EIP: f8b43004  ERR: ffffffff  EFLAGS: 00210286 
> 
> 
> They are supposed to represent a valid set of regs that are presented to 
> do_page_fault, which I presume are meant to be valid at the time the 
> exception occurred.
> Of they can never be a set of valid regs for the simple reason that the 
> CPL is 0 (CS=60) and the RPL of SS is 3, which is an automatic GPF.
> Since I manufactured the exception that caused this dump, by causing an 
> unrecoverable page fault in ring 0, I known the CS is correct but SS is 
> bogus. 
> Furthermore the the error code (ERR), which is stored by the processor as 
> part of the exception stack frame uses only bits 0-2 for page faults and 
> at most bits 0-15 for other exceptions, the unused bit positions are zero. 
> So ERR is also bogus.
> 
> On looking at the code in entry.S at page_fault and the other exception 
> entry points I see no attempt to save regs to create a pt_regs struct. The 
> fact that do_page_fault takes pt_regs as the first arg is a hack to get at 
> CS:EIP and SS:ESP at the time of exception. Furthermore error_code loads 
> the exception error code into edx then wipes it out from the stack by 
> storing -1 into this location. I can't actually see a good reason for 
> wiping out the error code. By convention exceptions and interrupts have a 
> -ve integer stored at the error-code location to distinguish them from 
> system calls, but I don't think this is used. signal.c seems to be the 
> only place to look for an error code >=0 but I don't see an exception 
> affects signal.c 
> 
> Can anyone confirm whether setting the error code to -1 is essential. If 
> it isn't then I think we should consider leaving it in place.
> 
> 
> The long and short of it is: the only thing that has any meaning is CS, 
> EIP and EFLAGS. All of which are saved by the processor.  SS and ESP are 
> only saved when the exception occurred at a privilege level >0 but these 
> can never generate a panic. 
> 
> I'd recommend that we change the bt output to format only the three valid 
> regs (possibly SS and ESP, if CPL at time of exception >0). Is there any 
> reason why this shouldn't be changed?
> 
> Richard
> 
> 
> 
> 
> 
> 
> 
> Unless stated otherwise above:
> IBM United Kingdom Limited - Registered in England and Wales with number 
> 741598. 
> Registered office: PO Box 41, North Harbour, Portsmouth, Hampshire PO6 3AU
> 
> 
> 
> 
> 
> 




More information about the Crash-utility mailing list