[Crash-utility] crash version 4.0-3.15 is available

Thu Dec 21 13:54:23 UTC 2006

Isaku Yamahata wrote:

> On Wed, Dec 20, 2006 at 10:15:32AM -0500, Dave Anderson wrote:
>
> > - Introduced support for xendumps of para-virtualized ia64 kernels.
> >   It should be noted that currently the ia64 Xen kernel does not
> >   lay down a switch_stack for the panic task, so only raw "bt -t"
> >   backtraces can be done on the panic task.  (anderson at redhat.com)
>
> Hi Dave.
>
> The current "xm dump-core" on ia64 loses some registers infomation
> which is saved on xen register stack.
> e.g. r33, ... aren't saved in domU xendump file.
> Probably ia64 specific code would be necessarry for it.
> This will be addressed as post-3.0.4 effort and the format will be changed.
>
> --
> yamahata

I'm not sure exactly what the ramifications are of an ia64 "xm dump-core"
on a paravirtualized kernel.  It would seem to depend upon what, if anything,
was "active" at the time.

My reference to the switch_stack above was for an ia64 kernel that panicked
on its own account; the test dump I used was killed with a write to
/proc/sysrq-trigger:

.crash> bt
PID: 1554   TASK: e000000000988000  CPU: 0   COMMAND: "bash"
bt: xendump: switch_stack possibly not saved -- try "bt -t"
 #0 [BSP:e000000000988f00] schedule at a0000001005e0420
crash>

It uses the stale information from the last time it called schedule(),
so the backtrace fails.

Using "bt -t" walks the process stack for kernel return addresses,
and the "reverse" BSP information just above the task_struct shows
the path taken:

crash> bt -t
PID: 1554   TASK: e000000000988000  CPU: 0   COMMAND: "bash"
              START: schedule at a0000001005e0420
  [e000000000989238] xen_trace_syscall at a000000100065020
  [e000000000989288] sys_write at a000000100155d30
  [e0000000009892b8] vfs_write at a0000001001551e0
  [e000000000989308] write_sysrq_trigger at a0000001001e3250
  [e000000000989320] __handle_sysrq at a00000010039bca0
  [e000000000989380] sysrq_handle_crashdump at a00000010039c460
  [e0000000009893e8] do_wait at a00000010008b080
  [e000000000989438] schedule_timeout at a0000001005e22a0
  [e000000000989450] ext3_lookup at a0000002001eb7d0
  [e000000000989460] cleanup_module at a000000200202a10
  [e000000000989470] ext3_find_entry at a0000002001e77c0
  [e0000000009894b0] __wait_on_buffer at a00000010015b580
  [e0000000009894c0] ll_rw_block at a00000010015c2b0
  [e000000000989500] out_of_line_wait_on_bit at a0000001005e2940
  [e000000000989520] __wait_on_bit at a0000001005e27d0
  [e000000000989540] sync_buffer at a00000010015b7c0
  [e000000000989558] io_schedule at a0000001005e2170
  [e000000000989588] __delayacct_blkio_start at a0000001000fa4b0
  [e000000000989608] io_schedule at a0000001005e21a0
  [e000000000989630] __do_IRQ at a0000001000f2450
  [e000000000989660] do_softirq at a000000100093100
  [e000000000989698] blkif_int at a000000200152070
  [e000000000989710] end_that_request_first at a00000010027ae30
  [e000000000989748] __end_that_request_first at a00000010027a460
  [e000000000989778] bio_endio at a0000001001622b0
  [e00000000098fca0] schedule at a0000001005e0420
  [e00000000098fd10] vhpt_miss at a000000100000002
  [e00000000098fd60] vhpt_miss at a000000100000002
  [e00000000098fdc8] dummycon_dummy at a0000001002dd380
  [e00000000098fdd0] vhpt_miss at a000000100000003
crash>

Well, it at least shows it going as far as sysrq_handle_crashdump(),
and any further addresses of function calls were never pushed into
the BSP.  (?)

Anyway, the problem is that the ia64 shutdown path in a para-virtualized ia64
kernel does not lay down a switch_stack -- as is done by the netdump,
diskdump and kdump facilities.  Without a switch_stack register dump,
a backtrace is impossible.

It's a simple thing to do -- at some point during the shutdown
path, presumably xen_panic_event(), the panicking process would
need to make a call to the unw_init_running() function, which lays
down a switch_stack on the kernel stack, and then continues on to
the next function in the shutdown path.  For example, the kdump
facility for ia64 does this:

[ system crashes ]
  crash_kexec()
    machine_kexec()
       ...

The ia64 version of machine_kexec() does this:

void machine_kexec(struct kimage *image)
{
        unw_init_running(ia64_machine_kexec, image);
        for(;;);
}

The call to unw_init_running() never returns, but rather
it lays down a switch_stack on the kernel stack, and then
calls the ia64_machine_kexec() function:

extern void *efi_get_pal_addr(void);
static void ia64_machine_kexec(struct unw_frame_info *info, void *arg)
{
        struct kimage *image = arg;
        relocate_new_kernel_t rnk;
        void *pal_addr = efi_get_pal_addr();
        unsigned long code_addr = (unsigned long)page_address(image->control_code_page);
        unsigned long vector;
        int ii;

        if (image->type == KEXEC_TYPE_CRASH) {
                crash_save_this_cpu();
                current->thread.ksp = (__u64)info->sw - 16;
        }

        ... (continue shutdown path)

The address of the switch stack address is found in the unw_frame_info
structure passed in, and gets stored in the current->thread.ksp of
the panicking task.  With that simple procedure, the crash utility
will then have all that it needs to do a backtrace of the panicking task.

Since the para-virtualized ia64 kernel shuts down when panic()
calls atomic_notifier_call_chain(), which in turn goes through the
panic_notifier list -- which leads to the ia64 version of xen_panic_event():

static int
xen_panic_event(struct notifier_block *this, unsigned long event, void *ptr)
{
        HYPERVISOR_shutdown(SHUTDOWN_crash);
        /* we're never actually going to get here... */
        return NOTIFY_DONE;
}

The ia64 would need to "jump through the hoop" of a call to
unw_init_running() before it calls HYPERVISOR_shutdown()

Thanks,
  Dave
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20061221/4ae6a0a6/attachment.htm>