[Crash-utility] Re: crash with Xen dom0 image from kdump

David Anderson anderson at redhat.com
Mon Jun 5 13:57:02 UTC 2006


David Anderson wrote:

> Horms wrote:
>
>>On Fri, Jun 02, 2006 at 09:28:07AM -0400, Dave Anderson wrote:
>>
>>>Kazuo Moriwaka wrote:
>>>
>>>With a xendump, the phys_to_machine_mapping array can be
>>>found by following the page tables from any of the cr3 values
>>>found in the dump header.
>>>
>>>It would seem simple enough to have the xen/kdump code store
>>>a legitimate dom0 cr3 value somewhere in the ELF header.  Would
>>>that be possible to explore?  Note that I'm not really interested
>>>in the other guest domains, at least at this point, but I could see
>>>it potentially helpful to recreate the crash environment for the
>>>guest domains as well, so perhap an array of per-domain cr3
>>>values in an ELF header note section would be possible?
>>>
>>
>>I haven't looked into this, but yes I think that should be quite
>>possible as I believe that dom0's cr3 is stored in the hypervisor
>>somewhere. The main problem (for me) is where to put the extra crash
>>note and how to add it without having to modify the user-space kexec
>>tool - currently that code does not need to be modified in order to work
>>with xen. Would it be enough to have it stored in a symbol in the
>>hypervisor (to be honest I don't really understand why crash notes are
>>needed at all given that there is a symbol table)? Or alternatively,
>>could you give me suggestions on the crash-note front? I'll chase up
>>where dom0's cr3 is currently saved.
>>
>
> It's conceivable, but highly unlikely, that it can be found in the 
> vmcore in
> its current state.  My suggestion of storing at least one cr3 for 
> dom0, and
> even better, an array of cr3's for all of the domains, was based upon how
> I figure it out now using the xendump format.  It's just that for 
> domains with
> writable page tables, I at least need a starting point "hook"; and with a
> legitimate cr3 (that has the kernel mappings), I can then walk the 
> domain's
> page tables (all of which contain mfn's instead of pfn's in the case 
> of writable
> page table domains).  
>
> (Note that there's some churn going on with the kexec/kdump patch to 
> replace its
> current manner of over-writing the pgd in use at the dump of the crash 
> -- it's
> not clear to me whether that would be a problem until such time as 
> that kdump
> quirk is "fixed".)
>
> But it would also be possible if I could access the domain's 
> xen_start_info->mfn_list
> value, I could also re-create the list of mfn's that make up the list 
> of pages in each
> domain's phys_to_machine_mapping array.  The kernel does this, based 
> upon what
>  its max_pfn value is determined to be:
>
>                 phys_to_machine_mapping = alloc_bootmem_low_pages(
>                      max_pfn * sizeof(unsigned long));
>                 memset(phys_to_machine_mapping, ~0,
>                        max_pfn * sizeof(unsigned long));
>                 memcpy(phys_to_machine_mapping,
>                        (unsigned long *)xen_start_info->mfn_list,
>                        xen_start_info->nr_pages * sizeof(unsigned long));
>
> So my wild guess was that I could possibly find the xen_start_info 
> structures
> stored in the vmcore as it exists now, but that's probably impossible. 
>  I don't
> know enough about the hypervisor code.
>
> And again, I don't see why something could *not* be found in the xen 
> symbol
> table *now*, but my point was that it unnecessarily adds the burden of 
> having to
> incorporate that file in the process, instead of simply needing the 
> vmlinux and
> vmcore file.  Using a PT_NOTE section is simply one manner of passing
> arbitraray data back to the ultimate user of the vmcore file.  
>
> Dave
>   
>
>
>
Forget what I said about using the xen_start_info->mfn_list above -- that
won't work at all -- mfn value initially stored and copied from the 
"mfn_list"
are immediatly free_bootmem()'d after the memcpy above.

What I *meant* was that if the cr3 values were not available, another
alternative would be to access the value of the pfn_to_mfn_frame_list_list
in the arch_shared_info structure:

typedef struct arch_shared_info {
    unsigned long max_pfn;                  /* max pfn that appears in 
table */
    /* Frame containing list of mfns containing list of mfns containing 
p2m. */
    unsigned long pfn_to_mfn_frame_list_list;
    unsigned long nmi_reason;
} arch_shared_info_t;

Each domain's xen_start_info structure has the mfn of the the 
"shared_info" structure:

typedef struct start_info {
    /* THE FOLLOWING ARE FILLED IN BOTH ON INITIAL BOOT AND ON RESUME.    */
    char magic[32];             /* "xen-<version>-<platform>".            */
    unsigned long nr_pages;     /* Total pages allocated to this domain.  */
    unsigned long shared_info;  /* MACHINE address of shared info struct. */
    uint32_t flags;             /* SIF_xxx flags.                         */
    unsigned long store_mfn;    /* MACHINE page number of shared page.    */
    uint32_t store_evtchn;      /* Event channel for store communication. */
    unsigned long console_mfn;  /* MACHINE address of console page.       */
    uint32_t console_evtchn;    /* Event channel for console messages.    */
    /* THE FOLLOWING ARE ONLY FILLED IN ON INITIAL BOOT (NOT RESUME).     */
    unsigned long pt_base;      /* VIRTUAL address of page directory.     */
    unsigned long nr_pt_frames; /* Number of bootstrap p.t. frames.       */
    unsigned long mfn_list;     /* VIRTUAL address of page-frame list.    */
    unsigned long mod_start;    /* VIRTUAL address of pre-loaded module.  */
    unsigned long mod_len;      /* Size (bytes) of pre-loaded module.     */
    int8_t cmd_line[MAX_GUEST_CMDLINE];
} start_info_t;

and the shared_info structure contains the arch_shared_info structure
show above.

So it would appear that given the mfn of the xen_start_info structure
of a domain, then the pfn_to_mfn_frame_list_list could be tracked down.

Dave







>
>
>
>------------------------------------------------------------------------
>
>--
>Crash-utility mailing list
>Crash-utility at redhat.com
>https://www.redhat.com/mailman/listinfo/crash-utility
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20060605/898d5dcb/attachment.htm>


More information about the Crash-utility mailing list