[Crash-utility] Re: crash with Xen dom0 image from kdump
David Anderson
anderson at redhat.com
Mon Jun 5 13:57:02 UTC 2006
David Anderson wrote:
> Horms wrote:
>
>>On Fri, Jun 02, 2006 at 09:28:07AM -0400, Dave Anderson wrote:
>>
>>>Kazuo Moriwaka wrote:
>>>
>>>With a xendump, the phys_to_machine_mapping array can be
>>>found by following the page tables from any of the cr3 values
>>>found in the dump header.
>>>
>>>It would seem simple enough to have the xen/kdump code store
>>>a legitimate dom0 cr3 value somewhere in the ELF header. Would
>>>that be possible to explore? Note that I'm not really interested
>>>in the other guest domains, at least at this point, but I could see
>>>it potentially helpful to recreate the crash environment for the
>>>guest domains as well, so perhap an array of per-domain cr3
>>>values in an ELF header note section would be possible?
>>>
>>
>>I haven't looked into this, but yes I think that should be quite
>>possible as I believe that dom0's cr3 is stored in the hypervisor
>>somewhere. The main problem (for me) is where to put the extra crash
>>note and how to add it without having to modify the user-space kexec
>>tool - currently that code does not need to be modified in order to work
>>with xen. Would it be enough to have it stored in a symbol in the
>>hypervisor (to be honest I don't really understand why crash notes are
>>needed at all given that there is a symbol table)? Or alternatively,
>>could you give me suggestions on the crash-note front? I'll chase up
>>where dom0's cr3 is currently saved.
>>
>
> It's conceivable, but highly unlikely, that it can be found in the
> vmcore in
> its current state. My suggestion of storing at least one cr3 for
> dom0, and
> even better, an array of cr3's for all of the domains, was based upon how
> I figure it out now using the xendump format. It's just that for
> domains with
> writable page tables, I at least need a starting point "hook"; and with a
> legitimate cr3 (that has the kernel mappings), I can then walk the
> domain's
> page tables (all of which contain mfn's instead of pfn's in the case
> of writable
> page table domains).
>
> (Note that there's some churn going on with the kexec/kdump patch to
> replace its
> current manner of over-writing the pgd in use at the dump of the crash
> -- it's
> not clear to me whether that would be a problem until such time as
> that kdump
> quirk is "fixed".)
>
> But it would also be possible if I could access the domain's
> xen_start_info->mfn_list
> value, I could also re-create the list of mfn's that make up the list
> of pages in each
> domain's phys_to_machine_mapping array. The kernel does this, based
> upon what
> its max_pfn value is determined to be:
>
> phys_to_machine_mapping = alloc_bootmem_low_pages(
> max_pfn * sizeof(unsigned long));
> memset(phys_to_machine_mapping, ~0,
> max_pfn * sizeof(unsigned long));
> memcpy(phys_to_machine_mapping,
> (unsigned long *)xen_start_info->mfn_list,
> xen_start_info->nr_pages * sizeof(unsigned long));
>
> So my wild guess was that I could possibly find the xen_start_info
> structures
> stored in the vmcore as it exists now, but that's probably impossible.
> I don't
> know enough about the hypervisor code.
>
> And again, I don't see why something could *not* be found in the xen
> symbol
> table *now*, but my point was that it unnecessarily adds the burden of
> having to
> incorporate that file in the process, instead of simply needing the
> vmlinux and
> vmcore file. Using a PT_NOTE section is simply one manner of passing
> arbitraray data back to the ultimate user of the vmcore file.
>
> Dave
>
>
>
>
Forget what I said about using the xen_start_info->mfn_list above -- that
won't work at all -- mfn value initially stored and copied from the
"mfn_list"
are immediatly free_bootmem()'d after the memcpy above.
What I *meant* was that if the cr3 values were not available, another
alternative would be to access the value of the pfn_to_mfn_frame_list_list
in the arch_shared_info structure:
typedef struct arch_shared_info {
unsigned long max_pfn; /* max pfn that appears in
table */
/* Frame containing list of mfns containing list of mfns containing
p2m. */
unsigned long pfn_to_mfn_frame_list_list;
unsigned long nmi_reason;
} arch_shared_info_t;
Each domain's xen_start_info structure has the mfn of the the
"shared_info" structure:
typedef struct start_info {
/* THE FOLLOWING ARE FILLED IN BOTH ON INITIAL BOOT AND ON RESUME. */
char magic[32]; /* "xen-<version>-<platform>". */
unsigned long nr_pages; /* Total pages allocated to this domain. */
unsigned long shared_info; /* MACHINE address of shared info struct. */
uint32_t flags; /* SIF_xxx flags. */
unsigned long store_mfn; /* MACHINE page number of shared page. */
uint32_t store_evtchn; /* Event channel for store communication. */
unsigned long console_mfn; /* MACHINE address of console page. */
uint32_t console_evtchn; /* Event channel for console messages. */
/* THE FOLLOWING ARE ONLY FILLED IN ON INITIAL BOOT (NOT RESUME). */
unsigned long pt_base; /* VIRTUAL address of page directory. */
unsigned long nr_pt_frames; /* Number of bootstrap p.t. frames. */
unsigned long mfn_list; /* VIRTUAL address of page-frame list. */
unsigned long mod_start; /* VIRTUAL address of pre-loaded module. */
unsigned long mod_len; /* Size (bytes) of pre-loaded module. */
int8_t cmd_line[MAX_GUEST_CMDLINE];
} start_info_t;
and the shared_info structure contains the arch_shared_info structure
show above.
So it would appear that given the mfn of the xen_start_info structure
of a domain, then the pfn_to_mfn_frame_list_list could be tracked down.
Dave
>
>
>
>------------------------------------------------------------------------
>
>--
>Crash-utility mailing list
>Crash-utility at redhat.com
>https://www.redhat.com/mailman/listinfo/crash-utility
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/crash-utility/attachments/20060605/898d5dcb/attachment.htm>
More information about the Crash-utility
mailing list