[Crash-utility] x86 remap allocator in kernel 3.0
Dave Anderson
anderson at redhat.com
Tue Jan 10 19:24:58 UTC 2012
----- Original Message -----
> Hi folks,
>
> I've just discovered that the crash utility fails to initialize the vm
> subsystem properly on our latest SLES 32-bit kernels. It turns out that our
> kernels are compiled with CONFIG_DISCONTIGMEM=y, which causes pgdat structs to
> be allocated by the remap allocator (cf. arch/x86/mm/numa_32.c and also the
> code in setup_node_data).
>
> If you don't know what the remap allocator is (like I didn't before I hit the
> bug), it's a very special early-boot allocator which remaps physical pages
> from low memory to high memory, giving them virtual addresses from the
> identity mapping. Looks a bit like this:
>
> physical addr
> +------------+
> | |
> +------------+
> +--> | KVA RAM |
> | +------------+
> | | |
> | \/\/\/\/\/\/\/
> | /\/\/\/\/\/\/\
> | | |
> virtual addr | | highmem |
> +------------+ | |------------|
> | | -----> | |
> +------------+ | +------------+
> | remap va | --+ | KVA PG | (unused)
> +------------+ +------------+
> | | | |
> | | -----> | RAM bottom |
> +------------+ +------------+
>
> This breaks a very basic assumption that crash makes about low-memory virtual
> addresses.
Hmmm, yeah, I am also unaware of this, and I'm not entirely clear based upon
your explanation. What do "KVA PG" and "KVA RAM" mean exactly? And do just
the pgdat structures (which I know can be huge) get moved from low to high
physical memory (per-node perhaps), and then remapped with mapped virtual
addresses?
Anyway, I trust you know what you're doing...
>
> The attached patch fixes the issue for me, but may not be the cleanest method
> to handle these mappings.
Anyway, what I can't wrap my head around is that the initialization sequence
is being done by the first call to x86_ktop_PAE(), which calls x86_kvtop_remap(),
which calls initialize_remap(), which calls readmem(), which calls x86_kvtop_PAE(),
starting the whole thing over again. How does that recursion work? Would it be
possible to call initialize_remap() earlier on instead of doing it upon the first
kvtop() call?
Dave
>
> Ken'ichi Ohmichi, please note that makedumpfile is also affected by this
> deficiency. On my test system, it will fail to produce any output if I set
> dump level to anything greater than zero:
>
> makedumpfile -c -d 31 -x vmlinux-3.0.13-0.5-pae.debug vmcore kdump.31
> readmem: Can't convert a physical address(34a012b4) to offset.
> readmem: type_addr: 0, addr:f4a012b4, size:4
> get_mm_discontigmem: Can't get node_start_pfn.
>
> makedumpfile Failed.
>
> However, fixing this for makedumpfile is harder, and it will most likely
> require a few more lines in VMCOREINFO, because debug symbols may not be
> available at dump time, and I can't see any alternative method to locate the
> remapped regions.
>
> Regards,
> Petr Tesarik
> SUSE Linux
>
More information about the Crash-utility
mailing list