[Crash-utility] [PATCH v1] arm64: fix kernel memory map handling for kaslr-enabled
AKASHI Takahiro
takahiro.akashi at linaro.org
Mon May 23 09:14:50 UTC 2016
On Fri, May 20, 2016 at 03:06:39PM -0400, Dave Anderson wrote:
>
>
> Hi Takahiro,
>
> Welcome to the mailing list -- you are a most valuable addition.
>
> To others in the list, Takahiro and I have been communicating offline
> for a couple weeks, and I convinced him to join us. He works on both
> kexec-tools and the crash utility for Linaro on the ARM64 architecture.
> If you are unaware, in Linux 4.6 there was a major change in the ARM64
> virtual memory layout, complicated by the layering of KASLR on top of it.
> The new VM has broken crash utility support completely, and Takahiro is
> tackling both.
>
> My comments and questions on the v1 patch follow...
>
> ----- Original Message -----
> > Hi,
> >
> > This patch is still rough-edged, but please review it and
> > any comments are very welcome.
> > I will try to fix the known issues before I submit a new
> > version of kexec/kdump patch for v4.7 merge window.
> >
> > Thanks,
> > -Takahiro AKASHI
> >
> > ===8<===
> > >From fdc7c881d98ef00ed1ff38a058b4913a1d5bcda6 Mon Sep 17 00:00:00 2001
> > From: AKASHI Takahiro <takahiro.akashi at linaro.org>
> > Date: Mon, 16 May 2016 17:31:55 +0900
> > Subject: [PATCH v1] arm64: fix kernel memory map handling for kaslr-enabled
> > kernel
> >
> > In kernel v4.6, Kernel ASLR (KASLR) is supported on arm64, and the start
> > address of the kernel image can be randomized if CONFIG_RANDOMIZE_BASE is enabled.
> > Even worse, the kernel image is no more mapped in the linear mapping, but
> > in vmalloc area (i.e. below PAGE_OFFSET).
> >
> > Now, according to the kernel's memory.h, converting a virtual address to
> > a physical address should be done like below:
> >
> > phys_addr_t __x = (phys_addr_t)(x); \
> > __x & BIT(VA_BITS - 1) ? (__x & ~PAGE_OFFSET) + PHYS_OFFSET : \
> > (__x - kimage_voffset); })
> >
> > Please note that PHYS_OFFSET is no more equal to the start address of
> > the first usable memory block in SYSTEM RAM due to the fact mentioned
> > above.
>
> So it is no longer possible to use /proc/iomem if KASLR is enabled
> on a live system? That being the case, we need a way for root to
> be able to determine what it is for live system analysis.
Now that PHYS_OFFSET is defined as "memstart_addr", we can get the value
if we can access this symbol (on a live system).
> >
> > This patch addresses this change and allows the crash utility to access
> > memory contents with correct addresses.
> > *However*, it is still rough-edged and we need more refinements.
> >
> > 1) Currently we have to specify the following command line:
> > $ crash --kaslr auto
> > --machdep phys_offset=XXX
> > --machdep kimage_voffset=YYY
> > <vmlinux> <vmcore>
>
> I presume that live system analysis requires the same extra arguments at this point?
I'm still investigating a live-system case.
>
> > To remove these extra options, I'm planning to enhance my kdump patch
> > so that /proc/vmcore holds the necessary information for analyzing the
> > contents of kernel memory.
> >
> > 2) My current change breaks the backward compatibility.
> > Crash won't work with v4.5 or early kernel partly because I changed
> > the calculation of VA_BITS. (I think that the existing code is wrong.)
> > So far I don't have enough time to look into this issue as I'm focusing
> > on KASLR support.
>
> I don't understand why you think the existing code is wrong. It is
> simply trying to determine what the value of CONFIG_ARM64_VA_BITS is.
> For example, we (Red Hat) are currently using 4.5.0 based kernel, which
> has this configuration:
>
> config-arm64:CONFIG_ARM64_VA_BITS=48
>
> And that's what crash-7.1.5 calculates:
>
> crash> sys | grep RELEASE
> RELEASE: 4.5.0-0.38.el7.aarch64
> crash> help -m
> ... [ cut ] ...
> VA_BITS: 48
> userspace_top: 0001000000000000
> page_offset: ffff800000000000
> vmalloc_start_addr: ffff000000000000
> vmalloc_end: ffff7fbfdffeffff
> modules_vaddr: ffff7ffffc000000
> modules_end: ffff7fffffffffff
> vmemmap_vaddr: ffff7fbfe0000000
> vmemmap_end: ffff7fffdfffffff
> phys_offset: 4000000000
> ...
>
> I'm also presuming that your VA_BITS-related changes is what breaks
> backwards-compatibility, because if I run "crash -d1" with your patch
> applied, I see this debug output on the same kernel above:
>
> VA_BITS: 47
>
> On the other hand, if I run your patch on a Fedora 4.6.0-0.rc7.git2.1.fc25
> live kernel, it comes up OK, and shows this:
>
> VA_BITS: 42
> userspace_top: 0000040000000000
> page_offset: fffffe0000000000
I don't still have time to look into this issue, but
as far as those three values are concerned, they are consistent
and VA_BITS seems to be correct.
> vmalloc_start_addr: fffffc0008000000
> vmalloc_end: fffffdff5ffeffff
> modules_vaddr: fffffc0000000000
> modules_end: fffffc0007ffffff
> vmemmap_vaddr: fffffdff80000000
> vmemmap_end: fffffdffffffffff
> phys_offset: 4000000000
>
> KASLR is NOT configured, so perhaps that figures into why it works?
>
>
> > 3) Backtracing a 'panic'ed task fails:
> > crash> bt
> > PID: 999 TASK: ffffffe960081800 CPU: 0 COMMAND: "sh"
> > bt: WARNING: corrupt prstatus? pstate=0x80000000, but no user frame found
> > bt: WARNING: cannot determine starting stack frame for task ffffffe960081800
> >
> > frame->pc indicates that it is a kernel address, so obviously something
> > is wrong. I'm diggin into it.
This seems to a bug in my kdump patch.
It mistakenly(?) returns spsr_el1 for regs->pstate, not current pstate,
but we can't get current pstate directly on arm64.
> OK...
>
> First, before posting, please kick off a "make warn" compilation in order to clean up
> things like this:
>
> $ make warn
> ... [ cut ] ...
> cc -c -g -DARM64 -DLZO -DSNAPPY -DGDB_7_6 arm64.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector -Wformat-security
> arm64.c:78:1: warning: no previous prototype for 'arm64_VTOP' [-Wmissing-prototypes]
> arm64_VTOP(ulong addr)
> ^
> arm64.c: In function 'arm64_parse_cmdline_args':
> arm64.c:635:8: warning: unused variable 'value' [-Wunused-variable]
> ulong value = 0;
> ^
> arm64.c:631:19: warning: unused variable 'err' [-Wunused-variable]
> int index, i, c, err;
> ^
> arm64.c: In function 'arm64_init':
> arm64.c:2539:17: warning: 'vmalloc_end' may be used uninitialized in this function [-Wmaybe-uninitialized]
> vmemmap_start = vmalloc_end + SZ_64K;
> ^
> arm64.c:2510:8: note: 'vmalloc_end' was declared here
> ulong vmalloc_end;
> ...
>
> As I mentioned in one of our earlier conversations, "vmalloc_end" is most definitely
> being used uninitialized.
Fixed all those warnings.
> Patch comments are intermingled below:
>
>
> > Signed-off-by: AKASHI Takahiro <takahiro.akashi at linaro.org>
> > ---
> > arm64.c | 165
> > +++++++++++++++++++++++++++++++++++++++++++++-----------------
> > defs.h | 22 ++++++---
> > main.c | 7 +--
> > symbols.c | 10 ++--
> > 4 files changed, 146 insertions(+), 58 deletions(-)
> >
> > diff --git a/arm64.c b/arm64.c
> > index 34c8c59..22ddade 100644
> > --- a/arm64.c
> > +++ b/arm64.c
> > @@ -22,6 +22,8 @@
> > #include <endian.h>
> >
> > #define NOT_IMPLEMENTED(X) error((X), "%s: function not implemented\n",
> > __func__)
> > +/* FIXME: temporarily used in kt->flags2 */
> > +#define ARM64_NEW_VMEMMAP 0x80000000
>
> Since ARM64_NEW_VMEMMAP is only used in arm64.c, please put it
> in the machdep->flags field (under IRQ_STACKS):
Yes.
>
> >
> > static struct machine_specific arm64_machine_specific = { 0 };
> > static int arm64_verify_symbol(const char *, ulong, char);
> > @@ -72,6 +74,21 @@ static int arm64_get_crash_notes(void);
> > static void arm64_calc_VA_BITS(void);
> > static int arm64_is_uvaddr(ulong, struct task_context *);
> >
> > +ulong
> > +arm64_VTOP(ulong addr)
> > +{
> > + if (!(kt->flags2 & ARM64_NEW_VMEMMAP) ||
> > + (addr >= machdep->machspec->page_offset)) {
> > + return machdep->machspec->phys_offset
> > + + (addr - machdep->machspec->page_offset);
> > + } else {
> > + if (machdep->machspec->kimage_voffset)
> > + return addr - machdep->machspec->kimage_voffset;
> > + else /* no randomness */
> > + return machdep->machspec->phys_offset
> > + + (addr - machdep->machspec->vmalloc_start_addr);
> > + }
> > +}
> >
> > /*
> > * Do all necessary machine-specific setup here. This is called several
> > times
> > @@ -105,6 +122,10 @@ arm64_init(int when)
> > break;
> >
> > case PRE_GDB:
> > + /* FIXME: Is kimage_voffset appropriate? */
> > + if (kernel_symbol_exists("kimage_voffset"))
> > + kt->flags2 |= ARM64_NEW_VMEMMAP;
> > +
>
> As far as I can tell, "kvimage_voffset" is perfect.
Yes, I will remove "FIXME" comment.
>
> > if (!machdep->pagesize) {
> > /*
> > * Kerneldoc Documentation/arm64/booting.txt describes
> > @@ -160,16 +181,33 @@ arm64_init(int when)
> > machdep->pagemask = ~((ulonglong)machdep->pageoffset);
> >
> > arm64_calc_VA_BITS();
> > - machdep->machspec->page_offset = ARM64_PAGE_OFFSET;
> > + ms = machdep->machspec;
> > + ms->page_offset = ARM64_PAGE_OFFSET;
> > machdep->identity_map_base = ARM64_PAGE_OFFSET;
> > - machdep->machspec->userspace_top = ARM64_USERSPACE_TOP;
> > - machdep->machspec->modules_vaddr = ARM64_MODULES_VADDR;
> > - machdep->machspec->modules_end = ARM64_MODULES_END;
> > - machdep->machspec->vmalloc_start_addr = ARM64_VMALLOC_START;
> > - machdep->machspec->vmalloc_end = ARM64_VMALLOC_END;
> > - machdep->kvbase = ARM64_VMALLOC_START;
> > - machdep->machspec->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
> > - machdep->machspec->vmemmap_end = ARM64_VMEMMAP_END;
> > + ms->userspace_top = ARM64_USERSPACE_TOP;
> > + if (kt->flags2 & ARM64_NEW_VMEMMAP) {
> > + struct syment *sp;
> > +
> > + sp = kernel_symbol_search("_text");
> > + ms->kimage_text = (sp ? sp->value : 0);
> > + sp = kernel_symbol_search("_end");
> > + ms->kimage_end = (sp ? sp->value : 0);
> > +
> > + ms->modules_vaddr = ARM64_VA_START;
> > + if (kernel_symbol_exists("kasan_init"))
> > + ms->modules_vaddr += ARM64_KASAN_SHADOW_SIZE;
> > + ms->modules_end = ms->modules_vaddr
> > + + ARM64_MODULES_VSIZE -1;
> > +
> > + ms->vmalloc_start_addr = ms->modules_end + 1;
> > + } else {
> > + ms->modules_vaddr = ARM64_PAGE_OFFSET - MEGABYTES(64);
> > + ms->modules_end = ARM64_PAGE_OFFSET - 1;
> > + ms->vmalloc_start_addr = ARM64_VA_START;
> > + }
> > + ms->vmalloc_end = ARM64_VMALLOC_END;
> > + ms->vmemmap_vaddr = ARM64_VMEMMAP_VADDR;
> > + ms->vmemmap_end = ARM64_VMEMMAP_END;
> >
> > switch (machdep->pagesize)
> > {
> > @@ -543,6 +581,42 @@ arm64_dump_machdep_table(ulong arg)
> > }
> > }
>
> Although you haven't changed arm64_dump_machdep_table(), this reminds
> me: can you please display the 3 new machine_specific fields in
> arm64_dump_machdep_table(). And also the ARM64_NEW_VMEMMAP bit after
> you move it into the flags field? ("help -m" is your friend...)
Yes.
>
> > +static int
> > +arm64_parse_machdep_arg_l(char *argstring, char *param, ulong *value)
> > +{
> > + int len;
> > + int megabytes = FALSE;
> > + char *p;
> > +
> > + len = strlen(param);
> > + if (!STRNEQ(argstring, param) || (argstring[len] != '='))
> > + return FALSE;
> > +
> > + if ((LASTCHAR(argstring) == 'm') ||
> > + (LASTCHAR(argstring) == 'M')) {
> > + LASTCHAR(argstring) = NULLCHAR;
> > + megabytes = TRUE;
> > + }
> > +
> > + p = argstring + len + 1;
> > + if (strlen(p)) {
> > + int flags = RETURN_ON_ERROR | QUIET;
> > + int err = 0;
> > +
> > + if (megabytes) {
> > + *value = dtol(p, flags, &err);
> > + if (!err)
> > + *value = MEGABYTES(*value);
> > + } else {
> > + *value = htol(p, flags, &err);
> > + }
> > +
> > + if (!err)
> > + return TRUE;
> > + }
> > +
> > + return FALSE;
> > +}
> >
> > /*
> > * Parse machine dependent command line arguments.
> > @@ -580,39 +654,23 @@ arm64_parse_cmdline_args(void)
> > c = parse_line(buf, arglist);
> >
> > for (i = 0; i < c; i++) {
> > - err = 0;
> > -
> > - if (STRNEQ(arglist[i], "phys_offset=")) {
> > - int megabytes = FALSE;
> > - int flags = RETURN_ON_ERROR | QUIET;
> > -
> > - if ((LASTCHAR(arglist[i]) == 'm') ||
> > - (LASTCHAR(arglist[i]) == 'M')) {
> > - LASTCHAR(arglist[i]) = NULLCHAR;
> > - megabytes = TRUE;
> > - }
> > -
> > - p = arglist[i] + strlen("phys_offset=");
> > - if (strlen(p)) {
> > - if (megabytes)
> > - value = dtol(p, flags, &err);
> > - else
> > - value = htol(p, flags, &err);
> > - }
> > -
> > - if (!err) {
> > - if (megabytes)
> > - value = MEGABYTES(value);
> > -
> > - machdep->machspec->phys_offset = value;
> > -
> > - error(NOTE,
> > - "setting phys_offset to: 0x%lx\n\n",
> > - machdep->machspec->phys_offset);
> > + if (arm64_parse_machdep_arg_l(arglist[i],
> > + "phys_offset",
> > + &machdep->machspec->phys_offset)) {
> > + error(NOTE,
> > + "setting phys_offset to: 0x%lx\n\n",
> > + machdep->machspec->phys_offset);
> > +
> > + machdep->flags |= PHYS_OFFSET;
> > + continue;
> > + } else if (arm64_parse_machdep_arg_l(arglist[i],
> > + "kimage_voffset",
> > + &machdep->machspec->kimage_voffset)) {
> > + error(NOTE,
> > + "setting kimage_voffset to: 0x%lx\n\n",
> > + machdep->machspec->kimage_voffset);
> >
> > - machdep->flags |= PHYS_OFFSET;
> > - continue;
> > - }
> > + continue;
> > }
> >
> > error(WARNING, "ignoring --machdep option: %s\n",
> > @@ -2377,6 +2435,11 @@ arm64_IS_VMALLOC_ADDR(ulong vaddr)
> > {
> > struct machine_specific *ms = machdep->machspec;
> >
> > + if ((kt->flags2 & ARM64_NEW_VMEMMAP) &&
> > + (vaddr >= machdep->machspec->kimage_text) &&
> > + (vaddr <= machdep->machspec->kimage_end))
> > + return FALSE;
> > +
> > return ((vaddr >= ms->vmalloc_start_addr && vaddr <=
> > ms->vmalloc_end) ||
> > ((machdep->flags & VMEMMAP) &&
> > (vaddr >= ms->vmemmap_vaddr && vaddr <= ms->vmemmap_end))
> > ||
> > @@ -2407,7 +2470,11 @@ arm64_calc_VA_BITS(void)
> >
> > for (bitval = highest_bit_long(value); bitval; bitval--) {
> > if ((value & (1UL << bitval)) == 0) {
> > +#if 1
> > + machdep->machspec->VA_BITS = bitval + 1;
> > +#else /* FIXME: bug? */
> > machdep->machspec->VA_BITS = bitval + 2;
> > +#endif
>
> Yeah, the new scheme needs further investigation...
>
>
> > break;
> > }
> > }
> > @@ -2459,10 +2526,20 @@ arm64_calc_virtual_memory_ranges(void)
> > break;
> > }
> >
> > - vmemmap_size = ALIGN((1UL << (ms->VA_BITS - machdep->pageshift)) *
> > SIZE(page), PUD_SIZE);
> > + if (kt->flags2 & ARM64_NEW_VMEMMAP) {
> > +#define STRUCT_PAGE_MAX_SHIFT 6
> > + vmemmap_size = 1UL << (ms->VA_BITS - machdep->pageshift - 1
> > + + STRUCT_PAGE_MAX_SHIFT);
> > +
> > + vmemmap_start = ms->page_offset - vmemmap_size;
> > + vmemmap_end = ms->page_offset;
> > + } else {
> > + vmemmap_size = ALIGN((1UL << (ms->VA_BITS - machdep->pageshift)) *
> > SIZE(page), PUD_SIZE);
> > +
> > + vmemmap_start = vmalloc_end + SZ_64K;
> > + vmemmap_end = vmemmap_start + vmemmap_size;
> > + }
> > vmalloc_end = (ms->page_offset - PUD_SIZE - vmemmap_size - SZ_64K);
> > - vmemmap_start = vmalloc_end + SZ_64K;
> > - vmemmap_end = vmemmap_start + vmemmap_size;
> >
> > ms->vmalloc_end = vmalloc_end - 1;
> > ms->vmemmap_vaddr = vmemmap_start;
> > diff --git a/defs.h b/defs.h
> > index a09fa9a..40e02fc 100644
> > --- a/defs.h
> > +++ b/defs.h
> > @@ -2844,8 +2844,8 @@ typedef u64 pte_t;
> >
> > #define PTOV(X) \
> > ((unsigned
> > long)(X)-(machdep->machspec->phys_offset)+(machdep->machspec->page_offset))
> > -#define VTOP(X) \
> > - ((unsigned
> > long)(X)-(machdep->machspec->page_offset)+(machdep->machspec->phys_offset))
> > +
> > +#define VTOP(X) arm64_VTOP((ulong)(X))
> >
> > #define USERSPACE_TOP (machdep->machspec->userspace_top)
> > #define PAGE_OFFSET (machdep->machspec->page_offset)
> > @@ -2944,12 +2944,16 @@ typedef signed int s32;
> > * arch/arm64/include/asm/memory.h
> > * arch/arm64/include/asm/pgtable.h
> > */
> > -
> > -#define ARM64_PAGE_OFFSET ((0xffffffffffffffffUL) <<
> > (machdep->machspec->VA_BITS - 1))
> > +#define ARM64_VA_START ((0xffffffffffffffffUL) \
> > + << machdep->machspec->VA_BITS)
> > +#define ARM64_PAGE_OFFSET ((0xffffffffffffffffUL) \
> > + << (machdep->machspec->VA_BITS - 1))
> > #define ARM64_USERSPACE_TOP ((1UL) << machdep->machspec->VA_BITS)
> > -#define ARM64_MODULES_VADDR (ARM64_PAGE_OFFSET - MEGABYTES(64))
> > -#define ARM64_MODULES_END (ARM64_PAGE_OFFSET - 1)
> > -#define ARM64_VMALLOC_START ((0xffffffffffffffffUL) <<
> > machdep->machspec->VA_BITS)
> > +
> > +/* only used for v4.6 or later */
> > +#define ARM64_MODULES_VSIZE MEGABYTES(128)
> > +#define ARM64_KASAN_SHADOW_SIZE (1UL << (machdep->machspec->VA_BITS - 3))
> > +
> > /*
> > * The following 3 definitions are the original values, but are obsolete
> > * for 3.17 and later kernels because they are now build-time calculations.
> > @@ -3028,6 +3032,10 @@ struct machine_specific {
> > ulong kernel_flags;
> > ulong irq_stack_size;
> > ulong *irq_stacks;
> > + /* only needed for kaslr-enabled kernel */
> > + ulong kimage_voffset;
> > + ulong kimage_text;
> > + ulong kimage_end;
> > };
>
> Don't forget -- please display these in arm64_dump_machdep_table()! ;-)
Yes, I will.
> >
> > struct arm64_stackframe {
> > diff --git a/main.c b/main.c
> > index 821bb4e..3f24908 100644
> > --- a/main.c
> > +++ b/main.c
> > @@ -227,9 +227,10 @@ main(int argc, char **argv)
> > optarg);
> > }
> > } else if (STREQ(long_options[option_index].name, "kaslr")) {
> > - if (!machine_type("X86_64"))
> > - error(INFO, "--kaslr only valid "
> > - "with X86_64 machine type.\n");
> > + if (!machine_type("X86_64") &&
> > + !machine_type("ARM64"))
> > + error(INFO, "--kaslr not valid "
> > + "with this machine type.\n");
> > else if (STREQ(optarg, "auto"))
> > kt->flags2 |= (RELOC_AUTO|KASLR);
> > else {
> > diff --git a/symbols.c b/symbols.c
> > index a8d3563..df27793 100644
> > --- a/symbols.c
> > +++ b/symbols.c
> > @@ -593,7 +593,8 @@ kaslr_init(void)
> > {
> > char *string;
> >
> > - if (!machine_type("X86_64") || (kt->flags & RELOC_SET))
> > + if ((!machine_type("X86_64") && !machine_type("ARM64")) ||
> > + (kt->flags & RELOC_SET))
> > return;
> >
> > /*
> > @@ -712,7 +713,7 @@ store_symbols(bfd *abfd, int dynamic, void *minisyms,
> > long symcount,
> > if (machine_type("X86")) {
> > if (!(kt->flags & RELOC_SET))
> > kt->flags |= RELOC_FORCE;
> > - } else if (machine_type("X86_64")) {
> > + } else if (machine_type("X86_64") || machine_type("ARM64")) {
> > if ((kt->flags2 & RELOC_AUTO) && !(kt->flags & RELOC_SET))
> > derive_kaslr_offset(abfd, dynamic, from,
> > fromend, size, store);
> > @@ -783,7 +784,8 @@ store_sysmap_symbols(void)
> > error(FATAL, "symbol table namespace malloc: %s\n",
> > strerror(errno));
> >
> > - if (!machine_type("X86") && !machine_type("X86_64"))
> > + if (!machine_type("X86") && !machine_type("X86_64") &&
> > + !machine_type("ARM64"))
> > kt->flags &= ~RELOC_SET;
> >
> > first = 0;
> > @@ -4681,7 +4683,7 @@ value_search(ulong value, ulong *offset)
> > if ((sp = machdep->value_to_symbol(value, offset)))
> > return sp;
> >
> > - if (IS_VMALLOC_ADDR(value))
> > + if (IS_VMALLOC_ADDR(value))
> > goto check_modules;
> >
> > if ((sp = symval_hash_search(value)) == NULL)
> > --
> > 2.8.1
>
>
> That's all I've got for this pass.
Though I've not addressed all of your comments, I will sent out
a updated patch tomorrow.
Thanks,
-Takahiro AKASHI
> Thanks again,
> Dave
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
--
Thanks,
-Takahiro AKASHI
More information about the Crash-utility
mailing list