[Crash-utility] [PATH v4 1/2] arm64: fix kernel memory map handling for kaslr-enabled kernel

Thu Jun 2 04:27:30 UTC 2016

On Wed, Jun 01, 2016 at 11:41:04AM -0400, Dave Anderson wrote:
> 
> ----- Original Message -----
> 
> > > >  
> > > >  	case PRE_GDB:
> > > > +		/* This check is somewhat redundant */
> > > > +		if (kernel_symbol_exists("kimage_voffset"))
> > > > +			machdep->flags |= NEW_VMEMMAP;
> > > > +
> > > 
> > > Certainly not redundant on a live system without CONFIG_RANDOMIZE_BASE.
> > 
> > Yeah, but for crashdump case, it's redundant.
> 
> I understand, but if you're going to add a comment, it should clarify
> rather than mystify...   ;-)

OK, how about this:
"Check for NEW_VMEMMAP again for a live system"

Or just remove it?

> 
> > > >  		arm64_calc_VA_BITS();
> > > > -		machdep->machspec->page_offset = ARM64_PAGE_OFFSET;
> > > > +		ms = machdep->machspec;
> > > > +		ms->page_offset = ARM64_PAGE_OFFSET;
> > > > +		/* FIXME: idmap for NEW_VMEMMAP */
> > > 
> > > What's the FIXME here?
> > 
> > Identity-mapping no longer starts from PAGE_OFFSET if KASLR.
> > Since machdep->identity_map_base is not used anywhere in crash,
> > I wonder how I should handle it.
> > 
> > > >  		machdep->identity_map_base = ARM64_PAGE_OFFSET;
> 
> You're right, machdep->identity_map_base is (currently) only used in 
> architecture-specific files, although arm64.c does not use it.  
> But what you have done above is correct.

So should I remove the comment?
> 
> > > > @@ -631,6 +718,19 @@ arm64_calc_phys_offset(void)
> > > >  	if (machdep->flags & PHYS_OFFSET) /* --machdep override */
> > > >  		return;
> > > >  
> > > > +	if (machdep->flags & NEW_VMEMMAP) {
> > > > +		struct syment *sp;
> > > > +		ulong value;
> > > > +
> > > > +		sp = kernel_symbol_search("memstart_addr");
> > > > +		if (sp && readmem(sp->value, KVADDR, (char *)&value,
> > > > +				sizeof(value), "memstart_addr",
> > > > +				QUIET|RETURN_ON_ERROR)) {
> > > > +			ms->phys_offset = value;
> > > > +			return;
> > > > +		}
> > > > +	}
> > > > +
> > > 
> > > As we've discussed before, I cannot accept the chicken-and-egg snippet
> > > above.
> > > 
> > > If machdep->machspec->kimage_voffset has not been determined, then the arm64_VTOP()
> > > call made by readmem() will utilize machdep->machspec->phys_offset -- which is what
> > > you're trying to initialize here.
> > 
> > Right, I was a bit confused because my patch basically had an assumption
> > that we have some sort of way to know kimage_voffset.
> > 
> > Meanwhile, if this hunk be moved after "if ACTIVE())" like,
> >     ms->phys_offset = 0;
> >     if (ACTIVE()) {
> >         read("/proc/iomem");
> >         ...
> >     } else {
> >         if (machdep->flags & NEW_VMEMMAP) {
> >             readmem("memstart_addr")
> >             ...
> >         } else if (DISKDUMP_DUMPFILE() ...) {
> >         } else if (KDUMP_DUMPFILE() ...) {
> >         }
> >     }
> > the crash command would still work for a live system if !KASLR.
> 
> Again, the readmem() is not necessary.
> 
> > 
> > > On my live system that has a phys_offset of 0x40000000000, the readmem() 
> > > will fail quietly, but that's not a acceptable usage of the RETURN_ON_ERROR
> > > failure mode, because "memstart_addr" is a legitimate virtual  address that
> > > should never fail.  Also, because the actual kernel phys_offset can be
> > > negative, it would seem to be entirely within the realm of possibility that
> > > the readmem() could mistakenly return successfully, but have read the wrong
> > > location.
> > > 
> > > So what it boils down to is that readmem() should NEVER be called until ALL of the
> > > pieces required by arm64_VTOP() have all been properly initialized, however
> > > that might be accomplished.  Not to mention that calling it this early sets  a
> > > dangerous precedent.
> > 
> > If you think so, we can add a specialized function like:
> >     arm64_readmem_kernel_symbol(char *, ...)
> 
> I don't think so -- the point is to avoid making a readmem() call this early.

I think that I fully understand your concerns here, but my point is:
- I give you a value of kimage_voffset in VMCOREINFO
- you know vmlinux and have vmcore
- you can and do determine kaslr_offset
- so why don't you identify PHYS_OFFSET by reading "memstart_addr"?

Using readmem() or not is a matter of implementation.
I think that we can use READMEM(), instead of readmem(), in
arm64_readme_kernel_symbol(), which is solely used, at least at the moment,
in arm64_calc_phys_offset().

Well, we may have a different story for makedumpfile as there seems to be
an assumption that we have only a vmcore file, neither vmlinux nor System.map.
So I suggested to Pratyush that he could post a patch later to add any
VMCOREINFO parameters that you wanted for makedumpfile.

> > > And in the case of kdump's ELF vmcore and compressed vmcore formats, there
> > > is an existing API between kdump and the crash utility to pass back the
> > > phys_base.  In the kexec-tool's makedumpfile.c file, there is the get_phys_base_arm64()
> > > function that
> > 
> > I can't find makedumpfile.c in kexec-tools.
>  
> Hold on, clarification below...
> 
> > 
> > > currently calculates the offset by using the PT_LOAD segments, and presumably will
> > > have to be modified to use new VMCOREINFO data.  But regardless of how it's one,
> > > the architecture-neutral write_kdump_header() laster copies that offset value into
> > > the kdump_sub_header.phys_base field for the crash utility to access.  Trying to do
> > > a readmem() this early in time is essentially breaking that API.
> > 
> > You may already notice that "phys_offset" may have two different meanings:
> >   * the start address of System RAM
> >     (PHYS_OFFSET on v4.5 or early, and __PHYS_OFFSET on v4.6 or later)
> >   * the value used in phys_to_virt()/virt_to_phys() for any address
> >     in linear-mapping
> >     (PHYS_OFFSET on v4.5 or early, and PHYS_OFFSET on v4.6 or later)
> > 
> > I think that current crash utility needs only the latter.
> > Please correct me if I misunderstand something.
> 
> Right, the __PHYS_OFFSET value seems to be the kernel entry point address, 
> but that is not used.  Crash needs the PHYS_OFFSET value to mimic the kernel's
> arm64 __virt_to_phys() function.  
> 
> Sorry about my reference to "kexec-tool's makedumpfile.c file", as I was
> talking about the Red Hat kexec-tools package, which also contains
> the makedumpfile facility.

I see.

> Makedumpfile is upstream in a couple places, where the most recently released 
> version (1.5.9) is here:
> 
>   https://sourceforge.net/projects/makedumpfile/files/makedumpfile/
> 
> Or the development repository is here:
> 
>   git://git.code.sf.net/p/makedumpfile/code
> 
> Regardless of which location, there are the generic get_phys_base() macros 
> here in "makedumpfile.h" for those architectures that need it:
> 
>   makedumpfile.h 814 #define get_phys_base() get_phys_base_arm64()
>   makedumpfile.h 826 #define get_phys_base() get_phys_base_arm()
>   makedumpfile.h 837 #define get_phys_base() stub_true()
>   makedumpfile.h 850 #define get_phys_base() get_phys_base_x86_64()
>   makedumpfile.h 861 #define get_phys_base() stub_true()
>   makedumpfile.h 871 #define get_phys_base() stub_true()
>   makedumpfile.h 882 #define get_phys_base() stub_true()
>   makedumpfile.h 894 #define get_phys_base() get_phys_base_ia64()
> 
> And where the arm64-specific function is in "arch/arm64.c":
>   
>   int
>   get_phys_base_arm64(void)
>   {
>           unsigned long phys_base = ULONG_MAX;
>           unsigned long long phys_start;
>           int i;
>           /*
>            * We resolve phys_base from PT_LOAD segments. LMA contains physical
>            * address of the segment, and we use the lowest start as
>            * phys_base.
>            */
>           for (i = 0; get_pt_load(i, &phys_start, NULL, NULL, NULL); i++) {
>                   if (phys_start < phys_base)
>                           phys_base = phys_start;
>           }
>   
>           if (phys_base == ULONG_MAX) {
>                   ERRMSG("Can't determine phys_base\n");
>                   return FALSE;
>           }
>   
>           info->phys_base = phys_base;
>   
>           DEBUG_MSG("phys_base    : %lx\n", phys_base);
>   
>           return TRUE;
>   }
>   
> Note that it currently looks for the lowest physical address that is 
> referenced in any of the PT_LOAD segments of /proc/vmcore.  As you have
> discussed for Linux 4.6, perhaps it will be necessary that the value 
> of PHYS_OFFSET ("memstart_addr") will have to be passed in a VMCOREINFO
> item.  It's not clear to me why PT_LOAD p_paddr values will no longer
> be usable, but that's up to the makedumpfile/kdump maintainers.  And
> certainly a convenient VMCOREINFO item would be ideal/preferable.
> 
> Anyway, later on, the info->phys_base value gets stored in the compressed
> kdump header in the write_kdump_header() function in "makedumpfile.c":
>   
>   int
>   write_kdump_header(void)
>   {
>           int ret = FALSE;
>           size_t size;
>           off_t offset_note, offset_vmcoreinfo;
>           unsigned long size_note, size_vmcoreinfo;
>           struct disk_dump_header *dh = info->dump_header;
>           struct kdump_sub_header kh;
>           char *buf = NULL;
>   
>   ... [ cut ] ... 
>   
>           /*
>            * Write sub header
>            */
>           size = sizeof(struct kdump_sub_header);
>           memset(&kh, 0, size);
>           /* 64bit max_mapnr_64 */
>           kh.max_mapnr_64 = info->max_mapnr;
>           kh.phys_base  = info->phys_base;
>           kh.dump_level = info->dump_level;
>   ...
>   
> And then in the crash utility source code, it simply gets accessed
> like so:
>   
>   int
>   diskdump_phys_base(unsigned long *phys_base)
>   {
>           if (KDUMP_CMPRS_VALID()) {
>                   *phys_base = dd->sub_header_kdump->phys_base;
>                   return TRUE;
>           }
>   
>           return FALSE;
>   }
>   
> Now, for ELF vmcores, the crash utility calls arm64_kdump_phys_base() 
> which calls arm_kdump_phys_base(), which simply mimics what the 
> makedumpfile.c's get_phys_base_arm64() does.  And that function may
> need to be changed for Linux 4.6 in order to access the new VMCOREINFO
> variable.

It sounds to me that, once VOMCOREINFO("PHYS_OFFSET") is added for
makedumpfile, crash utility wants to use it, too. Right?

If so, what about, say, VA_BITS which will presumably be needed as
VMCOREINFO later for makedumpfile but is calculated for now in
crash utility?

Please don't take me wrong. I, as an author of kdump for arm64, just want
to understand what parameters and why we want to have.

Thanks,
-Takahiro AKASHI

> Clear as mud?
> 
> Dave
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility