[Crash-utility] [PATCH] x86_64: Make the conversion between 4level and 5level paging automatically

Dave Anderson anderson at redhat.com
Mon Jul 9 14:20:21 UTC 2018



----- Original Message -----

> Dear Dave,
> 
> At 07/06/2018 09:45 PM, Dave Anderson wrote:
> > 
> > 
> > ----- Original Message -----
> >> Currently, Crash only enable support for kernel-only 5-level page tables by
> >> entering the command line option "--machdep vm=5level". Since Linux 4.17,
> >> the Linux kernel can be both 4level and 5level page tables. This command
> >> line can't work well for this.
> >>
> >> Using the "pgtable_l5_enabled" got from vmcore to detect whether the kernel
> >> proper for 5 level page tables automatically.
> > 
> > Hello Dou,
> > 
> > Presumably by the time arch_crash_save_vmcoreinfo calls pgtable_l5_enabled(),
> > things have been initialized up appropriately, and so this should work OK for
> > kdump-generated vmcores.  But have you looked into how this should be accomplished
> > for for live systems?  Since kernel commit 51be1335 reverts __pgtable_l5_enabled
> 
> I tested in live system, it didn't work, need use the "--machdep vm=5level" like before.
> 
> > from being __initdata to __ro_after_init, would it be as simple as just reading
> > __pgtable_l5_enabled at POST_RELOC time?
> 
> Yes, I agree, but, how can we read the '__pgtable_l5_enabled' in
> crash. Is there a ready-made interface such as symbol_value() for SYMBOL
> values?

Yes, symbol_value() will work correctly when machdep_init(POST_RELOC)
gets called.

> And seems read at POST_RELOC time is late, it should be earlier than
> PRE_GDB.

Since the "__pgtable_l5_enabled" symbol is a static data symbol located
in the __START_KERNEL_map region, x86_64_VTOP() only needs the kernel's 
"phys_base" value in order to translate the symbol value into a 
physical address:

  ulong x86_64_VTOP(ulong vaddr)
  {
          if (vaddr >= __START_KERNEL_map)
                  return ((vaddr) - (ulong)__START_KERNEL_map + machdep->machspec->phys_base);
          else
                  return ((vaddr) - PAGE_OFFSET);
  }

So if the contents of "__pgtable_l5_enabled" is all that is needed,
I think you can do something like: 

	case POST_RELOC:
+		if (!(machdep->flags & VM_5LEVEL) &&
+		    kernel_symbol_exists("__pgtable_l5_enabled")) {
+			int l5_enabled;
+                       readmem(symbol_value("__pgtable_l5_enabled"), KVADDR,
+                                &l5_enabled, sizeof(int), "__pgtable_l5_enabled", 
+				FAULT_ON_ERROR);
+
+			if (l5_enabled) {
+				... execute the relevant section from PRE_GDB ...
+			}

which would be this section from PRE_GDB:

                case VM_5LEVEL:
                        machdep->machspec->userspace_top = USERSPACE_TOP_5LEVEL;
                        machdep->machspec->page_offset = PAGE_OFFSET_5LEVEL;
                        machdep->machspec->vmalloc_start_addr = VMALLOC_START_ADDR_5LEVEL;
                        machdep->machspec->vmalloc_end = VMALLOC_END_5LEVEL;
                        machdep->machspec->modules_vaddr = MODULES_VADDR_5LEVEL;
                        machdep->machspec->modules_end = MODULES_END_5LEVEL;
                        machdep->machspec->vmemmap_vaddr = VMEMMAP_VADDR_5LEVEL;
                        machdep->machspec->vmemmap_end = VMEMMAP_END_5LEVEL;
                        if (symbol_exists("vmemmap_populate"))
                                machdep->flags |= VMEMMAP;
                        machdep->machspec->physical_mask_shift = __PHYSICAL_MASK_SHIFT_5LEVEL;
                        machdep->machspec->pgdir_shift = PGDIR_SHIFT_5LEVEL;
                        machdep->machspec->ptrs_per_pgd = PTRS_PER_PGD_5LEVEL;
                        if ((machdep->machspec->p4d = (char *)malloc(PAGESIZE())) == NULL)
                                error(FATAL, "cannot malloc p4d space.");
                        machdep->machspec->last_p4d_read = 0;
                        machdep->uvtop = x86_64_uvtop_level4;  /* 5-level is optional per-task */
                }
                machdep->kvbase = (ulong)PAGE_OFFSET;
                machdep->identity_map_base = (ulong)PAGE_OFFSET;

The only things that I can think of that might be a problem is the
readmem() of "__pgtable_l5_enabled" will need to get by this part
of x86_64_kvtop() in order to use x86_64_VTOP():

               if (!IS_VMALLOC_ADDR(kvaddr)) {
                        *paddr = x86_64_VTOP(kvaddr);
                        if (!verbose)
                                return TRUE;
               }

where IS_VMALLOC_ADDR() would still be using the 4-level addresses.
But that could be worked around some way.

Can you give that a test?

Thanks,
  Dave





> 
> Thanks,
> 	dou.
> 
> > 
> > Thanks,
> >    Dave
> > 
> >> Signed-off-by: Dou Liyang <douly.fnst at cn.fujitsu.com>
> >> ---
> >>   x86_64.c | 4 ++++
> >>   1 file changed, 4 insertions(+)
> >>
> >> diff --git a/x86_64.c b/x86_64.c
> >> index 6d1ae2f..be6164b 100644
> >> --- a/x86_64.c
> >> +++ b/x86_64.c
> >> @@ -203,6 +203,10 @@ x86_64_init(int when)
> >>   			machdep->machspec->kernel_image_size = dtol(string, QUIET, NULL);
> >>   			free(string);
> >>   		}
> >> +		if ((string = pc->read_vmcoreinfo("NUMBER(pgtable_l5_enabled)"))) {
> >> +			machdep->flags |= VM_5LEVEL;
> >> +			free(string);
> >> +		}
> >>   		if (SADUMP_DUMPFILE() || QEMU_MEM_DUMP_NO_VMCOREINFO() ||
> >>   		    VMSS_DUMPFILE())
> >>   			/* Need for calculation of kaslr_offset and phys_base */
> >> --
> >> 2.14.3
> >>
> >>
> >>
> >>
> > 
> > 
> > 
> 
> 
> 




More information about the Crash-utility mailing list