[Crash-utility] [PATCH] read physical memory layout from device tree (32-bit ARM)

Dave Anderson anderson at redhat.com
Thu Jul 2 15:03:17 UTC 2015



----- Original Message -----
> Hello Dave,
> here is patch mentioned in bug
> https://bugzilla.redhat.com/show_bug.cgi?id=1238151.
> It adds new option "--dtbmem" which reads physical memory layout from
> provided device tree and use it for kernel virtual
> to physical memory address translation. This is needed because some unusual
> boards have sparse memory and it is impossible
> to translate address using offset cause memory "holes" must be considered.
> Fail occurs during read of "mem_section" symbol.
> With this patch everything works ok. I use device tree parser based on
> 'fdtdump' utility from device tree compiler.
> 
> Of course it is just basic concept and code doesn't look good, but if this
> idea will be useful i'll improve it.


Hello Arseniy,

Just so the 32-bit ARM developers on this list can understand what this is
all about, here is the description from the bugzilla you filed:

  Description of problem: crash failed reading 'memory_section' symbol with sparse mem.

  How reproducible:
  I have board with sparse memory. When i'm trying to load coredump of this board,
  i get the following message:

  crash: read error: kernel virtual address: dc98f400  type: "memory section"

  This happens because 'arm_kvtop' translates virtual to physical just using offset,
  while mentioned board has the following physical memory layout(from, size):
  0x20000000, 0xf800000
  0x40000000, 0x2c000000
  0x84800000, 0x2b000000
  So when crash substracts offset from 0xdc98f400 it gets 0x3c98f400, which is
  out of physical addresses of coredump program segments. I've prepared simple
  patch which parses device tree in order to read these memory regions and use
  them instead of offset substraction during virtual to physical address
  translation.

So the problem is not with the CONFIG_SPARSEMEM per se, but rather with the
the crash utility's VTOP() macro for 32-bit ARM, which is this:

  #define PTOV(X) \
        ((unsigned long)(X)-(machdep->machspec->phys_base)+(machdep->kvbase))
  #define VTOP(X) \
        ((unsigned long)(X)-(machdep->kvbase)+(machdep->machspec->phys_base))

and where the machdep->machspec->phys_base value is determined during initialization
like so:

  (1) on live systems, by parsing /proc/iomem
  (2) in ELF kdump dumpfiles, from parsing the PT_LOAD segments, and selects
      the lowest "PhysAddr" value.
  (3) in compressed kdump dumpfiles, from the dumpile header
  (4) or the user must pass the offset with "--machdep phys_base=<addr>"

The crash utility's VTOP() is a direct reflection of how the 32-bit ARM kernel
does it, at least if the kernel is configured with CONFIG_FLATMEM:

  static inline phys_addr_t __virt_to_phys(unsigned long x)
  {
          return (phys_addr_t)x - PAGE_OFFSET + PHYS_OFFSET;
  }

Looking at the kernel sources, I can't understand exactly how it works
with multiple physical base addresses?  There is another __virt_to_phys()
that is constrained to CONFIG_PATCH_ARM_PHYS_VIRT, but it seems to 
depend on !SPARSEMEM?  Can you explain how the kernel does it?

So anyway, the problem is that crash does not support 32-bit ARM kernels configured
with CONFIG_SPARSEMEM which have *multiple* phys_base values.  Your patch adds a 
"--dtbmem <file>" option which reads the physical memory layout from a 
"provided device tree".   It parses that file, and creates an array of 
structures containing a physical-base-address/size pairs for each memory 
segment.

I don't know what "provided device tree" file means.  Where does the file come from?

Regardless of that, I have to believe that this cannot be accomplished in a much 
simpler manner.

Your dumpfile is an ELF vmcore.  Can you show me the output of "readelf -a"
on your vmcore?  It should reflect the physical address ranges you showed
above:

>  0x20000000, 0xf800000
>  0x40000000, 0x2c000000
>  0x84800000, 0x2b000000

I guessing that the PT_LOAD segments will precisely reflect the regions
above.  If that's true, why can't the information be gathered from the 
vmcore header as it does now?

Worse case, it seems that you should be able to create something like
a "--machdep phys_base=<addr:size,addr:size,addr:size>" option string 
that describes the multiple physical bases/sizes.

In any case, I have some initial comments about the patch.

First, can you please post your patches as attachments to your email?
It seems that it has been line-wrapped somewhere along the line.  I 
started fixing each complaint, but I give up:

  $ patch -p1 --dry-run < dtb.patch
  patching file Makefile
  patch: **** malformed patch at line 39: help.c task.c \

  $ vi +39 $dl/dtc.patch

    < fix line 39 >

  $ patch -p1 --dry-run < dtb.patch
  patching file Makefile
  patch: **** malformed patch at line 43: \

  $ vi +43 $dl/dtc.patch

    < fix line 43 >

  $ patch -p1 --dry-run < dtb.patch
  patching file Makefile
  Hunk #1 succeeded at 69 with fuzz 2 (offset -1 lines).
  patching file arm.c
  patch: **** malformed patch at line 77: PMD_TYPE_TABLE  1  #define PMD_TYPE_SECT_LPAE 1

  $

Your patch replaces the two VTOP() calls in arm.c with a new vtop_sparse()
function if this feature has been used:

 +              if (dyn_mems == NULL)
 +                      *paddr = VTOP(kvaddr);
 +              else
 +                      *paddr = vtop_sparse(kvaddr);

But the patch does not handle the cases where VTOP() is used by generic
crash code outside of arm.c.  And it doesn't handle PTOV() at all.

So what would have to be done is to change the ARM VTOP() and PTOV() 
macros in "defs.h" so that all callers will do the right thing.

The Makefile has declared mem_dtb.c and mem_dtb.o, but then compiles
it based upon "parse.o"?:

 +       ramdump.c vmware_vmss.c mem_dtb.c

 +       ramdump.o vmware_vmss.o mem_dtb.o

 +parse.o: ${GENERIC_HFILES} mem_dtb.c
 +       ${CC} -c ${CRASH_CFLAGS} mem_dtb.c ${WARNING_OPTIONS}
 ${WARNING_ERROR}
 +

I note this at the end of the new mem_dtb.c file:

 +#ifdef ARM
 ... [ cut ] ...
 +}
 +#else
 +int read_dtb_sparse_mem(const char *dtb_file_name) {
 +	error(INFO, "Sparse mem supported only for arm!\n");
 +	return 0;
 +}
 +#endif
 --

Kind of misleading -- it's not a problem with sparsemem, but
rather this device tree file business.  

But again, I appreciate the work you put into this, but please let's try
to avoid this additional file requirement.

Thanks,
  Dave





> 
> >From a634cf95aadff1a4372f066cca009332bcec2fc6 Mon Sep 17 00:00:00 2001
> From: Arseniy Krasnov <a.krasnov at samsung.com>
> Date: Wed, 1 Jul 2015 10:26:34 +0300
> Subject: [PATCH] crashtool: phys memory layout from device tree.
> 
> ---
>  Makefile  |   7 +-
>  arm.c     |  39 +++++++-
>  defs.h    |   9 ++
>  main.c    |   8 +-
>  mem_dtb.c | 299
> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  5 files changed, 354 insertions(+), 8 deletions(-)  create mode 100644
> mem_dtb.c
> 
> diff --git a/Makefile b/Makefile
> index 3c38ff5..e2188a5 100644
> --- a/Makefile
> +++ b/Makefile
> @@ -70,7 +70,7 @@ CFILES=main.c tools.c global_data.c memory.c filesys.c
> help.c task.c \
>  	unwind_x86_32_64.c unwind_arm.c \
>  	xen_hyper.c xen_hyper_command.c xen_hyper_global_data.c \
>  	xen_hyper_dump_tables.c kvmdump.c qemu.c qemu-load.c sadump.c ipcs.c
> \
> -	ramdump.c vmware_vmss.c
> +	ramdump.c vmware_vmss.c mem_dtb.c
>  
>  SOURCE_FILES=${CFILES} ${GENERIC_HFILES} ${MCORE_HFILES} \
>  	${REDHAT_CFILES} ${REDHAT_HFILES} ${UNWIND_HFILES} \ @@ -88,7 +88,7
> @@ OBJECT_FILES=main.o tools.o global_data.o memory.o filesys.o help.o
> task.o \
>  	unwind_x86_32_64.o unwind_arm.o \
>  	xen_hyper.o xen_hyper_command.o xen_hyper_global_data.o \
>  	xen_hyper_dump_tables.o kvmdump.o qemu.o qemu-load.o sadump.o ipcs.o
> \
> -	ramdump.o vmware_vmss.o
> +	ramdump.o vmware_vmss.o mem_dtb.o
>  
>  MEMORY_DRIVER_FILES=memory_driver/Makefile memory_driver/crash.c
> memory_driver/README
>  
> @@ -345,6 +345,9 @@ help.o: ${GENERIC_HFILES} help.c
>  memory.o: ${GENERIC_HFILES} memory.c
>  	${CC} -c ${CRASH_CFLAGS} memory.c ${WARNING_OPTIONS}
> ${WARNING_ERROR}
>  
> +parse.o: ${GENERIC_HFILES} mem_dtb.c
> +	${CC} -c ${CRASH_CFLAGS} mem_dtb.c ${WARNING_OPTIONS}
> ${WARNING_ERROR}
> +
>  test.o: ${GENERIC_HFILES} test.c
>  	${CC} -c ${CRASH_CFLAGS} test.c ${WARNING_OPTIONS} ${WARNING_ERROR}
>  
> diff --git a/arm.c b/arm.c
> index 534c501..c31e6a1 100644
> --- a/arm.c
> +++ b/arm.c
> @@ -85,6 +85,10 @@ static struct arm_pt_regs *panic_task_regs;  #define
> PMD_TYPE_TABLE  1  #define PMD_TYPE_SECT_LPAE 1
>  
> +/* number of memory areas and its array */ int mems_num; struct
> +mem_area *dyn_mems;
> +
>  static inline ulong *
>  pmd_page_addr(ulong pmd)
>  {
> @@ -1264,6 +1268,30 @@ arm_uvtop(struct task_context *tc, ulong uvaddr,
> physaddr_t *paddr, int verbose)
>  	return arm_vtop(uvaddr, pgd, paddr, verbose);  }
>  
> +static physaddr_t vtop_sparse(unsigned int vaddr) {
> +	physaddr_t ret;
> +	physaddr_t cur;
> +	int i;
> +
> +	ret = vaddr - machdep->kvbase;
> +
> +	cur = 0;
> +	for(i = 0;i < mems_num;i++) {
> +		if (ret < cur + dyn_mems[i].size) {
> +			if (i == 0)
> +				ret += dyn_mems[0].base;
> +			else
> +				ret = dyn_mems[i].base + (ret - cur);
> +			break;
> +		}
> +
> +		cur += dyn_mems[i].size;
> +	}
> +
> +	return ret;
> +}
> +
>  /*
>   * Translates a kernel virtual address to its physical address. cmd_vtop()
> sets
>   * the verbose flag so that the pte translation gets displayed; all other
> @@ -1279,14 +1307,19 @@ arm_kvtop(struct task_context *tc, ulong kvaddr,
> physaddr_t *paddr, int verbose)
>  		return arm_lpae_vtop(kvaddr, (ulong *)vt->kernel_pgd[0],
>  			paddr, verbose);
>  
> -
>  	if (!vt->vmalloc_start) {
> -		*paddr = VTOP(kvaddr);
> +		if (dyn_mems == NULL)
> +			*paddr = VTOP(kvaddr);
> +		else
> +			*paddr = vtop_sparse(kvaddr);
>  		return TRUE;
>  	}
>  
>  	if (!IS_VMALLOC_ADDR(kvaddr)) {
> -		*paddr = VTOP(kvaddr);
> +		if (dyn_mems == NULL)
> +			*paddr = VTOP(kvaddr);
> +		else
> +			*paddr = vtop_sparse(kvaddr);
>  		if (!verbose)
>  			return TRUE;
>  	}
> diff --git a/defs.h b/defs.h
> index ecadc29..94e5509 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -6281,4 +6281,13 @@ extern int have_full_symbols(void);  #define
> XEN_HYPERVISOR_ARCH  #endif
>  
> +#ifdef ARM
> +/* describes memory area for sparse memory systems */ struct mem_area {
> +	physaddr_t base;
> +	unsigned int size;
> +};
> +int read_dtb_sparse_mem(const char *filename); #endif
> +
>  #endif /* !GDB_COMMON */
> diff --git a/main.c b/main.c
> index fd7f7a8..eeccd68 100644
> --- a/main.c
> +++ b/main.c
> @@ -72,6 +72,7 @@ static struct option long_options[] = {
>  	{"no_strip", 0, 0, 0},
>  	{"hash", required_argument, 0, 0},
>  	{"offline", required_argument, 0, 0},
> +	{"dtbmem", required_argument, 0, 0},
>          {0, 0, 0, 0}
>  };
>  
> @@ -291,9 +292,10 @@ main(int argc, char **argv)
>  					error(INFO, "invalid --offline
> argument: %s\n", optarg);
>  					program_usage(SHORT_FORM);
>  				}
> -			}
> -
> -			else {
> +			} else if (STREQ(long_options[option_index].name,
> "dtbmem")) {
> +				if (read_dtb_sparse_mem(optarg) == 0)
> +					error(INFO, "Failed to read sparse
> mem config!\n");
> +			} else {
>  				error(INFO, "internal error: option %s
> unhandled\n",
>  					long_options[option_index].name);
>  				program_usage(SHORT_FORM);
> diff --git a/mem_dtb.c b/mem_dtb.c
> new file mode 100644
> index 0000000..23526bc
> --- /dev/null
> +++ b/mem_dtb.c
> @@ -0,0 +1,299 @@
> +/*
> + * Tiny device tree parser from 'fdtdump' application of device tree
> compiler.
> + * http://www.devicetree.org/Device_Tree_Compiler
> + * https://git.kernel.org/cgit/utils/dtc/dtc.git
> + */
> +#include <stdio.h>
> +#include <sys/mman.h>
> +#include <stdint.h>
> +#include <fcntl.h>
> +#include <stdlib.h>
> +#include <byteswap.h>
> +#include <sys/stat.h>
> +#include <sys/types.h>
> +#include <unistd.h>
> +#include <assert.h>
> +#include <string.h>
> +#include <ctype.h>
> +#include "defs.h"
> +
> +typedef uint32_t fdt32_t;
> +typedef uint64_t fdt64_t;
> +
> +#define FDT_BEGIN_NODE	0x1		/* Start node: full name */
> +#define FDT_END_NODE	0x2		/* End node */
> +#define FDT_PROP	0x3		/* Property: name off,
> +					   size, content */
> +#define FDT_NOP		0x4		/* nop */
> +#define FDT_END		0x9
> +
> +#define EXTRACT_BYTE(x, n)	((unsigned long long)((uint8_t *)&x)[n])
> +#define CPU_TO_FDT32(x) ((EXTRACT_BYTE(x, 0) << 24) | (EXTRACT_BYTE(x, 1)
> << 16) | \
> +			 (EXTRACT_BYTE(x, 2) << 8) | EXTRACT_BYTE(x, 3))
> #define
> +CPU_TO_FDT64(x) ((EXTRACT_BYTE(x, 0) << 56) | (EXTRACT_BYTE(x, 1) << 48) |
> \
> +			 (EXTRACT_BYTE(x, 2) << 40) | (EXTRACT_BYTE(x, 3) <<
> 32) | \
> +			 (EXTRACT_BYTE(x, 4) << 24) | (EXTRACT_BYTE(x, 5) <<
> 16) | \
> +			 (EXTRACT_BYTE(x, 6) << 8) | EXTRACT_BYTE(x, 7))
> +
> +#define ALIGN(x, a)	(((x) + ((a) - 1)) & ~((a) - 1))
> +#define PALIGN(p, a)	((void *)(ALIGN((unsigned long)(p), (a))))
> +#define GET_CELL(p)	(p += 4, *((const uint32_t *)(p-4)))
> +
> +#ifdef ARM
> +int memory_found;
> +int memory_node_found;
> +
> +extern struct mem_area *dyn_mems;
> +extern int mems_num;
> +
> +static inline uint32_t fdt32_to_cpu(fdt32_t x) {
> +	return (uint32_t)CPU_TO_FDT32(x);
> +}
> +
> +static inline uint64_t fdt64_to_cpu(fdt64_t x) {
> +	return (uint64_t)CPU_TO_FDT64(x);
> +}
> +
> +struct fdt_header {
> +	fdt32_t magic;			 /* magic word FDT_MAGIC */
> +	fdt32_t totalsize;		 /* total size of DT block */
> +	fdt32_t off_dt_struct;		 /* offset to structure */
> +	fdt32_t off_dt_strings;		 /* offset to strings */
> +	fdt32_t off_mem_rsvmap;		 /* offset to memory reserve map */
> +	fdt32_t version;		 /* format version */
> +	fdt32_t last_comp_version;	 /* last compatible version */
> +
> +	/* version 2 fields below */
> +	fdt32_t boot_cpuid_phys;	 /* Which physical CPU id we're
> +					    booting on */
> +	/* version 3 fields below */
> +	fdt32_t size_dt_strings;	 /* size of the strings block */
> +
> +	/* version 17 fields below */
> +	fdt32_t size_dt_struct;		 /* size of the structure block */
> +};
> +
> +struct fdt_reserve_entry {
> +	fdt64_t address;
> +	fdt64_t size;
> +};
> +
> +struct fdt_node_header {
> +	fdt32_t tag;
> +	char name[0];
> +};
> +
> +struct fdt_property {
> +	fdt32_t tag;
> +	fdt32_t len;
> +	fdt32_t nameoff;
> +	char data[0];
> +};
> +
> +static int util_is_printable_string(const void *data, int len) {
> +	const char *s = data;
> +	const char *ss, *se;
> +
> +	/* zero length is not */
> +	if (len == 0)
> +		return 0;
> +
> +	/* must terminate with zero */
> +	if (s[len - 1] != '\0')
> +		return 0;
> +
> +	se = s + len;
> +
> +	while (s < se) {
> +		ss = s;
> +		while (s < se && *s && isprint((unsigned char)*s))
> +			s++;
> +
> +		/* not zero, or not done yet */
> +		if (*s != '\0' || s == ss)
> +			return 0;
> +
> +		s++;
> +	}
> +
> +	return 1;
> +}
> +
> +static int utilfdt_print_data(const char *data, int len) {
> +	int i;
> +	const char *s;
> +
> +	/* no data, don't print */
> +	if (len == 0)
> +		return;
> +
> +	if (util_is_printable_string(data, len)) {
> +		s = data;
> +
> +		if (strcmp(s, "memory") == 0) {
> +			if (memory_node_found == 1)
> +				memory_found = 1;
> +		} else
> +			memory_found = 0;
> +
> +		do {
> +			s += strlen(s) + 1;
> +		} while (s < data + len);
> +
> +	} else if ((len % 4) == 0) {
> +		const uint32_t *cell = (const uint32_t *)data;
> +
> +		len /= 4;
> +
> +		if (memory_found == 1) {
> +			int k;
> +
> +			/* we are on property with memory regions */
> +			dyn_mems = malloc((len / 2) * (sizeof *dyn_mems));
> +
> +			if (dyn_mems == NULL)
> +				return -1;
> +
> +			for (k = 0, i = 0; i < len; k++, i += 2) {
> +				physaddr_t from;
> +				unsigned int size;
> +
> +				from = fdt32_to_cpu(cell[i]);
> +				size = fdt32_to_cpu(cell[i + 1]);
> +				dyn_mems[k].base = from;
> +				dyn_mems[k].size = size;
> +				printf("adding memregion: %08llX sz:
> %08X\n",
> +				       dyn_mems[k].base, dyn_mems[k].size);
> +			}
> +
> +			mems_num = k;
> +
> +			return 1;
> +		}
> +	}
> +
> +	return 0;
> +}
> +
> +static int parse_dt(void *blob)
> +{
> +	struct fdt_header *bph = blob;
> +	uint32_t off_mem_rsvmap = fdt32_to_cpu(bph->off_mem_rsvmap);
> +	uint32_t off_dt = fdt32_to_cpu(bph->off_dt_struct);
> +	uint32_t off_str = fdt32_to_cpu(bph->off_dt_strings);
> +	struct fdt_reserve_entry *p_rsvmap =
> +		(struct fdt_reserve_entry *)((char *)blob + off_mem_rsvmap);
> +	const char *p_struct = (const char *)blob + off_dt;
> +	const char *p_strings = (const char *)blob + off_str;
> +	uint32_t version = fdt32_to_cpu(bph->version);
> +	uint32_t tag;
> +	const char *p, *s, *t;
> +	int depth, sz;
> +	int i;
> +	uint64_t addr, size;
> +
> +	depth = 0;
> +
> +	for (i = 0; ; i++) {
> +		addr = fdt64_to_cpu(p_rsvmap[i].address);
> +		size = fdt64_to_cpu(p_rsvmap[i].size);
> +
> +		if (addr == 0 && size == 0)
> +			break;
> +	}
> +
> +	p = p_struct;
> +
> +	while ((tag = fdt32_to_cpu(GET_CELL(p))) != FDT_END) {
> +		int res;
> +
> +		if (tag == FDT_BEGIN_NODE) {
> +			s = p;
> +			p = PALIGN(p + strlen(s) + 1, 4);
> +
> +			if (*s == '\0')
> +				s = "/";
> +
> +			if (!strcmp(s, "memory"))
> +				memory_node_found = 1;
> +
> +			depth++;
> +			continue;
> +		}
> +
> +		if (tag == FDT_END_NODE) {
> +			depth--;
> +			continue;
> +		}
> +
> +		if (tag == FDT_NOP)
> +			continue;
> +
> +		if (tag != FDT_PROP)
> +			break;
> +
> +		sz = fdt32_to_cpu(GET_CELL(p));
> +		s = p_strings + fdt32_to_cpu(GET_CELL(p));
> +
> +		if (version < 16 && sz >= 8)
> +			p = PALIGN(p, 8);
> +
> +		t = p;
> +
> +		p = PALIGN(p + sz, 4);
> +
> +		res = utilfdt_print_data(t, sz);
> +
> +		/* error occured */
> +		if (res == -1)
> +			return 0;
> +
> +		/* memory region parsed, exit */
> +		if (res == 1)
> +			return 1;
> +
> +		/* again */
> +	}
> +
> +	return 0;
> +}
> +
> +int read_dtb_sparse_mem(const char *dtb_file_name) {
> +	int dt_fd;
> +	unsigned char *p;
> +	struct stat si;
> +	int res;
> +
> +	res = 0;
> +	dt_fd = open(dtb_file_name, O_RDONLY);
> +
> +	if (dt_fd == -1)
> +		goto out;
> +
> +	if (fstat(dt_fd, &si))
> +		goto out_close;
> +
> +	p = mmap(NULL, si.st_size, PROT_READ | PROT_WRITE,
> +		 MAP_PRIVATE, dt_fd, 0);
> +
> +	if (p == MAP_FAILED)
> +		goto out_close;
> +
> +	res = parse_dt(p);
> +out_close:
> +	close(dt_fd);
> +out:
> +	return res;
> +}
> +#else
> +int read_dtb_sparse_mem(const char *dtb_file_name) {
> +	error(INFO, "Sparse mem supported only for arm!\n");
> +	return 0;
> +}
> +#endif
> --
> 1.9.1
> 
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
> 




More information about the Crash-utility mailing list