[Crash-utility] [PATCH] Speed up "kmem -[sS]" by optimizing is_page_ptr()

Kazuhito Hagio k-hagio at ab.jp.nec.com
Tue Feb 27 21:45:31 UTC 2018


Hi Dave,

I'm sorry for my long delay.

On 2/23/2018 12:20 PM, Dave Anderson wrote:
[...]
> This "#ifdef IS_VMEMMAP_PAGE_ADDR" patch is getting really ugly.  I've been
> playing around with this, and this is my latest counter-proposal.

Yes, I couldn't think of a way to refine it greatly..

> First, the mem_section numbers are ascending.  They may not necessarily start
> with 0, and there may be holes, but they are ascending.  That being the case,
> there is no need for is_page_ptr() to walk through NR_MEM_SECTIONS() worth
> of entries, because there will be an ending number that's typically much
> smaller.  Even on a 256GB dumpfile I have on hand, which has a NR_MEM_SECTIONS()
> value of 524288, the largest valid section number is 2055. (What is the smallest
> and largest number that you see on whatever large-memory system that you are 
> testing with?)
> 
> In any case, let's store the largest section number during initialization in 
> the vm_table, and use it as a delimeter in is_page_ptr().

I agree with you.  This will improve the worst case of the loop.  Also,
if the binary search is implemented in the future, it could be utilized.
(The largest valid section numbers of each architecture in my test logs
are 1543 on a 192GB x86_64 and 2047 on a 32GB ppc64.)

> diff --git a/defs.h b/defs.h
> index aa17792..8768fd5 100644
> --- a/defs.h
> +++ b/defs.h
> @@ -2369,6 +2369,7 @@ struct vm_table {                /* kernel VM-related data */
>  		ulong mask;
>  		char *name;
>  	} *pageflags_data;
> +	ulong max_mem_section_nr;
>  };
>  
>  #define NODES                       (0x1)
> diff --git a/memory.c b/memory.c
> index 0df8ecc..6770937 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -255,7 +255,7 @@ static void PG_reserved_flag_init(void);
>  static void PG_slab_flag_init(void);
>  static ulong nr_blockdev_pages(void);
>  void sparse_mem_init(void);
> -void dump_mem_sections(void);
> +void dump_mem_sections(int);
>  void list_mem_sections(void);
>  ulong sparse_decode_mem_map(ulong, ulong);
>  char *read_mem_section(ulong);
> @@ -13350,7 +13350,7 @@ is_page_ptr(ulong addr, physaddr_t *phys)
>  	physaddr_t section_paddr;
>  
>  	if (IS_SPARSEMEM()) {
> -		nr_mem_sections = NR_MEM_SECTIONS();
> +		nr_mem_sections = vt->max_mem_section_nr+1;
>  	        for (nr = 0; nr < nr_mem_sections ; nr++) {
>  	                if ((sec_addr = valid_section_nr(nr))) {
>  	                        coded_mem_map = section_mem_map_addr(sec_addr);
> @@ -13668,6 +13668,7 @@ dump_vm_table(int verbose)
>  	fprintf(fp, "   swap_info_struct: %lx\n", (ulong)vt->swap_info_struct);
>  	fprintf(fp, "            mem_sec: %lx\n", (ulong)vt->mem_sec);
>  	fprintf(fp, "        mem_section: %lx\n", (ulong)vt->mem_section);
> +	fprintf(fp, " max_mem_section_nr: %ld\n", vt->max_mem_section_nr);
>  	fprintf(fp, "       ZONE_HIGHMEM: %d\n", vt->ZONE_HIGHMEM);
>  	fprintf(fp, "node_online_map_len: %d\n", vt->node_online_map_len);
>  	if (vt->node_online_map_len) {
> @@ -16295,8 +16296,8 @@ dump_memory_nodes(int initialize)
>  		vt->numnodes = n;
>  	}
>  
> -	if (!initialize && IS_SPARSEMEM())
> -		dump_mem_sections();
> +	if (IS_SPARSEMEM())
> +		dump_mem_sections(initialize);
>  }
>  
>  /*
> @@ -17128,9 +17129,9 @@ pfn_to_map(ulong pfn)
>  }
>  
>  void 
> -dump_mem_sections(void)
> +dump_mem_sections(int initialize)
>  {
> -	ulong nr,addr;
> +	ulong nr, max, addr;
>  	ulong nr_mem_sections;
>  	ulong coded_mem_map, mem_map, pfn;
>  	char buf1[BUFSIZE];
> @@ -17140,6 +17141,15 @@ dump_mem_sections(void)
>  
>  	nr_mem_sections = NR_MEM_SECTIONS();
>  
> +	if (initialize) {
> +		for (nr = max = 0; nr < nr_mem_sections ; nr++) {
> +			if (valid_section_nr(nr))
> +				max = nr;
> +		}
> +		vt->max_mem_section_nr = max;
> +		return;
> +	}
> +
>  	fprintf(fp, "\n");
>  	pad_line(fp, BITS32() ? 59 : 67, '-');
>          fprintf(fp, "\n\nNR  %s  %s  %s  PFN\n",
> 
> 
> Now, with respect to the architecture-specific, VMEMMAP-only, part
> that is of most interest to you, let's do it with an architecture
> specific callback.  You can post it for x86_64, and the other architecture
> maintainers can write their own version.  For example, add a new
> callback function to the machdep_table structure, i.e., like this:
> 
>   struct machdep_table {
>           ulong flags;
>           ulong kvbase;
>   ...
>           void (*get_irq_affinity)(int);
>           void (*show_interrupts)(int, ulong *);
> +         int is_page_ptr(ulong, physaddr_t *);
>   };
>   
> Write the x86_64_is_page_ptr() function that works for VMEMMAP
> kernels, and returns FALSE otherwise.  And add the call to the top 
> of is_page_ptr() like this:
> 
>   int
>   is_page_ptr(ulong addr, physaddr_t *phys)
>   {
>           int n;
>           ulong ppstart, ppend;
>           struct node_table *nt;
>           ulong pgnum, node_size;
>           ulong nr, sec_addr;
>           ulong nr_mem_sections;
>           ulong coded_mem_map, mem_map, end_mem_map;
>           physaddr_t section_paddr;
> 
> +	  if (machdep->is_page_ptr(addr, phys))
> +		  return TRUE;
>   
>           if (IS_SPARSEMEM()) {
>   ...
>   
> To avoid having to check whether the machdep->is_page_ptr function
> exists, write a generic_is_page_ptr() function that just returns
> FALSE, and initialize machdep->is_page_ptr to generic_is_page_ptr in
> the setup_environment() function.  Later on, individual architectures
> can overwrite it when machdep_init(SETUP_ENV) is called.
> 
> How's that sound?

It looks readable and refined.

If an incoming address is not a page address, the IS_SPARSEMEM() section
is also executed, but I think that it does not matter because it is rare
that the situation occurs many times at once and it is likely that the code
will become ugly to avoid it.

So I'll prepare the x86_64 part based on the above.

Thanks,
Kazuhito Hagio

> 
> Dave
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
> 




More information about the Crash-utility mailing list