[Crash-utility] [PATCH] Speed up "kmem -[sS]" by optimizing is_page_ptr()
Dave Anderson
anderson at redhat.com
Fri Mar 2 19:21:49 UTC 2018
----- Original Message -----
> Hi Dave,
>
> On 2/27/2018 4:45 PM, Kazuhito Hagio wrote:
> [...]
> >> First, the mem_section numbers are ascending. They may not necessarily start
> >> with 0, and there may be holes, but they are ascending. That being the case,
> >> there is no need for is_page_ptr() to walk through NR_MEM_SECTIONS() worth
> >> of entries, because there will be an ending number that's typically much
> >> smaller. Even on a 256GB dumpfile I have on hand, which has a NR_MEM_SECTIONS()
> >> value of 524288, the largest valid section number is 2055. (What is the smallest
> >> and largest number that you see on whatever large-memory system that you are
> >> testing with?)
> >>
> >> In any case, let's store the largest section number during initialization in
> >> the vm_table, and use it as a delimeter in is_page_ptr().
> >
> > I agree with you. This will improve the worst case of the loop. Also,
> > if the binary search is implemented in the future, it could be utilized.
> > (The largest valid section numbers of each architecture in my test logs
> > are 1543 on a 192GB x86_64 and 2047 on a 32GB ppc64.)
>
> I checked and tested the former patch you proposed below as it is
> and I didn't find any problem. Could you merge this?
Sure, I'll merge the infrastructure and send you an email when it's checked in.
Thanks,
Dave
> (or is there anything I should do?)
>
> >
> >> diff --git a/defs.h b/defs.h
> >> index aa17792..8768fd5 100644
> >> --- a/defs.h
> >> +++ b/defs.h
> >> @@ -2369,6 +2369,7 @@ struct vm_table { /* kernel
> >> VM-related data */
> >> ulong mask;
> >> char *name;
> >> } *pageflags_data;
> >> + ulong max_mem_section_nr;
> >> };
> >>
> >> #define NODES (0x1)
> >> diff --git a/memory.c b/memory.c
> >> index 0df8ecc..6770937 100644
> >> --- a/memory.c
> >> +++ b/memory.c
> >> @@ -255,7 +255,7 @@ static void PG_reserved_flag_init(void);
> >> static void PG_slab_flag_init(void);
> >> static ulong nr_blockdev_pages(void);
> >> void sparse_mem_init(void);
> >> -void dump_mem_sections(void);
> >> +void dump_mem_sections(int);
> >> void list_mem_sections(void);
> >> ulong sparse_decode_mem_map(ulong, ulong);
> >> char *read_mem_section(ulong);
> >> @@ -13350,7 +13350,7 @@ is_page_ptr(ulong addr, physaddr_t *phys)
> >> physaddr_t section_paddr;
> >>
> >> if (IS_SPARSEMEM()) {
> >> - nr_mem_sections = NR_MEM_SECTIONS();
> >> + nr_mem_sections = vt->max_mem_section_nr+1;
> >> for (nr = 0; nr < nr_mem_sections ; nr++) {
> >> if ((sec_addr = valid_section_nr(nr))) {
> >> coded_mem_map = section_mem_map_addr(sec_addr);
> >> @@ -13668,6 +13668,7 @@ dump_vm_table(int verbose)
> >> fprintf(fp, " swap_info_struct: %lx\n", (ulong)vt->swap_info_struct);
> >> fprintf(fp, " mem_sec: %lx\n", (ulong)vt->mem_sec);
> >> fprintf(fp, " mem_section: %lx\n", (ulong)vt->mem_section);
> >> + fprintf(fp, " max_mem_section_nr: %ld\n", vt->max_mem_section_nr);
> >> fprintf(fp, " ZONE_HIGHMEM: %d\n", vt->ZONE_HIGHMEM);
> >> fprintf(fp, "node_online_map_len: %d\n", vt->node_online_map_len);
> >> if (vt->node_online_map_len) {
> >> @@ -16295,8 +16296,8 @@ dump_memory_nodes(int initialize)
> >> vt->numnodes = n;
> >> }
> >>
> >> - if (!initialize && IS_SPARSEMEM())
> >> - dump_mem_sections();
> >> + if (IS_SPARSEMEM())
> >> + dump_mem_sections(initialize);
> >> }
> >>
> >> /*
> >> @@ -17128,9 +17129,9 @@ pfn_to_map(ulong pfn)
> >> }
> >>
> >> void
> >> -dump_mem_sections(void)
> >> +dump_mem_sections(int initialize)
> >> {
> >> - ulong nr,addr;
> >> + ulong nr, max, addr;
> >> ulong nr_mem_sections;
> >> ulong coded_mem_map, mem_map, pfn;
> >> char buf1[BUFSIZE];
> >> @@ -17140,6 +17141,15 @@ dump_mem_sections(void)
> >>
> >> nr_mem_sections = NR_MEM_SECTIONS();
> >>
> >> + if (initialize) {
> >> + for (nr = max = 0; nr < nr_mem_sections ; nr++) {
> >> + if (valid_section_nr(nr))
> >> + max = nr;
> >> + }
> >> + vt->max_mem_section_nr = max;
> >> + return;
> >> + }
> >> +
> >> fprintf(fp, "\n");
> >> pad_line(fp, BITS32() ? 59 : 67, '-');
> >> fprintf(fp, "\n\nNR %s %s %s PFN\n",
> >>
> >>
> >> Now, with respect to the architecture-specific, VMEMMAP-only, part
> >> that is of most interest to you, let's do it with an architecture
> >> specific callback. You can post it for x86_64, and the other architecture
> >> maintainers can write their own version. For example, add a new
> >> callback function to the machdep_table structure, i.e., like this:
> >>
> >> struct machdep_table {
> >> ulong flags;
> >> ulong kvbase;
> >> ...
> >> void (*get_irq_affinity)(int);
> >> void (*show_interrupts)(int, ulong *);
> >> + int is_page_ptr(ulong, physaddr_t *);
> >> };
> >>
> >> Write the x86_64_is_page_ptr() function that works for VMEMMAP
> >> kernels, and returns FALSE otherwise. And add the call to the top
> >> of is_page_ptr() like this:
> >>
> >> int
> >> is_page_ptr(ulong addr, physaddr_t *phys)
> >> {
> >> int n;
> >> ulong ppstart, ppend;
> >> struct node_table *nt;
> >> ulong pgnum, node_size;
> >> ulong nr, sec_addr;
> >> ulong nr_mem_sections;
> >> ulong coded_mem_map, mem_map, end_mem_map;
> >> physaddr_t section_paddr;
> >>
> >> + if (machdep->is_page_ptr(addr, phys))
> >> + return TRUE;
> >>
> >> if (IS_SPARSEMEM()) {
> >> ...
> >>
> >> To avoid having to check whether the machdep->is_page_ptr function
> >> exists, write a generic_is_page_ptr() function that just returns
> >> FALSE, and initialize machdep->is_page_ptr to generic_is_page_ptr in
> >> the setup_environment() function. Later on, individual architectures
> >> can overwrite it when machdep_init(SETUP_ENV) is called.
> >>
> >> How's that sound?
> >
> > It looks readable and refined.
> >
> > If an incoming address is not a page address, the IS_SPARSEMEM() section
> > is also executed, but I think that it does not matter because it is rare
> > that the situation occurs many times at once and it is likely that the code
> > will become ugly to avoid it.
> >
> > So I'll prepare the x86_64 part based on the above.
>
> I thought that you would merge the common part, but is it wrong?
> Could I post it with the x86_64 part?
>
> Sorry I didn't understand well how to proceed with this.
> And thank you very much for your help with this issue!
>
> Kazuhito Hagio
>
> >
> > Thanks,
> > Kazuhito Hagio
> >
> >>
> >> Dave
> >>
> >> --
> >> Crash-utility mailing list
> >> Crash-utility at redhat.com
> >> https://www.redhat.com/mailman/listinfo/crash-utility
> >>
> >
> > --
> > Crash-utility mailing list
> > Crash-utility at redhat.com
> > https://www.redhat.com/mailman/listinfo/crash-utility
> >
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>
More information about the Crash-utility
mailing list