[Crash-utility] [PATCH 1/1 V2] crash: initial note of excluded page structures

Cliff Wickman cpw at sgi.com
Thu Jan 9 20:54:14 UTC 2014


On Thu, Jan 09, 2014 at 01:21:35PM -0500, Dave Anderson wrote:
> 
> 
> ----- Original Message -----
> > On Thu, Jan 09, 2014 at 11:40:59AM -0500, Dave Anderson wrote:
> > > 
> > > 
> > > ----- Original Message -----
> > > > 
> > > > Version 2
> > > > - Moves the warning to this point:
> > > >       ...
> > > >       This GDB was configured as "x86_64-unknown-linux-gnu"...
> > > > 
> > > >       WARNING: All unused vmemmap page structures are excluded from this
> > > >       dump.
> > > >                This will cause failures of the kmem command if it
> > > >                attempts to
> > > >                walk any list of free pages (with options -f -F -i -s and
> > > >                -S).
> > > > 
> > > >         SYSTEM MAP: /boot/System.map-2.6.32-cpw
> > > >       ...
> > > > 
> > > > - Drop patch 2, which added warnings to individual kmem options.
> > > > 
> > > > - Feel free to change the wording of the warning.
> > > > 
> > > > And this patch is contingent upon the acceptance of a change to the
> > > > makedumpfile command.
> > > >    http://marc.info/?l=kexec&m=138853299130125&w=2
> > > >   
> > > > 
> > > > If makedumpfile excludes unused page structures it will flag that
> > > > fact in the dump header.
> > > > (There are about 3.67 million pages full of page structures for
> > > >  every tera byte of system memory.  The great bulk of those
> > > >  page structures are not needed.)
> > > > Their exclusion is a makedumpfile option.
> > > > 
> > > > Crash will display a note during initialization if such structures
> > > > have been excluded.  Crash commands that walk page freelists, for
> > > > example, will fail. So the note will help the user understand why.
> > > > 
> > > > Signed-off-by: Cliff Wickman <cpw at sgi.com>
> > > > ---
> > > >  defs.h     |    1 +
> > > >  diskdump.c |    3 +++
> > > >  diskdump.h |    1 +
> > > >  main.c     |    6 ++++++
> > > >  4 files changed, 11 insertions(+)
> > > > 
> > > > Index: crash-7.0.4/diskdump.c
> > > > ===================================================================
> > > > --- crash-7.0.4.orig/diskdump.c
> > > > +++ crash-7.0.4/diskdump.c
> > > > @@ -749,6 +749,9 @@ restart:
> > > >  				dd->valid_pages[i]++;
> > > >  	}
> > > >  
> > > > +	if (header->status & DUMP_DH_EXCLUDED_VMEMMAP)
> > > > +		pc->flags2 |= VMEXCLUDED;
> > > > +
> > > >          return TRUE;
> > > >  
> > > >  err:
> > > > Index: crash-7.0.4/diskdump.h
> > > > ===================================================================
> > > > --- crash-7.0.4.orig/diskdump.h
> > > > +++ crash-7.0.4/diskdump.h
> > > > @@ -84,6 +84,7 @@ struct kdump_sub_header {
> > > >  #define DUMP_DH_COMPRESSED_ZLIB    0x1   /* page is compressed with zlib
> > > >  */
> > > >  #define DUMP_DH_COMPRESSED_LZO     0x2   /* page is compressed with lzo
> > > >  */
> > > >  #define DUMP_DH_COMPRESSED_SNAPPY  0x4   /* page is compressed with
> > > >  snappy
> > > >  */
> > > > +#define DUMP_DH_EXCLUDED_VMEMMAP   0x8   /* unused vmemmap pages are
> > > > excluded */
> > > >  
> > > >  /* descriptor of each page for vmcore */
> > > >  typedef struct page_desc {
> > > > Index: crash-7.0.4/defs.h
> > > > ===================================================================
> > > > --- crash-7.0.4.orig/defs.h
> > > > +++ crash-7.0.4/defs.h
> > > > @@ -505,6 +505,7 @@ struct program_context {
> > > >  #define VMCOREINFO    (0x400ULL)
> > > >  #define ALLOW_FP      (0x800ULL)
> > > >  #define REM_PAUSED_F (0x1000ULL)
> > > > +#define VMEXCLUDED   (0x2000ULL)
> > > >  #define REMOTE_PAUSED() (pc->flags2 & REM_PAUSED_F)
> > > >  	char *cleanup;
> > > >  	char *namelist_orig;
> > > > Index: crash-7.0.4/main.c
> > > > ===================================================================
> > > > --- crash-7.0.4.orig/main.c
> > > > +++ crash-7.0.4/main.c
> > > > @@ -662,6 +662,12 @@ main_loop(void)
> > > >  	} else
> > > >  		SIGACTION(SIGINT, restart, &pc->sigaction, NULL);
> > > >  
> > > > +	if (pc->flags2 & VMEXCLUDED)
> > > > +                fprintf(fp,
> > > > +        "WARNING: All unused vmemmap page structures are excluded from
> > > > this dump.\n"
> > > > +	"         This will cause failures of the kmem command if it attempts
> > > > to\n"
> > > > +	"         walk any list of free pages (with options -f -F -i -s and
> > > > -S).\n\n");
> > > > +
> > > >          /*
> > > >           *  Display system statistics and current context.
> > > >           */
> > > > 
> > > 
> > > This patch looks reasonable.
> > > 
> > > BTW, what happens if you enter "kmem <address>" alone with no option?
> > > It should fail as well, no?
> > 
> > Yes, it does.
> > 
> > crash> kmem 133360
> > kmem: page excluded: kernel virtual address: ffffea0000004350  type:
> > "page.lru.next"
> > 
> > crash> kmem ffff8897baae7540
> > CACHE            NAME                 OBJSIZE  ALLOCATED     TOTAL  SLABS
> > SSIZE
> > ffff88c7bec70040 task_struct             2656       4578      5361   1787
> > 8k
> > SLAB              MEMORY            TOTAL  ALLOCATED  FREE
> > ffff8897baae6040  ffff8897baae6080      3          2     1
> > FREE / [ALLOCATED]
> >   [ffff8897baae7540]
> > 
> > kmem: page excluded: kernel virtual address: ffffea0000007028  type: "first
> > list entry"
> > 
> > Another way to search a freelist, that I didn't notice.
> > Do you want to add that to the warning?
>  
> I'm thinking that there might be others as well.  Like how does "kmem -p" handle
> the missing page structs?

kmem -p gives the appearance of working, but is unable to read the
mem_map pages.  It is giving false results.
This is an example of what you feared.

If we do this:

---
 memory.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Index: crash-7.0.4/memory.c
===================================================================
--- crash-7.0.4.orig/memory.c
+++ crash-7.0.4/memory.c
@@ -5657,7 +5657,7 @@ fill_mem_map_cache(ulong pp, ulong ppend
 	 *  Try to read it in one fell swoop.
  	 */
 	if (readmem(pp, KVADDR, page_cache, SIZE(page) * PGMM_CACHED,
-      	    "page struct cache", RETURN_ON_ERROR|QUIET))
+      	    "page struct cache", FAULT_ON_ERROR))
 		return;
 
 	/*

Then we see the immediate failure:

crash> kmem -p
      PAGE         PHYSICAL      MAPPING       INDEX CNT FLAGS
kmem: page excluded: kernel virtual address: ffffea0000004000  type: "page struct cache"

What should be done about that?
I wonder why the reads of the mem_map were done with RETURN_ON_ERROR|QUIET
to begin with?

-Cliff

> 
> It might be better to just generally warn that results may be unpredictable
> because any command that accesses page structures, such as kmem, may fail.
> By listing specific options, it makes it sound like that they are the only
> ones, so it may be better to leave it open-ended.
> 
> Dave

-- 
Cliff Wickman
SGI
cpw at sgi.com
(651) 683-3824




More information about the Crash-utility mailing list