[Crash-utility] [PATCH 1/1 V2] crash: initial note of excluded page structures
Atsushi Kumagai
kumagai-atsushi at mxc.nes.nec.co.jp
Fri Jan 10 06:10:56 UTC 2014
On 2014/01/10 7:23:32, crash-utility-bounces at redhat.com wrote:
> On Thu, Jan 09, 2014 at 04:11:40PM -0500, Dave Anderson wrote:
> > > > > > > Version 2
> > > > > > > - Moves the warning to this point:
> > > > > > > ...
> > > > > > > This GDB was configured as "x86_64-unknown-linux-gnu"...
> > > > > > >
> > > > > > > WARNING: All unused vmemmap page structures are excluded from
> > > > > > > this
> > > > > > > dump.
> > > > > > > This will cause failures of the kmem command if it
> > > > > > > attempts to
> > > > > > > walk any list of free pages (with options -f -F -i -s
> > > > > > > and
> > > > > > > -S).
> > > > > > >
> > > > > > > SYSTEM MAP: /boot/System.map-2.6.32-cpw
> > > > > > > ...
> > > > > > >
> > > > > > > - Drop patch 2, which added warnings to individual kmem options.
> > > > > > >
> > > > > > > - Feel free to change the wording of the warning.
> > > > > > >
> > > > > > > And this patch is contingent upon the acceptance of a change to the
> > > > > > > makedumpfile command.
> > > > > > > http://marc.info/?l=kexec&m=138853299130125&w=2
> > > > > > >
> > > > > > >
> > > > > > > If makedumpfile excludes unused page structures it will flag that
> > > > > > > fact in the dump header.
> > > > > > > (There are about 3.67 million pages full of page structures for
> > > > > > > every tera byte of system memory. The great bulk of those
> > > > > > > page structures are not needed.)
> > > > > > > Their exclusion is a makedumpfile option.
> > > > > > >
> > > > > > > Crash will display a note during initialization if such structures
> > > > > > > have been excluded. Crash commands that walk page freelists, for
> > > > > > > example, will fail. So the note will help the user understand why.
> > > > > > >
> > > > > > > Signed-off-by: Cliff Wickman <cpw at sgi.com>
> > > > > > > ---
> > > > > > > defs.h | 1 +
> > > > > > > diskdump.c | 3 +++
> > > > > > > diskdump.h | 1 +
> > > > > > > main.c | 6 ++++++
> > > > > > > 4 files changed, 11 insertions(+)
> > > > > > >
> > > > > > > Index: crash-7.0.4/diskdump.c
> > > > > > > ===================================================================
> > > > > > > --- crash-7.0.4.orig/diskdump.c
> > > > > > > +++ crash-7.0.4/diskdump.c
> > > > > > > @@ -749,6 +749,9 @@ restart:
> > > > > > > dd->valid_pages[i]++;
> > > > > > > }
> > > > > > >
> > > > > > > + if (header->status & DUMP_DH_EXCLUDED_VMEMMAP)
> > > > > > > + pc->flags2 |= VMEXCLUDED;
> > > > > > > +
> > > > > > > return TRUE;
> > > > > > >
> > > > > > > err:
> > > > > > > Index: crash-7.0.4/diskdump.h
> > > > > > > ===================================================================
> > > > > > > --- crash-7.0.4.orig/diskdump.h
> > > > > > > +++ crash-7.0.4/diskdump.h
> > > > > > > @@ -84,6 +84,7 @@ struct kdump_sub_header {
> > > > > > > #define DUMP_DH_COMPRESSED_ZLIB 0x1 /* page is compressed with
> > > > > > > zlib
> > > > > > > */
> > > > > > > #define DUMP_DH_COMPRESSED_LZO 0x2 /* page is compressed with
> > > > > > > lzo
> > > > > > > */
> > > > > > > #define DUMP_DH_COMPRESSED_SNAPPY 0x4 /* page is compressed with
> > > > > > > snappy
> > > > > > > */
> > > > > > > +#define DUMP_DH_EXCLUDED_VMEMMAP 0x8 /* unused vmemmap pages are
> > > > > > > excluded */
> > > > > > >
> > > > > > > /* descriptor of each page for vmcore */
> > > > > > > typedef struct page_desc {
> > > > > > > Index: crash-7.0.4/defs.h
> > > > > > > ===================================================================
> > > > > > > --- crash-7.0.4.orig/defs.h
> > > > > > > +++ crash-7.0.4/defs.h
> > > > > > > @@ -505,6 +505,7 @@ struct program_context {
> > > > > > > #define VMCOREINFO (0x400ULL)
> > > > > > > #define ALLOW_FP (0x800ULL)
> > > > > > > #define REM_PAUSED_F (0x1000ULL)
> > > > > > > +#define VMEXCLUDED (0x2000ULL)
> > > > > > > #define REMOTE_PAUSED() (pc->flags2 & REM_PAUSED_F)
> > > > > > > char *cleanup;
> > > > > > > char *namelist_orig;
> > > > > > > Index: crash-7.0.4/main.c
> > > > > > > ===================================================================
> > > > > > > --- crash-7.0.4.orig/main.c
> > > > > > > +++ crash-7.0.4/main.c
> > > > > > > @@ -662,6 +662,12 @@ main_loop(void)
> > > > > > > } else
> > > > > > > SIGACTION(SIGINT, restart, &pc->sigaction, NULL);
> > > > > > >
> > > > > > > + if (pc->flags2 & VMEXCLUDED)
> > > > > > > + fprintf(fp,
> > > > > > > + "WARNING: All unused vmemmap page structures are excluded
> > > > > > > from
> > > > > > > this dump.\n"
> > > > > > > + " This will cause failures of the kmem command if it
> > > > > > > attempts
> > > > > > > to\n"
> > > > > > > + " walk any list of free pages (with options -f -F -i -s and
> > > > > > > -S).\n\n");
> > > > > > > +
> > > > > > > /*
> > > > > > > * Display system statistics and current context.
> > > > > > > */
> > > > > > >
> > > > > >
> > > > > > This patch looks reasonable.
> > > > > >
> > > > > > BTW, what happens if you enter "kmem <address>" alone with no option?
> > > > > > It should fail as well, no?
> > > > >
> > > > > Yes, it does.
> > > > >
> > > > > crash> kmem 133360
> > > > > kmem: page excluded: kernel virtual address: ffffea0000004350 type:
> > > > > "page.lru.next"
> > > > >
> > > > > crash> kmem ffff8897baae7540
> > > > > CACHE NAME OBJSIZE ALLOCATED TOTAL SLABS
> > > > > SSIZE
> > > > > ffff88c7bec70040 task_struct 2656 4578 5361 1787
> > > > > 8k
> > > > > SLAB MEMORY TOTAL ALLOCATED FREE
> > > > > ffff8897baae6040 ffff8897baae6080 3 2 1
> > > > > FREE / [ALLOCATED]
> > > > > [ffff8897baae7540]
> > > > >
> > > > > kmem: page excluded: kernel virtual address: ffffea0000007028 type: "first list entry"
> > > > >
> > > > > Another way to search a freelist, that I didn't notice.
> > > > > Do you want to add that to the warning?
> > > >
> > > > I'm thinking that there might be others as well. Like how does "kmem -p" handle
> > > > the missing page structs?
> > >
> > > kmem -p gives the appearance of working, but is unable to read the
> > > mem_map pages. It is giving false results.
> > > This is an example of what you feared.
> > >
> > > If we do this:
> > >
> > > ---
> > > memory.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > Index: crash-7.0.4/memory.c
> > > ===================================================================
> > > --- crash-7.0.4.orig/memory.c
> > > +++ crash-7.0.4/memory.c
> > > @@ -5657,7 +5657,7 @@ fill_mem_map_cache(ulong pp, ulong ppend
> > > * Try to read it in one fell swoop.
> > > */
> > > if (readmem(pp, KVADDR, page_cache, SIZE(page) * PGMM_CACHED,
> > > - "page struct cache", RETURN_ON_ERROR|QUIET))
> > > + "page struct cache", FAULT_ON_ERROR))
> > > return;
> > >
> > > /*
> > >
> > > Then we see the immediate failure:
> > >
> > > crash> kmem -p
> > > PAGE PHYSICAL MAPPING INDEX CNT FLAGS
> > > kmem: page excluded: kernel virtual address: ffffea0000004000 type: "page
> > > struct cache"
> > >
> > > What should be done about that?
> >
> > I'd leave it alone -- at least it doesn't fail miserably like the other commands.
> >
> > > I wonder why the reads of the mem_map were done with RETURN_ON_ERROR|QUIET
> > > to begin with?
> >
> > It's done to guard against a vmemmap boundary (real page structs vs. unmapped
> > page structs) that happens to occur within the 512-page-size chunk of memory
> > that's being cached. So first it tries to read the full chunk in one readmem(),
> > but if that fails, it breaks up the read requests, zeroing out the unmapped pages.
> >
> > Dave
>
> Okay. I'll just make issue a more general warning like:
>
> WARNING: All unused vmemmap page structures are excluded from this dump.
> This will cause failures in any command that accesses page
> structures of pages that are not included in the dump. This
> is particularly likely when using several options of the
> kmem command.
>
> If that looks reasonable to you.
Dave, the excluding vmemmap obviously affect a user's investigation,
is this really acceptable for you ?
There is no chance to re-capture the same dump image,
I think we should be more carefully about filtering out
since it's an irreversible change.
How many users want to get such a broken dump image even if
it could be gotten faster ?
Why we capture dump images, it's for analyzing, of course.
At least, we should supply alternatives to the affected commands.
Thanks
Atsushi Kumagai
> -Cliff
> >
> > >
> > > -Cliff
> > >
> > > >
> > > > It might be better to just generally warn that results may be unpredictable
> > > > because any command that accesses page structures, such as kmem, may fail.
> > > > By listing specific options, it makes it sound like that they are the only
> > > > ones, so it may be better to leave it open-ended.
> > > >
> > > > Dave
> > >
> > > --
> > > Cliff Wickman
> > > SGI
> > > cpw at sgi.com
> > > (651) 683-3824
> > >
>
> --
> Cliff Wickman
> SGI
> cpw at sgi.com
> (651) 683-3824
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
More information about the Crash-utility
mailing list