[Crash-utility] The problems when running SuSE 12 on VirtualBox

Dave Anderson anderson at redhat.com
Thu Nov 19 21:38:01 UTC 2015



----- Original Message -----
> On 11/19/2015 12:20 PM, Dave Anderson wrote:
> > 
> > 
> > ----- Original Message -----
> >> On 11/19/2015 08:37 AM, David Mair wrote:
> >>> On 11/19/2015 07:45 AM, Dave Anderson wrote:
> >>>
> >>> Hi Dave,
> >>>
> >>>> ----- Original Message -----
> >>> <snip>
> >>>>
> >>>>> (2) Execute "crash -d8" on physical machine will cause crash utility
> >>>>> core
> >>>>> dump.
> >>>>
> >>>> I can reproduce this, so I'll look into it.  It's related to the
> >>>> /dev/mem "test"
> >>>> to determine whether the kernel was configured with
> >>>> CONFIG_STRICT_DEVMEM, where
> >>>> it tries to read from pfn 257 (just above the CONFIG_STRICT_DEVMEM
> >>>> limit), but
> >>>> gets into an infinite loop when used in conjunction with -d.
> >>>>
> >>>> Anyway, just continue to use /proc/kcore and you should be fine.
> >>>
> >>> This is the cause in readmem():
> >>>
> >>> switch(READMEM(...))
> >>> {
> >>> 	.
> >>> 	.
> >>> 	.
> >>> case READ_ERROR:
> >>> 	if (PRINT_ERROR_MESSAGE) ********** THIS ***********
> >>> 	{
> >>> 		causes a nested readmem() call before the goto gives it
> >>> 		to the caller to deal with
> >>> 	}
> >>> 	goto readmem_error
> >>> }
> >>> 	.
> >>> 	.
> >>> 	.
> >>> switch(error_handle)
> >>> {
> >>> 	case (RETURN_ON_ERROR):
> >>> }
> >>>
> >>> The PRINT_ERROR_MESSAGE I assume is an escalation from -d 8 in this case.
> >>>
> >>
> >> The whole switch_to_proc_kcore() probably shouldn't be conditional on
> >> the presence of PRINT_ERROR_MESSAGE, only the actual error message
> >> should be.
> >>
> >> Patch shortly.
> >>
> >> --
> >> David.
> > 
> > Yeah, I agree.  It was done that way to catch the very first non-QUIET
> > readmem()
> > call from kernel_init(), and make the bait-and-switch right there.  Without
> > the
> > CRASHDEBUG(x) override, it works as-is because the readmem() calls in
> > devmem_is_restricted() are purposely set to QUIET.
> > 
> > So we're thinking something like this, right?:
> > 
> > diff --git a/memory.c b/memory.c
> > index 824b3ae..2282ba9 100644
> > --- a/memory.c
> > +++ b/memory.c
> > @@ -2207,13 +2207,12 @@ readmem(ulonglong addr, int memtype, void *buffer,
> > long size,
> >                          goto readmem_error;
> >  
> >                 case READ_ERROR:
> > -                        if (PRINT_ERROR_MESSAGE) {
> > -                               if ((pc->flags & DEVMEM) && (kt->flags &
> > PRE_KERNEL_INIT) &&
> > -                                   devmem_is_restricted() &&
> > switch_to_proc_kcore())
> > -                                       return(readmem(addr, memtype,
> > bufptr, size,
> > -                                               type, error_handle));
> > +                       if ((pc->flags & DEVMEM) && (kt->flags &
> > PRE_KERNEL_INIT) &&
> > +                           !(error_handle & QUIET) &&
> > devmem_is_restricted() && switch_to_proc_kcore())
> > +                               return(readmem(addr, memtype, bufptr, size,
> > +                                       type, error_handle));
> > +                       if (PRINT_ERROR_MESSAGE)
> >                                 error(INFO, READ_ERRMSG,
> >                                 memtype_string(memtype, 0), addr, type);
> > -                       }
> >                          goto readmem_error;
> >  
> >                 case PAGE_EXCLUDED:
> > 
> > Works for me, and is better than hiding a recursive-call check in
> > devmem_is_restricted().
> 
> That protects from nesting devmem_is_restricted() as long as
> devmem_is_restricted() continues to do QUIET readmem()s but fails to
> switch_to_proc_kcore() for QUIET readmem() at other random failing
> addresses that could be fixed via /proc/kcore. QUIET isn't a global flag
> only for being in devmem_is_restricted().

I'm not following...

There are the 3 QUIET readmem() calls in devmem_is_restricted().  As soon
as the first one fails, it comes into the code segment above, and will
call switch_to_proc_kcore().  There are no readmem() calls there.  So
if /proc/kcore is usable, it switches to it, and we continue just fine.

And, for example, if I force switch_to_proc_kcore() to return FALSE, 
it still works:

$ crash -d1 /dev/mem
... [ cut ] ...
GNU gdb (GDB) 7.6
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-unknown-linux-gnu"...

crash: read error: physical address: 101000  type: "devmem_is_allowed - pfn 257"
crash: read error: kernel virtual address: ffffffff81e08000  type: "devmem_is_allowed - jiffies"
crash: this kernel may be configured with CONFIG_STRICT_DEVMEM, which
       renders /dev/mem unusable as a live memory source.
crash: read error: kernel virtual address: ffffffff81a114b0  type: "cpu_possible_mask"
# 

The first QUIET read fails, tries /proc/kcore and it fails, and the error message
is printed.  It returns back to devmem_is_restricted(), which calls readmem() again,
but since DEVMEM has been cleared, it won't retry the /proc/kcore switch, prints the
"jiffies" error message and returns.  At that point, it is marked as restricted, 
returns back to readmem() which fails the original "cpu_possible_mask" readmem() 
that came from kernel_init().  At that point the crash session is killed.
 
Note that because we have /dev/crash built-in, I had to put /dev/mem on the command line,
but I wouldn't think that would make a difference.

Am I missing something?  Do you see something different?

Dave





  

> 
> Here's mine, it's the kind of clumsy hardcore you were trying to avoid,
> it's not thread or signal safe. I do feel the same concern you expressed
> but I believe it does work for all failing QUIET readmem() cases.
> 
> diff --git a/memory.c b/memory.c
> index 72218e7..71f1e47 100644
> --- a/memory.c
> +++ b/memory.c
> @@ -319,6 +319,7 @@ static void dump_hstates(void);
>  #define ASCII_UNLIMITED ((ulong)(-1) >> 1)
> 
>  static ulong DISPLAY_DEFAULT;
> +static int IN_DEVMEM_IS_RESTRICTED;
> 
>  /*
>   *  Verify that the sizeof the primitive types are reasonable.
> @@ -338,6 +339,7 @@ mem_init(void)
>                  error(FATAL, "pointer size: %d is not sizeof(long):
> %d\n", sizeof(void *), sizeof(long));
> 
>          DISPLAY_DEFAULT = (sizeof(long) == 8) ? DISPLAY_64 : DISPLAY_32;
> +        IN_DEVMEM_IS_RESTRICTED = FALSE;
>  }
> 
> 
> @@ -2204,13 +2206,13 @@ readmem(ulonglong addr, int memtype, void
> *buffer, long size,
>                          goto readmem_error;
> 
>  		case READ_ERROR:
> -                        if (PRINT_ERROR_MESSAGE) {
> -				if ((pc->flags & DEVMEM) && (kt->flags & PRE_KERNEL_INIT) &&
> -				    devmem_is_restricted() && switch_to_proc_kcore())
> -					return(readmem(addr, memtype, bufptr, size,
> -						type, error_handle));
> +                        if (PRINT_ERROR_MESSAGE)
>  				error(INFO, READ_ERRMSG, memtype_string(memtype, 0), addr, type);
> -			}
> +
> +			if ((pc->flags & DEVMEM) && (kt->flags & PRE_KERNEL_INIT) &&
> +			    devmem_is_restricted() && switch_to_proc_kcore())
> +				return(readmem(addr, memtype, bufptr, size,
> +					type, error_handle));
>                          goto readmem_error;
> 
>  		case PAGE_EXCLUDED:
> @@ -2393,6 +2395,11 @@ devmem_is_restricted(void)
>  	long tmp;
>  	int restricted;
> 
> +	if (IN_DEVMEM_IS_RESTRICTED)
> +		return FALSE;
> +	IN_DEVMEM_IS_RESTRICTED = TRUE;
> +
> +	restricted = FALSE;
> 
>  	/*
>  	 *  Check for pre-CONFIG_STRICT_DEVMEM kernels.
> @@ -2401,11 +2408,9 @@ devmem_is_restricted(void)
>  		if (machine_type("ARM") || machine_type("ARM64") ||
>  		    machine_type("X86") || machine_type("X86_64") ||
>  		    machine_type("PPC") || machine_type("PPC64"))
> -			return FALSE;
> +			goto end_devmem_is_restricted;
>  	}
> 
> -	restricted = FALSE;
> -
>  	if (STREQ(pc->live_memsrc, "/dev/mem")) {
>  	    	if (machine_type("X86") || machine_type("X86_64")) {
>  			if (readmem(255*PAGESIZE(), PHYSADDR, &tmp,
> @@ -2429,6 +2434,9 @@ devmem_is_restricted(void)
>  			    "source.\n");
>  	}
> 
> +end_devmem_is_restricted:
> +	IN_DEVMEM_IS_RESTRICTED = FALSE;
> +
>  	return restricted;
>  }
> 
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
> 




More information about the Crash-utility mailing list