[Crash-utility] The problems when running SuSE 12 on VirtualBox

David Mair dmair at suse.com
Thu Nov 19 22:14:48 UTC 2015


On 11/19/2015 02:38 PM, Dave Anderson wrote:
> 
> 
> ----- Original Message -----
>> On 11/19/2015 12:20 PM, Dave Anderson wrote:
>>>
>>>
>>> ----- Original Message -----
>>>> On 11/19/2015 08:37 AM, David Mair wrote:
>>>>> On 11/19/2015 07:45 AM, Dave Anderson wrote:
>>>>>
>>>>> Hi Dave,
>>>>>
>>>>>> ----- Original Message -----
>>>>> <snip>
>>>>>>
>>>>>>> (2) Execute "crash -d8" on physical machine will cause crash utility
>>>>>>> core
>>>>>>> dump.
>>>>>>
>>>>>> I can reproduce this, so I'll look into it.  It's related to the
>>>>>> /dev/mem "test"
>>>>>> to determine whether the kernel was configured with
>>>>>> CONFIG_STRICT_DEVMEM, where
>>>>>> it tries to read from pfn 257 (just above the CONFIG_STRICT_DEVMEM
>>>>>> limit), but
>>>>>> gets into an infinite loop when used in conjunction with -d.
>>>>>>
>>>>>> Anyway, just continue to use /proc/kcore and you should be fine.
>>>>>
>>>>> This is the cause in readmem():
>>>>>
>>>>> switch(READMEM(...))
>>>>> {
>>>>> 	.
>>>>> 	.
>>>>> 	.
>>>>> case READ_ERROR:
>>>>> 	if (PRINT_ERROR_MESSAGE) ********** THIS ***********
>>>>> 	{
>>>>> 		causes a nested readmem() call before the goto gives it
>>>>> 		to the caller to deal with
>>>>> 	}
>>>>> 	goto readmem_error
>>>>> }
>>>>> 	.
>>>>> 	.
>>>>> 	.
>>>>> switch(error_handle)
>>>>> {
>>>>> 	case (RETURN_ON_ERROR):
>>>>> }
>>>>>
>>>>> The PRINT_ERROR_MESSAGE I assume is an escalation from -d 8 in this case.
>>>>>
>>>>
>>>> The whole switch_to_proc_kcore() probably shouldn't be conditional on
>>>> the presence of PRINT_ERROR_MESSAGE, only the actual error message
>>>> should be.
>>>>
>>>> Patch shortly.
>>>>
>>>> --
>>>> David.
>>>
>>> Yeah, I agree.  It was done that way to catch the very first non-QUIET
>>> readmem()
>>> call from kernel_init(), and make the bait-and-switch right there.  Without
>>> the
>>> CRASHDEBUG(x) override, it works as-is because the readmem() calls in
>>> devmem_is_restricted() are purposely set to QUIET.
>>>
>>> So we're thinking something like this, right?:
>>>
>>> diff --git a/memory.c b/memory.c
>>> index 824b3ae..2282ba9 100644
>>> --- a/memory.c
>>> +++ b/memory.c
>>> @@ -2207,13 +2207,12 @@ readmem(ulonglong addr, int memtype, void *buffer,
>>> long size,
>>>                          goto readmem_error;
>>>  
>>>                 case READ_ERROR:
>>> -                        if (PRINT_ERROR_MESSAGE) {
>>> -                               if ((pc->flags & DEVMEM) && (kt->flags &
>>> PRE_KERNEL_INIT) &&
>>> -                                   devmem_is_restricted() &&
>>> switch_to_proc_kcore())
>>> -                                       return(readmem(addr, memtype,
>>> bufptr, size,
>>> -                                               type, error_handle));
>>> +                       if ((pc->flags & DEVMEM) && (kt->flags &
>>> PRE_KERNEL_INIT) &&
>>> +                           !(error_handle & QUIET) &&
>>> devmem_is_restricted() && switch_to_proc_kcore())
>>> +                               return(readmem(addr, memtype, bufptr, size,
>>> +                                       type, error_handle));
>>> +                       if (PRINT_ERROR_MESSAGE)
>>>                                 error(INFO, READ_ERRMSG,
>>>                                 memtype_string(memtype, 0), addr, type);
>>> -                       }
>>>                          goto readmem_error;
>>>  
>>>                 case PAGE_EXCLUDED:
>>>
>>> Works for me, and is better than hiding a recursive-call check in
>>> devmem_is_restricted().
>>
>> That protects from nesting devmem_is_restricted() as long as
>> devmem_is_restricted() continues to do QUIET readmem()s but fails to
>> switch_to_proc_kcore() for QUIET readmem() at other random failing
>> addresses that could be fixed via /proc/kcore. QUIET isn't a global flag
>> only for being in devmem_is_restricted().
> 
> I'm not following...
> 
> There are the 3 QUIET readmem() calls in devmem_is_restricted().  As soon
> as the first one fails, it comes into the code segment above, and will
> call switch_to_proc_kcore().  There are no readmem() calls there.  So
> if /proc/kcore is usable, it switches to it, and we continue just fine.
> 
> And, for example, if I force switch_to_proc_kcore() to return FALSE, 
> it still works:
> 
> $ crash -d1 /dev/mem
> ... [ cut ] ...
> GNU gdb (GDB) 7.6
> Copyright (C) 2013 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
> and "show warranty" for details.
> This GDB was configured as "x86_64-unknown-linux-gnu"...
> 
> crash: read error: physical address: 101000  type: "devmem_is_allowed - pfn 257"
> crash: read error: kernel virtual address: ffffffff81e08000  type: "devmem_is_allowed - jiffies"
> crash: this kernel may be configured with CONFIG_STRICT_DEVMEM, which
>        renders /dev/mem unusable as a live memory source.
> crash: read error: kernel virtual address: ffffffff81a114b0  type: "cpu_possible_mask"
> # 
> 
> The first QUIET read fails, tries /proc/kcore and it fails, and the error message
> is printed.  It returns back to devmem_is_restricted(), which calls readmem() again,
> but since DEVMEM has been cleared, it won't retry the /proc/kcore switch, prints the
> "jiffies" error message and returns.  At that point, it is marked as restricted, 
> returns back to readmem() which fails the original "cpu_possible_mask" readmem() 
> that came from kernel_init().  At that point the crash session is killed.
>  
> Note that because we have /dev/crash built-in, I had to put /dev/mem on the command line,
> but I wouldn't think that would make a difference.
> 
> Am I missing something?  Do you see something different?

I was originally thinking about the same as yourself but decided against
it due to thinking first that, in English, QUIET is a bit abstract for
"don't try to switch devmem source" as well as avoid output.

Plus, if any other readmem() that's QUIET fails READ_ERROR for any
random reason and devmem_is_restricted() will fail a
switch_to_proc_kcore() would have resolved it then that caller's use of
QUIET will prevent devmem_is_restricted() being tried and it won't
switch_to_proc_kcore(). I admit that means the status of
devmem_is_restricted() has changed between initialization and now, but
it's a live environment with it's own behavior.

Due to the small number of error_handle flags here's one that uses the
best of your idea without expanding the meaning of the word QUIET. It
adds another flag for the sole purpose of saying "don't try to switch
devmem source this time" and it avoids all the overhead I originally had.

The name for the flag is long enough that the value tab alignment
doesn't match with the other flags (I can't think of a shorter name for
the flag but maybe you can). I chose not to appear to make three extra
lines of changes just to expand the tab alignment on the other flags. It
also leaves PRINT_ERROR_MESSAGE over-riding QUIET as at present but I
don't have a preference for that -d8 has it's useful moments:

diff --git a/defs.h b/defs.h
index b0cdd42..29f46b4 100644
--- a/defs.h
+++ b/defs.h
@@ -328,6 +328,7 @@ struct number_option {
 #define HEX_BIAS         (0x8)
 #define LONG_LONG       (0x10)
 #define RETURN_PARTIAL  (0x20)
+#define NO_DEVMEM_SWITCH (0x40)

 #define SEEK_ERROR       (-1)
 #define READ_ERROR       (-2)
diff --git a/memory.c b/memory.c
index 72218e7..b4629a6 100644
--- a/memory.c
+++ b/memory.c
@@ -2204,14 +2204,14 @@ readmem(ulonglong addr, int memtype, void
*buffer, long size,
                         goto readmem_error;

 		case READ_ERROR:
-                        if (PRINT_ERROR_MESSAGE) {
-				if ((pc->flags & DEVMEM) && (kt->flags & PRE_KERNEL_INIT) &&
-				    devmem_is_restricted() && switch_to_proc_kcore())
-					return(readmem(addr, memtype, bufptr, size,
-						type, error_handle));
+                        if (PRINT_ERROR_MESSAGE)
 				error(INFO, READ_ERRMSG, memtype_string(memtype, 0), addr, type);
-			}
-                        goto readmem_error;
+
+			if ((pc->flags & DEVMEM) && (kt->flags & PRE_KERNEL_INIT) &&
+			    !(error_handle & NO_DEVMEM_SWITCH) && devmem_is_restricted() &&
switch_to_proc_kcore())
+				return(readmem(addr, memtype, bufptr, size,
+					type, error_handle));
+			goto readmem_error;

 		case PAGE_EXCLUDED:
 			RETURN_ON_PARTIAL_READ();
@@ -2410,16 +2410,16 @@ devmem_is_restricted(void)
 	    	if (machine_type("X86") || machine_type("X86_64")) {
 			if (readmem(255*PAGESIZE(), PHYSADDR, &tmp,
 			    sizeof(long), "devmem_is_allowed - pfn 255",
-			    QUIET|RETURN_ON_ERROR) &&
+			    QUIET|RETURN_ON_ERROR|NO_DEVMEM_SWITCH) &&
 			    !(readmem(257*PAGESIZE(), PHYSADDR, &tmp,
 			    sizeof(long), "devmem_is_allowed - pfn 257",
-			    QUIET|RETURN_ON_ERROR)))
+			    QUIET|RETURN_ON_ERROR|NO_DEVMEM_SWITCH)))
 				restricted = TRUE;
 		}
 		if (kernel_symbol_exists("jiffies") &&
 		    !readmem(symbol_value("jiffies"), KVADDR, &tmp,
 		    sizeof(ulong), "devmem_is_allowed - jiffies",
-		    QUIET|RETURN_ON_ERROR))
+		    QUIET|RETURN_ON_ERROR|NO_DEVMEM_SWITCH))
 			restricted = TRUE;

 		if (restricted)




More information about the Crash-utility mailing list