[Crash-utility] [PATCH] Report error for NULL list_head pointers

Rabin Vincent rabin.vincent at axis.com
Fri Mar 31 05:59:26 UTC 2017


On Thu, Mar 30, 2017 at 03:13:27PM -0400, Dave Anderson wrote:
> > If we hit a NULL next pointer on a struct list_head list it means the
> > list is corrupted.
> 
> Yeah, that is true -- although it's always been this way and there's never
> been a bug report.  I'm curious as to what happened in your case where
> you discovered this? 

I received a dump where the kernel had crashed while iterating through a
linked list.  We read out the list in crash and it was certainly
corrupted:

 crash-arm> list -xH 0x7f047394
 be44f544
 8ff69004
 8ff69f04
 8ff693c4
 8ffe0e44
 8ffe0244
 ...
 8ffb53c4
 be448b44
 8fc2c3c4
 8fc2c0c4
 8fc2c604
 8fc2cf04
 ffff
 list: invalid kernel virtual address: ffff  type: "list entry"
 crash-arm> 

Further investigation led us to suspect that this was not a simple case
of a freed element still being on the list, but some other larger memory
corruption.  We wanted to find out if there were more corrupted entries
on this list, so we dumped the list in reverse using the .prev pointers:

 crash-arm> list -rxH 0x7f047394
 b957fcc4
 b957f0c4
 b4d863c4
 bad41904
 bad416c4
 bad41c04
 bad41784
 bad41544
 be5f4b44
 ...
 8ff7de44
 8f7a4c04
 8f7a4f04
 8fc2c9c4
 8fc2c904
 8fc2c784
 8fc2ce44
 crash-arm> 

This, suprisingly, terminated succesfully.

However, a closer look at the addresses showed that the last elements of
the reverse iteration are not the first elements of the forward
iteration.  So crash had silently stopped iteration halfway into the
list.  This was because the 8fc2ce44 element had a NULL prev pointer.

 crash-arm> struct list_head 8fc2ce44
 struct list_head {
   next = 0xffffffff, 
   prev = 0x0
 }

Since crash knows that the list is corrupted, it would seem appropriate
for it to alert the user to this fact instead of silently and
succesfully terminating the iteration.




More information about the Crash-utility mailing list