[Crash-utility] [PATCH] Report error for NULL list_head pointers

Dave Anderson anderson at redhat.com
Fri Mar 31 17:42:16 UTC 2017



----- Original Message -----
> On Thu, Mar 30, 2017 at 03:13:27PM -0400, Dave Anderson wrote:
> > > If we hit a NULL next pointer on a struct list_head list it means the
> > > list is corrupted.

Thanks Rabin, queued for crash-7.1.9:

  https://github.com/crash-utility/crash/commit/a5ebe53b6b2a55ed64bf6f580e24edb76018ee48

Dave

  
> > 
> > Yeah, that is true -- although it's always been this way and there's never
> > been a bug report.  I'm curious as to what happened in your case where
> > you discovered this?
> 
> I received a dump where the kernel had crashed while iterating through a
> linked list.  We read out the list in crash and it was certainly
> corrupted:
> 
>  crash-arm> list -xH 0x7f047394
>  be44f544
>  8ff69004
>  8ff69f04
>  8ff693c4
>  8ffe0e44
>  8ffe0244
>  ...
>  8ffb53c4
>  be448b44
>  8fc2c3c4
>  8fc2c0c4
>  8fc2c604
>  8fc2cf04
>  ffff
>  list: invalid kernel virtual address: ffff  type: "list entry"
>  crash-arm>
> 
> Further investigation led us to suspect that this was not a simple case
> of a freed element still being on the list, but some other larger memory
> corruption.  We wanted to find out if there were more corrupted entries
> on this list, so we dumped the list in reverse using the .prev pointers:
> 
>  crash-arm> list -rxH 0x7f047394
>  b957fcc4
>  b957f0c4
>  b4d863c4
>  bad41904
>  bad416c4
>  bad41c04
>  bad41784
>  bad41544
>  be5f4b44
>  ...
>  8ff7de44
>  8f7a4c04
>  8f7a4f04
>  8fc2c9c4
>  8fc2c904
>  8fc2c784
>  8fc2ce44
>  crash-arm>
> 
> This, suprisingly, terminated succesfully.
> 
> However, a closer look at the addresses showed that the last elements of
> the reverse iteration are not the first elements of the forward
> iteration.  So crash had silently stopped iteration halfway into the
> list.  This was because the 8fc2ce44 element had a NULL prev pointer.
> 
>  crash-arm> struct list_head 8fc2ce44
>  struct list_head {
>    next = 0xffffffff,
>    prev = 0x0
>  }
> 
> Since crash knows that the list is corrupted, it would seem appropriate
> for it to alert the user to this fact instead of silently and
> succesfully terminating the iteration.
> 




More information about the Crash-utility mailing list