[Crash-utility] CFS runqueue list handling improvements

Thu Feb 2 01:41:49 UTC 2012

Hello,

From: David Mair <dmair at suse.com>
Subject: [Crash-utility] CFS runqueue list handling improvements
Date: Wed, 01 Feb 2012 12:07:46 -0700

> I was using a crash dump last week where the CFS runqueue for one CPU
> had a loop in it. It was caused by a red/black tree node that was the
> left child of one node at the same time as being the right child of a
> different node. The doubly-linked node has only one parent, one of the
> nodes that has it as a child here). Call the node with two links to it
> X, the node that is the stored parent of X call Y and the other node
> that links to X call Z. The parent of Y is Z here but X is the right
> node of Z (left of Y). So, you reach Z, do all it's left branch then go
> right to reach X. Go up to Z from the left not Y from the right and do
> all the right of Z. Go up to Y and your last node isn't the left or
> right child so you go do the whole left tree of Y again, etc.

So, I understand this as the following diagram, without explicit
parent notations:

            :
            :
            |
            V     parent
            Z  <----------  Y
            |              ^ |
   rb_right |       parent | | left
            V              | |
            X -------------+ |
              <--------------+

, and the loop continues just like:

            -> Z -> X -> Y
               ^         |
               |         |
               +---------+

Is this right?

One concern I have is when this kind of situation could
happen. Propbably only unde rtree operation?

Thanks.
HATAYAMA, Daisuke

> 
> In that case, the runq command (task.c) in crash never bails out of
> dump_CFS_runqueues() because the for loop that goes from rb_first()
> through all of rb_next() has no means of detecting the problem, although
> you can use the runq command with -d and only get the task list (which
> wasn't corrupted). The runq command alone then becomes very annoying to
> use when diagnosing the actual problem.
> 
> It occurs to me that this could be improved with one or more cases to
> detect as additional exit conditions with warning messages for that loop:
> 
> * Number of iterations exceeds cfs_rq_nr_running (perhaps allowing it to
> be exceeded by some small amount to have a chance of seeing the nature
> of a problem with it, perhaps with a runq command line switch to allow
> ignoring any problems)
> 
> * At the node making the cfs_rq_nr_running iteration, note the node
> address, then exit with a warning if that node is ever displayed again,
> thus showing any loop that starts within the first cfs_rq_nr_running nodes.
> 
> * A more complex loop detection that handles cases where
> cfs_rq_nr_running is significantly lower than the number of actual nodes
> in the valid part of the tree.
> 
> * I suppose if you still have the last node processed and you are going
> up then that last node should always be the left or right child of the
> parent you are reaching or the tree is broken.
> 
> I understand that cfs_rq_nr_running could contain the "corruption" so
> assuming it indicates the correct number of times to call rb_next()
> isn't correct either, so my preference is for some form of loop
> detection. Before I work on a patch, are there any opinions on the
> behavior of the runq command in the case of a corrupt CFS runqueue, e.g.
> one that contains a loop?
> 
> -- 
> David Mair
> SUSE Linux
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>