[Crash-utility] x86_64: supporting cpu hot remove

Dave Anderson anderson at redhat.com
Thu Sep 18 19:57:03 UTC 2014


----- Original Message -----
>
> The attachment is what I am trying to implement. If you don't like it, we can
> go on discussing it.

Hello Qiao, 

This patch-set could use some clarification.  In fact, I was at
first confused as to the meaning of flag, and thought it was
"backwards".  Then I realized that you are hiding the offline
cpus data by default, and the user would have to turn it back
on to show the data.

First, as I indicated in our original discussion, I do not want
to make such a change in traditional crash behavior by default.  
The decision to hide an offline cpu's data should be up to the user. 

Second, back to semantics, the environment variable name and its 
arguments should be clarified to make it more obvious as to what
they actually mean.  The "[on|off]" is confusing when combined
with "offline_cpu".  Let me suggest something like:

  $ crash --offline [show|hide]    (where "show" is the default)

or during runtime:

  crash> set offline [show|hide]

And then with respect to setting its internal state:

  (pc->flags2 & OFFLINE_HIDE):  means that "hide" mode is turned on.

Third, your sample construct is kind of hard to understand, in fact
I thought the logic was backwards at first:

   next_cpu:
  +       if (!(pc->flags2 & OFFLINE_CPU) && check_offline_cpu(cpu)) {
  +               if (++cpu < kt->cpus)
  +                       goto next_cpu;
  +       }
  +

Since the check_offline_cpu() is only going to be used to determine
whether something is going to be shown or hidden, perhaps you could
create a more meaningful function name that could be used like this:

   next_cpu:
          if (hide_offline_cpu(cpu)) {
                  if (++cpu < kt->cpus)
                          goto next_cpu;
          }

And lastly, when the offline cpu data is to be hidden, I still feel 
that there should be *some* kind of indication that the cpu is offline.
In your example, instead of just skipping the TVEC_BASES[cpu] line, 
why not indicate why it's missing?  So in your example, show something
like this:

TVEC_BASES[3]: [OFFLINE]

I think these are all reasonable compromises.  

And also, don't forget crash.8, which should basically show the
same output as "crash -h".

Thanks,
  Dave






More information about the Crash-utility mailing list