[Crash-utility] x86_64: supporting cpu hot remove

Dave Anderson anderson at redhat.com
Tue Sep 16 15:46:45 UTC 2014



----- Original Message -----
> Hello Dave,
> 
> On 09/09/2014 10:04 PM, Dave Anderson wrote:
> >>> >  >  Many of the changes reflect the contents of per-cpu data structures
> >>> >  >  of offlined cpus, but even though the cpu is currently offline, the
> >>> >  >  data structures still exist.  Why prevent the user from viewing
> >>> >  >  their
> >>> >  >  contents?
> >> >
> >> >  I think just showing online cpu's data is reasonable.
> > Why?  Give me an example as to when it is/was a problem?
> >
> >> >  What about adding a internal crash variables (used by command set) to
> >> >  hide/show offline cpu's data?
> > I suppose that could be done, but again, in my opinion there is no
> > compelling
> > reason to do so.  I could be wrong, but aside from maybe "help -r", it
> > seems
> > that you are trying to answer a question that nobody's asking.
> 
> I know it is important to show data of offline cpu, like debugging hot
> remove.
> But for those who don't care about the removed cpu, hiding offline cpu will
> be
> more clear. Then let me talk about the reason why I think hiding will be more
> clear.
> 
> I first got a vmcore with 90 cpus at first and 30 of them were physically
> removed. After 30 cpus physically removed, the machine works with 60 cpus.
> To those who don't care about data of the removed cpu, the following data
> is confusing:
> 
> 1. The machine only got 60 cpus, but crash shows 90 cpus.
> 2. when I execute command timer, crash show 90 TVEC_BASES, some of them(maybe
> exceed 30) are empty. But I have to check which cpu is offline, and then I
> can know whether the empty is because of offline cpu or just no timer was set
> on that cpu.
> 3. comes to idle tasks, offline cpu is halt and related idle tasks will not
> work, but crash shows they are running right now.
> 4. ...
> 
> After I check kernel, I found when cpu is set to offline, things, processes,
> timers, interrupts etc., are migrated to a new cpu. So I tried to hide when
> cpu is set offline(logically removed) instead of physically removed.
> 
> The attachment is what I am trying to implement. If you don't like it, we can
> go on discussing it.

Hello Qiao,

Making it configurable sounds reasonable enough -- I will look into the details
of this patch-set later this week.

Thanks,
  Dave


> 
> --
> Regards
> Qiao Nuohan
> 




More information about the Crash-utility mailing list