[Crash-utility] crash 4.0-2.8 fails on 2.6.14-rc5 (EM64T)

Wed Oct 26 21:48:42 UTC 2005

Badari Pulavarty wrote:

> On Wed, 2005-10-26 at 16:27 -0400, Dave Anderson wrote:
> > Badari Pulavarty wrote:
> >
> > > On Wed, 2005-10-26 at 14:41 -0400, Dave Anderson wrote:
> > > > Sorry I've generated some unnecesary confusion re: my comments
> > > > about the use of DEFINE_PER_CPU and DECLARE_PER_CPU.
> > > > That's what I get for trying to multi-task...
> > > >
> > > > Stepping back, the init_tss array is defined in "arch/x86_64/kernel/init_task.c".
> > > >
> > > > In 2.6.9, it's declared like so:
> > > >
> > > > /*
> > > >  * per-CPU TSS segments. Threads are completely 'soft' on Linux,
> > > >  * no more per-task TSS's. The TSS size is kept cacheline-aligned
> > > >  * so they are allowed to end up in the .data.cacheline_aligned
> > > >  * section. Since TSS's are completely CPU-local, we want them
> > > >  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
> > > >  */
> > > > DEFINE_PER_CPU(struct tss_struct, init_tss) ____cacheline_maxaligned_in_smp;
> > > >
> > > > In 2.6.13, it's slightly different in that it is initialized to INIT_TSS:
> > > >
> > > > /*
> > > >  * per-CPU TSS segments. Threads are completely 'soft' on Linux,
> > > >  * no more per-task TSS's. The TSS size is kept cacheline-aligned
> > > >  * so they are allowed to end up in the .data.cacheline_aligned
> > > >  * section. Since TSS's are completely CPU-local, we want them
> > > >  * on exact cacheline boundaries, to eliminate cacheline ping-pong.
> > > >  */
> > > > DEFINE_PER_CPU(struct tss_struct, init_tss) ____cacheline_maxaligned_in_smp = INIT_TSS;
> > > >
> > > > Both kernels have the same DECLARE_PER_CPU in the
> > > > "x86_64/processor.h" header file:
> > > >
> > > > DECLARE_PER_CPU(struct tss_struct,init_tss);
> > > >
> > > > That being the case, and not seeing why the INIT_TSS initialization should
> > > > have anything to do with the problem at hand, I am officially stumped at
> > > > why the 2.6.14 kernel shows the problem with your patch.
> > >
> > > Okay, I thought so too. I will take a closer look at it and let you
> > > know what I find. I am tempted to go back to 2.6.10 and see if
> > > crash works. Do you know the last known good kernel release for crash
> > > to work ?
> > >
> >
> > Sorry -- for x86_64, I can't say that I do know the last version
> > that worked.  Maybe somebody else on the list that uses other
> > than Red Hat RHEL4 kernels does?
> >
> > Dave
> >
>
> Dave,
>
> I tried 2.6.10 and crash worked fine there. Here is the what I found
> interesting. On 2.6.10 the values seem reasonable, but on 2.6.14 they
> have huge values.
>
> 2.6.10:
> cpunum: 0 data_offset 10084b80f60
> cpunum: 1 data_offset 10084b88f60
>
> 2.6.14-rc5:
>
> cpunum: 0 data_offset ffff810084af5f60
> cpunum: 1 data_offset ffff810084afdf60
>
> I got curious on the top "0xffff8" part an trimmed them.
> (basically I did data_offset & 0x00000fffffffffff).

Well that certainly needs further explanation...

>
> Now I run into next problem :( I am missing something basic.
>
> crash: read error: kernel virtual address: ffff81000000fa90  type:
> "pglist_data node_next"
>

That's probably coming from node_table_init().  Could the pglist_data
list now be using per-cpu data structures?  But again, I don't understand
the significance of the ffff8 at the top of the address.

Dave

>
> Thanks,
> Badari
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility