[Crash-utility] Unknown osrelease information in vmcore with xen

Dietmar Hahn dietmar.hahn at ts.fujitsu.com
Mon Dec 14 12:59:28 UTC 2015


Hi Dave,

Am Freitag 11 Dezember 2015, 14:35:56 schrieb Dave Anderson:
> 
> ----- Original Message -----
> > 
> > 
> > ----- Original Message -----
> > > 
> > > 
> > > ----- Original Message -----
> > > > Hi,
> > > > 
> > > > I have a SuSE SLES11 vmcore with xen and tried to read the osrelease from
> > > > the vmcore with
> > > > # crash --osrelease vmcore
> > > > unknown
> > > > 
> > > > The problem is that there are two notes in the vmcore starting with
> > > > "VMCOREINFO":
> > > > 
> > > > Elf64_Nhdr:
> > > >                n_namesz: 11 ("VMCOREINFO")
> > > >                n_descsz: 1384
> > > >                  n_type: 0 (unused)
> > > >                          OSRELEASE=3.0.101-63-xen
> > > >                ...
> > > > Elf64_Nhdr:
> > > >                n_namesz: 15 ("VMCOREINFO_XEN")
> > > >                n_descsz: 4068
> > > >                  n_type: 0 (unused)
> > > >                ...
> > > > 
> > > > In the function dump_Elf64_Nhdr() I found:
> > > >     vmcoreinfo = STRNEQ(buf, "VMCOREINFO");
> > > > 
> > > > But because the "VMCOREINFO_XEN" ist the second one in the file it wins!
> > > > 
> > > > When using
> > > >     vmcoreinfo = STREQ(buf, "VMCOREINFO");
> > > > all is fine and I get:
> > > > # crash --osrelease vmcore
> > > > 3.0.101-63-xen
> > > > 
> > > > So my question is: why is STRNEQ() used?
> > > > Thanks!
> > > 
> > > Hello Dietmar,
> > > 
> > > As I recall, I did all of the note name checks that way because the length
> > > of the name string is specified by the note->n_namesz field, and therefore
> > > not necessarily guaranteed to be a NULL-terminated string?  In reality,
> > > they're probably will be a NULL there though.
> > > 
> > > Anyway, I wasn't even familiar with the existence of a VMCOREINFO_XEN note,
> > > so please feel free to post a patch to address it.
> > > 
> > > Dave
> > 
> > 
> > This should work, right?:
> > 
> > --- crash-7.1.3/netdump.c.orig
> > +++ crash-7.1.3/netdump.c
> > @@ -1940,7 +1940,8 @@ dump_Elf32_Nhdr(Elf32_Off offset, int st
> >  #endif
> >  	default:
> >  		xen_core = STRNEQ(buf, "XEN CORE") || STRNEQ(buf, "Xen");
> > -		vmcoreinfo = STRNEQ(buf, "VMCOREINFO");
> > +		if (!STRNEQ(buf, "VMCOREINFO_XEN"))
> > +			vmcoreinfo = STRNEQ(buf, "VMCOREINFO");
> >  		eraseinfo = STRNEQ(buf, "ERASEINFO");
> >  		qemuinfo = STRNEQ(buf, "QEMU");
> >  		if (xen_core) {
> > @@ -2196,7 +2197,8 @@ dump_Elf64_Nhdr(Elf64_Off offset, int st
> >  #endif
> >  	default:
> >  		xen_core = STRNEQ(buf, "XEN CORE") || STRNEQ(buf, "Xen");
> > -		vmcoreinfo = STRNEQ(buf, "VMCOREINFO");
> > +		if (!STRNEQ(buf, "VMCOREINFO_XEN"))
> > +			vmcoreinfo = STRNEQ(buf, "VMCOREINFO");
> >  		eraseinfo = STRNEQ(buf, "ERASEINFO");
> >  		qemuinfo = STRNEQ(buf, "QEMU");
> >                  if (xen_core) {
> > 
> > Dave
> 
> Actually, the patch above will prevent the VMCOREINFO_XEN strings from being
> dumped in readable strings by "help -[nD]" or by "crash -d#".  Try the attached
> patch instead, where both VMCOREINFO and VMCOREINFO_XEN string data will be 
> dumped appropriately.
> 
> I do have a couple old RHEL5 xen dom0 dumpfiles that have VMCOREINFO_XEN notes, 
> but they do not have VMCOREINFO notes.  It must be relatively newer Xen kernels
> that have both?  Anyway, check out the updated patch and let me know.

Thanks the patch works very well!
Only a minor nit, now the n_type has changed:

Elf64_Nhdr:
               n_namesz: 15 ("VMCOREINFO_XEN")
               n_descsz: 4068
                 n_type: 0 (?)

I'm not familiar enough what should be the right one:
 "n_type: 0 (?)" or "n_type: 0 (unused)" 

If the second this works for 64 bit:
--- a/netdump.c
+++ b/netdump.c
@@ -2231,6 +2231,8 @@ dump_Elf64_Nhdr(Elf64_Off offset, int store)
                } else if (qemuinfo) {
                        pc->flags2 |= QEMU_MEM_DUMP_ELF;
                        netdump_print("(QEMUCPUState)\n");
+               } else if (vmcoreinfo_xen) {
+                        netdump_print("(unused)\n");
                 } else
                         netdump_print("(?)\n");
                 break;

> 
> And BTW, checking the ELF specs, the name string should be NULL-terminated, and
> the n_namesz count should include the NULL-byte:
> 
>   namesz and name
>       The first namesz bytes in name contain a null-terminated character
>       representation of the entry's owner and originator.
> 
> There's an accompanying diagram example that shows the namesz value counting
> the NULL terminator.
> 
> However, the reason for STRNEQ() in crash stems from the fact that the 
> original "netdump" generated ELF vmcores were incorrectly created such
> the n_namesz count did not include the NULL, and I believe that the
> descriptor data started immediately after the name string.

OK, I understand.

> 
> For example, here's an old 2.6.9-based netdump generated vmcore, where
> the name is "CORE", and therefore *should* have a n_namesz of 5 if it
> included a NULL terminator:
> 
> Elf64_Nhdr:
>                n_namesz: 4 ("CORE")
>                n_descsz: 336
>                  n_type: 1 (NT_PRSTATUS)
> 
> The correct way would include the NULL-byte as well, as is done with
> kdump generated ELF vmcores:
> 
> Elf64_Nhdr:
>                n_namesz: 5 ("CORE")
>                n_descsz: 336
>                  n_type: 1 (NT_PRSTATUS)
> 
> So for backwards-compatibility, I'd prefer to leave the checks using STRNEQ().

Yes you are right.
Many thanks!

Dietmar.

> 
> Thanks,
>   Dave


-- 
Company details: http://ts.fujitsu.com/imprint.html




More information about the Crash-utility mailing list