[Crash-utility] x86_64 limit of 454 cpu's?

tachibana at mxm.nes.nec.co.jp tachibana at mxm.nes.nec.co.jp
Tue Apr 19 09:03:41 UTC 2011


Hi,

On Fri, 15 Apr 2011 12:13:51 -0400 (EDT)
Dave Anderson <anderson at redhat.com> wrote:

> 
> 
> ----- Original Message -----
> > (4/15/2011 09:04), Dave Anderson wrote:
> > >
> > >
> > > ----- Original Message -----
> > >> Hi Dave, and company,
> > >>
> > >> I get this error trying to open a dump of a large system:
> > >>
> > >> crash: compressed kdump: invalid nr_cpus value: 640
> > >> crash: vmcore: not a supported file format
> > >>
> > >> The message is from diskdump.c:
> > >> if (sizeof(*header) + sizeof(void *) * header->nr_cpus> block_size
> > >> ||
> > >>      header->nr_cpus<= 0) {
> > >>          error(INFO, "%s: invalid nr_cpus value: %d\n",
> > >>
> > >> block_size is the page size of 4096
> > >> struct disk_dump_header looks like 464 bytes
> > >> void * is 8
> > >> So it looks like 454 is the maximum number of cpus.
> > >> 464 + 454*8 -> 4096
> > >>
> > >> Is this intentional?
> > >> It looks like a restriction that no one ever complained about. But
> > >> there
> > >> are systems (Altix UV) with 2048 cpu's.
> > >>
> > >> Is there an easy fix?
> > >>
> > >> -Cliff
> > >
> > > To be honest, I don't know, I didn't design or write that code.
> > 
> > Yes, this is intentional for RHEL4/diskdump. In the RHEL4 kernel,
> > disk_dump_header is defined as follows.
> > 
> > struct disk_dump_header {
> > char signature[8]; /* = "DISKDUMP" */
> > (snip)
> > int nr_cpus; /* Number of CPUs */
> > struct task_struct *tasks[NR_CPUS];
> > };
> > 
> > And maximum logical CPUs of RHEL4 are 32(x86) or 64(x86_64) so
> > this does not cause any problem.
> > 
> > On the other hands, as you and Dave said, this causes limitation
> > problem
> > in RHEL5/RHEL6 kernel. But as far as I know, makedumpfile does not use
> > this "tasks" member, so we can skip here.
> > 
> > if (is_diskdump?)
> > if (sizeof(*header) + sizeof(void *) * header->nr_cpus> block_size ||
> > header->nr_cpus<= 0) {
> > error(INFO, "%s: invalid nr_cpus value: %d\n",
> > goto err;
> > }
> > 
> > Something like this. Dave, Ohmichi-san, what do you think?
> > 
> > Thanks,
> > Takao Indoh
> 
> Looking at a couple sample compressed kdumps, it does appear that
> they do not set up the tasks[] array, and that the sub_header still
> starts at the "header->block_size".  That being the case, your
> proposal looks like a nice, simple fix!
> 
> Cliff, can you enclose that piece of code with something like:
> 
>         if (DISKDUMP_VALID()) {
>             if (sizeof(*header) + sizeof(void *) * header->nr_cpus > block_size ||
>                 header->nr_cpus <= 0) {
>                     error(INFO, "%s: invalid nr_cpus value: %d\n",
>                             DISKDUMP_VALID() ? "diskdump" : "compressed kdump",
>                             header->nr_cpus);
>                     goto err;
>             }
>         }
> 
> I suppose it should still check for (header->nr_cpus <= 0), but I'll
> let Ken'ichi confirm that.

I think that it is good to check the format of the header though
makedumpfile sets header->nr_cpus of course.

So I think that the following code is better.

	if ((DISDUMP_VALID() &&
	     sizeof(*header) + sizeof(void *) * header->nr_cpus > block_size) ||
	     header->nr_cpus <= 0) {
		error(INFO, "%s: invalid nr_cpus value: %d\n",
			DISKDUMP_VALID() ? "diskdump" : "compressed kdump",
			header->nr_cpus);
			goto err;
	}

And,
- makedumpfile doesn't use tasks array as Indoh-san said.
- I don't think that makedumpfile malfunctions for CPUs more than 454.


Thanks,
tachibana

> 
> Thanks Takao,
>   Dave
> 
> > 
> > >
> > > And you're right, although dumpfiles with that many cpus are highly
> > > unusual, but looking at the code, it certainly does appear that the
> > > disk_dump_header plus the task pointers for each cpu must fit in a
> > > "block_size", or page size, and that the sub_header is the first
> > > item
> > > following the contents of the first page:
> > >
> > > ---
> > >   ^ disk_dump_header
> > >   |     task on cpu 0
> > > page ...
> > >   |     task on cpu x-1
> > >   V task on cpu x
> > > ---
> > >         sub_header
> > >         bitmap
> > >         remaining stuff...
> > >
> > > Since your dump is presumably a compressed kdump, I'm wondering
> > > what the makedumpfile code did in your case? Did it round up the
> > > location of the sub_header to a page boundary?
> > >
> > > I've cc'd this to Ken'ichi Ohmichi (makedumpfile), and to Takao
> > > Indoh
> > > (original compressed diskdump) for their input.
> > >
> > > Thanks,
> > >    Dave
> > >
> > >
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility




More information about the Crash-utility mailing list