[Crash-utility] About displaying virtual memory information of exiting task

Dave Anderson anderson at redhat.com
Mon Dec 8 14:28:04 UTC 2014



----- Original Message -----
> On 12/06/2014 04:11 AM, Dave Anderson wrote:
> > Interestingly enough, today I was asked to look at a vmcore in which an oops
> > occurred during task exit after tsk->mm had been NULL'd out in exit_mm():
> 
> It almost matches what I am facing, when tsk->mm is set to NULL and memory
> mapping is supposed to be displayed. This is a more simple implementation.
> I have tried to command like vm [taskp | pid | [-M mm_struct]]. But it have
> to modify a lot of thing.
> 
> By the way, I feel the code is becoming more and more complicated, maybe a
> reconstruction is needed.

Well, the vm_area_dump() function is relatively stable, so let's not go crazy
here for what's essentially an "experimental" option.

> 
> >
> > Of course it has its limitations.  Since the page tables are being broken down in this case,
> > "vm -p" fails:
> >
> >   crash>  vm -M ffff880495120dc0 -p
> >    PID: 4563   TASK: ffff88049863f500  CPU: 8   COMMAND: "postgres"
> >           MM               PGD          RSS    TOTAL_VM
> >           0                 0            0k       0k
> >          VMA           START       END     FLAGS FILE
> >    ffff8804a085ce90     400000     f56000 8001875
> >    /usr/local/greenplum-db-4.3.3.1/bin/postgres
> >    VIRTUAL     PHYSICAL
> >    vm: invalid kernel virtual address: 50  type: "mm_struct pgd"
> >    crash>
> 
> After a glance, the pgd comes from the mm of task_struct. We need a lot of work to make it
> replaced by argument of -M, I don't think it worse it right now.

Actually it doesn't take much work at all.  If both tc->mm_struct and tm->mm_struct_addr
are replaced with the supplied address: 

     tc->mm_struct = tm->mm_struct_addr = pc->curcmd_private;

then "vm -M ffff880495120dc0 -p" also works OK with my sample vmcore.

> >
> > But it does seems like a worthwhile addition.
> >
> > The patch doesn't check whether mm->owner or mm->mm_count are legitimate, but I'm not
> > sure whether it's even worth it?  If it fails, it fails, and the help page should just
> > indicate that the command option is not guaranteed to work.  Does the attached patch work
> > for you?
> 
> Similar to the core I got. And I modified the patch to add some check. At least I think
> we need to make sure the address still belongs to a mm_struct object.

I suppose you could, although in all probability it's going to be stay in the mm_struct
slab cache, and worst case, have been re-used by another task.

Dave





More information about the Crash-utility mailing list