[Crash-utility] patch to add vmss memory regions support

Dave Anderson anderson at redhat.com
Tue Mar 31 15:07:09 UTC 2015



----- Original Message -----
> Dave,
>   patch updated. please review, thanks.
>   * converted all the FATAL to INFO.
>   * the reason to "break" rather than "return FALSE" is that we might still
>     be able to read the "memory" group which is the only relevant checkpoint
>     group we care about.

Hi Dyno,

This patch looks OK, except for one minor thing.

I see that you removed the two CRASHDEBUG(1) messages in is_vmware_vmss()
as well.  In the case of the failure to fopen() or fread() the passed-in
filename, it is OK to call error(INFO, ...), but they are essentially both
no-ops.  Given that main() will reject any non-readable file, and several
earlier dumpfile-checking functions will call error(FATAL, ...) if the 
dumpfile cannot be opened, those messages will never be displayed.

However, in this case:

@@ -47,8 +53,7 @@ is_vmware_vmss(char *filename)
            hdr.id != CPTDUMP_PARTIAL_MAGIC_NUMBER &&
            hdr.id != CPTDUMP_RESTORED_MAGIC_NUMBER &&
            hdr.id != CPTDUMP_NORESTORE_MAGIC_NUMBER) {
-               if (CRASHDEBUG(1))
-                       error(INFO, LOGPRX"Unrecognized .vmss file (magic %x).\n", hdr.id);
+               error(INFO, LOGPRX"Unrecognized .vmss file (magic %x).\n", hdr.id);
                return FALSE;
        }

The CRASHDEBUG(1) absolutely should be there.  The is_vmware_vmss() function
is simply there to verify whether the passed-in filename is a .vmss file.
If it is not -- based upon the header contents -- then it should quietly return
FALSE.  There will be other file formats in the future, and most likely will be placed
following the is_vmware_vmss() call in main(), and we won't want that irrelevant
message.  So I've restored that particular CRASHDEBUG(1) qualifier.

Aside from that, the patch is queued for crash-7.1.1:

  https://github.com/crash-utility/crash/commit/1ed90b28af6dbb74fd785e36502f16a8f156e3b3

Thanks,
  Dave

  
> 
> rgds,
> Dyno
> 
> On 3/30/15 12:14 PM, Dave Anderson wrote:
> >
> > ----- Original Message -----
> >> Dave,
> >>    I've removed the assert now. please find the updated patch in
> >>    attachment.
> >>    thanks.
> >> rgds,
> >> Dyno
> > Hi Dyno,
> >
> > I should have combined this actual code review with my last
> > response re: the assert() calls, but there are just a few minor
> > things that still need adjustment.
> >
> > As I mentioned in an earlier response, during session initialization,
> > which is before RUNTIME gets set in pc->flags in main_loop(), any
> > error(FATAL, ...) call will print the error message and the crash
> > session immediately exits.  And that's OK if that's what you want
> > to happen.
> >
> > But in vmware_vmss_init() you changed many of the previous error messages
> > that used fprintf(vmss.ofp, ...) to use a construct using FATAL error
> > messages like this:
> >
> > +               error(FATAL, LOGPRX"%s: %s\n", filename, strerror(errno));
> > +               result = FALSE;
> > +               goto exit;
> >
> > But that doesn't make sense since the error message will get printed and
> > crash will exit immediately, and therefore will not return back here in
> > memory_sources_init():
> >
> > +               } else if (pc->flags & VMWARE_VMSS) {
> > +                       if (!vmware_vmss_init(pc->dumpfile, fp))
> > +                               error(FATAL, "%s: initialization failed\n",
> > +                                       pc->dumpfile);
> > +               }
> >
> > So preferably the calls should be changed to error(INFO, ...) so that the
> > user
> > will get the specific message followed by the general "initialization
> > failed"
> > message.
> >
> > Also, it would make sense to add an error(INFO, ...) message here as you
> > have done for the other error scenarios:
> >
> > +       if (fread(&hdr, sizeof(cptdumpheader), 1, fp) != 1) {
> > +               result = FALSE;
> > +               goto exit;
> > +       }
> >
> > If that initial fread() of the header fails, then the session will exit
> > with
> > just the single somewhat-confusing "initialization failed" message.
> >
> > I also have a general question about vmware_vmss_init() which is not
> > specific to this patch.  In the inner-most "for(;;)" loop, it appears
> > that it only legitimately breaks if it runs into a TAG_NULL:
> >
> >                        if (tag == NULL_TAG)
> >                                 break;
> >
> > But there are a number of fread() failures in the loop that do an
> > fprintf(ofp, ...) and simply "break".  If any of those failures occur,
> > it looks like vmware_vmss_init() will return success.  But should it?
> >
> > Anyway, if you can clean up the error message handling, we can get this
> > patch
> > checked in.
> >
> > Thanks,
> >   Dave
> >
> >
> >> On 3/27/15 11:05 AM, Dave Anderson wrote:
> >>> Dyno,
> >>>
> >>> I've got the sample multi-region dumpfile -- thanks for that.
> >>>
> >>> This latest patch tests OK on all three dumpfiles.  But can you please
> >>> remove the remaining assert() calls as I requested in my previous email?
> >>>
> >>> Thanks,
> >>>   Dave
> >>>
> >>>
> >>> ----- Original Message -----
> >>>> Dave,
> >>>>
> >>>>   An assert i put caused the crash, workstation VM memory has more block
> >>>>   item
> >>>>   than i thought.
> >>>> fixed it in the patch and tested against all the dump i have. i will try
> >>>> to
> >>>> provide a 4GB memory dump later.
> >>>>
> >>>> ESX VM 3G Memory
> >>>> ================
> >>>> - Group: memory pos=0x1f6f6 size=0xc000090c
> >>>> ------------------------------------
> >>>> align_mask[0, 0]                         => 0x00ffff
> >>>> regionsCount                             => 0x000000
> >>>> Memory[0, 0]                             => BLOCK, pos=0x20000,
> >>>> size=0xc0000000
> >>>>
> >>>> ESX VM 4G Memory
> >>>> ================
> >>>> - Group: memory pos=0x1f6f6 size=0x10000090c
> >>>> -----------------------------------
> >>>> align_mask[0, 0]                         => 0x00ffff
> >>>> regionsCount                             => 0x000002
> >>>> regionPageNum[0]                         => 0x000000
> >>>> regionPPN[0]                             => 0x000000
> >>>> regionSize[0]                            => 0x0c0000
> >>>> regionPageNum[1]                         => 0x0c0000
> >>>> regionPPN[1]                             => 0x100000
> >>>> regionSize[1]                            => 0x040000
> >>>> Memory[0, 0]                             => BLOCK, pos=0x20000,
> >>>> size=0x100000000
> >>>>
> >>>> WS VM Memory
> >>>> ============
> >>>> - Group: memory pos=0x93b1 size=0x10098
> >>>> ----------------------------------------
> >>>> align_mask[0, 0]                         => 0x00ffff
> >>>> regionsCount                             => 0x000000
> >>>> hotSetSize                               => 0x040000
> >>>> hotSet                                   => BLOCK, pos=0x9405,
> >>>> size=0x8000
> >>>> MainMemPageZeroStateSize                 => 0x040000
> >>>> MainMemKnownZero                         => BLOCK, pos=0x11447,
> >>>> size=0x8000
> >>>>
> >>>> rgds,
> >>>> Dyno
> >>>>
> >>>> On 3/26/15 1:12 PM, Dave Anderson wrote:
> >>>>> ----- Original Message -----
> >>>>>> Dave,
> >>>>>>   updated the patch and please review. thanks.
> >>>>>>   - the page_size/page_shift problem.
> >>>>>>   - change type cast to union.
> >>>>>>   - the read_vmware_vmss() regression.
> >>>>>>
> >>>>>> rgds,
> >>>>>> Dyno
> >>>>> Dyno,
> >>>>>
> >>>>> Since you cannot make any additional sample vmss2core-generated
> >>>>> dumpfiles available to me, I can only test this latest patch
> >>>>> on the two dumpfile/kernel pairs that you gave me in February:
> >>>>>   
> >>>>>   vmlinux-2.6.32-431.el6 CentOS6.5-11bd56db.vmss
> >>>>>
> >>>>> and
> >>>>>
> >>>>>   vmlinux-3.13.0-39-generic Ubuntu1404_64bit-65993542.vmss
> >>>>>   (with its companion Ubuntu1404_64bit-65993542.vmem file)
> >>>>>
> >>>>> The CentOS kernel works OK with your patch:
> >>>>>   
> >>>>>   $ crash vmlinux-2.6.32-431.el6 CentOS6.5-11bd56db.vmss
> >>>>>   
> >>>>>   crash 7.1.1rc13
> >>>>>   Copyright (C) 2002-2014  Red Hat, Inc.
> >>>>>   Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
> >>>>>   Copyright (C) 1999-2006  Hewlett-Packard Co
> >>>>>   Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
> >>>>>   Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> >>>>>   Copyright (C) 2005, 2011  NEC Corporation
> >>>>>   Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> >>>>>   Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> >>>>>   This program is free software, covered by the GNU General Public
> >>>>>   License,
> >>>>>   and you are welcome to change it and/or distribute copies of it under
> >>>>>   certain conditions.  Enter "help copying" to see the conditions.
> >>>>>   This program has absolutely no warranty.  Enter "help warranty" for
> >>>>>   details.
> >>>>>    
> >>>>>   GNU gdb (GDB) 7.6
> >>>>>   Copyright (C) 2013 Free Software Foundation, Inc.
> >>>>>   License GPLv3+: GNU GPL version 3 or later
> >>>>>   <https://urldefense.proofpoint.com/v2/url?u=http-3A__gnu.org_licenses_gpl.html&d=AwIFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=g2Vka_25x09RSowRkQw8pA&m=iQzEiRx9c9mhk2AN2ZgUquvryNUKAR8a3H2BxGelV-8&s=UTmIfMl3-60oCUaY2i-0rZlb-o78kBVx6f9F90s9r4Q&e=
> >>>>>   >
> >>>>>   This is free software: you are free to change and redistribute it.
> >>>>>   There is NO WARRANTY, to the extent permitted by law.  Type "show
> >>>>>   copying"
> >>>>>   and "show warranty" for details.
> >>>>>   This GDB was configured as "x86_64-unknown-linux-gnu"...
> >>>>>   
> >>>>>         KERNEL: vmlinux-2.6.32-431.el6
> >>>>>       DUMPFILE: CentOS6.5-11bd56db.vmss
> >>>>>           CPUS: 4
> >>>>>           DATE: Tue Feb  3 18:22:03 2015
> >>>>>         UPTIME: 00:01:06
> >>>>>   LOAD AVERAGE: 0.71, 0.22, 0.07
> >>>>>          TASKS: 302
> >>>>>       NODENAME: promd-1s-dhcp37.eng.vmware.com
> >>>>>        RELEASE: 2.6.32-431.el6.x86_64
> >>>>>        VERSION: #1 SMP Fri Nov 22 03:15:09 UTC 2013
> >>>>>        MACHINE: x86_64  (2394 Mhz)
> >>>>>         MEMORY: 511.5 MB
> >>>>>          PANIC: ""
> >>>>>            PID: 0
> >>>>>        COMMAND: "swapper"
> >>>>>           TASK: ffffffff81a8d020  (1 of 4)  [THREAD_INFO:
> >>>>>           ffffffff81a00000]
> >>>>>            CPU: 0
> >>>>>          STATE: TASK_RUNNING (ACTIVE)
> >>>>>        WARNING: panic task not found
> >>>>>   
> >>>>>   crash>
> >>>>>   
> >>>>> But the Ubuntu1404_64bit-65993542.vmss fails miserably:
> >>>>>
> >>>>>   $ crash Ubuntu1404_64bit-65993542.vmss vmlinux-3.13.0-39-generic
> >>>>>   
> >>>>>   crash 7.1.1rc13
> >>>>>   Copyright (C) 2002-2014  Red Hat, Inc.
> >>>>>   Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
> >>>>>   Copyright (C) 1999-2006  Hewlett-Packard Co
> >>>>>   Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
> >>>>>   Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> >>>>>   Copyright (C) 2005, 2011  NEC Corporation
> >>>>>   Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> >>>>>   Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> >>>>>   This program is free software, covered by the GNU General Public
> >>>>>   License,
> >>>>>   and you are welcome to change it and/or distribute copies of it under
> >>>>>   certain conditions.  Enter "help copying" to see the conditions.
> >>>>>   This program has absolutely no warranty.  Enter "help warranty" for
> >>>>>   details.
> >>>>>    
> >>>>>   crash: vmware_vmss.c:169: vmware_vmss_init: Assertion `__extension__
> >>>>>   ({
> >>>>>   size_t __s1_len, __s2_len; (__builtin_constant_p (name) &&
> >>>>>   __builtin_constant_p ("Memory") && (__s1_len = strlen (name),
> >>>>>   __s2_len
> >>>>>   =
> >>>>>   strlen ("Memory"), (!((size_t)(const void *)((name) + 1) -
> >>>>>   (size_t)(const void *)(name) == 1) || __s1_len >= 4) &&
> >>>>>   (!((size_t)(const void *)(("Memory") + 1) - (size_t)(const void
> >>>>>   *)("Memory") == 1) || __s2_len >= 4)) ? __builtin_strcmp (name,
> >>>>>   "Memory") : (__builtin_constant_p (name) && ((size_t)(const void
> >>>>>   *)((name) + 1) - (size_t)(const void *)(name) == 1) && (__s1_len =
> >>>>>   strlen (name), __s1_len < 4) ? (__builtin_constant_p ("Memory") &&
> >>>>>   ((size_t)(const void *)(("Memory") + 1) - (size_t)(const void
> >>>>>   *)("Memory") == 1) ? __builtin_strcmp (name, "Memory") :
> >>>>>   (__extension__
> >>>>>   ({ __const unsigned char *__s2 = (__const unsigned char *) (__const
> >>>>>   char
> >>>>>   *) ("Memory"); register int __result = (((__const unsigned char *)
> >>>>>   (__const char *) (name))[0] - __s2[0]); if (__s1_len > 0 && __result
> >>>>>   ==
> >>>>>   0)
> >>>> { __result = (((__const unsigned char *) (__const char *) (name))[1] -
> >>>> __s2[1]); if (__s1_len > 1 && __result == 0) { __result = (((__const
> >>>> unsigned char *) (__const char *) (name))[2] - __s2[2]); if (__s1_len >
> >>>> 2
> >>>> &&
> >>>> __result == 0) __result = (((__const unsigned char *) (__const char *)
> >>>> (name))[3] - __s2[3]); } } __result; }))) : (__builtin_constant_p
> >>>> ("Memory")
> >>>> && ((size_t)(const void *)(("Memory") + 1) - (size_t)(const void
> >>>> *)("Memory") == 1) && (__s2_len = strlen ("Memory"), __s2_len < 4) ?
> >>>> (__builtin_constant_p (name) && ((size_t)(const void *)((name) + 1) -
> >>>> (size_t)(const void *)(name) == 1) ? __builtin_strcmp (name, "Memory") :
> >>>> (__extension__ ({ __const unsigned char *__s1 = (__const unsigned char
> >>>> *)
> >>>> (__const char *) (name); register int __result = __s1[0] - ((__const
> >>>> unsigned char *) (__const char *) ("Memory"))[0]; if (__s2_len > 0 &&
> >>>> __result == 0) { __result = (__s1[1] - ((__const unsigned char *)
> >>>> (__const
> >>>> char *) ("Memory"))[1]); if (__s2_len > 1 && __result == 0) { __resul
> >>>> t = (__s1[2] - ((__const unsigned char *) (__const char *)
> >>>> ("Memory"))[2]);
> >>>> if (__s2_len > 2 && __result == 0) __result = (__s1[3] - ((__const
> >>>> unsigned
> >>>> char *) (__const char *) ("Memory"))[3]); } } __result; }))) :
> >>>> __builtin_strcmp (name, "Memory")))); }) == 0' failed.
> >>>>>   Aborted (core dumped)
> >>>>>   $
> >>>>>   
> >>>>> Note that crash-7.1.0 works OK:
> >>>>>   
> >>>>>   $ /usr/bin/crash Ubuntu1404_64bit-65993542.vmss
> >>>>>   vmlinux-3.13.0-39-generic
> >>>>>   
> >>>>>   crash 7.1.0
> >>>>>   Copyright (C) 2002-2014  Red Hat, Inc.
> >>>>>   Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation
> >>>>>   Copyright (C) 1999-2006  Hewlett-Packard Co
> >>>>>   Copyright (C) 2005, 2006, 2011, 2012  Fujitsu Limited
> >>>>>   Copyright (C) 2006, 2007  VA Linux Systems Japan K.K.
> >>>>>   Copyright (C) 2005, 2011  NEC Corporation
> >>>>>   Copyright (C) 1999, 2002, 2007  Silicon Graphics, Inc.
> >>>>>   Copyright (C) 1999, 2000, 2001, 2002  Mission Critical Linux, Inc.
> >>>>>   This program is free software, covered by the GNU General Public
> >>>>>   License,
> >>>>>   and you are welcome to change it and/or distribute copies of it under
> >>>>>   certain conditions.  Enter "help copying" to see the conditions.
> >>>>>   This program has absolutely no warranty.  Enter "help warranty" for
> >>>>>   details.
> >>>>>    
> >>>>>   vmw: Memory dump is not part of this vmss file.
> >>>>>   vmw: Try to locate the companion vmem file ...
> >>>>>   vmw: vmem file: Ubuntu1404_64bit-65993542.vmem
> >>>>>   
> >>>>>   GNU gdb (GDB) 7.6
> >>>>>   Copyright (C) 2013 Free Software Foundation, Inc.
> >>>>>   License GPLv3+: GNU GPL version 3 or later
> >>>>>   <https://urldefense.proofpoint.com/v2/url?u=http-3A__gnu.org_licenses_gpl.html&d=AwIFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=g2Vka_25x09RSowRkQw8pA&m=iQzEiRx9c9mhk2AN2ZgUquvryNUKAR8a3H2BxGelV-8&s=UTmIfMl3-60oCUaY2i-0rZlb-o78kBVx6f9F90s9r4Q&e=
> >>>>>   >
> >>>>>   This is free software: you are free to change and redistribute it.
> >>>>>   There is NO WARRANTY, to the extent permitted by law.  Type "show
> >>>>>   copying"
> >>>>>   and "show warranty" for details.
> >>>>>   This GDB was configured as "x86_64-unknown-linux-gnu"...
> >>>>>   
> >>>>>         KERNEL: vmlinux-3.13.0-39-generic
> >>>>>       DUMPFILE: Ubuntu1404_64bit-65993542.vmss
> >>>>>           CPUS: 1
> >>>>>           DATE: Thu Nov 13 14:10:53 2014
> >>>>>         UPTIME: 2 days, 03:40:33
> >>>>>   LOAD AVERAGE: 0.00, 0.01, 0.05
> >>>>>          TASKS: 669
> >>>>>       NODENAME: ubuntu
> >>>>>        RELEASE: 3.13.0-39-generic
> >>>>>        VERSION: #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC 2014
> >>>>>        MACHINE: x86_64  (2693 Mhz)
> >>>>>         MEMORY: 1 GB
> >>>>>          PANIC: ""
> >>>>>            PID: 0
> >>>>>        COMMAND: "swapper/0"
> >>>>>           TASK: ffffffff81c15480  [THREAD_INFO: ffffffff81c00000]
> >>>>>            CPU: 0
> >>>>>          STATE: TASK_RUNNING
> >>>>>        WARNING: panic task not found
> >>>>>   
> >>>>>   crash>
> >>>>>   
> >>>>> Anyway, besides fixing whatever the problem is, please remove the
> >>>>> assert()
> >>>>> calls entirely.  They did not exist in the original vmware_vmss.c file,
> >>>>> and shouldn't be added now.
> >>>>>
> >>>>> assert() is not used by the top-level crash sources (except for qemu.c
> >>>>> and qemu-load.c, which came from a 3rd party, and I didn't feel like
> >>>>> changing
> >>>>> all of them).  Instead, please use the crash convention by testing for
> >>>>> the
> >>>>> anomoly, and then send an appropriate error message to error(FATAL,
> >>>>> ...),
> >>>>> which will kill the crash session if it's during session
> >>>>> initialization,
> >>>>> or kill the current command if it's during crash runtime.
> >>>>>
> >>>>> Thanks,
> >>>>>   Dave
> >>>>>
> >>
> 
> 




More information about the Crash-utility mailing list