[Crash-utility] patch to add vmss memory regions support
Dyno (Hongjun) Fu
hfu at vmware.com
Tue Mar 31 16:28:47 UTC 2015
thanks for your patient guidance!
rgds,
Dyno
On 3/31/15 8:07 AM, Dave Anderson wrote:
>
> ----- Original Message -----
>> Dave,
>> patch updated. please review, thanks.
>> * converted all the FATAL to INFO.
>> * the reason to "break" rather than "return FALSE" is that we might still
>> be able to read the "memory" group which is the only relevant checkpoint
>> group we care about.
> Hi Dyno,
>
> This patch looks OK, except for one minor thing.
>
> I see that you removed the two CRASHDEBUG(1) messages in is_vmware_vmss()
> as well. In the case of the failure to fopen() or fread() the passed-in
> filename, it is OK to call error(INFO, ...), but they are essentially both
> no-ops. Given that main() will reject any non-readable file, and several
> earlier dumpfile-checking functions will call error(FATAL, ...) if the
> dumpfile cannot be opened, those messages will never be displayed.
>
> However, in this case:
>
> @@ -47,8 +53,7 @@ is_vmware_vmss(char *filename)
> hdr.id != CPTDUMP_PARTIAL_MAGIC_NUMBER &&
> hdr.id != CPTDUMP_RESTORED_MAGIC_NUMBER &&
> hdr.id != CPTDUMP_NORESTORE_MAGIC_NUMBER) {
> - if (CRASHDEBUG(1))
> - error(INFO, LOGPRX"Unrecognized .vmss file (magic %x).\n", hdr.id);
> + error(INFO, LOGPRX"Unrecognized .vmss file (magic %x).\n", hdr.id);
> return FALSE;
> }
>
> The CRASHDEBUG(1) absolutely should be there. The is_vmware_vmss() function
> is simply there to verify whether the passed-in filename is a .vmss file.
> If it is not -- based upon the header contents -- then it should quietly return
> FALSE. There will be other file formats in the future, and most likely will be placed
> following the is_vmware_vmss() call in main(), and we won't want that irrelevant
> message. So I've restored that particular CRASHDEBUG(1) qualifier.
>
> Aside from that, the patch is queued for crash-7.1.1:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_crash-2Dutility_crash_commit_1ed90b28af6dbb74fd785e36502f16a8f156e3b3&d=AwICaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=g2Vka_25x09RSowRkQw8pA&m=OAtu5GVMTW4RSkslwDnYb94wwRRNEwxSMomEbEnil1M&s=9nX5ubhXrdRFAOTUEhlINNAPP1tYgN3O6d_vhH1MYi4&e=
>
> Thanks,
> Dave
>
>
>> rgds,
>> Dyno
>>
>> On 3/30/15 12:14 PM, Dave Anderson wrote:
>>> ----- Original Message -----
>>>> Dave,
>>>> I've removed the assert now. please find the updated patch in
>>>> attachment.
>>>> thanks.
>>>> rgds,
>>>> Dyno
>>> Hi Dyno,
>>>
>>> I should have combined this actual code review with my last
>>> response re: the assert() calls, but there are just a few minor
>>> things that still need adjustment.
>>>
>>> As I mentioned in an earlier response, during session initialization,
>>> which is before RUNTIME gets set in pc->flags in main_loop(), any
>>> error(FATAL, ...) call will print the error message and the crash
>>> session immediately exits. And that's OK if that's what you want
>>> to happen.
>>>
>>> But in vmware_vmss_init() you changed many of the previous error messages
>>> that used fprintf(vmss.ofp, ...) to use a construct using FATAL error
>>> messages like this:
>>>
>>> + error(FATAL, LOGPRX"%s: %s\n", filename, strerror(errno));
>>> + result = FALSE;
>>> + goto exit;
>>>
>>> But that doesn't make sense since the error message will get printed and
>>> crash will exit immediately, and therefore will not return back here in
>>> memory_sources_init():
>>>
>>> + } else if (pc->flags & VMWARE_VMSS) {
>>> + if (!vmware_vmss_init(pc->dumpfile, fp))
>>> + error(FATAL, "%s: initialization failed\n",
>>> + pc->dumpfile);
>>> + }
>>>
>>> So preferably the calls should be changed to error(INFO, ...) so that the
>>> user
>>> will get the specific message followed by the general "initialization
>>> failed"
>>> message.
>>>
>>> Also, it would make sense to add an error(INFO, ...) message here as you
>>> have done for the other error scenarios:
>>>
>>> + if (fread(&hdr, sizeof(cptdumpheader), 1, fp) != 1) {
>>> + result = FALSE;
>>> + goto exit;
>>> + }
>>>
>>> If that initial fread() of the header fails, then the session will exit
>>> with
>>> just the single somewhat-confusing "initialization failed" message.
>>>
>>> I also have a general question about vmware_vmss_init() which is not
>>> specific to this patch. In the inner-most "for(;;)" loop, it appears
>>> that it only legitimately breaks if it runs into a TAG_NULL:
>>>
>>> if (tag == NULL_TAG)
>>> break;
>>>
>>> But there are a number of fread() failures in the loop that do an
>>> fprintf(ofp, ...) and simply "break". If any of those failures occur,
>>> it looks like vmware_vmss_init() will return success. But should it?
>>>
>>> Anyway, if you can clean up the error message handling, we can get this
>>> patch
>>> checked in.
>>>
>>> Thanks,
>>> Dave
>>>
>>>
>>>> On 3/27/15 11:05 AM, Dave Anderson wrote:
>>>>> Dyno,
>>>>>
>>>>> I've got the sample multi-region dumpfile -- thanks for that.
>>>>>
>>>>> This latest patch tests OK on all three dumpfiles. But can you please
>>>>> remove the remaining assert() calls as I requested in my previous email?
>>>>>
>>>>> Thanks,
>>>>> Dave
>>>>>
>>>>>
>>>>> ----- Original Message -----
>>>>>> Dave,
>>>>>>
>>>>>> An assert i put caused the crash, workstation VM memory has more block
>>>>>> item
>>>>>> than i thought.
>>>>>> fixed it in the patch and tested against all the dump i have. i will try
>>>>>> to
>>>>>> provide a 4GB memory dump later.
>>>>>>
>>>>>> ESX VM 3G Memory
>>>>>> ================
>>>>>> - Group: memory pos=0x1f6f6 size=0xc000090c
>>>>>> ------------------------------------
>>>>>> align_mask[0, 0] => 0x00ffff
>>>>>> regionsCount => 0x000000
>>>>>> Memory[0, 0] => BLOCK, pos=0x20000,
>>>>>> size=0xc0000000
>>>>>>
>>>>>> ESX VM 4G Memory
>>>>>> ================
>>>>>> - Group: memory pos=0x1f6f6 size=0x10000090c
>>>>>> -----------------------------------
>>>>>> align_mask[0, 0] => 0x00ffff
>>>>>> regionsCount => 0x000002
>>>>>> regionPageNum[0] => 0x000000
>>>>>> regionPPN[0] => 0x000000
>>>>>> regionSize[0] => 0x0c0000
>>>>>> regionPageNum[1] => 0x0c0000
>>>>>> regionPPN[1] => 0x100000
>>>>>> regionSize[1] => 0x040000
>>>>>> Memory[0, 0] => BLOCK, pos=0x20000,
>>>>>> size=0x100000000
>>>>>>
>>>>>> WS VM Memory
>>>>>> ============
>>>>>> - Group: memory pos=0x93b1 size=0x10098
>>>>>> ----------------------------------------
>>>>>> align_mask[0, 0] => 0x00ffff
>>>>>> regionsCount => 0x000000
>>>>>> hotSetSize => 0x040000
>>>>>> hotSet => BLOCK, pos=0x9405,
>>>>>> size=0x8000
>>>>>> MainMemPageZeroStateSize => 0x040000
>>>>>> MainMemKnownZero => BLOCK, pos=0x11447,
>>>>>> size=0x8000
>>>>>>
>>>>>> rgds,
>>>>>> Dyno
>>>>>>
>>>>>> On 3/26/15 1:12 PM, Dave Anderson wrote:
>>>>>>> ----- Original Message -----
>>>>>>>> Dave,
>>>>>>>> updated the patch and please review. thanks.
>>>>>>>> - the page_size/page_shift problem.
>>>>>>>> - change type cast to union.
>>>>>>>> - the read_vmware_vmss() regression.
>>>>>>>>
>>>>>>>> rgds,
>>>>>>>> Dyno
>>>>>>> Dyno,
>>>>>>>
>>>>>>> Since you cannot make any additional sample vmss2core-generated
>>>>>>> dumpfiles available to me, I can only test this latest patch
>>>>>>> on the two dumpfile/kernel pairs that you gave me in February:
>>>>>>>
>>>>>>> vmlinux-2.6.32-431.el6 CentOS6.5-11bd56db.vmss
>>>>>>>
>>>>>>> and
>>>>>>>
>>>>>>> vmlinux-3.13.0-39-generic Ubuntu1404_64bit-65993542.vmss
>>>>>>> (with its companion Ubuntu1404_64bit-65993542.vmem file)
>>>>>>>
>>>>>>> The CentOS kernel works OK with your patch:
>>>>>>>
>>>>>>> $ crash vmlinux-2.6.32-431.el6 CentOS6.5-11bd56db.vmss
>>>>>>>
>>>>>>> crash 7.1.1rc13
>>>>>>> Copyright (C) 2002-2014 Red Hat, Inc.
>>>>>>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
>>>>>>> Copyright (C) 1999-2006 Hewlett-Packard Co
>>>>>>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
>>>>>>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
>>>>>>> Copyright (C) 2005, 2011 NEC Corporation
>>>>>>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
>>>>>>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
>>>>>>> This program is free software, covered by the GNU General Public
>>>>>>> License,
>>>>>>> and you are welcome to change it and/or distribute copies of it under
>>>>>>> certain conditions. Enter "help copying" to see the conditions.
>>>>>>> This program has absolutely no warranty. Enter "help warranty" for
>>>>>>> details.
>>>>>>>
>>>>>>> GNU gdb (GDB) 7.6
>>>>>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>>>>>> License GPLv3+: GNU GPL version 3 or later
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gnu.org_licenses_gpl.html&d=AwIFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=g2Vka_25x09RSowRkQw8pA&m=iQzEiRx9c9mhk2AN2ZgUquvryNUKAR8a3H2BxGelV-8&s=UTmIfMl3-60oCUaY2i-0rZlb-o78kBVx6f9F90s9r4Q&e=
>>>>>>> >
>>>>>>> This is free software: you are free to change and redistribute it.
>>>>>>> There is NO WARRANTY, to the extent permitted by law. Type "show
>>>>>>> copying"
>>>>>>> and "show warranty" for details.
>>>>>>> This GDB was configured as "x86_64-unknown-linux-gnu"...
>>>>>>>
>>>>>>> KERNEL: vmlinux-2.6.32-431.el6
>>>>>>> DUMPFILE: CentOS6.5-11bd56db.vmss
>>>>>>> CPUS: 4
>>>>>>> DATE: Tue Feb 3 18:22:03 2015
>>>>>>> UPTIME: 00:01:06
>>>>>>> LOAD AVERAGE: 0.71, 0.22, 0.07
>>>>>>> TASKS: 302
>>>>>>> NODENAME: promd-1s-dhcp37.eng.vmware.com
>>>>>>> RELEASE: 2.6.32-431.el6.x86_64
>>>>>>> VERSION: #1 SMP Fri Nov 22 03:15:09 UTC 2013
>>>>>>> MACHINE: x86_64 (2394 Mhz)
>>>>>>> MEMORY: 511.5 MB
>>>>>>> PANIC: ""
>>>>>>> PID: 0
>>>>>>> COMMAND: "swapper"
>>>>>>> TASK: ffffffff81a8d020 (1 of 4) [THREAD_INFO:
>>>>>>> ffffffff81a00000]
>>>>>>> CPU: 0
>>>>>>> STATE: TASK_RUNNING (ACTIVE)
>>>>>>> WARNING: panic task not found
>>>>>>>
>>>>>>> crash>
>>>>>>>
>>>>>>> But the Ubuntu1404_64bit-65993542.vmss fails miserably:
>>>>>>>
>>>>>>> $ crash Ubuntu1404_64bit-65993542.vmss vmlinux-3.13.0-39-generic
>>>>>>>
>>>>>>> crash 7.1.1rc13
>>>>>>> Copyright (C) 2002-2014 Red Hat, Inc.
>>>>>>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
>>>>>>> Copyright (C) 1999-2006 Hewlett-Packard Co
>>>>>>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
>>>>>>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
>>>>>>> Copyright (C) 2005, 2011 NEC Corporation
>>>>>>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
>>>>>>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
>>>>>>> This program is free software, covered by the GNU General Public
>>>>>>> License,
>>>>>>> and you are welcome to change it and/or distribute copies of it under
>>>>>>> certain conditions. Enter "help copying" to see the conditions.
>>>>>>> This program has absolutely no warranty. Enter "help warranty" for
>>>>>>> details.
>>>>>>>
>>>>>>> crash: vmware_vmss.c:169: vmware_vmss_init: Assertion `__extension__
>>>>>>> ({
>>>>>>> size_t __s1_len, __s2_len; (__builtin_constant_p (name) &&
>>>>>>> __builtin_constant_p ("Memory") && (__s1_len = strlen (name),
>>>>>>> __s2_len
>>>>>>> =
>>>>>>> strlen ("Memory"), (!((size_t)(const void *)((name) + 1) -
>>>>>>> (size_t)(const void *)(name) == 1) || __s1_len >= 4) &&
>>>>>>> (!((size_t)(const void *)(("Memory") + 1) - (size_t)(const void
>>>>>>> *)("Memory") == 1) || __s2_len >= 4)) ? __builtin_strcmp (name,
>>>>>>> "Memory") : (__builtin_constant_p (name) && ((size_t)(const void
>>>>>>> *)((name) + 1) - (size_t)(const void *)(name) == 1) && (__s1_len =
>>>>>>> strlen (name), __s1_len < 4) ? (__builtin_constant_p ("Memory") &&
>>>>>>> ((size_t)(const void *)(("Memory") + 1) - (size_t)(const void
>>>>>>> *)("Memory") == 1) ? __builtin_strcmp (name, "Memory") :
>>>>>>> (__extension__
>>>>>>> ({ __const unsigned char *__s2 = (__const unsigned char *) (__const
>>>>>>> char
>>>>>>> *) ("Memory"); register int __result = (((__const unsigned char *)
>>>>>>> (__const char *) (name))[0] - __s2[0]); if (__s1_len > 0 && __result
>>>>>>> ==
>>>>>>> 0)
>>>>>> { __result = (((__const unsigned char *) (__const char *) (name))[1] -
>>>>>> __s2[1]); if (__s1_len > 1 && __result == 0) { __result = (((__const
>>>>>> unsigned char *) (__const char *) (name))[2] - __s2[2]); if (__s1_len >
>>>>>> 2
>>>>>> &&
>>>>>> __result == 0) __result = (((__const unsigned char *) (__const char *)
>>>>>> (name))[3] - __s2[3]); } } __result; }))) : (__builtin_constant_p
>>>>>> ("Memory")
>>>>>> && ((size_t)(const void *)(("Memory") + 1) - (size_t)(const void
>>>>>> *)("Memory") == 1) && (__s2_len = strlen ("Memory"), __s2_len < 4) ?
>>>>>> (__builtin_constant_p (name) && ((size_t)(const void *)((name) + 1) -
>>>>>> (size_t)(const void *)(name) == 1) ? __builtin_strcmp (name, "Memory") :
>>>>>> (__extension__ ({ __const unsigned char *__s1 = (__const unsigned char
>>>>>> *)
>>>>>> (__const char *) (name); register int __result = __s1[0] - ((__const
>>>>>> unsigned char *) (__const char *) ("Memory"))[0]; if (__s2_len > 0 &&
>>>>>> __result == 0) { __result = (__s1[1] - ((__const unsigned char *)
>>>>>> (__const
>>>>>> char *) ("Memory"))[1]); if (__s2_len > 1 && __result == 0) { __resul
>>>>>> t = (__s1[2] - ((__const unsigned char *) (__const char *)
>>>>>> ("Memory"))[2]);
>>>>>> if (__s2_len > 2 && __result == 0) __result = (__s1[3] - ((__const
>>>>>> unsigned
>>>>>> char *) (__const char *) ("Memory"))[3]); } } __result; }))) :
>>>>>> __builtin_strcmp (name, "Memory")))); }) == 0' failed.
>>>>>>> Aborted (core dumped)
>>>>>>> $
>>>>>>>
>>>>>>> Note that crash-7.1.0 works OK:
>>>>>>>
>>>>>>> $ /usr/bin/crash Ubuntu1404_64bit-65993542.vmss
>>>>>>> vmlinux-3.13.0-39-generic
>>>>>>>
>>>>>>> crash 7.1.0
>>>>>>> Copyright (C) 2002-2014 Red Hat, Inc.
>>>>>>> Copyright (C) 2004, 2005, 2006, 2010 IBM Corporation
>>>>>>> Copyright (C) 1999-2006 Hewlett-Packard Co
>>>>>>> Copyright (C) 2005, 2006, 2011, 2012 Fujitsu Limited
>>>>>>> Copyright (C) 2006, 2007 VA Linux Systems Japan K.K.
>>>>>>> Copyright (C) 2005, 2011 NEC Corporation
>>>>>>> Copyright (C) 1999, 2002, 2007 Silicon Graphics, Inc.
>>>>>>> Copyright (C) 1999, 2000, 2001, 2002 Mission Critical Linux, Inc.
>>>>>>> This program is free software, covered by the GNU General Public
>>>>>>> License,
>>>>>>> and you are welcome to change it and/or distribute copies of it under
>>>>>>> certain conditions. Enter "help copying" to see the conditions.
>>>>>>> This program has absolutely no warranty. Enter "help warranty" for
>>>>>>> details.
>>>>>>>
>>>>>>> vmw: Memory dump is not part of this vmss file.
>>>>>>> vmw: Try to locate the companion vmem file ...
>>>>>>> vmw: vmem file: Ubuntu1404_64bit-65993542.vmem
>>>>>>>
>>>>>>> GNU gdb (GDB) 7.6
>>>>>>> Copyright (C) 2013 Free Software Foundation, Inc.
>>>>>>> License GPLv3+: GNU GPL version 3 or later
>>>>>>> <https://urldefense.proofpoint.com/v2/url?u=http-3A__gnu.org_licenses_gpl.html&d=AwIFaQ&c=Sqcl0Ez6M0X8aeM67LKIiDJAXVeAw-YihVMNtXt-uEs&r=g2Vka_25x09RSowRkQw8pA&m=iQzEiRx9c9mhk2AN2ZgUquvryNUKAR8a3H2BxGelV-8&s=UTmIfMl3-60oCUaY2i-0rZlb-o78kBVx6f9F90s9r4Q&e=
>>>>>>> >
>>>>>>> This is free software: you are free to change and redistribute it.
>>>>>>> There is NO WARRANTY, to the extent permitted by law. Type "show
>>>>>>> copying"
>>>>>>> and "show warranty" for details.
>>>>>>> This GDB was configured as "x86_64-unknown-linux-gnu"...
>>>>>>>
>>>>>>> KERNEL: vmlinux-3.13.0-39-generic
>>>>>>> DUMPFILE: Ubuntu1404_64bit-65993542.vmss
>>>>>>> CPUS: 1
>>>>>>> DATE: Thu Nov 13 14:10:53 2014
>>>>>>> UPTIME: 2 days, 03:40:33
>>>>>>> LOAD AVERAGE: 0.00, 0.01, 0.05
>>>>>>> TASKS: 669
>>>>>>> NODENAME: ubuntu
>>>>>>> RELEASE: 3.13.0-39-generic
>>>>>>> VERSION: #66-Ubuntu SMP Tue Oct 28 13:30:27 UTC 2014
>>>>>>> MACHINE: x86_64 (2693 Mhz)
>>>>>>> MEMORY: 1 GB
>>>>>>> PANIC: ""
>>>>>>> PID: 0
>>>>>>> COMMAND: "swapper/0"
>>>>>>> TASK: ffffffff81c15480 [THREAD_INFO: ffffffff81c00000]
>>>>>>> CPU: 0
>>>>>>> STATE: TASK_RUNNING
>>>>>>> WARNING: panic task not found
>>>>>>>
>>>>>>> crash>
>>>>>>>
>>>>>>> Anyway, besides fixing whatever the problem is, please remove the
>>>>>>> assert()
>>>>>>> calls entirely. They did not exist in the original vmware_vmss.c file,
>>>>>>> and shouldn't be added now.
>>>>>>>
>>>>>>> assert() is not used by the top-level crash sources (except for qemu.c
>>>>>>> and qemu-load.c, which came from a 3rd party, and I didn't feel like
>>>>>>> changing
>>>>>>> all of them). Instead, please use the crash convention by testing for
>>>>>>> the
>>>>>>> anomoly, and then send an appropriate error message to error(FATAL,
>>>>>>> ...),
>>>>>>> which will kill the crash session if it's during session
>>>>>>> initialization,
>>>>>>> or kill the current command if it's during crash runtime.
>>>>>>>
>>>>>>> Thanks,
>>>>>>> Dave
>>>>>>>
>>
More information about the Crash-utility
mailing list