[Crash-utility] [ANNOUNCE] crash version 7.1.6 is available

Dave Anderson anderson at redhat.com
Thu Oct 13 19:01:22 UTC 2016


Download from: http://people.redhat.com/anderson
                 or
               https://github.com/crash-utility/crash/releases

The github master branch serves as a development branch that will contain 
all patches that are queued for the next release:

  $ git clone git://github.com/crash-utility/crash.git


Changelog:
  
 - Introduction of support for "live" ramdump files, such as those that
   are specified by the QEMU mem-path argument of a memory-backend-file 
   object.  This allows the running of a live crash session against a
   QEMU guest from the host machine.  In this example, the /tmp/MEM file
   on a QEMU host represents the guest's physical memory:

     $ qemu-kvm ...other-options... \
     -object memory-backend-file,id=MEM,size=128m,mem-path=/tmp/MEM,share=on \
     -numa node,memdev=MEM -m 128

   and a live session run can be run against the guest kernel like so:

     $ crash <path-to-guest-vmlinux> live:/tmp/MEM at 0

   By prepending the ramdump image name with "live:", the crash session will
   act as if it were running a normal live session.
   (oleg at redhat.com)

 - Fix for the support of ELF vmcores created by the KVM "virsh dump 
   --memory-only" facility if the guest kernel was not configured with 
   CONFIG_KEXEC, or CONFIG_KEXEC_CORE in Linux 4.3 and later kernels.
   Without the patch, the crash session fails during initialization with
   the message "crash: cannot resolve kexec_crash_image".
   (hirofumi at mail.parknet.co.jp)

 - Added support for x86_64 ramdump files.  Without the patch, the crash
   session fails immediately with the message "ramdump: unsupported 
   machine type: X86_64".
   (anderson at redhat.com)

 - Fix for a "[-Werror=misleading-indentation]" compiler warning that
   is generated by gdb-7.6/bfd/elf64-s390.c when building S390X in a
   Fedora Rawhide environment with gcc-6.0.0
   (anderson at redhat.com)

 - Recognize and parse the new QEMU_VM_CONFIGURATION and QEMU_VM_FOOTER
   sections used for live migration of KVM guests, which are seen in
   the "kvmdump" format generated if "virsh dump" is used without the 
   "--memory-only" option.
   (pagupta at redhat.com)

 - Fix for Linux commit edf14cdbf9a0e5ab52698ca66d07a76ade0d5c46, which
   has appended a NULL entry as the final member of the pageflag_names[] 
   array.  Without the patch, a message that indicates "crash: failed to
   read pageflag_names entry" is displayed during session initialization
   in Linux 4.6 kernels.
   (andrej.skvortzov at gmail.com)

 - Fix for Linux commit 0139aa7b7fa12ceef095d99dc36606a5b10ab83a, which
   renamed the page._count member to page._refcount.  Without the patch,
   certain "kmem" commands fail with the "kmem: invalid structure member
   offset: page_count".
   (anderson at redhat.com)

 - Fix for an ARM64 crash-7.1.5 "bt" regression for a task that has
   called panic().  Without the patch, the backtrace may fail with a
   message such as "bt: WARNING: corrupt prstatus? pstate=0x20000000, 
   but no user frame found" followed by "bt: WARNING: cannot determine 
   starting stack frame for task <address>".  The pstate register
   warning will still be displayed (as it is essentially a kdump bug),
   but the backtrace will proceed normally.
   (anderson at redhat.com)

 - Fix for the ARM64 "bt" command in Linux 4.5 and later kernels which
   use per-cpu IRQ stacks.  Without the patch, if an active non-crashing 
   task was running in user space when it received the shutdown IPI from
   the crashing task, the "-- <IRQ stack> ---" transition marker from 
   the IRQ stack to the process stack is not displayed, and a message
   indicating "bt: WARNING: arm64_unwind_frame: on IRQ stack: oriq_sp: 
   <address> fp: 0 (?)" gets displayed.
   (anderson at redhat.com)

 - Fix for the ARM64 "bt" command in Linux 4.5 and later kernels which
   are not configured with CONFIG_FUNCTION_GRAPH_TRACER.  Without the
   patch, backtraces that originate from a per-cpu IRQ stack will dump 
   an invalid exception frame before transitioning to the process stack.
   (anderson at redhat.com)

 - Introduction of ARM64 support for 4K pages with 4-level page tables
   and 48 VA bits.
   (takahiro.akashi at linaro.org)

 - Implemented support for the redesigned ARM64 kernel virtual memory 
   layout and associated KASLR support that was introduced in Linux 4.6.
   The kernel text and static data has been moved from unity-mapped 
   memory into the vmalloc region, and its start address can be 
   randomized if CONFIG_RANDOMIZE_BASE is configured.  Related support
   is being put into the kernel's kdump code, the kexec-tools package, 
   and makedumpfile(8); with that in place, the analysis of Linux 4.6 
   ARM64 dumpfiles with or without KASLR enabled should work normally
   by entering "crash vmlinux vmcore".  On live systems, Linux 4.6 ARM64
   kernels will only work automatically if CONFIG_RANDOMIZE_BASE is not
   configured.  Unfortunately, if CONFIG_RANDOMIZE_BASE is configured 
   on a live system, two --machdep command line arguments are required,
   at least for the time being.  The arguments are:

     --machdep phys_offset=<base physical address>
     --machdep kimage_voffset=<kernel kimage_voffset value>

   Without the patch, any attempt to analyze a Linux 4.6 ARM64 kernel
   fails during initialization with a stream of "read error" messages 
   followed by "crash: vmlinux and vmcore do not match!".
   (takahiro.akashi at linaro.org)

 - Linux 3.15 and later kernels configured with CONFIG_RANDOMIZE_BASE
   could be identified because of the "randomize_modules" kernel symbol,
   and if it existed, the "--kaslr=<offset>" and/or "--kaslr=auto" 
   options were unnecessary.  Since the "randomize_modules" symbol was 
   removed in Linux 4.1, this patch has replaced the KASLR identifier 
   with the "module_load_offset" symbol, which was also introduced in 
   Linux 3.15, but still remains.
   (anderson at redhat.com)

 - Improvement of the ARM64 "bt -f" display such that in most cases,
   each stack frame level delimiter will be set to the stack address
   location containing the old FP and old LR pair.
   (takahiro.akashi at linaro.org)

 - Fix for the introduction of ARM64 support for 64K pages with 3-level
   page tables in crash-7.1.5, which fails to translate user space 
   virtual addresses.  Without the patch, "vtop <user-space address>"
   fails to translate all user-space addresses, and any command that 
   needs to either translate or read user-space memory, such as "vm -p",
   "ps -a", and "rd -u" will fail.
   (anderson at redhat.com)

 - Enhancement of the error message generated by the "tree -t radix"
   option when a duplicate entry is encountered.  Without the patch,
   the error message shows the address of the radix_tree_node that 
   contains the duplicate entry, for example, "tree: duplicate tree 
   entry: <radix_tree_node>".  It has been changed to also display
   the radix_tree_node.slots[] array index and the duplicate entry
   value, for example, "tree: duplicate tree entry: radix_tree_node: 
   <radix_tree_node> slots[<index>]: <entry>".
   (anderson at redhat.com)

 - Introduction of a new "bt -v" option that checks the kernel stack of
   all tasks for evidence of stack overflows.  It does so by verifying
   the thread_info.task address, ensuring the thread_info.cpu value is
   a valid cpu number, and checking the end of the stack for the 
   STACK_END_MAGIC value.
   (anderson at redhat.com)

 - Fix to recognize a kernel thread that has user space virtual memory
   attached to it.  While kernel threads typically do not have an 
   mm_struct referencing a user-space virtual address space, they can 
   either temporarily reference one for a user-space copy operation, or
   in the case of KVM "vhost" kernel threads, keep a reference to the 
   user space of the "quem-kvm" task that created them.  Without the 
   patch, they will be mistaken for user tasks; the "bt" command will
   display an invalid kernel-entry exception frame that indicates
   "[exception RIP: unknown or invalid address]", the "ps" command
   will not enclose the command name with brackets, and the "ps -[uk]" 
   and "foreach [user|kernel]" options will show the kernel thread as
   a user task.
   (anderson at redhat.com)

 - Fix for the "bt -[eE]" options on ARM64 to recognize kernel exception
   frames in VHE enabled systems, in which the kernel runs in EL2.
   (takahiro.akashi at linaro.org)

 - Fix for the extensions/trace.c extension module to account for the
   Linux 4.7 kernel commit dcb0b5575d24 that changed the bit index for
   the TRACE_EVENT_FL_TRACEPOINT flag.  Without the patch, the "extend"
   command fails to load the trace.so module, with the error message
   "extend: /path/to/crash/extensions/trace.so: no commands registered:
   shared object unloaded".  The patch reads the flag's enum value 
   dynamically instead of using a hard-coded value.
   (namhyung at gmail.com)

 - Incorporated Takahiro Akashi's alternative backtrace method as a 
   "bt" option, which can be accessed using "bt -o", and where "bt -O"
   will toggle the original and optional methods as the default.  The 
   original backtrace method has adopted two changes/features from 
   the optional method:
     (1) ORIG_X0 and SYSCALLNO registers are not displayed in kernel
         exception frames.
     (2) stackframe entry text locations are modified to be the PC 
         address of the branch instruction instead of the subsequent 
         "return" PC address contained in the stackframe link register.
   Accordingly, these are the essential differences between the original
   and optional methods:
     (1) optional: the backtrace will start with the IPI exception frame
         located on the process stack.
     (2) original: the starting point of backtraces for the active, 
         non-crashing, tasks, will continue to have crash_save_cpu() 
         on the IRQ stack as the starting point.
     (3) optional: the exception entry stackframe adjusted to be located
         farther down in the IRQ stack.
     (4) optional: bt -f does not display IRQ stack memory above the 
         adjusted exception entry stackframe.
     (5) optional: may display "(Next exception frame might be wrong)".
   (takahiro.akashi at linaro.org, anderson at redhat.com)

 - Fix for the failure of the "sym <symbol>" option in the extremely
   unlikely case where the symbol's name string is composed entirely of 
   hexadecimal characters.  For example, without the patch, "sym e820" 
   fails with the error message "sym: invalid address: e820".
   (anderson at redhat.com)

 - Fix for the failure of the "dis <symbol>" option in the extremely
   unlikely case where the symbol's name string is composed entirely of 
   hexadecimal characters.  For example, without the patch, "dis f" 
   fails with the error message "dis: WARNING: f: no associated kernel 
   symbol found" followed by "0xf: Cannot access memory at address 0xf". 
   (anderson at redhat.com)

 - Fix for the X86_64 "bt -R <symbol>" option if the only reference
   to the kernel text symbol in a backtrace is contained within the 
   "[exception RIP: <symbol+offset>]" line of an exception frame
   dump.  Without the patch, the reference will only be picked up if 
   the exception RIP's hexadecimal address value is used.
   (anderson at redhat.com)

 - Fix for the ARM64 "bt -R <symbol>" option if the only reference
   to the kernel text symbol in a backtrace is contained within the 
   "[PC: <address> [<symbol+offset>]" line of an exception frame
   dump.  Without the patch, the reference will only be picked up if 
   the PC's hexadecimal address value is used.
   (anderson at redhat.com)

 - Fix for the gathering of module symbol name strings during session
   initialization.  In the unlikely case where the ordering of module 
   symbol name strings does not match the order of the kernel_symbol 
   structures, a faulty module symbol list entry may be created that 
   contains a bogus name string.
   (sebastien.piechurski at bull.net)

 - Fix the PERCENTAGE of total output of the "kmem -i" SWAP USED line 
   when the system has no swap pages at all.  Without the patch, both
   the PAGES and TOTAL columns show values of zero, but it confusingly 
   shows "100% of TOTAL SWAP", which upon first glance may seem to 
   indicate potential memory pressure.
   (jsiddle at redhat.com)

 - Enhancement to determine structure member data if the member is
   contained within an anonymous structure or union.  Without the patch,
   it is necessary to parse the output of a discrete gdb "printf" 
   command to determine the offset of such a structure member.
   (Alexandr_Terekhov at epam.com)

 - Speed up session initialization by attempting MEMBER_OFFSET_INIT()
   before falling back to ANON_MEMBER_OFFSET_INIT() in several known 
   cases of structure members that are contained within anonymous 
   structures.
   (anderson at redhat.com)

 - Implemented new "list -S" and "tree -S" options that are similar to 
   each command's -s option, but instead of parsing gdb output, member 
   values are read directly from memory, so the command is much faster
   for 1-, 2-, 4-, and 8-byte members.
   (Alexandr_Terekhov at epam.com)

 - Fix to recognize and support x86_64 Linux 4.8-rc1 and later kernels 
   that are configured with CONFIG_RANDOMIZE_MEMORY, which randomizes
   the base addresses of the kernel's unity-map address (PAGE_OFFSET), 
   and the vmalloc region.  Without the patch, the crash utility fails 
   with a segmentation violation during session initialization on a
   live system, or will generate a number of WARNING messages followed
   by the fatal error message "crash: vmlinux and <dumpfile name> do
   not match!" with dumpfiles.
   (anderson at redhat.com)

 - Fix for Linux 4.1 commit d0a0de21f82bbc1737ea3c831f018d0c2bc6b9c2, 
   which renamed the x86_64 "init_tss" per-cpu variable to "cpu_tss".  
   Without the patch, the addresses of the 4 per-cpu exception stacks 
   cannot be determined, which causes backtraces that originate on
   any of the per-cpu DOUBLEFAULT, NMI, DEBUG, or MCE stacks to be
   truncated.
   (anderson at redhat.com)

 - With the introduction of radix MMU in Power ISA 3.0, there are 
   changes in kernel page table management accommodating it.  This patch
   series makes appropriate changes here to work for such kernels.  
   Also, this series fixes a few bugs along the way:
 
     ppc64: fix vtop page translation for 4K pages
     ppc64: Use kernel terminology for each level in 4-level page table
     ppc64/book3s: address changes in kernel v4.5
     ppc64/book3s: address change in page flags for PowerISA v3.0
     ppc64: use physical addresses and unfold pud for 64K page size
     ppc64/book3s: support big endian Linux page tables

   The patches are needed for Linux v4.5 and later kernels on all
   ppc64 hardware.
   (hbathini at linux.vnet.ibm.com)
   
 - Fix for Linux 4.8-rc1 commit 500462a9de657f86edaa102f8ab6bff7f7e43fc2,
   in which Thomas Gleixner redesigned the kernel timer mechanism to
   switch to a non-cascading wheel.  Without the patch, the "timer"
   command fails with the message "timer: zero-size memory allocation! 
   (called from <address>)"
   (anderson at redhat.com)

 - Support for PPC64/BOOK3S virtual address translation for radix MMU. 
   As both radix and hash MMU are supported in a single kernel on
   Power ISA 3.0 based server processors, identify the current MMU
   type and set page table index values accordingly.  Also, in Linux 
   4.7 and later kernels, PPC64/BOOK3S uses the same masked bit values
   in page table entries for 4K and 64K page sizes.
   (hbathini at linux.vnet.ibm.com)

 - Change the RESIZEBUF() macro so that it will accept buffer pointers
   that are not declared as "char *" types.  Change two prior direct 
   callers of resizebuf() to use RESIZEBUF(), and fix two prior users of
   RESIZEBUF() to correctly calculate the need to resize their buffers.
   (anderson at redhat.com)

 - Fix for the "trace.so" extension module to properly recognize Linux
   3.15 and later kernels.  In crash-7.1.6, the MEMBER_OFFSET() macro
   has been improved so that it is able to recognize members of embedded
   anonymous structures.  However, the module's manner of recognizing 
   Linux 3.15 and later kernels depended upon MEMBER_OFFSET() failing
   to handle anonymous members, and therefore the improvement prevented
   the module from successfully loading.
   (rabinv at axis.com)

 - If a "struct" command address argument is expressed using the per-cpu
   "symbol:cpuspec" format, and the symbol is a pointer type, i.e., not
   the address of the structure, display a WARNING message.
   (atomlin at redhat.com)

 - Exclude ARM64 kernel module linker mapping symbols like "$d" and "$x"
   as is done with 32-bit ARM.  Without the patch, a crash session may 
   fail during the "gathering module symbol data" stage with a message 
   similar to "crash: store_module_symbols_v2: total: 15 mcnt: 16".
   (takahiro.akashi at linaro.org)

 - Enhancement to the ARM64 "dis" command when the kernel has enabled 
   KASLR.  When KASLR is enabled on ARM64, a function call between a 
   module and the base kernel code will be done via a veneer (PLT) if 
   the displacement is more than +/-128MB.  As a result, disassembled
   code will show a branch to the in-module veneer location instead of
   the in-kernel target location.  To avoid confusion, the output of
   the "dis" command will translate the veneer location to the target
   location preceded by "plt:", for example, "<plt:printk>".
   (takahiro.akashi at linaro.org)

 - Improvement of the "dev -d" option to display I/O statics for disks
   whose device driver uses the blk-mq interface.  Currently "dev -d" 
   always displays 0 in all fields for the blk-mq disk because blk-mq 
   does not increment/decrement request_list.count[2] on I/O creation 
   and I/O completion.  The following values are used in blk-mq in such
   situations:
     - I/O creation:   blk_mq_ctx.rq_dispatched[2]
     - I/O completion: blk_mq_ctx.rq_completed[2]
   So, we can get the counter of in-progress I/Os as follows:
     in progress I/Os == rq_dispatched - rq_completed
   This patch displays the result of above calculation for the disk. 
   It determines whether the device driver uses blk-mq if the 
   request_queue.mq_ops is not NULL.  The "DRV" field is displayed as 
   "N/A(MQ)" if the value for in-flight in the device driver does not 
   exist for blk-mq.
   (m.mizuma at jp.fujitsu.com)




More information about the Crash-utility mailing list