[Crash-utility] [ANNOUNCE] crash version 7.2.4 is available

Dave Anderson anderson at redhat.com
Fri Sep 21 19:27:06 UTC 2018


Download from: http://people.redhat.com/anderson
                 or
               https://github.com/crash-utility/crash/releases

The github master branch serves as a development branch that will contain 
all patches that are queued for the next release:

  $ git clone git://github.com/crash-utility/crash.git


Changelog:
  
 - Fix for the "timer -r" command on Linux 4.10 and later kernels that
   contain commit 2456e855354415bfaeb7badaa14e11b3e02c8466, titled
   "ktime: Get rid of the union".  Without the patch, the command fails
   with the error message "timer: invalid structure member offset: 
   ktime_t_sec".
   (k-hagio at ab.jp.nec.com)

 - Fix for the x86 and x86_64 "mach -m" option on Linux 4.12 and later 
   kernels to account for the structure name changes "e820map" to 
   "e820_table", and "e820entry" to "e820_entry", and for the symbol 
   name change from "e820" to "e820_table".  Also updated the display 
   output to properly translate E820_PRAM and E820_RESERVED_KERN entries.  
   Without the patch on all kernels, E820_PRAM and E820_RESERVED_KERN 
   entries show "type 12" and "type 128" respectively.  Without the 
   patch on Linux 4.12 and later kernels, the command fails with the 
   error message "mach: cannot resolve e820".  
   (anderson at redhat.com)

 - Update for the recognition of the new x86_64 CPU_ENTRY_AREA virtual
   address range introduced in Linux 4.15.  The memory range exists 
   above the vmemmap range and below the mapped kernel static text/data
   region, and where all of the x86_64 exception stacks have been moved.
   Without the patch, reads from the new memory region fail because the
   address range is not recognized as a legitimate virtual address.  
   Most notable is the failure of "bt" on tasks whose backtraces 
   originate from any of the exception stacks, which fail with the two
   error messages "bt: seek error: kernel virtual address: <address>
   type: stack contents" followed by "bt: read of stack at <address>
   failed".
   (anderson at redhat.com)

 - Fix to address a "__builtin___snprintf_chk" compiler warning if bpf.c 
   is compiled with -D_FORTIFY_SOURCE=2.
   (anderson at redhat.com)

 - Fix for the "bpf -t" option.  Although highly unlikely, without the 
   patch, the target function name of a BPF bytecode call instruction 
   may fail to be resolved correctly.
   (anderson at redhat.com)

 - If /proc/kcore gets selected for the live memory source because 
   /dev/mem was configured with CONFIG_STRICT_DEVMEM, its ELF header 
   contents are not displayed by "help -[dD]", and are not displayed 
   when the crash session is invoked with -d<number>".  Without the
   patch, the ELF contents are only displayed in those two situations 
   if "/proc/kcore" is explicitly entered on the crash command line.
   (anderson at redhat.com)

 - If the default live memory source /dev/mem is determined to be 
   unusable because the kernel was configured with CONFIG_STRICT_DEVMEM,
   the first memory read during session initialization will fail.  The
   current behavior results in a readmem() error message, followed by two
   notification messages that indicate that /dev/mem is restricted and 
   a switch to using /proc/kcore will be attempted; the readmem is
   reattempted from /proc/kcore, and if successful, the session will 
   continue initialization.  With this patch, the behavior will change
   such that if the switch to /proc/kcore and the reattempted readmem()
   are successful, no messages will be displayed unless the crash 
   session is invoked with "crash -d<number>".
   (anderson at redhat.com)

 - Fix for the ppc64/ppc64le "bt" command on Linux 4.7 and later kernels
   that contain commit d8bff643d81a58181356c0aa3ab771ac10da6894, 
   titled "[x86] asm: Make sure verify_cpu() has a good stack", which 
   inadvertently breaks the ppc64/ppc64le kernel stack size calculation
   when running with crash-7.2.2 or later.  Without the patch, "bt" may
   fail with a filtered kdump dumpfile with the two error messages 
   "bt: page excluded: kernel virtual address: <address> type: stack
   contents" and "bt: read of stack at <address> failed".
   (anderson at redhat.com)

 - Fix for PPC64 kernel virtual address translation in Linux 4.17 and
   later kernels with commit c2b4d8b7417a59b7f9a52d0d8402f5257cbbd398,
   titled "powerpc/mm/hash64: Increase the VA range", in which the 
   maximum virtual address value has been increased to 4PB.  Without
   the patch, the translation/access of high vmalloc space addresses
   fails; for example, the "kmem -[sS]" option fails the translation 
   of per-cpu kmem_cache_cpu addresses located in vmalloc space, with
   the error messages "kmem: invalid kernel virtual address: <address>
   type: kmem_cache_cpu.freelist" and "kmem: invalid kernel virtual 
   address: <address>  type: kmem_cache_cpu.page", and the "vtop"
   command shows the addresses as "(not mapped)".
   (hbathini at linux.ibm.com)
   
 - Fix for the x86_64 "bt" command in which a legitimate exception
   frame is appended with the message "bt: WARNING: possibly bogus
   exception frame".  This only happens in KASLR-enabled kernels when
   the text address that was executing when the exception occurred
   is marked as a "weak" symbol (type "W") instead of a text symbol
   (type "T" or "t").  As a result, the exception frame's RIP is not
   recognized as a text symbol, and the warning message is displayed.  
   (anderson at redhat.com)

 - Fix for the x86_64 "bt" command in Linux 4.16 and later kernels
   containing commit 3aa99fc3e708b9cd9b4cfe2df0b7a66cf293e3cf, titled
   "x86/entry/64: Remove 'interrupt' macro".  Without the patch, the
   exception frame display generated by an interrupt exception will
   show incorrect contents, and be followed by the message "bt: WARNING:
   possibly bogus exception frame".
   (anderson at redhat.com)

 - Fix for the failure of several "kmem" command options, most notably 
   seen if the command is piped directly into a crash session, or if 
   the command is contained in an input file.  For examples:
     $ echo "kmem -i" | crash ...
     $ crash -i <input-file> ...
   Without the patch, the kmem command may fail with the error message 
   "<segmentation violation in gdb>".  While the bug is due to a buffer 
   overflow that has always existed, it only is triggered by certain 
   kernel configurations.
   (anderson at redhat.com)

 - Update for the "kmem -V" option to also dump the global entries that
   are contained in the "vm_numa_stat" array that was introduced in 
   Linux 4.14.  Also, the command output separates the "vm_zone_stat",
   "vm_node_stat" and "vm_numa_stat" entries into separate sections with 
   "VM_ZONE_STAT", "VM_NODE_STAT" and "VM_NUMA_STAT" headers.  Without 
   the patch, the "vm_zone_stat" and "vm_node_stat" entries are listed
   together under a "VM_STAT" header.
   (anderson at redhat.com)

 - Support for the "bpf" command on RHEL 3.10.0-913.el7 and later 
   3.10-based RHEL7 kernels, which contain a backport of the upstream
   eBPF code, but still use the older, pre-4.11, IDR facility that does
   not use radix trees for linking the active bpf_prog and bpf_map 
   structures.  Without the patch, the command indicates "bpf: command
   not supported or applicable on this architecture or kernel". 
   (anderson at redhat.com)

 - Third phase of support for x86_64 5-level page tables in Linux 4.17
   and later kernels.  With this patch, the usage of 5-level page tables
   is automatically detected on live systems and when running against
   vmcores that contain the new "NUMBER(pgtable_l5_enabled)" VMCOREINFO
   entry.  Without the patch, the "--machdep vm=5level" command line
   option is required.
   (douly.fnst at cn.fujitsu.com, anderson at redhat.com)

 - The existing "list" command uses a hash table to detect duplicate 
   items as it traverses the list.  The hash table approach has worked 
   well for many years.  However, with increasing memory sizes and list
   sizes, the overhead of the hash table can be substantial, often 
   leading to commands running for a very long time.  For large lists,
   we have found that the existing hash based approach may slow the 
   system to a crawl and possibly never complete.  You can turn off
   the hash with "set hash off" but then there is no loop detection; in 
   that case, loop detection must be done manually after dumping the 
   list to disk or some other method.  This patch is an implementation
   of the cycle detection algorithm from R. P. Brent as an alternative
   algorithm for the "list" command.  The algorithm both avoids the 
   overhead of the hash table and yet is able to detect a loop.  In 
   addition, further loop characteristics are printed, such as the
   distance to the start of the loop as well as the loop length.
   An excellent description of the algorithm can be found here on
   the crash-utility mailing list:

   https://www.redhat.com/archives/crash-utility/2018-July/msg00019.html

   A new "list -B" option has been added to the "list" command to 
   invoke this new algorithm rather than using the hash table.  In 
   addition to low memory usage, the output of the list command is 
   slightly different when a loop is detected.  In addition to printing
   the first duplicate entry, the length of the loop, and the distance
   to the loop is output.
   (dwysocha at redhat.com)

 - Fix for x86_64 "bt" command to prevent an in-kernel exception frame
   from not being displayed.  Without the patch, if the RIP in a pt_regs
   structure on the stack is not a kernel text address, such as a NULL 
   pointer, it is not recognized as an exception frame and the register 
   set is not displayed.
   (anderson at redhat.com)

 - Fix for the "repeat" command when the argument consists of an input 
   file construct, for example, "repeat -1 < input_file".  Without the
   patch, only the first command line in the input file is executed 
   each time.
   (anderson at redhat.com)

 - Fourth phase of support for x86_64 5-level page tables in Linux 4.17
   and later kernels.  This patch adds support for user virtual address
   translation when the kernel is configured with CONFIG_X86_5LEVEL. 
   (douly.fnst at cn.fujitsu.com)

 - Fix to prevent an unnecessary "read error" message during session
   initialization on live systems running a kernel that is configured
   with CONFIG_X86_5LEVEL.  Without the patch,  a message indicating 
   "crash: read error: kernel virtual address: <address>  type: 
   __pgtable_l5_enabled" will be displayed if /proc/kcore gets 
   selected as the live memory source after /dev/mem is determined
   to be unusable.
   (anderson at redhat.com)

 - Update for "ps" and "foreach" commands to display and recognize two
   new process states, "ID" for the TASK_IDLE macro introduced in 
   Linux 4.2, and "NE" for the TASK_NEW bit introduced in Linux 4.8.
   (k-hagio at ab.jp.nec.com)

 - Fix for running live on ARM64 kernels against /proc/kcore on kernels
   configured with CONFIG_RANDOMIZE_BASE.  Without the patch, depending
   upon the hardware platform, the session may fail with the error message
   "crash: vmlinux and /proc/kcore do not match!".
   (anderson at redhat.com)

 - Modify the output of the "kmem -[sS]" header and contents such that
   the slab cache "NAME" string is moved from the second column to the
   the last column.  Since the slab cache name strings have become
   increasingly longer over time, without the patch, the numerical 
   column contents may be skewed so far to the right that the output 
   becomes difficult to read.
   (k-hagio at ab.jp.nec.com)

 - Fix for the "files" and "net -s" commands when a task has an open 
   files count that exceeds 1024 (FD_SETSIZE) file descriptors.  Without
   the patch, the commands may omit the display of open file descriptors.
   (tan.hu at zte.com.cn)

 - As an addendum to the new "kmem -[sS]" output format, align the slab
   cache name string so that it is beneath the "NAME" header column when
   the "kmem -I <slab-cache>" option is used to ignore a slab cache,
   or if the scan of the metadata of a slab cache enounters corruption.
   Also remove a superfluous line from the "help kmem" description of 
   the "kmem -I" option. 
   (k-hagio at ab.jp.nec.com, anderson at redhat.com)

 - Account for the addition of the new ORC unwinder "orc_entry.end" 
   member in kernel commit d31a580266eeb1f355df90fde8a71f480e30ad70, 
   titled "x86/unwind/orc: Detect the end of the stack".
   (anderson at redhat.com)

 - Fix for the "trace.c" extension module for RHEL7.6, which moved the
   ftrace_event_call.data member into a new structure contained within
   an anonymous union.  Without the patch, the module fails to load, 
   indicating "no commands registered: shared object unloaded".
   (xuhuan.fnst at cn.fujitsu.com)

 - Fix for the "vm -p", user-space "vtop", and "pte" commands in kernels
   where the dimension of the static swap_info[] array is not contained 
   in the vmlinux file's debuginfo data.  Without the patch, the 
   translation of a swapped-out PTE entry fails to determine the swap
   device, and the commands display "cannot determine swap location". 
   (anderson at redhat.com)

 - Fix for the swap offset calculation in the x86_64 "vm -p", "pte", and 
   user-space "vtop" commands.  The swap offset bits in an x86_64 PTE
   were changed in Linux 4.6, and then again in Linux 4.18.1 with the
   new L1TF security patchset.  Without the patch, the offset value
   in the later kernels, or in older kernels with an L1TF backport,
   show an incorrect swap offset value.
   (anderson at redhat.com)

 - Fix for the "kmem -V" option on Linux 4.14 and later kernels that are
   configured without CONFIG_NUMA, and therefore do not contain the
   "numa_stat_item" enumeration.  Without the patch, the command causes
   the crash session to abort with the error messages "double free or 
   corruption (!prev)" followed by "Aborted (core dumped)".
   (k-hagio at ab.jp.nec.com) 

 - Introduction of a new "kmem -r" option.  With the implementation of
   per-cgroup kmem_cache slabs, the number of slab caches displayed by
   "kmem -s" can number into the thousands.  Similar to /proc/slabinfo,
   this new option displays the accumulated data of the root cache and
   its children.  It is limited to Linux 4.11 and later kernels that 
   contain the "slab_root_caches" list.  Currently the command option 
   is restricted to kernels configured with CONFIG_SLUB.
   (k-hagio at ab.jp.nec.com) 

 - Fix for Linux 4.19-rc1 and later kernels that contain kernel commit
   2c4704756cab7cfa031ada4dab361562f0e357c0, titled "pids: Move the pgrp
   and session pid pointers from task_struct to signal_struct".  Without
   the patch, the crash session fails during initialization with the
   message "crash: invalid structure member offset: task_struct_pids".
   (anderson at redhat.com)

 - Fix for Linux 4.19-rc1 and later kernels that contain kernel commit
   7290d58095712a89f845e1bca05334796dd49ed2, titled "module: use 
   relative references for __ksymtab entries".  Without the patch,
   kernels configured with CONFIG_HAVE_ARCH_PREL32_RELOCATIONS fail
   during session initialization, with a dump of the internel buffer
   allocation stats followed by the message "crash: cannot allocate 
   any more memory!"
   (asmadeus at codewreck.org)

 - Fix a cut-and-paste error in the previous patch application.
   (anderson at redhat.com)

 - Fix for the "files" command in Linux 4.17 and later kernels that
   contain commit b93b016313b3ba8003c3b8bb71f569af91f19fc7, titled
   "page cache: use xa_lock".  Without the patch, the "files -c" option
   fails with the message "files: -c option not supported or applicable
   on this architecture or kernel", and the "files -p <inode>" option
   fails in a similar manner.
   (k-hagio at ab.jp.nec.com) 

 - Fix for the "files -p <inode>" option.  Without the patch, the
   command attempts to translate radix tree node slot entries that 
   are RADIX_TREE_EXCEPTIONAL_ENTRY types, and as a result may fail
   prematurely with an error message of the sort "files: do_radix_tree:
   callback operation failed: entry: 5  item: 44788c5000a".
   (anderson at redhat.com)

 - Commit 3db3d3992d781c1e42587d2d2bf81e785408e0c2 in crash-7.1.8 was
   aimed at making the PPC64 "bt" command work for dumpfiles saved
   with the FADUMP facility, but it introduced a bit of unwarranted 
   complexity in "bt" command processing.  Reworked the "bt" command 
   processing for PPC64 arch to make it a little less compilated and
   also to print symbols for NIP and LR registers in exception frames.
   Without the patch, "bt" on non-panic active tasks may fail with 
   the message "bt: invalid kernel virtual address: <address>  
   type: Regs NIP value".
   (hbathini at linux.ibm.com)

 - An addendum to crash commit 5fe78861ea1589084f6a2956a6ff63677c9269e1,
   this patch for the PPC64 "bt" command prevents an invalid error 
   message from being displayed when an active non-panic task is 
   interrupted while running in user space.  Without the patch, the 
   command correctly indicates "Task is running in user space", dumps 
   the user-space exception frame, but then prints the invalid error 
   message "bt: invalid kernel virtual address: ffffffffffffff90 type:
   Regs NIP value".
   (anderson at redhat.com)




More information about the Crash-utility mailing list