[Crash-utility] [PATCH v2] crash: dis: introduce count in reverse and forward mode

Fri Jun 21 15:45:29 UTC 2019

----- Original Message -----
> Changes since v1:
> 
>  - Update 'dis' help page
>  - Resolve patch fuzz
> 
> 
> The purpose of this patch is to add support for a count value in reverse
> or forward mode, during disassembly:
> 
> 	'dis [-r|-f] [symbol|address] [count]'
> 
> For example:
> 
>   crash> dis -f list_del+0x16 4
>   0xffffffff812b3346 <list_del+22>:       jne    0xffffffff812b3381  <list_del+81>
>   0xffffffff812b3348 <list_del+24>:       mov    (%rbx),%rax
>   0xffffffff812b334b <list_del+27>:       mov    0x8(%rax),%r8
>   0xffffffff812b334f <list_del+31>:       cmp    %r8,%rbx
> 
>   crash> dis -r list_del+0x16 4
>   0xffffffff812b333c <list_del+12>:       mov    0x8(%rdi),%rax
>   0xffffffff812b3340 <list_del+16>:       mov    (%rax),%r8
>   0xffffffff812b3343 <list_del+19>:       cmp    %r8,%rdi
>   0xffffffff812b3346 <list_del+22>:       jne    0xffffffff812b3381  <list_del+81>

Hmmm, I don't understand why, but I still see the fuzz on x86_64, but more importantly,
your patch fails to compile on the ppc64 architecture.

Note that during the initial build, the gdb-7.6.tar.gz file is un-tarred,
and then 3 patches are applied to it prior to the first "make":

  gdb-7.6.patch
  gdb-7.6-ppc64le-support.patch
  gdb-7.6-proc_service.h.patch

The gdb-7.6-ppc64le-support.patch is a 86-file patch that changes
several files that are also modified by the generic gdb-7.6.patch, 
one of them being printcmd.c.  It *only* gets applied on ppc64le hosts
or if you run "make target=ppc64".  

If I check out a fresh git tree, apply your patch, and do a "make"
on a ppc64le host, the build fails.  The initial build log shows that 
the gdb-7.6.patch does not apply cleanly:

  if [ -f gdb-7.6.patch ] && [ -s gdb-7.6.patch ]; then \
          patch -p0 < gdb-7.6.patch; cp gdb-7.6.patch gdb-7.6; fi
  ... [ cut ] ...
  patching file gdb-7.6/gdb/printcmd.c
  Hunk #2 succeeded at 800 with fuzz 2.
  Hunk #5 succeeded at 2187 with fuzz 1 (offset 995 lines).

And then when the gdb-7.6-ppc64le-support.patch gets applied after that, 
it's not a clean application as it normally is:

  if [ "ppc64le" = "ppc64le" ] && [ -f gdb-7.6-ppc64le-support.patch ]; then \
          patch -d gdb-7.6 -p1 -F0 < gdb-7.6-ppc64le-support.patch ; \
  fi
  ... [ cut ] ...
  patching file gdb/printcmd.c
  Hunk #1 succeeded at 673 (offset 5 lines).
  Hunk #2 succeeded at 1435 (offset 269 lines).

And ultimately it fails to compile printcmd.c:

-g -O2 -m64 -fPIC -mminimal-toc  -I. -I. -I./common -I./config -DLOCALEDIR="\"/usr/local/share/locale\"" -DCRASH_MERGE -DHAVE_CONFIG_H -I./../include/opcode -I./../opcodes/.. -I./../readline/.. -I../bfd -I./../bfd -I./../include -I../libdecnumber -I./../libdecnumber  -I./gnulib/import -Ibuild-gnulib/import   -DTUI=1  -Wall -Wdeclaration-after-statement -Wpointer-arith -Wformat-nonliteral -Wno-pointer-sign -Wno-unused -Wunused-value -Wunused-function -Wno-switch -Wno-char-subscripts -Wmissing-prototypes -Wdeclaration-after-statement -Wempty-body  `echo " -Wall -Wdeclaration-after-statement -Wpointer-arith -Wformat-nonliteral -Wno-pointer-sign -Wno-unused -Wunused-value -Wunused-function -Wno-switch -Wno-char-subscripts -Wmissing-prototypes -Wdeclaration-after-statement -Wempty-body " | sed "s/ -Wformat-nonliteral / -Wno-format-nonliteral /g"` \
        -c -o printcmd.o -MT printcmd.o -MMD -MP -MF .deps/printcmd.Tpo ./printcmd.c
./printcmd.c: In function ‘display_info’:
./printcmd.c:2192:7: error: ‘need_to_update_next_address’ undeclared (first use in this function); did you mean ‘set_next_address’?
   if (need_to_update_next_address)
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~
       set_next_address
./printcmd.c:2192:7: note: each undeclared identifier is reported only once for each function it appears in
./printcmd.c:2193:20: error: ‘addr_rewound’ undeclared (first use in this function)
     next_address = addr_rewound;
                    ^~~~~~~~~~~~
make[4]: *** [Makefile:1566: printcmd.o] Error 1
make[3]: *** [Makefile:8265: all-gdb] Error 2
make[2]: *** [Makefile:835: all] Error 2

crash build failed

make[1]: *** [Makefile:234: gdb_merge] Error 1
make: *** [Makefile:225: all] Error 2

To be honest, it's not clear how that can be addressed, because any patches
to printcmd.c must be applicable to the non-ppc64le and the ppc64le versions
of the file. 

So let's take a step back, and consider the risk/reward of this quite intrusive
patchset to such a crucial gdb file.  

First, "dis -f <address> <count>" is pretty much meaningless.  In order
to accomplish that, just don't use the "-f" argument! 

And BTW, even if you wanted to support it, it could be here done in 
cmd_dis() with a couple lines: 

                if (args[++optind]) {
                        if (reverse || forward) {
                                error(INFO,
                                    "count argument ignored with -%s option\n",
                                        reverse ? "r" : "f");
                        } else {
                                req->count = stol(args[optind],
                                        FAULT_ON_ERROR, NULL);
                                req->flags &= ~GNU_FUNCTION_ONLY;
                                count_entered++;
                        }
                }

i.e., by just setting "forward" back to FALSE if a count argument is appended.

Secondly, for "dis -r <address> count", you could simply run "dis -r <address> | tail -4".
For that matter, is it really all that onerous to see the fully disassembled function
up to the target address?

And again, even if you wanted to support it, it also seems like it could be accomplished
within the cmd_dis() function, given that the full output is pre-gathered into a tmpfile
prior to being displayed.  

I appreciate the time and effort you've put into it, but making such a huge change to
the printcmd.c file for such a small reward scares the hell out of me.

Dave

> To support this feature, I have essentially incorported GDB commit
> bb556f1facb ("Add negative repeat count to 'x' command"), with some
> additional changes to maintain default behaviour i.e. always display the
> target instruction with the examine command.
> 
> Signed-off-by: Aaron Tomlin <atomlin at redhat.com>
> ---
>  gdb-7.6.patch | 331 ++++++++++++++++++++++++++++++++++++++++++++++++++
>  help.c        |   4 +-
>  kernel.c      |  33 ++---
>  3 files changed, 351 insertions(+), 17 deletions(-)
> 
> diff --git a/gdb-7.6.patch b/gdb-7.6.patch
> index cd75dcf..96cdef4 100644
> --- a/gdb-7.6.patch
> +++ b/gdb-7.6.patch
> @@ -2447,3 +2447,334 @@ diff -up gdb-7.6/opcodes/configure.orig
> gdb-7.6/opcodes/configure
>   #else
>   # error "!__i386__ && !__x86_64__"
>   #endif
> +--- gdb-7.6/gdb/printcmd.c.orig
> ++++ gdb-7.6/gdb/printcmd.c
> +@@ -201,8 +201,13 @@ decode_format (char **string_ptr, int oformat, int
> osize)
> +   val.count = 1;
> +   val.raw = 0;
> +
> ++  if (*p == '-')
> ++    {
> ++      val.count = -1;
> ++      p++;
> ++    }
> +   if (*p >= '0' && *p <= '9')
> +-    val.count = atoi (p);
> ++    val.count *= atoi (p);
> +   while (*p >= '0' && *p <= '9')
> +     p++;
> +
> +@@ -795,6 +800,232 @@ print_address_demangle (const struct
> value_print_options *opts,
> + }
> +
> +
> ++/* Find the address of the instruction that is INST_COUNT instructions
> before
> ++   the instruction at ADDR.
> ++   Since some architectures have variable-length instructions, we can't
> just
> ++   simply subtract INST_COUNT * INSN_LEN from ADDR.  Instead, we use line
> ++   number information to locate the nearest known instruction boundary,
> ++   and disassemble forward from there.  If we go out of the symbol range
> ++   during disassembling, we return the lowest address we've got so far and
> ++   set the number of instructions read to INST_READ.  */
> ++
> ++static CORE_ADDR
> ++find_instruction_backward (struct gdbarch *gdbarch, CORE_ADDR addr,
> ++                           int inst_count, int *inst_read)
> ++{
> ++  /* The vector PCS is used to store instruction addresses within
> ++     a pc range.  */
> ++  CORE_ADDR loop_start, loop_end, p, func_addr;
> ++  VEC (CORE_ADDR) *pcs = NULL;
> ++  struct symtab_and_line sal;
> ++  struct cleanup *cleanup = make_cleanup (VEC_cleanup (CORE_ADDR), &pcs);
> ++  int actual_count = 0;
> ++
> ++  *inst_read = 0;
> ++  inst_count--;
> ++  loop_start = loop_end = addr;
> ++
> ++  find_pc_partial_function (addr, NULL, &func_addr, NULL);
> ++  for (p = func_addr; p != addr;)
> ++    {
> ++      p += gdb_insn_length (gdbarch, p);
> ++      actual_count++;
> ++    }
> ++  if (inst_count > actual_count)
> ++     inst_count = actual_count;
> ++
> ++  /* In each iteration of the outer loop, we get a pc range that ends
> before
> ++     LOOP_START, then we count and store every instruction address of the
> range
> ++     iterated in the loop.
> ++     If the number of instructions counted reaches INST_COUNT, return the
> ++     stored address that is located INST_COUNT instructions back from ADDR.
> ++     If INST_COUNT is not reached, we subtract the number of counted
> ++     instructions from INST_COUNT, and go to the next iteration.  */
> ++  do
> ++    {
> ++      VEC_truncate (CORE_ADDR, pcs, 0);
> ++      sal = find_pc_sect_line (loop_start, NULL, 1);
> ++      if (sal.line <= 0)
> ++        {
> ++          /* We reach here when line info is not available.  In this case,
> ++             we print a message and just exit the loop.  The return value
> ++             is calculated after the loop.  */
> ++          printf_filtered (_("No line number information available "
> ++                             "for address "));
> ++          wrap_here ("  ");
> ++          print_address (gdbarch, loop_start - 1, gdb_stdout);
> ++          printf_filtered ("\n");
> ++          break;
> ++        }
> ++
> ++      loop_end = loop_start;
> ++      loop_start = sal.pc;
> ++
> ++      /* This loop pushes instruction addresses in the range from
> ++         LOOP_START to LOOP_END.  */
> ++      for (p = loop_start; p < loop_end;)
> ++        {
> ++          VEC_safe_push (CORE_ADDR, pcs, p);
> ++          p += gdb_insn_length (gdbarch, p);
> ++        }
> ++
> ++      inst_count -= VEC_length (CORE_ADDR, pcs);
> ++      *inst_read += VEC_length (CORE_ADDR, pcs);
> ++    }
> ++  while (inst_count > 0);
> ++
> ++  /* After the loop, the vector PCS has instruction addresses of the last
> ++     source line we processed, and INST_COUNT has a negative value.
> ++     We return the address at the index of -INST_COUNT in the vector for
> ++     the reason below.
> ++     Let's assume the following instruction addresses and run 'x/-4i
> 0x400e'.
> ++       Line X of File
> ++          0x4000
> ++          0x4001
> ++          0x4005
> ++       Line Y of File
> ++          0x4009
> ++          0x400c
> ++       => 0x400e
> ++          0x4011
> ++     find_instruction_backward is called with INST_COUNT = 4 and expected
> to
> ++     return 0x4001.  When we reach here, INST_COUNT is set to -1 because
> ++     it was subtracted by 2 (from Line Y) and 3 (from Line X).  The value
> ++     4001 is located at the index 1 of the last iterated line (= Line X),
> ++     which is simply calculated by -INST_COUNT.
> ++     The case when the length of PCS is 0 means that we reached an area for
> ++     which line info is not available.  In such case, we return LOOP_START,
> ++     which was the lowest instruction address that had line info.  */
> ++  p = VEC_length (CORE_ADDR, pcs) > 0
> ++    ? VEC_index (CORE_ADDR, pcs, -inst_count)
> ++    : loop_start;
> ++
> ++  /* INST_READ includes all instruction addresses in a pc range.  Need to
> ++     exclude the beginning part up to the address we're returning.  That
> ++     is, exclude {0x4000} in the example above.  */
> ++  if (inst_count < 0)
> ++    *inst_read += inst_count;
> ++
> ++  do_cleanups (cleanup);
> ++  return p;
> ++}
> ++
> ++/* Backward read LEN bytes of target memory from address MEMADDR + LEN,
> ++   placing the results in GDB's memory from MYADDR + LEN.  Returns
> ++   a count of the bytes actually read.  */
> ++
> ++static int
> ++read_memory_backward (struct gdbarch *gdbarch,
> ++                      CORE_ADDR memaddr, gdb_byte *myaddr, int len)
> ++{
> ++  int errcode;
> ++  int nread;      /* Number of bytes actually read.  */
> ++
> ++  /* First try a complete read.  */
> ++  errcode = target_read_memory (memaddr, myaddr, len);
> ++  if (errcode == 0)
> ++    {
> ++      /* Got it all.  */
> ++      nread = len;
> ++    }
> ++  else
> ++    {
> ++      /* Loop, reading one byte at a time until we get as much as we can.
> */
> ++      memaddr += len;
> ++      myaddr += len;
> ++      for (nread = 0; nread < len; ++nread)
> ++        {
> ++          errcode = target_read_memory (--memaddr, --myaddr, 1);
> ++          if (errcode != 0)
> ++            {
> ++              /* The read was unsuccessful, so exit the loop.  */
> ++              printf_filtered (_("Cannot access memory at address %s\n"),
> ++                               paddress (gdbarch, memaddr));
> ++              break;
> ++            }
> ++        }
> ++    }
> ++  return nread;
> ++}
> ++
> ++/* Returns true if X (which is LEN bytes wide) is the number zero.  */
> ++
> ++static int
> ++integer_is_zero (const gdb_byte *x, int len)
> ++{
> ++  int i = 0;
> ++
> ++  while (i < len && x[i] == 0)
> ++    ++i;
> ++  return (i == len);
> ++}
> ++
> ++/* Find the start address of a string in which ADDR is included.
> ++   Basically we search for '\0' and return the next address,
> ++   but if OPTIONS->PRINT_MAX is smaller than the length of a string,
> ++   we stop searching and return the address to print characters as many as
> ++   PRINT_MAX from the string.  */
> ++
> ++static CORE_ADDR
> ++find_string_backward (struct gdbarch *gdbarch,
> ++                      CORE_ADDR addr, int count, int char_size,
> ++                      const struct value_print_options *options,
> ++                      int *strings_counted)
> ++{
> ++  const int chunk_size = 0x20;
> ++  gdb_byte *buffer = NULL;
> ++  struct cleanup *cleanup = NULL;
> ++  int read_error = 0;
> ++  int chars_read = 0;
> ++  int chars_to_read = chunk_size;
> ++  int chars_counted = 0;
> ++  int count_original = count;
> ++  CORE_ADDR string_start_addr = addr;
> ++
> ++  gdb_assert (char_size == 1 || char_size == 2 || char_size == 4);
> ++  buffer = (gdb_byte *) xmalloc (chars_to_read * char_size);
> ++  cleanup = make_cleanup (xfree, buffer);
> ++  while (count > 0 && read_error == 0)
> ++    {
> ++      int i;
> ++
> ++      addr -= chars_to_read * char_size;
> ++      chars_read = read_memory_backward (gdbarch, addr, buffer,
> ++                                         chars_to_read * char_size);
> ++      chars_read /= char_size;
> ++      read_error = (chars_read == chars_to_read) ? 0 : 1;
> ++      /* Searching for '\0' from the end of buffer in backward direction.
> */
> ++      for (i = 0; i < chars_read && count > 0 ; ++i, ++chars_counted)
> ++        {
> ++          int offset = (chars_to_read - i - 1) * char_size;
> ++
> ++          if (integer_is_zero (buffer + offset, char_size)
> ++              || chars_counted == options->print_max)
> ++            {
> ++              /* Found '\0' or reached print_max.  As OFFSET is the offset
> to
> ++                 '\0', we add CHAR_SIZE to return the start address of
> ++                 a string.  */
> ++              --count;
> ++              string_start_addr = addr + offset + char_size;
> ++              chars_counted = 0;
> ++            }
> ++        }
> ++    }
> ++
> ++  /* Update STRINGS_COUNTED with the actual number of loaded strings.  */
> ++  *strings_counted = count_original - count;
> ++
> ++  if (read_error != 0)
> ++    {
> ++      /* In error case, STRING_START_ADDR is pointing to the string that
> ++         was last successfully loaded.  Rewind the partially loaded string.
> */
> ++      string_start_addr -= chars_counted * char_size;
> ++    }
> ++
> ++  do_cleanups (cleanup);
> ++  return string_start_addr;
> ++}
> ++
> + /* Examine data at address ADDR in format FMT.
> +    Fetch it from memory and print on gdb_stdout.  */
> +
> +@@ -808,12 +1039,16 @@ do_examine (struct format_data fmt, struct gdbarch
> *gdbarch, CORE_ADDR addr)
> +   int i;
> +   int maxelts;
> +   struct value_print_options opts;
> ++  int need_to_update_next_address = 0;
> ++  CORE_ADDR addr_rewound = 0;
> ++  int is_backward;
> +
> +   format = fmt.format;
> +   size = fmt.size;
> +   count = fmt.count;
> +   next_gdbarch = gdbarch;
> +   next_address = addr;
> ++  is_backward = count < 0;
> +
> +   /* Instruction format implies fetch single bytes
> +      regardless of the specified size.
> +@@ -878,9 +1113,43 @@ do_examine (struct format_data fmt, struct gdbarch
> *gdbarch, CORE_ADDR addr)
> +
> +   get_formatted_print_options (&opts, format);
> +
> ++  if (is_backward)
> ++    {
> ++      /* This is the negative repeat count case.
> ++         We rewind the address based on the given repeat count and format,
> ++         then examine memory from there in forward direction.  */
> ++
> ++      count = -count;
> ++      if (format == 'i')
> ++        {
> ++          next_address = find_instruction_backward (gdbarch, addr, count,
> ++                                                    &count);
> ++        }
> ++      else if (format == 's')
> ++        {
> ++          next_address = find_string_backward (gdbarch, addr, count,
> ++                                               TYPE_LENGTH (val_type),
> ++                                               &opts, &count);
> ++        }
> ++      else
> ++        {
> ++          next_address = addr - count * TYPE_LENGTH (val_type);
> ++        }
> ++
> ++      /* The following call to print_formatted updates next_address in
> every
> ++         iteration.  In backward case, we store the start address here
> ++         and update next_address with it before exiting the function.  */
> ++      addr_rewound = (format == 's'
> ++                      ? next_address - TYPE_LENGTH (val_type)
> ++                      : next_address);
> ++      need_to_update_next_address = 1;
> ++    }
> ++
> +   /* Print as many objects as specified in COUNT, at most maxelts per line,
> +      with the address of the next one at the start of each line.  */
> +
> ++  if (is_backward)
> ++    count++;
> +   while (count > 0)
> +     {
> +       QUIT;
> +@@ -923,6 +1192,9 @@ do_examine (struct format_data fmt, struct gdbarch
> *gdbarch, CORE_ADDR addr)
> +       printf_filtered ("\n");
> +       gdb_flush (gdb_stdout);
> +     }
> ++
> ++  if (need_to_update_next_address)
> ++    next_address = addr_rewound;
> + }
> +
> + static void
> +@@ -2535,7 +2807,8 @@ Format letters are o(octal), x(hex), d(decimal),
> u(unsigned decimal),\n\
> +   t(binary), f(float), a(address), i(instruction), c(char) and
> s(string).\n\
> + Size letters are b(byte), h(halfword), w(word), g(giant, 8 bytes).\n\
> + The specified number of objects of the specified size are printed\n\
> +-according to the format.\n\n\
> ++according to the format.  If a negative number is specified, memory is\n\
> ++examined backward from the address.\n\n\
> + Defaults for format and size letters are those previously used.\n\
> + Default count is 1.  Default address is following last thing printed\n\
> + with this command or \"print\"."));
> diff --git a/help.c b/help.c
> index 581e616..4d028e1 100644
> --- a/help.c
> +++ b/help.c
> @@ -7278,8 +7278,8 @@ char *help_dis[] = {
>  "         count  the number of instructions to be disassembled (default is
>  1).",
>  "                If no count argument is entered, and the starting address",
>  "                is entered as a text symbol, then the whole routine will
>  be",
> -"                disassembled.  The count argument is ignored when used
> with",
> -"                the -r option.",
> +"                disassembled.  The count argument is supported when used
> with",
> +"                the -r and -f option.",
>  "\nEXAMPLES",
>  "  Disassemble the sys_signal() routine without, and then with, line
>  numbers:\n",
>  "    %s> dis sys_signal",
> diff --git a/kernel.c b/kernel.c
> index f01dc2e..e1f0b7e 100644
> --- a/kernel.c
> +++ b/kernel.c
> @@ -1931,16 +1931,10 @@ cmd_dis(void)
>                  }
>  
>                  if (args[++optind]) {
> -			if (reverse || forward) {
> -				error(INFO,
> -			            "count argument ignored with -%s option\n",
> -				    	reverse ? "r" : "f");
> -			} else {
> -                        	req->count = stol(args[optind],
> +			req->count = stol(args[optind],
>  					FAULT_ON_ERROR, NULL);
> -				req->flags &= ~GNU_FUNCTION_ONLY;
> -				count_entered++;
> -			}
> +			req->flags &= ~GNU_FUNCTION_ONLY;
> +			count_entered++;
>  		}
>  
>  		if (sources) {
> @@ -1992,6 +1986,10 @@ cmd_dis(void)
>  			}
>  		}
>  
> +		if (reverse || forward)
> +			if (count_entered && req->count == 1)
> +				reverse = forward = 0;
> +
>  		if (reverse || forward) {
>  			target = req->addr;
>  			if ((sp = value_search(target, NULL)) == NULL)
> @@ -2006,14 +2004,19 @@ cmd_dis(void)
>  		do_machdep_filter = machdep->dis_filter(req->addr, NULL, radix);
>  		open_tmpfile();
>  
> -		if (reverse)
> -			sprintf(buf5, "x/%ldi 0x%lx",
> -				(target - req->addr) ? target - req->addr : 1,
> -				req->addr);
> -		else
> +		if (reverse || forward) {
> +			if (count_entered && req->count)
> +				sprintf(buf5, "x/%s%ldi 0x%lx", reverse ? "-" : "",
> +					req->count, target);
> +			else
> +				sprintf(buf5, "x/%ldi 0x%lx",
> +					forward ?  req->addr2 - req->addr :
> +					(target - req->addr) ? target - req->addr : 1,
> +					forward ? target : req->addr);
> +		} else
>  			sprintf(buf5, "x/%ldi 0x%lx",
>  				count_entered && req->count ? req->count :
> -				forward || req->flags & GNU_FUNCTION_ONLY ?
> +				req->flags & GNU_FUNCTION_ONLY ?
>  				req->addr2 - req->addr : 1,
>  				req->addr);
>  		gdb_pass_through(buf5, NULL, GNU_RETURN_ON_ERROR);
> --
> 2.20.1
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility
>