[Crash-utility] [PATCH v3 1/2] Update gdb to 10.1
HAGIO KAZUHITO(萩尾 一仁)
k-hagio-ab at nec.com
Fri Feb 19 02:45:50 UTC 2021
-----Original Message-----
> Hi Alexey,
>
> sorry for the late reply, I needed to learn the crash-gdb interaction and
> how to review such a huge patch as this. We appreciate your patience and
> understanding.
>
> I've almost read the patch, on the whole, it is absolutely excellent!
> I think it's going a right way.
>
> There are several comments including slight things:
>
> - Is it possible to separate the fixes only in crash (outside gdb-10.1.patch)
> and old patches removal from this 1/2 patch? i.e.
> - fix for "'bt' command often emits reduced output"
> - fix for "lack of information about struct members" and "unionprint"
> - removal of gdb-7.6.patch and gdb-7.6-proc_service.h.patch
> This would help us understand what issue a bunch of changes fixes, and
> we can read it easier. I know it's better not to split the gdb-10.1.patch.
>
>
> - Build error on architectures except for x86_64:
>
> /usr/bin/ld: ../../crashlib.a(gdb_interface.o): in function `crash_get_nr_cpus':
> /home/travis/build/k-hagio/crash/gdb_interface.c:1074: undefined reference to `sadump_get_nr_cpus'
> /usr/bin/ld: /home/travis/build/k-hagio/crash/gdb_interface.c:1076: undefined reference to
> `diskdump_get_nr_cpus'
> /usr/bin/ld: /home/travis/build/k-hagio/crash/gdb_interface.c:1078: undefined reference to
> `kdump_get_nr_cpus'
> collect2: error: ld returned 1 exit status
> make[4]: *** [Makefile:1872: gdb] Error 1
> make[3]: *** [Makefile:10072: all-gdb] Error 2
> make[2]: *** [Makefile:860: all] Error 2
> crash build failed
> make[1]: *** [Makefile:239: gdb_merge] Error 1
> make: *** [Makefile:314: warn] Error 2
> The command "make warn" exited with 2.
>
> ref. https://travis-ci.org/github/k-hagio/crash/builds/759444845
sorry, this is wrong URL, the correct one is:
https://travis-ci.org/github/k-hagio/crash/builds/759421444
The former is the results with the 2/2 patch.
>
>
> - "make target=x86" on an x86_64 machine also fails with additional errors:
>
> $ make target=x86
> ...
> ar: creating crashlib.a
> CXXLD gdb
> /usr/bin/ld: skipping incompatible ./../zlib/libz.a when searching for -lz
> /usr/bin/ld: skipping incompatible ./../zlib/libz.a when searching for -lz
> /usr/bin/ld: i386 architecture of input file `../../crashlib.a(main.o)' is incompatible with i386:x86-64
> output
> /usr/bin/ld: i386 architecture of input file `../../crashlib.a(tools.o)' is incompatible with i386:x86-64
> output
> /usr/bin/ld: i386 architecture of input file `../../crashlib.a(global_data.o)' is incompatible with
> i386:x86-64 output
> ...
> ../../crashlib.a(tools.o): In function `eval_common':
> /home/k-hagio/crash/x86-gdb10/tools.c:3012: undefined reference to `__udivdi3'
> /home/k-hagio/crash/x86-gdb10/tools.c:3015: undefined reference to `__umoddi3'
> ...
> decNumber.c:(.text+0x79a7): undefined reference to `__udivdi3'
> collect2: error: ld returned 1 exit status
> make[3]: *** [gdb] Error 1
> make[2]: *** [rebuild] Error 2
> make[1]: *** [gdb_merge] Error 2
> make: *** [all] Error 2
>
>
> - "p" does not print the address of "linux_banner" for some vmcores
> (relatively old kernels? like RHEL7).
>
> --- crash-master.log
> +++ crash-gdb10.1.log
>
> crash> p linux_banner
> -linux_banner = $1 = 0xffffffff816bc100 <linux_banner> "Linux version ...
> +linux_banner = $1 = "Linux version ...
hmm, this looks gdb's behavior. Maybe no need to match it with the current
output by force.
Thanks,
Kazu
>
>
> - "whatis" options print reduced/duplicated results:
>
> crash> whatis -r 512
> SIZE TYPE
> - 512 _legacy_mbr <<-- dropped
> 512 i387_fxsave_struct
> 512 netns_ipv4
> - 512 sgi_disklabel
> 512 user_i387_struct
>
> crash> whatis -m mm_struct
> SIZE TYPE
> 16 tlb_state
> - 24 flush_tlb_info <<-- dropped
> - 24 ftrace_raw_xen_mmu_pgd
> 24 futex_key
> ...
> 216 vm_area_struct
> 256 linux_binprm
> 2752 rq
> +2752 rq <<-- duplicated
> +2752 rq
> +2752 rq
> +2752 rq
> 4048 task_struct
> -8296 numa_maps_private
>
>
> > +--- gdb-10.1/gdb/main.c.orig
> > ++++ gdb-10.1/gdb/main.c
> > +@@ -929,8 +944,12 @@ captured_main_1 (struct captured_main_args
> > + catch_command_errors returns non-zero on success! */
> > + if (catch_command_errors (exec_file_attach, execarg,
> > + !batch_flag, RETURN_MASK_ALL))
> > ++#ifdef CRASH_MERGE
> > ++ catch_command_errors (symbol_file_add_main, symarg, 0, RETURN_MASK_ALL);
> > ++#else
> > + catch_command_errors (symbol_file_add_main, symarg,
> > + !batch_flag, RETURN_MASK_ALL);
> > ++#endif
> > + }
> > + else
> > + {
>
> - This is a dropped hunk, but without this, "crash -s" also prints the
> following message:
>
> # crash -s
> Reading symbols from /usr/lib/debug/lib/modules/3.10.0-1127.el7.x86_64/vmlinux...
> crash>
>
> I'm ok with this message without the "-s" option, but it would be preferable
> to print nothing with the option if there is no warning.
>
>
> > +@@ -992,8 +1011,12 @@ captured_main (void *data)
> > + {
> > + auto_load_local_gdbinit_loaded = 1;
> > +
> > ++#ifdef CRASH_MERGE
> > ++ catch_command_errors (source_script, local_gdbinit, -1, RETURN_MASK_ALL);
> > ++#else
> > + catch_command_errors (source_script, local_gdbinit, 0,
> > + RETURN_MASK_ALL);
> > ++#endif
> > + }
> > + }
> > +
> > +@@ -1039,6 +1062,12 @@ captured_main (void *data)
> > + while (1)
> > + {
> > + catch_errors (captured_command_loop, 0, "", RETURN_MASK_ALL);
> > ++#ifdef CRASH_MERGE
> > ++ {
> > ++ int console(char *, ...);
> > ++ console("<CAPTURED_MAIN WHILE LOOP>\n");
> > ++ }
> > ++#endif
> > + }
> > + /* No exit -- exit is through quit_command. */
> > + }
>
> - Why were the two hunks dropped? Is it possible not to drop?
>
>
> > ++static void
> > ++gdb_delete_symbol_file(struct gnu_request *req)
> > ++{
> > ++ for (objfile *objfile : current_program_space->objfiles ()) {
> > ++ if (STREQ(objfile_name(objfile), req->name) ||
> > ++ same_file((char *)objfile_name(objfile), req->name)) {
> > ++ break;
> > ++ }
> > ++ }
>
> - This does not delete the symbol file, so the symbols remain even after
> "mod -d" command.
>
>
> > ++static void
> > ++dump_enum(struct type *type, struct gnu_request *req)
> > ++{
> > ++ register int i;
> > ++ int len;
> > ++ int lastval;
>
> - The "lastval" variable should be "long long"?
>
>
> > #define DUMP_EMPTY_FILE 0x8
> > #define DUMP_FILE_NRPAGES 0x10
> > -#endif /* !GDB_COMMON */
> > int same_file(char *, char *);
> > +#endif /* !GDB_COMMON */
> > #ifndef GDB_COMMON
> > int cleanup_memory_driver(void);
>
> - We can remove this #endif and #ifndef?
>
>
> > ++#ifdef CRASH_MERGE
> > ++extern "C" int gdb_main_entry(int, char **);
> > ++extern void replace_ui_file_FILE(struct ui_file *, FILE *);
>
> - We don't have the replace_ui_file_FILE() any more?
>
>
> > +int crash_get_nr_cpus(void)
> > +{
> > + if (SADUMP_DUMPFILE())
> > + return sadump_get_nr_cpus();
> > + else if (DISKDUMP_DUMPFILE())
> > + return diskdump_get_nr_cpus();
> > + else if (KDUMP_DUMPFILE())
> > + return kdump_get_nr_cpus();
> > + else if (VMSS_DUMPFILE())
> > + return vmware_vmss_get_nr_cpus();
>
> - Seems diskdump_get_nr_cpus() and kdump_get_nr_cpus() works only with
> QEMU memory dumps and return 0 for normal vmcores. This causes crash
> to fail with the 2/2 patch?
>
>
> - The gdb-10.1.patch does not have a shell script at the head of it, once
> it's modified, "make" prints "gdb-10.1.patch: line 2: ---: command not found"
> and so on.
>
> $ make warn
> TARGET: X86_64
> CRASH: 7.2.9++
> GDB: 10.1
>
> + diff -aurp -X diff_exclude gdb-10.1.orig/gdb/cli/cli-cmds.c gdb-10.1/gdb/cli/cli-cmds.c
> diff: diff_exclude: No such file or directory
> + --- gdb-10.1.orig/gdb/cli/cli-cmds.c 2020-10-23 21:23:02.000000000 -0700
> gdb-10.1.patch: line 2: ---: command not found
> + +++ gdb-10.1/gdb/cli/cli-cmds.c 2020-11-10 13:06:56.423569114 -0800
> gdb-10.1.patch: line 3: +++: command not found
> gdb-10.1.patch: line 4: syntax error near unexpected token `('
> gdb-10.1.patch: line 4: `@@ -435,6 +435,11 @@ complete_command (const char *arg, int f'
> patching file gdb-10.1/gdb/cli/cli-cmds.c
> Reversed (or previously applied) patch detected! Skipping patch.
> 4 out of 4 hunks ignored
> ...
>
>
> - Compilation warnings.
>
> symtab.c: In function ‘void gdb_get_line_number(gnu_request*)’:
> symtab.c:7073:17: warning: variable ‘sym’ set but not used [-Wunused-but-set-variable]
> struct symbol *sym;
> ^
> CXX gcore.o
> symtab.c: In function ‘void gdb_get_datatype(gnu_request*)’:
> symtab.c:7137:51: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
> console("gdb_get_datatype [%s] (a)\n", req->name);
> ^
> CXX gdb-demangle.o
> symtab.c:7163:51: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
> console("gdb_get_datatype [%s] (b)\n", req->name);
> ^
> symtab.c:7172:57: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
> console("expr->elts[0].opcode: OP_VAR_VALUE\n");
> ^
> CXX gdb_bfd.o
> symtab.c:7191:52: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
> console("expr->elts[0].opcode: OP_TYPE\n");
> ^
> symtab.c:7224:31: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
> expr.get()->elts[0].opcode);
> ^
> symtab.c: In function ‘void eval_enum(type*, gnu_request*)’:
> symtab.c:7285:17: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
> req->tagname = "(unknown)";
> ^
> symtab.c: In function ‘void get_member_data(gnu_request*, type*, long int, int)’:
> symtab.c:7315:42: warning: deprecated conversion from string constant to ‘char*’ [-Wwrite-strings]
> req->name, req->member, type, newtype);
> ^
> symtab.c: In function ‘void gdb_command_exists(gnu_request*)’:
> symtab.c:7355:43: warning: variable ‘c’ set but not used [-Wunused-but-set-variable]
> register struct cmd_list_element *c;
>
> Thanks,
> Kazu
>
> -----Original Message-----
> > Fully redone gdb-7.6.patch to gdb-10.1.patch to keep all
> > functionality. Changes which were dropped are saved in
> > dropped-gdb-7.6-to-10.1.patch
> >
> > Main difference between gdb-7.6 and gdb-10.1 is the last
> > one was rewritten in C++.
> > I continue to keep crash code in C. Mark transition
> > functions as extern "C" to resolve linking issues.
> >
> > Eliminated error_hook() and SJLJ while running in C++ code
> > (after gdb_command_funnel()) use try-catch mechanism instead.
> >
> > request_types() was redone to do not call
> > GNU_GET_NEXT_DATATYPE multiple times but single usage of
> > GNU_ITERATE_DATATYPES with proper callback instead.
> > Complete iteration happens on C++ side now.
> > Removed "struct global_iterator" from request structure,
> > but added several fields (including callback pointer) to
> > be able to perform iteration on C++ side.
> >
> > Type of "linux_banner" symbol is reported as 'D' by new
> > gdb as its section ".rodata" marked as writable in vmlinux.
> >
> > BFD API has changed.
> >
> > deprecated_command_loop_hook got deprecated. So, call crash
> > main_loop() directly from gdb captured_main().
> >
> > Added symbol file (vmlinux) rebase in gdb by kaslr_offset.
> > by using new function: objfile_rebase().
> > As result, we do not need kernel symbol patching as well as
> > bait_and_switch hook anymore.
> >
> > Added crash_target for gdb to provide target operations
> > such as xfer_partial to read and write crash dump memory.
> > Removed previously used hooks for that in target.c.
> > Keep crash_target.c as a file in crash folder instead of
> > in gdb-10.1.patch for easier development and history
> > tracking.
> > crash_target can be enhanced in future to provide access
> > to CPU registers, so backtrace and frame related commands
> > from gdb can be used.
> >
> > Removed gdb-7.6-proc_service.h.patch is not required as
> > gdb-10.1 already has this change.
> >
> > Extra: add VMware copyright to the version info.
> >
> > TODO:
> > 1) gdb-10.1-ppc64le-support.patch has to be updated with
> > following commits.
> > 2) deprecate #if defined(GDB_X_Y) code as crash really
> > supports only the latest gdb (only one patch).
> > 3) move gdb_funnel_command() and subfunctions to separate
> > file, similar to crash_target.c
> > 4) remove legacy kernel patching and bait_and_switch hook.
> >
> > Signed-off-by: Alexey Makhalov <amakhalov at vmware.com>
> > ---
> > Makefile | 11 +-
> > configure.c | 20 +-
> > crash_target.c | 104 +
> > defs.h | 35 +-
> > dropped-gdb-7.6-to-10.1.patch | 303 +++
> > ...support.patch => gdb-10.1-ppc64le-support.patch | 0
> > gdb-10.1.patch | 1577 ++++++++++++
> > gdb-7.6-proc_service.h.patch | 67 -
> > gdb-7.6.patch | 2503 --------------------
> > gdb_interface.c | 85 +-
> > help.c | 1 +
> > kernel.c | 2 +-
> > main.c | 1 -
> > symbols.c | 125 +-
> > x86_64.c | 14 +-
> > 15 files changed, 2141 insertions(+), 2707 deletions(-)
> > create mode 100644 crash_target.c
> > create mode 100644 dropped-gdb-7.6-to-10.1.patch
> > rename gdb-7.6-ppc64le-support.patch => gdb-10.1-ppc64le-support.patch (100%)
> > create mode 100644 gdb-10.1.patch
> > delete mode 100644 gdb-7.6-proc_service.h.patch
> > delete mode 100644 gdb-7.6.patch
> >
>
>
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://listman.redhat.com/mailman/listinfo/crash-utility
More information about the Crash-utility
mailing list