[Crash-utility] [RFC/PATCH] s390x: Add live dump detection

Dave Anderson anderson at redhat.com
Thu Apr 19 18:44:43 UTC 2012



----- Original Message -----
> Hi Dave,
> 
> On s390 we will have a dump method that creates live dumps, similar to
> the snap.so crash plugin. Because Linux is not stopped while the dump
> is created, the resulting dump is not consistent. Therefore it is
> important that the crash tool informs the user about this issue.
> 
> The dump tool writes a magic number (ASCII "LIVEDUMP") into the first 8
> bytes of the dump memory. With this patch this is checked in POST_INIT
> by the s390x backend crash code. If the magic is found, the LIVE_SYSTEM
> flag is set. This ensures that commands that do not work with /dev/mem
> also will fail with s390x live dumps.
> 
> Example:
> 
> crash> bt -a
> bt: -a option not supported on a live system
> 
> In addition to that with this patch crash prints a "[LIVE DUMP]" info
> tag for live dump files at startup (similar to [PARTIAL DUMP]):
> 
> $ crash livedump vmlinux
>       KERNEL: /boot/vmlinux
>     DUMPFILE: dump.s390 [LIVE DUMP]
> 
> Michael

Interesting -- I'm amazed that even doing such a thing works!

Anyway, if you look at all the places where ACTIVE() is called, 
this patch, at a minimum, would be kind of inefficient.  

For example, consider that -- after each command -- the complete
task table would get re-initialized (as opposed to doing a single
invocation-time initialization from a dumpfile).  That's not a big
deal when reading RAM, but it could be costly when having to
read from the dumpfile each time.  But there are numerous places
where a similar tact is taken to avoid unnecessary dumpfile accesses
if the data can be cached.

I would have no problem with adding a new LIVE_DUMP flag to
pc->flags2, and just checking it in display_sys_stats() and
non_matching_kernel() as you've done below.  In dealing with
dumpfiles generated from snap.so, the "bt" command is pretty
much the only command that probably should be restricted.
However, I don't restrict "bt" with snap.so vmcores because
currently there's no magic/signature/whatever that indicates
what kind of dump it is.  But if you implement a new LIVE_DUMP
flag, I might do it there as well so we've got some consistency.

What do you think about that?

Dave

> ---
>  kernel.c |    8 ++++++--
>  s390x.c  |   12 ++++++++++++
>  2 files changed, 18 insertions(+), 2 deletions(-)
> 
> --- a/kernel.c
> +++ b/kernel.c
> @@ -975,7 +975,9 @@ non_matching_kernel(void)
>  	else
>          	fprintf(fp, "    DUMPFILE: ");
>          if (ACTIVE()) {
> -                if (REMOTE_ACTIVE())
> +		if (pc->dumpfile)
> +			fprintf(fp, "%s [LIVE DUMP]\n", pc->dumpfile);
> +		else if (REMOTE_ACTIVE())
>                          fprintf(fp, "%s@%s  (remote live system)\n",
>                                  pc->server_memsrc, pc->server);
>                  else
> @@ -4080,7 +4082,9 @@ display_sys_stats(void)
>  	else
>  		fprintf(fp, "    DUMPFILE: ");
>          if (ACTIVE()) {
> -		if (REMOTE_ACTIVE())
> +		if (pc->dumpfile)
> +			fprintf(fp, "%s [LIVE DUMP]\n", pc->dumpfile);
> +		else if (REMOTE_ACTIVE())
>  			fprintf(fp, "%s@%s  (remote live system)\n",
>  			    	pc->server_memsrc, pc->server);
>  		else
> --- a/s390x.c
> +++ b/s390x.c
> @@ -328,6 +328,17 @@ static void s390x_process_elf_notes(void
>  	}
>  }
>  
> +static void s390x_check_live(void)
>j +{
> +	unsigned long long live_magic;
> +
> +	readmem(0, KVADDR, &live_magic, sizeof(live_magic),
> "live_magic",
> +		RETURN_ON_ERROR | QUIET);
> +
> +	if (live_magic == 0x4c49564544554d50ULL)
> +		pc->flags |= LIVE_SYSTEM;
> +}
> +
>  /*
>   *  Do all necessary machine-specific setup here.  This is called
> several
>   *  times during initialization.
> @@ -402,6 +413,7 @@ s390x_init(int when)
>  		break;
>  
>  	case POST_INIT:
> +		s390x_check_live();
>  		break;
>  	}
>  }
> 




More information about the Crash-utility mailing list