[Crash-utility] [PATCH v2] Show module taint flags

Dave Anderson anderson at redhat.com
Fri Apr 5 16:11:19 UTC 2013



----- Original Message -----
> 
> > 
> > On RHEL5 Xen kernel, it shows multiple garbage graphic characters, and I
> > cannot even transpose it, because it must have a CR, LF, or something else
> > embedded in the garbarge.
> > 
> > If you want to re-post, fix the module list truncation issue, try to do
> > something to handle kernel versions that do not have the same #define's
> > as what you're hardwiring, and perhaps make it into a "mod -t" option
> > that displays just the module name and its TAINT flag.
> > 
> > Dave
> > 
> 
> Hi Dave,
> 
> Firstly, sorry for the delay.
> Here's another attempt.
> 
> An example:
> 
> crash> linux_banner
> linux_banner = $3 = "Linux version 2.6.32-220.28.1.el6.x86_64
> (mockbuild at x86-022.build.eng.bos.redhat.com) (gcc version 4.4.6 20110731
> (Red Hat 4.4.6-3) (GCC) ) #1 SMP Wed Oct 3 12:26:28 EDT 2012\n"
> 
> crash> mod -T
> NAME                 TAINT
> vxspec               (P)(U)
> dmpaa                (P)(U)
> emcpgpx              (P)(U)
> dmpap                (P)(U)
> dmpjbod              (P)(U)
> emcp                 (P)(U)
> emcpmpx              (P)(U)
> emcpdm               (P)(U)
> emcpxcrypt           (P)(U)
> emcpvlumd            (P)(U)
> vxfs                 (P)(U)
> fdd                  (P)(U)
> vxportal             (P)(U)
> vxdmp                (P)(U)
> vxio                 (P)(U)
> llt                  (P)(U)
> gab                  (P)(U)
> vxfen                (P)(U)
> vxglm                (P)(U)
> vxgms                (P)(U)
> vxodm                (P)(U)
> 
> crash> mod -t vxfs
> NAME                 TAINT
> vxfs                 (P)(U)


Hi Aaron,

Before getting into my somewhat long-winded review, I do want to
emphasize that I think this is a valuable option for kernel debugging,
and it absolutely warrants being folded into the "mod" command.

The v2 patch tested OK on a wide variety of kernel versions without
running into any of the garbage display issues I saw with the first
version.

Here are my issues/comments with the v2 patch.  Let's get the simple 
things out of the way first.

At some point during your development, run the build through
"make warn", and address any issues there:

  $ make warn
  ...
  cc -c -g -DX86_64  -DGDB_7_3_1  kernel.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector 
  kernel.c:3512:1: warning: no previous prototype for 'is_module_taint' [-Wmissing-prototypes]
  kernel.c:3577:1: warning: no previous prototype for 'show_module_taint' [-Wmissing-prototypes]
  kernel.c: In function 'show_module_taint':
  kernel.c:3579:9: warning: unused variable 'bx' [-Wunused-variable]
  ...

I wouldn't bother with the "mod -t <module>" vs. "mod -T".
The number of modules that are tainted would not be so onerous
that it warrants a per-module option.  (Just pipe it to grep if
you *really* feel it's necessary.)  So I would just make one simple
"mod -t" option that checks all the modules and displays those with
taint flags.
  
If no modules are tainted, then indicate that -- instead of just this:

  crash> mod -T
  NAME                   TAINT
  crash>

You could change it to only show the header if there are actually tainted
modules, and if there are none, then show something like "no tainted modules".

And the "help mod" page would obviously need documentation...
  
Now to the meat of the matter, as I mentioned when reviewing the 
first patch, I really prefer that hardwiring of kernel #define's 
be avoided if at all possible.  Your patch seems to pick a
subset of the possible flags:

   /*
  +* Module taint flags
  +*/
  +
  +#define TAINT_PROPRIETARY_MODULE        0
  +#define TAINT_FORCED_MODULE             1
  +#define TAINT_UNSIGNED_MODULE           6
  +#define TAINT_CRAP                     10
  +#define TAINT_OOT_MODULE               12
  +#define TAINT_TECH_PREVIEW             29
  +

And looking at the kernel history, say, starting at upstream 2.6.18:
  
  #define TAINT_PROPRIETARY_MODULE        (1<<0)
  #define TAINT_FORCED_MODULE             (1<<1)
  #define TAINT_UNSAFE_SMP                (1<<2)
  #define TAINT_FORCED_RMMOD              (1<<3)
  #define TAINT_MACHINE_CHECK             (1<<4)
  #define TAINT_BAD_PAGE                  (1<<5)

where RHEL5 expounded upon the base 2.6.18 kernel like this:
  
  #define TAINT_PROPRIETARY_MODULE	(1<<0)
  #define TAINT_FORCED_MODULE		(1<<1)
  #define TAINT_UNSAFE_SMP		(1<<2)
  #define TAINT_FORCED_RMMOD		(1<<3)
  #define TAINT_MACHINE_CHECK		(1<<4)
  #define TAINT_BAD_PAGE			(1<<5)
  #define TAINT_UNSIGNED_MODULE		(1<<6)
  #define TAINT_7 			(1<<7)
  #define TAINT_8 			(1<<8)
  #define TAINT_9 			(1<<9)
  #define TAINT_10			(1<<10)
  #define TAINT_11			(1<<11)
  #define TAINT_12			(1<<12)
  #define TAINT_13			(1<<13)
  #define TAINT_14			(1<<14)
  #define TAINT_15			(1<<15)
  #define TAINT_16			(1<<16)
  #define TAINT_17			(1<<17)
  #define TAINT_18			(1<<18)
  #define TAINT_19			(1<<19)
  #define TAINT_20			(1<<20)
  #define TAINT_21			(1<<21)
  #define TAINT_22			(1<<22)
  #define TAINT_23			(1<<23)
  #define TAINT_24			(1<<24)
  #define TAINT_25			(1<<25)
  #define TAINT_26			(1<<26)
  #define TAINT_27			(1<<27)
  /* Reserving bits for vendor specific uses */
  #define TAINT_HARDWARE_UNSUPPORTED	(1<<28)
  #define TAINT_TECH_PREVIEW		(1<<29)
  #define TAINT_RESERVED30		(1<<30)
  #define TAINT_RESERVED31		(1<<31)
  
Then if we look at upstream 2.6.32, a few more bits
were added:
  
  #define TAINT_PROPRIETARY_MODULE        0
  #define TAINT_FORCED_MODULE             1
  #define TAINT_UNSAFE_SMP                2
  #define TAINT_FORCED_RMMOD              3
  #define TAINT_MACHINE_CHECK             4
  #define TAINT_BAD_PAGE                  5
  #define TAINT_USER                      6
  #define TAINT_DIE                       7
  #define TAINT_OVERRIDDEN_ACPI_TABLE     8
  #define TAINT_WARN                      9
  #define TAINT_CRAP                      10
  
But RHEL6 (2.6.32-based) went a bit further:
  
  #define TAINT_PROPRIETARY_MODULE	0
  #define TAINT_FORCED_MODULE		1
  #define TAINT_UNSAFE_SMP		2
  #define TAINT_FORCED_RMMOD		3
  #define TAINT_MACHINE_CHECK		4
  #define TAINT_BAD_PAGE			5
  #define TAINT_USER			6
  #define TAINT_DIE			7
  #define TAINT_OVERRIDDEN_ACPI_TABLE	8
  #define TAINT_WARN			9
  #define TAINT_CRAP			10
  #define TAINT_FIRMWARE_WORKAROUND	11
  #define TAINT_12			12
  #define TAINT_13			13
  #define TAINT_14			14
  #define TAINT_15			15
  #define TAINT_16			16
  #define TAINT_17			17
  #define TAINT_18			18
  #define TAINT_19			19
  #define TAINT_20			20
  #define TAINT_21			21
  #define TAINT_22			22
  #define TAINT_23			23
  #define TAINT_24			24
  #define TAINT_25			25
  #define TAINT_26			26
  #define TAINT_BIT_BY_ZOMBIE		27
  /* Reserving bits for vendor specific uses */
  #define TAINT_HARDWARE_UNSUPPORTED	28
  #define TAINT_TECH_PREVIEW		29
  /* Bits 30 - 31 are reserved for Red Hat use only */
  #define TAINT_RESERVED30		30
  #define TAINT_RESERVED31		31
   
And here's the state in the current upstream linux git tree:
  
  #define TAINT_PROPRIETARY_MODULE        0
  #define TAINT_FORCED_MODULE             1
  #define TAINT_UNSAFE_SMP                2
  #define TAINT_FORCED_RMMOD              3
  #define TAINT_MACHINE_CHECK             4
  #define TAINT_BAD_PAGE                  5
  #define TAINT_USER                      6
  #define TAINT_DIE                       7
  #define TAINT_OVERRIDDEN_ACPI_TABLE     8
  #define TAINT_WARN                      9
  #define TAINT_CRAP                      10
  #define TAINT_FIRMWARE_WORKAROUND       11
  #define TAINT_OOT_MODULE                12
  
So how would be the best way to deal with handling the other bits 
that you are not checking? 

When I reviewed your first post, I didn't notice that the
solution -- at least for 2.6.28 and later kernels -- is pretty
much implemented in the kernel by the introduction of the
kernel's "tnts" structure and the print_tainted() function:

  struct tnt {
          u8      bit;
          char    true;
          char    false;
  };
  
  static const struct tnt tnts[] = {
          { TAINT_PROPRIETARY_MODULE,     'P', 'G' },
          { TAINT_FORCED_MODULE,          'F', ' ' },
          { TAINT_UNSAFE_SMP,             'S', ' ' },
          { TAINT_FORCED_RMMOD,           'R', ' ' },
          { TAINT_MACHINE_CHECK,          'M', ' ' },
          { TAINT_BAD_PAGE,               'B', ' ' },
          { TAINT_USER,                   'U', ' ' },
          { TAINT_DIE,                    'D', ' ' },
          { TAINT_OVERRIDDEN_ACPI_TABLE,  'A', ' ' },
          { TAINT_WARN,                   'W', ' ' },
          { TAINT_CRAP,                   'C', ' ' },
          { TAINT_FIRMWARE_WORKAROUND,    'I', ' ' },
          { TAINT_OOT_MODULE,             'O', ' ' },
  };
  
Why not just read the "tnts[]" array and then use the same
logic as the kernel's print_tainted() function?  Doing that 
would even handle the vendor-specific issue, where for example,
RHEL6 has this version of the tnts array:

  static const struct tnt tnts[] = {
          { TAINT_PROPRIETARY_MODULE,     'P', 'G' },
          { TAINT_FORCED_MODULE,          'F', ' ' },
          { TAINT_UNSAFE_SMP,             'S', ' ' },
          { TAINT_FORCED_RMMOD,           'R', ' ' },
          { TAINT_MACHINE_CHECK,          'M', ' ' },
          { TAINT_BAD_PAGE,               'B', ' ' },
          { TAINT_USER,                   'U', ' ' },
          { TAINT_DIE,                    'D', ' ' },
          { TAINT_OVERRIDDEN_ACPI_TABLE,  'A', ' ' },
          { TAINT_WARN,                   'W', ' ' },
          { TAINT_CRAP,                   'C', ' ' },
          { TAINT_FIRMWARE_WORKAROUND,    'I', ' ' },
          { TAINT_12,                     '?', '-' },
          { TAINT_13,                     '?', '-' },
          { TAINT_14,                     '?', '-' },
          { TAINT_15,                     '?', '-' },
          { TAINT_16,                     '?', '-' },
          { TAINT_17,                     '?', '-' },
          { TAINT_18,                     '?', '-' },
          { TAINT_19,                     '?', '-' },
          { TAINT_20,                     '?', '-' },
          { TAINT_21,                     '?', '-' },
          { TAINT_22,                     '?', '-' },
          { TAINT_23,                     '?', '-' },
          { TAINT_24,                     '?', '-' },
          { TAINT_25,                     '?', '-' },
          { TAINT_26,                     '?', '-' },
          { TAINT_BIT_BY_ZOMBIE,          'Z', ' ' },
          { TAINT_HARDWARE_UNSUPPORTED,   'H', ' ' },
          { TAINT_TECH_PREVIEW,           'T', ' ' },
  };

For kernels earlier than 2.6.28, you could still hardwire
the values that were valid at that point in time, or you
could use the subset of bits that you've chosen.  But for
bits that are set that are *not* part of your chosen
subset (if you do it that way), you should probably put
the bit number in brackets, and then the user could refer
to the relevant kernel source for details on what the bit
number means.  

But from 2.6.28 on, please use the kernel data structure
contents.  

And lastly, what about the transition from module.gpgsig_ok to 
module.sig_ok?  And BTW, if the relevant "ok" bit is not set, 
will the (U) be displayed twice?  Maybe there should be a
special display to differentiate the two (U) possibilities?

Thanks,
  Dave




More information about the Crash-utility mailing list