[Crash-utility] [PATCH] files: support dump file page cache

Dave Anderson anderson at redhat.com
Mon Jun 15 15:44:06 UTC 2015



----- Original Message -----
> Sorry,Ijust realized that my email setting is not correct.
> 
> Resend patch file here.
> 
> > Dave,
> >
> > This patch add -M and -m option for file commands, which allow to dump
> > page cache for a file.
> >
> > Please review and let me know your comments. Thanks!

Hello Oliver,

Before getting into the patch specifics, please make it apply to the current
git tree contents:

  $ git clone git://github.com/crash-utility/crash.git
  Cloning into 'crash'...
  remote: Counting objects: 954, done.
  remote: Compressing objects: 100% (3/3), done.
  remote: Total 954 (delta 0), reused 0 (delta 0), pack-reused 951
  Receiving objects: 100% (954/954), 2.08 MiB, done.
  Resolving deltas: 100% (634/634), done.
  $ cd crash
  $ patch -p1 < ../0001-files-support-dump-file-page-caches.patch
  patching file defs.h
  Hunk #3 succeeded at 4755 (offset -16 lines).
  patching file filesys.c
  patching file memory.c
  Hunk #1 succeeded at 134 with fuzz 1 (offset 2 lines).
  Hunk #2 succeeded at 6467 (offset 305 lines).
  patching file task.c
  Hunk #1 succeeded at 5612 (offset 11 lines).
  Hunk #2 succeeded at 5636 (offset 11 lines).
  Hunk #3 succeeded at 6144 (offset 11 lines).
  Hunk #4 succeeded at 6377 (offset 11 lines).
  $ 

And make sure it compiles cleanly with "make warn":
  
  $ make warn
  ... [ cut ] ...
  cc -c -g -DX86_64 -DLZO -DSNAPPY -DGDB_7_6  memory.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector -Wformat-security 
  memory.c: In function 'dump_file_address_mappings':
  memory.c:6477:22: warning: unused variable 'ret' [-Wunused-variable]
  memory.c:6475:8: warning: unused variable 'radix_tree_rnode' [-Wunused-variable]
  memory.c: At top level:
  memory.c:6505:1: warning: no previous prototype for 'get_page_tree_count' [-Wmissing-prototypes]
  memory.c: In function 'get_page_tree_count':
  memory.c:6507:8: warning: unused variable 'radix_tree_rnode' [-Wunused-variable]
  cc -c -g -DX86_64 -DLZO -DSNAPPY -DGDB_7_6  filesys.c -Wall -O2 -Wstrict-prototypes -Wmissing-prototypes -fstack-protector -Wformat-security 
  filesys.c: In function 'cmd_files':
  filesys.c:2215:4: warning: implicit declaration of function 'dump_file_address_mappings' [-Wimplicit-function-declaration]
  filesys.c: In function 'file_dump':
  filesys.c:2894:4: warning: implicit declaration of function 'get_page_tree_count' [-Wimplicit-function-declaration]
  ...

I've only done some quick testing, but for starters, The PATH translation 
for /dev files is not working the same way as the regular "files" command:
  
  crash> files
  PID: 19772  TASK: ffff810278593820  CPU: 7   COMMAND: "sshd"
  ROOT: /    CWD: /
   FD       FILE            DENTRY           INODE       TYPE PATH
    0 ffff8102777021c0 ffff81027cc06078 ffff81027f3efa18 CHR  /dev/null
    1 ffff8102777021c0 ffff81027cc06078 ffff81027f3efa18 CHR  /dev/null
    2 ffff81027760abc0 ffff81027cc06078 ffff81027f3efa18 CHR  /dev/null
    3 ffff810213616dc0 ffff8102130de738 ffff81026d575610 SOCK socket:/[72770]
    4 ffff81027e5b5cc0 ffff8102130de8e8 ffff810274152ad0 SOCK socket:/[72980]
    5 ffff810216a63e80 ffff810229a28228 ffff81021006a910 PIPE 
    6 ffff810276ba8c80 ffff810229a28228 ffff81021006a910 PIPE 
    7 ffff81027e0757c0 ffff8102130dec48 ffff81027f2eb110 SOCK socket:/[72991]
    8 ffff8102136160c0 ffff810213079b70 ffff8102741525d0 SOCK socket:/[72992]
    9 ffff81027e0755c0 ffff81027f3f0588 ffff81027f3ef418 CHR  /dev/ptmx
   10 ffff81027e0755c0 ffff81027f3f0588 ffff81027f3ef418 CHR  /dev/ptmx
   11 ffff81027e0755c0 ffff81027f3f0588 ffff81027f3ef418 CHR  /dev/ptmx
  crash> files -M
  PID: 19772  TASK: ffff810278593820  CPU: 7   COMMAND: "sshd"
  ROOT: /    CWD: /
   FD    ADDR-SPACE      PGCACHE-PGS         INODE       TYPE PATH
    0 ffff81027f3efb28         0        ffff81027f3efa18 CHR  /null
    1 ffff81027f3efb28         0        ffff81027f3efa18 CHR  /null
    2 ffff81027f3efb28         0        ffff81027f3efa18 CHR  /null
    3 ffff81026d575720         0        ffff81026d575610 SOCK socket:/[72770]
    4 ffff810274152be0         0        ffff810274152ad0 SOCK socket:/[72980]
    5 ffff81021006aa20         0        ffff81021006a910 PIPE 
    6 ffff81021006aa20         0        ffff81021006a910 PIPE 
    7 ffff81027f2eb220         0        ffff81027f2eb110 SOCK socket:/[72991]
    8 ffff8102741526e0         0        ffff8102741525d0 SOCK socket:/[72992]
    9 ffff81027f3ef528         0        ffff81027f3ef418 CHR  /ptmx
   10 ffff81027f3ef528         0        ffff81027f3ef418 CHR  /ptmx
   11 ffff81027f3ef528         0        ffff81027f3ef418 CHR  /ptmx
  crash>

But more importantly, for "files -M", it's not clear to me what the PGCACHE-PGS count
should or does mean.

One might expect to pass any ADDR-SPACE address shown by "files -M" to the
"files -m <address-space>" option, and see PGCACHE-PGS worth of pages dumped.  
But that's not always true.

For example:
  
  crash> files -M
  PID: 30700  TASK: ffff810876c8d7a0  CPU: 0   COMMAND: "_progres"
  ROOT: /    CWD: /home/TCusa
   FD    ADDR-SPACE      PGCACHE-PGS         INODE       TYPE PATH
    0 ffff81080e7095c0         0        ffff81080e7094b0 CHR  /45
    1 ffff81080e7095c0         0        ffff81080e7094b0 CHR  /45
    2 ffff81080e7095c0         0        ffff81080e7094b0 CHR  /45
    3 ffff810fe6da85c0         0        ffff810fe6da84b0 PIPE 
    4 ffff81021b28eb88        35        ffff81021b28ea78 REG  /dlc/101c/convmap.cp
    5 ffff811024532b88         6        ffff811024532a78 REG  /ProgTemp/lbiVQS7cd
    6 ffff810224fbd850        78        ffff810224fbd740 REG  /usa/mfg/usa.lg
    7 ffff810218e31220        12        ffff810218e31110 REG  /usa/mfg/usa.db
    8 ffff810218e31220        12        ffff810218e31110 REG  /usa/mfg/usa.db
    9 ffff810e371fe260       42937      ffff810e371fe150 REG  /usa/mfg/usa.b1
   10 ffff81102f12ee40         0        ffff81102f12ed30 REG  /usa/mfg/usa.b2
   11 ffff810224fbde40        490       ffff810224fbdd30 REG  /usa/mfg/usa.d1
   12 ffff810224fbde40        490       ffff810224fbdd30 REG  /usa/mfg/usa.d1
  ...
  
Taking FD 7's address space structure, the 12 page cache pages can be dumped:
  
  crash> files -m ffff810218e31220
  Address Space ffff810218e31220 : 12 pages in page cache
  
        PAGE        PHYSICAL      MAPPING       INDEX CNT FLAGS
  ffff81010774d1f8 221609000 ffff810218e31220        0  1 22010000001006c
  ffff81010485b378 14ac59000 ffff810218e31220        1  1 14810000001006c
  ffff810107993a78 22bc79000 ffff810218e31220        2  1 228100000010028
  ffff8101049d3660 1517d4000 ffff810218e31220        3  1 150100000010028
  ffff810103b29670 10e742000 ffff810218e31220        4  1 108100000010028
  ffff810106d51ba0 1f3bec000 ffff810218e31220        5  1 1f0100000010028
  ffff810103ac95f0 10cbd2000 ffff810218e31220        6  1 108100000010028
  ffff810106c8cc18 1f03a5000 ffff810218e31220        7  1 1f0100000010028
  ffff8101077b1028 223293000 ffff810218e31220        8  1 220100000010028
  ffff810106cc03b0 1f125a000 ffff810218e31220        9  1 1f0100000010028
  ffff810107a04cd8 22dccd000 ffff810218e31220        a  1 228100000010028
  ffff8101078741b8 226a51000 ffff810218e31220        b  1 220100000010028
  crash> 
  
So taking FD 11's ffff810224fbde40, you would expect all 490 pages plus 
the 3 line header to be dumped:

  crash> files -m ffff810224fbde40 | wc -l
  46
  crash>

Or taking FD 9's ffff810e371fe260, you would expect 42937 pages plus the header:

  crash> files -m ffff810e371fe260 | wc -l
  2444
  crash>

So what does that mean exactly?  Should the PGCACHE-PGS display show "x of y", 
where "x" is the number of a file's "y" cached pages that are mapped into the 
specified address space?

I haven't looked too deeply at the patch-set yet, but in my quick test, I ran
into this in your new dump_file_address_mappings() function:

+       /* Now walk the tree, counting all the pages in the tree */
+       for (index = 0; index <= count; index++) {
+               rtp.index = index;
+               if (do_radix_tree(root_rnode, RADIX_TREE_SEARCH, &rtp)) {
+                       meminfo.spec_addr = (ulong)rtp.value;
+                       meminfo.memtype = KVADDR;
+                       meminfo.flags = ADDRESS_SPECIFIED;
+                       dump_mem_map_SPARSEMEM(&meminfo);
+               }
+       }

Also, if the kernel is not configured with CONFIG_SPARSEMEM, the "files -m" option
fails like this, on a 2.6.9 RHEL4 x86_64 kernel (yes the address space virtual address
is correct):

  crash> files -m 1016fcaa668
  Address Space 1016fcaa668 : 20 pages in page cache

  files: cannot resolve "mem_section"
  crash>

For backwards-compatibility, I did a quick check on a couple older 32-bit x86 kernels, 
and on a RHEL4 2.6.9-based x86 kernel, "files -M" fails every time:

  crash> files -M
  PID: 4846   TASK: c09de0b0  CPU: 0   COMMAND: "dmach7"
  ROOT: /    CWD: /home/m7istp.4.6.18.b4/m7istp/dmach7/bin
   FD  ADDR-SPACE  PGCACHE-PGS   INODE    TYPE  PATH
  radix_tree_root at cf53a3ac:
  struct radix_tree_root {
    height = 0x220, 
    gfp_mask = 0x0, 
    rnode = 0x1d244b3c 
  }
  files: height 544 is greater than height_to_maxindex[] index 7
  crash> 

I thought it might be a problem with really old kernels, but it happens on
a 32-bit RHEL5 2.6.18-128.2.1.el5 kernel:

  crash> files -M
  PID: 22328  TASK: dbdeb000  CPU: 0   COMMAND: "sushiremote"
  ROOT: /    CWD: /afs/cs.wisc.edu/u/s/u/sushi
   FD  ADDR-SPACE  PGCACHE-PGS   INODE    TYPE  PATH
  radix_tree_root at de8cb798:
  struct radix_tree_root {
    height = 0x220, 
    gfp_mask = 0x0, 
    rnode = 0x1000000
  }
  files: height 544 is greater than height_to_maxindex[] index 7
  crash> 

And the same thing on the most recent 32-bit x86 kernel I have on hand, which
is 2.6.40.4-5.fc15.i686.PAE:

  crash> files -M
  PID: 3804   TASK: f466a5e0  CPU: 0   COMMAND: "crash"
  ROOT: /    CWD: /root/crash-5.1.8
   FD  ADDR-SPACE  PGCACHE-PGS   INODE    TYPE  PATH
  radix_tree_root at e6d6f644:
  struct radix_tree_root {
    height = 0x20, 
    gfp_mask = 0x0, 
    rnode = 0x101
  }
  files: height 32 is greater than height_to_maxindex[] index 7
  crash> 

So then I tried it on a 32-bit ARM 3.10.17 kernel, which also fails:

  crash> files -M
  PID: 13429  TASK: db944580  CPU: 1   COMMAND: "AudioIn_5F8"
  ROOT: /    CWD: /
   FD  ADDR-SPACE  PGCACHE-PGS   INODE    TYPE  PATH
  radix_tree_root at db6b1a7c:
  struct radix_tree_root {
    height = 0x20, 
    gfp_mask = 0x0, 
    rnode = 0x0
  }
  files: height 32 is greater than height_to_maxindex[] index 7
  crash> 

So I'm guessing that the patch fails on all 32-bit kernels.

Dave




> >
> > Here is the usage,
> >
> > 1. Dump a process page cache number, default is crash, also work with given
> > pid,
> >
> > crash> files -M
> >
> > PID: 22710  TASK: ffff8801077153e0  CPU: 1   COMMAND: "crash"
> >
> > ROOT: /    CWD: /auto/home2/yango/workspace/crash
> >
> > FD    ADDR-SPACE      PGCACHE-PGS         INODE       TYPE PATH
> >
> >    0 ffff8801031edbe8         0        ffff8801031edaa0 CHR  /2
> >
> >    1 ffff8801031edbe8         0        ffff8801031edaa0 CHR  /2
> >
> >    2 ffff8801031edbe8         0        ffff8801031edaa0 CHR  /2
> >
> >    3 ffff880139bf8950         0        ffff880139bf8808 CHR  /null
> >
> >    4 ffff88011e561390         0        ffff88011e561248 CHR  /crash
> >
> >    5 ffff88012f8345f0       37910      ffff88012f8344a8 REG
> > /usr/lib/debug/lib/modules/3.11.10-301.fc20.x86_64/vmlinux
> >
> >    [snipped..........................]
> >
> >
> > 2. Dump pages in a given addr-space, this exmaple is ffff88012f8345f0
> > from above output.
> >      page flags could indicates the dirty pages for fsync stress debugging,
> >
> > crash> files -m ffff88012f8345f0
> >
> > Address Space ffff88012f8345f0 : 37910 pages in page cache
> >
> >        PAGE       PHYSICAL      MAPPING       INDEX CNT FLAGS
> >
> > ffffea0001f5bc40 7d6f1000 ffff88012f8345f0        0  2 3ff0000000086c
> > referenced,uptodate,lru,active,private
> >
> > ffffea0001f5bc80 7d6f2000 ffff88012f8345f0        1  2 3ff0000000082c
> > referenced,uptodate,lru,private
> >
> > ..............................[snipped...].........................................................................
> >
> > ffffea00016226c0 5889b000 ffff88012f8345f0     9414  2 3ff0000000086c
> > referenced,uptodate,lru,active,private
> >
> > ffffea000224f480 893d2000 ffff88012f8345f0     9415  2 3ff0000000086c
> > referenced,uptodate,lru,active,private
> >
> > 3. For each files doesn't work with -m but it work with -M
> >
> > crash> foreach files -m
> >
> > foreach: foreach files command does not support -m option
> >
> > So we can use foreach to find which process or files have most page
> > cache number,
> >
> > crash> foreach files -M | grep REG | sort -k3 -n | tail -10
> >
> > 20 ffff880137a70be0         2        ffff880137a70a98 REG  /ffinLFoAy
> >
> >    4 ffff880037630de0        131       ffff880037630c98 REG
> > /var/log/audit/audit.log
> >
> >    4 ffff880037630de0        131       ffff880037630c98 REG
> > /var/log/audit/audit.log
> >
> > 36 ffff8801352e91d8        574       ffff8801352e9090 REG
> > /var/log/journal/2d6f0d3073ff4a60b1e52a8e38e48feb/user-530.journal
> >
> > 34 ffff8801352e81f8        590       ffff8801352e80b0 REG
> > /var/log/journal/2d6f0d3073ff4a60b1e52a8e38e48feb/user-42.journal
> >
> >    5 ffff8800a90219c8       9816       ffff8800a9021880 REG
> > /usr/lib/debug/lib/modules/3.11.10-301.fc20.x86_64/vmlinux
> >
> > 13 ffff880135267198       14051      ffff880135267050 REG
> > /var/log/journal/2d6f0d3073ff4a60b1e52a8e38e48feb/system.journal
> >
> >    5 ffff88012f8345f0       37910      ffff88012f8344a8 REG
> > /usr/lib/debug/lib/modules/3.11.10-301.fc20.x86_64/vmlinux
> >
> >    1 ffff8800704f3d80       59468      ffff8800704f3c38 REG
> > /ws/irqstat/nohup.out
> >
> >    2 ffff8800704f3d80       59468      ffff8800704f3c38 REG
> > /ws/irqstat/nohup.out
> >
> >
> > With these commands, we can easily to debug some page cache flush
> > stress issue, and find out which process or files had the problem.
> >
> >
> 
> 
> --
> Crash-utility mailing list
> Crash-utility at redhat.com
> https://www.redhat.com/mailman/listinfo/crash-utility




More information about the Crash-utility mailing list