[Linux-cachefs] Page reference leak in fscache or cachefiles?

Shantanu Goel sgoel01 at yahoo.com
Wed Jul 9 16:35:01 UTC 2014


Hi,

We are running Linux 3.10 + fscache-20130702 + commit bae6b235905ab9dc6659395f7802c1d36fb63f15 from dhowells' git tree and suspect there might be a page reference leak somewhere as evidenced by the low reclaim ratio.  We also see the kernel complaining about memory allocation failures in the readahead code as seen below.
The machine has plenty of filecache and there isn't any heavy write activity which could also pin the pages.
We wrote a script to monitor /proc/vmstat and see the following output with a one second interval.


   Refill      Scan      Free   FrRatio      Slab      Runs    Stalls
         0    154663      6855         4      1024         2         0
         0      7152      7152       100         0         4         0
         0      5960      5960       100         0         4         0
         0      7152      7152       100      1024         3         0
         0    152407     21698        14      1024         2         0


The fields are:

Refill  - # active pages scanned, i.e. deactivated
Scan    - # inactive pages scanned
Free    - # pages freed
FrRatio - free / scan ratio as percentage
Runs    - # times kswapd ran
Stalls  - # times direct reclaim ran


The free memory looks like this (free -m):

             total       used       free     shared    buffers     cached
Mem:         24175      23986        188          0         16      10453
-/+ buffers/cache:      13517      10657
Swap:         8191         85       8106


Are you aware of any such issue with fscache / cachefiles?  If not, could you suggest what other information we could gather to debug it further?


Thanks,
Shantanu
SysRq : HELP : loglevel(0-9) reboot(b) crash(c) terminate-all-tasks(e) memory-full-oom-kill(f) kill-all-tasks(i) thaw-filesystems(j) sak(k) show-backtrace-all-active-cpus(l) show-memory-usage(m) nice-all-RT-tasks(n) poweroff(o) show-registers(p) show-all-timers(q) unraw(r) sync(s) show-task-states(t) unmount(u) show-blocked-tasks(w) dump-ftrace-buffer(z) 
python: page allocation failure: order:0, mode:0x11110
CPU: 5 PID: 18997 Comm: python Tainted: G        W  O 3.10.36-el5.ia32e.lime.0 #1
Hardware name: Supermicro X8STi/X8STi, BIOS 2.0     06/03/2010
 0000000000000000 ffff880104d39578 ffffffff81426c43 ffff880104d39608
 ffffffff810fdcfc ffff88061fffd468 0000004000000000 ffff880104d395a8
 ffffffff810fd506 0000000000000000 ffffffff8109120d 0001111000000000
Call Trace:
 [<ffffffff81426c43>] dump_stack+0x19/0x1e
 [<ffffffff810fdcfc>] warn_alloc_failed+0xfc/0x140
 [<ffffffff810fd506>] ? drain_local_pages+0x16/0x20
 [<ffffffff8109120d>] ? on_each_cpu_mask+0x4d/0x70
 [<ffffffff810fef1c>] __alloc_pages_nodemask+0x6cc/0x910
 [<ffffffff81139a9a>] alloc_pages_current+0xba/0x160
 [<ffffffff810f6617>] __page_cache_alloc+0xa7/0xc0
 [<ffffffffa050a4ea>] cachefiles_read_backing_file+0x2ba/0x7b0 [cachefiles]
 [<ffffffffa050ac87>] cachefiles_read_or_alloc_pages+0x2a7/0x3d0 [cachefiles]
 [<ffffffff81062c4f>] ? wake_up_bit+0x2f/0x40
 [<ffffffffa04da24a>] ? fscache_run_op+0x5a/0xa0 [fscache]
 [<ffffffffa04daacb>] ? fscache_submit_op+0x1db/0x4c0 [fscache]
 [<ffffffffa04dbd65>] __fscache_read_or_alloc_pages+0x1f5/0x2c0 [fscache]
 [<ffffffffa0531e9e>] __nfs_readpages_from_fscache+0x7e/0x1b0 [nfs]
 [<ffffffffa052bc6a>] nfs_readpages+0xca/0x1f0 [nfs]
 [<ffffffff81139a9a>] ? alloc_pages_current+0xba/0x160
 [<ffffffff81101ae2>] __do_page_cache_readahead+0x1b2/0x260
 [<ffffffff81101bb1>] ra_submit+0x21/0x30
 [<ffffffff81101e85>] ondemand_readahead+0x115/0x240
 [<ffffffff81102038>] page_cache_async_readahead+0x88/0xb0
 [<ffffffff810f5f5e>] ? find_get_page+0x1e/0xa0
 [<ffffffff810f81fc>] generic_file_aio_read+0x4dc/0x720
 [<ffffffffa05224b9>] nfs_file_read+0x89/0x100 [nfs]
 [<ffffffff81158a4f>] do_sync_read+0x7f/0xb0
 [<ffffffff8115a3a5>] vfs_read+0xc5/0x190
 [<ffffffff8115a57f>] SyS_read+0x5f/0xa0
 [<ffffffff814327bb>] tracesys+0xdd/0xe2
Mem-Info:
Node 0 DMA32 per-cpu:
CPU    0: hi:  186, btch:  31 usd:   2
CPU    1: hi:  186, btch:  31 usd: 159
CPU    2: hi:  186, btch:  31 usd:  82
CPU    3: hi:  186, btch:  31 usd: 182
CPU    4: hi:  186, btch:  31 usd:  57
CPU    5: hi:  186, btch:  31 usd:   0
Node 0 Normal per-cpu:
CPU    0: hi:  186, btch:  31 usd:  22
CPU    1: hi:  186, btch:  31 usd: 117
CPU    2: hi:  186, btch:  31 usd:  16
CPU    3: hi:  186, btch:  31 usd:  79
CPU    4: hi:  186, btch:  31 usd:  59
CPU    5: hi:  186, btch:  31 usd:   0
active_anon:3461555 inactive_anon:451550 isolated_anon:0
 active_file:236104 inactive_file:1899298 isolated_file:0
 unevictable:467 dirty:109 writeback:0 unstable:0
 free:33860 slab_reclaimable:19654 slab_unreclaimable:5110
 mapped:15284 shmem:2 pagetables:10231 bounce:0
 free_cma:0
Node 0 DMA32 free:91764kB min:4824kB low:6028kB high:7236kB active_anon:2185528kB inactive_anon:566976kB active_file:108808kB inactive_file:670996kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:3660960kB managed:3646160kB mlocked:0kB dirty:152kB writeback:0kB mapped:1344kB shmem:0kB slab_reclaimable:17300kB slab_unreclaimable:580kB kernel_stack:80kB pagetables:6448kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 20607 20607
Node 0 Normal free:39164kB min:27940kB low:34924kB high:41908kB active_anon:11663332kB inactive_anon:1239224kB active_file:835608kB inactive_file:6926636kB unevictable:1868kB isolated(anon):0kB isolated(file):0kB present:21495808kB managed:21101668kB mlocked:1868kB dirty:612kB writeback:0kB mapped:60120kB shmem:8kB slab_reclaimable:61536kB slab_unreclaimable:19860kB kernel_stack:2088kB pagetables:34804kB unstable:0kB bounce:0kB free_cma:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no
lowmem_reserve[]: 0 0 0
Node 0 DMA32: 506*4kB (UEM) 545*8kB (UEM) 569*16kB (UEM) 897*32kB (UEM) 541*64kB (UEM) 87*128kB (UM) 4*256kB (UM) 2*512kB (M) 0*1024kB 0*2048kB 0*4096kB = 92000kB
Node 0 Normal: 703*4kB (UE) 285*8kB (UEM) 176*16kB (UEM) 836*32kB (UEM) 7*64kB (UEM) 0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 35108kB
Node 0 hugepages_total=0 hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
2137331 total pagecache pages
1215 pages in swap cache
Swap cache stats: add 7056879, delete 7055664, find 633567/703482
Free swap  = 8301172kB
Total swap = 8388604kB
6291455 pages RAM
102580 pages reserved
4391935 pages shared
4746779 pages non-shared




More information about the Linux-cachefs mailing list