FC2 locked up when killing runaway process

Simon Andrews simon.andrews at bbsrc.ac.uk
Fri Oct 15 11:00:20 UTC 2004


Yesterday we had a problem with a development box running FC2.  We had a 
process (chesire.cgi) which went a bit mad and ate all the memory on the 
box (well it is a development box!).  Normally when this happens the 
process just gets killed and life goes on.  Yesterday however something 
else went on as the whole box locked up accompanied by thrashing disk 
activity.

The slowdown went on for 5-10 mins, during which time the box was 
unreachable from the net, and was *very* slow from a local terminal 
(Ctrl+Alt+F1 took >1min to bring up a prompt).  Eventually whatever was 
causing the slowdown completed and everything was back to normal and has 
been running fine ever since.  We've tried to reproduce the event, but 
although we can get the same process killed, we've never seen the slow 
down again.

I'd like to be able to figure out what actually went on whilst this was 
happening to see if there's anything we can do to fix it, or if this is 
something which should be bugzilla'd.

I've attached the log from whilst this incident was happening - you can 
see the root login in the middle which was right in the midst of the 
chaos where I was trying to figure out what was going on.

Any help or suggestions are appreciated.

Cheers

Simon.

Oct 14 12:47:58 bilin1 kernel: oom-killer: gfp_mask=0x1d2
Oct 14 12:47:58 bilin1 kernel: DMA per-cpu:
Oct 14 12:47:58 bilin1 kernel: cpu 0 hot: low 2, high 6, batch 1
Oct 14 12:47:59 bilin1 kernel: cpu 0 cold: low 0, high 2, batch 1
Oct 14 12:47:59 bilin1 kernel: cpu 1 hot: low 2, high 6, batch 1
Oct 14 12:47:59 bilin1 kernel: cpu 1 cold: low 0, high 2, batch 1
Oct 14 12:47:59 bilin1 kernel: Normal per-cpu:
Oct 14 12:47:59 bilin1 kernel: cpu 0 hot: low 32, high 96, batch 16
Oct 14 12:47:59 bilin1 kernel: cpu 0 cold: low 0, high 32, batch 16
Oct 14 12:48:00 bilin1 kernel: cpu 1 hot: low 32, high 96, batch 16
Oct 14 12:48:00 bilin1 kernel: cpu 1 cold: low 0, high 32, batch 16
Oct 14 12:48:00 bilin1 kernel: HighMem per-cpu: empty
Oct 14 12:48:00 bilin1 kernel:
Oct 14 12:48:00 bilin1 kernel: Free pages:        1840kB (0kB HighMem)
Oct 14 12:48:01 bilin1 kernel: Active:60456 inactive:60435 dirty:0 
writeback:0 unstable:0 free:460 slab:4847 mapped:120578 pagetables:1395
Oct 14 12:48:01 bilin1 kernel: DMA free:1440kB min:20kB low:40kB 
high:60kB active:5280kB inactive:5008kB present:16384kB
Oct 14 12:48:01 bilin1 kernel: protections[]: 10 360 360
Oct 14 12:48:01 bilin1 kernel: Normal free:400kB min:700kB low:1400kB 
high:2100kB active:236544kB inactive:236732kB present:507840kB
Oct 14 12:48:01 bilin1 kernel: protections[]: 0 350 350
Oct 14 12:48:01 bilin1 kernel: HighMem free:0kB min:128kB low:256kB 
high:384kB active:0kB inactive:0kB present:0kB
Oct 14 12:48:02 bilin1 kernel: protections[]: 0 0 0
Oct 14 12:48:02 bilin1 kernel: DMA: 0*4kB 94*8kB 39*16kB 2*32kB 0*64kB 
0*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 1440kB
Oct 14 12:48:02 bilin1 kernel: Normal: 0*4kB 0*8kB 1*16kB 4*32kB 2*64kB 
1*128kB 0*256kB 0*512kB 0*1024kB 0*2048kB 0*4096kB = 400kB
Oct 14 12:48:02 bilin1 kernel: HighMem: empty
Oct 14 12:48:02 bilin1 kernel: Swap cache: add 3643743, delete 3643448, 
find 894687/1307178, race 4+52
Oct 14 12:48:03 bilin1 kernel: Out of Memory: Killed process 8411 
(chesire.cgi).
Oct 14 12:48:03 bilin1 kernel: chesire.cgi: page allocation failure. 
order:0, mode:0xd2
Oct 14 12:48:03 bilin1 kernel:  [<02140445>] __alloc_pages+0x2a4/0x2be
Oct 14 12:48:03 bilin1 kernel:  [<0214bd83>] do_anonymous_page+0xb6/0x241
Oct 14 12:48:03 bilin1 kernel:  [<0214bf77>] do_no_page+0x69/0x3a0
Oct 14 12:48:03 bilin1 kernel:  [<0214c460>] handle_mm_fault+0xdf/0x1d4
Oct 14 12:48:04 bilin1 kernel:  [<0214a838>] follow_page+0xca/0x128
Oct 14 12:48:04 bilin1 kernel:  [<0214abfa>] get_user_pages+0x20f/0x3a5
Oct 14 12:48:04 bilin1 kernel:  [<0215862c>] rw_vm+0x178/0x331
Oct 14 12:48:04 bilin1 kernel:  [<02158bb5>] put_user_size+0x29/0x2d
Oct 14 12:48:04 bilin1 kernel:  [<0213cd77>] file_read_actor+0x73/0x101
Oct 14 12:48:04 bilin1 kernel:  [<0213cb04>] 
do_generic_mapping_read+0x153/0x353
Oct 14 12:48:04 bilin1 kernel:  [<0213cfa3>] 
__generic_file_aio_read+0x19e/0x1bc
Oct 14 12:48:04 bilin1 kernel:  [<0213cd04>] file_read_actor+0x0/0x101
Oct 14 12:48:05 bilin1 kernel:  [<0213d001>] generic_file_aio_read+0x40/0x47
Oct 14 12:48:05 bilin1 kernel:  [<0215b097>] do_sync_read+0x6a/0x99
Oct 14 12:48:05 bilin1 kernel:  [<0215b17e>] vfs_read+0xb8/0xe4
Oct 14 12:48:05 bilin1 login(pam_unix)[1736]: session opened for user 
root by (uid=0)
Oct 14 12:48:05 bilin1 kernel:  [<0215b363>] sys_read+0x3c/0x62
Oct 14 12:48:05 bilin1 kernel: chesire.cgi: page allocation failure. 
order:0, mode:0xd2
Oct 14 12:48:05 bilin1 kernel:  [<02140445>] __alloc_pages+0x2a4/0x2be
Oct 14 12:48:05 bilin1  -- root[1736]: ROOT LOGIN ON tty1
Oct 14 12:48:06 bilin1 kernel:  [<0214bd83>] do_anonymous_page+0xb6/0x241
Oct 14 12:48:06 bilin1 kernel:  [<0214bf77>] do_no_page+0x69/0x3a0
Oct 14 12:48:06 bilin1 kernel:  [<0214c460>] handle_mm_fault+0xdf/0x1d4
Oct 14 12:48:06 bilin1 kernel:  [<0214a838>] follow_page+0xca/0x128
Oct 14 12:48:06 bilin1 kernel:  [<0214abfa>] get_user_pages+0x20f/0x3a5
Oct 14 12:48:06 bilin1 kernel:  [<0215862c>] rw_vm+0x178/0x331
Oct 14 12:48:07 bilin1 kernel:  [<02158bb5>] put_user_size+0x29/0x2d
Oct 14 12:48:07 bilin1 kernel:  [<0213cdd4>] file_read_actor+0xd0/0x101
Oct 14 12:48:07 bilin1 kernel:  [<0213cb04>] 
do_generic_mapping_read+0x153/0x353
Oct 14 12:48:07 bilin1 kernel:  [<0213cfa3>] 
__generic_file_aio_read+0x19e/0x1bc
Oct 14 12:48:07 bilin1 kernel:  [<0213cd04>] file_read_actor+0x0/0x101
Oct 14 12:48:07 bilin1 kernel:  [<0213d001>] generic_file_aio_read+0x40/0x47
Oct 14 12:48:08 bilin1 kernel:  [<0215b097>] do_sync_read+0x6a/0x99
Oct 14 12:48:08 bilin1 kernel:  [<0215b17e>]<4>chesire.cgi: page 
allocation failure. order:0, mode:0x1d2
Oct 14 12:48:08 bilin1 kernel:  [<02140445>] vfs_read+0xb8/0xe4
Oct 14 12:48:08 bilin1 kernel:  [<0215b363>] __alloc_pages+0x2a4/0x2be
Oct 14 12:48:08 bilin1 kernel:  [<02142edb>] sys_read+0x3c/0x62
Oct 14 12:48:08 bilin1 kernel:  do_page_cache_readahead+0x138/0x1fa
Oct 14 12:48:09 bilin1 kernel:  [<02143118>] 
page_cache_readahead+0x17b/0x1b0
Oct 14 12:48:09 bilin1 kernel:  [<0213ca6f>] 
do_generic_mapping_read+0xbe/0x353
Oct 14 12:48:09 bilin1 kernel:  [<0213cfa3>] 
__generic_file_aio_read+0x19e/0x1bc
Oct 14 12:48:09 bilin1 kernel:  [<0213cd04>] file_read_actor+0x0/0x101
Oct 14 12:48:09 bilin1 kernel:  [<0213d001>] generic_file_aio_read+0x40/0x47
Oct 14 12:48:10 bilin1 kernel:  [<0215b097>] do_sync_read+0x6a/0x99
Oct 14 12:48:10 bilin1 kernel:  [<0215b17e>] vfs_read+0xb8/0xe4
Oct 14 12:48:10 bilin1 kernel:  [<0215b363>] sys_read+0x3c/0x62






More information about the fedora-list mailing list