more on bogged down server

Harold Hallikainen harold at hallikainen.com
Thu Apr 13 14:39:08 UTC 2006


> On Wed, 2006-04-12 at 13:55 -0700, Harold Hallikainen wrote:
>
> <snip>
>> >
>> > Have you done the "vmstat 3" thing yet to see if you have context
>> > switching going nuts?
>> >
>>
>> I guess I have to read about vmstat 3. I dunno what it means, but here's
>> some output:
>>
>>  vmstat 3
>> procs -----------memory---------- ---swap-- -----io---- --system--
>> ----cpu----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy
>> id wa
>>  8  0  25068 159440  15576 456728    0    2   188    52  378    95 94  1
>> 5  0
>>  9  0  25068 159320  15576 456856    0    0    43     0  358    73 100
>> 0
>> 0  0
>>  8  0  25068 159200  15584 456984    0    0    43    25  350    74 100
>> 0
>> 0  0
>>  8  0  25068 159080  15584 457112    0    0    43     0  348    74 100
>> 0
>> 0  0
>>  8  0  25068 159020  15588 457196    0    0    28    17  356    72 100
>> 0
>> 0  0
>>  8  0  25068 159020  15592 457196    0    0     0    17  352    66 100
>> 0
>> 0  0
>>  8  0  25068 159020  15592 457196    0    0     0     0  354    72 100
>> 0
>> 0  0
>>  8  0  25068 159020  15600 457196    0    0     0    19  350    73 100
>> 0
>> 0  0
>>  8  0  25068 159020  15608 457196    0    0     0    31  355    76 100
>> 0
>> 0  0
>>  8  0  25068 159020  15608 457196    0    0     0     0  350    72 100
>> 0
>> 0  0
>>  8  0  25068 159020  15616 457196    0    0     0    31  354    73 100
>> 0
>> 0  0
>> procs -----------memory---------- ---swap-- -----io---- --system--
>> ----cpu----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy
>> id wa
>>  9  0  25068 159020  15616 457196    0    0     0     0  353    71 100
>> 0
>> 0  0
>>  8  0  25068 159020  15624 457196    0    0     0    27  352    77 100
>> 0
>> 0  0
>>  8  0  25068 159020  15624 457196    0    0     0    28  359    77 100
>> 0
>> 0  0
>>  8  0  25068 159020  15632 457196    0    0     0    17  349    76 100
>> 0
>> 0  0
>>  9  0  25068 150320  15640 457300    0    0    35    17  361    90 96  4
>> 0  0
>>  9  1  25068 137820  15748 458000    0    0   269     0  419   206 87 13
>> 0  0
>> 10  0  25068 129764  15928 461872    0    0  1348   211  629   630 93  7
>> 0  0
>> 10  0  25068 128296  15932 462000    0    0    43    55  355    77 99  1
>> 0  0
>> 10  0  25068 127876  15932 462128    0    0    43     0  350    72 99  1
>> 0  0
>> 10  0  25068 127632  15936 462252    0    0    41    39  353    73 100
>> 0
>> 0  0
>> 10  0  25068 127508  15936 462252    0    0     0     0  355    66 100
>> 0
>> 0  0
>> procs -----------memory---------- ---swap-- -----io---- --system--
>> ----cpu----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy
>> id wa
>> 11  0  25068 126336  15956 462476    0    0    79    23  350    77 100
>> 0
>> 0  0
>> 10  0  25068 126032  15960 462604    0    0    43    27  356    77 100
>> 0
>> 0  0
>> 10  0  25068 125912  15960 462732    0    0    43     0  351    74 100
>> 0
>> 0  0
>> 10  0  25068 125792  15964 462860    0    0    43    24  351    74 100
>> 0
>> 0  0
>> 10  0  25068 125672  15964 462988    0    0    43   217  365    82 100
>> 0
>> 0  0
>> 10  0  25068 125552  15972 463116    0    0    43    20  353    78 100
>> 0
>> 0  0
>> 10  0  25068 125296  15988 463128    0    0     7    36  385    82 100
>> 0
>> 0  0
>> 10  0  25068 125060  15988 463256    0    0    43     0  354    71 99  1
>> 0  0
>> 10  0  25068 124632  15996 463384    0    0    43    39  349    73 99  1
>> 0  0
>> 10  0  25068 124264  15996 463512    0    0    43     0  355    73 100
>> 0
>> 0  0
>> 10  0  25068 124204  16004 463584    0    0    24    28  350    76 100
>> 0
>> 0  0
>> procs -----------memory---------- ---swap-- -----io---- --system--
>> ----cpu----
>>  r  b   swpd   free   buff  cache   si   so    bi    bo   in    cs us sy
>> id wa
>> 11  0  25068 123960  16012 463712    0    0    43    21  356    76 100
>> 0
>> 0  0
>> 10  0  25068 123720  16016 463936    0    0    76     0  353    75 100
>> 0
>> 0  0
>> 10  0  25068 123360  16020 464320    0    0   128    21  350    72 100
>> 0
>> 0  0
>> 10  0  25068 123120  16020 464576    0    0    85     0  357    76 100
>> 0
>> 0  0
>> 10  0  25068 122812  16044 464832    0    0    92    21  366    79 100
>> 0
>> 0  0
>> 10  0  25068 122568  16052 464960    0    0    43    73  362    82 100
>> 0
>> 0  0
>> 10  0  25068 122336  16052 465216    0    0    85     0  356    70 100
>> 0
>> 0  0
>> 10  0  25068 122216  16060 465344    0    0    43    32  349    74 100
>> 0
>> 0  0
>> 12  0  25068 122156  16060 465420    0    0    25     0  357    72 100
>> 0
>> 0  0
>>
>>
>> I restarted httpd about an hour ago. top now reports a load average of
>> 10.06 9.06 8.16
>
> Hmmm.  Well, you're not going nutsy with the context switches (that's
> the "cs" column).  You aren't swapping ("si" and "so").  You also aren't
> bogged down in disk I/O ("bi" and "bo") and you're not getting swamped
> with interrupts ("in").  To be truthful, I'd expect more interrupt
> activity because of network I/O so you may still be throttled back by
> your ISP.  I can't be sure.
>
> You are spending a ridiculous amount of time in userspace, so that's an
> indicator that SOMETHING changed in your web config.  I'd go look at the
> yum logs and see if something you use in your site (mod_perl, perl, PHP,
> etc.) got updated and consequently broke.  You might even try an strace
> of a couple of the web processes to see what the hell they're doing.
>


strace sounds interesting. I'll have to read up on it. Meanwhile, is there
some way to take a pid out of top and see what url(s) httpd is working on?

Prior to making the trip to Arkansas when this problem first appeared, I
DID do an update to gallery, the photo gallery program. Looking at httpd
logs, I see search engines calling the slideshow, which is pretty
processor intensive. So, I've added gallery to my disallow list in
robots.txt . Also, looking through gallery config last night, I found
there's an option that improves cpu usage by about 90% by only updating
dynamic pages every 15 minutes instead of recreating them on the fly. I'll
see how these two changes help.

It's also been suggested that I mess with this on the LAN, removing WAN
requests. I'll try that out this weekend to see if I can duplicate the cpu
loading with some known url request.

THANKS to all!

Harold


-- 
FCC Rules Updated Daily at http://www.hallikainen.com




More information about the Redhat-install-list mailing list