[Linux-cluster] gfs_controld, aisexec and cman_tool

Corey Kovacs corey.kovacs at gmail.com
Sat Mar 21 09:16:59 UTC 2009


Do you have ntp setup? It's possible for the cluster to form without  
it if the clocks are close enough, but after some skew sets in the  
cluster deamons work harder to keep in sync.

Regards,

Corey

On Mar 20, 2009, at 18:29, "Ed Sanborn" <Ed.Sanborn at genband.com> wrote:

> Hi folks,
>
>
>
> I have an 8-node cluster running on an IBM Bladecenter HS21.  Using  
> RHEL 5.2, GFS (no GFS2).
>
> The nodes are exhibiting high-cpu load with the following apps:
>
>
>
> aisexec and cman_tool
>
>
>
> Both these apps race the cpu without any other user apps doing much  
> at all.
>
> Affectively, the user experience is dog-slow.
>
> After I reboot one of the nodes it clears up, these apps (aisexec  
> and cman_tool)\
>
> seem to behave, for awhile.  Eventually they race the cpu again days  
> to weeks later.
>
> Has anyone ever experienced this?  Top output is below.
>
>
>
> Thanks,
>
>
>
> Ed
>
>
>
>
>
>    [root at blade1]# top
>
> top - 13:47:51 up 40 days, 22:16, 37 users,  load average: 4.17,  
> 3.94, 3.86
>
> Tasks: 372 total,   2 running, 369 sleeping,   1 stopped,   0 zombie
>
> Cpu(s):  5.9%us, 32.6%sy,  0.0%ni, 61.4%id,  0.0%wa,  0.0%hi,   
> 0.0%si,  0.0%st
>
> Mem:   8311372k total,  1934844k used,  6376528k free,    76332k  
> buffers
>
> Swap:  8388600k total,   322976k used,  8065624k free,   443172k  
> cached
>
>
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>
>  4352 root      RT   0 37404  35m 2020 R  100  0.4  10519:34 aisexec
>
> 20806 root      16   0  1684  560  484 S   42  0.0   8324:49 cman_tool
>
> 12501 root      15   0  1680  556  484 S   31  0.0 609:38.46 cman_tool
>
> 27245 root      16   0  1688  560  484 S   30  0.0 508:14.31 cman_tool
>
>  4635 root      34  19     0    0    0 S    2  0.0   1271:52 kipmi0
>
>  5047 root      18   0  405m  17m 6260 S    1  0.2  21:57.04 cimserver
>
> 28975 root      15   0  2564 1296  900 R    1  0.0   0:00.05 top
>
>     1 root      15   0  2064  576  524 S    0  0.0   0:02.91 init
>
>     2 root      RT  -5     0    0    0 S    0  0.0   0:02.98  
> migration/0
>
>     3 root      34  19     0    0    0 S    0  0.0   0:00.11  
> ksoftirqd/0
>
>     4 root      RT  -5     0    0    0 S    0  0.0   0:00.00  
> watchdog/0
>
>     5 root      RT  -5     0    0    0 S    0  0.0   0:01.29  
> migration/1
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20090321/7ffaa1d4/attachment.htm>


More information about the Linux-cluster mailing list