[Linux-cluster] Fencing and dead locks

Jürgen Ladstätter info at innova-studios.com
Tue Dec 2 10:29:04 UTC 2014


Hi guys,

 

we’re running a 9 node cluster with 5 gfs2 mounts. The cluster is mainly
used for load balancing web based applications. Fencing is done with IPMI
and works.

Sometimes one server gets fenced, but after rebooting isn’t able to rejoin
the cluster. This triggers higher load and many open processes, leading to
another server being fenced. This server then isn’t able to rejoin either
and this continues until we lose quorum and have to manually restart the
whole cluster.

Sadly this is not reproducible, but it looks like it happens more often when
there is more write IO.

 

Since a whole cluster deadlock kinda removes the sense of a cluster, we’d
need some input what we could do or change.

We’re running Centos 6.6, kernel 2.6.32-504.1.3.el6.x86_64

 

Did anyone of you test gfs2 with centos 7? Any known major bugs that could
cause dead locks? 

 

Thanks in advance, Jürgen

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20141202/af810cb9/attachment.htm>


More information about the Linux-cluster mailing list