[Linux-cluster] Severe problems with 64-bit RHCS on RHEL5.1

Harri Päiväniemi harri.paivaniemi at tietoenator.com
Thu Apr 17 10:27:45 UTC 2008


Yes,

Cluster.conf attached.


I just resolved 1 thing:

When node a & b are down (cluster daemons) and I start node a, it hangs
5 minutes in fencing becouse becouse...


man fence_tool says:

""Before  joining or leaving the fence domain, fence_tool waits for the
cluster be in a quorate state""

And in qdisk man- page it's said:

""CMAN  must  be running before the qdisk program can operate in full
capacity.  If CMAN is not running, qdisk will wait for it."

I started in this order: cman-qdiskd-rgmanager". In this case it hangs
because fence is waiting cluster to be quorate and it's not gonna be
because qdisk is not yet running ;)

Jihaa - so one problem solved. No I can start cluster node at a time.


The 2nd problem that still exists is:

When node a and b are running and everything is ok. I stop node b's
cluster daemons. when I start node b again, this situation stays
forever:

----------------
node a - clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  areenasql1                            1 Online, Local, rgmanager
  areenasql2                            2 Offline
  /dev/sda                              0 Online, Quorum Disk

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  service:areena       areenasql1                     started

-------------------

node b - clustat

Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  areenasql1                            1 Online, rgmanager
  areenasql2                            2 Online, Local, rgmanager
  /dev/sda                              0 Offline, Quorum Disk

  Service Name         Owner (Last)                   State
  ------- ----         ----- ------                   -----
  service:areena       areenasql1                     started


So node b's quorum disk is offline, log says it's registred ok and
heuristic is UP... node a sees node b as offline. If I reboot node b, it
works ok and joins ok...

Both nodes sees:

Nodes: 2
Expected votes: 3
Total votes: 2
Quorum: 2



-hjp







On Thu, 2008-04-17 at 12:12 +0200, jr wrote:
> do you mind sending your cluster.conf?
> 
> johannes
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf
Type: application/xml
Size: 2638 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080417/ef17b50b/attachment.wsdl>


More information about the Linux-cluster mailing list