[Linux-cluster] Less than smooth upgrade experience from RHEL5.2->5.3

denis denisb+gmane at gmail.com
Tue Jan 27 15:30:12 UTC 2009


Hi,

I'll just describe my upgrade process today. Cluster is back to a
quorate and operational status, but I don't fully understand what
happened and any input on what to do differently next time would be nice.

This is a two-node cluster running qdisk (so 3 total votes), resources
are mysql and haproxy with SAN backed storage. Both nodes mount /var/www
with GFS on a SAN multipathed device.

1. Migrated all services to node B
2. Upgraded node A with yum
3. Rebooted node A
4. Node A rejoins cluster, and takes ownership of resource with
failoverdomain priority
5. I notice /var/www is not mounted on node A
6. The errormessage is descriptive enough so I remove mountoptions until
I can mount /var/www (remove noatime, noquota from fstab)
7. Migrated remaining service to node A
8. Upgraded node B with yum
9. Rebooted node B
10. When node B shuts down, node A instantly claims quorum lost and
dissolves the cluster
11. Upon rebooting, node B hangs as the cluster is inquorate
12. Eventually rebooting both nodes re-establishes quorum and cluster
services come up


The messages on node A from the point where cluster quorum was dissolved
say :

Jan 27 14:57:57 nodeb qdiskd[3806]: <info> Node 1 shutdown
Jan 27 14:58:03 nodeb clurgmgrd[4465]: <emerg> #1: Quorum Dissolved
Jan 27 14:58:03 nodeb kernel: dlm: closing connection to node 1
Jan 27 14:58:03 nodeb openais[3755]: [CMAN ] lost contact with quorum
device
Jan 27 14:58:03 nodeb openais[3755]: [CMAN ] quorum lost, blocking activity
Jan 27 14:58:03 nodeb ccsd[3681]: Cluster is not quorate.  Refusing
connection.
Jan 27 14:58:03 nodeb ccsd[3681]: Error while processing connect:
Connection refused
Jan 27 14:58:03 nodeb ccsd[3681]: Invalid descriptor specified (-111).
Jan 27 14:58:03 nodeb ccsd[3681]: Someone may be attempting something evil.


I am still scratching my head over why quorum was dissolved over booting
node B.

Regards
-- 
Denis Braekhus
Team Lead Managed Services
Redpill Linpro AS - Changing the game




More information about the Linux-cluster mailing list