[Linux-cluster] node fails to stop when inquorate

Katriel Traum katriel at penguin-it.co.il
Wed Oct 18 21:07:50 UTC 2006


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Thanks,
I'll look into it.
I don't think qdiskd rebootign is a good solution for this scenario.
Are there any cases in which cman reboots a machine? maybe this should
be configurable (not only when qdiskd tells it to reboot)

Thanks,
+Katriel

Lon Hohberger wrote:
> On Wed, 2006-10-18 at 21:38 +0200, Katriel Traum wrote:
> 
>> The (ugly) workaround I've been using is killing the process manually
>> and then manually removing /var/lock/subsys/rgmanager, which causes "rc"
>> to skip it.
> 
>> Is there a better way to restart a failed node? Shouldn't a failed node
>> be "hard booted" by cman?
> 
> Nodes don't "know" they're fenced with fabric-level fencing; it's a
> deficiency in the model itself.
> 
> The easiest thing to do is 'reboot -fn'.  A fenced node may have
> outstanding buffers which never get cleaned up - so you can't "un-fence"
> them until they have been rebooted anyway.
> 
> Rgmanager's child processes are probably trying to umount the a file
> system that has been fenced and are stuck in disk-wait - which may be
> "forever", depending on the storage configuration.
> 
> There's an patch outstanding for qdiskd which makes it reboot on loss of
> score, which triggers a reboot.  However, I don't think this is your
> problem.
> 
> -- Lon
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

- --
Katriel Traum, PenguinIT
RHCE, CLP
Mobile: 054-6789953
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org

iD8DBQFFNpefDWy+Hv/461sRAlwZAKCGMPfGwsFmsAd09Z0Z3Y3vxmudwQCfd+09
2oGyyKMkxpPV6SSQUH8J4jk=
=rrou
-----END PGP SIGNATURE-----




More information about the Linux-cluster mailing list