[Linux-cluster] RHEL5 clvmd hangs only after a node crashes...

David Teigland teigland at redhat.com
Thu Oct 18 14:12:00 UTC 2007


On Thu, Oct 18, 2007 at 01:33:41PM +0200, tam_annie at alice.it wrote:
> Hi everybody,
> 
> 	I have successfully installed and (almost successfully) configured RHEL5 cluster suite on a two node cluster, which will soon become a three node cluster, hopefully - my boss' euros willing-: that's why I configured it with qdisk (on a raw partition) too.
> 	Two GFS (v. 1) filesystems are shared by both nodes.
> 	
> 	Well, everything really works like a breeze (HP iLO power fencing obviously included), even when I reboot _NORMALLY_ any node.
> 	
> 	Problems arise only after a node _CRASHES_ (You see, I love performing cold reboots in test environments), when that's exactly what happens:
> 	
> 	1) Node 2 crashes;
> 	2) Node 1 successfully fences node 2: I can go on working on GFS
> 	file systems after a freeze lasting less than one second;
> 	3) While booting up, node 2 startup sequence runs fine: cman
> 	services start successfully (I even get 'Starting fencing... [OK]'
> 	!!!), but when it comes to clvmd, music definitely changes: dlm
> 	connections are successfully established, but then the whole node
> 	hangs on 'Starting clvmd... '. Debugging clvmd init script, I've
> 	found that the problem is due to the vgscan command, which hangs
> 	indefinitely on something like 'Locking vg_flash_1... '. I can't
> 	really find any particular error about that in my logs.

Please remove qdisk from your configuration and see if anything changes.
If it still doesn't work, please send any dlm information from
/var/log/messages, cman_tool nodes output from all nodes, group_tool -v
output from all nodes.

Dave




More information about the Linux-cluster mailing list