[Linux-cluster] fs.sh status hangs after device failures

jose nuno neto jose.neto at liber4e.com
Mon Apr 19 14:07:39 UTC 2010


Hellos

Im testing SAN under Multipath failures and founding a behavior on the
fs.sh that is not what I wanned.

On simulating a SAN failure either with portdown on the san switch or on
the OS ( echo offline > /sys/block/$DEVICE/device/state ) the fs.sh status
script doesn't give back an error.

I looked at the script and think it hangs on the ls or touch test (depends
on timings )
In fact if I issue an ls/touch on the failed mountpoints it hangs forever.

If I fail the devices with /sys/block/$DEVICE/device/delete then the touch
test returns an error and service switches.

I found on redhat doc a reference for a parameter: remove_on_dev_loss
http://www.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/html/Online_Storage_Reconfiguration_Guide/modifying-link-loss-behavior.html

I set it
echo 1 > /sys/module/scsi_transport_fc/parameters/remove_on_dev_loss
but didn't notice any changes

Any nice suggestions ?

Thanks
Jose




More information about the Linux-cluster mailing list