[Linux-cluster] self_fence for FS resource in RHEL 6.x operational?

Robert Hayden
Tue Jan 22 17:22:25 UTC 2013

I am testing RHCS 6.3 and found that the self_fence option for a file
system resource will now longer function as expected.  Before I log an SR
with RH, I was wondering if the design changed between RHEL 5 and RHEL 6.

In RHEL 5, I see logic in /usr/share/cluster/fs.sh that will complete a
"reboot -fn" command on a self_fence logic.  In RHEL 6, there is little to
no logic around self_fence in the fs.sh file.

Example of RHEL 5 logic in fs.sh that appears to be removed from RHEL 6:
        if [ -n "$umount_failed" ]; then
                ocf_log err "'umount $mp' failed, error=$ret_val"

                if [ "$self_fence" ]; then
                        ocf_log alert "umount failed - REBOOTING"
                        reboot -fn
                return $FAIL
                return $SUCCESS

To test in RHEL 6, I simply create a file system (e.g. /test/data) resource
with self_fence="1" or self_fence="on" (as added by Conga).  Then mount a
small ISO image on top of the file system.  This mount will cause the file
system resource to be unable to unmount itself and should trigger a
self_fence scenario.

Testing RHEL 6, I see the following in /var/log/messages:

Jan 21 16:40:59 techval16 rgmanager[82637]: [fs] unmounting /test/data
Jan 21 16:40:59 techval16 rgmanager[82777]: [fs] Sending SIGTERM to
processes on /test/data
Jan 21 16:41:04 techval16 rgmanager[82859]: [fs] unmounting /test/data
Jan 21 16:41:05 techval16 rgmanager[82900]: [fs] Sending SIGKILL to
processes on /test/data
Jan 21 16:41:05 techval16 rgmanager[61929]: stop on fs "share16_data"
returned 1 (generic error)
Jan 21 16:41:05 techval16 rgmanager[61929]: #12: RG service:fstest_node16
failed to stop; intervention required
Jan 21 16:41:05 techval16 rgmanager[61929]: Service service:fstest_node16
is failed

