[Linux-cluster] Fwd: High Available Transparent File System

yue ooolinux at 163.com
Sun Apr 10 15:08:05 UTC 2011


What is fencing?
Fencing is the act of forecefully removing a node from a cluster. A node with OCFS2 mounted will fence itself when it realizes that it doesn't have quorum in a degraded cluster. It does this so that other nodes won't get stuck trying to access its resources. Currently OCFS2 will panic the machine when it realizes it has to fence itself off from the cluster. As described above, it will do this when it sees more nodes heartbeating than it has connectivity to and fails the quorum test.
Due to user reports of nodes hanging during fencing, OCFS2 1.2.5 no longer uses "panic" for fencing. Instead, by default, it uses "machine restart". This should not only prevent nodes from hanging during fencing but also allow for nodes to quickly restart and rejoin the cluster. While this change is internal in nature, we are documenting this so as to make users aware that they are no longer going to see the familiar panic stack trace during fencing. Instead they will see the message"*** ocfs2 is very sorry to be fencing this system by restarting ***" and that too probably only as part of the messages captured on the netdump/netconsole server.
If perchance the user wishes to use panic to fence (maybe to see the familiar oops stack trace or on the advise of customer support to diagnose frequent reboots), one can do so by issuing the following command after the O2CB cluster is online.
	# echo 1 > /proc/fs/ocfs2_nodemanager/fence_method

Please note that this change is local to a node.



At 2011-04-10 22:29:01,"Meisam Mohammadkhani" <meisam.mohammadkhani at gmail.com> wrote:

Hi All,


I'm new to GFS. I'm searching around a solution for our enterprise application that is responsible to save(and manipulate) historical data of industrial devices. Now, we have two stations that works like hot redundant of each other. Our challenge is in case of failure. For now, our application is responsible to handling fault by synchronizing the files that changed during the fault, by itself. Our application is running on two totally independent machines (one as redundant) and so each one has its own disk.
We are searching around a solution like a "high available transparent file system" that makes the fault transparent to the application, so in case of fault, redundant machine still can access the files even the master machine is down (replica issue or such a thing).
Is there fail-over feature in GFS that satisfy our requirement? Actually, my question is that can GFS help us in our case?

Regards

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20110410/d1a57037/attachment.htm>


More information about the Linux-cluster mailing list