[Linux-cluster] details of fencing

Thu Jul 12 08:03:46 UTC 2007

I am in the process of implementing clustering for shared data storage
across a number of nodes, with several nodes exporting large GNBD
volumes, and also new storage from an iSCSI raid chassis with 6TB of
storage.  The nature of the application requires that the nodes that
access the data store are pretty much independent of each other, just
providing CPU and graphics support while reading several hundred
megabytes of image data in 32mb chunks, and writing numerous small
summary files of this data.  Our current methodology, which works but is
slow, is to server the data by NFS over gigabit ethernet. A similar
facility nearby, with the same application, has implemented GFS on FC
equipment, and are using the FC switch for fencing.  As I have somewhat
different storage hardware and data retention requirements, I need to
implement different fencing methods.

The storage network is on a 3com switch, which is able to take down a
given link via a telnet command, and later restore it.  Also, each of
the storage nodes has a Smart UPS with control over the individual
outlets on the UPS, which could be used for power fencing of the GNBD
server nodes.  The only issue there is that these are not networked UPS
systems, but are connected via serial ports to some of the nodes.  On
the network switch fencing, I am currently using the storage net for
cluster communications, so bringing down a port also stops cluster
communications.

Each of the Storage systems has at least two network interfaces (most
have 6 or more), one (or more) on the storage net, and one on our
intranet. The data processing units have two net interfaces, one on each
network.

I know I will probably have to write a fence agent for at least some of
the parts of this.  The questions that I have are the exact sequence of
events for fencing a node, as in who initiates the fencing operation,
and what is the sequence of events for recovery and rejoining the
cluster after a reboot.  I currently have a test setup of four nodes
with a 4TB GNBD export from one of the nodes to the other three, using
fence_gnbd on those nodes, and fence_gnbd_notify with fence_manual on
the server, at least until I can get the UPS fence agent working.  If I
need to, I can put the UPS systems on a network terminal server to allow
any node to connect to the UPS for commands, but would prefer that it
connect to one of the cluster nodes directly using the serial port.  For
the iSCSI chassis, from the manual it appears that I can force a iSCSI
disconnect via snmp or telnet using the management interface for the
chassis, which from my reading of the RFQ, should be an effective fence
for iSCSI, as it will invalidate the current connection from the
initiator, and requires a re-authentication and negotiation of the link
before allowing more communications with that node.

Hopefully, this gives enough information to a least get a start on this,
as it is several issues, each which may need separate followup.

Sincerely

James Fait
-- 
James Fait, Ph.D.
Beamline Scientist, SER-CAT
APS, building 436B-008
Argonne National Laboratory
9700 S Cass Ave
Argonne, IL 60439
phone 630-252-0644
fax   630-252-0652
email fait at anl.gov