[Linux-cluster] Starter Cluster / GFS

Gordan Bobic gordan at bobich.net
Thu Nov 11 09:04:20 UTC 2010


Digimer wrote:
> On 10-11-10 11:09 AM, Gordan Bobic wrote:
>> Digimer wrote:
>>> On 10-11-10 07:17 AM, Gordan Bobic wrote:
>>>>>> If you want the FS mounted on all nodes at the same time then all
>>>>>> those nodes must be a part of the cluster, and they have to be
>>>>>> quorate (majority of nodes have to be up). You don't need a quorum
>>>>>> block device, but it can be useful when you have only 2 nodes.
>>>>> At term, I will have 7 to 10 nodes, but 2 at first for initial setup
>>>>> and testing. Ok, so if I have a 3 nodes cluster for exemple, I need at
>>>>> least 2 nodes for the cluster, and thus the gfs, to be up ? I cannot
>>>>> have a running gfs with only one node ?
>>>> In a 2-node cluster, you can have running GFS with just one node up. But
>>>> in that case it is advisble to have a quorum block device on the SAN.
>>>> With a 3 node cluster, you cannot have quorum with just 1 node, and thus
>>>> you cannot have GFS running. It will block until quorum is
>>>> re-established.
>>> With a quorum disk, you can in fact have one node left and still have
>>> quorum. This is because the quorum drive should have (node-1) votes,
>>> thus always giving the last node 50%+1 even with all other nodes being
>>> dead.
>> I've never tried testing that use-case extensively, but I suspect that
>> it is only safe to do with SAN-side fencing. Otherwise two nodes could
>> lose contact with each other and still both have access to the SAN and
>> thus both be individually quorate.
>>
>> Gordan
> 
> Clustered storage *requires* fencing. To not use fencing is like driving
> tired; It's just a matter of time before something bad happens. That
> said, I should have been more clear in specifying the requirement for
> fencing.
> 
> Now that said, the fencing shouldn't be needed at the SAN side, though
> that works fine as well.

The default fencing action, last time I checked, is reboot. Consider the 
use case where you have a network failure and separate networks for 
various things, and you lose connectivity between the nodes but they 
both still have access to the SAN. One node gets fenced, reboots, comes 
up and connects to the SAN. It connects to the quorum device and has 
quorum without the other nodes, and mounts the file systems and starts 
writing - while all the other nodes that have become partitioned off do 
the same thing. Unless you can fence the nodes from the SAN side, quorum 
device having a 50% weight is a recipe for disaster.

> The way it works is:
[...]

I'm well aware of how fencing works, but you overlooked one major 
failure mode that is essentially guaranteed to hose your data if you set 
up the quorum device to have 50% of the votes.

> With SAN-side fencing, a fence is in the form of a logic disconnection
> from the storage network. This has no inherent mechanism for recovery,
> so the sysadmin will have to manually recover the node(s). For this
> reason, I do not prefer it.

Then don't use a quorum device with more than an equal weight to the 
individual nodes.

Gordan




More information about the Linux-cluster mailing list