[Linux-cluster] Two-node cluster: Node attempts stateful merge after clean reboot

emmanuel segura emi2fast at gmail.com
Wed Sep 11 22:57:49 UTC 2013


Consider that the hung/failed node was in the middle of a write to the SAN
and froze. Now imagine at some point in the future it recovers, having no
idea that time passed it has no reason to doubt that it's locks are still
valid so it just finishes the writes. Congrats, you could have just
corrupted your storage.

UMMMMMM

I use ext3(LV)->(VG=exclusive=true with clvmd)->(pv)->(multipath)->(SAN),
so as you know the redhat cluster only support failover resource, so your
example is not very clear, how can i corrupte the storare with
clean_start=1?


2013/9/12 Digimer <lists at alteeve.ca>

> The problem that Pascal has is that the node sees the peer, joins and
> fences anyway. So in this case, clean_start won't help.
>
> Even with a SAN/qdisk though, it's not needed to enable this. If the
> remaining node can't talk to qdisk, it won't have quorum and will not be
> offering services, so fencing it won't hurt. It's *always* better to put
> nodes into a known state, regardless of quorum.
>
> Consider that the hung/failed node was in the middle of a write to the SAN
> and froze. Now imagine at some point in the future it recovers, having no
> idea that time passed it has no reason to doubt that it's locks are still
> valid so it just finishes the writes. Congrats, you could have just
> corrupted your storage.
>
> _Never_ assume _anything_.
>
> "The only thing you don't know is what you don't know."
>
> digimer
>
> On 11/09/13 18:24, emmanuel segura wrote:
>
>> Fixed previous mail
>>
>> clean_start=1 disable the startup fencing and if you use a quorum disk
>> in your cluster without expected_votes=1, when a node start after it has
>> been fenced, the node dosn't try to fence di remain node and doesn't try
>> to start the service, because rgmanager need a cluster quorate, so many
>> people around say clean_start=1 is dangerous, but no one give a clear
>> reason, in my production cluster a i have clvm+vg in exclusive
>> mode+(clean_start=1)+(master_
>> wins). so if you can explain me where is the problem :) i apriciate
>>
>>
>>
>> 2013/9/11 Digimer <lists at alteeve.ca <mailto:lists at alteeve.ca>>
>>
>>     On 11/09/13 12:04, emmanuel segura wrote:
>>
>>         Hello Pascal
>>
>>         For disable startup fencing you need clean_start=1 in the
>>         fence_daemon
>>         tag, i saw in your previous mail you are using
>>         expected_votes="1", with
>>         this setting every cluster node will be partitioned into two
>>         clusters
>>         and operate independently, i recommended using a quorim disk with
>>         master_wins parameter
>>
>>
>>     This is a very bad idea and is asking for a split-brain, the main
>>     reason fencing exists at all.
>>
>>
>>     --
>>     Digimer
>>     Papers and Projects: https://alteeve.ca/w/
>>     What if the cure for cancer is trapped in the mind of a person
>>     without access to education?
>>
>>
>>
>>
>> --
>> esta es mi vida e me la vivo hasta que dios quiera
>>
>
>
> --
> Digimer
> Papers and Projects: https://alteeve.ca/w/
> What if the cure for cancer is trapped in the mind of a person without
> access to education?
>



-- 
esta es mi vida e me la vivo hasta que dios quiera
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20130912/5ed1cd1e/attachment.htm>


More information about the Linux-cluster mailing list