[Linux-cluster] Node is randomly fenced

Schaefer, Micah Micah.Schaefer at jhuapl.edu
Thu Jun 12 17:24:03 UTC 2014


The servers do not run any tasks other than the tasks in the cluster
service group. 

Nodes 3 and 4 are physical servers with a lot of horsepower and nodes 1
and 2 are virtual machines with much less resources available.

I adjusted the token settings and will watch for any change.








On 6/12/14, 1:08 PM, "Digimer" <lists at alteeve.ca> wrote:

>On 12/06/14 12:48 PM, Schaefer, Micah wrote:
>> As far as the switch goes, both are Cisco Catalyst 6509-E, no spanning
>> tree changes are happening and all the ports have port-fast enabled for
>> these servers. My switch logging level is very high and I have no
>>messages
>> in relation to the time frames or ports.
>> 
>> TOTEM reports that ³A processor joined or left the membershipŠ², but
>>that
>> isn¹t enough detail.
>> 
>> Also note that I did not have these issues until adding new servers:
>>node3
>> and node4 to the cluster. Node1 and node2 do not fence each other
>>(unless
>> a real issue is there), and they are on different switches.
>
>Then I can't imagine it being network anymore. Seeing as both node 3 and
>4 get fenced, it's likely not hardware either. Are the workloads on 3
>and 4 much higher (or are the computers much slower) than 1 and 2? I'm
>wondering if the nodes are simply not keeping up with corosync traffic.
>You might try adjusting the corosync token timeout and retransmit counts
>to see if that reduces the node loses.
>
>-- 
>Digimer
>Papers and Projects: https://alteeve.ca/w/
>What if the cure for cancer is trapped in the mind of a person without
>access to education?
>
>-- 
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>https://www.redhat.com/mailman/listinfo/linux-cluster





More information about the Linux-cluster mailing list