[Linux-cluster] Problem in cluster with xen kernel

Nuno Fernandes npf at eurotux.com
Tue Apr 3 14:02:39 UTC 2007


On Tuesday 03 April 2007 11:49:31 carlopmart wrote:
> Nuno Fernandes wrote:
> > Hi,
> >
> > Just for your information we've solved it. It was a problem in the xen
> > bridge scripts that restarted network interfaces while the cluster is
> > active.
> >
> > Changing /etc/xen/xend-config.sxp line
> >
> > (network-script network-bridge)
> >
> > to
> >
> > (network-script /bin/true)
> >
> > and creating the bridge in /etc/sysconfig/network-scripts/ifcfg-* files
> > solved.
> >
> > Thanks
> > Nuno Fernandes
> >
> > On Tuesday 03 April 2007 10:12:20 Nuno Fernandes wrote:
> >> Hi,
> >>
> >> I'm using rhel5 default kernel and everything seems ok.
> >>
> >> [root at xen1 ~]# clustat
> >> Member Status: Quorate
> >>
> >>   Member Name                        ID   Status
> >>   ------ ----                        ---- ------
> >>   xen1.dc.server.pt                      1 Online, Local
> >>   xen2.dc.server.pt                      2 Online
> >>   xen3.dc.server.pt                      3 Online
> >>
> >> Later on, i reboot  xen3 to a Dom0 kernel and get in xen1 logs:
> >>
> >> Dec 19 23:02:47 xen1 openais[2747]: [TOTEM] The token was lost in the
> >> OPERATIONAL state.
> >> Dec 19 23:02:47 xen1 openais[2747]: [TOTEM] Receive multicast socket
> >> recv buffer size (262142 bytes).
> >> Dec 19 23:02:47 xen1 openais[2747]: [TOTEM] Transmit multicast socket
> >> send buffer size (262142 bytes).
> >> Dec 19 23:02:47 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 2.
> >>
> >> [root at xen1 ~]# Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] entering
> >> GATHER state from 0.
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Creating commit token
> >> because I am the rep.
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Saving state aru 2f high seq
> >> received 2f
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] entering COMMIT state.
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] position [0] member
> >> 172.16.40.107: Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 84 rep 172.16.40.107
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] aru 2f high delivered 2f
> >> received flag 0
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] position [1] member
> >> 172.16.40.108: Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 84 rep 172.16.40.107
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] aru 2f high delivered 2f
> >> received flag 0
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Did not need to originate
> >> any messages in recovery.
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Storing new sequence id for
> >> ring 58
> >> Dec 19 23:02:52 xen1 kernel: dlm: closing connection to node 3
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] Sending initial ORF token
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:02:52 xen1 openais[2747]: [SYNC ] This node is within the
> >> primary component and will provide service.
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:02:52 xen1 openais[2747]: [SYNC ] This node is within the
> >> primary component and will provide service.
> >> Dec 19 23:02:52 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
> >> Dec 19 23:02:52 xen1 openais[2747]: [CLM  ] got nodejoin message
> >> 172.16.40.107 Dec 19 23:02:53 xen1 openais[2747]: [CLM  ] got nodejoin
> >> message 172.16.40.108 Dec 19 23:02:53 xen1 openais[2747]: [CPG  ] got
> >> joinlist message from node 2 Dec 19 23:02:53 xen1 openais[2747]: [CPG  ]
> >> got joinlist message from node 1
> >>
> >> So far so good, xen3 is offline while it reboots...
> >>
> >> [root at xen1 ~]# clustat
> >> Member Status: Quorate
> >>
> >>   Member Name                        ID   Status
> >>   ------ ----                        ---- ------
> >>   xen1.dc.server.pt                      1 Online, Local
> >>   xen2.dc.server.pt                      2 Online
> >>   xen3.dc.server.pt                      3 Offline
> >>
> >> After it reboots i get node join in xen1 server logs:
> >>
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 11. Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Creating commit token
> >> because I am the rep.
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Saving state aru 17 high seq
> >> received 17
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] entering COMMIT state.
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] position [0] member
> >> 172.16.40.107: Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 88 rep 172.16.40.107
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] aru 17 high delivered 17
> >> received flag 0
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] position [1] member
> >> 172.16.40.108: Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 88 rep 172.16.40.107
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] aru 17 high delivered 17
> >> received flag 0
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] position [2] member
> >> 172.16.40.116: Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 4 rep 172.16.40.116
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] aru 9 high delivered 9
> >> received flag 0
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Did not need to originate
> >> any messages in recovery.
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Storing new sequence id for
> >> ring 5c
> >> Dec 19 23:05:03 xen1 openais[2747]: [TOTEM] Sending initial ORF token
> >> Dec 19 23:05:03 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:05:03 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:05:03 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:05:03 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:05:04 xen1 openais[2747]: [SYNC ] This node is within the
> >> primary component and will provide service.
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
> >> Dec 19 23:05:04 xen1 openais[2747]: [SYNC ] This node is within the
> >> primary component and will provide service.
> >> Dec 19 23:05:04 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
> >> Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] got nodejoin message
> >> 172.16.40.107 Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] got nodejoin
> >> message 172.16.40.108 Dec 19 23:05:04 xen1 openais[2747]: [CLM  ] got
> >> nodejoin message 172.16.40.116 Dec 19 23:05:04 xen1 openais[2747]: [CPG 
> >> ] got joinlist message from node 1 Dec 19 23:05:04 xen1 openais[2747]:
> >> [CPG ] got joinlist message from node 2 Dec 19 23:05:12 xen1 kernel:
> >> dlm: connecting to 3
> >> Dec 19 23:05:12 xen1 kernel: dlm: got connection from 3
> >>
> >> Clustat also reports ok status:
> >>
> >> [root at xen1 ~]# clustat
> >> Member Status: Quorate
> >>
> >>   Member Name                        ID   Status
> >>   ------ ----                        ---- ------
> >>   xen1.dc.server.pt                      1 Online, Local
> >>   xen2.dc.server.pt                      2 Online
> >>   xen3.dc.server.pt                      3 Online
> >>
> >> Everything ok so far...
> >>
> >> Next i reboot xen2. When xen2 leaves xen1 complains that it can speak
> >> with xen3 and fences it.
> >>
> >> Dec 19 23:08:48 xen1 openais[2747]: [TOTEM] Retransmit List: 32
> >> Dec 19 23:08:48 xen1 openais[2747]: [TOTEM] Retransmit List: 32
> >> Dec 19 23:08:48 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:55 xen1 last message repeated 47 times
> >> Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:55 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:56 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:57 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:58 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Retransmit List: 32 33 34
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] FAILED TO RECEIVE
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 6. Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering GATHER state
> >> from 11. Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Creating commit
> >> token because I am the rep.
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Saving state aru 34 high seq
> >> received 34
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering COMMIT state.
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] position [0] member
> >> 172.16.40.107: Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 92 rep 172.16.40.107
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] aru 34 high delivered 34
> >> received flag 0
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] position [1] member
> >> 172.16.40.108: Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 92 rep 172.16.40.107
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] aru 34 high delivered 34
> >> received flag 0
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Did not need to originate
> >> any messages in recovery.
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Storing new sequence id for
> >> ring 60
> >> Dec 19 23:08:59 xen1 openais[2747]: [TOTEM] Sending initial ORF token
> >> Dec 19 23:08:59 xen1 kernel: dlm: closing connection to node 3
> >>
> >>
> >> Dec 19 23:08:59 xen1 fenced[2763]: xen3.dc.aeiou.pt not a cluster member
> >> after 0 sec post_fail_delay
> >>
> >>
> >>
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:09:00 xen1 fenced[2763]: xen2.dc.aeiou.pt not a cluster member
> >> after 0 sec post_fail_delay
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:09:00 xen1 fenced[2763]: fencing node "xen3.dc.aeiou.pt"
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:09:00 xen1 openais[2747]: [SYNC ] This node is within the
> >> primary component and will provide service.
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:09:00 xen1 openais[2747]: [SYNC ] This node is within the
> >> primary component and will provide service.
> >> Dec 19 23:09:00 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
> >> Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] got nodejoin message
> >> 172.16.40.107 Dec 19 23:09:00 xen1 openais[2747]: [CLM  ] got nodejoin
> >> message 172.16.40.108 Dec 19 23:09:00 xen1 openais[2747]: [CPG  ] got
> >> joinlist message from node 2 Dec 19 23:09:00 xen1 openais[2747]: [CPG  ]
> >> got joinlist message from node 1 Dec 19 23:09:05 xen1 openais[2747]:
> >> [TOTEM] entering GATHER state from 11. Dec 19 23:09:09 xen1
> >> openais[2747]: [TOTEM] entering GATHER state from 0. Dec 19 23:09:09
> >> xen1 openais[2747]: [TOTEM] Creating commit token because I am the rep.
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] Saving state aru 1a high seq
> >> received 1a
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] entering COMMIT state.
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] position [0] member
> >> 172.16.40.107: Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 96 rep 172.16.40.107
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] aru 1a high delivered 1a
> >> received flag 0
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] position [1] member
> >> 172.16.40.116: Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 92 rep 172.16.40.107
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] aru 31 high delivered 31
> >> received flag 0
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] Did not need to originate
> >> any messages in recovery.
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] Storing new sequence id for
> >> ring 64
> >> Dec 19 23:09:09 xen1 kernel: dlm: closing connection to node 2
> >> Dec 19 23:09:09 xen1 openais[2747]: [TOTEM] Sending initial ORF token
> >> Dec 19 23:09:09 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.108)
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:09:10 xen1 openais[2747]: [CMAN ] quorum lost, blocking
> >> activity Dec 19 23:09:10 xen1 openais[2747]: [SYNC ] This node is within
> >> the primary component and will provide service.
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
> >> Dec 19 23:09:10 xen1 openais[2747]: [SYNC ] This node is within the
> >> primary component and will provide service.
> >> Dec 19 23:09:10 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
> >> Dec 19 23:09:10 xen1 openais[2747]: [MAIN ] Node xen3.dc.aeiou.pt not
> >> joined to cman because it has rejoined an inquorate cluster
> >> Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] got nodejoin message
> >> 172.16.40.107 Dec 19 23:09:10 xen1 openais[2747]: [CLM  ] got nodejoin
> >> message 172.16.40.116 Dec 19 23:09:10 xen1 openais[2747]: [CPG  ] got
> >> joinlist message from node 3 Dec 19 23:09:10 xen1 openais[2747]: [CPG  ]
> >> got joinlist message from node 1 Dec 19 23:09:14 xen1 ccsd[2740]:
> >> Cluster is not quorate.  Refusing connection. Dec 19 23:09:14 xen1
> >> ccsd[2740]: Error while processing connect: Connection refused
> >> Dec 19 23:09:19 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:09:19 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >> Dec 19 23:09:24 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:09:24 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >> Dec 19 23:09:29 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:09:29 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >> Dec 19 23:09:34 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:09:34 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >> Dec 19 23:09:36 xen1 openais[2747]: [TOTEM] The token was lost in the
> >> OPERATIONAL state.
> >> Dec 19 23:09:36 xen1 openais[2747]: [TOTEM] Receive multicast socket
> >> recv buffer size (262142 bytes).
> >> Dec 19 23:09:36 xen1 openais[2747]: [TOTEM] Transmit multicast socket
> >> send buffer size (262142 bytes).
> >> Dec 19 23:09:36 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 2. Dec 19 23:09:39 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:09:39 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] entering GATHER state from
> >> 0. Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Creating commit token
> >> because I am the rep.
> >> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Saving state aru 18 high seq
> >> received 18
> >> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] entering COMMIT state.
> >> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] entering RECOVERY state.
> >> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] position [0] member
> >> 172.16.40.107: Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] previous ring
> >> seq 100 rep 172.16.40.107
> >> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] aru 18 high delivered 18
> >> received flag 0
> >> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Did not need to originate
> >> any messages in recovery.
> >> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Storing new sequence id for
> >> ring 68
> >> Dec 19 23:09:40 xen1 openais[2747]: [TOTEM] Sending initial ORF token
> >> Dec 19 23:09:40 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:09:40 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.116)
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:09:41 xen1 openais[2747]: [SYNC ] This node is within the
> >> primary component and will provide service.
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] CLM CONFIGURATION CHANGE
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] New Configuration:
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ]     r(0) ip(172.16.40.107)
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] Members Left:
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] Members Joined:
> >> Dec 19 23:09:41 xen1 openais[2747]: [SYNC ] This node is within the
> >> primary component and will provide service.
> >> Dec 19 23:09:41 xen1 openais[2747]: [TOTEM] entering OPERATIONAL state.
> >> Dec 19 23:09:41 xen1 openais[2747]: [CLM  ] got nodejoin message
> >> 172.16.40.107 Dec 19 23:09:41 xen1 openais[2747]: [CPG  ] got joinlist
> >> message from node 1 Dec 19 23:09:44 xen1 ccsd[2740]: Cluster is not
> >> quorate.  Refusing connection. Dec 19 23:09:44 xen1 ccsd[2740]: Error
> >> while processing connect: Connection refused
> >> Dec 19 23:09:49 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:09:49 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >> Dec 19 23:09:54 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:09:54 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >> Dec 19 23:09:59 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:09:59 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >> Dec 19 23:10:04 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:10:04 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >> Dec 19 23:10:09 xen1 ccsd[2740]: Cluster is not quorate.  Refusing
> >> connection. Dec 19 23:10:09 xen1 ccsd[2740]: Error while processing
> >> connect: Connection refused
> >>
> >> The last errors are ok because the cluster isn't quorate anymore. xen2
> >> was rebooting and xen3 was fenced, so leaving xen1 alone creates an
> >> unquorate cluster...
> >>
> >> The unusual thing is that it only happens when one of the nodes is using
> >> rhel5 xen kernel. Maybe something in the bridge-utils bug and multicast?
> >> This problem happens if i reboot xen1 server with xen kernel or xen2
> >> server.
> >>
> >>
> >> Any intel?
> >>
> >> Thanks
> >> Nuno Fernandes
>
> Hi Nuno,
>
>   Sorry for this question: how do you have created xen bridges with
> ifcfg-files?? I am trying to do the same, but it doesn't works for me ...
>
> > ------------------------------------------------------------------------
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster

Create/Edit the files:

::::::::::::::
File: /etc/sysconfig/network-scripts/ifcfg-eth0
::::::::::::::
DEVICE=eth0
ONBOOT=yes
BRIDGE=xenbr0
HWADDR=XX:XX:XX:XX:XX:XX

where XX:XX:XX:XX:XX:XX is the mac address of your network card.

::::::::::::::
File: /etc/sysconfig/network-scripts/ifcfg-xenbr0
::::::::::::::

DEVICE=xenbr0
ONBOOT=yes
BOOTPROTO=static
IPADDR=172.16.40.116
NETMASK=255.255.255.0
GATEWAY=172.16.40.254
TYPE=Bridge
DELAY=0

The ips are for my network. You have to change them to match yours.

And then:

service network restart

Rgds
Nuno Fernandes


-- 
Nuno Pais Fernandes
Cisco Certified Network Associate
Oracle Certified Professional
Eurotux Informatica S.A.
Tel: +351 253257395
Fax: +351 253257396
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070403/11dc1ada/attachment.sig>


More information about the Linux-cluster mailing list