[Linux-cluster] Network failure results cluster environment unstable & fragile

Deval kulshrestha deval.kulshrestha at progression.com
Mon Feb 27 05:55:58 UTC 2006


Hi 

 

"Network goes off" - If there is very heavy traffic on network. There is no
space for cluster messages to go across all cluster members. This makes real
chaos. I am not able to figure out a method to come out of this.

 

Even on this list, except on user jeff herr, no one else have raised the
issues. 

 

An enterprise class solution can not behave like this.

 

With regards

 

Deval Kulshrestha

 

 

-----Original Message-----
From: Hirantha Wijayawardena [mailto:hirantha at vcs.informatics.lk] 
Sent: Monday, February 27, 2006 9:23 AM
To: deval.kulshrestha at progression.com; linux clustering
Subject: RE: [Linux-cluster] Network failure results cluster environment
unstable & fragile

 

Hi,


I'm not expert on Cluster. I have the same scenario as you do! but I'm not
sure what you meant "Network goes off" - assuming network cable unplugged,
network switch failed or NIC failed - I succeeded with bonding NICs and
redundant switch.


someone may advice with any other way..




Hope this is helpful



Thanks and regards


- Hirantha




----- Original Message -----

From: Deval kulshrestha  <mailto:deval.kulshrestha at progression.com>
<deval.kulshrestha at progression.com>

To: 'linux clustering'  <mailto:linux-cluster at redhat.com>
<linux-cluster at redhat.com>

Cc: 

Date: Saturday, February 25 2006 11:40 AM

Subject: RE: [Linux-cluster] Network failure results cluster environment
unstable & fragile

Please help me to resolve my problem 


If network goes off on node1, and service which were not running on node1
are started by node1 with shared storage mount point, which was already
running on node 2 but both of these nodes are not able to communicate to
each other, node2 anyway already running the same service with shared
storage mount point. Because of Fencing both of these nodes try to kill
each other. Both of they got hanged up at "Stoping Cluster manager
Services.".In /var/log/messages, it shows fencing s1, fence successful. 

If we disable fencing than 

If network comes back nodes don't synchronize with each other. Shared
storage mount point is available to both the servers. If they try to access
storage at same storage gives IO errors. Hence this entire setup become
very unstable, fragile.

--- Deval kulshrestha
<deval.kulshrestha at progression.com> wrote:

> Hi 
> 
> I am struggling to get some help on following
> configuration. This setup is
> intended to put live in a data center for 24 x 7
> x365, any issue that makes
> my environment unstable is very critical here.
> 
> My HA Cluster Setup details
> 
> 1. HP DL 360 G4p Server 2nos.
> 2. HP MSA 500 G2 (SAN) 1nos.
> 3. RedHat Enterprise Linux 4 ES 
> 4. Red Hat Cluster Suite 4
> 
> 
> Server does have a HP SCSI HBA. MSA 500G2 is a scsi
> based SAN. Both of these
> server are connected to SAN using SCSI VHDCI cable.
> I used a network switch
> to establish network connectivity for the server.
> created a disk array of
> three HDD on SAN with two logical volumes than I
> have installed RHEL 4
> Update 1 on both server(Servers are configured with
> RAID 1) than installed
> all HP drivers and management agents. After server
> configuration and OS
> installation I have installed Red Hat Cluster Suite
> v 4 on both the machine.
> 
> 
> 
> Than I have configured Cluster using Cluster
> Configuration Manager. Added
> member hosts, configured fence device and assigned
> to member host(HP iLO is
> certified as an fence device), Configured Failover
> domain with node
> priority, configured resources such as floating IP
> address, File System,
> Script, than configured service which need to be run
> in HA mode.
> 
> 
> 
> After configuring this I have tested with various
> scenario HA is working
> properly, when ever powered off any machine ,
> services fail over on
> available node. 
> 
> Problem:
> 
> 
> If network goes off on node1, and service which were
> not running on node1
> are started by node1 with shared storage mount
> point, which was already
> running on node 2 but both of these nodes are not
> able to communicate to
> each other, node2 anyway already running the same
> service with shared
> storage mount point. Because of Fencing both of
> these nodes try to kill each
> other. Both of they got hanged up at "Stoping
> Cluster manager Services.".In
> /var/log/messages, it shows fencing s1, fence
> successful. 
> 
> If we disable fencing than 
> 
> If network comes back nodes don't synchronize with
> each other. Shared
> storage mount point is available to both the
> servers. If they try to access
> storage at same storage gives IO errors. Hence this
> entire setup become very
> unstable, fragile.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> With Regard
> 
> Deval
> 
> Progression Infonet Pvt. Ltd. 
> 55, Independent Electronic Modules, 
> Sector - 18, Electronic City, 
> Gurgaon - 122015
> 
> India
> Tel : - 0124 - 2455070, Ext. 215, Fax:
> 91-124-2398647
> Mobile : - 98186 -82509 
> URL : - www.progression.com 
> 
> 
> 
>
===========================================================
> Privileged or confidential information may be
> contained
> in this message. If you are not the addressee
> indicated
> in this message (or responsible for delivery of the 
> message to such person), please delete this message
> and
> kindly notify the sender by an emailed reply.
> Opinions,
> conclusions and other information in this message
> that
> do not relate to the official business of
> Progression
> and its associate entities shall be understood as
> neither
> given nor endorsed by them.
> 
> 
>
-------------------------------------------------------------
> Progression Infonet Private Limited, Gurgaon
> (Haryana), India
> > --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
>
https://www.redhat.com/mailman/listinfo/linux-cluster


__________________________________________________
Do You Yahoo!?
Tired of spam? Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com 



===========================================================
Privileged or confidential information may be contained
in this message. If you are not the addressee indicated
in this message (or responsible for delivery of the 
message to such person), please delete this message and
kindly notify the sender by an emailed reply. Opinions,
conclusions and other information in this message that
do not relate to the official business of Progression
and its associate entities shall be understood as neither
given nor endorsed by them.


-------------------------------------------------------------
Progression Infonet Private Limited, Gurgaon (Haryana), India

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster



===========================================================
Privileged or confidential information may be contained
in this message. If you are not the addressee indicated
in this message (or responsible for delivery of the 
message to such person), please delete this message and
kindly notify the sender by an emailed reply. Opinions,
conclusions and other information in this message that
do not relate to the official business of Progression
and its associate entities shall be understood as neither
given nor endorsed by them.


-------------------------------------------------------------
Progression Infonet Private Limited, Gurgaon (Haryana), India


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

 


e'switch 2.7.9 w/ MySQL on RedHat Enterprise 4 in production

 

_________________________________________________________________ 

Disclaimer and Confidentiality

 

This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system manager.
This message contains confidential information and is intended only for the
individual named. If you are not the named addressee you should not
disseminate, distribute or copy this e-mail. Please notify the sender
immediately by e-mail if you have received this e-mail by mistake and delete
this e-mail from your system. If you are not the intended recipient you are
notified that disclosing, copying, distributing or taking any action in
reliance on the contents of this information is strictly prohibited.

===========================================================
Privileged or confidential information may be contained
in this message. If you are not the addressee indicated
in this message (or responsible for delivery of the 
message to such person), please delete this message and
kindly notify the sender by an emailed reply. Opinions,
conclusions and other information in this message that
do not relate to the official business of Progression
and its associate entities shall be understood as neither
given nor endorsed by them.
  

-------------------------------------------------------------
Progression Infonet Private Limited, Gurgaon (Haryana), India
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20060227/cc0fb009/attachment.htm>


More information about the Linux-cluster mailing list