[Linux-cluster] Cluster environment issue

Srija swap_project at yahoo.com
Tue May 31 13:51:35 UTC 2011


Thanks  again  for the reply.

Yes, this cluster environment is of  xen hosts. When the  cluster  is detatched, all the guests are pingable, there is no issue for that. Only as I said , clustat command  shows  everything 'offline',  also  can't able to execute the  lvm related commands.  

iptables  are 'off' already in this  cluster  environment.

regards.

--- On Mon, 5/30/11, Hiroyuki Sato <hiroysato at gmail.com> wrote:

> From: Hiroyuki Sato <hiroysato at gmail.com>
> Subject: Re: [Linux-cluster] Cluster environment issue
> To: "linux clustering" <linux-cluster at redhat.com>
> Date: Monday, May 30, 2011, 11:03 PM
> Hello
> 
> I'm not sure, This is useful or not.
> 
> Have you ever checked ``ping some_where'' on domU when
> cluster is broken??
> ( I thought you are using Xen, because you are using
> 2.6.18-194.3.1.el5xen. )
> If it does not respond anything, you should check
> iptables.
> (ex, disable iptables)
> 
> --
> Hiroyuki Sato
> 
> 2011/5/31 Srija <swap_project at yahoo.com>:
> > Thanks for your quick reply.
> >
> > I talked to the network people , but they are saying
> everything is good at their end. Is there anyway at the
> server end, to figure it  for the switch restart or
> multicast traffic?
> >
> > I think you have already checked the cluster.conf
> file.. Except quorum disk, do you think that the cluster
> configuration is sufficient for handling the sixteen node
> cluster!!
> >
> > thanks again .
> > regards
> >
> > --- On Mon, 5/30/11, Kaloyan Kovachev <kkovachev at varna.net>
> wrote:
> >
> >> From: Kaloyan Kovachev <kkovachev at varna.net>
> >> Subject: Re: [Linux-cluster] Cluster environment
> issue
> >> To: "linux clustering" <linux-cluster at redhat.com>
> >> Date: Monday, May 30, 2011, 4:05 PM
> >> Hi,
> >>  when your cluster gets broken, most likely the
> reason is,
> >> there is a
> >> network problem (switch restart or multicast
> traffic is
> >> lost for a while)
> >> on the interface where serverX-priv IPs are
> configured.
> >> Having a quorum
> >> disk may help by giving a quorum vote to one of
> the
> >> servers, so it can
> >> fence the others, but the best thing to do is to
> fix your
> >> network and
> >> preferably add a redundant link for the cluster
> >> communication to avoid
> >> breakage in the first place
> >>
> >> On Mon, 30 May 2011 12:17:07 -0700 (PDT), Srija
> <swap_project at yahoo.com>
> >> wrote:
> >> > Hi,
> >> >
> >> > I am very new to the redhat cluster. Need
> some help
> >> and suggession for
> >> the
> >> > cluster configuration.
> >> > We have sixteen node cluster of
> >> >
> >> >             OS
> >> : Linux Server release 5.5 (Tikanga)
> >> >
> >>    kernel :  2.6.18-194.3.1.el5xen.
> >> >
> >> > The problem is sometimes the cluster is
> getting
> >> broken. The solution is
> >> > (still yet)to reboot the
> >> > sixteen nodes. Otherwise the nodes are not
> joining
> >> >
> >> > We are using  clvm and not using any quorum
> disk.
> >> The quorum is by
> >> default.
> >> >
> >> > When it is getting broken, clustat commands
> >> shows  evrything  offline
> >> > except the node from where
> >> > the clustat command executed.  If we execute
> vgs,
> >> lvs command, those
> >> > commands are getting hung.
> >> >
> >> > Here is at present the clustat report
> >> > -------------------------------------
> >> >
> >> > [server1]# clustat
> >> > Cluster Status for newcluster @ Mon May 30
> 14:55:10
> >> 2011
> >> > Member Status: Quorate
> >> >
> >> >  Member Name
> >>
> >> ID   Status
> >> >  ------ ----
> >>             ---- ------
> >> >  server1
> >>               1 Online
> >> >  server2
> >>               2 Online,
> >> Local
> >> >  server3
> >>               3 Online
> >> >  server4
> >>               4 Online
> >> >  server5
> >>               5 Online
> >> >  server6
> >>               6 Online
> >> >  server7
> >>               7 Online
> >> >  server8
> >>               8 Online
> >> >  server9
> >>               9 Online
> >> >  server10
> >>
> >>    10 Online
> >> >  server11
> >>
> >>    11 Online
> >> >  server12
> >>
> >>    12 Online
> >> >  server13
> >>
> >>    13 Online
> >> >  server14
> >>
> >>    14 Online
> >> >  server15
> >>
> >>    15 Online
> >> >  server16
> >>
> >>    16 Online
> >> >
> >> > Here the cman_tool status  output  from
> one
> >> server
> >> >
> --------------------------------------------------
> >> >
> >> > [server1 ~]# cman_tool status
> >> > Version: 6.2.0
> >> > Config Version: 23
> >> > Cluster Name: newcluster
> >> > Cluster Id: 53322
> >> > Cluster Member: Yes
> >> > Cluster Generation: 11432
> >> > Membership state: Cluster-Member
> >> > Nodes: 16
> >> > Expected votes: 16
> >> > Total votes: 16
> >> > Quorum: 9
> >> > Active subsystems: 8
> >> > Flags: Dirty
> >> > Ports Bound: 0 11
> >> > Node name: server1
> >> > Node ID: 1
> >> > Multicast addresses: xxx.xxx.xxx.xx
> >> > Node addresses: 192.168.xxx.xx
> >> >
> >> >
> >> > Here is the cluster.conf file.
> >> > ------------------------------
> >> >
> >> > <?xml version="1.0"?>
> >> > <cluster alias="newcluster"
> config_version="23"
> >> name="newcluster">
> >> > <fence_daemon clean_start="1"
> post_fail_delay="0"
> >> post_join_delay="15"/>
> >> >
> >> > <clusternodes>
> >> >
> >> > <clusternode name="server1-priv"
> nodeid="1"
> >> votes="1">
> >> >
> >>   <fence><method name="1">
> >> >
> >>   <device
> name="ilo-server1r"/></method>
> >> >
> >>   </fence>
> >> > </clusternode>
> >> >
> >> > <clusternode name="server2-priv"
> nodeid="3"
> >> votes="1">
> >> >
> >>    <fence><method name="1">
> >> >         <device
> >> name="ilo-server2r"/></method>
> >> >         </fence>
> >> > </clusternode>
> >> >
> >> > <clusternode name="server3-priv"
> nodeid="2"
> >> votes="1">
> >> >
> >>    <fence><method name="1">
> >> >         <device
> >> name="ilo-server3r"/></method>
> >> >         </fence>
> >> > </clusternode>
> >> >
> >> > [ ... sinp .....]
> >> >
> >> > <clusternode name="server16-priv"
> nodeid="16"
> >> votes="1">
> >> >        <fence><method
> >> name="1">
> >> >        <device
> >> name="ilo-server16r"/></method>
> >> >        </fence>
> >> > </clusternode>
> >> >
> >> > </clusternodes>
> >> > <cman/>
> >> >
> >> > <dlm plock_ownership="1"
> plock_rate_limit="0"/>
> >> > <gfs_controld plock_rate_limit="0"/>
> >> >
> >> > <fencedevices>
> >> >         <fencedevice
> >> agent="fence_ilo" hostname="server1r"
> login="Admin"
> >> >
> >>    name="ilo-server1r" passwd="xxxxx"/>
> >> >         ..........
> >> >         <fencedevice
> >> agent="fence_ilo" hostname="server16r"
> >> login="Admin"
> >> >
> >>    name="ilo-server16r" passwd="xxxxx"/>
> >> > </fencedevices>
> >> > <rm>
> >> > <failoverdomains/>
> >> > <resources/>
> >> > </rm></cluster>
> >> >
> >> > Here is the lvm.conf file
> >> > --------------------------
> >> >
> >> > devices {
> >> >
> >> >     dir = "/dev"
> >> >     scan = [ "/dev" ]
> >> >     preferred_names = [ ]
> >> >     filter = [
> >> "r/scsi.*/","r/pci.*/","r/sd.*/","a/.*/" ]
> >> >     cache_dir = "/etc/lvm/cache"
> >> >     cache_file_prefix = ""
> >> >     write_cache_state = 1
> >> >     sysfs_scan = 1
> >> >     md_component_detection = 1
> >> >     md_chunk_alignment = 1
> >> >     data_alignment_detection = 1
> >> >     data_alignment = 0
> >> >
> >>    data_alignment_offset_detection = 1
> >> >     ignore_suspended_devices = 0
> >> > }
> >> >
> >> > log {
> >> >
> >> >     verbose = 0
> >> >     syslog = 1
> >> >     overwrite = 0
> >> >     level = 0
> >> >     indent = 1
> >> >     command_names = 0
> >> >     prefix = "  "
> >> > }
> >> >
> >> > backup {
> >> >
> >> >     backup = 1
> >> >     backup_dir =
> >> "/etc/lvm/backup"
> >> >     archive = 1
> >> >     archive_dir =
> >> "/etc/lvm/archive"
> >> >     retain_min = 10
> >> >     retain_days = 30
> >> > }
> >> >
> >> > shell {
> >> >
> >> >     history_size = 100
> >> > }
> >> > global {
> >> >     library_dir = "/usr/lib64"
> >> >     umask = 077
> >> >     test = 0
> >> >     units = "h"
> >> >     si_unit_consistency = 0
> >> >     activation = 1
> >> >     proc = "/proc"
> >> >     locking_type = 3
> >> >     wait_for_locks = 1
> >> >     fallback_to_clustered_locking
> >> = 1
> >> >     fallback_to_local_locking = 1
> >> >     locking_dir = "/var/lock/lvm"
> >> >     prioritise_write_locks = 1
> >> > }
> >> >
> >> > activation {
> >> >     udev_sync = 1
> >> >     missing_stripe_filler =
> >> "error"
> >> >     reserved_stack = 256
> >> >     reserved_memory = 8192
> >> >     process_priority = -18
> >> >     mirror_region_size = 512
> >> >     readahead = "auto"
> >> >     mirror_log_fault_policy =
> >> "allocate"
> >> >     mirror_image_fault_policy =
> >> "remove"
> >> > }
> >> > dmeventd {
> >> >
> >> >     mirror_library =
> >> "libdevmapper-event-lvm2mirror.so"
> >> >     snapshot_library =
> >> "libdevmapper-event-lvm2snapshot.so"
> >> > }
> >> >
> >> >
> >> > If you need more  information,  I can
> >> provide ...
> >> >
> >> > Thanks for your help
> >> > Priya
> >> >
> >> > --
> >> > Linux-cluster mailing list
> >> > Linux-cluster at redhat.com
> >> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 




More information about the Linux-cluster mailing list