From swhiteho at redhat.com  Wed Sep  1 10:53:28 2010
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Wed, 01 Sep 2010 11:53:28 +0100
Subject: [Linux-cluster] GFS2 parameters
In-Reply-To: <971769.26249.qm@web112802.mail.gq1.yahoo.com>
References: <971769.26249.qm@web112802.mail.gq1.yahoo.com>
Message-ID: <1283338408.2462.2.camel@localhost>

Hi,

On Mon, 2010-08-30 at 09:42 -0700, Srija wrote:
> Hi,
> 
> I am using gfs2 in a cluster environment , 
> 
> OS  system using  RHEL5.5  86_64, 
> kernel : 2.6.18-194.3.1.el5xen
> 
> Trying to tune the GFS file system,  but few parameters  not getting  , 
> 
> like demote_secs.  
> 
> The error is as follows:
> 
> gfs2_tool: can't open /sys/fs/gfs2/gfsred:GFS-XEN-IMAGES/tune/demote_secs: No such file or directory
> 
> Can anybody please help me , what I am missing?
> 
> thanks in advance
> 
> 
Why are you trying to change demote_secs? Do you have a performance
problem of some kind? This parameter is obsolete and has been removed in
gfs2,

Steve.

> 
> 
> 
> 
>       
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From bturner at redhat.com  Wed Sep  1 14:48:23 2010
From: bturner at redhat.com (Ben Turner)
Date: Wed, 1 Sep 2010 10:48:23 -0400 (EDT)
Subject: [Linux-cluster] Fencing through iLO and functioning of kdump
In-Reply-To: <545151688.665561283352478743.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <679070528.665741283352503659.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>

Here is a kbase on fence scsi:

https://access.redhat.com/kb/docs/DOC-17809

It should answer any questions you have:

https://access.redhat.com/kb/docs/DOC-17809

Usually I try the fence_scsi_test to be sure my devices are capable, note:

"To assist with finding and detecting devices which are (or are not) suitable for use with fence_scsi, a tool has been provided. The fence_scsi_test script will find devices visible to the node and report whether or not they are compatible with SCSI persistent reservations."

-Ben


----- "Chris Jankowski" <Chris.Jankowski at hp.com> wrote:

> Ben,
> 
> Thank you for pointing me at fence_scsi.
> It looks like fence_scsi will fit the bill elegantly. And it should be
> much more reliable then iLO fencing if the cluster uses properly
> configured, dual fabric FC SAN for shared storage.
> 
> I read the fence_scsi manual page and have one more question.
> 
> What do I need to do for my cluster to start using SCSI reservations?
> Is this done by default?
> 
> Thanks and regards,
> 
> Chris Jankowski
> 
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Ben Turner
> Sent: Saturday, 28 August 2010 03:29
> To: linux clustering
> Subject: Re: [Linux-cluster] Fencing through iLO and functioning of
> kdump
> 
> You have a couple options here:
> 
> 1.  Switch to fence_scsi(uses scsi reservation as you described) or an
> other I/O fencing method that does not reboot the system.  This will
> enable you core dump to complete without power fencing interrupting
> it.
> 
> 2.  Put in a post fail delay long enough for fencing to complete. 
> This is sub optimal as your cluster services/resources will be hung
> for the duration of the post fail delay.  I usually only do this when
> I know I have a node that is crashing and no I/O fencing
> capabilities.
> 
> 3.  If you don't have access to an I/O fence agent and it post fail
> delay won't work for some reason you can try:
> 
> Best practice I can think of right now would be the following:
> 1. disable the power fence device on the host you're seeing panics on,
> I have changed the IP for it in cluster.conf in the past 2. when that
> node fails, the other nodes will attempt to fence the host
>    and it will fail since the fence device was disabled
>    (NOTE: between steps 2 and 3, cluster operation is suspended) 3.
> administrator can now do things like:
>    - disconnect the FC and network cables form the affected host
> ensuring
>      that it is 'manually I/O fenced'
>    - run fence_ack_manual on the other host to override the failed
>      fencing operation to continue cluster operation on the other
> nodes 4. Now the failed host is free to continue kdumping for as long
> as need be
> 
> Hope this helps.
> 
> -b
> 
> 
> ----- "Chris Jankowski" <Chris.Jankowski at hp.com> wrote:
> 
> > Hi,
> > 
> > How can I reconcile the need to have Kdump configured and
> operational 
> > on cluster nodes with the need for fencing of a node most commonly
> and 
> > conveniently implemented through iLO on HP servers?
> > 
> > Customers require Kdump configured and operational to be able to
> have 
> > kernel crashes analysed by Red Hat support. The taking of crash dump
> 
> > starts immediately after the crash, but it may take very
> considerable 
> > time on a machine with 512 GB of memory (more than an hour) if done
> in 
> > dumplevel 0 and over 1 GBE network. However, if I use iLO fencing
> then 
> > the crashed node will be powered off through iLO which will 
> > irrecovably kill the the kernel dump in progress and erase the
> memory 
> > content containing the crashed kernel image.
> > 
> > Ideally, I would love to have the functionality that is present in 
> > several UNIX clusters, when a crashed node completes its kernel
> crash 
> > dump in peace. In UNIX clusters the crashed node can be configured
> to 
> > reboot automatically after kernel crash and rejoin the cluster. It 
> > typically does the kernel dump as a part of the boot.
> > 
> > The UNIX clusters typically use SCSI reservation to protect
> integrity 
> > of storage. This enables them to keep the failed node isolated
> whilst 
> > it is still able to do the kernel crash dump before rejoining the 
> > cluster. I believe this option is not avilable in Linux Cluster.
> > 
> > So, how can I have functioning Linux cluster with ability of taking
> a 
> > kernel crash dump of crashed nodes and without blocking the access
> to 
> > shared GFS2 filesystem for the hour or so that bit may take a crash
> 
> > dump obn a very large system?
> > 
> > Thanks and regards,
> > 
> > Chris Jankowski
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From rohara at redhat.com  Wed Sep  1 17:11:45 2010
From: rohara at redhat.com (Ryan O'Hara)
Date: Wed, 1 Sep 2010 12:11:45 -0500
Subject: [Linux-cluster] Fencing through iLO and functioning of kdump
In-Reply-To: <679070528.665741283352503659.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
References: <545151688.665561283352478743.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
	<679070528.665741283352503659.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <20100901171145.GD1721@redhat.com>

On Wed, Sep 01, 2010 at 10:48:23AM -0400, Ben Turner wrote:
> Here is a kbase on fence scsi:
> 
> https://access.redhat.com/kb/docs/DOC-17809
> 
> It should answer any questions you have:
> 
> https://access.redhat.com/kb/docs/DOC-17809
> 
> Usually I try the fence_scsi_test to be sure my devices are capable, note:
> 
> "To assist with finding and detecting devices which are (or are not) suitable for use with fence_scsi, a tool has been provided. The fence_scsi_test script will find devices visible to the node and report whether or not they are compatible with SCSI persistent reservations."

I just have to comment that fence_scsi_test is rather limited. I'm
currently working on making it more robust, such that it more
accurately tests device(s) for SCSI-PR support.

Basically there are two issues:

1. The current script does not verify that registrations exist on a
device -- it relies on the error code returned from sg_persist. This
usually works, but we have seen some arrays that will report false
positives.

2. The script *only* puts a registration on the device(s) and then
removes the registration from each device. This doesn't tell the whole
story, since it the array must also support the preempt-and-abort
operation.

A new fence_scsi_test script should be available in the very near
future. Here is the relevant BZ:

https://bugzilla.redhat.com/show_bug.cgi?id=603838

Ryan

> ----- "Chris Jankowski" <Chris.Jankowski at hp.com> wrote:
> 
> > Ben,
> > 
> > Thank you for pointing me at fence_scsi.
> > It looks like fence_scsi will fit the bill elegantly. And it should be
> > much more reliable then iLO fencing if the cluster uses properly
> > configured, dual fabric FC SAN for shared storage.
> > 
> > I read the fence_scsi manual page and have one more question.
> > 
> > What do I need to do for my cluster to start using SCSI reservations?
> > Is this done by default?
> > 
> > Thanks and regards,
> > 
> > Chris Jankowski
> > 
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Ben Turner
> > Sent: Saturday, 28 August 2010 03:29
> > To: linux clustering
> > Subject: Re: [Linux-cluster] Fencing through iLO and functioning of
> > kdump
> > 
> > You have a couple options here:
> > 
> > 1.  Switch to fence_scsi(uses scsi reservation as you described) or an
> > other I/O fencing method that does not reboot the system.  This will
> > enable you core dump to complete without power fencing interrupting
> > it.
> > 
> > 2.  Put in a post fail delay long enough for fencing to complete. 
> > This is sub optimal as your cluster services/resources will be hung
> > for the duration of the post fail delay.  I usually only do this when
> > I know I have a node that is crashing and no I/O fencing
> > capabilities.
> > 
> > 3.  If you don't have access to an I/O fence agent and it post fail
> > delay won't work for some reason you can try:
> > 
> > Best practice I can think of right now would be the following:
> > 1. disable the power fence device on the host you're seeing panics on,
> > I have changed the IP for it in cluster.conf in the past 2. when that
> > node fails, the other nodes will attempt to fence the host
> >    and it will fail since the fence device was disabled
> >    (NOTE: between steps 2 and 3, cluster operation is suspended) 3.
> > administrator can now do things like:
> >    - disconnect the FC and network cables form the affected host
> > ensuring
> >      that it is 'manually I/O fenced'
> >    - run fence_ack_manual on the other host to override the failed
> >      fencing operation to continue cluster operation on the other
> > nodes 4. Now the failed host is free to continue kdumping for as long
> > as need be
> > 
> > Hope this helps.
> > 
> > -b
> > 
> > 
> > ----- "Chris Jankowski" <Chris.Jankowski at hp.com> wrote:
> > 
> > > Hi,
> > > 
> > > How can I reconcile the need to have Kdump configured and
> > operational 
> > > on cluster nodes with the need for fencing of a node most commonly
> > and 
> > > conveniently implemented through iLO on HP servers?
> > > 
> > > Customers require Kdump configured and operational to be able to
> > have 
> > > kernel crashes analysed by Red Hat support. The taking of crash dump
> > 
> > > starts immediately after the crash, but it may take very
> > considerable 
> > > time on a machine with 512 GB of memory (more than an hour) if done
> > in 
> > > dumplevel 0 and over 1 GBE network. However, if I use iLO fencing
> > then 
> > > the crashed node will be powered off through iLO which will 
> > > irrecovably kill the the kernel dump in progress and erase the
> > memory 
> > > content containing the crashed kernel image.
> > > 
> > > Ideally, I would love to have the functionality that is present in 
> > > several UNIX clusters, when a crashed node completes its kernel
> > crash 
> > > dump in peace. In UNIX clusters the crashed node can be configured
> > to 
> > > reboot automatically after kernel crash and rejoin the cluster. It 
> > > typically does the kernel dump as a part of the boot.
> > > 
> > > The UNIX clusters typically use SCSI reservation to protect
> > integrity 
> > > of storage. This enables them to keep the failed node isolated
> > whilst 
> > > it is still able to do the kernel crash dump before rejoining the 
> > > cluster. I believe this option is not avilable in Linux Cluster.
> > > 
> > > So, how can I have functioning Linux cluster with ability of taking
> > a 
> > > kernel crash dump of crashed nodes and without blocking the access
> > to 
> > > shared GFS2 filesystem for the hour or so that bit may take a crash
> > 
> > > dump obn a very large system?
> > > 
> > > Thanks and regards,
> > > 
> > > Chris Jankowski
> > > 
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From cos at aaaaa.org  Wed Sep  1 18:03:22 2010
From: cos at aaaaa.org (Ofer Inbar)
Date: Wed, 1 Sep 2010 14:03:22 -0400
Subject: [Linux-cluster] logging from the resource agent script
In-Reply-To: <20100824003816.GM18763@mip.aaaaa.org>
References: <20100824003816.GM18763@mip.aaaaa.org>
Message-ID: <20100901180322.GO18256@mip.aaaaa.org>

I got the answers to the questions about ocf_log that I posted last
week, so I'm following up to the list in case anyone finds these in
the list archives and wonders what the solution was.  I wasn't able
to find good answers via Google before, so hopefully this email will
fix that :)

On Mon, Aug 23, 2010 at 08:38:16PM -0400, I wrote:
> Right now, the question that's vexing me is how to log custom messages
> from this resource agent script, to give the operator more information
> about what the cluster is doing (such as, for example, the exact
> commands that are run when starting and stopping the service, or what
> the real return code from the health check is, rather than just "did
> it fail?").

> 1. ocf_log statements I put at the top level of the script do log,
> but any that I put inside functions such as start() and stop() don't.
> Why don't my custom log messages appear in /var/log/messages when
> other messages at the same level (such as info or notice) from
> rgmanager do, and when the start() or stop() function is clearly being
> called?
> 
> 2. ocf_log seems to sometimes, or always, output to stdout, which
> means I have to take care *not* to let it run when meta-data is the
> argument, because it'd pollute the metadata XML.  But then how do I
> log anything from the times the script is run for metadata, if I want?
> 
> Should this work?  Is there another, better way of making resource
> agent scripts log custom messages?
> 
> And what happens to the resource agent script's stdout, anyway?

So, first of all, the resource agent script's stdout and stderr are
tied to /dev/null *except* when it's being called for meta-data. It
is not logged anywhere.

Secondly, the problem with ocf_log not logging was very simple, but
obfuscated by the fact that stderr was thrown to the bit bucket.

ocf_log is a shell function which always outputs to stdout and also
calls a separate program called clulog to send stuff to syslog.  It
assumes clulog is in the path, which means the resource agent needs
/usr/sbin in its path, which was missing from my script.  A simple
oversight, would've been obvious if I'd see then "clulog: command
not found" errors.

One potential hitch is that ocf_log just passes its string argument to
clulog on the command line enclosed in double quotes, so you could
have shell quoting issues.  Quoting once (in your call to ocf_log in
the resource agent string) is not necessarily enough, there's going to
be a second level of shell interpolation, though it's double-quoted.
One failure would be if you start your string with a - character,
because then clulog will think it's another command line switch.

Note: My confusion about ocf_log "sometimes" sending to stdout was
caused by the fact that the resource agent's stdout was going to
/dev/null except when it was being called for meta-data.  ocf_log
always writes to stdout, and rgmanager was sometimes looking at it
and sometimes bitbucketing it.


Finally, a very useful debugging tool I was not aware of when I first
asked the question, that makes it much easier to see what's going on:

rg_test test /etc/cluster/cluster.conf [status|start|stop] service [service]

(run as root, or with sudo)

This runs your resource agent as rgmanager would, but shows you stdout
and stderr.
  -- Cos


From christopher.walker at gmail.com  Wed Sep  1 20:40:54 2010
From: christopher.walker at gmail.com (Chris Walker)
Date: Wed, 1 Sep 2010 16:40:54 -0400
Subject: [Linux-cluster] active/active NFS cluster question
Message-ID: <AANLkTin3U411LS8C7KAOhiOnpHWZvct+9fsfLDX2d2M7@mail.gmail.com>

Hello,

I suspect that I'm doing something fairly stupid, but I'm having a
problem with a cluster that is exporting the same GFS filesystems to
the same nfs clients.  Everything thing is fine until I relocate one
of the nfs services to another machine.  The relocation goes fine, but
once I have two nfs services on the same machine, I can't get them
apart.  When I relocate one of the two nfs services to a different
cluster host, the relocation wipes out the entries in
/var/lib/nfs/etab, forcing the second nfs service on that node to
relocate as well (I get the error "nfsclient:rc_nfs_clients is
missing!").

Any suggestions?  Other than modifying
/usr/share/cluster/nfsclient.sh, is there some way to prevent the etab
entries from being purged?

Thanks!
Chris

cluster.conf:

<cluster alias="gfs_cluster" config_version="5" name="gfs_cluster">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="gfs03" nodeid="1" votes="1">
                        <fence/>
                </clusternode>
                <clusternode name="gfs02" nodeid="2" votes="1">
                        <fence/>
                </clusternode>
                <clusternode name="gfs01" nodeid="3" votes="1">
                        <fence/>
                </clusternode>
        </clusternodes>
        <cman/>
        <fencedevices/>
        <rm>
                <failoverdomains/>
                <resources>
                        <ip address="10.242.62.138" monitor_link="1"/>
                        <ip address="10.242.62.140" monitor_link="1"/>
                        <ip address="10.242.62.139" monitor_link="1"/>
                        <nfsexport name="md3000i01_nfsexp"/>
                        <nfsclient allow_recover="1"
name="rc_nfs_clients" options="rw,async" target="10.242.0.0/16"/>
                        <nfsexport name="md3000i02_nfsexp"/>
                        <clusterfs
device="/dev/md3000i01_vg/md3000i01_lv" force_unmount="0" fsid="49569"
fstype="gfs2" mountpoint="/mnt/md3000i01" name="md3000i01_gfsres"
self_fence="0"/>
                        <clusterfs
device="/dev/md3000i02_vg/md3000i02_lv" force_unmount="0" fsid="50754"
fstype="gfs2" mountpoint="/mnt/md3000i02" name="md3000i02_gfsres"
self_fence="0"/>
                </resources>
                <service autostart="1" exclusive="0"
name="gfs-a_nfssvc" recovery="relocate">
                        <ip ref="10.242.62.138"/>
                        <clusterfs fstype="gfs" ref="md3000i01_gfsres">
                                <nfsexport ref="md3000i01_nfsexp">
                                        <nfsclient name=" "
ref="rc_nfs_clients"/>
                                </nfsexport>
                        </clusterfs>
                        <clusterfs fstype="gfs" ref="md3000i02_gfsres">
                                <nfsexport ref="md3000i02_nfsexp">
                                        <nfsclient name=" "
ref="rc_nfs_clients"/>
                                </nfsexport>
                        </clusterfs>
                </service>
                <service autostart="1" exclusive="0"
name="gfs-b_nfssvc" recovery="relocate">
                        <ip ref="10.242.62.139"/>
                        <clusterfs fstype="gfs" ref="md3000i01_gfsres">
                                <nfsexport ref="md3000i01_nfsexp">
                                        <nfsclient name=" "
ref="rc_nfs_clients"/>
                                </nfsexport>
                        </clusterfs>
                        <clusterfs fstype="gfs" ref="md3000i02_gfsres">
                                <nfsexport ref="md3000i02_nfsexp">
                                        <nfsclient name=" "
ref="rc_nfs_clients"/>
                                </nfsexport>
                        </clusterfs>
                </service>
                <service autostart="1" exclusive="0"
name="gfs-c_nfssvc" recovery="relocate">
                        <ip ref="10.242.62.140"/>
                        <clusterfs fstype="gfs" ref="md3000i01_gfsres">
                                <nfsexport ref="md3000i01_nfsexp">
                                        <nfsclient name=" "
ref="rc_nfs_clients"/>
                                </nfsexport>
                        </clusterfs>
                        <clusterfs fstype="gfs" ref="md3000i02_gfsres">
                                <nfsexport ref="md3000i02_nfsexp">
                                        <nfsclient name=" "
ref="rc_nfs_clients"/>
                                </nfsexport>
                        </clusterfs>
                </service>
        </rm>
</cluster>


From fdinitto at redhat.com  Thu Sep  2 12:45:18 2010
From: fdinitto at redhat.com (Fabio M. Di Nitto)
Date: Thu, 02 Sep 2010 14:45:18 +0200
Subject: [Linux-cluster] Cluster 3.0.16 stable release
Message-ID: <4C7F9C5E.6000606@redhat.com>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256

The cluster team and its community are proud to announce the 3.0.16
stable release from the STABLE3 branch.

This release contains a few major bug fixes. We strongly recommend
people to update their clusters. We also welcome Digimer to the
development team and her contribution of fence_nodeassassin in time for
this release.

(if you are wondering where 3.0.15 is, the tarballs are available at the
usual URL, but due to a change in some headers, it probably will not
build on your system, 3.0.16 addresses that problem specifically)

In order to build/run the 3.0.16 release you will need:

- - corosync 1.2.8
- - openais 1.1.4
- - linux kernel 2.6.31 (only for GFS1 users)

The new source tarball can be downloaded here:

https://fedorahosted.org/releases/c/l/cluster/cluster-3.0.16.tar.bz2

To report bugs or issues:

   https://bugzilla.redhat.com/

Would you like to meet the cluster team or members of its community?

   Join us on IRC (irc.freenode.net #linux-cluster) and share your
   experience  with other sysadministrators or power users.

Thanks/congratulations to all people that contributed to achieve this
great milestone.

Happy clustering,
Fabio

Under the hood (from 3.0.15):

Fabio M. Di Nitto (1):
      cman: fix build with old headers (f12 and older)

 cman/cman_tool/main.c |    2 ++
 1 files changed, 2 insertions(+), 0 deletions(-)

Under the hood (from 3.0.14):

Bob Peterson (3):
      gfs2-utils: mkfs can't fsync device with 32MB RGs
      fsck.gfs2 deletes directories if they get too big
      fsck.gfs2 segfaults if journals are missing

David Teigland (5):
      dlm_controld: fix save_plocks initialization
      dlm_controld: fix plock owner syncing
      dlm_controld: fix plock signature in stored message
      Revert "dlm_controld: fix save_plocks initialization"
      dlm_controld: ignore plocks until checkpoint time

Fabio M. Di Nitto (11):
      cman: do not propagate old configurations around
      fence_na: import files pristine from upstream
      build: fix man page install from outside source tree
      fence_na: first cut at the Makefile
      fence_na: add example config file
      build: rename CONFFILEEXAMPLE to EXTRACONFFILE
      fence_na: generate files based on configure invokation
      fence_na: fix last installation bits required to work in our build env
      fence_na: add copyright/author information
      fence_na: add support to the validation schema
      config: Update ldif schema

Lon Hohberger (13):
      cman: Make qdiskd exit if removed from configuration
      cman: Clarify man page on config distribution
      rgmanager: Fix clustat return code
      rgmanager: Honor restricted FDs during migrations
      config: Add missing fence-agent options to RNG schema
      config: Add missing fence-agent options to LDAP schema
      rgmanager: Present flags in clustat output
      config: Fix broken fence_egenera options
      config: Add fence_egenera options to ldif
      doc: Update autogenerated documentation
      rgmanager: fix compiler warning in clulog.c
      config: Present fencing agent name in metadata
      config: Add fencing agent name to group for clarity

Marek 'marx' Grac (5):
      fence_drac5: make "port" a synonym of "module_name" for drac5
      fencing: Method to cause one node to delay fencing
      fencing: Method to cause one node to delay fencing [2]
      fencing: Method to cause one node to delay fencing - drac, egenera
      fencing: Method to cause one node to delay fencing - ipmilan

Ryan O'Hara (1):
      Fix syntax error in code that opens logfile.

 cman/cman_tool/main.c                       |   37 +-
 cman/man/cman_tool.8                        |   24 +-
 cman/qdisk/main.c                           |   33 +-
 config/plugins/ldap/99cluster.ldif          |   98 +++-
 config/plugins/ldap/ldap-base.csv           |   11 +-
 config/tools/xml/ccs_config_validate.in     |   34 +-
 config/tools/xml/cluster.rng.in             |  565 +++++++++++++----
 doc/COPYRIGHT                               |    3 +
 doc/cluster_conf.html                       |   26 +-
 fence/agents/drac/fence_drac.8              |    6 +
 fence/agents/drac/fence_drac.pl             |   13 +-
 fence/agents/egenera/fence_egenera.8        |    6 +
 fence/agents/egenera/fence_egenera.pl       |   10 +-
 fence/agents/ipmilan/ipmilan.c              |   29 +-
 fence/agents/lib/fence2rng.xsl              |    3 +-
 fence/agents/lib/fencing.py.py              |   23 +-
 fence/agents/node_assassin/Makefile         |   50 ++
 fence/agents/node_assassin/fence_na.conf.in |   84 +++
 fence/agents/node_assassin/fence_na.lib.in  |  919
+++++++++++++++++++++++++++
 fence/agents/node_assassin/fence_na.pl      |  162 +++++
 fence/agents/node_assassin/fence_na.pod.in  |  188 ++++++
 fence/agents/scsi/fence_scsi.pl             |    2 +-
 gfs2/convert/gfs2_convert.c                 |   12 +-
 gfs2/fsck/fs_recovery.c                     |   55 ++-
 gfs2/fsck/fs_recovery.h                     |    7 +-
 gfs2/fsck/metawalk.c                        |   45 +-
 gfs2/fsck/metawalk.h                        |    5 +-
 gfs2/fsck/pass1.c                           |  144 +++--
 gfs2/fsck/pass1b.c                          |   15 +-
 gfs2/libgfs2/libgfs2.h                      |    5 +-
 gfs2/libgfs2/rgrp.c                         |   14 +-
 gfs2/libgfs2/structures.c                   |   32 +-
 gfs2/libgfs2/super.c                        |   43 +--
 group/dlm_controld/cpg.c                    |   43 ++-
 group/dlm_controld/dlm_daemon.h             |    1 +
 group/dlm_controld/plock.c                  |  120 +++--
 make/install.mk                             |    6 +-
 make/uninstall.mk                           |    3 +
 rgmanager/src/daemons/rg_state.c            |    5 +
 rgmanager/src/utils/clulog.c                |    2 +-
 rgmanager/src/utils/clustat.c               |   53 ++-
 41 files changed, 2563 insertions(+), 373 deletions(-)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQIcBAEBCAAGBQJMf5xcAAoJEFA6oBJjVJ+OZLEP/Rsvn6n1T29WCtlQCqdK/Ux0
Ljc2Py/JcPaptuR2oeqDAAjNSb5WnI8kBNMp5XJ0bbegn72m1OVCNTyCTnuvHGHe
CPwFLx7WfOpusBayhHzpErPBTjBMROt4noZI9+iWSkbjr1YERPowbBZ3NRpjKaye
QIX/Z4Wc7loqevzeg3h8HYmhf2Ka7t3VsMKzmGMRdUKeuFqUI6XWvqd8Q8YxR4gd
2mgu4OODHgvv7dD/vt1OSRI62/uUT92R5edRuK7Y0FizQ0ujWWOv10KsfAULzLKI
fLqhBaX29OZE68AeAkfSZ98p5E7vreVTXT0QAds6kIVjw53ZRJ9LH57pEzB6vMmh
xzb4vjD8ChU3WNCYE1GYDxF28cBHTzintNv1MNiSFAP1vC1r7UaAZ6GJGztE506a
ZGT/wOOfgFmkk0u1oT6cPwnkMXbIHDJVPqd1Ds+M0Pz3UMNZ+ta9k8YnkOkgPrZJ
Lne81a51u7wLqKc+2BD34TBwxSpETL4oHiYR5wnWVjugsipBnPV9f5bZMwOtsOSm
bi6/r8NcVY9wsuo28wIBFISkyyyppw7v0ohUS/nne3Colr3dJJAJB73nYTtenPQd
Ir9219EyRCevLrmI9K+7a9GSBPulIFXkGWXJLFBu/lyWfUSOHU5uGC/ZG8kJwEus
rug84hGXCJ06GNZmtoAc
=LV1N
-----END PGP SIGNATURE-----


From girishpati at yahoo.com  Wed Sep  8 10:05:30 2010
From: girishpati at yahoo.com (Girish Prajapati)
Date: Wed, 8 Sep 2010 03:05:30 -0700 (PDT)
Subject: [Linux-cluster] need help - Fencing problem
Message-ID: <178789.16151.qm@web120516.mail.ne1.yahoo.com>

Hello Everybody,
i am having problem of fencing a cluster node? let me explain indetail :
I have installed RHEL 5.4 on? HP Prolaint DL280 G5 servers and iLO 2as fencing 
device. Am managing cluster through Luci - (Conga). itseems everything is 
working fine. I can reboot cluster nodes through Luci and service get transfer 
to another node. After rebooting node connect to cluster automatically without 
any error.
Problem is i can not do Fence this node through Luci, when i try to fence any 
node i get following error :

Sep? 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable to 
connect/login to fencing device
Sep? 8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was 
unsuccessful

my iLO license is : iLO 2 Advanced Evaluation
Do i need to have? license of iLO or there is problem in configuration of 
cluster ?
how i can check cluster log in details.

Appreciate your help.
Thank you in advance.

Regards,
Girishkumar R Prajapati


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100908/8605e2f5/attachment.htm>

From jacob.ishak at gmail.com  Wed Sep  8 11:09:58 2010
From: jacob.ishak at gmail.com (jacob ishak)
Date: Wed, 8 Sep 2010 14:09:58 +0300
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
Message-ID: <AANLkTik6O8i1TotGGicBNDei8ttFXOf1N_BX0oZzF1UX@mail.gmail.com>

it might be ilo login issue check fencing device authentication type

i faced this isse on Sun ILOM fencing device , i changed authentication type
to md5 and it worked

from cluster.conf: fencedevice agent="fence_ipmilan" auth="md5"


On Wed, Sep 8, 2010 at 1:05 PM, Girish Prajapati <girishpati at yahoo.com>wrote:

> Hello Everybody,
> i am having problem of fencing a cluster node  let me explain indetail :
> I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as
> fencing device. Am managing cluster through Luci - (Conga). itseems
> everything is working fine. I can reboot cluster nodes through Luci and
> service get transfer to another node. After rebooting node connect to
> cluster automatically without any error.
> Problem is i can not do Fence this node through Luci, when i try to fence
> any node i get following error :
>
> Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was
> unsuccessful
>
> my iLO license is : iLO 2 Advanced Evaluation
> Do i need to have  license of iLO or there is problem in configuration of
> cluster ?
> how i can check cluster log in details.
>
> Appreciate your help.
> Thank you in advance.
>
> Regards,
> Girishkumar R Prajapati
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100908/565365f1/attachment.htm>

From girishpati at yahoo.com  Wed Sep  8 12:24:15 2010
From: girishpati at yahoo.com (Girish Prajapati)
Date: Wed, 8 Sep 2010 05:24:15 -0700 (PDT)
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <AANLkTik6O8i1TotGGicBNDei8ttFXOf1N_BX0oZzF1UX@mail.gmail.com>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
	<AANLkTik6O8i1TotGGicBNDei8ttFXOf1N_BX0oZzF1UX@mail.gmail.com>
Message-ID: <550346.91123.qm@web120518.mail.ne1.yahoo.com>

Hello Jecob,

Thanks for your reply.. i try to change as you explain but still there is same 
problem. When i try to fence any node from Luci, i get?following error on the 
web browser :

--? Unable to retrieve batch 1223037152 status from node2.drctmb.com:11111: 
fence_node failed: Node "node1.drctmb.com" is being fenced by node 
"node2.drctmb.com" -- You will be redirected in 5 seconds.
??? Stop waiting for this job to complete 


--Unable to retrieve batch 719909649 status from node1.drctmb.com:11111: 
fence_node failed: Node "node2.drctmb.com" is being fenced by node 
"node1.drctmb.com" -- You will be redirected in 5 seconds.
??? Stop waiting for this job to complete 

any idea why am gettting this error message? ??

Regards,
Girishkumar R Prajapati


________________________________
From: jacob ishak <jacob.ishak at gmail.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Wed, September 8, 2010 1:09:58 PM
Subject: Re: [Linux-cluster] need help - Fencing problem


it might be ilo login issue check fencing device authentication type 

i faced this isse on Sun ILOM fencing device , i changed authentication type to 
md5 and it worked

from cluster.conf: fencedevice agent="fence_ipmilan" auth="md5"


On Wed, Sep 8, 2010 at 1:05 PM, Girish Prajapati <girishpati at yahoo.com> wrote:

Hello Everybody,
>i am having problem of fencing a cluster node? let me explain indetail :
>I have installed RHEL 5.4 on? HP Prolaint DL280 G5 servers and iLO 2as fencing 
>device. Am managing cluster through Luci - (Conga). itseems everything is 
>working fine. I can reboot cluster nodes through Luci and service get transfer 
>to another node. After rebooting node connect to cluster automatically without 
>any error.
>Problem is i can not do Fence this node through Luci, when i try to fence any 
>node i get following error :
>
>Sep? 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable to 
>connect/login to fencing device
>Sep? 8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was 
>unsuccessful
>
>my iLO license is : iLO 2 Advanced Evaluation
>Do i need to have? license of iLO or there is problem in configuration of 
>cluster ?
>how i can check cluster log in details.
>
>Appreciate your help.
>Thank you in advance.
>
>Regards,
>Girishkumar R Prajapati
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>https://www.redhat.com/mailman/listinfo/linux-cluster
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100908/c9d8f431/attachment.htm>

From esggrupos at gmail.com  Wed Sep  8 12:57:25 2010
From: esggrupos at gmail.com (ESGLinux)
Date: Wed, 8 Sep 2010 14:57:25 +0200
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
Message-ID: <AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>

Hello,

Have you configured the iLO devices entering in the BIOS?

I remenber I have to set up the user/pass in the iLO and marked the iLo as
not shared


HTH,

ESG

2010/9/8 Girish Prajapati <girishpati at yahoo.com>

> Hello Everybody,
> i am having problem of fencing a cluster node  let me explain indetail :
> I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as
> fencing device. Am managing cluster through Luci - (Conga). itseems
> everything is working fine. I can reboot cluster nodes through Luci and
> service get transfer to another node. After rebooting node connect to
> cluster automatically without any error.
> Problem is i can not do Fence this node through Luci, when i try to fence
> any node i get following error :
>
> Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was
> unsuccessful
>
> my iLO license is : iLO 2 Advanced Evaluation
> Do i need to have  license of iLO or there is problem in configuration of
> cluster ?
> how i can check cluster log in details.
>
> Appreciate your help.
> Thank you in advance.
>
> Regards,
> Girishkumar R Prajapati
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100908/46d4c9e4/attachment.htm>

From Chris.Jankowski at hp.com  Wed Sep  8 22:30:57 2010
From: Chris.Jankowski at hp.com (Jankowski, Chris)
Date: Wed, 8 Sep 2010 22:30:57 +0000
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
	<AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>
Message-ID: <036B68E61A28CA49AC2767596576CD596BACEB2F3D@GVW1113EXC.americas.hpqcorp.net>

Why did you have to set iLO as non-shared?

Thank and regards,

Chris

________________________________
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of ESGLinux
Sent: Wednesday, 8 September 2010 22:57
To: linux clustering
Subject: Re: [Linux-cluster] need help - Fencing problem

Hello,

Have you configured the iLO devices entering in the BIOS?

I remenber I have to set up the user/pass in the iLO and marked the iLo as not shared


HTH,

ESG

2010/9/8 Girish Prajapati <girishpati at yahoo.com<mailto:girishpati at yahoo.com>>
Hello Everybody,
i am having problem of fencing a cluster node  let me explain indetail :
I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as fencing device. Am managing cluster through Luci - (Conga). itseems everything is working fine. I can reboot cluster nodes through Luci and service get transfer to another node. After rebooting node connect to cluster automatically without any error.
Problem is i can not do Fence this node through Luci, when i try to fence any node i get following error :

Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable to connect/login to fencing device
Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com<http://node1.drctmb.com>" was unsuccessful

my iLO license is : iLO 2 Advanced Evaluation
Do i need to have  license of iLO or there is problem in configuration of cluster ?
how i can check cluster log in details.

Appreciate your help.
Thank you in advance.

Regards,
Girishkumar R Prajapati


--
Linux-cluster mailing list
Linux-cluster at redhat.com<mailto:Linux-cluster at redhat.com>
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100908/12ca10d3/attachment.htm>

From girishpati at yahoo.com  Thu Sep  9 05:29:51 2010
From: girishpati at yahoo.com (Girish Prajapati)
Date: Wed, 8 Sep 2010 22:29:51 -0700 (PDT)
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
	<AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>
Message-ID: <541653.54252.qm@web120512.mail.ne1.yahoo.com>

Hello...

I have already configure BIOS for iLO.. but am not sure why i don need to shared 
?? 

please anybody can help me out for this problem. 
Do i need any extra setup for fencing device ?
thanks


________________________________
From: ESGLinux <esggrupos at gmail.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Wed, September 8, 2010 2:57:25 PM
Subject: Re: [Linux-cluster] need help - Fencing problem

Hello,? 

Have you configured the iLO devices entering in the BIOS?

I remenber I have to set up the user/pass in the iLO and marked the iLo as not 
shared


HTH,?

ESG


2010/9/8 Girish Prajapati <girishpati at yahoo.com>

Hello Everybody,
>i am having problem of fencing a cluster node? let me explain indetail :
>I have installed RHEL 5.4 on? HP Prolaint DL280 G5 servers and iLO 2as fencing 
>device. Am managing cluster through Luci - (Conga). itseems everything is 
>working fine. I can reboot cluster nodes through Luci and service get transfer 
>to another node. After rebooting node connect to cluster automatically without 
>any error.
>Problem is i can not do Fence this node through Luci, when i try to fence any 
>node i get following error :
>
>Sep? 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable to 
>connect/login to fencing device
>Sep? 8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was 
>unsuccessful
>
>my iLO license is : iLO 2 Advanced Evaluation
>Do i need to have? license of iLO or there is problem in configuration of 
>cluster ?
>how i can check cluster log in details.
>
>Appreciate your help.
>Thank you in advance.
>
>Regards,
>Girishkumar R Prajapati
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>https://www.redhat.com/mailman/listinfo/linux-cluster
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100908/79e5b85b/attachment.htm>

From brem.belguebli at gmail.com  Thu Sep  9 06:00:28 2010
From: brem.belguebli at gmail.com (Brem Belguebli)
Date: Thu, 09 Sep 2010 08:00:28 +0200
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <541653.54252.qm@web120512.mail.ne1.yahoo.com>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
	<AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>
	<541653.54252.qm@web120512.mail.ne1.yahoo.com>
Message-ID: <1284012028.3342.3.camel@newgen.localdomain>

try run this from another node of the cluster

fence_ilo -a "Ilo IP"  -l "Ilo user" -p "Ilo passwd" -o reboot


Additionnally, by connecting thru http to the Ilo, you should be able to
see Ilo logs (in the general tab) and see if it is due to a lack of
licensing

 
On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> Hello...
>  
> I have already configure BIOS for iLO.. but am not sure why i don need
> to shared ?? 
> please anybody can help me out for this problem. 
> Do i need any extra setup for fencing device ?
> thanks
> 
> 
> 
> ______________________________________________________________________
> From: ESGLinux <esggrupos at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Wed, September 8, 2010 2:57:25 PM
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> Hello,  
> 
> 
> Have you configured the iLO devices entering in the BIOS?
> 
> 
> I remenber I have to set up the user/pass in the iLO and marked the
> iLo as not shared
> 
> 
> 
> 
> HTH, 
> 
> 
> ESG
> 
> 2010/9/8 Girish Prajapati <girishpati at yahoo.com>
>         Hello Everybody,
>         i am having problem of fencing a cluster node  let me explain
>         indetail :
>         I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and
>         iLO 2as fencing device. Am managing cluster through Luci -
>         (Conga). itseems everything is working fine. I can reboot
>         cluster nodes through Luci and service get transfer to another
>         node. After rebooting node connect to cluster automatically
>         without any error.
>         Problem is i can not do Fence this node through Luci, when i
>         try to fence any node i get following error :
>         
>         Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
>         reports: Unable to connect/login to fencing device
>         Sep  8 14:51:16 node2 fence_node[9106]: Fence of
>         "node1.drctmb.com" was unsuccessful
>         
>         my iLO license is : iLO 2 Advanced Evaluation
>         Do i need to have  license of iLO or there is problem in
>         configuration of cluster ?
>         how i can check cluster log in details.
>          
>         Appreciate your help.
>         Thank you in advance.
>          
>         Regards,
>         Girishkumar R Prajapati
>         
>         
>         
>         --
>         Linux-cluster mailing list
>         Linux-cluster at redhat.com
>         https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From girishpati at yahoo.com  Thu Sep  9 07:43:45 2010
From: girishpati at yahoo.com (Girish Prajapati)
Date: Thu, 9 Sep 2010 00:43:45 -0700 (PDT)
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <1284012028.3342.3.camel@newgen.localdomain>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
	<AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>
	<541653.54252.qm@web120512.mail.ne1.yahoo.com>
	<1284012028.3342.3.camel@newgen.localdomain>
Message-ID: <744023.36408.qm@web120512.mail.ne1.yahoo.com>

Hello,
i can run following command successfully from another node but still getting 
same error message :

fence_ilo -a "Ilo IP"  -l "Ilo user" -p "Ilo passwd" -o reboot

Sep  9 14:37:00 node2 openais[2904]: [CLM  ] Members Joined: 
Sep  9 14:37:00 node2 openais[2904]: [SYNC ] This node is within the primary 
component and will provide service. 

Sep  9 14:37:00 node2 openais[2904]: [TOTEM] entering OPERATIONAL state. 
Sep  9 14:37:00 node2 openais[2904]: [CLM  ] got nodejoin message 192.168.0.28 
Sep  9 14:37:00 node2 openais[2904]: [CPG  ] got joinlist message from node 1 
Sep  9 14:37:00 node2 fenced[2923]: node1.drctmb.com not a cluster member after 
0 sec post_fail_delay
Sep  9 14:37:00 node2 fenced[2923]: fencing node "node1.drctmb.com"
Sep  9 14:37:10 node2 fenced[2923]: agent "fence_ilo" reports: Unable to 
connect/login to fencing device 

Sep  9 14:37:10 node2 fenced[2923]: fence "node1.drctmb.com" failed
Sep  9 14:37:15 node2 fenced[2923]: fencing node "node1.drctmb.com"
Sep  9 14:37:26 node2 fenced[2923]: agent "fence_ilo" reports: Unable to 
connect/login to fencing device 


node1 rebooted and get connect to the cluster  but now my webby service not 
working see below log :

Broadcast message from root (Thu Sep  9 14:32:41 2010):
The system is going down for system halt NOW!
Sep  9 14:19:22 node1 last message repeated 17 times
Sep  9 14:32:41 node1 shutdown[25506]: shutting down for system halt
Sep  9 14:32:41 node1 pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not 
Found
Sep  9 14:32:43 node1 modclusterd: shutdown succeeded
Sep  9 14:32:43 node1 rgmanager: [25593]: <notice> Shutting down Cluster Service 
Manager... 

Sep  9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down 
Sep  9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down 
Sep  9 14:32:43 node1 clurgmgrd[3457]: <notice> Stopping service service:webby 
Sep  9 14:32:44 node1 avahi-daemon[3378]: Withdrawing address record for 
192.168.0.30 on eth0.
Read from remote host node1: Connection reset by peer
.
.
.
Sep  9 14:35:42 node1 smartd[3585]: Device: /dev/hda, packet devices [this 
device CD/DVD] not SMART capable 

Sep  9 14:35:42 node1 smartd[3585]: Device: /dev/sda, opened 
Sep  9 14:35:42 node1 smartd[3585]: Device: /dev/sda, IE (SMART) not enabled, 
skip device Try 'smartctl -s on /dev/sda' to turn on SMART features 

Sep  9 14:35:42 node1 smartd[3585]: Monitoring 0 ATA and 0 SCSI devices 
Sep  9 14:35:42 node1 smartd[3604]: smartd has fork()ed into background mode. 
New PID=3604. 

Sep  9 14:35:42 node1 avahi-daemon[3412]: Service "SFTP File Transfer on node1" 
(/services/sftp-ssh.service) successfully established.
Sep  9 14:35:45 node1 pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not 
Found
Sep  9 14:35:45 node1 last message repeated 3 times
Sep  9 14:35:45 node1 kernel: mtrr: type mismatch for d8000000,2000000 old: 
uncachable new: write-combining
Sep  9 14:35:46 node1 clurgmgrd: [3491]: <err> Checking Existence Of File 
/var/run/cluster/apache/apache:httpd.pid [apache:httpd] > Failed - File Doesn't 
Exist 


It seems that there problem in fencing device configuration.
Please find here my cluster.conf :


<?xml version="1.0"?>
<cluster alias="girish" config_version="21" name="girish">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="node2.drctmb.com" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="NODE2"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="node1.drctmb.com" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device name="NODE1"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_ilo" hostname="node1.drctmb.com" 
login="root" name="NODE1" passwd="redhat123"/>
                <fencedevice agent="fence_ilo" hostname="node2.drctmb.com" 
login="root" name="NODE2" passwd="redhat123"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="prefer_node1" nofailback="0" 
ordered="1" restricted="1">
                                <failoverdomainnode name="node2.drctmb.com" 
priority="2"/>
                                <failoverdomainnode name="node1.drctmb.com" 
priority="1"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <fs device="/dev/sda1" force_fsck="0" force_unmount="0" 
fsid="8669" fstype="ext3" mountpoint="/var/www/html" name="docroot" 
self_fence="0"/>
                        <ip address="192.168.0.30" monitor_link="1"/>
                        <apache config_file="conf/httpd.conf" name="httpd" 
server_root="/etc/httpd" shutdown_wait="5"/>
                </resources>
                <service autostart="1" domain="prefer_node1" exclusive="0" 
name="webby" recovery="relocate">
                        <ip ref="192.168.0.30"/>
                        <fs ref="docroot"/>
                        <apache ref="httpd"/>
                </service>
        </rm>
        <fence_xvmd/>
</cluster>
~  

This is first time am working on Clustering so please help me.
Appreciate your help.

Thank you.


________________________________
From: Brem Belguebli <brem.belguebli at gmail.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Thu, September 9, 2010 11:30:28 AM
Subject: Re: [Linux-cluster] need help - Fencing problem

try run this from another node of the cluster

fence_ilo -a "Ilo IP"  -l "Ilo user" -p "Ilo passwd" -o reboot


Additionnally, by connecting thru http to the Ilo, you should be able to
see Ilo logs (in the general tab) and see if it is due to a lack of
licensing


On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> Hello...
>  
> I have already configure BIOS for iLO.. but am not sure why i don need
> to shared ?? 
> please anybody can help me out for this problem. 
> Do i need any extra setup for fencing device ?
> thanks
> 
> 
> 
> ______________________________________________________________________
> From: ESGLinux <esggrupos at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Wed, September 8, 2010 2:57:25 PM
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> Hello,  
> 
> 
> Have you configured the iLO devices entering in the BIOS?
> 
> 
> I remenber I have to set up the user/pass in the iLO and marked the
> iLo as not shared
> 
> 
> 
> 
> HTH, 
> 
> 
> ESG
> 
> 2010/9/8 Girish Prajapati <girishpati at yahoo.com>
>         Hello Everybody,
>         i am having problem of fencing a cluster node  let me explain
>         indetail :
>         I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and
>         iLO 2as fencing device. Am managing cluster through Luci -
>         (Conga). itseems everything is working fine. I can reboot
>         cluster nodes through Luci and service get transfer to another
>         node. After rebooting node connect to cluster automatically
>         without any error.
>         Problem is i can not do Fence this node through Luci, when i
>         try to fence any node i get following error :
>        
>         Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
>         reports: Unable to connect/login to fencing device
>         Sep  8 14:51:16 node2 fence_node[9106]: Fence of
>         "node1.drctmb.com" was unsuccessful
>        
>         my iLO license is : iLO 2 Advanced Evaluation
>         Do i need to have  license of iLO or there is problem in
>         configuration of cluster ?
>         how i can check cluster log in details.
>          
>         Appreciate your help.
>         Thank you in advance.
>          
>         Regards,
>         Girishkumar R Prajapati
>        
>        
>        
>         --
>         Linux-cluster mailing list
>        Linux-cluster at redhat.com
>        https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100909/526d0bc3/attachment.htm>

From girishpati at yahoo.com  Thu Sep  9 08:05:23 2010
From: girishpati at yahoo.com (Girish Prajapati)
Date: Thu, 9 Sep 2010 01:05:23 -0700 (PDT)
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <1284012028.3342.3.camel@newgen.localdomain>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
	<AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>
	<541653.54252.qm@web120512.mail.ne1.yahoo.com>
	<1284012028.3342.3.camel@newgen.localdomain>
Message-ID: <374384.92522.qm@web120505.mail.ne1.yahoo.com>


Hello,
i can run following command successfully from another node but still getting 
same error message :

fence_ilo -a "Ilo IP"? -l "Ilo user" -p "Ilo passwd" -o reboot

Sep? 9 14:37:00 node2 openais[2904]: [CLM? ] Members Joined: 
Sep? 9 14:37:00 node2 openais[2904]: [SYNC ] This node is within the primary 
component and will provide service. 

Sep? 9 14:37:00 node2 openais[2904]: [TOTEM] entering OPERATIONAL state. 
Sep? 9 14:37:00 node2 openais[2904]: [CLM? ] got nodejoin message 192.168.0.28 
Sep? 9 14:37:00 node2 openais[2904]: [CPG? ] got joinlist message from node 1 
Sep? 9 14:37:00 node2 fenced[2923]: node1.drctmb.com not a cluster member after 
0 sec post_fail_delay
Sep? 9 14:37:00 node2 fenced[2923]: fencing node "node1.drctmb.com"
Sep? 9 14:37:10 node2 fenced[2923]: agent "fence_ilo" reports: Unable to 
connect/login to fencing device 

Sep? 9 14:37:10 node2 fenced[2923]: fence "node1.drctmb.com" failed
Sep? 9 14:37:15 node2 fenced[2923]: fencing node "node1.drctmb.com"
Sep? 9 14:37:26 node2 fenced[2923]: agent "fence_ilo" reports: Unable to 
connect/login to fencing device 


node1 rebooted and get connect to the cluster? but now my webby service not 
working see below log :

Broadcast message from root (Thu Sep? 9 14:32:41 2010):
The system is going down for system halt NOW!
Sep? 9 14:19:22 node1 last message repeated 17 times
Sep? 9 14:32:41 node1 shutdown[25506]: shutting down for system halt
Sep? 9 14:32:41 node1 pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not 
Found
Sep? 9 14:32:43 node1 modclusterd: shutdown succeeded
Sep? 9 14:32:43 node1 rgmanager: [25593]: <notice> Shutting down Cluster Service 
Manager... 

Sep? 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down 
Sep? 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down 
Sep? 9 14:32:43 node1 clurgmgrd[3457]: <notice> Stopping service service:webby 
Sep? 9 14:32:44 node1 avahi-daemon[3378]: Withdrawing address record for 
192.168.0.30 on eth0.
Read from remote host node1: Connection reset by peer
.
.
.
Sep? 9 14:35:42 node1 smartd[3585]: Device: /dev/hda, packet devices [this 
device CD/DVD] not SMART capable 

Sep? 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, opened 
Sep? 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, IE (SMART) not enabled, 
skip device Try 'smartctl -s on /dev/sda' to turn on SMART features 

Sep? 9 14:35:42 node1 smartd[3585]: Monitoring 0 ATA and 0 SCSI devices 
Sep? 9 14:35:42 node1 smartd[3604]: smartd has fork()ed into background mode. 
New PID=3604. 

Sep? 9 14:35:42 node1 avahi-daemon[3412]: Service "SFTP File Transfer on node1" 
(/services/sftp-ssh.service) successfully established.
Sep? 9 14:35:45 node1 pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not 
Found
Sep? 9 14:35:45 node1 last message repeated 3 times
Sep? 9 14:35:45 node1 kernel: mtrr: type mismatch for d8000000,2000000 old: 
uncachable new: write-combining
Sep? 9 14:35:46 node1 clurgmgrd: [3491]: <err> Checking Existence Of File 
/var/run/cluster/apache/apache:httpd.pid [apache:httpd] > Failed - File Doesn't 
Exist 


It seems that there problem in fencing device configuration.
Please find here my cluster.conf :


<?xml version="1.0"?>
<cluster alias="girish" config_version="21" name="girish">
??????? <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
??????? <clusternodes>
??????????????? <clusternode name="node2.drctmb.com" nodeid="1" votes="1">
??????????????????????? <fence>
??????????????????????????????? <method name="1">
??????????????????????????????????????? <device name="NODE2"/>
??????????????????????????????? </method>
??????????????????????? </fence>
??????????????? </clusternode>
??????????????? <clusternode name="node1.drctmb.com" nodeid="2" votes="1">
??????????????????????? <fence>
??????????????????????????????? <method name="1">
??????????????????????????????????????? <device name="NODE1"/>
??????????????????????????????? </method>
??????????????????????? </fence>
??????????????? </clusternode>
??????? </clusternodes>
??????? <cman expected_votes="1" two_node="1"/>
??????? <fencedevices>
??????????????? <fencedevice agent="fence_ilo" hostname="node1.drctmb.com" 
login="root" name="NODE1" passwd="redhat123"/>
??????????????? <fencedevice agent="fence_ilo" hostname="node2.drctmb.com" 
login="root" name="NODE2" passwd="redhat123"/>
??????? </fencedevices>
??????? <rm>
??????????????? <failoverdomains>
??????????????????????? <failoverdomain name="prefer_node1" nofailback="0" 
ordered="1" restricted="1">
??????????????????????????????? <failoverdomainnode name="node2.drctmb.com" 
priority="2"/>
??????????????????????????????? <failoverdomainnode name="node1.drctmb.com" 
priority="1"/>
??????????????????????? </failoverdomain>
??????????????? </failoverdomains>
??????????????? <resources>
??????????????????????? <fs device="/dev/sda1" force_fsck="0" force_unmount="0" 
fsid="8669" fstype="ext3" mountpoint="/var/www/html" name="docroot" 
self_fence="0"/>
??????????????????????? <ip address="192.168.0.30" monitor_link="1"/>
??????????????????????? <apache config_file="conf/httpd.conf" name="httpd" 
server_root="/etc/httpd" shutdown_wait="5"/>
??????????????? </resources>
??????????????? <service autostart="1" domain="prefer_node1" exclusive="0" 
name="webby" recovery="relocate">
??????????????????????? <ip ref="192.168.0.30"/>
??????????????????????? <fs ref="docroot"/>
??????????????????????? <apache ref="httpd"/>
??????????????? </service>
??????? </rm>
??????? <fence_xvmd/>
</cluster>
~? 

This is first time am working on Clustering so please help me.
Appreciate your help.

Thank you.


________________________________
From: Brem Belguebli <brem.belguebli at gmail.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Thu, September 9, 2010 11:30:28 AM
Subject: Re: [Linux-cluster] need help - Fencing problem

try run this from another node of the cluster

fence_ilo -a "Ilo IP"? -l "Ilo user" -p "Ilo passwd" -o reboot


Additionnally, by connecting thru http to the Ilo, you should be able to
see Ilo logs (in the general tab) and see if it is due to a lack of
licensing


On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> Hello...
>? 
> I have already configure BIOS for iLO.. but am not sure why i don need
> to shared ?? 
> please anybody can help me out for this problem. 
> Do i need any extra setup for fencing device ?
> thanks
> 
> 
> 
> ______________________________________________________________________
> From: ESGLinux <esggrupos at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Wed, September 8, 2010 2:57:25 PM
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> Hello,? 
> 
> 
> Have you configured the iLO devices entering in the BIOS?
> 
> 
> I remenber I have to set up the user/pass in the iLO and marked the
> iLo as not shared
> 
> 
> 
> 
> HTH, 
> 
> 
> ESG
> 
> 2010/9/8 Girish Prajapati <girishpati at yahoo.com>
>? ? ? ? Hello Everybody,
>? ? ? ? i am having problem of fencing a cluster node? let me explain
>? ? ? ? indetail :
>? ? ? ? I have installed RHEL 5.4 on? HP Prolaint DL280 G5 servers and
>? ? ? ? iLO 2as fencing device. Am managing cluster through Luci -
>? ? ? ? (Conga). itseems everything is working fine. I can reboot
>? ? ? ? cluster nodes through Luci and service get transfer to another
>? ? ? ? node. After rebooting node connect to cluster automatically
>? ? ? ? without any error.
>? ? ? ? Problem is i can not do Fence this node through Luci, when i
>? ? ? ? try to fence any node i get following error :
>? ? ? ? 
>? ? ? ? Sep? 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
>? ? ? ? reports: Unable to connect/login to fencing device
>? ? ? ? Sep? 8 14:51:16 node2 fence_node[9106]: Fence of
>? ? ? ? "node1.drctmb.com" was unsuccessful
>? ? ? ? 
>? ? ? ? my iLO license is : iLO 2 Advanced Evaluation
>? ? ? ? Do i need to have? license of iLO or there is problem in
>? ? ? ? configuration of cluster ?
>? ? ? ? how i can check cluster log in details.
>? ? ? ? ? 
>? ? ? ? Appreciate your help.
>? ? ? ? Thank you in advance.
>? ? ? ? ? 
>? ? ? ? Regards,
>? ? ? ? Girishkumar R Prajapati
>? ? ? ? 
>? ? ? ? 
>? ? ? ? 
>? ? ? ? --
>? ? ? ? Linux-cluster mailing list
>? ? ? ? Linux-cluster at redhat.com
>? ? ? ? https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100909/0e620725/attachment.htm>
-------------- next part --------------
--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

From esggrupos at gmail.com  Thu Sep  9 08:51:47 2010
From: esggrupos at gmail.com (ESGLinux)
Date: Thu, 9 Sep 2010 10:51:47 +0200
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <036B68E61A28CA49AC2767596576CD596BACEB2F3D@GVW1113EXC.americas.hpqcorp.net>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
	<AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>
	<036B68E61A28CA49AC2767596576CD596BACEB2F3D@GVW1113EXC.americas.hpqcorp.net>
Message-ID: <AANLkTinvwGjTYc01AgdghwTszSm2j5TJXyw+nEmEwGFe@mail.gmail.com>

Hi,

the only reason was that when I used as shared the speed of this device was
very very low. Marked it as non-shared it works fine. I don?t know the
reason. It was a try-error test,

Greetings,

ESG

2010/9/9 Jankowski, Chris <Chris.Jankowski at hp.com>

>  Why did you have to set iLO as non-shared?
>
> Thank and regards,
>
> Chris
>
>  ------------------------------
> *From:* linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] *On Behalf Of *ESGLinux
> *Sent:* Wednesday, 8 September 2010 22:57
> *To:* linux clustering
>
> *Subject:* Re: [Linux-cluster] need help - Fencing problem
>
> Hello,
>
> Have you configured the iLO devices entering in the BIOS?
>
> I remenber I have to set up the user/pass in the iLO and marked the iLo as
> not shared
>
>
> HTH,
>
> ESG
>
> 2010/9/8 Girish Prajapati <girishpati at yahoo.com>
>
>>  Hello Everybody,
>> i am having problem of fencing a cluster node  let me explain indetail :
>> I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as
>> fencing device. Am managing cluster through Luci - (Conga). itseems
>> everything is working fine. I can reboot cluster nodes through Luci and
>> service get transfer to another node. After rebooting node connect to
>> cluster automatically without any error.
>> Problem is i can not do Fence this node through Luci, when i try to fence
>> any node i get following error :
>>
>> Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable
>> to connect/login to fencing device
>> Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was
>> unsuccessful
>>
>> my iLO license is : iLO 2 Advanced Evaluation
>> Do i need to have  license of iLO or there is problem in configuration of
>> cluster ?
>> how i can check cluster log in details.
>>
>> Appreciate your help.
>> Thank you in advance.
>>
>> Regards,
>> Girishkumar R Prajapati
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100909/4398a868/attachment.htm>

From rhurst at bidmc.harvard.edu  Thu Sep  9 13:34:20 2010
From: rhurst at bidmc.harvard.edu (rhurst at bidmc.harvard.edu)
Date: Thu, 9 Sep 2010 09:34:20 -0400
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
Message-ID: <50168EC934B8D64AA8D8DD37F840F3DE05640628E6@EVS2CCR.its.caregroup.org>

For what it is worth, our experiences with HP iLO management cards:

iLO found on G1 servers does not need to be licensed, AFAIK, it does not have the option to do so anyways.

iLO2 found on G2 and beyond does not need to be licensed either, if you are only using it as a fencing device.  We licensed all of ours, because it enabled useful KVM with remote media capabilities that are superior than our Raritan KVM infrastructure.

Both management cards should have their firmware updated -- they were both problematic to us as factory-shipped, but applying their update packs allowed them to work as advertised.

Also, can't you add "-v" for verbose output and also something like "-D /tmp/fence.out" to save debugging info to an output file?  It might help some to see where exactly the failure is occuring.  Good luck.

________________________________
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Girish Prajapati
Sent: Wednesday, September 08, 2010 6:06 AM
To: Linux-cluster at redhat.com
Subject: [Linux-cluster] need help - Fencing problem

Hello Everybody,
i am having problem of fencing a cluster node  let me explain indetail :
I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as fencing device. Am managing cluster through Luci - (Conga). itseems everything is working fine. I can reboot cluster nodes through Luci and service get transfer to another node. After rebooting node connect to cluster automatically without any error.
Problem is i can not do Fence this node through Luci, when i try to fence any node i get following error :

Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable to connect/login to fencing device
Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was unsuccessful

my iLO license is : iLO 2 Advanced Evaluation
Do i need to have  license of iLO or there is problem in configuration of cluster ?
how i can check cluster log in details.

Appreciate your help.
Thank you in advance.

Regards,
Girishkumar R Prajapati

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100909/d58c6bb4/attachment.htm>

From nehemiasjahcob at gmail.com  Thu Sep  9 14:18:31 2010
From: nehemiasjahcob at gmail.com (Nehemias Jahcob)
Date: Thu, 9 Sep 2010 10:18:31 -0400
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <50168EC934B8D64AA8D8DD37F840F3DE05640628E6@EVS2CCR.its.caregroup.org>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
	<50168EC934B8D64AA8D8DD37F840F3DE05640628E6@EVS2CCR.its.caregroup.org>
Message-ID: <AANLkTim-nS3c8e67kPycd-u0XFOMERR8EJorG6+xHn4M@mail.gmail.com>

1. ) You can increase the verbosity level for troubleshooting??
   <cluster alias="girish" config_version="*n+1*" name="girish">
----
  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3" *
log_level="7"/>*
  <rm *log_level="7"*>
-----
#ccs_tool update /etc/cluster/cluster.conf

Copy-paste /var/log/messages


2.) What version of PSP you have installed??

3.) If  nothing works, I recommend using fence_ipmi

Greetings!


2010/9/9 <rhurst at bidmc.harvard.edu>

>  For what it is worth, our experiences with HP iLO management cards:
>
> iLO found on G1 servers does not need to be licensed, AFAIK, it does not
> have the option to do so anyways.
>
> iLO2 found on G2 and beyond does not need to be licensed either, if you are
> only using it as a fencing device.  We licensed all of ours, because it
> enabled useful KVM with remote media capabilities that are superior than our
> Raritan KVM infrastructure.
>
> Both management cards should have their firmware updated -- they were both
> problematic to us as factory-shipped, but applying their update
> packs allowed them to work as advertised.
>
> Also, can't you add "-v" for verbose output and also something like "-D
> /tmp/fence.out" to save debugging info to an output file?  It might help
> some to see where exactly the failure is occuring.  Good luck.
>
>  ------------------------------
> *From:* linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] *On Behalf Of *Girish Prajapati
> *Sent:* Wednesday, September 08, 2010 6:06 AM
> *To:* Linux-cluster at redhat.com
> *Subject:* [Linux-cluster] need help - Fencing problem
>
>  Hello Everybody,
> i am having problem of fencing a cluster node  let me explain indetail :
> I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as
> fencing device. Am managing cluster through Luci - (Conga). itseems
> everything is working fine. I can reboot cluster nodes through Luci and
> service get transfer to another node. After rebooting node connect to
> cluster automatically without any error.
> Problem is i can not do Fence this node through Luci, when i try to fence
> any node i get following error :
>
> Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was
> unsuccessful
>
> my iLO license is : iLO 2 Advanced Evaluation
> Do i need to have  license of iLO or there is problem in configuration of
> cluster ?
> how i can check cluster log in details.
>
> Appreciate your help.
> Thank you in advance.
>
> Regards,
> Girishkumar R Prajapati
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100909/ba940c01/attachment.htm>

From bturner at redhat.com  Thu Sep  9 15:58:45 2010
From: bturner at redhat.com (Ben Turner)
Date: Thu, 9 Sep 2010 11:58:45 -0400 (EDT)
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <374384.92522.qm@web120505.mail.ne1.yahoo.com>
Message-ID: <155361964.174311284047925612.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>

Judging from:

"Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable to connect/login to fencing device"

Chances are you are not using the correct username/password/IP or the ilo is not configured for telnet logins.  Try the following:

1.  Login to the ilo via telnet from the command line.  Be sure to use the username/password/IP you have in cluster.conf.

2.  If that is successful try:

# fence_ilo -v -a "Ilo IP from cluster.conf" -l "Ilo user from cluster.conf" -p "Ilo passwd from cluster.conf" -o status

The -v will display exactly what the fence agent sees and is very useful for debugging failing fences.  If the status fails send me the output.

3.  If the fence_ilo successful try:

# fence_node <node name from cluster.conf>

If all 3 are successful then fencing is setup properly and there may be a problem running it from Luci, if any of the 3 fail post the error back to the list and I'll look at it.

-Ben


----- "Girish Prajapati" <girishpati at yahoo.com> wrote:

> Hello,
> i can run following command successfully from another node but still
> getting same error message :
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] Members Joined:
> Sep 9 14:37:00 node2 openais[2904]: [SYNC ] This node is within the
> primary component and will provide service.
> Sep 9 14:37:00 node2 openais[2904]: [TOTEM] entering OPERATIONAL
> state.
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] got nodejoin message
> 192.168.0.28
> Sep 9 14:37:00 node2 openais[2904]: [CPG ] got joinlist message from
> node 1
> Sep 9 14:37:00 node2 fenced[2923]: node1.drctmb.com not a cluster
> member after 0 sec post_fail_delay
> Sep 9 14:37:00 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:10 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> Sep 9 14:37:10 node2 fenced[2923]: fence "node1.drctmb.com" failed
> Sep 9 14:37:15 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:26 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> 
> node1 rebooted and get connect to the cluster but now my webby service
> not working see below log :
> 
> Broadcast message from root (Thu Sep 9 14:32:41 2010):
> The system is going down for system halt NOW!
> Sep 9 14:19:22 node1 last message repeated 17 times
> Sep 9 14:32:41 node1 shutdown[25506]: shutting down for system halt
> Sep 9 14:32:41 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:32:43 node1 modclusterd: shutdown succeeded
> Sep 9 14:32:43 node1 rgmanager: [25593]: <notice> Shutting down
> Cluster Service Manager...
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Stopping service
> service:webby
> Sep 9 14:32:44 node1 avahi-daemon[3378]: Withdrawing address record
> for 192.168.0.30 on eth0.
> Read from remote host node1: Connection reset by peer
> .
> .
> .
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/hda, packet devices
> [this device CD/DVD] not SMART capable
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, opened
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, IE (SMART) not
> enabled, skip device Try 'smartctl -s on /dev/sda' to turn on SMART
> features
> Sep 9 14:35:42 node1 smartd[3585]: Monitoring 0 ATA and 0 SCSI devices
> Sep 9 14:35:42 node1 smartd[3604]: smartd has fork()ed into background
> mode. New PID=3604.
> Sep 9 14:35:42 node1 avahi-daemon[3412]: Service "SFTP File Transfer
> on node1" (/services/sftp-ssh.service) successfully established.
> Sep 9 14:35:45 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:35:45 node1 last message repeated 3 times
> Sep 9 14:35:45 node1 kernel: mtrr: type mismatch for d8000000,2000000
> old: uncachable new: write-combining
> Sep 9 14:35:46 node1 clurgmgrd: [3491]: <err> Checking Existence Of
> File /var/run/cluster/apache/apache:httpd.pid [apache:httpd] > Failed
> - File Doesn't Exist
> 
> 
> 
> It seems that there problem in fencing device configuration.
> Please find here my cluster.conf :
> 
> 
> <?xml version="1.0"?>
> <cluster alias="girish" config_version="21" name="girish">
> <fence_daemon clean_start="0" post_fail_delay="0"
> post_join_delay="3"/>
> <clusternodes>
> <clusternode name=" node2.drctmb.com " nodeid="1" votes="1">
> <fence>
> <method name="1">
> <device name="NODE2"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="node1.drctmb.com" nodeid="2" votes="1">
> <fence>
> <method name="1">
> <device name="NODE1"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> <cman expected_votes="1" two_node="1"/>
> <fencedevices>
> <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
> login="root" name="NODE1" passwd="redhat123"/>
> <fencedevice agent="fence_ilo" hostname="node2.drctmb.com"
> login="root" name="NODE2" passwd="redhat123"/>
> </fencedevices>
> <rm>
> <failoverdomains>
> <failoverdomain name="prefer_node1" nofailback="0" ordered="1"
> restricted="1">
> <failoverdomainnode name="node2.drctmb.com" priority="2"/>
> <failoverdomainnode name="node1.drctmb.com" priority="1"/>
> </failoverdomain>
> </failoverdomains>
> <resources>
> <fs device="/dev/sda1" force_fsck="0" force_unmount="0" fsid="8669"
> fstype="ext3" mountpoint="/var/www/html" name="docroot"
> self_fence="0"/>
> <ip address="192.168.0.30" monitor_link="1"/>
> <apache config_file="conf/httpd.conf" name="httpd"
> server_root="/etc/httpd" shutdown_wait="5"/>
> </resources>
> <service autostart="1" domain="prefer_node1" exclusive="0"
> name="webby" recovery="relocate">
> <ip ref="192.168.0.30"/>
> <fs ref="docroot"/>
> <apache ref="httpd"/>
> </service>
> </rm>
> <fence_xvmd/>
> </cluster>
> ~
> 
> This is first time am working on Clustering so please help me.
> Appreciate your help.
> 
> Thank you.
> 
> 
> 
> From: Brem Belguebli <brem.belguebli at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Thu, September 9, 2010 11:30:28 AM
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> try run this from another node of the cluster
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> 
> Additionnally, by connecting thru http to the Ilo, you should be able
> to
> see Ilo logs (in the general tab) and see if it is due to a lack of
> licensing
> 
> 
> On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> > Hello...
> >
> > I have already configure BIOS for iLO.. but am not sure why i don
> need
> > to shared ??
> > please anybody can help me out for this problem.
> > Do i need any extra setup for fencing device ?
> > thanks
> >
> >
> >
> >
> ______________________________________________________________________
> > From: ESGLinux < esggrupos at gmail.com >
> > To: linux clustering < linux-cluster at redhat.com >
> > Sent: Wed, September 8, 2010 2:57:25 PM
> > Subject: Re: [Linux-cluster] need help - Fencing problem
> >
> > Hello,
> >
> >
> > Have you configured the iLO devices entering in the BIOS?
> >
> >
> > I remenber I have to set up the user/pass in the iLO and marked the
> > iLo as not shared
> >
> >
> >
> >
> > HTH,
> >
> >
> > ESG
> >
> > 2010/9/8 Girish Prajapati < girishpati at yahoo.com >
> > Hello Everybody,
> > i am having problem of fencing a cluster node let me explain
> > indetail :
> > I have installed RHEL 5.4 on HP Prolaint DL280 G5 servers and
> > iLO 2as fencing device. Am managing cluster through Luci -
> > (Conga). itseems everything is working fine. I can reboot
> > cluster nodes through Luci and service get transfer to another
> > node. After rebooting node connect to cluster automatically
> > without any error.
> > Problem is i can not do Fence this node through Luci, when i
> > try to fence any node i get following error :
> >
> > Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
> > reports: Unable to connect/login to fencing device
> > Sep 8 14:51:16 node2 fence_node[9106]: Fence of
> > " node1.drctmb.com " was unsuccessful
> >
> > my iLO license is : iLO 2 Advanced Evaluation
> > Do i need to have license of iLO or there is problem in
> > configuration of cluster ?
> > how i can check cluster log in details.
> >
> > Appreciate your help.
> > Thank you in advance.
> >
> > Regards,
> > Girishkumar R Prajapati
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From sagar.vipin at gmail.com  Thu Sep  9 15:43:09 2010
From: sagar.vipin at gmail.com (vipin sagar)
Date: Thu, 9 Sep 2010 21:13:09 +0530
Subject: [Linux-cluster] RHCS: High Availabilty on SAP Application
Message-ID: <AANLkTin7RsR2P9G1AyfcQD=9BZP_A6Ods3YGiDtwif5v@mail.gmail.com>

Hello There!

I am sure, quite a lot of people in here have worked on different types of
Cluster setup. I myself worked on setting up ROCKS and MPICH on the HPC
side.

Now I am looking for a head start on setting up an HA cluster for "*SAP
application"*, which includes ABAP and ABAP+JAVA application stack with
MaxDB on RHAS-5.5.
If any of you worked with SAP on RHCS HA set up, please share your thoughts,
inputs, best-practice or any kind of reference would be much grateful.

Already read www.*redhat*.com/f/pdf/ha-*sap*-v1-6-4.pdf

Thank you for your time

~sagar

-- 
~O_0~
~sagar
http://vipinsagar.net
*...i?ve to look back when i heard a gong! i could only see a huge cobweb
and its shining, just got wonder, what the time it was?5AgAr*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100909/9ceddd29/attachment.htm>

From lhh at redhat.com  Thu Sep  9 17:59:05 2010
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 09 Sep 2010 13:59:05 -0400
Subject: [Linux-cluster] Creating custom OCF_RESKEY_ variables,
 now how and what to refresh?
In-Reply-To: <4C581396.2020107@gmail.com>
References: <AANLkTim+WEBDoEGeqAO6h75srW_v_dgUMuE9XxgZD6=m@mail.gmail.com>
	<4C581396.2020107@gmail.com>
Message-ID: <1284055145.2207.16059.camel@ayanami.boston.devel.redhat.com>

On Tue, 2010-08-03 at 08:03 -0500, Dustin Henry Offutt wrote:
> Still an unsolved mystery why new "rules" written into one of the
> cluster scripts located in /usr/share/cluster on an RHEL5U5 cluster
> won't get recognized by the cluster software, if anyone has a clue...
> > Hello,
> > 
> > Does anyone know how to force a cluster (the "Cluster Suite" as
> > released with the RHEL5.4 ISO, cman, rgmanager, et.al.) to recognize
> > that new OCF_RESKEY variables have been introduced in
> > a /usr/share/cluster/ script?


> > On one cluster the new variables are recognized and used by all
> > nodes. Same exact script, another cluster, just put the
> > updated /usr/share/cluster/ script in today, and it's like it hasn't
> > had something "refreshed", and doesn't see the new "rules," if that
> > makes any sense - despite bouncing the cluster suite and the cluster
> > nodes.

- needs to be mode 755
- update /etc/cluster/cluster.conf's version and run
   ccs_tool update /etc/cluster/cluster.conf
   cman_tool version -r <new_config_version>
- remove all backup files from /usr/share/cluster -OR- chmod -x them.

Sorry for the late response. :/

-- Lon


From lhh at redhat.com  Thu Sep  9 18:03:19 2010
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 09 Sep 2010 14:03:19 -0400
Subject: [Linux-cluster] What does FAIL_STOP_WAIT state mean for clvmd
 and rgmanager
In-Reply-To: <AANLkTimdQRytDkt2hYh0pvVA2KTXd+B9H+_x-FwjXSk2@mail.gmail.com>
References: <AANLkTimdQRytDkt2hYh0pvVA2KTXd+B9H+_x-FwjXSk2@mail.gmail.com>
Message-ID: <1284055399.2207.16065.camel@ayanami.boston.devel.redhat.com>

On Mon, 2010-08-23 at 17:58 +1000, Joel Heenan wrote:
> Can someone please explain what this means and what you can do to get
> out of it:
> 
> [root at cluster-host ~]# group_tool -v
> type             level name       id       state node id local_done
> fence            0     default    00010003 JOIN_STOP_WAIT 1 100050001
> 1
> [1 1 2 3 4]
> dlm              1     clvmd      00020003 FAIL_STOP_WAIT 2 200030003
> 1
> [1 2 3 4]
> dlm              1     rgmanager  00030003 FAIL_STOP_WAIT 2 200030003
> 1
> [1 2 3 4]

It looks like fencing has not completed.  How do you have 2 node 1's in
the fencing group?

-- Lon


From lhh at redhat.com  Thu Sep  9 18:06:22 2010
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 09 Sep 2010 14:06:22 -0400
Subject: [Linux-cluster] resource script vm.sh strange declare directive
In-Reply-To: <AANLkTinF9+WsNNf=pLsmhszRA=J_cQ_JYaReuYHhsUw6@mail.gmail.com>
References: <AANLkTinF9+WsNNf=pLsmhszRA=J_cQ_JYaReuYHhsUw6@mail.gmail.com>
Message-ID: <1284055582.2207.16071.camel@ayanami.boston.devel.redhat.com>

On Tue, 2010-08-24 at 17:57 +0200, brem belguebli wrote:
> Hi,
> 
> After not being able to live migrate cluster resource vm's from one
> node to the other using clusvcadm, I've put some debug in vm.sh and it
> allowed me to see a strange variable assignement that I do not
> understand and that prevents live migration to occur.
> 
> Rhel 5.5 /usr/share/cluster/vm.sh at line  790
> virsh_migrate()
>               declare $target=$1 <-- strange
> 
> Rhel 5.4 in /usr/share/cluster/vm.sh at line 631
> virsh_migrate()
>               declare $target=$1 <-- Same declaration
> 
> For information, when removing the $ before target, live migration
> works like a charm.

It should work either way.  That variable assignment doesn't actually
matter because of the way bash works.  The higher up function which
calls virsh_migrate function declares $target (correctly) and passes it
in as $1 to virsh_migrate.

Because $target's scope is actually global (declare does not create a
'local' variable; it creates a global one; the 'local' keyword creates a
'local' variable), the fact that there is a syntax error should not
matter in this case.

So, you'll get a weird error if you run this from the console but it
should not affect migration.

-- Lon


From lhh at redhat.com  Thu Sep  9 18:10:02 2010
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 09 Sep 2010 14:10:02 -0400
Subject: [Linux-cluster] active/active NFS cluster question
In-Reply-To: <AANLkTin3U411LS8C7KAOhiOnpHWZvct+9fsfLDX2d2M7@mail.gmail.com>
References: <AANLkTin3U411LS8C7KAOhiOnpHWZvct+9fsfLDX2d2M7@mail.gmail.com>
Message-ID: <1284055802.2207.16078.camel@ayanami.boston.devel.redhat.com>

On Wed, 2010-09-01 at 16:40 -0400, Chris Walker wrote:
> Hello,
> 
> I suspect that I'm doing something fairly stupid, but I'm having a
> problem with a cluster that is exporting the same GFS filesystems to
> the same nfs clients.  Everything thing is fine until I relocate one
> of the nfs services to another machine.  The relocation goes fine, but
> once I have two nfs services on the same machine, I can't get them
> apart.  When I relocate one of the two nfs services to a different
> cluster host, the relocation wipes out the entries in
> /var/lib/nfs/etab, forcing the second nfs service on that node to
> relocate as well (I get the error "nfsclient:rc_nfs_clients is
> missing!").

Please delete the name=" " from the nfsclient lines; it might be a bug
in luci, but you can't have "name" and "ref" in the same line.  It will
probably cause rgmanager to think the entire entry is missing in the
best case.

>                                 <nfsexport ref="md3000i01_nfsexp">
>                                         <nfsclient name=" "
> ref="rc_nfs_clients"/>
>                                 </nfsexport>

^^^^^

   <nfsclient ref="rc_nfs_clients" />

-- Lon


From brem.belguebli at gmail.com  Thu Sep  9 18:37:29 2010
From: brem.belguebli at gmail.com (brem belguebli)
Date: Thu, 9 Sep 2010 20:37:29 +0200
Subject: [Linux-cluster] resource script vm.sh strange declare directive
In-Reply-To: <1284055582.2207.16071.camel@ayanami.boston.devel.redhat.com>
References: <AANLkTinF9+WsNNf=pLsmhszRA=J_cQ_JYaReuYHhsUw6@mail.gmail.com>
	<1284055582.2207.16071.camel@ayanami.boston.devel.redhat.com>
Message-ID: <AANLkTim6RPGw+bZx1f-taYa0X-OM86JPOQ58w3Gmv1qC@mail.gmail.com>

Hi Lon,

It did affect live migration. Once corrected  migration worked like a charm.

On my FC13 box, the script is "correct"


Brem
2010/9/9 Lon Hohberger <lhh at redhat.com>:
> On Tue, 2010-08-24 at 17:57 +0200, brem belguebli wrote:
>> Hi,
>>
>> After not being able to live migrate cluster resource vm's from one
>> node to the other using clusvcadm, I've put some debug in vm.sh and it
>> allowed me to see a strange variable assignement that I do not
>> understand and that prevents live migration to occur.
>>
>> Rhel 5.5 /usr/share/cluster/vm.sh at line ?790
>> virsh_migrate()
>> ? ? ? ? ? ? ? declare $target=$1 <-- strange
>>
>> Rhel 5.4 in /usr/share/cluster/vm.sh at line 631
>> virsh_migrate()
>> ? ? ? ? ? ? ? declare $target=$1 <-- Same declaration
>>
>> For information, when removing the $ before target, live migration
>> works like a charm.
>
> It should work either way. ?That variable assignment doesn't actually
> matter because of the way bash works. ?The higher up function which
> calls virsh_migrate function declares $target (correctly) and passes it
> in as $1 to virsh_migrate.
>
> Because $target's scope is actually global (declare does not create a
> 'local' variable; it creates a global one; the 'local' keyword creates a
> 'local' variable), the fact that there is a syntax error should not
> matter in this case.
>
> So, you'll get a weird error if you run this from the console but it
> should not affect migration.
>
> -- Lon
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


From anoop_rajkumar at merck.com  Thu Sep  9 20:12:25 2010
From: anoop_rajkumar at merck.com (Rajkumar, Anoop)
Date: Thu, 9 Sep 2010 16:12:25 -0400
Subject: [Linux-cluster] Linux-cluster Digest, Vol 77, Issue 5
In-Reply-To: <mailman.45.1284048007.14968.linux-cluster@redhat.com>
References: <mailman.45.1284048007.14968.linux-cluster@redhat.com>
Message-ID: <C651C3AA2A6A1D4980D35451DDE3F96B803C37@usctmx1160.merck.com>

 
Hi

It seems you are using hostname of cluster nodes at the place of
hostname of ilo (ILO should have separate ip and hostname in DNS)

In below config is node1.drctmb.com assigned as hostname of node or the
hostname of ILO device? It should be hostname of ilo device..


<fencedevices>
> <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
> login="root" name="NODE1" passwd="redhat123"/>
> <fencedevice agent="fence_ilo" hostname="node2.drctmb.com"
> login="root" name="NODE2" passwd="redhat123"/>
> </fencedevices>

Thanks
Anoop

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of
linux-cluster-request at redhat.com
Sent: Thursday, September 09, 2010 12:00 PM
To: linux-cluster at redhat.com
Subject: Linux-cluster Digest, Vol 77, Issue 5

Send Linux-cluster mailing list submissions to
	linux-cluster at redhat.com

To subscribe or unsubscribe via the World Wide Web, visit
	https://www.redhat.com/mailman/listinfo/linux-cluster
or, via email, send a message with subject or body 'help' to
	linux-cluster-request at redhat.com

You can reach the person managing the list at
	linux-cluster-owner at redhat.com

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Linux-cluster digest..."


Today's Topics:

   1. Re: need help - Fencing problem (ESGLinux)
   2. Re: need help - Fencing problem (rhurst at bidmc.harvard.edu)
   3. Re: need help - Fencing problem (Nehemias Jahcob)
   4. Re: need help - Fencing problem (Ben Turner)


----------------------------------------------------------------------

Message: 1
Date: Thu, 9 Sep 2010 10:51:47 +0200
From: ESGLinux <esggrupos at gmail.com>
To: linux clustering <linux-cluster at redhat.com>
Subject: Re: [Linux-cluster] need help - Fencing problem
Message-ID:
	<AANLkTinvwGjTYc01AgdghwTszSm2j5TJXyw+nEmEwGFe at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

the only reason was that when I used as shared the speed of this device
was
very very low. Marked it as non-shared it works fine. I don?t know the
reason. It was a try-error test,

Greetings,

ESG

2010/9/9 Jankowski, Chris <Chris.Jankowski at hp.com>

>  Why did you have to set iLO as non-shared?
>
> Thank and regards,
>
> Chris
>
>  ------------------------------
> *From:* linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] *On Behalf Of *ESGLinux
> *Sent:* Wednesday, 8 September 2010 22:57
> *To:* linux clustering
>
> *Subject:* Re: [Linux-cluster] need help - Fencing problem
>
> Hello,
>
> Have you configured the iLO devices entering in the BIOS?
>
> I remenber I have to set up the user/pass in the iLO and marked the
iLo as
> not shared
>
>
> HTH,
>
> ESG
>
> 2010/9/8 Girish Prajapati <girishpati at yahoo.com>
>
>>  Hello Everybody,
>> i am having problem of fencing a cluster node  let me explain
indetail :
>> I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO
2as
>> fencing device. Am managing cluster through Luci - (Conga). itseems
>> everything is working fine. I can reboot cluster nodes through Luci
and
>> service get transfer to another node. After rebooting node connect to
>> cluster automatically without any error.
>> Problem is i can not do Fence this node through Luci, when i try to
fence
>> any node i get following error :
>>
>> Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports:
Unable
>> to connect/login to fencing device
>> Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com"
was
>> unsuccessful
>>
>> my iLO license is : iLO 2 Advanced Evaluation
>> Do i need to have  license of iLO or there is problem in
configuration of
>> cluster ?
>> how i can check cluster log in details.
>>
>> Appreciate your help.
>> Thank you in advance.
>>
>> Regards,
>> Girishkumar R Prajapati
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://www.redhat.com/archives/linux-cluster/attachments/20100909/4398
a868/attachment.html>

------------------------------

Message: 2
Date: Thu, 9 Sep 2010 09:34:20 -0400
From: <rhurst at bidmc.harvard.edu>
To: <linux-cluster at redhat.com>
Subject: Re: [Linux-cluster] need help - Fencing problem
Message-ID:
	
<50168EC934B8D64AA8D8DD37F840F3DE05640628E6 at EVS2CCR.its.caregroup.org>
Content-Type: text/plain; charset="us-ascii"

For what it is worth, our experiences with HP iLO management cards:

iLO found on G1 servers does not need to be licensed, AFAIK, it does not
have the option to do so anyways.

iLO2 found on G2 and beyond does not need to be licensed either, if you
are only using it as a fencing device.  We licensed all of ours, because
it enabled useful KVM with remote media capabilities that are superior
than our Raritan KVM infrastructure.

Both management cards should have their firmware updated -- they were
both problematic to us as factory-shipped, but applying their update
packs allowed them to work as advertised.

Also, can't you add "-v" for verbose output and also something like "-D
/tmp/fence.out" to save debugging info to an output file?  It might help
some to see where exactly the failure is occuring.  Good luck.

________________________________
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Girish Prajapati
Sent: Wednesday, September 08, 2010 6:06 AM
To: Linux-cluster at redhat.com
Subject: [Linux-cluster] need help - Fencing problem

Hello Everybody,
i am having problem of fencing a cluster node  let me explain indetail :
I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as
fencing device. Am managing cluster through Luci - (Conga). itseems
everything is working fine. I can reboot cluster nodes through Luci and
service get transfer to another node. After rebooting node connect to
cluster automatically without any error.
Problem is i can not do Fence this node through Luci, when i try to
fence any node i get following error :

Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports:
Unable to connect/login to fencing device
Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was
unsuccessful

my iLO license is : iLO 2 Advanced Evaluation
Do i need to have  license of iLO or there is problem in configuration
of cluster ?
how i can check cluster log in details.

Appreciate your help.
Thank you in advance.

Regards,
Girishkumar R Prajapati

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://www.redhat.com/archives/linux-cluster/attachments/20100909/d58c
6bb4/attachment.html>

------------------------------

Message: 3
Date: Thu, 9 Sep 2010 10:18:31 -0400
From: Nehemias Jahcob <nehemiasjahcob at gmail.com>
To: linux clustering <linux-cluster at redhat.com>
Subject: Re: [Linux-cluster] need help - Fencing problem
Message-ID:
	<AANLkTim-nS3c8e67kPycd-u0XFOMERR8EJorG6+xHn4M at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

1. ) You can increase the verbosity level for troubleshooting??
   <cluster alias="girish" config_version="*n+1*" name="girish">
----
  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"
*
log_level="7"/>*
  <rm *log_level="7"*>
-----
#ccs_tool update /etc/cluster/cluster.conf

Copy-paste /var/log/messages


2.) What version of PSP you have installed??

3.) If  nothing works, I recommend using fence_ipmi

Greetings!


2010/9/9 <rhurst at bidmc.harvard.edu>

>  For what it is worth, our experiences with HP iLO management cards:
>
> iLO found on G1 servers does not need to be licensed, AFAIK, it does
not
> have the option to do so anyways.
>
> iLO2 found on G2 and beyond does not need to be licensed either, if
you are
> only using it as a fencing device.  We licensed all of ours, because
it
> enabled useful KVM with remote media capabilities that are superior
than our
> Raritan KVM infrastructure.
>
> Both management cards should have their firmware updated -- they were
both
> problematic to us as factory-shipped, but applying their update
> packs allowed them to work as advertised.
>
> Also, can't you add "-v" for verbose output and also something like
"-D
> /tmp/fence.out" to save debugging info to an output file?  It might
help
> some to see where exactly the failure is occuring.  Good luck.
>
>  ------------------------------
> *From:* linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] *On Behalf Of *Girish Prajapati
> *Sent:* Wednesday, September 08, 2010 6:06 AM
> *To:* Linux-cluster at redhat.com
> *Subject:* [Linux-cluster] need help - Fencing problem
>
>  Hello Everybody,
> i am having problem of fencing a cluster node  let me explain indetail
:
> I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as
> fencing device. Am managing cluster through Luci - (Conga). itseems
> everything is working fine. I can reboot cluster nodes through Luci
and
> service get transfer to another node. After rebooting node connect to
> cluster automatically without any error.
> Problem is i can not do Fence this node through Luci, when i try to
fence
> any node i get following error :
>
> Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports:
Unable
> to connect/login to fencing device
> Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com"
was
> unsuccessful
>
> my iLO license is : iLO 2 Advanced Evaluation
> Do i need to have  license of iLO or there is problem in configuration
of
> cluster ?
> how i can check cluster log in details.
>
> Appreciate your help.
> Thank you in advance.
>
> Regards,
> Girishkumar R Prajapati
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://www.redhat.com/archives/linux-cluster/attachments/20100909/ba94
0c01/attachment.html>

------------------------------

Message: 4
Date: Thu, 9 Sep 2010 11:58:45 -0400 (EDT)
From: Ben Turner <bturner at redhat.com>
To: linux clustering <linux-cluster at redhat.com>
Subject: Re: [Linux-cluster] need help - Fencing problem
Message-ID:
	
<155361964.174311284047925612.JavaMail.root at zmail07.collab.prod.int.phx2
.redhat.com>
	
Content-Type: text/plain; charset=utf-8

Judging from:

"Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports:
Unable to connect/login to fencing device"

Chances are you are not using the correct username/password/IP or the
ilo is not configured for telnet logins.  Try the following:

1.  Login to the ilo via telnet from the command line.  Be sure to use
the username/password/IP you have in cluster.conf.

2.  If that is successful try:

# fence_ilo -v -a "Ilo IP from cluster.conf" -l "Ilo user from
cluster.conf" -p "Ilo passwd from cluster.conf" -o status

The -v will display exactly what the fence agent sees and is very useful
for debugging failing fences.  If the status fails send me the output.

3.  If the fence_ilo successful try:

# fence_node <node name from cluster.conf>

If all 3 are successful then fencing is setup properly and there may be
a problem running it from Luci, if any of the 3 fail post the error back
to the list and I'll look at it.

-Ben


----- "Girish Prajapati" <girishpati at yahoo.com> wrote:

> Hello,
> i can run following command successfully from another node but still
> getting same error message :
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] Members Joined:
> Sep 9 14:37:00 node2 openais[2904]: [SYNC ] This node is within the
> primary component and will provide service.
> Sep 9 14:37:00 node2 openais[2904]: [TOTEM] entering OPERATIONAL
> state.
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] got nodejoin message
> 192.168.0.28
> Sep 9 14:37:00 node2 openais[2904]: [CPG ] got joinlist message from
> node 1
> Sep 9 14:37:00 node2 fenced[2923]: node1.drctmb.com not a cluster
> member after 0 sec post_fail_delay
> Sep 9 14:37:00 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:10 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> Sep 9 14:37:10 node2 fenced[2923]: fence "node1.drctmb.com" failed
> Sep 9 14:37:15 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:26 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> 
> node1 rebooted and get connect to the cluster but now my webby service
> not working see below log :
> 
> Broadcast message from root (Thu Sep 9 14:32:41 2010):
> The system is going down for system halt NOW!
> Sep 9 14:19:22 node1 last message repeated 17 times
> Sep 9 14:32:41 node1 shutdown[25506]: shutting down for system halt
> Sep 9 14:32:41 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:32:43 node1 modclusterd: shutdown succeeded
> Sep 9 14:32:43 node1 rgmanager: [25593]: <notice> Shutting down
> Cluster Service Manager...
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Stopping service
> service:webby
> Sep 9 14:32:44 node1 avahi-daemon[3378]: Withdrawing address record
> for 192.168.0.30 on eth0.
> Read from remote host node1: Connection reset by peer
> .
> .
> .
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/hda, packet devices
> [this device CD/DVD] not SMART capable
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, opened
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, IE (SMART) not
> enabled, skip device Try 'smartctl -s on /dev/sda' to turn on SMART
> features
> Sep 9 14:35:42 node1 smartd[3585]: Monitoring 0 ATA and 0 SCSI devices
> Sep 9 14:35:42 node1 smartd[3604]: smartd has fork()ed into background
> mode. New PID=3604.
> Sep 9 14:35:42 node1 avahi-daemon[3412]: Service "SFTP File Transfer
> on node1" (/services/sftp-ssh.service) successfully established.
> Sep 9 14:35:45 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:35:45 node1 last message repeated 3 times
> Sep 9 14:35:45 node1 kernel: mtrr: type mismatch for d8000000,2000000
> old: uncachable new: write-combining
> Sep 9 14:35:46 node1 clurgmgrd: [3491]: <err> Checking Existence Of
> File /var/run/cluster/apache/apache:httpd.pid [apache:httpd] > Failed
> - File Doesn't Exist
> 
> 
> 
> It seems that there problem in fencing device configuration.
> Please find here my cluster.conf :
> 
> 
> <?xml version="1.0"?>
> <cluster alias="girish" config_version="21" name="girish">
> <fence_daemon clean_start="0" post_fail_delay="0"
> post_join_delay="3"/>
> <clusternodes>
> <clusternode name=" node2.drctmb.com " nodeid="1" votes="1">
> <fence>
> <method name="1">
> <device name="NODE2"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="node1.drctmb.com" nodeid="2" votes="1">
> <fence>
> <method name="1">
> <device name="NODE1"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> <cman expected_votes="1" two_node="1"/>
> <fencedevices>
> <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
> login="root" name="NODE1" passwd="redhat123"/>
> <fencedevice agent="fence_ilo" hostname="node2.drctmb.com"
> login="root" name="NODE2" passwd="redhat123"/>
> </fencedevices>
> <rm>
> <failoverdomains>
> <failoverdomain name="prefer_node1" nofailback="0" ordered="1"
> restricted="1">
> <failoverdomainnode name="node2.drctmb.com" priority="2"/>
> <failoverdomainnode name="node1.drctmb.com" priority="1"/>
> </failoverdomain>
> </failoverdomains>
> <resources>
> <fs device="/dev/sda1" force_fsck="0" force_unmount="0" fsid="8669"
> fstype="ext3" mountpoint="/var/www/html" name="docroot"
> self_fence="0"/>
> <ip address="192.168.0.30" monitor_link="1"/>
> <apache config_file="conf/httpd.conf" name="httpd"
> server_root="/etc/httpd" shutdown_wait="5"/>
> </resources>
> <service autostart="1" domain="prefer_node1" exclusive="0"
> name="webby" recovery="relocate">
> <ip ref="192.168.0.30"/>
> <fs ref="docroot"/>
> <apache ref="httpd"/>
> </service>
> </rm>
> <fence_xvmd/>
> </cluster>
> ~
> 
> This is first time am working on Clustering so please help me.
> Appreciate your help.
> 
> Thank you.
> 
> 
> 
> From: Brem Belguebli <brem.belguebli at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Thu, September 9, 2010 11:30:28 AM
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> try run this from another node of the cluster
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> 
> Additionnally, by connecting thru http to the Ilo, you should be able
> to
> see Ilo logs (in the general tab) and see if it is due to a lack of
> licensing
> 
> 
> On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> > Hello...
> >
> > I have already configure BIOS for iLO.. but am not sure why i don
> need
> > to shared ??
> > please anybody can help me out for this problem.
> > Do i need any extra setup for fencing device ?
> > thanks
> >
> >
> >
> >
> ______________________________________________________________________
> > From: ESGLinux < esggrupos at gmail.com >
> > To: linux clustering < linux-cluster at redhat.com >
> > Sent: Wed, September 8, 2010 2:57:25 PM
> > Subject: Re: [Linux-cluster] need help - Fencing problem
> >
> > Hello,
> >
> >
> > Have you configured the iLO devices entering in the BIOS?
> >
> >
> > I remenber I have to set up the user/pass in the iLO and marked the
> > iLo as not shared
> >
> >
> >
> >
> > HTH,
> >
> >
> > ESG
> >
> > 2010/9/8 Girish Prajapati < girishpati at yahoo.com >
> > Hello Everybody,
> > i am having problem of fencing a cluster node let me explain
> > indetail :
> > I have installed RHEL 5.4 on HP Prolaint DL280 G5 servers and
> > iLO 2as fencing device. Am managing cluster through Luci -
> > (Conga). itseems everything is working fine. I can reboot
> > cluster nodes through Luci and service get transfer to another
> > node. After rebooting node connect to cluster automatically
> > without any error.
> > Problem is i can not do Fence this node through Luci, when i
> > try to fence any node i get following error :
> >
> > Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
> > reports: Unable to connect/login to fencing device
> > Sep 8 14:51:16 node2 fence_node[9106]: Fence of
> > " node1.drctmb.com " was unsuccessful
> >
> > my iLO license is : iLO 2 Advanced Evaluation
> > Do i need to have license of iLO or there is problem in
> > configuration of cluster ?
> > how i can check cluster log in details.
> >
> > Appreciate your help.
> > Thank you in advance.
> >
> > Regards,
> > Girishkumar R Prajapati
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


------------------------------

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

End of Linux-cluster Digest, Vol 77, Issue 5
********************************************
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates Direct contact information
for affiliates is available at 
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from 
your system.


From brem.belguebli at gmail.com  Fri Sep 10 08:14:17 2010
From: brem.belguebli at gmail.com (Brem Belguebli)
Date: Fri, 10 Sep 2010 10:14:17 +0200
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <744023.36408.qm@web120512.mail.ne1.yahoo.com>
References: <178789.16151.qm@web120516.mail.ne1.yahoo.com>
	<AANLkTi=4w75C4mU++U=6E71zO2s1jHkPXhOZv0wUKEJp@mail.gmail.com>
	<541653.54252.qm@web120512.mail.ne1.yahoo.com>
	<1284012028.3342.3.camel@newgen.localdomain>
	<744023.36408.qm@web120512.mail.ne1.yahoo.com>
Message-ID: <1284106457.3342.5.camel@newgen.localdomain>

hostname filed in agent fence_ilo line must be the hostname (or IP addr)
of the ILO not the one of the node.

Regards

 > <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
login="root" name="NODE1" passwd="redhat123"/>

On Thu, 2010-09-09 at 00:43 -0700, Girish Prajapati wrote:
> 
> 
> om>
>                                To: 
> linux clustering
> <linux-cluster at redhat.com>
>                           Subject: 
> Re: [Linux-cluster] need help -
> Fencing problem
>                              Date: 
> Thu, 9 Sep 2010 00:43:45 -0700
> (PDT) (09/09/2010 09:43:45 AM)
> 
> 
> Hello,
> i can run following command successfully from another node but still
> getting same error message : 


From girishpati at yahoo.com  Fri Sep 10 09:38:05 2010
From: girishpati at yahoo.com (Girish Prajapati)
Date: Fri, 10 Sep 2010 02:38:05 -0700 (PDT)
Subject: [Linux-cluster] Linux-cluster Digest, Vol 77, Issue 5
In-Reply-To: <C651C3AA2A6A1D4980D35451DDE3F96B803C37@usctmx1160.merck.com>
References: <mailman.45.1284048007.14968.linux-cluster@redhat.com>
	<C651C3AA2A6A1D4980D35451DDE3F96B803C37@usctmx1160.merck.com>
Message-ID: <449355.23468.qm@web120508.mail.ne1.yahoo.com>

Hello Mr. Anoop,

I have already try with different host & ilo name but am getting same getting 
same error. Please let me know if there is any other possibility for 
troubleshoot.

Thank you.

Regards,
Girishkumar R Prajapati


________________________________
From: "Rajkumar, Anoop" <anoop_rajkumar at merck.com>
To: linux-cluster at redhat.com
Sent: Fri, September 10, 2010 1:42:25 AM
Subject: Re: [Linux-cluster] Linux-cluster Digest, Vol 77, Issue 5


Hi

It seems you are using hostname of cluster nodes at the place of
hostname of ilo (ILO should have separate ip and hostname in DNS)

In below config is node1.drctmb.com assigned as hostname of node or the
hostname of ILO device? It should be hostname of ilo device..


<fencedevices>
> <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
> login="root" name="NODE1" passwd="redhat123"/>
> <fencedevice agent="fence_ilo" hostname="node2.drctmb.com"
> login="root" name="NODE2" passwd="redhat123"/>
> </fencedevices>

Thanks
Anoop

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of
linux-cluster-request at redhat.com
Sent: Thursday, September 09, 2010 12:00 PM
To: linux-cluster at redhat.com
Subject: Linux-cluster Digest, Vol 77, Issue 5

Send Linux-cluster mailing list submissions to
    linux-cluster at redhat.com

To subscribe or unsubscribe via the World Wide Web, visit
    https://www.redhat.com/mailman/listinfo/linux-cluster
or, via email, send a message with subject or body 'help' to
    linux-cluster-request at redhat.com

You can reach the person managing the list at
    linux-cluster-owner at redhat.com

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Linux-cluster digest..."


Today's Topics:

   1. Re: need help - Fencing problem (ESGLinux)
   2. Re: need help - Fencing problem (rhurst at bidmc.harvard.edu)
   3. Re: need help - Fencing problem (Nehemias Jahcob)
   4. Re: need help - Fencing problem (Ben Turner)


----------------------------------------------------------------------

Message: 1
Date: Thu, 9 Sep 2010 10:51:47 +0200
From: ESGLinux <esggrupos at gmail.com>
To: linux clustering <linux-cluster at redhat.com>
Subject: Re: [Linux-cluster] need help - Fencing problem
Message-ID:
    <AANLkTinvwGjTYc01AgdghwTszSm2j5TJXyw+nEmEwGFe at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

Hi,

the only reason was that when I used as shared the speed of this device
was
very very low. Marked it as non-shared it works fine. I don?t know the
reason. It was a try-error test,

Greetings,

ESG

2010/9/9 Jankowski, Chris <Chris.Jankowski at hp.com>

>  Why did you have to set iLO as non-shared?
>
> Thank and regards,
>
> Chris
>
>  ------------------------------
> *From:* linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] *On Behalf Of *ESGLinux
> *Sent:* Wednesday, 8 September 2010 22:57
> *To:* linux clustering
>
> *Subject:* Re: [Linux-cluster] need help - Fencing problem
>
> Hello,
>
> Have you configured the iLO devices entering in the BIOS?
>
> I remenber I have to set up the user/pass in the iLO and marked the
iLo as
> not shared
>
>
> HTH,
>
> ESG
>
> 2010/9/8 Girish Prajapati <girishpati at yahoo.com>
>
>>  Hello Everybody,
>> i am having problem of fencing a cluster node  let me explain
indetail :
>> I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO
2as
>> fencing device. Am managing cluster through Luci - (Conga). itseems
>> everything is working fine. I can reboot cluster nodes through Luci
and
>> service get transfer to another node. After rebooting node connect to
>> cluster automatically without any error.
>> Problem is i can not do Fence this node through Luci, when i try to
fence
>> any node i get following error :
>>
>> Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports:
Unable
>> to connect/login to fencing device
>> Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com"
was
>> unsuccessful
>>
>> my iLO license is : iLO 2 Advanced Evaluation
>> Do i need to have  license of iLO or there is problem in
configuration of
>> cluster ?
>> how i can check cluster log in details.
>>
>> Appreciate your help.
>> Thank you in advance.
>>
>> Regards,
>> Girishkumar R Prajapati
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://www.redhat.com/archives/linux-cluster/attachments/20100909/4398
a868/attachment.html>

------------------------------

Message: 2
Date: Thu, 9 Sep 2010 09:34:20 -0400
From: <rhurst at bidmc.harvard.edu>
To: <linux-cluster at redhat.com>
Subject: Re: [Linux-cluster] need help - Fencing problem
Message-ID:
    
<50168EC934B8D64AA8D8DD37F840F3DE05640628E6 at EVS2CCR.its.caregroup.org>
Content-Type: text/plain; charset="us-ascii"

For what it is worth, our experiences with HP iLO management cards:

iLO found on G1 servers does not need to be licensed, AFAIK, it does not
have the option to do so anyways.

iLO2 found on G2 and beyond does not need to be licensed either, if you
are only using it as a fencing device.  We licensed all of ours, because
it enabled useful KVM with remote media capabilities that are superior
than our Raritan KVM infrastructure.

Both management cards should have their firmware updated -- they were
both problematic to us as factory-shipped, but applying their update
packs allowed them to work as advertised.

Also, can't you add "-v" for verbose output and also something like "-D
/tmp/fence.out" to save debugging info to an output file?  It might help
some to see where exactly the failure is occuring.  Good luck.

________________________________
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Girish Prajapati
Sent: Wednesday, September 08, 2010 6:06 AM
To: Linux-cluster at redhat.com
Subject: [Linux-cluster] need help - Fencing problem

Hello Everybody,
i am having problem of fencing a cluster node  let me explain indetail :
I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as
fencing device. Am managing cluster through Luci - (Conga). itseems
everything is working fine. I can reboot cluster nodes through Luci and
service get transfer to another node. After rebooting node connect to
cluster automatically without any error.
Problem is i can not do Fence this node through Luci, when i try to
fence any node i get following error :

Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports:
Unable to connect/login to fencing device
Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com" was
unsuccessful

my iLO license is : iLO 2 Advanced Evaluation
Do i need to have  license of iLO or there is problem in configuration
of cluster ?
how i can check cluster log in details.

Appreciate your help.
Thank you in advance.

Regards,
Girishkumar R Prajapati

-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://www.redhat.com/archives/linux-cluster/attachments/20100909/d58c
6bb4/attachment.html>

------------------------------

Message: 3
Date: Thu, 9 Sep 2010 10:18:31 -0400
From: Nehemias Jahcob <nehemiasjahcob at gmail.com>
To: linux clustering <linux-cluster at redhat.com>
Subject: Re: [Linux-cluster] need help - Fencing problem
Message-ID:
    <AANLkTim-nS3c8e67kPycd-u0XFOMERR8EJorG6+xHn4M at mail.gmail.com>
Content-Type: text/plain; charset="iso-8859-1"

1. ) You can increase the verbosity level for troubleshooting??
   <cluster alias="girish" config_version="*n+1*" name="girish">
----
  <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"
*
log_level="7"/>*
  <rm *log_level="7"*>
-----
#ccs_tool update /etc/cluster/cluster.conf

Copy-paste /var/log/messages


2.) What version of PSP you have installed??

3.) If  nothing works, I recommend using fence_ipmi

Greetings!


2010/9/9 <rhurst at bidmc.harvard.edu>

>  For what it is worth, our experiences with HP iLO management cards:
>
> iLO found on G1 servers does not need to be licensed, AFAIK, it does
not
> have the option to do so anyways.
>
> iLO2 found on G2 and beyond does not need to be licensed either, if
you are
> only using it as a fencing device.  We licensed all of ours, because
it
> enabled useful KVM with remote media capabilities that are superior
than our
> Raritan KVM infrastructure.
>
> Both management cards should have their firmware updated -- they were
both
> problematic to us as factory-shipped, but applying their update
> packs allowed them to work as advertised.
>
> Also, can't you add "-v" for verbose output and also something like
"-D
> /tmp/fence.out" to save debugging info to an output file?  It might
help
> some to see where exactly the failure is occuring.  Good luck.
>
>  ------------------------------
> *From:* linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] *On Behalf Of *Girish Prajapati
> *Sent:* Wednesday, September 08, 2010 6:06 AM
> *To:* Linux-cluster at redhat.com
> *Subject:* [Linux-cluster] need help - Fencing problem
>
>  Hello Everybody,
> i am having problem of fencing a cluster node  let me explain indetail
:
> I have installed RHEL 5.4 on  HP Prolaint DL280 G5 servers and iLO 2as
> fencing device. Am managing cluster through Luci - (Conga). itseems
> everything is working fine. I can reboot cluster nodes through Luci
and
> service get transfer to another node. After rebooting node connect to
> cluster automatically without any error.
> Problem is i can not do Fence this node through Luci, when i try to
fence
> any node i get following error :
>
> Sep  8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports:
Unable
> to connect/login to fencing device
> Sep  8 14:51:16 node2 fence_node[9106]: Fence of "node1.drctmb.com"
was
> unsuccessful
>
> my iLO license is : iLO 2 Advanced Evaluation
> Do i need to have  license of iLO or there is problem in configuration
of
> cluster ?
> how i can check cluster log in details.
>
> Appreciate your help.
> Thank you in advance.
>
> Regards,
> Girishkumar R Prajapati
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL:
<https://www.redhat.com/archives/linux-cluster/attachments/20100909/ba94
0c01/attachment.html>

------------------------------

Message: 4
Date: Thu, 9 Sep 2010 11:58:45 -0400 (EDT)
From: Ben Turner <bturner at redhat.com>
To: linux clustering <linux-cluster at redhat.com>
Subject: Re: [Linux-cluster] need help - Fencing problem
Message-ID:
    
<155361964.174311284047925612.JavaMail.root at zmail07.collab.prod.int.phx2
.redhat.com>
    
Content-Type: text/plain; charset=utf-8

Judging from:

"Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports:
Unable to connect/login to fencing device"

Chances are you are not using the correct username/password/IP or the
ilo is not configured for telnet logins.  Try the following:

1.  Login to the ilo via telnet from the command line.  Be sure to use
the username/password/IP you have in cluster.conf.

2.  If that is successful try:

# fence_ilo -v -a "Ilo IP from cluster.conf" -l "Ilo user from
cluster.conf" -p "Ilo passwd from cluster.conf" -o status

The -v will display exactly what the fence agent sees and is very useful
for debugging failing fences.  If the status fails send me the output.

3.  If the fence_ilo successful try:

# fence_node <node name from cluster.conf>

If all 3 are successful then fencing is setup properly and there may be
a problem running it from Luci, if any of the 3 fail post the error back
to the list and I'll look at it.

-Ben


----- "Girish Prajapati" <girishpati at yahoo.com> wrote:

> Hello,
> i can run following command successfully from another node but still
> getting same error message :
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] Members Joined:
> Sep 9 14:37:00 node2 openais[2904]: [SYNC ] This node is within the
> primary component and will provide service.
> Sep 9 14:37:00 node2 openais[2904]: [TOTEM] entering OPERATIONAL
> state.
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] got nodejoin message
> 192.168.0.28
> Sep 9 14:37:00 node2 openais[2904]: [CPG ] got joinlist message from
> node 1
> Sep 9 14:37:00 node2 fenced[2923]: node1.drctmb.com not a cluster
> member after 0 sec post_fail_delay
> Sep 9 14:37:00 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:10 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> Sep 9 14:37:10 node2 fenced[2923]: fence "node1.drctmb.com" failed
> Sep 9 14:37:15 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:26 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> 
> node1 rebooted and get connect to the cluster but now my webby service
> not working see below log :
> 
> Broadcast message from root (Thu Sep 9 14:32:41 2010):
> The system is going down for system halt NOW!
> Sep 9 14:19:22 node1 last message repeated 17 times
> Sep 9 14:32:41 node1 shutdown[25506]: shutting down for system halt
> Sep 9 14:32:41 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:32:43 node1 modclusterd: shutdown succeeded
> Sep 9 14:32:43 node1 rgmanager: [25593]: <notice> Shutting down
> Cluster Service Manager...
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Stopping service
> service:webby
> Sep 9 14:32:44 node1 avahi-daemon[3378]: Withdrawing address record
> for 192.168.0.30 on eth0.
> Read from remote host node1: Connection reset by peer
> .
> .
> .
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/hda, packet devices
> [this device CD/DVD] not SMART capable
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, opened
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, IE (SMART) not
> enabled, skip device Try 'smartctl -s on /dev/sda' to turn on SMART
> features
> Sep 9 14:35:42 node1 smartd[3585]: Monitoring 0 ATA and 0 SCSI devices
> Sep 9 14:35:42 node1 smartd[3604]: smartd has fork()ed into background
> mode. New PID=3604.
> Sep 9 14:35:42 node1 avahi-daemon[3412]: Service "SFTP File Transfer
> on node1" (/services/sftp-ssh.service) successfully established.
> Sep 9 14:35:45 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:35:45 node1 last message repeated 3 times
> Sep 9 14:35:45 node1 kernel: mtrr: type mismatch for d8000000,2000000
> old: uncachable new: write-combining
> Sep 9 14:35:46 node1 clurgmgrd: [3491]: <err> Checking Existence Of
> File /var/run/cluster/apache/apache:httpd.pid [apache:httpd] > Failed
> - File Doesn't Exist
> 
> 
> 
> It seems that there problem in fencing device configuration.
> Please find here my cluster.conf :
> 
> 
> <?xml version="1.0"?>
> <cluster alias="girish" config_version="21" name="girish">
> <fence_daemon clean_start="0" post_fail_delay="0"
> post_join_delay="3"/>
> <clusternodes>
> <clusternode name=" node2.drctmb.com " nodeid="1" votes="1">
> <fence>
> <method name="1">
> <device name="NODE2"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="node1.drctmb.com" nodeid="2" votes="1">
> <fence>
> <method name="1">
> <device name="NODE1"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> <cman expected_votes="1" two_node="1"/>
> <fencedevices>
> <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
> login="root" name="NODE1" passwd="redhat123"/>
> <fencedevice agent="fence_ilo" hostname="node2.drctmb.com"
> login="root" name="NODE2" passwd="redhat123"/>
> </fencedevices>
> <rm>
> <failoverdomains>
> <failoverdomain name="prefer_node1" nofailback="0" ordered="1"
> restricted="1">
> <failoverdomainnode name="node2.drctmb.com" priority="2"/>
> <failoverdomainnode name="node1.drctmb.com" priority="1"/>
> </failoverdomain>
> </failoverdomains>
> <resources>
> <fs device="/dev/sda1" force_fsck="0" force_unmount="0" fsid="8669"
> fstype="ext3" mountpoint="/var/www/html" name="docroot"
> self_fence="0"/>
> <ip address="192.168.0.30" monitor_link="1"/>
> <apache config_file="conf/httpd.conf" name="httpd"
> server_root="/etc/httpd" shutdown_wait="5"/>
> </resources>
> <service autostart="1" domain="prefer_node1" exclusive="0"
> name="webby" recovery="relocate">
> <ip ref="192.168.0.30"/>
> <fs ref="docroot"/>
> <apache ref="httpd"/>
> </service>
> </rm>
> <fence_xvmd/>
> </cluster>
> ~
> 
> This is first time am working on Clustering so please help me.
> Appreciate your help.
> 
> Thank you.
> 
> 
> 
> From: Brem Belguebli <brem.belguebli at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Thu, September 9, 2010 11:30:28 AM
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> try run this from another node of the cluster
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> 
> Additionnally, by connecting thru http to the Ilo, you should be able
> to
> see Ilo logs (in the general tab) and see if it is due to a lack of
> licensing
> 
> 
> On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> > Hello...
> >
> > I have already configure BIOS for iLO.. but am not sure why i don
> need
> > to shared ??
> > please anybody can help me out for this problem.
> > Do i need any extra setup for fencing device ?
> > thanks
> >
> >
> >
> >
> ______________________________________________________________________
> > From: ESGLinux < esggrupos at gmail.com >
> > To: linux clustering < linux-cluster at redhat.com >
> > Sent: Wed, September 8, 2010 2:57:25 PM
> > Subject: Re: [Linux-cluster] need help - Fencing problem
> >
> > Hello,
> >
> >
> > Have you configured the iLO devices entering in the BIOS?
> >
> >
> > I remenber I have to set up the user/pass in the iLO and marked the
> > iLo as not shared
> >
> >
> >
> >
> > HTH,
> >
> >
> > ESG
> >
> > 2010/9/8 Girish Prajapati < girishpati at yahoo.com >
> > Hello Everybody,
> > i am having problem of fencing a cluster node let me explain
> > indetail :
> > I have installed RHEL 5.4 on HP Prolaint DL280 G5 servers and
> > iLO 2as fencing device. Am managing cluster through Luci -
> > (Conga). itseems everything is working fine. I can reboot
> > cluster nodes through Luci and service get transfer to another
> > node. After rebooting node connect to cluster automatically
> > without any error.
> > Problem is i can not do Fence this node through Luci, when i
> > try to fence any node i get following error :
> >
> > Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
> > reports: Unable to connect/login to fencing device
> > Sep 8 14:51:16 node2 fence_node[9106]: Fence of
> > " node1.drctmb.com " was unsuccessful
> >
> > my iLO license is : iLO 2 Advanced Evaluation
> > Do i need to have license of iLO or there is problem in
> > configuration of cluster ?
> > how i can check cluster log in details.
> >
> > Appreciate your help.
> > Thank you in advance.
> >
> > Regards,
> > Girishkumar R Prajapati
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


------------------------------

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

End of Linux-cluster Digest, Vol 77, Issue 5
********************************************
Notice:  This e-mail message, together with any attachments, contains
information of Merck & Co., Inc. (One Merck Drive, Whitehouse Station,
New Jersey, USA 08889), and/or its affiliates Direct contact information
for affiliates is available at 
http://www.merck.com/contact/contacts.html) that may be confidential,
proprietary copyrighted and/or legally privileged. It is intended solely
for the use of the individual or entity named on this message. If you are
not the intended recipient, and have received this message in error,
please notify us immediately by reply e-mail and then delete it from 
your system.


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100910/f9bd8b2d/attachment.htm>

From girishpati at yahoo.com  Fri Sep 10 09:32:35 2010
From: girishpati at yahoo.com (Girish Prajapati)
Date: Fri, 10 Sep 2010 02:32:35 -0700 (PDT)
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <155361964.174311284047925612.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
References: <155361964.174311284047925612.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <505609.49142.qm@web120502.mail.ne1.yahoo.com>

Hello Sir,

 1st and 2nd option passed successfully.
i also try to run command with ilo's name and it run successfully so there is no 
issue of DNS.

i ) when i try to run fence_node command i get the following error:

[root at node1 ~]# fence_node node2.drctmb.com
agent "fence_ilo" reports: Unable to connect/login to fencing device

ii) when i try to fence through Luci i get following error:

Sep 10 11:13:10 tmb luci[24270]: Unable to retrieve batch 1700106142 status from 
node2.drctmb.com:11111: fence_node failed: 


Please let me know if there is any other why for troubleshoot

Thank you.

Regards,
Girishkumar 


________________________________
From: Ben Turner <bturner at redhat.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Thu, September 9, 2010 9:28:45 PM
Subject: Re: [Linux-cluster] need help - Fencing problem

Judging from:

"Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable to 
connect/login to fencing device"

Chances are you are not using the correct username/password/IP or the ilo is not 
configured for telnet logins.  Try the following:

1.  Login to the ilo via telnet from the command line.  Be sure to use the 
username/password/IP you have in cluster.conf.

2.  If that is successful try:

# fence_ilo -v -a "Ilo IP from cluster.conf" -l "Ilo user from cluster.conf" -p 
"Ilo passwd from cluster.conf" -o status

The -v will display exactly what the fence agent sees and is very useful for 
debugging failing fences.  If the status fails send me the output.

3.  If the fence_ilo successful try:

# fence_node <node name from cluster.conf>

If all 3 are successful then fencing is setup properly and there may be a 
problem running it from Luci, if any of the 3 fail post the error back to the 
list and I'll look at it.

-Ben


----- "Girish Prajapati" <girishpati at yahoo.com> wrote:

> Hello,
> i can run following command successfully from another node but still
> getting same error message :
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] Members Joined:
> Sep 9 14:37:00 node2 openais[2904]: [SYNC ] This node is within the
> primary component and will provide service.
> Sep 9 14:37:00 node2 openais[2904]: [TOTEM] entering OPERATIONAL
> state.
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] got nodejoin message
> 192.168.0.28
> Sep 9 14:37:00 node2 openais[2904]: [CPG ] got joinlist message from
> node 1
> Sep 9 14:37:00 node2 fenced[2923]: node1.drctmb.com not a cluster
> member after 0 sec post_fail_delay
> Sep 9 14:37:00 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:10 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> Sep 9 14:37:10 node2 fenced[2923]: fence "node1.drctmb.com" failed
> Sep 9 14:37:15 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:26 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> 
> node1 rebooted and get connect to the cluster but now my webby service
> not working see below log :
> 
> Broadcast message from root (Thu Sep 9 14:32:41 2010):
> The system is going down for system halt NOW!
> Sep 9 14:19:22 node1 last message repeated 17 times
> Sep 9 14:32:41 node1 shutdown[25506]: shutting down for system halt
> Sep 9 14:32:41 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:32:43 node1 modclusterd: shutdown succeeded
> Sep 9 14:32:43 node1 rgmanager: [25593]: <notice> Shutting down
> Cluster Service Manager...
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Stopping service
> service:webby
> Sep 9 14:32:44 node1 avahi-daemon[3378]: Withdrawing address record
> for 192.168.0.30 on eth0.
> Read from remote host node1: Connection reset by peer
> .
> .
> .
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/hda, packet devices
> [this device CD/DVD] not SMART capable
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, opened
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, IE (SMART) not
> enabled, skip device Try 'smartctl -s on /dev/sda' to turn on SMART
> features
> Sep 9 14:35:42 node1 smartd[3585]: Monitoring 0 ATA and 0 SCSI devices
> Sep 9 14:35:42 node1 smartd[3604]: smartd has fork()ed into background
> mode. New PID=3604.
> Sep 9 14:35:42 node1 avahi-daemon[3412]: Service "SFTP File Transfer
> on node1" (/services/sftp-ssh.service) successfully established.
> Sep 9 14:35:45 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:35:45 node1 last message repeated 3 times
> Sep 9 14:35:45 node1 kernel: mtrr: type mismatch for d8000000,2000000
> old: uncachable new: write-combining
> Sep 9 14:35:46 node1 clurgmgrd: [3491]: <err> Checking Existence Of
> File /var/run/cluster/apache/apache:httpd.pid [apache:httpd] > Failed
> - File Doesn't Exist
> 
> 
> 
> It seems that there problem in fencing device configuration.
> Please find here my cluster.conf :
> 
> 
> <?xml version="1.0"?>
> <cluster alias="girish" config_version="21" name="girish">
> <fence_daemon clean_start="0" post_fail_delay="0"
> post_join_delay="3"/>
> <clusternodes>
> <clusternode name=" node2.drctmb.com " nodeid="1" votes="1">
> <fence>
> <method name="1">
> <device name="NODE2"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="node1.drctmb.com" nodeid="2" votes="1">
> <fence>
> <method name="1">
> <device name="NODE1"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> <cman expected_votes="1" two_node="1"/>
> <fencedevices>
> <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
> login="root" name="NODE1" passwd="redhat123"/>
> <fencedevice agent="fence_ilo" hostname="node2.drctmb.com"
> login="root" name="NODE2" passwd="redhat123"/>
> </fencedevices>
> <rm>
> <failoverdomains>
> <failoverdomain name="prefer_node1" nofailback="0" ordered="1"
> restricted="1">
> <failoverdomainnode name="node2.drctmb.com" priority="2"/>
> <failoverdomainnode name="node1.drctmb.com" priority="1"/>
> </failoverdomain>
> </failoverdomains>
> <resources>
> <fs device="/dev/sda1" force_fsck="0" force_unmount="0" fsid="8669"
> fstype="ext3" mountpoint="/var/www/html" name="docroot"
> self_fence="0"/>
> <ip address="192.168.0.30" monitor_link="1"/>
> <apache config_file="conf/httpd.conf" name="httpd"
> server_root="/etc/httpd" shutdown_wait="5"/>
> </resources>
> <service autostart="1" domain="prefer_node1" exclusive="0"
> name="webby" recovery="relocate">
> <ip ref="192.168.0.30"/>
> <fs ref="docroot"/>
> <apache ref="httpd"/>
> </service>
> </rm>
> <fence_xvmd/>
> </cluster>
> ~
> 
> This is first time am working on Clustering so please help me.
> Appreciate your help.
> 
> Thank you.
> 
> 
> 
> From: Brem Belguebli <brem.belguebli at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Thu, September 9, 2010 11:30:28 AM
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> try run this from another node of the cluster
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> 
> Additionnally, by connecting thru http to the Ilo, you should be able
> to
> see Ilo logs (in the general tab) and see if it is due to a lack of
> licensing
> 
> 
> On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> > Hello...
> >
> > I have already configure BIOS for iLO.. but am not sure why i don
> need
> > to shared ??
> > please anybody can help me out for this problem.
> > Do i need any extra setup for fencing device ?
> > thanks
> >
> >
> >
> >
> ______________________________________________________________________
> > From: ESGLinux < esggrupos at gmail.com >
> > To: linux clustering < linux-cluster at redhat.com >
> > Sent: Wed, September 8, 2010 2:57:25 PM
> > Subject: Re: [Linux-cluster] need help - Fencing problem
> >
> > Hello,
> >
> >
> > Have you configured the iLO devices entering in the BIOS?
> >
> >
> > I remenber I have to set up the user/pass in the iLO and marked the
> > iLo as not shared
> >
> >
> >
> >
> > HTH,
> >
> >
> > ESG
> >
> > 2010/9/8 Girish Prajapati < girishpati at yahoo.com >
> > Hello Everybody,
> > i am having problem of fencing a cluster node let me explain
> > indetail :
> > I have installed RHEL 5.4 on HP Prolaint DL280 G5 servers and
> > iLO 2as fencing device. Am managing cluster through Luci -
> > (Conga). itseems everything is working fine. I can reboot
> > cluster nodes through Luci and service get transfer to another
> > node. After rebooting node connect to cluster automatically
> > without any error.
> > Problem is i can not do Fence this node through Luci, when i
> > try to fence any node i get following error :
> >
> > Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
> > reports: Unable to connect/login to fencing device
> > Sep 8 14:51:16 node2 fence_node[9106]: Fence of
> > " node1.drctmb.com " was unsuccessful
> >
> > my iLO license is : iLO 2 Advanced Evaluation
> > Do i need to have license of iLO or there is problem in
> > configuration of cluster ?
> > how i can check cluster log in details.
> >
> > Appreciate your help.
> > Thank you in advance.
> >
> > Regards,
> > Girishkumar R Prajapati
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100910/8d5f1d4e/attachment.htm>

From Jost.Rakovec at snt.si  Sat Sep 11 16:36:44 2010
From: Jost.Rakovec at snt.si (Rakovec Jost)
Date: Sat, 11 Sep 2010 18:36:44 +0200
Subject: [Linux-cluster] fence in xen
Message-ID: <3754ED14F3EE0C459DEFE2DF184515FF0F101C719C@SIMAIL.snt-is.com>

Hi list!


I have a question about fence_xvm. 

Situation is:

one physical server with xen --> dom0  with 2 domU. Cluster work fine between domU --reboot, relocate,

I'm using redhat 5.5

Problem is with fence from dom0  with "fence_xvm -H oelcl2" ,  domU is destroyed but when it is booted back domU can't join to the cluster. domU boot very long time --> FENCED_START_TIMEOUT=300 


on console I get after the node2 is up:

node2:

INFO: task clurgmgrd:2127 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
clurgmgrd     D 0000000000000010     0  2127   2126                     (NOTLB)
 ffff88006f08dda8  0000000000000286  ffff88007cc0b810  0000000000000000
 0000000000000003  ffff880072009860  ffff880072f6b0c0  00000000000455ec
 ffff880072009a48  ffffffff802649d7
Call Trace:
 [<ffffffff802649d7>] _read_lock_irq+0x9/0x19
 [<ffffffff8021420e>] filemap_nopage+0x193/0x360
 [<ffffffff80263a7e>] __mutex_lock_slowpath+0x60/0x9b
 [<ffffffff80263ac8>] .text.lock.mutex+0xf/0x14
 [<ffffffff88424b64>] :dlm:dlm_new_lockspace+0x2c/0x860
 [<ffffffff80222b08>] __up_read+0x19/0x7f
 [<ffffffff802d0abb>] __kmalloc+0x8f/0x9f
 [<ffffffff8842b6fa>] :dlm:device_write+0x438/0x5e5
 [<ffffffff80217377>] vfs_write+0xce/0x174
 [<ffffffff80217bc4>] sys_write+0x45/0x6e
 [<ffffffff802602f9>] tracesys+0xab/0xb6


between booting on node2:
 
Starting clvmd: dlm: Using TCP for communications
clvmd startup timed out
[FAILED]


node2:

[root at oelcl2 init.d]# clustat
Cluster Status for cluster1 @ Sat Sep 11 18:11:21 2010
Member Status: Quorate

 Member Name                                                ID   Status
 ------ ----                                                ---- ------
 oelcl1                                                  1 Online
 oelcl2                                                 2 Online, Local

[root at oelcl2 init.d]#


on first node:

[root at oelcl1 ~]# clustat
Cluster Status for cluster1 @ Sat Sep 11 18:12:07 2010
Member Status: Quorate

 Member Name                                                ID   Status
 ------ ----                                                ---- ------
 oelcl1                                                  1 Online, Local, rgmanager
 oelcl2                                                  2 Online, rgmanager

 Service Name                                      Owner (Last)                                      State
 ------- ----                                      ----- ------                                      -----
 service:webby                                     oelcl1                                     started
[root at oelcl1 ~]#


and then I have to destroy both domU on guest and create it back to get node2 work again.

I have use how to on https://access.redhat.com/kb/docs/DOC-5937 and http://sources.redhat.com/cluster/wiki/VMClusterCookbook


cluster config on dom0


<?xml version="1.0"?>
<cluster alias="vmcluster" config_version="1" name="vmcluster">
        <clusternodes>
                <clusternode name="vm5" nodeid="1" votes="1"/>
        </clusternodes>
        <cman/>
        <fencedevices/>
        <rm/>
        <fence_xvmd/>
</cluster>


cluster config on domU


<?xml version="1.0"?>
<cluster alias="cluster1" config_version="49" name="cluster1">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="4"/>
        <clusternodes>
                <clusternode name="oelcl1.name.comi" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="oelcl1" name="xenfence1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="oelcl2.name.com" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="oelcl2" name="xenfence1"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_xvm" name="xenfence1"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="prefer_node1" nofailback="0" ordered="1" restricted="1">
                                <failoverdomainnode name="oelcl1.name.com" priority="1"/>
                                <failoverdomainnode name="oelcl2.name.com" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <ip address="xx.xx.xx.xx" monitor_link="1"/>
                        <fs device="/dev/xvdb1" force_fsck="0" force_unmount="0" fsid="8669" fstype="ext3" mountpoint="/var/www/html" name="docroot" self_fence="0"/>
                        <script file="/etc/init.d/httpd" name="apache_s"/>
                </resources>
                <service autostart="1" domain="prefer_node1" exclusive="0" name="webby" recovery="relocate">
                        <ip ref="xx.xx.xx.xx"/>
                        <fs ref="docroot"/>
                        <script ref="apache_s"/>
                </service>
        </rm>
</cluster>


fence proces on dom0

[root at vm5 cluster]# ps -ef |grep fenc
root     18690     1  0 17:40 ?        00:00:00 /sbin/fenced
root     18720     1  0 17:40 ?        00:00:00 /sbin/fence_xvmd -I xenbr0
root     22633 14524  0 18:21 pts/3    00:00:00 grep fenc
[root at vm5 cluster]#


and on domU

[root at oelcl1 ~]# ps -ef|grep fen
root      1523     1  0 17:41 ?        00:00:00 /sbin/fenced
root     13695  2902  0 18:22 pts/0    00:00:00 grep fen
[root at oelcl1 ~]#


Do somebody have any idea why fence don't work?

thx

br

jost


From Jost.Rakovec at snt.si  Mon Sep 13 07:31:37 2010
From: Jost.Rakovec at snt.si (Rakovec Jost)
Date: Mon, 13 Sep 2010 09:31:37 +0200
Subject: [Linux-cluster] fence in xen
In-Reply-To: <3754ED14F3EE0C459DEFE2DF184515FF0F101C719C@SIMAIL.snt-is.com>
References: <3754ED14F3EE0C459DEFE2DF184515FF0F101C719C@SIMAIL.snt-is.com>
Message-ID: <3754ED14F3EE0C459DEFE2DF184515FF0F101C719D@SIMAIL.snt-is.com>

Hi


Q: do fence_xvmd must run also  in domU?
Because I notice that if I run on host when fence_xvmd is running:
 
[root at oelcl1 ~]# fence_xvm -H oelcl2 -ddd -o null
Debugging threshold is now 3
-- args @ 0x7fffe3f71fb0 --
  args->addr = 225.0.0.12
  args->domain = oelcl2
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 0
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 0
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 0
  args->debug = 3
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0x7fffe3f70f60 (4096 max size)
Actual key length = 4096 bytesSending to 225.0.0.12 via 127.0.0.1
Sending to 225.0.0.12 via 10.9.131.80
Sending to 225.0.0.12 via 10.9.131.83
Sending to 225.0.0.12 via 192.168.122.1
Waiting for connection from XVM host daemon.
Issuing TCP challenge
Responding to TCP challenge
TCP Exchange + Authentication done...
Waiting for return value from XVM host
Remote: Operation was successful


but if I try to fence ---> reboot then I get:

[root at oelcl1 ~]# fence_xvm -H oelc2
Remote: Operation was successful
[root at oelcl1 ~]#

but host2 is not reboot.


if fence_xvmd is not run on hosts then I get time out.


[root at oelcl1 sysconfig]# fence_xvm -H oelcl2 -ddd -o null
Debugging threshold is now 3
-- args @ 0x7fff1a6b5580 --
  args->addr = 225.0.0.12
  args->domain = oelcl2
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 0
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 0
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 0
  args->debug = 3
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0x7fff1a6b4530 (4096 max size)
Actual key length = 4096 bytesSending to 225.0.0.12 via 127.0.0.1
Sending to 225.0.0.12 via 10.9.131.80
Waiting for connection from XVM host daemon.
Sending to 225.0.0.12 via 127.0.0.1
Sending to 225.0.0.12 via 10.9.131.80
Waiting for connection from XVM host daemon.


Q: how can I try if multicast is ok?

Q: on which network interface must fence_xvmd run on dom0? I notice that on hosts-domU is:

virbr0    Link encap:Ethernet  HWaddr 00:00:00:00:00:00
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:40 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:7212 (7.0 KiB)


also virbr0

and on dom0 guest:

[root at vm5 ~]# fence_xvmd -fdd -I xenbr0
-- args @ 0xbfd26234 --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 7
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 1
  args->debug = 2
-- end args --
Opened ckpt vm_states
My Node ID = 1
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
oelcl1                   2a53022c-5836-68f0-4514-02a5a0b07e81 00001 00002
oelcl2                   dd268dd4-f012-e0f7-7c77-aa8a58e1e6ab 00001 00002
oelcman                  09c783bd-9107-0916-ebbf-bd27bcc8babe 00001 00002
Storing oelcl1
Storing oelcl2


[root at vm5 ~]# fence_xvmd -fdd -I virbr0
-- args @ 0xbfd26234 --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 7
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 1
  args->debug = 2
-- end args --
Opened ckpt vm_states
My Node ID = 1
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
oelcl1                   2a53022c-5836-68f0-4514-02a5a0b07e81 00001 00002
oelcl2                   dd268dd4-f012-e0f7-7c77-aa8a58e1e6ab 00001 00002
oelcman                  09c783bd-9107-0916-ebbf-bd27bcc8babe 00001 00002
Storing oelcl1
Storing oelcl2


no meter whic interface I take fence is not done.


thx

br jost 


_____________________________________
From: linux-cluster-bounces at redhat.com [linux-cluster-bounces at redhat.com] On Behalf Of Rakovec Jost [Jost.Rakovec at snt.si]
Sent: Saturday, September 11, 2010 6:36 PM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] fence in xen

Hi list!


I have a question about fence_xvm.

Situation is:

one physical server with xen --> dom0  with 2 domU. Cluster work fine between domU --reboot, relocate,

I'm using redhat 5.5

Problem is with fence from dom0  with "fence_xvm -H oelcl2" ,  domU is destroyed but when it is booted back domU can't join to the cluster. domU boot very long time --> FENCED_START_TIMEOUT=300


on console I get after the node2 is up:

node2:

INFO: task clurgmgrd:2127 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
clurgmgrd     D 0000000000000010     0  2127   2126                     (NOTLB)
 ffff88006f08dda8  0000000000000286  ffff88007cc0b810  0000000000000000
 0000000000000003  ffff880072009860  ffff880072f6b0c0  00000000000455ec
 ffff880072009a48  ffffffff802649d7
Call Trace:
 [<ffffffff802649d7>] _read_lock_irq+0x9/0x19
 [<ffffffff8021420e>] filemap_nopage+0x193/0x360
 [<ffffffff80263a7e>] __mutex_lock_slowpath+0x60/0x9b
 [<ffffffff80263ac8>] .text.lock.mutex+0xf/0x14
 [<ffffffff88424b64>] :dlm:dlm_new_lockspace+0x2c/0x860
 [<ffffffff80222b08>] __up_read+0x19/0x7f
 [<ffffffff802d0abb>] __kmalloc+0x8f/0x9f
 [<ffffffff8842b6fa>] :dlm:device_write+0x438/0x5e5
 [<ffffffff80217377>] vfs_write+0xce/0x174
 [<ffffffff80217bc4>] sys_write+0x45/0x6e
 [<ffffffff802602f9>] tracesys+0xab/0xb6


between booting on node2:

Starting clvmd: dlm: Using TCP for communications
clvmd startup timed out
[FAILED]


node2:

[root at oelcl2 init.d]# clustat
Cluster Status for cluster1 @ Sat Sep 11 18:11:21 2010
Member Status: Quorate

 Member Name                                                ID   Status
 ------ ----                                                ---- ------
 oelcl1                                                  1 Online
 oelcl2                                                 2 Online, Local

[root at oelcl2 init.d]#


on first node:

[root at oelcl1 ~]# clustat
Cluster Status for cluster1 @ Sat Sep 11 18:12:07 2010
Member Status: Quorate

 Member Name                                                ID   Status
 ------ ----                                                ---- ------
 oelcl1                                                  1 Online, Local, rgmanager
 oelcl2                                                  2 Online, rgmanager

 Service Name                                      Owner (Last)                                      State
 ------- ----                                      ----- ------                                      -----
 service:webby                                     oelcl1                                     started
[root at oelcl1 ~]#


and then I have to destroy both domU on guest and create it back to get node2 work again.

I have use how to on https://access.redhat.com/kb/docs/DOC-5937 and http://sources.redhat.com/cluster/wiki/VMClusterCookbook


cluster config on dom0


<?xml version="1.0"?>
<cluster alias="vmcluster" config_version="1" name="vmcluster">
        <clusternodes>
                <clusternode name="vm5" nodeid="1" votes="1"/>
        </clusternodes>
        <cman/>
        <fencedevices/>
        <rm/>
        <fence_xvmd/>
</cluster>


cluster config on domU


<?xml version="1.0"?>
<cluster alias="cluster1" config_version="49" name="cluster1">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="4"/>
        <clusternodes>
                <clusternode name="oelcl1.name.comi" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="oelcl1" name="xenfence1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="oelcl2.name.com" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="oelcl2" name="xenfence1"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_xvm" name="xenfence1"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="prefer_node1" nofailback="0" ordered="1" restricted="1">
                                <failoverdomainnode name="oelcl1.name.com" priority="1"/>
                                <failoverdomainnode name="oelcl2.name.com" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <ip address="xx.xx.xx.xx" monitor_link="1"/>
                        <fs device="/dev/xvdb1" force_fsck="0" force_unmount="0" fsid="8669" fstype="ext3" mountpoint="/var/www/html" name="docroot" self_fence="0"/>
                        <script file="/etc/init.d/httpd" name="apache_s"/>
                </resources>
                <service autostart="1" domain="prefer_node1" exclusive="0" name="webby" recovery="relocate">
                        <ip ref="xx.xx.xx.xx"/>
                        <fs ref="docroot"/>
                        <script ref="apache_s"/>
                </service>
        </rm>
</cluster>


fence proces on dom0

[root at vm5 cluster]# ps -ef |grep fenc
root     18690     1  0 17:40 ?        00:00:00 /sbin/fenced
root     18720     1  0 17:40 ?        00:00:00 /sbin/fence_xvmd -I xenbr0
root     22633 14524  0 18:21 pts/3    00:00:00 grep fenc
[root at vm5 cluster]#


and on domU

[root at oelcl1 ~]# ps -ef|grep fen
root      1523     1  0 17:41 ?        00:00:00 /sbin/fenced
root     13695  2902  0 18:22 pts/0    00:00:00 grep fen
[root at oelcl1 ~]#


Do somebody have any idea why fence don't work?

thx

br

jost


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From girishpati at yahoo.com  Mon Sep 13 07:44:33 2010
From: girishpati at yahoo.com (Girish Prajapati)
Date: Mon, 13 Sep 2010 00:44:33 -0700 (PDT)
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <505609.49142.qm@web120502.mail.ne1.yahoo.com>
References: <155361964.174311284047925612.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
	<505609.49142.qm@web120502.mail.ne1.yahoo.com>
Message-ID: <962762.73149.qm@web120501.mail.ne1.yahoo.com>

Hello,

Any update sir ?


________________________________
From: Girish Prajapati <girishpati at yahoo.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Fri, September 10, 2010 11:32:35 AM
Subject: Re: [Linux-cluster] need help - Fencing problem


Hello Sir,

?1st and 2nd option passed successfully.
i also try to run command with ilo's name and it run successfully so there is no 
issue of DNS.

i ) when i try to run fence_node command i get the following error:

[root at node1 ~]# fence_node node2.drctmb.com
agent "fence_ilo" reports: Unable to connect/login to fencing device

ii) when i try to fence through Luci i get following error:

Sep 10 11:13:10 tmb luci[24270]: Unable to retrieve batch 1700106142 status from 
node2.drctmb.com:11111: fence_node failed: 


Please let me know if there is any other why for troubleshoot

Thank you.

Regards,
Girishkumar 


________________________________
From: Ben Turner <bturner at redhat.com>
To: linux clustering <linux-cluster at redhat.com>
Sent: Thu, September 9, 2010 9:28:45 PM
Subject: Re: [Linux-cluster] need help - Fencing problem

Judging from:

"Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable to 
connect/login to fencing device"

Chances are you are not using the correct username/password/IP or the ilo is not 
configured for telnet logins.? Try the following:

1.? Login to the ilo via telnet from the command line.? Be sure to use the 
username/password/IP you have in cluster.conf.

2.? If that is successful try:

# fence_ilo -v -a "Ilo IP from cluster.conf" -l "Ilo user from cluster.conf" -p 
"Ilo passwd from cluster.conf" -o status

The -v will display exactly what the fence agent sees and is very useful for 
debugging failing fences.? If the status fails send me the output.

3.? If the fence_ilo successful try:

# fence_node <node name from cluster.conf>

If all 3 are successful then fencing is setup properly and there may be a 
problem running it from Luci, if any of the 3 fail post the error back to the 
list and I'll look at it.

-Ben


----- "Girish Prajapati" <girishpati at yahoo.com> wrote:

> Hello,
> i can run following command successfully from another node but still
> getting same error message :
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] Members Joined:
> Sep 9 14:37:00 node2 openais[2904]: [SYNC ] This node is within the
> primary component and will provide service.
> Sep 9 14:37:00 node2 openais[2904]: [TOTEM] entering OPERATIONAL
> state.
> Sep 9 14:37:00 node2 openais[2904]: [CLM ] got nodejoin message
> 192.168.0.28
> Sep 9 14:37:00 node2 openais[2904]: [CPG ] got joinlist message from
> node 1
> Sep 9 14:37:00 node2 fenced[2923]: node1.drctmb.com not a cluster
> member after 0 sec post_fail_delay
> Sep 9 14:37:00 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:10 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> Sep 9 14:37:10 node2 fenced[2923]: fence "node1.drctmb.com" failed
> Sep 9 14:37:15 node2 fenced[2923]: fencing node "node1.drctmb.com"
> Sep 9 14:37:26 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device
> 
> node1 rebooted and get connect to the cluster but now my webby service
> not working see below log :
> 
> Broadcast message from root (Thu Sep 9 14:32:41 2010):
> The system is going down for system halt NOW!
> Sep 9 14:19:22 node1 last message repeated 17 times
> Sep 9 14:32:41 node1 shutdown[25506]: shutting down for system halt
> Sep 9 14:32:41 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:32:43 node1 modclusterd: shutdown succeeded
> Sep 9 14:32:43 node1 rgmanager: [25593]: <notice> Shutting down
> Cluster Service Manager...
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Stopping service
> service:webby
> Sep 9 14:32:44 node1 avahi-daemon[3378]: Withdrawing address record
> for 192.168.0.30 on eth0.
> Read from remote host node1: Connection reset by peer
> .
> .
> .
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/hda, packet devices
> [this device CD/DVD] not SMART capable
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, opened
> Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, IE (SMART) not
> enabled, skip device Try 'smartctl -s on /dev/sda' to turn on SMART
> features
> Sep 9 14:35:42 node1 smartd[3585]: Monitoring 0 ATA and 0 SCSI devices
> Sep 9 14:35:42 node1 smartd[3604]: smartd has fork()ed into background
> mode. New PID=3604.
> Sep 9 14:35:42 node1 avahi-daemon[3412]: Service "SFTP File Transfer
> on node1" (/services/sftp-ssh.service) successfully established.
> Sep 9 14:35:45 node1 pcscd: winscard.c:304:SCardConnect() Reader
> E-Gate 0 0 Not Found
> Sep 9 14:35:45 node1 last message repeated 3 times
> Sep 9 14:35:45 node1 kernel: mtrr: type mismatch for d8000000,2000000
> old: uncachable new: write-combining
> Sep 9 14:35:46 node1 clurgmgrd: [3491]: <err> Checking Existence Of
> File /var/run/cluster/apache/apache:httpd.pid [apache:httpd] > Failed
> - File Doesn't Exist
> 
> 
> 
> It seems that there problem in fencing device configuration.
> Please find here my cluster.conf :
> 
> 
> <?xml version="1.0"?>
> <cluster alias="girish" config_version="21" name="girish">
> <fence_daemon clean_start="0" post_fail_delay="0"
> post_join_delay="3"/>
> <clusternodes>
> <clusternode name=" node2.drctmb.com " nodeid="1" votes="1">
> <fence>
> <method name="1">
> <device name="NODE2"/>
> </method>
> </fence>
> </clusternode>
> <clusternode name="node1.drctmb.com" nodeid="2" votes="1">
> <fence>
> <method name="1">
> <device name="NODE1"/>
> </method>
> </fence>
> </clusternode>
> </clusternodes>
> <cman expected_votes="1" two_node="1"/>
> <fencedevices>
> <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
> login="root" name="NODE1" passwd="redhat123"/>
> <fencedevice agent="fence_ilo" hostname="node2.drctmb.com"
> login="root" name="NODE2" passwd="redhat123"/>
> </fencedevices>
> <rm>
> <failoverdomains>
> <failoverdomain name="prefer_node1" nofailback="0" ordered="1"
> restricted="1">
> <failoverdomainnode name="node2.drctmb.com" priority="2"/>
> <failoverdomainnode name="node1.drctmb.com" priority="1"/>
> </failoverdomain>
> </failoverdomains>
> <resources>
> <fs device="/dev/sda1" force_fsck="0" force_unmount="0" fsid="8669"
> fstype="ext3" mountpoint="/var/www/html" name="docroot"
> self_fence="0"/>
> <ip address="192.168.0.30" monitor_link="1"/>
> <apache config_file="conf/httpd.conf" name="httpd"
> server_root="/etc/httpd" shutdown_wait="5"/>
> </resources>
> <service autostart="1" domain="prefer_node1" exclusive="0"
> name="webby" recovery="relocate">
> <ip ref="192.168.0.30"/>
> <fs ref="docroot"/>
> <apache ref="httpd"/>
> </service>
> </rm>
> <fence_xvmd/>
> </cluster>
> ~
> 
> This is first time am working on Clustering so please help me.
> Appreciate your help.
> 
> Thank you.
> 
> 
> 
> From: Brem Belguebli <brem.belguebli at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Sent: Thu, September 9, 2010 11:30:28 AM
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> try run this from another node of the cluster
> 
> fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> 
> 
> Additionnally, by connecting thru http to the Ilo, you should be able
> to
> see Ilo logs (in the general tab) and see if it is due to a lack of
> licensing
> 
> 
> On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> > Hello...
> >
> > I have already configure BIOS for iLO.. but am not sure why i don
> need
> > to shared ??
> > please anybody can help me out for this problem.
> > Do i need any extra setup for fencing device ?
> > thanks
> >
> >
> >
> >
> ______________________________________________________________________
> > From: ESGLinux < esggrupos at gmail.com >
> > To: linux clustering < linux-cluster at redhat.com >
> > Sent: Wed, September 8, 2010 2:57:25 PM
> > Subject: Re: [Linux-cluster] need help - Fencing problem
> >
> > Hello,
> >
> >
> > Have you configured the iLO devices entering in the BIOS?
> >
> >
> > I remenber I have to set up the user/pass in the iLO and marked the
> > iLo as not shared
> >
> >
> >
> >
> > HTH,
> >
> >
> > ESG
> >
> > 2010/9/8 Girish Prajapati < girishpati at yahoo.com >
> > Hello Everybody,
> > i am having problem of fencing a cluster node let me explain
> > indetail :
> > I have installed RHEL 5.4 on HP Prolaint DL280 G5 servers and
> > iLO 2as fencing device. Am managing cluster through Luci -
> > (Conga). itseems everything is working fine. I can reboot
> > cluster nodes through Luci and service get transfer to another
> > node. After rebooting node connect to cluster automatically
> > without any error.
> > Problem is i can not do Fence this node through Luci, when i
> > try to fence any node i get following error :
> >
> > Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
> > reports: Unable to connect/login to fencing device
> > Sep 8 14:51:16 node2 fence_node[9106]: Fence of
> > " node1.drctmb.com " was unsuccessful
> >
> > my iLO license is : iLO 2 Advanced Evaluation
> > Do i need to have license of iLO or there is problem in
> > configuration of cluster ?
> > how i can check cluster log in details.
> >
> > Appreciate your help.
> > Thank you in advance.
> >
> > Regards,
> > Girishkumar R Prajapati
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100913/41475801/attachment.htm>

From susvirkar.3616 at gmail.com  Mon Sep 13 16:37:59 2010
From: susvirkar.3616 at gmail.com (umesh susvirkar)
Date: Mon, 13 Sep 2010 22:07:59 +0530
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <962762.73149.qm@web120501.mail.ne1.yahoo.com>
References: <155361964.174311284047925612.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
	<505609.49142.qm@web120502.mail.ne1.yahoo.com>
	<962762.73149.qm@web120501.mail.ne1.yahoo.com>
Message-ID: <AANLkTin8-1n6jssryS_8PK5JX-XHMjMzAmn+XL_gECdM@mail.gmail.com>

Hi

from your cluster.conf file


<?xml version="1.0"?>
               <clusternode name="node2.drctmb.com" nodeid="1" votes="1">
                 <clusternode name="node1.drctmb.com" nodeid="2" votes="1">
         <fencedevices>
                <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
login="root" name="NODE1" passwd="redhat123"/>
                <fencedevice agent="fence_ilo" hostname="node2.drctmb.com"
login="root" name="NODE2" passwd="redhat123"/>
        </fencedevices>

Your node name & fence device hostname is same.that should be different.

As you mentioned following command in working

fence_ilo -a "IP" -l "login" -p "Pass" -o status

replace hostname of fencedevice with ip you specify with -a option & check.


On Mon, Sep 13, 2010 at 1:14 PM, Girish Prajapati <girishpati at yahoo.com>wrote:

> Hello,
>
> Any update sir ?
>
>
>
>  ------------------------------
> *From:* Girish Prajapati <girishpati at yahoo.com>
>
> *To:* linux clustering <linux-cluster at redhat.com>
> *Sent:* Fri, September 10, 2010 11:32:35 AM
>
> *Subject:* Re: [Linux-cluster] need help - Fencing problem
>
>  Hello Sir,
>
>  1st and 2nd option passed successfully.
> i also try to run command with ilo's name and it run successfully so there
> is no issue of DNS.
>
> i ) when i try to run fence_node command i get the following error:
>
> [root at node1 ~]# fence_node node2.drctmb.com
> agent "fence_ilo" reports: Unable to connect/login to fencing device
>
> ii) when i try to fence through Luci i get following error:
>
> Sep 10 11:13:10 tmb luci[24270]: Unable to retrieve batch 1700106142 status
> from node2.drctmb.com:11111: fence_node failed:
>
> Please let me know if there is any other why for troubleshoot
>
> Thank you.
>
> Regards,
> Girishkumar
>
>  ------------------------------
> *From:* Ben Turner <bturner at redhat.com>
> *To:* linux clustering <linux-cluster at redhat.com>
> *Sent:* Thu, September 9, 2010 9:28:45 PM
> *Subject:* Re: [Linux-cluster] need help - Fencing problem
>
> Judging from:
>
> "Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports: Unable
> to connect/login to fencing device"
>
> Chances are you are not using the correct username/password/IP or the ilo
> is not configured for telnet logins.  Try the following:
>
> 1.  Login to the ilo via telnet from the command line.  Be sure to use the
> username/password/IP you have in cluster.conf.
>
> 2.  If that is successful try:
>
> # fence_ilo -v -a "Ilo IP from cluster.conf" -l "Ilo user from
> cluster.conf" -p "Ilo passwd from cluster.conf" -o status
>
> The -v will display exactly what the fence agent sees and is very useful
> for debugging failing fences.  If the status fails send me the output.
>
> 3.  If the fence_ilo successful try:
>
> # fence_node <node name from cluster.conf>
>
> If all 3 are successful then fencing is setup properly and there may be a
> problem running it from Luci, if any of the 3 fail post the error back to
> the list and I'll look at it.
>
> -Ben
>
>
>
>
>
> ----- "Girish Prajapati" <girishpati at yahoo.com> wrote:
>
> > Hello,
> > i can run following command successfully from another node but still
> > getting same error message :
> >
> > fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> >
> > Sep 9 14:37:00 node2 openais[2904]: [CLM ] Members Joined:
> > Sep 9 14:37:00 node2 openais[2904]: [SYNC ] This node is within the
> > primary component and will provide service.
> > Sep 9 14:37:00 node2 openais[2904]: [TOTEM] entering OPERATIONAL
> > state.
> > Sep 9 14:37:00 node2 openais[2904]: [CLM ] got nodejoin message
> > 192.168.0.28
> > Sep 9 14:37:00 node2 openais[2904]: [CPG ] got joinlist message from
> > node 1
> > Sep 9 14:37:00 node2 fenced[2923]: node1.drctmb.com not a cluster
> > member after 0 sec post_fail_delay
> > Sep 9 14:37:00 node2 fenced[2923]: fencing node "node1.drctmb.com"
> > Sep 9 14:37:10 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> > to connect/login to fencing device
> > Sep 9 14:37:10 node2 fenced[2923]: fence "node1.drctmb.com" failed
> > Sep 9 14:37:15 node2 fenced[2923]: fencing node "node1.drctmb.com"
> > Sep 9 14:37:26 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> > to connect/login to fencing device
> >
> > node1 rebooted and get connect to the cluster but now my webby service
> > not working see below log :
> >
> > Broadcast message from root (Thu Sep 9 14:32:41 2010):
> > The system is going down for system halt NOW!
> > Sep 9 14:19:22 node1 last message repeated 17 times
> > Sep 9 14:32:41 node1 shutdown[25506]: shutting down for system halt
> > Sep 9 14:32:41 node1 pcscd: winscard.c:304:SCardConnect() Reader
> > E-Gate 0 0 Not Found
> > Sep 9 14:32:43 node1 modclusterd: shutdown succeeded
> > Sep 9 14:32:43 node1 rgmanager: [25593]: <notice> Shutting down
> > Cluster Service Manager...
> > Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> > Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> > Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Stopping service
> > service:webby
> > Sep 9 14:32:44 node1 avahi-daemon[3378]: Withdrawing address record
> > for 192.168.0.30 on eth0.
> > Read from remote host node1: Connection reset by peer
> > .
> > .
> > .
> > Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/hda, packet devices
> > [this device CD/DVD] not SMART capable
> > Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, opened
> > Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, IE (SMART) not
> > enabled, skip device Try 'smartctl -s on /dev/sda' to turn on SMART
> > features
> > Sep 9 14:35:42 node1 smartd[3585]: Monitoring 0 ATA and 0 SCSI devices
> > Sep 9 14:35:42 node1 smartd[3604]: smartd has fork()ed into background
> > mode. New PID=3604.
> > Sep 9 14:35:42 node1 avahi-daemon[3412]: Service "SFTP File Transfer
> > on node1" (/services/sftp-ssh.service) successfully established.
> > Sep 9 14:35:45 node1 pcscd: winscard.c:304:SCardConnect() Reader
> > E-Gate 0 0 Not Found
> > Sep 9 14:35:45 node1 last message repeated 3 times
> > Sep 9 14:35:45 node1 kernel: mtrr: type mismatch for d8000000,2000000
> > old: uncachable new: write-combining
> > Sep 9 14:35:46 node1 clurgmgrd: [3491]: <err> Checking Existence Of
> > File /var/run/cluster/apache/apache:httpd.pid [apache:httpd] > Failed
> > - File Doesn't Exist
> >
> >
> >
> > It seems that there problem in fencing device configuration.
> > Please find here my cluster.conf :
> >
> >
> > <?xml version="1.0"?>
> > <cluster alias="girish" config_version="21" name="girish">
> > <fence_daemon clean_start="0" post_fail_delay="0"
> > post_join_delay="3"/>
> > <clusternodes>
> > <clusternode name=" node2.drctmb.com " nodeid="1" votes="1">
> > <fence>
> > <method name="1">
> > <device name="NODE2"/>
> > </method>
> > </fence>
> > </clusternode>
> > <clusternode name="node1.drctmb.com" nodeid="2" votes="1">
> > <fence>
> > <method name="1">
> > <device name="NODE1"/>
> > </method>
> > </fence>
> > </clusternode>
> > </clusternodes>
> > <cman expected_votes="1" two_node="1"/>
> > <fencedevices>
> > <fencedevice agent="fence_ilo" hostname="node1.drctmb.com"
> > login="root" name="NODE1" passwd="redhat123"/>
> > <fencedevice agent="fence_ilo" hostname="node2.drctmb.com"
> > login="root" name="NODE2" passwd="redhat123"/>
> > </fencedevices>
> > <rm>
> > <failoverdomains>
> > <failoverdomain name="prefer_node1" nofailback="0" ordered="1"
> > restricted="1">
> > <failoverdomainnode name="node2.drctmb.com" priority="2"/>
> > <failoverdomainnode name="node1.drctmb.com" priority="1"/>
> > </failoverdomain>
> > </failoverdomains>
> > <resources>
> > <fs device="/dev/sda1" force_fsck="0" force_unmount="0" fsid="8669"
> > fstype="ext3" mountpoint="/var/www/html" name="docroot"
> > self_fence="0"/>
> > <ip address="192.168.0.30" monitor_link="1"/>
> > <apache config_file="conf/httpd.conf" name="httpd"
> > server_root="/etc/httpd" shutdown_wait="5"/>
> > </resources>
> > <service autostart="1" domain="prefer_node1" exclusive="0"
> > name="webby" recovery="relocate">
> > <ip ref="192.168.0.30"/>
> > <fs ref="docroot"/>
> > <apache ref="httpd"/>
> > </service>
> > </rm>
> > <fence_xvmd/>
> > </cluster>
> > ~
> >
> > This is first time am working on Clustering so please help me.
> > Appreciate your help.
> >
> > Thank you.
> >
> >
> >
> > From: Brem Belguebli <brem.belguebli at gmail.com>
> > To: linux clustering <linux-cluster at redhat.com>
> > Sent: Thu, September 9, 2010 11:30:28 AM
> > Subject: Re: [Linux-cluster] need help - Fencing problem
> >
> > try run this from another node of the cluster
> >
> > fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> >
> >
> > Additionnally, by connecting thru http to the Ilo, you should be able
> > to
> > see Ilo logs (in the general tab) and see if it is due to a lack of
> > licensing
> >
> >
> > On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> > > Hello...
> > >
> > > I have already configure BIOS for iLO.. but am not sure why i don
> > need
> > > to shared ??
> > > please anybody can help me out for this problem.
> > > Do i need any extra setup for fencing device ?
> > > thanks
> > >
> > >
> > >
> > >
> > ______________________________________________________________________
> > > From: ESGLinux < esggrupos at gmail.com >
> > > To: linux clustering < linux-cluster at redhat.com >
> > > Sent: Wed, September 8, 2010 2:57:25 PM
> > > Subject: Re: [Linux-cluster] need help - Fencing problem
> > >
> > > Hello,
> > >
> > >
> > > Have you configured the iLO devices entering in the BIOS?
> > >
> > >
> > > I remenber I have to set up the user/pass in the iLO and marked the
> > > iLo as not shared
> > >
> > >
> > >
> > >
> > > HTH,
> > >
> > >
> > > ESG
> > >
> > > 2010/9/8 Girish Prajapati < girishpati at yahoo.com >
> > > Hello Everybody,
> > > i am having problem of fencing a cluster node let me explain
> > > indetail :
> > > I have installed RHEL 5.4 on HP Prolaint DL280 G5 servers and
> > > iLO 2as fencing device. Am managing cluster through Luci -
> > > (Conga). itseems everything is working fine. I can reboot
> > > cluster nodes through Luci and service get transfer to another
> > > node. After rebooting node connect to cluster automatically
> > > without any error.
> > > Problem is i can not do Fence this node through Luci, when i
> > > try to fence any node i get following error :
> > >
> > > Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
> > > reports: Unable to connect/login to fencing device
> > > Sep 8 14:51:16 node2 fence_node[9106]: Fence of
> > > " node1.drctmb.com " was unsuccessful
> > >
> > > my iLO license is : iLO 2 Advanced Evaluation
> > > Do i need to have license of iLO or there is problem in
> > > configuration of cluster ?
> > > how i can check cluster log in details.
> > >
> > > Appreciate your help.
> > > Thank you in advance.
> > >
> > > Regards,
> > > Girishkumar R Prajapati
> > >
> > >
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > >
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100913/bf42ef67/attachment.htm>

From bturner at redhat.com  Tue Sep 14 20:58:44 2010
From: bturner at redhat.com (Ben Turner)
Date: Tue, 14 Sep 2010 16:58:44 -0400 (EDT)
Subject: [Linux-cluster] need help - Fencing problem
In-Reply-To: <AANLkTin8-1n6jssryS_8PK5JX-XHMjMzAmn+XL_gECdM@mail.gmail.com>
Message-ID: <546141917.636911284497924498.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>

Yep, thats what I see too.  The host name of your nodes should be different than the host name of your fence devices.  You probably used the correct hostname when you ran it manually, thats why they succeeded.  Try changing the hostname= in cluster.conf to what you used in tests where the reboot was successful.

-Ben


----- "umesh susvirkar" <susvirkar.3616 at gmail.com> wrote:

> Hi
> 
> 
> from your cluster.conf file
> 
> 
> 
> 
> <?xml version="1.0"?>
> <clusternode name=" node2.drctmb.com " nodeid="1" votes="1">
> <clusternode name=" node1.drctmb.com " nodeid="2" votes="1">
> <fencedevices>
> <fencedevice agent="fence_ilo" hostname=" node1.drctmb.com "
> login="root" name="NODE1" passwd="redhat123"/>
> <fencedevice agent="fence_ilo" hostname=" node2.drctmb.com "
> login="root" name="NODE2" passwd="redhat123"/>
> </fencedevices>
> 
> 
> Your node name & fence device hostname is same.that should be
> different.
> 
> As you mentioned following command in working
> 
> 
> fence_ilo -a "IP" -l "login" -p "Pass" -o status
> 
> 
> replace hostname of fencedevice with ip you specify with -a option &
> check.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> On Mon, Sep 13, 2010 at 1:14 PM, Girish Prajapati <
> girishpati at yahoo.com > wrote:
> 
> 
> 
> 
> 
> Hello,
> 
> Any update sir ?
> 
> 
> 
> 
> 
> From: Girish Prajapati < girishpati at yahoo.com >
> 
> To: linux clustering < linux-cluster at redhat.com >
> Sent: Fri, September 10, 2010 11:32:35 AM
> 
> 
> 
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> 
> 
> 
> 
> 
> Hello Sir,
> 
> 1st and 2nd option passed successfully.
> i also try to run command with ilo's name and it run successfully so
> there is no issue of DNS.
> 
> i ) when i try to run fence_node command i get the following error:
> 
> [root at node1 ~]# fence_node node2.drctmb.com
> agent "fence_ilo" reports: Unable to connect/login to fencing device
> 
> ii) when i try to fence through Luci i get following error:
> 
> Sep 10 11:13:10 tmb luci[24270]: Unable to retrieve batch 1700106142
> status from node2.drctmb.com:11111 : fence_node failed:
> 
> Please let me know if there is any other why for troubleshoot
> 
> Thank you.
> 
> Regards,
> Girishkumar
> 
> 
> 
> 
> From: Ben Turner < bturner at redhat.com >
> To: linux clustering < linux-cluster at redhat.com >
> Sent: Thu, September 9, 2010 9:28:45 PM
> Subject: Re: [Linux-cluster] need help - Fencing problem
> 
> Judging from:
> 
> "Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo" reports:
> Unable to connect/login to fencing device"
> 
> Chances are you are not using the correct username/password/IP or the
> ilo is not configured for telnet logins. Try the following:
> 
> 1. Login to the ilo via telnet from the command line. Be sure to use
> the username/password/IP you have in cluster.conf.
> 
> 2. If that is successful try:
> 
> # fence_ilo -v -a "Ilo IP from cluster.conf" -l "Ilo user from
> cluster.conf" -p "Ilo passwd from cluster.conf" -o status
> 
> The -v will display exactly what the fence agent sees and is very
> useful for debugging failing fences. If the status fails send me the
> output.
> 
> 3. If the fence_ilo successful try:
> 
> # fence_node <node name from cluster.conf>
> 
> If all 3 are successful then fencing is setup properly and there may
> be a problem running it from Luci, if any of the 3 fail post the error
> back to the list and I'll look at it.
> 
> -Ben
> 
> 
> 
> 
> 
> ----- "Girish Prajapati" < girishpati at yahoo.com > wrote:
> 
> > Hello,
> > i can run following command successfully from another node but still
> > getting same error message :
> >
> > fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> >
> > Sep 9 14:37:00 node2 openais[2904]: [CLM ] Members Joined:
> > Sep 9 14:37:00 node2 openais[2904]: [SYNC ] This node is within the
> > primary component and will provide service.
> > Sep 9 14:37:00 node2 openais[2904]: [TOTEM] entering OPERATIONAL
> > state.
> > Sep 9 14:37:00 node2 openais[2904]: [CLM ] got nodejoin message
> > 192.168.0.28
> > Sep 9 14:37:00 node2 openais[2904]: [CPG ] got joinlist message from
> > node 1
> > Sep 9 14:37:00 node2 fenced[2923]: node1.drctmb.com not a cluster
> > member after 0 sec post_fail_delay
> > Sep 9 14:37:00 node2 fenced[2923]: fencing node " node1.drctmb.com "
> > Sep 9 14:37:10 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> > to connect/login to fencing device
> > Sep 9 14:37:10 node2 fenced[2923]: fence " node1.drctmb.com " failed
> > Sep 9 14:37:15 node2 fenced[2923]: fencing node " node1.drctmb.com "
> > Sep 9 14:37:26 node2 fenced[2923]: agent "fence_ilo" reports: Unable
> > to connect/login to fencing device
> >
> > node1 rebooted and get connect to the cluster but now my webby
> service
> > not working see below log :
> >
> > Broadcast message from root (Thu Sep 9 14:32:41 2010):
> > The system is going down for system halt NOW!
> > Sep 9 14:19:22 node1 last message repeated 17 times
> > Sep 9 14:32:41 node1 shutdown[25506]: shutting down for system halt
> > Sep 9 14:32:41 node1 pcscd: winscard.c:304:SCardConnect() Reader
> > E-Gate 0 0 Not Found
> > Sep 9 14:32:43 node1 modclusterd: shutdown succeeded
> > Sep 9 14:32:43 node1 rgmanager: [25593]: <notice> Shutting down
> > Cluster Service Manager...
> > Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> > Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Shutting down
> > Sep 9 14:32:43 node1 clurgmgrd[3457]: <notice> Stopping service
> > service:webby
> > Sep 9 14:32:44 node1 avahi-daemon[3378]: Withdrawing address record
> > for 192.168.0.30 on eth0.
> > Read from remote host node1: Connection reset by peer
> > .
> > .
> > .
> > Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/hda, packet devices
> > [this device CD/DVD] not SMART capable
> > Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, opened
> > Sep 9 14:35:42 node1 smartd[3585]: Device: /dev/sda, IE (SMART) not
> > enabled, skip device Try 'smartctl -s on /dev/sda' to turn on SMART
> > features
> > Sep 9 14:35:42 node1 smartd[3585]: Monitoring 0 ATA and 0 SCSI
> devices
> > Sep 9 14:35:42 node1 smartd[3604]: smartd has fork()ed into
> background
> > mode. New PID=3604.
> > Sep 9 14:35:42 node1 avahi-daemon[3412]: Service "SFTP File Transfer
> > on node1" (/services/sftp-ssh.service) successfully established.
> > Sep 9 14:35:45 node1 pcscd: winscard.c:304:SCardConnect() Reader
> > E-Gate 0 0 Not Found
> > Sep 9 14:35:45 node1 last message repeated 3 times
> > Sep 9 14:35:45 node1 kernel: mtrr: type mismatch for
> d8000000,2000000
> > old: uncachable new: write-combining
> > Sep 9 14:35:46 node1 clurgmgrd: [3491]: <err> Checking Existence Of
> > File /var/run/cluster/apache/apache:httpd.pid [apache:httpd] >
> Failed
> > - File Doesn't Exist
> >
> >
> >
> > It seems that there problem in fencing device configuration.
> > Please find here my cluster.conf :
> >
> >
> > <?xml version="1.0"?>
> > <cluster alias="girish" config_version="21" name="girish">
> > <fence_daemon clean_start="0" post_fail_delay="0"
> > post_join_delay="3"/>
> > <clusternodes>
> > <clusternode name=" node2.drctmb.com " nodeid="1" votes="1">
> > <fence>
> > <method name="1">
> > <device name="NODE2"/>
> > </method>
> > </fence>
> > </clusternode>
> > <clusternode name=" node1.drctmb.com " nodeid="2" votes="1">
> > <fence>
> > <method name="1">
> > <device name="NODE1"/>
> > </method>
> > </fence>
> > </clusternode>
> > </clusternodes>
> > <cman expected_votes="1" two_node="1"/>
> > <fencedevices>
> > <fencedevice agent="fence_ilo" hostname=" node1.drctmb.com "
> > login="root" name="NODE1" passwd="redhat123"/>
> > <fencedevice agent="fence_ilo" hostname=" node2.drctmb.com "
> > login="root" name="NODE2" passwd="redhat123"/>
> > </fencedevices>
> > <rm>
> > <failoverdomains>
> > <failoverdomain name="prefer_node1" nofailback="0" ordered="1"
> > restricted="1">
> > <failoverdomainnode name=" node2.drctmb.com " priority="2"/>
> > <failoverdomainnode name=" node1.drctmb.com " priority="1"/>
> > </failoverdomain>
> > </failoverdomains>
> > <resources>
> > <fs device="/dev/sda1" force_fsck="0" force_unmount="0" fsid="8669"
> > fstype="ext3" mountpoint="/var/www/html" name="docroot"
> > self_fence="0"/>
> > <ip address="192.168.0.30" monitor_link="1"/>
> > <apache config_file="conf/httpd.conf" name="httpd"
> > server_root="/etc/httpd" shutdown_wait="5"/>
> > </resources>
> > <service autostart="1" domain="prefer_node1" exclusive="0"
> > name="webby" recovery="relocate">
> > <ip ref="192.168.0.30"/>
> > <fs ref="docroot"/>
> > <apache ref="httpd"/>
> > </service>
> > </rm>
> > <fence_xvmd/>
> > </cluster>
> > ~
> >
> > This is first time am working on Clustering so please help me.
> > Appreciate your help.
> >
> > Thank you.
> >
> >
> >
> > From: Brem Belguebli < brem.belguebli at gmail.com >
> > To: linux clustering < linux-cluster at redhat.com >
> > Sent: Thu, September 9, 2010 11:30:28 AM
> > Subject: Re: [Linux-cluster] need help - Fencing problem
> >
> > try run this from another node of the cluster
> >
> > fence_ilo -a "Ilo IP" -l "Ilo user" -p "Ilo passwd" -o reboot
> >
> >
> > Additionnally, by connecting thru http to the Ilo, you should be
> able
> > to
> > see Ilo logs (in the general tab) and see if it is due to a lack of
> > licensing
> >
> >
> > On Wed, 2010-09-08 at 22:29 -0700, Girish Prajapati wrote:
> > > Hello...
> > >
> > > I have already configure BIOS for iLO.. but am not sure why i don
> > need
> > > to shared ??
> > > please anybody can help me out for this problem.
> > > Do i need any extra setup for fencing device ?
> > > thanks
> > >
> > >
> > >
> > >
> >
> ______________________________________________________________________
> > > From: ESGLinux < esggrupos at gmail.com >
> > > To: linux clustering < linux-cluster at redhat.com >
> > > Sent: Wed, September 8, 2010 2:57:25 PM
> > > Subject: Re: [Linux-cluster] need help - Fencing problem
> > >
> > > Hello,
> > >
> > >
> > > Have you configured the iLO devices entering in the BIOS?
> > >
> > >
> > > I remenber I have to set up the user/pass in the iLO and marked
> the
> > > iLo as not shared
> > >
> > >
> > >
> > >
> > > HTH,
> > >
> > >
> > > ESG
> > >
> > > 2010/9/8 Girish Prajapati < girishpati at yahoo.com >
> > > Hello Everybody,
> > > i am having problem of fencing a cluster node let me explain
> > > indetail :
> > > I have installed RHEL 5.4 on HP Prolaint DL280 G5 servers and
> > > iLO 2as fencing device. Am managing cluster through Luci -
> > > (Conga). itseems everything is working fine. I can reboot
> > > cluster nodes through Luci and service get transfer to another
> > > node. After rebooting node connect to cluster automatically
> > > without any error.
> > > Problem is i can not do Fence this node through Luci, when i
> > > try to fence any node i get following error :
> > >
> > > Sep 8 14:51:16 node2 fence_node[9106]: agent "fence_ilo"
> > > reports: Unable to connect/login to fencing device
> > > Sep 8 14:51:16 node2 fence_node[9106]: Fence of
> > > " node1.drctmb.com " was unsuccessful
> > >
> > > my iLO license is : iLO 2 Advanced Evaluation
> > > Do i need to have license of iLO or there is problem in
> > > configuration of cluster ?
> > > how i can check cluster log in details.
> > >
> > > Appreciate your help.
> > > Thank you in advance.
> > >
> > > Regards,
> > > Girishkumar R Prajapati
> > >
> > >
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > >
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From jakov.sosic at srce.hr  Wed Sep 15 13:42:56 2010
From: jakov.sosic at srce.hr (Jakov Sosic)
Date: Wed, 15 Sep 2010 15:42:56 +0200
Subject: [Linux-cluster] HA agents (cluster scripts)
Message-ID: <4C90CD60.2090301@srce.hr>

Hi.


I want to write few scripts for my company's services, and I don't know
how to debug the scripts?

Is there any documentation regarding the issue, or something similar?


Thank you.


-- 
|    Jakov Sosic    |    ICQ: 28410271    |   PGP: 0x965CAE2D   |
=================================================================
| start fighting cancer -> http://www.worldcommunitygrid.org/   |


From cmaiolino at redhat.com  Wed Sep 15 14:08:46 2010
From: cmaiolino at redhat.com (Carlos Maiolino)
Date: Wed, 15 Sep 2010 11:08:46 -0300
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <4C90CD60.2090301@srce.hr>
References: <4C90CD60.2090301@srce.hr>
Message-ID: <20100915140846.GB5353@andromeda.usersys.redhat.com>

On Wed, Sep 15, 2010 at 03:42:56PM +0200, Jakov Sosic wrote:
> Hi.
> 
> 
> I want to write few scripts for my company's services, and I don't know
> how to debug the scripts?
> 

Hi Jakov,

Cluster uses system V model to manage scripts, so, it needs to have a  start, stop and status arguments, where each argument can return 0 if ok or 1 if any error occurred. so, after that, you can use "script" as a resource agent and point cluster.conf to your custom script.

About debug it, what do you exactly want to debug ? if it is working properly ? So, it should work if you manually run: service <your_script> <status>  (status is one of the three above status).

There is a document explaining how to set up these scripts, I'm trying to find it.

Hope it helps


-- 
---

Best Regards

Carlos Eduardo Maiolino


From jakov.sosic at srce.hr  Wed Sep 15 14:45:46 2010
From: jakov.sosic at srce.hr (Jakov Sosic)
Date: Wed, 15 Sep 2010 16:45:46 +0200
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <20100915140846.GB5353@andromeda.usersys.redhat.com>
References: <4C90CD60.2090301@srce.hr>
	<20100915140846.GB5353@andromeda.usersys.redhat.com>
Message-ID: <4C90DC1A.1020609@srce.hr>

On 09/15/2010 04:08 PM, Carlos Maiolino wrote:
> On Wed, Sep 15, 2010 at 03:42:56PM +0200, Jakov Sosic wrote:
>> Hi.
>>
>>
>> I want to write few scripts for my company's services, and I don't know
>> how to debug the scripts?
>>
> 
> Hi Jakov,
> 
> Cluster uses system V model to manage scripts, so, it needs to have a  start, stop and status arguments, where each argument can return 0 if ok or 1 if any error occurred. so, after that, you can use "script" as a resource agent and point cluster.conf to your custom script.
> 
> About debug it, what do you exactly want to debug ? if it is working properly ? So, it should work if you manually run: service <your_script> <status>  (status is one of the three above status).
> 
> There is a document explaining how to set up these scripts, I'm trying to find it.

I meant writing HA resource agent - like the ones that are in
/usr/share/cluster.... Not the classic init scripts...


-- 
|    Jakov Sosic    |    ICQ: 28410271    |   PGP: 0x965CAE2D   |
=================================================================
| start fighting cancer -> http://www.worldcommunitygrid.org/   |


From jakov.sosic at srce.hr  Wed Sep 15 14:52:09 2010
From: jakov.sosic at srce.hr (Jakov Sosic)
Date: Wed, 15 Sep 2010 16:52:09 +0200
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <20100915140846.GB5353@andromeda.usersys.redhat.com>
References: <4C90CD60.2090301@srce.hr>
	<20100915140846.GB5353@andromeda.usersys.redhat.com>
Message-ID: <4C90DD99.8000809@srce.hr>

On 09/15/2010 04:08 PM, Carlos Maiolino wrote:

> Cluster uses system V model to manage scripts, so, it needs to have
> a start, stop and status arguments, where each argument can return 0
> if ok or 1 if any error occurred. so, after that, you can use "script"
> as a resource agent and point cluster.conf to your custom script.
>
> About debug it, what do you exactly want to debug ? if it is working
> properly ? So, it should work if you manually run:
> service <your_script> <status>  (status is one of the three above
> status).
> 
> There is a document explaining how to set up these scripts,
> I'm trying to find it.

I meant writing HA resource agent - like the ones that are in
/usr/share/cluster directory. I know about script resource and classic
init scripts, but would prefer to write an resource provider XML/script.

Thing is - I don't have a clue how to test them, because of all the
"OCF" variables they parse...


-- 
|    Jakov Sosic    |    ICQ: 28410271    |   PGP: 0x965CAE2D   |
=================================================================
| start fighting cancer -> http://www.worldcommunitygrid.org/   |


From cos at aaaaa.org  Wed Sep 15 15:08:19 2010
From: cos at aaaaa.org (Ofer Inbar)
Date: Wed, 15 Sep 2010 11:08:19 -0400
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <4C90CD60.2090301@srce.hr>
References: <4C90CD60.2090301@srce.hr>
Message-ID: <20100915150819.GM18254@mip.aaaaa.org>

> I want to write few scripts for my company's services, and I don't know
> how to debug the scripts?
> 
> Is there any documentation regarding the issue, or something similar?

I ran into the same problem this summer.  Unfortunately, the answer is
that mostly, full documentation does not exist.  However, there are a
few scraps you may find very useful.

First, here are some notes on the Open Cluster Framework's resource
agent API:

http://www.opencf.org/cgi-bin/viewcvs.cgi/specs/ra/resource-agent-api.txt?rev=HEAD

If you're doing Red Hat Cluster Suite, there's more to it than the
OCF-common parts.  For example, RHCS metada is, I believe, specific
to RHCS.  I pointed out some holes in the documentation to Lon on
IRC a few weeks ago, and he began writing this page:

http://sources.redhat.com/cluster/wiki/RGManagerResourceAgents

It's still very incomplete, but already very useful (thanks Lon!)

Also, some things you can do for debugging:

1. Sprinkle some ocf_log calls in strategic places in your script:

  ocf_log info "some debugging statement, including a $OCF_variable"

 -> Make sure /usr/sbin is in your resource agent's path!

2. Configure logging so that your log statements make it to syslog.

See http://sources.redhat.com/cluster/wiki/RGManager

3. Use rg_test, which I didn't know about at first.  For example:

  sudo rg_test rules
  sudo rg_test test /etc/cluster/cluster.conf
  sudo rg_test test /etc/cluster/cluster.conf status service myservice
  sudo rg_test test /etc/cluster/cluster.conf start service myservice

It turns out this is documented in the ResourceTrees page on the
cluster wiki, but I didn't read that far on that page because I
wasn't having any trouble understanding or configuring my resource
trees.  It's true that rg_test is useful for debugging resource
trees, however it's also useful for troubleshooting resource agents.

  -- Cos


From crosa at redhat.com  Wed Sep 15 16:20:14 2010
From: crosa at redhat.com (Cleber Rosa)
Date: Wed, 15 Sep 2010 13:20:14 -0300
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <20100915140846.GB5353@andromeda.usersys.redhat.com>
References: <4C90CD60.2090301@srce.hr>
	<20100915140846.GB5353@andromeda.usersys.redhat.com>
Message-ID: <4C90F23E.4000007@redhat.com>

  If they're bash scripts, you might want to try:

#bash -x <script> <arguments>.

CR.

On 09/15/2010 11:08 AM, Carlos Maiolino wrote:
> On Wed, Sep 15, 2010 at 03:42:56PM +0200, Jakov Sosic wrote:
>> Hi.
>>
>>
>> I want to write few scripts for my company's services, and I don't know
>> how to debug the scripts?
>>
> Hi Jakov,
>
> Cluster uses system V model to manage scripts, so, it needs to have a  start, stop and status arguments, where each argument can return 0 if ok or 1 if any error occurred. so, after that, you can use "script" as a resource agent and point cluster.conf to your custom script.
>
> About debug it, what do you exactly want to debug ? if it is working properly ? So, it should work if you manually run: service<your_script>  <status>   (status is one of the three above status).
>
> There is a document explaining how to set up these scripts, I'm trying to find it.
>
> Hope it helps
>
>
>


From cos at aaaaa.org  Wed Sep 15 16:37:41 2010
From: cos at aaaaa.org (Ofer Inbar)
Date: Wed, 15 Sep 2010 12:37:41 -0400
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <4C90F23E.4000007@redhat.com>
References: <4C90CD60.2090301@srce.hr>
	<20100915140846.GB5353@andromeda.usersys.redhat.com>
	<4C90F23E.4000007@redhat.com>
Message-ID: <20100915163741.GQ18254@mip.aaaaa.org>

Cleber Rosa <crosa at redhat.com> wrote:
>  If they're bash scripts, you might want to try:
> 
> #bash -x <script> <arguments>.

This won't work.  The interface between the resource manager and the
resource agent script is more than just "run the script".  It includes:

 - What environment does the script get when run by the RM?
   - Which includes a lot of parameters passed from the RM to the script

 - What metadata must the agent script provide to the RM?
   - ... and what the RM will do differently based on this metadata

 - When does the RM call the agent, with which paremeters?
   - What actions will it take in response to exit codes
   - What happens to stdout and stderr (answer below)

As a very basic hack, you can sort of simulate running a resource
agent by doing something like this in bash:

$ sudo OCF_RESKEY_name=... OCF_RESKEY_otherparam=... /usr/share/cluster/myagent status

rg_test is better, but even it won't help troubleshoot all of this.


P.S. In RHCS at least, a resource agent's stdout and stderr are always
sent to the bitbucket, *except* that when calling meta-data, rgmanager
will read all of the agent's stdout as the metadata.  One probelm here
is that if you put ocf_log statements in your script, they *will* write
to stdout in addition to syslog; if any of your ocf_log's are at the
top level of the script you have to test that $1 isn't "meta-data", so
you don't write extra debugging or status output along with the XML.

And here's another tricky and undocumented portion of the interface:
If your XML metadata doesn't validate (for example, if you write some
extra stuff to stdout when called with meta-data, such as ocf_log),
rgmanager will ignore your resource agent as invalid, and will ignore
its resources in your cluster.conf - which means that any service you
define that includes your custom resource will "successfully" start
without your custom resource, and rgmanager will treat that as okay!!

I think that behavior is stunningly awful and broken; a resource
group that includes a resource that failed to validate, should fail.

At least in RHEL/CentOS 5.3, it doesn't log anything to indicate this
condition.  I've heard that in 5.5, you do get a log message when the
metadata doesn't validate.
  -- Cos


From crosa at redhat.com  Wed Sep 15 17:02:44 2010
From: crosa at redhat.com (Cleber Rosa)
Date: Wed, 15 Sep 2010 14:02:44 -0300
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <20100915163741.GQ18254@mip.aaaaa.org>
References: <4C90CD60.2090301@srce.hr>	<20100915140846.GB5353@andromeda.usersys.redhat.com>	<4C90F23E.4000007@redhat.com>
	<20100915163741.GQ18254@mip.aaaaa.org>
Message-ID: <4C90FC34.1080104@redhat.com>

  On 09/15/2010 01:37 PM, Ofer Inbar wrote:
> Cleber Rosa<crosa at redhat.com>  wrote:
>>   If they're bash scripts, you might want to try:
>>
>> #bash -x<script>  <arguments>.
> This won't work.  The interface between the resource manager and the
> resource agent script is more than just "run the script".  It includes:
>
>   - What environment does the script get when run by the RM?
>     - Which includes a lot of parameters passed from the RM to the script
>
I've never run into issues where a testing "script resource" sys-v like 
script (redundancy intended!) in a shell resulted in  behaviour 
different then one running under rgmanager.

Maybe you're trying to debug other "cluster scripts", that implement 
other resources (such as filesystems, floating IP addresses, etc).

Or maybe my mileage is indeed very very low regarding this...

>   - What metadata must the agent script provide to the RM?
>     - ... and what the RM will do differently based on this metadata
>
>   - When does the RM call the agent, with which paremeters?
>     - What actions will it take in response to exit codes
>     - What happens to stdout and stderr (answer below)
>


From cos at aaaaa.org  Wed Sep 15 17:14:25 2010
From: cos at aaaaa.org (Ofer Inbar)
Date: Wed, 15 Sep 2010 13:14:25 -0400
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <4C90FC34.1080104@redhat.com>
References: <4C90CD60.2090301@srce.hr>
	<20100915140846.GB5353@andromeda.usersys.redhat.com>
	<4C90F23E.4000007@redhat.com>
	<20100915163741.GQ18254@mip.aaaaa.org>
	<4C90FC34.1080104@redhat.com>
Message-ID: <20100915171425.GR18254@mip.aaaaa.org>

Cleber Rosa <crosa at redhat.com> wrote:
> >>#bash -x<script>  <arguments>.
> >
> >This won't work.  The interface between the resource manager and the
> >resource agent script is more than just "run the script".  It includes:
...
> I've never run into issues where a testing "script resource" sys-v like 
> script (redundancy intended!) in a shell resulted in  behaviour 
> different then one running under rgmanager.

That's the key difference: You're referring to SysV/LSB scripts used
by the "script" resource agent that is supplied with the cluster
software; the original poster on this thread, however, asked for
information about writing a custom resource agent script (IOW, not
using the "script" resource, which takes a standard SysV/LSB init
script and uses it as is).
  -- Cos


From crosa at redhat.com  Wed Sep 15 17:25:41 2010
From: crosa at redhat.com (Cleber Rosa)
Date: Wed, 15 Sep 2010 14:25:41 -0300
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <20100915171425.GR18254@mip.aaaaa.org>
References: <4C90CD60.2090301@srce.hr>	<20100915140846.GB5353@andromeda.usersys.redhat.com>	<4C90F23E.4000007@redhat.com>	<20100915163741.GQ18254@mip.aaaaa.org>	<4C90FC34.1080104@redhat.com>
	<20100915171425.GR18254@mip.aaaaa.org>
Message-ID: <4C910195.4070702@redhat.com>

  Should have kept my mouth shut... answering fast-read (and 
half-understood) questions tend to do more harm than good ;)
> That's the key difference: You're referring to SysV/LSB scripts used
> by the "script" resource agent that is supplied with the cluster
> software; the original poster on this thread, however, asked for
> information about writing a custom resource agent script (IOW, not
> using the "script" resource, which takes a standard SysV/LSB init
> script and uses it as is).
>    -- Cos


From jakov.sosic at srce.hr  Wed Sep 15 23:03:35 2010
From: jakov.sosic at srce.hr (Jakov Sosic)
Date: Thu, 16 Sep 2010 01:03:35 +0200
Subject: [Linux-cluster] HA agents (cluster scripts)
In-Reply-To: <20100915150819.GM18254@mip.aaaaa.org>
References: <4C90CD60.2090301@srce.hr> <20100915150819.GM18254@mip.aaaaa.org>
Message-ID: <4C9150C7.8030008@srce.hr>

On 09/15/2010 05:08 PM, Ofer Inbar wrote:
>> I want to write few scripts for my company's services, and I don't know
>> how to debug the scripts?
>>
>> Is there any documentation regarding the issue, or something similar?
> 
> I ran into the same problem this summer.  Unfortunately, the answer is
> that mostly, full documentation does not exist.  However, there are a
> few scraps you may find very useful.
> 
> First, here are some notes on the Open Cluster Framework's resource
> agent API:
> 
> http://www.opencf.org/cgi-bin/viewcvs.cgi/specs/ra/resource-agent-api.txt?rev=HEAD
> 
> If you're doing Red Hat Cluster Suite, there's more to it than the
> OCF-common parts.  For example, RHCS metada is, I believe, specific
> to RHCS.  I pointed out some holes in the documentation to Lon on
> IRC a few weeks ago, and he began writing this page:
> 
> http://sources.redhat.com/cluster/wiki/RGManagerResourceAgents
> 
> It's still very incomplete, but already very useful (thanks Lon!)
> 
> Also, some things you can do for debugging:
> 
> 1. Sprinkle some ocf_log calls in strategic places in your script:
> 
>   ocf_log info "some debugging statement, including a $OCF_variable"
> 
>  -> Make sure /usr/sbin is in your resource agent's path!
> 
> 2. Configure logging so that your log statements make it to syslog.
> 
> See http://sources.redhat.com/cluster/wiki/RGManager
> 
> 3. Use rg_test, which I didn't know about at first.  For example:
> 
>   sudo rg_test rules
>   sudo rg_test test /etc/cluster/cluster.conf
>   sudo rg_test test /etc/cluster/cluster.conf status service myservice
>   sudo rg_test test /etc/cluster/cluster.conf start service myservice
> 
> It turns out this is documented in the ResourceTrees page on the
> cluster wiki, but I didn't read that far on that page because I
> wasn't having any trouble understanding or configuring my resource
> trees.  It's true that rg_test is useful for debugging resource
> trees, however it's also useful for troubleshooting resource agents.

Thank you! Now, this is exactly what I've looked for :)


-- 
|    Jakov Sosic    |    ICQ: 28410271    |   PGP: 0x965CAE2D   |
=================================================================
| start fighting cancer -> http://www.worldcommunitygrid.org/   |


From jhowell at medianewsgroup.com  Thu Sep 16 20:43:56 2010
From: jhowell at medianewsgroup.com (Jeff Howell)
Date: Thu, 16 Sep 2010 14:43:56 -0600
Subject: [Linux-cluster] Continuing gfs2 problems: Am I doing something
 wrong????
In-Reply-To: <4C587029.6090909@cgl.ucsf.edu>
References: <4C587029.6090909@cgl.ucsf.edu>
Message-ID: <4C92818C.9010003@medianewsgroup.com>

  I'm having an identical problem.

I have 2 nodes running a Wordpress instance with a TCP load balancer in 
front of them distributing http requests between them.

In the last 2 days, I've had 10+ instances where the GFS2 volume hangs 
with:

Sep 16 14:05:10 wordpress3 kernel: "echo 0 > 
/proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 16 14:05:10 wordpress3 kernel: delete_workqu D 00000272  2676  
3687     19          3688  3686 (L-TLB)
Sep 16 14:05:10 wordpress3 kernel:        f7839e38 00000046 3f1c322e 
00000272 00000000 f57ab400 f7839df8 0000000a
Sep 16 14:05:10 wordpress3 kernel:        c3217aa0 3f1dcca8 00000272 
00019a7a 00000001 c3217bac c3019744 f57c5ac0
Sep 16 14:05:10 wordpress3 kernel:        f8afa21c 00000003 f26162f0 
00000000 f2213df8 00000018 c3019c00 f7839e6c
Sep 16 14:05:10 wordpress3 kernel: Call Trace:
Sep 16 14:05:10 wordpress3 kernel:  [<f8afa21c>] gdlm_bast+0x0/0x78 
[lock_dlm]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c3910e>] just_schedule+0x5/0x8 
[gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c061d2f5>] __wait_on_bit+0x33/0x58
Sep 16 14:05:10 wordpress3 kernel:  [<f8c39109>] just_schedule+0x0/0x8 
[gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c39109>] just_schedule+0x0/0x8 
[gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c061d37c>] 
out_of_line_wait_on_bit+0x62/0x6a
Sep 16 14:05:10 wordpress3 kernel:  [<c0436098>] wake_bit_function+0x0/0x3c
Sep 16 14:05:10 wordpress3 kernel:  [<f8c39102>] 
gfs2_glock_wait+0x27/0x2e [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c4c667>] 
gfs2_check_blk_type+0xbc/0x18c [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c061d312>] __wait_on_bit+0x50/0x58
Sep 16 14:05:10 wordpress3 kernel:  [<f8c39109>] just_schedule+0x0/0x8 
[gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c4c660>] 
gfs2_check_blk_type+0xb5/0x18c [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c4c3c8>] 
gfs2_rindex_hold+0x2b/0x148 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c48273>] 
gfs2_delete_inode+0x6f/0x1a1 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c4823b>] 
gfs2_delete_inode+0x37/0x1a1 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<f8c48204>] 
gfs2_delete_inode+0x0/0x1a1 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c048cb02>] 
generic_delete_inode+0xa5/0x10f
Sep 16 14:05:10 wordpress3 kernel:  [<c048c5a6>] iput+0x64/0x66
Sep 16 14:05:10 wordpress3 kernel:  [<f8c3a8bb>] 
delete_work_func+0x49/0x53 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c04332da>] run_workqueue+0x78/0xb5
Sep 16 14:05:10 wordpress3 kernel:  [<f8c3a872>] 
delete_work_func+0x0/0x53 [gfs2]
Sep 16 14:05:10 wordpress3 kernel:  [<c0433b8e>] worker_thread+0xd9/0x10b
Sep 16 14:05:10 wordpress3 kernel:  [<c041f81b>] 
default_wake_function+0x0/0xc
Sep 16 14:05:10 wordpress3 kernel:  [<c0433ab5>] worker_thread+0x0/0x10b
Sep 16 14:05:10 wordpress3 kernel:  [<c0435fa7>] kthread+0xc0/0xed
Sep 16 14:05:10 wordpress3 kernel:  [<c0435ee7>] kthread+0x0/0xed
Sep 16 14:05:10 wordpress3 kernel:  [<c0405c53>] 
kernel_thread_helper+0x7/0x10

And then a bunch more for the httpd processes. I can pretty much 
reproduce this consistently by untarring a large tarball on the volume. 
Seems like anything IO intensive is causing this behavior.

Running CentOS 5.5 with kernel 2.6.18-194.11.1.el5 #1 SMP Tue Aug 10 
19:09:06 EDT 2010 i686 i686 i386 GNU/Linux

I tried the hangalizer program and it always came back with:
/bin/ls: /gfs2/: No such file or directoryhb.medianewsgroup.com "/bin/ls 
/gfs2/"
/bin/ls: /gfs2/: No such file or directoryhb.medianewsgroup.com "/bin/ls 
/gfs2/"
No waiting glocks found on any node.

Any Ideas?

On 08/03/2010 01:38 PM, Scooter Morris wrote:
> HI all,
>     We continue to have gfs2 crashes and hangs on our production 
> cluster, so I'm beginning to think that we've done something really 
> wrong.  Here is our set-up:
>
>     * 4 node cluster, only 3 participate in gfs2 filesystems
>     * Running several services on multiple nodes using gfs2:
>           o IMAP (dovecot)
>           o Web (apache with lots of python)
>           o Samba (using ctdb)
>     * GFS2 partitions are multipathed on an HP EVA-based SAN (no LVM)
>       -- here is fstab from one node (the three nodes are all the same):
>
>         LABEL=/1                /                       ext3   
>         defaults        1 1
>         LABEL=/boot1            /boot                   ext3   
>         defaults        1 2
>         tmpfs                   /dev/shm                tmpfs  
>         defaults        0 0
>         devpts                  /dev/pts                devpts 
>         gid=5,mode=620  0 0
>         sysfs                   /sys                    sysfs  
>         defaults        0 0
>         proc                    /proc                   proc   
>         defaults        0 0
>         LABEL=SW-cciss/c0d0p2   swap                    swap   
>         defaults        0 0
>         LABEL=plato:Mail        /var/spool/mail         gfs2   
>         noatime,_netdev
>         LABEL=plato:VarTmp      /var/tmp                gfs2    _netdev
>         LABEL=plato:UsrLocal    /usr/local              gfs2   
>         noatime,_netdev
>         LABEL=plato:UsrLocalProjects /usr/local/projects gfs2  
>         noatime,_netdev
>         LABEL=plato:Home2       /home/socr              gfs2   
>         noatime,_netdev
>         LABEL=plato:HomeNoBackup /home/socr/nobackup    gfs2    _netdev
>         LABEL=plato:DbBackup    /databases/backups      gfs2   
>         noatime,_netdev
>         LABEL=plato:DbMol       /databases/mol          gfs2   
>         noatime,_netdev
>         LABEL=plato:MolDbBlast  /databases/mol/blast    gfs2   
>         noatime,_netdev
>         LABEL=plato:MolDbEmboss /databases/mol/emboss   gfs2   
>         noatime,_netdev
>
>     * Kernel version is: 2.6.18-194.3.1.el5 and all nodes are x86_64.
>     * What's happening is every so often, we start seeing gfs2-related
>       task hangs in the logs.  In the last instance (last Friday)
>       we've got this:
>
>         Node 0:
>
>             [2010-07-30 13:23:25]INFO: task imap:25716 blocked for
>             more than 120 seconds.^M
>             [2010-07-30 13:23:25]"echo 0 >
>             /proc/sys/kernel/hung_task_timeout_secs" disables this
>             message.^M
>             [2010-07-30 13:23:25]imap          D ffff8100010825a0    
>             0 25716   9217         24080 25667 (NOTLB)^M
>             [2010-07-30 13:23:25] ffff810619b59bc8 0000000000000086
>             ffff810113233f10 ffffffff00000000^M
>             [2010-07-30 13:23:26] ffff81000f8c5cd0 000000000000000a
>             ffff810233416040 ffff81082fd05100^M
>             [2010-07-30 13:23:26] 00012196d153c88e 0000000000008b81
>             ffff810233416228 0000000f6a949180^M
>             [2010-07-30 13:23:26]Call Trace:^M
>             [2010-07-30 13:23:26] [<ffffffff887d0be6>]
>             :gfs2:gfs2_dirent_find+0x0/0x4e^M
>             [2010-07-30 13:23:26] [<ffffffff887d0c18>]
>             :gfs2:gfs2_dirent_find+0x32/0x4e^M
>             [2010-07-30 13:23:26] [<ffffffff887d5ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:23:26] [<ffffffff887d5ef0>]
>             :gfs2:just_schedule+0x9/0xe^M
>             [2010-07-30 13:23:26] [<ffffffff80063a16>]
>             __wait_on_bit+0x40/0x6e^M
>             [2010-07-30 13:23:26] [<ffffffff887d5ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:23:26] [<ffffffff80063ab0>]
>             out_of_line_wait_on_bit+0x6c/0x78^M
>             [2010-07-30 13:23:26] [<ffffffff800a0aec>]
>             wake_bit_function+0x0/0x23^M
>             [2010-07-30 13:23:26] [<ffffffff887d5ee2>]
>             :gfs2:gfs2_glock_wait+0x2b/0x30^M
>             [2010-07-30 13:23:26] [<ffffffff887e579e>]
>             :gfs2:gfs2_permission+0x83/0xd5^M
>             [2010-07-30 13:23:26] [<ffffffff887e5796>]
>             :gfs2:gfs2_permission+0x7b/0xd5^M
>             [2010-07-30 13:23:26] [<ffffffff8000ce97>]
>             do_lookup+0x65/0x1e6^M
>             [2010-07-30 13:23:26] [<ffffffff8000d918>]
>             permission+0x81/0xc8^M
>             [2010-07-30 13:23:26] [<ffffffff8000997f>]
>             __link_path_walk+0x173/0xf42^M
>             [2010-07-30 13:23:26] [<ffffffff8000e9e2>]
>             link_path_walk+0x42/0xb2^M
>             [2010-07-30 13:23:26] [<ffffffff8000ccb2>]
>             do_path_lookup+0x275/0x2f1^M
>             [2010-07-30 13:23:26] [<ffffffff8001280e>]
>             getname+0x15b/0x1c2^M
>             [2010-07-30 13:23:27] [<ffffffff80023876>]
>             __user_walk_fd+0x37/0x4c^M
>             [2010-07-30 13:23:27] [<ffffffff80028846>]
>             vfs_stat_fd+0x1b/0x4a^M
>             [2010-07-30 13:23:27] [<ffffffff800638b3>]
>             schedule_timeout+0x92/0xad^M
>             [2010-07-30 13:23:27] [<ffffffff80097dab>]
>             process_timeout+0x0/0x5^M
>             [2010-07-30 13:23:27] [<ffffffff800f8435>]
>             sys_epoll_wait+0x3b8/0x3f9^M
>             [2010-07-30 13:23:27] [<ffffffff800235a8>]
>             sys_newstat+0x19/0x31^M
>             [2010-07-30 13:23:27] [<ffffffff8005d229>]
>             tracesys+0x71/0xe0^M
>             [2010-07-30 13:23:27] [<ffffffff8005d28d>]
>             tracesys+0xd5/0xe0^M
>
>         Node 1:
>
>             [2010-07-30 13:23:59]INFO: task pdflush:623 blocked for
>             more than 120 seconds.^M
>             [2010-07-30 13:23:59]"echo 0 >
>             /proc/sys/kernel/hung_task_timeout_secs" disables this
>             message.^M
>             [2010-07-30 13:23:59]pdflush       D ffff810407069aa0    
>             0   623    291           624   622 (L-TLB)^M
>             [2010-07-30 13:23:59] ffff8106073c1bd0 0000000000000046
>             0000000000000001 ffff8103fea899a8^M
>             [2010-07-30 13:23:59] ffff8106073c1c30 000000000000000a
>             ffff8105fff7c0c0 ffff8107fff4c820^M
>             [2010-07-30 13:24:00] 0000ed85d9d7a027 0000000000011b50
>             ffff8105fff7c2a8 00000006f0a9d0d0^M
>             [2010-07-30 13:24:00]Call Trace:^M
>             [2010-07-30 13:24:00] [<ffffffff8001a927>]
>             submit_bh+0x10a/0x111^M
>             [2010-07-30 13:24:00] [<ffffffff88802ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:24:00] [<ffffffff88802ef0>]
>             :gfs2:just_schedule+0x9/0xe^M
>             [2010-07-30 13:24:00] [<ffffffff80063a16>]
>             __wait_on_bit+0x40/0x6e^M
>             [2010-07-30 13:24:00] [<ffffffff88802ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:24:00] [<ffffffff80063ab0>]
>             out_of_line_wait_on_bit+0x6c/0x78^M
>             [2010-07-30 13:24:00] [<ffffffff800a0aec>]
>             wake_bit_function+0x0/0x23^M
>             [2010-07-30 13:24:00] [<ffffffff88802ee2>]
>             :gfs2:gfs2_glock_wait+0x2b/0x30^M
>             [2010-07-30 13:24:00] [<ffffffff88813269>]
>             :gfs2:gfs2_write_inode+0x5f/0x152^M
>             [2010-07-30 13:24:00] [<ffffffff88813261>]
>             :gfs2:gfs2_write_inode+0x57/0x152^M
>             [2010-07-30 13:24:00] [<ffffffff8002fbf8>]
>             __writeback_single_inode+0x1e9/0x328^M
>             [2010-07-30 13:24:00] [<ffffffff80020ec9>]
>             sync_sb_inodes+0x1b5/0x26f^M
>             [2010-07-30 13:24:00] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:00] [<ffffffff8005123a>]
>             writeback_inodes+0x82/0xd8^M
>             [2010-07-30 13:24:00] [<ffffffff800c97b5>]
>             wb_kupdate+0xd4/0x14e^M
>             [2010-07-30 13:24:00] [<ffffffff80056879>] pdflush+0x0/0x1fb^M
>             [2010-07-30 13:24:00] [<ffffffff800569ca>]
>             pdflush+0x151/0x1fb^M
>             [2010-07-30 13:24:00] [<ffffffff800c96e1>]
>             wb_kupdate+0x0/0x14e^M
>             [2010-07-30 13:24:01] [<ffffffff80032894>]
>             kthread+0xfe/0x132^M
>             [2010-07-30 13:24:01] [<ffffffff8009d734>]
>             request_module+0x0/0x14d^M
>             [2010-07-30 13:24:01] [<ffffffff8005dfb1>]
>             child_rip+0xa/0x11^M
>             [2010-07-30 13:24:01] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:01] [<ffffffff80032796>] kthread+0x0/0x132^M
>             [2010-07-30 13:24:01] [<ffffffff8005dfa7>]
>             child_rip+0x0/0x11^M
>
>         Node 2:
>
>             [2010-07-30 13:24:46]INFO: task delete_workqueu:7175
>             blocked for more than 120 seconds.^M
>             [2010-07-30 13:24:46]"echo 0 >
>             /proc/sys/kernel/hung_task_timeout_secs" disables this
>             message.^M
>             [2010-07-30 13:24:46]delete_workqu D ffff81082b5cf860    
>             0  7175    329          7176  7174 (L-TLB)^M
>             [2010-07-30 13:24:46] ffff81081ed6dbf0 0000000000000046
>             0000000000000018 ffffffff887a84f3^M
>             [2010-07-30 13:24:46] 0000000000000286 000000000000000a
>             ffff81082dd477e0 ffff81082b5cf860^M
>             [2010-07-30 13:24:46] 00012166bf7ec21d 000000000002ed0b
>             ffff81082dd479c8 00000007887a9e5a^M
>             [2010-07-30 13:24:46]Call Trace:^M
>             [2010-07-30 13:24:46] [<ffffffff887a84f3>]
>             :dlm:request_lock+0x93/0xa0^M
>             [2010-07-30 13:24:47] [<ffffffff8884f556>]
>             :lock_dlm:gdlm_ast+0x0/0x311^M
>             [2010-07-30 13:24:47] [<ffffffff8884f2c1>]
>             :lock_dlm:gdlm_bast+0x0/0x8d^M
>             [2010-07-30 13:24:47] [<ffffffff887d3ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:24:47] [<ffffffff887d3ef0>]
>             :gfs2:just_schedule+0x9/0xe^M
>             [2010-07-30 13:24:47] [<ffffffff80063a16>]
>             __wait_on_bit+0x40/0x6e^M
>             [2010-07-30 13:24:47] [<ffffffff887d3ee7>]
>             :gfs2:just_schedule+0x0/0xe^M
>             [2010-07-30 13:24:47] [<ffffffff80063ab0>]
>             out_of_line_wait_on_bit+0x6c/0x78^M
>             [2010-07-30 13:24:47] [<ffffffff800a0aec>]
>             wake_bit_function+0x0/0x23^M
>             [2010-07-30 13:24:47] [<ffffffff887d3ee2>]
>             :gfs2:gfs2_glock_wait+0x2b/0x30^M
>             [2010-07-30 13:24:47] [<ffffffff887e82cf>]
>             :gfs2:gfs2_check_blk_type+0xd7/0x1c9^M
>             [2010-07-30 13:24:47] [<ffffffff887e82c7>]
>             :gfs2:gfs2_check_blk_type+0xcf/0x1c9^M
>             [2010-07-30 13:24:47] [<ffffffff80063ab0>]
>             out_of_line_wait_on_bit+0x6c/0x78^M
>             [2010-07-30 13:24:47] [<ffffffff887e804f>]
>             :gfs2:gfs2_rindex_hold+0x32/0x12b^M
>             [2010-07-30 13:24:47] [<ffffffff887d5a29>]
>             :gfs2:delete_work_func+0x0/0x65^M
>             [2010-07-30 13:24:47] [<ffffffff887d5a29>]
>             :gfs2:delete_work_func+0x0/0x65^M
>             [2010-07-30 13:24:47] [<ffffffff887e3e3a>]
>             :gfs2:gfs2_delete_inode+0x76/0x1b4^M
>             [2010-07-30 13:24:47] [<ffffffff887e3e01>]
>             :gfs2:gfs2_delete_inode+0x3d/0x1b4^M
>             [2010-07-30 13:24:47] [<ffffffff8000d3ba>] dput+0x2c/0x114^M
>             [2010-07-30 13:24:48] [<ffffffff887e3dc4>]
>             :gfs2:gfs2_delete_inode+0x0/0x1b4^M
>             [2010-07-30 13:24:48] [<ffffffff8002f35e>]
>             generic_delete_inode+0xc6/0x143^M
>             [2010-07-30 13:24:48] [<ffffffff887d5a83>]
>             :gfs2:delete_work_func+0x5a/0x65^M
>             [2010-07-30 13:24:48] [<ffffffff8004d8f0>]
>             run_workqueue+0x94/0xe4^M
>             [2010-07-30 13:24:48] [<ffffffff8004a12b>]
>             worker_thread+0x0/0x122^M
>             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:48] [<ffffffff8004a21b>]
>             worker_thread+0xf0/0x122^M
>             [2010-07-30 13:24:48] [<ffffffff8008d087>]
>             default_wake_function+0x0/0xe^M
>             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:48] [<ffffffff80032894>]
>             kthread+0xfe/0x132^M
>             [2010-07-30 13:24:48] [<ffffffff8005dfb1>]
>             child_rip+0xa/0x11^M
>             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
>             keventd_create_kthread+0x0/0xc4^M
>             [2010-07-30 13:24:48] [<ffffffff80032796>] kthread+0x0/0x132^M
>             [2010-07-30 13:24:48] [<ffffffff8005dfa7>]
>             child_rip+0x0/0x11^M
>
>     * Various messages related to hung_task_timeouts repeated on each
>       node (usually related to imap).
>     * Within a minute or two, the cluster was completely hung.  Root
>       could log into the console, but commands (like dmesg) would just
>       hang.
>
> So, my major question:  is there something wrong with my 
> configuration?  Have we done something really stupid?  The initial 
> response from RedHat was that we shouldn't run services on multiple 
> nodes that access gfs2, which seems a little confusing since we would 
> use ext3 or ext4 if we were going to node lock (or failover) the 
> partitions.  Have we missed something somewhere?
>
> Thanks in advance for any help anyone can give.  We're getting pretty 
> desperate here since the downtime is starting to have a significant 
> impact on our credibility.
>
> -- scooter
>
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Jeff Howell
Sr. Linux Administrator
Media News Group interactive
303.563.6394 jhowell at medianewsgroup.com


From Chris.Jankowski at hp.com  Fri Sep 17 03:58:59 2010
From: Chris.Jankowski at hp.com (Jankowski, Chris)
Date: Fri, 17 Sep 2010 03:58:59 +0000
Subject: [Linux-cluster] GFS2 changes between RHEL AP V5.x and 6?
Message-ID: <036B68E61A28CA49AC2767596576CD596BADB1859E@GVW1113EXC.americas.hpqcorp.net>

Hi,

I read the beta 2 release notes for RHEL 6.  It mentions numerous changes in the cluster for RHEL 6, but nothing about GFS2.

Are there any GFS2 changes in RHEL 6 compared with RHEL 5.x?

Thanks and regards,

Chris Jankowski

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100917/951089e2/attachment.htm>

From swhiteho at redhat.com  Fri Sep 17 09:19:34 2010
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Fri, 17 Sep 2010 10:19:34 +0100
Subject: [Linux-cluster] Continuing gfs2 problems: Am I doing something
 wrong????
In-Reply-To: <4C92818C.9010003@medianewsgroup.com>
References: <4C587029.6090909@cgl.ucsf.edu>
	<4C92818C.9010003@medianewsgroup.com>
Message-ID: <1284715174.2821.3.camel@dolmen>

Hi,

On Thu, 2010-09-16 at 14:43 -0600, Jeff Howell wrote:
> I'm having an identical problem.
> 
> I have 2 nodes running a Wordpress instance with a TCP load balancer in 
> front of them distributing http requests between them.
> 
> In the last 2 days, I've had 10+ instances where the GFS2 volume hangs 
> with:
> 
> Sep 16 14:05:10 wordpress3 kernel: "echo 0 > 
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 16 14:05:10 wordpress3 kernel: delete_workqu D 00000272  2676  
> 3687     19          3688  3686 (L-TLB)
> Sep 16 14:05:10 wordpress3 kernel:        f7839e38 00000046 3f1c322e 
> 00000272 00000000 f57ab400 f7839df8 0000000a
> Sep 16 14:05:10 wordpress3 kernel:        c3217aa0 3f1dcca8 00000272 
> 00019a7a 00000001 c3217bac c3019744 f57c5ac0
> Sep 16 14:05:10 wordpress3 kernel:        f8afa21c 00000003 f26162f0 
> 00000000 f2213df8 00000018 c3019c00 f7839e6c
> Sep 16 14:05:10 wordpress3 kernel: Call Trace:
> Sep 16 14:05:10 wordpress3 kernel:  [<f8afa21c>] gdlm_bast+0x0/0x78 
> [lock_dlm]
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c3910e>] just_schedule+0x5/0x8 
> [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<c061d2f5>] __wait_on_bit+0x33/0x58
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c39109>] just_schedule+0x0/0x8 
> [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c39109>] just_schedule+0x0/0x8 
> [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<c061d37c>] 
> out_of_line_wait_on_bit+0x62/0x6a
> Sep 16 14:05:10 wordpress3 kernel:  [<c0436098>] wake_bit_function+0x0/0x3c
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c39102>] 
> gfs2_glock_wait+0x27/0x2e [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c4c667>] 
> gfs2_check_blk_type+0xbc/0x18c [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<c061d312>] __wait_on_bit+0x50/0x58
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c39109>] just_schedule+0x0/0x8 
> [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c4c660>] 
> gfs2_check_blk_type+0xb5/0x18c [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c4c3c8>] 
> gfs2_rindex_hold+0x2b/0x148 [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c48273>] 
> gfs2_delete_inode+0x6f/0x1a1 [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c4823b>] 
> gfs2_delete_inode+0x37/0x1a1 [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c48204>] 
> gfs2_delete_inode+0x0/0x1a1 [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<c048cb02>] 
> generic_delete_inode+0xa5/0x10f
> Sep 16 14:05:10 wordpress3 kernel:  [<c048c5a6>] iput+0x64/0x66
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c3a8bb>] 
> delete_work_func+0x49/0x53 [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<c04332da>] run_workqueue+0x78/0xb5
> Sep 16 14:05:10 wordpress3 kernel:  [<f8c3a872>] 
> delete_work_func+0x0/0x53 [gfs2]
> Sep 16 14:05:10 wordpress3 kernel:  [<c0433b8e>] worker_thread+0xd9/0x10b
> Sep 16 14:05:10 wordpress3 kernel:  [<c041f81b>] 
> default_wake_function+0x0/0xc
> Sep 16 14:05:10 wordpress3 kernel:  [<c0433ab5>] worker_thread+0x0/0x10b
> Sep 16 14:05:10 wordpress3 kernel:  [<c0435fa7>] kthread+0xc0/0xed
> Sep 16 14:05:10 wordpress3 kernel:  [<c0435ee7>] kthread+0x0/0xed
> Sep 16 14:05:10 wordpress3 kernel:  [<c0405c53>] 
> kernel_thread_helper+0x7/0x10
> 
> And then a bunch more for the httpd processes. I can pretty much 
> reproduce this consistently by untarring a large tarball on the volume. 
> Seems like anything IO intensive is causing this behavior.
> 
> Running CentOS 5.5 with kernel 2.6.18-194.11.1.el5 #1 SMP Tue Aug 10 
> 19:09:06 EDT 2010 i686 i686 i386 GNU/Linux
> 
> I tried the hangalizer program and it always came back with:
> /bin/ls: /gfs2/: No such file or directoryhb.medianewsgroup.com "/bin/ls 
> /gfs2/"
> /bin/ls: /gfs2/: No such file or directoryhb.medianewsgroup.com "/bin/ls 
> /gfs2/"
> No waiting glocks found on any node.
> 
> Any Ideas?
> 
Can you report this via our support team? or if you don't have a support
contract at least via bugzilla so that we have a record of the problem
which won't get missed?

That doesn't look at all right to me, so I'd like to get to the bottom
of what is going on here.

> On 08/03/2010 01:38 PM, Scooter Morris wrote:
> > HI all,
> >     We continue to have gfs2 crashes and hangs on our production 
> > cluster, so I'm beginning to think that we've done something really 
> > wrong.  Here is our set-up:
> >
> >     * 4 node cluster, only 3 participate in gfs2 filesystems
> >     * Running several services on multiple nodes using gfs2:
> >           o IMAP (dovecot)
> >           o Web (apache with lots of python)
> >           o Samba (using ctdb)
> >     * GFS2 partitions are multipathed on an HP EVA-based SAN (no LVM)
> >       -- here is fstab from one node (the three nodes are all the same):
> >
> >         LABEL=/1                /                       ext3   
> >         defaults        1 1
> >         LABEL=/boot1            /boot                   ext3   
> >         defaults        1 2
> >         tmpfs                   /dev/shm                tmpfs  
> >         defaults        0 0
> >         devpts                  /dev/pts                devpts 
> >         gid=5,mode=620  0 0
> >         sysfs                   /sys                    sysfs  
> >         defaults        0 0
> >         proc                    /proc                   proc   
> >         defaults        0 0
> >         LABEL=SW-cciss/c0d0p2   swap                    swap   
> >         defaults        0 0
> >         LABEL=plato:Mail        /var/spool/mail         gfs2   
> >         noatime,_netdev
> >         LABEL=plato:VarTmp      /var/tmp                gfs2    _netdev
> >         LABEL=plato:UsrLocal    /usr/local              gfs2   
> >         noatime,_netdev
> >         LABEL=plato:UsrLocalProjects /usr/local/projects gfs2  
> >         noatime,_netdev
> >         LABEL=plato:Home2       /home/socr              gfs2   
> >         noatime,_netdev
> >         LABEL=plato:HomeNoBackup /home/socr/nobackup    gfs2    _netdev
> >         LABEL=plato:DbBackup    /databases/backups      gfs2   
> >         noatime,_netdev
> >         LABEL=plato:DbMol       /databases/mol          gfs2   
> >         noatime,_netdev
> >         LABEL=plato:MolDbBlast  /databases/mol/blast    gfs2   
> >         noatime,_netdev
> >         LABEL=plato:MolDbEmboss /databases/mol/emboss   gfs2   
> >         noatime,_netdev
> >
> >     * Kernel version is: 2.6.18-194.3.1.el5 and all nodes are x86_64.
> >     * What's happening is every so often, we start seeing gfs2-related
> >       task hangs in the logs.  In the last instance (last Friday)
> >       we've got this:
> >
> >         Node 0:
> >
> >             [2010-07-30 13:23:25]INFO: task imap:25716 blocked for
> >             more than 120 seconds.^M
> >             [2010-07-30 13:23:25]"echo 0 >
> >             /proc/sys/kernel/hung_task_timeout_secs" disables this
> >             message.^M
> >             [2010-07-30 13:23:25]imap          D ffff8100010825a0    
> >             0 25716   9217         24080 25667 (NOTLB)^M
> >             [2010-07-30 13:23:25] ffff810619b59bc8 0000000000000086
> >             ffff810113233f10 ffffffff00000000^M
> >             [2010-07-30 13:23:26] ffff81000f8c5cd0 000000000000000a
> >             ffff810233416040 ffff81082fd05100^M
> >             [2010-07-30 13:23:26] 00012196d153c88e 0000000000008b81
> >             ffff810233416228 0000000f6a949180^M
> >             [2010-07-30 13:23:26]Call Trace:^M
> >             [2010-07-30 13:23:26] [<ffffffff887d0be6>]
> >             :gfs2:gfs2_dirent_find+0x0/0x4e^M
> >             [2010-07-30 13:23:26] [<ffffffff887d0c18>]
> >             :gfs2:gfs2_dirent_find+0x32/0x4e^M
> >             [2010-07-30 13:23:26] [<ffffffff887d5ee7>]
> >             :gfs2:just_schedule+0x0/0xe^M
> >             [2010-07-30 13:23:26] [<ffffffff887d5ef0>]
> >             :gfs2:just_schedule+0x9/0xe^M
> >             [2010-07-30 13:23:26] [<ffffffff80063a16>]
> >             __wait_on_bit+0x40/0x6e^M
> >             [2010-07-30 13:23:26] [<ffffffff887d5ee7>]
> >             :gfs2:just_schedule+0x0/0xe^M
> >             [2010-07-30 13:23:26] [<ffffffff80063ab0>]
> >             out_of_line_wait_on_bit+0x6c/0x78^M
> >             [2010-07-30 13:23:26] [<ffffffff800a0aec>]
> >             wake_bit_function+0x0/0x23^M
> >             [2010-07-30 13:23:26] [<ffffffff887d5ee2>]
> >             :gfs2:gfs2_glock_wait+0x2b/0x30^M
> >             [2010-07-30 13:23:26] [<ffffffff887e579e>]
> >             :gfs2:gfs2_permission+0x83/0xd5^M
> >             [2010-07-30 13:23:26] [<ffffffff887e5796>]
> >             :gfs2:gfs2_permission+0x7b/0xd5^M
> >             [2010-07-30 13:23:26] [<ffffffff8000ce97>]
> >             do_lookup+0x65/0x1e6^M
> >             [2010-07-30 13:23:26] [<ffffffff8000d918>]
> >             permission+0x81/0xc8^M
> >             [2010-07-30 13:23:26] [<ffffffff8000997f>]
> >             __link_path_walk+0x173/0xf42^M
> >             [2010-07-30 13:23:26] [<ffffffff8000e9e2>]
> >             link_path_walk+0x42/0xb2^M
> >             [2010-07-30 13:23:26] [<ffffffff8000ccb2>]
> >             do_path_lookup+0x275/0x2f1^M
> >             [2010-07-30 13:23:26] [<ffffffff8001280e>]
> >             getname+0x15b/0x1c2^M
> >             [2010-07-30 13:23:27] [<ffffffff80023876>]
> >             __user_walk_fd+0x37/0x4c^M
> >             [2010-07-30 13:23:27] [<ffffffff80028846>]
> >             vfs_stat_fd+0x1b/0x4a^M
> >             [2010-07-30 13:23:27] [<ffffffff800638b3>]
> >             schedule_timeout+0x92/0xad^M
> >             [2010-07-30 13:23:27] [<ffffffff80097dab>]
> >             process_timeout+0x0/0x5^M
> >             [2010-07-30 13:23:27] [<ffffffff800f8435>]
> >             sys_epoll_wait+0x3b8/0x3f9^M
> >             [2010-07-30 13:23:27] [<ffffffff800235a8>]
> >             sys_newstat+0x19/0x31^M
> >             [2010-07-30 13:23:27] [<ffffffff8005d229>]
> >             tracesys+0x71/0xe0^M
> >             [2010-07-30 13:23:27] [<ffffffff8005d28d>]
> >             tracesys+0xd5/0xe0^M
> >
> >         Node 1:
> >
> >             [2010-07-30 13:23:59]INFO: task pdflush:623 blocked for
> >             more than 120 seconds.^M
> >             [2010-07-30 13:23:59]"echo 0 >
> >             /proc/sys/kernel/hung_task_timeout_secs" disables this
> >             message.^M
> >             [2010-07-30 13:23:59]pdflush       D ffff810407069aa0    
> >             0   623    291           624   622 (L-TLB)^M
> >             [2010-07-30 13:23:59] ffff8106073c1bd0 0000000000000046
> >             0000000000000001 ffff8103fea899a8^M
> >             [2010-07-30 13:23:59] ffff8106073c1c30 000000000000000a
> >             ffff8105fff7c0c0 ffff8107fff4c820^M
> >             [2010-07-30 13:24:00] 0000ed85d9d7a027 0000000000011b50
> >             ffff8105fff7c2a8 00000006f0a9d0d0^M
> >             [2010-07-30 13:24:00]Call Trace:^M
> >             [2010-07-30 13:24:00] [<ffffffff8001a927>]
> >             submit_bh+0x10a/0x111^M
> >             [2010-07-30 13:24:00] [<ffffffff88802ee7>]
> >             :gfs2:just_schedule+0x0/0xe^M
> >             [2010-07-30 13:24:00] [<ffffffff88802ef0>]
> >             :gfs2:just_schedule+0x9/0xe^M
> >             [2010-07-30 13:24:00] [<ffffffff80063a16>]
> >             __wait_on_bit+0x40/0x6e^M
> >             [2010-07-30 13:24:00] [<ffffffff88802ee7>]
> >             :gfs2:just_schedule+0x0/0xe^M
> >             [2010-07-30 13:24:00] [<ffffffff80063ab0>]
> >             out_of_line_wait_on_bit+0x6c/0x78^M
> >             [2010-07-30 13:24:00] [<ffffffff800a0aec>]
> >             wake_bit_function+0x0/0x23^M
> >             [2010-07-30 13:24:00] [<ffffffff88802ee2>]
> >             :gfs2:gfs2_glock_wait+0x2b/0x30^M
> >             [2010-07-30 13:24:00] [<ffffffff88813269>]
> >             :gfs2:gfs2_write_inode+0x5f/0x152^M
> >             [2010-07-30 13:24:00] [<ffffffff88813261>]
> >             :gfs2:gfs2_write_inode+0x57/0x152^M
> >             [2010-07-30 13:24:00] [<ffffffff8002fbf8>]
> >             __writeback_single_inode+0x1e9/0x328^M
> >             [2010-07-30 13:24:00] [<ffffffff80020ec9>]
> >             sync_sb_inodes+0x1b5/0x26f^M
> >             [2010-07-30 13:24:00] [<ffffffff800a08a6>]
> >             keventd_create_kthread+0x0/0xc4^M
> >             [2010-07-30 13:24:00] [<ffffffff8005123a>]
> >             writeback_inodes+0x82/0xd8^M
> >             [2010-07-30 13:24:00] [<ffffffff800c97b5>]
> >             wb_kupdate+0xd4/0x14e^M
> >             [2010-07-30 13:24:00] [<ffffffff80056879>] pdflush+0x0/0x1fb^M
> >             [2010-07-30 13:24:00] [<ffffffff800569ca>]
> >             pdflush+0x151/0x1fb^M
> >             [2010-07-30 13:24:00] [<ffffffff800c96e1>]
> >             wb_kupdate+0x0/0x14e^M
> >             [2010-07-30 13:24:01] [<ffffffff80032894>]
> >             kthread+0xfe/0x132^M
> >             [2010-07-30 13:24:01] [<ffffffff8009d734>]
> >             request_module+0x0/0x14d^M
> >             [2010-07-30 13:24:01] [<ffffffff8005dfb1>]
> >             child_rip+0xa/0x11^M
> >             [2010-07-30 13:24:01] [<ffffffff800a08a6>]
> >             keventd_create_kthread+0x0/0xc4^M
> >             [2010-07-30 13:24:01] [<ffffffff80032796>] kthread+0x0/0x132^M
> >             [2010-07-30 13:24:01] [<ffffffff8005dfa7>]
> >             child_rip+0x0/0x11^M
> >
> >         Node 2:
> >
> >             [2010-07-30 13:24:46]INFO: task delete_workqueu:7175
> >             blocked for more than 120 seconds.^M
> >             [2010-07-30 13:24:46]"echo 0 >
> >             /proc/sys/kernel/hung_task_timeout_secs" disables this
> >             message.^M
> >             [2010-07-30 13:24:46]delete_workqu D ffff81082b5cf860    
> >             0  7175    329          7176  7174 (L-TLB)^M
> >             [2010-07-30 13:24:46] ffff81081ed6dbf0 0000000000000046
> >             0000000000000018 ffffffff887a84f3^M
> >             [2010-07-30 13:24:46] 0000000000000286 000000000000000a
> >             ffff81082dd477e0 ffff81082b5cf860^M
> >             [2010-07-30 13:24:46] 00012166bf7ec21d 000000000002ed0b
> >             ffff81082dd479c8 00000007887a9e5a^M
> >             [2010-07-30 13:24:46]Call Trace:^M
> >             [2010-07-30 13:24:46] [<ffffffff887a84f3>]
> >             :dlm:request_lock+0x93/0xa0^M
> >             [2010-07-30 13:24:47] [<ffffffff8884f556>]
> >             :lock_dlm:gdlm_ast+0x0/0x311^M
> >             [2010-07-30 13:24:47] [<ffffffff8884f2c1>]
> >             :lock_dlm:gdlm_bast+0x0/0x8d^M
> >             [2010-07-30 13:24:47] [<ffffffff887d3ee7>]
> >             :gfs2:just_schedule+0x0/0xe^M
> >             [2010-07-30 13:24:47] [<ffffffff887d3ef0>]
> >             :gfs2:just_schedule+0x9/0xe^M
> >             [2010-07-30 13:24:47] [<ffffffff80063a16>]
> >             __wait_on_bit+0x40/0x6e^M
> >             [2010-07-30 13:24:47] [<ffffffff887d3ee7>]
> >             :gfs2:just_schedule+0x0/0xe^M
> >             [2010-07-30 13:24:47] [<ffffffff80063ab0>]
> >             out_of_line_wait_on_bit+0x6c/0x78^M
> >             [2010-07-30 13:24:47] [<ffffffff800a0aec>]
> >             wake_bit_function+0x0/0x23^M
> >             [2010-07-30 13:24:47] [<ffffffff887d3ee2>]
> >             :gfs2:gfs2_glock_wait+0x2b/0x30^M
> >             [2010-07-30 13:24:47] [<ffffffff887e82cf>]
> >             :gfs2:gfs2_check_blk_type+0xd7/0x1c9^M
> >             [2010-07-30 13:24:47] [<ffffffff887e82c7>]
> >             :gfs2:gfs2_check_blk_type+0xcf/0x1c9^M
> >             [2010-07-30 13:24:47] [<ffffffff80063ab0>]
> >             out_of_line_wait_on_bit+0x6c/0x78^M
> >             [2010-07-30 13:24:47] [<ffffffff887e804f>]
> >             :gfs2:gfs2_rindex_hold+0x32/0x12b^M
> >             [2010-07-30 13:24:47] [<ffffffff887d5a29>]
> >             :gfs2:delete_work_func+0x0/0x65^M
> >             [2010-07-30 13:24:47] [<ffffffff887d5a29>]
> >             :gfs2:delete_work_func+0x0/0x65^M
> >             [2010-07-30 13:24:47] [<ffffffff887e3e3a>]
> >             :gfs2:gfs2_delete_inode+0x76/0x1b4^M
> >             [2010-07-30 13:24:47] [<ffffffff887e3e01>]
> >             :gfs2:gfs2_delete_inode+0x3d/0x1b4^M
> >             [2010-07-30 13:24:47] [<ffffffff8000d3ba>] dput+0x2c/0x114^M
> >             [2010-07-30 13:24:48] [<ffffffff887e3dc4>]
> >             :gfs2:gfs2_delete_inode+0x0/0x1b4^M
> >             [2010-07-30 13:24:48] [<ffffffff8002f35e>]
> >             generic_delete_inode+0xc6/0x143^M
> >             [2010-07-30 13:24:48] [<ffffffff887d5a83>]
> >             :gfs2:delete_work_func+0x5a/0x65^M
> >             [2010-07-30 13:24:48] [<ffffffff8004d8f0>]
> >             run_workqueue+0x94/0xe4^M
> >             [2010-07-30 13:24:48] [<ffffffff8004a12b>]
> >             worker_thread+0x0/0x122^M
> >             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
> >             keventd_create_kthread+0x0/0xc4^M
> >             [2010-07-30 13:24:48] [<ffffffff8004a21b>]
> >             worker_thread+0xf0/0x122^M
> >             [2010-07-30 13:24:48] [<ffffffff8008d087>]
> >             default_wake_function+0x0/0xe^M
> >             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
> >             keventd_create_kthread+0x0/0xc4^M
> >             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
> >             keventd_create_kthread+0x0/0xc4^M
> >             [2010-07-30 13:24:48] [<ffffffff80032894>]
> >             kthread+0xfe/0x132^M
> >             [2010-07-30 13:24:48] [<ffffffff8005dfb1>]
> >             child_rip+0xa/0x11^M
> >             [2010-07-30 13:24:48] [<ffffffff800a08a6>]
> >             keventd_create_kthread+0x0/0xc4^M
> >             [2010-07-30 13:24:48] [<ffffffff80032796>] kthread+0x0/0x132^M
> >             [2010-07-30 13:24:48] [<ffffffff8005dfa7>]
> >             child_rip+0x0/0x11^M
> >
> >     * Various messages related to hung_task_timeouts repeated on each
> >       node (usually related to imap).
> >     * Within a minute or two, the cluster was completely hung.  Root
> >       could log into the console, but commands (like dmesg) would just
> >       hang.
> >
> > So, my major question:  is there something wrong with my 
> > configuration?  Have we done something really stupid?  The initial 
> > response from RedHat was that we shouldn't run services on multiple 
> > nodes that access gfs2, which seems a little confusing since we would 
> > use ext3 or ext4 if we were going to node lock (or failover) the 
> > partitions.  Have we missed something somewhere?
> >
That doesn't sound quite right... our guidance is not to run NFS/Samba
either together on the same GFS2 directory tree or in combination with
local applications. Otherwise there shouldn't be any issues with running
multiple applications on the same GFS2 tree/mount,

Steve.

> > Thanks in advance for any help anyone can give.  We're getting pretty 
> > desperate here since the downtime is starting to have a significant 
> > impact on our credibility.
> >
> > -- scooter
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 


From swhiteho at redhat.com  Fri Sep 17 21:10:06 2010
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Fri, 17 Sep 2010 22:10:06 +0100
Subject: [Linux-cluster] GFS2 changes between RHEL AP V5.x and 6?
In-Reply-To: <036B68E61A28CA49AC2767596576CD596BADB1859E@GVW1113EXC.americas.hpqcorp.net>
References: <036B68E61A28CA49AC2767596576CD596BADB1859E@GVW1113EXC.americas.hpqcorp.net>
Message-ID: <1284757806.2821.44.camel@dolmen>

Hi,

On Fri, 2010-09-17 at 03:58 +0000, Jankowski, Chris wrote:
> Hi,
>  
> I read the beta 2 release notes for RHEL 6.  It mentions numerous
> changes in the cluster for RHEL 6, but nothing about GFS2.
>  
> Are there any GFS2 changes in RHEL 6 compared with RHEL 5.x?

There are a few. Not a huge number though, as most of the changes are
cleanup, performance and stability related.

GFS2 in RHEL6 supports barriers, and the lock_dlm module has gone away
(gfs2 talks direct to the dlm without needing a layer between the two).
Support for discards is also new. Also a long standing bug relating to
umount order and bind mounts has been fixed. Tracepoints for GFS2 are
new too.

I'm sure there was supposed to be a list somewhere in the docs, but
maybe not the release notes... I forget now where it landed up,

Steve.

>  
> Thanks and regards,
>  
> Chris Jankowski
>  
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From sunhux at gmail.com  Sat Sep 18 03:26:02 2010
From: sunhux at gmail.com (sunhux G)
Date: Sat, 18 Sep 2010 11:26:02 +0800
Subject: [Linux-cluster] Netscape directory servers 6.1 stopped replicating
	to each other
Message-ID: <AANLkTinPVy83sLLnwPC9heUnXY6F8yqkeFbBL2FnYm90@mail.gmail.com>

We run a pair of Netscape directory servers V6.1 on two separate
servers (they're not clustered) and their LDAP entries replicate/
synchronize to each other.

This replication has been running fine for years and couple of
weeks back, the replication failed to work anymore and about that
time, I noticed an alert "port LDAP (389) is not responding".

Though the Tcp389 port subsequently recovered, the replication
did not recover.

>From the logs, the following link describes this issue very well:
https://bugzilla.redhat.com/show_bug.cgi?id=233642

Redhat has a patch for the directory server, but for linux version only:
http://rhn.redhat.com/errata/RHSA-2008-0602.html

As we can't locate the patch for our version/platform (HP-UX), we
followed the workaround given by the link below :
http://rhn.redhat.com/errata/RHSA-2008-0602.html
   ie
1. shutdown both LDAPs
2. remove replication agreement.
3. export data from the LDAP with latest data
4. import to the other ldap
5. remove all change log db
6. add back the replication agreement

This enabled the replication to work for a few weeks but
unfortunately, this same problem resurfaced again.

Appreciate any sharing from anyone who has any insight on
this issue.

Thanks
U
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100918/e8d806f0/attachment.htm>

From kitgerrits at gmail.com  Sat Sep 18 09:04:05 2010
From: kitgerrits at gmail.com (Kit Gerrits)
Date: Sat, 18 Sep 2010 11:04:05 +0200
Subject: [Linux-cluster] Netscape directory servers 6.1 stopped
	replicatingto each other
In-Reply-To: <AANLkTinPVy83sLLnwPC9heUnXY6F8yqkeFbBL2FnYm90@mail.gmail.com>
Message-ID: <4c948081.887b0e0a.0249.589b@mx.google.com>

I found a note near the bottom of the bug:
https://bugzilla.redhat.com/show_bug.cgi?id=233642#c22
We will probably not make any more RHEL4 binaries available.  It's even very

difficult for us to provide binaries that can run on RHEL5 (Fedora Core 6).
If

you want the latest code, you can build it yourself.  If you feel so
inclined,

I can help you with that.


By the sound of this, you may need to rebuild the package from source, but
at least they may be able to help you with that
 
Regards,
Kit Gerrits


  _____  

From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of sunhux G
Sent: zaterdag 18 september 2010 5:26
To: linux clustering
Subject: [Linux-cluster] Netscape directory servers 6.1 stopped
replicatingto each other


We run a pair of Netscape directory servers V6.1 on two separate
servers (they're not clustered) and their LDAP entries replicate/
synchronize to each other.

This replication has been running fine for years and couple of
weeks back, the replication failed to work anymore and about that
time, I noticed an alert "port LDAP (389) is not responding".

Though the Tcp389 port subsequently recovered, the replication
did not recover.
 
>From the logs, the following link describes this issue very well:
https://bugzilla.redhat.com/show_bug.cgi?id=233642 
 
Redhat has a patch for the directory server, but for linux version only: 
http://rhn.redhat.com/errata/RHSA-2008-0602.html
 
As we can't locate the patch for our version/platform (HP-UX), we
followed the workaround given by the link below :
http://rhn.redhat.com/errata/RHSA-2008-0602.html
   ie
1. shutdown both LDAPs 
2. remove replication agreement. 
3. export data from the LDAP with latest data 
4. import to the other ldap 
5. remove all change log db 
6. add back the replication agreement

This enabled the replication to work for a few weeks but
unfortunately, this same problem resurfaced again.

Appreciate any sharing from anyone who has any insight on
this issue.
 

Thanks
U


No virus found in this incoming message.
Checked by AVG - www.avg.com
Version: 9.0.851 / Virus Database: 271.1.1/3142 - Release Date: 09/17/10
20:34:00


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100918/c326356f/attachment.htm>

From joelh at planetjoel.com  Mon Sep 20 06:21:29 2010
From: joelh at planetjoel.com (Joel Heenan)
Date: Mon, 20 Sep 2010 16:21:29 +1000
Subject: [Linux-cluster] What does FAIL_STOP_WAIT state mean for clvmd
	and rgmanager
In-Reply-To: <1284055399.2207.16065.camel@ayanami.boston.devel.redhat.com>
References: <AANLkTimdQRytDkt2hYh0pvVA2KTXd+B9H+_x-FwjXSk2@mail.gmail.com>
	<1284055399.2207.16065.camel@ayanami.boston.devel.redhat.com>
Message-ID: <AANLkTi=3E6S=xybCSjBi3uGWHWg+AYRkZuDVU28MYBwT@mail.gmail.com>

I'm not sure possibly it was from doing a "service cman restart"

I understand its always preferrable to reboot with cluster suite but some of
our physical hosts can take 20 minutes to do a full reboot, so I'm always
look for some way to fix them online.

Joel

On Fri, Sep 10, 2010 at 4:03 AM, Lon Hohberger <lhh at redhat.com> wrote:

> On Mon, 2010-08-23 at 17:58 +1000, Joel Heenan wrote:
> > Can someone please explain what this means and what you can do to get
> > out of it:
> >
> > [root at cluster-host ~]# group_tool -v
> > type             level name       id       state node id local_done
> > fence            0     default    00010003 JOIN_STOP_WAIT 1 100050001
> > 1
> > [1 1 2 3 4]
> > dlm              1     clvmd      00020003 FAIL_STOP_WAIT 2 200030003
> > 1
> > [1 2 3 4]
> > dlm              1     rgmanager  00030003 FAIL_STOP_WAIT 2 200030003
> > 1
> > [1 2 3 4]
>
> It looks like fencing has not completed.  How do you have 2 node 1's in
> the fencing group?
>
> -- Lon
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100920/5abb94e4/attachment.htm>

From rohit2525 at gmail.com  Tue Sep 21 18:33:19 2010
From: rohit2525 at gmail.com (Rohit)
Date: Wed, 22 Sep 2010 00:03:19 +0530
Subject: [Linux-cluster] cluster of RHEL 5.4 AP in IBM Blade Center S with
	blade server HS22
Message-ID: <006901cb59bb$7994ea70$0c01a8c0@shilpi>

Dear Sir 

pls help me to create cluster  of RHEL 5.4 AP in IBM Blade Center S with blade server HS22 

pls help me i had go through redhat site for cluster installation but i m not susceed so pls tell me and give me step by step installation of cluster above said hardware. 


regard

Rohit 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100922/bdb7cce6/attachment.htm>

From dgeevarg at redhat.com  Wed Sep 22 02:52:26 2010
From: dgeevarg at redhat.com (Dominic Geevarghese)
Date: Wed, 22 Sep 2010 08:22:26 +0530
Subject: [Linux-cluster] cluster of RHEL 5.4 AP in IBM Blade Center S
 with	blade server HS22
In-Reply-To: <006901cb59bb$7994ea70$0c01a8c0@shilpi>
References: <006901cb59bb$7994ea70$0c01a8c0@shilpi>
Message-ID: <4C996F6A.4080102@redhat.com>

On 09/22/2010 12:03 AM, Rohit wrote:
> Dear Sir
> pls help me to create cluster  of RHEL 5.4 AP in IBM 
> Blade Center S with blade server HS22
> pls help me i had go through redhat site for cluster installation but 
> i m not susceed so pls tell me and give me step by step installation 
> of cluster above said hardware.
RH document is perfect to install/manage RH cluster suite. So, please 
share the steps you have followed n error you got , which
is much easier for everybody in this list.

-- Dominic
> regard
> Rohit
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100922/cd990001/attachment.htm>

From Jost.Rakovec at snt.si  Wed Sep 22 09:08:37 2010
From: Jost.Rakovec at snt.si (Rakovec Jost)
Date: Wed, 22 Sep 2010 11:08:37 +0200
Subject: [Linux-cluster] fence in xen
In-Reply-To: <3754ED14F3EE0C459DEFE2DF184515FF0F101C719D@SIMAIL.snt-is.com>
References: <3754ED14F3EE0C459DEFE2DF184515FF0F101C719C@SIMAIL.snt-is.com>,
	<3754ED14F3EE0C459DEFE2DF184515FF0F101C719D@SIMAIL.snt-is.com>
Message-ID: <3754ED14F3EE0C459DEFE2DF184515FF0F101C71BD@SIMAIL.snt-is.com>

Hi

anybody any idea? Please help!!


now i can fence node but after booting it can't connect in to cluster. 

on dom0

 fence_xvmd -LX -I xenbr0 -U xen:/// -fdddddddddddddd


ipv4_connect: Connecting to client
ipv4_connect: Success; fd = 12
Rebooting domain oelcl21...
[REBOOT] Calling virDomainDestroy(0x99cede0)
libvir: Xen error : Domain not found: xenUnifiedDomainLookupByName
[[ XML Domain Info ]]
<domain type='xen' id='41'>
  <name>oelcl21</name>
  <uuid>07e31b27-1ff1-4754-4f58-221e8d2057d6</uuid>
  <memory>1048576</memory>
  <currentMemory>1048576</currentMemory>
  <vcpu>2</vcpu>
  <bootloader>/usr/bin/pygrub</bootloader>
  <os>
    <type>linux</type>
  </os>
  <clock offset='utc'/>
  <on_poweroff>destroy</on_poweroff>
  <on_reboot>restart</on_reboot>
  <on_crash>restart</on_crash>
  <devices>
    <disk type='block' device='disk'>
      <driver name='phy'/>
      <source dev='/dev/vg_datastore/oelcl21'/>
      <target dev='xvda' bus='xen'/>
    </disk>
    <disk type='block' device='disk'>
      <driver name='phy'/>
      <source dev='/dev/vg_datastore/skupni1'/>
      <target dev='xvdb' bus='xen'/>
      <shareable/>
    </disk>
    <interface type='bridge'>
      <mac address='00:16:3e:7c:60:aa'/>
      <source bridge='xenbr0'/>
      <script path='/etc/xen/scripts/vif-bridge'/>
      <target dev='vif41.0'/>
    </interface>
    <console type='pty' tty='/dev/pts/2'>
      <source path='/dev/pts/2'/>
      <target port='0'/>
    </console>
  </devices>
</domain>

[[ XML END ]]
Calling virDomainCreateLinux()..


on domU -node1 

fence_xvm -H oelcl21 -ddd

clustat on node1:

[root at oelcl11 ~]# clustat 
Cluster Status for cluster2 @ Wed Sep 22 11:04:49 2010
Member Status: Quorate

 Member Name                                        ID   Status
 ------ ----                                        ---- ------
 oelcl11                                                1 Online, Local, rgmanager
 oelcl21                                                2 Online, rgmanager

 Service Name                              Owner (Last)                              State         
 ------- ----                              ----- ------                              -----         
 service:web                               oelcl11                                   started       
[root at oelcl11 ~]#


but node2 it waits for 300s an can 't connect 

   Starting daemons... done
   Starting fencing... Sep 22 10:41:06 oelcl21 kernel: eth0: no IPv6 routers present
done
[  OK  ]

[root at oelcl21 ~]# clustat 
Cluster Status for cluster2 @ Wed Sep 22 11:04:19 2010
Member Status: Quorate

 Member Name                             ID   Status
 ------ ----                             ---- ------
 oelcl11                                     1 Online
 oelcl21                                     2 Online, Local

[root at oelcl21 ~]# 


br
jost


________________________________________
From: linux-cluster-bounces at redhat.com [linux-cluster-bounces at redhat.com] On Behalf Of Rakovec Jost [Jost.Rakovec at snt.si]
Sent: Monday, September 13, 2010 9:31 AM
To: linux clustering
Subject: Re: [Linux-cluster] fence in xen

Hi


Q: do fence_xvmd must run also  in domU?
Because I notice that if I run on host when fence_xvmd is running:

[root at oelcl1 ~]# fence_xvm -H oelcl2 -ddd -o null
Debugging threshold is now 3
-- args @ 0x7fffe3f71fb0 --
  args->addr = 225.0.0.12
  args->domain = oelcl2
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 0
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 0
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 0
  args->debug = 3
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0x7fffe3f70f60 (4096 max size)
Actual key length = 4096 bytesSending to 225.0.0.12 via 127.0.0.1
Sending to 225.0.0.12 via 10.9.131.80
Sending to 225.0.0.12 via 10.9.131.83
Sending to 225.0.0.12 via 192.168.122.1
Waiting for connection from XVM host daemon.
Issuing TCP challenge
Responding to TCP challenge
TCP Exchange + Authentication done...
Waiting for return value from XVM host
Remote: Operation was successful


but if I try to fence ---> reboot then I get:

[root at oelcl1 ~]# fence_xvm -H oelc2
Remote: Operation was successful
[root at oelcl1 ~]#

but host2 is not reboot.


if fence_xvmd is not run on hosts then I get time out.


[root at oelcl1 sysconfig]# fence_xvm -H oelcl2 -ddd -o null
Debugging threshold is now 3
-- args @ 0x7fff1a6b5580 --
  args->addr = 225.0.0.12
  args->domain = oelcl2
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 0
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 0
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 0
  args->debug = 3
-- end args --
Reading in key file /etc/cluster/fence_xvm.key into 0x7fff1a6b4530 (4096 max size)
Actual key length = 4096 bytesSending to 225.0.0.12 via 127.0.0.1
Sending to 225.0.0.12 via 10.9.131.80
Waiting for connection from XVM host daemon.
Sending to 225.0.0.12 via 127.0.0.1
Sending to 225.0.0.12 via 10.9.131.80
Waiting for connection from XVM host daemon.


Q: how can I try if multicast is ok?

Q: on which network interface must fence_xvmd run on dom0? I notice that on hosts-domU is:

virbr0    Link encap:Ethernet  HWaddr 00:00:00:00:00:00
          inet addr:192.168.122.1  Bcast:192.168.122.255  Mask:255.255.255.0
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:40 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:0
          RX bytes:0 (0.0 b)  TX bytes:7212 (7.0 KiB)


also virbr0

and on dom0 guest:

[root at vm5 ~]# fence_xvmd -fdd -I xenbr0
-- args @ 0xbfd26234 --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 7
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 1
  args->debug = 2
-- end args --
Opened ckpt vm_states
My Node ID = 1
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
oelcl1                   2a53022c-5836-68f0-4514-02a5a0b07e81 00001 00002
oelcl2                   dd268dd4-f012-e0f7-7c77-aa8a58e1e6ab 00001 00002
oelcman                  09c783bd-9107-0916-ebbf-bd27bcc8babe 00001 00002
Storing oelcl1
Storing oelcl2


[root at vm5 ~]# fence_xvmd -fdd -I virbr0
-- args @ 0xbfd26234 --
  args->addr = 225.0.0.12
  args->domain = (null)
  args->key_file = /etc/cluster/fence_xvm.key
  args->op = 2
  args->hash = 2
  args->auth = 2
  args->port = 1229
  args->ifindex = 7
  args->family = 2
  args->timeout = 30
  args->retr_time = 20
  args->flags = 1
  args->debug = 2
-- end args --
Opened ckpt vm_states
My Node ID = 1
Domain                   UUID                                 Owner State
------                   ----                                 ----- -----
Domain-0                 00000000-0000-0000-0000-000000000000 00001 00001
oelcl1                   2a53022c-5836-68f0-4514-02a5a0b07e81 00001 00002
oelcl2                   dd268dd4-f012-e0f7-7c77-aa8a58e1e6ab 00001 00002
oelcman                  09c783bd-9107-0916-ebbf-bd27bcc8babe 00001 00002
Storing oelcl1
Storing oelcl2


no meter whic interface I take fence is not done.


thx

br jost


_____________________________________
From: linux-cluster-bounces at redhat.com [linux-cluster-bounces at redhat.com] On Behalf Of Rakovec Jost [Jost.Rakovec at snt.si]
Sent: Saturday, September 11, 2010 6:36 PM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] fence in xen

Hi list!


I have a question about fence_xvm.

Situation is:

one physical server with xen --> dom0  with 2 domU. Cluster work fine between domU --reboot, relocate,

I'm using redhat 5.5

Problem is with fence from dom0  with "fence_xvm -H oelcl2" ,  domU is destroyed but when it is booted back domU can't join to the cluster. domU boot very long time --> FENCED_START_TIMEOUT=300


on console I get after the node2 is up:

node2:

INFO: task clurgmgrd:2127 blocked for more than 120 seconds.
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
clurgmgrd     D 0000000000000010     0  2127   2126                     (NOTLB)
 ffff88006f08dda8  0000000000000286  ffff88007cc0b810  0000000000000000
 0000000000000003  ffff880072009860  ffff880072f6b0c0  00000000000455ec
 ffff880072009a48  ffffffff802649d7
Call Trace:
 [<ffffffff802649d7>] _read_lock_irq+0x9/0x19
 [<ffffffff8021420e>] filemap_nopage+0x193/0x360
 [<ffffffff80263a7e>] __mutex_lock_slowpath+0x60/0x9b
 [<ffffffff80263ac8>] .text.lock.mutex+0xf/0x14
 [<ffffffff88424b64>] :dlm:dlm_new_lockspace+0x2c/0x860
 [<ffffffff80222b08>] __up_read+0x19/0x7f
 [<ffffffff802d0abb>] __kmalloc+0x8f/0x9f
 [<ffffffff8842b6fa>] :dlm:device_write+0x438/0x5e5
 [<ffffffff80217377>] vfs_write+0xce/0x174
 [<ffffffff80217bc4>] sys_write+0x45/0x6e
 [<ffffffff802602f9>] tracesys+0xab/0xb6


between booting on node2:

Starting clvmd: dlm: Using TCP for communications
clvmd startup timed out
[FAILED]


node2:

[root at oelcl2 init.d]# clustat
Cluster Status for cluster1 @ Sat Sep 11 18:11:21 2010
Member Status: Quorate

 Member Name                                                ID   Status
 ------ ----                                                ---- ------
 oelcl1                                                  1 Online
 oelcl2                                                 2 Online, Local

[root at oelcl2 init.d]#


on first node:

[root at oelcl1 ~]# clustat
Cluster Status for cluster1 @ Sat Sep 11 18:12:07 2010
Member Status: Quorate

 Member Name                                                ID   Status
 ------ ----                                                ---- ------
 oelcl1                                                  1 Online, Local, rgmanager
 oelcl2                                                  2 Online, rgmanager

 Service Name                                      Owner (Last)                                      State
 ------- ----                                      ----- ------                                      -----
 service:webby                                     oelcl1                                     started
[root at oelcl1 ~]#


and then I have to destroy both domU on guest and create it back to get node2 work again.

I have use how to on https://access.redhat.com/kb/docs/DOC-5937 and http://sources.redhat.com/cluster/wiki/VMClusterCookbook


cluster config on dom0


<?xml version="1.0"?>
<cluster alias="vmcluster" config_version="1" name="vmcluster">
        <clusternodes>
                <clusternode name="vm5" nodeid="1" votes="1"/>
        </clusternodes>
        <cman/>
        <fencedevices/>
        <rm/>
        <fence_xvmd/>
</cluster>


cluster config on domU


<?xml version="1.0"?>
<cluster alias="cluster1" config_version="49" name="cluster1">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="4"/>
        <clusternodes>
                <clusternode name="oelcl1.name.comi" nodeid="1" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="oelcl1" name="xenfence1"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="oelcl2.name.com" nodeid="2" votes="1">
                        <fence>
                                <method name="1">
                                        <device domain="oelcl2" name="xenfence1"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <cman expected_votes="1" two_node="1"/>
        <fencedevices>
                <fencedevice agent="fence_xvm" name="xenfence1"/>
        </fencedevices>
        <rm>
                <failoverdomains>
                        <failoverdomain name="prefer_node1" nofailback="0" ordered="1" restricted="1">
                                <failoverdomainnode name="oelcl1.name.com" priority="1"/>
                                <failoverdomainnode name="oelcl2.name.com" priority="2"/>
                        </failoverdomain>
                </failoverdomains>
                <resources>
                        <ip address="xx.xx.xx.xx" monitor_link="1"/>
                        <fs device="/dev/xvdb1" force_fsck="0" force_unmount="0" fsid="8669" fstype="ext3" mountpoint="/var/www/html" name="docroot" self_fence="0"/>
                        <script file="/etc/init.d/httpd" name="apache_s"/>
                </resources>
                <service autostart="1" domain="prefer_node1" exclusive="0" name="webby" recovery="relocate">
                        <ip ref="xx.xx.xx.xx"/>
                        <fs ref="docroot"/>
                        <script ref="apache_s"/>
                </service>
        </rm>
</cluster>


fence proces on dom0

[root at vm5 cluster]# ps -ef |grep fenc
root     18690     1  0 17:40 ?        00:00:00 /sbin/fenced
root     18720     1  0 17:40 ?        00:00:00 /sbin/fence_xvmd -I xenbr0
root     22633 14524  0 18:21 pts/3    00:00:00 grep fenc
[root at vm5 cluster]#


and on domU

[root at oelcl1 ~]# ps -ef|grep fen
root      1523     1  0 17:41 ?        00:00:00 /sbin/fenced
root     13695  2902  0 18:22 pts/0    00:00:00 grep fen
[root at oelcl1 ~]#


Do somebody have any idea why fence don't work?

thx

br

jost


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From expertalert at gmail.com  Wed Sep 22 20:49:31 2010
From: expertalert at gmail.com (fosiul alam)
Date: Wed, 22 Sep 2010 21:49:31 +0100
Subject: [Linux-cluster] How to add mysql script into Add resource(luci
	interface)
Message-ID: <AANLkTin+w66Gg_Ew0Y3imqPXGmb6yCCHDMhZz938Hsb3@mail.gmail.com>

Hi
I have installed mysql server from source, So I need to start mysql service
by executing bellow command:

/usr/local/mysql/bin/mysqld_safe --user=mysql &

But I dont understand, How will i add this command in Resources.
Normally I add , Add Resources ->script
but script will try to execute "start" "stop" end of the command i will
provide.
which will not work in my case.

So can any one please tel me, how to add above command in redhat  cluster by
using luci interface
Please let me know if i am  not clear.
Thanks for help
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100922/a319ca38/attachment.htm>

From bturner at redhat.com  Thu Sep 23 19:04:48 2010
From: bturner at redhat.com (Ben Turner)
Date: Thu, 23 Sep 2010 15:04:48 -0400 (EDT)
Subject: [Linux-cluster] How to add mysql script into Add resource(luci
 interface)
In-Reply-To: <AANLkTin+w66Gg_Ew0Y3imqPXGmb6yCCHDMhZz938Hsb3@mail.gmail.com>
Message-ID: <905170572.166021285268688016.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>

In cases like this I write a wrapper script to use as the cluster script resource.  This will have to be a Sys V style init script that calls mysql using the command you specified.  Hope this helps.

-Ben


----- "fosiul alam" <expertalert at gmail.com> wrote:

> Hi
> I have installed mysql server from source, So I need to start mysql
> service by executing bellow command:
> 
> /usr/local/mysql/bin/mysqld_safe --user=mysql &
> 
> But I dont understand, How will i add this command in Resources.
> Normally I add , Add Resources ->script
> but script will try to execute "start" "stop" end of the command i
> will provide.
> which will not work in my case.
> 
> So can any one please tel me, how to add above command in redhat
> cluster by using luci interface
> Please let me know if i am not clear.
> Thanks for help
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From expertalert at gmail.com  Thu Sep 23 19:15:48 2010
From: expertalert at gmail.com (fosiul alam)
Date: Thu, 23 Sep 2010 20:15:48 +0100
Subject: [Linux-cluster] How to add mysql script into Add resource(luci
	interface)
In-Reply-To: <905170572.166021285268688016.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
References: <AANLkTin+w66Gg_Ew0Y3imqPXGmb6yCCHDMhZz938Hsb3@mail.gmail.com>
	<905170572.166021285268688016.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <AANLkTi==zgF-2GUJ8psiPcBaJEQtXNAtxLewfE8_X+Ut@mail.gmail.com>

Hi thanks
I did that today morning, i copied one script from /etc/init.d/ and then
modify it as my needs
which is working fine.

thanks


On 23 September 2010 20:04, Ben Turner <bturner at redhat.com> wrote:

> In cases like this I write a wrapper script to use as the cluster script
> resource.  This will have to be a Sys V style init script that calls mysql
> using the command you specified.  Hope this helps.
>
> -Ben
>
>
> ----- "fosiul alam" <expertalert at gmail.com> wrote:
>
> > Hi
> > I have installed mysql server from source, So I need to start mysql
> > service by executing bellow command:
> >
> > /usr/local/mysql/bin/mysqld_safe --user=mysql &
> >
> > But I dont understand, How will i add this command in Resources.
> > Normally I add , Add Resources ->script
> > but script will try to execute "start" "stop" end of the command i
> > will provide.
> > which will not work in my case.
> >
> > So can any one please tel me, how to add above command in redhat
> > cluster by using luci interface
> > Please let me know if i am not clear.
> > Thanks for help
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100923/a15844a2/attachment.htm>

From expertalert at gmail.com  Thu Sep 23 19:26:24 2010
From: expertalert at gmail.com (fosiul alam)
Date: Thu, 23 Sep 2010 20:26:24 +0100
Subject: [Linux-cluster] ricci is very unstable in one nodes
Message-ID: <AANLkTi=kFipwqBrYbCgJ+Kj68J_9i1tu8EzN1X7BomTq@mail.gmail.com>

Hi
I have 4 nodes cluster,
It was running fine. but today one nodes is giving trouble

>From luci Gui interface, when i try to relocate service into this node and
trying to relocate from this nodes to another nodes

from luci gui interface, its showing :

Unable to retrieve batch 1908047789 status from beaver.domain.local:11111:
clusvcadm start failed to start httpd1: Starting cluster service "httpd1" on
node "http1.domain.local" -- You will be redirected in 5 seconds.also

*The ricci agent for this node is unresponsive. Node-specific information is
not available at this time.  :

but ricci is running on problematic node ,
ricci     7324  0.0  0.1  58876  2932 ?        S<s  14:40   0:00 ricci -u
101

 there is not any firewall running.

 iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination

Chain OUTPUT (policy ACCEPT)
target     prot opt source               destination

Chain RH-Firewall-1-INPUT (0 references)
target     prot opt source               destination

port 11111 is runningg

netstat -an | grep 11111
tcp        0      0 0.0.0.0:11111               0.0.0.0:*
LISTEN


but still ricci is very unstable , and i cant relocate any service on this
node or i cant relocate any service away from this node.

from problematic node if i type this

 clustat
Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
Member Status: Quorate

 Member Name                             ID   Status
 ------ ----                             ---- ------
 beaver.xxx.local                  1 Online, rgmanager         ::: luci is
running from this server
 publicdns1.xxxx.local              2 Online, rgmanager
 http1.xxxx.local                   3 Online, Local, rgmanager
 mail01.xxxxx.local                  4 Online, rgmanager

 Service Name                   Owner (Last)                   State
 ------- ----                   ----- ------                   -----
 service:httpd1                 mail01.xxxx.local     started
 service:mysql-server           http1.xxxx.local      started
------------------- this is the problematic node
 service:public-dns             publicdns1.xxxxxx.local started

I cant move that service mysql-server from this node or cant relocate any
service on this node ..
I am very confused.

what shall i do  to fix this issue ??

thanks for your advise.


*
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100923/f4009045/attachment.htm>

From bturner at redhat.com  Fri Sep 24 14:33:06 2010
From: bturner at redhat.com (Ben Turner)
Date: Fri, 24 Sep 2010 10:33:06 -0400 (EDT)
Subject: [Linux-cluster] ricci is very unstable in one nodes
In-Reply-To: <1401211566.240251285338780283.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <383081641.240271285338786621.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>

There is an issue with ricci timeouts that was fixed recently:

https://bugzilla.redhat.com/show_bug.cgi?id=564490

I'm not sure but you may be hitting that bug.  Symptoms include: luci isn't able to get the status from the node, timeouts when querying ricci, etc.  The fix should be released with 5.6

On the mysql service there are some options that you need to set.  Here are all the options available to that agent:

mysql
Defines a MySQL database server

Attribute	Description
config_file	Define configuration file
listen_address	Define an IP address for MySQL server. If the address is not given then first IP address from the service is taken.
mysqld_options	Other command-line options for mysqld
name	Name
ref	Reference to existing mysql resource in the resources section.
service_name	Inherit the service name.
shutdown_wait	Wait X seconds for correct end of service shutdown
startup_wait	Wait X seconds for correct end of service startup
__enforce_timeouts	Consider a timeout for operations as fatal.
__failure_expire_time	Amount of time before a failure is forgotten.
__independent_subtree	Treat this and all children as an independent subtree.
__max_failures	Maximum number of failures before returning a failure to a status check.

If I recall correctly you may need to tweak:

shutdown_wait	Wait X seconds for correct end of service shutdown
startup_wait	Wait X seconds for correct end of service startup

There can be problems relocating the DB if it takes too long to start/shutdown.  If you are having problems relocating with luci it may be a good idea to test with:

# clusvcadm -r <service name> -m <cluster node>

-Ben


----- "fosiul alam" <expertalert at gmail.com> wrote:

> Hi
> I have 4 nodes cluster,
> It was running fine. but today one nodes is giving trouble
> 
> From luci Gui interface, when i try to relocate service into this node
> and trying to relocate from this nodes to another nodes
> 
> from luci gui interface, its showing :
> 
> Unable to retrieve batch 1908047789 status from
> beaver.domain.local:11111: clusvcadm start failed to start httpd1:
> Starting cluster service "httpd1" on node "http1.domain.local" -- You
> will be redirected in 5 seconds.
> also
> 
> The ricci agent for this node is unresponsive. Node-specific
> information is not available at this time. :
> 
> but ricci is running on problematic node ,
> ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
> 
> there is not any firewall running.
> 
> iptables -L
> Chain INPUT (policy ACCEPT)
> target prot opt source destination
> 
> Chain FORWARD (policy ACCEPT)
> target prot opt source destination
> 
> Chain OUTPUT (policy ACCEPT)
> target prot opt source destination
> 
> Chain RH-Firewall-1-INPUT (0 references)
> target prot opt source destination
> 
> port 11111 is runningg
> 
> netstat -an | grep 11111
> tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
> 
> 
> but still ricci is very unstable , and i cant relocate any service on
> this node or i cant relocate any service away from this node.
> 
> from problematic node if i type this
> 
> clustat
> Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
> Member Status: Quorate
> 
> Member Name ID Status
> ------ ---- ---- ------
> beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
> server
> publicdns1.xxxx.local 2 Online, rgmanager
> http1.xxxx.local 3 Online, Local, rgmanager
> mail01.xxxxx.local 4 Online, rgmanager
> 
> Service Name Owner (Last) State
> ------- ---- ----- ------ -----
> service:httpd1 mail01.xxxx.local started
> service:mysql-server http1.xxxx.local started ------------------- this
> is the problematic node
> service:public-dns publicdns1.xxxxxx.local started
> 
> I cant move that service mysql-server from this node or cant relocate
> any service on this node ..
> I am very confused.
> 
> what shall i do to fix this issue ??
> 
> thanks for your advise.
> 
> 
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From emilio at inet.it  Fri Sep 24 16:38:17 2010
From: emilio at inet.it (emilio brambilla)
Date: Fri, 24 Sep 2010 18:38:17 +0200
Subject: [Linux-cluster] porblem with quorum at cluster boot
Message-ID: <4C9CD3F9.40700@inet.it>

hello,

I have a 2 node cluster with qdisk quorum partition;

each node has 1 vote and the qdisk has 1 vote too; in cluster.conf I 
have this explicit declaration:
<cman expected_votes="3" two_node="0"\>

when I have both 2 nodes active cman_tool status tell me this:

Version: 6.1.0
Nodes: 2
Expected votes: 3
Quorum device votes: 1
Total votes: 3
Node votes: 1
Quorum: 2

then, if I power off a node these value, as expected, changed this way:
Nodes: 1
Total votes: 2

and the cluster is still quorate and functional.

the problem is if I power off both the node and them power on only one 
of them: in this case the single node does not quorate and the cluster 
does not start: I have to power on both the node to have the cluster 
(and services on the cluster) working.

I'd like the cluster can work (and boot) even with a single node (ie, if 
one of the node has hw failure and is down I still want to be able to 
reboot the working node and have it booting correctly the cluster)

any hints? (thank's for reading all this)

-- 
bye,
emilio


From Jason_Henderson at Mitel.com  Fri Sep 24 16:52:06 2010
From: Jason_Henderson at Mitel.com (Jason_Henderson at Mitel.com)
Date: Fri, 24 Sep 2010 12:52:06 -0400
Subject: [Linux-cluster] porblem with quorum at cluster boot
In-Reply-To: <4C9CD3F9.40700@inet.it>
Message-ID: <OF325BF332.6C7EA96C-ON852577A8.005C8390-852577A8.005CA959@ottlnmta.mitel.com>

I think you still need two_node="1" in your conf file if you want a single 
node to become quorate.

linux-cluster-bounces at redhat.com wrote on 09/24/2010 12:38:17 PM:

> hello,
> 
> I have a 2 node cluster with qdisk quorum partition;
> 
> each node has 1 vote and the qdisk has 1 vote too; in cluster.conf I 
> have this explicit declaration:
> <cman expected_votes="3" two_node="0"\>
> 
> when I have both 2 nodes active cman_tool status tell me this:
> 
> Version: 6.1.0
> Nodes: 2
> Expected votes: 3
> Quorum device votes: 1
> Total votes: 3
> Node votes: 1
> Quorum: 2
> 
> then, if I power off a node these value, as expected, changed this way:
> Nodes: 1
> Total votes: 2
> 
> and the cluster is still quorate and functional.
> 
> the problem is if I power off both the node and them power on only one 
> of them: in this case the single node does not quorate and the cluster 
> does not start: I have to power on both the node to have the cluster 
> (and services on the cluster) working.
> 
> I'd like the cluster can work (and boot) even with a single node (ie, if 

> one of the node has hw failure and is down I still want to be able to 
> reboot the working node and have it booting correctly the cluster)
> 
> any hints? (thank's for reading all this)
> 
> -- 
> bye,
> emilio
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100924/39f91546/attachment.htm>

From rhurst at bidmc.harvard.edu  Fri Sep 24 17:32:52 2010
From: rhurst at bidmc.harvard.edu (rhurst at bidmc.harvard.edu)
Date: Fri, 24 Sep 2010 13:32:52 -0400
Subject: [Linux-cluster] cluster of RHEL 5.4 AP in IBM Blade Center S
 with	blade server HS22
In-Reply-To: <006901cb59bb$7994ea70$0c01a8c0@shilpi>
References: <006901cb59bb$7994ea70$0c01a8c0@shilpi>
Message-ID: <50168EC934B8D64AA8D8DD37F840F3DE0564062AC2@EVS2CCR.its.caregroup.org>

HS22 type blades are irrelevant; there is nothing special about them to setup.

Make certain your (Cisco) switch has IGMP Snooping enabled -- you will not be able to form a cluster without it on.  Conversely, if you are using a Server Connectivity Module (SCM) only, disable IGMP Snooping on that device.

If you are using GFS / GFS2 clustered filesystems, your fence device is the BladeCenter's AMM.  Make certain you setup an account in your AMM that is allowed to power up / power off each blade in its chassis.

We have two AMMs for redundancy, Adama and Cain, and created a local account "cluster", i.e.,

cluster.conf snippet:

  <clusternode name="zodiac" nodeid="8" votes="1">
   <fence>
    <method name="1">
     <device blade="8" name="Adama"/>
    </method>
    <method name="2">
     <device blade="8" name="Cain"/>
    </method>
   </fence>
  </clusternode>
 <cman>
  <multicast addr="239.63.63.63"/>
 </cman>
 <fencedevices>
  <fencedevice agent="fence_bladecenter" ipaddr="adama" login="cluster" name="Adama" passwd="********"/>
  <fencedevice agent="fence_bladecenter" ipaddr="cain" login="cluster" name="Cain" passwd="********/>
 </fencedevices>

________________________________
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Rohit
Sent: Tuesday, September 21, 2010 2:33 PM
To: linux-cluster at redhat.com
Subject: [Linux-cluster] cluster of RHEL 5.4 AP in IBM Blade Center S with blade server HS22

Dear Sir

pls help me to create cluster  of RHEL 5.4 AP in IBM Blade Center S with blade server HS22

pls help me i had go through redhat site for cluster installation but i m not susceed so pls tell me and give me step by step installation of cluster above said hardware.


regard

Rohit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100924/ebec11d7/attachment.htm>

From utpalchemiitkgp at gmail.com  Sat Sep 25 09:25:39 2010
From: utpalchemiitkgp at gmail.com (Utpal Sarkar)
Date: Sat, 25 Sep 2010 11:25:39 +0200
Subject: [Linux-cluster] request help for ubuntu 10 or fedora 7 clustering
	guide
Message-ID: <AANLkTimhVpviG0-86aK3_=OHjZxhAVwa4rNRpTY5ebGX@mail.gmail.com>

dear linux users,
                              I want to make a  four node linux cluster
(either ubuntu 10 or fedora 7) for scientific comutation,
My wuestions are:
1)  I have installed ubuntu 10 but I couldn't able to install fortran 90/ or
f90 there.So can you please help me to install it along with the BLAS and
LAPAC libraries.
2) When I installed fedora 7 I saw that C-compiler and F95 compiler is
installed along with BLAS library, but I LAPAC library is not and I also
couldn't able to install it in fedora. SO how can I do it.
3) Once the C compiler, F90 or F95 and math libraries like LAPAC, BLAS
istalled either in ubuntu or fedora, I would like to the step by step giude
about the clustering of that system.
 I am novice in this field so waitng for your kind help.
Utpal Sarkar
Physics Department
Assam University
Silchar
INDIA
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100925/7933b33e/attachment.htm>

From expertalert at gmail.com  Sat Sep 25 10:00:05 2010
From: expertalert at gmail.com (fosiul alam)
Date: Sat, 25 Sep 2010 11:00:05 +0100
Subject: [Linux-cluster] ricci is very unstable in one nodes
In-Reply-To: <383081641.240271285338786621.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
References: <1401211566.240251285338780283.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
	<383081641.240271285338786621.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <AANLkTinkgwqZy8LVsRE9HoCjtO=2n-A5CRS=XYkS0Zvz@mail.gmail.com>

Hi Ben
Thanks

I named this cluster as mysql-server but i have not installed mysql database
in their yet

and both luci and ricci on luci server and node1 is running this version

luci-0.12.2-12.el5.centos.1
ricci-0.12.2-12.el5.centos.1


do you think this version has problem as well ??

thanks for your help


On 24 September 2010 15:33, Ben Turner <bturner at redhat.com> wrote:

> There is an issue with ricci timeouts that was fixed recently:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=564490
>
> I'm not sure but you may be hitting that bug.  Symptoms include: luci isn't
> able to get the status from the node, timeouts when querying ricci, etc.
>  The fix should be released with 5.6
>
> On the mysql service there are some options that you need to set.  Here are
> all the options available to that agent:
>
> mysql
> Defines a MySQL database server
>
> Attribute       Description
> config_file     Define configuration file
> listen_address  Define an IP address for MySQL server. If the address is
> not given then first IP address from the service is taken.
> mysqld_options  Other command-line options for mysqld
> name    Name
> ref     Reference to existing mysql resource in the resources section.
> service_name    Inherit the service name.
> shutdown_wait   Wait X seconds for correct end of service shutdown
> startup_wait    Wait X seconds for correct end of service startup
> __enforce_timeouts      Consider a timeout for operations as fatal.
> __failure_expire_time   Amount of time before a failure is forgotten.
> __independent_subtree   Treat this and all children as an independent
> subtree.
> __max_failures  Maximum number of failures before returning a failure to a
> status check.
>
> If I recall correctly you may need to tweak:
>
> shutdown_wait   Wait X seconds for correct end of service shutdown
> startup_wait    Wait X seconds for correct end of service startup
>
> There can be problems relocating the DB if it takes too long to
> start/shutdown.  If you are having problems relocating with luci it may be a
> good idea to test with:
>
> # clusvcadm -r <service name> -m <cluster node>
>
> -Ben
>
>
>
> ----- "fosiul alam" <expertalert at gmail.com> wrote:
>
> > Hi
> > I have 4 nodes cluster,
> > It was running fine. but today one nodes is giving trouble
> >
> > From luci Gui interface, when i try to relocate service into this node
> > and trying to relocate from this nodes to another nodes
> >
> > from luci gui interface, its showing :
> >
> > Unable to retrieve batch 1908047789 status from
> > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
> > Starting cluster service "httpd1" on node "http1.domain.local" -- You
> > will be redirected in 5 seconds.
> > also
> >
> > The ricci agent for this node is unresponsive. Node-specific
> > information is not available at this time. :
> >
> > but ricci is running on problematic node ,
> > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
> >
> > there is not any firewall running.
> >
> > iptables -L
> > Chain INPUT (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain FORWARD (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain OUTPUT (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain RH-Firewall-1-INPUT (0 references)
> > target prot opt source destination
> >
> > port 11111 is runningg
> >
> > netstat -an | grep 11111
> > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
> >
> >
> > but still ricci is very unstable , and i cant relocate any service on
> > this node or i cant relocate any service away from this node.
> >
> > from problematic node if i type this
> >
> > clustat
> > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
> > Member Status: Quorate
> >
> > Member Name ID Status
> > ------ ---- ---- ------
> > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
> > server
> > publicdns1.xxxx.local 2 Online, rgmanager
> > http1.xxxx.local 3 Online, Local, rgmanager
> > mail01.xxxxx.local 4 Online, rgmanager
> >
> > Service Name Owner (Last) State
> > ------- ---- ----- ------ -----
> > service:httpd1 mail01.xxxx.local started
> > service:mysql-server http1.xxxx.local started ------------------- this
> > is the problematic node
> > service:public-dns publicdns1.xxxxxx.local started
> >
> > I cant move that service mysql-server from this node or cant relocate
> > any service on this node ..
> > I am very confused.
> >
> > what shall i do to fix this issue ??
> >
> > thanks for your advise.
> >
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100925/86806fb3/attachment.htm>

From brem.belguebli at gmail.com  Sat Sep 25 15:19:43 2010
From: brem.belguebli at gmail.com (Brem Belguebli)
Date: Sat, 25 Sep 2010 17:19:43 +0200
Subject: [Linux-cluster] porblem with quorum at cluster boot
In-Reply-To: <OF325BF332.6C7EA96C-ON852577A8.005C8390-852577A8.005CA959@ottlnmta.mitel.com>
References: <OF325BF332.6C7EA96C-ON852577A8.005C8390-852577A8.005CA959@ottlnmta.mitel.com>
Message-ID: <1285427983.23766.0.camel@newgen.localdomain>

On Fri, 2010-09-24 at 12:52 -0400, Jason_Henderson at Mitel.com wrote:
> 
> I think you still need two_node="1" in your conf file if you want a
> single node to become quorate. 
> 
two_nodes=1 is only valid if you do not have a quorum disk.

> linux-cluster-bounces at redhat.com wrote on 09/24/2010 12:38:17 PM:
> 
> > hello,
> > 
> > I have a 2 node cluster with qdisk quorum partition;
> > 
> > each node has 1 vote and the qdisk has 1 vote too; in cluster.conf
> I 
> > have this explicit declaration:
> > <cman expected_votes="3" two_node="0"\>
> > 
> > when I have both 2 nodes active cman_tool status tell me this:
> > 
> > Version: 6.1.0
> > Nodes: 2
> > Expected votes: 3
> > Quorum device votes: 1
> > Total votes: 3
> > Node votes: 1
> > Quorum: 2
> > 
> > then, if I power off a node these value, as expected, changed this
> way:
> > Nodes: 1
> > Total votes: 2
> > 
> > and the cluster is still quorate and functional.
> > 
> > the problem is if I power off both the node and them power on only
> one 
> > of them: in this case the single node does not quorate and the
> cluster 
> > does not start: I have to power on both the node to have the
> cluster 
> > (and services on the cluster) working.
> > 
> > I'd like the cluster can work (and boot) even with a single node
> (ie, if 
> > one of the node has hw failure and is down I still want to be able
> to 
> > reboot the working node and have it booting correctly the cluster)
> > 
> > any hints? (thank's for reading all this)
> > 
> > -- 
> > bye,
> > emilio
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From jacob.ishak at gmail.com  Sat Sep 25 20:32:31 2010
From: jacob.ishak at gmail.com (jacob ishak)
Date: Sat, 25 Sep 2010 23:32:31 +0300
Subject: [Linux-cluster] porblem with quorum at cluster boot
In-Reply-To: <4C9CD3F9.40700@inet.it>
References: <4C9CD3F9.40700@inet.it>
Message-ID: <AANLkTi=aNpF=F9VANg1RekmyoTijepBBd00BcL3EOBvu@mail.gmail.com>

hi
in 2-node cluster you don't need a quorum disk , there is a 2-node cluster
option that will added to the cluster.conf , and the cluster will be
functional quorate and everything , think of it  if you have 2 nodes only .

regarding your problem , is the problem occuring on a specific node , or on
both , if on a specific node , then check connectivity with shared disk ,
also how are you shutting down the cluster ( hard shutdown? ) ,
simultaneously or at the same time, .... it all maters ?? also check the
fencing devices is the fencing occurring correctly ( did you tried to unplug
network cables from one node and observed behavior )???? many parameters to
check , my recommendation is to troubleshoot step by step starting with the
basics ( hardware .. ) and on the way up .

On Fri, Sep 24, 2010 at 7:38 PM, emilio brambilla <emilio at inet.it> wrote:

> hello,
>
> I have a 2 node cluster with qdisk quorum partition;
>
> each node has 1 vote and the qdisk has 1 vote too; in cluster.conf I have
> this explicit declaration:
> <cman expected_votes="3" two_node="0"\>
>
> when I have both 2 nodes active cman_tool status tell me this:
>
> Version: 6.1.0
> Nodes: 2
> Expected votes: 3
> Quorum device votes: 1
> Total votes: 3
> Node votes: 1
> Quorum: 2
>
> then, if I power off a node these value, as expected, changed this way:
> Nodes: 1
> Total votes: 2
>
> and the cluster is still quorate and functional.
>
> the problem is if I power off both the node and them power on only one of
> them: in this case the single node does not quorate and the cluster does not
> start: I have to power on both the node to have the cluster (and services on
> the cluster) working.
>
> I'd like the cluster can work (and boot) even with a single node (ie, if
> one of the node has hw failure and is down I still want to be able to reboot
> the working node and have it booting correctly the cluster)
>
> any hints? (thank's for reading all this)
>
> --
> bye,
> emilio
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100925/e6d21735/attachment.htm>

From rohit2525 at gmail.com  Mon Sep 27 05:18:32 2010
From: rohit2525 at gmail.com (Rohit tripathi)
Date: Mon, 27 Sep 2010 10:48:32 +0530
Subject: [Linux-cluster] cluster of RHEL 5.4 AP in IBM Blade Center S
 with blade server HS22
In-Reply-To: <50168EC934B8D64AA8D8DD37F840F3DE0564062AC2@EVS2CCR.its.caregroup.org>
References: <006901cb59bb$7994ea70$0c01a8c0@shilpi>
	<50168EC934B8D64AA8D8DD37F840F3DE0564062AC2@EVS2CCR.its.caregroup.org>
Message-ID: <AANLkTik0G+FXZm5s2QwWqTzv54PaePVLHJfAxV_Tmk1R@mail.gmail.com>

thanx for help

I need one more help to configure IBM DB2 and LDAP  in cluster mode. Pls
help me in this regards


Rohit

On Fri, Sep 24, 2010 at 11:02 PM, <rhurst at bidmc.harvard.edu> wrote:

>  HS22 type blades are irrelevant; there is nothing special about them to
> setup.
>
> Make certain your (Cisco) switch has IGMP Snooping enabled -- you will not
> be able to form a cluster without it on.  Conversely, if you are using a
> Server Connectivity Module (SCM) only, disable IGMP Snooping on that device.
>
> If you are using GFS / GFS2 clustered filesystems, your fence device is the
> BladeCenter's AMM.  Make certain you setup an account in your AMM that is
> allowed to power up / power off each blade in its chassis.
>
> We have two AMMs for redundancy, Adama and Cain, and created a local
> account "cluster", i.e.,
>
> cluster.conf snippet:
>
>   <clusternode name="zodiac" nodeid="8" votes="1">
>    <fence>
>     <method name="1">
>      <device blade="8" name="Adama"/>
>     </method>
>     <method name="2">
>      <device blade="8" name="Cain"/>
>     </method>
>    </fence>
>   </clusternode>
>  <cman>
>   <multicast addr="239.63.63.63"/>
>  </cman>
>  <fencedevices>
>   <fencedevice agent="fence_bladecenter" ipaddr="adama" login="cluster"
> name="Adama" passwd="********"/>
>   <fencedevice agent="fence_bladecenter" ipaddr="cain" login="cluster"
> name="Cain" passwd="********/>
>  </fencedevices>
>
>  ------------------------------
> *From:* linux-cluster-bounces at redhat.com [mailto:
> linux-cluster-bounces at redhat.com] *On Behalf Of *Rohit
> *Sent:* Tuesday, September 21, 2010 2:33 PM
> *To:* linux-cluster at redhat.com
> *Subject:* [Linux-cluster] cluster of RHEL 5.4 AP in IBM Blade Center S
> with blade server HS22
>
>   Dear Sir
>
> pls help me to create cluster  of RHEL 5.4 AP in IBM Blade Center S with
> blade server HS22
>
> pls help me i had go through redhat site for cluster installation but i m
> not susceed so pls tell me and give me step by step installation of cluster
> above said hardware.
>
>
> regard
>
> Rohit
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/feadd553/attachment.htm>

From rohit2525 at gmail.com  Mon Sep 27 13:12:25 2010
From: rohit2525 at gmail.com (Rohit tripathi)
Date: Mon, 27 Sep 2010 18:42:25 +0530
Subject: [Linux-cluster] redhat HA cluster we need shared storage
Message-ID: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>

Dear sir

I need to know for redhat HA cluster we need shared storage it is compulsory
or not or we can configure HA cluster without shared storage.


regards

Rohit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/344a9eca/attachment.htm>

From linux at alteeve.com  Mon Sep 27 14:00:37 2010
From: linux at alteeve.com (Digimer)
Date: Mon, 27 Sep 2010 10:00:37 -0400
Subject: [Linux-cluster] redhat HA cluster we need shared storage
In-Reply-To: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>
References: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>
Message-ID: <4CA0A385.5010701@alteeve.com>

On 10-09-27 09:12 AM, Rohit tripathi wrote:
> Dear sir
>  
> I need to know for redhat HA cluster we need shared storage it is
> compulsory or not or we can configure HA cluster without shared storage.
>  
>  
> regards
>  
> Rohit

Short answer;

Yes, a cluster is required.

Longer answer;

Just to clarify, there are two types of cluster; HA (Heartbeat) and RHCS
(Red Hat Cluster Services). Assuming by "shared storage" you are talking
about GFS2 partitions, then yes, you must have a cluster.

The reason is that GFS2 (and possibly OCFS2, though I am not certain)
require the DLM (Distributed Lock Manager) provided by the cluster to
ensure safe access to the shared storage. Further, the cluster provides
critical fencing so that nodes that fail can be removed from the cluster
and safe access to the shared storage can be regained.

Cheers,

-- 
Digimer
E-Mail:         linux at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org


From gordan at bobich.net  Mon Sep 27 13:44:49 2010
From: gordan at bobich.net (Gordan Bobic)
Date: Mon, 27 Sep 2010 14:44:49 +0100
Subject: [Linux-cluster] redhat HA cluster we need shared storage
In-Reply-To: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>
References: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>
Message-ID: <4CA09FD1.8070409@bobich.net>

If you'll excuse my stating the obvious, you only require shared storage 
if you intend to use it. If you only require resource fail-over, then 
you don't. What is your intended use case?

In any case, you will need operational fencing.

For shared block level storage without a SAN, you can try DRBD.

Gordan

Rohit tripathi wrote:
> Dear sir
>  
> I need to know for redhat HA cluster we need shared storage it is 
> compulsory or not or we can configure HA cluster without shared storage.
>  
>  
> regards
>  
> Rohit
>  
> 
> 
> ------------------------------------------------------------------------
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From omerfsen at gmail.com  Mon Sep 27 14:09:43 2010
From: omerfsen at gmail.com (Omer Faruk SEN)
Date: Mon, 27 Sep 2010 17:09:43 +0300
Subject: [Linux-cluster] redhat HA cluster we need shared storage
In-Reply-To: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>
References: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>
Message-ID: <AANLkTimDF1Uef1Bbub525iY=-z-CmX+PCtYh1C26jqgw@mail.gmail.com>

Shared resource is not a necessity. You may use seperate storages but
ensure the content of the files are the same (maybe with rsync)

On Mon, Sep 27, 2010 at 4:12 PM, Rohit tripathi <rohit2525 at gmail.com> wrote:
> Dear sir
>
> I need to know for redhat HA cluster we need shared storage it is compulsory
> or not or we can configure HA cluster without shared storage.
>
>
> regards
>
> Rohit
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>


From Bennie_R_Thomas at raytheon.com  Mon Sep 27 14:49:28 2010
From: Bennie_R_Thomas at raytheon.com (Bennie R Thomas)
Date: Mon, 27 Sep 2010 09:49:28 -0500
Subject: [Linux-cluster] porblem with quorum at cluster boot
In-Reply-To: <1285427983.23766.0.camel@newgen.localdomain>
References: <OF325BF332.6C7EA96C-ON852577A8.005C8390-852577A8.005CA959@ottlnmta.mitel.com>
	<1285427983.23766.0.camel@newgen.localdomain>
Message-ID: <OF53C30CF7.D0AD6B90-ON862577AB.00514899-862577AB.005173D0@mck.us.ray.com>

Try setting your expected votes to 2 or 1.. 

Your Cluster is hanging with one node because it want's 3 votes.


From:
Brem Belguebli <brem.belguebli at gmail.com>
To:
linux clustering <linux-cluster at redhat.com>
Date:
09/25/2010 10:30 AM
Subject:
Re: [Linux-cluster] porblem with quorum at cluster boot
Sent by:
linux-cluster-bounces at redhat.com


On Fri, 2010-09-24 at 12:52 -0400, Jason_Henderson at Mitel.com wrote:
> 
> I think you still need two_node="1" in your conf file if you want a
> single node to become quorate. 
> 
two_nodes=1 is only valid if you do not have a quorum disk.

> linux-cluster-bounces at redhat.com wrote on 09/24/2010 12:38:17 PM:
> 
> > hello,
> > 
> > I have a 2 node cluster with qdisk quorum partition;
> > 
> > each node has 1 vote and the qdisk has 1 vote too; in cluster.conf
> I 
> > have this explicit declaration:
> > <cman expected_votes="3" two_node="0"\>
> > 
> > when I have both 2 nodes active cman_tool status tell me this:
> > 
> > Version: 6.1.0
> > Nodes: 2
> > Expected votes: 3
> > Quorum device votes: 1
> > Total votes: 3
> > Node votes: 1
> > Quorum: 2
> > 
> > then, if I power off a node these value, as expected, changed this
> way:
> > Nodes: 1
> > Total votes: 2
> > 
> > and the cluster is still quorate and functional.
> > 
> > the problem is if I power off both the node and them power on only
> one 
> > of them: in this case the single node does not quorate and the
> cluster 
> > does not start: I have to power on both the node to have the
> cluster 
> > (and services on the cluster) working.
> > 
> > I'd like the cluster can work (and boot) even with a single node
> (ie, if 
> > one of the node has hw failure and is down I still want to be able
> to 
> > reboot the working node and have it booting correctly the cluster)
> > 
> > any hints? (thank's for reading all this)
> > 
> > -- 
> > bye,
> > emilio
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/57d0efd6/attachment.htm>

From dist-list at lexum.com  Mon Sep 27 15:05:39 2010
From: dist-list at lexum.com (F M)
Date: Mon, 27 Sep 2010 11:05:39 -0400 (EDT)
Subject: [Linux-cluster] Xen + cluster : 5.4 to 5.5
In-Reply-To: <991248123.2012.1285598928061.JavaMail.root@vicenza.dmz.lexum.pri>
Message-ID: <1830117246.2036.1285599939769.JavaMail.root@vicenza.dmz.lexum.pri>

Hello,
I am always waiting a lot of time before updating my dom0 : 4 Xen Servers sharing a GFS2 mount for failover and I am using ricci/luci to manage cluster suite.
But with the latest Kernel security alerts the time has come.
Any advice and gotcha for this update ?

Regards,


From bturner at redhat.com  Mon Sep 27 15:26:29 2010
From: bturner at redhat.com (Ben Turner)
Date: Mon, 27 Sep 2010 11:26:29 -0400 (EDT)
Subject: [Linux-cluster] ricci is very unstable in one nodes
In-Reply-To: <1729750904.450591285600253338.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <1131975399.454141285601189939.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>

RHEL 5.6 hasn't been released yet so your package probably contains the problem.  I'm not sure how in sync Centos is with RHEL or if they patch earlier so I cannot give you a time frame when it will be in Centos or if they have already patched it.  The problem in that BZ is more of an annoyance, you usually just have to retry a time or two and it works.  If you can't get Luci working properly with your service at all you should try enabling the service through the command line with clusvcadm -e.  If it is not working from the command line either then there is a problem with the service config.

-Ben


----- "fosiul alam" <expertalert at gmail.com> wrote:

> Hi Ben
> Thanks
> 
> I named this cluster as mysql-server but i have not installed mysql
> database in their yet
> 
> and both luci and ricci on luci server and node1 is running this
> version
> 
> luci-0.12.2-12.el5.centos.1
> ricci-0.12.2-12.el5.centos.1
> 
> 
> do you think this version has problem as well ??
> 
> thanks for your help
> 
> 
> 
> 
> On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
> 
> 
> There is an issue with ricci timeouts that was fixed recently:
> 
> https://bugzilla.redhat.com/show_bug.cgi?id=564490
> 
> I'm not sure but you may be hitting that bug. Symptoms include: luci
> isn't able to get the status from the node, timeouts when querying
> ricci, etc. The fix should be released with 5.6
> 
> On the mysql service there are some options that you need to set. Here
> are all the options available to that agent:
> 
> mysql
> Defines a MySQL database server
> 
> Attribute Description
> config_file Define configuration file
> listen_address Define an IP address for MySQL server. If the address
> is not given then first IP address from the service is taken.
> mysqld_options Other command-line options for mysqld
> name Name
> ref Reference to existing mysql resource in the resources section.
> service_name Inherit the service name.
> shutdown_wait Wait X seconds for correct end of service shutdown
> startup_wait Wait X seconds for correct end of service startup
> __enforce_timeouts Consider a timeout for operations as fatal.
> __failure_expire_time Amount of time before a failure is forgotten.
> __independent_subtree Treat this and all children as an independent
> subtree.
> __max_failures Maximum number of failures before returning a failure
> to a status check.
> 
> If I recall correctly you may need to tweak:
> 
> shutdown_wait Wait X seconds for correct end of service shutdown
> startup_wait Wait X seconds for correct end of service startup
> 
> There can be problems relocating the DB if it takes too long to
> start/shutdown. If you are having problems relocating with luci it may
> be a good idea to test with:
> 
> # clusvcadm -r <service name> -m <cluster node>
> 
> -Ben
> 
> 
> 
> 
> 
> 
> ----- "fosiul alam" < expertalert at gmail.com > wrote:
> 
> > Hi
> > I have 4 nodes cluster,
> > It was running fine. but today one nodes is giving trouble
> >
> > From luci Gui interface, when i try to relocate service into this
> node
> > and trying to relocate from this nodes to another nodes
> >
> > from luci gui interface, its showing :
> >
> > Unable to retrieve batch 1908047789 status from
> > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
> > Starting cluster service "httpd1" on node "http1.domain.local" --
> You
> > will be redirected in 5 seconds.
> > also
> >
> > The ricci agent for this node is unresponsive. Node-specific
> > information is not available at this time. :
> >
> > but ricci is running on problematic node ,
> > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
> >
> > there is not any firewall running.
> >
> > iptables -L
> > Chain INPUT (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain FORWARD (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain OUTPUT (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain RH-Firewall-1-INPUT (0 references)
> > target prot opt source destination
> >
> > port 11111 is runningg
> >
> > netstat -an | grep 11111
> > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
> >
> >
> > but still ricci is very unstable , and i cant relocate any service
> on
> > this node or i cant relocate any service away from this node.
> >
> > from problematic node if i type this
> >
> > clustat
> > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
> > Member Status: Quorate
> >
> > Member Name ID Status
> > ------ ---- ---- ------
> > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
> > server
> > publicdns1.xxxx.local 2 Online, rgmanager
> > http1.xxxx.local 3 Online, Local, rgmanager
> > mail01.xxxxx.local 4 Online, rgmanager
> >
> > Service Name Owner (Last) State
> > ------- ---- ----- ------ -----
> > service:httpd1 mail01.xxxx.local started
> > service:mysql-server http1.xxxx.local started -------------------
> this
> > is the problematic node
> > service:public-dns publicdns1.xxxxxx.local started
> >
> > I cant move that service mysql-server from this node or cant
> relocate
> > any service on this node ..
> > I am very confused.
> >
> > what shall i do to fix this issue ??
> >
> > thanks for your advise.
> >
> >
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From marcello.percoco at diennea.com  Mon Sep 27 15:30:27 2010
From: marcello.percoco at diennea.com (Marcello Percoco - Diennea)
Date: Mon, 27 Sep 2010 17:30:27 +0200
Subject: [Linux-cluster] Gfs2 Problem
Message-ID: <CA2F45405F1F27488F305C7863282FB974D76013@dnaexc01.diennea.lan>

Hallo.

We have a 3 node cluster, with a shared Gfs2 used by our application.
Evry day all the 3 machines get stuck, in the dmesg i see many "task hang..." with a stack trace referign to dlm and Gfs2 (i didin't have a trace to post now, but next time i add it to the mail).

The application write and delete many lock file, but every lock file had is own directory.
Whath could be the problem?

Our Gfs2 had 512Mb journals, is this a good idea or not?

Thnx.

--
Marcello Percoco
IT Junior

Diennea
Viale G. Marconi, 30/14
48018 Faenza (RA) - Italy

E-Mail: marcello.percoco at diennea.com<mailto:marcello.percoco at diennea.com>
Tel.: (+39) 0546 667432 - Int. 916
Fax:  (+39) 0546 399913

MagNews - E-Mail Marketing Solutions
http://www.magnews.it<http://www.magnews.it/>

Diennea - Technology for Marketing
http://www.diennea.com<http://www.diennea.com/>

DISCLAIMER
Questo messaggio e i suoi allegati si rivolgono esclusivamente ai destinatari e possono contenere informazioni personali, confidenziali o protette da diritti. Se ha ricevuto questo messaggio per errore, l'utilizzo dei suoi contenuti ? proibito e pu? esporre a conseguenze penali o civili. La invitiamo pertanto a rispedire immediatamente il messaggio al mittente e cancellarne gli allegati senza conservarne una copia. Per ulteriori informazioni, La preghiamo di contattarci all'indirizzo postmaster at diennea.com<mailto:postmaster at diennea.com>. Grazie
This e-mail and any attachments may be confidential and the subject of legal professional privilege. Any disclosure, use, storage or copying of this e-mail without the consent of the sender is strictly prohibited. Please notify the sender immediately if you are not the intended recipient and then delete the e-mail from your inbox and do not disclose the contents to another person, use, copy or store the information in any medium. For further information write to postmaster at diennea.com<mailto:postmaster at diennea.com>. Thanks

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/4abc3e78/attachment.htm>

From swhiteho at redhat.com  Mon Sep 27 15:52:26 2010
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Mon, 27 Sep 2010 16:52:26 +0100
Subject: [Linux-cluster] Gfs2 Problem
In-Reply-To: <CA2F45405F1F27488F305C7863282FB974D76013@dnaexc01.diennea.lan>
References: <CA2F45405F1F27488F305C7863282FB974D76013@dnaexc01.diennea.lan>
Message-ID: <1285602746.2476.23.camel@dolmen>

Hi,

On Mon, 2010-09-27 at 17:30 +0200, Marcello Percoco - Diennea wrote:
> Hallo.
> 
> 
> We have a 3 node cluster, with a shared Gfs2 used by our application.
> 
> Evry day all the 3 machines get stuck, in the dmesg i see many ?task
> hang?? with a stack trace referign to dlm and Gfs2 (i didin?t have a
> trace to post now, but next time i add it to the mail).
> 
That would be helpful in tracking down the cause. Also a lock dump would
be useful too (you need debugfs mounted somewhere).

>  
> 
> The application write and delete many lock file, but every lock file
> had is own directory.
> 
So it is unlikely to be directory lock contention then, it seems.

> Whath could be the problem?
> 
>  
There are a lot of possible causes. Without the traces it is very tricky
to narrow it down. I think you are looking in the right area though.

> 
> Our Gfs2 had 512Mb journals, is this a good idea or not?
> 
That sounds like a lot. It won't hurt though and I very much doubt that
is related to the hangs.

The key is to figure out whether the problem is actually a hang or
whether it is just being slow. If you can get two lock dumps from a node
that is having problems, say a minute or two apart, that should tell you
that vital information,

Steve.

>  
> 
> Thnx.
> 
>  
> 
> -- 
> 
> Marcello Percoco
> IT Junior
> 
> 
> 
> Diennea
> Viale G. Marconi, 30/14
> 48018 Faenza (RA) - Italy
> 
> E-Mail: marcello.percoco at diennea.com
> Tel.: (+39) 0546 667432 - Int. 916
> Fax:  (+39) 0546 399913
> 
> 
> 
> MagNews - E-Mail Marketing Solutions
> http://www.magnews.it
> 
> Diennea - Technology for Marketing
> http://www.diennea.com 
> 
> 
> 
> DISCLAIMER
> Questo messaggio e i suoi allegati si rivolgono esclusivamente ai
> destinatari e possono contenere informazioni personali, confidenziali
> o protette da diritti. Se ha ricevuto questo messaggio per errore,
> l'utilizzo dei suoi contenuti ? proibito e pu? esporre a conseguenze
> penali o civili. La invitiamo pertanto a rispedire immediatamente il
> messaggio al mittente e cancellarne gli allegati senza conservarne una
> copia. Per ulteriori informazioni, La preghiamo di contattarci
> all'indirizzo postmaster at diennea.com. Grazie 
> This e-mail and any attachments may be confidential and the subject of
> legal professional privilege. Any disclosure, use, storage or copying
> of this e-mail without the consent of the sender is strictly
> prohibited. Please notify the sender immediately if you are not the
> intended recipient and then delete the e-mail from your inbox and do
> not disclose the contents to another person, use, copy or store the
> information in any medium. For further information write to
> postmaster at diennea.com. Thanks
> 
>  
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From expertalert at gmail.com  Mon Sep 27 15:48:27 2010
From: expertalert at gmail.com (fosiul alam)
Date: Mon, 27 Sep 2010 16:48:27 +0100
Subject: [Linux-cluster] ricci is very unstable in one nodes
In-Reply-To: <1131975399.454141285601189939.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
References: <1729750904.450591285600253338.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
	<1131975399.454141285601189939.JavaMail.root@zmail07.collab.prod.int.phx2.redhat.com>
Message-ID: <AANLkTimo0R=C1XP8KwoXeyO=VWNVnFckkiXUZnrjBgs0@mail.gmail.com>

Hi
i am trying to patch ricci . let see how it goes

but clusvcadm is failing as well

[root at http1 ~]#  clusvcadm -e httpd1 -m http1.xxxx.local
Member http1.xxxx.local trying to enable service:httpd1...Invalid operation
for resource

here, http1 , where i was trying to run the service from luci

what could be the problem ?
is there any way to find out if there is any problem with config ??
On 27 September 2010 16:26, Ben Turner <bturner at redhat.com> wrote:

> RHEL 5.6 hasn't been released yet so your package probably contains the
> problem.  I'm not sure how in sync Centos is with RHEL or if they patch
> earlier so I cannot give you a time frame when it will be in Centos or if
> they have already patched it.  The problem in that BZ is more of an
> annoyance, you usually just have to retry a time or two and it works.  If
> you can't get Luci working properly with your service at all you should try
> enabling the service through the command line with clusvcadm -e.  If it is
> not working from the command line either then there is a problem with the
> service config.
>
> -Ben
>
>
>
>
> ----- "fosiul alam" <expertalert at gmail.com> wrote:
>
> > Hi Ben
> > Thanks
> >
> > I named this cluster as mysql-server but i have not installed mysql
> > database in their yet
> >
> > and both luci and ricci on luci server and node1 is running this
> > version
> >
> > luci-0.12.2-12.el5.centos.1
> > ricci-0.12.2-12.el5.centos.1
> >
> >
> > do you think this version has problem as well ??
> >
> > thanks for your help
> >
> >
> >
> >
> > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
> >
> >
> > There is an issue with ricci timeouts that was fixed recently:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=564490
> >
> > I'm not sure but you may be hitting that bug. Symptoms include: luci
> > isn't able to get the status from the node, timeouts when querying
> > ricci, etc. The fix should be released with 5.6
> >
> > On the mysql service there are some options that you need to set. Here
> > are all the options available to that agent:
> >
> > mysql
> > Defines a MySQL database server
> >
> > Attribute Description
> > config_file Define configuration file
> > listen_address Define an IP address for MySQL server. If the address
> > is not given then first IP address from the service is taken.
> > mysqld_options Other command-line options for mysqld
> > name Name
> > ref Reference to existing mysql resource in the resources section.
> > service_name Inherit the service name.
> > shutdown_wait Wait X seconds for correct end of service shutdown
> > startup_wait Wait X seconds for correct end of service startup
> > __enforce_timeouts Consider a timeout for operations as fatal.
> > __failure_expire_time Amount of time before a failure is forgotten.
> > __independent_subtree Treat this and all children as an independent
> > subtree.
> > __max_failures Maximum number of failures before returning a failure
> > to a status check.
> >
> > If I recall correctly you may need to tweak:
> >
> > shutdown_wait Wait X seconds for correct end of service shutdown
> > startup_wait Wait X seconds for correct end of service startup
> >
> > There can be problems relocating the DB if it takes too long to
> > start/shutdown. If you are having problems relocating with luci it may
> > be a good idea to test with:
> >
> > # clusvcadm -r <service name> -m <cluster node>
> >
> > -Ben
> >
> >
> >
> >
> >
> >
> > ----- "fosiul alam" < expertalert at gmail.com > wrote:
> >
> > > Hi
> > > I have 4 nodes cluster,
> > > It was running fine. but today one nodes is giving trouble
> > >
> > > From luci Gui interface, when i try to relocate service into this
> > node
> > > and trying to relocate from this nodes to another nodes
> > >
> > > from luci gui interface, its showing :
> > >
> > > Unable to retrieve batch 1908047789 status from
> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
> > > Starting cluster service "httpd1" on node "http1.domain.local" --
> > You
> > > will be redirected in 5 seconds.
> > > also
> > >
> > > The ricci agent for this node is unresponsive. Node-specific
> > > information is not available at this time. :
> > >
> > > but ricci is running on problematic node ,
> > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
> > >
> > > there is not any firewall running.
> > >
> > > iptables -L
> > > Chain INPUT (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain FORWARD (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain OUTPUT (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain RH-Firewall-1-INPUT (0 references)
> > > target prot opt source destination
> > >
> > > port 11111 is runningg
> > >
> > > netstat -an | grep 11111
> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
> > >
> > >
> > > but still ricci is very unstable , and i cant relocate any service
> > on
> > > this node or i cant relocate any service away from this node.
> > >
> > > from problematic node if i type this
> > >
> > > clustat
> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
> > > Member Status: Quorate
> > >
> > > Member Name ID Status
> > > ------ ---- ---- ------
> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
> > > server
> > > publicdns1.xxxx.local 2 Online, rgmanager
> > > http1.xxxx.local 3 Online, Local, rgmanager
> > > mail01.xxxxx.local 4 Online, rgmanager
> > >
> > > Service Name Owner (Last) State
> > > ------- ---- ----- ------ -----
> > > service:httpd1 mail01.xxxx.local started
> > > service:mysql-server http1.xxxx.local started -------------------
> > this
> > > is the problematic node
> > > service:public-dns publicdns1.xxxxxx.local started
> > >
> > > I cant move that service mysql-server from this node or cant
> > relocate
> > > any service on this node ..
> > > I am very confused.
> > >
> > > what shall i do to fix this issue ??
> > >
> > > thanks for your advise.
> > >
> > >
> > >
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/1da1accb/attachment.htm>

From expertalert at gmail.com  Mon Sep 27 16:02:20 2010
From: expertalert at gmail.com (fosiul alam)
Date: Mon, 27 Sep 2010 17:02:20 +0100
Subject: [Linux-cluster] Unable to patch conga
Message-ID: <AANLkTimdQNO3x3g5EKc2ETMPePf3iA-Cptiih6rLb4Au@mail.gmail.com>

hi
Due to the same issue, I see exact same problem in my luci interface
so i am trying to patch conga.

I downloaded ,

http://mirrors.kernel.org/centos/5/os/SRPMS/conga-0.12.2-12.el5.centos.1.src.rpm
rpm -i conga-0.12.2-12.el5.centos.1.src.rpm
cd /usr/src/redhat/SOURCE

tar -xvzf conga-0.12.2.tar.gz
patch -p0 < /path/to/where_the_patch/ricci.patch

[root at beaver SOURCES]# cd conga-0.12.2

Now i am facing the problem to install

./autogen.sh --include_zope_and_plone=yes
Zope-2.9.8-final.tgz passed sha512sum test
Plone-2.5.5.tar.gz passed sha512sum test
cat: clustermon.spec.in.in: No such file or directory

Run `./configure` to configure conga build,
or `make srpms` to build conga and clustermon srpms
or `make rpms` to build all rpms

[root at beaver conga-0.12.2]#  ./configure --include_zope_and_plone=yes
D-BUS version 1.1.2 detected  -> major 1, minor 1
missing zope directory, extract zope source-code into it and try again


Now, how will i tell ./configure where is zope and plone ?
do i need this zope and plone ?

Please give me some advise

Fosiul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/21959f19/attachment.htm>

From pmdyer at ctgcentral2.com  Mon Sep 27 16:55:28 2010
From: pmdyer at ctgcentral2.com (Paul M. Dyer)
Date: Mon, 27 Sep 2010 11:55:28 -0500 (CDT)
Subject: [Linux-cluster] ricci is very unstable in one nodes
In-Reply-To: <AANLkTimo0R=C1XP8KwoXeyO=VWNVnFckkiXUZnrjBgs0@mail.gmail.com>
Message-ID: <1480320.10.1285606528829.JavaMail.root@athena>

http://rhn.redhat.com/errata/RHBA-2010-0716.html

It appears that this problem has been fixed in this errata.

I installed the luci and ricci updates and did some lite testing.   So far, the timeout 11111 error has not shown up.

Paul

----- Original Message -----
From: "fosiul alam" <expertalert at gmail.com>
To: "linux clustering" <linux-cluster at redhat.com>
Sent: Monday, September 27, 2010 10:48:27 AM
Subject: Re: [Linux-cluster] ricci is very unstable in one nodes

Hi
i am trying to patch ricci . let see how it goes

but clusvcadm is failing as well

[root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
Member http1.xxxx.local trying to enable service:httpd1...Invalid
operation for resource

here, http1 , where i was trying to run the service from luci

what could be the problem ?
is there any way to find out if there is any problem with config ??

On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote:


RHEL 5.6 hasn't been released yet so your package probably contains the
problem. I'm not sure how in sync Centos is with RHEL or if they patch
earlier so I cannot give you a time frame when it will be in Centos or
if they have already patched it. The problem in that BZ is more of an
annoyance, you usually just have to retry a time or two and it works. If
you can't get Luci working properly with your service at all you should
try enabling the service through the command line with clusvcadm -e. If
it is not working from the command line either then there is a problem
with the service config.


-Ben


----- "fosiul alam" < expertalert at gmail.com > wrote:

> Hi Ben
> Thanks
>
> I named this cluster as mysql-server but i have not installed mysql
> database in their yet
>
> and both luci and ricci on luci server and node1 is running this
> version
>
> luci-0.12.2-12.el5.centos.1
> ricci-0.12.2-12.el5.centos.1
>
>
> do you think this version has problem as well ??
>
> thanks for your help
>
>
>
>
> On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
>
>
> There is an issue with ricci timeouts that was fixed recently:
>
> https://bugzilla.redhat.com/show_bug.cgi?id=564490
>
> I'm not sure but you may be hitting that bug. Symptoms include: luci
> isn't able to get the status from the node, timeouts when querying
> ricci, etc. The fix should be released with 5.6
>
> On the mysql service there are some options that you need to set. Here
> are all the options available to that agent:
>
> mysql
> Defines a MySQL database server
>
> Attribute Description
> config_file Define configuration file
> listen_address Define an IP address for MySQL server. If the address
> is not given then first IP address from the service is taken.
> mysqld_options Other command-line options for mysqld
> name Name
> ref Reference to existing mysql resource in the resources section.
> service_name Inherit the service name.
> shutdown_wait Wait X seconds for correct end of service shutdown
> startup_wait Wait X seconds for correct end of service startup
> __enforce_timeouts Consider a timeout for operations as fatal.
> __failure_expire_time Amount of time before a failure is forgotten.
> __independent_subtree Treat this and all children as an independent
> subtree. __max_failures Maximum number of failures before returning a
> failure to a status check.
>
> If I recall correctly you may need to tweak:
>
> shutdown_wait Wait X seconds for correct end of service shutdown
> startup_wait Wait X seconds for correct end of service startup
>
> There can be problems relocating the DB if it takes too long to
> start/shutdown. If you are having problems relocating with luci it may
> be a good idea to test with:
>
> # clusvcadm -r <service name> -m <cluster node>
>
> -Ben
>
>
>
>
>
>
> ----- "fosiul alam" < expertalert at gmail.com > wrote:
>
> > Hi
> > I have 4 nodes cluster,
> > It was running fine. but today one nodes is giving trouble
> >
> > From luci Gui interface, when i try to relocate service into this
> node
> > and trying to relocate from this nodes to another nodes
> >
> > from luci gui interface, its showing :
> >
> > Unable to retrieve batch 1908047789 status from
> > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
> > Starting cluster service "httpd1" on node "http1.domain.local" --
> You
> > will be redirected in 5 seconds.
> > also
> >
> > The ricci agent for this node is unresponsive. Node-specific
> > information is not available at this time. :
> >
> > but ricci is running on problematic node ,
> > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
> >
> > there is not any firewall running.
> >
> > iptables -L
> > Chain INPUT (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain FORWARD (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain OUTPUT (policy ACCEPT)
> > target prot opt source destination
> >
> > Chain RH-Firewall-1-INPUT (0 references)
> > target prot opt source destination
> >
> > port 11111 is runningg
> >
> > netstat -an | grep 11111
> > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
> >
> >
> > but still ricci is very unstable , and i cant relocate any service
> on
> > this node or i cant relocate any service away from this node.
> >
> > from problematic node if i type this
> >
> > clustat
> > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
> > Member Status: Quorate
> >
> > Member Name ID Status
> > ------ ---- ---- ------
> > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
> > server publicdns1.xxxx.local 2 Online, rgmanager
> > http1.xxxx.local 3 Online, Local, rgmanager
> > mail01.xxxxx.local 4 Online, rgmanager
> >
> > Service Name Owner (Last) State
> > ------- ---- ----- ------ -----
> > service:httpd1 mail01.xxxx.local started
> > service:mysql-server http1.xxxx.local started -------------------
> this
> > is the problematic node
> > service:public-dns publicdns1.xxxxxx.local started
> >
> > I cant move that service mysql-server from this node or cant
> relocate
> > any service on this node ..
> > I am very confused.
> >
> > what shall i do to fix this issue ??
> >
> > thanks for your advise.
> >
> >
> >
> >
> > -- Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> -- Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> -- Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-- Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


-- Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster


From brem.belguebli at gmail.com  Mon Sep 27 17:05:06 2010
From: brem.belguebli at gmail.com (brem belguebli)
Date: Mon, 27 Sep 2010 19:05:06 +0200
Subject: [Linux-cluster] porblem with quorum at cluster boot
In-Reply-To: <OF53C30CF7.D0AD6B90-ON862577AB.00514899-862577AB.005173D0@mck.us.ray.com>
References: <OF325BF332.6C7EA96C-ON852577A8.005C8390-852577A8.005CA959@ottlnmta.mitel.com>
	<1285427983.23766.0.camel@newgen.localdomain>
	<OF53C30CF7.D0AD6B90-ON862577AB.00514899-862577AB.005173D0@mck.us.ray.com>
Message-ID: <AANLkTi=FOA-cj5hg11zBmZdzWyQiMpPCM9FZiKgFQHH9@mail.gmail.com>

The configuration you are trying to build, 2 cluster nodes (1 vote each)
plus a quorum disk 1 vote (making a total expected votes= 3) must remain up
if you loose 1 of the members (as long as the remaining node still accesses
the quorum disk) because there are still 2   active votes (1 remaining node
+ 1 quorum disk) = 2 > expected_votes/2.

The Quorum (majority) must be greater (absolutely greater  >) than
expected_votes/2 (51% or greater) in order to service to continue.


2010/9/27 Bennie R Thomas <Bennie_R_Thomas at raytheon.com>

> Try setting your expected votes to 2 or 1..
>
> Your Cluster is hanging with one node because it want's 3 votes.
>
>
>
>   From: Brem Belguebli <brem.belguebli at gmail.com> To: linux clustering <
> linux-cluster at redhat.com> Date: 09/25/2010 10:30 AM Subject: Re:
> [Linux-cluster] porblem with quorum at cluster boot Sent by:
> linux-cluster-bounces at redhat.com
> ------------------------------
>
>
>
> On Fri, 2010-09-24 at 12:52 -0400, Jason_Henderson at Mitel.com wrote:
> >
> > I think you still need two_node="1" in your conf file if you want a
> > single node to become quorate.
> >
> two_nodes=1 is only valid if you do not have a quorum disk.
>
> > linux-cluster-bounces at redhat.com wrote on 09/24/2010 12:38:17 PM:
> >
> > > hello,
> > >
> > > I have a 2 node cluster with qdisk quorum partition;
> > >
> > > each node has 1 vote and the qdisk has 1 vote too; in cluster.conf
> > I
> > > have this explicit declaration:
> > > <cman expected_votes="3" two_node="0"\>
> > >
> > > when I have both 2 nodes active cman_tool status tell me this:
> > >
> > > Version: 6.1.0
> > > Nodes: 2
> > > Expected votes: 3
> > > Quorum device votes: 1
> > > Total votes: 3
> > > Node votes: 1
> > > Quorum: 2
> > >
> > > then, if I power off a node these value, as expected, changed this
> > way:
> > > Nodes: 1
> > > Total votes: 2
> > >
> > > and the cluster is still quorate and functional.
> > >
> > > the problem is if I power off both the node and them power on only
> > one
> > > of them: in this case the single node does not quorate and the
> > cluster
> > > does not start: I have to power on both the node to have the
> > cluster
> > > (and services on the cluster) working.
> > >
> > > I'd like the cluster can work (and boot) even with a single node
> > (ie, if
> > > one of the node has hw failure and is down I still want to be able
> > to
> > > reboot the working node and have it booting correctly the cluster)
> > >
> > > any hints? (thank's for reading all this)
> > >
> > > --
> > > bye,
> > > emilio
> > >
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/e452edb5/attachment.htm>

From expertalert at gmail.com  Mon Sep 27 17:31:31 2010
From: expertalert at gmail.com (fosiul alam)
Date: Mon, 27 Sep 2010 18:31:31 +0100
Subject: [Linux-cluster] ricci is very unstable in one nodes
In-Reply-To: <1480320.10.1285606528829.JavaMail.root@athena>
References: <AANLkTimo0R=C1XP8KwoXeyO=VWNVnFckkiXUZnrjBgs0@mail.gmail.com>
	<1480320.10.1285606528829.JavaMail.root@athena>
Message-ID: <AANLkTikwtYxG3_gf0QxqJpGzZxowh4T7rGbwH-+MhWs8@mail.gmail.com>

Hi
Thanks for your advise,
Currently i got this

luci-0.12.2-12.el5.centos.1
ricci-0.12.2-12.el5.centos.1

is this the same rpm as

luci-0.12.2-12.el5_5.4.i386.rpm  ?
ricci-0.12.2-12.el5_5.4.i386.rpm  ?

Thanks


On 27 September 2010 17:55, Paul M. Dyer <pmdyer at ctgcentral2.com> wrote:

> http://rhn.redhat.com/errata/RHBA-2010-0716.html
>
> It appears that this problem has been fixed in this errata.
>
> I installed the luci and ricci updates and did some lite testing.   So far,
> the timeout 11111 error has not shown up.
>
> Paul
>
> ----- Original Message -----
> From: "fosiul alam" <expertalert at gmail.com>
> To: "linux clustering" <linux-cluster at redhat.com>
> Sent: Monday, September 27, 2010 10:48:27 AM
> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
>
> Hi
> i am trying to patch ricci . let see how it goes
>
> but clusvcadm is failing as well
>
> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
> Member http1.xxxx.local trying to enable service:httpd1...Invalid
> operation for resource
>
> here, http1 , where i was trying to run the service from luci
>
> what could be the problem ?
> is there any way to find out if there is any problem with config ??
>
> On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote:
>
>
> RHEL 5.6 hasn't been released yet so your package probably contains the
> problem. I'm not sure how in sync Centos is with RHEL or if they patch
> earlier so I cannot give you a time frame when it will be in Centos or
> if they have already patched it. The problem in that BZ is more of an
> annoyance, you usually just have to retry a time or two and it works. If
> you can't get Luci working properly with your service at all you should
> try enabling the service through the command line with clusvcadm -e. If
> it is not working from the command line either then there is a problem
> with the service config.
>
>
>
>
> -Ben
>
>
>
>
> ----- "fosiul alam" < expertalert at gmail.com > wrote:
>
> > Hi Ben
> > Thanks
> >
> > I named this cluster as mysql-server but i have not installed mysql
> > database in their yet
> >
> > and both luci and ricci on luci server and node1 is running this
> > version
> >
> > luci-0.12.2-12.el5.centos.1
> > ricci-0.12.2-12.el5.centos.1
> >
> >
> > do you think this version has problem as well ??
> >
> > thanks for your help
> >
> >
> >
> >
> > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
> >
> >
> > There is an issue with ricci timeouts that was fixed recently:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=564490
> >
> > I'm not sure but you may be hitting that bug. Symptoms include: luci
> > isn't able to get the status from the node, timeouts when querying
> > ricci, etc. The fix should be released with 5.6
> >
> > On the mysql service there are some options that you need to set. Here
> > are all the options available to that agent:
> >
> > mysql
> > Defines a MySQL database server
> >
> > Attribute Description
> > config_file Define configuration file
> > listen_address Define an IP address for MySQL server. If the address
> > is not given then first IP address from the service is taken.
> > mysqld_options Other command-line options for mysqld
> > name Name
> > ref Reference to existing mysql resource in the resources section.
> > service_name Inherit the service name.
> > shutdown_wait Wait X seconds for correct end of service shutdown
> > startup_wait Wait X seconds for correct end of service startup
> > __enforce_timeouts Consider a timeout for operations as fatal.
> > __failure_expire_time Amount of time before a failure is forgotten.
> > __independent_subtree Treat this and all children as an independent
> > subtree. __max_failures Maximum number of failures before returning a
> > failure to a status check.
> >
> > If I recall correctly you may need to tweak:
> >
> > shutdown_wait Wait X seconds for correct end of service shutdown
> > startup_wait Wait X seconds for correct end of service startup
> >
> > There can be problems relocating the DB if it takes too long to
> > start/shutdown. If you are having problems relocating with luci it may
> > be a good idea to test with:
> >
> > # clusvcadm -r <service name> -m <cluster node>
> >
> > -Ben
> >
> >
> >
> >
> >
> >
> > ----- "fosiul alam" < expertalert at gmail.com > wrote:
> >
> > > Hi
> > > I have 4 nodes cluster,
> > > It was running fine. but today one nodes is giving trouble
> > >
> > > From luci Gui interface, when i try to relocate service into this
> > node
> > > and trying to relocate from this nodes to another nodes
> > >
> > > from luci gui interface, its showing :
> > >
> > > Unable to retrieve batch 1908047789 status from
> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
> > > Starting cluster service "httpd1" on node "http1.domain.local" --
> > You
> > > will be redirected in 5 seconds.
> > > also
> > >
> > > The ricci agent for this node is unresponsive. Node-specific
> > > information is not available at this time. :
> > >
> > > but ricci is running on problematic node ,
> > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
> > >
> > > there is not any firewall running.
> > >
> > > iptables -L
> > > Chain INPUT (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain FORWARD (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain OUTPUT (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain RH-Firewall-1-INPUT (0 references)
> > > target prot opt source destination
> > >
> > > port 11111 is runningg
> > >
> > > netstat -an | grep 11111
> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
> > >
> > >
> > > but still ricci is very unstable , and i cant relocate any service
> > on
> > > this node or i cant relocate any service away from this node.
> > >
> > > from problematic node if i type this
> > >
> > > clustat
> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
> > > Member Status: Quorate
> > >
> > > Member Name ID Status
> > > ------ ---- ---- ------
> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
> > > server publicdns1.xxxx.local 2 Online, rgmanager
> > > http1.xxxx.local 3 Online, Local, rgmanager
> > > mail01.xxxxx.local 4 Online, rgmanager
> > >
> > > Service Name Owner (Last) State
> > > ------- ---- ----- ------ -----
> > > service:httpd1 mail01.xxxx.local started
> > > service:mysql-server http1.xxxx.local started -------------------
> > this
> > > is the problematic node
> > > service:public-dns publicdns1.xxxxxx.local started
> > >
> > > I cant move that service mysql-server from this node or cant
> > relocate
> > > any service on this node ..
> > > I am very confused.
> > >
> > > what shall i do to fix this issue ??
> > >
> > > thanks for your advise.
> > >
> > >
> > >
> > >
> > > -- Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > -- Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > -- Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> -- Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> -- Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/462f567b/attachment.htm>

From expertalert at gmail.com  Mon Sep 27 17:37:44 2010
From: expertalert at gmail.com (fosiul alam)
Date: Mon, 27 Sep 2010 18:37:44 +0100
Subject: [Linux-cluster] ricci is very unstable in one nodes
In-Reply-To: <AANLkTikwtYxG3_gf0QxqJpGzZxowh4T7rGbwH-+MhWs8@mail.gmail.com>
References: <AANLkTimo0R=C1XP8KwoXeyO=VWNVnFckkiXUZnrjBgs0@mail.gmail.com>
	<1480320.10.1285606528829.JavaMail.root@athena>
	<AANLkTikwtYxG3_gf0QxqJpGzZxowh4T7rGbwH-+MhWs8@mail.gmail.com>
Message-ID: <AANLkTi=DfrVMFkp8No9UbwD+fVoRx9FmpO+qzY2RxLPk@mail.gmail.com>

Hi, Addition to my previous email have a look to this one

from http1 ( where i am trying to relocate a service)

[root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
Member http1.xxxx.local trying to enable service:httpd1...Success
Warning: service:httpd1 is now running on mail01.xxxx.local

so, its saying its Success..
but it actually no..

Thanks again


On 27 September 2010 18:31, fosiul alam <expertalert at gmail.com> wrote:

> Hi
> Thanks for your advise,
> Currently i got this
>
>
> luci-0.12.2-12.el5.centos.1
> ricci-0.12.2-12.el5.centos.1
>
> is this the same rpm as
>
> luci-0.12.2-12.el5_5.4.i386.rpm  ?
> ricci-0.12.2-12.el5_5.4.i386.rpm  ?
>
> Thanks
>
>
>
> On 27 September 2010 17:55, Paul M. Dyer <pmdyer at ctgcentral2.com> wrote:
>
>> http://rhn.redhat.com/errata/RHBA-2010-0716.html
>>
>> It appears that this problem has been fixed in this errata.
>>
>> I installed the luci and ricci updates and did some lite testing.   So
>> far, the timeout 11111 error has not shown up.
>>
>> Paul
>>
>> ----- Original Message -----
>> From: "fosiul alam" <expertalert at gmail.com>
>> To: "linux clustering" <linux-cluster at redhat.com>
>> Sent: Monday, September 27, 2010 10:48:27 AM
>> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
>>
>> Hi
>> i am trying to patch ricci . let see how it goes
>>
>> but clusvcadm is failing as well
>>
>> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
>> Member http1.xxxx.local trying to enable service:httpd1...Invalid
>> operation for resource
>>
>> here, http1 , where i was trying to run the service from luci
>>
>> what could be the problem ?
>> is there any way to find out if there is any problem with config ??
>>
>> On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote:
>>
>>
>> RHEL 5.6 hasn't been released yet so your package probably contains the
>> problem. I'm not sure how in sync Centos is with RHEL or if they patch
>> earlier so I cannot give you a time frame when it will be in Centos or
>> if they have already patched it. The problem in that BZ is more of an
>> annoyance, you usually just have to retry a time or two and it works. If
>> you can't get Luci working properly with your service at all you should
>> try enabling the service through the command line with clusvcadm -e. If
>> it is not working from the command line either then there is a problem
>> with the service config.
>>
>>
>>
>>
>> -Ben
>>
>>
>>
>>
>> ----- "fosiul alam" < expertalert at gmail.com > wrote:
>>
>> > Hi Ben
>> > Thanks
>> >
>> > I named this cluster as mysql-server but i have not installed mysql
>> > database in their yet
>> >
>> > and both luci and ricci on luci server and node1 is running this
>> > version
>> >
>> > luci-0.12.2-12.el5.centos.1
>> > ricci-0.12.2-12.el5.centos.1
>> >
>> >
>> > do you think this version has problem as well ??
>> >
>> > thanks for your help
>> >
>> >
>> >
>> >
>> > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
>> >
>> >
>> > There is an issue with ricci timeouts that was fixed recently:
>> >
>> > https://bugzilla.redhat.com/show_bug.cgi?id=564490
>> >
>> > I'm not sure but you may be hitting that bug. Symptoms include: luci
>> > isn't able to get the status from the node, timeouts when querying
>> > ricci, etc. The fix should be released with 5.6
>> >
>> > On the mysql service there are some options that you need to set. Here
>> > are all the options available to that agent:
>> >
>> > mysql
>> > Defines a MySQL database server
>> >
>> > Attribute Description
>> > config_file Define configuration file
>> > listen_address Define an IP address for MySQL server. If the address
>> > is not given then first IP address from the service is taken.
>> > mysqld_options Other command-line options for mysqld
>> > name Name
>> > ref Reference to existing mysql resource in the resources section.
>> > service_name Inherit the service name.
>> > shutdown_wait Wait X seconds for correct end of service shutdown
>> > startup_wait Wait X seconds for correct end of service startup
>> > __enforce_timeouts Consider a timeout for operations as fatal.
>> > __failure_expire_time Amount of time before a failure is forgotten.
>> > __independent_subtree Treat this and all children as an independent
>> > subtree. __max_failures Maximum number of failures before returning a
>> > failure to a status check.
>> >
>> > If I recall correctly you may need to tweak:
>> >
>> > shutdown_wait Wait X seconds for correct end of service shutdown
>> > startup_wait Wait X seconds for correct end of service startup
>> >
>> > There can be problems relocating the DB if it takes too long to
>> > start/shutdown. If you are having problems relocating with luci it may
>> > be a good idea to test with:
>> >
>> > # clusvcadm -r <service name> -m <cluster node>
>> >
>> > -Ben
>> >
>> >
>> >
>> >
>> >
>> >
>> > ----- "fosiul alam" < expertalert at gmail.com > wrote:
>> >
>> > > Hi
>> > > I have 4 nodes cluster,
>> > > It was running fine. but today one nodes is giving trouble
>> > >
>> > > From luci Gui interface, when i try to relocate service into this
>> > node
>> > > and trying to relocate from this nodes to another nodes
>> > >
>> > > from luci gui interface, its showing :
>> > >
>> > > Unable to retrieve batch 1908047789 status from
>> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
>> > > Starting cluster service "httpd1" on node "http1.domain.local" --
>> > You
>> > > will be redirected in 5 seconds.
>> > > also
>> > >
>> > > The ricci agent for this node is unresponsive. Node-specific
>> > > information is not available at this time. :
>> > >
>> > > but ricci is running on problematic node ,
>> > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
>> > >
>> > > there is not any firewall running.
>> > >
>> > > iptables -L
>> > > Chain INPUT (policy ACCEPT)
>> > > target prot opt source destination
>> > >
>> > > Chain FORWARD (policy ACCEPT)
>> > > target prot opt source destination
>> > >
>> > > Chain OUTPUT (policy ACCEPT)
>> > > target prot opt source destination
>> > >
>> > > Chain RH-Firewall-1-INPUT (0 references)
>> > > target prot opt source destination
>> > >
>> > > port 11111 is runningg
>> > >
>> > > netstat -an | grep 11111
>> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
>> > >
>> > >
>> > > but still ricci is very unstable , and i cant relocate any service
>> > on
>> > > this node or i cant relocate any service away from this node.
>> > >
>> > > from problematic node if i type this
>> > >
>> > > clustat
>> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
>> > > Member Status: Quorate
>> > >
>> > > Member Name ID Status
>> > > ------ ---- ---- ------
>> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
>> > > server publicdns1.xxxx.local 2 Online, rgmanager
>> > > http1.xxxx.local 3 Online, Local, rgmanager
>> > > mail01.xxxxx.local 4 Online, rgmanager
>> > >
>> > > Service Name Owner (Last) State
>> > > ------- ---- ----- ------ -----
>> > > service:httpd1 mail01.xxxx.local started
>> > > service:mysql-server http1.xxxx.local started -------------------
>> > this
>> > > is the problematic node
>> > > service:public-dns publicdns1.xxxxxx.local started
>> > >
>> > > I cant move that service mysql-server from this node or cant
>> > relocate
>> > > any service on this node ..
>> > > I am very confused.
>> > >
>> > > what shall i do to fix this issue ??
>> > >
>> > > thanks for your advise.
>> > >
>> > >
>> > >
>> > >
>> > > -- Linux-cluster mailing list
>> > > Linux-cluster at redhat.com
>> > > https://www.redhat.com/mailman/listinfo/linux-cluster
>> >
>> > -- Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>> >
>> >
>> > -- Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> -- Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>> -- Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/4101fdf9/attachment.htm>

From rohit2525 at gmail.com  Mon Sep 27 17:49:57 2010
From: rohit2525 at gmail.com (Rohit tripathi)
Date: Mon, 27 Sep 2010 23:19:57 +0530
Subject: [Linux-cluster] pls help
Message-ID: <AANLkTikF4it2wRoO6ErnO4i_99W_Ok6SOY3q8BB0t6QK@mail.gmail.com>

thanx for help

I need one more help to configure IBM DB2 and LDAP  in cluster mode. Pls
help me in this regards


Rohit
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/65e20032/attachment.htm>

From rohit2525 at gmail.com  Mon Sep 27 18:06:11 2010
From: rohit2525 at gmail.com (Rohit tripathi)
Date: Mon, 27 Sep 2010 23:36:11 +0530
Subject: [Linux-cluster] redhat HA cluster we need shared storage
In-Reply-To: <4CA09FD1.8070409@bobich.net>
References: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>
	<4CA09FD1.8070409@bobich.net>
Message-ID: <AANLkTimJnEFAW6tZQ=+cxNhvZZuT+G75BBnzy18KjuiC@mail.gmail.com>

i need to configure Ldap and webspeher on HA cluster

On Mon, Sep 27, 2010 at 7:14 PM, Gordan Bobic <gordan at bobich.net> wrote:

> If you'll excuse my stating the obvious, you only require shared storage if
> you intend to use it. If you only require resource fail-over, then you
> don't. What is your intended use case?
>
> In any case, you will need operational fencing.
>
> For shared block level storage without a SAN, you can try DRBD.
>
> Gordan
>
> Rohit tripathi wrote:
>
>>  Dear sir
>>  I need to know for redhat HA cluster we need shared storage it is
>> compulsory or not or we can configure HA cluster without shared storage.
>>  regards
>>  Rohit
>>
>>
>> ------------------------------------------------------------------------
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100927/63eefc98/attachment.htm>

From linux at alteeve.com  Mon Sep 27 18:43:50 2010
From: linux at alteeve.com (Digimer)
Date: Mon, 27 Sep 2010 14:43:50 -0400
Subject: [Linux-cluster] pls help
In-Reply-To: <AANLkTikF4it2wRoO6ErnO4i_99W_Ok6SOY3q8BB0t6QK@mail.gmail.com>
References: <AANLkTikF4it2wRoO6ErnO4i_99W_Ok6SOY3q8BB0t6QK@mail.gmail.com>
Message-ID: <4CA0E5E6.5010603@alteeve.com>

On 10-09-27 01:49 PM, Rohit tripathi wrote:
> thanx for help
>  
> I need one more help to configure IBM DB2 and LDAP  in cluster mode. Pls
> help me in this regards
>  
>  
> Rohit

We'll need to know more about what you are trying to do, what operating
system you are using and what kind of a cluster you want to build.

-- 
Digimer
E-Mail:         linux at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org


From jakov.sosic at srce.hr  Mon Sep 27 19:18:51 2010
From: jakov.sosic at srce.hr (Jakov Sosic)
Date: Mon, 27 Sep 2010 21:18:51 +0200
Subject: [Linux-cluster] redhat HA cluster we need shared storage
In-Reply-To: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>
References: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>
Message-ID: <4CA0EE1B.4060401@srce.hr>

On 09/27/2010 03:12 PM, Rohit tripathi wrote:
> Dear sir
>  
> I need to know for redhat HA cluster we need shared storage it is
> compulsory or not or we can configure HA cluster without shared storage.

No, you do not need shared storage with RedHat Cluster Suite (while it
is requirement with two-nodes Sun Cluster for example).


-- 
|    Jakov Sosic    |    ICQ: 28410271    |   PGP: 0x965CAE2D   |
=================================================================
| start fighting cancer -> http://www.worldcommunitygrid.org/   |


From gordan at bobich.net  Mon Sep 27 21:33:37 2010
From: gordan at bobich.net (Gordan Bobic)
Date: Mon, 27 Sep 2010 22:33:37 +0100
Subject: [Linux-cluster] redhat HA cluster we need shared storage
In-Reply-To: <AANLkTimJnEFAW6tZQ=+cxNhvZZuT+G75BBnzy18KjuiC@mail.gmail.com>
References: <AANLkTi=zhaPaMotxmwm-JeF70DNp6-dWGcPN45mggYNB@mail.gmail.com>	<4CA09FD1.8070409@bobich.net>
	<AANLkTimJnEFAW6tZQ=+cxNhvZZuT+G75BBnzy18KjuiC@mail.gmail.com>
Message-ID: <4CA10DB1.7020509@bobich.net>

Don't know about websphere, but for LDAP you can set up application 
level replication, so for that you certainly don't need shared storage, 
just get the LDAP servers to replicate to each other.

Gordan

On 09/27/2010 07:06 PM, Rohit tripathi wrote:
> i need to configure Ldap and webspeher on HA cluster
>
> On Mon, Sep 27, 2010 at 7:14 PM, Gordan Bobic <gordan at bobich.net
> <mailto:gordan at bobich.net>> wrote:
>
>     If you'll excuse my stating the obvious, you only require shared
>     storage if you intend to use it. If you only require resource
>     fail-over, then you don't. What is your intended use case?
>
>     In any case, you will need operational fencing.
>
>     For shared block level storage without a SAN, you can try DRBD.
>
>     Gordan
>
>     Rohit tripathi wrote:
>
>         Dear sir
>           I need to know for redhat HA cluster we need shared storage it
>         is compulsory or not or we can configure HA cluster without
>         shared storage.
>           regards
>           Rohit


From sslohar at gmail.com  Tue Sep 28 06:44:48 2010
From: sslohar at gmail.com (santosh lohar)
Date: Tue, 28 Sep 2010 12:14:48 +0530
Subject: [Linux-cluster] cluster issue
Message-ID: <AANLkTikZGrpMMZWB=8va7OYD2k-Wk7pAhysm4xNkwFMn@mail.gmail.com>

Hi all,

I am facing the problem with SGE and flexlm licencing details are below:

*Hardware: * IBM 3650 , 2 Quad core CPU , 16 GB RAM , total nos of node2 +
one master node conected with IB switch connectivity:
*Software* : ROCKS 5.1 / os -RHEL4 mars hill/ fluent / MSC mentat.

Problem :
1 when I submitt the jobs with SGE the "qhost -F MDAdv " is showinf updated
status of license issued and avilable
but when I submitt the jobs outside SGE then it will not able
to recognize the latest status of license tokens
2. jobs submitted after 4 cpu's then cluster computation will get slows down
,

Kindly suggest me what to do in this case , thanks in advance

Regards
Santosh


On Mon, Sep 27, 2010 at 11:07 PM, <linux-cluster-request at redhat.com> wrote:

> Send Linux-cluster mailing list submissions to
>        linux-cluster at redhat.com
>
> To subscribe or unsubscribe via the World Wide Web, visit
>        https://www.redhat.com/mailman/listinfo/linux-cluster
> or, via email, send a message with subject or body 'help' to
>        linux-cluster-request at redhat.com
>
> You can reach the person managing the list at
>        linux-cluster-owner at redhat.com
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Linux-cluster digest..."
>
>
> Today's Topics:
>
>   1. Unable to patch conga (fosiul alam)
>   2. Re: ricci is very unstable in one nodes (Paul M. Dyer)
>   3. Re: porblem with quorum at cluster boot (brem belguebli)
>   4. Re: ricci is very unstable in one nodes (fosiul alam)
>   5. Re: ricci is very unstable in one nodes (fosiul alam)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 27 Sep 2010 17:02:20 +0100
> From: fosiul alam <expertalert at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Subject: [Linux-cluster] Unable to patch conga
> Message-ID:
>        <AANLkTimdQNO3x3g5EKc2ETMPePf3iA-Cptiih6rLb4Au at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> hi
> Due to the same issue, I see exact same problem in my luci interface
> so i am trying to patch conga.
>
> I downloaded ,
>
>
> http://mirrors.kernel.org/centos/5/os/SRPMS/conga-0.12.2-12.el5.centos.1.src.rpm
> rpm -i conga-0.12.2-12.el5.centos.1.src.rpm
> cd /usr/src/redhat/SOURCE
>
> tar -xvzf conga-0.12.2.tar.gz
> patch -p0 < /path/to/where_the_patch/ricci.patch
>
> [root at beaver SOURCES]# cd conga-0.12.2
>
> Now i am facing the problem to install
>
> ./autogen.sh --include_zope_and_plone=yes
> Zope-2.9.8-final.tgz passed sha512sum test
> Plone-2.5.5.tar.gz passed sha512sum test
> cat: clustermon.spec.in.in: No such file or directory
>
> Run `./configure` to configure conga build,
> or `make srpms` to build conga and clustermon srpms
> or `make rpms` to build all rpms
>
> [root at beaver conga-0.12.2]#  ./configure --include_zope_and_plone=yes
> D-BUS version 1.1.2 detected  -> major 1, minor 1
> missing zope directory, extract zope source-code into it and try again
>
>
> Now, how will i tell ./configure where is zope and plone ?
> do i need this zope and plone ?
>
> Please give me some advise
>
> Fosiul
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://www.redhat.com/archives/linux-cluster/attachments/20100927/21959f19/attachment.html
> >
>
> ------------------------------
>
> Message: 2
> Date: Mon, 27 Sep 2010 11:55:28 -0500 (CDT)
> From: "Paul M. Dyer" <pmdyer at ctgcentral2.com>
> To: linux clustering <linux-cluster at redhat.com>
> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
> Message-ID: <1480320.10.1285606528829.JavaMail.root at athena>
> Content-Type: text/plain; charset=utf-8
>
> http://rhn.redhat.com/errata/RHBA-2010-0716.html
>
> It appears that this problem has been fixed in this errata.
>
> I installed the luci and ricci updates and did some lite testing.   So far,
> the timeout 11111 error has not shown up.
>
> Paul
>
> ----- Original Message -----
> From: "fosiul alam" <expertalert at gmail.com>
> To: "linux clustering" <linux-cluster at redhat.com>
> Sent: Monday, September 27, 2010 10:48:27 AM
> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
>
> Hi
> i am trying to patch ricci . let see how it goes
>
> but clusvcadm is failing as well
>
> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
> Member http1.xxxx.local trying to enable service:httpd1...Invalid
> operation for resource
>
> here, http1 , where i was trying to run the service from luci
>
> what could be the problem ?
> is there any way to find out if there is any problem with config ??
>
> On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote:
>
>
> RHEL 5.6 hasn't been released yet so your package probably contains the
> problem. I'm not sure how in sync Centos is with RHEL or if they patch
> earlier so I cannot give you a time frame when it will be in Centos or
> if they have already patched it. The problem in that BZ is more of an
> annoyance, you usually just have to retry a time or two and it works. If
> you can't get Luci working properly with your service at all you should
> try enabling the service through the command line with clusvcadm -e. If
> it is not working from the command line either then there is a problem
> with the service config.
>
>
>
>
> -Ben
>
>
>
>
> ----- "fosiul alam" < expertalert at gmail.com > wrote:
>
> > Hi Ben
> > Thanks
> >
> > I named this cluster as mysql-server but i have not installed mysql
> > database in their yet
> >
> > and both luci and ricci on luci server and node1 is running this
> > version
> >
> > luci-0.12.2-12.el5.centos.1
> > ricci-0.12.2-12.el5.centos.1
> >
> >
> > do you think this version has problem as well ??
> >
> > thanks for your help
> >
> >
> >
> >
> > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
> >
> >
> > There is an issue with ricci timeouts that was fixed recently:
> >
> > https://bugzilla.redhat.com/show_bug.cgi?id=564490
> >
> > I'm not sure but you may be hitting that bug. Symptoms include: luci
> > isn't able to get the status from the node, timeouts when querying
> > ricci, etc. The fix should be released with 5.6
> >
> > On the mysql service there are some options that you need to set. Here
> > are all the options available to that agent:
> >
> > mysql
> > Defines a MySQL database server
> >
> > Attribute Description
> > config_file Define configuration file
> > listen_address Define an IP address for MySQL server. If the address
> > is not given then first IP address from the service is taken.
> > mysqld_options Other command-line options for mysqld
> > name Name
> > ref Reference to existing mysql resource in the resources section.
> > service_name Inherit the service name.
> > shutdown_wait Wait X seconds for correct end of service shutdown
> > startup_wait Wait X seconds for correct end of service startup
> > __enforce_timeouts Consider a timeout for operations as fatal.
> > __failure_expire_time Amount of time before a failure is forgotten.
> > __independent_subtree Treat this and all children as an independent
> > subtree. __max_failures Maximum number of failures before returning a
> > failure to a status check.
> >
> > If I recall correctly you may need to tweak:
> >
> > shutdown_wait Wait X seconds for correct end of service shutdown
> > startup_wait Wait X seconds for correct end of service startup
> >
> > There can be problems relocating the DB if it takes too long to
> > start/shutdown. If you are having problems relocating with luci it may
> > be a good idea to test with:
> >
> > # clusvcadm -r <service name> -m <cluster node>
> >
> > -Ben
> >
> >
> >
> >
> >
> >
> > ----- "fosiul alam" < expertalert at gmail.com > wrote:
> >
> > > Hi
> > > I have 4 nodes cluster,
> > > It was running fine. but today one nodes is giving trouble
> > >
> > > From luci Gui interface, when i try to relocate service into this
> > node
> > > and trying to relocate from this nodes to another nodes
> > >
> > > from luci gui interface, its showing :
> > >
> > > Unable to retrieve batch 1908047789 status from
> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
> > > Starting cluster service "httpd1" on node "http1.domain.local" --
> > You
> > > will be redirected in 5 seconds.
> > > also
> > >
> > > The ricci agent for this node is unresponsive. Node-specific
> > > information is not available at this time. :
> > >
> > > but ricci is running on problematic node ,
> > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
> > >
> > > there is not any firewall running.
> > >
> > > iptables -L
> > > Chain INPUT (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain FORWARD (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain OUTPUT (policy ACCEPT)
> > > target prot opt source destination
> > >
> > > Chain RH-Firewall-1-INPUT (0 references)
> > > target prot opt source destination
> > >
> > > port 11111 is runningg
> > >
> > > netstat -an | grep 11111
> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
> > >
> > >
> > > but still ricci is very unstable , and i cant relocate any service
> > on
> > > this node or i cant relocate any service away from this node.
> > >
> > > from problematic node if i type this
> > >
> > > clustat
> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
> > > Member Status: Quorate
> > >
> > > Member Name ID Status
> > > ------ ---- ---- ------
> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
> > > server publicdns1.xxxx.local 2 Online, rgmanager
> > > http1.xxxx.local 3 Online, Local, rgmanager
> > > mail01.xxxxx.local 4 Online, rgmanager
> > >
> > > Service Name Owner (Last) State
> > > ------- ---- ----- ------ -----
> > > service:httpd1 mail01.xxxx.local started
> > > service:mysql-server http1.xxxx.local started -------------------
> > this
> > > is the problematic node
> > > service:public-dns publicdns1.xxxxxx.local started
> > >
> > > I cant move that service mysql-server from this node or cant
> > relocate
> > > any service on this node ..
> > > I am very confused.
> > >
> > > what shall i do to fix this issue ??
> > >
> > > thanks for your advise.
> > >
> > >
> > >
> > >
> > > -- Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > -- Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > -- Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
>
> -- Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> -- Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
>
>
> ------------------------------
>
> Message: 3
> Date: Mon, 27 Sep 2010 19:05:06 +0200
> From: brem belguebli <brem.belguebli at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Subject: Re: [Linux-cluster] porblem with quorum at cluster boot
> Message-ID:
>        <AANLkTi=FOA-cj5hg11zBmZdzWyQiMpPCM9FZiKgFQHH9 at mail.gmail.com>
> Content-Type: text/plain; charset="iso-8859-1"
>
> The configuration you are trying to build, 2 cluster nodes (1 vote each)
> plus a quorum disk 1 vote (making a total expected votes= 3) must remain up
> if you loose 1 of the members (as long as the remaining node still accesses
> the quorum disk) because there are still 2   active votes (1 remaining node
> + 1 quorum disk) = 2 > expected_votes/2.
>
> The Quorum (majority) must be greater (absolutely greater  >) than
> expected_votes/2 (51% or greater) in order to service to continue.
>
>
> 2010/9/27 Bennie R Thomas <Bennie_R_Thomas at raytheon.com>
>
> > Try setting your expected votes to 2 or 1..
> >
> > Your Cluster is hanging with one node because it want's 3 votes.
> >
> >
> >
> >   From: Brem Belguebli <brem.belguebli at gmail.com> To: linux clustering <
> > linux-cluster at redhat.com> Date: 09/25/2010 10:30 AM Subject: Re:
> > [Linux-cluster] porblem with quorum at cluster boot Sent by:
> > linux-cluster-bounces at redhat.com
> > ------------------------------
> >
> >
> >
> > On Fri, 2010-09-24 at 12:52 -0400, Jason_Henderson at Mitel.com wrote:
> > >
> > > I think you still need two_node="1" in your conf file if you want a
> > > single node to become quorate.
> > >
> > two_nodes=1 is only valid if you do not have a quorum disk.
> >
> > > linux-cluster-bounces at redhat.com wrote on 09/24/2010 12:38:17 PM:
> > >
> > > > hello,
> > > >
> > > > I have a 2 node cluster with qdisk quorum partition;
> > > >
> > > > each node has 1 vote and the qdisk has 1 vote too; in cluster.conf
> > > I
> > > > have this explicit declaration:
> > > > <cman expected_votes="3" two_node="0"\>
> > > >
> > > > when I have both 2 nodes active cman_tool status tell me this:
> > > >
> > > > Version: 6.1.0
> > > > Nodes: 2
> > > > Expected votes: 3
> > > > Quorum device votes: 1
> > > > Total votes: 3
> > > > Node votes: 1
> > > > Quorum: 2
> > > >
> > > > then, if I power off a node these value, as expected, changed this
> > > way:
> > > > Nodes: 1
> > > > Total votes: 2
> > > >
> > > > and the cluster is still quorate and functional.
> > > >
> > > > the problem is if I power off both the node and them power on only
> > > one
> > > > of them: in this case the single node does not quorate and the
> > > cluster
> > > > does not start: I have to power on both the node to have the
> > > cluster
> > > > (and services on the cluster) working.
> > > >
> > > > I'd like the cluster can work (and boot) even with a single node
> > > (ie, if
> > > > one of the node has hw failure and is down I still want to be able
> > > to
> > > > reboot the working node and have it booting correctly the cluster)
> > > >
> > > > any hints? (thank's for reading all this)
> > > >
> > > > --
> > > > bye,
> > > > emilio
> > > >
> > > > --
> > > > Linux-cluster mailing list
> > > > Linux-cluster at redhat.com
> > > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > > --
> > > Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://www.redhat.com/archives/linux-cluster/attachments/20100927/e452edb5/attachment.html
> >
>
> ------------------------------
>
> Message: 4
> Date: Mon, 27 Sep 2010 18:31:31 +0100
> From: fosiul alam <expertalert at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
> Message-ID:
>        <AANLkTikwtYxG3_gf0QxqJpGzZxowh4T7rGbwH-+MhWs8 at mail.gmail.com<AANLkTikwtYxG3_gf0QxqJpGzZxowh4T7rGbwH-%2BMhWs8 at mail.gmail.com>
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi
> Thanks for your advise,
> Currently i got this
>
> luci-0.12.2-12.el5.centos.1
> ricci-0.12.2-12.el5.centos.1
>
> is this the same rpm as
>
> luci-0.12.2-12.el5_5.4.i386.rpm  ?
> ricci-0.12.2-12.el5_5.4.i386.rpm  ?
>
> Thanks
>
>
> On 27 September 2010 17:55, Paul M. Dyer <pmdyer at ctgcentral2.com> wrote:
>
> > http://rhn.redhat.com/errata/RHBA-2010-0716.html
> >
> > It appears that this problem has been fixed in this errata.
> >
> > I installed the luci and ricci updates and did some lite testing.   So
> far,
> > the timeout 11111 error has not shown up.
> >
> > Paul
> >
> > ----- Original Message -----
> > From: "fosiul alam" <expertalert at gmail.com>
> > To: "linux clustering" <linux-cluster at redhat.com>
> > Sent: Monday, September 27, 2010 10:48:27 AM
> > Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
> >
> > Hi
> > i am trying to patch ricci . let see how it goes
> >
> > but clusvcadm is failing as well
> >
> > [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
> > Member http1.xxxx.local trying to enable service:httpd1...Invalid
> > operation for resource
> >
> > here, http1 , where i was trying to run the service from luci
> >
> > what could be the problem ?
> > is there any way to find out if there is any problem with config ??
> >
> > On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote:
> >
> >
> > RHEL 5.6 hasn't been released yet so your package probably contains the
> > problem. I'm not sure how in sync Centos is with RHEL or if they patch
> > earlier so I cannot give you a time frame when it will be in Centos or
> > if they have already patched it. The problem in that BZ is more of an
> > annoyance, you usually just have to retry a time or two and it works. If
> > you can't get Luci working properly with your service at all you should
> > try enabling the service through the command line with clusvcadm -e. If
> > it is not working from the command line either then there is a problem
> > with the service config.
> >
> >
> >
> >
> > -Ben
> >
> >
> >
> >
> > ----- "fosiul alam" < expertalert at gmail.com > wrote:
> >
> > > Hi Ben
> > > Thanks
> > >
> > > I named this cluster as mysql-server but i have not installed mysql
> > > database in their yet
> > >
> > > and both luci and ricci on luci server and node1 is running this
> > > version
> > >
> > > luci-0.12.2-12.el5.centos.1
> > > ricci-0.12.2-12.el5.centos.1
> > >
> > >
> > > do you think this version has problem as well ??
> > >
> > > thanks for your help
> > >
> > >
> > >
> > >
> > > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
> > >
> > >
> > > There is an issue with ricci timeouts that was fixed recently:
> > >
> > > https://bugzilla.redhat.com/show_bug.cgi?id=564490
> > >
> > > I'm not sure but you may be hitting that bug. Symptoms include: luci
> > > isn't able to get the status from the node, timeouts when querying
> > > ricci, etc. The fix should be released with 5.6
> > >
> > > On the mysql service there are some options that you need to set. Here
> > > are all the options available to that agent:
> > >
> > > mysql
> > > Defines a MySQL database server
> > >
> > > Attribute Description
> > > config_file Define configuration file
> > > listen_address Define an IP address for MySQL server. If the address
> > > is not given then first IP address from the service is taken.
> > > mysqld_options Other command-line options for mysqld
> > > name Name
> > > ref Reference to existing mysql resource in the resources section.
> > > service_name Inherit the service name.
> > > shutdown_wait Wait X seconds for correct end of service shutdown
> > > startup_wait Wait X seconds for correct end of service startup
> > > __enforce_timeouts Consider a timeout for operations as fatal.
> > > __failure_expire_time Amount of time before a failure is forgotten.
> > > __independent_subtree Treat this and all children as an independent
> > > subtree. __max_failures Maximum number of failures before returning a
> > > failure to a status check.
> > >
> > > If I recall correctly you may need to tweak:
> > >
> > > shutdown_wait Wait X seconds for correct end of service shutdown
> > > startup_wait Wait X seconds for correct end of service startup
> > >
> > > There can be problems relocating the DB if it takes too long to
> > > start/shutdown. If you are having problems relocating with luci it may
> > > be a good idea to test with:
> > >
> > > # clusvcadm -r <service name> -m <cluster node>
> > >
> > > -Ben
> > >
> > >
> > >
> > >
> > >
> > >
> > > ----- "fosiul alam" < expertalert at gmail.com > wrote:
> > >
> > > > Hi
> > > > I have 4 nodes cluster,
> > > > It was running fine. but today one nodes is giving trouble
> > > >
> > > > From luci Gui interface, when i try to relocate service into this
> > > node
> > > > and trying to relocate from this nodes to another nodes
> > > >
> > > > from luci gui interface, its showing :
> > > >
> > > > Unable to retrieve batch 1908047789 status from
> > > > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
> > > > Starting cluster service "httpd1" on node "http1.domain.local" --
> > > You
> > > > will be redirected in 5 seconds.
> > > > also
> > > >
> > > > The ricci agent for this node is unresponsive. Node-specific
> > > > information is not available at this time. :
> > > >
> > > > but ricci is running on problematic node ,
> > > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
> > > >
> > > > there is not any firewall running.
> > > >
> > > > iptables -L
> > > > Chain INPUT (policy ACCEPT)
> > > > target prot opt source destination
> > > >
> > > > Chain FORWARD (policy ACCEPT)
> > > > target prot opt source destination
> > > >
> > > > Chain OUTPUT (policy ACCEPT)
> > > > target prot opt source destination
> > > >
> > > > Chain RH-Firewall-1-INPUT (0 references)
> > > > target prot opt source destination
> > > >
> > > > port 11111 is runningg
> > > >
> > > > netstat -an | grep 11111
> > > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
> > > >
> > > >
> > > > but still ricci is very unstable , and i cant relocate any service
> > > on
> > > > this node or i cant relocate any service away from this node.
> > > >
> > > > from problematic node if i type this
> > > >
> > > > clustat
> > > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
> > > > Member Status: Quorate
> > > >
> > > > Member Name ID Status
> > > > ------ ---- ---- ------
> > > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
> > > > server publicdns1.xxxx.local 2 Online, rgmanager
> > > > http1.xxxx.local 3 Online, Local, rgmanager
> > > > mail01.xxxxx.local 4 Online, rgmanager
> > > >
> > > > Service Name Owner (Last) State
> > > > ------- ---- ----- ------ -----
> > > > service:httpd1 mail01.xxxx.local started
> > > > service:mysql-server http1.xxxx.local started -------------------
> > > this
> > > > is the problematic node
> > > > service:public-dns publicdns1.xxxxxx.local started
> > > >
> > > > I cant move that service mysql-server from this node or cant
> > > relocate
> > > > any service on this node ..
> > > > I am very confused.
> > > >
> > > > what shall i do to fix this issue ??
> > > >
> > > > thanks for your advise.
> > > >
> > > >
> > > >
> > > >
> > > > -- Linux-cluster mailing list
> > > > Linux-cluster at redhat.com
> > > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > > -- Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> > >
> > >
> > > -- Linux-cluster mailing list
> > > Linux-cluster at redhat.com
> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > -- Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >
> > -- Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://www.redhat.com/archives/linux-cluster/attachments/20100927/462f567b/attachment.html
> >
>
> ------------------------------
>
> Message: 5
> Date: Mon, 27 Sep 2010 18:37:44 +0100
> From: fosiul alam <expertalert at gmail.com>
> To: linux clustering <linux-cluster at redhat.com>
> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
> Message-ID:
>        <AANLkTi=DfrVMFkp8No9UbwD+fVoRx9FmpO+qzY2RxLPk at mail.gmail.com<DfrVMFkp8No9UbwD%2BfVoRx9FmpO%2BqzY2RxLPk at mail.gmail.com>
> >
> Content-Type: text/plain; charset="iso-8859-1"
>
> Hi, Addition to my previous email have a look to this one
>
> from http1 ( where i am trying to relocate a service)
>
> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
> Member http1.xxxx.local trying to enable service:httpd1...Success
> Warning: service:httpd1 is now running on mail01.xxxx.local
>
> so, its saying its Success..
> but it actually no..
>
> Thanks again
>
>
>
> On 27 September 2010 18:31, fosiul alam <expertalert at gmail.com> wrote:
>
> > Hi
> > Thanks for your advise,
> > Currently i got this
> >
> >
> > luci-0.12.2-12.el5.centos.1
> > ricci-0.12.2-12.el5.centos.1
> >
> > is this the same rpm as
> >
> > luci-0.12.2-12.el5_5.4.i386.rpm  ?
> > ricci-0.12.2-12.el5_5.4.i386.rpm  ?
> >
> > Thanks
> >
> >
> >
> > On 27 September 2010 17:55, Paul M. Dyer <pmdyer at ctgcentral2.com> wrote:
> >
> >> http://rhn.redhat.com/errata/RHBA-2010-0716.html
> >>
> >> It appears that this problem has been fixed in this errata.
> >>
> >> I installed the luci and ricci updates and did some lite testing.   So
> >> far, the timeout 11111 error has not shown up.
> >>
> >> Paul
> >>
> >> ----- Original Message -----
> >> From: "fosiul alam" <expertalert at gmail.com>
> >> To: "linux clustering" <linux-cluster at redhat.com>
> >> Sent: Monday, September 27, 2010 10:48:27 AM
> >> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
> >>
> >> Hi
> >> i am trying to patch ricci . let see how it goes
> >>
> >> but clusvcadm is failing as well
> >>
> >> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
> >> Member http1.xxxx.local trying to enable service:httpd1...Invalid
> >> operation for resource
> >>
> >> here, http1 , where i was trying to run the service from luci
> >>
> >> what could be the problem ?
> >> is there any way to find out if there is any problem with config ??
> >>
> >> On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote:
> >>
> >>
> >> RHEL 5.6 hasn't been released yet so your package probably contains the
> >> problem. I'm not sure how in sync Centos is with RHEL or if they patch
> >> earlier so I cannot give you a time frame when it will be in Centos or
> >> if they have already patched it. The problem in that BZ is more of an
> >> annoyance, you usually just have to retry a time or two and it works. If
> >> you can't get Luci working properly with your service at all you should
> >> try enabling the service through the command line with clusvcadm -e. If
> >> it is not working from the command line either then there is a problem
> >> with the service config.
> >>
> >>
> >>
> >>
> >> -Ben
> >>
> >>
> >>
> >>
> >> ----- "fosiul alam" < expertalert at gmail.com > wrote:
> >>
> >> > Hi Ben
> >> > Thanks
> >> >
> >> > I named this cluster as mysql-server but i have not installed mysql
> >> > database in their yet
> >> >
> >> > and both luci and ricci on luci server and node1 is running this
> >> > version
> >> >
> >> > luci-0.12.2-12.el5.centos.1
> >> > ricci-0.12.2-12.el5.centos.1
> >> >
> >> >
> >> > do you think this version has problem as well ??
> >> >
> >> > thanks for your help
> >> >
> >> >
> >> >
> >> >
> >> > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
> >> >
> >> >
> >> > There is an issue with ricci timeouts that was fixed recently:
> >> >
> >> > https://bugzilla.redhat.com/show_bug.cgi?id=564490
> >> >
> >> > I'm not sure but you may be hitting that bug. Symptoms include: luci
> >> > isn't able to get the status from the node, timeouts when querying
> >> > ricci, etc. The fix should be released with 5.6
> >> >
> >> > On the mysql service there are some options that you need to set. Here
> >> > are all the options available to that agent:
> >> >
> >> > mysql
> >> > Defines a MySQL database server
> >> >
> >> > Attribute Description
> >> > config_file Define configuration file
> >> > listen_address Define an IP address for MySQL server. If the address
> >> > is not given then first IP address from the service is taken.
> >> > mysqld_options Other command-line options for mysqld
> >> > name Name
> >> > ref Reference to existing mysql resource in the resources section.
> >> > service_name Inherit the service name.
> >> > shutdown_wait Wait X seconds for correct end of service shutdown
> >> > startup_wait Wait X seconds for correct end of service startup
> >> > __enforce_timeouts Consider a timeout for operations as fatal.
> >> > __failure_expire_time Amount of time before a failure is forgotten.
> >> > __independent_subtree Treat this and all children as an independent
> >> > subtree. __max_failures Maximum number of failures before returning a
> >> > failure to a status check.
> >> >
> >> > If I recall correctly you may need to tweak:
> >> >
> >> > shutdown_wait Wait X seconds for correct end of service shutdown
> >> > startup_wait Wait X seconds for correct end of service startup
> >> >
> >> > There can be problems relocating the DB if it takes too long to
> >> > start/shutdown. If you are having problems relocating with luci it may
> >> > be a good idea to test with:
> >> >
> >> > # clusvcadm -r <service name> -m <cluster node>
> >> >
> >> > -Ben
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >
> >> > ----- "fosiul alam" < expertalert at gmail.com > wrote:
> >> >
> >> > > Hi
> >> > > I have 4 nodes cluster,
> >> > > It was running fine. but today one nodes is giving trouble
> >> > >
> >> > > From luci Gui interface, when i try to relocate service into this
> >> > node
> >> > > and trying to relocate from this nodes to another nodes
> >> > >
> >> > > from luci gui interface, its showing :
> >> > >
> >> > > Unable to retrieve batch 1908047789 status from
> >> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
> >> > > Starting cluster service "httpd1" on node "http1.domain.local" --
> >> > You
> >> > > will be redirected in 5 seconds.
> >> > > also
> >> > >
> >> > > The ricci agent for this node is unresponsive. Node-specific
> >> > > information is not available at this time. :
> >> > >
> >> > > but ricci is running on problematic node ,
> >> > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
> >> > >
> >> > > there is not any firewall running.
> >> > >
> >> > > iptables -L
> >> > > Chain INPUT (policy ACCEPT)
> >> > > target prot opt source destination
> >> > >
> >> > > Chain FORWARD (policy ACCEPT)
> >> > > target prot opt source destination
> >> > >
> >> > > Chain OUTPUT (policy ACCEPT)
> >> > > target prot opt source destination
> >> > >
> >> > > Chain RH-Firewall-1-INPUT (0 references)
> >> > > target prot opt source destination
> >> > >
> >> > > port 11111 is runningg
> >> > >
> >> > > netstat -an | grep 11111
> >> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
> >> > >
> >> > >
> >> > > but still ricci is very unstable , and i cant relocate any service
> >> > on
> >> > > this node or i cant relocate any service away from this node.
> >> > >
> >> > > from problematic node if i type this
> >> > >
> >> > > clustat
> >> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
> >> > > Member Status: Quorate
> >> > >
> >> > > Member Name ID Status
> >> > > ------ ---- ---- ------
> >> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
> >> > > server publicdns1.xxxx.local 2 Online, rgmanager
> >> > > http1.xxxx.local 3 Online, Local, rgmanager
> >> > > mail01.xxxxx.local 4 Online, rgmanager
> >> > >
> >> > > Service Name Owner (Last) State
> >> > > ------- ---- ----- ------ -----
> >> > > service:httpd1 mail01.xxxx.local started
> >> > > service:mysql-server http1.xxxx.local started -------------------
> >> > this
> >> > > is the problematic node
> >> > > service:public-dns publicdns1.xxxxxx.local started
> >> > >
> >> > > I cant move that service mysql-server from this node or cant
> >> > relocate
> >> > > any service on this node ..
> >> > > I am very confused.
> >> > >
> >> > > what shall i do to fix this issue ??
> >> > >
> >> > > thanks for your advise.
> >> > >
> >> > >
> >> > >
> >> > >
> >> > > -- Linux-cluster mailing list
> >> > > Linux-cluster at redhat.com
> >> > > https://www.redhat.com/mailman/listinfo/linux-cluster
> >> >
> >> > -- Linux-cluster mailing list
> >> > Linux-cluster at redhat.com
> >> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >> >
> >> >
> >> > -- Linux-cluster mailing list
> >> > Linux-cluster at redhat.com
> >> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >> -- Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >>
> >> -- Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >
> >
> -------------- next part --------------
> An HTML attachment was scrubbed...
> URL: <
> https://www.redhat.com/archives/linux-cluster/attachments/20100927/4101fdf9/attachment.html
> >
>
> ------------------------------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> End of Linux-cluster Digest, Vol 77, Issue 23
> *********************************************
>


-- 
Santosh
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100928/7a2148c9/attachment.htm>

From expertalert at gmail.com  Tue Sep 28 10:32:54 2010
From: expertalert at gmail.com (fosiul alam)
Date: Tue, 28 Sep 2010 11:32:54 +0100
Subject: [Linux-cluster] ricci is very unstable in one nodes
In-Reply-To: <AANLkTi=DfrVMFkp8No9UbwD+fVoRx9FmpO+qzY2RxLPk@mail.gmail.com>
References: <AANLkTimo0R=C1XP8KwoXeyO=VWNVnFckkiXUZnrjBgs0@mail.gmail.com>
	<1480320.10.1285606528829.JavaMail.root@athena>
	<AANLkTikwtYxG3_gf0QxqJpGzZxowh4T7rGbwH-+MhWs8@mail.gmail.com>
	<AANLkTi=DfrVMFkp8No9UbwD+fVoRx9FmpO+qzY2RxLPk@mail.gmail.com>
Message-ID: <AANLkTinYo3JmCqLgQEvqNFM5N4DLHnWAdbQunO9ymqYO@mail.gmail.com>

HI ya

i found this interesting .. but dont know if its normal or not

i typed this command in 3 cluster nodes

tcpdump -i eth0 ip multicast


and for some reason.. i am seeing same output in 3 server which is

11:26:13.700399 IP http1.xxxxx.local.5149 > 239.192.2.185.netsupport: UDP,
length 118


example.. Same output in every 3 server..

is this normal output ?? ( here http1 is having the trouble to locate or
relocate services in the cluster)

so basically, what ever i am seeing in http1 server i am seeing the same out
put on rest ..

here 239.192.2.185 is  the multicast address of clsuter

Thanks
fosiul

On 27 September 2010 18:37, fosiul alam <expertalert at gmail.com> wrote:

> Hi, Addition to my previous email have a look to this one
>
> from http1 ( where i am trying to relocate a service)
>
>
> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
> Member http1.xxxx.local trying to enable service:httpd1...Success
> Warning: service:httpd1 is now running on mail01.xxxx.local
>
> so, its saying its Success..
> but it actually no..
>
> Thanks again
>
>
>
>
> On 27 September 2010 18:31, fosiul alam <expertalert at gmail.com> wrote:
>
>> Hi
>> Thanks for your advise,
>> Currently i got this
>>
>>
>> luci-0.12.2-12.el5.centos.1
>> ricci-0.12.2-12.el5.centos.1
>>
>> is this the same rpm as
>>
>> luci-0.12.2-12.el5_5.4.i386.rpm  ?
>> ricci-0.12.2-12.el5_5.4.i386.rpm  ?
>>
>> Thanks
>>
>>
>>
>> On 27 September 2010 17:55, Paul M. Dyer <pmdyer at ctgcentral2.com> wrote:
>>
>>> http://rhn.redhat.com/errata/RHBA-2010-0716.html
>>>
>>> It appears that this problem has been fixed in this errata.
>>>
>>> I installed the luci and ricci updates and did some lite testing.   So
>>> far, the timeout 11111 error has not shown up.
>>>
>>> Paul
>>>
>>> ----- Original Message -----
>>> From: "fosiul alam" <expertalert at gmail.com>
>>> To: "linux clustering" <linux-cluster at redhat.com>
>>> Sent: Monday, September 27, 2010 10:48:27 AM
>>> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
>>>
>>> Hi
>>> i am trying to patch ricci . let see how it goes
>>>
>>> but clusvcadm is failing as well
>>>
>>> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
>>> Member http1.xxxx.local trying to enable service:httpd1...Invalid
>>> operation for resource
>>>
>>> here, http1 , where i was trying to run the service from luci
>>>
>>> what could be the problem ?
>>> is there any way to find out if there is any problem with config ??
>>>
>>> On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote:
>>>
>>>
>>> RHEL 5.6 hasn't been released yet so your package probably contains the
>>> problem. I'm not sure how in sync Centos is with RHEL or if they patch
>>> earlier so I cannot give you a time frame when it will be in Centos or
>>> if they have already patched it. The problem in that BZ is more of an
>>> annoyance, you usually just have to retry a time or two and it works. If
>>> you can't get Luci working properly with your service at all you should
>>> try enabling the service through the command line with clusvcadm -e. If
>>> it is not working from the command line either then there is a problem
>>> with the service config.
>>>
>>>
>>>
>>>
>>> -Ben
>>>
>>>
>>>
>>>
>>> ----- "fosiul alam" < expertalert at gmail.com > wrote:
>>>
>>> > Hi Ben
>>> > Thanks
>>> >
>>> > I named this cluster as mysql-server but i have not installed mysql
>>> > database in their yet
>>> >
>>> > and both luci and ricci on luci server and node1 is running this
>>> > version
>>> >
>>> > luci-0.12.2-12.el5.centos.1
>>> > ricci-0.12.2-12.el5.centos.1
>>> >
>>> >
>>> > do you think this version has problem as well ??
>>> >
>>> > thanks for your help
>>> >
>>> >
>>> >
>>> >
>>> > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
>>> >
>>> >
>>> > There is an issue with ricci timeouts that was fixed recently:
>>> >
>>> > https://bugzilla.redhat.com/show_bug.cgi?id=564490
>>> >
>>> > I'm not sure but you may be hitting that bug. Symptoms include: luci
>>> > isn't able to get the status from the node, timeouts when querying
>>> > ricci, etc. The fix should be released with 5.6
>>> >
>>> > On the mysql service there are some options that you need to set. Here
>>> > are all the options available to that agent:
>>> >
>>> > mysql
>>> > Defines a MySQL database server
>>> >
>>> > Attribute Description
>>> > config_file Define configuration file
>>> > listen_address Define an IP address for MySQL server. If the address
>>> > is not given then first IP address from the service is taken.
>>> > mysqld_options Other command-line options for mysqld
>>> > name Name
>>> > ref Reference to existing mysql resource in the resources section.
>>> > service_name Inherit the service name.
>>> > shutdown_wait Wait X seconds for correct end of service shutdown
>>> > startup_wait Wait X seconds for correct end of service startup
>>> > __enforce_timeouts Consider a timeout for operations as fatal.
>>> > __failure_expire_time Amount of time before a failure is forgotten.
>>> > __independent_subtree Treat this and all children as an independent
>>> > subtree. __max_failures Maximum number of failures before returning a
>>> > failure to a status check.
>>> >
>>> > If I recall correctly you may need to tweak:
>>> >
>>> > shutdown_wait Wait X seconds for correct end of service shutdown
>>> > startup_wait Wait X seconds for correct end of service startup
>>> >
>>> > There can be problems relocating the DB if it takes too long to
>>> > start/shutdown. If you are having problems relocating with luci it may
>>> > be a good idea to test with:
>>> >
>>> > # clusvcadm -r <service name> -m <cluster node>
>>> >
>>> > -Ben
>>> >
>>> >
>>> >
>>> >
>>> >
>>> >
>>> > ----- "fosiul alam" < expertalert at gmail.com > wrote:
>>> >
>>> > > Hi
>>> > > I have 4 nodes cluster,
>>> > > It was running fine. but today one nodes is giving trouble
>>> > >
>>> > > From luci Gui interface, when i try to relocate service into this
>>> > node
>>> > > and trying to relocate from this nodes to another nodes
>>> > >
>>> > > from luci gui interface, its showing :
>>> > >
>>> > > Unable to retrieve batch 1908047789 status from
>>> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
>>> > > Starting cluster service "httpd1" on node "http1.domain.local" --
>>> > You
>>> > > will be redirected in 5 seconds.
>>> > > also
>>> > >
>>> > > The ricci agent for this node is unresponsive. Node-specific
>>> > > information is not available at this time. :
>>> > >
>>> > > but ricci is running on problematic node ,
>>> > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
>>> > >
>>> > > there is not any firewall running.
>>> > >
>>> > > iptables -L
>>> > > Chain INPUT (policy ACCEPT)
>>> > > target prot opt source destination
>>> > >
>>> > > Chain FORWARD (policy ACCEPT)
>>> > > target prot opt source destination
>>> > >
>>> > > Chain OUTPUT (policy ACCEPT)
>>> > > target prot opt source destination
>>> > >
>>> > > Chain RH-Firewall-1-INPUT (0 references)
>>> > > target prot opt source destination
>>> > >
>>> > > port 11111 is runningg
>>> > >
>>> > > netstat -an | grep 11111
>>> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
>>> > >
>>> > >
>>> > > but still ricci is very unstable , and i cant relocate any service
>>> > on
>>> > > this node or i cant relocate any service away from this node.
>>> > >
>>> > > from problematic node if i type this
>>> > >
>>> > > clustat
>>> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
>>> > > Member Status: Quorate
>>> > >
>>> > > Member Name ID Status
>>> > > ------ ---- ---- ------
>>> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
>>> > > server publicdns1.xxxx.local 2 Online, rgmanager
>>> > > http1.xxxx.local 3 Online, Local, rgmanager
>>> > > mail01.xxxxx.local 4 Online, rgmanager
>>> > >
>>> > > Service Name Owner (Last) State
>>> > > ------- ---- ----- ------ -----
>>> > > service:httpd1 mail01.xxxx.local started
>>> > > service:mysql-server http1.xxxx.local started -------------------
>>> > this
>>> > > is the problematic node
>>> > > service:public-dns publicdns1.xxxxxx.local started
>>> > >
>>> > > I cant move that service mysql-server from this node or cant
>>> > relocate
>>> > > any service on this node ..
>>> > > I am very confused.
>>> > >
>>> > > what shall i do to fix this issue ??
>>> > >
>>> > > thanks for your advise.
>>> > >
>>> > >
>>> > >
>>> > >
>>> > > -- Linux-cluster mailing list
>>> > > Linux-cluster at redhat.com
>>> > > https://www.redhat.com/mailman/listinfo/linux-cluster
>>> >
>>> > -- Linux-cluster mailing list
>>> > Linux-cluster at redhat.com
>>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>>> >
>>> >
>>> > -- Linux-cluster mailing list
>>> > Linux-cluster at redhat.com
>>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>> -- Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>>
>>> -- Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>> --
>>> Linux-cluster mailing list
>>> Linux-cluster at redhat.com
>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100928/71ecbfc4/attachment.htm>

From marcello.percoco at diennea.com  Tue Sep 28 10:44:26 2010
From: marcello.percoco at diennea.com (Marcello Percoco - Diennea)
Date: Tue, 28 Sep 2010 12:44:26 +0200
Subject: [Linux-cluster] R:  Gfs2 Problem
In-Reply-To: <1285602746.2476.23.camel@dolmen>
References: <CA2F45405F1F27488F305C7863282FB974D76013@dnaexc01.diennea.lan>
	<1285602746.2476.23.camel@dolmen>
Message-ID: <CA2F45405F1F27488F305C7863282FB974D7605A@dnaexc01.diennea.lan>

This is the dmesg

Sep 28 10:12:16 sviluppo05 kernel: INFO: task java:32140 blocked for more than 120 seconds.
Sep 28 10:12:16 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:12:16 sviluppo05 kernel: java          D ffff81000100caa0     0 32140   9284         10218 30744 (NOTLB)
Sep 28 10:12:16 sviluppo05 kernel:  ffff810082e1dba8 0000000000000082 0000000000000000 ffff8101376e9800
Sep 28 10:12:16 sviluppo05 kernel:  0000000000000292 0000000000000009 ffff81006bf49860 ffff81010474a0c0
Sep 28 10:12:16 sviluppo05 kernel:  00003d141351f6a7 00000000002495aa ffff81006bf49a48 00000001884a2e5a
Sep 28 10:12:16 sviluppo05 kernel: Call Trace:
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884daebb>] :gfs2:gfs2_setattr+0x39/0x367
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884daeb3>] :gfs2:gfs2_setattr+0x31/0x367
Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff800645ab>] __down_write_nested+0x12/0x92
Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff8002c893>] notify_change+0x145/0x2f3
Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff800dfb98>] do_truncate+0x67/0x82
Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff800126c0>] may_open+0x1d3/0x22f
Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff8001b200>] open_namei+0x2c4/0x6d5
Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff80027533>] do_filp_open+0x1c/0x38
Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff80019e5d>] do_sys_open+0x44/0xbe
Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 10:12:17 sviluppo05 kernel:
Sep 28 10:14:16 sviluppo05 kernel: INFO: task java:32140 blocked for more than 120 seconds.
Sep 28 10:14:16 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:14:16 sviluppo05 kernel: java          D ffff81000100caa0     0 32140   9284         10218 30744 (NOTLB)
Sep 28 10:14:16 sviluppo05 kernel:  ffff810082e1dba8 0000000000000082 0000000000000000 ffff8101376e9800
Sep 28 10:14:16 sviluppo05 kernel:  0000000000000292 0000000000000009 ffff81006bf49860 ffff81010474a0c0
Sep 28 10:14:16 sviluppo05 kernel:  00003d141351f6a7 00000000002495aa ffff81006bf49a48 00000001884a2e5a
Sep 28 10:14:16 sviluppo05 kernel: Call Trace:
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884daebb>] :gfs2:gfs2_setattr+0x39/0x367
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884daeb3>] :gfs2:gfs2_setattr+0x31/0x367
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff800645ab>] __down_write_nested+0x12/0x92
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff8002c893>] notify_change+0x145/0x2f3
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff800dfb98>] do_truncate+0x67/0x82
Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff800126c0>] may_open+0x1d3/0x22f
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8001b200>] open_namei+0x2c4/0x6d5
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80027533>] do_filp_open+0x1c/0x38
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80019e5d>] do_sys_open+0x44/0xbe
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 10:14:17 sviluppo05 kernel:
Sep 28 10:14:17 sviluppo05 kernel: INFO: task java:10243 blocked for more than 120 seconds.
Sep 28 10:14:17 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:14:17 sviluppo05 kernel: java          D ffff81010461c440     0 10243   6377         10244  9763 (NOTLB)
Sep 28 10:14:17 sviluppo05 kernel:  ffff810087ec3cf8 0000000000000086 0000000000000018 ffffffff884a14f3
Sep 28 10:14:17 sviluppo05 kernel:  0000000000000292 0000000000000008 ffff81013e82c7a0 ffff81000d2d5860
Sep 28 10:14:17 sviluppo05 kernel:  00003d2758ee2f11 000000000008805b ffff81013e82c988 00000002884a2e5a
Sep 28 10:14:17 sviluppo05 kernel: Call Trace:
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 10:14:17 sviluppo05 kernel:
Sep 28 10:14:17 sviluppo05 kernel: INFO: task java:10244 blocked for more than 120 seconds.
Sep 28 10:14:17 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:14:17 sviluppo05 kernel: java          D ffff81010461c440     0 10244   6377         10245 10243 (NOTLB)
Sep 28 10:14:17 sviluppo05 kernel:  ffff810087ec1cf8 0000000000000086 0000000000000018 ffffffff884a14f3
Sep 28 10:14:17 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff81000d2d5860 ffff8100579d77a0
Sep 28 10:14:17 sviluppo05 kernel:  00003d27591218a7 000000000006e36c ffff81000d2d5a48 00000002884a2e5a
Sep 28 10:14:17 sviluppo05 kernel: Call Trace:
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 10:14:18 sviluppo05 kernel:
Sep 28 10:14:18 sviluppo05 kernel: INFO: task java:10245 blocked for more than 120 seconds.
Sep 28 10:14:18 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:14:18 sviluppo05 kernel: java          D ffff81010461c3f8     0 10245   6377         10246 10244 (NOTLB)
Sep 28 10:14:18 sviluppo05 kernel:  ffff8100b9cf3cf8 0000000000000086 0000000000000018 ffffffff884a14f3
Sep 28 10:14:18 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff8100579d77a0 ffff8100a1a1a820
Sep 28 10:14:18 sviluppo05 kernel:  00003d275934bd1a 000000000006fd3d ffff8100579d7988 00000002884a2e5a
Sep 28 10:14:18 sviluppo05 kernel: Call Trace:
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 10:14:18 sviluppo05 kernel:
Sep 28 10:14:18 sviluppo05 kernel: INFO: task java:10246 blocked for more than 120 seconds.
Sep 28 10:14:18 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:14:18 sviluppo05 kernel: java          D ffff81010461c3f8     0 10246   6377         10247 10245 (NOTLB)
Sep 28 10:14:18 sviluppo05 kernel:  ffff8100b9cf5cf8 0000000000000086 0000000000000018 ffffffff884a14f3
Sep 28 10:14:18 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff8100a1a1a820 ffff810063432080
Sep 28 10:14:18 sviluppo05 kernel:  00003d275958c9ee 000000000000844e ffff8100a1a1aa08 00000002884a2e5a
Sep 28 10:14:18 sviluppo05 kernel: Call Trace:
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 10:14:19 sviluppo05 kernel:
Sep 28 10:14:19 sviluppo05 kernel: INFO: task java:10247 blocked for more than 120 seconds.
Sep 28 10:14:19 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:14:19 sviluppo05 kernel: java          D ffff81000100caa0     0 10247   6377         10248 10246 (NOTLB)
Sep 28 10:14:19 sviluppo05 kernel:  ffff8100b9cf7cf8 0000000000000086 0000000000000018 ffffffff884a14f3
Sep 28 10:14:19 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff810063432080 ffff81010474a0c0
Sep 28 10:14:19 sviluppo05 kernel:  00003d275c9772b8 0000000000050cc8 ffff810063432268 00000001884a2e5a
Sep 28 10:14:19 sviluppo05 kernel: Call Trace:
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 10:14:19 sviluppo05 kernel:
Sep 28 10:14:19 sviluppo05 kernel: INFO: task java:10248 blocked for more than 120 seconds.
Sep 28 10:14:19 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:14:19 sviluppo05 kernel: java          D ffff81010461c4a0     0 10248   6377         10249 10247 (NOTLB)
Sep 28 10:14:19 sviluppo05 kernel:  ffff81004da09cf8 0000000000000086 0000000000000018 ffffffff884a14f3
Sep 28 10:14:19 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff8100147810c0 ffff81006bf49100
Sep 28 10:14:19 sviluppo05 kernel:  00003d27599615c3 00000000000087a4 ffff8100147812a8 00000002884a2e5a
Sep 28 10:14:19 sviluppo05 kernel: Call Trace:
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 10:14:20 sviluppo05 kernel:
Sep 28 10:14:20 sviluppo05 kernel: INFO: task java:10249 blocked for more than 120 seconds.
Sep 28 10:14:20 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:14:20 sviluppo05 kernel: java          D ffff81010461c4a0     0 10249   6377         10250 10248 (NOTLB)
Sep 28 10:14:20 sviluppo05 kernel:  ffff81004da0bcf8 0000000000000086 0000000000000018 ffffffff884a14f3
Sep 28 10:14:20 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff81006bf49100 ffff810086b9f080
Sep 28 10:14:20 sviluppo05 kernel:  00003d2759b85cc0 0000000000076e29 ffff81006bf492e8 00000002884a2e5a
Sep 28 10:14:20 sviluppo05 kernel: Call Trace:
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
Sep 28 10:14:20 sviluppo05 kernel:
Sep 28 10:14:20 sviluppo05 kernel: INFO: task java:10250 blocked for more than 120 seconds.
Sep 28 10:14:20 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Sep 28 10:14:20 sviluppo05 kernel: java          D ffff81010461c4a0     0 10250   6377         10251 10249 (NOTLB)
Sep 28 10:14:20 sviluppo05 kernel:  ffff81004da0dcf8 0000000000000086 0000000000000018 ffffffff884a14f3
Sep 28 10:14:20 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff810086b9f080 ffff8100057ef100
Sep 28 10:14:20 sviluppo05 kernel:  00003d2759dbb866 0000000000075b03 ffff810086b9f268 00000002884a2e5a
Sep 28 10:14:20 sviluppo05 kernel: Call Trace:
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0

--
Marcello Percoco
IT Junior

Diennea
Viale G. Marconi, 30/14
48018 Faenza (RA) - Italy

E-Mail: marcello.percoco at diennea.com
Tel.: (+39) 0546 667432 - Int. 916
Fax:  (+39) 0546 399913

MagNews - E-Mail Marketing Solutions
http://www.magnews.it

Diennea - Technology for Marketing
http://www.diennea.com

DISCLAIMER
Questo messaggio e i suoi allegati si rivolgono esclusivamente ai destinatari e possono contenere informazioni personali, confidenziali o protette da diritti. Se ha ricevuto questo messaggio per errore, l'utilizzo dei suoi contenuti ? proibito e pu? esporre a conseguenze penali o civili. La invitiamo pertanto a rispedire immediatamente il messaggio al mittente e cancellarne gli allegati senza conservarne una copia. Per ulteriori informazioni, La preghiamo di contattarci all'indirizzo postmaster at diennea.com. Grazie
This e-mail and any attachments may be confidential and the subject of legal professional privilege. Any disclosure, use, storage or copying of this e-mail without the consent of the sender is strictly prohibited. Please notify the sender immediately if you are not the intended recipient and then delete the e-mail from your inbox and do not disclose the contents to another person, use, copy or store the information in any medium. For further information write to postmaster at diennea.com. Thanks


From swhiteho at redhat.com  Tue Sep 28 11:00:18 2010
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Tue, 28 Sep 2010 12:00:18 +0100
Subject: [Linux-cluster] R:  Gfs2 Problem
In-Reply-To: <CA2F45405F1F27488F305C7863282FB974D7605A@dnaexc01.diennea.lan>
References: <CA2F45405F1F27488F305C7863282FB974D76013@dnaexc01.diennea.lan>
	<1285602746.2476.23.camel@dolmen>
	<CA2F45405F1F27488F305C7863282FB974D7605A@dnaexc01.diennea.lan>
Message-ID: <1285671618.2457.2.camel@dolmen>

Hi,

That looks to me just like normal contention on a glock. The glock dumps
will confirm what is going on here, but there is nothing obviously wrong
in these traces,

Steve.

On Tue, 2010-09-28 at 12:44 +0200, Marcello Percoco - Diennea wrote:
> This is the dmesg
> 
> Sep 28 10:12:16 sviluppo05 kernel: INFO: task java:32140 blocked for more than 120 seconds.
> Sep 28 10:12:16 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:12:16 sviluppo05 kernel: java          D ffff81000100caa0     0 32140   9284         10218 30744 (NOTLB)
> Sep 28 10:12:16 sviluppo05 kernel:  ffff810082e1dba8 0000000000000082 0000000000000000 ffff8101376e9800
> Sep 28 10:12:16 sviluppo05 kernel:  0000000000000292 0000000000000009 ffff81006bf49860 ffff81010474a0c0
> Sep 28 10:12:16 sviluppo05 kernel:  00003d141351f6a7 00000000002495aa ffff81006bf49a48 00000001884a2e5a
> Sep 28 10:12:16 sviluppo05 kernel: Call Trace:
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884daebb>] :gfs2:gfs2_setattr+0x39/0x367
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff884daeb3>] :gfs2:gfs2_setattr+0x31/0x367
> Sep 28 10:12:16 sviluppo05 kernel:  [<ffffffff800645ab>] __down_write_nested+0x12/0x92
> Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff8002c893>] notify_change+0x145/0x2f3
> Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff800dfb98>] do_truncate+0x67/0x82
> Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff800126c0>] may_open+0x1d3/0x22f
> Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff8001b200>] open_namei+0x2c4/0x6d5
> Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff80027533>] do_filp_open+0x1c/0x38
> Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff80019e5d>] do_sys_open+0x44/0xbe
> Sep 28 10:12:17 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> Sep 28 10:12:17 sviluppo05 kernel:
> Sep 28 10:14:16 sviluppo05 kernel: INFO: task java:32140 blocked for more than 120 seconds.
> Sep 28 10:14:16 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:14:16 sviluppo05 kernel: java          D ffff81000100caa0     0 32140   9284         10218 30744 (NOTLB)
> Sep 28 10:14:16 sviluppo05 kernel:  ffff810082e1dba8 0000000000000082 0000000000000000 ffff8101376e9800
> Sep 28 10:14:16 sviluppo05 kernel:  0000000000000292 0000000000000009 ffff81006bf49860 ffff81010474a0c0
> Sep 28 10:14:16 sviluppo05 kernel:  00003d141351f6a7 00000000002495aa ffff81006bf49a48 00000001884a2e5a
> Sep 28 10:14:16 sviluppo05 kernel: Call Trace:
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884daebb>] :gfs2:gfs2_setattr+0x39/0x367
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff884daeb3>] :gfs2:gfs2_setattr+0x31/0x367
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff800645ab>] __down_write_nested+0x12/0x92
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff8002c893>] notify_change+0x145/0x2f3
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff800dfb98>] do_truncate+0x67/0x82
> Sep 28 10:14:16 sviluppo05 kernel:  [<ffffffff800126c0>] may_open+0x1d3/0x22f
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8001b200>] open_namei+0x2c4/0x6d5
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80027533>] do_filp_open+0x1c/0x38
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80019e5d>] do_sys_open+0x44/0xbe
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> Sep 28 10:14:17 sviluppo05 kernel:
> Sep 28 10:14:17 sviluppo05 kernel: INFO: task java:10243 blocked for more than 120 seconds.
> Sep 28 10:14:17 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:14:17 sviluppo05 kernel: java          D ffff81010461c440     0 10243   6377         10244  9763 (NOTLB)
> Sep 28 10:14:17 sviluppo05 kernel:  ffff810087ec3cf8 0000000000000086 0000000000000018 ffffffff884a14f3
> Sep 28 10:14:17 sviluppo05 kernel:  0000000000000292 0000000000000008 ffff81013e82c7a0 ffff81000d2d5860
> Sep 28 10:14:17 sviluppo05 kernel:  00003d2758ee2f11 000000000008805b ffff81013e82c988 00000002884a2e5a
> Sep 28 10:14:17 sviluppo05 kernel: Call Trace:
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> Sep 28 10:14:17 sviluppo05 kernel:
> Sep 28 10:14:17 sviluppo05 kernel: INFO: task java:10244 blocked for more than 120 seconds.
> Sep 28 10:14:17 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:14:17 sviluppo05 kernel: java          D ffff81010461c440     0 10244   6377         10245 10243 (NOTLB)
> Sep 28 10:14:17 sviluppo05 kernel:  ffff810087ec1cf8 0000000000000086 0000000000000018 ffffffff884a14f3
> Sep 28 10:14:17 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff81000d2d5860 ffff8100579d77a0
> Sep 28 10:14:17 sviluppo05 kernel:  00003d27591218a7 000000000006e36c ffff81000d2d5a48 00000002884a2e5a
> Sep 28 10:14:17 sviluppo05 kernel: Call Trace:
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
> Sep 28 10:14:17 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> Sep 28 10:14:18 sviluppo05 kernel:
> Sep 28 10:14:18 sviluppo05 kernel: INFO: task java:10245 blocked for more than 120 seconds.
> Sep 28 10:14:18 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:14:18 sviluppo05 kernel: java          D ffff81010461c3f8     0 10245   6377         10246 10244 (NOTLB)
> Sep 28 10:14:18 sviluppo05 kernel:  ffff8100b9cf3cf8 0000000000000086 0000000000000018 ffffffff884a14f3
> Sep 28 10:14:18 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff8100579d77a0 ffff8100a1a1a820
> Sep 28 10:14:18 sviluppo05 kernel:  00003d275934bd1a 000000000006fd3d ffff8100579d7988 00000002884a2e5a
> Sep 28 10:14:18 sviluppo05 kernel: Call Trace:
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> Sep 28 10:14:18 sviluppo05 kernel:
> Sep 28 10:14:18 sviluppo05 kernel: INFO: task java:10246 blocked for more than 120 seconds.
> Sep 28 10:14:18 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:14:18 sviluppo05 kernel: java          D ffff81010461c3f8     0 10246   6377         10247 10245 (NOTLB)
> Sep 28 10:14:18 sviluppo05 kernel:  ffff8100b9cf5cf8 0000000000000086 0000000000000018 ffffffff884a14f3
> Sep 28 10:14:18 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff8100a1a1a820 ffff810063432080
> Sep 28 10:14:18 sviluppo05 kernel:  00003d275958c9ee 000000000000844e ffff8100a1a1aa08 00000002884a2e5a
> Sep 28 10:14:18 sviluppo05 kernel: Call Trace:
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
> Sep 28 10:14:18 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> Sep 28 10:14:19 sviluppo05 kernel:
> Sep 28 10:14:19 sviluppo05 kernel: INFO: task java:10247 blocked for more than 120 seconds.
> Sep 28 10:14:19 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:14:19 sviluppo05 kernel: java          D ffff81000100caa0     0 10247   6377         10248 10246 (NOTLB)
> Sep 28 10:14:19 sviluppo05 kernel:  ffff8100b9cf7cf8 0000000000000086 0000000000000018 ffffffff884a14f3
> Sep 28 10:14:19 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff810063432080 ffff81010474a0c0
> Sep 28 10:14:19 sviluppo05 kernel:  00003d275c9772b8 0000000000050cc8 ffff810063432268 00000001884a2e5a
> Sep 28 10:14:19 sviluppo05 kernel: Call Trace:
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> Sep 28 10:14:19 sviluppo05 kernel:
> Sep 28 10:14:19 sviluppo05 kernel: INFO: task java:10248 blocked for more than 120 seconds.
> Sep 28 10:14:19 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:14:19 sviluppo05 kernel: java          D ffff81010461c4a0     0 10248   6377         10249 10247 (NOTLB)
> Sep 28 10:14:19 sviluppo05 kernel:  ffff81004da09cf8 0000000000000086 0000000000000018 ffffffff884a14f3
> Sep 28 10:14:19 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff8100147810c0 ffff81006bf49100
> Sep 28 10:14:19 sviluppo05 kernel:  00003d27599615c3 00000000000087a4 ffff8100147812a8 00000002884a2e5a
> Sep 28 10:14:19 sviluppo05 kernel: Call Trace:
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:14:19 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> Sep 28 10:14:20 sviluppo05 kernel:
> Sep 28 10:14:20 sviluppo05 kernel: INFO: task java:10249 blocked for more than 120 seconds.
> Sep 28 10:14:20 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:14:20 sviluppo05 kernel: java          D ffff81010461c4a0     0 10249   6377         10250 10248 (NOTLB)
> Sep 28 10:14:20 sviluppo05 kernel:  ffff81004da0bcf8 0000000000000086 0000000000000018 ffffffff884a14f3
> Sep 28 10:14:20 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff81006bf49100 ffff810086b9f080
> Sep 28 10:14:20 sviluppo05 kernel:  00003d2759b85cc0 0000000000076e29 ffff81006bf492e8 00000002884a2e5a
> Sep 28 10:14:20 sviluppo05 kernel: Call Trace:
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> Sep 28 10:14:20 sviluppo05 kernel:
> Sep 28 10:14:20 sviluppo05 kernel: INFO: task java:10250 blocked for more than 120 seconds.
> Sep 28 10:14:20 sviluppo05 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Sep 28 10:14:20 sviluppo05 kernel: java          D ffff81010461c4a0     0 10250   6377         10251 10249 (NOTLB)
> Sep 28 10:14:20 sviluppo05 kernel:  ffff81004da0dcf8 0000000000000086 0000000000000018 ffffffff884a14f3
> Sep 28 10:14:20 sviluppo05 kernel:  0000000000000292 0000000000000006 ffff810086b9f080 ffff8100057ef100
> Sep 28 10:14:20 sviluppo05 kernel:  00003d2759dbb866 0000000000075b03 ffff810086b9f268 00000002884a2e5a
> Sep 28 10:14:20 sviluppo05 kernel: Call Trace:
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884a14f3>] :dlm:request_lock+0x93/0xa0
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff88548556>] :lock_dlm:gdlm_ast+0x0/0x311
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff885482c1>] :lock_dlm:gdlm_bast+0x0/0x8d
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccef0>] :gfs2:just_schedule+0x9/0xe
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063a16>] __wait_on_bit+0x40/0x6e
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff884ccee7>] :gfs2:just_schedule+0x0/0xe
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff80063ab0>] out_of_line_wait_on_bit+0x6c/0x78
> Sep 28 10:14:20 sviluppo05 kernel:  [<ffffffff800a0a02>] wake_bit_function+0x0/0x23
> Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff884ccee2>] :gfs2:gfs2_glock_wait+0x2b/0x30
> Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff884db26e>] :gfs2:gfs2_getattr+0x85/0xc4
> Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff884db266>] :gfs2:gfs2_getattr+0x7d/0xc4
> Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8000e390>] vfs_getattr+0x2d/0xa9
> Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8003f214>] vfs_lstat_fd+0x2f/0x47
> Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8002ab82>] sys_newlstat+0x19/0x31
> Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8005d229>] tracesys+0x71/0xe0
> Sep 28 10:14:21 sviluppo05 kernel:  [<ffffffff8005d28d>] tracesys+0xd5/0xe0
> 
> --
> Marcello Percoco
> IT Junior
> 
> Diennea
> Viale G. Marconi, 30/14
> 48018 Faenza (RA) - Italy
> 
> E-Mail: marcello.percoco at diennea.com
> Tel.: (+39) 0546 667432 - Int. 916
> Fax:  (+39) 0546 399913
> 
> MagNews - E-Mail Marketing Solutions
> http://www.magnews.it
> 
> Diennea - Technology for Marketing
> http://www.diennea.com
> 
> DISCLAIMER
> Questo messaggio e i suoi allegati si rivolgono esclusivamente ai destinatari e possono contenere informazioni personali, confidenziali o protette da diritti. Se ha ricevuto questo messaggio per errore, l'utilizzo dei suoi contenuti ? proibito e pu? esporre a conseguenze penali o civili. La invitiamo pertanto a rispedire immediatamente il messaggio al mittente e cancellarne gli allegati senza conservarne una copia. Per ulteriori informazioni, La preghiamo di contattarci all'indirizzo postmaster at diennea.com. Grazie
> This e-mail and any attachments may be confidential and the subject of legal professional privilege. Any disclosure, use, storage or copying of this e-mail without the consent of the sender is strictly prohibited. Please notify the sender immediately if you are not the intended recipient and then delete the e-mail from your inbox and do not disclose the contents to another person, use, copy or store the information in any medium. For further information write to postmaster at diennea.com. Thanks
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


From marcello.percoco at diennea.com  Tue Sep 28 11:15:00 2010
From: marcello.percoco at diennea.com (Marcello Percoco - Diennea)
Date: Tue, 28 Sep 2010 13:15:00 +0200
Subject: [Linux-cluster] R:  R:  Gfs2 Problem
In-Reply-To: <1285671618.2457.2.camel@dolmen>
References: <CA2F45405F1F27488F305C7863282FB974D76013@dnaexc01.diennea.lan>
	<1285602746.2476.23.camel@dolmen>
	<CA2F45405F1F27488F305C7863282FB974D7605A@dnaexc01.diennea.lan>
	<1285671618.2457.2.camel@dolmen>
Message-ID: <CA2F45405F1F27488F305C7863282FB974D76064@dnaexc01.diennea.lan>

Here is the glock file, from debugfs, Thanks

H: s:SH f:EH e:0 p:7243 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b6c54 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1a14 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:11377 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/30ac2a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b35c3 f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:97298/2831811 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/2b750b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a7e8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7b92 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4900f4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20555a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2054fc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4789c6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4778a6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47733f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c602 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bed1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477bed f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2058e8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7e48 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32798f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7a70 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204e5e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a453 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e861 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202262 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328666 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327f94 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cad7d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e8da f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7ca9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205c83 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ae8c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481889 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7798 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2025ed f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477641 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b183a f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:95282/2824250 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/478004 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1b0e f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/26e8a6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca7b4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca67e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204ece f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e15a2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:3/30e2a6 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20345f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6208 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6033 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ba11 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6cc3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477f18 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f69d4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a041 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fac3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fd45 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3263e2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478a9e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c3dc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20495c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2041ea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203d97 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb738 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3284ce f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205019 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b61e1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478757 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205330 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7a93 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5c7b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478abc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f766b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47855a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477560 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b61af f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e769 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477662 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b35c1 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7757 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/3cb5b1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7b0e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30d058 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328555 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1ab4 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:10108 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/30ac3f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b1e36 f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:96047/2825782 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/49fdde f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bcbe f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30adc2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7a12 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b0df f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f72e6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6deb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c93eb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328755 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c704 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6f15 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32730c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477621 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6095 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30d070 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb6cd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3278b3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fe7e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7557 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b7e3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7265 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4822dd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20466f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20347b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4beba1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6ddf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a505 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca2bd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e2245 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205a2a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e162 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4820d9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cadcc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1d05 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203847 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47737d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ce34d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b17c4 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e932 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2026d3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30d07e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b6e3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b19d8 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:6985 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/4e1873 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1740 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205566 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cab5f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328323 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1ab3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6b3c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205575 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e25f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477d0b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1887 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b21c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20294e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32ec3e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb6fb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aed2f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6960 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fc3d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca10b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b8c2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477548 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef578 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47815c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1a94 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202c94 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20251b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202617 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481a67 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7a9b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca129 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9f44 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1b10 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:11046 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b7598 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205aa1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205adb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f796f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1d5f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e1b5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1b3a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b0a7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e148 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c9da f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202760 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478aa7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb09a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f653b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/20ebef f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8734 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2027ac f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203122 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b631e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20557c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2041fd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7241 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1856 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203691 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1688 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fc7a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2024b0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205cd8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e2166 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef663 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b616e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7827 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c980c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32ec43 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/309dda f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20552a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7b4a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a111 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204ecd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4bec8a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204fde f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4771ae f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb702 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ca45 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4beab0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb62f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1d87 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bbb1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca547 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204ffb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1f46 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8548 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:2/2b1e4c f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b38d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7363 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca7cf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4821bc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204271 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3fddb8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3fdcaa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f651a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20518c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202199 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7b6f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203be1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202e06 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205d71 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205cba f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a37b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205b91 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47838a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202782 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5d60 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1c8a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4beac0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fcec f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9254 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328794 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3fdca3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205888 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7189 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5baf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fc19 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7f9e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202508 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1856 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8365 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/4e1585 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2031b2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20576b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327ea9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6bdd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b28f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c21a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a015 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c91e8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7a8b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a38e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b400 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7aa6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a457 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f64fb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a050 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7a59 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6e9a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1ebb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a75f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1799 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6d24 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb513 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20526c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7506 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202b28 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9aaa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202b27 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20214d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20ec3f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c014 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2036dd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205584 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1b4c f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b777d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6a5c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481fe1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48199a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205912 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f71fb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32705b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205638 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202b0e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2042b3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c110 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6963 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2031aa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205cf3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b168a f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:95066/2823818 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/2b6da3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caba5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203ffe f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203f7a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205338 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/476fda f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b3601 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8170 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:2/2b17d0 f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10269 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/32eb34 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e2153 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aec09 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef1fc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202f3f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20571d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5971 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f77ee f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2041a7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203f4d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bf32 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202e8c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202db9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fa18 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47848e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ba23 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca375 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38df90 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e17d3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328732 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5ee2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47822c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a996 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202484 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e18db f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2025ca f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7209 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/476f9b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2054b0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b56c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c91cf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7c29 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caaea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327f16 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7e41 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9e0d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6717 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1707 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b74ea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205430 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6e25 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205a2d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3288b4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202d57 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2029cd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aec69 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fbaf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b3514 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7774 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/30c419 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6c6f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f771d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30af29 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cbac f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1ba3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30aa4c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb394 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a9bd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b34b7 f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:97165/2831543 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/20225f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cadeb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caa9e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a489 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aec35 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1ac2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7776 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32724f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4821ca f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aeccb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9e54 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef16a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1a56 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7239 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/30bc63 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7089 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204d6d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203bf1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1bba f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7789 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/4e1ad3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6a80 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef6c6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1692 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1fda f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10291 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/3c9bf8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b174e f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:12997 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/203c74 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1e9a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/326757 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e232b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1710 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6caa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb155 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1e1c f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7859 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:5/2b1860 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7368 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/205195 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef6c2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb6ed f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477b71 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef0f1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49f9d2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a5c6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205457 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4823cc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1b50 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481871 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5eb9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20509f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fd70 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a210 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202680 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b8147 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f78fb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203c6e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327307 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef1bc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9473 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1cdb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328744 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5f30 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fca0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1d8c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b7e0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202edc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f5f45 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203012 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b7f66 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:32372 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/32eb8a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a343 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b72d6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caf92 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f712d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b280 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229c16 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204b02 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2029de f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205700 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/309f3b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202ae2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fc7c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aee55 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202b39 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a1d4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a722 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1689 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203a77 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202ce9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47808d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b557 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb449 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e18ed f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fc42 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e16b4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20305b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a8d7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5fc0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f731f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205db1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477605 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b75a5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/482440 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22abd6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/490001 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203b5c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cc80 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f79b1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b810f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f61c1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7a73 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202dca f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f71f7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1968 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e157e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49018f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7534 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481f40 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aee05 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202419 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f66dd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204888 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caad8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1cea f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8406 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/30c9fb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6ce6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328753 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328247 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204e71 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b62aa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cabb7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aec51 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ddd43 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9205 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20533e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e18c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7522 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef693 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6a39 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2021e3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4816e3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2028ff f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3267e1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7547 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6b0c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205d67 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f640d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202af5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1cf6 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:10458 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b7275 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a96b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205292 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6a60 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30af68 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fba6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f65da f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1781 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22ae0b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b786e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6914 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202486 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb563 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1a4a f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:11362 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b74c1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4bea0b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1ccb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e8a6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b626c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e16af f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481e77 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3fddd3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2027c6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef700 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2039fc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327a1a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b72e6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478106 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aebdd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47707f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f727b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4819be f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cab6a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5b1f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fefb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3289ba f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2045c4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b60f4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca4b3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fc49 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f68f8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2051ef f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6aeb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a201 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2039d6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1c54 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1f55 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20534f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203ae6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1d50 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:10516 [java] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/48fc9c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6a44 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6982 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aee5a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481fd9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204b93 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1cd8 f:l t:SH d:EX/0 l:0 a:0 r:5
 H: s:SH f:aW e:0 p:10831 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
 H: s:SH f:aW e:0 p:10889 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/4ef736 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202f56 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7e29 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478466 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202485 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b18b0 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7868 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/3f722d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327a22 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477d18 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202714 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b75ce f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205834 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229f4b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f79d7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2049f2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1813 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cac9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b715e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4784b7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7b82 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205a07 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b350 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e15a3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1bb6 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c301 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca68d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30aa09 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f69e5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2059ff f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481f34 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b711a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4bebbc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bd8c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef44d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3284ab f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9341 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caf7e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1a58 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7cd0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca0e1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477189 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fcec f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e747 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cc42 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6f4c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aef24 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e20ee f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1bd6 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:10441 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/30bb84 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20478b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caf3e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ce3b5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cda9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202f83 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205447 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c91a6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204fff f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1b54 f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10337 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/2b7f8d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205095 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20281a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/25e936 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47833c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1c04 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229fad f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6596 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481e17 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ce237 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b78f5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327f77 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca69a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a100 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6255 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204766 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f68b9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202620 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2054a8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2026ea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb58b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328473 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c970b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478411 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30d06a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22acf7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a013 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7519 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7177 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20376d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b5cf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7230 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a21d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2021bd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e733 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202e48 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205139 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32ecd5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1ea8 f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10618 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/4e1b87 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30abed f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477dfd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cb49 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e172 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32758e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205ce4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1e40 f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10863 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/30c6f4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204ba0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1c88 f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10297 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/3289c5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ba3e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202b43 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204eb3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202d41 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2054e9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32ec4a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ce254 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2054ea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/21eb49 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1dce f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:10131 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/4e1fa3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204dc5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202369 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a58e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c3a4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20275c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2052fb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c926 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1e28 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1c14 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7819 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/202be5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b171e f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:10303 [java] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/47842a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c11c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef140 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202da0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2022e3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2058b0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cc73 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f695a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f60c7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204c70 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203bbd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b06a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b67cf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204d61 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b086 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b19f2 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8003 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/4ef65c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4786f0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328642 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9164 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b352d f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:97223/2831661 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/2b6f9b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205bef f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9b3c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7891 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f74fe f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205cdb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6e24 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2050bb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30beb0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2059b3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30acbd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2051e4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202823 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b5b2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e18b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb465 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/23ea21 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8863 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/48218a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204c9c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20540e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3092a0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4900e9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b5b9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3277c9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cef4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aed6b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202238 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9a68 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ad74 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2054aa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477142 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca3a6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47847b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6396 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327a27 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205a0a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b58eb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fd40 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4786da f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e234e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6d0a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cc79 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6e01 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/482393 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7da5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e2046 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a33c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c920b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7d9c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481f41 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30afc5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481b8d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6371 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b19e8 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:12956 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:UN n:2/2b1d1a f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f774c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f775f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b1c4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9a85 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3fdd20 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2021d8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b78e3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205380 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6e08 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229b9c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202220 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aec6e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202b32 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38dfa2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b0be f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1e3e f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f69de f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327145 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca8ff f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b1a9e f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:95588/2824862 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/4780e3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c92bc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7da9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7b5c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478a10 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cad02 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b760d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205d99 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2048b4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9cf0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b35b3 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7870 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/3f71af f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ab95 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/32e516 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7bda f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205181 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca7b0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2055f4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204c76 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7b1b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b66e5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b16c8 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8068 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/30b4cb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6ce5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f730f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef44a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b609c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e8dc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1f9b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4771b7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49f99f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/20eb86 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8719 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/3f6108 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7979 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481a7f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2057c1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/326f35 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30adf3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32857a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cacff f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fc2b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1d8a f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6667 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cae4e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327d25 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b5de f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef755 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f63c2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2055d4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229cb0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cc69 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481873 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48ff83 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fa01 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3fdd53 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4780f4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c8c1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b728f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6a8d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20531f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f71f0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202c6e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9c93 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aed17 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477956 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/482180 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a60f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/326f4a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4824b7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f776c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202cf7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f68df f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7182 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9d96 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f68e6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1e46 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:13099 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/32eb50 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7847 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca00e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b379f f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7879 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b7e78 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a785 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b56a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229c87 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fc6a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b19d8 f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:95489/2824664 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/3f7715 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/326f2f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3287b5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327907 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6af4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9aa6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20ec31 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20ebff f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f614d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fff4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1f66 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202350 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7a8e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef309 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1b95 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203530 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478433 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e17d2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb3f7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229dd3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7867 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1886 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b647f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ce94 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477bc2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c98f3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1c0a f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10305 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/2b616a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f72f5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca728 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f724a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477a51 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7b9c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22adeb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2059c4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204b4f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5938 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bb18 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e165c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30aa63 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6d26 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2029e5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c9b5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ae3a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e76e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2058b7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328a24 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f712c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca1ea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20389a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6fd7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cb8e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481ffc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2022ad f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a2e6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6f13 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204fbf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4784a8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1abc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205a51 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9597 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202167 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b177c f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8117 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/49010a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b16c0 f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:95093/2823872 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/2b65e8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef72c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4786c7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203bc1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6f24 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5a44 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22aa82 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cafc7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1c6a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30d05f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caf87 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20299a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203821 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aec00 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1c1e f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:10158 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b7582 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4778fc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328479 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328581 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30adfb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e198e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c607 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1d51 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b909 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aee50 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e15b1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2025e1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229b53 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30d03c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203a4e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca1ab f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4782f0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6495 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1d7f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327096 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478b58 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48212e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20572a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205311 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b3ef f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1942 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7759 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/30bb91 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202dd9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c964c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4789a5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7666 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ca6a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204efc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205708 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2034b5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3276ef f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20219b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327a03 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30af4b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5b52 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b6b5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327f9d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca9da f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328a09 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2056a7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a390 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202e75 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aedb7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202e9c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c92c5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c287 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202202 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204f97 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bd60 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203934 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2053f2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/482211 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb09e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca718 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477da1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1eb4 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7041 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b681e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1712 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8414 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/3cac57 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca520 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b59eb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fbf7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cda7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef359 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2053c5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e8a7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca594 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9b0f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478762 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7640 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c55c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203d90 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f61c0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6c27 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bafa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b637a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/490197 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a661 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b18be f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:11291 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/30b642 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2041d2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2040d6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2040f6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30aa78 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a6d2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205111 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2028be f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1dee f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e18aa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6be3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fd4e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b79a3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203191 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/4ed0f6 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7546 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb363 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7bb9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202258 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2023af f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1d24 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ddd27 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202986 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1576 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205893 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1d1f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477000 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef635 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1b50 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1c75 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b744b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7850 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb6ea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328052 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204beb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fbdd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb04e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1575 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6896 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4be9e3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e196a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30aac4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481f97 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478489 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ba99 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca26d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2022dc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2050d0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2033da f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6afa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1a22 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7240 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b7ca3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5c80 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202bdd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a04c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47708e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204ee2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3fdd82 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4bebbb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4775fd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b16e0 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205cf5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:3/36df0a f:y t:EX d:EX/0 l:0 a:0 r:3
 R: n:3596042 f:05 b:65343/65343 i:11
G:  s:EX n:2/34e735 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c9ae f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca09a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/23ea29 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205042 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cc4b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6e93 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fb8e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e1b2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3289a6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/204cdf f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1c48 f:l t:SH d:EX/0 l:0 a:0 r:5
 H: s:SH f:aW e:0 p:10821 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
 H: s:SH f:aW e:0 p:11466 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/481db6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca5d0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204fcf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2035b8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20ebe5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2043db f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6073 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20274b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef42b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202514 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32650a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b64ce f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caeef f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6d56 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e19a8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ce2c9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6221 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b68aa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6035 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c91a3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7606 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6ee9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef3b7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f79d6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f694c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a04e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477b06 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9f79 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7243 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202f0c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1efa f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caf3a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1595 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1c64 f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10290 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/20294c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20574d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2023fe f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2038de f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c93f9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3270a9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4776db f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/202139 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202708 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a178 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30d079 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48febb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6de8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c0ed f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2037bb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1da0 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7942 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/4e1fbf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30aa7f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4823f1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477265 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a4c5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202e67 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fb38 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47722f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205413 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328624 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b54c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202934 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e17d8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205173 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204f9a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c50b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9d7f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c91d6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20569c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7c63 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477197 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aebea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6ae7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205866 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4beaef f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c955c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477091 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bf52 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32761f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fce2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1eb6 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7041 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/328913 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3270d6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cd31 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e866 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9fde f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204ee6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e167e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477194 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7c5b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48231a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4780e9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bf15 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2041f0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478865 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1b52 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a9ed f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20540a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20363e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca29d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204d5e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2053d8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328490 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef2cd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b60b8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477901 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7dbd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2052e9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e2051 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7909 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b684a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a10c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202b62 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6f30 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1ae1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1a9e f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7329 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/32ebc1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caaf2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6511 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203931 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477a6d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1f9f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aecad f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e2328 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205572 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c6e5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202687 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7308 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f759b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca720 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caacd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4820dc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205d5f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a021 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6e20 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327224 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48201f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327fb4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aee0a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b563 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1883 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a8dc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/309e79 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a410 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205953 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b667b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205496 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6112 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef5d5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477939 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cab52 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20275f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/482324 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb623 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b08e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1918 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:11008 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/481e4a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202f37 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aef22 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f722e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c31b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2fe3e8 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ce3a9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7b25 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2055e9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6dd7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477757 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef745 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1d1e f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1e14 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7c91 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b6c5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204dc7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a146 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205a82 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb4b6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ba76 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ab3c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477066 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b78d3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205bd1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205c1e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b172e f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10336 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/2b6008 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32866a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fd7b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203b10 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481b2f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205cd5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca689 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7396 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1984 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fb69 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b18fa f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32813b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1a9c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1774 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7096 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/202364 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2031ac f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30cf5c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a892 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327557 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f610c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1c12 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8184 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/4aec2e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481673 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1700 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8077 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b72f9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481fd7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c18d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca807 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202c8a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6595 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/34e7d5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c535 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b639e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4784d8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327521 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2021c9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b803c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e16e7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3280bb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202548 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32754e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3284ea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20ebbe f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205a80 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20562a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b1e90 f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:96092/2825872 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/2b7a67 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/482028 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20560a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef2f9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2056ca f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202810 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1875 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9659 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/478480 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204233 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bff3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202d5d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6980 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f61ce f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3264cf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229b87 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204278 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3270e4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327e55 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7cb3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b8e8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4780ee f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30be45 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c9f3a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/309de4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1b66 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7086 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2028b6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2035d2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204161 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205c98 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fbea f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7333 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205b5f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229ec4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203320 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6564 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1866 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5f43 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fc8c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204e22 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caea0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30bf30 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229c0b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6868 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cad6f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481b2b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b95a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2031ad f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205db0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1c2a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef0fe f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3285ae f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2035bf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49f9ce f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7ab7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fb88 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e1f8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6d21 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e118 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47730d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1c6f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477100 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b72f1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205d81 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b809f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b71c1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ce215 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b1c2c f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:95786/2825260 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/2b69fb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b67ef f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5af3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2035a9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202520 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4beb87 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/309eaa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205884 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/326473 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b595d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b19be f:l t:SH d:EX/0 l:0 a:0 r:5
 H: s:SH f:aW e:0 p:10783 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
 H: s:SH f:aW e:0 p:11428 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/2b75bf f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aef36 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aef09 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203ce2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2032f6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1a88 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a32b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7b81 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b820 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e02a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20515d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328928 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e2319 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2057e4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b2d0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32650c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aeee7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2039a3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30adb3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5906 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203830 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7b76 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1ba5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ca96 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a766 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c946f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20424c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2034f1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b66a0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229c75 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47832c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205bdc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b165e f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10849 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/2b7ba4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7844 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229dd7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1fa8 f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10608 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/30cd58 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49facd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/327329 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1920 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:13017 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:SH n:2/2b34b3 f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:97163/2831539 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/2b6d36 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2050f1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b740 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4777e9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4bea5f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204ed5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a7b5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4816b3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fb82 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203674 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1b0e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb6a4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20347a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b34b9 f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:97166/2831545 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/20290e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e20f8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2035a3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ddd3c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b70fc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204ac9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c3de f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1e87 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7feb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7e7d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6bd2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c92b6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1e9e f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:13184 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2055d1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3caceb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20519b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1f92 f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10347 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:UN n:2/2b1d5a f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b19ce f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7947 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/49014e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7beb f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a374 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b081 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47812a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f65d2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7b9a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b187 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b659d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca732 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1cfc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f61ee f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481893 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b74e7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca487 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c98d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2023f2 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481e0d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b733c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a4d8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1568 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/328492 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb060 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6123 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47719f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38e030 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aedd0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca12a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2024ca f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1bad f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7109 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b6e38 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4782c8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b16d2 f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:95102/2823890 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/2046de f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30be18 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205af6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203eb1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb38a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c3ff f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ca6a5 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a3cc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481879 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c02c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/23e9f5 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8843 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/3caa5e f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a314 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f75ca f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204b44 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/21eb82 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8827 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/32678c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4ef352 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32886b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c222 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30baf9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b1a8c f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:7399 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b7df9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202706 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f68a9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204fcc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f79ed f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:2/2b195a f: t:SH d:EX/0 l:0 a:0 r:3
 I: n:95426/2824538 t:4 f:0x10 d:0x00000001 s:3864/3864
G:  s:EX n:2/3f6fda f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ce13 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1b62 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7508 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aec0d f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aecce f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b184e f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10338 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/48ff93 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a149 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3c96f9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4782b9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229e6c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ddd69 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202226 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4784d9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204d5f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fa64 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204c11 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f7244 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/47823c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1d3c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49f9de f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a999 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48fd02 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6ff1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ca1f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2028d7 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49014b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202175 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a382 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3cb6cc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22a987 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ddd4c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203b05 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/37e8a9 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:10225 [java] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/204f50 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/203841 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ddcfd f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2059fa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30c507 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22ae51 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205697 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1540 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6d04 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20532f f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f68f6 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a5dc f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:SH n:5/2b17c6 f: t:SH d:EX/0 l:0 a:0 r:3
 H: s:SH f:EH e:0 p:8198 [(ended)] gfs2_inode_lookup+0x12d/0x209 [gfs2]
G:  s:EX n:2/2b7b8c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481742 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/477099 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b6080 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205c89 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f79c1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/48216a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1eec f:l t:SH d:EX/0 l:0 a:0 r:4
 H: s:SH f:aW e:0 p:10371 [java] gfs2_getattr+0x7d/0xc4 [gfs2]
G:  s:EX n:2/205b71 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205358 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/49fc9a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20377b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4818a3 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4e1e42 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/202b40 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30a1b9 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/481f4a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204f9c f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/20ebc4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6e68 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/205738 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229caa f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/38dfc4 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30ba49 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2058c1 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2024f8 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b5a58 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3f6534 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:UN n:2/2b1d92 f: t:UN d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/229b13 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/32854b f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b74ee f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/22aa67 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/4aec42 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/30b4a0 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/3ce306 f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/2b7e2a f:y t:EX d:EX/0 l:0 a:0 r:2
G:  s:EX n:2/204774 f:y t:EX d:EX/0 l:0 a:0 r:2

--
Marcello Percoco
IT Junior

Diennea
Viale G. Marconi, 30/14
48018 Faenza (RA) - Italy

E-Mail: marcello.percoco at diennea.com
Tel.: (+39) 0546 667432 - Int. 916
Fax:  (+39) 0546 399913

MagNews - E-Mail Marketing Solutions
http://www.magnews.it

Diennea - Technology for Marketing
http://www.diennea.com

DISCLAIMER
Questo messaggio e i suoi allegati si rivolgono esclusivamente ai destinatari e possono contenere informazioni personali, confidenziali o protette da diritti. Se ha ricevuto questo messaggio per errore, l'utilizzo dei suoi contenuti ? proibito e pu? esporre a conseguenze penali o civili. La invitiamo pertanto a rispedire immediatamente il messaggio al mittente e cancellarne gli allegati senza conservarne una copia. Per ulteriori informazioni, La preghiamo di contattarci all'indirizzo postmaster at diennea.com. Grazie
This e-mail and any attachments may be confidential and the subject of legal professional privilege. Any disclosure, use, storage or copying of this e-mail without the consent of the sender is strictly prohibited. Please notify the sender immediately if you are not the intended recipient and then delete the e-mail from your inbox and do not disclose the contents to another person, use, copy or store the information in any medium. For further information write to postmaster at diennea.com. Thanks


From swhiteho at redhat.com  Tue Sep 28 11:35:12 2010
From: swhiteho at redhat.com (Steven Whitehouse)
Date: Tue, 28 Sep 2010 12:35:12 +0100
Subject: [Linux-cluster] R:  R:  Gfs2 Problem
In-Reply-To: <CA2F45405F1F27488F305C7863282FB974D76064@dnaexc01.diennea.lan>
References: <CA2F45405F1F27488F305C7863282FB974D76013@dnaexc01.diennea.lan>
	<1285602746.2476.23.camel@dolmen>
	<CA2F45405F1F27488F305C7863282FB974D7605A@dnaexc01.diennea.lan>
	<1285671618.2457.2.camel@dolmen>
	<CA2F45405F1F27488F305C7863282FB974D76064@dnaexc01.diennea.lan>
Message-ID: <1285673712.2457.5.camel@dolmen>

Hi,

On Tue, 2010-09-28 at 13:15 +0200, Marcello Percoco - Diennea wrote:
> G:  s:UN n:2/2b1f92 f:l t:SH d:EX/0 l:0 a:0 r:4
>  H: s:SH f:aW e:0 p:10347 [java] gfs2_getattr+0x7d/0xc4 [gfs2] 

The entries you need to look for are those like the above where there is
a 'W' in the holder's flags field. If you do two dumps a few mins apart
and the lock state of the locks with waiters looks identical then there
is a real bug. If the lock state is different, or if different locks
have waiters, then that suggests that the issue is contention,

Steve.


From marcello.percoco at diennea.com  Tue Sep 28 11:59:22 2010
From: marcello.percoco at diennea.com (Marcello Percoco - Diennea)
Date: Tue, 28 Sep 2010 13:59:22 +0200
Subject: [Linux-cluster] R:  R:  R:  Gfs2 Problem
In-Reply-To: <1285673712.2457.5.camel@dolmen>
References: <CA2F45405F1F27488F305C7863282FB974D76013@dnaexc01.diennea.lan>
	<1285602746.2476.23.camel@dolmen>
	<CA2F45405F1F27488F305C7863282FB974D7605A@dnaexc01.diennea.lan>
	<1285671618.2457.2.camel@dolmen>
	<CA2F45405F1F27488F305C7863282FB974D76064@dnaexc01.diennea.lan>
	<1285673712.2457.5.camel@dolmen>
Message-ID: <CA2F45405F1F27488F305C7863282FB974D76069@dnaexc01.diennea.lan>

Very wel, now i make a deep analysis, but i think thath the problemi s simply the conteention, so only our developers can solve the problem.
Thanks.

-- 
Marcello Percoco
IT Junior

Diennea
Viale G. Marconi, 30/14
48018 Faenza (RA) - Italy

E-Mail:?marcello.percoco at diennea.com
Tel.:?(+39)?0546?667432 - Int. 916
Fax:??(+39)?0546?399913

MagNews?-?E-Mail?Marketing?Solutions
http://www.magnews.it

Diennea?-?Technology?for?Marketing
http://www.diennea.com 

DISCLAIMER
Questo messaggio e i suoi allegati si rivolgono esclusivamente ai destinatari e possono contenere informazioni personali, confidenziali o protette da diritti. Se ha ricevuto questo messaggio per errore, l'utilizzo dei suoi contenuti ? proibito e pu? esporre a conseguenze penali o civili. La invitiamo pertanto a rispedire immediatamente il messaggio al mittente e cancellarne gli allegati senza conservarne una copia. Per ulteriori informazioni, La preghiamo di contattarci all'indirizzo postmaster at diennea.com. Grazie 
This e-mail and any attachments may be confidential and the subject of legal professional privilege. Any disclosure, use, storage or copying of this e-mail without the consent of the sender is strictly prohibited. Please notify the sender immediately if you are not the intended recipient and then delete the e-mail from your inbox and do not disclose the contents to another person, use, copy or store the information in any medium. For further information write to postmaster at diennea.com. Thanks

-----Messaggio originale-----
Da: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] Per conto di Steven Whitehouse
Inviato: marted? 28 settembre 2010 13:35
A: linux clustering
Oggetto: Re: [Linux-cluster] R: R: Gfs2 Problem

Hi,

On Tue, 2010-09-28 at 13:15 +0200, Marcello Percoco - Diennea wrote:
> G:  s:UN n:2/2b1f92 f:l t:SH d:EX/0 l:0 a:0 r:4
>  H: s:SH f:aW e:0 p:10347 [java] gfs2_getattr+0x7d/0xc4 [gfs2] 

The entries you need to look for are those like the above where there is
a 'W' in the holder's flags field. If you do two dumps a few mins apart
and the lock state of the locks with waiters looks identical then there
is a real bug. If the lock state is different, or if different locks
have waiters, then that suggests that the issue is contention,

Steve.


--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

Nessun virus nel messaggio in arrivo.
Controllato da AVG - www.avg.com 
Versione: 9.0.856 / Database dei virus: 271.1.1/3163 -  Data di rilascio: 09/27/10 19:56:00


From expertalert at gmail.com  Tue Sep 28 12:10:49 2010
From: expertalert at gmail.com (fosiul alam)
Date: Tue, 28 Sep 2010 13:10:49 +0100
Subject: [Linux-cluster] ricci is very unstable in one nodes
In-Reply-To: <AANLkTinYo3JmCqLgQEvqNFM5N4DLHnWAdbQunO9ymqYO@mail.gmail.com>
References: <AANLkTimo0R=C1XP8KwoXeyO=VWNVnFckkiXUZnrjBgs0@mail.gmail.com>
	<1480320.10.1285606528829.JavaMail.root@athena>
	<AANLkTikwtYxG3_gf0QxqJpGzZxowh4T7rGbwH-+MhWs8@mail.gmail.com>
	<AANLkTi=DfrVMFkp8No9UbwD+fVoRx9FmpO+qzY2RxLPk@mail.gmail.com>
	<AANLkTinYo3JmCqLgQEvqNFM5N4DLHnWAdbQunO9ymqYO@mail.gmail.com>
Message-ID: <AANLkTinZ=hgCs29zS0=2gdtDU6gY9JPe7u6xGtLAnReC@mail.gmail.com>

hi ya

I reboot the whole cluster, every single server

when every one has been rebooted ..

every thing was looking al-right!!

[root at http1 ~]# clustat
Cluster Status for ng1 @ Tue Sep 28 13:03:45 2010
Member Status: Quorate

 Member Name                             ID   Status
 ------ ----                             ---- ------
 beaver.xx.local                  1 Online, rgmanager
 publicdns1.xxx.local              2 Online, rgmanager
 http1.xxx.local                   3 Online, Local, rgmanager
 mail01.xxx.local                  4 Online, rgmanager

 Service Name                   Owner (Last)                   State
 ------- ----                   ----- ------                   -----
 service:httpd1                 http1.xxx.local      started
------------------- this suppose to be here.
 service:mysql-server           mail01.xxx.local     started
 service:public-dns             publicdns1.xxx.local started

but now i was trying to relocate that service from http1.xxx.locate to
mail01.xxx.local


or even trying to access http1.xxx.local from luci server, same problem
again ......


so something else is upsetting.. dont know ...

is not there any way to debug or see what happening inside ??

Thanks for advise


On 28 September 2010 11:32, fosiul alam <expertalert at gmail.com> wrote:

> HI ya
>
> i found this interesting .. but dont know if its normal or not
>
> i typed this command in 3 cluster nodes
>
> tcpdump -i eth0 ip multicast
>
>
> and for some reason.. i am seeing same output in 3 server which is
>
> 11:26:13.700399 IP http1.xxxxx.local.5149 > 239.192.2.185.netsupport: UDP,
> length 118
>
>
> example.. Same output in every 3 server..
>
> is this normal output ?? ( here http1 is having the trouble to locate or
> relocate services in the cluster)
>
> so basically, what ever i am seeing in http1 server i am seeing the same
> out put on rest ..
>
> here 239.192.2.185 is  the multicast address of clsuter
>
> Thanks
> fosiul
>
>
> On 27 September 2010 18:37, fosiul alam <expertalert at gmail.com> wrote:
>
>> Hi, Addition to my previous email have a look to this one
>>
>> from http1 ( where i am trying to relocate a service)
>>
>>
>> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
>> Member http1.xxxx.local trying to enable service:httpd1...Success
>> Warning: service:httpd1 is now running on mail01.xxxx.local
>>
>> so, its saying its Success..
>> but it actually no..
>>
>> Thanks again
>>
>>
>>
>>
>> On 27 September 2010 18:31, fosiul alam <expertalert at gmail.com> wrote:
>>
>>> Hi
>>> Thanks for your advise,
>>> Currently i got this
>>>
>>>
>>> luci-0.12.2-12.el5.centos.1
>>> ricci-0.12.2-12.el5.centos.1
>>>
>>> is this the same rpm as
>>>
>>> luci-0.12.2-12.el5_5.4.i386.rpm  ?
>>> ricci-0.12.2-12.el5_5.4.i386.rpm  ?
>>>
>>> Thanks
>>>
>>>
>>>
>>> On 27 September 2010 17:55, Paul M. Dyer <pmdyer at ctgcentral2.com> wrote:
>>>
>>>> http://rhn.redhat.com/errata/RHBA-2010-0716.html
>>>>
>>>> It appears that this problem has been fixed in this errata.
>>>>
>>>> I installed the luci and ricci updates and did some lite testing.   So
>>>> far, the timeout 11111 error has not shown up.
>>>>
>>>> Paul
>>>>
>>>> ----- Original Message -----
>>>> From: "fosiul alam" <expertalert at gmail.com>
>>>> To: "linux clustering" <linux-cluster at redhat.com>
>>>> Sent: Monday, September 27, 2010 10:48:27 AM
>>>> Subject: Re: [Linux-cluster] ricci is very unstable in one nodes
>>>>
>>>> Hi
>>>> i am trying to patch ricci . let see how it goes
>>>>
>>>> but clusvcadm is failing as well
>>>>
>>>> [root at http1 ~]# clusvcadm -e httpd1 -m http1.xxxx.local
>>>> Member http1.xxxx.local trying to enable service:httpd1...Invalid
>>>> operation for resource
>>>>
>>>> here, http1 , where i was trying to run the service from luci
>>>>
>>>> what could be the problem ?
>>>> is there any way to find out if there is any problem with config ??
>>>>
>>>> On 27 September 2010 16:26, Ben Turner < bturner at redhat.com > wrote:
>>>>
>>>>
>>>> RHEL 5.6 hasn't been released yet so your package probably contains the
>>>> problem. I'm not sure how in sync Centos is with RHEL or if they patch
>>>> earlier so I cannot give you a time frame when it will be in Centos or
>>>> if they have already patched it. The problem in that BZ is more of an
>>>> annoyance, you usually just have to retry a time or two and it works. If
>>>> you can't get Luci working properly with your service at all you should
>>>> try enabling the service through the command line with clusvcadm -e. If
>>>> it is not working from the command line either then there is a problem
>>>> with the service config.
>>>>
>>>>
>>>>
>>>>
>>>> -Ben
>>>>
>>>>
>>>>
>>>>
>>>> ----- "fosiul alam" < expertalert at gmail.com > wrote:
>>>>
>>>> > Hi Ben
>>>> > Thanks
>>>> >
>>>> > I named this cluster as mysql-server but i have not installed mysql
>>>> > database in their yet
>>>> >
>>>> > and both luci and ricci on luci server and node1 is running this
>>>> > version
>>>> >
>>>> > luci-0.12.2-12.el5.centos.1
>>>> > ricci-0.12.2-12.el5.centos.1
>>>> >
>>>> >
>>>> > do you think this version has problem as well ??
>>>> >
>>>> > thanks for your help
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > On 24 September 2010 15:33, Ben Turner < bturner at redhat.com > wrote:
>>>> >
>>>> >
>>>> > There is an issue with ricci timeouts that was fixed recently:
>>>> >
>>>> > https://bugzilla.redhat.com/show_bug.cgi?id=564490
>>>> >
>>>> > I'm not sure but you may be hitting that bug. Symptoms include: luci
>>>> > isn't able to get the status from the node, timeouts when querying
>>>> > ricci, etc. The fix should be released with 5.6
>>>> >
>>>> > On the mysql service there are some options that you need to set. Here
>>>> > are all the options available to that agent:
>>>> >
>>>> > mysql
>>>> > Defines a MySQL database server
>>>> >
>>>> > Attribute Description
>>>> > config_file Define configuration file
>>>> > listen_address Define an IP address for MySQL server. If the address
>>>> > is not given then first IP address from the service is taken.
>>>> > mysqld_options Other command-line options for mysqld
>>>> > name Name
>>>> > ref Reference to existing mysql resource in the resources section.
>>>> > service_name Inherit the service name.
>>>> > shutdown_wait Wait X seconds for correct end of service shutdown
>>>> > startup_wait Wait X seconds for correct end of service startup
>>>> > __enforce_timeouts Consider a timeout for operations as fatal.
>>>> > __failure_expire_time Amount of time before a failure is forgotten.
>>>> > __independent_subtree Treat this and all children as an independent
>>>> > subtree. __max_failures Maximum number of failures before returning a
>>>> > failure to a status check.
>>>> >
>>>> > If I recall correctly you may need to tweak:
>>>> >
>>>> > shutdown_wait Wait X seconds for correct end of service shutdown
>>>> > startup_wait Wait X seconds for correct end of service startup
>>>> >
>>>> > There can be problems relocating the DB if it takes too long to
>>>> > start/shutdown. If you are having problems relocating with luci it may
>>>> > be a good idea to test with:
>>>> >
>>>> > # clusvcadm -r <service name> -m <cluster node>
>>>> >
>>>> > -Ben
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> > ----- "fosiul alam" < expertalert at gmail.com > wrote:
>>>> >
>>>> > > Hi
>>>> > > I have 4 nodes cluster,
>>>> > > It was running fine. but today one nodes is giving trouble
>>>> > >
>>>> > > From luci Gui interface, when i try to relocate service into this
>>>> > node
>>>> > > and trying to relocate from this nodes to another nodes
>>>> > >
>>>> > > from luci gui interface, its showing :
>>>> > >
>>>> > > Unable to retrieve batch 1908047789 status from
>>>> > > beaver.domain.local:11111: clusvcadm start failed to start httpd1:
>>>> > > Starting cluster service "httpd1" on node "http1.domain.local" --
>>>> > You
>>>> > > will be redirected in 5 seconds.
>>>> > > also
>>>> > >
>>>> > > The ricci agent for this node is unresponsive. Node-specific
>>>> > > information is not available at this time. :
>>>> > >
>>>> > > but ricci is running on problematic node ,
>>>> > > ricci 7324 0.0 0.1 58876 2932 ? S<s 14:40 0:00 ricci -u 101
>>>> > >
>>>> > > there is not any firewall running.
>>>> > >
>>>> > > iptables -L
>>>> > > Chain INPUT (policy ACCEPT)
>>>> > > target prot opt source destination
>>>> > >
>>>> > > Chain FORWARD (policy ACCEPT)
>>>> > > target prot opt source destination
>>>> > >
>>>> > > Chain OUTPUT (policy ACCEPT)
>>>> > > target prot opt source destination
>>>> > >
>>>> > > Chain RH-Firewall-1-INPUT (0 references)
>>>> > > target prot opt source destination
>>>> > >
>>>> > > port 11111 is runningg
>>>> > >
>>>> > > netstat -an | grep 11111
>>>> > > tcp 0 0 0.0.0.0:11111 0.0.0.0:* LISTEN
>>>> > >
>>>> > >
>>>> > > but still ricci is very unstable , and i cant relocate any service
>>>> > on
>>>> > > this node or i cant relocate any service away from this node.
>>>> > >
>>>> > > from problematic node if i type this
>>>> > >
>>>> > > clustat
>>>> > > Cluster Status for ng1 @ Thu Sep 23 20:24:02 2010
>>>> > > Member Status: Quorate
>>>> > >
>>>> > > Member Name ID Status
>>>> > > ------ ---- ---- ------
>>>> > > beaver.xxx.local 1 Online, rgmanager ::: luci is running from this
>>>> > > server publicdns1.xxxx.local 2 Online, rgmanager
>>>> > > http1.xxxx.local 3 Online, Local, rgmanager
>>>> > > mail01.xxxxx.local 4 Online, rgmanager
>>>> > >
>>>> > > Service Name Owner (Last) State
>>>> > > ------- ---- ----- ------ -----
>>>> > > service:httpd1 mail01.xxxx.local started
>>>> > > service:mysql-server http1.xxxx.local started -------------------
>>>> > this
>>>> > > is the problematic node
>>>> > > service:public-dns publicdns1.xxxxxx.local started
>>>> > >
>>>> > > I cant move that service mysql-server from this node or cant
>>>> > relocate
>>>> > > any service on this node ..
>>>> > > I am very confused.
>>>> > >
>>>> > > what shall i do to fix this issue ??
>>>> > >
>>>> > > thanks for your advise.
>>>> > >
>>>> > >
>>>> > >
>>>> > >
>>>> > > -- Linux-cluster mailing list
>>>> > > Linux-cluster at redhat.com
>>>> > > https://www.redhat.com/mailman/listinfo/linux-cluster
>>>> >
>>>> > -- Linux-cluster mailing list
>>>> > Linux-cluster at redhat.com
>>>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>>>> >
>>>> >
>>>> > -- Linux-cluster mailing list
>>>> > Linux-cluster at redhat.com
>>>> > https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>> -- Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>>
>>>> -- Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>> --
>>>> Linux-cluster mailing list
>>>> Linux-cluster at redhat.com
>>>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100928/6e90451a/attachment.htm>

From mulualem_vienna at yahoo.com  Wed Sep 29 07:47:19 2010
From: mulualem_vienna at yahoo.com (Mulualem Denekew)
Date: Wed, 29 Sep 2010 00:47:19 -0700 (PDT)
Subject: [Linux-cluster] Gfs2 Problem
Message-ID: <941151.5094.qm@web33208.mail.mud.yahoo.com>


Marcello,
?
I have similar problem on my two nodes cluster used as GFS2 storage cluster. Just GFS2 not service clustering. Both nodes were hanging several times and we had to reboot . They are running webservers. I have been in contact with Redhat and I was advised to use gfs2_hangalyzer during the next IO or process hang.? The Redhat support engineer is assuming it is a lock contention issue, and on your case you said only your developers can solve this type of problem. What do you mean by that ? Can you make it a little bit clear for me. You mean the problem is caused by applications running in your servers ? In our case we have Java and Tomcat are running, you mean the Java developers should see the matter ? 
?
By the way?I just subscribed to the linux-cluster at redhat.com?now, is this correct way to post messages ?? 
?
Thanks
Mulu
?
Very wel, now i make a deep analysis, but i think thath the problemi s simply the conteention, so only our developers can solve the problem.
Thanks.
-- 
Marcello Percoco
IT Junior
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100929/248bcfd1/attachment.htm>

From dhoffutt at gmail.com  Thu Sep 30 00:13:15 2010
From: dhoffutt at gmail.com (Dustin Henry Offutt)
Date: Wed, 29 Sep 2010 19:13:15 -0500
Subject: [Linux-cluster] Gfs2 Problem
In-Reply-To: <941151.5094.qm@web33208.mail.mud.yahoo.com>
References: <941151.5094.qm@web33208.mail.mud.yahoo.com>
Message-ID: <4CA3D61B.7050902@gmail.com>

it is indeed the correct way, sir.

Mulualem Denekew wrote:
>
> Marcello,
>
>  
>
> I have similar problem on my two nodes cluster used as GFS2 storage 
> cluster. Just GFS2 not service clustering. Both nodes were hanging 
> several times and we had to reboot . They are running webservers. I 
> have been in contact with Redhat and I was advised to use 
> gfs2_hangalyzer during the next IO or process hang.  The Redhat 
> support engineer is assuming it is a lock contention issue, and on 
> your case you said only your developers can solve this type of 
> problem. What do you mean by that ? Can you make it a little bit clear 
> for me. You mean the problem is caused by applications running in your 
> servers ? In our case we have Java and Tomcat are running, you mean 
> the Java developers should see the matter ?
>
>  
>
> By the way I just subscribed to the _linux-cluster at redhat.com_ 
> <mailto:linux-cluster at redhat.com> now, is this correct way to post 
> messages ? 
>
>  
>
> Thanks
>
> Mulu
>
>  
>
> Very wel, now i make a deep analysis, but i think thath the problemi s 
> simply the conteention, so only our developers can solve the problem.
>
> Thanks.
>
> -- 
>
> Marcello Percoco
>
> IT Junior
>
> ------------------------------------------------------------------------
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100929/1bc9bf44/attachment.htm>

From prasannakumar85 at gmail.com  Thu Sep 30 18:29:22 2010
From: prasannakumar85 at gmail.com (prasanna kumar)
Date: Thu, 30 Sep 2010 23:59:22 +0530
Subject: [Linux-cluster] Regarding Redhat Cluster Configuration
Message-ID: <AANLkTimUTESjW4UfMOLMdk6=B_jtYnx0zCfebpTRFBZu@mail.gmail.com>

Hello All,


I am newly joined in this group. Kindly send me the redhat cluster
configuration steps for redhat 5 version.


Also  Could you please advise me for my below questions.


 Are we able to configure the Redhat Cluster (adding nodes) only in
graphical environment? or we can configure in text mode also.


-- 
Thanks & Regards
Prasanna
(Linux Learner)
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100930/2ffcc1c9/attachment.htm>

From thomas at sjolshagen.net  Thu Sep 30 19:50:15 2010
From: thomas at sjolshagen.net (Thomas Sjolshagen)
Date: Thu, 30 Sep 2010 15:50:15 -0400
Subject: [Linux-cluster] Regarding Redhat Cluster Configuration
In-Reply-To: <AANLkTimUTESjW4UfMOLMdk6=B_jtYnx0zCfebpTRFBZu@mail.gmail.com>
References: <AANLkTimUTESjW4UfMOLMdk6=B_jtYnx0zCfebpTRFBZu@mail.gmail.com>
Message-ID: <43617dafa9f4c5bc519a96bd6e43b327@www.sjolshagen.net>


On Thu, 30 Sep 2010 23:59:22 +0530, prasanna kumar  wrote:  

Hello
All, 

I am newly joined in this group. Kindly send me the redhat
cluster configuration steps for redhat 5 version.  

How about starting
with a simple google search: "red hat cluster" howto 

That would lead
you to http://www.centos.org/docs/5/pdf/Cluster_Administration.pdf or
the official Red Hat Documentation with of the same name. 

If you've
got a specific problem during your configuration, there are lots of
people who're more than willing to help out, but then you need to
provide many more details about your environment (hardware and
configuration), what you're trying to do, what it is that isn't working
and how you'd expect it to work as well as any error
messages/warnings/etc from the log files (after you've enabled
debugging), etc, etc. 

The "question" you asked above (it reads more
like a demand!) is very counter-productive. Spend some time researching
before you ask us to help you with stuff that's clearly documented at
multiple places online, please. 

// Thomas 
// Thomas
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20100930/b9902e3a/attachment.htm>

From linux at alteeve.com  Thu Sep 30 20:10:03 2010
From: linux at alteeve.com (Digimer)
Date: Thu, 30 Sep 2010 16:10:03 -0400
Subject: [Linux-cluster] Regarding Redhat Cluster Configuration
In-Reply-To: <AANLkTimUTESjW4UfMOLMdk6=B_jtYnx0zCfebpTRFBZu@mail.gmail.com>
References: <AANLkTimUTESjW4UfMOLMdk6=B_jtYnx0zCfebpTRFBZu@mail.gmail.com>
Message-ID: <4CA4EE9B.5080103@alteeve.com>

On 10-09-30 02:29 PM, prasanna kumar wrote:
> Hello All,
> 
> I am newly joined in this group. Kindly send me the redhat cluster
> configuration steps for redhat 5 version.
> 
> Also  Could you please advise me for my below questions.
> 
>  Are we able to configure the Redhat Cluster (adding nodes) only in
> graphical environment? or we can configure in text mode also.

Welcome. :)

Clusters can be configured in many different ways, so to help you, you
need to better explain what you are trying to accomplish.

As for configuration method; I personally do the entire configuration
and maintenance from the command line. It is a question of a) What do
you prefer and b) Do the graphical tools support the various
configuration options that you need to fiddle with.

-- 
Digimer
E-Mail:         linux at alteeve.com
AN!Whitepapers: http://alteeve.com
Node Assassin:  http://nodeassassin.org