From alban.crequy at seanodes.com  Fri Jul  1 08:36:46 2005
From: alban.crequy at seanodes.com (Alban Crequy)
Date: Fri, 01 Jul 2005 10:36:46 +0200
Subject: [Linux-cluster] Fencing the gulm master node: problem
Message-ID: <42C5009E.4080108@seanodes.com>

Hello,

I want to test the fencing capability of GFS by unplugging the network on a 
node. But I experience some problems when the node I unplug is the gulm master.

I am using the RPM:
  - GFS-6.0.2.20-2
  - GFS-modules-smp-6.0.2.20-2

I have a 8-nodes cluster (sam21, sam22, ..., sam28). I mount a GFS 
filesystem on all nodes on /mnt/gfs

My config is:

----->8-------->8-------->8-------->8---
# fence.ccs
fence_devices {
         admin {
                 agent="fence_manual"
         }
}

# cluster.ccs
cluster {
         name="sam"
         lock_gulm {
                 servers=["sam21", "sam22", "sam23", "sam24", "sam25"]
         }
}

# nodes.ccs
nodes {
         sam21.toulouse {
                 ip_interfaces {
                 eth0 = "192.168.0.121"
                 }
                 fence {
                         human {
                                 admin {
                                         ipaddr = "192.168.0.121"
                                 }
                         }
                 }
         }
# etc. for sam22 ... sam28
----->8-------->8-------->8-------->8---

I want to check that the unplugged node is fenced and its locks are released 
when I run "fence_ack_manual" (and only when I run fence_ack_manual", not 
before).

In order to know when the locks are released, I wrote a small program:

----->8-------->8-------->8-------->8---
// lock.c
fd = open("/mnt/gfs/lock-test.tmp", O_RDWR|O_CREAT, S_IREAD|S_IWRITE);
if (fd == -1) {
   printf("ERROR: open failed.\n");
   return 1;
}
error = flock(fd, LOCK_EX);
if (error == -1) {
   printf("ERROR: lock failed.\n");
   return 1;
}
while (1) {
   printf("writing... pid %d : %d\n", pid, counter++);
   buf[0]=0;
   p = sprintf(buf, "pid %d : %d\n", pid, counter);
   write(fd, buf, p);
   sleep(1);
}
----->8-------->8-------->8-------->8---


First test (which works):
- I run my lock.c program on sam26 (not a gulm server)
- The lock is acquired on sam26
- I run my lock.c program on all other nodes
- The other nodes wait for the lock
- I unplug sam26 and wait until the gulm master (sam21) want to fence sam26
- The gulm master (sam21) want me to run fence_ack_manual
- The lock is not taken on other node
- I run fence_ack_manual
- The lock is released on the unplugged node (sam26) and taken by another
=> So when I unplug a node which is not a gulm server, all work correctly.


Second test (which doesn't work):
- I run my lock.c program on sam21 (the gulm master)
- The lock is acquired on sam21
- I run my lock.c program on all other nodes
- The other nodes wait for the lock
- I unplug sam21 and wait until a new gulm master (sam22) want to fence the 
old master (sam21)
- The new gulm master (sam22) want me to run fence_ack_manual BUT the lock 
is released immediately. I did not run fence_ack_manual and the lock is 
already released. This is my problem.

I read the bug reports [1][2] and the advisory RHBA-2005:466-11 [3] which 
says ? Fixed a problem in which a gulm lock server ran on GFS clients after 
the master server died. ? But I use GFS-6.0.2.20-2.

[1] https://bugzilla.redhat.com/beta/show_bug.cgi?id=148029
[2] https://bugzilla.redhat.com/beta/show_bug.cgi?id=149119
[3] http://rhn.redhat.com/errata/RHBA-2005-466.html

Is this a bug? Or a misunderstanding of the fencing mechanism?

The syslogs on the new gulm master (sam22) are:

----->8-------->8-------->8-------->8---
Jul  1 09:06:52 sam22 lock_gulmd_core[4195]: Failed to receive a timely 
heartbeat reply from Master. (t:1120201612489192 mb:1)
Jul  1 09:07:07 sam22 lock_gulmd_core[4195]: Failed to receive a timely 
heartbeat reply from Master. (t:1120201627509192 mb:2)
Jul  1 09:07:22 sam22 lock_gulmd_core[4195]: Failed to receive a timely 
heartbeat reply from Master. (t:1120201642529191 mb:3)
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: I see no Masters, So I am 
Arbitrating until enough Slaves talk to me.
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send quorum update to 
slave sam23.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send quorum update to 
slave sam28.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send quorum update to 
slave sam26.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send quorum update to 
slave sam25.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send quorum update to 
slave sam24.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send quorum update to 
slave sam22.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send quorum update to 
slave sam27.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: LastMaster 
sam21.toulouse:192.168.0.121, is being marked Expired.
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send membership 
update "Expired" about sam21.toulouse to slave sam23.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send membership 
update "Expired" about sam21.toulouse to slave sam28.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send membership 
update "Expired" about sam21.toulouse to slave sam26.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_LTPX[4197]: New Master at 
sam22.toulouse:192.168.0.122
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send membership 
update "Expired" about sam21.toulouse to slave sam25.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send membership 
update "Expired" about sam21.toulouse to slave sam24.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send membership 
update "Expired" about sam21.toulouse to slave sam22.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Could not send membership 
update "Expired" about sam21.toulouse to slave sam27.toulouse
Jul  1 09:07:37 sam22 lock_gulmd_core[4195]: Forked [4882] fence_node 
sam21.toulouse with a 0 pause.
Jul  1 09:07:37 sam22 lock_gulmd_core[4882]: Gonna exec fence_node 
sam21.toulouse
Jul  1 09:07:37 sam22 fence_node[4882]: Performing fence method, human, on 
sam21.toulouse.
Jul  1 09:07:37 sam22 fence_manual: Node 192.168.0.121 requires hard reset. 
  Run "fence_ack_manual -s 192.168.0.121" after power cycling the machine.
Jul  1 09:07:38 sam22 lock_gulmd_core[4195]: Still in Arbitrating: Have 1, 
need 3 for quorum.
Jul  1 09:07:39 sam22 lock_gulmd_core[4195]: Still in Arbitrating: Have 2, 
need 3 for quorum.
Jul  1 09:07:39 sam22 lock_gulmd_core[4195]: New Client: idx:5 fd:10 from 
(192.168.0.124:sam24.toulouse)
Jul  1 09:07:39 sam22 lock_gulmd_core[4195]: Member update message Logged in 
about sam24.toulouse to sam23.toulouse is lost because node is in OM
Jul  1 09:07:39 sam22 lock_gulmd_core[4195]: Member update message Logged in 
about sam24.toulouse to sam28.toulouse is lost because node is in OM
Jul  1 09:07:39 sam22 lock_gulmd_core[4195]: Member update message Logged in 
about sam24.toulouse to sam25.toulouse is lost because node is in OM
Jul  1 09:07:39 sam22 lock_gulmd_core[4195]: Member update message Logged in 
about sam24.toulouse to sam27.toulouse is lost because node is in OM
Jul  1 09:07:39 sam22 lock_gulmd_LT000[4196]: Attached slave 
sam24.toulouse:192.168.0.124 idx:2 fd:7 (soff:3 connected:0x8)
Jul  1 09:07:39 sam22 lock_gulmd_core[4195]: Still in Arbitrating: Have 2, 
need 3 for quorum.
Jul  1 09:07:39 sam22 lock_gulmd_core[4195]: Still in Arbitrating: Have 2, 
need 3 for quorum.
Jul  1 09:07:41 sam22 lock_gulmd_core[4195]: Now have Slave quorum, going 
full Master.
Jul  1 09:07:41 sam22 lock_gulmd_core[4195]: New Client: idx:6 fd:11 from 
(192.168.0.123:sam23.toulouse)
Jul  1 09:07:41 sam22 lock_gulmd_core[4195]: Member update message Logged in 
about sam23.toulouse to sam25.toulouse is lost because node is in OM
Jul  1 09:07:41 sam22 lock_gulmd_LTPX[4197]: Logged into LT000 at 
sam22.toulouse:192.168.0.122
Jul  1 09:07:41 sam22 lock_gulmd_LT000[4196]: New Client: idx 3 fd 8 from 
(192.168.0.122:sam22.toulouse)
Jul  1 09:07:41 sam22 lock_gulmd_LTPX[4197]: Finished resending to LT000
Jul  1 09:07:41 sam22 lock_gulmd_LT000[4196]: Attached slave 
sam23.toulouse:192.168.0.123 idx:4 fd:9 (soff:2 connected:0xc)
Jul  1 09:07:41 sam22 lock_gulmd_LT000[4196]: New Client: idx 5 fd 10 from 
(192.168.0.123:sam23.toulouse)
Jul  1 09:07:41 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=0: Trying to 
acquire journal lock...
Jul  1 09:07:41 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=0: Looking at 
journal...
Jul  1 09:07:41 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=0: Acquiring the 
transaction lock...
Jul  1 09:07:42 sam22 lock_gulmd_LT000[4196]: New Client: idx 6 fd 11 from 
(192.168.0.124:sam24.toulouse)
Jul  1 09:07:45 sam22 lock_gulmd_core[4195]: New Client: idx:3 fd:7 from 
(192.168.0.126:sam26.toulouse)
Jul  1 09:07:45 sam22 lock_gulmd_core[4195]: Member update message Logged in 
about sam26.toulouse to sam25.toulouse is lost because node is in OM
Jul  1 09:07:45 sam22 lock_gulmd_LT000[4196]: New Client: idx 7 fd 12 from 
(192.168.0.126:sam26.toulouse)
Jul  1 09:07:46 sam22 lock_gulmd_core[4195]: New Client: idx:7 fd:12 from 
(192.168.0.128:sam28.toulouse)
Jul  1 09:07:46 sam22 lock_gulmd_core[4195]: Member update message Logged in 
about sam28.toulouse to sam25.toulouse is lost because node is in OM
Jul  1 09:07:46 sam22 lock_gulmd_LT000[4196]: New Client: idx 8 fd 13 from 
(192.168.0.128:sam28.toulouse)
Jul  1 09:07:47 sam22 lock_gulmd_core[4195]: New Client: idx:8 fd:13 from 
(192.168.0.125:sam25.toulouse)
Jul  1 09:07:47 sam22 lock_gulmd_LT000[4196]: Attached slave 
sam25.toulouse:192.168.0.125 idx:9 fd:14 (soff:1 connected:0xe)
Jul  1 09:07:47 sam22 lock_gulmd_LT000[4196]: New Client: idx 10 fd 15 from 
(192.168.0.125:sam25.toulouse)
Jul  1 09:07:49 sam22 lock_gulmd_core[4195]: New Client: idx:9 fd:14 from 
(192.168.0.127:sam27.toulouse)
Jul  1 09:07:49 sam22 lock_gulmd_LT000[4196]: New Client: idx 11 fd 16 from 
(192.168.0.127:sam27.toulouse)
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=0: Replaying 
journal...
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=0: Replayed 0 of 
2 blocks
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=0: replays = 0, 
skips = 1, sames = 1
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=0: Journal 
replayed in 9s
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=0: Done
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=7: Trying to 
acquire journal lock...
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=7: Busy
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=6: Trying to 
acquire journal lock...
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=6: Busy
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=5: Trying to 
acquire journal lock...
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=5: Busy
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=7: Trying to 
acquire journal lock...
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=7: Busy
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=6: Trying to 
acquire journal lock...
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=6: Busy
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=5: Trying to 
acquire journal lock...
Jul  1 09:07:50 sam22 kernel: GFS: fsid=sam:grp_gfs.1: jid=5: Busy
----->8-------->8-------->8-------->8---

Sincerely,

Alban Crequy


From gwood at dragonhold.org  Fri Jul  1 09:31:50 2005
From: gwood at dragonhold.org (gwood at dragonhold.org)
Date: Fri, 1 Jul 2005 10:31:50 +0100 (BST)
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <42C429C7.3070200@possibilityforge.com>
References: <Pine.LNX.4.61.0506081351350.1493@tungsten.technicality.org>	<Pine.LNX.4.61.0506241511410.26421@tungsten.technicality.org>	<1119649059.13563.18.camel@ayanami.boston.redhat.com>	<Pine.LNX.4.61.0506241700340.7553@tungsten.technicality.org>	<42BF9FCF.7060000@redhat.com>	<Pine.LNX.4.61.0506281007070.27044@tungsten.technicality.org>	<42C24610.2@redhat.com>	<Pine.LNX.4.61.0506290648370.18433@tungsten.technicality.org>	<42C2AD19.5000601@redhat.com>	<Pine.LNX.4.61.0506291346000.27941@tungsten.technicality.org>	<42C39DFC.7070209@redhat.com>	<Pine.LNX.4.61.0506300853460.2741@tungsten.technicality.org>	<42C3FD0A.4040800@possibilityforge.com>	<42C4002C.80403@redhat.com>	<42C402E9.2080502@possibilityforge.com>	<42C4067F.6000903@redhat.com>	<Pine.LNX.4.61.0506301019310.2741@tungsten.technicality.org>	<42C40F58.4040401@oakland.edu>	<21339.208.178.77.200.1120145458.squirrel@208.178.77.200>	<42C42352.3000600@possibilityforge.com>
	<12897.208.178.77.200.1120151133.squirrel@208.178.! 77.200>
	<42C429C7.3070200@possibilityforge.com>
Message-ID: <16201.208.178.77.200.1120210310.squirrel@208.178.77.200>

>    node 1 accesses gfs mount/some_dir
>    node 2 accesses gfs mount/some_other_dir
>    node 3 accesses gfs mount/yet_some_other_dir
This won't totally solve the same problem - for two reasons.

Firstly, the metadata for "some_dir" and "some_other_dir" are going to be
stored in the same place ("mount"'s directory) - which means that access
time, permission changes, and other metadata will be cached within each of
the split brains quite happily.

The more serious problem is new/deleted files - since the 2 mini-clusters
will think they are allocating disk space from a device they are solely
using, they will each start allocating from the same place (e.g. nodes 1-3
could use block offset 12345 for new file /mount/some_dir/file1, at the
same time as nodes 4-6 usr block offset 12345 for new file
/mount/yet_some_other_dir/file2)

While each half uses the cached data, you'll just get data corruption (the
contents of the file will be the last one written).  However, as soon as
they go back to the disk to look things up they may notice the cross
linked files, and either whinge or die.  And when you next fsck, it'll
split them out and that's the first time you'll definitely know.

If you're just after sharing the space on disk (and not sharing the data
within the partitions) then clvm may be the answer - the only thing you
won't be able to do (from memory) without quorum is resize the partitions.

Run ext3 on top of that for each node?


From jscottorn at possibilityforge.com  Fri Jul  1 13:46:48 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Fri, 01 Jul 2005 07:46:48 -0600
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <42C46EAF.9050703@possibilityforge.com>
References: <Pine.LNX.4.44.0506301806530.21204-100000@algiers.clic.cs.columbia.edu>
	<42C46EAF.9050703@possibilityforge.com>
Message-ID: <42C54948.9070404@possibilityforge.com>

Does any one have any thoughts about my problem?  I have been fighting
with fenced for the last couple of days now.

Jon Scottorn wrote:

>Thanks, 
>
>Another little issue, I can't get fenced to start. 
>
>I issue fenced -cD and it gives
>
>1120169493 our_nodeid 1 our_name data
>1120169493 group_init error 0 111
>
>What have I done wrong here?
>
>Thanks,
>
>Jon
>
>Dan B. Phung wrote:
>
>  
>
>>while [ 1 ]; then
>> cman_tool expected -e 1
>> sleep 600
>>done
>>
>>
>>On 30, Jun, 2005, Jon Scottorn declared:
>>
>> 
>>
>>    
>>
>>>How would something like that look?
>>>
>>>would you use cman_tool to do it?
>>>
>>>
>>>
>>>Lon Hohberger wrote:
>>>
>>>   
>>>
>>>      
>>>
>>>>On Thu, 2005-06-30 at 10:52 -0600, Jon Scottorn wrote:
>>>>
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>>>Ok, so let me reiterate:
>>>>>
>>>>> If I don't even care about quorum and the cluster.  I just want a
>>>>>filesystem that will server out a block device, which is what gfs does. 
>>>>>I'm not worried about "split brain" issues.  If we need to have quorum,
>>>>>I want the number of nodes for quorum to be set at 1, which will be the
>>>>>main server containing the data.  Any other node that connectes can just
>>>>>access the data or go offline without causeing any quorum issues.
>>>>>Is this functionality going to be possible, I want to use GFS for this
>>>>>because if not, our other option is enbd but then we are limiting
>>>>>ourselves very much.  We would have to create seperate partitions for
>>>>>each node to mount, etc... a major PAIN to go that way but, it is not as
>>>>>painful of having quorum fail and cause all of our nodes to go down.
>>>>>  
>>>>>
>>>>>       
>>>>>
>>>>>          
>>>>>
>>>>Set a cronjob to reset expected votes every 10 minutes...
>>>>
>>>>Good luck.
>>>>
>>>>-- Lon
>>>>
>>>>
>>>>
>>>>--
>>>>Linux-cluster mailing list
>>>>Linux-cluster at redhat.com
>>>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>>>
>>>>
>>>>
>>>>     
>>>>
>>>>        
>>>>
>>>--
>>>Linux-cluster mailing list
>>>Linux-cluster at redhat.com
>>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>>
>>>   
>>>
>>>      
>>>
>> 
>>
>>    
>>
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster
>
>  
>


From pcaulfie at redhat.com  Fri Jul  1 14:07:35 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Fri, 01 Jul 2005 15:07:35 +0100
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <42C54948.9070404@possibilityforge.com>
References: <Pine.LNX.4.44.0506301806530.21204-100000@algiers.clic.cs.columbia.edu>	<42C46EAF.9050703@possibilityforge.com>
	<42C54948.9070404@possibilityforge.com>
Message-ID: <42C54E27.1020303@redhat.com>

Jon Scottorn wrote:
> Does any one have any thoughts about my problem?  I have been fighting
> with fenced for the last couple of days now.
> 
> Jon Scottorn wrote:
> 
> 
>>Thanks, 
>>
>>Another little issue, I can't get fenced to start. 
>>
>>I issue fenced -cD and it gives
>>
>>1120169493 our_nodeid 1 our_name data
>>1120169493 group_init error 0 111
>>
>>What have I done wrong here?
>>

Are you using head of CVS? that's not really ready yet.

-- 

patrick


From jscottorn at possibilityforge.com  Fri Jul  1 14:13:26 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Fri, 01 Jul 2005 08:13:26 -0600
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <42C54E27.1020303@redhat.com>
References: <Pine.LNX.4.44.0506301806530.21204-100000@algiers.clic.cs.columbia.edu>	<42C46EAF.9050703@possibilityforge.com>	<42C54948.9070404@possibilityforge.com>
	<42C54E27.1020303@redhat.com>
Message-ID: <42C54F86.8010402@possibilityforge.com>

Yes, I am.  Which one should I use than?

Patrick Caulfield wrote:

>Jon Scottorn wrote:
>  
>
>>Does any one have any thoughts about my problem?  I have been fighting
>>with fenced for the last couple of days now.
>>
>>Jon Scottorn wrote:
>>
>>
>>    
>>
>>>Thanks, 
>>>
>>>Another little issue, I can't get fenced to start. 
>>>
>>>I issue fenced -cD and it gives
>>>
>>>1120169493 our_nodeid 1 our_name data
>>>1120169493 group_init error 0 111
>>>
>>>What have I done wrong here?
>>>
>>>      
>>>
>
>Are you using head of CVS? that's not really ready yet.
>
>  
>


From jscottorn at possibilityforge.com  Fri Jul  1 14:24:50 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Fri, 01 Jul 2005 08:24:50 -0600
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <42C54E27.1020303@redhat.com>
References: <Pine.LNX.4.44.0506301806530.21204-100000@algiers.clic.cs.columbia.edu>	<42C46EAF.9050703@possibilityforge.com>	<42C54948.9070404@possibilityforge.com>
	<42C54E27.1020303@redhat.com>
Message-ID: <42C55232.40808@possibilityforge.com>

Hi,

Thanks for the response.

I was using head but now I went back to the version I was using before
that was working.  Now I get this error

fenced -cD
fenced: 1120227794 our_nodeid 1 our_name data
fenced: 1120227794 delay post_join 6s post_fail 0s
fenced: 1120227794 clean start, skipping initial nodes
fenced: 1120227794 group_init error 0 111
free(): invalid pointer 0x8051300!

Any Ideas?

Thanks again,

Jon

Patrick Caulfield wrote:

>Jon Scottorn wrote:
>  
>
>>Does any one have any thoughts about my problem?  I have been fighting
>>with fenced for the last couple of days now.
>>
>>Jon Scottorn wrote:
>>
>>
>>    
>>
>>>Thanks, 
>>>
>>>Another little issue, I can't get fenced to start. 
>>>
>>>I issue fenced -cD and it gives
>>>
>>>1120169493 our_nodeid 1 our_name data
>>>1120169493 group_init error 0 111
>>>
>>>What have I done wrong here?
>>>
>>>      
>>>
>
>Are you using head of CVS? that's not really ready yet.
>
>  
>


From mtilstra at redhat.com  Fri Jul  1 15:12:50 2005
From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra)
Date: Fri, 1 Jul 2005 10:12:50 -0500
Subject: [Linux-cluster] Fencing the gulm master node: problem
In-Reply-To: <42C5009E.4080108@seanodes.com>
References: <42C5009E.4080108@seanodes.com>
Message-ID: <20050701151250.GA4400@redhat.com>

On Fri, Jul 01, 2005 at 10:36:46AM +0200, Alban Crequy wrote:
> I want to test the fencing capability of GFS by unplugging the network on a 
> node. But I experience some problems when the node I unplug is the gulm 
> master.
> 
> I am using the RPM:
>  - GFS-6.0.2.20-2
>  - GFS-modules-smp-6.0.2.20-2
> 
> I have a 8-nodes cluster (sam21, sam22, ..., sam28). I mount a GFS 
> filesystem on all nodes on /mnt/gfs
[snip]
> Is this a bug? Or a misunderstanding of the fencing mechanism?

No, you've got the right idea.  It looks like a bug.  ick.
So can you file a bugzilla for this?  
Also, can you try things where there is no gfs mounted on the gulm
master, then unplug that node and see if the cluster behaves?

thanks.
-- 
Michael Conrad Tadpol Tilstra
Any technology, no matter how primitive, is magic to those who don't
understand it.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050701/5a9eda33/attachment.sig>

From pcaulfie at redhat.com  Fri Jul  1 15:30:51 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Fri, 01 Jul 2005 16:30:51 +0100
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <42C54F86.8010402@possibilityforge.com>
References: <Pine.LNX.4.44.0506301806530.21204-100000@algiers.clic.cs.columbia.edu>	<42C46EAF.9050703@possibilityforge.com>	<42C54948.9070404@possibilityforge.com>	<42C54E27.1020303@redhat.com>
	<42C54F86.8010402@possibilityforge.com>
Message-ID: <42C561AB.7070309@redhat.com>

Jon Scottorn wrote:
> Yes, I am.  Which one should I use than?
> 

STABLE is probably the best CVS tag to use if you want something that
actually works - rather than testing new stuff.

Patrick


From lhh at redhat.com  Fri Jul  1 15:49:06 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 01 Jul 2005 11:49:06 -0400
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <16201.208.178.77.200.1120210310.squirrel@208.178.77.200>
References: <Pine.LNX.4.61.0506081351350.1493@tungsten.technicality.org>
	<Pine.LNX.4.61.0506241511410.26421@tungsten.technicality.org>
	<1119649059.13563.18.camel@ayanami.boston.redhat.com>
	<Pine.LNX.4.61.0506241700340.7553@tungsten.technicality.org>
	<42BF9FCF.7060000@redhat.com>
	<Pine.LNX.4.61.0506281007070.27044@tungsten.technicality.org>
	<42C24610.2@redhat.com>
	<Pine.LNX.4.61.0506290648370.18433@tungsten.technicality.org>
	<42C2AD19.5000601@redhat.com>
	<Pine.LNX.4.61.0506291346000.27941@tungsten.technicality.org>
	<42C39DFC.7070209@redhat.com>
	<Pine.LNX.4.61.0506300853460.2741@tungsten.technicality.org>
	<42C3FD0A.4040800@possibilityforge.com>	<42C4002C.80403@redhat.com>
	<42C402E9.2080502@possibilityforge.com>	<42C4067F.6000903@redhat.com>
	<Pine.LNX.4.61.0506301019310.2741@tungsten.technicality.org>
	<42C40F58.4040401@oakland.edu>
	<21339.208.178.77.200.1120145458.squirrel@208.178.77.200>
	<42C42352.3000600@possibilityforge.com>
	<12897.208.178.77.200.1120151133.squirrel@208.178.! 77.200>
	<42C429C7.3070200@possibilityforge.com>
	<16201.208.178.77.200.1120210310.squirrel@208.178.77.200>
Message-ID: <1120232946.13773.30.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-01 at 10:31 +0100, gwood at dragonhold.org wrote:

> If you're just after sharing the space on disk (and not sharing the data
> within the partitions) then clvm may be the answer - the only thing you
> won't be able to do (from memory) without quorum is resize the partitions.

You shouldn't even need CLVM if you don't intend to muddle with
partitions or cross-mount the file systems.  You'll lose resizing, but
in doing so, your clients no longer need to be cluster participants.

Partition once, and just make sure *everybody* mounts their own
partition.

-- Lon


From gwood at dragonhold.org  Fri Jul  1 15:53:45 2005
From: gwood at dragonhold.org (gwood at dragonhold.org)
Date: Fri, 1 Jul 2005 16:53:45 +0100 (BST)
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <1120232946.13773.30.camel@ayanami.boston.redhat.com>
References: <Pine.LNX.4.61.0506081351350.1493@tungsten.technicality.org>
	<Pine.LNX.4.61.0506241511410.26421@tungsten.technicality.org>
	<1119649059.13563.18.camel@ayanami.boston.redhat.com>
	<Pine.LNX.4.61.0506241700340.7553@tungsten.technicality.org>
	<42BF9FCF.7060000@redhat.com>
	<Pine.LNX.4.61.0506281007070.27044@tungsten.technicality.org>
	<42C24610.2@redhat.com>
	<Pine.LNX.4.61.0506290648370.18433@tungsten.technicality.org>
	<42C2AD19.5000601@redhat.com>
	<Pine.LNX.4.61.0506291346000.27941@tungsten.technicality.org>
	<42C39DFC.7070209@redhat.com>
	<Pine.LNX.4.61.0506300853460.2741@tungsten.technicality.org>
	<42C3FD0A.4040800@possibilityforge.com>	<42C4002C.80403@redhat.com>
	<42C402E9.2080502@possibilityforge.com>	<42C4067F.6000903@redhat.com>
	<Pine.LNX.4.61.0506301019310.2741@tungsten.technicality.org>
	<42C40F58.4040401@oakland.edu>
	<21339.208.178.77.200.1120145458.squirrel@208.178.77.200>
	<42C42352.3000600@possibilityforge.com>
	<12897.208.178.77.200.1120151133.squirrel@208.178.! 77.200>
	<42C429C7.3070200@possibilityforge.com>
	<16201.208.178.77.200.1120210310.squirrel@208.178.77.200>
	<1120232946.13773.30.camel@ayanami.boston.redhat.com>
Message-ID: <14504.208.178.77.200.1120233225.squirrel@208.178.77.200>

> You shouldn't even need CLVM if you don't intend to muddle with
> partitions or cross-mount the file systems.  You'll lose resizing, but
> in doing so, your clients no longer need to be cluster participants.
Oh certainly...

I was just trying to think of why someone would want to use GFS to do
this, and the idea of the machines not have pre-fixed partition sizes was
about the only thing I could think of.


From lhh at redhat.com  Fri Jul  1 15:58:44 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 01 Jul 2005 11:58:44 -0400
Subject: [Linux-cluster] ccsd performance / local sockets patch for
	STABLE / RHEL4 branches
In-Reply-To: <1120087941.28218.40.camel@ayanami.boston.redhat.com>
References: <1120087941.28218.40.camel@ayanami.boston.redhat.com>
Message-ID: <1120233524.13773.32.camel@ayanami.boston.redhat.com>

Committed to STABLE and HEAD.

-- Lon


From sasmaz at itu.edu.tr  Mon Jul  4 15:38:05 2005
From: sasmaz at itu.edu.tr (Aydin SASMAZ)
Date: Mon, 4 Jul 2005 18:38:05 +0300
Subject: [Linux-cluster] How to test GFS-6.0.2.20-1 on RHELAS3.0 ?
Message-ID: <!~!UENERkVCMDkAAQACAAAAAAAAAAAAAAAAABgAAAAAAAAAoqQoNjHj3E6N05HIoMcVWuKMAAAQAAAAQJg3eSz1ek6QEcDX+Zn+XAEAAAAA@itu.edu.tr>

Hi
 
I would like to be sure which method to test GFS-6.0.2.20-1installed 3 node
file cluster doesn't corrupt data while multiple nodes writing on the same
file. At the same time how LOCK_GULM manage filesystem ? Is there any known
test? or benchmark other than bonnie++ . I would like to deploy 3
filesystem, each one on one gfs server and export these file systems with
smb to other smb clients. I plan to use 3 smb instance for all 3 filesystem
on every server node and load balancer in front of them.
 
Platform            : 3x HP-DL380 G4
OP                   : RedHat Enterprise Linux Advanced Server 3.0 Update 4
Cluster Suite     : 3.0, clumanager-1.2.22-2
                            redhat-config-cluster-1.0.3-1.noarch.rpm
GFS                 :    GFS-devel-6.0.2.20-1
                            GFS-modules-smp-6.0.2.20-1
                            GFS-6.0.2.20-1
Fencing Dev      : fence_ilo
 
 
Any advice would be appreciated. 
 
Some notes about system :
------------------------------------------
 
[root at gfs-test2 root]# gfs_tool df
/users/lnxsrv1:
  SB lock proto = "lock_gulm"
  SB lock table = "gfs-test:lnxsrv1"
  SB ondisk format = 1308
  SB multihost format = 1401
  Block size = 4096
  Journals = 8
  Resource Groups = 1596
  Mounted lock proto = "lock_gulm"
  Mounted lock table = "gfs-test:lnxsrv1"
  Mounted host data = ""
  Journal number = 1
  Lock module flags = async 
  Local flocks = FALSE
  Local caching = FALSE
 
  Type           Total          Used           Free           use%

  ------------------------------------------------------------------------
  inodes         8              8              0              100%
  metadata       92131          60723          31408          66%
  data           104492113      30341120       74150993       29%
 
/users/lnxsrv2:
  SB lock proto = "lock_gulm"
  SB lock table = "gfs-test:lnxsrv2"
  SB ondisk format = 1308
  SB multihost format = 1401
  Block size = 4096
  Journals = 8
  Resource Groups = 1596
  Mounted lock proto = "lock_gulm"
  Mounted lock table = "gfs-test:lnxsrv2"
  Mounted host data = ""
  Journal number = 1
  Lock module flags = async 
  Local flocks = FALSE
  Local caching = FALSE
 
  Type           Total          Used           Free           use%

  ------------------------------------------------------------------------
  inodes         5              5              0              100%
  metadata       38             38             0              100%
  data           104584209      0              104584209      0%
 
/users/lnxsrv3:
  SB lock proto = "lock_gulm"
  SB lock table = "gfs-test:lnxsrv3"
  SB ondisk format = 1308
  SB multihost format = 1401
  Block size = 4096
  Journals = 8
  Resource Groups = 396
  Mounted lock proto = "lock_gulm"
  Mounted lock table = "gfs-test:lnxsrv3"
  Mounted host data = ""
  Journal number = 1
  Lock module flags = async 
  Local flocks = FALSE
  Local caching = FALSE
 
  Type           Total          Used           Free           use%

  ------------------------------------------------------------------------
  inodes         5              5              0              100%
  metadata       10             10             0              100%
  data           25949437       0              25949437       0%

[root at gfs-test1 root]# gulm_tool nodeinfo gfs-test{3,1}.-----
 Name: gfs-test1.-----
  ip    = 160.75.100.22
  state = Logged in
  mode = Slave
  missed beats = 0
  last beat = 1120235791271378
  delay avg = 6672475
  max delay = 9459850
 
[root at gfs-test1 root]# gulm_tool nodeinfo gfs-test{3,2}.------
 Name: gfs-test2.------
  ip    = 160.75.100.23
  state = Logged in
  mode = Slave
  missed beats = 0
  last beat = 1120235802888768
  delay avg = 6678723
  max delay = 6880217


Aydin SASMAZ
System Support Engineer
ITU BIDB
Phone: +90 212 2853930
Fax    : +90 212 2856936

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050704/79726010/attachment.htm>

From Birger.Wathne at ift.uib.no  Mon Jul  4 20:16:01 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Mon, 04 Jul 2005 22:16:01 +0200
Subject: [Linux-cluster] NFS exports disappeared
Message-ID: <42C99901.8090805@uib.no>

I currently have a situation on my FC4 system where all exports for one of 
the file systems disappeared.

Other file systems using the same nfsclient entries are still there, but I 
have not yet checked that all entries are there for all file systems. The 
one file system that had lost all exports was so obvious I spent my time 
trying to debug that one.

I tried bumping up the version number of cluster.conf and ran ccs_tool 
update with no messages that look like errors, but still no exports. I tried 
killing the clurmtabd process for that file system and kill -HUP clurgmgrd. 
It restarted the process, but still no exports. I then removed two nfsclient 
lines, bumped up the version, ran ccs_tool, restored the missing lines, 
bumped up the version and ran ccs_tool again. Now those two exports are 
back, but there others for that file system are still missing.

One client had the file system mounted and had stale NFS handles. After the 
above procedure the file system was ok without any remount, so that is good.

How can I find all info needed to debug this? What do you want?


From haydar2906 at hotmail.com  Tue Jul  5 04:49:30 2005
From: haydar2906 at hotmail.com (haydar Ali)
Date: Tue, 05 Jul 2005 00:49:30 -0400
Subject: [Linux-cluster] RedHat Cluster suite installation
Message-ID: <BAY104-F375EDC93E5CEB0541CADC0C5E60@phx.gbl>

Hi,

I'm looking for an installing and configuring procedure for RedHat  Cluster 
suite (examples).
We have 2 servers HP Proliant 380 G3 (RedHat Advanced Server 2.1) attached 
by fiber optic to the storage area network SAN HP MSA1000 and we want to 
install and configure Cluster suite to allow 2 servers to simultaneously 
read and write to a single shared file system by NFS (Word documents located 
into /u04) on the SAN.

Thanks.

Haydar


From npf at eurotux.com  Tue Jul  5 09:18:53 2005
From: npf at eurotux.com (Nuno Pais Fernandes)
Date: Tue, 5 Jul 2005 10:18:53 +0100
Subject: [Linux-cluster] RedHat Cluster suite installation
In-Reply-To: <BAY104-F375EDC93E5CEB0541CADC0C5E60@phx.gbl>
References: <BAY104-F375EDC93E5CEB0541CADC0C5E60@phx.gbl>
Message-ID: <200507051018.57454.npf@eurotux.com>

Hi

On Tuesday 05 July 2005 05:49, haydar Ali wrote:
> Hi,
>
> I'm looking for an installing and configuring procedure for RedHat  Cluster
> suite (examples).
> We have 2 servers HP Proliant 380 G3 (RedHat Advanced Server 2.1) attached
> by fiber optic to the storage area network SAN HP MSA1000 and we want to
> install and configure Cluster suite to allow 2 servers to simultaneously
> read and write to a single shared file system by NFS (Word documents

Read and write isn't possible with cluster suite.. you have to use GFS.

Nuno Fernandes

> located into /u04) on the SAN.
>
> Thanks.
>
> Haydar
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster

-- 
--------------------------
Nuno Miguel Pais Fernandes
<npf at eurotux.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050705/e47d44bf/attachment.sig>

From yazan at ccs.com.jo  Tue Jul  5 12:06:52 2005
From: yazan at ccs.com.jo (Yazan Al-Sheyyab)
Date: Tue, 5 Jul 2005 14:06:52 +0200
Subject: [Linux-cluster] RedHat Cluster suite installation
References: <BAY104-F375EDC93E5CEB0541CADC0C5E60@phx.gbl>
Message-ID: <001c01c5815a$07a65c30$69050364@yazanz>


 You have to use GFS in order to reach the shared from the two nodes at the 
same time.

  and about the documents , You can find many of them in the redhat network 
or on the net.


Regards
-------------------------------------------------

Yazan


----- Original Message ----- 
From: "haydar Ali" <haydar2906 at hotmail.com>
To: <linux-cluster at redhat.com>
Sent: Tuesday, July 05, 2005 6:49 AM
Subject: [Linux-cluster] RedHat Cluster suite installation


> Hi,
>
> I'm looking for an installing and configuring procedure for RedHat 
> Cluster suite (examples).
> We have 2 servers HP Proliant 380 G3 (RedHat Advanced Server 2.1) attached 
> by fiber optic to the storage area network SAN HP MSA1000 and we want to 
> install and configure Cluster suite to allow 2 servers to simultaneously 
> read and write to a single shared file system by NFS (Word documents 
> located into /u04) on the SAN.
>
> Thanks.
>
> Haydar
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster 


From haydar2906 at hotmail.com  Tue Jul  5 12:26:44 2005
From: haydar2906 at hotmail.com (haydar Ali)
Date: Tue, 05 Jul 2005 08:26:44 -0400
Subject: [Linux-cluster] RedHat Cluster suite installation
In-Reply-To: <001c01c5815a$07a65c30$69050364@yazanz>
Message-ID: <BAY104-F404DC1317CBD4C6AEAF2CDC5E60@phx.gbl>

Hi Yazan,

Yes I know that I have to use GFS to reach the shared from the two nodes at 
the same time but the RedHat technical support said me that I need 3 nodes 
(3 servers) connected to my SAN and we have only 2 nodes connected by 2 
Fiber channel each.
My question : can I use only 2 nodes in GFS solution??

Thanks

Haydar

>From: "Yazan Al-Sheyyab" <yazan at ccs.com.jo>
>Reply-To: linux clustering <linux-cluster at redhat.com>
>To: "linux clustering" <linux-cluster at redhat.com>
>Subject: Re: [Linux-cluster] RedHat Cluster suite installation
>Date: Tue, 5 Jul 2005 14:06:52 +0200
>
>
>You have to use GFS in order to reach the shared from the two nodes at the 
>same time.
>
>  and about the documents , You can find many of them in the redhat network 
>or on the net.
>
>
>
>Regards
>-------------------------------------------------
>
>Yazan
>
>
>----- Original Message ----- From: "haydar Ali" <haydar2906 at hotmail.com>
>To: <linux-cluster at redhat.com>
>Sent: Tuesday, July 05, 2005 6:49 AM
>Subject: [Linux-cluster] RedHat Cluster suite installation
>
>
>>Hi,
>>
>>I'm looking for an installing and configuring procedure for RedHat Cluster 
>>suite (examples).
>>We have 2 servers HP Proliant 380 G3 (RedHat Advanced Server 2.1) attached 
>>by fiber optic to the storage area network SAN HP MSA1000 and we want to 
>>install and configure Cluster suite to allow 2 servers to simultaneously 
>>read and write to a single shared file system by NFS (Word documents 
>>located into /u04) on the SAN.
>>
>>Thanks.
>>
>>Haydar
>>
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>http://www.redhat.com/mailman/listinfo/linux-cluster
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster


From REdelman at netstandard.com  Tue Jul  5 13:20:42 2005
From: REdelman at netstandard.com (Rich Edelman)
Date: Tue, 5 Jul 2005 08:20:42 -0500
Subject: [Linux-cluster] Fencing Problems
Message-ID: <06BAD0EE281D8C4F96DBCEC41260D47BEE8AF0@nsimail03.NetStandard.inc>

Hi all.

I've got a 2 node FC4 system here, connected via a Brocade FC switch to
FC SAN. I've followed usage.txt to a T, but seem to have some fencing
problems.

What happens is if one of the 2 nodes loses network connectivity, that
node immediately gets fenced by the remaining node, which is all fine
and good. The problem here is the fenced node (the one without network
connectivity) starts trying to fence the remaining node, and when
network connectivity is restored, succeeds in fencing the other node.
Now both my nodes are fenced. And not only that, but I have the 'split
brain' scenario, where each node thinks it is the only member of a
similarly named cluster.

Any ideas how to fix this?

Thanks,

Rich Edelman


From pcaulfie at redhat.com  Tue Jul  5 13:32:23 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Tue, 05 Jul 2005 14:32:23 +0100
Subject: [Linux-cluster] Fencing Problems
In-Reply-To: <06BAD0EE281D8C4F96DBCEC41260D47BEE8AF0@nsimail03.NetStandard.inc>
References: <06BAD0EE281D8C4F96DBCEC41260D47BEE8AF0@nsimail03.NetStandard.inc>
Message-ID: <42CA8BE7.4060701@redhat.com>

Rich Edelman wrote:
> Hi all.
> 
> I've got a 2 node FC4 system here, connected via a Brocade FC switch to
> FC SAN. I've followed usage.txt to a T, but seem to have some fencing
> problems.
> 
> What happens is if one of the 2 nodes loses network connectivity, that
> node immediately gets fenced by the remaining node, which is all fine
> and good. The problem here is the fenced node (the one without network
> connectivity) starts trying to fence the remaining node, and when
> network connectivity is restored, succeeds in fencing the other node.
> Now both my nodes are fenced. And not only that, but I have the 'split
> brain' scenario, where each node thinks it is the only member of a
> similarly named cluster.
> 
> Any ideas how to fix this?
> 

For a start you need to set the 2-node option to cman in the
/etc/cluster/cluster.conf file.

I think that 2-node option only works with power-switch fencing. It relies on
the fact that the two nodes will race to fence each other and only the first one
suceeds because the second is then dead.
-- 

patrick


From REdelman at netstandard.com  Tue Jul  5 13:44:32 2005
From: REdelman at netstandard.com (Rich Edelman)
Date: Tue, 5 Jul 2005 08:44:32 -0500
Subject: [Linux-cluster] Fencing Problems
Message-ID: <06BAD0EE281D8C4F96DBCEC41260D47BEE8AF3@nsimail03.NetStandard.inc>

Ah, yes, I did set the two-node option to cman in
/etc/cluster/cluster.conf.

Perhaps some documentation should be updated... But would it not be
better to have some kind of timeout value on the nodes when trying to
fence the others? Seems to me that in a two-node scenario, if one node
is repeatedly trying to fence the other, and after say 10 tries it
can't, it should just give up and assume it is the problem node. Then,
hopefully, everything else would go quite well when that node came back.

Rich

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Patrick Caulfield
Sent: Tuesday, July 05, 2005 8:32 AM
To: linux clustering
Subject: Re: [Linux-cluster] Fencing Problems

Rich Edelman wrote:
> Hi all.
> 
> I've got a 2 node FC4 system here, connected via a Brocade FC switch 
> to FC SAN. I've followed usage.txt to a T, but seem to have some 
> fencing problems.
> 
> What happens is if one of the 2 nodes loses network connectivity, that

> node immediately gets fenced by the remaining node, which is all fine 
> and good. The problem here is the fenced node (the one without network
> connectivity) starts trying to fence the remaining node, and when 
> network connectivity is restored, succeeds in fencing the other node.
> Now both my nodes are fenced. And not only that, but I have the 'split

> brain' scenario, where each node thinks it is the only member of a 
> similarly named cluster.
> 
> Any ideas how to fix this?
> 

For a start you need to set the 2-node option to cman in the
/etc/cluster/cluster.conf file.

I think that 2-node option only works with power-switch fencing. It
relies on the fact that the two nodes will race to fence each other and
only the first one suceeds because the second is then dead.
-- 

patrick

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster


From pcaulfie at redhat.com  Tue Jul  5 14:25:27 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Tue, 05 Jul 2005 15:25:27 +0100
Subject: [Linux-cluster] Fencing Problems
In-Reply-To: <06BAD0EE281D8C4F96DBCEC41260D47BEE8AF3@nsimail03.NetStandard.inc>
References: <06BAD0EE281D8C4F96DBCEC41260D47BEE8AF3@nsimail03.NetStandard.inc>
Message-ID: <42CA9857.9050507@redhat.com>

Rich Edelman wrote:
> Ah, yes, I did set the two-node option to cman in
> /etc/cluster/cluster.conf.
> 
> Perhaps some documentation should be updated... But would it not be
> better to have some kind of timeout value on the nodes when trying to
> fence the others? Seems to me that in a two-node scenario, if one node
> is repeatedly trying to fence the other, and after say 10 tries it
> can't, it should just give up and assume it is the problem node. Then,
> hopefully, everything else would go quite well when that node came back.
> 

Data integrity requires that the remaining node knows for sure that the other
node has been removed from the cluster. Without that assurance your data is toast.
-- 

patrick


From forgue at oakland.edu  Tue Jul  5 14:30:45 2005
From: forgue at oakland.edu (Andrew Forgue)
Date: Tue, 05 Jul 2005 10:30:45 -0400
Subject: [Linux-cluster] Fencing Problems
In-Reply-To: <42CA9857.9050507@redhat.com>
References: <06BAD0EE281D8C4F96DBCEC41260D47BEE8AF3@nsimail03.NetStandard.inc>
	<42CA9857.9050507@redhat.com>
Message-ID: <42CA9995.5000503@oakland.edu>


Patrick Caulfield wrote:

>
>Data integrity requires that the remaining node knows for sure that the other
>node has been removed from the cluster. Without that assurance your data is toast.
>

Would this be possible if GFS implemented an on-disk quota system?

-- 
Andrew J. Forgue
Systems Programmer II :: Oakland University
forgue at oakland.edu

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050705/9f692d8b/attachment.sig>

From pcaulfie at redhat.com  Tue Jul  5 14:48:07 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Tue, 05 Jul 2005 15:48:07 +0100
Subject: [Linux-cluster] Fencing Problems
In-Reply-To: <42CA9995.5000503@oakland.edu>
References: <06BAD0EE281D8C4F96DBCEC41260D47BEE8AF3@nsimail03.NetStandard.inc>	<42CA9857.9050507@redhat.com>
	<42CA9995.5000503@oakland.edu>
Message-ID: <42CA9DA7.10804@redhat.com>

Andrew Forgue wrote:
> Patrick Caulfield wrote:
> 
> 
>>Data integrity requires that the remaining node knows for sure that the other
>>node has been removed from the cluster. Without that assurance your data is toast.
>>
> 
> 
> Would this be possible if GFS implemented an on-disk quota system?
> 

GFS does have quotas but I don't see what they have to do with fencing. If you
meant a "quorum disk" then the answer is "yes". But we don't have a quorum disk
facility.

-- 

patrick


From lhh at redhat.com  Tue Jul  5 14:49:47 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 05 Jul 2005 10:49:47 -0400
Subject: [Linux-cluster] Fencing Problems
In-Reply-To: <42CA9995.5000503@oakland.edu>
Message-ID: <20050705.i7H.30528600@egw.corp.redhat.com>

Andrew Forgue (forgue at oakland.edu) wrote:

> Would this be possible if GFS implemented an on-disk quota system?

(Quorum?)

In either case (quorum *or* quota), fencing is still required, as there's no
assurance that a rogue node can't flush buffers.

Fencing *forcefully* prevents nodes from flushing any outstanding buffers to
disk after we've declared them out of the cluster, regardless of the
membership/quorum model used.

-- Lon


From lhh at redhat.com  Tue Jul  5 15:02:50 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 05 Jul 2005 11:02:50 -0400
Subject: [Linux-cluster] Fencing Problems
In-Reply-To: <42CA9DA7.10804@redhat.com>
Message-ID: <20050705.Lw9.35654900@egw.corp.redhat.com>

Patrick Caulfield (pcaulfie at redhat.com) wrote:

> > Would this be possible if GFS implemented an on-disk quota system?
> >
>
> GFS does have quotas but I don't see what they have to do with fencing. If you
> meant a "quorum disk" then the answer is "yes". But we don't have a quorum disk
> facility.

How does the presence or use of a quorum disk prevent queued buffers from
flying out and hitting the disk?  Typically, quorum disks have been used to
prevent (or detect) multiple partitions, as a primary form of cluster
membership, etc.

A SCSI reservation as a secondary quorum determinant is more like "fencing"
than a traditional quorum disk, but that's open to argument of course ;)

-- Lon


From haydar2906 at hotmail.com  Tue Jul  5 16:22:15 2005
From: haydar2906 at hotmail.com (haydar Ali)
Date: Tue, 05 Jul 2005 12:22:15 -0400
Subject: [Linux-cluster] GFS installation
In-Reply-To: <20050616153316.39456.qmail@web52910.mail.yahoo.com>
Message-ID: <BAY104-F167E0837063F401106879EC5E60@phx.gbl>

Hi Igor,

Thanks for this URL.
My question is: Have I to use 3 nodes to achieve GFS solution?
We have 2 servers HP Proliant 380 G3 (RedHat Advanced Server 2.1) attached 
by 2 fiber channels each to the storage area network SAN HP MSA1000 and we 
want to install and configure GFS to allow 2 servers to simultaneously read 
and write to a single shared file system (Word documents located into /u04) 
located on the Storage area network SAN HP MSA1000.
I read the example that you have sent to me and I see 3 nodes, 2 client 
nodes share a directory mounted on the 3d server node, but our solution the 
directory is located in the SAN.

Have you any explanation or ideas for our request?
Thanks

Haydar


>From: Igor <logastellus at yahoo.com>
>Reply-To: linux clustering <linux-cluster at redhat.com>
>To: linux clustering <linux-cluster at redhat.com>
>Subject: Re: [Linux-cluster] GFS installation
>Date: Thu, 16 Jun 2005 08:33:16 -0700 (PDT)
>
>Look at this URL that David suggested to me:
>
>http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/doc/min-gfs.txt?rev=1.3&content-type=text/x-cvsweb-markup&cvsroot=cluster
>
>it's pretty good.
>
>--- haydar Ali <haydar2906 at hotmail.com> wrote:
>
> > Hi,
> >
> > I'm looking for an installing and configuring
> > procedure for GFS (examples).
> > We have 2 servers HP Proliant 380 G3 (RedHat
> > Advanced Server 2.1) attached
> > by fiber optic to the storage area network SAN HP
> > MSA1000 and we want to
> > install and configure GFS to allow 2 servers to
> > simultaneously read and
> > write to a single shared file system (Word documents
> > located into /u04) on
> > the SAN.
> >
> > Thanks.
> >
> > Haydar
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > http://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
>
>
>
>__________________________________
>Discover Yahoo!
>Stay in touch with email, IM, photo sharing and more. Check it out!
>http://discover.yahoo.com/stayintouch.html
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster


From jscottorn at possibilityforge.com  Tue Jul  5 17:04:30 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Tue, 05 Jul 2005 11:04:30 -0600
Subject: [Linux-cluster] GFS freezing
Message-ID: <42CABD9E.8040404@possibilityforge.com>

Hi,

   I am wondering what is going wrong with my setup.  I am running a
min-gfs setup with 2 nodes accessing the storage server.  The node that
is the storage server is also acting as a client node.  So I have 2
physical systems only.  I have brought up the cluster and it works
fine.  Both systems can access the gfs share and what not.  When I go to
copy data to the share with rsync, or scp or cp, they gfs system freezes
and doesn't finish putting the data on gfs.  It will get about half way
through and that's it.  Have I missed something in the setup or what
else would I be missing?

Thanks,

Jon


From Scott.Money at lycos-inc.com  Tue Jul  5 18:32:00 2005
From: Scott.Money at lycos-inc.com (Scott.Money at lycos-inc.com)
Date: Tue, 5 Jul 2005 14:32:00 -0400
Subject: [Linux-cluster] GFS freezing
In-Reply-To: <42CABD9E.8040404@possibilityforge.com>
Message-ID: <OF5F232017.E24E8DDB-ON85257035.00651FB3-85257035.0066283C@ma.lycos.com>

We are seeing a similar issue. We have a 3 node gfs system that uses a 
gnbd server as storage. We originally ran into this problem quite 
frequently, but hard-setting our NICs to 100Mbit  full duplex has limited 
the system freezes to "large" data transfers. (e.g. copying 500mb files 
via scp or creating 500mb Oracle tablespaces). The good news is that the 
fencing works ;-)

Let me know if you get any information about this.

$cott


Jon Scottorn <jscottorn at possibilityforge.com> 
Sent by: linux-cluster-bounces at redhat.com
07/05/2005 01:04 PM
Please respond to
linux clustering <linux-cluster at redhat.com>


To
linux clustering <linux-cluster at redhat.com>
cc

Subject
[Linux-cluster] GFS freezing


Hi,

   I am wondering what is going wrong with my setup.  I am running a
min-gfs setup with 2 nodes accessing the storage server.  The node that
is the storage server is also acting as a client node.  So I have 2
physical systems only.  I have brought up the cluster and it works
fine.  Both systems can access the gfs share and what not.  When I go to
copy data to the share with rsync, or scp or cp, they gfs system freezes
and doesn't finish putting the data on gfs.  It will get about half way
through and that's it.  Have I missed something in the setup or what
else would I be missing?

Thanks,

Jon

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050705/dbf1cb36/attachment.htm>

From mtilstra at redhat.com  Tue Jul  5 19:06:25 2005
From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra)
Date: Tue, 5 Jul 2005 14:06:25 -0500
Subject: [Linux-cluster] GFS freezing
In-Reply-To: <OF5F232017.E24E8DDB-ON85257035.00651FB3-85257035.0066283C@ma.lycos.com>
References: <42CABD9E.8040404@possibilityforge.com>
	<OF5F232017.E24E8DDB-ON85257035.00651FB3-85257035.0066283C@ma.lycos.com>
Message-ID: <20050705190625.GA16854@redhat.com>

On Tue, Jul 05, 2005 at 02:32:00PM -0400, Scott.Money at lycos-inc.com wrote:
>    We are seeing a similar issue. We have a 3 node gfs system that
>    uses a gnbd server as storage. We originally ran into this problem
>    quite frequently, but hard-setting our NICs to 100Mbit  full duplex
>    has limited the system freezes to "large" data transfers. (e.g.
>    copying 500mb files via scp or creating 500mb Oracle tablespaces).
>    The good news is that the fencing works ;-)
>    Let me know if you get any information about this.

What you describe here sounds more like flooding of the network.  If you
send too much data over the same network device as the heartbeat&locking
traffic, you can starve out the heatbeats.  There was a bunch of emails
about this already on this list.  The way to deal with it is one of
1: don't ever flood the network, 2: use a provate network for heartbeats
& lock traffic, 3: use the traffic shaping kernel modules to provide a
garunteed bandwidth for the heartbeat & locking traffic.

-- 
Michael Conrad Tadpol Tilstra
What is your one purpose in life?
To explode of course!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050705/92becb81/attachment.sig>

From jscottorn at possibilityforge.com  Tue Jul  5 19:40:25 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Tue, 05 Jul 2005 13:40:25 -0600
Subject: [Linux-cluster] GFS freezing
In-Reply-To: <20050705190625.GA16854@redhat.com>
References: <42CABD9E.8040404@possibilityforge.com>	<OF5F232017.E24E8DDB-ON85257035.00651FB3-85257035.0066283C@ma.lycos.com>
	<20050705190625.GA16854@redhat.com>
Message-ID: <42CAE229.6000207@possibilityforge.com>

We already have #2 setup that way.  Also we are not flooding the
network.  I am copying data from with the storage server.  Also.  Here
is what my gfs_tool df displays.  I am wondering if the inodes and such
being so low will cause an issue.  I would think that is shouldn't be
what it is.

SB lock proto = "lock_dlm"
  SB lock table = "SAN1:VserversFS"
  SB ondisk format = 1309
  SB multihost format = 1401
  Block size = 4096
  Journals = 4
  Resource Groups = 2794
  Mounted lock proto = "lock_dlm"
  Mounted lock table = "SAN1:VserversFS"
  Mounted host data = ""
  Journal number = 0
  Lock module flags =
  Local flocks = FALSE
  Local caching = FALSE
  Oopses OK = FALSE

  Type           Total          Used           Free           use%
  ------------------------------------------------------------------------
  inodes         5              5              0              100%
  metadata       66             66             0              100%
  data           182996209      0              182996209      0%


Michael Conrad Tadpol Tilstra wrote:

>On Tue, Jul 05, 2005 at 02:32:00PM -0400, Scott.Money at lycos-inc.com wrote:
>  
>
>>   We are seeing a similar issue. We have a 3 node gfs system that
>>   uses a gnbd server as storage. We originally ran into this problem
>>   quite frequently, but hard-setting our NICs to 100Mbit  full duplex
>>   has limited the system freezes to "large" data transfers. (e.g.
>>   copying 500mb files via scp or creating 500mb Oracle tablespaces).
>>   The good news is that the fencing works ;-)
>>   Let me know if you get any information about this.
>>    
>>
>
>What you describe here sounds more like flooding of the network.  If you
>send too much data over the same network device as the heartbeat&locking
>traffic, you can starve out the heatbeats.  There was a bunch of emails
>about this already on this list.  The way to deal with it is one of
>1: don't ever flood the network, 2: use a provate network for heartbeats
>& lock traffic, 3: use the traffic shaping kernel modules to provide a
>garunteed bandwidth for the heartbeat & locking traffic.
>
>  
>
>------------------------------------------------------------------------
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster
>


From mtilstra at redhat.com  Tue Jul  5 20:22:28 2005
From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra)
Date: Tue, 5 Jul 2005 15:22:28 -0500
Subject: [Linux-cluster] GFS freezing
In-Reply-To: <42CAE229.6000207@possibilityforge.com>
References: <42CABD9E.8040404@possibilityforge.com>
	<OF5F232017.E24E8DDB-ON85257035.00651FB3-85257035.0066283C@ma.lycos.com>
	<20050705190625.GA16854@redhat.com>
	<42CAE229.6000207@possibilityforge.com>
Message-ID: <20050705202228.GA17623@redhat.com>

On Tue, Jul 05, 2005 at 01:40:25PM -0600, Jon Scottorn wrote:
> We already have #2 setup that way.  Also we are not flooding the
> network.  I am copying data from with the storage server.  Also.  Here
> is what my gfs_tool df displays.  I am wondering if the inodes and such
> being so low will cause an issue.  I would think that is shouldn't be
> what it is.

yeah, I was responding specifically to Scott.Money at lycos-inc.com, sorry
for not being clear on that.


-- 
Michael Conrad Tadpol Tilstra
Too many errors on one line (make fewer)
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050705/02c918b3/attachment.sig>

From jscottorn at possibilityforge.com  Tue Jul  5 22:35:36 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Tue, 05 Jul 2005 16:35:36 -0600
Subject: [Linux-cluster] Network problems
Message-ID: <42CB0B38.7060000@possibilityforge.com>

Hi,

   How does a node rejoin the cluster if network connectivity is lost?

I have 3 nodes that are in the cluster.  I have unplugged the network
connection from one of the nodes, for testing purposes, the cluster sees
the node disappear, when I plug the network connection back in it does
not rejoin the cluster.  I looks as though it forms it's own cluster.  ie...

cat /proc/cluster/nodes shows two of the nodes with the third missing.
cat /proc/cluster/nodes on the system that lost network connectivity
shows itself in the clusterand the other two nodes missing and the
quorum is locked as well.

Can this issue be prevented when network connectivity is lost or when
that happens do I just have to reboot the system?

Jon


From fajar at telkom.net.id  Wed Jul  6 02:25:59 2005
From: fajar at telkom.net.id (Fajar A. Nugraha)
Date: Wed, 06 Jul 2005 09:25:59 +0700
Subject: [Linux-cluster] RedHat Cluster suite installation
In-Reply-To: <BAY104-F404DC1317CBD4C6AEAF2CDC5E60@phx.gbl>
References: <BAY104-F404DC1317CBD4C6AEAF2CDC5E60@phx.gbl>
Message-ID: <42CB4137.30204@telkom.net.id>

haydar Ali wrote:

> Hi Yazan,
>
> Yes I know that I have to use GFS to reach the shared from the two 
> nodes at the same time but the RedHat technical support said me that I 
> need 3 nodes (3 servers) connected to my SAN and we have only 2 nodes 
> connected by 2 Fiber channel each.
> My question : can I use only 2 nodes in GFS solution??
>
You can with RHEL4 and http://sources.redhat.com/cluster/ . I believe 
the official (supported) version was released recently but I can't find 
the PR stuff yet.

Regards,

Fajar


From jason at selu.edu  Wed Jul  6 03:52:09 2005
From: jason at selu.edu (Jason Lanclos)
Date: Tue, 5 Jul 2005 22:52:09 -0500
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <14504.208.178.77.200.1120233225.squirrel@208.178.77.200>
References: <Pine.LNX.4.61.0506081351350.1493@tungsten.technicality.org>
	<1120232946.13773.30.camel@ayanami.boston.redhat.com>
	<14504.208.178.77.200.1120233225.squirrel@208.178.77.200>
Message-ID: <200507052252.09456.jason@selu.edu>


> > You shouldn't even need CLVM if you don't intend to muddle with
> > partitions or cross-mount the file systems.  You'll lose resizing, but
> > in doing so, your clients no longer need to be cluster participants.
>

CLVM is cool, but its pretty much useless until LVM2 actually implements 
pvresize or a pvextend. One of the main advantages of having a SAN is being 
able to add space to a volume (LUN)   Currently when we expand a volume on 
the san, we have to unmount the filesystem, rescan the LUNs, then run fdisk 
on that volume, delete the partition entry, and recreate it to use all the 
space.. then at that point we can run ext2online / gfs_grow to resize the 
filesystem.

I would be VERY nice if pvresize / pvextend existed, that way one could expand 
the volume on the SAN, rescan LUNs on each cluster member, run pvresize / 
pvextend, run lvextend and then gfs_grow and call it a day.

I mentioned this at the RedHat summit, and got a few puzzled looks, but never 
got an answer on the matter.


> Oh certainly...
>
> I was just trying to think of why someone would want to use GFS to do
> this, and the idea of the machines not have pre-fixed partition sizes was
> about the only thing I could think of.
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster


From fajar at telkom.net.id  Wed Jul  6 04:04:08 2005
From: fajar at telkom.net.id (Fajar A. Nugraha)
Date: Wed, 06 Jul 2005 11:04:08 +0700
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <200507052252.09456.jason@selu.edu>
References: <Pine.LNX.4.61.0506081351350.1493@tungsten.technicality.org>	<1120232946.13773.30.camel@ayanami.boston.redhat.com>	<14504.208.178.77.200.1120233225.squirrel@208.178.77.200>
	<200507052252.09456.jason@selu.edu>
Message-ID: <42CB5838.4060802@telkom.net.id>

Jason Lanclos wrote:

>>>You shouldn't even need CLVM if you don't intend to muddle with
>>>partitions or cross-mount the file systems.  You'll lose resizing, but
>>>in doing so, your clients no longer need to be cluster participants.
>>>      
>>>
>
>CLVM is cool, but its pretty much useless until LVM2 actually implements 
>pvresize or a pvextend. 
>
What's wrong with vgextend?

>One of the main advantages of having a SAN is being 
>able to add space to a volume (LUN)   Currently when we expand a volume on 
>the san, we have to unmount the filesystem, rescan the LUNs, then run fdisk 
>on that volume, delete the partition entry, and recreate it to use all the 
>space.. then at that point we can run ext2online / gfs_grow to resize the 
>filesystem.
>
>  
>
A simpler method will be create a NEW LUN, scan it, pvcreate, add it to 
existing VG with vgextend, and extend your volume lvextend. At least you 
don't have to mess with existing partitions.

>I would be VERY nice if pvresize / pvextend existed, that way one could expand 
>the volume on the SAN, 
>
My IBM SAN can't expand exisiting volume on the SAN. It's not like I 
need it though. The above solution works better.

>rescan LUNs on each cluster member, run pvresize / 
>pvextend, run lvextend and then gfs_grow and call it a day.
>
>  
>
My only problem (for now) is I can't rescan LUNs without removing HBA 
modules (thus unmounting filesystems or restarting the server). If you 
can rescan LUNs online, it's a simple method of pvcreate, vgextend, 
lvextend, gfs_grow, and call it a day.

>I mentioned this at the RedHat summit, and got a few puzzled looks, 
>
I wonder why :-D

Regards,

Fajar


From JACOB_LIBERMAN at Dell.com  Wed Jul  6 04:04:56 2005
From: JACOB_LIBERMAN at Dell.com (JACOB_LIBERMAN at Dell.com)
Date: Tue, 5 Jul 2005 23:04:56 -0500
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
Message-ID: <BC430F453501174992B9D9E8AFB7519A085568@ausx3mps309.aus.amer.dell.com>

Jason,

LVM2 has a vgextend command that can be used to add a new PV to a vg.
There is also an lvextend and ext2online so you can increase the size of
the lv without dismounting it. This is slightly different than a
pvextend, but I believe it achieves the same end result.

So a process you might be able to try is:

1. extend the LUN on the SAN
2. use fdisk to create a new partition on the LUN
3. use partprobe -s to dynamically redetect the new partition
4. make a new filesystem on the partition
5. use vgextend to add the new PV to the VG
6. use lvextend to add new space to the LV
7. use ext2online of gfs_grow to tarck out the filesystem on the new
space *without* dismounting or rebooting

I've used to above procedure under RHEL4 with LVM2 and an ext3
filesystem. I have not tried it w/GFS yet because our reference lab is
down for maintenance.

Thanks, jacob

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jason Lanclos
> Sent: Tuesday, July 05, 2005 10:52 PM
> To: linux clustering
> Subject: Re: [Linux-cluster] A few GFS newbie questions: journals, etc
> 
> 
> > > You shouldn't even need CLVM if you don't intend to muddle with 
> > > partitions or cross-mount the file systems.  You'll lose 
> resizing, 
> > > but in doing so, your clients no longer need to be 
> cluster participants.
> >
> 
> CLVM is cool, but its pretty much useless until LVM2 actually 
> implements pvresize or a pvextend. One of the main advantages 
> of having a SAN is being 
> able to add space to a volume (LUN)   Currently when we 
> expand a volume on 
> the san, we have to unmount the filesystem, rescan the LUNs, 
> then run fdisk on that volume, delete the partition entry, 
> and recreate it to use all the space.. then at that point we 
> can run ext2online / gfs_grow to resize the filesystem.
> 
> I would be VERY nice if pvresize / pvextend existed, that way 
> one could expand the volume on the SAN, rescan LUNs on each 
> cluster member, run pvresize / pvextend, run lvextend and 
> then gfs_grow and call it a day.
> 
> I mentioned this at the RedHat summit, and got a few puzzled 
> looks, but never got an answer on the matter.
> 
> 
> 
> > Oh certainly...
> >
> > I was just trying to think of why someone would want to use 
> GFS to do
> > this, and the idea of the machines not have pre-fixed 
> partition sizes was
> > about the only thing I could think of.
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > http://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 


From fabbione at fabbione.net  Wed Jul  6 04:49:15 2005
From: fabbione at fabbione.net (Fabio Massimo Di Nitto)
Date: Wed, 06 Jul 2005 06:49:15 +0200
Subject: [Linux-cluster] [PATCH] add syscall missing includes in
	rgmanager/src/clulib/gettid.c
Message-ID: <42CB62CB.408@fabbione.net>


Hi guys,
   patch is pretty self-explanatory, when we did change the way in which we use
syscall, we forgot to add the relevant includes.
Patch is against the STABLE branch, but i am pretty sure it applies all over.
As a consequence it also shuts up a warning a build time.

Please apply.

Cheers
Fabio

-- 
no signature file found.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: add_missing_syscall_includes.dpatch
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050706/e34a8951/attachment.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 256 bytes
Desc: OpenPGP digital signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050706/e34a8951/attachment.sig>

From fabbione at fabbione.net  Wed Jul  6 05:00:19 2005
From: fabbione at fabbione.net (Fabio Massimo Di Nitto)
Date: Wed, 06 Jul 2005 07:00:19 +0200
Subject: [Linux-cluster] [PATCH] add syscall missing includes
	in	rgmanager/src/clulib/gettid.c
In-Reply-To: <42CB62CB.408@fabbione.net>
References: <42CB62CB.408@fabbione.net>
Message-ID: <42CB6563.5050007@fabbione.net>

Fabio Massimo Di Nitto wrote:
> Hi guys,
>    patch is pretty self-explanatory, when we did change the way in which we use
> syscall, we forgot to add the relevant includes.
> Patch is against the STABLE branch, but i am pretty sure it applies all over.
> As a consequence it also shuts up a warning a build time.
> 
> Please apply.
> 
> Cheers
> Fabio

whops.. wrong patch.. sorry.. here is the good one..

Fabio

-- 
no signature file found.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: add_missing_syscall_includes.dpatch
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050706/c8d24e41/attachment.ksh>

From fabbione at fabbione.net  Wed Jul  6 05:25:08 2005
From: fabbione at fabbione.net (Fabio Massimo Di Nitto)
Date: Wed, 06 Jul 2005 07:25:08 +0200
Subject: [Linux-cluster] Suggestion for the STABLE branch
Message-ID: <42CB6B34.6090800@fabbione.net>


Hi guys,
    given the nature of the branch, wouldn't make sense to setup the release info
to something more natural than:

RELEASE_MAJOR = DEVEL
RELEASE_MINOR = DATE

?

Perhaps start traking versions, libs soname and stuff like will give different vendors
the possibility to allign their package names and be less confusing for the end users.

Thanks.
Fabio

-- 
no signature file found.


From yazan at ccs.com.jo  Wed Jul  6 06:37:57 2005
From: yazan at ccs.com.jo (Yazan Al-Sheyyab)
Date: Wed, 6 Jul 2005 08:37:57 +0200
Subject: [Linux-cluster] GFS installation
References: <BAY104-F167E0837063F401106879EC5E60@phx.gbl>
Message-ID: <010401c581f5$3ecc1560$69050364@yazanz>

hi haydar,

  my friend i was having the same case : which is a two ml 370 hp proliant 
nodes and a shared storage msa500 and the two nodes connected to the shared 
by SCSI cable , the problem was with the lock server when implementing the 
GFS.

im using RHEL_ES_V3_U4 and a GFS for V3 U4 also,  but here in this GFS 
release you have to have an odd number of lock server , i mean that when you 
have two server so you have to have a three lock server , but in my case i 
have used only the first node as lock server and then i reached a poor 
redunduncy in the cluster cause i still studying the purchase of a third 
server., i heared that there is a release of GFS which is GFS 6.1 working 
with RHEL_V4 which will work without the need of the locking service  ..... 
some body correct me if that is not right.

but some body told me that i can use the third lock server as logical not 
physical server  , he meant that i dont need another HardWare like a server

but my question now is there any body on our list can explain me the use of 
the third server as logical way , is it as unix and hp-ux  as an area of the 
disk used for locking or what?

 can we manage one of the two cpu on the server as virtual to get a solution 
without another server.

Sorry for the long Email , but we have this last problem.


Regards
-------------------------------------------------

Yazan
---------------------------

----- Original Message ----- 
From: "haydar Ali" <haydar2906 at hotmail.com>
To: <linux-cluster at redhat.com>
Sent: Tuesday, July 05, 2005 6:22 PM
Subject: Re: [Linux-cluster] GFS installation


> Hi Igor,
>
> Thanks for this URL.
> My question is: Have I to use 3 nodes to achieve GFS solution?
> We have 2 servers HP Proliant 380 G3 (RedHat Advanced Server 2.1) attached 
> by 2 fiber channels each to the storage area network SAN HP MSA1000 and we 
> want to install and configure GFS to allow 2 servers to simultaneously 
> read and write to a single shared file system (Word documents located into 
> /u04) located on the Storage area network SAN HP MSA1000.
> I read the example that you have sent to me and I see 3 nodes, 2 client 
> nodes share a directory mounted on the 3d server node, but our solution 
> the directory is located in the SAN.
>
> Have you any explanation or ideas for our request?
> Thanks
>
> Haydar
>
>
>>From: Igor <logastellus at yahoo.com>
>>Reply-To: linux clustering <linux-cluster at redhat.com>
>>To: linux clustering <linux-cluster at redhat.com>
>>Subject: Re: [Linux-cluster] GFS installation
>>Date: Thu, 16 Jun 2005 08:33:16 -0700 (PDT)
>>
>>Look at this URL that David suggested to me:
>>
>>http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/doc/min-gfs.txt?rev=1.3&content-type=text/x-cvsweb-markup&cvsroot=cluster
>>
>>it's pretty good.
>>
>>--- haydar Ali <haydar2906 at hotmail.com> wrote:
>>
>> > Hi,
>> >
>> > I'm looking for an installing and configuring
>> > procedure for GFS (examples).
>> > We have 2 servers HP Proliant 380 G3 (RedHat
>> > Advanced Server 2.1) attached
>> > by fiber optic to the storage area network SAN HP
>> > MSA1000 and we want to
>> > install and configure GFS to allow 2 servers to
>> > simultaneously read and
>> > write to a single shared file system (Word documents
>> > located into /u04) on
>> > the SAN.
>> >
>> > Thanks.
>> >
>> > Haydar
>> >
>> >
>> > --
>> > Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > http://www.redhat.com/mailman/listinfo/linux-cluster
>> >
>>
>>
>>
>>
>>__________________________________
>>Discover Yahoo!
>>Stay in touch with email, IM, photo sharing and more. Check it out!
>>http://discover.yahoo.com/stayintouch.html
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>http://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster 


From JACOB_LIBERMAN at Dell.com  Wed Jul  6 06:06:00 2005
From: JACOB_LIBERMAN at Dell.com (JACOB_LIBERMAN at Dell.com)
Date: Wed, 6 Jul 2005 01:06:00 -0500
Subject: [Linux-cluster] GFS installation
Message-ID: <BC430F453501174992B9D9E8AFB7519A08556B@ausx3mps309.aus.amer.dell.com>

Yazan,

You can implement RLM (Redundant Lock Manager) on RHEL3 with 3 servers.
All of the lock servers do NOT need access to the shared storage,
although they can have access to the shared storage.  The file locking
in all handled over the heartbeat LAN. I believe that at least 1 of the
lockservers needs access to the shared storage, but I have not tried any
less than X -1 if X is the total number of lock servers.

The only important rule of thumb to keep in mind is that when the
lockserver has access to the shared storage, it must be fenced from BOTH
the disk and the network. If the lockserver only has access to the
network, it should be fenced from the network only. In both cases, a
network power switch is the preferred fencing method. 

If a non-lockserver GFS server (ie -- node) is accessing the shared
storage, it can be fenced from the storage via a fiber switch.

The method for implementing a non-disk attached RLM lockserver is to
make a copy of the current CCS information on a disk-attached
lockserver, copy it to the non-disk attached lockserver, and then make a
CCS file archive on the non-disk attached lockserver with ccs_tool.

Incidentally, it is far easier to set this up if you make one of the
disk-attached lockservers the master lockserver. 

Once you have copied the cluster config to a directory on the non-disk
attached lockserver, edit the /etc/sysconfig/ccsd CCS_ARCHIVE parameter
so that it points to the local directory that contains the config files.

Then, wafter you have copied the files to the non-disk attached
lockserver, all you need to do to generate the files is run cc_tool -v
create /etc/gfs/data0 /etc/gfs/data0.cca. (of course, your pathnames
will vary.)

Next, start ccsd on the non-disk lockserver with ccsd -f
/path/to/file.cca and then start lock_gulmd.

You can verify the new lockserver in /var/log/messages on the master
lock server. Then test fencing and you are done!

I hope this helps.

Cheers, jacob

 
> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Yazan 
> Al-Sheyyab
> Sent: Wednesday, July 06, 2005 1:38 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] GFS installation
> 
> hi haydar,
> 
>   my friend i was having the same case : which is a two ml 
> 370 hp proliant nodes and a shared storage msa500 and the two 
> nodes connected to the shared by SCSI cable , the problem was 
> with the lock server when implementing the GFS.
> 
> im using RHEL_ES_V3_U4 and a GFS for V3 U4 also,  but here in 
> this GFS release you have to have an odd number of lock 
> server , i mean that when you have two server so you have to 
> have a three lock server , but in my case i have used only 
> the first node as lock server and then i reached a poor 
> redunduncy in the cluster cause i still studying the purchase 
> of a third server., i heared that there is a release of GFS 
> which is GFS 6.1 working with RHEL_V4 which will work without 
> the need of the locking service  ..... 
> some body correct me if that is not right.
> 
> but some body told me that i can use the third lock server as 
> logical not physical server  , he meant that i dont need 
> another HardWare like a server
> 
> but my question now is there any body on our list can explain 
> me the use of the third server as logical way , is it as unix 
> and hp-ux  as an area of the disk used for locking or what?
> 
>  can we manage one of the two cpu on the server as virtual to 
> get a solution without another server.
> 
> Sorry for the long Email , but we have this last problem.
> 
> 
> 
> Regards
> -------------------------------------------------
> 
> Yazan
> ---------------------------
> 
> ----- Original Message -----
> From: "haydar Ali" <haydar2906 at hotmail.com>
> To: <linux-cluster at redhat.com>
> Sent: Tuesday, July 05, 2005 6:22 PM
> Subject: Re: [Linux-cluster] GFS installation
> 
> 
> > Hi Igor,
> >
> > Thanks for this URL.
> > My question is: Have I to use 3 nodes to achieve GFS solution?
> > We have 2 servers HP Proliant 380 G3 (RedHat Advanced 
> Server 2.1) attached 
> > by 2 fiber channels each to the storage area network SAN HP 
> MSA1000 and we 
> > want to install and configure GFS to allow 2 servers to 
> simultaneously 
> > read and write to a single shared file system (Word 
> documents located into 
> > /u04) located on the Storage area network SAN HP MSA1000.
> > I read the example that you have sent to me and I see 3 
> nodes, 2 client 
> > nodes share a directory mounted on the 3d server node, but 
> our solution 
> > the directory is located in the SAN.
> >
> > Have you any explanation or ideas for our request?
> > Thanks
> >
> > Haydar
> >
> >
> >>From: Igor <logastellus at yahoo.com>
> >>Reply-To: linux clustering <linux-cluster at redhat.com>
> >>To: linux clustering <linux-cluster at redhat.com>
> >>Subject: Re: [Linux-cluster] GFS installation
> >>Date: Thu, 16 Jun 2005 08:33:16 -0700 (PDT)
> >>
> >>Look at this URL that David suggested to me:
> >>
> >>http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/doc/min
-gfs.txt?rev=1.3&content-type=text/x-cvsweb-markup&cvsroot=cluster
>>
>>it's pretty good.
>>
>>--- haydar Ali <haydar2906 at hotmail.com> wrote:
>>
>> > Hi,
>> >
>> > I'm looking for an installing and configuring
>> > procedure for GFS (examples).
>> > We have 2 servers HP Proliant 380 G3 (RedHat
>> > Advanced Server 2.1) attached
>> > by fiber optic to the storage area network SAN HP
>> > MSA1000 and we want to
>> > install and configure GFS to allow 2 servers to
>> > simultaneously read and
>> > write to a single shared file system (Word documents
>> > located into /u04) on
>> > the SAN.
>> >
>> > Thanks.
>> >
>> > Haydar
>> >
>> >
>> > --
>> > Linux-cluster mailing list
>> > Linux-cluster at redhat.com
>> > http://www.redhat.com/mailman/listinfo/linux-cluster
>> >
>>
>>
>>
>>
>>__________________________________
>>Discover Yahoo!
>>Stay in touch with email, IM, photo sharing and more. Check it out!
>>http://discover.yahoo.com/stayintouch.html
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>http://www.redhat.com/mailman/listinfo/linux-cluster
>
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster 

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster


From schlegel at riege.com  Wed Jul  6 06:30:51 2005
From: schlegel at riege.com (Gunther Schlegel)
Date: Wed, 06 Jul 2005 08:30:51 +0200
Subject: [Linux-cluster] RHEL3 Cluster network hangup
Message-ID: <42CB7A9B.30304@riege.com>

Hi,

I am running RHEL3 ES with the RedHat Cluster Suite (not GFS, simply 
failover cluster).

The clustered application does a lot of printing (lprng), 
faxing(hylafax) and mailing(sendmail). It uses shell scripts to pass the 
jobs to the operating systems daemons.

The client programs of these daemons, which pass jobs to the daemons 
using network connections to localhost start to behave irregular when 
the cluster is up for about 2 weeks.

Examples:
- hylafax faxstat stops listing the transmitted faxes in the middle of 
the list ( but always at the same job )
- sendmail opens a connection to the local daemon but does not transfer 
the message. Both processes sit there and wait, after some time the 
server closes the connection because of missing input from the clients side.
- same with lpr.

I assume that something locks up in the ip stack. Not all services are 
affected at the same time.

I guess this is related to the cluster software as we run that 
application on a lot of servers which all do not show this behaviour and 
that are all not clustered.

Any hints?

regards, Gunther
-------------- next part --------------
A non-text attachment was scrubbed...
Name: schlegel.vcf
Type: text/x-vcard
Size: 345 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050706/5cf32d5d/attachment.vcf>

From pcaulfie at redhat.com  Wed Jul  6 07:08:39 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Wed, 06 Jul 2005 08:08:39 +0100
Subject: [Linux-cluster] Network problems
In-Reply-To: <42CB0B38.7060000@possibilityforge.com>
References: <42CB0B38.7060000@possibilityforge.com>
Message-ID: <42CB8377.2090504@redhat.com>

Jon Scottorn wrote:
> Hi,
> 
>    How does a node rejoin the cluster if network connectivity is lost?
> 
> I have 3 nodes that are in the cluster.  I have unplugged the network
> connection from one of the nodes, for testing purposes, the cluster sees
> the node disappear, when I plug the network connection back in it does
> not rejoin the cluster.  I looks as though it forms it's own cluster.  ie...
> 
> cat /proc/cluster/nodes shows two of the nodes with the third missing.
> cat /proc/cluster/nodes on the system that lost network connectivity
> shows itself in the clusterand the other two nodes missing and the
> quorum is locked as well.
> 
> Can this issue be prevented when network connectivity is lost or when
> that happens do I just have to reboot the system?

You'll need to reboot the system - or at least restart all the cluster services
which probably amnounts to the same thing.

When a cluster is partitioned, the partitions (obviously) don't know what has
been going on in the other partition, so it can't just rejoin the cluster
because the whole lockspace/filesystem state may have completely changed.

When network connectivity is re-established the quorate partition (if there is
one) should fence the inquorate one. If it is an even partition where neither
has quorum then they will just stare sullenly at each other for ever, unfortunately.

-- 

patrick


From Birger.Wathne at ift.uib.no  Wed Jul  6 10:00:12 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Wed, 06 Jul 2005 12:00:12 +0200
Subject: [Linux-cluster] NFS exports disappeared
In-Reply-To: <42C99901.8090805@uib.no>
References: <42C99901.8090805@uib.no>
Message-ID: <42CBABAC.4000909@uib.no>

I'm still not certain how I lost the nfs export list in the first place, but 
I think I see why they never got 'fixed'.

I added some logging inside /usr/share/cluster/nfsclient.sh, and it seems 
like status is only checked for a few of my exports.

Is this a bug? Shouldn't all nfs exports get checked regularly?

-- 
birger


From Birger.Wathne at ift.uib.no  Wed Jul  6 10:12:22 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Wed, 06 Jul 2005 12:12:22 +0200
Subject: [Linux-cluster] Problem rebooting cluster node - CMANsendmsg
	failed: -101
Message-ID: <42CBAE86.3090101@uib.no>

When I try to reboot or shut down my cluster node (a 2 node cluster with 
only one node present. Still waiting for delivery of the second node) it 
stops answering to ping, and then seems to loop forever with the message 
CMANsendmsg failed: -101.

Perhaps it times out in the end, but I have ended up power cycling it to get 
up again.

-- 
birger


From Birger.Wathne at ift.uib.no  Wed Jul  6 10:16:34 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Wed, 06 Jul 2005 12:16:34 +0200
Subject: [Linux-cluster] NFS configuration question
Message-ID: <42CBAF82.8050900@uib.no>

When setting up a configuration for NFS server failover, can I ensure the IP 
address comes up after exports are ready, and more importantly that the IP 
address goes down before unexporting by placeing the <ip ...> entry at the 
end of the service declaration?

When taking down the nfs service I want the server to 'disappear' before it 
starts unexporting, as i want clients to hang waiting for the service to 
come back up instead of getting io errors.

I am unable to test this, since I had to go into production a bit too quickly.

-- 
birger


From pcaulfie at redhat.com  Wed Jul  6 10:36:08 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Wed, 06 Jul 2005 11:36:08 +0100
Subject: [Linux-cluster] Problem rebooting cluster node - CMANsendmsg
	failed: -101
In-Reply-To: <42CBAE86.3090101@uib.no>
References: <42CBAE86.3090101@uib.no>
Message-ID: <42CBB418.7020303@redhat.com>

Birger Wathne wrote:
> When I try to reboot or shut down my cluster node (a 2 node cluster with
> only one node present. Still waiting for delivery of the second node) it
> stops answering to ping, and then seems to loop forever with the message
> CMANsendmsg failed: -101.
> 

I have seen this a small number of times - though only on test machines that
don't invoke the cman shutdown script.

-- 

patrick


From Birger.Wathne at ift.uib.no  Wed Jul  6 11:29:53 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Wed, 06 Jul 2005 13:29:53 +0200
Subject: [Linux-cluster] Problem rebooting cluster node - CMANsendmsg
	failed: -101
In-Reply-To: <42CBB418.7020303@redhat.com>
References: <42CBAE86.3090101@uib.no> <42CBB418.7020303@redhat.com>
Message-ID: <42CBC0B1.7040309@uib.no>

Patrick Caulfield wrote:

> I have seen this a small number of times - though only on test machines that
> don't invoke the cman shutdown script.
> 

I have installed from source using the FC4 branch. It seems like a make 
install installs startup and shutdown links for clvmd_init and ccsd. for 
cman, rgmanager and fenced it only installs links for startup.

I'll fix my installation. Thanks for the tip. Is this a bug in the Makefile?

-- 
birger


From rkenna at redhat.com  Wed Jul  6 13:43:54 2005
From: rkenna at redhat.com (Rob Kenna)
Date: Wed, 06 Jul 2005 09:43:54 -0400
Subject: [Linux-cluster] RedHat Cluster suite installation
In-Reply-To: <42CB4137.30204@telkom.net.id>
References: <BAY104-F404DC1317CBD4C6AEAF2CDC5E60@phx.gbl>
	<42CB4137.30204@telkom.net.id>
Message-ID: <1120657434.4311.7.camel@rkenna-laptop>

GFS 6.1 for RHEL 4 shipped two weeks ago:

http://www.redhat.com/software/rha/gfs/ (General info page)
http://www.redhat.com/docs/manuals/csgfs/  (Documentation Pages)

GFS 6.1 can indeed run in 2 node configurations.  It includes the DLM
(Distributed Lock Manager) which "spreads" the locking across all GFS
nodes w/o separate lock servers. Though DLM is now the default, GULM
remains as an alternative lock manager.

- Rob

On Wed, 2005-07-06 at 09:25 +0700, Fajar A. Nugraha wrote:

> haydar Ali wrote:
> 
> > Hi Yazan,
> >
> > Yes I know that I have to use GFS to reach the shared from the two 
> > nodes at the same time but the RedHat technical support said me that I 
> > need 3 nodes (3 servers) connected to my SAN and we have only 2 nodes 
> > connected by 2 Fiber channel each.
> > My question : can I use only 2 nodes in GFS solution??
> >
> You can with RHEL4 and http://sources.redhat.com/cluster/ . I believe 
> the official (supported) version was released recently but I can't find 
> the PR stuff yet.
> 
> Regards,
> 
> Fajar
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster

-- 
Robert Kenna / Red Hat
Sr Product Mgr - Storage
o: (978) 392-2410 (x22410)
c: (978) 771-6314
rkenna at redhat.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050706/61d94786/attachment.htm>

From jason at selu.edu  Wed Jul  6 14:17:32 2005
From: jason at selu.edu (Jason Lanclos)
Date: Wed, 6 Jul 2005 09:17:32 -0500
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <42CB5838.4060802@telkom.net.id>
References: <Pine.LNX.4.61.0506081351350.1493@tungsten.technicality.org>
	<200507052252.09456.jason@selu.edu>
	<42CB5838.4060802@telkom.net.id>
Message-ID: <200507060917.32412.Jason@selu.edu>

On Tuesday 05 July 2005 11:04 pm, Fajar A. Nugraha wrote:
> Jason Lanclos wrote:
> 
> >>>You shouldn't even need CLVM if you don't intend to muddle with
> >>>partitions or cross-mount the file systems.  You'll lose resizing, but
> >>>in doing so, your clients no longer need to be cluster participants.
> >>>      
> >>>
> >
> >CLVM is cool, but its pretty much useless until LVM2 actually implements 
> >pvresize or a pvextend. 
> >
> What's wrong with vgextend?

Nothings wrong with vgextend.. but its useless when i want to extend a pv.


> 
> >One of the main advantages of having a SAN is being 
> >able to add space to a volume (LUN)   Currently when we expand a volume on 
> >the san, we have to unmount the filesystem, rescan the LUNs, then run fdisk 
> >on that volume, delete the partition entry, and recreate it to use all the 
> >space.. then at that point we can run ext2online / gfs_grow to resize the 
> >filesystem.
> >
> >  
> >
> A simpler method will be create a NEW LUN, scan it, pvcreate, add it to 
> existing VG with vgextend, and extend your volume lvextend. At least you 
> don't have to mess with existing partitions.

Ok.. what happens when you reach the max LUNs or max partitions???
And I can only imagine the administration nightmare it would be to manage with several
volumes spread over multiple LUNs. The whole point of a SAN is to simplify management of data storage.
And what about copy/swap? Spreading a volume over multiple LUNs would make using this feature of the SAN impossible / very
dangerous to do while volumes are mounted.
So.. One Volume, One LUN, One partition is the easiest way.
There are prolly a thousand different ways to "make it work",
but I wouldn't consider that part of the whole "Red Hat Enterprise Linux" concept. 

I know I'm not the 1st person to run into this issue.. 
so rather than having numerous work arounds, why not just implement pvresize/pvextend to do this?

  https://www.redhat.com/archives/linux-lvm/2003-July/msg00038.html

Oh.. and there's already a man page for it.. even though its not implemented in LVM2:
  http://mandoc.etopian.net/man/linux/8/pvresize

And a work around that I'd be reluctant using on a production system:
  http://www.redhat.com/archives/linux-lvm/2004-December/msg00049.html


> 
> >I would be VERY nice if pvresize / pvextend existed, that way one could expand 
> >the volume on the SAN, 
> >
> My IBM SAN can't expand exisiting volume on the SAN. It's not like I 
> need it though. The above solution works better.

XioTech Magnitude expands volumes with no problem.. Its one of the reasons we chose it for our
SAN.

> 
> >rescan LUNs on each cluster member, run pvresize / 
> >pvextend, run lvextend and then gfs_grow and call it a day.
> >
> >  
> >
> My only problem (for now) is I can't rescan LUNs without removing HBA 
> modules (thus unmounting filesystems or restarting the server).

I'm not sure if it will work with your FC card, but I've used the following script
a few times with a qlogic card and it worked.
  http://www.fifi.org/cgi-bin/man2html/usr/share/man/man8/rescan-scsi-bus.sh.8.gz


> If you  
> can rescan LUNs online, it's a simple method of pvcreate, vgextend, 
> lvextend, gfs_grow, and call it a day.
> 
> >I mentioned this at the RedHat summit, and got a few puzzled looks, 
> >
> I wonder why :-D

> 
> Regards,
> 
> Fajar
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 

-- 
Jason Lanclos                                        
Systems Administrator                                 
Red Hat Certified Engineer        
Southeastern Louisiana University		     


From amanthei at redhat.com  Wed Jul  6 16:04:33 2005
From: amanthei at redhat.com (Adam Manthei)
Date: Wed, 6 Jul 2005 11:04:33 -0500
Subject: [Linux-cluster] Problem rebooting cluster node - CMANsendmsg
	failed: -101
In-Reply-To: <42CBC0B1.7040309@uib.no>
References: <42CBAE86.3090101@uib.no> <42CBB418.7020303@redhat.com>
	<42CBC0B1.7040309@uib.no>
Message-ID: <20050706160433.GC15320@redhat.com>

On Wed, Jul 06, 2005 at 01:29:53PM +0200, Birger Wathne wrote:
> Patrick Caulfield wrote:
> 
> >I have seen this a small number of times - though only on test machines 
> >that
> >don't invoke the cman shutdown script.
> >
> 
> I have installed from source using the FC4 branch. It seems like a make 
> install installs startup and shutdown links for clvmd_init and ccsd. for 
> cman, rgmanager and fenced it only installs links for startup.
> 
> I'll fix my installation. Thanks for the tip. Is this a bug in the Makefile?

This is not a bug.  The Makefiles are responsible for installing the init
scripts and binaries onto your system.  They are not responsible for
configuring your system to start and stop services; that is the
responsibility of the rpm.

-- 
Adam Manthei  <amanthei at redhat.com>


From jason at selu.edu  Wed Jul  6 18:10:42 2005
From: jason at selu.edu (Jason Lanclos)
Date: Wed, 6 Jul 2005 13:10:42 -0500
Subject: [Linux-cluster] Problem rebooting cluster node - CMANsendmsg
	failed: -101
In-Reply-To: <20050706160433.GC15320@redhat.com>
References: <42CBAE86.3090101@uib.no> <42CBC0B1.7040309@uib.no>
	<20050706160433.GC15320@redhat.com>
Message-ID: <200507061310.42642.Jason@selu.edu>

On Wednesday 06 July 2005 11:04 am, Adam Manthei wrote:
> On Wed, Jul 06, 2005 at 01:29:53PM +0200, Birger Wathne wrote:
> > Patrick Caulfield wrote:
> > 
> > >I have seen this a small number of times - though only on test machines 
> > >that
> > >don't invoke the cman shutdown script.
> > >
> > 
> > I have installed from source using the FC4 branch. It seems like a make 
> > install installs startup and shutdown links for clvmd_init and ccsd. for 
> > cman, rgmanager and fenced it only installs links for startup.
>
> > I'll fix my installation. Thanks for the tip. Is this a bug in the Makefile?
> 
> This is not a bug.  The Makefiles are responsible for installing the init
> scripts and binaries onto your system.  They are not responsible for
> configuring your system to start and stop services; that is the
> responsibility of the rpm.
> 

I've run into this also.. For some reason any scripts that don't have the word "daemon" in them,
don't run when shutting the system down. And sometimes the install didn't setup the correct links in rc6.d.
To resolve this I just used the following script to put the word daemon in each script, and remove, and recreate the rc links.
  
  for i in ccsd cman fenced gfs rgmanager
	do
	   echo "# daemon" >> /etc/init.d/$i
	   chkconfig --del $i
	   chkconfig --add $i
   done


--
Jason Lanclos                                        
Systems Administrator                                 
Red Hat Certified Engineer        
Southeastern Louisiana University		     


From alewis at redhat.com  Wed Jul  6 19:01:28 2005
From: alewis at redhat.com (AJ Lewis)
Date: Wed, 6 Jul 2005 14:01:28 -0500
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <200507060917.32412.Jason@selu.edu>
References: <Pine.LNX.4.61.0506081351350.1493@tungsten.technicality.org>
	<200507052252.09456.jason@selu.edu>
	<42CB5838.4060802@telkom.net.id>
	<200507060917.32412.Jason@selu.edu>
Message-ID: <20050706190128.GB15005@null.msp.redhat.com>

On Wed, Jul 06, 2005 at 09:17:32AM -0500, Jason Lanclos wrote:
> > A simpler method will be create a NEW LUN, scan it, pvcreate, add it to 
> > existing VG with vgextend, and extend your volume lvextend. At least you 
> > don't have to mess with existing partitions.
> 
> Ok.. what happens when you reach the max LUNs or max partitions???
> And I can only imagine the administration nightmare it would be to manage with several
> volumes spread over multiple LUNs. The whole point of a SAN is to simplify management of data storage.
> And what about copy/swap? Spreading a volume over multiple LUNs would make using this feature of the SAN impossible / very
> dangerous to do while volumes are mounted.
> So.. One Volume, One LUN, One partition is the easiest way.
> There are prolly a thousand different ways to "make it work",
> but I wouldn't consider that part of the whole "Red Hat Enterprise Linux" concept. 

Just out of curiousity - if you're using your array to do volume management
anyway - why use lvm at all?

-- 
AJ Lewis                                   Voice:  612-638-0500
Red Hat                                    E-Mail: alewis at redhat.com
One Main Street SE, Suite 209
Minneapolis, MN 55414
   
Current GPG fingerprint = D9F8 EDCE 4242 855F A03D  9B63 F50C 54A8 578C 8715
Grab the key at: http://people.redhat.com/alewis/gpg.html or one of the
many keyservers out there...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050706/a07c6e68/attachment.sig>

From jason at selu.edu  Wed Jul  6 20:08:19 2005
From: jason at selu.edu (Jason Lanclos)
Date: Wed, 6 Jul 2005 15:08:19 -0500
Subject: [Linux-cluster] A few GFS newbie questions: journals, etc
In-Reply-To: <20050706190128.GB15005@null.msp.redhat.com>
References: <Pine.LNX.4.61.0506081351350.1493@tungsten.technicality.org>
	<200507060917.32412.Jason@selu.edu>
	<20050706190128.GB15005@null.msp.redhat.com>
Message-ID: <200507061508.20046.Jason@selu.edu>

On Wednesday 06 July 2005 02:01 pm, AJ Lewis wrote:
> Just out of curiousity - if you're using your array to do volume management
> anyway - why use lvm at all?

Currently, the linux cluster connected to the SAN is RedHat AS 2.1, ext3 is the filesystem.
Each volume is on its own LUN for failover reasons.

However once we upgrade to RHEL4 and start using GFS, we would like to consolidate some of the
smaller filesystems to one LUN and use CLVM, but at the same time avoid having a volume group spread over multiple LUNS.

right now, the 2 servers we have are identical.. 
but as we add new servers to the cluster, the device names might be different.
using CLVM would allow usage of /dev/<vg>/<lv> mappings instead of /dev/sd?? 

The only reason we wouldn't use CLVM is because it doesn't support extending PV size.


-- 
Jason Lanclos                                        
Systems Administrator                                 
Red Hat Certified Engineer        
Southeastern Louisiana University		     


From natecars at natecarlson.com  Wed Jul  6 20:08:38 2005
From: natecars at natecarlson.com (Nate Carlson)
Date: Wed, 6 Jul 2005 15:08:38 -0500 (CDT)
Subject: [Linux-cluster] cluster.conf changes without restarting cluster?
Message-ID: <Pine.LNX.4.61.0507061506250.23194@tungsten.technicality.org>

Hey all,

I'm adding a new node to my play cluster, and made the config changes with 
ccs_tool (added a few new fence devices, removed a node, added another, 
etc.)

On the node I made the changes on, I get this in syslog:

Jul  6 15:03:48 xen1 ccsd[2052]: Update of cluster.conf complete (version 15 -> 16).
Jul  6 15:04:12 xen1 ccsd[2052]: Update of cluster.conf complete (version 16 -> 17).
Jul  6 15:04:21 xen1 ccsd[2052]: Update of cluster.conf complete (version 17 -> 18).
Jul  6 15:04:26 xen1 ccsd[2052]: Update of cluster.conf complete (version 18 -> 19).
Jul  6 15:05:01 xen1 ccsd[2052]: Update of cluster.conf complete (version 19 -> 20).
Jul  6 15:05:22 xen1 ccsd[2052]: Update of cluster.conf complete (version 20 -> 21).

The rest of the nodes also have the updated config file. However, 
/proc/cluster/status on all boxes still says:

Config version: 15

Is there any way to get ccs to load the new config file without restarting 
everything?

I'm running STABLE as of a few days ago.

------------------------------------------------------------------------
| nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
|       depriving some poor village of its idiot since 1981            |
------------------------------------------------------------------------


From phung at cs.columbia.edu  Wed Jul  6 20:11:06 2005
From: phung at cs.columbia.edu (Dan B. Phung)
Date: Wed, 6 Jul 2005 16:11:06 -0400 (EDT)
Subject: [Linux-cluster] cluster.conf changes without restarting cluster?
In-Reply-To: <Pine.LNX.4.61.0507061506250.23194@tungsten.technicality.org>
Message-ID: <Pine.LNX.4.44.0507061610370.21204-100000@algiers.clic.cs.columbia.edu>

first run
   ccs_tool update /etc/clutser/cluster.conf
then
   cman_tool version -r <newversion>


On 6, Jul, 2005, Nate Carlson declared:

> Hey all,
> 
> I'm adding a new node to my play cluster, and made the config changes with 
> ccs_tool (added a few new fence devices, removed a node, added another, 
> etc.)
> 
> On the node I made the changes on, I get this in syslog:
> 
> Jul  6 15:03:48 xen1 ccsd[2052]: Update of cluster.conf complete (version 15 -> 16).
> Jul  6 15:04:12 xen1 ccsd[2052]: Update of cluster.conf complete (version 16 -> 17).
> Jul  6 15:04:21 xen1 ccsd[2052]: Update of cluster.conf complete (version 17 -> 18).
> Jul  6 15:04:26 xen1 ccsd[2052]: Update of cluster.conf complete (version 18 -> 19).
> Jul  6 15:05:01 xen1 ccsd[2052]: Update of cluster.conf complete (version 19 -> 20).
> Jul  6 15:05:22 xen1 ccsd[2052]: Update of cluster.conf complete (version 20 -> 21).
> 
> The rest of the nodes also have the updated config file. However, 
> /proc/cluster/status on all boxes still says:
> 
> Config version: 15
> 
> Is there any way to get ccs to load the new config file without restarting 
> everything?
> 
> I'm running STABLE as of a few days ago.
> 
> ------------------------------------------------------------------------
> | nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
> |       depriving some poor village of its idiot since 1981            |
> ------------------------------------------------------------------------
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 

-- 
email:  phung at cs.columbia.edu
www:    http://www.cs.columbia.edu/~phung
phone:  646-775-6090
office: CS Dept. 520, 1214 Amsterdam Ave., MC 0401, New York, NY 10027


From natecars at natecarlson.com  Wed Jul  6 20:15:02 2005
From: natecars at natecarlson.com (Nate Carlson)
Date: Wed, 6 Jul 2005 15:15:02 -0500 (CDT)
Subject: [Linux-cluster] cluster.conf changes without restarting cluster?
In-Reply-To: <Pine.LNX.4.44.0507061610370.21204-100000@algiers.clic.cs.columbia.edu>
References: <Pine.LNX.4.44.0507061610370.21204-100000@algiers.clic.cs.columbia.edu>
Message-ID: <Pine.LNX.4.61.0507061514550.23194@tungsten.technicality.org>

On Wed, 6 Jul 2005, Dan B. Phung wrote:
> first run
>   ccs_tool update /etc/clutser/cluster.conf
> then
>   cman_tool version -r <newversion>

Ah, very good - thanks!

------------------------------------------------------------------------
| nate carlson | natecars at natecarlson.com | http://www.natecarlson.com |
|       depriving some poor village of its idiot since 1981            |
------------------------------------------------------------------------


From eric at bootseg.com  Wed Jul  6 21:29:12 2005
From: eric at bootseg.com (Eric Kerin)
Date: Wed, 06 Jul 2005 17:29:12 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42CBAF82.8050900@uib.no>
References: <42CBAF82.8050900@uib.no>
Message-ID: <1120685352.5704.9.camel@auh5-0478>

On Wed, 2005-07-06 at 12:16 +0200, Birger Wathne wrote:
> When taking down the nfs service I want the server to 'disappear' before it 
> starts unexporting, as i want clients to hang waiting for the service to 
> come back up instead of getting io errors.

It should already work this way.  Look in /usr/share/cluster/service.sh,
there is a block of XML data that defines the service resource agent.
Twords the end of the block of XML is a "special" tag this defines the
child node types for that agent.  You'll notice each of the child nodes
has a start and stop number.  These define the order that the given
children are started and stopped  You'll see filesystems are started at
level 2, and ip addresses are started at 3.   Since a nfs export is
defined as a child of a fs agent, the nfs exports are turned on after
mounting the filesystem, and before the IP address is active.

Then they are stopped using the order defined by the stop attribute.  So
IP address is turned off, then filesystems are stopped, (nfs exports are
turned off, and then the filesystem is unmounted.)

Hope this helps.
-- 
Eric Kerin <eric at bootseg.com>


From Birger.Wathne at ift.uib.no  Wed Jul  6 23:44:07 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Thu, 07 Jul 2005 01:44:07 +0200
Subject: [Linux-cluster] Problem rebooting cluster node - CMANsendmsg
	failed: -101
In-Reply-To: <200507061310.42642.Jason@selu.edu>
References: <42CBAE86.3090101@uib.no>
	<42CBC0B1.7040309@uib.no>	<20050706160433.GC15320@redhat.com>
	<200507061310.42642.Jason@selu.edu>
Message-ID: <42CC6CC7.9040904@uib.no>

Jason Lanclos wrote:


> I've run into this also.. For some reason any scripts that don't have the word "daemon" in them,
> don't run when shutting the system down. And sometimes the install didn't setup the correct links in rc6.d.
> To resolve this I just used the following script to put the word daemon in each script, and remove, and recreate the rc links.
>   
>   for i in ccsd cman fenced gfs rgmanager
> 	do
> 	   echo "# daemon" >> /etc/init.d/$i
> 	   chkconfig --del $i
> 	   chkconfig --add $i
>    done
> 

I know I didn't install the startup links, as I would have used chkconfig to 
do it. I used chkconfig to add the links, and it did the job as expected.

I first installed the rpm's when fc4 was released, but they had too many old 
bugs. I uninstalled the rpm's and installed from cvs (FC4 branch). Since 
then the rpm's have been updated, so I will probably convert to those to 
ease management when I have the other cluster node available.

Anyway. Problem solved, and just to be safe I have also added the daemon 
lines in the scripts.

-- 
birger


From Birger.Wathne at ift.uib.no  Wed Jul  6 23:53:07 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Thu, 07 Jul 2005 01:53:07 +0200
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1120685352.5704.9.camel@auh5-0478>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
Message-ID: <42CC6EE3.6020005@uib.no>

Eric Kerin wrote:

> On Wed, 2005-07-06 at 12:16 +0200, Birger Wathne wrote:
> 
>>When taking down the nfs service I want the server to 'disappear' before it 
>>starts unexporting, as i want clients to hang waiting for the service to 
>>come back up instead of getting io errors.
> 
> 
> It should already work this way.  Look in /usr/share/cluster/service.sh,
> there is a block of XML data that defines the service resource agent.
.
.
.
Thanks. That is reassuring. I'll have to read all of those files so I really 
know what goes on. :-)

-- 
birger


From liangs at cse.ohio-state.edu  Thu Jul  7 01:53:25 2005
From: liangs at cse.ohio-state.edu (Shuang Liang)
Date: Wed, 06 Jul 2005 21:53:25 -0400
Subject: [Linux-cluster] installing GFS
In-Reply-To: <mailman.0.1120700026.12535.linux-cluster@redhat.com>
References: <mailman.0.1120700026.12535.linux-cluster@redhat.com>
Message-ID: <42CC8B15.7040705@cse.ohio-state.edu>

Hi,
   I am trying to install GFS on 3 cluster nodes with  redhat 
linux-2.6.9-5EL using gnbd. I met with some problems the mount, which says
"# mount -t gfs /dev/gnbd0 /mnt/gfs/                        
mount: Connection refused"

I have tried the installation with different order and configuration all 
end up with this error, hope sb can help solve this puzzle. Here are 
some step I have taken:
1.Insert gfs-lock and gfs kernel modules at the GFS node.
   Insert gnbd module at GFS node
2. run gnbd_serv -n at GNBD server;
    ./gnbd_export -d /dev/hda8 -e mygnbd -c
3. ./gnbd_import -i k35-as4 -n at GFS node
4.  Now I want to create logic volume at GFS node: it failed saying
   [root at k32-as4 sbin]# pvcreate /dev/gnbd0
   Device /dev/gnbd0 not found.

So I choose to do the logic volume stuff at the GNBD server side and 
gfs_mkfs at the GFS side, it succeeded. But it fails again when mounting.
Then I tried to mkfs at GNBD server and mount directly through gnbd 
device, same error again.

So I am wondering what should be the correct configuration trick.

Thanks for reading this far!


From haydar2906 at hotmail.com  Thu Jul  7 02:44:47 2005
From: haydar2906 at hotmail.com (haydar Ali)
Date: Wed, 06 Jul 2005 22:44:47 -0400
Subject: [Linux-cluster] GFS installation
In-Reply-To: <010401c581f5$3ecc1560$69050364@yazanz>
Message-ID: <BAY104-F2692AA181A4B77D121C04CC5D80@phx.gbl>

Hi Yazan,

But if the 3d node (lock server) will be down. will GFS offer the shared 
file system??

Thanks

Haydar

>From: "Yazan Al-Sheyyab" <yazan at ccs.com.jo>
>Reply-To: linux clustering <linux-cluster at redhat.com>
>To: "linux clustering" <linux-cluster at redhat.com>
>Subject: Re: [Linux-cluster] GFS installation
>Date: Wed, 6 Jul 2005 08:37:57 +0200
>
>hi haydar,
>
>  my friend i was having the same case : which is a two ml 370 hp proliant 
>nodes and a shared storage msa500 and the two nodes connected to the shared 
>by SCSI cable , the problem was with the lock server when implementing the 
>GFS.
>
>im using RHEL_ES_V3_U4 and a GFS for V3 U4 also,  but here in this GFS 
>release you have to have an odd number of lock server , i mean that when 
>you have two server so you have to have a three lock server , but in my 
>case i have used only the first node as lock server and then i reached a 
>poor redunduncy in the cluster cause i still studying the purchase of a 
>third server., i heared that there is a release of GFS which is GFS 6.1 
>working with RHEL_V4 which will work without the need of the locking 
>service  ..... some body correct me if that is not right.
>
>but some body told me that i can use the third lock server as logical not 
>physical server  , he meant that i dont need another HardWare like a server
>
>but my question now is there any body on our list can explain me the use of 
>the third server as logical way , is it as unix and hp-ux  as an area of 
>the disk used for locking or what?
>
>can we manage one of the two cpu on the server as virtual to get a solution 
>without another server.
>
>Sorry for the long Email , but we have this last problem.
>
>
>
>Regards
>-------------------------------------------------
>
>Yazan
>---------------------------
>
>----- Original Message ----- From: "haydar Ali" <haydar2906 at hotmail.com>
>To: <linux-cluster at redhat.com>
>Sent: Tuesday, July 05, 2005 6:22 PM
>Subject: Re: [Linux-cluster] GFS installation
>
>
>>Hi Igor,
>>
>>Thanks for this URL.
>>My question is: Have I to use 3 nodes to achieve GFS solution?
>>We have 2 servers HP Proliant 380 G3 (RedHat Advanced Server 2.1) attached 
>>by 2 fiber channels each to the storage area network SAN HP MSA1000 and we 
>>want to install and configure GFS to allow 2 servers to simultaneously 
>>read and write to a single shared file system (Word documents located into 
>>/u04) located on the Storage area network SAN HP MSA1000.
>>I read the example that you have sent to me and I see 3 nodes, 2 client 
>>nodes share a directory mounted on the 3d server node, but our solution 
>>the directory is located in the SAN.
>>
>>Have you any explanation or ideas for our request?
>>Thanks
>>
>>Haydar
>>
>>
>>>From: Igor <logastellus at yahoo.com>
>>>Reply-To: linux clustering <linux-cluster at redhat.com>
>>>To: linux clustering <linux-cluster at redhat.com>
>>>Subject: Re: [Linux-cluster] GFS installation
>>>Date: Thu, 16 Jun 2005 08:33:16 -0700 (PDT)
>>>
>>>Look at this URL that David suggested to me:
>>>
>>>http://sources.redhat.com/cgi-bin/cvsweb.cgi/cluster/doc/min-gfs.txt?rev=1.3&content-type=text/x-cvsweb-markup&cvsroot=cluster
>>>
>>>it's pretty good.
>>>
>>>--- haydar Ali <haydar2906 at hotmail.com> wrote:
>>>
>>> > Hi,
>>> >
>>> > I'm looking for an installing and configuring
>>> > procedure for GFS (examples).
>>> > We have 2 servers HP Proliant 380 G3 (RedHat
>>> > Advanced Server 2.1) attached
>>> > by fiber optic to the storage area network SAN HP
>>> > MSA1000 and we want to
>>> > install and configure GFS to allow 2 servers to
>>> > simultaneously read and
>>> > write to a single shared file system (Word documents
>>> > located into /u04) on
>>> > the SAN.
>>> >
>>> > Thanks.
>>> >
>>> > Haydar
>>> >
>>> >
>>> > --
>>> > Linux-cluster mailing list
>>> > Linux-cluster at redhat.com
>>> > http://www.redhat.com/mailman/listinfo/linux-cluster
>>> >
>>>
>>>
>>>
>>>
>>>__________________________________
>>>Discover Yahoo!
>>>Stay in touch with email, IM, photo sharing and more. Check it out!
>>>http://discover.yahoo.com/stayintouch.html
>>>
>>>--
>>>Linux-cluster mailing list
>>>Linux-cluster at redhat.com
>>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>http://www.redhat.com/mailman/listinfo/linux-cluster
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster


From Hansjoerg.Maurer at dlr.de  Thu Jul  7 06:26:39 2005
From: Hansjoerg.Maurer at dlr.de (=?ISO-8859-15?Q?Hansj=F6rg_Maurer?=)
Date: Thu, 07 Jul 2005 08:26:39 +0200
Subject: [Linux-cluster] GFS for RHEL4U1 (2.6.9-11)?
Message-ID: <42CCCB1F.7080602@dlr.de>

Hi,

are there any GFS SRPM's available for RHEL4U1 (2.6.9-11)?

The cluster webpage http://sources.redhat.com/cluster/
says, that there is a RHEL4 CVS, which builds against 2.6.9-5.

Is GFS-6.1 already officially released?
Are there any plans for making stable SRPM's available,
like there are for RHEL3?

Thank you very much

Hansj?rg


-- 
_________________________________________________________________

Dr.  Hansjoerg Maurer           | LAN- & System-Manager
                                |
Deutsches Zentrum               | DLR Oberpfaffenhofen
  f. Luft- und Raumfahrt e.V.   |
Institut f. Robotik             |
Postfach 1116                   | Muenchner Strasse 20
82230 Wessling                  | 82234 Wessling
Germany                         |
                                |
Tel: 08153/28-2431              | E-mail: Hansjoerg.Maurer at dlr.de
Fax: 08153/28-1134              | WWW: http://www.robotic.dlr.de/
__________________________________________________________________


There are 10 types of people in this world, 
those who understand binary and those who don't.


From birger at uib.no  Thu Jul  7 10:43:12 2005
From: birger at uib.no (Birger Wathne)
Date: Thu, 07 Jul 2005 12:43:12 +0200
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1120685352.5704.9.camel@auh5-0478>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
Message-ID: <42CD0740.8040102@uib.no>

Eric Kerin wrote:

>
>It should already work this way.  Look in /usr/share/cluster/service.sh,
>there is a block of XML data that defines the service resource agent.
>Twords the end of the block of XML is a "special" tag this defines the
>child node types for that agent.  You'll notice each of the child nodes
>has a start and stop number.  These define the order that the given
>children are started and stopped  You'll see filesystems are started at
>level 2, and ip addresses are started at 3.   Since a nfs export is
>defined as a child of a fs agent, the nfs exports are turned on after
>mounting the filesystem, and before the IP address is active.
>
It kind of works this way, but still it doesn't...

I just did a clusvcadm -R <service> and I also tried with -s followed by -e.

What happens is exacly what you describe. Exports come up all in one go, 
then the IP address.
But then, a split second later all exports except the one I have in 
/etc/exports are gone. It's as if something has done 'exportfs -r'. I'll 
have to look into this. Could be my own config problem, as I restart the 
lockd when bringing up the service. However the exports are all there 
when I reboot and let the services come up automatically, and if my 
script is the culprit it should behave the same way then, shouldn't it?

After this, the export lines get added again when the cluster tests for 
them and finds them missing. The problem there is that not all the 
export lines get tested. Of my 9 export entries in cluster.conf only 5 
get tested and reexported after disappearing. As I said they are all 
there if I reboot and let the service come up automatically.

-- 
birger


From mtilstra at redhat.com  Thu Jul  7 13:40:00 2005
From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra)
Date: Thu, 7 Jul 2005 08:40:00 -0500
Subject: [Linux-cluster] installing GFS
In-Reply-To: <42CC8B15.7040705@cse.ohio-state.edu>
References: <mailman.0.1120700026.12535.linux-cluster@redhat.com>
	<42CC8B15.7040705@cse.ohio-state.edu>
Message-ID: <20050707134000.GA27833@redhat.com>

On Wed, Jul 06, 2005 at 09:53:25PM -0400, Shuang Liang wrote:
> Hi,
>   I am trying to install GFS on 3 cluster nodes with  redhat 
> linux-2.6.9-5EL using gnbd. I met with some problems the mount, which says
> "# mount -t gfs /dev/gnbd0 /mnt/gfs/                        
> mount: Connection refused"
> 
> I have tried the installation with different order and configuration all 
> end up with this error, hope sb can help solve this puzzle. Here are 
> some step I have taken:
> 1.Insert gfs-lock and gfs kernel modules at the GFS node.
>   Insert gnbd module at GFS node
> 2. run gnbd_serv -n at GNBD server;
>    ./gnbd_export -d /dev/hda8 -e mygnbd -c
> 3. ./gnbd_import -i k35-as4 -n at GFS node
> 4.  Now I want to create logic volume at GFS node: it failed saying
>   [root at k32-as4 sbin]# pvcreate /dev/gnbd0
>   Device /dev/gnbd0 not found.
> 
> So I choose to do the logic volume stuff at the GNBD server side and 
> gfs_mkfs at the GFS side, it succeeded. But it fails again when mounting.
> Then I tried to mkfs at GNBD server and mount directly through gnbd 
> device, same error again.
> 
> So I am wondering what should be the correct configuration trick.

is cman/dlm loaded and running?

-- 
Michael Conrad Tadpol Tilstra
I can see clearly now, the brain is gone...
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050707/2b7440c9/attachment.sig>

From jscottorn at possibilityforge.com  Thu Jul  7 14:33:14 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Thu, 07 Jul 2005 08:33:14 -0600
Subject: [Linux-cluster] HELP! Diapered block device
Message-ID: <42CD3D2A.2050108@possibilityforge.com>

Hi,

   I have a major issue.  I was running fine with a 4 node cluster for a
few days and then I copied some data to the GFS share and within a few
hours the gfs share went down.  I restarted the cluster and now I am
getting this error from dmesg and it also says removing device node
/dev/diaperedmd0:

GFS: Trying to join cluster "fsck_dlm", "SAN1:VserversFS"
lock_harness:  can't find protocol fsck_dlm
GFS: can't mount proto = fsck_dlm, table = SAN1:VserversFS, hostdata =

I have removed the storage server from the cluster and shutdown gnbd and
I have been running gfs_fsck for about 24 hours and it still hasn't
completed.  I have a 750GB raid 5 setup and I can't remount or
anything.  What do I need to do to fix this?

Here is what gfs_fsck reports:

Initializing fsck
Initializing lists...
Initializing special inodes...
Setting block ranges...
Creating a block list of size 183146926...
Clearing journals (this may take a while)
Clearing journal 0
Clearing journal 1
Clearing journal 2
Clearing journal 3
Cleared journals
Starting pass1
Checking metadata in Resource Group 0
Checking metadata in Resource Group 1
Checking metadata in Resource Group 2
...... Skipped to conserve space .............
Checking metadata in Resource Group 2791
Checking metadata in Resource Group 2792
Checking metadata in Resource Group 2793
Pass1 complete
Starting pass1b
Looking for duplicate blocks...
Found dup block at 61573000
Found dup block at 61573014
Found dup block at 61573015
Found dup block at 61573016
Found dup block at 61573017
Found dup block at 61573018
Found dup block at 61573019
Found dup block at 61573020
Found dup block at 61573021
Found dup block at 61573022
Found dup block at 61573024
Found dup block at 61573047
Found dup block at 61573048
Found dup block at 61573052
Found dup block at 61623032
Found dup block at 61623033
Found dup block at 61623034
Found dup block at 61623035
Scanning filesystem for inodes containing duplicate blocks...

Any help is greatly appreciated.

Thanks,

Jon


From jason at selu.edu  Thu Jul  7 15:02:01 2005
From: jason at selu.edu (Jason Lanclos)
Date: Thu, 7 Jul 2005 10:02:01 -0500
Subject: [Linux-cluster] HELP! Diapered block device
In-Reply-To: <42CD3D2A.2050108@possibilityforge.com>
References: <42CD3D2A.2050108@possibilityforge.com>
Message-ID: <200507071002.01130.Jason@selu.edu>

On Thursday 07 July 2005 09:33 am, Jon Scottorn wrote:
> GFS: Trying to join cluster "fsck_dlm", "SAN1:VserversFS"
> lock_harness: ?can't find protocol fsck_dlm
> GFS: can't mount proto = fsck_dlm, table = SAN1:VserversFS, hostdata =
> 

 the lock protocol should be lock_dlm not fsck_dlm
 
 you can change this with gfs_tool
	gfs_tool sb <device> proto lock_dlm

-- 
Jason Lanclos                                        
Systems Administrator                                 
Red Hat Certified Engineer        
Southeastern Louisiana University		     


From jscottorn at possibilityforge.com  Thu Jul  7 15:16:45 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Thu, 07 Jul 2005 09:16:45 -0600
Subject: [Linux-cluster] HELP! Diapered block device
In-Reply-To: <200507071002.01130.Jason@selu.edu>
References: <42CD3D2A.2050108@possibilityforge.com>
	<200507071002.01130.Jason@selu.edu>
Message-ID: <42CD475D.3060609@possibilityforge.com>

Thanks,

    That made it so I can mount it from the other nodes, but now I can't
mount it on the storage server.
This is what dmesg reports:

GFS: fsid=SAN1:VserversFS.0: Joined cluster. Now mounting FS...
GFS: fsid=SAN1:VserversFS.0: jid=0: Trying to acquire journal lock...
GFS: fsid=SAN1:VserversFS.0: jid=0: Looking at journal...
GFS: fsid=SAN1:VserversFS.0: jid=0: Done
GFS: fsid=SAN1:VserversFS.0: jid=1: Trying to acquire journal lock...
GFS: fsid=SAN1:VserversFS.0: jid=1: Looking at journal...
GFS: fsid=SAN1:VserversFS.0: jid=1: Done
GFS: fsid=SAN1:VserversFS.0: jid=2: Trying to acquire journal lock...
GFS: fsid=SAN1:VserversFS.0: jid=2: Looking at journal...
GFS: fsid=SAN1:VserversFS.0: jid=2: Done
GFS: fsid=SAN1:VserversFS.0: jid=3: Trying to acquire journal lock...
GFS: fsid=SAN1:VserversFS.0: jid=3: Looking at journal...
GFS: fsid=SAN1:VserversFS.0: jid=3: Done
GFS: fsid=SAN1:VserversFS.0: fatal: filesystem consistency error
GFS: fsid=SAN1:VserversFS.0:   inode = 52732294/52732294
GFS: fsid=SAN1:VserversFS.0:   function = gfs_change_nlink
GFS: fsid=SAN1:VserversFS.0:   file =
/usr/src/cluster/cluster.STABLE.20050701/gfs-kernel/src/gfs/inode.c, line
= 843
GFS: fsid=SAN1:VserversFS.0:   time = 1120748944
GFS: fsid=SAN1:VserversFS.0: about to withdraw from the cluster
GFS: fsid=SAN1:VserversFS.0: waiting for outstanding I/O
GFS: fsid=SAN1:VserversFS.0: telling LM to withdraw
lock_dlm: withdraw abandoned memory
GFS: fsid=SAN1:VserversFS.0: withdrawn
  mh_magic = 0x01161970
  mh_type = 4
  mh_generation = 13
  mh_format = 400
  mh_incarn = 4
  no_formal_ino = 52732294
  no_addr = 52732294
  di_mode = 0600
  di_uid = 31
  di_gid = 32
  di_nlink = 0
  di_size = 0
  di_blocks = 1
  di_atime = 1120690913
  di_mtime = 1120690938
  di_ctime = 1120690938
  di_major = 0
  no_formal_ino = 0
  no_addr = 0
  di_eattr = 0
  di_reserved =
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00

I'm not sure what else I am missing.
Also, the mount I am doing from the clients are done with mounting the
gnbd device ie.. /dev/gnbd/fs
On the storage server itself, I am mounting the actual device ie.. /dev/md0
Don't know if that causes an issue doing it that way or not.

Any other thoughts as to how to get my errors fixed.

Thanks,

Jon

Jason Lanclos wrote:

>On Thursday 07 July 2005 09:33 am, Jon Scottorn wrote:
>  
>
>>GFS: Trying to join cluster "fsck_dlm", "SAN1:VserversFS"
>>lock_harness:  can't find protocol fsck_dlm
>>GFS: can't mount proto = fsck_dlm, table = SAN1:VserversFS, hostdata =
>>
>>    
>>
>
> the lock protocol should be lock_dlm not fsck_dlm
> 
> you can change this with gfs_tool
>	gfs_tool sb <device> proto lock_dlm
>
>  
>


From eric at bootseg.com  Thu Jul  7 15:22:03 2005
From: eric at bootseg.com (Eric Kerin)
Date: Thu, 07 Jul 2005 11:22:03 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42CD0740.8040102@uib.no>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no>
Message-ID: <1120749723.5534.7.camel@auh5-0478>

On Thu, 2005-07-07 at 12:43 +0200, Birger Wathne wrote:
> Eric Kerin wrote:
> 
> >
> >It should already work this way.  Look in /usr/share/cluster/service.sh,
> >there is a block of XML data that defines the service resource agent.
> >Twords the end of the block of XML is a "special" tag this defines the
> >child node types for that agent.  You'll notice each of the child nodes
> >has a start and stop number.  These define the order that the given
> >children are started and stopped  You'll see filesystems are started at
> >level 2, and ip addresses are started at 3.   Since a nfs export is
> >defined as a child of a fs agent, the nfs exports are turned on after
> >mounting the filesystem, and before the IP address is active.
> >
> It kind of works this way, but still it doesn't...

> What happens is exacly what you describe. Exports come up all in one go, 
> then the IP address.
> But then, a split second later all exports except the one I have in 
> /etc/exports are gone. It's as if something has done 'exportfs -r'. I'll 
> have to look into this. Could be my own config problem, as I restart the 
> lockd when bringing up the service. However the exports are all there 
> when I reboot and let the services come up automatically, and if my 
> script is the culprit it should behave the same way then, shouldn't it?
> 
Just because I'm curious, why do you restart lockd?  Are you restarting
any other nfs related services from rgmanager?

> Of my 9 export entries in cluster.conf only 5 
> get tested and reexported after disappearing. As I said they are all 
> there if I reboot and let the service come up automatically.
> 
It'd be interesting to see the relevant section of your cluster.conf
file.  Also you don't have any of the filesystems you are exporting in
the cluster setup in /etc/exports, do you?

-- 
Eric Kerin <eric at bootseg.com>


From alewis at redhat.com  Thu Jul  7 15:34:35 2005
From: alewis at redhat.com (AJ Lewis)
Date: Thu, 7 Jul 2005 10:34:35 -0500
Subject: [Linux-cluster] HELP! Diapered block device
In-Reply-To: <42CD475D.3060609@possibilityforge.com>
References: <42CD3D2A.2050108@possibilityforge.com>
	<200507071002.01130.Jason@selu.edu>
	<42CD475D.3060609@possibilityforge.com>
Message-ID: <20050707153435.GD15005@null.msp.redhat.com>

On Thu, Jul 07, 2005 at 09:16:45AM -0600, Jon Scottorn wrote:
> Thanks,
> 
>     That made it so I can mount it from the other nodes, but now I can't
> mount it on the storage server.

Gah!  Is the fsck still running?  You *CANNOT* run the fsck while other nodes
have the fs mounted.  The fsck changes the lock protocol to prevent others
from mounting after the fsck starts.  It will be changed back after
completion.

The fsck can take a while in the duplicate block code - could you tell if it
was still accessing storage?  If you have lots of inodes in the system, it's
gonna take a while to work through them in the dup block handling code.

Regards,
-- 
AJ Lewis                                   Voice:  612-638-0500
Red Hat                                    E-Mail: alewis at redhat.com
One Main Street SE, Suite 209
Minneapolis, MN 55414
   
Current GPG fingerprint = D9F8 EDCE 4242 855F A03D  9B63 F50C 54A8 578C 8715
Grab the key at: http://people.redhat.com/alewis/gpg.html or one of the
many keyservers out there...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050707/59c48db7/attachment.sig>

From jscottorn at possibilityforge.com  Thu Jul  7 15:40:49 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Thu, 07 Jul 2005 09:40:49 -0600
Subject: [Linux-cluster] HELP! Diapered block device
In-Reply-To: <20050707153435.GD15005@null.msp.redhat.com>
References: <42CD3D2A.2050108@possibilityforge.com>	<200507071002.01130.Jason@selu.edu>	<42CD475D.3060609@possibilityforge.com>
	<20050707153435.GD15005@null.msp.redhat.com>
Message-ID: <42CD4D01.20803@possibilityforge.com>

When I ran the fsck, i had everything unmounted as well as the gnbd serv
stopped.  I let it run for almost 24 hours and it was still running. 
That seems a little long for me.  Should I let it run again and see what
happens.  My main problem is I can't have the FS down for that long. 

Thanks,

Jon

AJ Lewis wrote:

>On Thu, Jul 07, 2005 at 09:16:45AM -0600, Jon Scottorn wrote:
>  
>
>>Thanks,
>>
>>    That made it so I can mount it from the other nodes, but now I can't
>>mount it on the storage server.
>>    
>>
>
>Gah!  Is the fsck still running?  You *CANNOT* run the fsck while other nodes
>have the fs mounted.  The fsck changes the lock protocol to prevent others
>from mounting after the fsck starts.  It will be changed back after
>completion.
>
>The fsck can take a while in the duplicate block code - could you tell if it
>was still accessing storage?  If you have lots of inodes in the system, it's
>gonna take a while to work through them in the dup block handling code.
>
>Regards,
>  
>
>------------------------------------------------------------------------
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster
>


From lhh at redhat.com  Thu Jul  7 15:49:17 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 07 Jul 2005 11:49:17 -0400
Subject: [Linux-cluster] RHEL3 Cluster network hangup
In-Reply-To: <42CB7A9B.30304@riege.com>
References: <42CB7A9B.30304@riege.com>
Message-ID: <1120751357.15658.40.camel@ayanami.boston.redhat.com>

On Wed, 2005-07-06 at 08:30 +0200, Gunther Schlegel wrote:
> The clustered application does a lot of printing (lprng), 
> faxing(hylafax) and mailing(sendmail). It uses shell scripts to pass the 
> jobs to the operating systems daemons.

> The client programs of these daemons, which pass jobs to the daemons 
> using network connections to localhost start to behave irregular when 
> the cluster is up for about 2 weeks.

> Examples:
> - hylafax faxstat stops listing the transmitted faxes in the middle of 
> the list ( but always at the same job )
> - sendmail opens a connection to the local daemon but does not transfer 
> the message. Both processes sit there and wait, after some time the 
> server closes the connection because of missing input from the clients side.
> - same with lpr.
> 
> I assume that something locks up in the ip stack. Not all services are 
> affected at the same time.
> 
> I guess this is related to the cluster software as we run that 
> application on a lot of servers which all do not show this behaviour and 
> that are all not clustered.

I doubt it, but it's not out of the realm of possibility.  The cluster
software does three things mostly:

(a) figures out who's online
(b) shoots nodes
(c) manages services using shell scripts

The shell scripts call standard utilities (ifconfig, route, etc.).

Now -- here's the thing.  Earlier versions of clumanager (<1.2.22) had a
problem where sometimes (and randomly!), services would get a bogus
status return and restart on the same node.  Also, the most recent
errata fixed a signal handling problem which broke JVMs from running
under it.  Either of these may have caused the problems on your cluster,
I don't know.  The former would have associated log messages; the latter
wouldn't.

I'd try the latest release from RHN (clumanager-1.2.26.1-1).

If that doesn't work, I'd call Red Hat Support...

-- Lon


From lhh at redhat.com  Thu Jul  7 15:58:20 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 07 Jul 2005 11:58:20 -0400
Subject: [Linux-cluster] NFS exports disappeared
In-Reply-To: <42CBABAC.4000909@uib.no>
References: <42C99901.8090805@uib.no>  <42CBABAC.4000909@uib.no>
Message-ID: <1120751900.15658.48.camel@ayanami.boston.redhat.com>

On Wed, 2005-07-06 at 12:00 +0200, Birger Wathne wrote:
> I'm still not certain how I lost the nfs export list in the first place, but 
> I think I see why they never got 'fixed'.
> 
> I added some logging inside /usr/share/cluster/nfsclient.sh, and it seems 
> like status is only checked for a few of my exports.
> 
> Is this a bug? Shouldn't all nfs exports get checked regularly?
> 

Yes, they should be.

Can you include the relevant parts of your cluster.conf when you file
the bugzilla?

-- Lon


From lhh at redhat.com  Thu Jul  7 16:01:07 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 07 Jul 2005 12:01:07 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42CBAF82.8050900@uib.no>
References: <42CBAF82.8050900@uib.no>
Message-ID: <1120752067.15658.54.camel@ayanami.boston.redhat.com>

On Wed, 2005-07-06 at 12:16 +0200, Birger Wathne wrote:
> When setting up a configuration for NFS server failover, can I ensure the IP 
> address comes up after exports are ready, and more importantly that the IP 
> address goes down before unexporting by placeing the <ip ...> entry at the 
> end of the service declaration?

The IP is stopped before disk devices are stopped.

See the "child types" in service.sh.  IPs are started next-to-last
(scripts are started last).

If you do a typical configuration where the NFS exports are children of
file systems, and clients are children of exports, your IP will start
after NFS exports are complete and the IP will be taken off before the
NFS exports are unexported.

> When taking down the nfs service I want the server to 'disappear' before it 
> starts unexporting, as i want clients to hang waiting for the service to 
> come back up instead of getting io errors.

That's how NFS failover needs to be done, actually.

-- Lon


From lhh at redhat.com  Thu Jul  7 16:05:16 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 07 Jul 2005 12:05:16 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1120749723.5534.7.camel@auh5-0478>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no>  <1120749723.5534.7.camel@auh5-0478>
Message-ID: <1120752316.15658.59.camel@ayanami.boston.redhat.com>

On Thu, 2005-07-07 at 11:22 -0400, Eric Kerin wrote:
> > What happens is exacly what you describe. Exports come up all in one go, 
> > then the IP address.
> > But then, a split second later all exports except the one I have in 
> > /etc/exports are gone. It's as if something has done 'exportfs -r'. I'll 
> > have to look into this. Could be my own config problem, as I restart the 
> > lockd when bringing up the service. However the exports are all there 
> > when I reboot and let the services come up automatically, and if my 
> > script is the culprit it should behave the same way then, shouldn't it?
> > 
> Just because I'm curious, why do you restart lockd?  Are you restarting
> any other nfs related services from rgmanager?
> 
> > Of my 9 export entries in cluster.conf only 5 
> > get tested and reexported after disappearing. As I said they are all 
> > there if I reboot and let the service come up automatically.
> > 
> It'd be interesting to see the relevant section of your cluster.conf
> file.  Also you don't have any of the filesystems you are exporting in
> the cluster setup in /etc/exports, do you?

... and /etc/exports.

If the cluster is for some reason unexporting stuff in /etc/exports
which it shouldn't, it's a bug.

e.g.:

If you have /mnt/clusterexport in cluster.conf as an NFS export, and you
have "/tmp" as an export in /etc/exports - and the "/tmp" export is
dissappearing, that's a bug.

If you have /mnt/clusterexport in both cluster.conf
and /etc/exports, ... that's a config problem.  Let the cluster manage
the stuff you intend to export from the cluster. =)

-- Lon


From jscottorn at possibilityforge.com  Thu Jul  7 16:17:46 2005
From: jscottorn at possibilityforge.com (Jon Scottorn)
Date: Thu, 07 Jul 2005 10:17:46 -0600
Subject: [Linux-cluster] HELP! Diapered block device
In-Reply-To: <42CD4D01.20803@possibilityforge.com>
References: <42CD3D2A.2050108@possibilityforge.com>	<200507071002.01130.Jason@selu.edu>	<42CD475D.3060609@possibilityforge.com>	<20050707153435.GD15005@null.msp.redhat.com>
	<42CD4D01.20803@possibilityforge.com>
Message-ID: <42CD55AA.2040004@possibilityforge.com>

Ok, so I am rerunning gfs_fsck again.  I have everything unmounted and
the storage server not even in the cluster.  It has been running now for
30 mins again and it gets to this point and just doesn't look like it is
doing anything.  This is where is stayed after running it for 24 hours
yesterday:  Here is what the verbose output from gfs_fsck:

Initializing fsck
Initializing lists...
Initializing special inodes...
Setting block ranges...
Creating a block list of size 183146926...
Clearing journals (this may take a while)
Clearing journal 0
Clearing journal 1
Clearing journal 2
Clearing journal 3
Cleared journals
Starting pass1
Checking metadata in Resource Group 0
Checking metadata in Resource Group 1
Checking metadata in Resource Group 2
Checking metadata in Resource Group 3

........Omitted lines for space.........................

Checking metadata in Resource Group 2790
Checking metadata in Resource Group 2791
Checking metadata in Resource Group 2792
Checking metadata in Resource Group 2793
Pass1 complete
Starting pass1b
Looking for duplicate blocks...
Found dup block at 61573000
Found dup block at 61573014
Found dup block at 61573015
Found dup block at 61573016
Found dup block at 61573017
Found dup block at 61573018
Found dup block at 61573019
Found dup block at 61573020
Found dup block at 61573021
Found dup block at 61573022
Found dup block at 61573024
Found dup block at 61573047
Found dup block at 61573048
Found dup block at 61573052
Found dup block at 61623032
Found dup block at 61623033
Found dup block at 61623034
Found dup block at 61623035
Scanning filesystem for inodes containing duplicate blocks...

Once it gets to this point it just sits there.  gfs_fsck is using 99% of
the CPU for the whole time it runs.  What else can I do to get this fixed?

Thanks,

Jon

Jon Scottorn wrote:

>When I ran the fsck, i had everything unmounted as well as the gnbd serv
>stopped.  I let it run for almost 24 hours and it was still running. 
>That seems a little long for me.  Should I let it run again and see what
>happens.  My main problem is I can't have the FS down for that long. 
>
>Thanks,
>
>Jon
>
>AJ Lewis wrote:
>
>  
>
>>On Thu, Jul 07, 2005 at 09:16:45AM -0600, Jon Scottorn wrote:
>> 
>>
>>    
>>
>>>Thanks,
>>>
>>>   That made it so I can mount it from the other nodes, but now I can't
>>>mount it on the storage server.
>>>   
>>>
>>>      
>>>
>>Gah!  Is the fsck still running?  You *CANNOT* run the fsck while other nodes
>>have the fs mounted.  The fsck changes the lock protocol to prevent others
>>    
>>
>>from mounting after the fsck starts.  It will be changed back after
>  
>
>>completion.
>>
>>The fsck can take a while in the duplicate block code - could you tell if it
>>was still accessing storage?  If you have lots of inodes in the system, it's
>>gonna take a while to work through them in the dup block handling code.
>>
>>Regards,
>> 
>>
>>------------------------------------------------------------------------
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>    
>>
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster
>
>  
>


From lhh at redhat.com  Thu Jul  7 16:20:45 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 07 Jul 2005 12:20:45 -0400
Subject: [Linux-cluster] [PATCH] add syscall missing includes
	in	rgmanager/src/clulib/gettid.c
In-Reply-To: <42CB6563.5050007@fabbione.net>
References: <42CB62CB.408@fabbione.net>  <42CB6563.5050007@fabbione.net>
Message-ID: <1120753245.15658.67.camel@ayanami.boston.redhat.com>

On Wed, 2005-07-06 at 07:00 +0200, Fabio Massimo Di Nitto wrote:
> Fabio Massimo Di Nitto wrote:
> > Hi guys,
> >    patch is pretty self-explanatory, when we did change the way in which we use
> > syscall, we forgot to add the relevant includes.
> > Patch is against the STABLE branch, but i am pretty sure it applies all over.
> > As a consequence it also shuts up a warning a build time.
> > 
> > Please apply.
> > 
> > Cheers
> > Fabio

Thanks

-- Lon


From phung at cs.columbia.edu  Thu Jul  7 22:38:16 2005
From: phung at cs.columbia.edu (Dan B. Phung)
Date: Thu, 7 Jul 2005 18:38:16 -0400 (EDT)
Subject: [Linux-cluster] kernel versions supported by -rSTABLE?
Message-ID: <Pine.LNX.4.44.0507071836590.21204-100000@algiers.clic.cs.columbia.edu>

Just wondering, which kernel versions are known to 
work with gfs?

-dan


From phung at cs.columbia.edu  Fri Jul  8 01:01:33 2005
From: phung at cs.columbia.edu (Dan B. Phung)
Date: Thu, 7 Jul 2005 21:01:33 -0400 (EDT)
Subject: [Linux-cluster] kernel versions supported by -rSTABLE?
In-Reply-To: <Pine.LNX.4.44.0507071836590.21204-100000@algiers.clic.cs.columbia.edu>
Message-ID: <Pine.LNX.4.44.0507072101150.21204-100000@algiers.clic.cs.columbia.edu>

to be more specific, will it work for 2.6.11?


On 7, Jul, 2005, Dan B. Phung declared:

> Just wondering, which kernel versions are known to 
> work with gfs?
> 
> -dan
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 

-- 
email:  phung at cs.columbia.edu
www:    http://www.cs.columbia.edu/~phung
phone:  646-775-6090
office: CS Dept. 520, 1214 Amsterdam Ave., MC 0401, New York, NY 10027


From schlegel at riege.com  Fri Jul  8 06:27:13 2005
From: schlegel at riege.com (Gunther Schlegel)
Date: Fri, 08 Jul 2005 08:27:13 +0200
Subject: [Linux-cluster] RHEL3 Cluster network hangup
In-Reply-To: <1120751357.15658.40.camel@ayanami.boston.redhat.com>
References: <42CB7A9B.30304@riege.com>
	<1120751357.15658.40.camel@ayanami.boston.redhat.com>
Message-ID: <42CE1CC1.2070200@riege.com>

Lon,

> I doubt it, but it's not out of the realm of possibility.  The cluster
> software does three things mostly:
> 
> (a) figures out who's online
> (b) shoots nodes
> (c) manages services using shell scripts
> 
> The shell scripts call standard utilities (ifconfig, route, etc.).

 From the theory you are right ( and you probably know your software ). 
But what about this:

The software we run has "background job managers", which are started by 
the script I made up for the cluster. When I run lsof on such a 
bcj-process, it looks like this:

[root at tim root]# lsof -p 22993
COMMAND   PID USER   FD   TYPE DEVICE    SIZE     NODE NAME
plb     22993  rsi  cwd    DIR   8,19   24576  2949121 /opt/rsi/de/ham/data
plb     22993  rsi  rtd    DIR   8,10    4096        2 /
plb     22993  rsi  txt    REG   8,19 1043972 10944608 /opt/rsi/plb90f/plb
plb     22993  rsi  mem    REG   8,10 1571824   102450 
/lib/tls/libc-2.3.2.so
plb     22993  rsi  mem    REG   8,10   14868    74214 /lib/libdl-2.3.2.so
plb     22993  rsi  mem    REG   8,10   97712   104282 
/lib/tls/libpthread-0.60.so
plb     22993  rsi  mem    REG   8,10   23388    73133 
/lib/libcrypt-2.3.2.so
plb     22993  rsi  mem    REG    8,6   52584   637864 
/usr/lib/libz.so.1.1.4
plb     22993  rsi  mem    REG   8,10  213508   104281 
/lib/tls/libm-2.3.2.so
plb     22993  rsi  mem    REG   8,10  106912    73243 /lib/ld-2.3.2.so
plb     22993  rsi    0r   CHR    1,3            60122 /dev/null
plb     22993  rsi    1w   CHR    1,3            60122 /dev/null
plb     22993  rsi    2w   CHR    1,3            60122 /dev/null
plb     22993  rsi    3uw  REG   8,19       0 10944633 
/opt/rsi/plb90f/.^A^A^A^A^A^A^A^B
plb     22993  rsi    4u   REG   8,19    2048  2965515 
/opt/rsi/de/ham/data/cook.isi
plb     22993  rsi    5u   REG   8,19    4509  2949322 
/opt/rsi/de/ham/data/cook.txt
plb     22993  rsi    6u   REG   8,19    2304  3134942 
/opt/rsi/de/ham/scra/bcjmgr301100000009640.par
plb     22993  rsi    7u   REG   8,19    2048   868402 
/opt/rsi/de/data/bct.isi
plb     22993  rsi    8u   REG   8,19    1512   869372 
/opt/rsi/de/data/bct.txt
... data files shortened ...

If the BCJ-Process has been started by the cluster, it also has 
IP-sockets listet by lsof. The application is that old that it has no 
clue about IP, therefore it will not open IP sockets itself.

Would have added another lsof output, but I disabled the cluster 
software on customers demand.

Another weird fact is that the applications index file get broken if I 
use ext3. ext2 is fine. Applikation is technically old fashioned and 
stores data just is hundreds of text files with external index files and 
concurrent access. Without the cluster ext3 is fine as well. ( Same with 
LVM, besides. )

> Now -- here's the thing.  Earlier versions of clumanager (<1.2.22) had a

I have been running 1.2.22.

> status return and restart on the same node.  Also, the most recent
> errata fixed a signal handling problem which broke JVMs from running
> under it.  Either of these may have caused the problems on your cluster,
> I don't know.  The former would have associated log messages; the latter
> wouldn't.

There have not been any log messages.

> I'd try the latest release from RHN (clumanager-1.2.26.1-1).

Hmm, I will probably not start up the cluster again... :(

> If that doesn't work, I'd call Red Hat Support...

While calling support is always on option, I am pretty much sure that it 
will not lead to a solution. In the end they will not be able to 
reproduce it and I can't test on a customers production system.

Do not point me to test systems -- they are there, but they do not have 
the problem. Seems to be related to the workload of the machine, which 
is hard to simulate.

regards, Gunther
-------------- next part --------------
A non-text attachment was scrubbed...
Name: schlegel.vcf
Type: text/x-vcard
Size: 331 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050708/aa85b338/attachment.vcf>

From birger at uib.no  Fri Jul  8 07:57:12 2005
From: birger at uib.no (Birger Wathne)
Date: Fri, 08 Jul 2005 09:57:12 +0200
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1120749723.5534.7.camel@auh5-0478>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>	
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
Message-ID: <42CE31D8.4090206@uib.no>

Eric Kerin wrote:

>Just because I'm curious, why do you restart lockd?  Are you restarting
>any other nfs related services from rgmanager?
>  
>
I think I found the bug. A line had sneaked (snuck?) back into 
cluster.conf from an early experiment. I have just been blind to this 
line in the config file. I once had it set up so that the service 
started NFS. Later I  changed it so NFS gets started at boot with just a 
dummy export in my exports file.

Now, the line starting NFS was back in my config file. Must have been a 
copy/paste error at some time.

The lockd/statd that start at boot shouldn't really have anything to do, 
so when the service starts I simply run a modified version of 
/etc/init.d/nfslock that kills the existing daemons and start them again 
with some parameters set... I set the hostname to the one that follows 
the nfs service, and redirect the state dir to a gfs disk. When the 
service stops, these daemons get stopped, and the default ones started 
again. This way I hope to get the HA service to also take over locks.

Here is a snippet of my config file showing the resources and the nfs 
service. Of the listed exports, the first one for users01 and all 3 for 
iftscratch don't seem to get status checks, and thus never came up again.

<resources>
    <clusterfs fstype="gfs" name="cluadmfs" mountpoint="/cluadm" 
device="/dev/mapper/raid5-cluadm" options="acl"/>
    <clusterfs fstype="gfs" name="pakkefs" mountpoint="/service/pakke" 
device="/dev/mapper/raid5-pakke" options="acl"/>
    <clusterfs fstype="gfs" name="users01fs" 
mountpoint="/service/users01" device="/dev/mapper/raid5-users01" 
options="acl"/>
    <clusterfs fstype="gfs" name="iftscratchfs" 
mountpoint="/service/iftscratch" device="/dev/mapper/raid5-iftscratch" 
options="acl"/>
    <nfsclient name="nis-hosts" target="@nis-hosts" options="rw,sync"/>
    <nfsclient name="nis-hosts-ro" target="@nis-hosts" 
options="ro,sync,no_root_squash"/>
    <nfsclient name="h-nfs-root" target="@h-nfs-root" 
options="rw,sync,no_root_squash"/>
    <nfsclient name="h-nfs-insecure" target="@h-nfs-insecure" 
options="rw,sync,insecure"/>
    <nfsclient name="h-nfs-insecure-ro" target="@h-nfs-insecure" 
options="ro,sync,no_root_squash,insecure"/>
    <nfsclient name="dhcp-ro" target="@dhcp-hosts" 
options="ro,sync,no_root_squash"/>
  </resources>

  <service name="nfssvc" domain="nfsdomain">
    <ip address="129.177.40.30" monitor_link="yes"/>
    <script name="NFS lockd script" file="/etc/init.d/nfslock-svc"/>
    <clusterfs ref="cluadmfs"/>
    <clusterfs ref="pakkefs">
      <nfsexport name="pakkefs">
        <nfsclient ref="h-nfs-root"/>
        <nfsclient ref="h-nfs-insecure-ro"/>
        <nfsclient ref="nis-hosts-ro"/>
      </nfsexport>
    </clusterfs>
    <clusterfs ref="users01fs">
      <nfsexport name="users01fs">
        <nfsclient ref="h-nfs-root"/>
        <nfsclient ref="h-nfs-insecure"/>
        <nfsclient ref="nis-hosts"/>
      </nfsexport>
    </clusterfs>
    <clusterfs ref="iftscratchfs">
      <nfsexport name="iftscratchfs">
        <nfsclient ref="h-nfs-root"/>
        <nfsclient ref="h-nfs-insecure"/>
        <nfsclient ref="nis-hosts"/>
      </nfsexport>
    </clusterfs>
  </service>


From Birger.Wathne at ift.uib.no  Fri Jul  8 10:55:26 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Fri, 08 Jul 2005 12:55:26 +0200
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1120752316.15658.59.camel@ayanami.boston.redhat.com>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>	
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<1120752316.15658.59.camel@ayanami.boston.redhat.com>
Message-ID: <42CE5B9E.10700@ift.uib.no>

Lon Hohberger wrote:

>
>... and /etc/exports.
>
>If the cluster is for some reason unexporting stuff in /etc/exports
>which it shouldn't, it's a bug.
>  
>
No. It was my own config problem that unexported the file systems, I guess.

>e.g.:
>
>If you have /mnt/clusterexport in cluster.conf as an NFS export, and you
>have "/tmp" as an export in /etc/exports - and the "/tmp" export is
>dissappearing, that's a bug.
>
>If you have /mnt/clusterexport in both cluster.conf
>and /etc/exports, ... that's a config problem.  Let the cluster manage
>the stuff you intend to export from the cluster. =)
>  
>
No problem here. The export I have in /etc/exports was still there.

As I said above, the deletion of the cluster-administered exports was 
probably my fault, as I had gotten a line from an early config file back 
into my new one. This line tried to start the nfs startup script on a 
system that already had nfs running. I guess this results in 'exportfs 
-r' after the cluster has exported, and suddenly the list is back to 
just the contents of /etc/exports.

What I still think may be a bug is that of my 9 exports in cluster.conf 
only 5 get checked regularly through calls to "nfsclient.sh status", and 
thus only those 5 got reexported after they got deleted. I would never 
have seen this problem without messing up myself in the first place :-/

-- 
birger


From cfeist at redhat.com  Fri Jul  8 14:20:47 2005
From: cfeist at redhat.com (Chris Feist)
Date: Fri, 08 Jul 2005 09:20:47 -0500
Subject: [Linux-cluster] kernel versions supported by -rSTABLE?
In-Reply-To: <Pine.LNX.4.44.0507072101150.21204-100000@algiers.clic.cs.columbia.edu>
References: <Pine.LNX.4.44.0507072101150.21204-100000@algiers.clic.cs.columbia.edu>
Message-ID: <42CE8BBF.8040607@redhat.com>

It should work against that kernel.  Please let us know if it doesn't.

Thanks,
Chris

Dan B. Phung wrote:
> to be more specific, will it work for 2.6.11?
> 
> 
> On 7, Jul, 2005, Dan B. Phung declared:
> 
> 
>>Just wondering, which kernel versions are known to 
>>work with gfs?
>>
>>-dan
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>
> 
> 


From lhh at redhat.com  Fri Jul  8 15:07:11 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 08 Jul 2005 11:07:11 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42CE5B9E.10700@ift.uib.no>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no>  <1120749723.5534.7.camel@auh5-0478>
	<1120752316.15658.59.camel@ayanami.boston.redhat.com>
	<42CE5B9E.10700@ift.uib.no>
Message-ID: <1120835231.15658.73.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-08 at 12:55 +0200, Birger Wathne wrote:

> What I still think may be a bug is that of my 9 exports in cluster.conf 
> only 5 get checked regularly through calls to "nfsclient.sh status", and 
> thus only those 5 got reexported after they got deleted. I would never 
> have seen this problem without messing up myself in the first place :-/

You're right, that is odd...

I think I should add a "dump services" facility to rgmanager for
debugging.  (Basically, build in the rg_test printing stuff so that it
activates on sigusr1 and dumps stuff to /tmp so you can compare).

-- Lon


From lhh at redhat.com  Fri Jul  8 15:52:00 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 08 Jul 2005 11:52:00 -0400
Subject: [Linux-cluster] RHEL3 Cluster network hangup
In-Reply-To: <42CE1CC1.2070200@riege.com>
References: <42CB7A9B.30304@riege.com>
	<1120751357.15658.40.camel@ayanami.boston.redhat.com>
	<42CE1CC1.2070200@riege.com>
Message-ID: <1120837920.15658.116.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-08 at 08:27 +0200, Gunther Schlegel wrote:

> I have been running 1.2.22.

Yup, that fixed the status problem, but...

> > Also, the most recent errata fixed a signal handling problem which
> > broke JVMs from running under it.

> There have not been any log messages.
>
> > I'd try the latest release from RHN (clumanager-1.2.26.1-1).

... it is very important to note that JVMs weren't the only thing that
broke because of the signal bug.

The signal bug was not fixed until to 1.2.26.1 (latest errata).  Some
processes use signals to communicate and avoid deadlocks or blocking,
but if the signals are blocked, they don't much help with those
problems...

As an example - a process which calls alarm(5) to set a timer to wake
itself up right before it calls, say, a blocking select().  5 seconds
later, SIGALRM comes in - but because it is blocked, the process gets
stuck in select() forever.


> > If that doesn't work, I'd call Red Hat Support...
> 
> While calling support is always on option, I am pretty much sure that it 
> will not lead to a solution. In the end they will not be able to 
> reproduce it and I can't test on a customers production system.

I suspect that the first thing they would have you do is try the latest
errata from RHN (which fixes the signal problem):

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=153070
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=161060
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=143867
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=149059

(Yeah, it's that bad.)

... which is why I recommended trying it *before* calling support.

> Do not point me to test systems -- they are there, but they do not have 
> the problem. Seems to be related to the workload of the machine, which 
> is hard to simulate.

> Hmm, I will probably not start up the cluster again... :(

(snipped from earlier)

Use your own judgment, and make the choices that are right for you and
your customer, whatever they are.  I am sorry I could not be more
helpful.

Good luck.

-- Lon


From lhh at redhat.com  Fri Jul  8 15:56:41 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 08 Jul 2005 11:56:41 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42CE31D8.4090206@uib.no>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<42CE31D8.4090206@uib.no>
Message-ID: <1120838201.15658.121.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-08 at 09:57 +0200, Birger Wathne wrote:
> Eric Kerin wrote:
> 
> >Just because I'm curious, why do you restart lockd?  Are you restarting
> >any other nfs related services from rgmanager?
> >  
> >
> I think I found the bug. A line had sneaked (snuck?) back into 
> cluster.conf from an early experiment. I have just been blind to this 
> line in the config file. I once had it set up so that the service 
> started NFS. Later I  changed it so NFS gets started at boot with just a 
> dummy export in my exports file.

Ah ha...!

> Here is a snippet of my config file showing the resources and the nfs 
> service. Of the listed exports, the first one for users01 and all 3 for 
> iftscratch don't seem to get status checks, and thus never came up again.

Ok, now we're getting somewhere...  That definitely sounds like a bug.
Want to take this to bugzilla?

-- Lon


From lhh at redhat.com  Fri Jul  8 15:57:11 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 08 Jul 2005 11:57:11 -0400
Subject: [Linux-cluster] kernel versions supported by -rSTABLE?
In-Reply-To: <42CE8BBF.8040607@redhat.com>
References: <Pine.LNX.4.44.0507072101150.21204-100000@algiers.clic.cs.columbia.edu>
	<42CE8BBF.8040607@redhat.com>
Message-ID: <1120838231.15658.123.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-08 at 09:20 -0500, Chris Feist wrote:
> It should work against that kernel.  Please let us know if it doesn't.

-STABLE is tracking 2.6.12, I thought?

-- Lon


From cfeist at redhat.com  Fri Jul  8 15:59:06 2005
From: cfeist at redhat.com (Chris Feist)
Date: Fri, 08 Jul 2005 10:59:06 -0500
Subject: [Linux-cluster] kernel versions supported by -rSTABLE?
In-Reply-To: <1120838231.15658.123.camel@ayanami.boston.redhat.com>
References: <Pine.LNX.4.44.0507072101150.21204-100000@algiers.clic.cs.columbia.edu>	
	<42CE8BBF.8040607@redhat.com>
	<1120838231.15658.123.camel@ayanami.boston.redhat.com>
Message-ID: <42CEA2CA.8060303@redhat.com>

Yes, that's correct.  I was confused about the versions.  -rSTABLE should 
always be following the latest upstream stable kernel on kernel.org.

Thanks,
Chris

Lon Hohberger wrote:
> On Fri, 2005-07-08 at 09:20 -0500, Chris Feist wrote:
> 
>>It should work against that kernel.  Please let us know if it doesn't.
> 
> 
> -STABLE is tracking 2.6.12, I thought?
> 
> -- Lon
> 
> 


From dawson at fnal.gov  Fri Jul  8 17:44:56 2005
From: dawson at fnal.gov (Troy Dawson)
Date: Fri, 08 Jul 2005 12:44:56 -0500
Subject: [Linux-cluster] Simplest 4 node GFS 6.1 cluster
In-Reply-To: <90D99853F9131D4388534489DA4FCBA602CE4083@ausx2kmpc104.aus.amer.dell.com>
References: <90D99853F9131D4388534489DA4FCBA602CE4083@ausx2kmpc104.aus.amer.dell.com>
Message-ID: <42CEBB98.5080909@fnal.gov>


Sorry for the delay, I was on vacation, then catching up from being on 
vacation.

Basically, our Fiber Channel switch is hands off to me.  Not that I 
don't have access to the password, but there are several much more 
important things connected to that switch.  If I were to mess them up, 
or cause them to go down for a bit ... well that wouldn't be good.

It looks like I'm going to be going with the power switch option.  The 
rack my macines are in doesn't currently have controllable power, put 
it's in the works, so I guess I'll just be patient.

Troy

JACOB_LIBERMAN at Dell.com wrote:
> Troy,
> 
> What is the model/vendor of the switch? What are the models/vendors of
> the servers? I ran into a similar problem in our lab (no network power
> switches) and there are a few ways to get around that problem.
> 
> Thanks, jacob 
> 
> 
>>-----Original Message-----
>>From: linux-cluster-bounces at redhat.com 
>>[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Troy Dawson
>>Sent: Thursday, June 23, 2005 8:43 AM
>>To: linux clustering
>>Subject: [Linux-cluster] Simplest 4 node GFS 6.1 cluster
>>
>>Hello,
>>I've really been trying to figure this out from the 
>>documentation, and if it's in there, please point me at it.
>>
>>Here is what I want to do.  I have 4 machines, all connected 
>>to the same SAN disk.  1 machine will be the only machine 
>>that can read and write to the disk.  The other 3 will be 
>>read only.  The read only machines will be the only ones that 
>>have outside services running, such as ftp, nfs, rsync.
>>
>>I don't have a reliable 5th machine.  So I don't have a way 
>>to do an external lock manager.
>>
>>I don't want any failover, but I also don't want a single 
>>point of failure.  Basically if one of the read only machines 
>>goes dead, then it's dead, nothing takes it's place, but the 
>>other machines can go right on working.  If the read, write 
>>machine goes dead, then it's dead, but the read only machines 
>>can go right on doing what they normally do.  And if two or 
>>three machines die, I still want the one to still be able to 
>>at least read the data.
>>
>>This idea of fencing is what's throwing me off.  If I'm 
>>reading things right, I can't do group GFS without them being 
>>in a cluster, and they can't be in a cluster without doing 
>>fencing.  But the fencing seems to just allow the various 
>>machines to take over for one another.
>>
>>I also don't have access to the SAN switch, other than my 
>>machines plug into it.  It's essentially a black box.  These 
>>machines also don't have any way to remotely turn power on an off.
>>
>>Is GFS what I really want?  I've tried just standard ext3, 
>>but I was getting a caching problem with my read only 
>>machines.  Do I just want to try and fix my caching problem?
>>
>>Troy Dawson
>>--

-- 
__________________________________________________
Troy Dawson  dawson at fnal.gov  (630)840-6468
Fermilab  ComputingDivision/CSS  CSI Group
__________________________________________________


From petersp at alleft.com  Fri Jul  8 18:35:33 2005
From: petersp at alleft.com (Poul Petersen)
Date: Fri, 8 Jul 2005 11:35:33 -0700
Subject: [Linux-cluster] GFS panic: sm_membership.c
Message-ID: <20050708183533.GA2300@odin.alleft.com>

	I just recently started playing around with GFS and I'm trying
to get it working using AoE/Vblade to share a device. I originally tried
the GFS RPMs that came with FC4, but lock_dlm had a bunch of missing
symbols, so I reverted to using the cluster package from sources.redhat.com
Here is the setup:
	
	Two nodes:

gandolf: 192.168.1.16	(Yeah, I know it's spelled wrong)
	 Fedora Core 3
	 Kernel: 2.6.12.2
	 cluster-1.00.00
	 aoe-tools 4

jupiter: 192.168.1.20
	 Fedora Core 4
	 Kernel: 2.6.12-1.1387_FC4smp
	 cluster-1.00.00
	 vblade-5
	 (5) 250GB SATA HD in Software RAID5 (/dev/md0)
	 /dev/vg1/media: 500GB LV in vg1 (in /dev/md0)

cluster.conf:

<?xml version="1.0"?>
<cluster name="mythtv" config_version="3">

<cman two_node="1" expected_votes="1">
</cman>

<clusternodes>
<clusternode name="jupiter">
        <fence>
                <method name="single">
                        <device name="human" ipaddr="192.168.1.20"/>
                </method>
        </fence>
</clusternode>

<clusternode name="gandolf">
        <fence>
                <method name="single">
                        <device name="human" ipaddr="192.168.1.16"/>
                </method>
        </fence>
</clusternode>
</clusternodes>

<fencedevices>
        <fencedevice name="human" agent="fence_manual"/>
</fencedevices>

</cluster>

# Start the cluster services

[root at jupiter ~]# modprobe gfs
[root at jupiter ~]# modprobe lock_dlm
[root at jupiter ~]# ccsd
[root at jupiter ~]# cman_tool -w join
[root at jupiter ~]# fence_tool -w join

[root at gandolf ~]# modprobe gfs
[root at gandolf ~]# modprobe lock_dlm
[root at gandolf ~]# ccsd
[root at gandolf ~]# cman_tool -w join
[root at gandolf ~]# fence_tool -w join

# Create the filesystem and export it with AoE

[root at jupiter ~]# gfs_mkfs -p lock_dlm -t mythtv:media -j 2 /dev/vg1/media 
[root at jupiter ~]# /usr/local/build/vblade-5/vblade 0 0 eth0 /dev/vg1/media &

# Verify the device is available and do a test mount, then unmount

[root at gandolf ~]# modprobe aoe
[root at gandolf ~]# aoe-stat
    e0.0            eth0              up
[root at gandolf ~]# mount -t gfs /dev/etherd/e0.0 /san/media/
[root at gandolf ~]# df -k /san/media
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/etherd/e0.0     523969792       212 523969580   1% /san/media
[root at gandolf ~]# umount /san/media

# Test mount from the other node, this time leave it mounted

[root at jupiter ~]# mount -t gfs /dev/vg1/media /san/media
[root at jupiter ~]# df -k /san/media
Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/mapper/vg1-media
                     523969792       212 523969580   1% /san/media

# Now try mounting on *both* nodes at the same time
[root at gandolf ~]# mount -t gfs /dev/etherd/e0.0 /san/media/

(from gandolf dmesg:)
GFS: Trying to join cluster "lock_dlm", "mythtv:media"
CMAN: removing node jupiter from the cluster : Missed too many heartbeats
dlm: media: dlm_dir_rebuild_local failed -1

	A this point, the mount command hangs and the other node
(jupiter in this case) panics with a message about an assertion
in line 106 of sm_membership.c. Whichever node mounts the 
filesystem second, panics the first. So close... Any thing
obvious that I am doing wrong? 

Many Thanks

-poul


From eric at bootseg.com  Fri Jul  8 20:18:38 2005
From: eric at bootseg.com (Eric Kerin)
Date: Fri, 08 Jul 2005 16:18:38 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1120838201.15658.121.camel@ayanami.boston.redhat.com>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<42CE31D8.4090206@uib.no>
	<1120838201.15658.121.camel@ayanami.boston.redhat.com>
Message-ID: <1120853918.5524.15.camel@auh5-0478>

On Fri, 2005-07-08 at 11:56 -0400, Lon Hohberger wrote:
> > Here is a snippet of my config file showing the resources and the nfs 
> > service. Of the listed exports, the first one for users01 and all 3 for 
> > iftscratch don't seem to get status checks, and thus never came up again.
> 
> Ok, now we're getting somewhere...  That definitely sounds like a bug.
> Want to take this to bugzilla?
> 

Well, I was able to track it down, it's being caused by the throttle on
the monitor operations for resources.

Basically, any time a shared resource is referenced more than once, it
will not get monitored for the 2nd+ time it's referenced.  This is
because it keeps track of the last time the resource was checked at the
resource level, and if it hasn't been more time than the amount of time
the monitor attribute says is the interval, it doesn't run the monitor
operation on it.

So here's a patch that seems to fix it in my quick testing, but I'm not
sure if it's the best way to fix the bug.  It copies the action list for
the resource to the resource_node when a resource is referenced.  It
then uses that copy of the action list when doing status checks.


Perhaps a better way would be to make a copy of the struct for the
shared resource_t any time it's referenced, rather than just using the
same one for all resource_node_t.  I'm willing to write up this patch if
you think it's a better course of action.

Thanks,
Eric Kerin <eric at bootseg.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: fixresourceactions.patch
Type: text/x-patch
Size: 4243 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050708/97a17689/attachment.bin>

From lhh at redhat.com  Fri Jul  8 20:40:28 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 08 Jul 2005 16:40:28 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1120853918.5524.15.camel@auh5-0478>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<42CE31D8.4090206@uib.no>
	<1120838201.15658.121.camel@ayanami.boston.redhat.com>
	<1120853918.5524.15.camel@auh5-0478>
Message-ID: <1120855228.15658.167.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-08 at 16:18 -0400, Eric Kerin wrote:

> 
> Well, I was able to track it down, it's being caused by the throttle on
> the monitor operations for resources.
> 
> Basically, any time a shared resource is referenced more than once, it
> will not get monitored for the 2nd+ time it's referenced.  This is
> because it keeps track of the last time the resource was checked at the
> resource level, and if it hasn't been more time than the amount of time
> the monitor attribute says is the interval, it doesn't run the monitor
> operation on it.
> 
> So here's a patch that seems to fix it in my quick testing, but I'm not
> sure if it's the best way to fix the bug.  It copies the action list for
> the resource to the resource_node when a resource is referenced.  It
> then uses that copy of the action list when doing status checks.
> 
> 
> Perhaps a better way would be to make a copy of the struct for the
> shared resource_t any time it's referenced, rather than just using the
> same one for all resource_node_t.  I'm willing to write up this patch if
> you think it's a better course of action.

Both work.

For now, I think we should use this, as copying an entire resource_t
structure has the downside of complicating reconfiguration, which is
already rather ... complicated :)

Ideally, we'd just have private "last-check-time" and "last-check-level"
in the resource node structure and not put it in the resource action
structures.  This would require a little more work.

-- Lon


From lhh at redhat.com  Fri Jul  8 21:48:16 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 08 Jul 2005 17:48:16 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1120855228.15658.167.camel@ayanami.boston.redhat.com>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<42CE31D8.4090206@uib.no>
	<1120838201.15658.121.camel@ayanami.boston.redhat.com>
	<1120853918.5524.15.camel@auh5-0478>
	<1120855228.15658.167.camel@ayanami.boston.redhat.com>
Message-ID: <1120859296.15658.169.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-08 at 16:40 -0400, Lon Hohberger wrote:
> > Perhaps a better way would be to make a copy of the struct for the
> > shared resource_t any time it's referenced, rather than just using the
> > same one for all resource_node_t.  I'm willing to write up this patch if
> > you think it's a better course of action.
> 
> Both work.
> 
> For now, I think we should use this, as copying an entire resource_t
> structure has the downside of complicating reconfiguration, which is
> already rather ... complicated :)
> 
> Ideally, we'd just have private "last-check-time" and "last-check-level"
> in the resource node structure and not put it in the resource action
> structures.  This would require a little more work.

Birger -- can you apply that patch and let me know if it works for you?

If so, can I convince one of you to file a bugzilla against rgmanager
for tracking? :)

-- Lon


From petersp at alleft.com  Sun Jul 10 20:47:34 2005
From: petersp at alleft.com (Poul Petersen)
Date: Sun, 10 Jul 2005 13:47:34 -0700
Subject: [Linux-cluster] GFS panic: sm_membership.c
In-Reply-To: <20050708183533.GA2300@odin.alleft.com>
References: <20050708183533.GA2300@odin.alleft.com>
Message-ID: <20050710204734.GA12148@odin.alleft.com>

	Hmmm. I dug thru the mailing list and saw a few references to *NOT*
using software RAID with GFS. So, I tried just using a physical disk and
switched to using GNBD instead of AoE. I also tried moving up to the
CVS version of the cluster software. Unfortunately, I get the exact
same kernel panic, same error, everything. I'm really not sure where 
I am going wrong, but I basically can not get GFS to work at all :(

-poul


From thomsonr at ucalgary.ca  Sun Jul 10 21:41:50 2005
From: thomsonr at ucalgary.ca (Ryan Thomson)
Date: Sun, 10 Jul 2005 15:41:50 -0600 (MDT)
Subject: [Linux-cluster] GFS for RHEL4U1 (2.6.9-11)?
Message-ID: <32922.68.146.207.245.1121031710.squirrel@68.146.207.245>

> Hi,
>
> are there any GFS SRPM's available for RHEL4U1 (2.6.9-11)?
>
> The cluster webpage http://sources.redhat.com/cluster/ says, that there is
> a RHEL4 CVS, which builds against 2.6.9-5.
>
> Is GFS-6.1 already officially released?
> Are there any plans for making stable SRPM's available,
> like there are for RHEL3?
>
> Thank you very much
>
> Hansj?rg

It seems possible to build the CVS code against RHEL4U1 (2.6.9-11). I
successfully built both the cluster CVS code and the LVM2 CVS code on
RHEL4U1 without issues.

You can also build src RPMS from the CVS code with the command 'make
srpms'. From there you can build binary RPMs for your systems as well.

-- 
Ryan


From pcaulfie at redhat.com  Mon Jul 11 07:05:13 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Mon, 11 Jul 2005 08:05:13 +0100
Subject: [Linux-cluster] GFS panic: sm_membership.c
In-Reply-To: <20050708183533.GA2300@odin.alleft.com>
References: <20050708183533.GA2300@odin.alleft.com>
Message-ID: <42D21A29.6000308@redhat.com>

Poul Petersen wrote:
> 	I just recently started playing around with GFS and I'm trying
> to get it working using AoE/Vblade to share a device. I originally tried
> the GFS RPMs that came with FC4, but lock_dlm had a bunch of missing
> symbols, so I reverted to using the cluster package from sources.redhat.com
> Here is the setup:

The Fedora RPMs work fine for me. It might just be that you're not running a
compatible kernel.

Which "package" did you get from sources.redhat.com? The tarball that's
currently on the ftp site looks ancient. You should either use the srpms or
checkout the STABLE branch from CVS.

-- 

patrick


From zuzhihui at gmail.com  Mon Jul 11 07:26:36 2005
From: zuzhihui at gmail.com (=?GB2312?Q?Zu_Zhihui_=28=D7=E6=D6=BE=BB=D4=29?=)
Date: Mon, 11 Jul 2005 15:26:36 +0800
Subject: [Linux-cluster] problems about gfs on FC4
Message-ID: <64360a0d0507110026372499e3@mail.gmail.com>

 According to the manual of GFS 6.1, GFS 6.1 depends on redhat
cluster.  FC4 includes the GFS 6.1 rpms. but doesnot include the
cluster manager.

Then, how to use GFS 6.1 on FC4 ?


From Axel.Thimm at ATrpms.net  Mon Jul 11 08:31:03 2005
From: Axel.Thimm at ATrpms.net (Axel Thimm)
Date: Mon, 11 Jul 2005 10:31:03 +0200
Subject: [Linux-cluster] Re: problems about gfs on FC4
In-Reply-To: <64360a0d0507110026372499e3@mail.gmail.com>
References: <64360a0d0507110026372499e3@mail.gmail.com>
Message-ID: <20050711083103.GB15471@neu.nirvana>

On Mon, Jul 11, 2005 at 03:26:36PM +0800, Zu Zhihui (???) wrote:
>  According to the manual of GFS 6.1, GFS 6.1 depends on redhat
> cluster.  FC4 includes the GFS 6.1 rpms. but doesnot include the
> cluster manager.
> 
> Then, how to use GFS 6.1 on FC4 ?

The cluster suite is a name for a collection of packages that does
exist in FC4, e.g. cman, ccs, fence, cluster-lvm2.

I'm running an FC4 GFS cluster for quite some time currently migrating
to RHEL4 (I need to repackage device-mapper-multipath and some other
smallish items).
-- 
Axel.Thimm at ATrpms.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050711/b4766256/attachment.sig>

From zuzhihui at gmail.com  Mon Jul 11 08:39:43 2005
From: zuzhihui at gmail.com (=?GB2312?Q?Zu_Zhihui_=28=D7=E6=D6=BE=BB=D4=29?=)
Date: Mon, 11 Jul 2005 16:39:43 +0800
Subject: [Linux-cluster] Re: problems about gfs on FC4
In-Reply-To: <20050711083103.GB15471@neu.nirvana>
References: <64360a0d0507110026372499e3@mail.gmail.com>
	<20050711083103.GB15471@neu.nirvana>
Message-ID: <64360a0d05071101396e30dc3a@mail.gmail.com>

Thanks! I find it. cman, ccs, fence, lvm2-cluster( not cluster-lvm2 :-)

2005/7/11, Axel Thimm <Axel.Thimm at atrpms.net>:
> On Mon, Jul 11, 2005 at 03:26:36PM +0800, Zu Zhihui (???) wrote:
> >  According to the manual of GFS 6.1, GFS 6.1 depends on redhat
> > cluster.  FC4 includes the GFS 6.1 rpms. but doesnot include the
> > cluster manager.
> >
> > Then, how to use GFS 6.1 on FC4 ?
> 
> The cluster suite is a name for a collection of packages that does
> exist in FC4, e.g. cman, ccs, fence, cluster-lvm2.
> 
> I'm running an FC4 GFS cluster for quite some time currently migrating
> to RHEL4 (I need to repackage device-mapper-multipath and some other
> smallish items).
> --
> Axel.Thimm at ATrpms.net
> 
> 
>


From pbruna at linuxcenterla.com  Mon Jul 11 17:33:05 2005
From: pbruna at linuxcenterla.com (Patricio Bruna V)
Date: Mon, 11 Jul 2005 13:33:05 -0400
Subject: [Linux-cluster] GFS & DRBD
Message-ID: <200507111333.05984.pbruna@linuxcenterla.com>

Its posibble to use GFS over DRBD?
-- 
Patricio Bruna V.
RHCE/RHCI
Jefe Soporte y Operaciones LinuxCenter S.A.
Mariano Sanchez Fontecilla 310
Las Condes, Chile
Fono:(+56-2) 4834000/4834042
http://www.linuxcenterla.com


From alewis at redhat.com  Mon Jul 11 18:21:14 2005
From: alewis at redhat.com (AJ Lewis)
Date: Mon, 11 Jul 2005 13:21:14 -0500
Subject: [Linux-cluster] GFS & DRBD
In-Reply-To: <200507111333.05984.pbruna@linuxcenterla.com>
References: <200507111333.05984.pbruna@linuxcenterla.com>
Message-ID: <20050711182114.GB23156@null.msp.redhat.com>

On Mon, Jul 11, 2005 at 01:33:05PM -0400, Patricio Bruna V wrote:
> Its posibble to use GFS over DRBD?

DRDB is active/passive, and GFS needs to be able to access the fs from both
nodes simultaneously, so no.   There has been some discussion about making
DRDB active/active, in which case you could probably run GFS over it (on 2
nodes) but AFAIK, it's not done yet, and if it is done, nobody's tested it.

Regards,
-- 
AJ Lewis                                   Voice:  612-638-0500
Red Hat                                    E-Mail: alewis at redhat.com
One Main Street SE, Suite 209
Minneapolis, MN 55414
   
Current GPG fingerprint = D9F8 EDCE 4242 855F A03D  9B63 F50C 54A8 578C 8715
Grab the key at: http://people.redhat.com/alewis/gpg.html or one of the
many keyservers out there...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050711/6e8a520c/attachment.sig>

From phung at cs.columbia.edu  Mon Jul 11 18:23:31 2005
From: phung at cs.columbia.edu (Dan B. Phung)
Date: Mon, 11 Jul 2005 14:23:31 -0400 (EDT)
Subject: [Linux-cluster] kernel versions supported by -rSTABLE?
In-Reply-To: <42CE8BBF.8040607@redhat.com>
Message-ID: <Pine.LNX.4.44.0507111421140.12582-100000@algiers.clic.cs.columbia.edu>

I have errors building with kernels for 2.6.11 and 2.6.8.1.  How would I
check out the versions that are the latest stable for those specific
kernel versions?

(I can send the errors if you'd like, but I think they're generally
recreatable when trying to compile against vanilla versions of these
kernels.)

-dan

On 8, Jul, 2005, Chris Feist declared:

> It should work against that kernel.  Please let us know if it doesn't.
> 
> Thanks,
> Chris
> 
> Dan B. Phung wrote:
> > to be more specific, will it work for 2.6.11?
> > 
> > 
> > On 7, Jul, 2005, Dan B. Phung declared:
> > 
> > 
> >>Just wondering, which kernel versions are known to 
> >>work with gfs?
> >>
> >>-dan
> >>
> >>--
> >>Linux-cluster mailing list
> >>Linux-cluster at redhat.com
> >>http://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> > 
> > 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 

-- 
email:  phung at cs.columbia.edu
www:    http://www.cs.columbia.edu/~phung
phone:  646-775-6090
office: CS Dept. 520, 1214 Amsterdam Ave., MC 0401, New York, NY 10027


From cfeist at redhat.com  Mon Jul 11 18:26:12 2005
From: cfeist at redhat.com (Chris Feist)
Date: Mon, 11 Jul 2005 13:26:12 -0500
Subject: [Linux-cluster] kernel versions supported by -rSTABLE?
In-Reply-To: <Pine.LNX.4.44.0507111421140.12582-100000@algiers.clic.cs.columbia.edu>
References: <Pine.LNX.4.44.0507111421140.12582-100000@algiers.clic.cs.columbia.edu>
Message-ID: <42D2B9C4.8020406@redhat.com>

Dan B. Phung wrote:
> I have errors building with kernels for 2.6.11 and 2.6.8.1.  How would I
> check out the versions that are the latest stable for those specific
> kernel versions?
> 
> (I can send the errors if you'd like, but I think they're generally
> recreatable when trying to compile against vanilla versions of these
> kernels.)
> 
> -dan

Try 2.6.12 and let me know how it works.

Thanks,
Chris


From adam at popik.pl  Mon Jul 11 18:53:54 2005
From: adam at popik.pl (Adam Popik)
Date: Mon, 11 Jul 2005 20:53:54 +0200
Subject: [Linux-cluster] GFS for RHEL4 
Message-ID: <42D2C042.9020306@popik.pl>

Will are SRPMS available for RHEL4 (like for RHEL3) ?

PS. Sorry for broken English.
Adam

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4115 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050711/02fdd50e/attachment.bin>

From alewis at redhat.com  Mon Jul 11 19:00:29 2005
From: alewis at redhat.com (AJ Lewis)
Date: Mon, 11 Jul 2005 14:00:29 -0500
Subject: [Linux-cluster] HELP! Diapered block device
In-Reply-To: <42CD55AA.2040004@possibilityforge.com>
References: <42CD3D2A.2050108@possibilityforge.com>
	<200507071002.01130.Jason@selu.edu>
	<42CD475D.3060609@possibilityforge.com>
	<20050707153435.GD15005@null.msp.redhat.com>
	<42CD4D01.20803@possibilityforge.com>
	<42CD55AA.2040004@possibilityforge.com>
Message-ID: <20050711190029.GD23156@null.msp.redhat.com>

Just for anyone following this, pls refer to
https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=162709 for more
information and updates.

On Thu, Jul 07, 2005 at 10:17:46AM -0600, Jon Scottorn wrote:
> Ok, so I am rerunning gfs_fsck again.  I have everything unmounted and
> the storage server not even in the cluster.  It has been running now for
> 30 mins again and it gets to this point and just doesn't look like it is
> doing anything.  This is where is stayed after running it for 24 hours
> yesterday:  Here is what the verbose output from gfs_fsck:
> 
> Initializing fsck
> Initializing lists...
> Initializing special inodes...
> Setting block ranges...
> Creating a block list of size 183146926...
> Clearing journals (this may take a while)
> Clearing journal 0
> Clearing journal 1
> Clearing journal 2
> Clearing journal 3
> Cleared journals
> Starting pass1
> Checking metadata in Resource Group 0
> Checking metadata in Resource Group 1
> Checking metadata in Resource Group 2
> Checking metadata in Resource Group 3
> 
> ........Omitted lines for space.........................
> 
> Checking metadata in Resource Group 2790
> Checking metadata in Resource Group 2791
> Checking metadata in Resource Group 2792
> Checking metadata in Resource Group 2793
> Pass1 complete
> Starting pass1b
> Looking for duplicate blocks...
> Found dup block at 61573000
> Found dup block at 61573014
> Found dup block at 61573015
> Found dup block at 61573016
> Found dup block at 61573017
> Found dup block at 61573018
> Found dup block at 61573019
> Found dup block at 61573020
> Found dup block at 61573021
> Found dup block at 61573022
> Found dup block at 61573024
> Found dup block at 61573047
> Found dup block at 61573048
> Found dup block at 61573052
> Found dup block at 61623032
> Found dup block at 61623033
> Found dup block at 61623034
> Found dup block at 61623035
> Scanning filesystem for inodes containing duplicate blocks...
> 
> Once it gets to this point it just sits there.  gfs_fsck is using 99% of
> the CPU for the whole time it runs.  What else can I do to get this fixed?
> 
> Thanks,
> 
> Jon
> 
> Jon Scottorn wrote:
> 
> >When I ran the fsck, i had everything unmounted as well as the gnbd serv
> >stopped.  I let it run for almost 24 hours and it was still running. 
> >That seems a little long for me.  Should I let it run again and see what
> >happens.  My main problem is I can't have the FS down for that long. 
> >
> >Thanks,
> >
> >Jon
> >
> >AJ Lewis wrote:
> >
> >  
> >
> >>On Thu, Jul 07, 2005 at 09:16:45AM -0600, Jon Scottorn wrote:
> >> 
> >>
> >>    
> >>
> >>>Thanks,
> >>>
> >>>   That made it so I can mount it from the other nodes, but now I can't
> >>>mount it on the storage server.
> >>>   
> >>>
> >>>      
> >>>
> >>Gah!  Is the fsck still running?  You *CANNOT* run the fsck while other nodes
> >>have the fs mounted.  The fsck changes the lock protocol to prevent others
> >>    
> >>
> >>from mounting after the fsck starts.  It will be changed back after
> >  
> >
> >>completion.
> >>
> >>The fsck can take a while in the duplicate block code - could you tell if it
> >>was still accessing storage?  If you have lots of inodes in the system, it's
> >>gonna take a while to work through them in the dup block handling code.
> >>
> >>Regards,
> >> 
> >>
> >>------------------------------------------------------------------------
> >>
> >>--
> >>Linux-cluster mailing list
> >>Linux-cluster at redhat.com
> >>http://www.redhat.com/mailman/listinfo/linux-cluster
> >>
> >>    
> >>
> >
> >--
> >Linux-cluster mailing list
> >Linux-cluster at redhat.com
> >http://www.redhat.com/mailman/listinfo/linux-cluster
> >
> >  
> >
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 

-- 
AJ Lewis                                   Voice:  612-638-0500
Red Hat                                    E-Mail: alewis at redhat.com
One Main Street SE, Suite 209
Minneapolis, MN 55414
   
Current GPG fingerprint = D9F8 EDCE 4242 855F A03D  9B63 F50C 54A8 578C 8715
Grab the key at: http://people.redhat.com/alewis/gpg.html or one of the
many keyservers out there...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050711/811d3bca/attachment.sig>

From Birger.Wathne at ift.uib.no  Mon Jul 11 20:13:27 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Mon, 11 Jul 2005 22:13:27 +0200
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1120853918.5524.15.camel@auh5-0478>
References: <42CBAF82.8050900@uib.no>
	<1120685352.5704.9.camel@auh5-0478>	<42CD0740.8040102@uib.no>
	<1120749723.5534.7.camel@auh5-0478>	<42CE31D8.4090206@uib.no>	<1120838201.15658.121.camel@ayanami.boston.redhat.com>
	<1120853918.5524.15.camel@auh5-0478>
Message-ID: <42D2D2E7.8000906@uib.no>

Sorry...

I have just fetched latest FC4 branch from cvs, but I am unable to apply the 
patch. It keeps complaining it can't find the file to patch, and if I give 
the file name manually it rejects the patch.

Could you give me a hint on how to apply it?

-- 
birger


From eric at bootseg.com  Mon Jul 11 20:44:51 2005
From: eric at bootseg.com (Eric Kerin)
Date: Mon, 11 Jul 2005 16:44:51 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42D2D2E7.8000906@uib.no>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<42CE31D8.4090206@uib.no>
	<1120838201.15658.121.camel@ayanami.boston.redhat.com>
	<1120853918.5524.15.camel@auh5-0478>  <42D2D2E7.8000906@uib.no>
Message-ID: <1121114691.5632.2.camel@auh5-0478>

On Mon, 2005-07-11 at 22:13 +0200, Birger Wathne wrote:
> Sorry...
> 
> I have just fetched latest FC4 branch from cvs, but I am unable to apply the 
> patch. It keeps complaining it can't find the file to patch, and if I give 
> the file name manually it rejects the patch.
> 
> Could you give me a hint on how to apply it?
> 

>From inside the cluster directory you checked out (the one with ccs,
cman, cman-kernel, etc) run: 
patch -p0 <filenameofpatch.patch

Thanks,
Eric Kerin <eric at bootseg.com>


From Birger.Wathne at ift.uib.no  Mon Jul 11 20:50:41 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Mon, 11 Jul 2005 22:50:41 +0200
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1121114691.5632.2.camel@auh5-0478>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>	
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>	
	<42CE31D8.4090206@uib.no>	
	<1120838201.15658.121.camel@ayanami.boston.redhat.com>	
	<1120853918.5524.15.camel@auh5-0478> <42D2D2E7.8000906@uib.no>
	<1121114691.5632.2.camel@auh5-0478>
Message-ID: <42D2DBA1.2090909@uib.no>

Eric Kerin wrote:
> 
> 
>>From inside the cluster directory you checked out (the one with ccs,
> cman, cman-kernel, etc) run: 
> patch -p0 <filenameofpatch.patch
> 
> Thanks,
> Eric Kerin <eric at bootseg.com>

# patch -p0 < ../rhcluster-fixresourceactions.patch
(Stripping trailing CRs from patch.)
patching file rgmanager/include/reslist.h
Hunk #1 FAILED at 126.
1 out of 1 hunk FAILED -- saving rejects to file rgmanager/include/reslist.h.rej
(Stripping trailing CRs from patch.)
patching file rgmanager/src/daemons/restree.c
Hunk #1 succeeded at 36 with fuzz 2 (offset -3 lines).
Hunk #2 FAILED at 488.
Hunk #3 FAILED at 589.
Hunk #4 FAILED at 747.
Hunk #5 FAILED at 776.
Hunk #6 FAILED at 801.
5 out of 6 hunks FAILED -- saving rejects to file 
rgmanager/src/daemons/restree.c.rej


From eric at bootseg.com  Mon Jul 11 21:06:15 2005
From: eric at bootseg.com (Eric Kerin)
Date: Mon, 11 Jul 2005 17:06:15 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42D2DBA1.2090909@uib.no>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<42CE31D8.4090206@uib.no>
	<1120838201.15658.121.camel@ayanami.boston.redhat.com>
	<1120853918.5524.15.camel@auh5-0478>  <42D2D2E7.8000906@uib.no>
	<1121114691.5632.2.camel@auh5-0478>  <42D2DBA1.2090909@uib.no>
Message-ID: <1121115975.5632.4.camel@auh5-0478>

On Mon, 2005-07-11 at 22:50 +0200, Birger Wathne wrote:
> Eric Kerin wrote:
> > 
> > 
> >>From inside the cluster directory you checked out (the one with ccs,
> > cman, cman-kernel, etc) run: 
> > patch -p0 <filenameofpatch.patch
> > 
> > Thanks,
> > Eric Kerin <eric at bootseg.com>
> 
> # patch -p0 < ../rhcluster-fixresourceactions.patch
> (Stripping trailing CRs from patch.)
> patching file rgmanager/include/reslist.h
> Hunk #1 FAILED at 126.
> 1 out of 1 hunk FAILED -- saving rejects to file rgmanager/include/reslist.h.rej
> (Stripping trailing CRs from patch.)
> patching file rgmanager/src/daemons/restree.c
> Hunk #1 succeeded at 36 with fuzz 2 (offset -3 lines).
> Hunk #2 FAILED at 488.
> Hunk #3 FAILED at 589.
> Hunk #4 FAILED at 747.
> Hunk #5 FAILED at 776.
> Hunk #6 FAILED at 801.
> 5 out of 6 hunks FAILED -- saving rejects to file 
> rgmanager/src/daemons/restree.c.rej

Ahh, looks like some whitespace snuck in that patch along the way.
use: patch -l -p0 < filenameofpatch.patch

Sorry about that.

thanks,
Eric Kerin <eric at bootseg.com>


From Thomas.Schwanhaeuser at aps-web.de  Mon Jul 11 23:10:00 2005
From: Thomas.Schwanhaeuser at aps-web.de (=?iso-8859-1?Q?Thomas_Schwanh=E4user?=)
Date: Tue, 12 Jul 2005 01:10:00 +0200
Subject: [Linux-cluster] GFS usage for simultanous access / shared-nothing
Message-ID: <173E658EF600C94396770F46C560BB0DE7EFA5@dmc-srv1.aps.de>

Hi all,

 
I wonder if it's possible that two servers maintain an exact copy of a file system in a "shared-nothing-architecture". Unfortunately, I haven't seen any hint on that although having already read through Redhat's documentation on GFS 6.1.

 
Can anyone confirm that or mention an URL with the corresponding parameters?

 
The set-up I plan to do is with two machines, each having about 1TB of storage in a RAID5-setup. On both of the machines some services are running and of course they should have access to the same data at any time. Access is done on a file-by-level and it's unlikely that both machines want to access the same file at any given moment (if so, one should wait until the lock is released). If one of the machines crashes, operation on the 2nd one shall continue and upon rebooting the failed one should synchronize its discs and restart the services after successful rebuild.

 
Thanks,

 
Thomas

----------------------------------------------------------------------

Thomas Schwanh?user

 
APS web.services GmbH, Mittelstr. 8, D-73733 Esslingen

Telefon: 07 11 / 327 78 - 950, Fax: -991

eMail: tschwanhaeuser at aps-web.de

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050712/af27a5ac/attachment.htm>

From Thomas.Schwanhaeuser at aps-web.de  Mon Jul 11 23:12:10 2005
From: Thomas.Schwanhaeuser at aps-web.de (=?iso-8859-1?Q?Thomas_Schwanh=E4user?=)
Date: Tue, 12 Jul 2005 01:12:10 +0200
Subject: [Linux-cluster] GNBD vs. file level performance
Message-ID: <173E658EF600C94396770F46C560BB0DE7EFA6@dmc-srv1.aps.de>

Hi,

 
the second thing I currently wonder about is  the performance of the GNBD driver. Has anyone performed some benchmarks comparing block-level access to file-level access? 

 
Many thanks in advance for any feedback,

 
Thomas

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050712/2f73286d/attachment.htm>

From Axel.Thimm at ATrpms.net  Tue Jul 12 00:17:17 2005
From: Axel.Thimm at ATrpms.net (Axel Thimm)
Date: Tue, 12 Jul 2005 02:17:17 +0200
Subject: [Linux-cluster] Re: GFS usage for simultanous access /
	shared-nothing
In-Reply-To: <173E658EF600C94396770F46C560BB0DE7EFA5@dmc-srv1.aps.de>
References: <173E658EF600C94396770F46C560BB0DE7EFA5@dmc-srv1.aps.de>
Message-ID: <20050712001717.GA13455@neu.nirvana>

On Tue, Jul 12, 2005 at 01:10:00AM +0200, Thomas Schwanh?user wrote:
> I wonder if it's possible that two servers maintain an exact copy of
> a file system in a "shared-nothing-architecture". Unfortunately, I
> haven't seen any hint on that although having already read through
> Redhat's documentation on GFS 6.1.

GFS requires a shared block device.

> Can anyone confirm that or mention an URL with the corresponding parameters?
> 
> The set-up I plan to do is with two machines, each having about 1TB
> of storage in a RAID5-setup. On both of the machines some services
> are running and of course they should have access to the same data
> at any time. Access is done on a file-by-level and it's unlikely
> that both machines want to access the same file at any given moment
> (if so, one should wait until the lock is released). If one of the
> machines crashes, operation on the 2nd one shall continue and upon
> rebooting the failed one should synchronize its discs and restart
> the services after successful rebuild.

Perhaps you are looking for something like
http://sourceware.org/cluster/ddraid/
-- 
Axel.Thimm at ATrpms.net
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050712/ef34313a/attachment.sig>

From eric at bootseg.com  Tue Jul 12 05:02:07 2005
From: eric at bootseg.com (Eric Kerin)
Date: Tue, 12 Jul 2005 01:02:07 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42D31844.4010502@uib.no>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<42CE31D8.4090206@uib.no>
	<1120838201.15658.121.camel@ayanami.boston.redhat.com>
	<1120853918.5524.15.camel@auh5-0478>  <42D2D2E7.8000906@uib.no>
	<1121114691.5632.2.camel@auh5-0478>  <42D2DBA1.2090909@uib.no>
	<1121115975.5632.4.camel@auh5-0478>  <42D2EE5E.5010904@uib.no>
	<1121120744.5632.10.camel@auh5-0478>  <42D31844.4010502@uib.no>
Message-ID: <1121144527.5275.5.camel@auh5-0478>

On Tue, 2005-07-12 at 03:09 +0200, Birger Wathne wrote:
> I did a make install and service rgmanager stop/start.
> 
> Now, status checks seem to work. After exportfs -r all file systems 
> eventually reappear.
> 

Good to hear that fixed the monitoring for you.

How about the exports being lost in the first place, did you get that
straightened out?  If not, can you send that script you are using for
restarting the nfs lock daemon?

Thanks,
Eric Kerin <eric at bootseg.com>


From Birger.Wathne at ift.uib.no  Tue Jul 12 08:14:11 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Tue, 12 Jul 2005 10:14:11 +0200
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1121144527.5275.5.camel@auh5-0478>
References: <42CBAF82.8050900@uib.no>
	<1120685352.5704.9.camel@auh5-0478>	<42CD0740.8040102@uib.no>
	<1120749723.5534.7.camel@auh5-0478>	<42CE31D8.4090206@uib.no>	<1120838201.15658.121.camel@ayanami.boston.redhat.com>	<1120853918.5524.15.camel@auh5-0478>
	<42D2D2E7.8000906@uib.no>	<1121114691.5632.2.camel@auh5-0478>
	<42D2DBA1.2090909@uib.no>	<1121115975.5632.4.camel@auh5-0478>
	<42D2EE5E.5010904@uib.no>	<1121120744.5632.10.camel@auh5-0478>
	<42D31844.4010502@uib.no> <1121144527.5275.5.camel@auh5-0478>
Message-ID: <42D37BD3.9080300@uib.no>

Eric Kerin wrote:

> On Tue, 2005-07-12 at 03:09 +0200, Birger Wathne wrote:
> 
>>I did a make install and service rgmanager stop/start.
>>
>>Now, status checks seem to work. After exportfs -r all file systems 
>>eventually reappear.
>>
> 
> 
> Good to hear that fixed the monitoring for you.
> 

Do you want the bad news as well?

The > 14 character long export path bug is still there in the FC4 branch.
If the exported path is > 14 characters it will always fail status check and 
get reexported. This is because exportfs breaks lines after the exported 
path if it is long. I think exportfs shouldn't do this, but I guess the 
cluster software will have to work around it.

The line that fails is:
exportfs | grep -q "^${OCF_RESKEY_path}\ .*${OCF_RESKEY_target}"

A proper fix would be a filter to remove the offending line breaks. My quick 
and dirty (and far less portable) fix is to use this instead:

grep -q "^${OCF_RESKEY_path}[   ]*${OCF_RESKEY_target}(" /var/lib/nfs/etab

Note that my code is a little bit more restrictive. It makes shure the 
target matches exactly, while the existing code will match if 
${OFC_RESKEY_target} is a prefix, won't it? In other words, if you look for 
an unqualified target and there is an export to a fully qualified host with 
the same hostname (but possibly in a different domain... hard to tell) the 
existing code will match. I guess there should be a '$' at the end of the 
regexp.

Should I bugzilla this, or will you take it? I don't remember my user, but I 
think I created one. Perhaps it's simply birger. :-)

-- 
birger


From robert at deakin.edu.au  Tue Jul 12 08:27:20 2005
From: robert at deakin.edu.au (Robert Ruge)
Date: Tue, 12 Jul 2005 18:27:20 +1000
Subject: [Linux-cluster] GFS Implementation Question
Message-ID: <200507120827.j6C8RMuR007100@deakin.edu.au>

I have two questions about implementing a GFS cluster for NFS and
samba failover:

1. Should one mount the GFS filesystem through the cluster software or
through fstab. It would appear to me that you could mount the GFS
filesystem through fstab and just have the NFS and samba services
through the cluster manager so that one could move the service but
still have access to the underlying filesystem. Pros and Cons?

2. Does a GFS filesystem have to be created through LVM or will it all
work using normal disk devices accessed through EMC powerpath
software?

PS EMC has just released Powerpath 4.4 for RHEL4 and Suse EL.

Thanks.

Robert Ruge   School of Information Technology, Deakin University 


From Birger.Wathne at ift.uib.no  Tue Jul 12 09:12:46 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Tue, 12 Jul 2005 11:12:46 +0200
Subject: [Linux-cluster] GFS Implementation Question
In-Reply-To: <200507120827.j6C8RMuR007100@deakin.edu.au>
References: <200507120827.j6C8RMuR007100@deakin.edu.au>
Message-ID: <42D3898E.3040807@uib.no>

Robert Ruge wrote:

> I have two questions about implementing a GFS cluster for NFS and
> samba failover:
> 
> 1. Should one mount the GFS filesystem through the cluster software or
> through fstab. It would appear to me that you could mount the GFS
> filesystem through fstab and just have the NFS and samba services
> through the cluster manager so that one could move the service but
> still have access to the underlying filesystem. Pros and Cons?

The cluster software gives gfs file systems a special treatment. They don't 
get umounted when a service moves away. So when you migrate a service you 
still have the file systems even if you let the cluster sw handle the mounting.

-- 
birger


From npf at eurotux.com  Tue Jul 12 11:14:11 2005
From: npf at eurotux.com (Nuno Pais Fernandes)
Date: Tue, 12 Jul 2005 12:14:11 +0100
Subject: [Linux-cluster] GFS & DRBD
In-Reply-To: <20050711182114.GB23156@null.msp.redhat.com>
References: <200507111333.05984.pbruna@linuxcenterla.com>
	<20050711182114.GB23156@null.msp.redhat.com>
Message-ID: <200507121214.12659.npf@eurotux.com>

Drbd 0.8 will be active-active.

Nuno Fernandes

On Monday 11 July 2005 19:21, AJ Lewis wrote:
> On Mon, Jul 11, 2005 at 01:33:05PM -0400, Patricio Bruna V wrote:
> > Its posibble to use GFS over DRBD?
>
> DRDB is active/passive, and GFS needs to be able to access the fs from both
> nodes simultaneously, so no.   There has been some discussion about making
> DRDB active/active, in which case you could probably run GFS over it (on 2
> nodes) but AFAIK, it's not done yet, and if it is done, nobody's tested it.
>
> Regards,

-- 
--------------------------
Nuno Miguel Pais Fernandes
<npf at eurotux.com>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050712/2aa359d6/attachment.sig>

From eric at bootseg.com  Tue Jul 12 12:53:32 2005
From: eric at bootseg.com (Eric Kerin)
Date: Tue, 12 Jul 2005 08:53:32 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42D37BD3.9080300@uib.no>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<42CE31D8.4090206@uib.no>
	<1120838201.15658.121.camel@ayanami.boston.redhat.com>
	<1120853918.5524.15.camel@auh5-0478>  <42D2D2E7.8000906@uib.no>
	<1121114691.5632.2.camel@auh5-0478>  <42D2DBA1.2090909@uib.no>
	<1121115975.5632.4.camel@auh5-0478>  <42D2EE5E.5010904@uib.no>
	<1121120744.5632.10.camel@auh5-0478>  <42D31844.4010502@uib.no>
	<1121144527.5275.5.camel@auh5-0478>  <42D37BD3.9080300@uib.no>
Message-ID: <1121172812.7227.6.camel@auh5-0478>

On Tue, 2005-07-12 at 10:14 +0200, Birger Wathne wrote:
> The > 14 character long export path bug is still there in the FC4 branch.
> If the exported path is > 14 characters it will always fail status check and 
> get reexported. This is because exportfs breaks lines after the exported 
> path if it is long. I think exportfs shouldn't do this, but I guess the 
> cluster software will have to work around it.
> 
The fix seems to already be in the RHEL4, STABLE, and HEAD branches.

Is FC4 even a branch that is keep up to date?  You should probably be
using either STABLE or RHEL4 for Fedora Core 4.

Thanks, 
Eric


From alewis at redhat.com  Tue Jul 12 13:39:22 2005
From: alewis at redhat.com (AJ Lewis)
Date: Tue, 12 Jul 2005 08:39:22 -0500
Subject: [Linux-cluster] GFS & DRBD
In-Reply-To: <200507121214.12659.npf@eurotux.com>
References: <200507111333.05984.pbruna@linuxcenterla.com>
	<20050711182114.GB23156@null.msp.redhat.com>
	<200507121214.12659.npf@eurotux.com>
Message-ID: <20050712133922.GF23156@null.msp.redhat.com>

On Tue, Jul 12, 2005 at 12:14:11PM +0100, Nuno Pais Fernandes wrote:
> Drbd 0.8 will be active-active.

Cool!  Then we just need to get people to try it out.  I'm assuming it's still
two-node only, right?
 
> Nuno Fernandes
> 
> On Monday 11 July 2005 19:21, AJ Lewis wrote:
> > On Mon, Jul 11, 2005 at 01:33:05PM -0400, Patricio Bruna V wrote:
> > > Its posibble to use GFS over DRBD?
> >
> > DRDB is active/passive, and GFS needs to be able to access the fs from both
> > nodes simultaneously, so no.   There has been some discussion about making
> > DRDB active/active, in which case you could probably run GFS over it (on 2
> > nodes) but AFAIK, it's not done yet, and if it is done, nobody's tested it.
> >
> > Regards,
> 
> -- 
> --------------------------
> Nuno Miguel Pais Fernandes
> <npf at eurotux.com>


> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster

-- 
AJ Lewis                                   Voice:  612-638-0500
Red Hat                                    E-Mail: alewis at redhat.com
One Main Street SE, Suite 209
Minneapolis, MN 55414
   
Current GPG fingerprint = D9F8 EDCE 4242 855F A03D  9B63 F50C 54A8 578C 8715
Grab the key at: http://people.redhat.com/alewis/gpg.html or one of the
many keyservers out there...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050712/f7f8aa2b/attachment.sig>

From mvz at nimium.hr  Mon Jul 11 16:45:23 2005
From: mvz at nimium.hr (Miroslav Zubcic)
Date: Mon, 11 Jul 2005 18:45:23 +0200
Subject: [Linux-cluster] IP adress missing from interface (sometimes)
Message-ID: <lzslylb8bg.fsf@devana.nimium.local>

Hi People.

I have 4 servers under RH Cluster Suite: 2 Clusters - one with oracle,
and one with two very important java daemons for local telecom.

Sometimes, I can see in log files (remote syslog outside cluster(s)):
--------------------------------------------------------
Jul 11 11:56:40 szgtr01 clusvcmgrd: [17188]: <err> service error: IP address 10.100.1.151 missing  
Jul 11 11:56:40 szgtr01 last message repeated 2 times
Jul 11 11:56:40 szgtr01 clusvcmgrd: [17188]: <err> service error: 0: error fetching interface information: Device not found  
Jul 11 11:56:40 szgtr01 last message repeated 2 times
Jul 11 11:56:40 szgtr01 clusvcmgrd: [17188]: <err> service error: Check status failed on IP addresses for tomcat  
Jul 11 11:56:40 szgtr01 last message repeated 2 times
Jul 11 11:56:40 szgtr01 clusvcmgrd[17187]: <warning> Restarting locally failed service tomcat  
Jul 11 11:56:40 szgtr01 last message repeated 2 times
Jul 11 11:56:40 szgtr01 clusvcmgrd: [17440]: <notice> service notice: Stopping service tomcat ...  
Jul 11 11:56:40 szgtr01 clusvcmgrd: [17440]: <notice> service notice: Stopping service tomcat ...  
Jul 11 11:56:40 szgtr01 clusvcmgrd: [17440]: <notice> service notice: Running user script '/etc/init.d/tomcat stop'  
Jul 11 11:56:40 szgtr01 clusvcmgrd: [17440]: <notice> service notice: Running user script '/etc/init.d/tomcat stop'  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17440]: <info> service info: Stopping IP address 10.100.1.151  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17440]: <info> service info: Stopping IP address 10.100.1.151  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17440]: <notice> service notice: Stopped service tomcat ...  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17440]: <notice> service notice: Stopped service tomcat ...  
Jul 11 11:56:46 szgtr01 clusvcmgrd[17187]: <notice> Starting stopped service tomcat  
Jul 11 11:56:46 szgtr01 clusvcmgrd[17187]: <notice> Starting stopped service tomcat  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17716]: <notice> service notice: Starting service tomcat ...  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17716]: <notice> service notice: Starting service tomcat ...  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17716]: <info> service info: Starting IP address 10.100.1.151  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17716]: <info> service info: Starting IP address 10.100.1.151  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17716]: <info> service info: Sending Gratuitous arp for 10.100.1.151 (00:12:79:D6:7F:30)  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17716]: <info> service info: Sending Gratuitous arp for 10.100.1.151 (00:12:79:D6:7F:30)  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17716]: <notice> service notice: Running user script '/etc/init.d/tomcat start'  
Jul 11 11:56:46 szgtr01 clusvcmgrd: [17716]: <notice> service notice: Running user script '/etc/init.d/tomcat start'  
Jul 11 11:56:58 szgtr01 clusvcmgrd: [17716]: <notice> service notice: Started service tomcat ...  
Jul 11 11:56:58 szgtr01 clusvcmgrd: [17716]: <notice> service notice: Started service tomcat ...  
--------------------------------------------------------

As I see, this "IP address <foo> missing" is coming from
/usr/lib/clumanager/services/svclib_ip script. Why? Nobody is removing
service address from interface, but sometimes script (or ifconfig) is
failing to find IP address. I didn't hacked anything. Everything is
configured by redhat-config-cluster graphic tool BTW. Cluster has 2
shared raw devices (sda1 and sdb1) from HP SAN (EVA) for
configuration. I have WTI NPS power switches ...

This is IPv4 configuration of the maschine is following:
--------------------------------------------------------
1: lo: <LOOPBACK,UP> mtu 16436 qdisc noqueue 
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 brd 127.255.255.255 scope host lo
2: bond0: <BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue 
    link/ether 00:12:79:d6:7f:30 brd ff:ff:ff:ff:ff:ff
    inet 10.100.1.20/24 brd 10.100.1.255 scope global bond0
    inet 10.100.1.152/24 brd 10.100.1.255 scope global secondary bond0:0
    inet 10.100.1.151/24 brd 10.100.1.255 scope global secondary bond0:1
3: eth0: <BROADCAST,MULTICAST,NOARP,SLAVE,UP> mtu 1500 qdisc pfifo_fast master bond1 qlen 1000
    link/ether 00:12:79:d6:7f:31 brd ff:ff:ff:ff:ff:ff
    inet 10.100.252.20/24 brd 10.100.252.255 scope global eth0
4: eth1: <BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:12:79:d6:7f:30 brd ff:ff:ff:ff:ff:ff
    inet 10.100.1.20/24 brd 10.100.1.255 scope global eth1
5: eth2: <BROADCAST,MULTICAST,NOARP,SLAVE,UP> mtu 1500 qdisc pfifo_fast master bond0 qlen 1000
    link/ether 00:12:79:d6:7f:30 brd ff:ff:ff:ff:ff:ff
    inet 10.100.1.20/24 brd 10.100.1.255 scope global eth2
6: eth3: <BROADCAST,MULTICAST,SLAVE,UP> mtu 1500 qdisc pfifo_fast master bond1 qlen 1000
    link/ether 00:12:79:d6:7f:31 brd ff:ff:ff:ff:ff:ff
    inet 10.100.252.20/24 brd 10.100.252.255 scope global eth3
7: bond1: <BROADCAST,MULTICAST,MASTER,UP> mtu 1500 qdisc noqueue 
    link/ether 00:12:79:d6:7f:31 brd ff:ff:ff:ff:ff:ff
    inet 10.100.252.20/24 brd 10.100.252.255 scope global bond1
--------------------------------------------------------

Network 10.100.1.0/24 on bond0 is data and heartbeat link, and net
10.100.252.0/24 on bond1 is special vlan/network for communication
with WTI NPS network power switches.

Every machine in cluster has 2 network cards - gigabit Broadcom and
gigabit intel card with bcm5700 and e1000 drivers.

Cluster node-systems are Red Hat Advanced Server 3. All machines are
updated from RHN to update 5. Clumanager package is 1.2.26.1-1.

This is cluster.xml on the first cluster:
------------------------------------------------------------------
# strings /dev/sda1
uszgtr01.tel.local
0/usr/sbin/clusvcmgrd
0/usr/sbin/clusvcmgrd
p/usr/sbin/clusvcmgrd
v%bE
8{+No!
Qwf?
UT!u
S-R*.
@&e:
'&gs
k/wB
ix.x
<?xml version="1.0"?>
<cluconfig version="3.0">
  <clumembd broadcast="no" interval="750000" loglevel="6" multicast="yes" multicast_ipaddress="225.0.0.11" thread="yes" tko_count="20"/>
  <cluquorumd loglevel="6" pinginterval="" tiebreaker_ip="10.100.1.1"/>
  <clurmtabd loglevel="6" pollinterval="4"/>
  <clusvcmgrd loglevel="6"/>
  <clulockd loglevel="6"/>
  <cluster config_viewnumber="10" key="55c6b6814c16718ea1728bdfcea5cf78" name="java"/>
  <sharedstate driver="libsharedraw.so" rawprimary="/dev/raw/raw1" rawshadow="/dev/raw/raw2" type="raw"/>
  <members>
    <member id="0" name="szgtr01" watchdog="no">
      <powercontroller id="0" ipaddress="10.100.252.222" password="xxxxxxx" port="1" type="wti_nps" user=""/>
      <powercontroller id="1" ipaddress="10.100.252.223" password="xxxxxxx" port="1" type="wti_nps" user=""/>
    </member>
    <member id="1" name="szgtr02" watchdog="no">
      <powercontroller id="0" ipaddress="10.100.252.222" password="xxxxxxxx" port="5" type="wti_nps" user=""/>
      <powercontroller id="1" ipaddress="10.100.252.223" password="xxxxxxxx" port="5" type="wti_nps" user=""/>
    </member>
  </members>
  <services>
    <service checkinterval="8" failoverdomain="javadom" id="0" maxfalsestarts="0" maxrestarts="0" name="tomcat" userscript="/etc/init.d/tomcat">
      <service_ipaddresses>
        <service_ipaddress broadcast="10.100.1.255" id="0" ipaddress="10.100.1.151" netmask="255.255.255.0"/>
      </service_ipaddresses>
    </service>
    <service checkinterval="8" failoverdomain="javadom" id="1" maxfalsestarts="0" maxrestarts="0" name="rad" userscript="/etc/init.d/radiusd">
      <service_ipaddresses>
        <service_ipaddress broadcast="10.100.1.255" id="0" ipaddress="10.100.1.152" netmask="255.255.255.0"/>
      </service_ipaddresses>
    </service>
  </services>
  <failoverdomains>
    <failoverdomain id="0" name="javadom" ordered="yes" restricted="yes">
      <failoverdomainnode id="0" name="szgtr01"/>
      <failoverdomainnode id="1" name="szgtr02"/>
    </failoverdomain>
  </failoverdomains>
</cluconfig>
------------------------------------------------------------------------

lsmod:
------------------------------------------------------------------------
Module                  Size  Used by
iptable_filter          2412   0  (autoclean) (unused)
ip_tables              16544   1  [iptable_filter]
cpqci                  28612   3 
audit                  90808   3 
bonding1               25156   1 
e1000                  83784   2 
bcm5700               110564   2 
bonding                25156   1 
microcode               6912   0  (autoclean)
keybdev                 2976   0  (unused)
mousedev                5688   0  (unused)
hid                    22532   0  (unused)
input                   6176   0  [keybdev mousedev hid]
ehci-hcd               20776   0  (unused)
usb-uhci               26860   0  (unused)
usbcore                81152   1  [hid ehci-hcd usb-uhci]
ext3                   89960   3 
jbd                    55156   3  [ext3]
sg                     37324   0 
qla2300               590844   9 
qla2300_conf          301560   0 
cciss                  45188   4 
sd_mod                 14128   8 
scsi_mod              115496   3  [sg qla2300 cciss sd_mod]
------------------------------------------------------------------------

I have set up this script to watch ifconfig output:

while `usleep 500000`
do
ifconfig bond0:1;echo "----------------------------"
done >> /tmp/ifconfig.log &

After 4-5 hours, i have one failure:

# grep addr:10.100.1.151 ifconfig.log  | wc -l
  23207
# grep 'HWaddr 00:12:79:D6:7F:30' ifconfig.log  | wc -l
  23208

Can somebody help me with this IP network and occationaly missing
service IP address?

Thanks ...

P.S.
Service failures are very randoom. Sometimes 2-3 in one day. Sometimes
only one weekly ... but this is not acceptible by my customer. :-(

P.P.S.
Situation (bug) is the same on all 4 cluster nodes. HW is HP ProLiant
DL 380 with hotswapable SCSI discs in HW RAID1, 2 CPUs each, and 12 GB
RAM.


-- 
Miroslav Zubcic, RHCE, Nimium d.o.o., email: <mvz at nimium.hr>
Tel: +385 01 4852 639, Fax: +385 01 4852 640, Mobile: +385 098 942 8672
Mrazoviceva 12, 10000 Zagreb, Hrvatska


From mvz at nimium.hr  Mon Jul 11 16:55:02 2005
From: mvz at nimium.hr (Miroslav Zubcic)
Date: Mon, 11 Jul 2005 18:55:02 +0200
Subject: [Linux-cluster] Unwanted cluster automigration after first node
	joins cluster
Message-ID: <lzirzhb7vd.fsf@devana.nimium.local>

I have another problem with Red Hat Cluster Suite:

Let's say one (active) node in 2-node cluster failed. It is rebooted
fenced etc... and services are migrated on the second node. That's OK.

But when I repair and boot again first (previously failed) node and
start cluster daemons with /etc/init.d/clumanager, services are
MIGRATED BACK again (!!!) automaticly, couple of seconds after firing
up first node.

Since I have equal hardware for all nodes in cluster, and REALLY
critical services, I'm experiencing unneeded service failover.

How can I configure RH CS *NOT* to failover services back on the first
node after first node is up again?

Thanks.


-- 
Miroslav Zubcic, RHCE, Nimium d.o.o., email: <mvz at nimium.hr>
Tel: +385 01 4852 639, Fax: +385 01 4852 640, Mobile: +385 098 942 8672
Mrazoviceva 12, 10000 Zagreb, Hrvatska


From lhh at redhat.com  Tue Jul 12 13:50:20 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 12 Jul 2005 09:50:20 -0400
Subject: [Linux-cluster] Unwanted cluster automigration after first
	node joins cluster
In-Reply-To: <lzirzhb7vd.fsf@devana.nimium.local>
References: <lzirzhb7vd.fsf@devana.nimium.local>
Message-ID: <1121176220.27080.0.camel@ayanami.boston.redhat.com>

On Mon, 2005-07-11 at 18:55 +0200, Miroslav Zubcic wrote:
> I have another problem with Red Hat Cluster Suite:
> 
> Let's say one (active) node in 2-node cluster failed. It is rebooted
> fenced etc... and services are migrated on the second node. That's OK.
> 
> But when I repair and boot again first (previously failed) node and
> start cluster daemons with /etc/init.d/clumanager, services are
> MIGRATED BACK again (!!!) automaticly, couple of seconds after firing
> up first node.
> 
> Since I have equal hardware for all nodes in cluster, and REALLY
> critical services, I'm experiencing unneeded service failover.
> 
> How can I configure RH CS *NOT* to failover services back on the first
> node after first node is up again?

Don't use failover domains.

-- Lon


From lhh at redhat.com  Tue Jul 12 14:50:37 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 12 Jul 2005 10:50:37 -0400
Subject: [Linux-cluster] IP adress missing from interface (sometimes)
In-Reply-To: <lzslylb8bg.fsf@devana.nimium.local>
References: <lzslylb8bg.fsf@devana.nimium.local>
Message-ID: <1121179837.27080.12.camel@ayanami.boston.redhat.com>

On Mon, 2005-07-11 at 18:45 +0200, Miroslav Zubcic wrote:

> As I see, this "IP address <foo> missing" is coming from
> /usr/lib/clumanager/services/svclib_ip script. Why? Nobody is removing
> service address from interface, but sometimes script (or ifconfig) is
> failing to find IP address. I didn't hacked anything. Everything is
> configured by redhat-config-cluster graphic tool BTW. Cluster has 2
> shared raw devices (sda1 and sdb1) from HP SAN (EVA) for
> configuration. I have WTI NPS power switches ...

There are three problems which could explain this...

- an odd problem in bond routing in U4 + U5 kernels.
- a problem with the e1000 driver with multiple NICs in bonded mode
which causes "ifconfig" to fail spuriously.
- a bug with clumanager versions < 1.2.26.1 re: bonded IPs.

I think it's one of the first two, or a combination of both.

The new (1.2.26.1) version of clumanager is available on RHN.
Unsupported test kernels which should fix the first two issues are
available here:

http://people.redhat.com/linville/kernels/rhel3/

You can also try downgrading the initscripts package to the GA version
to work around the regression in bond routing.  If neither solve your
problem, please file a ticket with Red Hat Support:

http://www.redhat.com/apps/support/

-- Lon


From lhh at redhat.com  Tue Jul 12 15:01:45 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 12 Jul 2005 11:01:45 -0400
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42D37BD3.9080300@uib.no>
References: <42CBAF82.8050900@uib.no> <1120685352.5704.9.camel@auh5-0478>
	<42CD0740.8040102@uib.no> <1120749723.5534.7.camel@auh5-0478>
	<42CE31D8.4090206@uib.no>
	<1120838201.15658.121.camel@ayanami.boston.redhat.com>
	<1120853918.5524.15.camel@auh5-0478>  <42D2D2E7.8000906@uib.no>
	<1121114691.5632.2.camel@auh5-0478>  <42D2DBA1.2090909@uib.no>
	<1121115975.5632.4.camel@auh5-0478>  <42D2EE5E.5010904@uib.no>
	<1121120744.5632.10.camel@auh5-0478>  <42D31844.4010502@uib.no>
	<1121144527.5275.5.camel@auh5-0478>  <42D37BD3.9080300@uib.no>
Message-ID: <1121180505.27080.16.camel@ayanami.boston.redhat.com>

On Tue, 2005-07-12 at 10:14 +0200, Birger Wathne wrote:
> Eric Kerin wrote:
> 
> > On Tue, 2005-07-12 at 03:09 +0200, Birger Wathne wrote:
> > 
> >>I did a make install and service rgmanager stop/start.
> >>
> >>Now, status checks seem to work. After exportfs -r all file systems 
> >>eventually reappear.
> >>
> > 
> > 
> > Good to hear that fixed the monitoring for you.
> > 
> 
> Do you want the bad news as well?
> 
> The > 14 character long export path bug is still there in the FC4 branch.
> If the exported path is > 14 characters it will always fail status check and 
> get reexported. This is because exportfs breaks lines after the exported 
> path if it is long. I think exportfs shouldn't do this, but I guess the 
> cluster software will have to work around it.
> 
> The line that fails is:
> exportfs | grep -q "^${OCF_RESKEY_path}\ .*${OCF_RESKEY_target}"
> 
> A proper fix would be a filter to remove the offending line breaks. My quick 
> and dirty (and far less portable) fix is to use this instead:
> 
> grep -q "^${OCF_RESKEY_path}[   ]*${OCF_RESKEY_target}(" /var/lib/nfs/etab

Isn't this fixed in STABLE?

	#
	# Status check fix from Birger Wathne:
	# * Exports longer than 14 chars have line breaks inserted, which
	#   broke the way the status check worked.
	#
	exportfs -v | tr -d "\n" | sed -e 's/([^)]*)/\n/g' | grep -q \
		"^${OCF_RESKEY_path}[\t ]*.*${OCF_RESKEY_target}"
	rv=$?

Looks like *you* fixed it already!  I thought the FC4 branch is dead,
but do you want me to put it in the -FC4 branch anyway?

-- Lon


From lhh at redhat.com  Tue Jul 12 15:05:34 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 12 Jul 2005 11:05:34 -0400
Subject: [Linux-cluster] GFS & DRBD
In-Reply-To: <200507121214.12659.npf@eurotux.com>
References: <200507111333.05984.pbruna@linuxcenterla.com>
	<20050711182114.GB23156@null.msp.redhat.com>
	<200507121214.12659.npf@eurotux.com>
Message-ID: <1121180734.27080.18.camel@ayanami.boston.redhat.com>

On Tue, 2005-07-12 at 12:14 +0100, Nuno Pais Fernandes wrote:
> Drbd 0.8 will be active-active.

Nice =)


From cfeist at redhat.com  Tue Jul 12 16:17:14 2005
From: cfeist at redhat.com (Chris Feist)
Date: Tue, 12 Jul 2005 11:17:14 -0500
Subject: [Linux-cluster] GFS for RHEL4
In-Reply-To: <42D2C042.9020306@popik.pl>
References: <42D2C042.9020306@popik.pl>
Message-ID: <42D3ED0A.7080907@redhat.com>

Yes, they are at: 
ftp://ftp.redhat.com/pub/redhat/linux/enterprise/4/en/RHCS/i386/SRPMS/ &
ftp://ftp.redhat.com/pub/redhat/linux/enterprise/4/en/RHGFS/i386/SRPMS/

Thanks,
Chris

Adam Popik wrote:
> Will are SRPMS available for RHEL4 (like for RHEL3) ?
> 
> PS. Sorry for broken English.
> Adam
> 
> 
> ------------------------------------------------------------------------
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster


From jason at selu.edu  Tue Jul 12 20:22:55 2005
From: jason at selu.edu (Jason Lanclos)
Date: Tue, 12 Jul 2005 15:22:55 -0500
Subject: [Linux-cluster] GFS for RHEL4
In-Reply-To: <42D3ED0A.7080907@redhat.com>
References: <42D2C042.9020306@popik.pl> <42D3ED0A.7080907@redhat.com>
Message-ID: <200507121522.55219.Jason@selu.edu>


I was attempting to build these rpm packages.. I'm missing fake-build-provides
but i can't find this anywhere

rpmbuild --target=i686 --rebuild cman-kernel-2.6.9-36.0.src.rpm
Installing cman-kernel-2.6.9-36.0.src.rpm
Building target platforms: i686
Building for target i686
error: Failed build dependencies:
        fake-build-provides is needed by cman-kernel-2.6.9-36.0.i686


On Tuesday 12 July 2005 11:17 am, Chris Feist wrote:
> Yes, they are at: 
> ftp://ftp.redhat.com/pub/redhat/linux/enterprise/4/en/RHCS/i386/SRPMS/ &
> ftp://ftp.redhat.com/pub/redhat/linux/enterprise/4/en/RHGFS/i386/SRPMS/
> 
> Thanks,
> Chris
> 
> Adam Popik wrote:
> > Will are SRPMS available for RHEL4 (like for RHEL3) ?
> > 
> > PS. Sorry for broken English.
> > Adam
> > 
> > 
> > ------------------------------------------------------------------------
> > 
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > http://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 

-- 
Jason Lanclos                                        
Systems Administrator                                 
Red Hat Certified Engineer        
Southeastern Louisiana University		     


From adam at popik.pl  Tue Jul 12 20:41:22 2005
From: adam at popik.pl (Adam Popik)
Date: Tue, 12 Jul 2005 22:41:22 +0200
Subject: [Linux-cluster] GFS for RHEL4
In-Reply-To: <200507121522.55219.Jason@selu.edu>
References: <42D2C042.9020306@popik.pl> <42D3ED0A.7080907@redhat.com>
	<200507121522.55219.Jason@selu.edu>
Message-ID: <42D42AF2.3070503@popik.pl>

Jason Lanclos wrote:
> I was attempting to build these rpm packages.. I'm missing fake-build-provides
> but i can't find this anywhere
> 
> rpmbuild --target=i686 --rebuild cman-kernel-2.6.9-36.0.src.rpm
> Installing cman-kernel-2.6.9-36.0.src.rpm
> Building target platforms: i686
> Building for target i686
> error: Failed build dependencies:
>         fake-build-provides is needed by cman-kernel-2.6.9-36.0.i686
> 
> 
I made with "--nodeps" option
Adam

> 
> On Tuesday 12 July 2005 11:17 am, Chris Feist wrote:
> 
>>Yes, they are at: 
>>ftp://ftp.redhat.com/pub/redhat/linux/enterprise/4/en/RHCS/i386/SRPMS/ &
>>ftp://ftp.redhat.com/pub/redhat/linux/enterprise/4/en/RHGFS/i386/SRPMS/
>>
>>Thanks,
>>Chris
>>
>>Adam Popik wrote:
>>
>>>Will are SRPMS available for RHEL4 (like for RHEL3) ?
>>>
>>>PS. Sorry for broken English.
>>>Adam
>>>
>>>
>>>------------------------------------------------------------------------
>>>
>>>--
>>>Linux-cluster mailing list
>>>Linux-cluster at redhat.com
>>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>
> 
> 

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4115 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050712/311f85f0/attachment.bin>

From Birger.Wathne at ift.uib.no  Tue Jul 12 23:48:52 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Wed, 13 Jul 2005 01:48:52 +0200
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <1121172812.7227.6.camel@auh5-0478>
References: <42CBAF82.8050900@uib.no>
	<1120685352.5704.9.camel@auh5-0478>	<42CD0740.8040102@uib.no>
	<1120749723.5534.7.camel@auh5-0478>	<42CE31D8.4090206@uib.no>	<1120838201.15658.121.camel@ayanami.boston.redhat.com>	<1120853918.5524.15.camel@auh5-0478>
	<42D2D2E7.8000906@uib.no>	<1121114691.5632.2.camel@auh5-0478>
	<42D2DBA1.2090909@uib.no>	<1121115975.5632.4.camel@auh5-0478>
	<42D2EE5E.5010904@uib.no>	<1121120744.5632.10.camel@auh5-0478>
	<42D31844.4010502@uib.no>	<1121144527.5275.5.camel@auh5-0478>
	<42D37BD3.9080300@uib.no> <1121172812.7227.6.camel@auh5-0478>
Message-ID: <42D456E4.3010001@uib.no>

Eric Kerin wrote:


> The fix seems to already be in the RHEL4, STABLE, and HEAD branches.
> 
> Is FC4 even a branch that is keep up to date?  You should probably be
> using either STABLE or RHEL4 for Fedora Core 4.

Knowing which branch to use is a science in itself... :-/

Which branch is used for the FC4 rpm's? As soon as this last fix is in, I 
would like to move to using rpm's on the production servers.

-- 
birger


From wkenji at labs.fujitsu.com  Wed Jul 13 05:15:37 2005
From: wkenji at labs.fujitsu.com (Kenji Wakamiya)
Date: Wed, 13 Jul 2005 14:15:37 +0900
Subject: [Linux-cluster] GFS cookie
Message-ID: <42D4A379.5040200@labs.fujitsu.com>

Hello,

In our three-node GFS cluster, one node has a different GFS cookie
from the other nodes for the same shared iSCSI LUN.

[root:node11]# gfs_tool list
4172664832 sdb cluster1:iscsi.2

[root:node12]# gfs_tool list
4172664832 sdb cluster1:iscsi.1

[root:node13]# gfs_tool list
4171632640 sdb cluster1:iscsi.0

Is this a possible (correct) state?  R/W is doable and gfs_fsck is
successful on all nodes.  But "gfs_tool freeze" does not accept a
mountpoint as an argument (cookie is OK).  Does this bug still
remains in FC4's GFS packages?

I'm using FC4-i386 2.6.11-1.1369_FC4smp and the following RPMs:
(iSCSI initiator is SFnet linux-iscsi 4.0.2.1.test)

GFS-6.1.0-3
GFS-kernel-smp-2.6.11.8-20050601.152643.FC4.2
GFS-kernheaders-2.6.11.8-20050601.152643.FC4.2
cman-1.0.0-1
cman-kernel-smp-2.6.11.5-20050601.152643.FC4.2
cman-kernheaders-2.6.11.5-20050601.152643.FC4.2
dlm-1.0.0-3
dlm-devel-1.0.0-3
dlm-kernel-smp-2.6.11.5-20050601.152643.FC4.2
dlm-kernheaders-2.6.11.5-20050601.152643.FC4.2
...

Things have not changed by re-gfs_mkfs or OS re-installation.

Thanks,

-- Kenji


From teigland at redhat.com  Wed Jul 13 05:32:28 2005
From: teigland at redhat.com (David Teigland)
Date: Wed, 13 Jul 2005 13:32:28 +0800
Subject: [Linux-cluster] GFS cookie
In-Reply-To: <42D4A379.5040200@labs.fujitsu.com>
References: <42D4A379.5040200@labs.fujitsu.com>
Message-ID: <20050713053228.GC3603@redhat.com>

On Wed, Jul 13, 2005 at 02:15:37PM +0900, Kenji Wakamiya wrote:
> Hello,
> 
> In our three-node GFS cluster, one node has a different GFS cookie
> from the other nodes for the same shared iSCSI LUN.
> 
> [root:node11]# gfs_tool list
> 4172664832 sdb cluster1:iscsi.2
> 
> [root:node12]# gfs_tool list
> 4172664832 sdb cluster1:iscsi.1
> 
> [root:node13]# gfs_tool list
> 4171632640 sdb cluster1:iscsi.0

The cookie is a local value (a memory address), I'm surprised it's the
same on two of your machines.

Dave


From wkenji at labs.fujitsu.com  Wed Jul 13 06:09:50 2005
From: wkenji at labs.fujitsu.com (Kenji Wakamiya)
Date: Wed, 13 Jul 2005 15:09:50 +0900
Subject: [Linux-cluster] GFS cookie
In-Reply-To: <20050713053228.GC3603@redhat.com>
References: <42D4A379.5040200@labs.fujitsu.com>
	<20050713053228.GC3603@redhat.com>
Message-ID: <42D4B02E.8010007@labs.fujitsu.com>

Thanks for your quick reply.

David Teigland wrote:
>>[root:node11]# gfs_tool list
>>4172664832 sdb cluster1:iscsi.2
>>
>>[root:node12]# gfs_tool list
>>4172664832 sdb cluster1:iscsi.1
>>
>>[root:node13]# gfs_tool list
>>4171632640 sdb cluster1:iscsi.0
> 
> 
> The cookie is a local value (a memory address), I'm surprised it's the
> same on two of your machines.

Oh, I see!  All nodes are quite the same spec.  So, why does node13
have always different cookie...  I stop caring about it :-)

-- Kenji


From alewis at redhat.com  Wed Jul 13 12:23:30 2005
From: alewis at redhat.com (AJ Lewis)
Date: Wed, 13 Jul 2005 07:23:30 -0500
Subject: [Linux-cluster] NFS configuration question
In-Reply-To: <42D456E4.3010001@uib.no>
References: <1121114691.5632.2.camel@auh5-0478> <42D2DBA1.2090909@uib.no>
	<1121115975.5632.4.camel@auh5-0478> <42D2EE5E.5010904@uib.no>
	<1121120744.5632.10.camel@auh5-0478> <42D31844.4010502@uib.no>
	<1121144527.5275.5.camel@auh5-0478> <42D37BD3.9080300@uib.no>
	<1121172812.7227.6.camel@auh5-0478> <42D456E4.3010001@uib.no>
Message-ID: <20050713122330.GA14421@null.msp.redhat.com>

On Wed, Jul 13, 2005 at 01:48:52AM +0200, Birger Wathne wrote:
> Eric Kerin wrote:
> 
> 
> >The fix seems to already be in the RHEL4, STABLE, and HEAD branches.
> >
> >Is FC4 even a branch that is keep up to date?  You should probably be
> >using either STABLE or RHEL4 for Fedora Core 4.
> 
> Knowing which branch to use is a science in itself... :-/

Welcome to CVS.  It's unfortunately really tricky to make this obvious for
everyone.  If you look back in the mail archives, there have been mails sent
describing which branches to use.
 
> Which branch is used for the FC4 rpm's? As soon as this last fix is in, I 
> would like to move to using rpm's on the production servers.

STABLE is used for FC4 AFAIK.  Look at the "Source Code" section of
http://sources.redhat.com/cluster/ for a description of the branches that are
currently active.

Regards,
-- 
AJ Lewis                                   Voice:  612-638-0500
Red Hat                                    E-Mail: alewis at redhat.com
One Main Street SE, Suite 209
Minneapolis, MN 55414
   
Current GPG fingerprint = D9F8 EDCE 4242 855F A03D  9B63 F50C 54A8 578C 8715
Grab the key at: http://people.redhat.com/alewis/gpg.html or one of the
many keyservers out there...

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050713/db6e4a6b/attachment.sig>

From htfrontier at gmail.com  Wed Jul 13 01:05:02 2005
From: htfrontier at gmail.com (Hanny Tidore)
Date: Wed, 13 Jul 2005 09:05:02 +0800
Subject: [Linux-cluster] Re: redhat clustersuite problem
In-Reply-To: <1121201260.27080.44.camel@ayanami.boston.redhat.com>
References: <2fa0bfca0507120611269107bc@mail.gmail.com>
	<1121201260.27080.44.camel@ayanami.boston.redhat.com>
Message-ID: <2fa0bfca050712180552f0cfbf@mail.gmail.com>

Here is the service configuration of the cluster service:

service name: gw
check interval: 10
user script: /apps/gw/main.sh
    service ip: 192.168.1.10

Thanks.
Hanny


On 7/13/05, Lon Hohberger <lhh at redhat.com> wrote:
> On Tue, 2005-07-12 at 21:11 +0800, Hanny Tidore wrote:
> > Hi,
> >
> > I have 2 dell servers with clustersuite installed. I can run xclock as
> > cluster service. However I got a problem running a particular service
> > called gw. It is a server listening to a certain port.
> > The script called by the cluster service is main.sh
> >
> > The problem is I can run main.sh by calling "./main.sh start"
> > manually. However, when I try running the service from the cluster (by
> > enabling the service), the service says started in the cluster but
> > when I check from the console, it does not actually start.
> >
> > Thanks for your help.
> 
> Can you send info to linux-cluster at redhat.com with the service
> configuration snippet?
> 
> -- Lon
> 
>


From htfrontier at gmail.com  Wed Jul 13 03:29:28 2005
From: htfrontier at gmail.com (Hanny Tidore)
Date: Wed, 13 Jul 2005 11:29:28 +0800
Subject: [Linux-cluster] (no subject)
Message-ID: <2fa0bfca050712202938298a66@mail.gmail.com>

Hi Lon,

My analysis so far is that it seems that gw is accessible by the cluster. 
I have specified the full path. What could be the issue?

Thanks.
Hanny


From htfrontier at gmail.com  Wed Jul 13 09:00:23 2005
From: htfrontier at gmail.com (Hanny Tidore)
Date: Wed, 13 Jul 2005 17:00:23 +0800
Subject: [Linux-cluster] Re: redhat clustersuite problem
In-Reply-To: <1121201260.27080.44.camel@ayanami.boston.redhat.com>
References: <2fa0bfca0507120611269107bc@mail.gmail.com>
	<1121201260.27080.44.camel@ayanami.boston.redhat.com>
Message-ID: <2fa0bfca0507130200567e025@mail.gmail.com>

Hi Lon,

Did you receive my config ? my test so far suggest that the actual
application (gw) cannot be run. I dont know why. Maybe the application
is not accessible ???  But, manually, I can run "./main.sh start" and
the application will run.

Thanks.
Hanny

On 7/13/05, Lon Hohberger <lhh at redhat.com> wrote:
> On Tue, 2005-07-12 at 21:11 +0800, Hanny Tidore wrote:
> > Hi,
> >
> > I have 2 dell servers with clustersuite installed. I can run xclock as
> > cluster service. However I got a problem running a particular service
> > called gw. It is a server listening to a certain port.
> > The script called by the cluster service is main.sh
> >
> > The problem is I can run main.sh by calling "./main.sh start"
> > manually. However, when I try running the service from the cluster (by
> > enabling the service), the service says started in the cluster but
> > when I check from the console, it does not actually start.
> >
> > Thanks for your help.
> 
> Can you send info to linux-cluster at redhat.com with the service
> configuration snippet?
> 
> -- Lon
> 
>


From lhh at redhat.com  Wed Jul 13 14:30:08 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 13 Jul 2005 10:30:08 -0400
Subject: [Linux-cluster] Re: redhat clustersuite problem
In-Reply-To: <2fa0bfca0507130200567e025@mail.gmail.com>
References: <2fa0bfca0507120611269107bc@mail.gmail.com>
	<1121201260.27080.44.camel@ayanami.boston.redhat.com>
	<2fa0bfca0507130200567e025@mail.gmail.com>
Message-ID: <1121265008.27080.54.camel@ayanami.boston.redhat.com>

On Wed, 2005-07-13 at 17:00 +0800, Hanny Tidore wrote:
> Hi Lon,
> 
> Did you receive my config ? my test so far suggest that the actual
> application (gw) cannot be run. I dont know why. Maybe the application
> is not accessible ???  But, manually, I can run "./main.sh start" and
> the application will run.

It looks like you're on RHEL 2.1 (based on your config) see this:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=73615

If you're on RHEL3 with Red Hat Cluster Suite 3, look here:

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=143867

The latest erratas should fix it.

-- Lon


From fabbione at fabbione.net  Thu Jul 14 06:09:17 2005
From: fabbione at fabbione.net (Fabio Massimo Di Nitto)
Date: Thu, 14 Jul 2005 08:09:17 +0200
Subject: [Linux-cluster] [PATCH] make kernel modules install path consistent
Message-ID: <42D6018D.8040306@fabbione.net>


Hi guys,
    another little clean up patch.
As it is now the patch target and the install target end up storing the kernel modules
in 2 different directories. kernel/cluster/$projectname/ for the former and kernel/cluster/
for the latter.

The patch (against the STABLE branch from 2005-07-14) make the behaviour consistent installing the
modules in kernel/cluster/$projectname/

Please apply

Cheers
Fabio

-- 
no signature file found.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: kernel-modules-install-paths.dpatch
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050714/2ef8906e/attachment.ksh>

From mvz+rhcluster at nimium.hr  Thu Jul 14 09:16:35 2005
From: mvz+rhcluster at nimium.hr (Miroslav Zubcic)
Date: Thu, 14 Jul 2005 11:16:35 +0200
Subject: [Linux-cluster] Unwanted cluster automigration after first node
	joins cluster
In-Reply-To: <1121176220.27080.0.camel@ayanami.boston.redhat.com> (Lon
	Hohberger's message of "Tue, 12 Jul 2005 09:50:20 -0400")
References: <lzirzhb7vd.fsf@devana.nimium.local>
	<1121176220.27080.0.camel@ayanami.boston.redhat.com>
Message-ID: <lzmzop693g.fsf@devana.nimium.local>

Lon Hohberger <lhh at redhat.com> writes:

> On Mon, 2005-07-11 at 18:55 +0200, Miroslav Zubcic wrote:

>> How can I configure RH CS *NOT* to failover services back on the first
>> node after first node is up again?

> Don't use failover domains.

Thanks.

One more question: it is not clear from documentation (rh-cs-en-3)
what is "ordered failover". Is this the cause of my unwanted
auto-back-migration problem, or I must turn off failover domains
entirely?


-- 
Miroslav Zubcic, Nimium d.o.o., email: <mvz at nimium.hr>
Tel: +385 01 4852 639, Fax: +385 01 4852 640, Mobile: +385 098 942 8672
Mrazoviceva 12, 10000 Zagreb, Hrvatska


From mvz+rhcluster at nimium.hr  Thu Jul 14 09:38:58 2005
From: mvz+rhcluster at nimium.hr (Miroslav Zubcic)
Date: Thu, 14 Jul 2005 11:38:58 +0200
Subject: [Linux-cluster] IP adress missing from interface (sometimes)
In-Reply-To: <1121179837.27080.12.camel@ayanami.boston.redhat.com> (Lon
	Hohberger's message of "Tue, 12 Jul 2005 10:50:37 -0400")
References: <lzslylb8bg.fsf@devana.nimium.local>
	<1121179837.27080.12.camel@ayanami.boston.redhat.com>
Message-ID: <lzirzd6825.fsf@devana.nimium.local>

Lon Hohberger <lhh at redhat.com> writes:

> There are three problems which could explain this...

> - an odd problem in bond routing in U4 + U5 kernels.
> - a problem with the e1000 driver with multiple NICs in bonded mode
> which causes "ifconfig" to fail spuriously.

Yes. I have setup another script last night to check output of
ifconfig every 1 second, it fails 7-8 times a day - 4 times last night
until this morning (central european time here).

> - a bug with clumanager versions < 1.2.26.1 re: bonded IPs.

I have: # rpm -qa | grep clum
clumanager-1.2.26.1-1

There is no new version on RHN, just checked with up2date(8).

> I think it's one of the first two, or a combination of both.

I agree. That was my thinking also. We have escalated problem with
e1000 driver to HP (e1000 is part of their official support pack for
RHEL 3 on DL 380).

e1000 driver version is e1000-5.7.6a-1. From
ftp://ftp.compaq.com/pub/products/servers/supportsoftware/linux

> The new (1.2.26.1) version of clumanager is available on RHN.
> Unsupported test kernels which should fix the first two issues are
> available here:
>
> http://people.redhat.com/linville/kernels/rhel3/

I'm afraid that I cannot put test kernels on that machines. They are
in production and my head can get on the chuck if I do this. :-)

> You can also try downgrading the initscripts package to the GA version

Which version of initscripts? I have initscripts-7.31.22.EL-2

> to work around the regression in bond routing.  If neither solve your
> problem, please file a ticket with Red Hat Support:

> http://www.redhat.com/apps/support/

OK. Thank you.


-- 
Miroslav Zubcic, Nimium d.o.o., email: <mvz at nimium.hr>
Tel: +385 01 4852 639, Fax: +385 01 4852 640, Mobile: +385 098 942 8672
Mrazoviceva 12, 10000 Zagreb, Hrvatska


From lhh at redhat.com  Thu Jul 14 12:57:03 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 14 Jul 2005 08:57:03 -0400
Subject: [Linux-cluster] IP adress missing from interface (sometimes)
In-Reply-To: <lzirzd6825.fsf@devana.nimium.local>
References: <lzslylb8bg.fsf@devana.nimium.local>
	<1121179837.27080.12.camel@ayanami.boston.redhat.com>
	<lzirzd6825.fsf@devana.nimium.local>
Message-ID: <1121345823.14224.16.camel@ayanami.boston.redhat.com>

On Thu, 2005-07-14 at 11:38 +0200, Miroslav Zubcic wrote:

> Yes. I have setup another script last night to check output of
> ifconfig every 1 second, it fails 7-8 times a day - 4 times last night
> until this morning (central european time here).

> There is no new version on RHN, just checked with up2date(8).

Good, latest release is good stuff. ;)

> > I think it's one of the first two, or a combination of both.
> 
> I agree. That was my thinking also. We have escalated problem with
> e1000 driver to HP (e1000 is part of their official support pack for
> RHEL 3 on DL 380).

Ok, the e1000+bonding problem is almost definitely what you're seeing.

> I'm afraid that I cannot put test kernels on that machines. They are
> in production and my head can get on the chuck if I do this. :-)

Understood.

> Which version of initscripts? I have initscripts-7.31.22.EL-2

Given the report from above (failing 4 times overnight), this is _not_
your problem.

-- Lon


From lhh at redhat.com  Thu Jul 14 13:01:32 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 14 Jul 2005 09:01:32 -0400
Subject: [Linux-cluster] Unwanted cluster automigration after first
	node joins cluster
In-Reply-To: <lzmzop693g.fsf@devana.nimium.local>
References: <lzirzhb7vd.fsf@devana.nimium.local>
	<1121176220.27080.0.camel@ayanami.boston.redhat.com>
	<lzmzop693g.fsf@devana.nimium.local>
Message-ID: <1121346092.14224.21.camel@ayanami.boston.redhat.com>

On Thu, 2005-07-14 at 11:16 +0200, Miroslav Zubcic wrote:
> Lon Hohberger <lhh at redhat.com> writes:
> 
> > On Mon, 2005-07-11 at 18:55 +0200, Miroslav Zubcic wrote:
> 
> >> How can I configure RH CS *NOT* to failover services back on the first
> >> node after first node is up again?
> 
> > Don't use failover domains.
> 
> Thanks.
> 
> One more question: it is not clear from documentation (rh-cs-en-3)
> what is "ordered failover". Is this the cause of my unwanted
> auto-back-migration problem, or I must turn off failover domains
> entirely?

Ordered is what is causing failback, yes.  Here's a pretty length
explanation:

http://people.redhat.com/lhh/fd.html

In 2-node failover clusters, the only useful kind of failover domain is
ordered.

Services not bound to a FD are considered part of the "unordered,
unrestricted" FD of all nodes in the cluster.

-- Lon


From JACOB_LIBERMAN at Dell.com  Thu Jul 14 14:26:35 2005
From: JACOB_LIBERMAN at Dell.com (JACOB_LIBERMAN at Dell.com)
Date: Thu, 14 Jul 2005 09:26:35 -0500
Subject: [Linux-cluster] IP adress missing from interface (sometimes)
Message-ID: <BC430F453501174992B9D9E8AFB7519A0855A0@ausx3mps309.aus.amer.dell.com>

I am sure this is not helpful at all, but I remember in the old rhel 2.1
clustering, you could set a management IP that would bond to a single
active cluster node... Not necessarily the service owver. Everyone would
confuse the management IP with the service IP and set them to the same
address. Then, when you would move the resource, the IP would stay with
the original node rather than migrating with the service. I don't think
rhel3 clustering has a management interface option, though. 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Lon Hohberger
> Sent: Thursday, July 14, 2005 7:57 AM
> To: linux clustering
> Subject: Re: [Linux-cluster] IP adress missing from interface 
> (sometimes)
> 
> On Thu, 2005-07-14 at 11:38 +0200, Miroslav Zubcic wrote:
> 
> > Yes. I have setup another script last night to check output of 
> > ifconfig every 1 second, it fails 7-8 times a day - 4 times 
> last night 
> > until this morning (central european time here).
> 
> > There is no new version on RHN, just checked with up2date(8).
> 
> Good, latest release is good stuff. ;)
> 
> > > I think it's one of the first two, or a combination of both.
> > 
> > I agree. That was my thinking also. We have escalated problem with 
> > e1000 driver to HP (e1000 is part of their official support 
> pack for 
> > RHEL 3 on DL 380).
> 
> Ok, the e1000+bonding problem is almost definitely what you're seeing.
> 
> > I'm afraid that I cannot put test kernels on that machines. 
> They are 
> > in production and my head can get on the chuck if I do this. :-)
> 
> Understood.
> 
> > Which version of initscripts? I have initscripts-7.31.22.EL-2
> 
> Given the report from above (failing 4 times overnight), this 
> is _not_ your problem.
> 
> -- Lon
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 


From bujan at isqsolutions.com  Thu Jul 14 20:57:51 2005
From: bujan at isqsolutions.com (Manuel Bujan)
Date: Thu, 14 Jul 2005 16:57:51 -0400
Subject: [Linux-cluster] SMP and GFS
Message-ID: <04f401c588b6$b31db0a0$5001a8c0@spa.isqsolutions.com>

Hi there,

Is there any  issue I should be aware of if SMP is enabled in
my kernel ? What if I compile my kernel to be pre-emptible ? Any problem with that and GFS ?

I am running GFS in a dual Xeon server from DELL.
My current kernel config has:

CONFIG_SMP=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y

After a lot of time running my GFS setup I got the following error in one of our cluster servers, and I had to reboot it in order to restablish the service:

#################################################################################
Jul 14 14:19:35 atmail-2 kernel:  2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (18044) req reply einval ae2c0092 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (31381) req reply einval bf9901e7 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (2023) req reply einval d6c30333 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 last message repeated 2 times
Jul 14 14:19:35 atmail-2 kernel: gfs001 (22381) req reply einval e03903ee fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (9779) req reply einval e0b20396 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 (21318) req reply einval e3f00178 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 (12439) req reply einval e3390095 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 last message repeated 2 times
Jul 14 14:19:35 atmail-2 kernel: gfs001 (12439) req reply einval e57e033f fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 (17946) req reply einval ef3400a6 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 (6679) req reply einval f0dc0169 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel: gfs001 (10103) req reply einval f6a700d9 fr 1 r 1        2
Jul 14 14:19:35 atmail-2 kernel: gfs001 send einval to 1
Jul 14 14:19:35 atmail-2 kernel:  3,3 id 24a019c sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,103519 223022b 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,103519 3,3 id 223022b sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,b06ad 8d02f9 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,103519 3,3 id 223022b sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,b06ad 8d02f9 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,b06ad 3,3 id 8d02f9 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,33df5b ff9d01f5 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,33df5b 3,3 id ff9d01f5 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,1369770 fe1f02fd 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,1369770 3,3 id fe1f02fd sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,17e353 e7034e 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,17e353 3,3 id e7034e sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,17e33e ffc001a7 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,17e33e 3,3 id ffc001a7 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,b06b0 8901e1 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,b06b0 3,3 id 8901e1 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,3bdc60 1ca0351 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,3bdc60 3,3 id 1ca0351 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,ef8da ffdd0006 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,ef8da 3,3 id ffdd0006 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,40b4a 1fa012f 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,40b4a 3,3 id 1fa012f sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,136976e ff8c0371 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,136976e 3,3 id ff8c0371 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,1369832 de0060 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,1369832 3,3 id de0060 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,af9d6 690279 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,af9d6 3,3 id 690279 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,8baaf4 fffb0229 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,8baaf4 3,3 id fffb0229 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,136976f ff730126 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,136976f 3,3 id ff730126 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,17e34e 620175 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,17e34e 3,3 id 620175 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,3ce7e7 2a00002 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,3ce7e7 3,3 id 2a00002 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,1369833 ba00ae 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,1369833 3,3 id ba00ae sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,df936 2530027 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,df936 3,3 id 2530027 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,17e356 feb502d9 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,17e356 3,3 id feb502d9 sts -65538 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,1369874 ff93010d 3 0
Jul 14 14:19:35 atmail-2 kernel: 2035 un 2,13892c 1dc038b 3 0
Jul 14 14:19:35 atmail-2 kernel: 2002 qc 2,1369874 3,3 id ff93010d sts -65538 0

...........

Jul 14 14:19:35 atmail-2 kernel: lock_dlm:  Assertion failed on line 411 of file /usr/src/cluster/gfs-kernel/src/dlm/lock.c
Jul 14 14:19:35 atmail-2 kernel: lock_dlm:  assertion:  "!error"
Jul 14 14:19:35 atmail-2 kernel: lock_dlm:  time = 1698417809
Jul 14 14:19:35 atmail-2 kernel: gfs001: num=2,cb81a8 err=-22 cur=3 req=5 lkf=44
Jul 14 14:19:35 atmail-2 kernel:
Jul 14 14:19:35 atmail-2 kernel: ------------[ cut here ]------------
Jul 14 14:19:35 atmail-2 kernel: kernel BUG at /usr/src/cluster/gfs-kernel/src/dlm/lock.c:411!
Jul 14 14:19:35 atmail-2 kernel: invalid operand: 0000 [#1]
Jul 14 14:19:35 atmail-2 kernel: PREEMPT SMP
Jul 14 14:19:35 atmail-2 kernel: Modules linked in: ipmi_si ipmi_devintf ipmi_msghandler autofs e1000 eepro100 mii microcode lock_dlm dlm cman gfs lock_harness dm_mod ide_disk ide_core aic7xxx aacraid megaraid_mbox megaraid_mm
Jul 14 14:19:35 atmail-2 kernel: CPU:    0
Jul 14 14:19:35 atmail-2 kernel: EIP:    0060:[<f887cfe7>]    Not tainted VLI
Jul 14 14:19:35 atmail-2 kernel: EFLAGS: 00010296   (2.6.11.6y)
Jul 14 14:19:35 atmail-2 kernel: EIP is at do_dlm_lock+0x1d7/0x1f0 [lock_dlm]
Jul 14 14:19:35 atmail-2 kernel: eax: 00000001   ebx: ffffffea   ecx: 00008000   edx: 00000202
Jul 14 14:19:35 atmail-2 kernel: esi: e360d500   edi: f7521e00   ebp: 00000001   esp: cdbfbcc4
Jul 14 14:19:35 atmail-2 kernel: ds: 007b   es: 007b   ss: 0068
Jul 14 14:19:35 atmail-2 kernel: Process virtual (pid: 7819, threadinfo=cdbfb000 task=cdb56a40)
Jul 14 14:19:35 atmail-2 kernel: Stack: f8882cc9 f73299a0 00000002 00cb81a8 00000000 ffffffea 00000003 00000005
Jul 14 14:19:35 atmail-2 kernel:        00000044 f887d730 00000000 20202020 32202020 20202020 20202020 62632020
Jul 14 14:19:35 atmail-2 kernel:        38613138 c2000018 c0319d70 e360d500 00000000 de646b08 f8c21000 f887d0d9
Jul 14 14:19:35 atmail-2 kernel: Call Trace:
Jul 14 14:19:35 atmail-2 kernel:  [<f887d730>] lock_bast+0x0/0x10 [lock_dlm]
Jul 14 14:19:35 atmail-2 kernel:  [<f887d0d9>] lm_dlm_lock+0x79/0x90 [lock_dlm]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b5aa2a>] gfs_lm_lock+0x4a/0x70 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b4f2cf>] gfs_glock_xmote_th+0xbf/0x220 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b4e6d7>] rq_promote+0xd7/0x1b0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b4e9be>] run_queue+0xce/0xe0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b4ff95>] gfs_glock_nq+0x85/0x190 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b50ab9>] nq_m_sync+0x69/0xa0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b50990>] glock_compare+0x0/0xc0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b50c62>] gfs_glock_nq_m+0x172/0x1e0 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<f8b6828e>] gfs_link+0xae/0x410 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<c0166912>] permission+0x92/0xa0
Jul 14 14:19:35 atmail-2 kernel:  [<f8b681e0>] gfs_link+0x0/0x410 [gfs]
Jul 14 14:19:35 atmail-2 kernel:  [<c0169c8c>] vfs_link+0xec/0x170
Jul 14 14:19:35 atmail-2 kernel:  [<c0169e11>] sys_link+0x101/0x130
Jul 14 14:19:35 atmail-2 kernel:  [<c0162d67>] sys_stat64+0x37/0x40
Jul 14 14:19:35 atmail-2 kernel:  [<c0102919>] sysenter_past_esp+0x52/0x75
Jul 14 14:19:35 atmail-2 kernel: Code: 0c 89 54 24 10 8b 46 0c 89 44 24 08 8b 47 18 c7 04 24 a0 35 88 f8 89 44 24 04 e8 f5 d4 89 c7 c7 04 24 c9 2c 88 f8 e8 e9 d4 89 c7 <0f> 0b 9b 01 a0 33 88 f8 c7 04 24 60 34 88 f8 e8 b5 cb 89 c7 90


##########################################

We were running without problem until now the following version of the GFS suite from cvs:

gfs_tool version
gfs_tool DEVEL.1112190134 (built Mar 30 2005 08:43:42)
Copyright (C) Red Hat, Inc.  2004-2005  All rights reserved.

#cman_tool version
5.0.1 config 15

#ccs_tool -V
ccs_tool DEVEL.1112190133 (built Mar 30 2005 08:43:29)
Copyright (C) Red Hat, Inc.  2004  All rights reserved.


Any hints or recomendation,

Regards
Bujan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050714/6e778b84/attachment.htm>

From teigland at redhat.com  Fri Jul 15 01:01:46 2005
From: teigland at redhat.com (David Teigland)
Date: Fri, 15 Jul 2005 09:01:46 +0800
Subject: [Linux-cluster] SMP and GFS
In-Reply-To: <04f401c588b6$b31db0a0$5001a8c0@spa.isqsolutions.com>
References: <04f401c588b6$b31db0a0$5001a8c0@spa.isqsolutions.com>
Message-ID: <20050715010146.GA8086@redhat.com>

On Thu, Jul 14, 2005 at 04:57:51PM -0400, Manuel Bujan wrote:
> What if I compile my kernel to be pre-emptible ? Any problem with that
> and GFS ?

We never use CONFIG_PREEMPT in our testing, so there could be some
problems with that.

> We were running without problem until now the following version of the
> GFS suite from cvs:

There's now an actual release you can use, cluster-1.00.00 at
  ftp://sources.redhat.com/pub/cluster/releases/

Dave


From fabbione at fabbione.net  Fri Jul 15 04:35:47 2005
From: fabbione at fabbione.net (Fabio Massimo Di Nitto)
Date: Fri, 15 Jul 2005 06:35:47 +0200
Subject: [Linux-cluster] Problems with system-config-cluster
Message-ID: <42D73D23.5030702@fabbione.net>


Hi,

I am having some problems with the GUI config tool (looks extremely nice btw).

Here is the scenario:

I created manually config 1 to make a 2 node cman cluster join the quorum (seems
to be required to startup the GUI).

<start the GUI>

As soon as i add any kind of fence information (in this attempt I edited the node
fence info that add <method name="1"/> to the node section), send the config to the
cluster and quit the GUI, I am not able to re-open the GUI anymore with the following
error:

Traceback (most recent call last):
  File "/usr/sbin/system-config-cluster", line 540, in ?
    runFullGUI()
  File "/usr/sbin/system-config-cluster", line 522, in runFullGUI
    baseapp = basecluster(glade_xml, app)
  File "/usr/sbin/system-config-cluster", line 135, in __init__
    self.bad_xml_text.get_buffer().set_text(e.getMessage())
TypeError: GtkTextBuffer.set_text() argument 1 must be string or read-only buffer, not file

In attachment there is also the generated config (num. 2).

You can also notice that even if i didn't specifically add extra fence information, a line
<fencedevices/> is added. (Removing doesn't help the GUI but it makes the config valid again
for ccsd).

Does anybody have any idea?

Cheers
Fabio

-- 
no signature file found.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf.1
Type: text/xml
Size: 380 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050715/a34799c8/attachment.xml>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.conf.2
Type: text/xml
Size: 615 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050715/a34799c8/attachment-0001.xml>

From fabbione at fabbione.net  Fri Jul 15 06:50:57 2005
From: fabbione at fabbione.net (Fabio Massimo Di Nitto)
Date: Fri, 15 Jul 2005 08:50:57 +0200
Subject: [Linux-cluster] Problems with system-config-cluster
In-Reply-To: <42D73D23.5030702@fabbione.net>
References: <42D73D23.5030702@fabbione.net>
Message-ID: <42D75CD1.7020707@fabbione.net>

Fabio Massimo Di Nitto wrote:
> Hi,
> 
> I am having some problems with the GUI config tool (looks extremely nice btw).
> 
> Here is the scenario:
> 
> I created manually config 1 to make a 2 node cman cluster join the quorum (seems
> to be required to startup the GUI).
> 
> <start the GUI>
> 
> As soon as i add any kind of fence information (in this attempt I edited the node
> fence info that add <method name="1"/> to the node section), send the config to the
> cluster and quit the GUI, I am not able to re-open the GUI anymore with the following
> error:
> 
> Traceback (most recent call last):
>   File "/usr/sbin/system-config-cluster", line 540, in ?
>     runFullGUI()
>   File "/usr/sbin/system-config-cluster", line 522, in runFullGUI
>     baseapp = basecluster(glade_xml, app)
>   File "/usr/sbin/system-config-cluster", line 135, in __init__
>     self.bad_xml_text.get_buffer().set_text(e.getMessage())
> TypeError: GtkTextBuffer.set_text() argument 1 must be string or read-only buffer, not file
> 

>From a bit more investigation it appears that the generated config file fails to pass
the check_xml in CommandHandler.py when executed in --relaxng mode. I suspect that the
config.ng file is somewhat outdate or there is a typo that i really can't find.

If i disable the --relaxng check (still executin xmllint on the config), the GUI loads again properly.

The configuration still looks valid and working.

Cheers
Fabio

-- 
no signature file found.


From cjk at techma.com  Fri Jul 15 10:03:28 2005
From: cjk at techma.com (Kovacs, Corey J.)
Date: Fri, 15 Jul 2005 06:03:28 -0400
Subject: [Linux-cluster] Redhat Cluster (RHEL3) and NFS
Message-ID: <EE32D921D7601547AD9CA5C87906C566095170@tmaemail.techma.com>

I am trying to get an NFS service locked down to a few specific ports
for operation thru a firewall. I set it up so that mountd, lockd, and statd
were all using there own ports and not asking portmap for ports. When
I do this on a non clustered machine, it works fine. When I tried this on
a cluster (RHEL3 and clumanager ) The NFS server never starts. 
Clumanager just sits there and hangs. Is there some dependancy on 
portmap for an NFS service to operate correctly?
 
 
Corey
 
 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050715/c995ac43/attachment.htm>

From ptr at poczta.fm  Fri Jul 15 11:04:19 2005
From: ptr at poczta.fm (ptr at poczta.fm)
Date: 15 Jul 2005 13:04:19 +0200
Subject: [Linux-cluster] NFS failover on GFS?
Message-ID: <20050715110419.03BF1A4175@poczta.interia.pl>


   Hello.

   As I need to incorporate failover on 2-nodes GFS cluster, I set the heartbeat communication between nodes exporting
GFS partition for external hosts. Although production tests 
brought somewhat unsuccessful results:

1. If I set an NFS as service run with heartbeat script (haresources), so it's being started and shut dwon togetheer with shared cluster interface, I'm getting problems with NFS sstartup on node taking resources. It works when juzt stopping heartbeat on one of the nodes, so another one takes resources over and starts services (NFS in this case). 
BUT: when I test ral-life scenario as by pulling the ethernet plug from one node, the shared interface is brought up on second one, but NFS won't start due to unknown reason.
The shared directory with NFS locks is /var/lib/nfs located on GFS partition, so I assume the moment of fencing dead node makes trouble for proper NFS startup on second node.

2. I also attempted to remove NFS script form services managed by heartbeat, but the same situation happens: although HA sends the gratituous arp when taking the shared interface over, machines mounting GFS partition won't see the change and will loose mounts.
   Any help on this issue would be greatly appreciated.
Regards,

Piotr

----------------------------------------------------------------------
Najnowsze wiadomosci!!! >>> http://link.interia.pl/f18a0


From toelen at gmail.com  Fri Jul 15 12:26:35 2005
From: toelen at gmail.com (Leen Toelen)
Date: Fri, 15 Jul 2005 14:26:35 +0200
Subject: [Linux-cluster] Windows
Message-ID: <a494a0bc0507150526381cb8c1@mail.gmail.com>

Hi all,

is gfs very closely tied to linux or not? It would be great to access
the same gfs from both windows and Linux.

Regards,
Leen Toelen


From lhh at redhat.com  Fri Jul 15 14:46:48 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 15 Jul 2005 10:46:48 -0400
Subject: [Linux-cluster] Windows
In-Reply-To: <a494a0bc0507150526381cb8c1@mail.gmail.com>
References: <a494a0bc0507150526381cb8c1@mail.gmail.com>
Message-ID: <1121438808.3287.9.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-15 at 14:26 +0200, Leen Toelen wrote:

> is gfs very closely tied to linux or not? 

Yes.

-- Lon


From lhh at redhat.com  Fri Jul 15 15:35:01 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 15 Jul 2005 11:35:01 -0400
Subject: [Linux-cluster] Redhat Cluster (RHEL3) and NFS
In-Reply-To: <EE32D921D7601547AD9CA5C87906C566095170@tmaemail.techma.com>
References: <EE32D921D7601547AD9CA5C87906C566095170@tmaemail.techma.com>
Message-ID: <1121441701.3287.15.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-15 at 06:03 -0400, Kovacs, Corey J. wrote:
> I am trying to get an NFS service locked down to a few specific ports
> for operation thru a firewall. I set it up so that mountd, lockd, and
> statd
> were all using there own ports and not asking portmap for ports. When
> I do this on a non clustered machine, it works fine. When I tried this
> on
> a cluster (RHEL3 and clumanager ) The NFS server never starts. 
> Clumanager just sits there and hangs. Is there some dependancy on 
> portmap for an NFS service to operate correctly?

There shouldn't be.  You should have the nfs servers running before
starting clumanager, but I don't think this would cause a problem.

-- Lon


From phung at cs.columbia.edu  Fri Jul 15 21:13:09 2005
From: phung at cs.columbia.edu (Dan B. Phung)
Date: Fri, 15 Jul 2005 17:13:09 -0400 (EDT)
Subject: [Linux-cluster] Makefile fix for 2.6.12.2
Message-ID: <Pine.LNX.4.44.0507151711390.17233-100000@algiers.clic.cs.columbia.edu>

building cluster for my 2.6.12.2 box had an error finding
cluster/cnxman-socket.h, and this fixed it:

cluster/cman/lib/Makefile:L32
-                echo '-I${incdir}/cluster'; fi)
+                echo '-I${incdir}'; fi)


-dan


From ocrete at max-t.com  Fri Jul 15 22:02:56 2005
From: ocrete at max-t.com (Olivier Crete)
Date: Fri, 15 Jul 2005 18:02:56 -0400
Subject: [Linux-cluster] Using cman,etc for a non-gfs app
In-Reply-To: <1119479792.24237.315.camel@ayanami.boston.redhat.com>
References: <1119467822.21167.34.camel@cocagne.max-t.internal>
	<1119479792.24237.315.camel@ayanami.boston.redhat.com>
Message-ID: <1121464976.13635.13.camel@cocagne.max-t.internal>

On Wed, 2005-22-06 at 18:36 -0400, Lon Hohberger wrote:
> * libmagma provides cp_fence() / clu_fence() which work on both CMAN and
> gulm.

I've looked implementing clu_fence() in libmagma. It seems to only call
an ioctl in the kernel cman which only seems to do cooperative fencing
and removing a node from the cluster... I dont see any call to fenced or
fence_node. Or have I missed something ?

-- 
Olivier Cr?te
ocrete at max-t.com
Maximum Throughput Inc.


From blaster at kofler.eu.org  Sun Jul 17 22:16:24 2005
From: blaster at kofler.eu.org (blaster at kofler.eu.org)
Date: Mon, 18 Jul 2005 00:16:24 +0200
Subject: [Linux-cluster] List of supported apc Masterswitches
Message-ID: <20050717221644.DPNR4584.viefep18-int.chello.at@TheCenter>

Hello,

we are interested to implement the GFS 6.1 with the fencing using an apc
Masterswitch.

But which models are supported - is there any list available?

We would be especially interested to use the APC Masterswitch AP9222 and the
Management Card AP9606.

Thanks a lot for any advice,
Thomas


From robert at deakin.edu.au  Sun Jul 17 23:28:34 2005
From: robert at deakin.edu.au (Robert Ruge)
Date: Mon, 18 Jul 2005 09:28:34 +1000
Subject: [Linux-cluster] Cluster design questions
In-Reply-To: <20050715010146.GA8086@redhat.com>
Message-ID: <200507172328.j6HNSY5g001614@deakin.edu.au>

I have two simple questions for the experts out there. My
configuration will be RHEL 4 and GFS6.1.

1. When setting up the cluster membership should I use a private
network address for the members even if it is 100mb or would the
public data interface which is gigabit provide beter performance? I
guess the question is really how much network traffic does the DLM
create?

2. If I have a GFS /export filesystem can I then mount other GFS
filesystems under that and on multiple nodes, i.e., can I have
/export/home1, /export/home2, /export/home3 all mounted under /export
and all being GFS filesystems and all mounted on multiple servers? Do
the mount points cause any confusion across the nodes.

Thanks.

Robert Ruge   School of Information Technology, Deakin University  


From teigland at redhat.com  Mon Jul 18 06:15:53 2005
From: teigland at redhat.com (David Teigland)
Date: Mon, 18 Jul 2005 14:15:53 +0800
Subject: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
Message-ID: <20050718061553.GA9568@redhat.com>

Some of the comments about the dlm concerned how it's configured (from
user space.)  In particular, there was interest in seeing the dlm and
ocfs2 use common methods for their configuration.

The first area I'm looking at is how we get addresses/ids of other nodes.
Currently, the dlm uses an ioctl on a misc device and ocfs2 uses a
separate kernel module called "ocfs2_nodemanager" that's based on
configfs.

I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use
it (removing ocfs-specific stuff).  It still needs some work, but I'd like
to know if this appeals to the ocfs group and to others who were
interested in seeing some similarity in dlm/ocfs configuration.

Thanks,
Dave


diff -urN a/drivers/Kconfig b/drivers/Kconfig
--- a/drivers/Kconfig	2005-07-18 13:40:31.011368352 +0800
+++ b/drivers/Kconfig	2005-07-18 13:46:17.661669496 +0800
@@ -68,4 +68,6 @@
 
 source "drivers/dlm/Kconfig"
 
+source "drivers/nodemanager/Kconfig"
+
 endmenu
diff -urN a/drivers/Makefile b/drivers/Makefile
--- a/drivers/Makefile	2005-07-18 13:40:31.015367744 +0800
+++ b/drivers/Makefile	2005-07-18 13:46:06.846313680 +0800
@@ -70,3 +70,4 @@
 obj-y				+= firmware/
 obj-$(CONFIG_CRYPTO)		+= crypto/
 obj-$(CONFIG_DLM)		+= dlm/
+obj-$(CONFIG_NODEMANAGER)	+= nodemanager/
diff -urN a/drivers/nodemanager/Kconfig b/drivers/nodemanager/Kconfig
--- a/drivers/nodemanager/Kconfig	1970-01-01 07:30:00.000000000 +0730
+++ b/drivers/nodemanager/Kconfig	2005-07-18 13:52:16.449125512 +0800
@@ -0,0 +1,9 @@
+menu "Node Manager"
+
+config NODEMANAGER
+	tristate "Node Manager"
+	help
+	Node addresses and ID"s are provided from user space and made
+	available to kernel components from this module.
+
+endmenu
diff -urN a/drivers/nodemanager/Makefile b/drivers/nodemanager/Makefile
--- a/drivers/nodemanager/Makefile	1970-01-01 07:30:00.000000000 +0730
+++ b/drivers/nodemanager/Makefile	2005-07-18 13:45:52.620476336 +0800
@@ -0,0 +1,3 @@
+obj-$(CONFIG_NODEMANAGER) +=	nodemanager.o
+
+nodemanager-y :=		nodemanager.o
diff -urN a/drivers/nodemanager/nodemanager.c b/drivers/nodemanager/nodemanager.c
--- a/drivers/nodemanager/nodemanager.c	1970-01-01 07:30:00.000000000 +0730
+++ b/drivers/nodemanager/nodemanager.c	2005-07-18 13:55:17.043670968 +0800
@@ -0,0 +1,655 @@
+/*
+ * nodemanager.c
+ *
+ * Copyright (C) 2004, 2005 Oracle.  All rights reserved.
+ * Copyright (C) 2005 Red Hat, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ * 
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ */
+
+/* TODO:
+   - generic addresses (IPV4/6)
+   - multiple addresses per node
+   - more than 255 nodes (no static MAXNODE array)
+   - function to get a list of all nodes
+*/
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/idr.h>
+#include <linux/configfs.h>
+
+#include "nodemanager.h"
+
+enum {
+	NM_NODE_ATTR_NODEID = 0,
+	NM_NODE_ATTR_ADDRESS,
+	NM_NODE_ATTR_LOCAL,
+};
+
+struct clusters;
+struct cluster;
+struct nodes;
+struct node;
+
+static ssize_t node_nodeid_read(struct node *, char *);
+static ssize_t node_nodeid_write(struct node *, const char *, size_t);
+static ssize_t node_ipv4_address_read(struct node *, char *);
+static ssize_t node_ipv4_address_write(struct node *, const char *, size_t);
+static ssize_t node_local_read(struct node *, char *);
+static ssize_t node_local_write(struct node *, const char *, size_t);
+
+static struct config_item *make_node(struct config_group *, const char *);
+static void drop_node(struct config_group *, struct config_item *);
+static void release_node(struct config_item *);
+static struct config_group *make_cluster(struct config_group *, const char *);
+static void drop_cluster(struct config_group *, struct config_item *);
+static void release_cluster(struct config_item *);
+
+static ssize_t show_node(struct config_item *, struct configfs_attribute *,
+			 char *);
+static ssize_t store_node(struct config_item *, struct configfs_attribute *,
+			  const char *, size_t);
+
+
+struct node_attribute {
+	struct configfs_attribute attr;
+	ssize_t (*show)(struct node *, char *);
+	ssize_t (*store)(struct node *, const char *, size_t);
+};
+
+static struct node_attribute node_attr_nodeid = {
+	.attr	= { .ca_owner = THIS_MODULE,
+		    .ca_name = "nodeid",
+		    .ca_mode = S_IRUGO | S_IWUSR },
+	.show	= node_nodeid_read,
+	.store	= node_nodeid_write,
+};
+
+static struct node_attribute node_attr_ipv4_address = {
+	.attr	= { .ca_owner = THIS_MODULE,
+		    .ca_name = "ipv4_address",
+		    .ca_mode = S_IRUGO | S_IWUSR },
+	.show	= node_ipv4_address_read,
+	.store	= node_ipv4_address_write,
+};
+
+static struct node_attribute node_attr_local = {
+	.attr	= { .ca_owner = THIS_MODULE,
+		    .ca_name = "local",
+		    .ca_mode = S_IRUGO | S_IWUSR },
+	.show	= node_local_read,
+	.store	= node_local_write,
+};
+
+static struct configfs_attribute *node_attrs[] = {
+	[NM_NODE_ATTR_NODEID] = &node_attr_nodeid.attr,
+	[NM_NODE_ATTR_ADDRESS] = &node_attr_ipv4_address.attr,
+	[NM_NODE_ATTR_LOCAL] = &node_attr_local.attr,
+	NULL,
+};
+
+/* Hierarchy of four kinds of objects: clusters, cluster, nodes, node.
+
+   'clusters' and 'nodes' are intermediate objects used to group together
+   collections of multiple 'cluster' objects and 'node' objects respectively.
+
+   struct clusters
+   struct cluster
+   struct nodes
+   struct node
+
+   struct config_item_type clusters_type
+   struct config_item_type cluster_type
+   struct config_item_type nodes_type
+   struct config_item_type node_type
+
+   struct configfs_group_operations clusters_ops
+   struct configfs_item_operations cluster_ops
+   struct configfs_group_operations nodes_ops
+   struct configfs_item_operations node_ops
+
+   When loaded, the module initially sets up one global, root 'clusters'
+   object named 'clusters_root' under which all 'cluster' objects will
+   be created.
+
+   When a 'cluster' object is created, a new 'nodes' structure is
+   initialized as part of it, under which multiple 'node' structs
+   can then be created.
+*/
+
+static struct configfs_item_operations node_ops = {
+	.release		= release_node,
+	.show_attribute		= show_node,
+	.store_attribute	= store_node,
+};
+
+static struct configfs_group_operations nodes_ops = {
+	.make_item	= make_node,
+	.drop_item	= drop_node,
+};
+
+static struct configfs_item_operations cluster_ops = {
+	.release	= release_cluster,
+};
+
+static struct configfs_group_operations clusters_ops = {
+	.make_group	= make_cluster,
+	.drop_item	= drop_cluster,
+};
+
+static struct config_item_type node_type = {
+	.ct_group_ops	= NULL,
+	.ct_item_ops	= &node_ops,
+	.ct_attrs	= node_attrs,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct config_item_type nodes_type = {
+	.ct_group_ops	= &nodes_ops,
+	.ct_item_ops	= NULL,
+	.ct_attrs	= NULL,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct config_item_type cluster_type = {
+	.ct_group_ops	= NULL,
+	.ct_item_ops	= &cluster_ops,
+	.ct_attrs	= NULL,
+	.ct_owner	= THIS_MODULE,
+};
+
+static struct config_item_type clusters_type = {
+	.ct_group_ops	= &clusters_ops,
+	.ct_item_ops	= NULL,
+	.ct_attrs	= NULL,
+	.ct_owner	= THIS_MODULE,
+};
+
+struct node {
+	spinlock_t		nd_lock;
+	struct config_item	nd_item; 
+	char			nd_name[NODEMANAGER_MAX_NAME_LEN+1];
+	int			nd_nodeid;
+	u32			nd_ipv4_address;
+	struct rb_node		nd_ip_node;
+	int			nd_local;
+	unsigned long		nd_set_attributes;
+	struct idr		nd_status_idr;
+	struct list_head	nd_status_list;
+};
+
+struct nodes {
+	struct config_group ns_group;
+};
+
+struct cluster {
+	struct config_group	cl_group;
+	int			cl_has_local;
+	int			cl_local_node;
+	u32			cl_local_addr;
+	rwlock_t		cl_nodes_lock;
+	struct node		*cl_nodes[NODEMANAGER_MAX_NODES];
+	struct rb_root		cl_node_ip_tree;
+};
+
+struct clusters {
+	struct configfs_subsystem cs_subsys;
+};
+
+static struct clusters clusters_root = {
+	.cs_subsys = {
+		.su_group = {
+			.cg_item = {
+				.ci_namebuf = "cluster",
+				.ci_type = &clusters_type,
+			},
+		},
+	},
+};
+
+/* for now we operate under the assertion that there can be only one
+ * cluster active at a time.  Changing this will require trickling
+ * cluster references throughout where nodes are looked up */
+
+static struct cluster *single_cluster = NULL;
+
+
+static struct cluster *to_cluster(struct config_item *item)
+{
+	return item ?
+		container_of(to_config_group(item), struct cluster, cl_group)
+		: NULL;
+}
+
+static struct node *to_node(struct config_item *item)
+{
+	return item ? container_of(item, struct node, nd_item) : NULL;
+}
+
+static struct cluster *node_to_cluster(struct node *node)
+{
+	return to_cluster(node->nd_item.ci_parent->ci_parent);
+}
+
+static struct node *node_ip_tree_lookup(struct cluster *cluster,
+					u32 addr,
+					struct rb_node ***ret_p,
+					struct rb_node **ret_parent)
+{
+	struct rb_node **p = &cluster->cl_node_ip_tree.rb_node;
+	struct rb_node *parent = NULL;
+	struct node *node, *ret = NULL;
+
+	while (*p) {
+		parent = *p;
+		node = rb_entry(parent, struct node, nd_ip_node);
+
+		if (addr < node->nd_ipv4_address)
+			p = &(*p)->rb_left;
+		else if (addr > node->nd_ipv4_address)
+			p = &(*p)->rb_right;
+		else {
+			ret = node;
+			break;
+		}
+	}
+
+	if (ret_p != NULL)
+		*ret_p = p;
+	if (ret_parent != NULL)
+		*ret_parent = parent;
+
+	return ret;
+}
+
+u32 nodemanager_nodeid_to_addr(int nodeid)
+{
+	struct node *node = NULL;
+	u32 addr = 0;
+
+	if (nodeid >= NODEMANAGER_MAX_NODES || single_cluster == NULL)
+		goto out;
+
+	read_lock(&single_cluster->cl_nodes_lock);
+	node = single_cluster->cl_nodes[nodeid];
+	if (node)
+		addr = node->nd_ipv4_address;
+	read_unlock(&single_cluster->cl_nodes_lock);
+ out:
+	return addr;
+}
+EXPORT_SYMBOL_GPL(nodemanager_nodeid_to_addr);
+
+int nodemanager_addr_to_nodeid(u32 addr)
+{
+	struct node *node = NULL;
+	struct cluster *cluster = single_cluster;
+	int nodeid = NODEMANAGER_MAX_NODES;
+
+	if (cluster == NULL)
+		goto out;
+
+	read_lock(&cluster->cl_nodes_lock);
+	node = node_ip_tree_lookup(cluster, addr, NULL, NULL);
+	if (node)
+		nodeid = node->nd_nodeid;
+	read_unlock(&cluster->cl_nodes_lock);
+ out:
+	return nodeid;
+}
+EXPORT_SYMBOL_GPL(nodemanager_addr_to_nodeid);
+
+int nodemanager_our_nodeid(void)
+{
+	int nodeid = NODEMANAGER_MAX_NODES;
+
+	if (single_cluster && single_cluster->cl_has_local)
+		nodeid = single_cluster->cl_local_node;
+
+	return nodeid;
+}
+EXPORT_SYMBOL_GPL(nodemanager_our_nodeid);
+
+u32 nodemanager_our_addr(void)
+{
+	u32 addr = 0;
+
+	if (single_cluster && single_cluster->cl_has_local)
+		addr = single_cluster->cl_local_addr;
+
+	return addr;
+}
+EXPORT_SYMBOL_GPL(nodemanager_our_addr);
+
+
+static ssize_t node_nodeid_read(struct node *node, char *page)
+{
+	return sprintf(page, "%d\n", node->nd_nodeid);
+}
+
+static ssize_t node_nodeid_write(struct node *node, const char *page,
+				 size_t count)
+{
+	struct cluster *cluster = node_to_cluster(node);
+	unsigned long tmp;
+	char *p = (char *)page;
+
+	tmp = simple_strtoul(p, &p, 0);
+	if (!p || (*p && (*p != '\n')))
+		return -EINVAL;
+
+	if (tmp >= NODEMANAGER_MAX_NODES)
+		return -ERANGE;
+
+	write_lock(&cluster->cl_nodes_lock);
+	if (cluster->cl_nodes[tmp])
+		p = NULL;
+	else  {
+		cluster->cl_nodes[tmp] = node;
+		node->nd_nodeid = tmp;
+	}
+	write_unlock(&cluster->cl_nodes_lock);
+	if (p == NULL)
+		return -EEXIST;
+
+	return count;
+}
+
+static ssize_t node_ipv4_address_read(struct node *node, char *page)
+{
+	return sprintf(page, "%u.%u.%u.%u\n", NIPQUAD(node->nd_ipv4_address));
+}
+
+static ssize_t node_ipv4_address_write(struct node *node, const char *page,
+				       size_t count)
+{
+	struct cluster *cluster = node_to_cluster(node);
+	int ret, i;
+	struct rb_node **p, *parent;
+	unsigned int octets[4];
+	u32 ipv4_addr = 0; /* network order */
+
+	ret = sscanf(page, "%3u.%3u.%3u.%3u", &octets[3], &octets[2],
+		     &octets[1], &octets[0]);
+	if (ret != 4)
+		return -EINVAL;
+
+	for (i = 0; i < ARRAY_SIZE(octets); i++) {
+		if (octets[i] > 255)
+			return -ERANGE;
+		ipv4_addr |= octets[i] << (i * 8);
+	}
+	ipv4_addr = htonl(ipv4_addr);
+
+	ret = 0;
+	write_lock(&cluster->cl_nodes_lock);
+	if (node_ip_tree_lookup(cluster, ipv4_addr, &p, &parent))
+		ret = -EEXIST;
+	else {
+		rb_link_node(&node->nd_ip_node, parent, p);
+		rb_insert_color(&node->nd_ip_node, &cluster->cl_node_ip_tree);
+	}
+	write_unlock(&cluster->cl_nodes_lock);
+	if (ret)
+		return ret;
+
+	memcpy(&node->nd_ipv4_address, &ipv4_addr, sizeof(ipv4_addr));
+
+	return count;
+}
+
+static ssize_t node_local_read(struct node *node, char *page)
+{
+	return sprintf(page, "%d\n", node->nd_local);
+}
+
+static ssize_t node_local_write(struct node *node, const char *page,
+				size_t count)
+{
+	struct cluster *cluster = node_to_cluster(node);
+	unsigned long tmp;
+	char *p = (char *)page;
+
+	tmp = simple_strtoul(p, &p, 0);
+	if (!p || (*p && (*p != '\n')))
+		return -EINVAL;
+
+	tmp = !!tmp; /* boolean of whether this node wants to be local */
+
+	/* the only failure case is trying to set a new local node
+	 * when a different one is already set */
+
+	if (tmp && tmp == cluster->cl_has_local &&
+	    cluster->cl_local_node != node->nd_nodeid)
+		return -EBUSY;
+
+	if (!tmp && cluster->cl_has_local &&
+	    cluster->cl_local_node == node->nd_nodeid) {
+		cluster->cl_local_node = 0;
+	}
+
+	node->nd_local = tmp;
+	if (node->nd_local) {
+		cluster->cl_has_local = tmp;
+		cluster->cl_local_node = node->nd_nodeid;
+		cluster->cl_local_addr = node->nd_ipv4_address;
+	}
+
+	return count;
+}
+
+static int attr_index(struct configfs_attribute *attr)
+{
+	int i;
+	for (i = 0; i < ARRAY_SIZE(node_attrs); i++) {
+		if (attr == node_attrs[i])
+			return i;
+	}
+	BUG();
+	return 0;
+}
+
+static ssize_t show_node(struct config_item *item,
+			 struct configfs_attribute *attr, char *page)
+{
+	struct node *node = to_node(item);
+	struct node_attribute *node_attr =
+			container_of(attr, struct node_attribute, attr);
+	ssize_t ret = 0;
+
+	if (node_attr->show)
+		ret = node_attr->show(node, page);
+	return ret;
+}
+
+static ssize_t store_node(struct config_item *item,
+			  struct configfs_attribute *attr,
+			  const char *page, size_t count)
+{
+	struct node *node = to_node(item);
+	struct node_attribute *node_attr =
+		container_of(attr, struct node_attribute, attr);
+	ssize_t ret;
+	int index = attr_index(attr);
+
+	if (node_attr->store == NULL) {
+		ret = -EINVAL;
+		goto out;
+	}
+
+	if (test_bit(index, &node->nd_set_attributes))
+		return -EBUSY;
+
+	ret = node_attr->store(node, page, count);
+	if (ret < count)
+		goto out;
+
+	set_bit(index, &node->nd_set_attributes);
+ out:
+	return ret;
+}
+
+static struct config_item *make_node(struct config_group *group,
+				     const char *name)
+{
+	struct node *node = NULL;
+	struct config_item *ret = NULL;
+
+	if (strlen(name) > NODEMANAGER_MAX_NAME_LEN)
+		goto out;
+
+	node = kcalloc(1, sizeof(struct node), GFP_KERNEL);
+	if (node == NULL)
+		goto out;
+
+	strcpy(node->nd_name, name); /* use item.ci_namebuf instead? */
+	config_item_init_type_name(&node->nd_item, name, &node_type);
+	spin_lock_init(&node->nd_lock);
+	idr_init(&node->nd_status_idr);
+	INIT_LIST_HEAD(&node->nd_status_list);
+
+	ret = &node->nd_item;
+ out:
+	if (ret == NULL)
+		kfree(node);
+
+	return ret;
+}
+
+static void drop_node(struct config_group *group, struct config_item *item)
+{
+	struct node *node = to_node(item);
+	struct cluster *cluster = to_cluster(group->cg_item.ci_parent);
+
+	if (cluster->cl_has_local &&
+	    (cluster->cl_local_node == node->nd_nodeid)) {
+		cluster->cl_has_local = 0;
+		cluster->cl_local_node = NODEMANAGER_MAX_NODES;
+	}
+
+	write_lock(&cluster->cl_nodes_lock);
+
+	if (node->nd_ipv4_address)
+		rb_erase(&node->nd_ip_node, &cluster->cl_node_ip_tree);
+
+	/* nd_nodeid might be 0 if the node number hasn't been set.. */
+	if (cluster->cl_nodes[node->nd_nodeid] == node)
+		cluster->cl_nodes[node->nd_nodeid] = NULL;
+
+	write_unlock(&cluster->cl_nodes_lock);
+
+	config_item_put(item);
+}
+
+static void release_node(struct config_item *item)
+{
+	struct node *node = to_node(item);
+	kfree(node);
+}
+
+/* a cluster object is created under the one root clusters object */
+
+static struct config_group *make_cluster(struct config_group *group,
+					 const char *name)
+{
+	struct cluster *cluster = NULL;
+	struct nodes *nodes = NULL;
+	struct config_group *ret = NULL;
+	void *defs = NULL;
+
+	if (single_cluster)
+		goto out;
+
+	cluster = kcalloc(1, sizeof(struct cluster), GFP_KERNEL);
+	nodes = kcalloc(1, sizeof(struct nodes), GFP_KERNEL);
+	defs = kcalloc(2, sizeof(struct config_group *), GFP_KERNEL);
+
+	if (!cluster || !nodes || !defs)
+		goto out;
+
+	config_group_init_type_name(&cluster->cl_group, name, &cluster_type);
+
+	config_group_init_type_name(&nodes->ns_group, "nodes", &nodes_type);
+
+	cluster->cl_group.default_groups = defs;
+	cluster->cl_group.default_groups[0] = &nodes->ns_group;
+	cluster->cl_group.default_groups[1] = NULL;
+	rwlock_init(&cluster->cl_nodes_lock);
+	cluster->cl_node_ip_tree = RB_ROOT;
+
+	ret = &cluster->cl_group;
+	single_cluster = cluster;
+ out:
+	if (ret == NULL) {
+		kfree(cluster);
+		kfree(nodes);
+		kfree(defs);
+	}
+
+	return ret;
+}
+
+static void drop_cluster(struct config_group *group, struct config_item *item)
+{
+	struct cluster *cluster = to_cluster(item);
+	struct config_item *killme;
+	int i;
+
+	BUG_ON(single_cluster != cluster);
+	single_cluster = NULL;
+
+	for (i = 0; cluster->cl_group.default_groups[i]; i++) {
+		killme = &cluster->cl_group.default_groups[i]->cg_item;
+		cluster->cl_group.default_groups[i] = NULL;
+		config_item_put(killme);
+	}
+
+	config_item_put(item);
+}
+
+static void release_cluster(struct config_item *item)
+{
+	struct cluster *cluster = to_cluster(item);
+	kfree(cluster->cl_group.default_groups);
+	kfree(cluster);
+}
+
+static int __init init_nodemanager(void)
+{
+	int ret = -1;
+
+	config_group_init(&clusters_root.cs_subsys.su_group);
+	init_MUTEX(&clusters_root.cs_subsys.su_sem);
+
+	ret = configfs_register_subsystem(&clusters_root.cs_subsys);
+	if (ret)
+		printk(KERN_ERR "nodemanager: Registration returned %d\n", ret);
+
+	return ret;
+}
+
+static void __exit exit_nodemanager(void)
+{
+	configfs_unregister_subsystem(&clusters_root.cs_subsys);
+}
+
+MODULE_AUTHOR("Oracle");
+MODULE_LICENSE("GPL");
+
+module_init(init_nodemanager)
+module_exit(exit_nodemanager)
+
diff -urN a/drivers/nodemanager/nodemanager.h b/drivers/nodemanager/nodemanager.h
--- a/drivers/nodemanager/nodemanager.h	1970-01-01 07:30:00.000000000 +0730
+++ b/drivers/nodemanager/nodemanager.h	2005-07-18 13:41:35.377583200 +0800
@@ -0,0 +1,37 @@
+/*
+ * nodemanager.h
+ *
+ * Copyright (C) 2004 Oracle.  All rights reserved.
+ * Copyright (C) 2005 Red Hat, Inc.  All rights reserved.
+ *
+ * This program is free software; you can redistribute it and/or
+ * modify it under the terms of the GNU General Public
+ * License as published by the Free Software Foundation; either
+ * version 2 of the License, or (at your option) any later version.
+ * 
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+ * General Public License for more details.
+ * 
+ * You should have received a copy of the GNU General Public
+ * License along with this program; if not, write to the
+ * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
+ * Boston, MA 021110-1307, USA.
+ *
+ */
+
+#ifndef NODEMANAGER_H
+#define NODEMANAGER_H
+
+#define NODEMANAGER_MAX_NODES		255
+#define NODEMANAGER_INVALID_NODE_NUM	255
+#define NODEMANAGER_MAX_NAME_LEN	__NEW_UTS_LEN	/* 64 */
+
+u32 nodemanager_nodeid_to_addr(int nodeid);
+int nodemanager_addr_to_nodeid(u32 addr);
+int nodemanager_our_nodeid(void);
+u32 nodemanager_our_addr(void);
+
+#endif
+


From pcaulfie at redhat.com  Mon Jul 18 08:10:30 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Mon, 18 Jul 2005 09:10:30 +0100
Subject: [Linux-cluster] Where to go with cman ?
Message-ID: <42DB63F6.5070600@redhat.com>

As I see it there are two things we can do with userland cman that's current in
the head of CVS:

1. Leave it as it is - a port of the kernel one. This has some benefits: it's
easy (plus a few bug fixes that need to go in), it's protocol-compatible with
the kernel one. There are a small number of extra features that could go in
there (that would, annoyingly, break that compatibility) but nothing really
serious. It doesn't give us anything new, but what new is neeed ?

2. Migrate it to something much more sophisticated. I've mentioned Virtual
Synchrony a few times before and I've been looking into this in some detail
since. The benefits are largely internal but they do provide a reliable, robust
and well-performing messaging system that other cluster subsystems can use.
While the application programmers at the cluster summit maintained they had no
use for a cluster messaging system, I still believe that it is a useful thing to
have at a lower level - if only for our own programming needs. I know that Jon
looked into the existing cman messaging system before rejecting it as too slow
and unreliable for he needs of the cluster mirroring code.

There are two suboptions here.
  a) write it ourself. Quite a big job this. Bigger than I would like. To be
honest I did make a start at this and now realise just what a huge job it is to
get something that both performs well and is reliable. REALLY reliable. even
worse if the academics want something provably reliable.
   b) adopt something else. The obvious candidate here is the openAIS code[1].
This looks to be quite mature now and has all the features we need of a low
level messaging system. It's very nicely abstracted out so we can pick out just
the bits we need without having the whole (rather heavyweight) system on top of it.

The one problem with the openAIS code is that it doesn't support IPv6, and much
of the code is tied to IPv4. Having had a look at it and emailed Steven Dake
about this he reckons it's about 2 weeks work to add.[2]

The advantages of doing this are several.
- It saves time. We get something that is known to work, even though it needs
extra features added for our own use.
- we're not inventing something new that already exists in several other places.
- we get more people who know the code. Currently only I know the internals of
cman as it stands and it's quite scary code that people don't want to get
involved with (we've have several DLM patches in the past, but no CMAN ones).
This way we get at least 2 (Steven and me) as well as anyone else who is
following openAIS. Of course there will be CMAN-specific stuff on top of their
comms layer to make it quorum-based and capable of supporting GFS and DLM that
will be Red Hat specific but these are not going to be large.
- the APIs are all open (based on SAforum specifications) and already
implemented. Although adding saCLM to CMAN is pretty easy as I proved last week.

The disadvantages are
- Need to learn the internals of someone else's code.
- We don't have full control over the code. Although we can obviously fork it if
we feel the need it would, obviously be preferable not to.
- non-compatibility with "old" cman, making rolling upgrades har or even
impossible. I'm not sure what to do about this yet, but it's worth pointing out
that the DLM has a new line-protocol too.
- openAIS is BSD licensed, I don't think this is a problem but it probably needs
checking.

In short, I'm advocating adopting the openAIS core (libtotem basically) as
CMAN's communications/membership protocol. If we're going to do a "CMAN V2" that
has anything significant over V1 then re-inventing it is going to be a huge
amount of work that someone else has already done.

Comments?

Patrick


From afontes at raufer.com.br  Wed Jul 13 14:56:12 2005
From: afontes at raufer.com.br (Luis Alexandre Fontes)
Date: Wed, 13 Jul 2005 14:56:12 -0000
Subject: [Linux-cluster] Power switch
Message-ID: <200507131757.j6DHv38B026210@oberon.raufer.com.br>

An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050713/c50699fe/attachment.htm>

From afontes at raufer.com.br  Thu Jul 14 14:18:51 2005
From: afontes at raufer.com.br (Luis Alexandre Fontes)
Date: Thu, 14 Jul 2005 14:18:51 -0000
Subject: [Linux-cluster] Linux-cluster
Message-ID: <200507141719.j6EHJgRK013417@oberon.raufer.com.br>

An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050714/7d1765f0/attachment.htm>

From jharr at opsource.net  Mon Jul 18 14:49:48 2005
From: jharr at opsource.net (Jeff Harr)
Date: Mon, 18 Jul 2005 10:49:48 -0400
Subject: [Linux-cluster] Redhat Cluster 4 security
Message-ID: <38A48FA2F0103444906AD22E14F1B5A34B5EA4@mailxchg01.corp.opsource.net>

Hi all, I just installed Redhat Cluster 4 and it is finally up and
working, more or less problem free.  During the install and all
throughout the experience, I don't believe I was ever prompted for a
root password or other type of authentication when forming my cluster of
2 machines.  Is there a default password of some kind?  If so I'd like
to change it.  It seems that if I could form a cluster without using a
password then so can anyone, possibly adding a service with a script
like "rm -rf *" or something sinister.

 
Thanks in advance,

Jeff

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050718/5ca7b676/attachment.htm>

From jharr at opsource.net  Mon Jul 18 14:52:21 2005
From: jharr at opsource.net (Jeff Harr)
Date: Mon, 18 Jul 2005 10:52:21 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
Message-ID: <38A48FA2F0103444906AD22E14F1B5A34B5EA3@mailxchg01.corp.opsource.net>

Hello all, sorry for the back to back questions but I've just installed
Redhat Cluster 4 for the first time and have but a single problem with
the way the cluster works.  If I cause a failover (for example, by
bringing my bonded interface down), the fencing does its job by
attempting to power down the proper node.  However, the shutdown gets
halted by an endless loop of "kernel: CMANsendmsg failed: -101"
messages.

 
I've looked through google and noted that this can be caused by missing
"cman" shutdown scripts.  I've got ALL of my startup/shutdown scripts
for all the various run levels so that doesn't seem to be the problem.

 
Any help appreciated!

Jeff

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050718/fd1a330b/attachment.htm>

From pcaulfie at redhat.com  Mon Jul 18 15:12:58 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Mon, 18 Jul 2005 16:12:58 +0100
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <38A48FA2F0103444906AD22E14F1B5A34B5EA3@mailxchg01.corp.opsource.net>
References: <38A48FA2F0103444906AD22E14F1B5A34B5EA3@mailxchg01.corp.opsource.net>
Message-ID: <42DBC6FA.8080509@redhat.com>

Jeff Harr wrote:
> Hello all, sorry for the back to back questions but I?ve just installed
> Redhat Cluster 4 for the first time and have but a single problem with
> the way the cluster works.  If I cause a failover (for example, by
> bringing my bonded interface down), the fencing does its job by
> attempting to power down the proper node.  However, the shutdown gets
> halted by an endless loop of ?kernel: CMANsendmsg failed: -101? messages.
> 
>  
> 
> I?ve looked through google and noted that this can be caused by missing
> ?cman? shutdown scripts.  I?ve got ALL of my startup/shutdown scripts
> for all the various run levels so that doesn?t seem to be the problem.
> 

It is nearly always caused by cman not being shut down when the system is shut
down. Are there any errors displayed when the cman shutdown script is run ?

-- 

patrick


From jharr at opsource.net  Mon Jul 18 15:26:02 2005
From: jharr at opsource.net (Jeff Harr)
Date: Mon, 18 Jul 2005 11:26:02 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
Message-ID: <38A48FA2F0103444906AD22E14F1B5A35437CE@mailxchg01.corp.opsource.net>

First of all, thanks for the reply.  I've run "service cman stop" so
many times my fingers hurt and I've never had a problem.  I've never run
/etc/init.d/cman stop before and its going to be difficult for me to do
it atm.  I suppose if I run it and it outputs something dreadful then
I've found my answer :)  Thanks for the response...at least I have
something better than a google search to confirm the most likely
problem.

Jeff

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Patrick Caulfield
Sent: Monday, July 18, 2005 11:13 AM
To: linux clustering
Subject: Re: [Linux-cluster] kernel: CMANsendmsg failed: -101

Jeff Harr wrote:
> Hello all, sorry for the back to back questions but I've just
installed
> Redhat Cluster 4 for the first time and have but a single problem with
> the way the cluster works.  If I cause a failover (for example, by
> bringing my bonded interface down), the fencing does its job by
> attempting to power down the proper node.  However, the shutdown gets
> halted by an endless loop of "kernel: CMANsendmsg failed: -101"
messages.
> 
>  
> 
> I've looked through google and noted that this can be caused by
missing
> "cman" shutdown scripts.  I've got ALL of my startup/shutdown scripts
> for all the various run levels so that doesn't seem to be the problem.
> 

It is nearly always caused by cman not being shut down when the system
is shut
down. Are there any errors displayed when the cman shutdown script is
run ?

-- 

patrick

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster


From adingman at cook-inc.com  Mon Jul 18 16:11:08 2005
From: adingman at cook-inc.com (Andrew C. Dingman)
Date: Mon, 18 Jul 2005 11:11:08 -0500
Subject: [Linux-cluster] Power switch
In-Reply-To: <200507131757.j6DHv38B026210@oberon.raufer.com.br>
References: <200507131757.j6DHv38B026210@oberon.raufer.com.br>
Message-ID: <1121703068.21291.6.camel@adingman.cin.cook>

I've got a handful of AP7901 switches that I like. The fencing agent is
very brittle with regard to configuration changes on the 7901, though,
because the menus can change as a result of permission changes. I ended
up writing my own which I think is a bit more tolerant of different
switch configurations. I *know* it's more tolerant of the switch
configuration I use. I posted that fencing agent here ages ago, I think,
and never got a response, but I'll post it again if anyone is
interested.

I've also heard that WTI switches are good, but I have no experience
with them.

On Wed, 2005-07-13 at 14:56 +0000, Luis Alexandre Fontes wrote:
> We want to buy a power switch from APC for using with clumanager.
> What model do you recommend?
> AP9225? AP7900?
> Any suggestions?
> 
> Thanks in advance,
> 
> Alex.
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster


From blaster at kofler.eu.org  Mon Jul 18 16:43:22 2005
From: blaster at kofler.eu.org (blaster at kofler.eu.org)
Date: Mon, 18 Jul 2005 18:43:22 +0200
Subject: AW: [Linux-cluster] Power switch
In-Reply-To: <1121703068.21291.6.camel@adingman.cin.cook>
Message-ID: <20050718164347.GFIY4473.viefep19-int.chello.at@TheCenter>

Hi,

I would be really interested into the fencing agent. Could you post it again
?
So, every power switch should be usable, if the agent is adjusted probably ?

Thomas


-----Urspr?ngliche Nachricht-----
Von: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] Im Auftrag von Andrew C. Dingman
Gesendet: Montag, 18. Juli 2005 18:11
An: linux clustering; Luis Alexandre Fontes
Betreff: Re: [Linux-cluster] Power switch

I've got a handful of AP7901 switches that I like. The fencing agent is
very brittle with regard to configuration changes on the 7901, though,
because the menus can change as a result of permission changes. I ended
up writing my own which I think is a bit more tolerant of different
switch configurations. I *know* it's more tolerant of the switch
configuration I use. I posted that fencing agent here ages ago, I think,
and never got a response, but I'll post it again if anyone is
interested.

I've also heard that WTI switches are good, but I have no experience
with them.

On Wed, 2005-07-13 at 14:56 +0000, Luis Alexandre Fontes wrote:
> We want to buy a power switch from APC for using with clumanager.
> What model do you recommend?
> AP9225? AP7900?
> Any suggestions?
> 
> Thanks in advance,
> 
> Alex.
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster


From eric at bootseg.com  Mon Jul 18 17:51:22 2005
From: eric at bootseg.com (Eric Kerin)
Date: Mon, 18 Jul 2005 13:51:22 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <38A48FA2F0103444906AD22E14F1B5A35437CE@mailxchg01.corp.opsource.net>
References: <38A48FA2F0103444906AD22E14F1B5A35437CE@mailxchg01.corp.opsource.net>
Message-ID: <1121709082.4875.10.camel@auh5-0478>

On Mon, 2005-07-18 at 11:26 -0400, Jeff Harr wrote:
> First of all, thanks for the reply.  I've run "service cman stop" so
> many times my fingers hurt and I've never had a problem.  I've never run
> /etc/init.d/cman stop before and its going to be difficult for me to do
> it atm.  I suppose if I run it and it outputs something dreadful then
> I've found my answer :)  Thanks for the response...at least I have
> something better than a google search to confirm the most likely
> problem.
> 
> Jeff

One thing to check, make sure you have the Kill entries in the rc0 and
rc6 directories:  ls /etc/rc.d/rc0.d/K??cman

Also check that cman is set to stop before the network interfaces are
taken down.  (Mine is set to stop cman at 79 and network at 90.)

If you want to reset cman's startup/shutdown scripts back to their
default settings, use "chkconfig --del cman", and then "chkconfig --add
cman".

Thanks, 
Eric Kerin <eric at bootseg.com>


From vmoravek at atlas.cz  Mon Jul 18 18:01:19 2005
From: vmoravek at atlas.cz (vmoravek at atlas.cz)
Date: Mon, 18 Jul 2005 20:01:19 +0200
Subject: [Linux-cluster] gfs 6.1 RHEL4 - gulm problem
Message-ID: <bf4c3d7fe7e445f5bee6e785c914e24e@atlas.cz>

Hi All, 
for one our customer I configure gfs cluster on gulm locking base.

But i have problem with lock_gulmd starting script, did you seen this message before??

this is  from /var/log/messages after:

service ccsd start
service lock_gulmd start

My cluster.conf is below

Thaks for any help.

Vojtech
 
......
Jul 18 19:56:19 gfs-samba1 lock_gulmd_main[9444]: Forked lock_gulmd_core.
Jul 18 19:56:19 gfs-samba1 lock_gulmd_core[9452]: Starting lock_gulmd_core 1.0.0. (built Jun  7 2005 12:34:15) Copyright (C) 2004 Red Hat, Inc.  All rights reserved.
Jul 18 19:56:19 gfs-samba1 lock_gulmd_core[9452]: I am running in Standard mode.
Jul 18 19:56:19 gfs-samba1 lock_gulmd_core[9452]: I am (gfs-samba1) with ip (::ffff:192.168.20.1)
Jul 18 19:56:19 gfs-samba1 lock_gulmd_core[9452]: This is cluster jouza
Jul 18 19:56:19 gfs-samba1 lock_gulmd_core[9452]: In src/core_main.c:200 (1.0.0) death by: open listener failed. 97:Addressfamily not supported by protocol
Jul 18 19:56:20 gfs-samba1 lock_gulmd_main[9444]: Forked lock_gulmd_LT.
Jul 18 19:56:20 gfs-samba1 lock_gulmd_LT[9460]: Starting lock_gulmd_LT 1.0.0. (built Jun  7 2005 12:34:15) Copyright (C) 2004 Red Hat, Inc.  All rights reserved.
Jul 18 19:56:20 gfs-samba1 lock_gulmd_LT[9460]: I am running in Standard mode.
Jul 18 19:56:20 gfs-samba1 lock_gulmd_LT[9460]: I am (gfs-samba1) with ip (::ffff:192.168.20.1)
Jul 18 19:56:20 gfs-samba1 lock_gulmd_LT[9460]: This is cluster jouza
Jul 18 19:56:20 gfs-samba1 lock_gulmd_LT000[9460]: ERROR [src/lock_io.c:520] Failed to create socket. 97:Address family notsupported by protocol
Jul 18 19:56:21 gfs-samba1 lock_gulmd_main[9444]: Forked lock_gulmd_LTPX.
Jul 18 19:56:21 gfs-samba1 lock_gulmd_LTPX[9472]: Starting lock_gulmd_LTPX 1.0.0. (built Jun  7 2005 12:34:15) Copyright (C) 2004 Red Hat, Inc.  All rights reserved.
Jul 18 19:56:21 gfs-samba1 lock_gulmd_LTPX[9472]: I am running in Standard mode.
Jul 18 19:56:21 gfs-samba1 lock_gulmd_LTPX[9472]: I am (gfs-samba1) with ip (::ffff:192.168.20.1)
Jul 18 19:56:21 gfs-samba1 lock_gulmd_LTPX[9472]: This is cluster jouza
Jul 18 19:56:21 gfs-samba1 lock_gulmd_LTPX[9472]: ERROR [src/ltpx_io.c:431] Failed to create socket. 97:Address family not supported by protocol
...


cluster.conf:

<?xml version="1.0" ?>
<cluster config_version="6" name="jouza">
        <fence_daemon clean_start="0" post_fail_delay="0" post_join_delay="3"/>
        <clusternodes>
                <clusternode name="gfs-samba1">
                        <fence>
                                <method name="1">
                                        <device name="gfs_man"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="gfs-samba2">
                        <fence>
                                <method name="1">
                                        <device name="gfs_man"/>
                                </method>
                        </fence>
                </clusternode>
                <clusternode name="gfs-fake">
                        <fence>
                                <method name="1">
                                        <device name="gfs_man"/>
                                </method>
                        </fence>
                </clusternode>
        </clusternodes>
        <fencedevices>
                <fencedevice agent="fence_manual" name="gfs_man"/>
        </fencedevices>
        <rm>
                <failoverdomains/>
                <resources/>
        </rm>
        <gulm>
                <lockserver name="gfs-samba1"/>
        </gulm>
</cluster>


From jharr at opsource.net  Mon Jul 18 18:06:03 2005
From: jharr at opsource.net (Jeff Harr)
Date: Mon, 18 Jul 2005 14:06:03 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
Message-ID: <38A48FA2F0103444906AD22E14F1B5A3543855@mailxchg01.corp.opsource.net>

Thanks for the help, Eric.  Its interesting that you mentioned taking
down cman BEFORE taking the interfaces down, because when I was testing
my failover I did: ifdown bond0.  My thinking was that the heartbeat
would die and everything would work.  It didn't occur to me that it
would mess up cman - do you think that's what's doing it?  Should I
instead just pull the cables? (I'm asking because it's a long drive to
the site just for the test, but will if you think that's the problem).

Thanks again,
Jeff

-----Original Message-----
From: Eric Kerin [mailto:eric at bootseg.com] 
Sent: Monday, July 18, 2005 1:51 PM
To: Jeff Harr
Cc: linux clustering
Subject: RE: [Linux-cluster] kernel: CMANsendmsg failed: -101

On Mon, 2005-07-18 at 11:26 -0400, Jeff Harr wrote:
> First of all, thanks for the reply.  I've run "service cman stop" so
> many times my fingers hurt and I've never had a problem.  I've never
run
> /etc/init.d/cman stop before and its going to be difficult for me to
do
> it atm.  I suppose if I run it and it outputs something dreadful then
> I've found my answer :)  Thanks for the response...at least I have
> something better than a google search to confirm the most likely
> problem.
> 
> Jeff

One thing to check, make sure you have the Kill entries in the rc0 and
rc6 directories:  ls /etc/rc.d/rc0.d/K??cman

Also check that cman is set to stop before the network interfaces are
taken down.  (Mine is set to stop cman at 79 and network at 90.)

If you want to reset cman's startup/shutdown scripts back to their
default settings, use "chkconfig --del cman", and then "chkconfig --add
cman".

Thanks, 
Eric Kerin <eric at bootseg.com>


From eric at bootseg.com  Mon Jul 18 18:26:17 2005
From: eric at bootseg.com (Eric Kerin)
Date: Mon, 18 Jul 2005 14:26:17 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <38A48FA2F0103444906AD22E14F1B5A3543855@mailxchg01.corp.opsource.net>
References: <38A48FA2F0103444906AD22E14F1B5A3543855@mailxchg01.corp.opsource.net>
Message-ID: <1121711177.4875.19.camel@auh5-0478>

On Mon, 2005-07-18 at 14:06 -0400, Jeff Harr wrote:
> Thanks for the help, Eric.  Its interesting that you mentioned taking
> down cman BEFORE taking the interfaces down, because when I was testing
> my failover I did: ifdown bond0.  My thinking was that the heartbeat
> would die and everything would work.  It didn't occur to me that it
> would mess up cman - do you think that's what's doing it?  Should I
> instead just pull the cables? (I'm asking because it's a long drive to
> the site just for the test, but will if you think that's the problem).
> 
> Thanks again,
> Jeff
> 

Well normally when a machine is fenced, it is not shut down, just
powered off.  So that would definatly be the problem if the network
interface is still down when you are shutting down the system.  How are
you issuing commands to the servers after taking down the network
interface if you're not on site?

I take it you aren't using power controllers to fence the machines, just
using fence_manual, or an I/O fencing mechanism?

Thanks, 
Eric Kerin <eric at bootseg.com>


From mtilstra at redhat.com  Mon Jul 18 18:33:39 2005
From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra)
Date: Mon, 18 Jul 2005 13:33:39 -0500
Subject: [Linux-cluster] gfs 6.1 RHEL4 - gulm problem
In-Reply-To: <bf4c3d7fe7e445f5bee6e785c914e24e@atlas.cz>
References: <bf4c3d7fe7e445f5bee6e785c914e24e@atlas.cz>
Message-ID: <20050718183339.GA27958@redhat.com>

On Mon, Jul 18, 2005 at 08:01:19PM +0200, vmoravek at atlas.cz wrote:
> Hi All, 
> for one our customer I configure gfs cluster on gulm locking base.
> 
> But i have problem with lock_gulmd starting script, did you seen this
> message before??
> Jul 18 19:56:19 gfs-samba1 lock_gulmd_core[9452]: In src/core_main.c:200 (1.0.0) death by: open listener failed. 97:Addressfamily not supported by protocol

do a "modprobe ipv6"

-- 
Michael Conrad Tadpol Tilstra
To be, or not to be, those are the parameters.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050718/dacf7936/attachment.sig>

From jharr at opsource.net  Mon Jul 18 18:54:19 2005
From: jharr at opsource.net (Jeff Harr)
Date: Mon, 18 Jul 2005 14:54:19 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
Message-ID: <38A48FA2F0103444906AD22E14F1B5A3543881@mailxchg01.corp.opsource.net>

Well, its actually on a Proliant blade and I'm going in through a
virtual console (an ILO) to issue the ifdown commands.  I'm doing
power-fencing on the ILO itself (its one of the ways you can do it with
Redhat Cluster). 

-----Original Message-----
From: Eric Kerin [mailto:eric at bootseg.com] 
Sent: Monday, July 18, 2005 2:26 PM
To: Jeff Harr
Cc: linux clustering
Subject: RE: [Linux-cluster] kernel: CMANsendmsg failed: -101

On Mon, 2005-07-18 at 14:06 -0400, Jeff Harr wrote:
> Thanks for the help, Eric.  Its interesting that you mentioned taking
> down cman BEFORE taking the interfaces down, because when I was
testing
> my failover I did: ifdown bond0.  My thinking was that the heartbeat
> would die and everything would work.  It didn't occur to me that it
> would mess up cman - do you think that's what's doing it?  Should I
> instead just pull the cables? (I'm asking because it's a long drive to
> the site just for the test, but will if you think that's the problem).
> 
> Thanks again,
> Jeff
> 

Well normally when a machine is fenced, it is not shut down, just
powered off.  So that would definatly be the problem if the network
interface is still down when you are shutting down the system.  How are
you issuing commands to the servers after taking down the network
interface if you're not on site?

I take it you aren't using power controllers to fence the machines, just
using fence_manual, or an I/O fencing mechanism?

Thanks, 
Eric Kerin <eric at bootseg.com>


From clusterbuilder at gmail.com  Mon Jul 18 21:56:49 2005
From: clusterbuilder at gmail.com (Nick I)
Date: Mon, 18 Jul 2005 15:56:49 -0600
Subject: [Linux-cluster] Cluster links
Message-ID: <e073f9120507181456678f0953@mail.gmail.com>

Hi,

I work on a Web site at www.clusterbuilder.org.  The site highlights a
broad spectrum of high performance computing related technologies.  It
has links to reviews, comparisons, pricing and other information
related to the many HPC solutions available. The site also allows for
you to complete one request-for-quote form and have multiple hardware
vendors of your choice provide pricing quotes to you.

We need knowledgeable cluster users to submit sites that have proved a
valuable resources to them.  We want to make sure that the sections
are complete as possible. Please take a minute and submit additional
links associated with Cluster technologies or other areas that will
benefit the HPC community. (To submit content for the site, click on
the Submit Content section on www.clusterbuilder.org).

Thank you for your help.

Nick


From pcaulfie at redhat.com  Tue Jul 19 07:41:17 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Tue, 19 Jul 2005 08:41:17 +0100
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <1121711177.4875.19.camel@auh5-0478>
References: <38A48FA2F0103444906AD22E14F1B5A3543855@mailxchg01.corp.opsource.net>
	<1121711177.4875.19.camel@auh5-0478>
Message-ID: <42DCAE9D.10200@redhat.com>

I've raised a bugzilla report for this. it's low priority because there is
usually a workaround, but it's a pretty tatty message!

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=163587
-- 

patrick


From jharr at opsource.net  Tue Jul 19 14:21:28 2005
From: jharr at opsource.net (Jeff Harr)
Date: Tue, 19 Jul 2005 10:21:28 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
Message-ID: <38A48FA2F0103444906AD22E14F1B5A35439C1@mailxchg01.corp.opsource.net>

Interesting, Patrick.  I appreciate it - glad I'm not the only one.

Jeff

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Patrick Caulfield
Sent: Tuesday, July 19, 2005 3:41 AM
To: linux clustering
Subject: Re: [Linux-cluster] kernel: CMANsendmsg failed: -101

I've raised a bugzilla report for this. it's low priority because there
is
usually a workaround, but it's a pretty tatty message!

https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=163587
-- 

patrick

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster


From lmb at suse.de  Tue Jul 19 15:52:14 2005
From: lmb at suse.de (Lars Marowsky-Bree)
Date: Tue, 19 Jul 2005 17:52:14 +0200
Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
In-Reply-To: <20050718061553.GA9568@redhat.com>
References: <20050718061553.GA9568@redhat.com>
Message-ID: <20050719155214.GG13246@marowsky-bree.de>

On 2005-07-18T14:15:53, David Teigland <teigland at redhat.com> wrote:

> Some of the comments about the dlm concerned how it's configured (from
> user space.)  In particular, there was interest in seeing the dlm and
> ocfs2 use common methods for their configuration.
> 
> The first area I'm looking at is how we get addresses/ids of other nodes.
> Currently, the dlm uses an ioctl on a misc device and ocfs2 uses a
> separate kernel module called "ocfs2_nodemanager" that's based on
> configfs.
> 
> I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use
> it (removing ocfs-specific stuff).  It still needs some work, but I'd like
> to know if this appeals to the ocfs group and to others who were
> interested in seeing some similarity in dlm/ocfs configuration.

Hi Dave, I finally found time to read through this.

Yes, I most definetely like where this is going!

> +/* TODO:
> +   - generic addresses (IPV4/6)
> +   - multiple addresses per node

The nodeid, I thought, was relative to a given DLM namespace, no? This
concept seems to be missing here, or are you suggesting the nodeid to be
global across namespaces?

Also, eventually we obviously need to have state for the nodes - up/down
et cetera. I think the node manager also ought to track this.

How would kernel components use this and be notified about changes to
the configuration / membership state?


Sincerely,
    Lars Marowsky-Br?e <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business	 -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"


From phillips at redhat.com  Tue Jul 19 17:19:50 2005
From: phillips at redhat.com (Daniel Phillips)
Date: Wed, 20 Jul 2005 03:19:50 +1000
Subject: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
In-Reply-To: <20050718061553.GA9568@redhat.com>
References: <20050718061553.GA9568@redhat.com>
Message-ID: <200507200319.51000.phillips@redhat.com>

On Monday 18 July 2005 16:15, David Teigland wrote:
> I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use
> it (removing ocfs-specific stuff).  It still needs some work, but I'd
> like to know if this appeals to the ocfs group and to others who were
> interested in seeing some similarity in dlm/ocfs configuration.

Let me get this straight.  The proposal is to expose cluster membership as a 
virtual filesystem and use that as the primary membership interface?  So 
that, e.g., a server on the cluster does a getdents to find out what nodes 
are in the cluster or uses inotify to learn about membership changes, 
instead of subscribing for and receiving membership events directly from 
the cluster membership manager?

Or what is this about, just providing a nice friendly view of the cluster to 
the administrator, not intended to be used by cluster infrastructure 
components?

Regards,

Daniel


From lhh at redhat.com  Tue Jul 19 20:22:33 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 19 Jul 2005 16:22:33 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <1121711177.4875.19.camel@auh5-0478>
References: <38A48FA2F0103444906AD22E14F1B5A3543855@mailxchg01.corp.opsource.net>
	<1121711177.4875.19.camel@auh5-0478>
Message-ID: <1121804553.14785.34.camel@ayanami.boston.redhat.com>

On Mon, 2005-07-18 at 14:26 -0400, Eric Kerin wrote:
> On Mon, 2005-07-18 at 14:06 -0400, Jeff Harr wrote:
> > Thanks for the help, Eric.  Its interesting that you mentioned taking
> > down cman BEFORE taking the interfaces down, because when I was testing
> > my failover I did: ifdown bond0.  My thinking was that the heartbeat
> > would die and everything would work.  It didn't occur to me that it
> > would mess up cman - do you think that's what's doing it?  Should I
> > instead just pull the cables? (I'm asking because it's a long drive to
> > the site just for the test, but will if you think that's the problem).
> > 
> > Thanks again,
> > Jeff
> > 
> 
> Well normally when a machine is fenced, it is not shut down, just
> powered off.  So that would definatly be the problem if the network
> interface is still down when you are shutting down the system.  How are
> you issuing commands to the servers after taking down the network
> interface if you're not on site?
> 
> I take it you aren't using power controllers to fence the machines, just
> using fence_manual, or an I/O fencing mechanism?

Jeff - 

Here's one way to avoid the problem.  Try this next time:

    reboot -fn

I wouldn't expect much of anything to work after the the network and SAN
paths just got turned off, so I'm not surprised it hangs during
shutdown.

(That said, CMAN probably shouldn't spin forever trying to call
sendmsg.)

-- Lon


From lhh at redhat.com  Tue Jul 19 20:25:54 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 19 Jul 2005 16:25:54 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <38A48FA2F0103444906AD22E14F1B5A3543881@mailxchg01.corp.opsource.net>
References: <38A48FA2F0103444906AD22E14F1B5A3543881@mailxchg01.corp.opsource.net>
Message-ID: <1121804754.14785.38.camel@ayanami.boston.redhat.com>

On Mon, 2005-07-18 at 14:54 -0400, Jeff Harr wrote:
> Well, its actually on a Proliant blade and I'm going in through a
> virtual console (an ILO) to issue the ifdown commands.  I'm doing
> power-fencing on the ILO itself (its one of the ways you can do it with
> Redhat Cluster). 

Eeeeek!

Ok, so, it's trying to be fenced, but it's hanging in shutdown.  The
fencing is causing the shutdown, right?

        chkconfig --del acpid
        service acpid stop

With ACPID running, the virtual power button press gets translated to
"shutdown -h now" instead of "power off NOW", which is what the cluster
wants.

-- Lon


From jharr at opsource.net  Tue Jul 19 20:27:22 2005
From: jharr at opsource.net (Jeff Harr)
Date: Tue, 19 Jul 2005 16:27:22 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
Message-ID: <38A48FA2F0103444906AD22E14F1B5A3543AC1@mailxchg01.corp.opsource.net>

Thanks for the reply, Lon.  I'm not sure why cman would spin forever
considering that ILO is one of the configurable power fences, and it is
supposed to turn the server off in case of a network disconnect.  The
SAN stays up, by the way.

Regarding shutdowns: I'm not actually performing a shutdown - ILO is.

Thanks again,
Jeff

-----Original Message-----
From: Lon Hohberger [mailto:lhh at redhat.com] 
Sent: Tuesday, July 19, 2005 4:23 PM
To: linux clustering
Cc: Jeff Harr
Subject: RE: [Linux-cluster] kernel: CMANsendmsg failed: -101

On Mon, 2005-07-18 at 14:26 -0400, Eric Kerin wrote:
> On Mon, 2005-07-18 at 14:06 -0400, Jeff Harr wrote:
> > Thanks for the help, Eric.  Its interesting that you mentioned
taking
> > down cman BEFORE taking the interfaces down, because when I was
testing
> > my failover I did: ifdown bond0.  My thinking was that the heartbeat
> > would die and everything would work.  It didn't occur to me that it
> > would mess up cman - do you think that's what's doing it?  Should I
> > instead just pull the cables? (I'm asking because it's a long drive
to
> > the site just for the test, but will if you think that's the
problem).
> > 
> > Thanks again,
> > Jeff
> > 
> 
> Well normally when a machine is fenced, it is not shut down, just
> powered off.  So that would definatly be the problem if the network
> interface is still down when you are shutting down the system.  How
are
> you issuing commands to the servers after taking down the network
> interface if you're not on site?
> 
> I take it you aren't using power controllers to fence the machines,
just
> using fence_manual, or an I/O fencing mechanism?

Jeff - 

Here's one way to avoid the problem.  Try this next time:

    reboot -fn

I wouldn't expect much of anything to work after the the network and SAN
paths just got turned off, so I'm not surprised it hangs during
shutdown.

(That said, CMAN probably shouldn't spin forever trying to call
sendmsg.)

-- Lon


From jharr at opsource.net  Tue Jul 19 20:28:01 2005
From: jharr at opsource.net (Jeff Harr)
Date: Tue, 19 Jul 2005 16:28:01 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
Message-ID: <38A48FA2F0103444906AD22E14F1B5A3543AC2@mailxchg01.corp.opsource.net>

Hey, there's a new suggestion :)  Thanks, I'll try that.

Jeff

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Lon Hohberger
Sent: Tuesday, July 19, 2005 4:26 PM
To: linux clustering
Subject: RE: [Linux-cluster] kernel: CMANsendmsg failed: -101

On Mon, 2005-07-18 at 14:54 -0400, Jeff Harr wrote:
> Well, its actually on a Proliant blade and I'm going in through a
> virtual console (an ILO) to issue the ifdown commands.  I'm doing
> power-fencing on the ILO itself (its one of the ways you can do it
with
> Redhat Cluster). 

Eeeeek!

Ok, so, it's trying to be fenced, but it's hanging in shutdown.  The
fencing is causing the shutdown, right?

        chkconfig --del acpid
        service acpid stop

With ACPID running, the virtual power button press gets translated to
"shutdown -h now" instead of "power off NOW", which is what the cluster
wants.

-- Lon

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster


From lhh at redhat.com  Tue Jul 19 20:28:34 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 19 Jul 2005 16:28:34 -0400
Subject: [Linux-cluster] List of supported apc Masterswitches
In-Reply-To: <20050717221644.DPNR4584.viefep18-int.chello.at@TheCenter>
References: <20050717221644.DPNR4584.viefep18-int.chello.at@TheCenter>
Message-ID: <1121804914.14785.42.camel@ayanami.boston.redhat.com>

On Mon, 2005-07-18 at 00:16 +0200, blaster at kofler.eu.org wrote:
> Hello,
> 
> we are interested to implement the GFS 6.1 with the fencing using an apc
> Masterswitch.
> 
> But which models are supported - is there any list available?
> 
> We would be especially interested to use the APC Masterswitch AP9222 and the
> Management Card AP9606.

We have not tested the AP9222, but I think we've tested the 9225 + 9606.
Please let us know if the AP9222 works for you.  If it doesn't, please
send a typescript of the following in both Outlet User and Administrator
modes:

- reboot a port
- power off a port
- power on a port

Please send a raw typescript, if possible (see "man script").  You can
change your IPs and passwords, but please leave everything else the
same.

-- Lon


From jharr at opsource.net  Tue Jul 19 21:19:49 2005
From: jharr at opsource.net (Jeff Harr)
Date: Tue, 19 Jul 2005 17:19:49 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
Message-ID: <38A48FA2F0103444906AD22E14F1B5A3543AF2@mailxchg01.corp.opsource.net>

Ok, that fixed it.  Actually, stopping acpid caused the system to never
receive any power off message.  But you got me on the right track of
thinking and I modified the config file to read "poweroff -f" and now it
turns off.  It turns back on too, but I think that's because of ILO's
autoboot setting, and beyond the control of redhat cluster :)

Lon: you are THE MAN.
Everyone: I appreciate all the help.


-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Lon Hohberger
Sent: Tuesday, July 19, 2005 4:26 PM
To: linux clustering
Subject: RE: [Linux-cluster] kernel: CMANsendmsg failed: -101

On Mon, 2005-07-18 at 14:54 -0400, Jeff Harr wrote:
> Well, its actually on a Proliant blade and I'm going in through a
> virtual console (an ILO) to issue the ifdown commands.  I'm doing
> power-fencing on the ILO itself (its one of the ways you can do it
with
> Redhat Cluster). 

Eeeeek!

Ok, so, it's trying to be fenced, but it's hanging in shutdown.  The
fencing is causing the shutdown, right?

        chkconfig --del acpid
        service acpid stop

With ACPID running, the virtual power button press gets translated to
"shutdown -h now" instead of "power off NOW", which is what the cluster
wants.

-- Lon

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster


From ocrete at max-t.com  Tue Jul 19 22:22:23 2005
From: ocrete at max-t.com (Olivier Crete)
Date: Tue, 19 Jul 2005 18:22:23 -0400
Subject: [Linux-cluster] cman ignores interface setting on ipv4
Message-ID: <1121811743.13635.26.camel@cocagne.max-t.internal>

Hi,

The current ipv4 code in the stable branch for cman completely ignores
the interface="" attribute for multicast. I've attached a minimal patch
that fixes that. 

I've only done minimal testing (ie it works here).. it will probably
break if there is no interface set, etc.. Have fun ;)

-- 
Olivier Cr?te
ocrete at max-t.com
Maximum Throughput Inc.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cman-stable-ipv4-interface.patch
Type: text/x-patch
Size: 2176 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050719/60f0f52b/attachment.bin>

From mark.fasheh at oracle.com  Wed Jul 20 00:48:26 2005
From: mark.fasheh at oracle.com (Mark Fasheh)
Date: Tue, 19 Jul 2005 17:48:26 -0700
Subject: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
In-Reply-To: <20050718061553.GA9568@redhat.com>
References: <20050718061553.GA9568@redhat.com>
Message-ID: <20050720004826.GH14505@ca-server1.us.oracle.com>

Hi David,

On Mon, Jul 18, 2005 at 02:15:53PM +0800, David Teigland wrote:
> Some of the comments about the dlm concerned how it's configured (from
> user space.)  In particular, there was interest in seeing the dlm and
> ocfs2 use common methods for their configuration.
> 
> The first area I'm looking at is how we get addresses/ids of other nodes.
Right. So this doesn't take into account other parts of node management
(communication, heartbeat, etc). OCFS2 and dlm would still be handling that
stuff on their own for now. For OCFS2 that would mean that an
ocfs2_nodemanager would still exist, but as a much smaller module sitting on
top of 'nodemanager'.

> I've taken a stab at generalizing ocfs2_nodemanager so the dlm could use
> it (removing ocfs-specific stuff).  It still needs some work, but I'd like
> to know if this appeals to the ocfs group and to others who were
> interested in seeing some similarity in dlm/ocfs configuration.
While I agree that some things look like they still need a bit of work, I
like the direction this is taking - thanks for getting this ball rolling. My
questions and comments below:

> +enum {
> +	NM_NODE_ATTR_NODEID = 0,
> +	NM_NODE_ATTR_ADDRESS,
> +	NM_NODE_ATTR_LOCAL,
> +};
So no port attribute. The OCFS2 network code normally takes port from the
node manager in order to determine how to talk to a given node. We'll have
to figure out how to resolve that. The easiest would be to add 'port' back,
but I think that might be problematic if we have multiple cluster network
infrastructures as we do today.

Another way to handle this would be to have userspace symlink to the node
items as an attribute on an ocfs2_tcp item. We could store 'port' as a
second attribute. This would have the added benefit of pinning node
information while OCFS2 uses it.

> +struct node {
> +	spinlock_t		nd_lock;
> +	struct config_item	nd_item; 
> +	char			nd_name[NODEMANAGER_MAX_NAME_LEN+1];
An accessor function for this would be nice for pretty prints - maybe strcpy
into a passed string.

> +	int			nd_nodeid;
This definitely won't work with OCFS2... Nodeid (what used to be called
node_num) needs to be unsigned. Otherwise this will break all our nodemap
stuff which uses a bitmap to represent cluster state.

> +	u32			nd_ipv4_address;
> +	struct rb_node		nd_ip_node;
> +	int			nd_local;
> +	unsigned long		nd_set_attributes;
> +	struct idr		nd_status_idr;
> +	struct list_head	nd_status_list;
What are these two for? They don't seem to be referenced elsewhere...

> +static ssize_t node_local_write(struct node *node, const char *page,
> +				size_t count)
> +{
> +	struct cluster *cluster = node_to_cluster(node);
> +	unsigned long tmp;
> +	char *p = (char *)page;
> +
> +	tmp = simple_strtoul(p, &p, 0);
> +	if (!p || (*p && (*p != '\n')))
> +		return -EINVAL;
> +
> +	tmp = !!tmp; /* boolean of whether this node wants to be local */
> +
> +	/* the only failure case is trying to set a new local node
> +	 * when a different one is already set */
> +
> +	if (tmp && tmp == cluster->cl_has_local &&
> +	    cluster->cl_local_node != node->nd_nodeid)
> +		return -EBUSY;
> +
> +	if (!tmp && cluster->cl_has_local &&
> +	    cluster->cl_local_node == node->nd_nodeid) {
> +		cluster->cl_local_node = 0;
I think we might want to be setting cl_local_node to NODEMANAGER_MAX_NODES
here. It seems that ocfs2_nodemanager also does this so we might have just
caught a bug you inherited :)

> diff -urN a/drivers/nodemanager/nodemanager.h b/drivers/nodemanager/nodemanager.h
> --- a/drivers/nodemanager/nodemanager.h	1970-01-01 07:30:00.000000000 +0730
> +++ b/drivers/nodemanager/nodemanager.h	2005-07-18 13:41:35.377583200 +0800
> @@ -0,0 +1,37 @@
> +/*
> + * nodemanager.h
> + *
> + * Copyright (C) 2004 Oracle.  All rights reserved.
> + * Copyright (C) 2005 Red Hat, Inc.  All rights reserved.
> + *
> + * This program is free software; you can redistribute it and/or
> + * modify it under the terms of the GNU General Public
> + * License as published by the Free Software Foundation; either
> + * version 2 of the License, or (at your option) any later version.
> + * 
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
> + * General Public License for more details.
> + * 
> + * You should have received a copy of the GNU General Public
> + * License along with this program; if not, write to the
> + * Free Software Foundation, Inc., 59 Temple Place - Suite 330,
> + * Boston, MA 021110-1307, USA.
> + *
> + */
> +
> +#ifndef NODEMANAGER_H
> +#define NODEMANAGER_H
> +
> +#define NODEMANAGER_MAX_NODES		255
> +#define NODEMANAGER_INVALID_NODE_NUM	255
> +#define NODEMANAGER_MAX_NAME_LEN	__NEW_UTS_LEN	/* 64 */
> +
> +u32 nodemanager_nodeid_to_addr(int nodeid);
> +int nodemanager_addr_to_nodeid(u32 addr);
> +int nodemanager_our_nodeid(void);
> +u32 nodemanager_our_addr(void);
> +
> +#endif
You removed o2nm_configured_node_map but we need some sort of method for
enumerating over the set of configured nodes.

Also we need a method for querying the existence of a node.
The OCFS2 code usually uses o2nm_get_node_by_num(..) != NULL for this but a
simple boolean api call would be cleaner and would avoid exposing the node
structure.

That's it for my first pass - I might have additional comments after reading
through this some more :)
	--Mark

--
Mark Fasheh
Senior Software Developer, Oracle
mark.fasheh at oracle.com


From teigland at redhat.com  Wed Jul 20 03:35:46 2005
From: teigland at redhat.com (David Teigland)
Date: Wed, 20 Jul 2005 11:35:46 +0800
Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
In-Reply-To: <20050719155214.GG13246@marowsky-bree.de>
References: <20050718061553.GA9568@redhat.com>
	<20050719155214.GG13246@marowsky-bree.de>
Message-ID: <20050720033546.GB9747@redhat.com>

On Tue, Jul 19, 2005 at 05:52:14PM +0200, Lars Marowsky-Bree wrote:

> The nodeid, I thought, was relative to a given DLM namespace, no? This
> concept seems to be missing here, or are you suggesting the nodeid to be
> global across namespaces?

I'm not sure I understand what you mean.  A node would have the same
nodeid across different dlm locking-domains, assuming, of course, those
dlm domains were in the context of the same cluster.  The dlm only uses
nodemanager to look up node addresses, though.

> Also, eventually we obviously need to have state for the nodes - up/down
> et cetera. I think the node manager also ought to track this.

We don't have a need for that information yet; I'm hoping we won't ever
need it in the kernel, but we'll see.

> How would kernel components use this and be notified about changes to
> the configuration / membership state?

"Nodemanager" is perhaps a poor name; at the moment its only substantial
purpose is to communicate node address/id associations in a way that's
independent of a specific driver or fs.

Changes to cluster configuration/membership happen in user space, of
course.  Those general events will have specific consequences to a given
component (fs, lock manager, etc).  These consequences vary quite widely
depending on the component you're looking at.

There are at least two ways to handle this:

1. Pass cluster events and data into the kernel (this sounds like what
you're talking about above), notify the effected kernel components, each
kernel component takes the cluster data and does whatever it needs to with
it (internal adjustments, recovery, etc).

2. Each kernel component "foo-kernel" has an associated user space
component "foo-user".  Cluster events (from userland clustering
infrastructure) are passed to foo-user -- not into the kernel.  foo-user
determines what the specific consequences are for foo-kernel.  foo-user
then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs,
configfs, etc).  These control hooks would largely be specific to foo.

We're following option 2 with the dlm and gfs and have been for quite a
while, which means we don't need 1.  I think ocfs2 is moving that way,
too.  Someone could still try 1, of course, but it would be of no use or
interest to me.  I'm not aware of any actual projects pushing forward with
something like 1, so the persistent reference to it is somewhat baffling.

Dave


From teigland at redhat.com  Wed Jul 20 04:16:30 2005
From: teigland at redhat.com (David Teigland)
Date: Wed, 20 Jul 2005 12:16:30 +0800
Subject: [Linux-cluster] [RFC] nodemanager, ocfs2, dlm
In-Reply-To: <20050720004826.GH14505@ca-server1.us.oracle.com>
References: <20050718061553.GA9568@redhat.com>
	<20050720004826.GH14505@ca-server1.us.oracle.com>
Message-ID: <20050720041630.GC9747@redhat.com>

On Tue, Jul 19, 2005 at 05:48:26PM -0700, Mark Fasheh wrote:
> For OCFS2 that would mean that an ocfs2_nodemanager would still exist,
> but as a much smaller module sitting on top of 'nodemanager'.

Yep, factoring out the common bits.

> So no port attribute. The OCFS2 network code normally takes port from the
> node manager in order to determine how to talk to a given node. We'll have
> to figure out how to resolve that. The easiest would be to add 'port' back,
> but I think that might be problematic if we have multiple cluster network
> infrastructures as we do today.

The port is specific to the component using it (ocfs2, dlm, etc), so
defining port as a node property doesn't make sense if nodemanager is
providing node info to multiple components.

> Another way to handle this would be to have userspace symlink to the node
> items as an attribute on an ocfs2_tcp item. We could store 'port' as a
> second attribute. This would have the added benefit of pinning node
> information while OCFS2 uses it.

I expect each component will probably use another per-node configfs object
for component-specific attributes, using the common bits from the
nodemanager object.

> > +	char			nd_name[NODEMANAGER_MAX_NAME_LEN+1];
> An accessor function for this would be nice for pretty prints - maybe strcpy
> into a passed string.

ok

> > +	int			nd_nodeid;
> This definitely won't work with OCFS2... Nodeid (what used to be called
> node_num) needs to be unsigned. Otherwise this will break all our nodemap
> stuff which uses a bitmap to represent cluster state.

ok

> > +	struct list_head	nd_status_list;
> What are these two for? They don't seem to be referenced elsewhere...

Missed ripping them out with the other ocfs-specific stuff.

> > +	if (!tmp && cluster->cl_has_local &&
> > +	    cluster->cl_local_node == node->nd_nodeid) {
> > +		cluster->cl_local_node = 0;
> I think we might want to be setting cl_local_node to NODEMANAGER_MAX_NODES
> here. It seems that ocfs2_nodemanager also does this so we might have just
> caught a bug you inherited :)

yep

> You removed o2nm_configured_node_map but we need some sort of method for
> enumerating over the set of configured nodes.
> 
> Also we need a method for querying the existence of a node.
> The OCFS2 code usually uses o2nm_get_node_by_num(..) != NULL for this but a
> simple boolean api call would be cleaner and would avoid exposing the node
> structure.

Right, those should be on the TODO.

Thanks,
Dave


From soul916 at gmail.com  Wed Jul 20 06:31:55 2005
From: soul916 at gmail.com (soul)
Date: Wed, 20 Jul 2005 14:31:55 +0800
Subject: [Linux-cluster] GFS newbie need help!
Message-ID: <1b08097c0507192331db3932d@mail.gmail.com>

Hi,

I use RHEL4 update 1 with RedHat cluster suit and RedHat GFS
installed, I want to test GFS under gnbd, when i create LVM, I recive
error messages. This is my approach:

# dd if=/dev/zero of=/tmp/testgnbd bs=1M count=1023
1023+0 records in
1023+0 records out

# modprobe gnbd

# gnbd_serv
gnbd_serv: startup succeeded

# gnbd_export -d /tmp/testgnbd -e testgnbd -c
gnbd_export: created GNBD testgnbd serving file /tmp/testgnbd

# gnbd_import -i localhost
gnbd_import: created directory /dev/gnbd
gnbd_import: created gnbd device testgnbd
gnbd_recvd: gnbd_recvd started

# pvcreate /dev/gnbd/testgnbd
  Physical volume "/dev/gnbd/testgnbd" successfully created

# vgcreate testvg /dev/gnbd/testgnbd
  Volume group "testvg" successfully created

#  lvcreate -L 1000 -n testlv testvg
  Error locking on node ECGL2-a5770c: Internal lvm error, check syslog
  Failed to activate new LV.

#  tail /var/log/messages
lvm[2041]: Volume group for uuid not found:
66FkLTuXUlgOglMBKqLEBIBvB1Pj08K7OAxo9QRNT60mqgLVs4dmQVI4T4f58Xi7

# vgscan
  Reading all physical volumes.  This may take a while...
  Found volume group "testvg" using metadata type lvm2

#lvscan
   inactive          '/dev/testvg/testlv' [1000.00 MB] inherit


Please help me to resolve the problem, thanks!

--
soul


From haydar2906 at hotmail.com  Wed Jul 20 14:10:35 2005
From: haydar2906 at hotmail.com (haydar Ali)
Date: Wed, 20 Jul 2005 10:10:35 -0400
Subject: [Linux-cluster] Need help for Clustered NFS
Message-ID: <BAY104-F40D6614061D1056D9940D3C5D70@phx.gbl>

Hi,

I want to setup and configure clustered NFS.
I have created 2 quorum partitions /dev/sdd2 and /dev/sdd3  (100MB each) and 
formatted them

mkfs -t ext2 -b 4096 /dev/sdd2
mkfs -t ext2 -b 4096 /dev/sdd3

I created another huge partition /dev/sdd4 (over 600GB) and formatted it in 
ext3 filesystem.

I installed the cluster suite on the 1st node (RAC1) and I started the 
rawdevices on the two nodes RAC1 and RAC2 (it's OK).

This the hosts file /etc/host on the node1 (RAC1)

# Do not remove the following line, or various programs
# that require network functionality will fail.
127.0.0.1              localhost.localdomain localhost
#
# Private hostnames
#
192.168.253.3           rac1.domain.net     rac1
192.168.253.4           rac2.domain.net     rac2
192.168.253.10          rac1
#
# Hostnames used for Interconnect
#
1.1.1.1                 rac1i.domain.net    rac1i
1.1.1.2                 rac2i.domain.net    rac2i
#
-----------------------


I launched the command cluconfig and it generated /etc/cluster.conf, you can 
list its content:

-------------------------------
This file is automatically generated.  Do not manually edit!

[cluhbd]
  logLevel = 4

[clupowerd]
  logLevel = 4

[cluquorumd]
  logLevel = 4

[cluster]
  alias_ip = 192.168.253.10
  name = project
  timestamp = 1121804245

[clusvcmgrd]
  logLevel = 4

[database]
  version = 2.0

[members]
start member0
start chan0
  name = rac1
  type = net
end chan0
  id = 0
  name = rac1
  powerSwitchIPaddr = rac1
  powerSwitchPortName = unused
  quorumPartitionPrimary = /dev/raw/raw1
  quorumPartitionShadow = /dev/raw/raw2
end member0
start member1
start chan0
  name = rac2
  type = net
end chan0
  id = 1
  name = rac2
  powerSwitchIPaddr = rac2
  powerSwitchPortName = unused
  quorumPartitionPrimary = /dev/raw/raw1
  quorumPartitionShadow = /dev/raw/raw2
end member1

[powercontrollers]
start powercontroller0
  IPaddr = rac1
  login = unused
  passwd = unused
  type = null
end powercontroller0
start powercontroller1
  IPaddr = rac2
  login = unused
  passwd = unused
  type = null
end powercontroller1

[services]
start service0
  checkInterval = 30
start device0
start mount
start NFSexports
start directory0
start client0
  name = rac1
  options = rw
end client0
  name = /u04
end directory0
end NFSexports
  forceUnmount = yes
  fstype = ext3
  name = /u04
  options = rw,nosuid,sync
end mount
  name = /dev/sdd4
  sharename = None
end device0
  name = nfs_project
  preferredNode = rac2
  relocateOnPreferredNodeBoot = yes
end service0
------------------------------------

I created a NFS share on /u04 using the following command cluadmin

[root at rac1 root]# cluadmin
Wed Jul 20 10:02:20 EDT 2005

You can obtain help by entering help and one of the following commands:

cluster     service        clear
help        apropos        exit
version         quit
cluadmin> service show
  1) state
  2) config
  3) services
service show what? 2
  0) nfs_project
  c) cancel

Choose service: 0
name: nfs_project
preferred node: rac2
relocate: yes
monitor interval: 30
device 0: /dev/sdd4
  mount point, device 0: /u04
  mount fstype, device 0: ext3
  mount options, device 0: rw,nosuid,sync
  force unmount, device 0: yes
  samba share, device 0: None
NFS export 0: /u04
  Client 0: rac1, rw
cluadmin> service show state
=========================  S e r v i c e   S t a t u s  
========================

                                         Last             Monitor  Restart
  Service        Status   Owner          Transition       Interval Count
  -------------- -------- -------------- ---------------- -------- -------
  nfs_project       started  rac1        16:21:23 Jul 19  30       1
cluadmin>


And when I launched clustat, I expected this error message:

clustat
Cluster Status Monitor (Fileserver Test Cluster)
07:46:05
Cluster alias: rac1

===================== M e m b e r   S t a t u s ================
  Member         Status     Node Id    Power Switch
  -------------- ---------- ---------- ------------
  rac1           Up         0          Good
  rac2           Down    1          Unknown

=================== H e a r t b e a t   S t a t u s ===============
  Name                           Type       Status
  ------------------------------ ---------- ------------
  rac1         <--> rac2         network    OFFLINE

=================== S e r v i c e   S t a t u s ==================
                                  	     Last            Monitor      Restart
Service         Status   Owner          Transition    Interval  Count

  ------------- -------- ------------- ---------------- ------------
  nfs_project          started  rac1          16:07:42 Jul 19  30            
0


And when I launched this command on RAC2:
mount ?t nfs rac1:/u04 /u04
It list the following error message :
Mount: rac1:/u04 failed, reason given by server: Permission denied

Can someone help me to fix this problem in this configuration?

Thanks

Cheers!

Haydar


From pcaulfie at redhat.com  Wed Jul 20 14:43:48 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Wed, 20 Jul 2005 15:43:48 +0100
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <38A48FA2F0103444906AD22E14F1B5A3543AF2@mailxchg01.corp.opsource.net>
References: <38A48FA2F0103444906AD22E14F1B5A3543AF2@mailxchg01.corp.opsource.net>
Message-ID: <42DE6324.1040407@redhat.com>

I've checked in a fix to the STABLE branch. If you can, please let me know how
you get on with it.

--
Patrick


From jharr at opsource.net  Wed Jul 20 15:22:40 2005
From: jharr at opsource.net (Jeff Harr)
Date: Wed, 20 Jul 2005 11:22:40 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
Message-ID: <38A48FA2F0103444906AD22E14F1B5A3543C0E@mailxchg01.corp.opsource.net>

Well, putting "poweroff -f" in the acpid configuration file powered off
the system, which was what I wanted :)  ILO kept calling acpid, which
interprted 'poweroff' as 'shutdown -h'.  This should have been
documented in the pdf...wasted hours on this.

Thanks again,
Jeff

-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Patrick Caulfield
Sent: Wednesday, July 20, 2005 10:44 AM
To: linux clustering
Subject: Re: [Linux-cluster] kernel: CMANsendmsg failed: -101

I've checked in a fix to the STABLE branch. If you can, please let me
know how
you get on with it.

--
Patrick

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster


From pcaulfie at redhat.com  Wed Jul 20 15:55:06 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Wed, 20 Jul 2005 16:55:06 +0100
Subject: [Linux-cluster] cman ignores interface setting on ipv4
In-Reply-To: <1121811743.13635.26.camel@cocagne.max-t.internal>
References: <1121811743.13635.26.camel@cocagne.max-t.internal>
Message-ID: <42DE73DA.6080408@redhat.com>

Olivier Crete wrote:
> Hi,
> 
> The current ipv4 code in the stable branch for cman completely ignores
> the interface="" attribute for multicast. I've attached a minimal patch
> that fixes that. 
> 
> I've only done minimal testing (ie it works here).. it will probably
> break if there is no interface set, etc.. Have fun ;)
> 

Thanks for the patch, I'll get it integrated.

Multicast should (well, does for me!) work anyway provided you only use one
interface and it matches the host name - which it always should all you'll run
into other problems.

Were you actually having problems with it without the patch ?
-- 

patrick


From ocrete at max-t.com  Wed Jul 20 16:29:40 2005
From: ocrete at max-t.com (Olivier Crete)
Date: Wed, 20 Jul 2005 12:29:40 -0400
Subject: [Linux-cluster] cman ignores interface setting on ipv4
In-Reply-To: <42DE73DA.6080408@redhat.com>
References: <1121811743.13635.26.camel@cocagne.max-t.internal>
	<42DE73DA.6080408@redhat.com>
Message-ID: <1121876980.13635.32.camel@cocagne.max-t.internal>

On Wed, 2005-20-07 at 16:55 +0100, Patrick Caulfield wrote:
> Olivier Crete wrote:
> > Hi,
> > 
> > The current ipv4 code in the stable branch for cman completely ignores
> > the interface="" attribute for multicast. I've attached a minimal patch
> > that fixes that. 
> > 
> > I've only done minimal testing (ie it works here).. it will probably
> > break if there is no interface set, etc.. Have fun ;)
> > 
> 
> Thanks for the patch, I'll get it integrated.

Make sure it doesnt break in the case where the interface isnt
specified.. I didn't test that case.

> Multicast should (well, does for me!) work anyway provided you only use one
> interface and it matches the host name - which it always should all you'll run
> into other problems.
> 
> Were you actually having problems with it without the patch ?

Yes, our nodes are linked using two different networks, a management
network (Ethernet) and a cluster interconnect (myrinet).. and I want the
heartbeat stuff to happen over myrinet (this is what our application
uses).. But the hostname of the nodes resolves to the ip of the Ethernet
card. 

-- 
Olivier Cr?te
ocrete at max-t.com
Maximum Throughput Inc.


From lmb at suse.de  Wed Jul 20 16:26:36 2005
From: lmb at suse.de (Lars Marowsky-Bree)
Date: Wed, 20 Jul 2005 18:26:36 +0200
Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
In-Reply-To: <20050720033546.GB9747@redhat.com>
References: <20050718061553.GA9568@redhat.com>
	<20050719155214.GG13246@marowsky-bree.de>
	<20050720033546.GB9747@redhat.com>
Message-ID: <20050720162636.GL5416@marowsky-bree.de>

On 2005-07-20T11:35:46, David Teigland <teigland at redhat.com> wrote:

> > Also, eventually we obviously need to have state for the nodes - up/down
> > et cetera. I think the node manager also ought to track this.
> We don't have a need for that information yet; I'm hoping we won't ever
> need it in the kernel, but we'll see.

Hm, I'm thinking a service might have a good reason to want to know the
possible list of nodes as opposed to the currently active membership;
though the DLM as the service in question right now does not appear to
need such.

But, see below.

> There are at least two ways to handle this:
> 
> 1. Pass cluster events and data into the kernel (this sounds like what
> you're talking about above), notify the effected kernel components, each
> kernel component takes the cluster data and does whatever it needs to with
> it (internal adjustments, recovery, etc).
> 
> 2. Each kernel component "foo-kernel" has an associated user space
> component "foo-user".  Cluster events (from userland clustering
> infrastructure) are passed to foo-user -- not into the kernel.  foo-user
> determines what the specific consequences are for foo-kernel.  foo-user
> then manipulates foo-kernel accordingly, through user/kernel hooks (sysfs,
> configfs, etc).  These control hooks would largely be specific to foo.
> 
> We're following option 2 with the dlm and gfs and have been for quite a
> while, which means we don't need 1.  I think ocfs2 is moving that way,
> too.  Someone could still try 1, of course, but it would be of no use or
> interest to me.  I'm not aware of any actual projects pushing forward with
> something like 1, so the persistent reference to it is somewhat baffling.

Right. I thought that the node manager changes for generalizing it where
pushing into sort-of direction 1. Thanks for clearing this up.


Sincerely,
    Lars Marowsky-Br?e <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business	 -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"


From JACOB_LIBERMAN at Dell.com  Wed Jul 20 16:44:33 2005
From: JACOB_LIBERMAN at Dell.com (JACOB_LIBERMAN at Dell.com)
Date: Wed, 20 Jul 2005 11:44:33 -0500
Subject: [Linux-cluster] Need help for Clustered NFS
Message-ID: <BC430F453501174992B9D9E8AFB7519A0855FB@ausx3mps309.aus.amer.dell.com>

You need to start the clustering services on the 2nd node so it can join
the cluster. Otherwise it wont be able to access the disk protected by
the cluster services on node1. the cluster service controllers when and
whether the shared disk resources get mounted by a cluster host.

Thanks, jacob

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of haydar Ali
> Sent: Wednesday, July 20, 2005 9:11 AM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] Need help for Clustered NFS
> 
> Hi,
> 
> I want to setup and configure clustered NFS.
> I have created 2 quorum partitions /dev/sdd2 and /dev/sdd3  
> (100MB each) and formatted them
> 
> mkfs -t ext2 -b 4096 /dev/sdd2
> mkfs -t ext2 -b 4096 /dev/sdd3
> 
> I created another huge partition /dev/sdd4 (over 600GB) and 
> formatted it in
> ext3 filesystem.
> 
> I installed the cluster suite on the 1st node (RAC1) and I 
> started the rawdevices on the two nodes RAC1 and RAC2 (it's OK).
> 
> This the hosts file /etc/host on the node1 (RAC1)
> 
> # Do not remove the following line, or various programs # 
> that require network functionality will fail.
> 127.0.0.1              localhost.localdomain localhost
> #
> # Private hostnames
> #
> 192.168.253.3           rac1.domain.net     rac1
> 192.168.253.4           rac2.domain.net     rac2
> 192.168.253.10          rac1
> #
> # Hostnames used for Interconnect
> #
> 1.1.1.1                 rac1i.domain.net    rac1i
> 1.1.1.2                 rac2i.domain.net    rac2i
> #
> -----------------------
> 
> 
> I launched the command cluconfig and it generated 
> /etc/cluster.conf, you can list its content:
> 
> -------------------------------
> This file is automatically generated.  Do not manually edit!
> 
> [cluhbd]
>   logLevel = 4
> 
> [clupowerd]
>   logLevel = 4
> 
> [cluquorumd]
>   logLevel = 4
> 
> [cluster]
>   alias_ip = 192.168.253.10
>   name = project
>   timestamp = 1121804245
> 
> [clusvcmgrd]
>   logLevel = 4
> 
> [database]
>   version = 2.0
> 
> [members]
> start member0
> start chan0
>   name = rac1
>   type = net
> end chan0
>   id = 0
>   name = rac1
>   powerSwitchIPaddr = rac1
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member0
> start member1
> start chan0
>   name = rac2
>   type = net
> end chan0
>   id = 1
>   name = rac2
>   powerSwitchIPaddr = rac2
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member1
> 
> [powercontrollers]
> start powercontroller0
>   IPaddr = rac1
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller0
> start powercontroller1
>   IPaddr = rac2
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller1
> 
> [services]
> start service0
>   checkInterval = 30
> start device0
> start mount
> start NFSexports
> start directory0
> start client0
>   name = rac1
>   options = rw
> end client0
>   name = /u04
> end directory0
> end NFSexports
>   forceUnmount = yes
>   fstype = ext3
>   name = /u04
>   options = rw,nosuid,sync
> end mount
>   name = /dev/sdd4
>   sharename = None
> end device0
>   name = nfs_project
>   preferredNode = rac2
>   relocateOnPreferredNodeBoot = yes
> end service0
> ------------------------------------
> 
> I created a NFS share on /u04 using the following command cluadmin
> 
> [root at rac1 root]# cluadmin
> Wed Jul 20 10:02:20 EDT 2005
> 
> You can obtain help by entering help and one of the following 
> commands:
> 
> cluster     service        clear
> help        apropos        exit
> version         quit
> cluadmin> service show
>   1) state
>   2) config
>   3) services
> service show what? 2
>   0) nfs_project
>   c) cancel
> 
> Choose service: 0
> name: nfs_project
> preferred node: rac2
> relocate: yes
> monitor interval: 30
> device 0: /dev/sdd4
>   mount point, device 0: /u04
>   mount fstype, device 0: ext3
>   mount options, device 0: rw,nosuid,sync
>   force unmount, device 0: yes
>   samba share, device 0: None
> NFS export 0: /u04
>   Client 0: rac1, rw
> cluadmin> service show state
> =========================  S e r v i c e   S t a t u s  
> ========================
> 
>                                          Last             
> Monitor  Restart
>   Service        Status   Owner          Transition       
> Interval Count
>   -------------- -------- -------------- ---------------- 
> -------- -------
>   nfs_project       started  rac1        16:21:23 Jul 19  30       1
> cluadmin>
> 
> 
> And when I launched clustat, I expected this error message:
> 
> clustat
> Cluster Status Monitor (Fileserver Test Cluster)
> 07:46:05
> Cluster alias: rac1
> 
> ===================== M e m b e r   S t a t u s ================
>   Member         Status     Node Id    Power Switch
>   -------------- ---------- ---------- ------------
>   rac1           Up         0          Good
>   rac2           Down    1          Unknown
> 
> =================== H e a r t b e a t   S t a t u s ===============
>   Name                           Type       Status
>   ------------------------------ ---------- ------------
>   rac1         <--> rac2         network    OFFLINE
> 
> =================== S e r v i c e   S t a t u s ==================
>                                   	     Last            
> Monitor      Restart
> Service         Status   Owner          Transition    Interval  Count
> 
>   ------------- -------- ------------- ---------------- ------------
>   nfs_project          started  rac1          16:07:42 Jul 19 
>  30            
> 0
> 
> 
> 
> And when I launched this command on RAC2:
> mount -t nfs rac1:/u04 /u04
> It list the following error message :
> Mount: rac1:/u04 failed, reason given by server: Permission denied
> 
> Can someone help me to fix this problem in this configuration?
> 
> Thanks
> 
> Cheers!
> 
> Haydar
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 


From haydar2906 at hotmail.com  Wed Jul 20 16:48:58 2005
From: haydar2906 at hotmail.com (haydar Ali)
Date: Wed, 20 Jul 2005 12:48:58 -0400
Subject: [Linux-cluster] Need help for Clustered NFS
Message-ID: <BAY104-F3575C4453794BA9DFA94EAC5D70@phx.gbl>

Hi Jacob,

Have I to create a /etc/cluster.conf on the 2nd node and I put the 1st node 
as preferred node?

Thanks a lot

Haydar

>From: <JACOB_LIBERMAN at Dell.com>
>Reply-To: linux clustering <linux-cluster at redhat.com>
>To: <linux-cluster at redhat.com>
>Subject: RE: [Linux-cluster] Need help for Clustered NFS
>Date: Wed, 20 Jul 2005 11:44:33 -0500
>
>You need to start the clustering services on the 2nd node so it can join
>the cluster. Otherwise it wont be able to access the disk protected by
>the cluster services on node1. the cluster service controllers when and
>whether the shared disk resources get mounted by a cluster host.
>
>Thanks, jacob
>
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of haydar Ali
> > Sent: Wednesday, July 20, 2005 9:11 AM
> > To: linux-cluster at redhat.com
> > Subject: [Linux-cluster] Need help for Clustered NFS
> >
> > Hi,
> >
> > I want to setup and configure clustered NFS.
> > I have created 2 quorum partitions /dev/sdd2 and /dev/sdd3
> > (100MB each) and formatted them
> >
> > mkfs -t ext2 -b 4096 /dev/sdd2
> > mkfs -t ext2 -b 4096 /dev/sdd3
> >
> > I created another huge partition /dev/sdd4 (over 600GB) and
> > formatted it in
> > ext3 filesystem.
> >
> > I installed the cluster suite on the 1st node (RAC1) and I
> > started the rawdevices on the two nodes RAC1 and RAC2 (it's OK).
> >
> > This the hosts file /etc/host on the node1 (RAC1)
> >
> > # Do not remove the following line, or various programs #
> > that require network functionality will fail.
> > 127.0.0.1              localhost.localdomain localhost
> > #
> > # Private hostnames
> > #
> > 192.168.253.3           rac1.domain.net     rac1
> > 192.168.253.4           rac2.domain.net     rac2
> > 192.168.253.10          rac1
> > #
> > # Hostnames used for Interconnect
> > #
> > 1.1.1.1                 rac1i.domain.net    rac1i
> > 1.1.1.2                 rac2i.domain.net    rac2i
> > #
> > -----------------------
> >
> >
> > I launched the command cluconfig and it generated
> > /etc/cluster.conf, you can list its content:
> >
> > -------------------------------
> > This file is automatically generated.  Do not manually edit!
> >
> > [cluhbd]
> >   logLevel = 4
> >
> > [clupowerd]
> >   logLevel = 4
> >
> > [cluquorumd]
> >   logLevel = 4
> >
> > [cluster]
> >   alias_ip = 192.168.253.10
> >   name = project
> >   timestamp = 1121804245
> >
> > [clusvcmgrd]
> >   logLevel = 4
> >
> > [database]
> >   version = 2.0
> >
> > [members]
> > start member0
> > start chan0
> >   name = rac1
> >   type = net
> > end chan0
> >   id = 0
> >   name = rac1
> >   powerSwitchIPaddr = rac1
> >   powerSwitchPortName = unused
> >   quorumPartitionPrimary = /dev/raw/raw1
> >   quorumPartitionShadow = /dev/raw/raw2
> > end member0
> > start member1
> > start chan0
> >   name = rac2
> >   type = net
> > end chan0
> >   id = 1
> >   name = rac2
> >   powerSwitchIPaddr = rac2
> >   powerSwitchPortName = unused
> >   quorumPartitionPrimary = /dev/raw/raw1
> >   quorumPartitionShadow = /dev/raw/raw2
> > end member1
> >
> > [powercontrollers]
> > start powercontroller0
> >   IPaddr = rac1
> >   login = unused
> >   passwd = unused
> >   type = null
> > end powercontroller0
> > start powercontroller1
> >   IPaddr = rac2
> >   login = unused
> >   passwd = unused
> >   type = null
> > end powercontroller1
> >
> > [services]
> > start service0
> >   checkInterval = 30
> > start device0
> > start mount
> > start NFSexports
> > start directory0
> > start client0
> >   name = rac1
> >   options = rw
> > end client0
> >   name = /u04
> > end directory0
> > end NFSexports
> >   forceUnmount = yes
> >   fstype = ext3
> >   name = /u04
> >   options = rw,nosuid,sync
> > end mount
> >   name = /dev/sdd4
> >   sharename = None
> > end device0
> >   name = nfs_project
> >   preferredNode = rac2
> >   relocateOnPreferredNodeBoot = yes
> > end service0
> > ------------------------------------
> >
> > I created a NFS share on /u04 using the following command cluadmin
> >
> > [root at rac1 root]# cluadmin
> > Wed Jul 20 10:02:20 EDT 2005
> >
> > You can obtain help by entering help and one of the following
> > commands:
> >
> > cluster     service        clear
> > help        apropos        exit
> > version         quit
> > cluadmin> service show
> >   1) state
> >   2) config
> >   3) services
> > service show what? 2
> >   0) nfs_project
> >   c) cancel
> >
> > Choose service: 0
> > name: nfs_project
> > preferred node: rac2
> > relocate: yes
> > monitor interval: 30
> > device 0: /dev/sdd4
> >   mount point, device 0: /u04
> >   mount fstype, device 0: ext3
> >   mount options, device 0: rw,nosuid,sync
> >   force unmount, device 0: yes
> >   samba share, device 0: None
> > NFS export 0: /u04
> >   Client 0: rac1, rw
> > cluadmin> service show state
> > =========================  S e r v i c e   S t a t u s
> > ========================
> >
> >                                          Last
> > Monitor  Restart
> >   Service        Status   Owner          Transition
> > Interval Count
> >   -------------- -------- -------------- ----------------
> > -------- -------
> >   nfs_project       started  rac1        16:21:23 Jul 19  30       1
> > cluadmin>
> >
> >
> > And when I launched clustat, I expected this error message:
> >
> > clustat
> > Cluster Status Monitor (Fileserver Test Cluster)
> > 07:46:05
> > Cluster alias: rac1
> >
> > ===================== M e m b e r   S t a t u s ================
> >   Member         Status     Node Id    Power Switch
> >   -------------- ---------- ---------- ------------
> >   rac1           Up         0          Good
> >   rac2           Down    1          Unknown
> >
> > =================== H e a r t b e a t   S t a t u s ===============
> >   Name                           Type       Status
> >   ------------------------------ ---------- ------------
> >   rac1         <--> rac2         network    OFFLINE
> >
> > =================== S e r v i c e   S t a t u s ==================
> >                                   	     Last
> > Monitor      Restart
> > Service         Status   Owner          Transition    Interval  Count
> >
> >   ------------- -------- ------------- ---------------- ------------
> >   nfs_project          started  rac1          16:07:42 Jul 19
> >  30
> > 0
> >
> >
> >
> > And when I launched this command on RAC2:
> > mount -t nfs rac1:/u04 /u04
> > It list the following error message :
> > Mount: rac1:/u04 failed, reason given by server: Permission denied
> >
> > Can someone help me to fix this problem in this configuration?
> >
> > Thanks
> >
> > Cheers!
> >
> > Haydar
> >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > http://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster


From bruce.walker at hp.com  Wed Jul 20 16:55:31 2005
From: bruce.walker at hp.com (Walker, Bruce J (HP-Labs))
Date: Wed, 20 Jul 2005 09:55:31 -0700
Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
Message-ID: <3689AF909D816446BA505D21F1461AE404167CFB@cacexc04.americas.cpqcorp.net>

Like Lars, I too was under the wrong impression about this configfs "nodemanager" kernel component.  Our discussions in the cluster meeting Monday and Tuesday were assuming it was a general service that other kernel components could/would utilize and possibly also something that could send uevents to non-kernel components wanting a std. way to see membership information/events.

As to kernel components without corresponding user-level "managers", look no farther than OpenSSI.  Our hope was that we could adapt to a user-land membership service and this interface thru configfs would drive all our kernel subsystems.

Bruce Walker
OpenSSI Cluster project
 

-----Original Message-----
From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Lars Marowsky-Bree
Sent: Wednesday, July 20, 2005 9:27 AM
To: David Teigland
Cc: linux-cluster at redhat.com; linux-kernel at vger.kernel.org; ocfs2-devel at oss.oracle.com
Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm

On 2005-07-20T11:35:46, David Teigland <teigland at redhat.com> wrote:

> > Also, eventually we obviously need to have state for the nodes - 
> > up/down et cetera. I think the node manager also ought to track this.
> We don't have a need for that information yet; I'm hoping we won't 
> ever need it in the kernel, but we'll see.

Hm, I'm thinking a service might have a good reason to want to know the possible list of nodes as opposed to the currently active membership; though the DLM as the service in question right now does not appear to need such.

But, see below.

> There are at least two ways to handle this:
> 
> 1. Pass cluster events and data into the kernel (this sounds like what 
> you're talking about above), notify the effected kernel components, 
> each kernel component takes the cluster data and does whatever it 
> needs to with it (internal adjustments, recovery, etc).
> 
> 2. Each kernel component "foo-kernel" has an associated user space 
> component "foo-user".  Cluster events (from userland clustering
> infrastructure) are passed to foo-user -- not into the kernel.  
> foo-user determines what the specific consequences are for foo-kernel.  
> foo-user then manipulates foo-kernel accordingly, through user/kernel 
> hooks (sysfs, configfs, etc).  These control hooks would largely be specific to foo.
> 
> We're following option 2 with the dlm and gfs and have been for quite 
> a while, which means we don't need 1.  I think ocfs2 is moving that 
> way, too.  Someone could still try 1, of course, but it would be of no 
> use or interest to me.  I'm not aware of any actual projects pushing 
> forward with something like 1, so the persistent reference to it is somewhat baffling.

Right. I thought that the node manager changes for generalizing it where pushing into sort-of direction 1. Thanks for clearing this up.


Sincerely,
    Lars Marowsky-Br?e <lmb at suse.de>

--
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business	 -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster


From lhh at redhat.com  Wed Jul 20 17:02:27 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 20 Jul 2005 13:02:27 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <38A48FA2F0103444906AD22E14F1B5A3543AF2@mailxchg01.corp.opsource.net>
References: <38A48FA2F0103444906AD22E14F1B5A3543AF2@mailxchg01.corp.opsource.net>
Message-ID: <1121878947.3498.25.camel@ayanami.boston.redhat.com>

On Tue, 2005-07-19 at 17:19 -0400, Jeff Harr wrote:
> Ok, that fixed it.  Actually, stopping acpid caused the system to never
> receive any power off message.  But you got me on the right track of
> thinking and I modified the config file to read "poweroff -f" and now it
> turns off.  It turns back on too, but I think that's because of ILO's
> autoboot setting, and beyond the control of redhat cluster :)
> 
> Lon: you are THE MAN.
> Everyone: I appreciate all the help.

Eeeeek! :)

Try adding "acpi=off" to the kernel command line and rebooting.  I
wouldn't recommend relying on an evicted node to have ACPID operating
properly.  For now, it's probably fine, though... ;)

Basically, the power button should really be a power button -- not a
soft-power-button.  Some BIOSes allow you to toggle whether it's a "soft
power" or "hard power".

-- Lon


From lhh at redhat.com  Wed Jul 20 17:19:36 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 20 Jul 2005 13:19:36 -0400
Subject: [Linux-cluster] Need help for Clustered NFS
In-Reply-To: <BAY104-F40D6614061D1056D9940D3C5D70@phx.gbl>
References: <BAY104-F40D6614061D1056D9940D3C5D70@phx.gbl>
Message-ID: <1121879976.3498.41.camel@ayanami.boston.redhat.com>

On Wed, 2005-07-20 at 10:10 -0400, haydar Ali wrote:

> mkfs -t ext2 -b 4096 /dev/sdd2
> mkfs -t ext2 -b 4096 /dev/sdd3

The quorum partitions shouldn't have a file system.  Cluster manager
will destroy all the data on those file systems.


> I created another huge partition /dev/sdd4 (over 600GB) and formatted it in 
> ext3 filesystem.
> 
> I installed the cluster suite on the 1st node (RAC1) and I started the 
> rawdevices on the two nodes RAC1 and RAC2 (it's OK).

Looks like your configuration is from clumanager on RHEL 2.1 AS (there
was no RHCS for RHEL 2.1).


> This the hosts file /etc/host on the node1 (RAC1)
> 
> # Do not remove the following line, or various programs
> # that require network functionality will fail.
> 127.0.0.1              localhost.localdomain localhost
> #
> # Private hostnames
> #
> 192.168.253.3           rac1.domain.net     rac1
> 192.168.253.4           rac2.domain.net     rac2
> 192.168.253.10          rac1

> [cluster]
>   alias_ip = 192.168.253.10

The cluster alias IP is only for monitoring, not for services.  In any
case, you need to have different hostnames for the cluster members, the
alias IP, and any service IPs you intend to use.


> [powercontrollers]
> start powercontroller0
>   IPaddr = rac1
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller0
> start powercontroller1
>   IPaddr = rac2
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller1

Obligatory note: Please buy a power controller.


> [services]
> start service0
>   checkInterval = 30
> start device0
> start mount
> start NFSexports
> start directory0
> start client0
>   name = rac1
>   options = rw
> end client0
>   name = /u04
> end directory0
> end NFSexports
>   forceUnmount = yes
>   fstype = ext3
>   name = /u04
>   options = rw,nosuid,sync
> end mount
>   name = /dev/sdd4
>   sharename = None
> end device0
>   name = nfs_project
>   preferredNode = rac2
>   relocateOnPreferredNodeBoot = yes
> end service0

> And when I launched this command on RAC2:
> mount t nfs rac1:/u04 /u04
> It list the following error message :
> Mount: rac1:/u04 failed, reason given by server: Permission denied
> 
> Can someone help me to fix this problem in this configuration?

a. Your /etc/hosts needs to be fixed: rac1 will first resolve to
192.168.253.3, so that server would refuse the mount.  rac1 needs to be
one IP, and the .10 address needs to have its own hostname.

b. Your service needs its own IP address.  I'd scrap the cluster alias
IP and move it in to the service.

c. You don't have RAC2 as a listed client in the service.  So, if (a)
wasn't true, mounting from RAC2 will always fail because it's not in the
NFS ACL.

Good luck!

-- Lon


From lhh at redhat.com  Wed Jul 20 17:49:44 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 20 Jul 2005 13:49:44 -0400
Subject: [Linux-cluster] Using cman,etc for a non-gfs app
In-Reply-To: <1121464976.13635.13.camel@cocagne.max-t.internal>
References: <1119467822.21167.34.camel@cocagne.max-t.internal>
	<1119479792.24237.315.camel@ayanami.boston.redhat.com>
	<1121464976.13635.13.camel@cocagne.max-t.internal>
Message-ID: <1121881784.3498.67.camel@ayanami.boston.redhat.com>

On Fri, 2005-07-15 at 18:02 -0400, Olivier Crete wrote:
> On Wed, 2005-22-06 at 18:36 -0400, Lon Hohberger wrote:
> > * libmagma provides cp_fence() / clu_fence() which work on both CMAN and
> > gulm.
> 
> I've looked implementing clu_fence() in libmagma. It seems to only call
> an ioctl in the kernel cman which only seems to do cooperative fencing
> and removing a node from the cluster... I dont see any call to fenced or
> fence_node. Or have I missed something ?
> 

Correct, it removes the node via libgulm or asking CMAN to expire and
fence the node.

If you want to change it, I'd recommend calling fence_node from the
particular plugin you're using rather than altering the way libmagma
works (libmagma is dumb -- don't make it smart, that's not it's
job! ;) ).

-- Lon


From lmb at suse.de  Wed Jul 20 18:09:18 2005
From: lmb at suse.de (Lars Marowsky-Bree)
Date: Wed, 20 Jul 2005 20:09:18 +0200
Subject: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC]
	nodemanager, ocfs2, dlm
In-Reply-To: <3689AF909D816446BA505D21F1461AE404167CFB@cacexc04.americas.cpqcorp.net>
References: <3689AF909D816446BA505D21F1461AE404167CFB@cacexc04.americas.cpqcorp.net>
Message-ID: <20050720180918.GU5416@marowsky-bree.de>

On 2005-07-20T09:55:31, "Walker, Bruce J (HP-Labs)" <bruce.walker at hp.com> wrote:

> Like Lars, I too was under the wrong impression about this configfs
> "nodemanager" kernel component.  Our discussions in the cluster
> meeting Monday and Tuesday were assuming it was a general service that
> other kernel components could/would utilize and possibly also
> something that could send uevents to non-kernel components wanting a
> std. way to see membership information/events.

Let me clarify that this was something we briefly touched on in
Walldorf: The node manager would (re-)export the current data via sysfs
(which would result in uevents being sent, too), and not something we
dreamed up just Monday ;-)

> As to kernel components without corresponding user-level "managers",
> look no farther than OpenSSI.  Our hope was that we could adapt to a
> user-land membership service and this interface thru configfs would
> drive all our kernel subsystems.

Well, node manager still can provide you the input as to which nodes are
configured, which in a way translates to "membership". The thing it
doesn't seem to provide yet is the supsend/modify/resume cycle which for
example the RHAT DLM seems to require.


Sincerely,
    Lars Marowsky-Br?e <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business	 -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"


From ocrete at max-t.com  Wed Jul 20 18:31:49 2005
From: ocrete at max-t.com (Olivier Crete)
Date: Wed, 20 Jul 2005 14:31:49 -0400
Subject: [Linux-cluster] Using cman,etc for a non-gfs app
In-Reply-To: <1121881784.3498.67.camel@ayanami.boston.redhat.com>
References: <1119467822.21167.34.camel@cocagne.max-t.internal>
	<1119479792.24237.315.camel@ayanami.boston.redhat.com>
	<1121464976.13635.13.camel@cocagne.max-t.internal>
	<1121881784.3498.67.camel@ayanami.boston.redhat.com>
Message-ID: <1121884309.13635.34.camel@cocagne.max-t.internal>

On Wed, 2005-20-07 at 13:49 -0400, Lon Hohberger wrote:
> On Fri, 2005-07-15 at 18:02 -0400, Olivier Crete wrote:
> > On Wed, 2005-22-06 at 18:36 -0400, Lon Hohberger wrote:
> > > * libmagma provides cp_fence() / clu_fence() which work on both CMAN and
> > > gulm.
> > 
> > I've looked implementing clu_fence() in libmagma. It seems to only call
> > an ioctl in the kernel cman which only seems to do cooperative fencing
> > and removing a node from the cluster... I dont see any call to fenced or
> > fence_node. Or have I missed something ?
> > 
> 
> Correct, it removes the node via libgulm or asking CMAN to expire and
> fence the node.

But cman does not seem to really fence the node... just remove it from
its list..  So that's not enough for our app that has shared storage..
Ideally, I would need the same behavior as stonith and have the call
block and return only when we are 100% sure that the node have been
fenced.

> If you want to change it, I'd recommend calling fence_node from the
> particular plugin you're using rather than altering the way libmagma
> works (libmagma is dumb -- don't make it smart, that's not it's
> job! ;) ).

Right now I'm just calling fence_node directly from my app

-- 
Olivier Cr?te
ocrete at max-t.com
Maximum Throughput Inc.


From Joel.Becker at oracle.com  Wed Jul 20 18:39:38 2005
From: Joel.Becker at oracle.com (Joel Becker)
Date: Wed, 20 Jul 2005 11:39:38 -0700
Subject: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC]
	nodemanager, ocfs2, dlm
In-Reply-To: <20050720180918.GU5416@marowsky-bree.de>
References: <3689AF909D816446BA505D21F1461AE404167CFB@cacexc04.americas.cpqcorp.net>
	<20050720180918.GU5416@marowsky-bree.de>
Message-ID: <20050720183938.GM16618@ca-server1.us.oracle.com>

On Wed, Jul 20, 2005 at 08:09:18PM +0200, Lars Marowsky-Bree wrote:
> On 2005-07-20T09:55:31, "Walker, Bruce J (HP-Labs)" <bruce.walker at hp.com> wrote:
> 
> > Like Lars, I too was under the wrong impression about this configfs
> > "nodemanager" kernel component.  Our discussions in the cluster
> > meeting Monday and Tuesday were assuming it was a general service that
> > other kernel components could/would utilize and possibly also
> > something that could send uevents to non-kernel components wanting a
> > std. way to see membership information/events.
> 
> Let me clarify that this was something we briefly touched on in
> Walldorf: The node manager would (re-)export the current data via sysfs
> (which would result in uevents being sent, too), and not something we
> dreamed up just Monday ;-)

	In turn, let me clarify a little where configfs fits in to
things.  Configfs is merely a convenient and transparent method to
communicate configuration to kernel objects.  It's not a place for
uevents, for netlink sockets, or for fancy communication.  It allows
userspace to create an in-kernel object and set/get values on that
object.  It also allows userspace and kernelspace to share the same
representation of that object and its values.
	For more complex interaction, sysfs and procfs are often more
appropriate.  While you might "configure" all known nodes in configfs,
the node up/down state might live in sysfs.  A netlink socket for
up/down events might live in procfs.  And so on.

Joel

-- 

"But all my words come back to me
 In shades of mediocrity.
 Like emptiness in harmony
 I need someone to comfort me."

Joel Becker
Senior Member of Technical Staff
Oracle
E-mail: joel.becker at oracle.com
Phone: (650) 506-8127


From lhh at redhat.com  Wed Jul 20 20:30:17 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 20 Jul 2005 16:30:17 -0400
Subject: [Linux-cluster] Using cman,etc for a non-gfs app
In-Reply-To: <1121884309.13635.34.camel@cocagne.max-t.internal>
References: <1119467822.21167.34.camel@cocagne.max-t.internal>
	<1119479792.24237.315.camel@ayanami.boston.redhat.com>
	<1121464976.13635.13.camel@cocagne.max-t.internal>
	<1121881784.3498.67.camel@ayanami.boston.redhat.com>
	<1121884309.13635.34.camel@cocagne.max-t.internal>
Message-ID: <1121891418.3498.78.camel@ayanami.boston.redhat.com>

On Wed, 2005-07-20 at 14:31 -0400, Olivier Crete wrote:
> On Wed, 2005-20-07 at 13:49 -0400, Lon Hohberger wrote:
> > On Fri, 2005-07-15 at 18:02 -0400, Olivier Crete wrote:
> > > On Wed, 2005-22-06 at 18:36 -0400, Lon Hohberger wrote:
> > > > * libmagma provides cp_fence() / clu_fence() which work on both CMAN and
> > > > gulm.
> > > 
> > > I've looked implementing clu_fence() in libmagma. It seems to only call
> > > an ioctl in the kernel cman which only seems to do cooperative fencing
> > > and removing a node from the cluster... I dont see any call to fenced or
> > > fence_node. Or have I missed something ?
> > > 
> > 
> > Correct, it removes the node via libgulm or asking CMAN to expire and
> > fence the node.
> 
> But cman does not seem to really fence the node... just remove it from
> its list..  So that's not enough for our app that has shared storage..
> Ideally, I would need the same behavior as stonith and have the call
> block and return only when we are 100% sure that the node have been
> fenced.

It's supposed to fence it.  At least, it used to fence it...  I wonder
if something changed that I didn't know about?

Basically, the idea is that we tell CMAN to force-expire the node.  It
should then take recovery actions to restore the node.

fenced is running, right?

-- Lon


From Birger.Wathne at ift.uib.no  Wed Jul 20 20:55:09 2005
From: Birger.Wathne at ift.uib.no (Birger Wathne)
Date: Wed, 20 Jul 2005 22:55:09 +0200
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <1121878947.3498.25.camel@ayanami.boston.redhat.com>
References: <38A48FA2F0103444906AD22E14F1B5A3543AF2@mailxchg01.corp.opsource.net>
	<1121878947.3498.25.camel@ayanami.boston.redhat.com>
Message-ID: <42DEBA2D.80503@uib.no>

Lon Hohberger wrote:


> Basically, the power button should really be a power button -- not a
> soft-power-button.  Some BIOSes allow you to toggle whether it's a "soft
> power" or "hard power".

Won't most bioses do a hard power if you just press the power button long 
enough?

-- 
birger


From lhh at redhat.com  Wed Jul 20 22:06:13 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Wed, 20 Jul 2005 18:06:13 -0400
Subject: [Linux-cluster] kernel: CMANsendmsg failed: -101
In-Reply-To: <42DEBA2D.80503@uib.no>
References: <38A48FA2F0103444906AD22E14F1B5A3543AF2@mailxchg01.corp.opsource.net>
	<1121878947.3498.25.camel@ayanami.boston.redhat.com>
	<42DEBA2D.80503@uib.no>
Message-ID: <1121897173.3498.84.camel@ayanami.boston.redhat.com>

On Wed, 2005-07-20 at 22:55 +0200, Birger Wathne wrote:
> Lon Hohberger wrote:
> 
> 
> > Basically, the power button should really be a power button -- not a
> > soft-power-button.  Some BIOSes allow you to toggle whether it's a "soft
> > power" or "hard power".
> 
> Won't most bioses do a hard power if you just press the power button long 
> enough?

Yeah, but iLO doesn't do this with the versions we support.  It
simulates a "button press".

The 1.70+ firmware has "hold_power_button", but the 1.62 firmware only
has "set_power_state off" (or something like it) which translates to
"press_power_button" in the 1.70/1.75 firmware.

The newer firmware solves this problem with the press_power_button,
hold_power_button distinction, but unfortunately, we need to make the
iLO agent more intelligent as to how it controls power without breaking
compatibility with the 1.62 firmware.

Ugly :)

-- Lon


From rhcluster at prioritynetworks.net  Thu Jul 21 00:51:56 2005
From: rhcluster at prioritynetworks.net (High Mobley)
Date: Wed, 20 Jul 2005 17:51:56 -0700
Subject: [Linux-cluster] Compile errors
Message-ID: <42DEF1AC.8070804@prioritynetworks.net>

I'm in need of the CLVM software for a project that I'm working on. It
appears that I require the full RH Cluster software suite in order to
run CLVM, so I've been trying to follow the information that I found in
your usage.txt file, as well as the INSTALL file that came with the cvs
code's STABLE branch.

I was initially trying to compile against recent 2.6.x kernels, but then
I noticed that the kernel patches in the STABLE branch were in a
subdirectory called "2.6.9". So my latest attempts have been to compile
against linux-2.6.9 kernel sources. I have tried both the STABLE and
RHEL4 branches. Both have failed to compile.

I am running Debian Sarge, but I am using vanilla kernel.org sources.
Thank you in advance for any direction that is provided!

Following is the output from 'make install' on the STABLE branch:

strife:/usr/2.6.9-cluster/cluster# make install
cd cman-kernel && make install
sbindir=/usr/2.6.9-cluster/cluster/build/sbin
libdir=/usr/2.6.9-cluster/cluster/build/lib
mandir=/usr/2.6.9-cluster/cluster/build/man
incdir=/usr/2.6.9-cluster/cluster/build/incdir
module_dir=/usr/2.6.9-cluster/cluster/build/module
sharedir=/usr/2.6.9-cluster/cluster/build
slibdir=/usr/2.6.9-cluster/cluster/build/slib
DESTDIR=/usr/2.6.9-cluster/cluster/build
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/cman-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/cman-kernel/src'
rm -f cluster
ln -s . cluster
make -C /usr/2.6.9-cluster/linux-2.6.9
M=/usr/2.6.9-cluster/cluster/cman-kernel/src modules USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   CC [M]  /usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.o
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c: In function
`cl_alloc_sock':
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:926: warning:
passing arg 3 of `sk_alloc' makes integer from pointer without a cast
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:926: warning:
passing arg 4 of `sk_alloc' makes pointer from integer without a cast
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c: In function `cl_bind':
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:1067: error:
`SOCK_ZAPPED' undeclared (first use in this function)
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:1067: error: (Each
undeclared identifier is reported only once
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:1067: error: for
each function it appears in.)
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c: At top level:
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4175: error: unknown
field `owner' specified in initializer
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4175: warning:
missing braces around initializer
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4175: warning: (near
initialization for `cl_proto.stats')
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4175: warning:
initialization makes integer from pointer without a cast
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4176: error: unknown
field `obj_size' specified in initializer
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4176: warning:
excess elements in struct initializer
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4176: warning: (near
initialization for `cl_proto')
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c: In function
`cluster_init':
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4191: warning:
implicit declaration of function `proto_register'
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4198: warning:
implicit declaration of function `proto_unregister'
make[4]: *** [/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.o] Error 1
make[3]: *** [_module_/usr/2.6.9-cluster/cluster/cman-kernel/src] Error 2
make[3]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
make[2]: *** [all] Error 2
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/cman-kernel/src'
make[1]: *** [install] Error 2
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/cman-kernel'
make: *** [all] Error 2


And here is the output from 'make install' with the RHEL4 branch:

cd cman-kernel && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/cman-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/cman-kernel/src'
rm -f cluster
ln -s . cluster
make -C /usr/2.6.9-cluster/linux-2.6.9
M=/usr/2.6.9-cluster/cluster/cman-kernel/src modules USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule"
[/usr/2.6.9-cluster/cluster/cman-kernel/src/cman.ko] undefined!
make[3]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //lib/modules/2.6.9/kernel/cluster
install cman.ko //lib/modules/2.6.9/kernel/cluster
install -d //usr/include/cluster
install cnxman.h cnxman-socket.h service.h //usr/include/cluster
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/cman-kernel/src'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/cman-kernel'
cd dlm-kernel && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/dlm-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/dlm-kernel/src'
if [ ! -e cluster ]; then ln -s . cluster; fi
if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
if [ ! -e cnxman-socket.h ]; then cp
//usr/include/cluster/cnxman-socket.h .; fi
make -C /usr/2.6.9-cluster/linux-2.6.9
M=/usr/2.6.9-cluster/cluster/dlm-kernel/src modules USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "kcl_addref_cluster"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_node_by_addr"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_node_addresses"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_releaseref_cluster"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_current_interface"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_node_by_nodeid"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_leave_service"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "preempt_schedule"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_remove_callback"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_global_service_id"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_unregister_service"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_join_service"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_start_done"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_add_callback"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_register_service"
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
make[3]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //lib/modules/2.6.9/kernel/cluster
install dlm.ko //lib/modules/2.6.9/kernel/cluster
install -d //usr/include/cluster
install dlm.h dlm_device.h //usr/include/cluster
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/dlm-kernel/src'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/dlm-kernel'
cd gfs-kernel && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src'
cd harness && make install
make[3]: Entering directory
`/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness'
rm -f linux
ln -s . linux
make -C /usr/2.6.9-cluster/linux-2.6.9
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness
symverfile=/usr/2.6.9-cluster/linux-2.6.9/../kernel/cluster/dlm.symvers
modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness/lock_harness.ko]
undefined!
   CC
/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness/lock_harness.mod.o
   LD [M]  /usr/2.6.9-cluster/cluster/gfs-kernel/src/harness/lock_harness.ko
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
/usr/2.6.9-cluster/linux-2.6.9/scripts/mod/modpost -m -i
/usr/2.6.9-cluster/linux-2.6.9/../kernel/cluster/dlm.symvers
../harness/lock_harness.o -o lock_harness.symvers
install -d //usr/include/linux
install lm_interface.h //usr/include/linux
install -d //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_harness
install lock_harness.ko
//lib/modules/2.6.9/kernel/fs/gfs_locking/lock_harness
make[3]: Leaving directory
`/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness'
cd nolock && make install
make[3]: Entering directory
`/usr/2.6.9-cluster/cluster/gfs-kernel/src/nolock'
echo

rm -f linux lm_interface.h
ln -s . linux
ln -s ../../src/harness/lm_interface.h .
make -C /usr/2.6.9-cluster/linux-2.6.9
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/nolock
symverfile=/usr/2.6.9-cluster/cluster/gfs-kernel/src/nolock/../harness/lock_harness.symvers 

modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_nolock
install lock_nolock.ko //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_nolock
make[3]: Leaving directory
`/usr/2.6.9-cluster/cluster/gfs-kernel/src/nolock'
cd dlm && make install
make[3]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm'
if [ ! -e linux ]; then ln -s . linux; fi
if [ ! -e cluster ]; then ln -s . cluster; fi
if [ ! -e lm_interface.h ]; then ln -s ../../src/harness/lm_interface.h
.; fi
if [ ! -e dlm.h ]; then cp //usr/include/cluster/dlm.h .; fi
if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
if [ ! -e cnxman-socket.h ]; then cp
//usr/include/cluster/cnxman-socket.h .; fi
if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
make -C /usr/2.6.9-cluster/linux-2.6.9
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm
symverfile=/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/../harness/lock_harness.symvers 

modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "kcl_addref_cluster"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_get_services"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_cluster_name"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_lock"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_releaseref_cluster"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_get_members"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_query"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_leave_service"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "preempt_schedule"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_unregister_service"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_join_service"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_release_lockspace"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_start_done"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_unlock"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_register_service"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_new_lockspace"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_debug_dump"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
   CC      /usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.mod.o
   LD [M]  /usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
/usr/2.6.9-cluster/linux-2.6.9/scripts/mod/modpost -m -i
/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/../harness/lock_harness.symvers 

../dlm/lock_dlm.o -o lock_dlm.symvers
install -d //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_dlm
install lock_dlm.ko //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_dlm
make[3]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm'
cd gulm && make install
make[3]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm'
rm -f linux lm_interface.h
ln -s . linux
ln -s ../../src/harness/lm_interface.h .
make -C /usr/2.6.9-cluster/linux-2.6.9
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm
symverfile=/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/../harness/lock_harness.symvers 

modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/lock_gulm.ko] undefined!
   CC      /usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/lock_gulm.mod.o
   LD [M]  /usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/lock_gulm.ko
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
/usr/2.6.9-cluster/linux-2.6.9/scripts/mod/modpost -m -i
/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/../harness/lock_harness.symvers 

../gulm/lock_gulm.o -o lock_gulm.symvers
install -d //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_gulm
install lock_gulm.ko //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_gulm
make[3]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm'
cd gfs && make install
make[3]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs'
rm -f linux lm_interface.h
ln -s . linux
ln -s ../../src/harness/lm_interface.h .
make -C /usr/2.6.9-cluster/linux-2.6.9
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs
symverfile=/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs/../gulm/lock_gulm.symvers 

modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule"
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs/gfs.ko] undefined!
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //lib/modules/2.6.9/kernel/fs/gfs
install gfs.ko //lib/modules/2.6.9/kernel/fs/gfs
install -d //usr/include/linux
install gfs_ondisk.h gfs_ioctl.h //usr/include/linux
make[3]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs'
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel'
cd gnbd-kernel && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/gnbd-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/gnbd-kernel/src'
rm -f linux
ln -s . linux
make -C /usr/2.6.9-cluster/linux-2.6.9
M=/usr/2.6.9-cluster/cluster/gnbd-kernel/src modules USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule"
[/usr/2.6.9-cluster/cluster/gnbd-kernel/src/gnbd.ko] undefined!
make[3]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //usr/include/linux
install gnbd.h //usr/include/linux
install -d //lib/modules/2.6.9/kernel/drivers/block/gnbd
install gnbd.ko //lib/modules/2.6.9/kernel/drivers/block/gnbd
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/gnbd-kernel/src'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/gnbd-kernel'
cd magma && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/magma'
cd lib && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/magma/lib'
install -d /usr/lib
install -d /usr/lib
#
# Install high level magma libraries
#
install -m 0644 libmagma.a /usr/lib
install -m 0644 libmagma.so.DEVEL.1121902825 /usr/lib
cd /usr/lib; ln -snf libmagma.so.DEVEL.1121902825 libmagma.so.DEVEL; cd -
cd /usr/lib; ln -snf libmagma.so.DEVEL.1121902825 libmagma.so; cd -
#
# Install low-overhead, non-pthread magma libraries
#
install -m 0644 libmagma_nt.a /usr/lib
install -m 0644 libmagma_nt.so.DEVEL.1121902825 /usr/lib
cd /usr/lib; ln -snf libmagma_nt.so.DEVEL.1121902825
libmagma_nt.so.DEVEL; cd -
cd /usr/lib; ln -snf libmagma_nt.so.DEVEL.1121902825 libmagma_nt.so; cd -
#
# Install high level TCP messaging libraries
#
install -m 0644 libmagmamsg.a /usr/lib
install -m 0644 libmagmamsg.so.DEVEL.1121902825 /usr/lib
cd /usr/lib; ln -snf libmagmamsg.so.DEVEL.1121902825
libmagmamsg.so.DEVEL; cd -
cd /usr/lib; ln -snf libmagmamsg.so.DEVEL.1121902825 libmagmamsg.so; cd -
#
# Install headers for magma, magma_nt, and magmamsg
#
install -d /usr/include
install -m 0644 magma.h /usr/include
install -m 0644 magmamsg.h /usr/include
install -m 0644 magma-build.h /usr/include
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/magma/lib'
cd man && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/magma/man'
install -d /usr/share/man/man3
install -d /usr/share/man/man8
install clu_connect.3 clu_disconnect.3 clu_get_event.3 /usr/share/man/man3
install magma_tool.8 /usr/share/man/man8
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/magma/man'
cd tests && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/magma/tests'
install -d /sbin
install -m 0755 magma_tool circleping cluster_cmd /sbin
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/magma/tests'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/magma'
cd ccs && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/ccs'
cd daemon && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/ccs/daemon'
gcc -Wall -I. -I../include -I../common -I../config -I//usr/include -Wall
-O2 -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE `xml2-config --cflags`
-DCCS_RELEASE_NAME=\"DEVEL.1121902825\" -I. -I../include -I../common
-I../config -I//usr/include -o ccsd ccsd.c cnx_mgr.c cluster_mgr.c
misc.c ../common/log.c globals.c -L//usr/lib `xml2-config --libs`
-lpthread -lmagma -lmagmamsg -ldl
/bin/sh: line 1: xml2-config: command not found
/bin/sh: line 1: xml2-config: command not found
ccsd.c:26:27: libxml/parser.h: No such file or directory
ccsd.c: In function `check_cluster_conf':
ccsd.c:502: error: `xmlDocPtr' undeclared (first use in this function)
ccsd.c:502: error: (Each undeclared identifier is reported only once
ccsd.c:502: error: for each function it appears in.)
ccsd.c:502: error: syntax error before "doc"
ccsd.c:505: error: `doc' undeclared (first use in this function)
ccsd.c:505: warning: implicit declaration of function `xmlParseFile'
ccsd.c:514: warning: implicit declaration of function `xmlFreeDoc'
cnx_mgr.c:25:27: libxml/parser.h: No such file or directory
cnx_mgr.c:26:25: libxml/tree.h: No such file or directory
cnx_mgr.c:27:26: libxml/xpath.h: No such file or directory
cnx_mgr.c:28:35: libxml/xpathInternals.h: No such file or directory
In file included from cnx_mgr.c:33:
misc.h:17: error: syntax error before "xmlDocPtr"
misc.h:17: warning: no semicolon at end of struct or union
misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
misc.h:18: warning: data definition has no type or storage class
misc.h:24: error: syntax error before '*' token
misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
misc.h:24: warning: data definition has no type or storage class
misc.h:26: error: syntax error before "ldoc"
misc.h:27: error: syntax error before "ldoc"
cnx_mgr.c:41: error: syntax error before "open_doc_t"
cnx_mgr.c:41: warning: no semicolon at end of struct or union
cnx_mgr.c:42: warning: type defaults to `int' in declaration of `oc_ctx'
cnx_mgr.c:42: warning: data definition has no type or storage class
cnx_mgr.c:43: warning: type defaults to `int' in declaration of
`open_connection_t'
cnx_mgr.c:43: warning: data definition has no type or storage class
cnx_mgr.c:48: error: syntax error before '*' token
cnx_mgr.c:48: warning: type defaults to `int' in declaration of `ocs'
cnx_mgr.c:48: warning: data definition has no type or storage class
cnx_mgr.c: In function `_update_config':
cnx_mgr.c:53: error: `tmp_odoc' undeclared (first use in this function)
cnx_mgr.c:53: error: (Each undeclared identifier is reported only once
cnx_mgr.c:53: error: for each function it appears in.)
cnx_mgr.c:54: error: `xmlDocPtr' undeclared (first use in this function)
cnx_mgr.c:54: error: syntax error before "tmp_doc"
cnx_mgr.c:58: error: `tmp_doc' undeclared (first use in this function)
cnx_mgr.c:58: warning: implicit declaration of function `xmlParseFile'
cnx_mgr.c:67: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:68: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:90: error: request for member `od_refs' in something not a
structure or union
cnx_mgr.c:92: warning: implicit declaration of function `xmlFreeDoc'
cnx_mgr.c:92: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:93: warning: implicit declaration of function `free'
cnx_mgr.c:104: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c: In function `broadcast_for_doc':
cnx_mgr.c:157: error: `xmlDocPtr' undeclared (first use in this function)
cnx_mgr.c:157: error: syntax error before "tmp_doc"
cnx_mgr.c:164: warning: implicit declaration of function `exit'
cnx_mgr.c:164: error: `EXIT_FAILURE' undeclared (first use in this function)
cnx_mgr.c:289: warning: implicit declaration of function `srandom'
cnx_mgr.c:293: warning: implicit declaration of function `random'
cnx_mgr.c:324: error: `tmp_doc' undeclared (first use in this function)
cnx_mgr.c:324: warning: implicit declaration of function `xmlParseMemory'
cnx_mgr.c:349: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:352: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:357: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:366: error: request for member `od_refs' in something not a
structure or union
cnx_mgr.c:367: error: `tmp_odoc' undeclared (first use in this function)
cnx_mgr.c:376: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:377: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:386: error: request for member `od_refs' in something not a
structure or union
cnx_mgr.c:396: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:397: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:450: warning: implicit declaration of function `xmlDocDump'
cnx_mgr.c:450: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c: In function `process_connect':
cnx_mgr.c:488: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:513: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:514: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:515: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:518: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:521: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:522: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:523: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:526: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:527: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:531: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:532: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:542: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:577: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:583: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:609: error: syntax error before ')' token
cnx_mgr.c:617: error: request for member `od_refs' in something not a
structure or union
cnx_mgr.c:618: error: request for member `oc_odoc' in something not a
structure or union
cnx_mgr.c:619: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:619: warning: implicit declaration of function
`xmlXPathNewContext'
cnx_mgr.c:619: error: request for member `oc_odoc' in something not a
structure or union
cnx_mgr.c:620: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:621: error: request for member `oc_odoc' in something not a
structure or union
cnx_mgr.c:631: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c: In function `process_disconnect':
cnx_mgr.c:658: error: `tmp_odoc' undeclared (first use in this function)
cnx_mgr.c:679: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:680: warning: implicit declaration of function
`xmlXPathFreeContext'
cnx_mgr.c:680: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:682: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:683: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:685: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:686: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:688: error: request for member `oc_odoc' in something not a
structure or union
cnx_mgr.c:692: error: `EXIT_FAILURE' undeclared (first use in this function)
cnx_mgr.c: In function `_process_get':
cnx_mgr.c:734: error: `xmlXPathObjectPtr' undeclared (first use in this
function)
cnx_mgr.c:734: error: syntax error before "obj"
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:758: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:764: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:765: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:766: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:768: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:774: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:784: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:789: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:792: error: `obj' undeclared (first use in this function)
cnx_mgr.c:792: warning: implicit declaration of function
`xmlXPathEvalExpression'
cnx_mgr.c:792: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:797: error: `xmlNodePtr' undeclared (first use in this function)
cnx_mgr.c:797: error: syntax error before "node"
cnx_mgr.c:801: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:802: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:807: error: `node' undeclared (first use in this function)
cnx_mgr.c:807: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:822: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
function)
cnx_mgr.c:823: error: `XML_ELEMENT_NODE' undeclared (first use in this
function)
cnx_mgr.c:868: warning: implicit declaration of function
`xmlXPathFreeObject'
cnx_mgr.c: In function `process_get_list':
cnx_mgr.c:897: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c: In function `process_get_state':
cnx_mgr.c:964: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:964: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:965: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:966: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:973: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:974: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:974: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:976: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:984: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:985: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:993: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c: In function `process_set_state':
cnx_mgr.c:1033: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:1034: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:1035: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:1038: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:1039: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:1040: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:1043: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c: In function `process_request':
cnx_mgr.c:1160: warning: implicit declaration of function `realloc'
cnx_mgr.c: In function `process_broadcast':
cnx_mgr.c:1247: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:1248: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:1266: warning: implicit declaration of function
`xmlDocDumpFormatMemory'
cnx_mgr.c:1266: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:1267: error: `xmlChar' undeclared (first use in this function)
cnx_mgr.c:1267: error: syntax error before ')' token
cnx_mgr.c:1301: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:1302: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:24:27: libxml/parser.h: No such file or directory
In file included from cluster_mgr.c:29:
misc.h:17: error: syntax error before "xmlDocPtr"
misc.h:17: warning: no semicolon at end of struct or union
misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
misc.h:18: warning: data definition has no type or storage class
misc.h:24: error: syntax error before '*' token
misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
misc.h:24: warning: data definition has no type or storage class
misc.h:26: error: syntax error before "ldoc"
misc.h:27: error: syntax error before "ldoc"
cluster_mgr.c:38: error: syntax error before "tmp_doc"
cluster_mgr.c: In function `check_update_doc':
cluster_mgr.c:45: error: `tmp_doc' undeclared (first use in this function)
cluster_mgr.c:45: error: (Each undeclared identifier is reported only once
cluster_mgr.c:45: error: for each function it appears in.)
cluster_mgr.c:51: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:52: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:64: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:65: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:67: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:67: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c: In function `handle_cluster_message':
cluster_mgr.c:93: error: `xmlDocPtr' undeclared (first use in this function)
cluster_mgr.c:93: error: syntax error before "tmp_doc"
cluster_mgr.c:146: error: `tmp_doc' undeclared (first use in this function)
cluster_mgr.c:146: warning: implicit declaration of function
`xmlParseMemory'
cluster_mgr.c:167: warning: implicit declaration of function `xmlDocDump'
cluster_mgr.c:198: warning: implicit declaration of function `xmlParseFile'
cluster_mgr.c:247: warning: implicit declaration of function `xmlFreeDoc'
misc.c:25:27: libxml/parser.h: No such file or directory
misc.c:27:27: libxml/parser.h: No such file or directory
misc.c:28:25: libxml/tree.h: No such file or directory
misc.c:29:26: libxml/xpath.h: No such file or directory
misc.c:30:35: libxml/xpathInternals.h: No such file or directory
In file included from misc.c:35:
misc.h:17: error: syntax error before "xmlDocPtr"
misc.h:17: warning: no semicolon at end of struct or union
misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
misc.h:18: warning: data definition has no type or storage class
misc.h:24: error: syntax error before '*' token
misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
misc.h:24: warning: data definition has no type or storage class
misc.h:26: error: syntax error before "ldoc"
misc.h:27: error: syntax error before "ldoc"
misc.c:42: error: syntax error before '*' token
misc.c:42: warning: type defaults to `int' in declaration of `master_doc'
misc.c:42: warning: data definition has no type or storage class
misc.c:44: error: syntax error before "ldoc"
misc.c: In function `get_doc_version':
misc.c:47: error: `xmlXPathObjectPtr' undeclared (first use in this
function)
misc.c:47: error: (Each undeclared identifier is reported only once
misc.c:47: error: for each function it appears in.)
misc.c:47: error: syntax error before "obj"
misc.c:48: error: `xmlXPathContextPtr' undeclared (first use in this
function)
misc.c:49: error: `xmlNodePtr' undeclared (first use in this function)
misc.c:53: error: `ctx' undeclared (first use in this function)
misc.c:53: warning: implicit declaration of function `xmlXPathNewContext'
misc.c:53: error: `ldoc' undeclared (first use in this function)
misc.c:60: error: `obj' undeclared (first use in this function)
misc.c:60: warning: implicit declaration of function
`xmlXPathEvalExpression'
misc.c:67: error: `node' undeclared (first use in this function)
misc.c:68: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
function)
misc.c:91: warning: implicit declaration of function `xmlXPathFreeContext'
misc.c:94: warning: implicit declaration of function `xmlXPathFreeObject'
misc.c: At top level:
misc.c:109: error: syntax error before "ldoc"
misc.c: In function `get_cluster_name':
misc.c:112: error: `xmlXPathObjectPtr' undeclared (first use in this
function)
misc.c:112: error: syntax error before "obj"
misc.c:113: error: `xmlXPathContextPtr' undeclared (first use in this
function)
misc.c:114: error: `xmlNodePtr' undeclared (first use in this function)
misc.c:118: error: `ctx' undeclared (first use in this function)
misc.c:118: error: `ldoc' undeclared (first use in this function)
misc.c:125: error: `obj' undeclared (first use in this function)
misc.c:132: error: `node' undeclared (first use in this function)
misc.c:133: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
function)
make[2]: *** [ccsd] Error 1
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/ccs/daemon'
make[1]: *** [install] Error 2
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/ccs'
make: *** [install] Error 2


High Mobley
Priority Networks


From teigland at redhat.com  Thu Jul 21 03:10:50 2005
From: teigland at redhat.com (David Teigland)
Date: Thu, 21 Jul 2005 11:10:50 +0800
Subject: [Linux-cluster] Compile errors
In-Reply-To: <42DEF1AC.8070804@prioritynetworks.net>
References: <42DEF1AC.8070804@prioritynetworks.net>
Message-ID: <20050721031050.GA10109@redhat.com>

On Wed, Jul 20, 2005 at 05:51:56PM -0700, High Mobley wrote:
> I'm in need of the CLVM software for a project that I'm working on. It
> appears that I require the full RH Cluster software suite in order to
> run CLVM, so I've been trying to follow the information that I found in
> your usage.txt file, as well as the INSTALL file that came with the cvs
> code's STABLE branch.
> 
> I was initially trying to compile against recent 2.6.x kernels, but then
> I noticed that the kernel patches in the STABLE branch were in a
> subdirectory called "2.6.9". So my latest attempts have been to compile
> against linux-2.6.9 kernel sources. I have tried both the STABLE and
> RHEL4 branches. Both have failed to compile.
> 
> I am running Debian Sarge, but I am using vanilla kernel.org sources.
> Thank you in advance for any direction that is provided!

Don't use cvs, use the release instead:
 ftp://sources.redhat.com/pub/cluster/releases/cluster-1.00.00.tar.gz

This compiles against vanilla linux-2.6.12 and doesn't use kernel patches.

Dave


From pcaulfie at redhat.com  Thu Jul 21 06:57:45 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Thu, 21 Jul 2005 07:57:45 +0100
Subject: [Linux-cluster] cman ignores interface setting on ipv4
In-Reply-To: <1121876980.13635.32.camel@cocagne.max-t.internal>
References: <1121811743.13635.26.camel@cocagne.max-t.internal>	<42DE73DA.6080408@redhat.com>
	<1121876980.13635.32.camel@cocagne.max-t.internal>
Message-ID: <42DF4769.2080300@redhat.com>

Olivier Crete wrote:
> 
> Make sure it doesnt break in the case where the interface isnt
> specified.. I didn't test that case.

The command-line and parsers should prevent that happening in the first place,
but it will be carefully tested :-)

> 
>>Multicast should (well, does for me!) work anyway provided you only use one
>>interface and it matches the host name - which it always should all you'll run
>>into other problems.
>>
>>Were you actually having problems with it without the patch ?
> 
> 
> Yes, our nodes are linked using two different networks, a management
> network (Ethernet) and a cluster interconnect (myrinet).. and I want the
> heartbeat stuff to happen over myrinet (this is what our application
> uses).. But the hostname of the nodes resolves to the ip of the Ethernet
> card.

OK that sounds expected. We always recommend that the hostname used for cman
is the one that resolves to the interface you are using - mainly to avoid
confusion because broadcast depends on it!

-- 

patrick


From eelizur.redhat.cluster at gmail.com  Thu Jul 21 11:37:35 2005
From: eelizur.redhat.cluster at gmail.com (eli elizur)
Date: Thu, 21 Jul 2005 14:37:35 +0300
Subject: [Linux-cluster] Cluter MIB & Trap
Message-ID: <35a448fb0507210437514fd07c@mail.gmail.com>

Hi list,

I have Linux AS 2.1 running on X86
I have configure the Redhat Cluster.

Dose the cluster has a mib I can get info via the snmp ?
Is there a trap ability to alert on error ?

10x,
Eli


From Simon.Gibbs at informa.com  Wed Jul 20 12:12:15 2005
From: Simon.Gibbs at informa.com (Gibbs, Simon)
Date: Wed, 20 Jul 2005 13:12:15 +0100
Subject: [Linux-cluster] CMAN/CCSD errors after fresh install of Cluster
	Suite
Message-ID: <BF03FE2F.BE12%Simon.Gibbs@informa.com>

Hi

I?ve installed RHCS on 2 nodes with fresh installs of RHEL4 but am .running
into problems when trying to configure the nodes using the GUI tool.
When opening the GUI I?m prompted to create/save a cluster.conf file (in
/etc/cluster) but there aren?t any options to add/configure any of the
nodes. Scanning through the Redhat docs it tells me that the ?management
tab? will not be available until the next time I start the GUI. So I start
up the GUI and I get an error telling me that none of the nodes are
configured so I won?t be able to manage them??!!

On further inspection I can see that ccsd is starting up, but cman isn?t ?
which I guess is causing the problem? Here?s the output from
/var/log/messages:

On starting ccsd:

Jul 20 13:08:51 UKFS02 ccsd[4415]: Starting ccsd 1.0.0:
Jul 20 13:08:51 UKFS02 ccsd[4415]:  Built: Jun  7 2005 12:19:51
Jul 20 13:08:51 UKFS02 ccsd[4415]:  Copyright (C) Red Hat, Inc.  2004  All
rights reserved. 
Jul 20 13:08:52 UKFS02 ccsd:  succeeded

Which looks OK.....

On starting cman:

Jul 20 13:09:29 UKFS02 kernel: CMAN 2.6.9-36.0 (built May 31 2005 12:15:02)
installed
Jul 20 13:09:29 UKFS02 kernel: NET: Registered protocol family 30
Jul 20 13:09:29 UKFS02 ccsd[4415]: cluster.conf (cluster name =
alpha_cluster, version = 3) found.
Jul 20 13:09:32 UKFS02 cman: cman_tool: local node name "UKFS02" not found
in cluster.conf failed
Jul 20 13:09:21 UKFS02 ccsd[4415]: Unable to connect to cluster
infrastructure after 30 seconds
Jul 20 13:09:21 UKFS02 ccsd[4415]: Unable to connect to cluster
infrastructure after 60 seconds
Jul 20 13:10:21 UKFS02 ccsd[4415]: Unable to connect to cluster
infrastructure after 90 seconds.
Jul 20 13:10:51 UKFS02 ccsd[4415]: Unable to connect to cluster
infrastructure after 120 seconds.
Jul 20 13:11:21 UKFS02 ccsd[4415]: Unable to connect to cluster
infrastructure after 150 seconds.
Jul 20 13:11:51 UKFS02 ccsd[4415]: Unable to connect to cluster
infrastructure after 180 seconds.

I?ve checked /etc/hosts, NIC config and shared storage and all appear to be
configured correctly.
Has anyone got ay ideas or can you point me in the right direction?

Thanks,

Simon


********************************************************************************
The information contained in this email message may be confidential. If you are not the intended recipient, any use, interference with, disclosure or copying of this material is unauthorised and prohibited. Although this message and any attachments are believed to be free of viruses, no responsibility is accepted by T&F Informa for any loss or damage arising in any way from receipt or use thereof.  Messages to and from the company are monitored for operational reasons and in accordance with lawful business practices. 
If you have received this message in error, please notify us by return and delete the message and any attachments.  Further enquiries/returns can be sent to postmaster at tfinforma.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050720/296e0e0f/attachment.htm>

From high.mobley at prioritynetworks.net  Thu Jul 21 00:39:23 2005
From: high.mobley at prioritynetworks.net (High Mobley)
Date: Wed, 20 Jul 2005 17:39:23 -0700
Subject: [Linux-cluster] Compile errors
Message-ID: <42DEEEBB.9000808@prioritynetworks.net>

I'm in need of the CLVM software for a project that I'm working on. It 
appears that I require the full RH Cluster software suite in order to 
run CLVM, so I've been trying to follow the information that I found in 
your usage.txt file, as well as the INSTALL file that came with the cvs 
code's STABLE branch.

I was initially trying to compile against recent 2.6.x kernels, but then 
I noticed that the kernel patches in the STABLE branch were in a 
subdirectory called "2.6.9". So my latest attempts have been to compile 
against linux-2.6.9 kernel sources. I have tried both the STABLE and 
RHEL4 branches. Both have failed to compile.

I am running Debian Sarge, but I am using vanilla kernel.org sources. 
Thank you in advance for any direction that is provided!

Following is the output from 'make install' on the STABLE branch:

strife:/usr/2.6.9-cluster/cluster# make install
cd cman-kernel && make install 
sbindir=/usr/2.6.9-cluster/cluster/build/sbin 
libdir=/usr/2.6.9-cluster/cluster/build/lib 
mandir=/usr/2.6.9-cluster/cluster/build/man 
incdir=/usr/2.6.9-cluster/cluster/build/incdir 
module_dir=/usr/2.6.9-cluster/cluster/build/module 
sharedir=/usr/2.6.9-cluster/cluster/build 
slibdir=/usr/2.6.9-cluster/cluster/build/slib 
DESTDIR=/usr/2.6.9-cluster/cluster/build
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/cman-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/cman-kernel/src'
rm -f cluster
ln -s . cluster
make -C /usr/2.6.9-cluster/linux-2.6.9 
M=/usr/2.6.9-cluster/cluster/cman-kernel/src modules USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   CC [M]  /usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.o
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c: In function 
`cl_alloc_sock':
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:926: warning: 
passing arg 3 of `sk_alloc' makes integer from pointer without a cast
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:926: warning: 
passing arg 4 of `sk_alloc' makes pointer from integer without a cast
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c: In function `cl_bind':
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:1067: error: 
`SOCK_ZAPPED' undeclared (first use in this function)
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:1067: error: (Each 
undeclared identifier is reported only once
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:1067: error: for 
each function it appears in.)
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c: At top level:
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4175: error: unknown 
field `owner' specified in initializer
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4175: warning: 
missing braces around initializer
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4175: warning: (near 
initialization for `cl_proto.stats')
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4175: warning: 
initialization makes integer from pointer without a cast
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4176: error: unknown 
field `obj_size' specified in initializer
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4176: warning: 
excess elements in struct initializer
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4176: warning: (near 
initialization for `cl_proto')
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c: In function 
`cluster_init':
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4191: warning: 
implicit declaration of function `proto_register'
/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.c:4198: warning: 
implicit declaration of function `proto_unregister'
make[4]: *** [/usr/2.6.9-cluster/cluster/cman-kernel/src/cnxman.o] Error 1
make[3]: *** [_module_/usr/2.6.9-cluster/cluster/cman-kernel/src] Error 2
make[3]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
make[2]: *** [all] Error 2
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/cman-kernel/src'
make[1]: *** [install] Error 2
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/cman-kernel'
make: *** [all] Error 2


And here is the output from 'make install' with the RHEL4 branch:

cd cman-kernel && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/cman-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/cman-kernel/src'
rm -f cluster
ln -s . cluster
make -C /usr/2.6.9-cluster/linux-2.6.9 
M=/usr/2.6.9-cluster/cluster/cman-kernel/src modules USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule" 
[/usr/2.6.9-cluster/cluster/cman-kernel/src/cman.ko] undefined!
make[3]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //lib/modules/2.6.9/kernel/cluster
install cman.ko //lib/modules/2.6.9/kernel/cluster
install -d //usr/include/cluster
install cnxman.h cnxman-socket.h service.h //usr/include/cluster
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/cman-kernel/src'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/cman-kernel'
cd dlm-kernel && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/dlm-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/dlm-kernel/src'
if [ ! -e cluster ]; then ln -s . cluster; fi
if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
if [ ! -e cnxman-socket.h ]; then cp 
//usr/include/cluster/cnxman-socket.h .; fi
make -C /usr/2.6.9-cluster/linux-2.6.9 
M=/usr/2.6.9-cluster/cluster/dlm-kernel/src modules USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "kcl_addref_cluster" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_node_by_addr" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_node_addresses" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_releaseref_cluster" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_current_interface" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_node_by_nodeid" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_leave_service" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "preempt_schedule" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_remove_callback" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_global_service_id" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_unregister_service" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_join_service" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_start_done" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_add_callback" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_register_service" 
[/usr/2.6.9-cluster/cluster/dlm-kernel/src/dlm.ko] undefined!
make[3]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //lib/modules/2.6.9/kernel/cluster
install dlm.ko //lib/modules/2.6.9/kernel/cluster
install -d //usr/include/cluster
install dlm.h dlm_device.h //usr/include/cluster
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/dlm-kernel/src'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/dlm-kernel'
cd gfs-kernel && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src'
cd harness && make install
make[3]: Entering directory 
`/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness'
rm -f linux
ln -s . linux
make -C /usr/2.6.9-cluster/linux-2.6.9 
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness 
symverfile=/usr/2.6.9-cluster/linux-2.6.9/../kernel/cluster/dlm.symvers 
modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness/lock_harness.ko] 
undefined!
   CC 
/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness/lock_harness.mod.o
   LD [M]  /usr/2.6.9-cluster/cluster/gfs-kernel/src/harness/lock_harness.ko
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
/usr/2.6.9-cluster/linux-2.6.9/scripts/mod/modpost -m -i 
/usr/2.6.9-cluster/linux-2.6.9/../kernel/cluster/dlm.symvers 
../harness/lock_harness.o -o lock_harness.symvers
install -d //usr/include/linux
install lm_interface.h //usr/include/linux
install -d //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_harness
install lock_harness.ko 
//lib/modules/2.6.9/kernel/fs/gfs_locking/lock_harness
make[3]: Leaving directory 
`/usr/2.6.9-cluster/cluster/gfs-kernel/src/harness'
cd nolock && make install
make[3]: Entering directory 
`/usr/2.6.9-cluster/cluster/gfs-kernel/src/nolock'
echo

rm -f linux lm_interface.h
ln -s . linux
ln -s ../../src/harness/lm_interface.h .
make -C /usr/2.6.9-cluster/linux-2.6.9 
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/nolock 
symverfile=/usr/2.6.9-cluster/cluster/gfs-kernel/src/nolock/../harness/lock_harness.symvers 
modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_nolock
install lock_nolock.ko //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_nolock
make[3]: Leaving directory 
`/usr/2.6.9-cluster/cluster/gfs-kernel/src/nolock'
cd dlm && make install
make[3]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm'
if [ ! -e linux ]; then ln -s . linux; fi
if [ ! -e cluster ]; then ln -s . cluster; fi
if [ ! -e lm_interface.h ]; then ln -s ../../src/harness/lm_interface.h 
.; fi
if [ ! -e dlm.h ]; then cp //usr/include/cluster/dlm.h .; fi
if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
if [ ! -e cnxman-socket.h ]; then cp 
//usr/include/cluster/cnxman-socket.h .; fi
if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
make -C /usr/2.6.9-cluster/linux-2.6.9 
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm 
symverfile=/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/../harness/lock_harness.symvers 
modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "kcl_addref_cluster" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_get_services" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_cluster_name" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_lock" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_releaseref_cluster" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_get_members" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_query" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_leave_service" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "preempt_schedule" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_unregister_service" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_join_service" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_release_lockspace" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_start_done" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_unlock" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "kcl_register_service" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_new_lockspace" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
*** Warning: "dlm_debug_dump" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko] undefined!
   CC      /usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.mod.o
   LD [M]  /usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/lock_dlm.ko
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
/usr/2.6.9-cluster/linux-2.6.9/scripts/mod/modpost -m -i 
/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm/../harness/lock_harness.symvers 
../dlm/lock_dlm.o -o lock_dlm.symvers
install -d //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_dlm
install lock_dlm.ko //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_dlm
make[3]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/dlm'
cd gulm && make install
make[3]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm'
rm -f linux lm_interface.h
ln -s . linux
ln -s ../../src/harness/lm_interface.h .
make -C /usr/2.6.9-cluster/linux-2.6.9 
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm 
symverfile=/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/../harness/lock_harness.symvers 
modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/lock_gulm.ko] undefined!
   CC      /usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/lock_gulm.mod.o
   LD [M]  /usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/lock_gulm.ko
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
/usr/2.6.9-cluster/linux-2.6.9/scripts/mod/modpost -m -i 
/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm/../harness/lock_harness.symvers 
../gulm/lock_gulm.o -o lock_gulm.symvers
install -d //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_gulm
install lock_gulm.ko //lib/modules/2.6.9/kernel/fs/gfs_locking/lock_gulm
make[3]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/gulm'
cd gfs && make install
make[3]: Entering directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs'
rm -f linux lm_interface.h
ln -s . linux
ln -s ../../src/harness/lm_interface.h .
make -C /usr/2.6.9-cluster/linux-2.6.9 
M=/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs 
symverfile=/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs/../gulm/lock_gulm.symvers 
modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule" 
[/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs/gfs.ko] undefined!
make[4]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //lib/modules/2.6.9/kernel/fs/gfs
install gfs.ko //lib/modules/2.6.9/kernel/fs/gfs
install -d //usr/include/linux
install gfs_ondisk.h gfs_ioctl.h //usr/include/linux
make[3]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src/gfs'
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel/src'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/gfs-kernel'
cd gnbd-kernel && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/gnbd-kernel'
cd src && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/gnbd-kernel/src'
rm -f linux
ln -s . linux
make -C /usr/2.6.9-cluster/linux-2.6.9 
M=/usr/2.6.9-cluster/cluster/gnbd-kernel/src modules USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.9-cluster/linux-2.6.9'
   Building modules, stage 2.
   MODPOST
*** Warning: "preempt_schedule" 
[/usr/2.6.9-cluster/cluster/gnbd-kernel/src/gnbd.ko] undefined!
make[3]: Leaving directory `/usr/2.6.9-cluster/linux-2.6.9'
install -d //usr/include/linux
install gnbd.h //usr/include/linux
install -d //lib/modules/2.6.9/kernel/drivers/block/gnbd
install gnbd.ko //lib/modules/2.6.9/kernel/drivers/block/gnbd
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/gnbd-kernel/src'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/gnbd-kernel'
cd magma && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/magma'
cd lib && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/magma/lib'
install -d /usr/lib
install -d /usr/lib
#
# Install high level magma libraries
#
install -m 0644 libmagma.a /usr/lib
install -m 0644 libmagma.so.DEVEL.1121902825 /usr/lib
cd /usr/lib; ln -snf libmagma.so.DEVEL.1121902825 libmagma.so.DEVEL; cd -
cd /usr/lib; ln -snf libmagma.so.DEVEL.1121902825 libmagma.so; cd -
#
# Install low-overhead, non-pthread magma libraries
#
install -m 0644 libmagma_nt.a /usr/lib
install -m 0644 libmagma_nt.so.DEVEL.1121902825 /usr/lib
cd /usr/lib; ln -snf libmagma_nt.so.DEVEL.1121902825 
libmagma_nt.so.DEVEL; cd -
cd /usr/lib; ln -snf libmagma_nt.so.DEVEL.1121902825 libmagma_nt.so; cd -
#
# Install high level TCP messaging libraries
#
install -m 0644 libmagmamsg.a /usr/lib
install -m 0644 libmagmamsg.so.DEVEL.1121902825 /usr/lib
cd /usr/lib; ln -snf libmagmamsg.so.DEVEL.1121902825 
libmagmamsg.so.DEVEL; cd -
cd /usr/lib; ln -snf libmagmamsg.so.DEVEL.1121902825 libmagmamsg.so; cd -
#
# Install headers for magma, magma_nt, and magmamsg
#
install -d /usr/include
install -m 0644 magma.h /usr/include
install -m 0644 magmamsg.h /usr/include
install -m 0644 magma-build.h /usr/include
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/magma/lib'
cd man && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/magma/man'
install -d /usr/share/man/man3
install -d /usr/share/man/man8
install clu_connect.3 clu_disconnect.3 clu_get_event.3 /usr/share/man/man3
install magma_tool.8 /usr/share/man/man8
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/magma/man'
cd tests && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/magma/tests'
install -d /sbin
install -m 0755 magma_tool circleping cluster_cmd /sbin
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/magma/tests'
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/magma'
cd ccs && make install
make[1]: Entering directory `/usr/2.6.9-cluster/cluster/ccs'
cd daemon && make install
make[2]: Entering directory `/usr/2.6.9-cluster/cluster/ccs/daemon'
gcc -Wall -I. -I../include -I../common -I../config -I//usr/include -Wall 
-O2 -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE `xml2-config --cflags` 
-DCCS_RELEASE_NAME=\"DEVEL.1121902825\" -I. -I../include -I../common 
-I../config -I//usr/include -o ccsd ccsd.c cnx_mgr.c cluster_mgr.c 
misc.c ../common/log.c globals.c -L//usr/lib `xml2-config --libs` 
-lpthread -lmagma -lmagmamsg -ldl
/bin/sh: line 1: xml2-config: command not found
/bin/sh: line 1: xml2-config: command not found
ccsd.c:26:27: libxml/parser.h: No such file or directory
ccsd.c: In function `check_cluster_conf':
ccsd.c:502: error: `xmlDocPtr' undeclared (first use in this function)
ccsd.c:502: error: (Each undeclared identifier is reported only once
ccsd.c:502: error: for each function it appears in.)
ccsd.c:502: error: syntax error before "doc"
ccsd.c:505: error: `doc' undeclared (first use in this function)
ccsd.c:505: warning: implicit declaration of function `xmlParseFile'
ccsd.c:514: warning: implicit declaration of function `xmlFreeDoc'
cnx_mgr.c:25:27: libxml/parser.h: No such file or directory
cnx_mgr.c:26:25: libxml/tree.h: No such file or directory
cnx_mgr.c:27:26: libxml/xpath.h: No such file or directory
cnx_mgr.c:28:35: libxml/xpathInternals.h: No such file or directory
In file included from cnx_mgr.c:33:
misc.h:17: error: syntax error before "xmlDocPtr"
misc.h:17: warning: no semicolon at end of struct or union
misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
misc.h:18: warning: data definition has no type or storage class
misc.h:24: error: syntax error before '*' token
misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
misc.h:24: warning: data definition has no type or storage class
misc.h:26: error: syntax error before "ldoc"
misc.h:27: error: syntax error before "ldoc"
cnx_mgr.c:41: error: syntax error before "open_doc_t"
cnx_mgr.c:41: warning: no semicolon at end of struct or union
cnx_mgr.c:42: warning: type defaults to `int' in declaration of `oc_ctx'
cnx_mgr.c:42: warning: data definition has no type or storage class
cnx_mgr.c:43: warning: type defaults to `int' in declaration of 
`open_connection_t'
cnx_mgr.c:43: warning: data definition has no type or storage class
cnx_mgr.c:48: error: syntax error before '*' token
cnx_mgr.c:48: warning: type defaults to `int' in declaration of `ocs'
cnx_mgr.c:48: warning: data definition has no type or storage class
cnx_mgr.c: In function `_update_config':
cnx_mgr.c:53: error: `tmp_odoc' undeclared (first use in this function)
cnx_mgr.c:53: error: (Each undeclared identifier is reported only once
cnx_mgr.c:53: error: for each function it appears in.)
cnx_mgr.c:54: error: `xmlDocPtr' undeclared (first use in this function)
cnx_mgr.c:54: error: syntax error before "tmp_doc"
cnx_mgr.c:58: error: `tmp_doc' undeclared (first use in this function)
cnx_mgr.c:58: warning: implicit declaration of function `xmlParseFile'
cnx_mgr.c:67: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:68: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:90: error: request for member `od_refs' in something not a 
structure or union
cnx_mgr.c:92: warning: implicit declaration of function `xmlFreeDoc'
cnx_mgr.c:92: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:93: warning: implicit declaration of function `free'
cnx_mgr.c:104: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c: In function `broadcast_for_doc':
cnx_mgr.c:157: error: `xmlDocPtr' undeclared (first use in this function)
cnx_mgr.c:157: error: syntax error before "tmp_doc"
cnx_mgr.c:164: warning: implicit declaration of function `exit'
cnx_mgr.c:164: error: `EXIT_FAILURE' undeclared (first use in this function)
cnx_mgr.c:289: warning: implicit declaration of function `srandom'
cnx_mgr.c:293: warning: implicit declaration of function `random'
cnx_mgr.c:324: error: `tmp_doc' undeclared (first use in this function)
cnx_mgr.c:324: warning: implicit declaration of function `xmlParseMemory'
cnx_mgr.c:349: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:352: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:357: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:366: error: request for member `od_refs' in something not a 
structure or union
cnx_mgr.c:367: error: `tmp_odoc' undeclared (first use in this function)
cnx_mgr.c:376: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:377: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:386: error: request for member `od_refs' in something not a 
structure or union
cnx_mgr.c:396: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:397: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:450: warning: implicit declaration of function `xmlDocDump'
cnx_mgr.c:450: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c: In function `process_connect':
cnx_mgr.c:488: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:513: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:514: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:515: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:518: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:521: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:522: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:523: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:526: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:527: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:531: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:532: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:542: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:577: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:583: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:609: error: syntax error before ')' token
cnx_mgr.c:617: error: request for member `od_refs' in something not a 
structure or union
cnx_mgr.c:618: error: request for member `oc_odoc' in something not a 
structure or union
cnx_mgr.c:619: error: request for member `oc_ctx' in something not a 
structure or union
cnx_mgr.c:619: warning: implicit declaration of function 
`xmlXPathNewContext'
cnx_mgr.c:619: error: request for member `oc_odoc' in something not a 
structure or union
cnx_mgr.c:620: error: request for member `oc_ctx' in something not a 
structure or union
cnx_mgr.c:621: error: request for member `oc_odoc' in something not a 
structure or union
cnx_mgr.c:631: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c: In function `process_disconnect':
cnx_mgr.c:658: error: `tmp_odoc' undeclared (first use in this function)
cnx_mgr.c:679: error: request for member `oc_ctx' in something not a 
structure or union
cnx_mgr.c:680: warning: implicit declaration of function 
`xmlXPathFreeContext'
cnx_mgr.c:680: error: request for member `oc_ctx' in something not a 
structure or union
cnx_mgr.c:682: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:683: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:685: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:686: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:688: error: request for member `oc_odoc' in something not a 
structure or union
cnx_mgr.c:692: error: `EXIT_FAILURE' undeclared (first use in this function)
cnx_mgr.c: In function `_process_get':
cnx_mgr.c:734: error: `xmlXPathObjectPtr' undeclared (first use in this 
function)
cnx_mgr.c:734: error: syntax error before "obj"
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:758: error: request for member `oc_index' in something not a 
structure or union
cnx_mgr.c:764: error: request for member `oc_index' in something not a 
structure or union
cnx_mgr.c:765: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:766: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:768: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:774: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:784: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:789: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:792: error: `obj' undeclared (first use in this function)
cnx_mgr.c:792: warning: implicit declaration of function 
`xmlXPathEvalExpression'
cnx_mgr.c:792: error: request for member `oc_ctx' in something not a 
structure or union
cnx_mgr.c:797: error: `xmlNodePtr' undeclared (first use in this function)
cnx_mgr.c:797: error: syntax error before "node"
cnx_mgr.c:801: error: request for member `oc_index' in something not a 
structure or union
cnx_mgr.c:802: error: request for member `oc_index' in something not a 
structure or union
cnx_mgr.c:807: error: `node' undeclared (first use in this function)
cnx_mgr.c:807: error: request for member `oc_index' in something not a 
structure or union
cnx_mgr.c:822: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this 
function)
cnx_mgr.c:823: error: `XML_ELEMENT_NODE' undeclared (first use in this 
function)
cnx_mgr.c:868: warning: implicit declaration of function 
`xmlXPathFreeObject'
cnx_mgr.c: In function `process_get_list':
cnx_mgr.c:897: error: request for member `oc_index' in something not a 
structure or union
cnx_mgr.c: In function `process_get_state':
cnx_mgr.c:964: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:964: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:965: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:966: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:973: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:974: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:974: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:976: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:984: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:985: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:993: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c: In function `process_set_state':
cnx_mgr.c:1033: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:1034: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:1035: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c:1038: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:1039: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:1040: error: request for member `oc_query' in something not a 
structure or union
cnx_mgr.c:1043: error: request for member `oc_cwp' in something not a 
structure or union
cnx_mgr.c: In function `process_request':
cnx_mgr.c:1160: warning: implicit declaration of function `realloc'
cnx_mgr.c: In function `process_broadcast':
cnx_mgr.c:1247: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:1248: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:1266: warning: implicit declaration of function 
`xmlDocDumpFormatMemory'
cnx_mgr.c:1266: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:1267: error: `xmlChar' undeclared (first use in this function)
cnx_mgr.c:1267: error: syntax error before ')' token
cnx_mgr.c:1301: error: request for member `od_doc' in something not a 
structure or union
cnx_mgr.c:1302: error: request for member `od_doc' in something not a 
structure or union
cluster_mgr.c:24:27: libxml/parser.h: No such file or directory
In file included from cluster_mgr.c:29:
misc.h:17: error: syntax error before "xmlDocPtr"
misc.h:17: warning: no semicolon at end of struct or union
misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
misc.h:18: warning: data definition has no type or storage class
misc.h:24: error: syntax error before '*' token
misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
misc.h:24: warning: data definition has no type or storage class
misc.h:26: error: syntax error before "ldoc"
misc.h:27: error: syntax error before "ldoc"
cluster_mgr.c:38: error: syntax error before "tmp_doc"
cluster_mgr.c: In function `check_update_doc':
cluster_mgr.c:45: error: `tmp_doc' undeclared (first use in this function)
cluster_mgr.c:45: error: (Each undeclared identifier is reported only once
cluster_mgr.c:45: error: for each function it appears in.)
cluster_mgr.c:51: error: request for member `od_doc' in something not a 
structure or union
cluster_mgr.c:52: error: request for member `od_doc' in something not a 
structure or union
cluster_mgr.c:64: error: request for member `od_doc' in something not a 
structure or union
cluster_mgr.c:65: error: request for member `od_doc' in something not a 
structure or union
cluster_mgr.c:67: error: request for member `od_doc' in something not a 
structure or union
cluster_mgr.c:67: error: request for member `od_doc' in something not a 
structure or union
cluster_mgr.c: In function `handle_cluster_message':
cluster_mgr.c:93: error: `xmlDocPtr' undeclared (first use in this function)
cluster_mgr.c:93: error: syntax error before "tmp_doc"
cluster_mgr.c:146: error: `tmp_doc' undeclared (first use in this function)
cluster_mgr.c:146: warning: implicit declaration of function 
`xmlParseMemory'
cluster_mgr.c:167: warning: implicit declaration of function `xmlDocDump'
cluster_mgr.c:198: warning: implicit declaration of function `xmlParseFile'
cluster_mgr.c:247: warning: implicit declaration of function `xmlFreeDoc'
misc.c:25:27: libxml/parser.h: No such file or directory
misc.c:27:27: libxml/parser.h: No such file or directory
misc.c:28:25: libxml/tree.h: No such file or directory
misc.c:29:26: libxml/xpath.h: No such file or directory
misc.c:30:35: libxml/xpathInternals.h: No such file or directory
In file included from misc.c:35:
misc.h:17: error: syntax error before "xmlDocPtr"
misc.h:17: warning: no semicolon at end of struct or union
misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
misc.h:18: warning: data definition has no type or storage class
misc.h:24: error: syntax error before '*' token
misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
misc.h:24: warning: data definition has no type or storage class
misc.h:26: error: syntax error before "ldoc"
misc.h:27: error: syntax error before "ldoc"
misc.c:42: error: syntax error before '*' token
misc.c:42: warning: type defaults to `int' in declaration of `master_doc'
misc.c:42: warning: data definition has no type or storage class
misc.c:44: error: syntax error before "ldoc"
misc.c: In function `get_doc_version':
misc.c:47: error: `xmlXPathObjectPtr' undeclared (first use in this 
function)
misc.c:47: error: (Each undeclared identifier is reported only once
misc.c:47: error: for each function it appears in.)
misc.c:47: error: syntax error before "obj"
misc.c:48: error: `xmlXPathContextPtr' undeclared (first use in this 
function)
misc.c:49: error: `xmlNodePtr' undeclared (first use in this function)
misc.c:53: error: `ctx' undeclared (first use in this function)
misc.c:53: warning: implicit declaration of function `xmlXPathNewContext'
misc.c:53: error: `ldoc' undeclared (first use in this function)
misc.c:60: error: `obj' undeclared (first use in this function)
misc.c:60: warning: implicit declaration of function 
`xmlXPathEvalExpression'
misc.c:67: error: `node' undeclared (first use in this function)
misc.c:68: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this 
function)
misc.c:91: warning: implicit declaration of function `xmlXPathFreeContext'
misc.c:94: warning: implicit declaration of function `xmlXPathFreeObject'
misc.c: At top level:
misc.c:109: error: syntax error before "ldoc"
misc.c: In function `get_cluster_name':
misc.c:112: error: `xmlXPathObjectPtr' undeclared (first use in this 
function)
misc.c:112: error: syntax error before "obj"
misc.c:113: error: `xmlXPathContextPtr' undeclared (first use in this 
function)
misc.c:114: error: `xmlNodePtr' undeclared (first use in this function)
misc.c:118: error: `ctx' undeclared (first use in this function)
misc.c:118: error: `ldoc' undeclared (first use in this function)
misc.c:125: error: `obj' undeclared (first use in this function)
misc.c:132: error: `node' undeclared (first use in this function)
misc.c:133: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this 
function)
make[2]: *** [ccsd] Error 1
make[2]: Leaving directory `/usr/2.6.9-cluster/cluster/ccs/daemon'
make[1]: *** [install] Error 2
make[1]: Leaving directory `/usr/2.6.9-cluster/cluster/ccs'
make: *** [install] Error 2


High Mobley
Priority Networks


From lhh at redhat.com  Thu Jul 21 14:02:47 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Thu, 21 Jul 2005 10:02:47 -0400
Subject: [Linux-cluster] Cluter MIB & Trap
In-Reply-To: <35a448fb0507210437514fd07c@mail.gmail.com>
References: <35a448fb0507210437514fd07c@mail.gmail.com>
Message-ID: <1121954567.27436.2.camel@ayanami.boston.redhat.com>

On Thu, 2005-07-21 at 14:37 +0300, eli elizur wrote:
> Hi list,
> 
> I have Linux AS 2.1 running on X86
> I have configure the Redhat Cluster.
> 
> Dose the cluster has a mib I can get info via the snmp ?
> Is there a trap ability to alert on error ?

Nope.

-- Lon


From pcaulfie at redhat.com  Thu Jul 21 14:06:45 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Thu, 21 Jul 2005 15:06:45 +0100
Subject: [Linux-cluster] CMAN/CCSD errors after fresh install of Cluster
	Suite
In-Reply-To: <BF03FE2F.BE12%Simon.Gibbs@informa.com>
References: <BF03FE2F.BE12%Simon.Gibbs@informa.com>
Message-ID: <42DFABF5.5050802@redhat.com>

Gibbs, Simon wrote:
> Hi
> 
> I?ve installed RHCS on 2 nodes with fresh installs of RHEL4 but am
> .running into problems when trying to configure the nodes using the GUI
> tool.
> When opening the GUI I?m prompted to create/save a cluster.conf file (in
> /etc/cluster) but there aren?t any options to add/configure any of the
> nodes. 

If you click on "Cluster Nodes" on the left, you'll see a button appear labelled
 "Add a cluster node" appear at the bottom right.

-- 

patrick


From pcaulfie at redhat.com  Thu Jul 21 14:10:30 2005
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Thu, 21 Jul 2005 15:10:30 +0100
Subject: [Linux-cluster] cman ignores interface setting on ipv4
In-Reply-To: <42DF4769.2080300@redhat.com>
References: <1121811743.13635.26.camel@cocagne.max-t.internal>	<42DE73DA.6080408@redhat.com>	<1121876980.13635.32.camel@cocagne.max-t.internal>
	<42DF4769.2080300@redhat.com>
Message-ID: <42DFACD6.4070206@redhat.com>

I've applied most of your patch - as it stood it broke broadcast configurations!

You'll probably find that it doesn't really fix your problem! The node name
passed to cman_tool must still match the hostname bound to the interface that
you are using for cluster communication.

I'll cook up a better solution later perhaps, but thanks for bringing it to my
attention.
-- 

patrick


From haydar2906 at hotmail.com  Thu Jul 21 18:54:52 2005
From: haydar2906 at hotmail.com (haydar Ali)
Date: Thu, 21 Jul 2005 14:54:52 -0400
Subject: [Linux-cluster] Cluster IP alias doesn't work
In-Reply-To: <BAY104-F40D6614061D1056D9940D3C5D70@phx.gbl>
Message-ID: <BAY104-F28F893F4F16D9113DE61B9C5D60@phx.gbl>

Hi,

I setuped and configured a clustered NFS.
I have created 2 quorum partitions /dev/sdd2 and /dev/sdd3  (100MB each).

I created another huge partition /dev/sdd4 (over 600GB) and formatted it in 
ext3 file system.

I installed the cluster suite on the 1st node (RAC1) and 2nd node RAC2 and I 
started the rawdevices on the two nodes RAC1 and RAC2 (it's OK).

This the hosts file /etc/host on the node1 (RAC1) and node2 RAC2

Do not remove the following line, or various programs
# that require network functionality will fail.
#127.0.0.1		rac1 localhost.localdomain localhost
127.0.0.1              localhost.localdomain localhost
#
# Private hostnames
#
192.168.253.3           rac1.project.net     rac1
192.168.253.4           rac2.project.net     rac2
192.168.253.10          raclu.project.net	raclu
192.168.253.20		raclu_nfs.project.net	raclu_nfs
#
# Hostnames used for Interconnect
#
1.1.1.1                 rac1i.project.net    rac1i
1.1.1.2                 rac2i.project.net    rac2i
#
192.168.253.5           infra.project.net       infra
192.168.253.7		ractest.project.net     ractest
#
I generated a /etc/cluster.conf on the 1st node RAC1 as following:

# This file is automatically generated.  Do not manually edit!

[cluhbd]
  logLevel = 4

[clupowerd]
  logLevel = 4

[cluquorumd]
  logLevel = 4

[cluster]
  alias_ip = raclu
  name = project
  timestamp = 1121957827

[clusvcmgrd]
  logLevel = 4

[database]
  version = 2.0

[members]
start member0
start chan0
  name = rac1
  type = net
end chan0
  id = 0
  name = rac1
  powerSwitchIPaddr = rac1
  powerSwitchPortName = unused
  quorumPartitionPrimary = /dev/raw/raw1
  quorumPartitionShadow = /dev/raw/raw2
end member0
start member1
start chan0
  name = rac2
  type = net
end chan0
  id = 1
  name = rac2
  powerSwitchIPaddr = rac2
  powerSwitchPortName = unused
  quorumPartitionPrimary = /dev/raw/raw1
  quorumPartitionShadow = /dev/raw/raw2
end member1

[powercontrollers]
start powercontroller0
  IPaddr = rac1
  login = unused
  passwd = unused
  type = null
end powercontroller0
start powercontroller1
  IPaddr = rac2
  login = unused
  passwd = unused
  type = null
end powercontroller1

[services]
start service0
  checkInterval = 0
start device0
start mount
start NFSexports
start directory0
start client0
  name = *
  options = rw
end client0
  name = /u04
end directory0
end NFSexports
  forceUnmount = yes
  fstype = ext3
  name = /u04
  options = rw,nosuid,sync
end mount
  name = /dev/sdd4
  sharename = None
end device0
  name = nfs_project
start network0
  ipAddress = 192.168.253.20
end network0
  preferredNode = rac2
  relocateOnPreferredNodeBoot = yes
end service0


I generated a /etc/cluster.conf on the 2nd node RAC2 as following:


# This file is automatically generated.  Do not manually edit!

[cluhbd]
  logLevel = 4

[clupowerd]
  logLevel = 4

[cluquorumd]
  logLevel = 4

[cluster]
  alias_ip = raclu
  name = project
  timestamp = 1121957827

[clusvcmgrd]
  logLevel = 4

[database]
  version = 2.0

[members]
start member0
start chan0
  name = rac1
  type = net
end chan0
  id = 0
  name = rac1
  powerSwitchIPaddr = rac1
  powerSwitchPortName = unused
  quorumPartitionPrimary = /dev/raw/raw1
  quorumPartitionShadow = /dev/raw/raw2
end member0
start member1
start chan0
  name = rac2
  type = net
end chan0
  id = 1
  name = rac2
  powerSwitchIPaddr = rac2
  powerSwitchPortName = unused
  quorumPartitionPrimary = /dev/raw/raw1
  quorumPartitionShadow = /dev/raw/raw2
end member1

[powercontrollers]
start powercontroller0
  IPaddr = rac1
  login = unused
  passwd = unused
  type = null
end powercontroller0
start powercontroller1
  IPaddr = rac2
  login = unused
  passwd = unused
  type = null
end powercontroller1

[services]
start service0
  checkInterval = 0
start device0
start mount
start NFSexports
start directory0
start client0
  name = *
  options = rw
end client0
  name = /u04
end directory0
end NFSexports
  forceUnmount = yes
  fstype = ext3
  name = /u04
  options = rw,nosuid,sync
end mount
  name = /dev/sdd4
  sharename = None
end device0
  name = nfs_project
start network0
  ipAddress = 192.168.253.20
end network0
  preferredNode = rac2
  relocateOnPreferredNodeBoot = yes
end service0

[session]
  lock = rac1-cluadmin-16970-root 1121957668

I created a NFS share on /u04 (mount on /dev/sdd4) using the following 
command cluadmin on RAC1
I launched on 2 nodes Rac1 and RAC2 the following command:
Service cluster start
I checked the result on the 2 nodes:
clustat


Cluster Status Monitor (project)   14:21:38

Cluster alias: raclu

==========  M e m b e r   S t a t u s  =============

  Member         Status     Node Id    Power Switch
  -------------- ---------- ---------- ------------
  rac1        Up         0          Good
  rac2        Up         1          Good

===========  H e a r t b e a t   S t a t u s  ======

  Name                           Type       Status
  ------------------------------ ---------- ------------
  rac1      <--> rac2      network    ONLINE

========== S e r v i c e   S t a t u s  =======

                                         Last             Monitor  Restart
  Service        Status   Owner          Transition       Interval Count
  -------------- -------- -------------- ---------------- --------  
nfs_project       started  rac2        13:01:36 Jul 21  0        0


And I launched on RAC1 and other servers the following command:
mount ?t nfs 192.168.253.20:/u04 /u04

And all are OK, I can list the /u04 content from any server (if I mount it).

But my only problem is:

When I want to try a test, I stop the network service on RAC2:
ifconfig eth0 down
Then when I try from another server as RAC1 to list the /u04 content, it 
doesn?t work and doesn?t respond, and when I ping the IP alias 
192.168.253.20 it doesn?t respond also.

Have you any idea to fix this problem?

Thanks for your replies and help


Haydar

>From: "haydar Ali" <haydar2906 at hotmail.com>
>Reply-To: linux clustering <linux-cluster at redhat.com>
>To: linux-cluster at redhat.com
>Subject: [Linux-cluster] Need help for Clustered NFS
>Date: Wed, 20 Jul 2005 10:10:35 -0400
>
>Hi,
>
>I want to setup and configure clustered NFS.
>I have created 2 quorum partitions /dev/sdd2 and /dev/sdd3  (100MB each) 
>and formatted them
>
>mkfs -t ext2 -b 4096 /dev/sdd2
>mkfs -t ext2 -b 4096 /dev/sdd3
>
>I created another huge partition /dev/sdd4 (over 600GB) and formatted it in 
>ext3 filesystem.
>
>I installed the cluster suite on the 1st node (RAC1) and I started the 
>rawdevices on the two nodes RAC1 and RAC2 (it's OK).
>
>This the hosts file /etc/host on the node1 (RAC1)
>
># Do not remove the following line, or various programs
># that require network functionality will fail.
>127.0.0.1              localhost.localdomain localhost
>#
># Private hostnames
>#
>192.168.253.3           rac1.domain.net     rac1
>192.168.253.4           rac2.domain.net     rac2
>192.168.253.10          rac1
>#
># Hostnames used for Interconnect
>#
>1.1.1.1                 rac1i.domain.net    rac1i
>1.1.1.2                 rac2i.domain.net    rac2i
>#
>-----------------------
>
>
>I launched the command cluconfig and it generated /etc/cluster.conf, you 
>can list its content:
>
>-------------------------------
>This file is automatically generated.  Do not manually edit!
>
>[cluhbd]
>  logLevel = 4
>
>[clupowerd]
>  logLevel = 4
>
>[cluquorumd]
>  logLevel = 4
>
>[cluster]
>  alias_ip = 192.168.253.10
>  name = project
>  timestamp = 1121804245
>
>[clusvcmgrd]
>  logLevel = 4
>
>[database]
>  version = 2.0
>
>[members]
>start member0
>start chan0
>  name = rac1
>  type = net
>end chan0
>  id = 0
>  name = rac1
>  powerSwitchIPaddr = rac1
>  powerSwitchPortName = unused
>  quorumPartitionPrimary = /dev/raw/raw1
>  quorumPartitionShadow = /dev/raw/raw2
>end member0
>start member1
>start chan0
>  name = rac2
>  type = net
>end chan0
>  id = 1
>  name = rac2
>  powerSwitchIPaddr = rac2
>  powerSwitchPortName = unused
>  quorumPartitionPrimary = /dev/raw/raw1
>  quorumPartitionShadow = /dev/raw/raw2
>end member1
>
>[powercontrollers]
>start powercontroller0
>  IPaddr = rac1
>  login = unused
>  passwd = unused
>  type = null
>end powercontroller0
>start powercontroller1
>  IPaddr = rac2
>  login = unused
>  passwd = unused
>  type = null
>end powercontroller1
>
>[services]
>start service0
>  checkInterval = 30
>start device0
>start mount
>start NFSexports
>start directory0
>start client0
>  name = rac1
>  options = rw
>end client0
>  name = /u04
>end directory0
>end NFSexports
>  forceUnmount = yes
>  fstype = ext3
>  name = /u04
>  options = rw,nosuid,sync
>end mount
>  name = /dev/sdd4
>  sharename = None
>end device0
>  name = nfs_project
>  preferredNode = rac2
>  relocateOnPreferredNodeBoot = yes
>end service0
>------------------------------------
>
>I created a NFS share on /u04 using the following command cluadmin
>
>[root at rac1 root]# cluadmin
>Wed Jul 20 10:02:20 EDT 2005
>
>You can obtain help by entering help and one of the following commands:
>
>cluster     service        clear
>help        apropos        exit
>version         quit
>cluadmin> service show
>  1) state
>  2) config
>  3) services
>service show what? 2
>  0) nfs_project
>  c) cancel
>
>Choose service: 0
>name: nfs_project
>preferred node: rac2
>relocate: yes
>monitor interval: 30
>device 0: /dev/sdd4
>  mount point, device 0: /u04
>  mount fstype, device 0: ext3
>  mount options, device 0: rw,nosuid,sync
>  force unmount, device 0: yes
>  samba share, device 0: None
>NFS export 0: /u04
>  Client 0: rac1, rw
>cluadmin> service show state
>=========================  S e r v i c e   S t a t u s  
>========================
>
>                                         Last             Monitor  Restart
>  Service        Status   Owner          Transition       Interval Count
>  -------------- -------- -------------- ---------------- -------- -------
>  nfs_project       started  rac1        16:21:23 Jul 19  30       1
>cluadmin>
>
>
>And when I launched clustat, I expected this error message:
>
>clustat
>Cluster Status Monitor (Fileserver Test Cluster)
>07:46:05
>Cluster alias: rac1
>
>===================== M e m b e r   S t a t u s ================
>  Member         Status     Node Id    Power Switch
>  -------------- ---------- ---------- ------------
>  rac1           Up         0          Good
>  rac2           Down    1          Unknown
>
>=================== H e a r t b e a t   S t a t u s ===============
>  Name                           Type       Status
>  ------------------------------ ---------- ------------
>  rac1         <--> rac2         network    OFFLINE
>
>=================== S e r v i c e   S t a t u s ==================
>                                  	     Last            Monitor      
>Restart
>Service         Status   Owner          Transition    Interval  Count
>
>  ------------- -------- ------------- ---------------- ------------
>  nfs_project          started  rac1          16:07:42 Jul 19  30           
>  0
>
>
>
>And when I launched this command on RAC2:
>mount ?t nfs rac1:/u04 /u04
>It list the following error message :
>Mount: rac1:/u04 failed, reason given by server: Permission denied
>
>Can someone help me to fix this problem in this configuration?
>
>Thanks
>
>Cheers!
>
>Haydar
>
>
>--
>Linux-cluster mailing list
>Linux-cluster at redhat.com
>http://www.redhat.com/mailman/listinfo/linux-cluster


From dawson at fnal.gov  Thu Jul 21 20:13:57 2005
From: dawson at fnal.gov (Troy Dawson)
Date: Thu, 21 Jul 2005 15:13:57 -0500
Subject: [Linux-cluster] RHEL4 srpm spec file bugs/problems
Message-ID: <42E00205.7050109@fnal.gov>

Howdy,
These problems might have already been taken care of (I haven't checked 
out the latest CVS), but just incase they haven't, here's some problems 
I encountered while rebuilding the RHEL4 src rpm's, and way's they could 
be fixed.

First - cman-kernel has the build dependancy on fake-build-provides, 
which isn't in the RHEL4 release.  The way around this is to build it 
with the --nodeps option.  But it really should be taken out of the 
RHEL4 spec files unless it is going to be provided at some time.
(this was already discussed on this mailling list, but I'll put it here 
for completeness)

Second - source rpm's that build kernel modules leave messy BUILD/smp 
and BUILD/hugemem directories when they are done.  These don't get 
cleaned up, and when you are building the next rpm that does kernel 
modules, there is conflicting files, so the next rpm doesn't build 
correctly.
Currently you have to remove those directories by hand after each rpm 
that builds a kernel module.  It would be nice if there was a couple of 
cleanup lines in the %clean section in the spec files.

Thanks
Troy Dawson
-- 
__________________________________________________
Troy Dawson  dawson at fnal.gov  (630)840-6468
Fermilab  ComputingDivision/CSS  CSI Group
__________________________________________________


From JACOB_LIBERMAN at Dell.com  Thu Jul 21 21:46:05 2005
From: JACOB_LIBERMAN at Dell.com (JACOB_LIBERMAN at Dell.com)
Date: Thu, 21 Jul 2005 16:46:05 -0500
Subject: [Linux-cluster] Cluster IP alias doesn't work
Message-ID: <BC430F453501174992B9D9E8AFB7519A08560C@ausx3mps309.aus.amer.dell.com>

It looks like your alias IP is not the same as your service IP, but that
is the problem I have seen in the past that causes this behavior. I
would specify the cluster alias via an IP address rather than a host
name.

The behavior I have seen is that you down the interface on the service
owner node, and the service starts on the other node, but the virtual IP
address does not bind to the second nodes interface. This can occur when
the alias IP is the same address and the service IP. This isnot a bug,
but working as desgined, because the alias IP will bind to any node
participating in the cluster, nt just the node that owns the service.
This feature is particularly useful in clusters with a large number of
nodes.

I hope this helps!

jacob 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of haydar Ali
> Sent: Thursday, July 21, 2005 1:55 PM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] Cluster IP alias doesn't work
> 
> Hi,
> 
> I setuped and configured a clustered NFS.
> I have created 2 quorum partitions /dev/sdd2 and /dev/sdd3  
> (100MB each).
> 
> I created another huge partition /dev/sdd4 (over 600GB) and 
> formatted it in
> ext3 file system.
> 
> I installed the cluster suite on the 1st node (RAC1) and 2nd 
> node RAC2 and I started the rawdevices on the two nodes RAC1 
> and RAC2 (it's OK).
> 
> This the hosts file /etc/host on the node1 (RAC1) and node2 RAC2
> 
> Do not remove the following line, or various programs # that 
> require network functionality will fail.
> #127.0.0.1		rac1 localhost.localdomain localhost
> 127.0.0.1              localhost.localdomain localhost
> #
> # Private hostnames
> #
> 192.168.253.3           rac1.project.net     rac1
> 192.168.253.4           rac2.project.net     rac2
> 192.168.253.10          raclu.project.net	raclu
> 192.168.253.20		raclu_nfs.project.net	raclu_nfs
> #
> # Hostnames used for Interconnect
> #
> 1.1.1.1                 rac1i.project.net    rac1i
> 1.1.1.2                 rac2i.project.net    rac2i
> #
> 192.168.253.5           infra.project.net       infra
> 192.168.253.7		ractest.project.net     ractest
> #
> I generated a /etc/cluster.conf on the 1st node RAC1 as following:
> 
> # This file is automatically generated.  Do not manually edit!
> 
> [cluhbd]
>   logLevel = 4
> 
> [clupowerd]
>   logLevel = 4
> 
> [cluquorumd]
>   logLevel = 4
> 
> [cluster]
>   alias_ip = raclu
>   name = project
>   timestamp = 1121957827
> 
> [clusvcmgrd]
>   logLevel = 4
> 
> [database]
>   version = 2.0
> 
> [members]
> start member0
> start chan0
>   name = rac1
>   type = net
> end chan0
>   id = 0
>   name = rac1
>   powerSwitchIPaddr = rac1
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member0
> start member1
> start chan0
>   name = rac2
>   type = net
> end chan0
>   id = 1
>   name = rac2
>   powerSwitchIPaddr = rac2
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member1
> 
> [powercontrollers]
> start powercontroller0
>   IPaddr = rac1
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller0
> start powercontroller1
>   IPaddr = rac2
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller1
> 
> [services]
> start service0
>   checkInterval = 0
> start device0
> start mount
> start NFSexports
> start directory0
> start client0
>   name = *
>   options = rw
> end client0
>   name = /u04
> end directory0
> end NFSexports
>   forceUnmount = yes
>   fstype = ext3
>   name = /u04
>   options = rw,nosuid,sync
> end mount
>   name = /dev/sdd4
>   sharename = None
> end device0
>   name = nfs_project
> start network0
>   ipAddress = 192.168.253.20
> end network0
>   preferredNode = rac2
>   relocateOnPreferredNodeBoot = yes
> end service0
> 
> 
> I generated a /etc/cluster.conf on the 2nd node RAC2 as following:
> 
> 
> # This file is automatically generated.  Do not manually edit!
> 
> [cluhbd]
>   logLevel = 4
> 
> [clupowerd]
>   logLevel = 4
> 
> [cluquorumd]
>   logLevel = 4
> 
> [cluster]
>   alias_ip = raclu
>   name = project
>   timestamp = 1121957827
> 
> [clusvcmgrd]
>   logLevel = 4
> 
> [database]
>   version = 2.0
> 
> [members]
> start member0
> start chan0
>   name = rac1
>   type = net
> end chan0
>   id = 0
>   name = rac1
>   powerSwitchIPaddr = rac1
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member0
> start member1
> start chan0
>   name = rac2
>   type = net
> end chan0
>   id = 1
>   name = rac2
>   powerSwitchIPaddr = rac2
>   powerSwitchPortName = unused
>   quorumPartitionPrimary = /dev/raw/raw1
>   quorumPartitionShadow = /dev/raw/raw2
> end member1
> 
> [powercontrollers]
> start powercontroller0
>   IPaddr = rac1
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller0
> start powercontroller1
>   IPaddr = rac2
>   login = unused
>   passwd = unused
>   type = null
> end powercontroller1
> 
> [services]
> start service0
>   checkInterval = 0
> start device0
> start mount
> start NFSexports
> start directory0
> start client0
>   name = *
>   options = rw
> end client0
>   name = /u04
> end directory0
> end NFSexports
>   forceUnmount = yes
>   fstype = ext3
>   name = /u04
>   options = rw,nosuid,sync
> end mount
>   name = /dev/sdd4
>   sharename = None
> end device0
>   name = nfs_project
> start network0
>   ipAddress = 192.168.253.20
> end network0
>   preferredNode = rac2
>   relocateOnPreferredNodeBoot = yes
> end service0
> 
> [session]
>   lock = rac1-cluadmin-16970-root 1121957668
> 
> I created a NFS share on /u04 (mount on /dev/sdd4) using the 
> following command cluadmin on RAC1 I launched on 2 nodes Rac1 
> and RAC2 the following command:
> Service cluster start
> I checked the result on the 2 nodes:
> clustat
> 
> 
> 
> Cluster Status Monitor (project)   14:21:38
> 
> Cluster alias: raclu
> 
> ==========  M e m b e r   S t a t u s  =============
> 
>   Member         Status     Node Id    Power Switch
>   -------------- ---------- ---------- ------------
>   rac1        Up         0          Good
>   rac2        Up         1          Good
> 
> ===========  H e a r t b e a t   S t a t u s  ======
> 
>   Name                           Type       Status
>   ------------------------------ ---------- ------------
>   rac1      <--> rac2      network    ONLINE
> 
> ========== S e r v i c e   S t a t u s  =======
> 
>                                          Last             
> Monitor  Restart
>   Service        Status   Owner          Transition       
> Interval Count
>   -------------- -------- -------------- ---------------- --------  
> nfs_project       started  rac2        13:01:36 Jul 21  0        0
> 
> 
> And I launched on RAC1 and other servers the following command:
> mount -t nfs 192.168.253.20:/u04 /u04
> 
> And all are OK, I can list the /u04 content from any server 
> (if I mount it).
> 
> But my only problem is:
> 
> When I want to try a test, I stop the network service on RAC2:
> ifconfig eth0 down
> Then when I try from another server as RAC1 to list the /u04 
> content, it 
> doesn't work and doesn't respond, and when I ping the IP alias 
> 192.168.253.20 it doesn't respond also.
> 
> Have you any idea to fix this problem?
> 
> Thanks for your replies and help
> 
> 
> Haydar
> 
> >From: "haydar Ali" <haydar2906 at hotmail.com>
> >Reply-To: linux clustering <linux-cluster at redhat.com>
> >To: linux-cluster at redhat.com
> >Subject: [Linux-cluster] Need help for Clustered NFS
> >Date: Wed, 20 Jul 2005 10:10:35 -0400
> >
> >Hi,
> >
> >I want to setup and configure clustered NFS.
> >I have created 2 quorum partitions /dev/sdd2 and /dev/sdd3  
> (100MB each) 
> >and formatted them
> >
> >mkfs -t ext2 -b 4096 /dev/sdd2
> >mkfs -t ext2 -b 4096 /dev/sdd3
> >
> >I created another huge partition /dev/sdd4 (over 600GB) and 
> formatted it in 
> >ext3 filesystem.
> >
> >I installed the cluster suite on the 1st node (RAC1) and I 
> started the 
> >rawdevices on the two nodes RAC1 and RAC2 (it's OK).
> >
> >This the hosts file /etc/host on the node1 (RAC1)
> >
> ># Do not remove the following line, or various programs
> ># that require network functionality will fail.
> >127.0.0.1              localhost.localdomain localhost
> >#
> ># Private hostnames
> >#
> >192.168.253.3           rac1.domain.net     rac1
> >192.168.253.4           rac2.domain.net     rac2
> >192.168.253.10          rac1
> >#
> ># Hostnames used for Interconnect
> >#
> >1.1.1.1                 rac1i.domain.net    rac1i
> >1.1.1.2                 rac2i.domain.net    rac2i
> >#
> >-----------------------
> >
> >
> >I launched the command cluconfig and it generated 
> /etc/cluster.conf, you 
> >can list its content:
> >
> >-------------------------------
> >This file is automatically generated.  Do not manually edit!
> >
> >[cluhbd]
> >  logLevel = 4
> >
> >[clupowerd]
> >  logLevel = 4
> >
> >[cluquorumd]
> >  logLevel = 4
> >
> >[cluster]
> >  alias_ip = 192.168.253.10
> >  name = project
> >  timestamp = 1121804245
> >
> >[clusvcmgrd]
> >  logLevel = 4
> >
> >[database]
> >  version = 2.0
> >
> >[members]
> >start member0
> >start chan0
> >  name = rac1
> >  type = net
> >end chan0
> >  id = 0
> >  name = rac1
> >  powerSwitchIPaddr = rac1
> >  powerSwitchPortName = unused
> >  quorumPartitionPrimary = /dev/raw/raw1
> >  quorumPartitionShadow = /dev/raw/raw2
> >end member0
> >start member1
> >start chan0
> >  name = rac2
> >  type = net
> >end chan0
> >  id = 1
> >  name = rac2
> >  powerSwitchIPaddr = rac2
> >  powerSwitchPortName = unused
> >  quorumPartitionPrimary = /dev/raw/raw1
> >  quorumPartitionShadow = /dev/raw/raw2
> >end member1
> >
> >[powercontrollers]
> >start powercontroller0
> >  IPaddr = rac1
> >  login = unused
> >  passwd = unused
> >  type = null
> >end powercontroller0
> >start powercontroller1
> >  IPaddr = rac2
> >  login = unused
> >  passwd = unused
> >  type = null
> >end powercontroller1
> >
> >[services]
> >start service0
> >  checkInterval = 30
> >start device0
> >start mount
> >start NFSexports
> >start directory0
> >start client0
> >  name = rac1
> >  options = rw
> >end client0
> >  name = /u04
> >end directory0
> >end NFSexports
> >  forceUnmount = yes
> >  fstype = ext3
> >  name = /u04
> >  options = rw,nosuid,sync
> >end mount
> >  name = /dev/sdd4
> >  sharename = None
> >end device0
> >  name = nfs_project
> >  preferredNode = rac2
> >  relocateOnPreferredNodeBoot = yes
> >end service0
> >------------------------------------
> >
> >I created a NFS share on /u04 using the following command cluadmin
> >
> >[root at rac1 root]# cluadmin
> >Wed Jul 20 10:02:20 EDT 2005
> >
> >You can obtain help by entering help and one of the 
> following commands:
> >
> >cluster     service        clear
> >help        apropos        exit
> >version         quit
> >cluadmin> service show
> >  1) state
> >  2) config
> >  3) services
> >service show what? 2
> >  0) nfs_project
> >  c) cancel
> >
> >Choose service: 0
> >name: nfs_project
> >preferred node: rac2
> >relocate: yes
> >monitor interval: 30
> >device 0: /dev/sdd4
> >  mount point, device 0: /u04
> >  mount fstype, device 0: ext3
> >  mount options, device 0: rw,nosuid,sync
> >  force unmount, device 0: yes
> >  samba share, device 0: None
> >NFS export 0: /u04
> >  Client 0: rac1, rw
> >cluadmin> service show state
> >=========================  S e r v i c e   S t a t u s  
> >========================
> >
> >                                         Last             
> Monitor  Restart
> >  Service        Status   Owner          Transition       
> Interval Count
> >  -------------- -------- -------------- ---------------- 
> -------- -------
> >  nfs_project       started  rac1        16:21:23 Jul 19  30       1
> >cluadmin>
> >
> >
> >And when I launched clustat, I expected this error message:
> >
> >clustat
> >Cluster Status Monitor (Fileserver Test Cluster)
> >07:46:05
> >Cluster alias: rac1
> >
> >===================== M e m b e r   S t a t u s ================
> >  Member         Status     Node Id    Power Switch
> >  -------------- ---------- ---------- ------------
> >  rac1           Up         0          Good
> >  rac2           Down    1          Unknown
> >
> >=================== H e a r t b e a t   S t a t u s ===============
> >  Name                           Type       Status
> >  ------------------------------ ---------- ------------
> >  rac1         <--> rac2         network    OFFLINE
> >
> >=================== S e r v i c e   S t a t u s ==================
> >                                  	     Last            
> Monitor      
> >Restart
> >Service         Status   Owner          Transition    Interval  Count
> >
> >  ------------- -------- ------------- ---------------- ------------
> >  nfs_project          started  rac1          16:07:42 Jul 
> 19  30           
> >  0
> >
> >
> >
> >And when I launched this command on RAC2:
> >mount -t nfs rac1:/u04 /u04
> >It list the following error message :
> >Mount: rac1:/u04 failed, reason given by server: Permission denied
> >
> >Can someone help me to fix this problem in this configuration?
> >
> >Thanks
> >
> >Cheers!
> >
> >Haydar
> >
> >
> >--
> >Linux-cluster mailing list
> >Linux-cluster at redhat.com
> >http://www.redhat.com/mailman/listinfo/linux-cluster
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 


From cfeist at redhat.com  Thu Jul 21 22:49:20 2005
From: cfeist at redhat.com (Chris Feist)
Date: Thu, 21 Jul 2005 17:49:20 -0500
Subject: [Linux-cluster] RHEL4 srpm spec file bugs/problems
In-Reply-To: <42E00205.7050109@fnal.gov>
References: <42E00205.7050109@fnal.gov>
Message-ID: <42E02670.5090400@redhat.com>

Troy,

Could you please a bug for the messy BUILD/smp issue?  (The 
fake-build-provides problem has just been fixed).

It will allow us to track this issue and make sure that it gets resolved.


Thanks,
Chris

Troy Dawson wrote:
> Howdy,
> These problems might have already been taken care of (I haven't checked 
> out the latest CVS), but just incase they haven't, here's some problems 
> I encountered while rebuilding the RHEL4 src rpm's, and way's they could 
> be fixed.
> 
> First - cman-kernel has the build dependancy on fake-build-provides, 
> which isn't in the RHEL4 release.  The way around this is to build it 
> with the --nodeps option.  But it really should be taken out of the 
> RHEL4 spec files unless it is going to be provided at some time.
> (this was already discussed on this mailling list, but I'll put it here 
> for completeness)
> 
> Second - source rpm's that build kernel modules leave messy BUILD/smp 
> and BUILD/hugemem directories when they are done.  These don't get 
> cleaned up, and when you are building the next rpm that does kernel 
> modules, there is conflicting files, so the next rpm doesn't build 
> correctly.
> Currently you have to remove those directories by hand after each rpm 
> that builds a kernel module.  It would be nice if there was a couple of 
> cleanup lines in the %clean section in the spec files.
> 
> Thanks
> Troy Dawson


From lmb at suse.de  Thu Jul 21 23:22:35 2005
From: lmb at suse.de (Lars Marowsky-Bree)
Date: Fri, 22 Jul 2005 01:22:35 +0200
Subject: [Clusters_sig] RE: [Linux-cluster] Re: [Ocfs2-devel] [RFC]
	nodemanager, ocfs2, dlm
In-Reply-To: <20050720183938.GM16618@ca-server1.us.oracle.com>
References: <3689AF909D816446BA505D21F1461AE404167CFB@cacexc04.americas.cpqcorp.net>
	<20050720180918.GU5416@marowsky-bree.de>
	<20050720183938.GM16618@ca-server1.us.oracle.com>
Message-ID: <20050721232235.GJ24464@marowsky-bree.de>

On 2005-07-20T11:39:38, Joel Becker <Joel.Becker at oracle.com> wrote:

> 	In turn, let me clarify a little where configfs fits in to
> things.  Configfs is merely a convenient and transparent method to
> communicate configuration to kernel objects.  It's not a place for
> uevents, for netlink sockets, or for fancy communication.  It allows
> userspace to create an in-kernel object and set/get values on that
> object.  It also allows userspace and kernelspace to share the same
> representation of that object and its values.
> 	For more complex interaction, sysfs and procfs are often more
> appropriate.  While you might "configure" all known nodes in configfs,
> the node up/down state might live in sysfs.  A netlink socket for
> up/down events might live in procfs.  And so on.

Right. Thanks for the clarification and elaboration, for I am sure
not entirely clear as to how all these mechanisms relate in detail and
what is appropriate just where, and when to use something more classic
like ioctl etc... ;-)

FWIW, we didn't mean to get uevents out via configfs of course.


Sincerely,
    Lars Marowsky-Br?e <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business	 -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"


From phillips at redhat.com  Fri Jul 22 03:22:15 2005
From: phillips at redhat.com (Daniel Phillips)
Date: Fri, 22 Jul 2005 13:22:15 +1000
Subject: [Linux-cluster] Re: [Ocfs2-devel] [RFC] nodemanager, ocfs2, dlm
In-Reply-To: <3689AF909D816446BA505D21F1461AE404167CFB@cacexc04.americas.cpqcorp.net>
References: <3689AF909D816446BA505D21F1461AE404167CFB@cacexc04.americas.cpqcorp.net>
Message-ID: <200507221322.16706.phillips@redhat.com>

On Thursday 21 July 2005 02:55, Walker, Bruce J (HP-Labs) wrote:
> Like Lars, I too was under the wrong impression about this configfs
> "nodemanager" kernel component.  Our discussions in the cluster meeting
> Monday and Tuesday were assuming it was a general service that other
> kernel components could/would utilize and possibly also something that
> could send uevents to non-kernel components wanting a std. way to see
> membership information/events.
>
> As to kernel components without corresponding user-level "managers", look
> no farther than OpenSSI.  Our hope was that we could adapt to a user-land
> membership service and this interface thru configfs would drive all our
> kernel subsystems.

Guys, it is absolutely stupid to rely on a virtual filesystem for 
userspace/kernel communication for any events that might have to be 
transmitted inside the block IO path.  This includes, among other things, 
memberhips events.  Inserting a virtual filesystem into this path does 
nothing but add long call chains and new, hard-to-characterize memory 
usage.

There are already tried-and-true interfaces that are designed to do this 
kind of job efficiently and with quantifiable resource requirements: 
sockets (UNIX domain or netlink) and ioctls.  If you want to layer a 
virtual filesystem on top as a user friendly way to present current cluster 
configuration or as a way to provide some administrator knobs, then fine, 
virtual filesystems are good for this kind of thing.  But please do not try 
to insinuate that bloat into the block IO path.

Regards,

Daniel


From phung at cs.columbia.edu  Fri Jul 22 03:51:21 2005
From: phung at cs.columbia.edu (Dan B. Phung)
Date: Thu, 21 Jul 2005 23:51:21 -0400 (EDT)
Subject: [Linux-cluster] how to recover from process_recovery_barrier
	status=-104
Message-ID: <Pine.LNX.4.44.0507212342550.634-100000@algiers.clic.cs.columbia.edu>

My cluster went down pretty hard, in that I had to hard reboot several
machines, and now the fence daemon won't come up.  I run:

$ ccsd && cman_tool join -w 
$ fence_tool join -w -j 15 -D
blade02:~ # fence_tool join -w -D -j 15
fence_tool: wait for quorum 1
fence_tool: get our node name
fence_tool: connect to ccs
fence_tool: start fenced
fenced: 1122003465 our name from cman "blade02"
fenced: 1122003465 delay post_join 15s post_fail 0s
fenced: 1122003465 added 14 nodes from ccs

and it hangs there forever until I hit ^C.  

On one of the surviving machines, I see (dmesg):

SM: 00000001 process_recovery_barrier status=-104
CMAN: node blade03 has been removed from the cluster : Missed too many 
heartbeats
SM: 00000001 process_recovery_barrier status=-104
CMAN: node blade06 has been removed from the cluster : Missed too many 
heartbeats
SM: 00000001 process_recovery_barrier status=-104
CMAN: node blade09 has been removed from the cluster : No response to 
messages
CMAN: bad generation number 371 in HELLO message from 1, expected 370
CMAN: removing node blade08 from the cluster : No response to messages
CMAN: removing node blade07 from the cluster : No response to messages
CMAN: quorum lost, blocking activity
SM: 00000001 process_recovery_barrier status=-104

Is there a way to recover (restart gfs) without having to reboot this
last machine?  

thanks,
dan

p.s.  here's some more info:

blade13:~ # cman_tool nodes
Node  Votes Exp Sts  Name
   1    1    1   M   blade01
   2    1    1   X   blade02
   3    1    1   X   blade03
   4    1    1   X   blade04
   6    1    1   X   blade06
   7    1    1   X   blade07
   8    1    1   X   blade08
   9    1    1   X   blade09
  10    1    1   X   blade10
  11    1    1   X   blade11
  12    1    1   X   blade12
  13    1    1   M   blade13
  14    1    1   X   blade14

blade13:~ # cman_tool status
Protocol version: 5.0.1
Config version: 1
Cluster name: blade_cluster
Cluster ID: 38068
Cluster Member: Yes
Membership state: Cluster-Member
Nodes: 2
Expected_votes: 1
Total_votes: 2
Quorum: 2   
Active subsystems: 6
Node name: blade13

blade13:~ # cman_tool services
Service          Name                              GID LID State     Code
Fence Domain:    "default"                           1   2 recover 2 -
[13]

DLM Lock Space:  "clvmd"                             2   3 recover 0 -
[13]

DLM Lock Space:  "lil_cheesy1_lv"                   11   4 run       -
[13]

GFS Mount Group: "lil_cheesy1_lv"                   12   5 run       -
[13]


From teigland at redhat.com  Fri Jul 22 04:37:04 2005
From: teigland at redhat.com (David Teigland)
Date: Fri, 22 Jul 2005 12:37:04 +0800
Subject: [Linux-cluster] how to recover from process_recovery_barrier
	status=-104
In-Reply-To: <Pine.LNX.4.44.0507212342550.634-100000@algiers.clic.cs.columbia.edu>
References: <Pine.LNX.4.44.0507212342550.634-100000@algiers.clic.cs.columbia.edu>
Message-ID: <20050722043704.GA8771@redhat.com>

On Thu, Jul 21, 2005 at 11:51:21PM -0400, Dan B. Phung wrote:
> My cluster went down pretty hard, in that I had to hard reboot several
> machines, and now the fence daemon won't come up.  I run:
> 
> $ ccsd && cman_tool join -w 
> $ fence_tool join -w -j 15 -D
> blade02:~ # fence_tool join -w -D -j 15
> fence_tool: wait for quorum 1
> fence_tool: get our node name
> fence_tool: connect to ccs
> fence_tool: start fenced
> fenced: 1122003465 our name from cman "blade02"

This is inconsistent with the data below which shows that blade1 is a
cluster member, not blade2.  Maybe you collected the other data before
blade2 joined the cluster...

> blade13:~ # cman_tool nodes
> Node  Votes Exp Sts  Name
>    1    1    1   M   blade01
>    2    1    1   X   blade02
>    3    1    1   X   blade03
>    4    1    1   X   blade04
>    6    1    1   X   blade06
>    7    1    1   X   blade07
>    8    1    1   X   blade08
>    9    1    1   X   blade09
>   10    1    1   X   blade10
>   11    1    1   X   blade11
>   12    1    1   X   blade12
>   13    1    1   M   blade13
>   14    1    1   X   blade14
> 
> blade13:~ # cman_tool status
> Protocol version: 5.0.1
> Config version: 1
> Cluster name: blade_cluster
> Cluster ID: 38068
> Cluster Member: Yes
> Membership state: Cluster-Member
> Nodes: 2
> Expected_votes: 1
> Total_votes: 2
> Quorum: 2   
> Active subsystems: 6
> Node name: blade13
> 
> blade13:~ # cman_tool services
> Service          Name                              GID LID State     Code
> Fence Domain:    "default"                           1   2 recover 2 -
> [13]

This looks like blade13 is trying to fence some node.  blade13 won't let
anyone else join the fence domain until it's completed the fencing; this
is probably why fenced on blade02 isn't getting anywhere.
/var/log/messages on blade13 should show where or if there's an incomplete
fencing operation.

Dave


From phung at cs.columbia.edu  Fri Jul 22 04:56:12 2005
From: phung at cs.columbia.edu (Dan B. Phung)
Date: Fri, 22 Jul 2005 00:56:12 -0400 (EDT)
Subject: [Linux-cluster] how to recover from process_recovery_barrier
	status=-104
In-Reply-To: <20050722043704.GA8771@redhat.com>
Message-ID: <Pine.LNX.4.44.0507220049350.11204-100000@algiers.clic.cs.columbia.edu>

On 22, Jul, 2005, David Teigland declared:

> On Thu, Jul 21, 2005 at 11:51:21PM -0400, Dan B. Phung wrote:
> > My cluster went down pretty hard, in that I had to hard reboot several
> > machines, and now the fence daemon won't come up.  I run:
> > 
> > $ ccsd && cman_tool join -w 
> > $ fence_tool join -w -j 15 -D
> > blade02:~ # fence_tool join -w -D -j 15
> > fence_tool: wait for quorum 1
> > fence_tool: get our node name
> > fence_tool: connect to ccs
> > fence_tool: start fenced
> > fenced: 1122003465 our name from cman "blade02"
> 
> This is inconsistent with the data below which shows that blade1 is a
> cluster member, not blade2.  Maybe you collected the other data before
> blade2 joined the cluster...

right, actually I exited from the fence operation and force blade02 to
leave the cluster.

> This looks like blade13 is trying to fence some node.  blade13 won't let
> anyone else join the fence domain until it's completed the fencing; this
> is probably why fenced on blade02 isn't getting anywhere.
> /var/log/messages on blade13 should show where or if there's an incomplete
> fencing operation.

here's some excerpts from /var/log/messages:

Jul 21 16:48:05 blade13 kernel: qla2300 0000:02:02.0: LOOP DOWN detected.
Jul 21 16:48:37 blade13 kernel: SCSI error : <0 0 1 1> return code = 
0x10000
Jul 21 16:48:37 blade13 kernel: end_request: I/O error, dev sdb, sector 
69569288
Jul 21 16:48:37 blade13 kernel: SCSI error : <0 0 1 1> return code = 
0x10000
Jul 21 16:48:37 blade13 kernel: end_request: I/O error, dev sdb, sector 
69569296
Jul 21 16:48:37 blade13 kernel: SCSI error : <0 0 1 1> return code = 
0x10000
Jul 21 16:48:37 blade13 kernel: end_request: I/O error, dev sdb, sector 
69569304
Jul 21 16:48:37 blade13 kernel: SCSI error : <0 0 1 1> return code = 
0x10000
Jul 21 16:48:37 blade13 kernel: end_request: I/O error, dev sdb, sector 
69569312
Jul 21 16:48:37 blade13 kernel: SCSI error : <0 0 1 1> return code = 
0x10000
Jul 21 16:48:37 blade13 kernel: end_request: I/O error, dev sdb, sector 
69569320
Jul 21 16:48:37 blade13 kernel: SCSI error : <0 0 1 1> return code = 
0x10000
Jul 21 16:48:37 blade13 kernel: end_request: I/O error, dev sdb, sector 
69569328
Jul 21 16:48:37 blade13 kernel: SCSI error : <0 0 1 1> return code = 
0x10000
Jul 21 16:48:37 blade13 kernel: end_request: I/O error, dev sdb, sector 
69569336
Jul 21 16:48:37 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0: 
fatal: I/O error
Jul 21 16:48:37 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0:   
block = 8696119
Jul 21 16:48:37 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0:   
function = gfs_logbh_wait
Jul 21 16:48:37 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0:   
file = /usr/local/src/cluster-2.6.8.1/gfs-kernel/src/gfs/dio.c, line = 923
Jul 21 16:48:37 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0:   
time = 1121978916
Jul 21 16:48:37 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0: 
about to withdraw from the cluster
Jul 21 16:48:37 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0: 
waiting for outstanding I/O
Jul 21 16:48:37 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0: 
telling LM to withdraw
Jul 21 16:48:37 blade13 kernel: lock_dlm: withdraw abandoned memory
Jul 21 16:48:37 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0: 
withdrawn
Jul 21 16:49:33 blade13 kernel: qla2300 0000:02:02.0: LOOP UP detected (2 
Gbps).
Jul 21 17:01:34 blade13 shutdown[7987]: shutting down for system reboot

--
snipped reboot messages
--

Jul 21 17:04:17 blade13 kernel: CMAN: Waiting to join or form a 
Linux-cluster
Jul 21 17:04:20 blade13 kernel: CMAN: sending membership request
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade12
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade04
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade09
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade03
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade02
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade06
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade07
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade08
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade11
Jul 21 17:04:21 blade13 kernel: CMAN: got node blade01
Jul 21 17:04:24 blade13 clvmd: Cluster LVM daemon started - connected to 
CMAN
Jul 21 17:04:24 blade13 kernel: CMAN: WARNING no listener for port 11 on 
node blade01
Jul 21 17:18:16 blade13 kernel: GFS: Trying to join cluster "lock_dlm", 
"blade_cluster:lil_cheesy1_lv"
Jul 21 17:18:18 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0: 
Joined cluster. Now mounting FS...
Jul 21 17:18:18 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0: 
jid=0: Trying to acquire journal lock...
Jul 21 17:18:18 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0: 
jid=0: Looking at journal...
Jul 21 17:18:18 blade13 kernel: GFS: fsid=blade_cluster:lil_cheesy1_lv.0: 
jid=0: Done
(last message repeated 13 times)

Jul 21 23:14:57 blade13 kernel: CMAN: node blade04 rejoining
Jul 21 23:16:52 blade13 kernel: CMAN: node blade12 rejoining
Jul 21 23:21:16 blade13 kernel: CMAN: node blade12 has been removed from 
the cluster : Shutdown
Jul 21 23:23:02 blade13 kernel: CMAN: node blade02 has been removed from 
the cluster : Missed too many heartbeats
Jul 21 23:23:03 blade13 kernel: SM: 00000001 process_recovery_barrier 
status=-104
Jul 21 23:23:27 blade13 kernel: CMAN: node blade03 has been removed from 
the cluster : Missed too many heartbeats
Jul 21 23:23:28 blade13 kernel: SM: 00000001 process_recovery_barrier 
status=-104
Jul 21 23:24:12 blade13 kernel: CMAN: node blade06 has been removed from 
the cluster : Missed too many heartbeats
Jul 21 23:24:13 blade13 kernel: SM: 00000001 process_recovery_barrier 
status=-104
Jul 21 23:24:33 blade13 kernel: CMAN: node blade09 has been removed from 
the cluster : No response to messages
Jul 21 23:24:43 blade13 kernel: CMAN: removing node blade08 from the 
cluster : No response to messages
Jul 21 23:24:43 blade13 kernel: CMAN: removing node blade07 from the 
cluster : No response to messages
Jul 21 23:24:53 blade13 kernel: SM: 00000001 process_recovery_barrier 
status=-104


> > blade13:~ # cman_tool nodes
> > Node  Votes Exp Sts  Name
> >    1    1    1   M   blade01
> >    2    1    1   X   blade02
> >    3    1    1   X   blade03
> >    4    1    1   X   blade04
> >    6    1    1   X   blade06
> >    7    1    1   X   blade07
> >    8    1    1   X   blade08
> >    9    1    1   X   blade09
> >   10    1    1   X   blade10
> >   11    1    1   X   blade11
> >   12    1    1   X   blade12
> >   13    1    1   M   blade13
> >   14    1    1   X   blade14
> > 
> > blade13:~ # cman_tool status
> > Protocol version: 5.0.1
> > Config version: 1
> > Cluster name: blade_cluster
> > Cluster ID: 38068
> > Cluster Member: Yes
> > Membership state: Cluster-Member
> > Nodes: 2
> > Expected_votes: 1
> > Total_votes: 2
> > Quorum: 2   
> > Active subsystems: 6
> > Node name: blade13
> > 
> > blade13:~ # cman_tool services
> > Service          Name                              GID LID State     Code
> > Fence Domain:    "default"                           1   2 recover 2 -
> > [13]
> 
> 


From dawson at fnal.gov  Fri Jul 22 13:58:24 2005
From: dawson at fnal.gov (Troy Dawson)
Date: Fri, 22 Jul 2005 08:58:24 -0500
Subject: [Linux-cluster] RHEL4 srpm spec file bugs/problems
In-Reply-To: <42E02670.5090400@redhat.com>
References: <42E00205.7050109@fnal.gov> <42E02670.5090400@redhat.com>
Message-ID: <42E0FB80.3060001@fnal.gov>

Submitted as bug 163963

Thanks
Troy

Chris Feist wrote:
> Troy,
> 
> Could you please a bug for the messy BUILD/smp issue?  (The 
> fake-build-provides problem has just been fixed).
> 
> It will allow us to track this issue and make sure that it gets resolved.
> 
> 
> Thanks,
> Chris
> 
> Troy Dawson wrote:
> 
>> Howdy,
>> These problems might have already been taken care of (I haven't 
>> checked out the latest CVS), but just incase they haven't, here's some 
>> problems I encountered while rebuilding the RHEL4 src rpm's, and way's 
>> they could be fixed.
>>
>> First - cman-kernel has the build dependancy on fake-build-provides, 
>> which isn't in the RHEL4 release.  The way around this is to build it 
>> with the --nodeps option.  But it really should be taken out of the 
>> RHEL4 spec files unless it is going to be provided at some time.
>> (this was already discussed on this mailling list, but I'll put it 
>> here for completeness)
>>
>> Second - source rpm's that build kernel modules leave messy BUILD/smp 
>> and BUILD/hugemem directories when they are done.  These don't get 
>> cleaned up, and when you are building the next rpm that does kernel 
>> modules, there is conflicting files, so the next rpm doesn't build 
>> correctly.
>> Currently you have to remove those directories by hand after each rpm 
>> that builds a kernel module.  It would be nice if there was a couple 
>> of cleanup lines in the %clean section in the spec files.
>>
>> Thanks
>> Troy Dawson
> 
> 


-- 
__________________________________________________
Troy Dawson  dawson at fnal.gov  (630)840-6468
Fermilab  ComputingDivision/CSS  CSI Group
__________________________________________________


From high.mobley at prioritynetworks.net  Thu Jul 21 21:55:25 2005
From: high.mobley at prioritynetworks.net (High Mobley)
Date: Thu, 21 Jul 2005 14:55:25 -0700
Subject: [Linux-cluster] Compile errors
In-Reply-To: <20050721031050.GA10109@redhat.com>
References: <42DEF1AC.8070804@prioritynetworks.net>
	<20050721031050.GA10109@redhat.com>
Message-ID: <42E019CD.9070604@prioritynetworks.net>

David Teigland wrote:
> On Wed, Jul 20, 2005 at 05:51:56PM -0700, High Mobley wrote:
> 
>>I'm in need of the CLVM software for a project that I'm working on. It
>>appears that I require the full RH Cluster software suite in order to
>>run CLVM, so I've been trying to follow the information that I found in
>>your usage.txt file, as well as the INSTALL file that came with the cvs
>>code's STABLE branch.
>>
>>I was initially trying to compile against recent 2.6.x kernels, but then
>>I noticed that the kernel patches in the STABLE branch were in a
>>subdirectory called "2.6.9". So my latest attempts have been to compile
>>against linux-2.6.9 kernel sources. I have tried both the STABLE and
>>RHEL4 branches. Both have failed to compile.
>>
>>I am running Debian Sarge, but I am using vanilla kernel.org sources.
>>Thank you in advance for any direction that is provided!
> 
> 
> Don't use cvs, use the release instead:
>  ftp://sources.redhat.com/pub/cluster/releases/cluster-1.00.00.tar.gz
> 
> This compiles against vanilla linux-2.6.12 and doesn't use kernel patches.
> 
> Dave
> 


I tried compiling against 2.6.12 and 2.6.12.3 and got errors with both.
Since my build machine is kind of slow, I am using a very minimal kernel
config. I've turned off support for things like SCSI, USB, parallel
port, etc. Is it possible that I'm missing some needed kernel option?
I've included the output from 'make install' on the cluster-1.00.00
source below. Thanks again for the help.


strife:/usr/2.6.12-cluster/cluster-1.00.00# make install
cd cman-kernel && make install
make[1]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel'
cd src && make install
make[2]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src'
rm -f cluster
ln -s . cluster
make -C /usr/2.6.12-cluster/linux-2.6.12
M=/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src modules
USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cnxman.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/config.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/membership.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/proc.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_barrier.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_control.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_daemon.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_joinleave.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_membership.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_message.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_misc.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_recover.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_services.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_user.o
   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cman.o
   Building modules, stage 2.
   MODPOST
   CC      /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cman.mod.o
   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cman.ko
make[3]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
install -d //lib/modules/2.6.12/kernel/cluster
install cman.ko //lib/modules/2.6.12/kernel/cluster
install -d //usr/include/cluster
install cnxman.h cnxman-socket.h service.h //usr/include/cluster
make[2]: Leaving directory
`/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src'
make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel'
cd dlm-kernel && make install
make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel'
cd src && make install
make[2]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src'
if [ ! -e cluster ]; then ln -s . cluster; fi
if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
if [ ! -e cnxman-socket.h ]; then cp
//usr/include/cluster/cnxman-socket.h .; fi
make -C /usr/2.6.12-cluster/linux-2.6.12
M=/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src modules
USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/ast.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/config.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/device.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dir.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lkb.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/locking.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lockqueue.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lockspace.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lowcomms.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/main.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/memory.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/midcomms.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/nodes.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/proc.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/queries.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/rebuild.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/reccomms.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/recover.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/recoverd.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/rsb.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/util.o
   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.o
   Building modules, stage 2.
   MODPOST
*** Warning: "kcl_addref_cluster"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_node_by_addr"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_node_addresses"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_releaseref_cluster"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_current_interface"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_get_node_by_nodeid"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_leave_service"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_remove_callback"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_global_service_id"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_unregister_service"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_join_service"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_start_done"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_add_callback"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
*** Warning: "kcl_register_service"
[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
   CC      /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.mod.o
   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko
make[3]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
install -d //lib/modules/2.6.12/kernel/cluster
install dlm.ko //lib/modules/2.6.12/kernel/cluster
install -d //usr/include/cluster
install dlm.h dlm_device.h //usr/include/cluster
make[2]: Leaving directory
`/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src'
make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel'
cd gfs-kernel && make install
make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel'
cd src && make install
make[2]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src'
cd harness && make install
make[3]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness'
rm -f linux
ln -s . linux
make -C /usr/2.6.12-cluster/linux-2.6.12
M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness
symverfile=/usr/2.6.12-cluster/linux-2.6.12/../kernel/cluster/dlm.symvers
modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/main.o
   LD [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/lock_harness.o
   Building modules, stage 2.
   MODPOST
   CC
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/lock_harness.mod.o
   LD [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/lock_harness.ko
make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
/usr/2.6.12-cluster/linux-2.6.12/scripts/mod/modpost -m -i
/usr/2.6.12-cluster/linux-2.6.12/../kernel/cluster/dlm.symvers
../harness/lock_harness.o -o lock_harness.symvers
install -d //usr/include/linux
install lm_interface.h //usr/include/linux
install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_harness
install lock_harness.ko
//lib/modules/2.6.12/kernel/fs/gfs_locking/lock_harness
make[3]: Leaving directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness'
cd nolock && make install
make[3]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock'
echo

rm -f linux lm_interface.h
ln -s . linux
ln -s ../../src/harness/lm_interface.h .
make -C /usr/2.6.12-cluster/linux-2.6.12
M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock
symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/../harness/lock_harness.symvers 

modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/main.o
   LD [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/lock_nolock.o
   Building modules, stage 2.
   MODPOST
   CC
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/lock_nolock.mod.o
   LD [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/lock_nolock.ko
make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_nolock
install lock_nolock.ko
//lib/modules/2.6.12/kernel/fs/gfs_locking/lock_nolock
make[3]: Leaving directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock'
cd dlm && make install
make[3]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm'
if [ ! -e linux ]; then ln -s . linux; fi
if [ ! -e cluster ]; then ln -s . cluster; fi
if [ ! -e lm_interface.h ]; then ln -s ../../src/harness/lm_interface.h
.; fi
if [ ! -e dlm.h ]; then cp //usr/include/cluster/dlm.h .; fi
if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
if [ ! -e cnxman-socket.h ]; then cp
//usr/include/cluster/cnxman-socket.h .; fi
if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
make -C /usr/2.6.12-cluster/linux-2.6.12
M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm
symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/../harness/lock_harness.symvers 

modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/group.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/main.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/mount.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/plock.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/thread.o
   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.o
   Building modules, stage 2.
   MODPOST
*** Warning: "kcl_addref_cluster"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "kcl_get_services"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "kcl_cluster_name"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "dlm_lock"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "kcl_releaseref_cluster"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "kcl_get_members"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "dlm_query"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "kcl_leave_service"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "kcl_unregister_service"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "kcl_join_service"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "dlm_release_lockspace"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "kcl_start_done"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "dlm_unlock"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "kcl_register_service"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "dlm_new_lockspace"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
*** Warning: "dlm_debug_dump"
[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
undefined!
   CC
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.mod.o
   LD [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko
make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
/usr/2.6.12-cluster/linux-2.6.12/scripts/mod/modpost -m -i
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/../harness/lock_harness.symvers 

../dlm/lock_dlm.o -o lock_dlm.symvers
install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_dlm
install lock_dlm.ko //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_dlm
make[3]: Leaving directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm'
cd gulm && make install
make[3]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm'
rm -f linux lm_interface.h
ln -s . linux
ln -s ../../src/harness/lm_interface.h .
make -C /usr/2.6.12-cluster/linux-2.6.12
M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm
symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/../harness/lock_harness.symvers 

modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_core.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_firstlock.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_fs.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_jid.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_lock_queue.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_lt.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_main.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_plock.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_recsig.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/handler.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lg_core.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lg_lock.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lg_main.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/utils_tostr.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/xdr_base.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/xdr_io.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/xdr_socket.o
   LD [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lock_gulm.o
   Building modules, stage 2.
   MODPOST
   CC
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lock_gulm.mod.o
   LD [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lock_gulm.ko
make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
/usr/2.6.12-cluster/linux-2.6.12/scripts/mod/modpost -m -i
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/../harness/lock_harness.symvers 

../gulm/lock_gulm.o -o lock_gulm.symvers
install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_gulm
install lock_gulm.ko //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_gulm
make[3]: Leaving directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm'
cd gfs && make install
make[3]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs'
rm -f linux lm_interface.h
ln -s . linux
ln -s ../../src/harness/lm_interface.h .
make -C /usr/2.6.12-cluster/linux-2.6.12
M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs
symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/../gulm/lock_gulm.symvers 

modules USING_KBUILD=yes
make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/acl.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/bits.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/bmap.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/daemon.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/diaper.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/dio.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/dir.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/eaops.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/eattr.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/file.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/glock.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/glops.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/inode.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ioctl.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/lm.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/log.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/lops.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/lvb.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/main.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/mount.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ondisk.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_address.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_dentry.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_export.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_file.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_fstype.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_inode.o
   CC [M]
/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_super.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_vm.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/page.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/proc.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/quota.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/recovery.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/rgrp.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/super.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/trans.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/unlinked.o
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/util.o
   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/gfs.o
   Building modules, stage 2.
   MODPOST
   CC      /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/gfs.mod.o
   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/gfs.ko
make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
install -d //lib/modules/2.6.12/kernel/fs/gfs
install gfs.ko //lib/modules/2.6.12/kernel/fs/gfs
install -d //usr/include/linux
install gfs_ondisk.h gfs_ioctl.h //usr/include/linux
make[3]: Leaving directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs'
make[2]: Leaving directory
`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src'
make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel'
cd gnbd-kernel && make install
make[1]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel'
cd src && make install
make[2]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src'
rm -f linux
ln -s . linux
make -C /usr/2.6.12-cluster/linux-2.6.12
M=/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src modules
USING_KBUILD=yes
make[3]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src/gnbd.o
   Building modules, stage 2.
   MODPOST
   CC      /usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src/gnbd.mod.o
   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src/gnbd.ko
make[3]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
install -d //usr/include/linux
install gnbd.h //usr/include/linux
install -d //lib/modules/2.6.12/kernel/drivers/block/gnbd
install gnbd.ko //lib/modules/2.6.12/kernel/drivers/block/gnbd
make[2]: Leaving directory
`/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src'
make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel'
cd magma && make install
make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/magma'
cd lib && make install
make[2]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/lib'
gcc -c -o global.o global.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
-Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
gcc -c -o plugin.o plugin.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
-Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
gcc -c -o localinfo.o localinfo.c -I. -Wall -I. -g -Werror
-Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
-D_GNU_SOURCE -D_CLUSTER_
gcc -c -o ip_lookup.o ip_lookup.c -I. -Wall -I. -g -Werror
-Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
-D_GNU_SOURCE -D_CLUSTER_
gcc -c -o memberlist.o memberlist.c -I. -Wall -I. -g -Werror
-Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
-D_GNU_SOURCE -D_CLUSTER_
gcc -c -o clist.o clist.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
-Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
ar cr libmagma.a global.o plugin.o localinfo.o ip_lookup.o memberlist.o
clist.o
ar cr libmagma_nt.a plugin.o localinfo.o ip_lookup.o memberlist.o
gcc -c -o message.o message.c -I. -Wall -I. -g -Werror
-Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
-D_GNU_SOURCE -D_CLUSTER_
gcc -c -o fdops.o fdops.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
-Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
ar cr libmagmamsg.a message.o fdops.o
ld -shared -soname libmagma.so.1 -o libmagma.so.1.00.00 global.o
plugin.o localinfo.o ip_lookup.o memberlist.o clist.o -lc
ln -snf libmagma.so.1.00.00 libmagma.so.1
ld -shared -soname libmagma_nt.so.1 -o libmagma_nt.so.1.00.00 plugin.o
localinfo.o ip_lookup.o memberlist.o -lc
ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so.1
ld -shared -soname libmagmamsg.so.1 -o libmagmamsg.so.1.00.00 message.o
fdops.o -lc
ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so.1
ln -snf libmagma.so.1.00.00 libmagma.so
ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so
ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so
install -d /usr/lib
install -d /usr/lib
#
# Install high level magma libraries
#
install -m 0644 libmagma.a /usr/lib
install -m 0644 libmagma.so.1.00.00 /usr/lib
cd /usr/lib; ln -snf libmagma.so.1.00.00 libmagma.so.1; cd -
cd /usr/lib; ln -snf libmagma.so.1.00.00 libmagma.so; cd -
#
# Install low-overhead, non-pthread magma libraries
#
install -m 0644 libmagma_nt.a /usr/lib
install -m 0644 libmagma_nt.so.1.00.00 /usr/lib
cd /usr/lib; ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so.1; cd -
cd /usr/lib; ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so; cd -
#
# Install high level TCP messaging libraries
#
install -m 0644 libmagmamsg.a /usr/lib
install -m 0644 libmagmamsg.so.1.00.00 /usr/lib
cd /usr/lib; ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so.1; cd -
cd /usr/lib; ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so; cd -
#
# Install headers for magma, magma_nt, and magmamsg
#
install -d /usr/include
install -m 0644 magma.h /usr/include
install -m 0644 magmamsg.h /usr/include
install -m 0644 magma-build.h /usr/include
make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/lib'
cd man && make install
make[2]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/man'
install -d /usr/share/man/man3
install -d /usr/share/man/man8
install clu_connect.3 clu_disconnect.3 clu_get_event.3 /usr/share/man/man3
install magma_tool.8 /usr/share/man/man8
make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/man'
cd tests && make install
make[2]: Entering directory
`/usr/2.6.12-cluster/cluster-1.00.00/magma/tests'
gcc -c -o magma_tool.o magma_tool.c  -Wall -I../lib -DDEBUG
-D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
gcc -o magma_tool magma_tool.o -L../lib -lmagma -lmagmamsg -ldl
-lpthread -ggdb
gcc -c -o circleping.o circleping.c  -Wall -I../lib -DDEBUG
-D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
gcc -o circleping circleping.o -L../lib -lmagma -lmagmamsg -ldl
-lpthread -ggdb
gcc -c -o cluster_cmd.o cluster_cmd.c  -Wall -I../lib -DDEBUG
-D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
gcc -o cluster_cmd cluster_cmd.o -L../lib -lmagma -lmagmamsg -ldl -lpthread
gcc -c -o cptester.o cptester.c  -Wall -I../lib -DDEBUG -D_GNU_SOURCE
-DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
gcc -o cptester cptester.o -L../lib -lmagma_nt -ldl
gcc -c -o thread_test.o thread_test.c  -Wall -I../lib -DDEBUG
-D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
gcc -o thread_test thread_test.o -L../lib -lmagma -lmagmamsg -ldl
-lpthread -ggdb
install -d /sbin
install -m 0755 magma_tool circleping cluster_cmd /sbin
make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/tests'
make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma'
cd ccs && make install
make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs'
cd daemon && make install
make[2]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs/daemon'
gcc -Wall -I. -I../include -I../common -I../config -I//usr/include -Wall
-O2 -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE `xml2-config --cflags`
-DCCS_RELEASE_NAME=\"1.00.00\" -I. -I../include -I../common -I../config
-I//usr/include -o ccsd ccsd.c cnx_mgr.c cluster_mgr.c misc.c
../common/log.c globals.c -L//usr/lib `xml2-config --libs` -lpthread
-lmagma -lmagmamsg -ldl
/bin/sh: line 1: xml2-config: command not found
/bin/sh: line 1: xml2-config: command not found
ccsd.c:25:27: libxml/parser.h: No such file or directory
ccsd.c: In function `check_cluster_conf':
ccsd.c:470: error: `xmlDocPtr' undeclared (first use in this function)
ccsd.c:470: error: (Each undeclared identifier is reported only once
ccsd.c:470: error: for each function it appears in.)
ccsd.c:470: error: syntax error before "doc"
ccsd.c:473: error: `doc' undeclared (first use in this function)
ccsd.c:473: warning: implicit declaration of function `xmlParseFile'
ccsd.c:482: warning: implicit declaration of function `xmlFreeDoc'
cnx_mgr.c:25:27: libxml/parser.h: No such file or directory
cnx_mgr.c:26:25: libxml/tree.h: No such file or directory
cnx_mgr.c:27:26: libxml/xpath.h: No such file or directory
cnx_mgr.c:28:35: libxml/xpathInternals.h: No such file or directory
In file included from cnx_mgr.c:33:
misc.h:17: error: syntax error before "xmlDocPtr"
misc.h:17: warning: no semicolon at end of struct or union
misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
misc.h:18: warning: data definition has no type or storage class
misc.h:24: error: syntax error before '*' token
misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
misc.h:24: warning: data definition has no type or storage class
misc.h:26: error: syntax error before "ldoc"
misc.h:27: error: syntax error before "ldoc"
cnx_mgr.c:41: error: syntax error before "open_doc_t"
cnx_mgr.c:41: warning: no semicolon at end of struct or union
cnx_mgr.c:42: warning: type defaults to `int' in declaration of `oc_ctx'
cnx_mgr.c:42: warning: data definition has no type or storage class
cnx_mgr.c:43: warning: type defaults to `int' in declaration of
`open_connection_t'
cnx_mgr.c:43: warning: data definition has no type or storage class
cnx_mgr.c:48: error: syntax error before '*' token
cnx_mgr.c:48: warning: type defaults to `int' in declaration of `ocs'
cnx_mgr.c:48: warning: data definition has no type or storage class
cnx_mgr.c: In function `_update_config':
cnx_mgr.c:53: error: `tmp_odoc' undeclared (first use in this function)
cnx_mgr.c:53: error: (Each undeclared identifier is reported only once
cnx_mgr.c:53: error: for each function it appears in.)
cnx_mgr.c:54: error: `xmlDocPtr' undeclared (first use in this function)
cnx_mgr.c:54: error: syntax error before "tmp_doc"
cnx_mgr.c:58: error: `tmp_doc' undeclared (first use in this function)
cnx_mgr.c:58: warning: implicit declaration of function `xmlParseFile'
cnx_mgr.c:67: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:68: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:90: error: request for member `od_refs' in something not a
structure or union
cnx_mgr.c:92: warning: implicit declaration of function `xmlFreeDoc'
cnx_mgr.c:92: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:93: warning: implicit declaration of function `free'
cnx_mgr.c:104: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c: In function `broadcast_for_doc':
cnx_mgr.c:157: error: `xmlDocPtr' undeclared (first use in this function)
cnx_mgr.c:157: error: syntax error before "tmp_doc"
cnx_mgr.c:164: warning: implicit declaration of function `exit'
cnx_mgr.c:164: error: `EXIT_FAILURE' undeclared (first use in this function)
cnx_mgr.c:289: warning: implicit declaration of function `srandom'
cnx_mgr.c:293: warning: implicit declaration of function `random'
cnx_mgr.c:324: error: `tmp_doc' undeclared (first use in this function)
cnx_mgr.c:324: warning: implicit declaration of function `xmlParseMemory'
cnx_mgr.c:349: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:352: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:357: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:366: error: request for member `od_refs' in something not a
structure or union
cnx_mgr.c:367: error: `tmp_odoc' undeclared (first use in this function)
cnx_mgr.c:376: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:377: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:386: error: request for member `od_refs' in something not a
structure or union
cnx_mgr.c:396: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:397: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:450: warning: implicit declaration of function `xmlDocDump'
cnx_mgr.c:450: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c: In function `process_connect':
cnx_mgr.c:488: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:493: error: syntax error before ')' token
cnx_mgr.c:513: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:514: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:515: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:518: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:521: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:522: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:523: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:526: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:527: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:531: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:532: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:542: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:577: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:583: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:609: error: syntax error before ')' token
cnx_mgr.c:617: error: request for member `od_refs' in something not a
structure or union
cnx_mgr.c:618: error: request for member `oc_odoc' in something not a
structure or union
cnx_mgr.c:619: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:619: warning: implicit declaration of function
`xmlXPathNewContext'
cnx_mgr.c:619: error: request for member `oc_odoc' in something not a
structure or union
cnx_mgr.c:620: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:621: error: request for member `oc_odoc' in something not a
structure or union
cnx_mgr.c:631: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c: In function `process_disconnect':
cnx_mgr.c:658: error: `tmp_odoc' undeclared (first use in this function)
cnx_mgr.c:679: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:680: warning: implicit declaration of function
`xmlXPathFreeContext'
cnx_mgr.c:680: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:682: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:683: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:685: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:686: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:688: error: request for member `oc_odoc' in something not a
structure or union
cnx_mgr.c:692: error: `EXIT_FAILURE' undeclared (first use in this function)
cnx_mgr.c: In function `_process_get':
cnx_mgr.c:734: error: `xmlXPathObjectPtr' undeclared (first use in this
function)
cnx_mgr.c:734: error: syntax error before "obj"
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:757: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:758: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:764: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:765: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:766: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:768: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:774: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:784: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:789: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:792: error: `obj' undeclared (first use in this function)
cnx_mgr.c:792: warning: implicit declaration of function
`xmlXPathEvalExpression'
cnx_mgr.c:792: error: request for member `oc_ctx' in something not a
structure or union
cnx_mgr.c:797: error: `xmlNodePtr' undeclared (first use in this function)
cnx_mgr.c:797: error: syntax error before "node"
cnx_mgr.c:801: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:802: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:807: error: `node' undeclared (first use in this function)
cnx_mgr.c:807: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c:822: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
function)
cnx_mgr.c:823: error: `XML_ELEMENT_NODE' undeclared (first use in this
function)
cnx_mgr.c:868: warning: implicit declaration of function
`xmlXPathFreeObject'
cnx_mgr.c: In function `process_get_list':
cnx_mgr.c:897: error: request for member `oc_index' in something not a
structure or union
cnx_mgr.c: In function `process_get_state':
cnx_mgr.c:964: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:964: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:965: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:966: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:973: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:974: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:974: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:976: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:984: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:985: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:993: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c: In function `process_set_state':
cnx_mgr.c:1033: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:1034: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:1035: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c:1038: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:1039: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:1040: error: request for member `oc_query' in something not a
structure or union
cnx_mgr.c:1043: error: request for member `oc_cwp' in something not a
structure or union
cnx_mgr.c: In function `process_request':
cnx_mgr.c:1160: warning: implicit declaration of function `realloc'
cnx_mgr.c: In function `process_broadcast':
cnx_mgr.c:1247: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:1248: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:1266: warning: implicit declaration of function
`xmlDocDumpFormatMemory'
cnx_mgr.c:1266: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:1267: error: `xmlChar' undeclared (first use in this function)
cnx_mgr.c:1267: error: syntax error before ')' token
cnx_mgr.c:1301: error: request for member `od_doc' in something not a
structure or union
cnx_mgr.c:1302: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:24:27: libxml/parser.h: No such file or directory
In file included from cluster_mgr.c:29:
misc.h:17: error: syntax error before "xmlDocPtr"
misc.h:17: warning: no semicolon at end of struct or union
misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
misc.h:18: warning: data definition has no type or storage class
misc.h:24: error: syntax error before '*' token
misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
misc.h:24: warning: data definition has no type or storage class
misc.h:26: error: syntax error before "ldoc"
misc.h:27: error: syntax error before "ldoc"
cluster_mgr.c:38: error: syntax error before "tmp_doc"
cluster_mgr.c: In function `check_update_doc':
cluster_mgr.c:45: error: `tmp_doc' undeclared (first use in this function)
cluster_mgr.c:45: error: (Each undeclared identifier is reported only once
cluster_mgr.c:45: error: for each function it appears in.)
cluster_mgr.c:51: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:52: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:64: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:65: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:67: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c:67: error: request for member `od_doc' in something not a
structure or union
cluster_mgr.c: In function `handle_cluster_message':
cluster_mgr.c:93: error: `xmlDocPtr' undeclared (first use in this function)
cluster_mgr.c:93: error: syntax error before "tmp_doc"
cluster_mgr.c:146: error: `tmp_doc' undeclared (first use in this function)
cluster_mgr.c:146: warning: implicit declaration of function
`xmlParseMemory'
cluster_mgr.c:167: warning: implicit declaration of function `xmlDocDump'
cluster_mgr.c:198: warning: implicit declaration of function `xmlParseFile'
cluster_mgr.c:247: warning: implicit declaration of function `xmlFreeDoc'
misc.c:25:27: libxml/parser.h: No such file or directory
misc.c:27:27: libxml/parser.h: No such file or directory
misc.c:28:25: libxml/tree.h: No such file or directory
misc.c:29:26: libxml/xpath.h: No such file or directory
misc.c:30:35: libxml/xpathInternals.h: No such file or directory
In file included from misc.c:35:
misc.h:17: error: syntax error before "xmlDocPtr"
misc.h:17: warning: no semicolon at end of struct or union
misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
misc.h:18: warning: data definition has no type or storage class
misc.h:24: error: syntax error before '*' token
misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
misc.h:24: warning: data definition has no type or storage class
misc.h:26: error: syntax error before "ldoc"
misc.h:27: error: syntax error before "ldoc"
misc.c:42: error: syntax error before '*' token
misc.c:42: warning: type defaults to `int' in declaration of `master_doc'
misc.c:42: warning: data definition has no type or storage class
misc.c:44: error: syntax error before "ldoc"
misc.c: In function `get_doc_version':
misc.c:47: error: `xmlXPathObjectPtr' undeclared (first use in this
function)
misc.c:47: error: (Each undeclared identifier is reported only once
misc.c:47: error: for each function it appears in.)
misc.c:47: error: syntax error before "obj"
misc.c:48: error: `xmlXPathContextPtr' undeclared (first use in this
function)
misc.c:49: error: `xmlNodePtr' undeclared (first use in this function)
misc.c:53: error: `ctx' undeclared (first use in this function)
misc.c:53: warning: implicit declaration of function `xmlXPathNewContext'
misc.c:53: error: `ldoc' undeclared (first use in this function)
misc.c:60: error: `obj' undeclared (first use in this function)
misc.c:60: warning: implicit declaration of function
`xmlXPathEvalExpression'
misc.c:67: error: `node' undeclared (first use in this function)
misc.c:68: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
function)
misc.c:91: warning: implicit declaration of function `xmlXPathFreeContext'
misc.c:94: warning: implicit declaration of function `xmlXPathFreeObject'
misc.c: At top level:
misc.c:109: error: syntax error before "ldoc"
misc.c: In function `get_cluster_name':
misc.c:112: error: `xmlXPathObjectPtr' undeclared (first use in this
function)
misc.c:112: error: syntax error before "obj"
misc.c:113: error: `xmlXPathContextPtr' undeclared (first use in this
function)
misc.c:114: error: `xmlNodePtr' undeclared (first use in this function)
misc.c:118: error: `ctx' undeclared (first use in this function)
misc.c:118: error: `ldoc' undeclared (first use in this function)
misc.c:125: error: `obj' undeclared (first use in this function)
misc.c:132: error: `node' undeclared (first use in this function)
misc.c:133: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
function)
make[2]: *** [ccsd] Error 1
make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs/daemon'
make[1]: *** [install] Error 2
make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs'
make: *** [install] Error 2


From lhh at redhat.com  Fri Jul 22 15:44:04 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 22 Jul 2005 11:44:04 -0400
Subject: [Linux-cluster] Cluster IP alias doesn't work
In-Reply-To: <BAY104-F28F893F4F16D9113DE61B9C5D60@phx.gbl>
References: <BAY104-F28F893F4F16D9113DE61B9C5D60@phx.gbl>
Message-ID: <1122047044.21448.60.camel@ayanami.boston.redhat.com>

On Thu, 2005-07-21 at 14:54 -0400, haydar Ali wrote:

> I created a NFS share on /u04 (mount on /dev/sdd4) using the following 
> command cluadmin on RAC1
> I launched on 2 nodes Rac1 and RAC2 the following command:
> Service cluster start
> I checked the result on the 2 nodes:
> clustat

Ok, that config looks good.

> And I launched on RAC1 and other servers the following command:
> mount t nfs 192.168.253.20:/u04 /u04
> 
> And all are OK, I can list the /u04 content from any server (if I mount it).

Good.

> But my only problem is:
> 
> When I want to try a test, I stop the network service on RAC2:
> ifconfig eth0 down
> Then when I try from another server as RAC1 to list the /u04 content, it 
> doesnt work and doesnt respond, and when I ping the IP alias 
> 192.168.253.20 it doesnt respond also.
> 
> Have you any idea to fix this problem?

That's expected behavior...

The IP addresses are all bound to eth0. (eth0:0, eth0:1, etc...)  If you
bring the entire interface down, all service IPs will also go down.

RHEL 2.1's clumanager does not perform link monitoring for services.
You'll have to use the "mon" utility to get it working if you want it.

-- Lon


From lmb at suse.de  Sun Jul 24 20:38:50 2005
From: lmb at suse.de (Lars Marowsky-Bree)
Date: Sun, 24 Jul 2005 22:38:50 +0200
Subject: [Linux-cluster] Where to go with cman ?
In-Reply-To: <42DB63F6.5070600@redhat.com>
References: <42DB63F6.5070600@redhat.com>
Message-ID: <20050724203850.GM8005@marowsky-bree.de>

On 2005-07-18T09:10:30, Patrick Caulfield <pcaulfie at redhat.com> wrote:

A late reply is better than no reply I guess.

FWIW, the Linux-HA project has been going through some of the same
thoughts. 

(Not wanting to make any statements about the future of NCS here, which
is a different group, but I know that Robert is keenly aware of
OpenAIS/libtotem etc too.)

Linux-HA is looking to integrate with OpenAIS. The exact design is yet
to be understood, but what is clear so far is that I'm convinced we'll
support running on top of OpenAIS as an additional messaging layer, and
figure out a way how to keep our Concensus Cluster Membership and
OpenAIS's one synchronized.

We're not quite ready to abandon our existing communication layer
completely in exchange with OpenAIS, for a variety of good reasons:

- We support some scenarios which OpenAIS doesn't; we can support raw
  serial communication links for example, which is very useful for 2
  node clusters (which make up 100% of our existing userbase and
  probably even with the ability to support N nodes, will continue to be
  in excess of 90% of all installations.)

- OpenAIS relies on the cluster network to be protected by IP/Sec (or
  other forms of network layer security, and be it just "don't you dare
  run this over anything but private links"). This is well for some
  scenarios, but makes setup more complex.
  
- A gradual shiftover is required; both in terms of managing risk for
  our own release cycle (we understand what we have, while we have yet
  to work out the quirks of OpenAIS), as well as allowing customers to
  do a rolling upgrade.

- It will allow other applications to interoperate with our comm layer
  just fine - you can simply add OpenAIS to an existing cluster.

Of course, there's some downside to this, namely that we cannot rely on
advanced OpenAIS features like extended virtual synchrony being
available everywhere, so we cannot ourselves take full advantage of the
simplifications this might offer for our join protocol - we can, but we
have to support both code paths for the transition period.

I believe this is a step in the right direction, and fully support the
reasons you mentioned.

> The disadvantages are
> - Need to learn the internals of someone else's code.

Again, I think this is offset by the increased number of people pounding
at the code.

> - We don't have full control over the code. Although we can obviously
> fork it if we feel the need it would, obviously be preferable not to.

I'd suggest to not consider this an option, it'll kill the
interoperability again. ;-)

> - non-compatibility with "old" cman, making rolling upgrades har or
> even impossible. I'm not sure what to do about this yet, but it's
> worth pointing out that the DLM has a new line-protocol too.

Maybe something like the strategy I described above and which we'll use
might work for you here too?

> - openAIS is BSD licensed, I don't think this is a problem but it
> probably needs checking.

Consuming a BSD licensed library in GPL'ed code should not be a problem.

> In short, I'm advocating adopting the openAIS core (libtotem
> basically) as CMAN's communications/membership protocol. If we're
> going to do a "CMAN V2" that has anything significant over V1 then
> re-inventing it is going to be a huge amount of work that someone else
> has already done.

Again, this would go quite a step towards resolving cluster
interoperability issues.


Sincerely,
    Lars Marowsky-Br?e <lmb at suse.de>

-- 
High Availability & Clustering
SUSE Labs, Research and Development
SUSE LINUX Products GmbH - A Novell Business	 -- Charles Darwin
"Ignorance more frequently begets confidence than does knowledge"

-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050724/b2986121/attachment.sig>

From Simon.Gibbs at informa.com  Mon Jul 25 13:36:49 2005
From: Simon.Gibbs at informa.com (Gibbs, Simon)
Date: Mon, 25 Jul 2005 14:36:49 +0100
Subject: [Linux-cluster] CMAN/CCSD errors after fresh install of Cluster
	Suite
Message-ID: <BF0AA981.C014%Simon.Gibbs@informa.com>

Oops. Screen resolution was set quite low so Cluster management window was
dropping off the screen, which meant the add node button wasn?t visible.

Thanks.


********************************************************************************
The information contained in this email message may be confidential. If you are not the intended recipient, any use, interference with, disclosure or copying of this material is unauthorised and prohibited. Although this message and any attachments are believed to be free of viruses, no responsibility is accepted by T&F Informa for any loss or damage arising in any way from receipt or use thereof.  Messages to and from the company are monitored for operational reasons and in accordance with lawful business practices. 
If you have received this message in error, please notify us by return and delete the message and any attachments.  Further enquiries/returns can be sent to postmaster at tfinforma.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050725/8fa850ed/attachment.htm>

From Simon.Gibbs at informa.com  Mon Jul 25 13:49:29 2005
From: Simon.Gibbs at informa.com (Gibbs, Simon)
Date: Mon, 25 Jul 2005 14:49:29 +0100
Subject: [Linux-cluster] RHCS4 adding mount options to filesystem resource
Message-ID: <BF0AAC79.C016%Simon.Gibbs@informa.com>

Hi,

Can anyone tell me if it?s possible to add mount options such as acl to
filesystem resources configured in RHCS4?

Thanks,

Simon


********************************************************************************
The information contained in this email message may be confidential. If you are not the intended recipient, any use, interference with, disclosure or copying of this material is unauthorised and prohibited. Although this message and any attachments are believed to be free of viruses, no responsibility is accepted by T&F Informa for any loss or damage arising in any way from receipt or use thereof.  Messages to and from the company are monitored for operational reasons and in accordance with lawful business practices. 
If you have received this message in error, please notify us by return and delete the message and any attachments.  Further enquiries/returns can be sent to postmaster at tfinforma.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050725/c425447a/attachment.htm>

From adingman at cookgroup.com  Mon Jul 25 13:52:11 2005
From: adingman at cookgroup.com (Andrew C. Dingman)
Date: Mon, 25 Jul 2005 08:52:11 -0500
Subject: [Linux-cluster] RHCS4 adding mount options to filesystem resource
In-Reply-To: <BF0AAC79.C016%Simon.Gibbs@informa.com>
References: <BF0AAC79.C016%Simon.Gibbs@informa.com>
Message-ID: <1122299531.3377.4.camel@adingman.cin.cook>

I'm not sure about cluster suite itself, but I would think that if all
else fails you could probably use 'tune2fs' to change the default mount
options on an ext2/3 file system. I assume you're not talking about a
GFS mount if you're using cluster suite to mount and unmount it?

On Mon, 2005-07-25 at 14:49 +0100, Gibbs, Simon wrote:
> Hi,
> 
> Can anyone tell me if it?s possible to add mount options such as acl
> to filesystem resources configured in RHCS4?
> 
> Thanks,
> 
> Simon
> 
> 
> 
> ********************************************************************************
> The information contained in this email message may be confidential.
> If you are not the intended recipient, any use, interference with,
> disclosure or copying of this material is unauthorised and prohibited.
> Although this message and any attachments are believed to be free of
> viruses, no responsibility is accepted by T&F Informa for any loss or
> damage arising in any way from receipt or use thereof. Messages to and
> from the company are monitored for operational reasons and in
> accordance with lawful business practices. 
> If you have received this message in error, please notify us by return
> and delete the message and any attachments. Further enquiries/returns
> can be sent to postmaster at tfinforma.com
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster


From lhh at redhat.com  Mon Jul 25 14:07:24 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 25 Jul 2005 10:07:24 -0400
Subject: [Linux-cluster] RHCS4 adding mount options to filesystem resource
In-Reply-To: <BF0AAC79.C016%Simon.Gibbs@informa.com>
References: <BF0AAC79.C016%Simon.Gibbs@informa.com>
Message-ID: <1122300444.3506.21.camel@ayanami.boston.redhat.com>

On Mon, 2005-07-25 at 14:49 +0100, Gibbs, Simon wrote:
> Hi,
> 
> Can anyone tell me if it?s possible to add mount options such as acl
> to filesystem resources configured in RHCS4?

options="...."  in the <fs> / <netfs> / <clusterfs> tags, I think.

It should be in the UI.

-- Lon


From Simon.Gibbs at informa.com  Mon Jul 25 16:08:12 2005
From: Simon.Gibbs at informa.com (Gibbs, Simon)
Date: Mon, 25 Jul 2005 17:08:12 +0100
Subject: [Linux-cluster] RHCS4 adding mount options to filesystem resource
Message-ID: <BF0ACCFC.C031%Simon.Gibbs@informa.com>

They?re a couple of ext3 filesystems and tune2fs did the trick.

I?ll have a look at adding the options=?...? tag in the cluster.conf file in
the future but it?s running OK at the moment so don?t want to start hacking
about to much.
I?ve double checked the GUI and there isn?t an ?option? field for filesystem
resources, although there are for NFS and GFS ? surprising because I?m sure
this was in RHCS3.

Thanks,

Simon


********************************************************************************
The information contained in this email message may be confidential. If you are not the intended recipient, any use, interference with, disclosure or copying of this material is unauthorised and prohibited. Although this message and any attachments are believed to be free of viruses, no responsibility is accepted by T&F Informa for any loss or damage arising in any way from receipt or use thereof.  Messages to and from the company are monitored for operational reasons and in accordance with lawful business practices. 
If you have received this message in error, please notify us by return and delete the message and any attachments.  Further enquiries/returns can be sent to postmaster at tfinforma.com

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050725/7be57d44/attachment.htm>

From lhh at redhat.com  Mon Jul 25 19:15:05 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Mon, 25 Jul 2005 15:15:05 -0400
Subject: [Linux-cluster] RHCS4 adding mount options to filesystem resource
In-Reply-To: <BF0ACCFC.C031%Simon.Gibbs@informa.com>
References: <BF0ACCFC.C031%Simon.Gibbs@informa.com>
Message-ID: <1122318905.3506.69.camel@ayanami.boston.redhat.com>

On Mon, 2005-07-25 at 17:08 +0100, Gibbs, Simon wrote:
> They?re a couple of ext3 filesystems and tune2fs did the trick.
> 
> I?ll have a look at adding the options=?...? tag in the cluster.conf
> file in the future but it?s running OK at the moment so don?t want to
> start hacking about to much.
> I?ve double checked the GUI and there isn?t an ?option? field for
> filesystem resources, although there are for NFS and GFS ? surprising
> because I?m sure this was in RHCS3.

Can you file a bugzilla against system-config-cluster?

-- Lon


From phung at cs.columbia.edu  Fri Jul 22 15:10:33 2005
From: phung at cs.columbia.edu (Dan B. Phung)
Date: Fri, 22 Jul 2005 11:10:33 -0400 (EDT)
Subject: [Linux-cluster] Compile errors
In-Reply-To: <42E019CD.9070604@prioritynetworks.net>
Message-ID: <Pine.LNX.4.44.0507221108350.11204-100000@algiers.clic.cs.columbia.edu>

look like you need to install libxml:

  aptitude install libxml2-dev

here's the list of things I need to install on debian to ensure that it
works.

  aptitude install libxml2-dev libdm0-dev libselinux1 libselinux1-dev libnet-telnet-perl


On 21, Jul, 2005, High Mobley declared:

> David Teigland wrote:
> > On Wed, Jul 20, 2005 at 05:51:56PM -0700, High Mobley wrote:
> > 
> >>I'm in need of the CLVM software for a project that I'm working on. It
> >>appears that I require the full RH Cluster software suite in order to
> >>run CLVM, so I've been trying to follow the information that I found in
> >>your usage.txt file, as well as the INSTALL file that came with the cvs
> >>code's STABLE branch.
> >>
> >>I was initially trying to compile against recent 2.6.x kernels, but then
> >>I noticed that the kernel patches in the STABLE branch were in a
> >>subdirectory called "2.6.9". So my latest attempts have been to compile
> >>against linux-2.6.9 kernel sources. I have tried both the STABLE and
> >>RHEL4 branches. Both have failed to compile.
> >>
> >>I am running Debian Sarge, but I am using vanilla kernel.org sources.
> >>Thank you in advance for any direction that is provided!
> > 
> > 
> > Don't use cvs, use the release instead:
> >  ftp://sources.redhat.com/pub/cluster/releases/cluster-1.00.00.tar.gz
> > 
> > This compiles against vanilla linux-2.6.12 and doesn't use kernel patches.
> > 
> > Dave
> > 
> 
> 
> 
> I tried compiling against 2.6.12 and 2.6.12.3 and got errors with both.
> Since my build machine is kind of slow, I am using a very minimal kernel
> config. I've turned off support for things like SCSI, USB, parallel
> port, etc. Is it possible that I'm missing some needed kernel option?
> I've included the output from 'make install' on the cluster-1.00.00
> source below. Thanks again for the help.
> 
> 
> 
> strife:/usr/2.6.12-cluster/cluster-1.00.00# make install
> cd cman-kernel && make install
> make[1]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel'
> cd src && make install
> make[2]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src'
> rm -f cluster
> ln -s . cluster
> make -C /usr/2.6.12-cluster/linux-2.6.12
> M=/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src modules
> USING_KBUILD=yes
> make[3]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cnxman.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/config.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/membership.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/proc.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_barrier.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_control.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_daemon.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_joinleave.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_membership.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_message.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_misc.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_recover.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_services.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_user.o
>    LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cman.o
>    Building modules, stage 2.
>    MODPOST
>    CC      /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cman.mod.o
>    LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cman.ko
> make[3]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
> install -d //lib/modules/2.6.12/kernel/cluster
> install cman.ko //lib/modules/2.6.12/kernel/cluster
> install -d //usr/include/cluster
> install cnxman.h cnxman-socket.h service.h //usr/include/cluster
> make[2]: Leaving directory
> `/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src'
> make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel'
> cd dlm-kernel && make install
> make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel'
> cd src && make install
> make[2]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src'
> if [ ! -e cluster ]; then ln -s . cluster; fi
> if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
> if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
> if [ ! -e cnxman-socket.h ]; then cp
> //usr/include/cluster/cnxman-socket.h .; fi
> make -C /usr/2.6.12-cluster/linux-2.6.12
> M=/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src modules
> USING_KBUILD=yes
> make[3]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/ast.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/config.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/device.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dir.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lkb.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/locking.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lockqueue.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lockspace.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lowcomms.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/main.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/memory.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/midcomms.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/nodes.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/proc.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/queries.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/rebuild.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/reccomms.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/recover.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/recoverd.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/rsb.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/util.o
>    LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.o
>    Building modules, stage 2.
>    MODPOST
> *** Warning: "kcl_addref_cluster"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_get_node_by_addr"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_get_node_addresses"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_releaseref_cluster"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_get_current_interface"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_get_node_by_nodeid"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_leave_service"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_remove_callback"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_global_service_id"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_unregister_service"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_join_service"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_start_done"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_add_callback"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
> *** Warning: "kcl_register_service"
> [/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>    CC      /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.mod.o
>    LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko
> make[3]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
> install -d //lib/modules/2.6.12/kernel/cluster
> install dlm.ko //lib/modules/2.6.12/kernel/cluster
> install -d //usr/include/cluster
> install dlm.h dlm_device.h //usr/include/cluster
> make[2]: Leaving directory
> `/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src'
> make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel'
> cd gfs-kernel && make install
> make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel'
> cd src && make install
> make[2]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src'
> cd harness && make install
> make[3]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness'
> rm -f linux
> ln -s . linux
> make -C /usr/2.6.12-cluster/linux-2.6.12
> M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness
> symverfile=/usr/2.6.12-cluster/linux-2.6.12/../kernel/cluster/dlm.symvers
> modules USING_KBUILD=yes
> make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/main.o
>    LD [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/lock_harness.o
>    Building modules, stage 2.
>    MODPOST
>    CC
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/lock_harness.mod.o
>    LD [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/lock_harness.ko
> make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
> /usr/2.6.12-cluster/linux-2.6.12/scripts/mod/modpost -m -i
> /usr/2.6.12-cluster/linux-2.6.12/../kernel/cluster/dlm.symvers
> ../harness/lock_harness.o -o lock_harness.symvers
> install -d //usr/include/linux
> install lm_interface.h //usr/include/linux
> install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_harness
> install lock_harness.ko
> //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_harness
> make[3]: Leaving directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness'
> cd nolock && make install
> make[3]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock'
> echo
> 
> rm -f linux lm_interface.h
> ln -s . linux
> ln -s ../../src/harness/lm_interface.h .
> make -C /usr/2.6.12-cluster/linux-2.6.12
> M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock
> symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/../harness/lock_harness.symvers 
> 
> modules USING_KBUILD=yes
> make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/main.o
>    LD [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/lock_nolock.o
>    Building modules, stage 2.
>    MODPOST
>    CC
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/lock_nolock.mod.o
>    LD [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/lock_nolock.ko
> make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
> install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_nolock
> install lock_nolock.ko
> //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_nolock
> make[3]: Leaving directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock'
> cd dlm && make install
> make[3]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm'
> if [ ! -e linux ]; then ln -s . linux; fi
> if [ ! -e cluster ]; then ln -s . cluster; fi
> if [ ! -e lm_interface.h ]; then ln -s ../../src/harness/lm_interface.h
> .; fi
> if [ ! -e dlm.h ]; then cp //usr/include/cluster/dlm.h .; fi
> if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
> if [ ! -e cnxman-socket.h ]; then cp
> //usr/include/cluster/cnxman-socket.h .; fi
> if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
> make -C /usr/2.6.12-cluster/linux-2.6.12
> M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm
> symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/../harness/lock_harness.symvers 
> 
> modules USING_KBUILD=yes
> make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/group.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/main.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/mount.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/plock.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/thread.o
>    LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.o
>    Building modules, stage 2.
>    MODPOST
> *** Warning: "kcl_addref_cluster"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "kcl_get_services"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "kcl_cluster_name"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "dlm_lock"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "kcl_releaseref_cluster"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "kcl_get_members"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "dlm_query"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "kcl_leave_service"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "kcl_unregister_service"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "kcl_join_service"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "dlm_release_lockspace"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "kcl_start_done"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "dlm_unlock"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "kcl_register_service"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "dlm_new_lockspace"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
> *** Warning: "dlm_debug_dump"
> [/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
> undefined!
>    CC
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.mod.o
>    LD [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko
> make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
> /usr/2.6.12-cluster/linux-2.6.12/scripts/mod/modpost -m -i
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/../harness/lock_harness.symvers 
> 
> ../dlm/lock_dlm.o -o lock_dlm.symvers
> install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_dlm
> install lock_dlm.ko //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_dlm
> make[3]: Leaving directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm'
> cd gulm && make install
> make[3]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm'
> rm -f linux lm_interface.h
> ln -s . linux
> ln -s ../../src/harness/lm_interface.h .
> make -C /usr/2.6.12-cluster/linux-2.6.12
> M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm
> symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/../harness/lock_harness.symvers 
> 
> modules USING_KBUILD=yes
> make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_core.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_firstlock.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_fs.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_jid.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_lock_queue.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_lt.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_main.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_plock.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_recsig.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/handler.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lg_core.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lg_lock.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lg_main.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/utils_tostr.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/xdr_base.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/xdr_io.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/xdr_socket.o
>    LD [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lock_gulm.o
>    Building modules, stage 2.
>    MODPOST
>    CC
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lock_gulm.mod.o
>    LD [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lock_gulm.ko
> make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
> /usr/2.6.12-cluster/linux-2.6.12/scripts/mod/modpost -m -i
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/../harness/lock_harness.symvers 
> 
> ../gulm/lock_gulm.o -o lock_gulm.symvers
> install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_gulm
> install lock_gulm.ko //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_gulm
> make[3]: Leaving directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm'
> cd gfs && make install
> make[3]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs'
> rm -f linux lm_interface.h
> ln -s . linux
> ln -s ../../src/harness/lm_interface.h .
> make -C /usr/2.6.12-cluster/linux-2.6.12
> M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs
> symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/../gulm/lock_gulm.symvers 
> 
> modules USING_KBUILD=yes
> make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/acl.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/bits.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/bmap.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/daemon.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/diaper.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/dio.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/dir.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/eaops.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/eattr.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/file.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/glock.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/glops.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/inode.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ioctl.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/lm.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/log.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/lops.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/lvb.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/main.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/mount.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ondisk.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_address.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_dentry.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_export.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_file.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_fstype.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_inode.o
>    CC [M]
> /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_super.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_vm.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/page.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/proc.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/quota.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/recovery.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/rgrp.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/super.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/trans.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/unlinked.o
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/util.o
>    LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/gfs.o
>    Building modules, stage 2.
>    MODPOST
>    CC      /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/gfs.mod.o
>    LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/gfs.ko
> make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
> install -d //lib/modules/2.6.12/kernel/fs/gfs
> install gfs.ko //lib/modules/2.6.12/kernel/fs/gfs
> install -d //usr/include/linux
> install gfs_ondisk.h gfs_ioctl.h //usr/include/linux
> make[3]: Leaving directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs'
> make[2]: Leaving directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src'
> make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel'
> cd gnbd-kernel && make install
> make[1]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel'
> cd src && make install
> make[2]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src'
> rm -f linux
> ln -s . linux
> make -C /usr/2.6.12-cluster/linux-2.6.12
> M=/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src modules
> USING_KBUILD=yes
> make[3]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>    CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src/gnbd.o
>    Building modules, stage 2.
>    MODPOST
>    CC      /usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src/gnbd.mod.o
>    LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src/gnbd.ko
> make[3]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
> install -d //usr/include/linux
> install gnbd.h //usr/include/linux
> install -d //lib/modules/2.6.12/kernel/drivers/block/gnbd
> install gnbd.ko //lib/modules/2.6.12/kernel/drivers/block/gnbd
> make[2]: Leaving directory
> `/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src'
> make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel'
> cd magma && make install
> make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/magma'
> cd lib && make install
> make[2]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/lib'
> gcc -c -o global.o global.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
> -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
> gcc -c -o plugin.o plugin.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
> -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
> gcc -c -o localinfo.o localinfo.c -I. -Wall -I. -g -Werror
> -Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
> -D_GNU_SOURCE -D_CLUSTER_
> gcc -c -o ip_lookup.o ip_lookup.c -I. -Wall -I. -g -Werror
> -Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
> -D_GNU_SOURCE -D_CLUSTER_
> gcc -c -o memberlist.o memberlist.c -I. -Wall -I. -g -Werror
> -Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
> -D_GNU_SOURCE -D_CLUSTER_
> gcc -c -o clist.o clist.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
> -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
> ar cr libmagma.a global.o plugin.o localinfo.o ip_lookup.o memberlist.o
> clist.o
> ar cr libmagma_nt.a plugin.o localinfo.o ip_lookup.o memberlist.o
> gcc -c -o message.o message.c -I. -Wall -I. -g -Werror
> -Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
> -D_GNU_SOURCE -D_CLUSTER_
> gcc -c -o fdops.o fdops.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
> -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
> ar cr libmagmamsg.a message.o fdops.o
> ld -shared -soname libmagma.so.1 -o libmagma.so.1.00.00 global.o
> plugin.o localinfo.o ip_lookup.o memberlist.o clist.o -lc
> ln -snf libmagma.so.1.00.00 libmagma.so.1
> ld -shared -soname libmagma_nt.so.1 -o libmagma_nt.so.1.00.00 plugin.o
> localinfo.o ip_lookup.o memberlist.o -lc
> ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so.1
> ld -shared -soname libmagmamsg.so.1 -o libmagmamsg.so.1.00.00 message.o
> fdops.o -lc
> ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so.1
> ln -snf libmagma.so.1.00.00 libmagma.so
> ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so
> ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so
> install -d /usr/lib
> install -d /usr/lib
> #
> # Install high level magma libraries
> #
> install -m 0644 libmagma.a /usr/lib
> install -m 0644 libmagma.so.1.00.00 /usr/lib
> cd /usr/lib; ln -snf libmagma.so.1.00.00 libmagma.so.1; cd -
> cd /usr/lib; ln -snf libmagma.so.1.00.00 libmagma.so; cd -
> #
> # Install low-overhead, non-pthread magma libraries
> #
> install -m 0644 libmagma_nt.a /usr/lib
> install -m 0644 libmagma_nt.so.1.00.00 /usr/lib
> cd /usr/lib; ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so.1; cd -
> cd /usr/lib; ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so; cd -
> #
> # Install high level TCP messaging libraries
> #
> install -m 0644 libmagmamsg.a /usr/lib
> install -m 0644 libmagmamsg.so.1.00.00 /usr/lib
> cd /usr/lib; ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so.1; cd -
> cd /usr/lib; ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so; cd -
> #
> # Install headers for magma, magma_nt, and magmamsg
> #
> install -d /usr/include
> install -m 0644 magma.h /usr/include
> install -m 0644 magmamsg.h /usr/include
> install -m 0644 magma-build.h /usr/include
> make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/lib'
> cd man && make install
> make[2]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/man'
> install -d /usr/share/man/man3
> install -d /usr/share/man/man8
> install clu_connect.3 clu_disconnect.3 clu_get_event.3 /usr/share/man/man3
> install magma_tool.8 /usr/share/man/man8
> make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/man'
> cd tests && make install
> make[2]: Entering directory
> `/usr/2.6.12-cluster/cluster-1.00.00/magma/tests'
> gcc -c -o magma_tool.o magma_tool.c  -Wall -I../lib -DDEBUG
> -D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
> -DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
> gcc -o magma_tool magma_tool.o -L../lib -lmagma -lmagmamsg -ldl
> -lpthread -ggdb
> gcc -c -o circleping.o circleping.c  -Wall -I../lib -DDEBUG
> -D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
> -DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
> gcc -o circleping circleping.o -L../lib -lmagma -lmagmamsg -ldl
> -lpthread -ggdb
> gcc -c -o cluster_cmd.o cluster_cmd.c  -Wall -I../lib -DDEBUG
> -D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
> -DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
> gcc -o cluster_cmd cluster_cmd.o -L../lib -lmagma -lmagmamsg -ldl -lpthread
> gcc -c -o cptester.o cptester.c  -Wall -I../lib -DDEBUG -D_GNU_SOURCE
> -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
> -DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
> gcc -o cptester cptester.o -L../lib -lmagma_nt -ldl
> gcc -c -o thread_test.o thread_test.c  -Wall -I../lib -DDEBUG
> -D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
> -DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
> gcc -o thread_test thread_test.o -L../lib -lmagma -lmagmamsg -ldl
> -lpthread -ggdb
> install -d /sbin
> install -m 0755 magma_tool circleping cluster_cmd /sbin
> make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/tests'
> make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma'
> cd ccs && make install
> make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs'
> cd daemon && make install
> make[2]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs/daemon'
> gcc -Wall -I. -I../include -I../common -I../config -I//usr/include -Wall
> -O2 -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE `xml2-config --cflags`
> -DCCS_RELEASE_NAME=\"1.00.00\" -I. -I../include -I../common -I../config
> -I//usr/include -o ccsd ccsd.c cnx_mgr.c cluster_mgr.c misc.c
> ../common/log.c globals.c -L//usr/lib `xml2-config --libs` -lpthread
> -lmagma -lmagmamsg -ldl
> /bin/sh: line 1: xml2-config: command not found
> /bin/sh: line 1: xml2-config: command not found
> ccsd.c:25:27: libxml/parser.h: No such file or directory
> ccsd.c: In function `check_cluster_conf':
> ccsd.c:470: error: `xmlDocPtr' undeclared (first use in this function)
> ccsd.c:470: error: (Each undeclared identifier is reported only once
> ccsd.c:470: error: for each function it appears in.)
> ccsd.c:470: error: syntax error before "doc"
> ccsd.c:473: error: `doc' undeclared (first use in this function)
> ccsd.c:473: warning: implicit declaration of function `xmlParseFile'
> ccsd.c:482: warning: implicit declaration of function `xmlFreeDoc'
> cnx_mgr.c:25:27: libxml/parser.h: No such file or directory
> cnx_mgr.c:26:25: libxml/tree.h: No such file or directory
> cnx_mgr.c:27:26: libxml/xpath.h: No such file or directory
> cnx_mgr.c:28:35: libxml/xpathInternals.h: No such file or directory
> In file included from cnx_mgr.c:33:
> misc.h:17: error: syntax error before "xmlDocPtr"
> misc.h:17: warning: no semicolon at end of struct or union
> misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
> misc.h:18: warning: data definition has no type or storage class
> misc.h:24: error: syntax error before '*' token
> misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
> misc.h:24: warning: data definition has no type or storage class
> misc.h:26: error: syntax error before "ldoc"
> misc.h:27: error: syntax error before "ldoc"
> cnx_mgr.c:41: error: syntax error before "open_doc_t"
> cnx_mgr.c:41: warning: no semicolon at end of struct or union
> cnx_mgr.c:42: warning: type defaults to `int' in declaration of `oc_ctx'
> cnx_mgr.c:42: warning: data definition has no type or storage class
> cnx_mgr.c:43: warning: type defaults to `int' in declaration of
> `open_connection_t'
> cnx_mgr.c:43: warning: data definition has no type or storage class
> cnx_mgr.c:48: error: syntax error before '*' token
> cnx_mgr.c:48: warning: type defaults to `int' in declaration of `ocs'
> cnx_mgr.c:48: warning: data definition has no type or storage class
> cnx_mgr.c: In function `_update_config':
> cnx_mgr.c:53: error: `tmp_odoc' undeclared (first use in this function)
> cnx_mgr.c:53: error: (Each undeclared identifier is reported only once
> cnx_mgr.c:53: error: for each function it appears in.)
> cnx_mgr.c:54: error: `xmlDocPtr' undeclared (first use in this function)
> cnx_mgr.c:54: error: syntax error before "tmp_doc"
> cnx_mgr.c:58: error: `tmp_doc' undeclared (first use in this function)
> cnx_mgr.c:58: warning: implicit declaration of function `xmlParseFile'
> cnx_mgr.c:67: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:68: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:90: error: request for member `od_refs' in something not a
> structure or union
> cnx_mgr.c:92: warning: implicit declaration of function `xmlFreeDoc'
> cnx_mgr.c:92: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:93: warning: implicit declaration of function `free'
> cnx_mgr.c:104: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c: In function `broadcast_for_doc':
> cnx_mgr.c:157: error: `xmlDocPtr' undeclared (first use in this function)
> cnx_mgr.c:157: error: syntax error before "tmp_doc"
> cnx_mgr.c:164: warning: implicit declaration of function `exit'
> cnx_mgr.c:164: error: `EXIT_FAILURE' undeclared (first use in this function)
> cnx_mgr.c:289: warning: implicit declaration of function `srandom'
> cnx_mgr.c:293: warning: implicit declaration of function `random'
> cnx_mgr.c:324: error: `tmp_doc' undeclared (first use in this function)
> cnx_mgr.c:324: warning: implicit declaration of function `xmlParseMemory'
> cnx_mgr.c:349: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:352: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:357: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:366: error: request for member `od_refs' in something not a
> structure or union
> cnx_mgr.c:367: error: `tmp_odoc' undeclared (first use in this function)
> cnx_mgr.c:376: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:377: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:386: error: request for member `od_refs' in something not a
> structure or union
> cnx_mgr.c:396: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:397: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:450: warning: implicit declaration of function `xmlDocDump'
> cnx_mgr.c:450: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c: In function `process_connect':
> cnx_mgr.c:488: error: syntax error before ')' token
> cnx_mgr.c:493: error: syntax error before ')' token
> cnx_mgr.c:493: error: syntax error before ')' token
> cnx_mgr.c:493: error: syntax error before ')' token
> cnx_mgr.c:493: error: syntax error before ')' token
> cnx_mgr.c:493: error: syntax error before ')' token
> cnx_mgr.c:493: error: syntax error before ')' token
> cnx_mgr.c:513: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:514: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:515: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:518: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:521: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:522: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:523: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:526: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:527: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:531: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:532: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:542: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:577: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:583: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:609: error: syntax error before ')' token
> cnx_mgr.c:617: error: request for member `od_refs' in something not a
> structure or union
> cnx_mgr.c:618: error: request for member `oc_odoc' in something not a
> structure or union
> cnx_mgr.c:619: error: request for member `oc_ctx' in something not a
> structure or union
> cnx_mgr.c:619: warning: implicit declaration of function
> `xmlXPathNewContext'
> cnx_mgr.c:619: error: request for member `oc_odoc' in something not a
> structure or union
> cnx_mgr.c:620: error: request for member `oc_ctx' in something not a
> structure or union
> cnx_mgr.c:621: error: request for member `oc_odoc' in something not a
> structure or union
> cnx_mgr.c:631: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c: In function `process_disconnect':
> cnx_mgr.c:658: error: `tmp_odoc' undeclared (first use in this function)
> cnx_mgr.c:679: error: request for member `oc_ctx' in something not a
> structure or union
> cnx_mgr.c:680: warning: implicit declaration of function
> `xmlXPathFreeContext'
> cnx_mgr.c:680: error: request for member `oc_ctx' in something not a
> structure or union
> cnx_mgr.c:682: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:683: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:685: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:686: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:688: error: request for member `oc_odoc' in something not a
> structure or union
> cnx_mgr.c:692: error: `EXIT_FAILURE' undeclared (first use in this function)
> cnx_mgr.c: In function `_process_get':
> cnx_mgr.c:734: error: `xmlXPathObjectPtr' undeclared (first use in this
> function)
> cnx_mgr.c:734: error: syntax error before "obj"
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:757: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:758: error: request for member `oc_index' in something not a
> structure or union
> cnx_mgr.c:764: error: request for member `oc_index' in something not a
> structure or union
> cnx_mgr.c:765: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:766: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:768: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:774: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:784: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:789: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:792: error: `obj' undeclared (first use in this function)
> cnx_mgr.c:792: warning: implicit declaration of function
> `xmlXPathEvalExpression'
> cnx_mgr.c:792: error: request for member `oc_ctx' in something not a
> structure or union
> cnx_mgr.c:797: error: `xmlNodePtr' undeclared (first use in this function)
> cnx_mgr.c:797: error: syntax error before "node"
> cnx_mgr.c:801: error: request for member `oc_index' in something not a
> structure or union
> cnx_mgr.c:802: error: request for member `oc_index' in something not a
> structure or union
> cnx_mgr.c:807: error: `node' undeclared (first use in this function)
> cnx_mgr.c:807: error: request for member `oc_index' in something not a
> structure or union
> cnx_mgr.c:822: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
> function)
> cnx_mgr.c:823: error: `XML_ELEMENT_NODE' undeclared (first use in this
> function)
> cnx_mgr.c:868: warning: implicit declaration of function
> `xmlXPathFreeObject'
> cnx_mgr.c: In function `process_get_list':
> cnx_mgr.c:897: error: request for member `oc_index' in something not a
> structure or union
> cnx_mgr.c: In function `process_get_state':
> cnx_mgr.c:964: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:964: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:965: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:966: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:973: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:974: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:974: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:976: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:984: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:985: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:993: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c: In function `process_set_state':
> cnx_mgr.c:1033: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:1034: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:1035: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c:1038: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:1039: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:1040: error: request for member `oc_query' in something not a
> structure or union
> cnx_mgr.c:1043: error: request for member `oc_cwp' in something not a
> structure or union
> cnx_mgr.c: In function `process_request':
> cnx_mgr.c:1160: warning: implicit declaration of function `realloc'
> cnx_mgr.c: In function `process_broadcast':
> cnx_mgr.c:1247: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:1248: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:1266: warning: implicit declaration of function
> `xmlDocDumpFormatMemory'
> cnx_mgr.c:1266: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:1267: error: `xmlChar' undeclared (first use in this function)
> cnx_mgr.c:1267: error: syntax error before ')' token
> cnx_mgr.c:1301: error: request for member `od_doc' in something not a
> structure or union
> cnx_mgr.c:1302: error: request for member `od_doc' in something not a
> structure or union
> cluster_mgr.c:24:27: libxml/parser.h: No such file or directory
> In file included from cluster_mgr.c:29:
> misc.h:17: error: syntax error before "xmlDocPtr"
> misc.h:17: warning: no semicolon at end of struct or union
> misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
> misc.h:18: warning: data definition has no type or storage class
> misc.h:24: error: syntax error before '*' token
> misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
> misc.h:24: warning: data definition has no type or storage class
> misc.h:26: error: syntax error before "ldoc"
> misc.h:27: error: syntax error before "ldoc"
> cluster_mgr.c:38: error: syntax error before "tmp_doc"
> cluster_mgr.c: In function `check_update_doc':
> cluster_mgr.c:45: error: `tmp_doc' undeclared (first use in this function)
> cluster_mgr.c:45: error: (Each undeclared identifier is reported only once
> cluster_mgr.c:45: error: for each function it appears in.)
> cluster_mgr.c:51: error: request for member `od_doc' in something not a
> structure or union
> cluster_mgr.c:52: error: request for member `od_doc' in something not a
> structure or union
> cluster_mgr.c:64: error: request for member `od_doc' in something not a
> structure or union
> cluster_mgr.c:65: error: request for member `od_doc' in something not a
> structure or union
> cluster_mgr.c:67: error: request for member `od_doc' in something not a
> structure or union
> cluster_mgr.c:67: error: request for member `od_doc' in something not a
> structure or union
> cluster_mgr.c: In function `handle_cluster_message':
> cluster_mgr.c:93: error: `xmlDocPtr' undeclared (first use in this function)
> cluster_mgr.c:93: error: syntax error before "tmp_doc"
> cluster_mgr.c:146: error: `tmp_doc' undeclared (first use in this function)
> cluster_mgr.c:146: warning: implicit declaration of function
> `xmlParseMemory'
> cluster_mgr.c:167: warning: implicit declaration of function `xmlDocDump'
> cluster_mgr.c:198: warning: implicit declaration of function `xmlParseFile'
> cluster_mgr.c:247: warning: implicit declaration of function `xmlFreeDoc'
> misc.c:25:27: libxml/parser.h: No such file or directory
> misc.c:27:27: libxml/parser.h: No such file or directory
> misc.c:28:25: libxml/tree.h: No such file or directory
> misc.c:29:26: libxml/xpath.h: No such file or directory
> misc.c:30:35: libxml/xpathInternals.h: No such file or directory
> In file included from misc.c:35:
> misc.h:17: error: syntax error before "xmlDocPtr"
> misc.h:17: warning: no semicolon at end of struct or union
> misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
> misc.h:18: warning: data definition has no type or storage class
> misc.h:24: error: syntax error before '*' token
> misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
> misc.h:24: warning: data definition has no type or storage class
> misc.h:26: error: syntax error before "ldoc"
> misc.h:27: error: syntax error before "ldoc"
> misc.c:42: error: syntax error before '*' token
> misc.c:42: warning: type defaults to `int' in declaration of `master_doc'
> misc.c:42: warning: data definition has no type or storage class
> misc.c:44: error: syntax error before "ldoc"
> misc.c: In function `get_doc_version':
> misc.c:47: error: `xmlXPathObjectPtr' undeclared (first use in this
> function)
> misc.c:47: error: (Each undeclared identifier is reported only once
> misc.c:47: error: for each function it appears in.)
> misc.c:47: error: syntax error before "obj"
> misc.c:48: error: `xmlXPathContextPtr' undeclared (first use in this
> function)
> misc.c:49: error: `xmlNodePtr' undeclared (first use in this function)
> misc.c:53: error: `ctx' undeclared (first use in this function)
> misc.c:53: warning: implicit declaration of function `xmlXPathNewContext'
> misc.c:53: error: `ldoc' undeclared (first use in this function)
> misc.c:60: error: `obj' undeclared (first use in this function)
> misc.c:60: warning: implicit declaration of function
> `xmlXPathEvalExpression'
> misc.c:67: error: `node' undeclared (first use in this function)
> misc.c:68: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
> function)
> misc.c:91: warning: implicit declaration of function `xmlXPathFreeContext'
> misc.c:94: warning: implicit declaration of function `xmlXPathFreeObject'
> misc.c: At top level:
> misc.c:109: error: syntax error before "ldoc"
> misc.c: In function `get_cluster_name':
> misc.c:112: error: `xmlXPathObjectPtr' undeclared (first use in this
> function)
> misc.c:112: error: syntax error before "obj"
> misc.c:113: error: `xmlXPathContextPtr' undeclared (first use in this
> function)
> misc.c:114: error: `xmlNodePtr' undeclared (first use in this function)
> misc.c:118: error: `ctx' undeclared (first use in this function)
> misc.c:118: error: `ldoc' undeclared (first use in this function)
> misc.c:125: error: `obj' undeclared (first use in this function)
> misc.c:132: error: `node' undeclared (first use in this function)
> misc.c:133: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
> function)
> make[2]: *** [ccsd] Error 1
> make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs/daemon'
> make[1]: *** [install] Error 2
> make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs'
> make: *** [install] Error 2
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 


From sdake at mvista.com  Mon Jul 25 18:51:42 2005
From: sdake at mvista.com (Steven Dake)
Date: Mon, 25 Jul 2005 11:51:42 -0700
Subject: [Linux-cluster] Where to go with cman ?
In-Reply-To: <20050724203850.GM8005@marowsky-bree.de>
References: <42DB63F6.5070600@redhat.com>
	<20050724203850.GM8005@marowsky-bree.de>
Message-ID: <1122317502.12824.11.camel@localhost.localdomain>

On Sun, 2005-07-24 at 22:38 +0200, Lars Marowsky-Bree wrote:
> On 2005-07-18T09:10:30, Patrick Caulfield <pcaulfie at redhat.com> wrote:
> 
> A late reply is better than no reply I guess.
> 
> FWIW, the Linux-HA project has been going through some of the same
> thoughts. 
> 
> (Not wanting to make any statements about the future of NCS here, which
> is a different group, but I know that Robert is keenly aware of
> OpenAIS/libtotem etc too.)
> 
> Linux-HA is looking to integrate with OpenAIS. The exact design is yet
> to be understood, but what is clear so far is that I'm convinced we'll
> support running on top of OpenAIS as an additional messaging layer, and
> figure out a way how to keep our Concensus Cluster Membership and
> OpenAIS's one synchronized.
> 
> We're not quite ready to abandon our existing communication layer
> completely in exchange with OpenAIS, for a variety of good reasons:
> 
> - We support some scenarios which OpenAIS doesn't; we can support raw
>   serial communication links for example, which is very useful for 2
>   node clusters (which make up 100% of our existing userbase and
>   probably even with the ability to support N nodes, will continue to be
>   in excess of 90% of all installations.)
> 
> - OpenAIS relies on the cluster network to be protected by IP/Sec (or
>   other forms of network layer security, and be it just "don't you dare
>   run this over anything but private links"). This is well for some
>   scenarios, but makes setup more complex.

I agree with everything you say except this statement.  The openais
libtotem code has symmetric encryption (sober128) and authentication
(hmac/sha1) of messages.  It is not perfect.  It is open to replay
denail of service attacks of captured messages on the local network.  If
an attacker is sniffing the local network, they may already have the
private key in which case game over.  The DOS problem can be fixed but
hasn't yet.  Another problem is that the private key is never
dynamically generated which is another thing I'd like to work on in the
future.  The security available, however, is better then un-ecrypted
packets without authentication...  If you want to read more about the
threat model assumed and algorithms used, take a look at the file
"SECURITY" in the sources of openais.

Because of the DOS possibility, I'd recommend ensuring the ports used in
openais are protected as best as possible from external untrusted
access.

regards
-steve


From sdake at mvista.com  Mon Jul 25 19:14:30 2005
From: sdake at mvista.com (Steven Dake)
Date: Mon, 25 Jul 2005 12:14:30 -0700
Subject: [Linux-cluster] Where to go with cman ?
In-Reply-To: <42DB63F6.5070600@redhat.com>
References: <42DB63F6.5070600@redhat.com>
Message-ID: <1122318870.12824.29.camel@localhost.localdomain>

On Mon, 2005-07-18 at 09:10 +0100, Patrick Caulfield wrote:
> As I see it there are two things we can do with userland cman that's current in
> the head of CVS:
> 
> 1. Leave it as it is - a port of the kernel one. This has some benefits: it's
> easy (plus a few bug fixes that need to go in), it's protocol-compatible with
> the kernel one. There are a small number of extra features that could go in
> there (that would, annoyingly, break that compatibility) but nothing really
> serious. It doesn't give us anything new, but what new is neeed ?
> 
> 2. Migrate it to something much more sophisticated. I've mentioned Virtual
> Synchrony a few times before and I've been looking into this in some detail
> since. The benefits are largely internal but they do provide a reliable, robust
> and well-performing messaging system that other cluster subsystems can use.
> While the application programmers at the cluster summit maintained they had no
> use for a cluster messaging system, I still believe that it is a useful thing to
> have at a lower level - if only for our own programming needs. I know that Jon
> looked into the existing cman messaging system before rejecting it as too slow
> and unreliable for he needs of the cluster mirroring code.
> 
> There are two suboptions here.
>   a) write it ourself. Quite a big job this. Bigger than I would like. To be
> honest I did make a start at this and now realise just what a huge job it is to
> get something that both performs well and is reliable. REALLY reliable. even
> worse if the academics want something provably reliable.
>    b) adopt something else. The obvious candidate here is the openAIS code[1].
> This looks to be quite mature now and has all the features we need of a low
> level messaging system. It's very nicely abstracted out so we can pick out just
> the bits we need without having the whole (rather heavyweight) system on top of it.
> 
> The one problem with the openAIS code is that it doesn't support IPv6, and much
> of the code is tied to IPv4. Having had a look at it and emailed Steven Dake
> about this he reckons it's about 2 weeks work to add.[2]
> 
> The advantages of doing this are several.
> - It saves time. We get something that is known to work, even though it needs
> extra features added for our own use.
> - we're not inventing something new that already exists in several other places.
> - we get more people who know the code. Currently only I know the internals of
> cman as it stands and it's quite scary code that people don't want to get
> involved with (we've have several DLM patches in the past, but no CMAN ones).
> This way we get at least 2 (Steven and me) as well as anyone else who is
> following openAIS. Of course there will be CMAN-specific stuff on top of their
> comms layer to make it quorum-based and capable of supporting GFS and DLM that

sorry my response is so late I missed this mail while at OLS.

The quorum problem is commonly referred to in the literature as a
"virtual synchrony filter".  I'd love to have some implementations of
virtual synchrony filters that exist within libtotem itself..
Definately an area of interest for openais as we need some services to
operate only in one partition (like the amf).

> will be Red Hat specific but these are not going to be large.
> - the APIs are all open (based on SAforum specifications) and already
> implemented. Although adding saCLM to CMAN is pretty easy as I proved last week.
> 

> The disadvantages are
> - Need to learn the internals of someone else's code.

indeed this part is somewhat painful :(

> - We don't have full control over the code. Although we can obviously fork it if
> we feel the need it would, obviously be preferable not to.

My view is that open source influence is dictated by level of
contribution just like any kind of community.  ie: the more a person
contributes the more influence they can exert over a project or
direction.  Even as maintainer I don't have full control over the
openais code as the community really decides where we go and what work
we do.

My point here is that if you are willing to fork, then you probably have
some time to maintain the code..  which is better spent influencing the
current openais tree :)

> - non-compatibility with "old" cman, making rolling upgrades har or even
> impossible. I'm not sure what to do about this yet, but it's worth pointing out
> that the DLM has a new line-protocol too.

yes upgrades are a real pain.  We have not fully tackled this problem in
the openais project yet, because we havn't released a stable version.
Ideally we would like two versions (older, newer) to interoperate, even
if that means uglifying the implementation to coexist with two line
types.  We have some work in place to address this problem but before
our first production release I'm planning to really think through
interoperability with new implementations for features of the totem
protocol (like redundant ring, multi ring gateway (for local area
networks), group key generation, multi-ring-bridged (for wide area
networks), etc).

> - openAIS is BSD licensed, I don't think this is a problem but it probably needs
> checking.
> 

Originally I had planned to use spread for openais, but the license was
not compatible with the lawyers "approved list".  So we had to implement
a protocol completely from scratch because of the license issue which
took about 1.5 years of work (sigh).  I wanted to be sure other projects
could reuse the totem code so chose the most liberal license I could
find.

> In short, I'm advocating adopting the openAIS core (libtotem basically) as
> CMAN's communications/membership protocol. If we're going to do a "CMAN V2" that
> has anything significant over V1 then re-inventing it is going to be a huge
> amount of work that someone else has already done.
> 
> Comments?
> 

sounds good Patrick  if you need any help from us let us know

regards
-steve

> Patrick
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster


From fgp at phlo.org  Tue Jul 26 16:37:22 2005
From: fgp at phlo.org (Florian G. Pflug)
Date: Tue, 26 Jul 2005 18:37:22 +0200
Subject: [Linux-cluster] GFS freezing
In-Reply-To: <OF5F232017.E24E8DDB-ON85257035.00651FB3-85257035.0066283C@ma.lycos.com>
References: <OF5F232017.E24E8DDB-ON85257035.00651FB3-85257035.0066283C@ma.lycos.com>
Message-ID: <42E666C2.5050303@phlo.org>

Scott.Money at lycos-inc.com wrote:
> We are seeing a similar issue. We have a 3 node gfs system that uses a 
> gnbd server as storage. We originally ran into this problem quite 
> frequently, but hard-setting our NICs to 100Mbit  full duplex has limited 
> the system freezes to "large" data transfers. (e.g. copying 500mb files 
> via scp or creating 500mb Oracle tablespaces). The good news is that the 
> fencing works ;-)
I have had gfs-deadlocks too - I mounted the gfs on two nodes,
and started an instance of bonnie++ on both nodes. Within a few
seconds both nodes seem to have deadlocked - the bonnie++ is in
state "S", and trying to access the gfs (i.e. via "ls") results
in block the accessing app. The nodes where not fenced, so I believe
that the heartbeats where still sent and received.

I'm using lock_gulm, and not lock_dlm.

> Let me know if you get any information about this.
Me too, please ;-)

greetings, FLorian Pflug
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/x-pkcs7-signature
Size: 4116 bytes
Desc: S/MIME Cryptographic Signature
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050726/51d1b206/attachment.bin>

From fabbione at fabbione.net  Wed Jul 27 05:36:51 2005
From: fabbione at fabbione.net (Fabio Massimo Di Nitto)
Date: Wed, 27 Jul 2005 07:36:51 +0200
Subject: [Linux-cluster] Compile errors
In-Reply-To: <42DEF1AC.8070804@prioritynetworks.net>
References: <42DEF1AC.8070804@prioritynetworks.net>
Message-ID: <42E71D73.2080707@fabbione.net>

High Mobley wrote:
> I'm in need of the CLVM software for a project that I'm working on. It
> appears that I require the full RH Cluster software suite in order to
> run CLVM, so I've been trying to follow the information that I found in
> your usage.txt file, as well as the INSTALL file that came with the cvs
> code's STABLE branch.
> 
> I was initially trying to compile against recent 2.6.x kernels, but then
> I noticed that the kernel patches in the STABLE branch were in a
> subdirectory called "2.6.9". So my latest attempts have been to compile
> against linux-2.6.9 kernel sources. I have tried both the STABLE and
> RHEL4 branches. Both have failed to compile.
> 
> I am running Debian Sarge, but I am using vanilla kernel.org sources.
> Thank you in advance for any direction that is provided!
> 
> Following is the output from 'make install' on the STABLE branch:
> 

I haven't test it myself on Debian Sarge, but you can grab the full suite
already packaged from:

http://archive.ubuntu.com/ubuntu/pool/main/r/redhat-cluster-suite/

and the GUI:

http://archive.ubuntu.com/ubuntu/pool/main/s/system-config-cluster/

Due to Build-Deps and Depends of the packages I strongly suggest you
to build the debs from the orig.tar.gz/diff.gz/dsc.

Install redhat-cluster-suite-source_1.20050721-0ubuntu1_all.deb that will
use module-assistent to build a .deb for the kernel modules.

Install redhat-cluster-suite meta package that will resolve the depends
for you.

Due to a soname change in the libs, you will need ether to rebuild (no source
changes required) lvm2 or grab it clvm from the ubuntu archive.

Cheers
Fabio

-- 
no signature file found.


From high.mobley at prioritynetworks.net  Tue Jul 26 15:07:13 2005
From: high.mobley at prioritynetworks.net (High Mobley)
Date: Tue, 26 Jul 2005 08:07:13 -0700
Subject: [Linux-cluster] Compile errors
In-Reply-To: <Pine.LNX.4.44.0507221108350.11204-100000@algiers.clic.cs.columbia.edu>
References: <Pine.LNX.4.44.0507221108350.11204-100000@algiers.clic.cs.columbia.edu>
Message-ID: <42E651A1.9000501@prioritynetworks.net>

Yes. Thank you. That fixed it for me. I thought that I had already 
installed the libxml package about a week ago, but now I realize that I 
had installed libxml2, not libxml2-dev. I have now gotten a clean 
compile of RHCS on my Debian machine. Thanks again for your help.

High Mobley


Dan B. Phung wrote:
> look like you need to install libxml:
> 
>   aptitude install libxml2-dev
> 
> here's the list of things I need to install on debian to ensure that it
> works.
> 
>   aptitude install libxml2-dev libdm0-dev libselinux1 libselinux1-dev libnet-telnet-perl
> 
> 
> On 21, Jul, 2005, High Mobley declared:
> 
> 
>>David Teigland wrote:
>>
>>>On Wed, Jul 20, 2005 at 05:51:56PM -0700, High Mobley wrote:
>>>
>>>
>>>>I'm in need of the CLVM software for a project that I'm working on. It
>>>>appears that I require the full RH Cluster software suite in order to
>>>>run CLVM, so I've been trying to follow the information that I found in
>>>>your usage.txt file, as well as the INSTALL file that came with the cvs
>>>>code's STABLE branch.
>>>>
>>>>I was initially trying to compile against recent 2.6.x kernels, but then
>>>>I noticed that the kernel patches in the STABLE branch were in a
>>>>subdirectory called "2.6.9". So my latest attempts have been to compile
>>>>against linux-2.6.9 kernel sources. I have tried both the STABLE and
>>>>RHEL4 branches. Both have failed to compile.
>>>>
>>>>I am running Debian Sarge, but I am using vanilla kernel.org sources.
>>>>Thank you in advance for any direction that is provided!
>>>
>>>
>>>Don't use cvs, use the release instead:
>>> ftp://sources.redhat.com/pub/cluster/releases/cluster-1.00.00.tar.gz
>>>
>>>This compiles against vanilla linux-2.6.12 and doesn't use kernel patches.
>>>
>>>Dave
>>>
>>
>>
>>
>>I tried compiling against 2.6.12 and 2.6.12.3 and got errors with both.
>>Since my build machine is kind of slow, I am using a very minimal kernel
>>config. I've turned off support for things like SCSI, USB, parallel
>>port, etc. Is it possible that I'm missing some needed kernel option?
>>I've included the output from 'make install' on the cluster-1.00.00
>>source below. Thanks again for the help.
>>
>>
>>
>>strife:/usr/2.6.12-cluster/cluster-1.00.00# make install
>>cd cman-kernel && make install
>>make[1]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel'
>>cd src && make install
>>make[2]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src'
>>rm -f cluster
>>ln -s . cluster
>>make -C /usr/2.6.12-cluster/linux-2.6.12
>>M=/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src modules
>>USING_KBUILD=yes
>>make[3]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cnxman.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/config.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/membership.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/proc.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_barrier.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_control.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_daemon.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_joinleave.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_membership.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_message.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_misc.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_recover.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_services.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/sm_user.o
>>   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cman.o
>>   Building modules, stage 2.
>>   MODPOST
>>   CC      /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cman.mod.o
>>   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src/cman.ko
>>make[3]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
>>install -d //lib/modules/2.6.12/kernel/cluster
>>install cman.ko //lib/modules/2.6.12/kernel/cluster
>>install -d //usr/include/cluster
>>install cnxman.h cnxman-socket.h service.h //usr/include/cluster
>>make[2]: Leaving directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel/src'
>>make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/cman-kernel'
>>cd dlm-kernel && make install
>>make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel'
>>cd src && make install
>>make[2]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src'
>>if [ ! -e cluster ]; then ln -s . cluster; fi
>>if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
>>if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
>>if [ ! -e cnxman-socket.h ]; then cp
>>//usr/include/cluster/cnxman-socket.h .; fi
>>make -C /usr/2.6.12-cluster/linux-2.6.12
>>M=/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src modules
>>USING_KBUILD=yes
>>make[3]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/ast.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/config.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/device.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dir.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lkb.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/locking.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lockqueue.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lockspace.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/lowcomms.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/main.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/memory.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/midcomms.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/nodes.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/proc.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/queries.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/rebuild.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/reccomms.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/recover.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/recoverd.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/rsb.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/util.o
>>   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.o
>>   Building modules, stage 2.
>>   MODPOST
>>*** Warning: "kcl_addref_cluster"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_get_node_by_addr"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_get_node_addresses"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_releaseref_cluster"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_get_current_interface"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_get_node_by_nodeid"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_leave_service"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_remove_callback"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_global_service_id"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_unregister_service"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_join_service"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_start_done"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_add_callback"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>*** Warning: "kcl_register_service"
>>[/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko] undefined!
>>   CC      /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.mod.o
>>   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src/dlm.ko
>>make[3]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
>>install -d //lib/modules/2.6.12/kernel/cluster
>>install dlm.ko //lib/modules/2.6.12/kernel/cluster
>>install -d //usr/include/cluster
>>install dlm.h dlm_device.h //usr/include/cluster
>>make[2]: Leaving directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel/src'
>>make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/dlm-kernel'
>>cd gfs-kernel && make install
>>make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel'
>>cd src && make install
>>make[2]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src'
>>cd harness && make install
>>make[3]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness'
>>rm -f linux
>>ln -s . linux
>>make -C /usr/2.6.12-cluster/linux-2.6.12
>>M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness
>>symverfile=/usr/2.6.12-cluster/linux-2.6.12/../kernel/cluster/dlm.symvers
>>modules USING_KBUILD=yes
>>make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/main.o
>>   LD [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/lock_harness.o
>>   Building modules, stage 2.
>>   MODPOST
>>   CC
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/lock_harness.mod.o
>>   LD [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness/lock_harness.ko
>>make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
>>/usr/2.6.12-cluster/linux-2.6.12/scripts/mod/modpost -m -i
>>/usr/2.6.12-cluster/linux-2.6.12/../kernel/cluster/dlm.symvers
>>../harness/lock_harness.o -o lock_harness.symvers
>>install -d //usr/include/linux
>>install lm_interface.h //usr/include/linux
>>install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_harness
>>install lock_harness.ko
>>//lib/modules/2.6.12/kernel/fs/gfs_locking/lock_harness
>>make[3]: Leaving directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/harness'
>>cd nolock && make install
>>make[3]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock'
>>echo
>>
>>rm -f linux lm_interface.h
>>ln -s . linux
>>ln -s ../../src/harness/lm_interface.h .
>>make -C /usr/2.6.12-cluster/linux-2.6.12
>>M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock
>>symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/../harness/lock_harness.symvers 
>>
>>modules USING_KBUILD=yes
>>make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/main.o
>>   LD [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/lock_nolock.o
>>   Building modules, stage 2.
>>   MODPOST
>>   CC
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/lock_nolock.mod.o
>>   LD [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock/lock_nolock.ko
>>make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
>>install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_nolock
>>install lock_nolock.ko
>>//lib/modules/2.6.12/kernel/fs/gfs_locking/lock_nolock
>>make[3]: Leaving directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/nolock'
>>cd dlm && make install
>>make[3]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm'
>>if [ ! -e linux ]; then ln -s . linux; fi
>>if [ ! -e cluster ]; then ln -s . cluster; fi
>>if [ ! -e lm_interface.h ]; then ln -s ../../src/harness/lm_interface.h
>>.; fi
>>if [ ! -e dlm.h ]; then cp //usr/include/cluster/dlm.h .; fi
>>if [ ! -e cnxman.h ]; then cp //usr/include/cluster/cnxman.h .; fi
>>if [ ! -e cnxman-socket.h ]; then cp
>>//usr/include/cluster/cnxman-socket.h .; fi
>>if [ ! -e service.h ]; then cp //usr/include/cluster/service.h .; fi
>>make -C /usr/2.6.12-cluster/linux-2.6.12
>>M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm
>>symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/../harness/lock_harness.symvers 
>>
>>modules USING_KBUILD=yes
>>make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/group.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/main.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/mount.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/plock.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/thread.o
>>   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.o
>>   Building modules, stage 2.
>>   MODPOST
>>*** Warning: "kcl_addref_cluster"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "kcl_get_services"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "kcl_cluster_name"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "dlm_lock"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "kcl_releaseref_cluster"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "kcl_get_members"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "dlm_query"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "kcl_leave_service"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "kcl_unregister_service"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "kcl_join_service"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "dlm_release_lockspace"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "kcl_start_done"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "dlm_unlock"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "kcl_register_service"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "dlm_new_lockspace"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>*** Warning: "dlm_debug_dump"
>>[/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko]
>>undefined!
>>   CC
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.mod.o
>>   LD [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/lock_dlm.ko
>>make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
>>/usr/2.6.12-cluster/linux-2.6.12/scripts/mod/modpost -m -i
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm/../harness/lock_harness.symvers 
>>
>>../dlm/lock_dlm.o -o lock_dlm.symvers
>>install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_dlm
>>install lock_dlm.ko //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_dlm
>>make[3]: Leaving directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/dlm'
>>cd gulm && make install
>>make[3]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm'
>>rm -f linux lm_interface.h
>>ln -s . linux
>>ln -s ../../src/harness/lm_interface.h .
>>make -C /usr/2.6.12-cluster/linux-2.6.12
>>M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm
>>symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/../harness/lock_harness.symvers 
>>
>>modules USING_KBUILD=yes
>>make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_core.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_firstlock.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_fs.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_jid.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_lock_queue.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_lt.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_main.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_plock.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/gulm_recsig.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/handler.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lg_core.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lg_lock.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lg_main.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/utils_tostr.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/xdr_base.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/xdr_io.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/xdr_socket.o
>>   LD [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lock_gulm.o
>>   Building modules, stage 2.
>>   MODPOST
>>   CC
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lock_gulm.mod.o
>>   LD [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/lock_gulm.ko
>>make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
>>/usr/2.6.12-cluster/linux-2.6.12/scripts/mod/modpost -m -i
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm/../harness/lock_harness.symvers 
>>
>>../gulm/lock_gulm.o -o lock_gulm.symvers
>>install -d //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_gulm
>>install lock_gulm.ko //lib/modules/2.6.12/kernel/fs/gfs_locking/lock_gulm
>>make[3]: Leaving directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gulm'
>>cd gfs && make install
>>make[3]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs'
>>rm -f linux lm_interface.h
>>ln -s . linux
>>ln -s ../../src/harness/lm_interface.h .
>>make -C /usr/2.6.12-cluster/linux-2.6.12
>>M=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs
>>symverfile=/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/../gulm/lock_gulm.symvers 
>>
>>modules USING_KBUILD=yes
>>make[4]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/acl.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/bits.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/bmap.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/daemon.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/diaper.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/dio.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/dir.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/eaops.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/eattr.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/file.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/glock.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/glops.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/inode.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ioctl.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/lm.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/log.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/lops.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/lvb.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/main.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/mount.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ondisk.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_address.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_dentry.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_export.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_file.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_fstype.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_inode.o
>>   CC [M]
>>/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_super.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/ops_vm.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/page.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/proc.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/quota.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/recovery.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/rgrp.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/super.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/trans.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/unlinked.o
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/util.o
>>   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/gfs.o
>>   Building modules, stage 2.
>>   MODPOST
>>   CC      /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/gfs.mod.o
>>   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs/gfs.ko
>>make[4]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
>>install -d //lib/modules/2.6.12/kernel/fs/gfs
>>install gfs.ko //lib/modules/2.6.12/kernel/fs/gfs
>>install -d //usr/include/linux
>>install gfs_ondisk.h gfs_ioctl.h //usr/include/linux
>>make[3]: Leaving directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src/gfs'
>>make[2]: Leaving directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel/src'
>>make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/gfs-kernel'
>>cd gnbd-kernel && make install
>>make[1]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel'
>>cd src && make install
>>make[2]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src'
>>rm -f linux
>>ln -s . linux
>>make -C /usr/2.6.12-cluster/linux-2.6.12
>>M=/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src modules
>>USING_KBUILD=yes
>>make[3]: Entering directory `/usr/2.6.12-cluster/linux-2.6.12'
>>   CC [M]  /usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src/gnbd.o
>>   Building modules, stage 2.
>>   MODPOST
>>   CC      /usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src/gnbd.mod.o
>>   LD [M]  /usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src/gnbd.ko
>>make[3]: Leaving directory `/usr/2.6.12-cluster/linux-2.6.12'
>>install -d //usr/include/linux
>>install gnbd.h //usr/include/linux
>>install -d //lib/modules/2.6.12/kernel/drivers/block/gnbd
>>install gnbd.ko //lib/modules/2.6.12/kernel/drivers/block/gnbd
>>make[2]: Leaving directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel/src'
>>make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/gnbd-kernel'
>>cd magma && make install
>>make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/magma'
>>cd lib && make install
>>make[2]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/lib'
>>gcc -c -o global.o global.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
>>-Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
>>gcc -c -o plugin.o plugin.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
>>-Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
>>gcc -c -o localinfo.o localinfo.c -I. -Wall -I. -g -Werror
>>-Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
>>-D_GNU_SOURCE -D_CLUSTER_
>>gcc -c -o ip_lookup.o ip_lookup.c -I. -Wall -I. -g -Werror
>>-Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
>>-D_GNU_SOURCE -D_CLUSTER_
>>gcc -c -o memberlist.o memberlist.c -I. -Wall -I. -g -Werror
>>-Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
>>-D_GNU_SOURCE -D_CLUSTER_
>>gcc -c -o clist.o clist.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
>>-Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
>>ar cr libmagma.a global.o plugin.o localinfo.o ip_lookup.o memberlist.o
>>clist.o
>>ar cr libmagma_nt.a plugin.o localinfo.o ip_lookup.o memberlist.o
>>gcc -c -o message.o message.c -I. -Wall -I. -g -Werror
>>-Wstrict-prototypes -Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\"
>>-D_GNU_SOURCE -D_CLUSTER_
>>gcc -c -o fdops.o fdops.c -I. -Wall -I. -g -Werror -Wstrict-prototypes
>>-Wshadow -fPIC -DPLUGINDIR=\"/usr/lib/magma\" -D_GNU_SOURCE -D_CLUSTER_
>>ar cr libmagmamsg.a message.o fdops.o
>>ld -shared -soname libmagma.so.1 -o libmagma.so.1.00.00 global.o
>>plugin.o localinfo.o ip_lookup.o memberlist.o clist.o -lc
>>ln -snf libmagma.so.1.00.00 libmagma.so.1
>>ld -shared -soname libmagma_nt.so.1 -o libmagma_nt.so.1.00.00 plugin.o
>>localinfo.o ip_lookup.o memberlist.o -lc
>>ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so.1
>>ld -shared -soname libmagmamsg.so.1 -o libmagmamsg.so.1.00.00 message.o
>>fdops.o -lc
>>ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so.1
>>ln -snf libmagma.so.1.00.00 libmagma.so
>>ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so
>>ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so
>>install -d /usr/lib
>>install -d /usr/lib
>>#
>># Install high level magma libraries
>>#
>>install -m 0644 libmagma.a /usr/lib
>>install -m 0644 libmagma.so.1.00.00 /usr/lib
>>cd /usr/lib; ln -snf libmagma.so.1.00.00 libmagma.so.1; cd -
>>cd /usr/lib; ln -snf libmagma.so.1.00.00 libmagma.so; cd -
>>#
>># Install low-overhead, non-pthread magma libraries
>>#
>>install -m 0644 libmagma_nt.a /usr/lib
>>install -m 0644 libmagma_nt.so.1.00.00 /usr/lib
>>cd /usr/lib; ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so.1; cd -
>>cd /usr/lib; ln -snf libmagma_nt.so.1.00.00 libmagma_nt.so; cd -
>>#
>># Install high level TCP messaging libraries
>>#
>>install -m 0644 libmagmamsg.a /usr/lib
>>install -m 0644 libmagmamsg.so.1.00.00 /usr/lib
>>cd /usr/lib; ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so.1; cd -
>>cd /usr/lib; ln -snf libmagmamsg.so.1.00.00 libmagmamsg.so; cd -
>>#
>># Install headers for magma, magma_nt, and magmamsg
>>#
>>install -d /usr/include
>>install -m 0644 magma.h /usr/include
>>install -m 0644 magmamsg.h /usr/include
>>install -m 0644 magma-build.h /usr/include
>>make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/lib'
>>cd man && make install
>>make[2]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/man'
>>install -d /usr/share/man/man3
>>install -d /usr/share/man/man8
>>install clu_connect.3 clu_disconnect.3 clu_get_event.3 /usr/share/man/man3
>>install magma_tool.8 /usr/share/man/man8
>>make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/man'
>>cd tests && make install
>>make[2]: Entering directory
>>`/usr/2.6.12-cluster/cluster-1.00.00/magma/tests'
>>gcc -c -o magma_tool.o magma_tool.c  -Wall -I../lib -DDEBUG
>>-D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
>>-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
>>gcc -o magma_tool magma_tool.o -L../lib -lmagma -lmagmamsg -ldl
>>-lpthread -ggdb
>>gcc -c -o circleping.o circleping.c  -Wall -I../lib -DDEBUG
>>-D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
>>-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
>>gcc -o circleping circleping.o -L../lib -lmagma -lmagmamsg -ldl
>>-lpthread -ggdb
>>gcc -c -o cluster_cmd.o cluster_cmd.c  -Wall -I../lib -DDEBUG
>>-D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
>>-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
>>gcc -o cluster_cmd cluster_cmd.o -L../lib -lmagma -lmagmamsg -ldl -lpthread
>>gcc -c -o cptester.o cptester.c  -Wall -I../lib -DDEBUG -D_GNU_SOURCE
>>-DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
>>-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
>>gcc -o cptester cptester.o -L../lib -lmagma_nt -ldl
>>gcc -c -o thread_test.o thread_test.c  -Wall -I../lib -DDEBUG
>>-D_GNU_SOURCE -DPLUGINDIR=\"/usr/lib/magma\" -DINCDIR=\"/usr/include\"
>>-DLIBDIR=\"/usr/lib\" -DSLIBDIR=\"/usr/lib\" -ggdb
>>gcc -o thread_test thread_test.o -L../lib -lmagma -lmagmamsg -ldl
>>-lpthread -ggdb
>>install -d /sbin
>>install -m 0755 magma_tool circleping cluster_cmd /sbin
>>make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma/tests'
>>make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/magma'
>>cd ccs && make install
>>make[1]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs'
>>cd daemon && make install
>>make[2]: Entering directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs/daemon'
>>gcc -Wall -I. -I../include -I../common -I../config -I//usr/include -Wall
>>-O2 -D_FILE_OFFSET_BITS=64 -D_GNU_SOURCE `xml2-config --cflags`
>>-DCCS_RELEASE_NAME=\"1.00.00\" -I. -I../include -I../common -I../config
>>-I//usr/include -o ccsd ccsd.c cnx_mgr.c cluster_mgr.c misc.c
>>../common/log.c globals.c -L//usr/lib `xml2-config --libs` -lpthread
>>-lmagma -lmagmamsg -ldl
>>/bin/sh: line 1: xml2-config: command not found
>>/bin/sh: line 1: xml2-config: command not found
>>ccsd.c:25:27: libxml/parser.h: No such file or directory
>>ccsd.c: In function `check_cluster_conf':
>>ccsd.c:470: error: `xmlDocPtr' undeclared (first use in this function)
>>ccsd.c:470: error: (Each undeclared identifier is reported only once
>>ccsd.c:470: error: for each function it appears in.)
>>ccsd.c:470: error: syntax error before "doc"
>>ccsd.c:473: error: `doc' undeclared (first use in this function)
>>ccsd.c:473: warning: implicit declaration of function `xmlParseFile'
>>ccsd.c:482: warning: implicit declaration of function `xmlFreeDoc'
>>cnx_mgr.c:25:27: libxml/parser.h: No such file or directory
>>cnx_mgr.c:26:25: libxml/tree.h: No such file or directory
>>cnx_mgr.c:27:26: libxml/xpath.h: No such file or directory
>>cnx_mgr.c:28:35: libxml/xpathInternals.h: No such file or directory
>>In file included from cnx_mgr.c:33:
>>misc.h:17: error: syntax error before "xmlDocPtr"
>>misc.h:17: warning: no semicolon at end of struct or union
>>misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
>>misc.h:18: warning: data definition has no type or storage class
>>misc.h:24: error: syntax error before '*' token
>>misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
>>misc.h:24: warning: data definition has no type or storage class
>>misc.h:26: error: syntax error before "ldoc"
>>misc.h:27: error: syntax error before "ldoc"
>>cnx_mgr.c:41: error: syntax error before "open_doc_t"
>>cnx_mgr.c:41: warning: no semicolon at end of struct or union
>>cnx_mgr.c:42: warning: type defaults to `int' in declaration of `oc_ctx'
>>cnx_mgr.c:42: warning: data definition has no type or storage class
>>cnx_mgr.c:43: warning: type defaults to `int' in declaration of
>>`open_connection_t'
>>cnx_mgr.c:43: warning: data definition has no type or storage class
>>cnx_mgr.c:48: error: syntax error before '*' token
>>cnx_mgr.c:48: warning: type defaults to `int' in declaration of `ocs'
>>cnx_mgr.c:48: warning: data definition has no type or storage class
>>cnx_mgr.c: In function `_update_config':
>>cnx_mgr.c:53: error: `tmp_odoc' undeclared (first use in this function)
>>cnx_mgr.c:53: error: (Each undeclared identifier is reported only once
>>cnx_mgr.c:53: error: for each function it appears in.)
>>cnx_mgr.c:54: error: `xmlDocPtr' undeclared (first use in this function)
>>cnx_mgr.c:54: error: syntax error before "tmp_doc"
>>cnx_mgr.c:58: error: `tmp_doc' undeclared (first use in this function)
>>cnx_mgr.c:58: warning: implicit declaration of function `xmlParseFile'
>>cnx_mgr.c:67: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:68: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:90: error: request for member `od_refs' in something not a
>>structure or union
>>cnx_mgr.c:92: warning: implicit declaration of function `xmlFreeDoc'
>>cnx_mgr.c:92: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:93: warning: implicit declaration of function `free'
>>cnx_mgr.c:104: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c: In function `broadcast_for_doc':
>>cnx_mgr.c:157: error: `xmlDocPtr' undeclared (first use in this function)
>>cnx_mgr.c:157: error: syntax error before "tmp_doc"
>>cnx_mgr.c:164: warning: implicit declaration of function `exit'
>>cnx_mgr.c:164: error: `EXIT_FAILURE' undeclared (first use in this function)
>>cnx_mgr.c:289: warning: implicit declaration of function `srandom'
>>cnx_mgr.c:293: warning: implicit declaration of function `random'
>>cnx_mgr.c:324: error: `tmp_doc' undeclared (first use in this function)
>>cnx_mgr.c:324: warning: implicit declaration of function `xmlParseMemory'
>>cnx_mgr.c:349: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:352: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:357: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:366: error: request for member `od_refs' in something not a
>>structure or union
>>cnx_mgr.c:367: error: `tmp_odoc' undeclared (first use in this function)
>>cnx_mgr.c:376: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:377: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:386: error: request for member `od_refs' in something not a
>>structure or union
>>cnx_mgr.c:396: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:397: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:450: warning: implicit declaration of function `xmlDocDump'
>>cnx_mgr.c:450: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c: In function `process_connect':
>>cnx_mgr.c:488: error: syntax error before ')' token
>>cnx_mgr.c:493: error: syntax error before ')' token
>>cnx_mgr.c:493: error: syntax error before ')' token
>>cnx_mgr.c:493: error: syntax error before ')' token
>>cnx_mgr.c:493: error: syntax error before ')' token
>>cnx_mgr.c:493: error: syntax error before ')' token
>>cnx_mgr.c:493: error: syntax error before ')' token
>>cnx_mgr.c:513: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:514: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:515: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:518: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:521: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:522: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:523: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:526: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:527: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:531: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:532: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:542: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:577: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:583: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:609: error: syntax error before ')' token
>>cnx_mgr.c:617: error: request for member `od_refs' in something not a
>>structure or union
>>cnx_mgr.c:618: error: request for member `oc_odoc' in something not a
>>structure or union
>>cnx_mgr.c:619: error: request for member `oc_ctx' in something not a
>>structure or union
>>cnx_mgr.c:619: warning: implicit declaration of function
>>`xmlXPathNewContext'
>>cnx_mgr.c:619: error: request for member `oc_odoc' in something not a
>>structure or union
>>cnx_mgr.c:620: error: request for member `oc_ctx' in something not a
>>structure or union
>>cnx_mgr.c:621: error: request for member `oc_odoc' in something not a
>>structure or union
>>cnx_mgr.c:631: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c: In function `process_disconnect':
>>cnx_mgr.c:658: error: `tmp_odoc' undeclared (first use in this function)
>>cnx_mgr.c:679: error: request for member `oc_ctx' in something not a
>>structure or union
>>cnx_mgr.c:680: warning: implicit declaration of function
>>`xmlXPathFreeContext'
>>cnx_mgr.c:680: error: request for member `oc_ctx' in something not a
>>structure or union
>>cnx_mgr.c:682: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:683: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:685: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:686: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:688: error: request for member `oc_odoc' in something not a
>>structure or union
>>cnx_mgr.c:692: error: `EXIT_FAILURE' undeclared (first use in this function)
>>cnx_mgr.c: In function `_process_get':
>>cnx_mgr.c:734: error: `xmlXPathObjectPtr' undeclared (first use in this
>>function)
>>cnx_mgr.c:734: error: syntax error before "obj"
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:757: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:758: error: request for member `oc_index' in something not a
>>structure or union
>>cnx_mgr.c:764: error: request for member `oc_index' in something not a
>>structure or union
>>cnx_mgr.c:765: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:766: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:768: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:774: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:784: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:789: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:792: error: `obj' undeclared (first use in this function)
>>cnx_mgr.c:792: warning: implicit declaration of function
>>`xmlXPathEvalExpression'
>>cnx_mgr.c:792: error: request for member `oc_ctx' in something not a
>>structure or union
>>cnx_mgr.c:797: error: `xmlNodePtr' undeclared (first use in this function)
>>cnx_mgr.c:797: error: syntax error before "node"
>>cnx_mgr.c:801: error: request for member `oc_index' in something not a
>>structure or union
>>cnx_mgr.c:802: error: request for member `oc_index' in something not a
>>structure or union
>>cnx_mgr.c:807: error: `node' undeclared (first use in this function)
>>cnx_mgr.c:807: error: request for member `oc_index' in something not a
>>structure or union
>>cnx_mgr.c:822: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
>>function)
>>cnx_mgr.c:823: error: `XML_ELEMENT_NODE' undeclared (first use in this
>>function)
>>cnx_mgr.c:868: warning: implicit declaration of function
>>`xmlXPathFreeObject'
>>cnx_mgr.c: In function `process_get_list':
>>cnx_mgr.c:897: error: request for member `oc_index' in something not a
>>structure or union
>>cnx_mgr.c: In function `process_get_state':
>>cnx_mgr.c:964: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:964: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:965: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:966: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:973: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:974: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:974: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:976: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:978: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:984: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:985: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:993: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c: In function `process_set_state':
>>cnx_mgr.c:1033: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:1034: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:1035: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c:1038: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:1039: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:1040: error: request for member `oc_query' in something not a
>>structure or union
>>cnx_mgr.c:1043: error: request for member `oc_cwp' in something not a
>>structure or union
>>cnx_mgr.c: In function `process_request':
>>cnx_mgr.c:1160: warning: implicit declaration of function `realloc'
>>cnx_mgr.c: In function `process_broadcast':
>>cnx_mgr.c:1247: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:1248: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:1266: warning: implicit declaration of function
>>`xmlDocDumpFormatMemory'
>>cnx_mgr.c:1266: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:1267: error: `xmlChar' undeclared (first use in this function)
>>cnx_mgr.c:1267: error: syntax error before ')' token
>>cnx_mgr.c:1301: error: request for member `od_doc' in something not a
>>structure or union
>>cnx_mgr.c:1302: error: request for member `od_doc' in something not a
>>structure or union
>>cluster_mgr.c:24:27: libxml/parser.h: No such file or directory
>>In file included from cluster_mgr.c:29:
>>misc.h:17: error: syntax error before "xmlDocPtr"
>>misc.h:17: warning: no semicolon at end of struct or union
>>misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
>>misc.h:18: warning: data definition has no type or storage class
>>misc.h:24: error: syntax error before '*' token
>>misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
>>misc.h:24: warning: data definition has no type or storage class
>>misc.h:26: error: syntax error before "ldoc"
>>misc.h:27: error: syntax error before "ldoc"
>>cluster_mgr.c:38: error: syntax error before "tmp_doc"
>>cluster_mgr.c: In function `check_update_doc':
>>cluster_mgr.c:45: error: `tmp_doc' undeclared (first use in this function)
>>cluster_mgr.c:45: error: (Each undeclared identifier is reported only once
>>cluster_mgr.c:45: error: for each function it appears in.)
>>cluster_mgr.c:51: error: request for member `od_doc' in something not a
>>structure or union
>>cluster_mgr.c:52: error: request for member `od_doc' in something not a
>>structure or union
>>cluster_mgr.c:64: error: request for member `od_doc' in something not a
>>structure or union
>>cluster_mgr.c:65: error: request for member `od_doc' in something not a
>>structure or union
>>cluster_mgr.c:67: error: request for member `od_doc' in something not a
>>structure or union
>>cluster_mgr.c:67: error: request for member `od_doc' in something not a
>>structure or union
>>cluster_mgr.c: In function `handle_cluster_message':
>>cluster_mgr.c:93: error: `xmlDocPtr' undeclared (first use in this function)
>>cluster_mgr.c:93: error: syntax error before "tmp_doc"
>>cluster_mgr.c:146: error: `tmp_doc' undeclared (first use in this function)
>>cluster_mgr.c:146: warning: implicit declaration of function
>>`xmlParseMemory'
>>cluster_mgr.c:167: warning: implicit declaration of function `xmlDocDump'
>>cluster_mgr.c:198: warning: implicit declaration of function `xmlParseFile'
>>cluster_mgr.c:247: warning: implicit declaration of function `xmlFreeDoc'
>>misc.c:25:27: libxml/parser.h: No such file or directory
>>misc.c:27:27: libxml/parser.h: No such file or directory
>>misc.c:28:25: libxml/tree.h: No such file or directory
>>misc.c:29:26: libxml/xpath.h: No such file or directory
>>misc.c:30:35: libxml/xpathInternals.h: No such file or directory
>>In file included from misc.c:35:
>>misc.h:17: error: syntax error before "xmlDocPtr"
>>misc.h:17: warning: no semicolon at end of struct or union
>>misc.h:18: warning: type defaults to `int' in declaration of `open_doc_t'
>>misc.h:18: warning: data definition has no type or storage class
>>misc.h:24: error: syntax error before '*' token
>>misc.h:24: warning: type defaults to `int' in declaration of `master_doc'
>>misc.h:24: warning: data definition has no type or storage class
>>misc.h:26: error: syntax error before "ldoc"
>>misc.h:27: error: syntax error before "ldoc"
>>misc.c:42: error: syntax error before '*' token
>>misc.c:42: warning: type defaults to `int' in declaration of `master_doc'
>>misc.c:42: warning: data definition has no type or storage class
>>misc.c:44: error: syntax error before "ldoc"
>>misc.c: In function `get_doc_version':
>>misc.c:47: error: `xmlXPathObjectPtr' undeclared (first use in this
>>function)
>>misc.c:47: error: (Each undeclared identifier is reported only once
>>misc.c:47: error: for each function it appears in.)
>>misc.c:47: error: syntax error before "obj"
>>misc.c:48: error: `xmlXPathContextPtr' undeclared (first use in this
>>function)
>>misc.c:49: error: `xmlNodePtr' undeclared (first use in this function)
>>misc.c:53: error: `ctx' undeclared (first use in this function)
>>misc.c:53: warning: implicit declaration of function `xmlXPathNewContext'
>>misc.c:53: error: `ldoc' undeclared (first use in this function)
>>misc.c:60: error: `obj' undeclared (first use in this function)
>>misc.c:60: warning: implicit declaration of function
>>`xmlXPathEvalExpression'
>>misc.c:67: error: `node' undeclared (first use in this function)
>>misc.c:68: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
>>function)
>>misc.c:91: warning: implicit declaration of function `xmlXPathFreeContext'
>>misc.c:94: warning: implicit declaration of function `xmlXPathFreeObject'
>>misc.c: At top level:
>>misc.c:109: error: syntax error before "ldoc"
>>misc.c: In function `get_cluster_name':
>>misc.c:112: error: `xmlXPathObjectPtr' undeclared (first use in this
>>function)
>>misc.c:112: error: syntax error before "obj"
>>misc.c:113: error: `xmlXPathContextPtr' undeclared (first use in this
>>function)
>>misc.c:114: error: `xmlNodePtr' undeclared (first use in this function)
>>misc.c:118: error: `ctx' undeclared (first use in this function)
>>misc.c:118: error: `ldoc' undeclared (first use in this function)
>>misc.c:125: error: `obj' undeclared (first use in this function)
>>misc.c:132: error: `node' undeclared (first use in this function)
>>misc.c:133: error: `XML_ATTRIBUTE_NODE' undeclared (first use in this
>>function)
>>make[2]: *** [ccsd] Error 1
>>make[2]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs/daemon'
>>make[1]: *** [install] Error 2
>>make[1]: Leaving directory `/usr/2.6.12-cluster/cluster-1.00.00/ccs'
>>make: *** [install] Error 2
>>
>>--
>>Linux-cluster mailing list
>>Linux-cluster at redhat.com
>>http://www.redhat.com/mailman/listinfo/linux-cluster
>>
> 
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster


From dawson at fnal.gov  Wed Jul 27 15:45:34 2005
From: dawson at fnal.gov (Troy Dawson)
Date: Wed, 27 Jul 2005 10:45:34 -0500
Subject: [Linux-cluster] RHEL4 spec file - hugemem on x86_64
Message-ID: <42E7AC1E.5030802@fnal.gov>

Hello,

I just thought I'd ask if this has already been taken care of before I 
put up a bugzilla bug.

on the rpm's that build kernel modules they require kernel-hugemem and 
kernel-hugemem-devel to build.  But for x86_64 there isn't a 
kernel-hugemem and the kernel-hugemem-devel is an i686.

To fix this you just need to put a

%if %{buildhugemem}
BuildRequires: kernel-hugemem = %{kernel_version}
BuildRequires: kernel-hugemem-devel = %{kernel_version}
%endif

up in the BuildRequires section.
This affects
dlm-kernel-2.6.9-34.0.src.rpm
GFS-kernel-2.6.9-35.5.src.rpm
gnbd-kernel-2.6.9-8.27.src.rpm
cman-kernel-2.6.9-36.0.src.rpm

One solution you can do is build them with the  --nodeps option, but you 
have to be careful to make sure you have all the other dependancies in 
before you run it with that.

Troy
-- 
__________________________________________________
Troy Dawson  dawson at fnal.gov  (630)840-6468
Fermilab  ComputingDivision/CSS  CSI Group
__________________________________________________


From cfeist at redhat.com  Wed Jul 27 17:28:37 2005
From: cfeist at redhat.com (Chris Feist)
Date: Wed, 27 Jul 2005 12:28:37 -0500
Subject: [Linux-cluster] RHEL4 spec file - hugemem on x86_64
In-Reply-To: <42E7AC1E.5030802@fnal.gov>
References: <42E7AC1E.5030802@fnal.gov>
Message-ID: <42E7C445.4040505@redhat.com>

Unfortunately our build system ignores the "%if" so we have to use the 
BuildRequires: kernel-hugemem.  There's another package that we use which 
provides kernel-hugemem on the archs that don't have a hugemem kernel.  The 
best thing to do is either remove those BuildRequires for the arch you want or 
build with --nodeps.

Thanks,
Chris

Troy Dawson wrote:
> Hello,
> 
> I just thought I'd ask if this has already been taken care of before I 
> put up a bugzilla bug.
> 
> on the rpm's that build kernel modules they require kernel-hugemem and 
> kernel-hugemem-devel to build.  But for x86_64 there isn't a 
> kernel-hugemem and the kernel-hugemem-devel is an i686.
> 
> To fix this you just need to put a
> 
> %if %{buildhugemem}
> BuildRequires: kernel-hugemem = %{kernel_version}
> BuildRequires: kernel-hugemem-devel = %{kernel_version}
> %endif
> 
> up in the BuildRequires section.
> This affects
> dlm-kernel-2.6.9-34.0.src.rpm
> GFS-kernel-2.6.9-35.5.src.rpm
> gnbd-kernel-2.6.9-8.27.src.rpm
> cman-kernel-2.6.9-36.0.src.rpm
> 
> One solution you can do is build them with the  --nodeps option, but you 
> have to be careful to make sure you have all the other dependancies in 
> before you run it with that.
> 
> Troy


From jharr at opsource.net  Wed Jul 27 17:35:15 2005
From: jharr at opsource.net (Jeff Harr)
Date: Wed, 27 Jul 2005 13:35:15 -0400
Subject: [Linux-cluster] How do nodes in a cluster authenticate each other?
Message-ID: <38A48FA2F0103444906AD22E14F1B5A364F4BF@mailxchg01.corp.opsource.net>

I asked this under a different heading earlier but nobody answered :-)
Don't mean to spam the group but I can't figure it out.  This is on a
Redhat Cluster 4.

 
Basically, I don't remember ever having to enter a password to get my
cluster working.  It seems like one box is essentially hacking the other
box (ie., every time it does a "send to cluster" command).  Anybody know
how this is happening?  My worry is that someone could theoretically
engage my cluster by knowing the cluster-id.

 
Thanks,
Jeff

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050727/59f29ac1/attachment.htm>

From oldmoonster at gmail.com  Thu Jul 28 08:50:02 2005
From: oldmoonster at gmail.com (Q.L)
Date: Thu, 28 Jul 2005 16:50:02 +0800
Subject: [Linux-cluster] fence device
Message-ID: <359782e705072801507a37d68@mail.gmail.com>

Hi, all,

I would like to setup an  environment to study GFS, in which 2 x86
box, 2 FC HBA,1 FC Switch and 1 FC raid system will be involved in,
the 2 nodes communicate through ethernet of course. However, I am not
sure what kind of FC switch is good enough to me, can I select a 8
ports FC switch freely? Do I have to use it as fence devcie? if I
don't want to do manual fence configuration.

Thanks!


From JACOB_LIBERMAN at Dell.com  Thu Jul 28 16:36:08 2005
From: JACOB_LIBERMAN at Dell.com (JACOB_LIBERMAN at Dell.com)
Date: Thu, 28 Jul 2005 11:36:08 -0500
Subject: [Linux-cluster] fence device
Message-ID: <BC430F453501174992B9D9E8AFB7519A085648@ausx3mps309.aus.amer.dell.com>

There are fencing agents for mcdata and brocade switches. Any managed
switch accessible via telnet should work, but you may have to roll your
own fencing agent based on the example scripts.

Thanks, jacob 

> -----Original Message-----
> From: linux-cluster-bounces at redhat.com 
> [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Q.L
> Sent: Thursday, July 28, 2005 3:50 AM
> To: linux-cluster at redhat.com
> Subject: [Linux-cluster] fence device
> 
> Hi, all,
> 
> I would like to setup an  environment to study GFS, in which 
> 2 x86 box, 2 FC HBA,1 FC Switch and 1 FC raid system will be 
> involved in, the 2 nodes communicate through ethernet of 
> course. However, I am not sure what kind of FC switch is good 
> enough to me, can I select a 8 ports FC switch freely? Do I 
> have to use it as fence devcie? if I don't want to do manual 
> fence configuration.
> 
> Thanks!
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 


From oldmoonster at gmail.com  Fri Jul 29 02:18:06 2005
From: oldmoonster at gmail.com (Q.L)
Date: Fri, 29 Jul 2005 10:18:06 +0800
Subject: [Linux-cluster] fence device
In-Reply-To: <BC430F453501174992B9D9E8AFB7519A085648@ausx3mps309.aus.amer.dell.com>
References: <BC430F453501174992B9D9E8AFB7519A085648@ausx3mps309.aus.amer.dell.com>
Message-ID: <359782e705072819181b5b0b0c@mail.gmail.com>

So, is there any 8 ports brocade switch that support fencing agent?
can Qlogic switch work? If I use a switch without fencing agent, can I
use a manual fencing way to get system work? In fact, what I want to
study is the Symmetric lock principle, especially at the time
concurrent write/read to the storage pool from both nodes, without
considering one point failure.

Thanks!

On 7/29/05, JACOB_LIBERMAN at dell.com <JACOB_LIBERMAN at dell.com> wrote:
> There are fencing agents for mcdata and brocade switches. Any managed
> switch accessible via telnet should work, but you may have to roll your
> own fencing agent based on the example scripts.
> 
> Thanks, jacob
> 
> > -----Original Message-----
> > From: linux-cluster-bounces at redhat.com
> > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Q.L
> > Sent: Thursday, July 28, 2005 3:50 AM
> > To: linux-cluster at redhat.com
> > Subject: [Linux-cluster] fence device
> >
> > Hi, all,
> >
> > I would like to setup an  environment to study GFS, in which
> > 2 x86 box, 2 FC HBA,1 FC Switch and 1 FC raid system will be
> > involved in, the 2 nodes communicate through ethernet of
> > course. However, I am not sure what kind of FC switch is good
> > enough to me, can I select a 8 ports FC switch freely? Do I
> > have to use it as fence devcie? if I don't want to do manual
> > fence configuration.
> >
> > Thanks!
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > http://www.redhat.com/mailman/listinfo/linux-cluster
> >
>


From jbrassow at redhat.com  Fri Jul 29 02:44:27 2005
From: jbrassow at redhat.com (Jonathan E Brassow)
Date: Thu, 28 Jul 2005 21:44:27 -0500
Subject: [Linux-cluster] fence device
In-Reply-To: <359782e705072819181b5b0b0c@mail.gmail.com>
References: <BC430F453501174992B9D9E8AFB7519A085648@ausx3mps309.aus.amer.dell.com>
	<359782e705072819181b5b0b0c@mail.gmail.com>
Message-ID: <8c389a1c6b8619e7d7716105c5aa3d37@redhat.com>


On Jul 28, 2005, at 9:18 PM, Q.L wrote:

> So, is there any 8 ports brocade switch that support fencing agent?

Pretty much all brocade switches are supported by the (included) 
fence_brocade agent.

> can Qlogic switch work?

I think so.  We have a fence_sanbox2 agent... not sure about all the 
models it supports - perhaps all, perhaps one.  Help, anyone?

>  If I use a switch without fencing agent, can I
> use a manual fencing way to get system work?

Yes, but it will require manual intervention every time you have (or 
test) a failure.

>  In fact, what I want to
> study is the Symmetric lock principle, especially at the time
> concurrent write/read to the storage pool from both nodes, without
> considering one point failure.

If you had a third machine, you could use GNBD.  The GNBD server would 
be a SPOF, but if it's just for testing, it would be fine.  GNBD would 
turn your third box into a iSCSI-like device which has built in I/O 
fencing support.

  brassow


From oldmoonster at gmail.com  Fri Jul 29 04:00:53 2005
From: oldmoonster at gmail.com (Q.L)
Date: Fri, 29 Jul 2005 12:00:53 +0800
Subject: [Linux-cluster] fence device
In-Reply-To: <8c389a1c6b8619e7d7716105c5aa3d37@redhat.com>
References: <BC430F453501174992B9D9E8AFB7519A085648@ausx3mps309.aus.amer.dell.com>
	<359782e705072819181b5b0b0c@mail.gmail.com>
	<8c389a1c6b8619e7d7716105c5aa3d37@redhat.com>
Message-ID: <359782e705072821005f4e9339@mail.gmail.com>

Jonathan,

I know GNBD is a choice, but I will not consider it in my study environment.

Thank for you very much.

Michael

On 7/29/05, Jonathan E Brassow <jbrassow at redhat.com> wrote:
> 
> On Jul 28, 2005, at 9:18 PM, Q.L wrote:
> 
> > So, is there any 8 ports brocade switch that support fencing agent?
> 
> Pretty much all brocade switches are supported by the (included)
> fence_brocade agent.
> 
> > can Qlogic switch work?
> 
> I think so.  We have a fence_sanbox2 agent... not sure about all the
> models it supports - perhaps all, perhaps one.  Help, anyone?
> 
> >  If I use a switch without fencing agent, can I
> > use a manual fencing way to get system work?
> 
> Yes, but it will require manual intervention every time you have (or
> test) a failure.
> 
> >  In fact, what I want to
> > study is the Symmetric lock principle, especially at the time
> > concurrent write/read to the storage pool from both nodes, without
> > considering one point failure.
> 
> If you had a third machine, you could use GNBD.  The GNBD server would
> be a SPOF, but if it's just for testing, it would be fine.  GNBD would
> turn your third box into a iSCSI-like device which has built in I/O
> fencing support.
> 
>  brassow
> 
>


From lhh at redhat.com  Fri Jul 29 13:59:09 2005
From: lhh at redhat.com (Lon Hohberger)
Date: Fri, 29 Jul 2005 09:59:09 -0400
Subject: [Linux-cluster] fence device
In-Reply-To: <359782e705072801507a37d68@mail.gmail.com>
References: <359782e705072801507a37d68@mail.gmail.com>
Message-ID: <1122645549.3972.20.camel@ayanami.boston.redhat.com>

On Thu, 2005-07-28 at 16:50 +0800, Q.L wrote:
> Do I have to use it as fence devcie? 

Yeah -- you should use it as a fence device.

> if I
> don't want to do manual fence configuration.

You don't.  Trust me.

-- Lon


From keith at clearpathit.com  Fri Jul 29 15:10:29 2005
From: keith at clearpathit.com (Keith Grammer)
Date: Fri, 29 Jul 2005 10:10:29 -0500
Subject: [Linux-cluster] job opening in Houston
Message-ID: <0MKz5u-1DyWVC3qLE-0007lF@mrelay.perfora.net>

I am looking for a Linux Cluster Specialist for a contract position. Please
respond with a resume in Word.

Thank You,

Keith

 
Keith Grammer

Partner

ClearPath IT LLC

713-344-0232

keith at clearpathit.com

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050729/09e8a2b2/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: image001.gif
Type: image/gif
Size: 3720 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050729/09e8a2b2/attachment.gif>

From jacobl at ccbill.com  Fri Jul 29 20:18:04 2005
From: jacobl at ccbill.com (Jacob Liff)
Date: Fri, 29 Jul 2005 13:18:04 -0700
Subject: [Linux-cluster] Compile Issues
Message-ID: <889A47B16278164FB657E0FFB1CAB8C7CB0C3C@hq-exchange.ccbill-hq.local>

Howdy,

 
I have gone through the list before buging you guys but haven't found
the answer. A few days ago someone else was having issues and someone
recommended getting the source from:

 
ftp://sources.redhat.com/pub/cluster/releases/cluster-1.00.00.tar.gz

 
Instead of the CSV and to compile it against vanilla 2.6.12. I followed
those instructions and everything appeared to compile fine except when I
modprobe gfs I get this fun error:

 
ATAL: Error inserting gfs (/lib/modules/2.6.12/kernel/fs/gfs/gfs.ko):
Unknown symbol in module, or unknown parameter (see dmesg)

 
dmesg output: 

 
gfs: Unknown symbol posix_acl_from_xattr

gfs: Unknown symbol posix_acl_valid

gfs: Unknown symbol posix_acl_permission

gfs: Unknown symbol posix_acl_equiv_mode

gfs: Unknown symbol posix_acl_chmod_masq

gfs: Unknown symbol posix_acl_to_xattr

gfs: Unknown symbol posix_acl_create_masq

gfs: Unknown symbol posix_acl_clone

 
Looking at the source its linking against the correct headers for the
functions.. and the functions do exist in the headers.

 
Maybe I have not compiled someone into the kernel that I needed in order
to make this work? Could I be trying to use this against the wrong
kernel(Vanilla 2.6.12
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.3.tar.bz2 )? I have
been trying to either compile modules or patch against most of the newer
kernels for the last two days with no luck. Any help would be greatly
appreciated.

 
Jacob L.

 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20050729/d35ff78f/attachment.htm>

From eric at bootseg.com  Fri Jul 29 20:42:30 2005
From: eric at bootseg.com (Eric Kerin)
Date: Fri, 29 Jul 2005 16:42:30 -0400
Subject: [Linux-cluster] Compile Issues
In-Reply-To: <889A47B16278164FB657E0FFB1CAB8C7CB0C3C@hq-exchange.ccbill-hq.local>
References: <889A47B16278164FB657E0FFB1CAB8C7CB0C3C@hq-exchange.ccbill-hq.local>
Message-ID: <1122669750.3392.8.camel@auh5-0479.corp.jabil.org>

On Fri, 2005-07-29 at 13:18 -0700, Jacob Liff wrote:
> Howdy,
> 
> ATAL: Error inserting gfs (/lib/modules/2.6.12/kernel/fs/gfs/gfs.ko):
> Unknown symbol in module, or unknown parameter (see dmesg)
> 
> dmesg output: 
> 
> gfs: Unknown symbol posix_acl_from_xattr
> gfs: Unknown symbol posix_acl_valid
> gfs: Unknown symbol posix_acl_permission
> gfs: Unknown symbol posix_acl_equiv_mode
> gfs: Unknown symbol posix_acl_chmod_masq
> gfs: Unknown symbol posix_acl_to_xattr
> gfs: Unknown symbol posix_acl_create_masq
> gfs: Unknown symbol posix_acl_clone
> 
> Looking at the source its linking against the correct headers for the
> functions.. and the functions do exist in the headers. 
> 
> Maybe I have not compiled someone into the kernel that I needed in
> order to make this work? Could I be trying to use this against the
> wrong kernel(Vanilla 2.6.12
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.3.tar.bz2 )? I
> have been trying to either compile modules or patch against most of
> the newer kernels for the last two days with no luck. Any help would
> be greatly appreciated.
> 
Most likely, you don't have CONFIG_FS_POSIX_ACL turned on in your
kernel.  It looks like you'll have to enable ACL support for ext2, ext3,
jfs, or reiser to be able to turn it on though.

Thanks,
Eric

> 


From jacobl at ccbill.com  Fri Jul 29 21:40:44 2005
From: jacobl at ccbill.com (Jacob Liff)
Date: Fri, 29 Jul 2005 14:40:44 -0700
Subject: [Linux-cluster] Compile Issues
Message-ID: <889A47B16278164FB657E0FFB1CAB8C7CB0C93@hq-exchange.ccbill-hq.local>

That's exactly what it was.. now that I feel like hiding under my desk..
I'll see what else I can break on here :)

Thanks very much for the help.

Jacob L.

-----Original Message-----
From: Eric Kerin [mailto:eric at bootseg.com] 
Sent: Friday, July 29, 2005 1:43 PM
To: linux clustering; Jacob Liff
Subject: Re: [Linux-cluster] Compile Issues

On Fri, 2005-07-29 at 13:18 -0700, Jacob Liff wrote:
> Howdy,
> 
> ATAL: Error inserting gfs (/lib/modules/2.6.12/kernel/fs/gfs/gfs.ko):
> Unknown symbol in module, or unknown parameter (see dmesg)
> 
> dmesg output: 
> 
> gfs: Unknown symbol posix_acl_from_xattr
> gfs: Unknown symbol posix_acl_valid
> gfs: Unknown symbol posix_acl_permission
> gfs: Unknown symbol posix_acl_equiv_mode
> gfs: Unknown symbol posix_acl_chmod_masq
> gfs: Unknown symbol posix_acl_to_xattr
> gfs: Unknown symbol posix_acl_create_masq
> gfs: Unknown symbol posix_acl_clone
> 
> Looking at the source its linking against the correct headers for the
> functions.. and the functions do exist in the headers. 
> 
> Maybe I have not compiled someone into the kernel that I needed in
> order to make this work? Could I be trying to use this against the
> wrong kernel(Vanilla 2.6.12
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.12.3.tar.bz2 )? I
> have been trying to either compile modules or patch against most of
> the newer kernels for the last two days with no luck. Any help would
> be greatly appreciated.
> 
Most likely, you don't have CONFIG_FS_POSIX_ACL turned on in your
kernel.  It looks like you'll have to enable ACL support for ext2, ext3,
jfs, or reiser to be able to turn it on though.

Thanks,
Eric

> 


From klrajasekar at gmail.com  Sat Jul 30 06:58:47 2005
From: klrajasekar at gmail.com (KL Raja Sekar)
Date: Sat, 30 Jul 2005 12:28:47 +0530
Subject: [Linux-cluster] Error when loading the lock_dlm module
Message-ID: <8293054050729235847b66f37@mail.gmail.com>

Hi,

I am trying to use GFS 6.1 with lock_dlm module.  But when i try to
load the lock_dlm module i am getting fatal error.  Herewith attached
the fatal error msg and the dmesg errors.

I am using HP Proliant DL360 server with XP 1024 storage.   please let
me know if anybody has the solution for the above.

regards
shekar

******************************************************************************
root at uranus1 ~]# modprobe lock_dlm
FATAL: Error inserting lock_dlm
(/lib/modules/2.6.9-11.ELsmp/kernel/fs/gfs_locking/lock_dlm/lock_dlm.ko):
Unknown symbol in module, or unknown parameter (see dmesg)
******************************************************************************
dmesg output -------------------------------
lock_dlm: Unknown symbol dlm_debug_dump
lock_dlm: Unknown symbol dlm_new_lockspace
lock_dlm: Unknown symbol kcl_register_service
lock_dlm: Unknown symbol dlm_unlock
lock_dlm: Unknown symbol kcl_start_done
lock_dlm: Unknown symbol dlm_release_lockspace
lock_dlm: Unknown symbol kcl_join_service
lock_dlm: Unknown symbol kcl_unregister_service
lock_dlm: Unknown symbol kcl_leave_service
lock_dlm: Unknown symbol dlm_query
lock_dlm: Unknown symbol kcl_get_members
lock_dlm: Unknown symbol kcl_releaseref_cluster
lock_dlm: Unknown symbol dlm_lock
lock_dlm: Unknown symbol kcl_cluster_name
lock_dlm: Unknown symbol kcl_get_services
lock_dlm: Unknown symbol kcl_addref_cluster
******************************************************************************


From yfttyfs at gmail.com  Sun Jul 31 04:15:53 2005
From: yfttyfs at gmail.com (y f)
Date: Sun, 31 Jul 2005 12:15:53 +0800
Subject: [Linux-cluster] job opening in Houston
In-Reply-To: <0MKz5u-1DyWVC3qLE-0007lF@mrelay.perfora.net>
References: <0MKz5u-1DyWVC3qLE-0007lF@mrelay.perfora.net>
Message-ID: <78fcc84a050730211531f4a197@mail.gmail.com>

Hi, Keith,

Can the work be done remotely ?

On 7/29/05, Keith Grammer <keith at clearpathit.com> wrote:
>  
>  
> 
> I am looking for a Linux Cluster Specialist for a contract position. Please
> respond with a resume in Word. 
> 
> Thank You, 
> 
> Keith 
> 
>   
> 
>  
> 
>   
> 
>   
> 
> Keith Grammer 
> 
> Partner 
> 
> ClearPath IT LLC 
> 
> 713-344-0232 
> 
> keith at clearpathit.com 
> 
>   
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
> 
>


From debug at MIT.EDU  Sun Jul 31 03:52:58 2005
From: debug at MIT.EDU (Cluster 2005)
Date: Sat, 30 Jul 2005 23:52:58 -0400
Subject: [Linux-cluster] IEEE Cluster 2005 Online Registration now open
Message-ID: <5.2.1.1.2.20050730234326.042a7a20@hesiod>


see <http://www.cluster2005.org/>http://www.cluster2005.org for info.