From teigland at redhat.com Mon Nov 1 05:10:25 2004 From: teigland at redhat.com (David Teigland) Date: Mon, 1 Nov 2004 13:10:25 +0800 Subject: [Linux-cluster] GFS: more simple performance numbers In-Reply-To: <1099086032.2180.44.camel@ibm-c.pdx.osdl.net> References: <200410191305.54330.danderso@redhat.com> <20041021120601.GC19478@redhat.com> <1099086032.2180.44.camel@ibm-c.pdx.osdl.net> Message-ID: <20041101051025.GA10637@redhat.com> On Fri, Oct 29, 2004 at 02:40:33PM -0700, Daniel McNeil wrote: > Can you explain what gfs does with the callback? Does it drop the locks > for all gfs inodes or just the ones that are not actively being used? > Does gfs still cache the inode without holding the dlm lock? It releases all its unused (cached) locks which includes releasing the corresponding cached data. > How much memory does a dlm lock take? 10,000 seems very small > for machines today. > > Would it make sense to limit the number of gfs inodes which, > in turn, would limit the number of dlm_locks? > > It seems to me the number of inodes and number of dlm locks > should scale together. GFS is already quite good at limiting the caching it does without this drop-locks mechanism. The drop-locks callback is really an emergency button that shouldn't be used regularly as it disrupts all the intelligent things gfs is doing. Since lock_dlm really has no idea when the dlm is setting up for a potential recovery mem shortage, it can be argued that by default we should ignore the callback -- that's what we did until recently. So, this callback, designed many years ago for very different (centralized) lock managers, doesn't work too well for the dlm. As I mentioned, there are other better ways we can solve dlm recovery limitations. Until then, we'll hope to find some sensible defaults and guidelines on when/how to configure /proc/cluster/lock_dlm/drop_count. -- Dave Teigland From sunjw at onewaveinc.com Mon Nov 1 07:50:23 2004 From: sunjw at onewaveinc.com (=?gb2312?B?y++/oc6w?=) Date: Mon, 1 Nov 2004 15:50:23 +0800 Subject: [Linux-cluster] Re: "gfs_mkfs -p LockProtoName" problem Message-ID: >> Hello,all. >> How can I use the LockProtoName: lock_gulm,lock_nolock? >> When I use lock_gulm to make gfs filesystem,I can not mount the filesystem. >> When lock_nolock, the gfs can be mounted,but when writed, the kernel paniced. >> Is that true that I can only use lock_dlm in GFS 6.1pre2 + kernel 2.6.8.1 ? > >What error messages are being printed to the syslog and console on mount? >(`dmesg` is your friend!) Is lock_gulmd running? Did you name the >filesystem with the correct -t option? Oh, yes. The module "lock_gulm" and the daemon "lock_gulmd" are both absent,I get the reason. So which lock module does work better on GFS 6.1+kernel 2.6? Such as performance and stability? lock_dlm or lock_gulm? > >When you are using lock_nolock to mount the filesystem, are you doing it >from more than one node? lock_nolock is ok for a single node, but it >doesn't do any locking, so if you mount it from a second node you will >corrupt your filesystem and see kernel panics. Best regards! Luckey Sun From pcaulfie at redhat.com Mon Nov 1 08:49:56 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Mon, 1 Nov 2004 08:49:56 +0000 Subject: [Linux-cluster] cluster with mixed versions In-Reply-To: <1099065771.2180.30.camel@ibm-c.pdx.osdl.net> References: <1099008755.2180.23.camel@ibm-c.pdx.osdl.net> <20041029071904.GD29009@tykepenguin.com> <1099065771.2180.30.camel@ibm-c.pdx.osdl.net> Message-ID: <20041101084955.GD26418@tykepenguin.com> On Fri, Oct 29, 2004 at 09:02:51AM -0700, Daniel McNeil wrote: > Patrick, > > Thanks for the info. Once things stabilize, I assume supporting > rolling upgrades is requirement, right? Yes. Within minor version numbers of the cluster software. provided the same clustering software loads into the kernel, the actualy kernel version shouldn't matter. > What needed feature required changing the protocol? > Is the protocol documented anywhere? It was for users of the messaging feature of cman to be able to find out which "port" a message had been sent from (it previously assumed you could only send and receive to the same port number). > When changes like this are checked in, it would be good to > announce it on the mailing list. Yes, sorry that should have happened. it was "announced" on the commit mailing list but I appreciate that not everyone reads that. -- patrick From pcaulfie at redhat.com Mon Nov 1 08:52:28 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Mon, 1 Nov 2004 08:52:28 +0000 Subject: [Linux-cluster] clvmd question In-Reply-To: <5.1.0.14.2.20041029154002.03de8700@pop.ncsa.uiuc.edu> References: <5.1.0.14.2.20041029154002.03de8700@pop.ncsa.uiuc.edu> Message-ID: <20041101085228.GE26418@tykepenguin.com> On Fri, Oct 29, 2004 at 03:41:59PM -0500, Qian Liu wrote: > Hi, all > > When I tried to start clvmd with command "clvmd" it returned prompt > information as following : > clvmd could not connect to cluster > > Any clue about this? Thanks in advance! > None at all - AFAICT there is no such message in clvmd! If you got "Can't open cluster socket" then it means that the cman kernel module was not loaded and started. -- patrick From lhh at redhat.com Mon Nov 1 16:37:18 2004 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 01 Nov 2004 11:37:18 -0500 Subject: [Linux-cluster] IMAP server clustering ... In-Reply-To: <41818859.6040208@utilitran.com> References: <41814530.6020605@utilitran.com> <1099001200.18913.64.camel@atlantis.boston.redhat.com> <41818859.6040208@utilitran.com> Message-ID: <1099327038.18913.94.camel@atlantis.boston.redhat.com> On Thu, 2004-10-28 at 18:01 -0600, Michael Gale wrote: > But would you even need the GFS file system then ? Could each box just > be accessing a reiserfs via the FC ? and let the application take care > of the "locking" ? No, because metadata creation would not be synchronized. You'd end up with a corrupt file system tree. -- Lon From linux-cluster at spam.dragonhold.org Mon Nov 1 16:16:49 2004 From: linux-cluster at spam.dragonhold.org (linux-cluster at spam.dragonhold.org) Date: Mon, 1 Nov 2004 16:16:49 +0000 Subject: [Linux-cluster] IMAP server clustering ... In-Reply-To: <1099327038.18913.94.camel@atlantis.boston.redhat.com> References: <41814530.6020605@utilitran.com> <1099001200.18913.64.camel@atlantis.boston.redhat.com> <41818859.6040208@utilitran.com> <1099327038.18913.94.camel@atlantis.boston.redhat.com> Message-ID: <20041101161649.GE3873@dragonhold.org> On Mon, Nov 01, 2004 at 11:37:18AM -0500, Lon Hohberger wrote: > On Thu, 2004-10-28 at 18:01 -0600, Michael Gale wrote: > > But would you even need the GFS file system then ? Could each box just > > be accessing a reiserfs via the FC ? and let the application take care > > of the "locking" ? > > No, because metadata creation would not be synchronized. You'd end up > with a corrupt file system tree. It's not just the creation, it's the caching too, surely? The problem is that each system would end up with a different set of cached data (data & metadata), more than anything. If there was a way to get it to invalidate the cached data as required, and also serialise access so that 2 machines don't try to access the same bit of the filesystem at the same time, then it would work. And at that point, you've got GFS, AFAICT. Graham From michael.gale at utilitran.com Mon Nov 1 17:25:03 2004 From: michael.gale at utilitran.com (Michael Gale) Date: Mon, 01 Nov 2004 10:25:03 -0700 Subject: [Linux-cluster] IMAP server clustering ... In-Reply-To: <20041101161649.GE3873@dragonhold.org> References: <41814530.6020605@utilitran.com> <1099001200.18913.64.camel@atlantis.boston.redhat.com> <41818859.6040208@utilitran.com> <1099327038.18913.94.camel@atlantis.boston.redhat.com> <20041101161649.GE3873@dragonhold.org> Message-ID: <4186716F.2010308@utilitran.com> I was reading up on Courier IMAP which use dot-locking with support NFS mounted maildirs. So would that application not take care of the locking ? Michael. linux-cluster at spam.dragonhold.org wrote: > On Mon, Nov 01, 2004 at 11:37:18AM -0500, Lon Hohberger wrote: > >>On Thu, 2004-10-28 at 18:01 -0600, Michael Gale wrote: >> >>>But would you even need the GFS file system then ? Could each box just >>>be accessing a reiserfs via the FC ? and let the application take care >>>of the "locking" ? >> >>No, because metadata creation would not be synchronized. You'd end up >>with a corrupt file system tree. > > > It's not just the creation, it's the caching too, surely? The problem is that each system > would end up with a different set of cached data (data & metadata), more than anything. > > If there was a way to get it to invalidate the cached data as required, and also serialise > access so that 2 machines don't try to access the same bit of the filesystem at the same > time, then it would work. > > And at that point, you've got GFS, AFAICT. > > Graham > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- Michael Gale Lan Administrator Utilitran Corp. We Pledge Allegiance to the Penguin From linux-cluster at spam.dragonhold.org Mon Nov 1 16:55:21 2004 From: linux-cluster at spam.dragonhold.org (linux-cluster at spam.dragonhold.org) Date: Mon, 1 Nov 2004 16:55:21 +0000 Subject: [Linux-cluster] IMAP server clustering ... In-Reply-To: <4186716F.2010308@utilitran.com> References: <41814530.6020605@utilitran.com> <1099001200.18913.64.camel@atlantis.boston.redhat.com> <41818859.6040208@utilitran.com> <1099327038.18913.94.camel@atlantis.boston.redhat.com> <20041101161649.GE3873@dragonhold.org> <4186716F.2010308@utilitran.com> Message-ID: <20041101165521.GF3873@dragonhold.org> On Mon, Nov 01, 2004 at 10:25:03AM -0700, Michael Gale wrote: > > I was reading up on Courier IMAP which use dot-locking with support NFS > mounted maildirs. > > So would that application not take care of the locking ? No, it's operating at the wrong level. (starting from nothing cached) Think about it this way - you create a new file on the disk (say the lock file). The other machine then tries to access the directory. It scans down from the root of the partition (successfully, since nothing has changed), and gets to the directory. This finds the lockfile. So far so good. Now the 1st machine deletes the lockfile. However, the 2nd machine still has this cached as locked - and therefore doesn't notice. -- Other example. Both machines read the directory (and it's not locked). Next machine 1 locks it. Even if reiserfs writes this lock back to disk (which it will eventually), the 1st machine doesn't know, since it still has a cached version of the directory which shows that the file doesn't exist. Now both can lock (successfully as far as they are concerned). -- Final example. Machines 1&2 both have the lock. One deletes a file, and updates the disk. The 2nd adds a file, and then updates the directory with it's version (which still has the first file in it). This means you've got the file pointing to the blocks where it exists, but the blocks have been freed. If you do that with another directory (create a new IMAP folder) rather than file, and it gets even worse - the machine that didn't create it won't know that those inodes are a directory, so will happily then write a file over it. -- If this doesn't make sense (quite possible, I've not worked through the examples properly), just work it through on paper. Remember that the machines have no reason to doubt their cached copy of the data, and they will cache as much as possible. Go through what could happen from a starting point of the disk & caches agreeing, remembering that not only is read data cached, data is not written back out immediately. Graham From michael.gale at utilitran.com Mon Nov 1 18:26:07 2004 From: michael.gale at utilitran.com (Michael Gale) Date: Mon, 01 Nov 2004 11:26:07 -0700 Subject: [Linux-cluster] IMAP server clustering ... In-Reply-To: <20041101165521.GF3873@dragonhold.org> References: <41814530.6020605@utilitran.com> <1099001200.18913.64.camel@atlantis.boston.redhat.com> <41818859.6040208@utilitran.com> <1099327038.18913.94.camel@atlantis.boston.redhat.com> <20041101161649.GE3873@dragonhold.org> <4186716F.2010308@utilitran.com> <20041101165521.GF3873@dragonhold.org> Message-ID: <41867FBF.3060702@utilitran.com> That makes more sense ... thanks for the info and in helping me avoid corrupted data. Michael. linux-cluster at spam.dragonhold.org wrote: > On Mon, Nov 01, 2004 at 10:25:03AM -0700, Michael Gale wrote: > >>I was reading up on Courier IMAP which use dot-locking with support NFS >>mounted maildirs. >> >>So would that application not take care of the locking ? > > > No, it's operating at the wrong level. > > (starting from nothing cached) > > Think about it this way - you create a new file on the disk (say the lock file). The other > machine then tries to access the directory. It scans down from the root of the partition > (successfully, since nothing has changed), and gets to the directory. This finds the > lockfile. > > So far so good. > > Now the 1st machine deletes the lockfile. However, the 2nd machine still has this cached as > locked - and therefore doesn't notice. > > > -- > > Other example. > > Both machines read the directory (and it's not locked). Next machine 1 locks it. Even if > reiserfs writes this lock back to disk (which it will eventually), the 1st machine doesn't > know, since it still has a cached version of the directory which shows that the file doesn't > exist. Now both can lock (successfully as far as they are concerned). > > > -- > > Final example. > > Machines 1&2 both have the lock. One deletes a file, and updates the disk. The 2nd adds a > file, and then updates the directory with it's version (which still has the first file in > it). > > This means you've got the file pointing to the blocks where it exists, but the blocks have > been freed. > > If you do that with another directory (create a new IMAP folder) rather than file, and it > gets even worse - the machine that didn't create it won't know that those inodes are a > directory, so will happily then write a file over it. > > -- > > If this doesn't make sense (quite possible, I've not worked through the examples properly), > just work it through on paper. Remember that the machines have no reason to doubt their > cached copy of the data, and they will cache as much as possible. > > Go through what could happen from a starting point of the disk & caches agreeing, > remembering that not only is read data cached, data is not written back out immediately. > > > Graham > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- Michael Gale Lan Administrator Utilitran Corp. We Pledge Allegiance to the Penguin From linux-cluster at spam.dragonhold.org Mon Nov 1 18:59:41 2004 From: linux-cluster at spam.dragonhold.org (linux-cluster at spam.dragonhold.org) Date: Mon, 1 Nov 2004 18:59:41 +0000 Subject: [Linux-cluster] IMAP server clustering ... In-Reply-To: <41867FBF.3060702@utilitran.com> References: <41814530.6020605@utilitran.com> <1099001200.18913.64.camel@atlantis.boston.redhat.com> <41818859.6040208@utilitran.com> <1099327038.18913.94.camel@atlantis.boston.redhat.com> <20041101161649.GE3873@dragonhold.org> <4186716F.2010308@utilitran.com> <20041101165521.GF3873@dragonhold.org> <41867FBF.3060702@utilitran.com> Message-ID: <20041101185941.GG3873@dragonhold.org> On Mon, Nov 01, 2004 at 11:26:07AM -0700, Michael Gale wrote: > > That makes more sense ... thanks for the info and in helping me avoid > corrupted data. > > Michael. > No worries - I have the best possible reason for remembering, that of personal experience. I managed to miss that a partition was mounted on a solaris box (through veritas), and therefore mounted it again manually (raw partition) somewhere else - the behaviour was "interesting" until it got so bad I rebooted. I ended up having to restore from the backup, which is why I remember it so vividly - we had never tested the backups until then. You've got me thinking tho. Depending on how much of the caching is done in the filesystem layer, and how much at the block device layer, it /might/ be possible to create a cluster aware block device, and then use a normal FS on top of it. Anyone around know enough to tell me how possible/not that would be? If it was possible, it should be possible to implement things like mirroring & snapshots at that level more easily than trying to do them higher up the chain. However, that may just be total baloney, since I've not really thought it through - one of those "I wonder" ideas that I don't know enough about at the moment to investigate. Graham From david.zafman at hp.com Mon Nov 1 19:56:53 2004 From: david.zafman at hp.com (David B Zafman) Date: Mon, 1 Nov 2004 11:56:53 -0800 Subject: [Linux-cluster] IMAP server clustering ... In-Reply-To: <20041101185941.GG3873@dragonhold.org> References: <41814530.6020605@utilitran.com> <1099001200.18913.64.camel@atlantis.boston.redhat.com> <41818859.6040208@utilitran.com> <1099327038.18913.94.camel@atlantis.boston.redhat.com> <20041101161649.GE3873@dragonhold.org> <4186716F.2010308@utilitran.com> <20041101165521.GF3873@dragonhold.org> <41867FBF.3060702@utilitran.com> <20041101185941.GG3873@dragonhold.org> Message-ID: <2D541E3C-2C40-11D9-91C4-000393C9E706@hp.com> See www.drbd.org What is DRBD -------------- next part -------------- A non-text attachment was scrubbed... Name: clear.gif Type: image/gif Size: 46 bytes Desc: not available URL: -------------- next part -------------- DRBD is a block device which is designed to build high availability clusters. This is done by mirroring a whole block device via (a dedicated) network. You could see it as a network raid-1. On Nov 1, 2004, at 10:59 AM, linux-cluster at spam.dragonhold.org wrote: > On Mon, Nov 01, 2004 at 11:26:07AM -0700, Michael Gale wrote: >> >> That makes more sense ... thanks for the info and in helping me avoid >> corrupted data. >> >> Michael. >> > > No worries - I have the best possible reason for remembering, that of > personal experience. > > I managed to miss that a partition was mounted on a solaris box > (through veritas), and > therefore mounted it again manually (raw partition) somewhere else - > the behaviour was > "interesting" until it got so bad I rebooted. > > I ended up having to restore from the backup, which is why I remember > it so vividly - we had > never tested the backups until then. > > > You've got me thinking tho. Depending on how much of the caching is > done in the filesystem > layer, and how much at the block device layer, it /might/ be possible > to create a cluster > aware block device, and then use a normal FS on top of it. > > Anyone around know enough to tell me how possible/not that would be? > > If it was possible, it should be possible to implement things like > mirroring & snapshots at > that level more easily than trying to do them higher up the chain. > > However, that may just be total baloney, since I've not really thought > it through - one of > those "I wonder" ideas that I don't know enough about at the moment to > investigate. > > Graham > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > > David B. Zafman | Hewlett-Packard Company mailto:david.zafman at hp.com | http://www.hp.com "Computer Science" is no more about computers than astronomy is about telescopes - E. W. Dijkstra From chekov at ucla.edu Mon Nov 1 20:30:47 2004 From: chekov at ucla.edu (Alan Wood) Date: Mon, 1 Nov 2004 12:30:47 -0800 (PST) Subject: [Linux-cluster] samba on top of GFS In-Reply-To: <20041101170055.464B07313C@hormel.redhat.com> References: <20041101170055.464B07313C@hormel.redhat.com> Message-ID: I am running a cluster with GFS-formatted file systems mounted on multiple nodes. What I was hoping to do was to set up one node running httpd to be my webserver and another node running samba to share the same data internally. What I am getting when running that is instability. The samba serving node keeps crashing. I have heartbeat set up so that failover happens to the webserver node, at which point the system apparently behaves well. After reading a few articles on the list it seemed to me that the problem might be samba using oplocks or some other caching mechanism that breaks synchronization. I tried turning oplocks=off in my smb.conf file, but that made the system unusably slow (over 3 minutes to right-click on a two-meg file). I am also not sure that is the extent of the problem, as I seem to be able to re-create the crash simply by accessing the same file on multiple clients just via samba (which locking should be able to handle). If the problem were merely that the remote node and the samba node were both accessing an oplocked file I could understand, but that doesn't always seem to be the case. has anyone had any success running the same type of setup? I am also serving nfs on the samba server, though with very little load there. below is the syslog output of a crash. I'm running 2.6.8-1.521smp with a GFS CVS dump from mid-september. -alan Code: 8b 03 0f 18 00 90 3b 5c 24 04 75 97 8b 04 24 5b 5e 5b 5e 5f <1>Unable to handle kernel paging request at virtual address 00100100 printing eip: f2ef1e8d *pde = 00003001 Oops: 0000 [#3] SMP Modules linked in: udf nfsd exportfs lock_dlm(U) dlm(U) cman(U) gfs(U) lock_harness(U) nfs lockd sunrpc tg3 floppy sg microcode joydev dm_mod ohci_hcd ext3 jbd aacraid megaraid sd_mod scsi_mod CPU: 0 EIP: 0060:[] Not tainted EFLAGS: 00010246 (2.6.8-1.521smp) EIP is at query_lkb_queue+0x85/0x9b [dlm] eax: ccf485d8 ebx: 00100100 ecx: 00000000 edx: 00000100 esi: 13012e48 edi: 00000000 ebp: 00000130 esp: 13012dc4 ds: 007b es: 007b ss: 0068 Process smbd (pid: 13049, threadinfo=13012000 task=7617b1f0) Stack: 00000000 4543aad0 00000130 950670d8 13012e48 3644d458 f2ef209e 13012e48 00000000 00000000 f2ef133d 34326633 68478400 950670d8 00000137 000000d0 ef239980 dea26800 13012e48 00000380 f2b79169 13012e48 f2b7905d be437380 Call Trace: [] query_locks+0x6f/0xad [dlm] [] dlm_query+0x155/0x238 [dlm] [] get_conflict_global+0x104/0x2ae [lock_dlm] [] query_ast+0x0/0x8 [lock_dlm] [<0227c989>] release_sock+0xa5/0xab [] lm_dlm_plock_get+0xcb/0x10f [lock_dlm] [] do_plock+0xc2/0x171 [gfs] [] gfs_lock+0x44/0x52 [gfs] [] gfs_lock+0x0/0x52 [gfs] [<02170571>] fcntl_getlk64+0x75/0x12e [<02170841>] fcntl_setlk64+0x217/0x221 [<0216c7e0>] sys_fcntl64+0x4d/0x7b From crh at ubiqx.mn.org Mon Nov 1 20:53:35 2004 From: crh at ubiqx.mn.org (Christopher R. Hertel) Date: Mon, 1 Nov 2004 14:53:35 -0600 Subject: [Linux-cluster] samba on top of GFS In-Reply-To: References: <20041101170055.464B07313C@hormel.redhat.com> Message-ID: <20041101205335.GE5409@Favog.ubiqx.mn.org> On Mon, Nov 01, 2004 at 12:30:47PM -0800, Alan Wood wrote: > I am running a cluster with GFS-formatted file systems mounted on multiple > nodes. What I was hoping to do was to set up one node running httpd to be > my webserver and another node running samba to share the same data > internally. > What I am getting when running that is instability. Yeah. This is a known problem. The reason is that Samba must maintain a great deal of metadata internally. This works well enough with multiple Samba processes running on a single machine dealing (more or less) directly with the filesystem. The problem is that Samba must keep track of translations between Posix and Windows metadata, locking semantics, file sharing mode semantics, etc. I had assumed that this would only be a problem if Samba was running on multiple machines all GFS-sharing the same back-end block storage. Your report suggests that there's more to the interaction between Samba and GFS than I had anticipated. Interesting... > The samba serving node > keeps crashing. I have heartbeat set up so that failover happens to the > webserver node, at which point the system apparently behaves well. Which kind of failover? Do you start Samba on the webserver node? It would be interesting to know if the two run well together on the same node, but fail on separate nodes. > After reading a few articles on the list it seemed to me that the problem > might be samba using oplocks or some other caching mechanism that breaks > synchronization. Yeah... that was my next question... > I tried turning oplocks=off in my smb.conf file, but that > made the system unusably slow (over 3 minutes to right-click on a two-meg > file). Curious. ...but did it fix the other problems? I'd really love to work with someone to figure all this out. (Hint hint.) :) > I am also not sure that is the extent of the problem, as I seem to be able > to re-create the crash simply by accessing the same file on multiple > clients just via samba (which locking should be able to handle). Should be... > If the > problem were merely that the remote node and the samba node were both > accessing an oplocked file I could understand, but that doesn't always seem > to be the case. There's more here than I can figure out just from the description. It'd take some digging along-side someone who knows GFS. > has anyone had any success running the same type of setup? I am also > serving nfs on the samba server, though with very little load there. Is there any overlap in the files they're serving? > below is the syslog output of a crash. I'm running 2.6.8-1.521smp with a > GFS CVS dump from mid-september. > -alan Wish I could be more help... Chris -)----- -- "Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X Samba Team -- http://www.samba.org/ -)----- Christopher R. Hertel jCIFS Team -- http://jcifs.samba.org/ -)----- ubiqx development, uninq. ubiqx Team -- http://www.ubiqx.org/ -)----- crh at ubiqx.mn.org OnLineBook -- http://ubiqx.org/cifs/ -)----- crh at ubiqx.org From teigland at redhat.com Tue Nov 2 03:46:45 2004 From: teigland at redhat.com (David Teigland) Date: Tue, 2 Nov 2004 11:46:45 +0800 Subject: [Linux-cluster] samba on top of GFS In-Reply-To: References: <20041101170055.464B07313C@hormel.redhat.com> Message-ID: <20041102034645.GA13255@redhat.com> On Mon, Nov 01, 2004 at 12:30:47PM -0800, Alan Wood wrote: > Call Trace: > [] query_locks+0x6f/0xad [dlm] > [] dlm_query+0x155/0x238 [dlm] > [] get_conflict_global+0x104/0x2ae [lock_dlm] > [] query_ast+0x0/0x8 [lock_dlm] > [<0227c989>] release_sock+0xa5/0xab > [] lm_dlm_plock_get+0xcb/0x10f [lock_dlm] > [] do_plock+0xc2/0x171 [gfs] > [] gfs_lock+0x44/0x52 [gfs] > [] gfs_lock+0x0/0x52 [gfs] > [<02170571>] fcntl_getlk64+0x75/0x12e > [<02170841>] fcntl_setlk64+0x217/0x221 > [<0216c7e0>] sys_fcntl64+0x4d/0x7b This is a kernel oops in the dlm which is always wrong regardless of what you try to do. There have been a lot of changes recently in the area of plocks (where this oopsed) so you should update from cvs, retry and let us know if there's still a problem. -- Dave Teigland From sunjw at onewaveinc.com Tue Nov 2 05:30:55 2004 From: sunjw at onewaveinc.com (=?gb2312?B?y++/oc6w?=) Date: Tue, 2 Nov 2004 13:30:55 +0800 Subject: [Linux-cluster] directio problem Message-ID: Hi,all When the GFS's version was 6.0?its documents said that GFS supported directio? But now is 6.1pre2 or 6.1pre3 with Linux kernel 2.6.8.1 or 2.6.9, does it support too? I've tried some tests, the result showed negative. My test program has the lines as follow: #define _GNU_SOURCE 1 //for the tag O_DIRECT. fd=open(argv[1], O_RDONLY|O_DIRECT); if(fd<0) { perror("open"); return -1; } readed=read(fd, buf, block_size); if(readed<=0) { perror("read"); break; } close(fd); the result is "1. fd=3,valid. 2.read return: Invalid argument" I've run the same program on ext3 and gfs filesystem, the result is the same. So what's the problem, the kernel 2.6 does not support, or some kernel config problem? Another problem: Do I need to (or must) update GFS from 6.1pre2 with 2.6.8.1 to 6.1pre3 with 2.6.9? I see the cvs log "Lock_dlm and lock_gulm are broken ..." Thanks for any reply! Best regards! Luckey Sun From pcaulfie at redhat.com Tue Nov 2 11:15:12 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Tue, 2 Nov 2004 11:15:12 +0000 Subject: [Linux-cluster] samba on top of GFS In-Reply-To: References: <20041101170055.464B07313C@hormel.redhat.com> Message-ID: <20041102111512.GI26656@tykepenguin.com> There looked to have been some missing locking in the query routines that could cause that oops. I've put a fix in CVS you might like to try. -- patrick From anton at hq.310.ru Tue Nov 2 11:36:02 2004 From: anton at hq.310.ru (Anton Nekhoroshikh) Date: Tue, 2 Nov 2004 14:36:02 +0300 Subject: [Linux-cluster] compile error on linux 2.6.9 Message-ID: <1768131126.20041102143602@hq.310.ru> Hi All ! Get last version GFS from cvs and build. In what a problem I shall not understand, like all it is true lm_interface.h but it is not compiled. CC [M] fs/gfs/ops_file.o fs/gfs/ops_file.c: In function `gfs_open': fs/gfs/ops_file.c:1286: warning: use of cast expressions as lvalues is deprecated fs/gfs/ops_file.c:1322: warning: use of cast expressions as lvalues is deprecated fs/gfs/ops_file.c: In function `gfs_close': fs/gfs/ops_file.c:1344: warning: use of cast expressions as lvalues is deprecated fs/gfs/ops_file.c: In function `do_plock': fs/gfs/ops_file.c:1411: warning: passing arg 3 of pointer to function makes integer from pointer wit hout a cast fs/gfs/ops_file.c:1411: warning: passing arg 4 of pointer to function from incompatible pointer type fs/gfs/ops_file.c:1411: error: too few arguments to function fs/gfs/ops_file.c:1416: warning: passing arg 3 of pointer to function makes integer from pointer wit hout a cast fs/gfs/ops_file.c:1416: warning: passing arg 4 of pointer to function makes integer from pointer wit hout a cast fs/gfs/ops_file.c:1416: error: too few arguments to function fs/gfs/ops_file.c:1421: warning: passing arg 3 of pointer to function makes integer from pointer wit hout a cast fs/gfs/ops_file.c:1421: warning: passing arg 5 of pointer to function makes integer from pointer wit hout a cast fs/gfs/ops_file.c:1421: error: too few arguments to function -- Anton Nekhoroshikh e-mail: anton at hq.310.ru http://www.310.ru From anton at hq.310.ru Tue Nov 2 11:41:07 2004 From: anton at hq.310.ru (Anton Nekhoroshikh) Date: Tue, 2 Nov 2004 14:41:07 +0300 Subject: [Linux-cluster] compile error on linux 2.6.9 In-Reply-To: <1768131126.20041102143602@hq.310.ru> References: <1768131126.20041102143602@hq.310.ru> Message-ID: <426529420.20041102144107@hq.310.ru> Hi Anton, Tuesday, November 2, 2004, 2:36:02 PM, you wrote: Oh sorry, lm_interface.h has not been updated in include/linux -- Anton Nekhoroshikh e-mail: anton at hq.310.ru http://www.310.ru From linux-cluster at spam.dragonhold.org Tue Nov 2 13:07:36 2004 From: linux-cluster at spam.dragonhold.org (linux-cluster at spam.dragonhold.org) Date: Tue, 2 Nov 2004 13:07:36 +0000 Subject: [Linux-cluster] Compile problems with latest CVS Message-ID: <20041102130736.GB6326@dragonhold.org> I'm almost positive that I'm inept - since this is much more likely to be my fault than anything else, however I've gone through the process multiple times, and can't seem to see what I'm doing wrong. 1. Download & untar 2.6.9 kernel 2. login to cvs & checkout the cluster stuff 3. for i in ../cluster/*kernel/patches/2.6.9/*patch; do patch -p1 <$i; done Which means ../cluster/cman-kernel/patches/2.6.9/00001.patch ../cluster/cman-kernel/patches/2.6.9/cman.patch ../cluster/dlm-kernel/patches/2.6.9/00001.patch ../cluster/dlm-kernel/patches/2.6.9/dlm.patch ../cluster/gfs-kernel/patches/2.6.9/00001.patch ../cluster/gfs-kernel/patches/2.6.9/00002.patch ../cluster/gfs-kernel/patches/2.6.9/00003.patch ../cluster/gfs-kernel/patches/2.6.9/00004.patch ../cluster/gfs-kernel/patches/2.6.9/00005.patch ../cluster/gfs-kernel/patches/2.6.9/gfs.patch ../cluster/gfs-kernel/patches/2.6.9/lock_dlm.patch ../cluster/gfs-kernel/patches/2.6.9/lock_gulm.patch ../cluster/gfs-kernel/patches/2.6.9/lock_harness.patch ../cluster/gfs-kernel/patches/2.6.9/lock_nolock.patch ../cluster/gnbd-kernel/patches/2.6.9/00001.patch ../cluster/gnbd-kernel/patches/2.6.9/gnbd.patch 4. make menuconfig, and include all the kernel stuff as modules. 5. (use make-kpkg to make a debian package of it) 6. in cluster do "./configure --kernel_src=/usr/src/linux-2.6.9" 7. run "make all". And this is where it fails with: CC [M] /usr/src/cluster/cman-kernel/src/cnxman.o /usr/src/cluster/cman-kernel/src/cnxman.c: In function `do_ioctl_get_cluster': /usr/src/cluster/cman-kernel/src/cnxman.c:1331: error: dereferencing pointer to incomplete type /usr/src/cluster/cman-kernel/src/cnxman.c:1334: error: dereferencing pointer to incomplete type /usr/src/cluster/cman-kernel/src/cnxman.c: In function `cl_ioctl': /usr/src/cluster/cman-kernel/src/cnxman.c:1801: error: `SIOCCLUSTER_GETCLUSTER' undeclared (first use in this function) /usr/src/cluster/cman-kernel/src/cnxman.c:1801: error: (Each undeclared identifier is reported only once /usr/src/cluster/cman-kernel/src/cnxman.c:1801: error: for each function it appears in.) make[4]: *** [/usr/src/cluster/cman-kernel/src/cnxman.o] Error 1 I've googled a few times, but can't find anything relevant, and don't remember seeing it on this list either. Can someone put me out of my misery, and tell me what I'm doing wrong? Ta, Graham PS: Any update on mirroring? :) From anton at hq.310.ru Tue Nov 2 14:05:11 2004 From: anton at hq.310.ru (Anton Nekhoroshikh) Date: Tue, 2 Nov 2004 17:05:11 +0300 Subject: [Linux-cluster] differed source codes Message-ID: <1437816791.20041102170511@hq.310.ru> Hi all, Different in cluster/dlm-kernel/src and cluster/dlm-kernel/patches/2.6.9/dlm.patch, it would be necessary to correct the initial codes. -- e-mail: anton at hq.310.ru http://www.310.ru From gwood at dragonhold.org Mon Nov 1 21:11:04 2004 From: gwood at dragonhold.org (gwood at dragonhold.org) Date: Mon, 1 Nov 2004 21:11:04 +0000 Subject: [Linux-cluster] IMAP server clustering ... In-Reply-To: <2D541E3C-2C40-11D9-91C4-000393C9E706@hp.com> References: <41814530.6020605@utilitran.com> <1099001200.18913.64.camel@atlantis.boston.redhat.com> <41818859.6040208@utilitran.com> <1099327038.18913.94.camel@atlantis.boston.redhat.com> <20041101161649.GE3873@dragonhold.org> <4186716F.2010308@utilitran.com> <20041101165521.GF3873@dragonhold.org> <41867FBF.3060702@utilitran.com> <20041101185941.GG3873@dragonhold.org> <2D541E3C-2C40-11D9-91C4-000393C9E706@hp.com> Message-ID: <20041101211104.GA6326@dragonhold.org> On Mon, Nov 01, 2004 at 11:56:53AM -0800, David B Zafman wrote: > > See www.drbd.org It's not really what I was talking about. The 2nd node in that only has a copy of the data - it can't mount or access that volume in any way. From daniel at osdl.org Tue Nov 2 16:30:24 2004 From: daniel at osdl.org (Daniel McNeil) Date: Tue, 02 Nov 2004 08:30:24 -0800 Subject: [Linux-cluster] directio problem In-Reply-To: References: Message-ID: <1099413024.11420.22.camel@ibm-c.pdx.osdl.net> On Mon, 2004-11-01 at 21:30, ??? wrote: > Hi,all > When the GFS's version was 6.0?its documents said that GFS supported directio? > But now is 6.1pre2 or 6.1pre3 with Linux kernel 2.6.8.1 or 2.6.9, does it support too? > I've tried some tests, the result showed negative. My test program has the lines as follow: > > #define _GNU_SOURCE 1 //for the tag O_DIRECT. > fd=open(argv[1], O_RDONLY|O_DIRECT); > if(fd<0) { > perror("open"); > return -1; > } > readed=read(fd, buf, block_size); > if(readed<=0) { > perror("read"); > break; > } > close(fd); > > the result is "1. fd=3,valid. 2.read return: Invalid argument" > I've run the same program on ext3 and gfs filesystem, the result is the same. > > So what's the problem, the kernel 2.6 does not support, or some kernel config problem? > > Another problem: > Do I need to (or must) update GFS from 6.1pre2 with 2.6.8.1 to 6.1pre3 with 2.6.9? > I see the cvs log "Lock_dlm and lock_gulm are broken ..." > > Thanks for any reply! Best regards! > Direct IO requires the buffer address be aligned and the size of the i/o needs to be a multiple of 512. This worked for me on GFS (and ext3). #define _GNU_SOURCE 1 //for the tag O_DIRECT. #include #include main(int argc, char **argv) { int fd; int readed; char *buf; int block_size = 4096; buf = memalign(512, 4096); fd=open(argv[1], O_RDONLY|O_DIRECT); if(fd<0) { perror("open"); return -1; } readed=read(fd, buf, block_size); if(readed<=0) { perror("read"); } close(fd); } From mtilstra at redhat.com Tue Nov 2 16:56:33 2004 From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra) Date: Tue, 2 Nov 2004 10:56:33 -0600 Subject: [Linux-cluster] Compile problems with latest CVS In-Reply-To: <20041102130736.GB6326@dragonhold.org> References: <20041102130736.GB6326@dragonhold.org> Message-ID: <20041102165633.GA28515@redhat.com> On Tue, Nov 02, 2004 at 01:07:36PM +0000, linux-cluster at spam.dragonhold.org wrote: > I'm almost positive that I'm inept - since this is much more likely to be my fault than > anything else, however I've gone through the process multiple times, and can't seem to see > what I'm doing wrong. > > 1. Download & untar 2.6.9 kernel > 2. login to cvs & checkout the cluster stuff > 3. for i in ../cluster/*kernel/patches/2.6.9/*patch; do patch -p1 <$i; done > Which means you don't need to use the patches unless you want to try compiling static instead of modules. You are a lot better off compiling the modules from the *-kernel directories. (configure --kernel_src=...;make) I cannot speak for the cman patch, but I know I've been a bit lazy keeping the gulm patch uptodate with the code in the gfs-kernel/src/gulm (others might have been better.) So try building the module that way. If you still really want to use a kernel patch, peek inside the makefile. There are rules for making the patch if you put the kernel source in the right place. -- Michael Conrad Tadpol Tilstra COBOL is the anti-code -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From linux-cluster at spam.dragonhold.org Tue Nov 2 23:55:28 2004 From: linux-cluster at spam.dragonhold.org (Graham Wood) Date: Tue, 2 Nov 2004 23:55:28 +0000 Subject: [Linux-cluster] Compile problems with latest CVS In-Reply-To: <20041102165633.GA28515@redhat.com> References: <20041102130736.GB6326@dragonhold.org> <20041102165633.GA28515@redhat.com> Message-ID: <20041102235528.GA2923@dragonhold.org> On Tue, Nov 02, 2004 at 10:56:33AM -0600, Michael Conrad Tadpol Tilstra wrote: > you don't need to use the patches unless you want to try compiling > static instead of modules. You are a lot better off compiling the > modules from the *-kernel directories. (configure --kernel_src=...;make) Ahh, thanks. From earlier references to locking, and my somewhat hazy memory of a kernel patch for flock or something in 2.6.8.1, I thought I had to apply them anyway. I've got it all compiled up and indeed it does compile (although I've commented out the reference to mallocdbg.h in rgmanager, since I don't have that on my machine). However, some of the modules refuse to load. lock_dlm refuses to load because dlm refuses to load (and spouts a lot of dlm_ symbols as a result). dlm in turn has refused to load because "Unknown symbol rcom_log_clear". A grep of /proc/kallsyms shows that there's nothing with 'rcom' let alone 'rcom_log'. A google suggests that rcom_log_clear used to be part of reccomms.c (and it is in my patched 2.6.8.1 kernel tree), but although it's still in the reccomms.h from CVS, it's not in reccomms.c (I've just done an update to make sure). The make for dlm.ko whines about a load of other undefined symbols, kcl_, but they are provided by cman - so I assume there's just something missing in the MODPOST so that it doesn't find them. I /think/ that this isn't my fault/problem, for a change, but am not totally confident of my analysis. lock_gulm fails with "Cannot allocate memory" - but I think I used dlm for the gfs partition I created under 2.6.8.1, and haven't tried to use gulm until now. Hopefully you can point out what I've missed, and/or use this as useful feedback. Ta, Graham From markus.wiedmer at stud.fhbb.ch Wed Nov 3 09:42:04 2004 From: markus.wiedmer at stud.fhbb.ch (Markus Wiedmer - FHBB) Date: Wed, 3 Nov 2004 10:42:04 +0100 Subject: [Linux-cluster] high availability with GFS Message-ID: <1365124106.20041103104204@stud.fhbb.ch> hi all, We are students of the University of applied Sciences in Basel. As a project we are trying to realize a High-Availability Fileserver on Linux. We want to use GFS for our Storage but we are having problems in making it redundant. We are running 2 Samba-Servers that achieve failover through Heartbeat. Ideally, both servers should access the external storage through GFS. We thought we could use the pool_tool or clvm for this, but AFAIK both don't offer any redundancy, right? Is there any way to make GFS-Nodes (preferably through GNBD) redundant, so that a failure of a single node wouldn't affect the whole storage? Of course we could employ RAID 1 or 5 on the nodes themselves but that wouldn't save us in case the whole node fails. Does anyone have any experience with this. Thanks in advance -markus -- Mit freundlichen Gr?ssen Markus Wiedmer - FHBB From michael.gale at utilitran.com Wed Nov 3 15:00:02 2004 From: michael.gale at utilitran.com (Michael Gale) Date: Wed, 03 Nov 2004 08:00:02 -0700 Subject: [Linux-cluster] high availability with GFS In-Reply-To: <1365124106.20041103104204@stud.fhbb.ch> References: <1365124106.20041103104204@stud.fhbb.ch> Message-ID: <4188F272.8020909@utilitran.com> Hello, From your e-mail I am really not sure what your intended goal is ? Do you have a shared storage device you want to make accessible through what ever Samba server is master ? Check the list archives ... there is a issue with Samba and GFS, something about how Samba caches file metadata. Not sure if it affects you or not. Michael. Markus Wiedmer - FHBB wrote: > hi all, > > We are students of the University of applied Sciences in Basel. As a > project we are trying to realize a High-Availability Fileserver on > Linux. We want to use GFS for our Storage but we are having problems > in making it redundant. > > We are running 2 Samba-Servers that achieve failover through > Heartbeat. Ideally, both servers should access the external storage through > GFS. We thought we could use the pool_tool or clvm for this, but AFAIK > both don't offer any redundancy, right? > > Is there any way to make GFS-Nodes (preferably through GNBD) > redundant, so that a failure of a single node wouldn't affect the > whole storage? Of course we could employ RAID 1 or 5 on the nodes > themselves but that wouldn't save us in case the whole node fails. > > Does anyone have any experience with this. Thanks in advance > > -markus > -- Michael Gale Lan Administrator Utilitran Corp. We Pledge Allegiance to the Penguin From mtilstra at redhat.com Wed Nov 3 16:15:18 2004 From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra) Date: Wed, 3 Nov 2004 10:15:18 -0600 Subject: [Linux-cluster] Compile problems with latest CVS In-Reply-To: <20041102235528.GA2923@dragonhold.org> References: <20041102130736.GB6326@dragonhold.org> <20041102165633.GA28515@redhat.com> <20041102235528.GA2923@dragonhold.org> Message-ID: <20041103161518.GA32238@redhat.com> On Tue, Nov 02, 2004 at 11:55:28PM +0000, Graham Wood wrote: > > lock_gulm fails with "Cannot allocate memory" - but I think I used dlm > for the gfs partition I created under 2.6.8.1, and haven't tried to > use gulm until now. I should have just fixed that one yesterday. I was trying to stick too much into a kmalloc. try a cvs up and see if that fixes lock_gulm. -- Michael Conrad Tadpol Tilstra Today, I am the bug. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From daniel at osdl.org Wed Nov 3 21:58:22 2004 From: daniel at osdl.org (Daniel McNeil) Date: Wed, 03 Nov 2004 13:58:22 -0800 Subject: [Linux-cluster] gfs on 2.6.9 : umount gives sleeping function called from invalid context Message-ID: <1099519102.11420.31.camel@ibm-c.pdx.osdl.net> My 3 node cluster is running on 2.6.9 and the GFS cvs from oct 27th. When I umount the gfs file system I get: dlm: closing connection to node 1 Debug: sleeping function called from invalid context at include/linux/rwsem.h:43in_atomic():1, irqs_disabled():0 [] dump_stack+0x1e/0x30 [] __might_sleep+0xb7/0xf0 [] nodeid2con+0x25/0x1e0 [dlm] [] lowcomms_close+0x42/0x70 [dlm] [] put_node+0x2c/0x70 [dlm] [] release_csb+0x17/0x30 [dlm] [] nodes_clear+0x33/0x40 [dlm] [] ls_nodes_clear+0x17/0x30 [dlm] [] release_lockspace+0x1fd/0x2f0 [dlm] [] release_gdlm+0x1c/0x30 [lock_dlm] [] lm_dlm_unmount+0x24/0x50 [lock_dlm] [] lm_unmount+0x46/0xac [lock_harness] [] gfs_put_super+0x30f/0x3c0 [gfs] [] generic_shutdown_super+0x18a/0x1a0 [] kill_block_super+0x1d/0x40 [] deactivate_super+0x81/0xa0 [] sys_umount+0x3c/0xa0 [] sys_oldumount+0x19/0x20 [] sysenter_past_esp+0x52/0x71 Daniel From daniel at osdl.org Wed Nov 3 22:12:49 2004 From: daniel at osdl.org (Daniel McNeil) Date: Wed, 03 Nov 2004 14:12:49 -0800 Subject: [Linux-cluster] gfs on 2.6.9 : umount gives sleeping function called from invalid context In-Reply-To: <1099519102.11420.31.camel@ibm-c.pdx.osdl.net> References: <1099519102.11420.31.camel@ibm-c.pdx.osdl.net> Message-ID: <1099519969.11420.34.camel@ibm-c.pdx.osdl.net> On Wed, 2004-11-03 at 13:58, Daniel McNeil wrote: > My 3 node cluster is running on 2.6.9 and the GFS cvs from > oct 27th. > > When I umount the gfs file system I get: > > dlm: closing connection to node 1 > Debug: sleeping function called from invalid context at include/linux/rwsem.h:43in_atomic():1, irqs_disabled():0 > [] dump_stack+0x1e/0x30 > [] __might_sleep+0xb7/0xf0 > [] nodeid2con+0x25/0x1e0 [dlm] > [] lowcomms_close+0x42/0x70 [dlm] > [] put_node+0x2c/0x70 [dlm] > [] release_csb+0x17/0x30 [dlm] > [] nodes_clear+0x33/0x40 [dlm] > [] ls_nodes_clear+0x17/0x30 [dlm] > [] release_lockspace+0x1fd/0x2f0 [dlm] > [] release_gdlm+0x1c/0x30 [lock_dlm] > [] lm_dlm_unmount+0x24/0x50 [lock_dlm] > [] lm_unmount+0x46/0xac [lock_harness] > [] gfs_put_super+0x30f/0x3c0 [gfs] > [] generic_shutdown_super+0x18a/0x1a0 > [] kill_block_super+0x1d/0x40 > [] deactivate_super+0x81/0xa0 > [] sys_umount+0x3c/0xa0 > [] sys_oldumount+0x19/0x20 > [] sysenter_past_esp+0x52/0x71 > > > Daniel There is a similar stack trace on the node that still has the GFS file system mounted: dlm: closing connection to node 1 Debug: sleeping function called from invalid context at include/linux/rwsem.h:43in_atomic():1, irqs_disabled():0 [] dump_stack+0x1e/0x30 [] __might_sleep+0xb7/0xf0 [] nodeid2con+0x25/0x1e0 [dlm] [] lowcomms_close+0x42/0x70 [dlm] [] put_node+0x2c/0x70 [dlm] [] release_csb+0x17/0x30 [dlm] [] clear_finished_nodes+0x54/0x60 [dlm] [] do_ls_recovery+0x25e/0x4a0 [dlm] [] dlm_recoverd+0x6c/0x100 [dlm] [] kthread+0xba/0xc0 [] kernel_thread_helper+0x5/0x10 dlm: stripefs: process held requests dlm: got connection from 1 dlm: stripefs: processed 0 requests dlm: stripefs: resend marked requests dlm: stripefs: resent 0 requests dlm: stripefs: recover event 19 finished dlm: connecting to 1 From markus.wiedmer at stud.fhbb.ch Thu Nov 4 07:07:24 2004 From: markus.wiedmer at stud.fhbb.ch (Markus Wiedmer - FHBB) Date: Thu, 4 Nov 2004 08:07:24 +0100 Subject: [Linux-cluster] high availability with GFS In-Reply-To: <4188F272.8020909@utilitran.com> References: <1365124106.20041103104204@stud.fhbb.ch> <4188F272.8020909@utilitran.com> Message-ID: <722743464.20041104080724@stud.fhbb.ch> oops, forgot to send this to the mailing-list, so here it is again... Hi Michael, Thanks for your reply. What you said is basically what we want to achieve. Our shared storage device consists atm of 5 nodes with one physical disk each. We want all of these 5 nodes to be visible as a single logical volume. We think we can achieve this with GNBD and LVM by exporting the disks as GNBD-Devices on each node. We'll then import them on the master-server and pool them together with LVM or pool_tool. that way the secondary-server should be able to access it as well. but as I said earlier without any redundancy. If one node fails, we are grounded. We need to get some redundancy in this or we have to look for another solution like DRBD, but there scalability is a real problem. I read the Samba/GFS-Thread but we're not that far yet. We'll see about this once (if ;) ) we get that far with GFS. Thanks again for your help Markus MG> Hello, MG> From your e-mail I am really not sure what your intended goal is ? Do MG> you have a shared storage device you want to make accessible through MG> what ever Samba server is master ? MG> Check the list archives ... there is a issue with Samba and GFS, MG> something about how Samba caches file metadata. Not sure if it affects MG> you or not. MG> Michael. MG> Markus Wiedmer - FHBB wrote: >> hi all, >> >> We are students of the University of applied Sciences in Basel. As a >> project we are trying to realize a High-Availability Fileserver on >> Linux. We want to use GFS for our Storage but we are having problems >> in making it redundant. >> >> We are running 2 Samba-Servers that achieve failover through >> Heartbeat. Ideally, both servers should access the external storage through >> GFS. We thought we could use the pool_tool or clvm for this, but AFAIK >> both don't offer any redundancy, right? >> >> Is there any way to make GFS-Nodes (preferably through GNBD) >> redundant, so that a failure of a single node wouldn't affect the >> whole storage? Of course we could employ RAID 1 or 5 on the nodes >> themselves but that wouldn't save us in case the whole node fails. >> >> Does anyone have any experience with this. Thanks in advance >> >> -markus >> -- Mit freundlichen Gr?ssen Markus Wiedmer - FHBB mailto:markus.wiedmer at stud.fhbb.ch From pcaulfie at redhat.com Thu Nov 4 09:50:01 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Thu, 4 Nov 2004 09:50:01 +0000 Subject: [Linux-cluster] gfs on 2.6.9 : umount gives sleeping function called from invalid context In-Reply-To: <1099519102.11420.31.camel@ibm-c.pdx.osdl.net> References: <1099519102.11420.31.camel@ibm-c.pdx.osdl.net> Message-ID: <20041104095001.GF26743@tykepenguin.com> On Wed, Nov 03, 2004 at 01:58:22PM -0800, Daniel McNeil wrote: > My 3 node cluster is running on 2.6.9 and the GFS cvs from > oct 27th. > > When I umount the gfs file system I get: > > dlm: closing connection to node 1 > Debug: sleeping function called from invalid context at include/linux/rwsem.h:43in_atomic():1, irqs_disabled():0 This patch should fix it: Index: dlm-kernel/src/lowcomms.c =================================================================== RCS file: /cvs/cluster/cluster/dlm-kernel/src/lowcomms.c,v retrieving revision 1.18 diff -u -r1.18 lowcomms.c --- dlm-kernel/src/lowcomms.c 25 Oct 2004 12:26:45 -0000 1.18 +++ dlm-kernel/src/lowcomms.c 4 Nov 2004 09:49:40 -0000 @@ -950,7 +950,7 @@ goto out; log_print("closing connection to node %d", nodeid); - con = nodeid2con(nodeid, 0); + con = connections[nodeid]; if (con) { close_connection(con, TRUE); clean_one_writequeue(con); -- patrick From agauthier at realmedia.com Thu Nov 4 13:34:02 2004 From: agauthier at realmedia.com (Arnaud Gauthier) Date: Thu, 4 Nov 2004 14:34:02 +0100 Subject: [Linux-cluster] Any news about broken broadcast on 2.6.9 ? Message-ID: <200411041434.02469.agauthier@realmedia.com> Hello, I have no Cisco switch available for implementing a Multicast router, so I would like staying with broadcast. But with pre3 version of GFS 6.1 broadcast looks... broken. Multicast looks good but I have no Cisco and Mrouted doesn't compile on my RH7.3 with kernel 2.6.9. Do you have news about broadcast or an idea about what I can use as an multicast router on 2.6.9 ? Regards, Arnaud -- Arnaud Gauthier Realmedia From jbrassow at redhat.com Thu Nov 4 15:57:15 2004 From: jbrassow at redhat.com (Jonathan E Brassow) Date: Thu, 4 Nov 2004 09:57:15 -0600 Subject: [Linux-cluster] Any news about broken broadcast on 2.6.9 ? In-Reply-To: <200411041434.02469.agauthier@realmedia.com> References: <200411041434.02469.agauthier@realmedia.com> Message-ID: <32F476A9-2E7A-11D9-9CA7-000A957BB1F6@redhat.com> at least with regards to ccs, you could 'ccsd -4', which will tell it to use IPv4 and broadcast. I'm not sure about cman... brassow On Nov 4, 2004, at 7:34 AM, Arnaud Gauthier wrote: > Hello, > > I have no Cisco switch available for implementing a Multicast router, > so I > would like staying with broadcast. But with pre3 version of GFS 6.1 > broadcast > looks... broken. Multicast looks good but I have no Cisco and Mrouted > doesn't > compile on my RH7.3 with kernel 2.6.9. > > Do you have news about broadcast or an idea about what I can use as an > multicast router on 2.6.9 ? > > Regards, > Arnaud > -- > Arnaud Gauthier > Realmedia > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > From daniel at osdl.org Fri Nov 5 01:40:59 2004 From: daniel at osdl.org (Daniel McNeil) Date: Thu, 04 Nov 2004 17:40:59 -0800 Subject: [Linux-cluster] gfs on 2.6.9 : umount gives sleeping function called from invalid context In-Reply-To: <20041104095001.GF26743@tykepenguin.com> References: <1099519102.11420.31.camel@ibm-c.pdx.osdl.net> <20041104095001.GF26743@tykepenguin.com> Message-ID: <1099618859.11420.38.camel@ibm-c.pdx.osdl.net> On Thu, 2004-11-04 at 01:50, Patrick Caulfield wrote: > On Wed, Nov 03, 2004 at 01:58:22PM -0800, Daniel McNeil wrote: > > My 3 node cluster is running on 2.6.9 and the GFS cvs from > > oct 27th. > > > > When I umount the gfs file system I get: > > > > dlm: closing connection to node 1 > > Debug: sleeping function called from invalid context at include/linux/rwsem.h:43in_atomic():1, irqs_disabled():0 > > This patch should fix it: > > Index: dlm-kernel/src/lowcomms.c > =================================================================== > RCS file: /cvs/cluster/cluster/dlm-kernel/src/lowcomms.c,v > retrieving revision 1.18 > diff -u -r1.18 lowcomms.c > --- dlm-kernel/src/lowcomms.c 25 Oct 2004 12:26:45 -0000 1.18 > +++ dlm-kernel/src/lowcomms.c 4 Nov 2004 09:49:40 -0000 > @@ -950,7 +950,7 @@ > goto out; > > log_print("closing connection to node %d", nodeid); > - con = nodeid2con(nodeid, 0); > + con = connections[nodeid]; > if (con) { > close_connection(con, TRUE); > clean_one_writequeue(con); > Patrick, Not quite. It fixed the might_sleep() in lowcomms_close(), but there is a down_write() is close_connection() causing another might_sleep(). Daniel From pcaulfie at redhat.com Fri Nov 5 08:22:15 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Fri, 5 Nov 2004 08:22:15 +0000 Subject: [Linux-cluster] gfs on 2.6.9 : umount gives sleeping function called from invalid context In-Reply-To: <1099618859.11420.38.camel@ibm-c.pdx.osdl.net> References: <1099519102.11420.31.camel@ibm-c.pdx.osdl.net> <20041104095001.GF26743@tykepenguin.com> <1099618859.11420.38.camel@ibm-c.pdx.osdl.net> Message-ID: <20041105082214.GA27851@tykepenguin.com> On Thu, Nov 04, 2004 at 05:40:59PM -0800, Daniel McNeil wrote: > > Not quite. It fixed the might_sleep() in lowcomms_close(), > but there is a down_write() is close_connection() causing > another might_sleep(). Hmm, well spotted. In that case the whole of that stack of code needs to be taken out of the atomic section. Dave? -- patrick From agauthier at realmedia.com Fri Nov 5 08:55:33 2004 From: agauthier at realmedia.com (Arnaud Gauthier) Date: Fri, 5 Nov 2004 09:55:33 +0100 Subject: [Linux-cluster] Any news about broken broadcast on 2.6.9 ? In-Reply-To: <32F476A9-2E7A-11D9-9CA7-000A957BB1F6@redhat.com> References: <200411041434.02469.agauthier@realmedia.com> <32F476A9-2E7A-11D9-9CA7-000A957BB1F6@redhat.com> Message-ID: <200411050955.33081.agauthier@realmedia.com> Le Thursday 04 November 2004 16:57, Jonathan E Brassow a ?crit?: > at least with regards to ccs, you could 'ccsd -4', which will tell it > to use IPv4 and broadcast. I tried it with still the same effect: cman can't join > I'm not sure about cman... No effect. cman join -d says it's alone and forms a new cluster :-(( Regards, Arnaud -- Arnaud Gauthier Realmedia From pcaulfie at redhat.com Fri Nov 5 09:02:57 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Fri, 5 Nov 2004 09:02:57 +0000 Subject: [Linux-cluster] Any news about broken broadcast on 2.6.9 ? In-Reply-To: <200411050955.33081.agauthier@realmedia.com> References: <200411041434.02469.agauthier@realmedia.com> <32F476A9-2E7A-11D9-9CA7-000A957BB1F6@redhat.com> <200411050955.33081.agauthier@realmedia.com> Message-ID: <20041105090257.GC27851@tykepenguin.com> On Fri, Nov 05, 2004 at 09:55:33AM +0100, Arnaud Gauthier wrote: > Le Thursday 04 November 2004 16:57, Jonathan E Brassow a ?crit?: > > at least with regards to ccs, you could 'ccsd -4', which will tell it > > to use IPv4 and broadcast. > > I tried it with still the same effect: cman can't join > > > I'm not sure about cman... > > No effect. cman join -d says it's alone and forms a new cluster :-(( > check that the other nodes are running the same version of the software. Also check that cman is using the correct interface (/proc/cluster/status will tell you both of these). It's also worth using tcpdump to make sure that the messages are emerging onto the wire. -- patrick From daniel at osdl.org Fri Nov 5 22:33:57 2004 From: daniel at osdl.org (Daniel McNeil) Date: Fri, 05 Nov 2004 14:33:57 -0800 Subject: [Linux-cluster] GFS strange behavior and mount hang on 2.6.9 - 3 nodes Message-ID: <1099694037.11420.63.camel@ibm-c.pdx.osdl.net> I been testing 3-node GFS file system on shared fibre channel storage, and run into a couple of strange things. The 3 nodes are cl030, cl031, and cl032. 1. After running tar tests on 3 nodes for about a day, I wanted to try out the patch to get rid of the might_sleep() warning. I umounted the GFS file system on cl031 and then tried to rmmod the lock_dlm module, but couldn't because of the use count on the modules: # umount /gfs_stripe5 dlm: connecting to 2 dlm: closing connection to node 1 Debug: sleeping function called from invalid context at include/linux/rwsem.h:43in_atomic():1, irqs_disabled():0 [] dump_stack+0x1e/0x30 [] __might_sleep+0xb7/0xf0 [] nodeid2con+0x25/0x1e0 [dlm] [] lowcomms_close+0x42/0x70 [dlm] [] put_node+0x2c/0x70 [dlm] [] release_csb+0x17/0x30 [dlm] [] nodes_clear+0x33/0x40 [dlm] [] ls_nodes_clear+0x17/0x30 [dlm] [] release_lockspace+0x1fd/0x2f0 [dlm] [] release_gdlm+0x1c/0x30 [lock_dlm] [] lm_dlm_unmount+0x24/0x50 [lock_dlm] [] lm_unmount+0x46/0xac [lock_harness] [] gfs_put_super+0x30f/0x3c0 [gfs] [] generic_shutdown_super+0x18a/0x1a0dlm: connecting to 1 [] kill_block_super+0x1d/0x40 [] deactivate_super+0x81/0xa0 [] sys_umount+0x3c/0xa0 dlm: closing connection to node 2 dlm: closing connection to node 3 dlm: got connection from 2 dlm: got connection from 1 # lsmod Module Size Used by lock_dlm 39408 2 dlm 128008 1 lock_dlm gfs 296780 0 lock_harness 3868 2 lock_dlm,gfs qla2200 86432 0 qla2xxx 112064 1 qla2200 cman 128480 8 lock_dlm,dlm dm_mod 53536 0 # rmmod lock_dlm ERROR: Module lock_dlm is in use ----> At this point, the lock_dlm module would not unload because it still had a use count of 2. The "got connection" messages after the umount look strange. What do those messages mean? 2. After rebooting, cl031, I got cl031 to rejoin the cluster, but when trying to mount the mount hung: # cat /proc/cluster/nodes Node Votes Exp Sts Name 1 1 3 M cl030a 2 1 3 M cl032a 3 1 3 M cl031a # mount -t gfs /dev/sdf1 /gfs_stripe5 GFS: Trying to join cluster "lock_dlm", "gfs_cluster:stripefs" dlm: stripefs: recover event 2 (first) dlm: stripefs: add nodes dlm: connecting to 1 ==> mount HUNG here cl031 root]# cat /proc/cluster/services Service Name GID LID State Code Fence Domain: "default" 1 2 run - [2 1 3] DLM Lock Space: "stripefs" 18 3 join S-6,20,3 [2 1 3] ================ cl032 root]# cat /proc/cluster/services Service Name GID LID State Code Fence Domain: "default" 1 4 run - [1 2 3] DLM Lock Space: "stripefs" 18 21 update U-4,1,3 [1 2 3] GFS Mount Group: "stripefs" 19 22 run - [1 2] ================ cl030 proc]# cat /proc/cluster/services Service Name GID LID State Code Fence Domain: "default" 1 2 run - [1 2 3] DLM Lock Space: "stripefs" 18 23 update U-4,1,3 [1 2 3] GFS Mount Group: "stripefs" 19 24 run - [1 2] It looks like some problem joining the DLM Lock Space. I have stack traces available from all 3 machines if that provides any info (http://developer.osdl.org/daniel/gfs_hang/) I reset cl031 and the other 2 nodes recovered ok: dlm: stripefs: total nodes 3 dlm: stripefs: nodes_reconfig failed -1 dlm: stripefs: recover event 76 error -1 cl032: CMAN: no HELLO from cl031a, removing from the cluster dlm: stripefs: total nodes 3 dlm: stripefs: nodes_reconfig failed 1 dlm: stripefs: recover event 69 error Anyone seen anything like this? Daniel From pcaulfie at redhat.com Mon Nov 8 14:06:19 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Mon, 8 Nov 2004 14:06:19 +0000 Subject: [Linux-cluster] WARNING: libdlm kernel interface change in CVS Message-ID: <20041108140619.GB23446@tykepenguin.com> I have just checked in a change that breaks backwards compatibility with libdlm and the dlm kernel. If you update the kernel from CVS you /must/ also update libdlm. And, conversely, don't ugrade libdlm on its own unless you are also changing the kernel. If you get odd errors from userland programs that use the DLM then check that both libdlm and the kernel. If applications are linked against the shared libdlm then it will be sufficient to upgrade the libdlm.so file only, the library ABI has not changed. Sorry for the inconvenience. -- patrick From pcaulfie at redhat.com Tue Nov 9 13:15:16 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Tue, 9 Nov 2004 13:15:16 +0000 Subject: [Linux-cluster] GFS strange behavior and mount hang on 2.6.9 - 3 nodes In-Reply-To: <1099694037.11420.63.camel@ibm-c.pdx.osdl.net> References: <1099694037.11420.63.camel@ibm-c.pdx.osdl.net> Message-ID: <20041109131515.GE18685@tykepenguin.com> On Fri, Nov 05, 2004 at 02:33:57PM -0800, Daniel McNeil wrote: > I been testing 3-node GFS file system on shared fibre channel > storage, and run into a couple of strange things. The 3 nodes > are cl030, cl031, and cl032. > > 1. After running tar tests on 3 nodes for about a day, > I wanted to try out the patch to get rid of the might_sleep() > warning. I umounted the GFS file system on cl031 and then > tried to rmmod the lock_dlm module, but couldn't because of > the use count on the modules: > > > # rmmod lock_dlm > ERROR: Module lock_dlm is in use > > ----> > At this point, the lock_dlm module would not unload because > it still had a use count of 2. That will be because of the previous oops I imagine. > The "got connection" messages after the umount look strange. > What do those messages mean? They mean that a connection was made to the node that has just dismounted the filesystem. This is wrong and alomst certainly the cause of what happened below. I'm looking into it at the moment. -- patrick From pcaulfie at redhat.com Tue Nov 9 16:02:55 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Tue, 9 Nov 2004 16:02:55 +0000 Subject: [Linux-cluster] GFS strange behavior and mount hang on 2.6.9 - 3 nodes In-Reply-To: <20041109131515.GE18685@tykepenguin.com> References: <1099694037.11420.63.camel@ibm-c.pdx.osdl.net> <20041109131515.GE18685@tykepenguin.com> Message-ID: <20041109160255.GF18685@tykepenguin.com> On Tue, Nov 09, 2004 at 01:15:16PM +0000, Patrick Caulfield wrote: > > > > The "got connection" messages after the umount look strange. > > What do those messages mean? > > They mean that a connection was made to the node that has just dismounted the > filesystem. This is wrong and alomst certainly the cause of what happened below. > I'm looking into it at the moment. > Ok, should be fixed in latest CVS. As is a better fix for the "sleep in spinlock" bug. -- patrick From daniel at osdl.org Tue Nov 9 19:10:31 2004 From: daniel at osdl.org (Daniel McNeil) Date: Tue, 09 Nov 2004 11:10:31 -0800 Subject: [Linux-cluster] GFS strange behavior and mount hang on 2.6.9 - 3 nodes In-Reply-To: <20041109160255.GF18685@tykepenguin.com> References: <1099694037.11420.63.camel@ibm-c.pdx.osdl.net> <20041109131515.GE18685@tykepenguin.com> <20041109160255.GF18685@tykepenguin.com> Message-ID: <1100027431.11420.79.camel@ibm-c.pdx.osdl.net> On Tue, 2004-11-09 at 08:02, Patrick Caulfield wrote: > On Tue, Nov 09, 2004 at 01:15:16PM +0000, Patrick Caulfield wrote: > > > > > > > The "got connection" messages after the umount look strange. > > > What do those messages mean? > > > > They mean that a connection was made to the node that has just dismounted the > > filesystem. This is wrong and alomst certainly the cause of what happened below. > > I'm looking into it at the moment. > > > > Ok, should be fixed in latest CVS. As is a better fix for the "sleep in > spinlock" bug. I'm upgrading now and will give it a try. Thanks, Daniel From dmorgan at gmi-mr.com Tue Nov 9 19:15:35 2004 From: dmorgan at gmi-mr.com (Duncan Morgan) Date: Tue, 9 Nov 2004 11:15:35 -0800 Subject: [Linux-cluster] Load issues with GFS/Apache Message-ID: <20041109191536.767D13E0034@van91.gmi-mr.com> Hello, We have the following configuration - 14 GFS nodes each with dual 2.4 GHz Xeon and 2 GB RAM - RHEL 3 with 2.4.21-20.ELsmp kernel - GFS 6.0.0-15 - Apache 1.3.31 on each node The load on each server is consistently high (from 2 -->15) compared to Apache running on a standalone server. The processors are generally about 80% idle and lock_gulmd is always near the top of the 'top' output. Has anybody else experienced this? Is it normal? Thanks in advance. Duncan Morgan From mtilstra at redhat.com Tue Nov 9 19:38:30 2004 From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra) Date: Tue, 9 Nov 2004 13:38:30 -0600 Subject: [Linux-cluster] Load issues with GFS/Apache In-Reply-To: <20041109191536.767D13E0034@van91.gmi-mr.com> References: <20041109191536.767D13E0034@van91.gmi-mr.com> Message-ID: <20041109193830.GA4814@redhat.com> On Tue, Nov 09, 2004 at 11:15:35AM -0800, Duncan Morgan wrote: > Hello, > > We have the following configuration > > - 14 GFS nodes each with dual 2.4 GHz Xeon and 2 GB RAM > - RHEL 3 with 2.4.21-20.ELsmp kernel > - GFS 6.0.0-15 > - Apache 1.3.31 on each node > > The load on each server is consistently high (from 2 -->15) compared to > Apache running on a standalone server. The processors are generally about > 80% idle and lock_gulmd is always near the top of the 'top' output. > > Has anybody else experienced this? Is it normal? > > Thanks in advance. yes, its normal. it is an artifact of how load averages are calculated and that most of gfs and gulm in kernel space don't use interruptible waits. (puts processes into the D state.) -- Michael Conrad Tadpol Tilstra I can resist anything but temptation. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From qliu at ncsa.uiuc.edu Tue Nov 9 19:55:35 2004 From: qliu at ncsa.uiuc.edu (Qian Liu) Date: Tue, 09 Nov 2004 13:55:35 -0600 Subject: [Linux-cluster] clvmd problem Message-ID: <5.1.0.14.2.20041109135324.02c79ec0@pop.ncsa.uiuc.edu> Hi, all When I finished starting ccsd, cam_tool join, and fence_tool join I could not start clvmd. I got returned prompt: clvmd could not connect to cluster. Any suggestion or clue on this? Thanks in advance! -Qian From pcaulfie at redhat.com Wed Nov 10 08:21:57 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Wed, 10 Nov 2004 08:21:57 +0000 Subject: [Linux-cluster] clvmd problem In-Reply-To: <5.1.0.14.2.20041109135324.02c79ec0@pop.ncsa.uiuc.edu> References: <5.1.0.14.2.20041109135324.02c79ec0@pop.ncsa.uiuc.edu> Message-ID: <20041110082157.GC30705@tykepenguin.com> On Tue, Nov 09, 2004 at 01:55:35PM -0600, Qian Liu wrote: > Hi, all > > When I finished starting ccsd, cam_tool join, and fence_tool join I could > not start clvmd. I got returned prompt: clvmd could not connect to > cluster. Any suggestion or clue on this? Thanks in advance! > Check that all the software is in sync (ie compiled from the same CVS checkout). In particular that you don't have an old libdlm lying around (perhaps in /lib/) -- patrick From andrew.warfield at gmail.com Wed Nov 10 16:22:58 2004 From: andrew.warfield at gmail.com (Andrew Warfield) Date: Wed, 10 Nov 2004 16:22:58 +0000 Subject: [Linux-cluster] GNBD: interrupt bug. Message-ID: I seem to have found a bug in GNBD that I'm wondering if anyone else has noticed. On a Linux 2.6.9 host, mounting a remote GNBD partition, i am hitting the WARN_ON in linux's local_bh_enable() (kernel/softirq.c:141). This results in repeated stack dumps, effectively locking up the host. After a bit of looking, it would appear that do_gnbd_request() in gnbd.c expects to be called with interrupts disabled. Unfortunately, the entry through generic_unplug_device does not disable interrupts first, and so the gnbd code is disagreeing. The specific code in gnbd.c that is causing problems is: spin_unlock_irq(q->queue_lock); ... spin_lock_irq(q->queue_lock); I'm planning to look at this a bit more tonight, but thought I'd quickly check if anyone had initial insight. I'm not yet sure if the bug is on the Linux or the gnbd side here. cheers, a. From daniel at osdl.org Thu Nov 11 00:41:49 2004 From: daniel at osdl.org (Daniel McNeil) Date: Wed, 10 Nov 2004 16:41:49 -0800 Subject: [Linux-cluster] GFS strange behavior and mount hang on 2.6.9 - 3 nodes In-Reply-To: <20041109160255.GF18685@tykepenguin.com> References: <1099694037.11420.63.camel@ibm-c.pdx.osdl.net> <20041109131515.GE18685@tykepenguin.com> <20041109160255.GF18685@tykepenguin.com> Message-ID: <1100133709.25523.15.camel@ibm-c.pdx.osdl.net> On Tue, 2004-11-09 at 08:02, Patrick Caulfield wrote: > On Tue, Nov 09, 2004 at 01:15:16PM +0000, Patrick Caulfield wrote: > > > > > > > The "got connection" messages after the umount look strange. > > > What do those messages mean? > > > > They mean that a connection was made to the node that has just dismounted the > > filesystem. This is wrong and alomst certainly the cause of what happened below. > > I'm looking into it at the moment. > > > > Ok, should be fixed in latest CVS. As is a better fix for the "sleep in > spinlock" bug. I upgraded to the latest cvs and the might_sleep() is gone. FYI, I upgraded one node at a time. cman was ok with mixing with the last version, but the gfs mount would hang. Is this expected because of the dlm change? With all nodes on the latest bits, everything seems ok. I do more testing and yell if I hit anything. Thanks, Daniel From pcaulfie at redhat.com Thu Nov 11 08:27:54 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Thu, 11 Nov 2004 08:27:54 +0000 Subject: [Linux-cluster] GFS strange behavior and mount hang on 2.6.9 - 3 nodes In-Reply-To: <1100133709.25523.15.camel@ibm-c.pdx.osdl.net> References: <1099694037.11420.63.camel@ibm-c.pdx.osdl.net> <20041109131515.GE18685@tykepenguin.com> <20041109160255.GF18685@tykepenguin.com> <1100133709.25523.15.camel@ibm-c.pdx.osdl.net> Message-ID: <20041111082754.GA1267@tykepenguin.com> On Wed, Nov 10, 2004 at 04:41:49PM -0800, Daniel McNeil wrote: > > I upgraded to the latest cvs and the might_sleep() is gone. > > FYI, I upgraded one node at a time. cman was ok with > mixing with the last version, but the gfs mount would > hang. Is this expected because of the dlm change? Yes, there was an extra flag added fairly recently. We are need to try to mitigate the incompatibility changes from now on I think - though be warned that there is one more to come that I know of... > With all nodes on the latest bits, everything seems ok. > I do more testing and yell if I hit anything. Great, thanks. -- patrick From dmorgan at gmi-mr.com Thu Nov 11 17:37:56 2004 From: dmorgan at gmi-mr.com (Duncan Morgan) Date: Thu, 11 Nov 2004 09:37:56 -0800 Subject: [Linux-cluster] Load issues with GFS/Apache In-Reply-To: <20041109193830.GA4814@redhat.com> Message-ID: <20041111173757.7897D3E0024@van91.gmi-mr.com> Thanks Michael. -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Michael Conrad Tadpol Tilstra Sent: Tuesday, November 09, 2004 11:39 AM To: linux clistering Subject: Re: [Linux-cluster] Load issues with GFS/Apache On Tue, Nov 09, 2004 at 11:15:35AM -0800, Duncan Morgan wrote: > Hello, > > We have the following configuration > > - 14 GFS nodes each with dual 2.4 GHz Xeon and 2 GB RAM > - RHEL 3 with 2.4.21-20.ELsmp kernel > - GFS 6.0.0-15 > - Apache 1.3.31 on each node > > The load on each server is consistently high (from 2 -->15) compared to > Apache running on a standalone server. The processors are generally about > 80% idle and lock_gulmd is always near the top of the 'top' output. > > Has anybody else experienced this? Is it normal? > > Thanks in advance. yes, its normal. it is an artifact of how load averages are calculated and that most of gfs and gulm in kernel space don't use interruptible waits. (puts processes into the D state.) -- Michael Conrad Tadpol Tilstra I can resist anything but temptation. From owen at isrl.uiuc.edu Thu Nov 11 18:07:18 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Thu, 11 Nov 2004 12:07:18 -0600 Subject: [Linux-cluster] GFS hangs after several hours Message-ID: <20041111180717.GF26670@iwork57.lis.uiuc.edu> Hi all, My setup: 5 Athlon servers RedHat 9.0 (Yeah, I still haven't upgraded yet) kernel-2.6.9 from kernel.org, patched with gfs/ccs/dlm from the .tar.gz repository. using lock_dlm Using Apple XServe RAIDs with Apple FC cards (mptscsih driver). I thought I had everything running properly. I had two machines hammering a GFS partition at the same time. I pulled the power cord on one. fence_vixel kicked in, and the rest of the cluster continued. I could repeat this over and over. I set up two machines, each writing to a different GFS overnight. In the morning, there were no errors but one process was hung in a "D" state. The fence system did not show any activity. No errors were logged anywhere on the cluster. 'df' hung on any machine in the cluster when it came to one of the GFS partitions. I shut down the ethernet on one of the machines, but it didn't get fenced. It seems that something silently died, but I don't really know where to begin looking, as I don't see any errors written anywhere. Anyone got any ideas? The only other note is that CCSD appeared to be having some problems with determining if the cluster had quorum. -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> From owen at isrl.uiuc.edu Fri Nov 12 16:06:49 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Fri, 12 Nov 2004 10:06:49 -0600 Subject: [Linux-cluster] GFS hangs after several hours In-Reply-To: <20041111180717.GF26670@iwork57.lis.uiuc.edu> References: <20041111180717.GF26670@iwork57.lis.uiuc.edu> Message-ID: <20041112160648.GD30143@iwork57.lis.uiuc.edu> More information. I may have had an old version of ccsd which allowed me to get the cluster running in the first place. I can't get that far now. I have IPv6 compiled in the kernel but no IPv6 interfaces defined. I've given ccsd the -4 flag. Checking logs after "ccs_test connect" shows that ccsd does not believe the cluster is quorate. /etc/cluster/status says that the cluster has reached quorum. The IP addresses are appropriate (I have dual-NIC hosts). I recompiled ccsd with "DEBUG=1" and found that the "quorate" variable was never set in ccsd. I further found that cluster_communicator() never received a valid fd from clu_connect and was therefore stuck in a loop. clu_connect appears to be a magma call. Any advice on how to proceed? On Thu, Nov 11, 2004 at 12:07:18PM -0600, Brynnen R Owen wrote: > Hi all, > > My setup: > > 5 Athlon servers > > RedHat 9.0 (Yeah, I still haven't upgraded yet) > > kernel-2.6.9 from kernel.org, patched with gfs/ccs/dlm from the > .tar.gz repository. > > using lock_dlm > > Using Apple XServe RAIDs with Apple FC cards (mptscsih driver). > > I thought I had everything running properly. I had two machines > hammering a GFS partition at the same time. I pulled the power cord > on one. fence_vixel kicked in, and the rest of the cluster > continued. I could repeat this over and over. > > I set up two machines, each writing to a different GFS overnight. > In the morning, there were no errors but one process was hung in a "D" > state. The fence system did not show any activity. No errors were > logged anywhere on the cluster. 'df' hung on any machine in the > cluster when it came to one of the GFS partitions. I shut down the > ethernet on one of the machines, but it didn't get fenced. It seems > that something silently died, but I don't really know where to begin > looking, as I don't see any errors written anywhere. Anyone got any > ideas? > > The only other note is that CCSD appeared to be having some problems > with determining if the cluster had quorum. > > -- > <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> > <> Brynnen Owen ( this space for rent )<> > <> owen at uiuc.edu ( )<> > <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> From bdemulder at gmail.com Fri Nov 12 16:19:00 2004 From: bdemulder at gmail.com (Benoit de Mulder) Date: Fri, 12 Nov 2004 11:19:00 -0500 Subject: [Linux-cluster] mysql and GFS Message-ID: Hello, The setup is the following : 4 mysql server, today using replication, around 600 con / per server. The issue is that the write is overloaded and It's not possible to change the app. I was looking at two approaches. 1) Multiple write servers and GFS I would like to know if this possible to use multiple write server on the same database, using external lock parameter on mysql to ensure data integrity. Does someone has setup a multiple write mysql server using GFS ? 2) MySQL clustering My only fear is that the solution is very young and storage engine does not support some characteristics of the db schema (ie full text index). Any advice will be appreciated. Thks Benoit From jbrassow at redhat.com Fri Nov 12 17:55:29 2004 From: jbrassow at redhat.com (Jonathan E Brassow) Date: Fri, 12 Nov 2004 11:55:29 -0600 Subject: [Linux-cluster] GFS hangs after several hours In-Reply-To: <20041112160648.GD30143@iwork57.lis.uiuc.edu> References: <20041111180717.GF26670@iwork57.lis.uiuc.edu> <20041112160648.GD30143@iwork57.lis.uiuc.edu> Message-ID: <0A9659B0-34D4-11D9-B241-000A957BB1F6@redhat.com> It sounds like the magma plugins can not be found - or are old. ls /usr/lib/magma/plugins ? Additionally, you may wish to check that there are no older versions of magma hanging around: rm -rf /lib/*magma* recompile. brassow On Nov 12, 2004, at 10:06 AM, Brynnen R Owen wrote: > More information. > > I may have had an old version of ccsd which allowed me to get the > cluster running in the first place. I can't get that far now. > > I have IPv6 compiled in the kernel but no IPv6 interfaces defined. > I've given ccsd the -4 flag. > > Checking logs after "ccs_test connect" shows that ccsd does not > believe the cluster is quorate. > > /etc/cluster/status says that the cluster has reached quorum. The IP > addresses are appropriate (I have dual-NIC hosts). > > I recompiled ccsd with "DEBUG=1" and found that the "quorate" variable > was never set in ccsd. I further found that cluster_communicator() > never received a valid fd from clu_connect and was therefore stuck in > a loop. clu_connect appears to be a magma call. > > Any advice on how to proceed? From owen at isrl.uiuc.edu Fri Nov 12 18:02:05 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Fri, 12 Nov 2004 12:02:05 -0600 Subject: [Linux-cluster] GFS hangs after several hours In-Reply-To: <0A9659B0-34D4-11D9-B241-000A957BB1F6@redhat.com> References: <20041111180717.GF26670@iwork57.lis.uiuc.edu> <20041112160648.GD30143@iwork57.lis.uiuc.edu> <0A9659B0-34D4-11D9-B241-000A957BB1F6@redhat.com> Message-ID: <20041112180205.GI30143@iwork57.lis.uiuc.edu> I was just about to send email to the list and say I found an issue with magma-plugins. I have fixed this. I'll see if the cluster is stable now. Thanks. On Fri, Nov 12, 2004 at 11:55:29AM -0600, Jonathan E Brassow wrote: > It sounds like the magma plugins can not be found - or are old. > > ls /usr/lib/magma/plugins ? > > Additionally, you may wish to check that there are no older versions of > magma hanging around: > > rm -rf /lib/*magma* > > recompile. > brassow > > > On Nov 12, 2004, at 10:06 AM, Brynnen R Owen wrote: > > >More information. > > > >I may have had an old version of ccsd which allowed me to get the > >cluster running in the first place. I can't get that far now. > > > >I have IPv6 compiled in the kernel but no IPv6 interfaces defined. > >I've given ccsd the -4 flag. > > > >Checking logs after "ccs_test connect" shows that ccsd does not > >believe the cluster is quorate. > > > >/etc/cluster/status says that the cluster has reached quorum. The IP > >addresses are appropriate (I have dual-NIC hosts). > > > >I recompiled ccsd with "DEBUG=1" and found that the "quorate" variable > >was never set in ccsd. I further found that cluster_communicator() > >never received a valid fd from clu_connect and was therefore stuck in > >a loop. clu_connect appears to be a magma call. > > > >Any advice on how to proceed? > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> From dmorgan at gmi-mr.com Mon Nov 15 01:08:34 2004 From: dmorgan at gmi-mr.com (dmorgan at gmi-mr.com) Date: Mon, 15 Nov 2004 01:08:34 -0000 (GMT) Subject: [Linux-cluster] GFS Support Question Message-ID: <1395.24.85.216.19.1100480914.squirrel@24.85.216.19> Hello, this question is probably best suited for the Red Hat sales department but I'll throw it out anyways. If you subscribe to RHEL AS 3 does this include GFS? If so would it include support for GFS? Thanks in advance, Duncan Morgan From rf_tang at ncic.ac.cn Mon Nov 15 03:47:55 2004 From: rf_tang at ncic.ac.cn (Rongfeng Tang) Date: Mon, 15 Nov 2004 11:47:55 +0800 Subject: [Linux-cluster] Low parallel write performance for GFS Message-ID: <20041115035900.9F5EAFB044@gatekeeper.ncic.ac.cn> Hello everyone: I want to test the GFS-6.0 by myself, the configuration is as follows: 2 GNBD server each with a dedicated SCSI disk, configured into one pool, 1 gulm lock server 10 clients GE network using iozone ran read and write testing on each client. the aggregate write performance I got only about 20% of raw device peformance (10MB/s), even much more lower than what one client can get (20MB/S), what's problem with it? Thanks a lot. trf -------------- next part -------------- A non-text attachment was scrubbed... Name: fox.gif Type: image/gif Size: 9519 bytes Desc: not available URL: From yfyoufeng at 263.net Mon Nov 15 04:12:23 2004 From: yfyoufeng at 263.net (yf-263) Date: Mon, 15 Nov 2004 12:12:23 +0800 Subject: [Linux-cluster] Low parallel write performance for GFS In-Reply-To: <20041115035900.9F5EAFB044@gatekeeper.ncic.ac.cn> References: <20041115035900.9F5EAFB044@gatekeeper.ncic.ac.cn> Message-ID: <41982CA7.4020906@263.net> Hi, Tang, I found our GFS on Darwin not so slowly as it is also 10MB/s on FC'ed SAN disk ;) Rongfeng Tang wrote: >Hello everyone: > I want to test the GFS-6.0 by myself, the configuration is as >follows: > 2 GNBD server each with a dedicated SCSI disk, configured into one pool, > 1 gulm lock server > 10 clients > GE network >using iozone ran read and write testing on each client. >the aggregate write performance I got only about 20% of raw device peformance (10MB/s), >even much more lower than what one client can get (20MB/S), what's problem with it? > > Thanks a lot. > > trf > > > > > > ------------------------------------------------------------------------ > >------------------------------------------------------------------------ > >-- >Linux-cluster mailing list >Linux-cluster at redhat.com >http://www.redhat.com/mailman/listinfo/linux-cluster > From rstevens at vitalstream.com Mon Nov 15 17:36:27 2004 From: rstevens at vitalstream.com (Rick Stevens) Date: Mon, 15 Nov 2004 09:36:27 -0800 Subject: [Linux-cluster] GFS Support Question In-Reply-To: <1395.24.85.216.19.1100480914.squirrel@24.85.216.19> References: <1395.24.85.216.19.1100480914.squirrel@24.85.216.19> Message-ID: <4198E91B.3060902@vitalstream.com> dmorgan at gmi-mr.com wrote: > Hello, > > this question is probably best suited for the Red Hat sales department but > I'll throw it out anyways. > > If you subscribe to RHEL AS 3 does this include GFS? If so would it > include support for GFS? I don't think so. Red Hat offers a subscription for GFS separately. AS/ES 3 does not include it as far as I know. ---------------------------------------------------------------------- - Rick Stevens, Senior Systems Engineer rstevens at vitalstream.com - - VitalStream, Inc. http://www.vitalstream.com - - - - "The bogosity meter just pegged." - ---------------------------------------------------------------------- From rstevens at vitalstream.com Mon Nov 15 17:48:19 2004 From: rstevens at vitalstream.com (Rick Stevens) Date: Mon, 15 Nov 2004 09:48:19 -0800 Subject: [Linux-cluster] Old 2.4, non-nptl kernel support Message-ID: <4198EBE3.1050601@vitalstream.com> This is probably a silly question, but I've got the CVS version of GFS running on an FC3 system with a JetStor dual-host SCSI RAID array. It runs very nicely, indeed. However, I do have an application where the vendor has not yet rebuilt on a POSIX-thread compliant kernel--much less 2.6.9. I can probably induce them to have a run at it, but they are in the throes of a release of a different product and can't spend a lot of time on it at this exact moment. I was wondering, is there code buried in the archives somewhere that would work with a 2.4, non-nptl kernel such as one would find on RH9? Tarballs would be fine, I'm not afraid of building stuff in the least. ---------------------------------------------------------------------- - Rick Stevens, Senior Systems Engineer rstevens at vitalstream.com - - VitalStream, Inc. http://www.vitalstream.com - - - - I.R.S.: We've got what it takes to take what you've got! - ---------------------------------------------------------------------- From schuan2 at gmail.com Mon Nov 15 20:33:03 2004 From: schuan2 at gmail.com (Shih-Che Huang) Date: Mon, 15 Nov 2004 15:33:03 -0500 Subject: [Linux-cluster] Problem for loading GFS6.0 modules! Message-ID: Hi, I am running CentOS 3.3 2.4.21-20.EL.c0 and want to load the modules for GFS 6.0. It came back followin message when I type "modprobe lock_guld" and "modprobe gfs". # modprobe lock_gulm /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: unresolved symbol sock_recvmsg_Racb84c2d /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: unresolved symbol sock_sendmsg_R808aeb7d /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: unresolved symbol sock_release_R95b5793b /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: unresolved symbol sock_create_Raed60fac /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: insmod /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o failed /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: insmod lock_gulm failed # modprobe gfs /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o: unresolved symbol irq_stat_R94d0d943 /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o: insmod /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o failed /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o: insmod gfs failed Any idea? Thanks for your help! Shih-Che Huang -- Shih-Che Huang From amanthei at redhat.com Mon Nov 15 20:44:30 2004 From: amanthei at redhat.com (Adam Manthei) Date: Mon, 15 Nov 2004 14:44:30 -0600 Subject: [Linux-cluster] Problem for loading GFS6.0 modules! In-Reply-To: References: Message-ID: <20041115204430.GB17305@redhat.com> On Mon, Nov 15, 2004 at 03:33:03PM -0500, Shih-Che Huang wrote: > Hi, > I am running CentOS 3.3 2.4.21-20.EL.c0 and want to load the modules > for GFS 6.0. > It came back followin message when I type "modprobe lock_guld" > and "modprobe gfs". > > # modprobe lock_gulm > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > unresolved symbol sock_recvmsg_Racb84c2d > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > unresolved symbol sock_sendmsg_R808aeb7d > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > unresolved symbol sock_release_R95b5793b > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > unresolved symbol sock_create_Raed60fac > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > insmod /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o > failed > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > insmod lock_gulm failed > > # modprobe gfs > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o: unresolved symbol > irq_stat_R94d0d943 > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o: insmod > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o failed > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o: insmod gfs failed > > Any idea? > > Thanks for your help! The modules you are loading were not built against the same source and/or .config as your kernel as evident by the symbol names are unresolved: unresolved symbol irq_stat_R94d0d943 unresolved symbol sock_recvmsg_Racb84c2d unresolved symbol sock_sendmsg_R808aeb7d unresolved symbol sock_release_R95b5793b unresolved symbol sock_create_Raed60fac -- Adam Manthei From lars.larsen at edb.com Mon Nov 15 22:06:18 2004 From: lars.larsen at edb.com (=?ISO-8859-1?Q?Larsen_Lars_Asbj=F8rn?=) Date: Mon, 15 Nov 2004 23:06:18 +0100 Subject: [Linux-cluster] GFS Support Question Message-ID: Right, GFS is not included in RHEL 3 AS og ES, but you need to run GFS on top of RHEL 3 AS or ES to get support. Vennlig hilsen/Best regards Lars Larsen Seniorkonsulent/Senior Consultant EDB IT Drift www.edb.com -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Rick Stevens Sent: 15. november 2004 18:36 To: linux clistering Subject: Re: [Linux-cluster] GFS Support Question dmorgan at gmi-mr.com wrote: > Hello, > > this question is probably best suited for the Red Hat sales department > but I'll throw it out anyways. > > If you subscribe to RHEL AS 3 does this include GFS? If so would it > include support for GFS? I don't think so. Red Hat offers a subscription for GFS separately. AS/ES 3 does not include it as far as I know. ---------------------------------------------------------------------- - Rick Stevens, Senior Systems Engineer rstevens at vitalstream.com - - VitalStream, Inc. http://www.vitalstream.com - - - - "The bogosity meter just pegged." - ---------------------------------------------------------------------- -- Linux-cluster mailing list Linux-cluster at redhat.com http://www.redhat.com/mailman/listinfo/linux-cluster From amanthei at redhat.com Mon Nov 15 22:42:07 2004 From: amanthei at redhat.com (Adam Manthei) Date: Mon, 15 Nov 2004 16:42:07 -0600 Subject: [Linux-cluster] Problem for loading GFS6.0 modules! In-Reply-To: References: <20041115204430.GB17305@redhat.com> Message-ID: <20041115224207.GD17305@redhat.com> On Mon, Nov 15, 2004 at 05:30:05PM -0500, Shih-Che Huang wrote: > Hi Adam, > Where can I find the moudles which match my kernel? > I am using CentOS3.3 2.4.21-20.EL.c0 . Good question. The GFS-6.0 rpms from Red Hat's site are for Red Hat kernels. If you are unable to use a Red Hat kernel that matches the GFS rpms, then you will either need to compile GFS from source, or bug your distribution's package maintainers. > On Mon, 15 Nov 2004 14:44:30 -0600, Adam Manthei wrote: > > On Mon, Nov 15, 2004 at 03:33:03PM -0500, Shih-Che Huang wrote: > > > > > > > Hi, > > > I am running CentOS 3.3 2.4.21-20.EL.c0 and want to load the modules > > > for GFS 6.0. > > > It came back followin message when I type "modprobe lock_guld" > > > and "modprobe gfs". > > > > > > # modprobe lock_gulm > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > > > unresolved symbol sock_recvmsg_Racb84c2d > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > > > unresolved symbol sock_sendmsg_R808aeb7d > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > > > unresolved symbol sock_release_R95b5793b > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > > > unresolved symbol sock_create_Raed60fac > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > > > insmod /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o > > > failed > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs_locking/lock_gulm/lock_gulm.o: > > > insmod lock_gulm failed > > > > > > # modprobe gfs > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o: unresolved symbol > > > irq_stat_R94d0d943 > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o: insmod > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o failed > > > /lib/modules/2.4.21-20.EL.c0/kernel/fs/gfs/gfs.o: insmod gfs failed > > > > > > Any idea? > > > > > > Thanks for your help! > > > > The modules you are loading were not built against the same source and/or > > .config as your kernel as evident by the symbol names are unresolved: > > > > unresolved symbol irq_stat_R94d0d943 > > unresolved symbol sock_recvmsg_Racb84c2d > > unresolved symbol sock_sendmsg_R808aeb7d > > unresolved symbol sock_release_R95b5793b > > unresolved symbol sock_create_Raed60fac > > > > -- > > Adam Manthei > > > > > -- > Shih-Che Huang -- Adam Manthei From dwaquilina at gmail.com Tue Nov 16 05:48:00 2004 From: dwaquilina at gmail.com (David Aquilina) Date: Tue, 16 Nov 2004 00:48:00 -0500 Subject: [Linux-cluster] GFS Support Question In-Reply-To: References: Message-ID: On Mon, 15 Nov 2004 23:06:18 +0100, Larsen Lars Asbj?rn wrote: > GFS is not included in RHEL 3 AS og ES, but you need to run GFS on top of > RHEL 3 AS or ES to get support. GFS is supported on WS as well. It inherits also level of support of your underlying RHEL subscription - if you have AS Premium, you also get Premium support on GFS. IIRC, GFS entitlements include cluster suite as well. -- David Aquilina, RHCE dwaquilina at gmail.com From chekov at ucla.edu Tue Nov 16 07:59:49 2004 From: chekov at ucla.edu (Alan Wood) Date: Mon, 15 Nov 2004 23:59:49 -0800 (PST) Subject: [Linux-cluster] samba on top of GFS In-Reply-To: <20041101205335.GE5409@Favog.ubiqx.mn.org> References: <20041101170055.464B07313C@hormel.redhat.com> <20041101205335.GE5409@Favog.ubiqx.mn.org> Message-ID: Thanks for your help guys. Sorry it has taken me a while to get back to you. I have been trying to come up with intelligent things to add but have been stymied by what appears to be an ever-changing target. details below Here's what I think I know: 1. Samba does indeed crash when running with "oplocks = no" AND on a single node of a two-node GFS cluster where the other node is not doing much of anything. So Christopher the answer to your questions is that it seems to be _purely_ samba and gfs interacting that is setting up a crash. 2. Samba sometimes crashes in such a way that a "kill -9" will not eliminate the crashed processes. In this situation attempting to start samba again creates a [predictably] unstable situation in which kernel oopses are evident. 3. Often samba simply starts to fail without outright crashing. This renders both heartbeat monitoring and fencing rather useless. On the client end this is evidenced by the ability to access a share and browse, but severely degredaded performance when accessing individual files and frequent crashes of windows explorer. On the server end, more and more smb processes start up but old ones don't die... Here are some observations. Some of these could be completly bogus because I am extrapolating on insifficient facts . 1. Crashes are not nescessarily caused by having multiple accesses to the same file. In one situation having 7 computers read/write to the same directory (but never the same file) appears to have caused a complete crash or a server running samba on top of GFS. 2. Crashes could be load related. I seem to be exponentially more likely to see a crash with 50 concurrent users than with 5. Since having many users increases the chances of a situation like #1 above, I can see a possible correlation, but this would not explain all crashes. 3. Larger files and more full directories experience severe performance degredation in the samba/gfs scenario. Simply right-clicking on a networked file that is a few hundred megabytes can take minutes to pop up a menu (assuming it doesn't crash windows explorer first). 4. Crashes occur on quota-enforced and non-quota-enforced system with no discernable difference. However access to files on quota-ed systems might be slower. I experiemented with a lot of different settings in the smb.conf, and while the crashes were sometimes different I could not come up with a good mapping of configs<=>crash causes. I have been very frustrated because every time I think I've discovered something a new crash happens that appears to prove me wrong. I need to create a seperate test environment where I can set up idealized crash conditions in order to give this list some more credible data because my current environment has a lot of simultaneous access from multiple users whose actions aren't easily monitorably or consistent. For instance, bob can't access a server drive on windows computer one, he simply logs into computer two and then three hoping to get a different result. Server crashes, but did it crash because bob didn't wait long enough on the first access and exacerbated a fixable problem to a crash? hard to tell... hoping someone out there has seen something similar or can shed some light. anyway, my system details: I'm still running samba 3.0.7 on top of kernel 2.6.8-1.521 I tried updating to the newest CVS releases but ran into compile errors and haven't had time to try again. so the GFS build is still mid-september. I'd like to try the fixes that Patrick and David posted but think I am going to try to compile cleanly with a 2.6.9 kernel. error details: I tried turning the loglevel up to 3. I get the following fairly often: Nov 15 17:10:32 clu2 smbd[5624]: Error writing 5 bytes to client. -1. (Connection reset by peer) Nov 15 17:10:32 clu2 smbd[5624]: [2004/11/15 17:10:32, 0] lib/util_sock.c:send_smb(647) Nov 15 17:10:32 clu2 smbd[5624]: write_socket: Error writing 5 bytes to socket 24: ERRNO = Connection reset by peer Nov 15 17:10:32 clu2 smbd[5624]: [2004/11/15 17:10:32, 0] lib/util_sock.c:write_socket(455) Nov 15 17:10:32 clu2 smbd[5624]: write_socket_data: write failure. Error = Connection reset by peer Nov 15 17:10:32 clu2 smbd[5624]: [2004/11/15 17:10:32, 0] lib/util_sock.c:write_socket_data(430) otherwise I just see runaway processes that won't die or I get a fence event with no apparent log entry leading to it. -alan On Mon, 1 Nov 2004, Christopher R. Hertel wrote: > On Mon, Nov 01, 2004 at 12:30:47PM -0800, Alan Wood wrote: >> I am running a cluster with GFS-formatted file systems mounted on multiple >> nodes. What I was hoping to do was to set up one node running httpd to be >> my webserver and another node running samba to share the same data >> internally. >> What I am getting when running that is instability. > > Yeah. This is a known problem. The reason is that Samba must maintain a > great deal of metadata internally. This works well enough with multiple > Samba processes running on a single machine dealing (more or less) > directly with the filesystem. > > The problem is that Samba must keep track of translations between Posix > and Windows metadata, locking semantics, file sharing mode semantics, etc. > > I had assumed that this would only be a problem if Samba was running on > multiple machines all GFS-sharing the same back-end block storage. Your > report suggests that there's more to the interaction between Samba and GFS > than I had anticipated. Interesting... > >> The samba serving node >> keeps crashing. I have heartbeat set up so that failover happens to the >> webserver node, at which point the system apparently behaves well. > > Which kind of failover? Do you start Samba on the webserver node? It > would be interesting to know if the two run well together on the same > node, but fail on separate nodes. > >> After reading a few articles on the list it seemed to me that the problem >> might be samba using oplocks or some other caching mechanism that breaks >> synchronization. > > Yeah... that was my next question... > >> I tried turning oplocks=off in my smb.conf file, but that >> made the system unusably slow (over 3 minutes to right-click on a two-meg >> file). > > Curious. > > ...but did it fix the other problems? > > I'd really love to work with someone to figure all this out. (Hint hint.) > :) > >> I am also not sure that is the extent of the problem, as I seem to be able >> to re-create the crash simply by accessing the same file on multiple >> clients just via samba (which locking should be able to handle). > > Should be... > >> If the >> problem were merely that the remote node and the samba node were both >> accessing an oplocked file I could understand, but that doesn't always seem >> to be the case. > > There's more here than I can figure out just from the description. It'd > take some digging along-side someone who knows GFS. > >> has anyone had any success running the same type of setup? I am also >> serving nfs on the samba server, though with very little load there. > > Is there any overlap in the files they're serving? > >> below is the syslog output of a crash. I'm running 2.6.8-1.521smp with a >> GFS CVS dump from mid-september. >> -alan > > Wish I could be more help... > > Chris -)----- > > From pcaulfie at redhat.com Tue Nov 16 13:56:40 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Tue, 16 Nov 2004 13:56:40 +0000 Subject: [Linux-cluster] Any news about broken broadcast on 2.6.9 ? In-Reply-To: <20041105090257.GC27851@tykepenguin.com> References: <200411041434.02469.agauthier@realmedia.com> <32F476A9-2E7A-11D9-9CA7-000A957BB1F6@redhat.com> <200411050955.33081.agauthier@realmedia.com> <20041105090257.GC27851@tykepenguin.com> Message-ID: <20041116135640.GB645@tykepenguin.com> There does seem to be a problem with broadcast on some kernels - I can confirm (yes, you already knew that but I've seen it for myself now!) If you try to create a lockspace (eg using one of the dlm/tests/usertest programs) and the thing hangs with SM: send_broadcast_message error -13 SM: send_broadcast_message error -13 SM: send_broadcast_message error -13 on the console, then I have what you're seeing. I'm working on it... -- patrick From crh at ubiqx.mn.org Tue Nov 16 19:06:29 2004 From: crh at ubiqx.mn.org (Christopher R. Hertel) Date: Tue, 16 Nov 2004 13:06:29 -0600 Subject: [Linux-cluster] samba on top of GFS In-Reply-To: References: <20041101170055.464B07313C@hormel.redhat.com> <20041101205335.GE5409@Favog.ubiqx.mn.org> Message-ID: <20041116190629.GB29115@Favog.ubiqx.mn.org> On Mon, Nov 15, 2004 at 11:59:49PM -0800, Alan Wood wrote: : > anyway, > my system details: > I'm still running samba 3.0.7 on top of kernel 2.6.8-1.521 Also go ahead and upgrade to Samba 3.0.8 for me. Just so we're on the same page... > I tried updating to the newest CVS releases but ran into compile errors and > haven't had time to try again. so the GFS build is still mid-september. > I'd like to try the fixes that Patrick and David posted but think I am > going to try to compile cleanly with a 2.6.9 kernel. > > error details: > I tried turning the loglevel up to 3. I get the following fairly often: (Reformatted to Samba's normal log style...) Error writing 5 bytes to client. -1. (Connection reset by peer) [2004/11/15 17:10:32, 0] lib/util_sock.c:send_smb(647) write_socket: Error writing 5 bytes to socket 24: ERRNO = Connection reset by peer [2004/11/15 17:10:32, 0] lib/util_sock.c:write_socket(455) write_socket_data: write failure. Error = Connection reset by peer [2004/11/15 17:10:32, 0] lib/util_sock.c:write_socket_data(430) Samba has it's own logfile format. Each error message starts with the timestamp followed by the file:function name(line number) at which the message was generated. The text following this header is the actual message. Yeah, it get's kind of messy when you log to syslog. I suggest you set "debug timestamp = no" in your smb.conf. The above errors (on the surface, at least) talk about Samba's smbd daemon being unable to send a message to the client. That confuses me. It seems as though the client has dropped the TCP connection for some reason. ...either that, or there's some other problem related to the network I/O. This is all very superficial. I think debugging this from the GFS side first makes the most sense. Chris -)----- -- "Implementing CIFS - the Common Internet FileSystem" ISBN: 013047116X Samba Team -- http://www.samba.org/ -)----- Christopher R. Hertel jCIFS Team -- http://jcifs.samba.org/ -)----- ubiqx development, uninq. ubiqx Team -- http://www.ubiqx.org/ -)----- crh at ubiqx.mn.org OnLineBook -- http://ubiqx.org/cifs/ -)----- crh at ubiqx.org From bmarzins at redhat.com Tue Nov 16 20:24:18 2004 From: bmarzins at redhat.com (Benjamin Marzinski) Date: Tue, 16 Nov 2004 14:24:18 -0600 Subject: [Linux-cluster] Re: patches. In-Reply-To: <200411161434.13636.phillips@redhat.com> References: <20041112225514.GI11111@phlogiston.msp.redhat.com> <200411160229.35386.phillips@redhat.com> <20041116162412.GC2554@phlogiston.msp.redhat.com> <200411161434.13636.phillips@redhat.com> Message-ID: <20041116202418.GD2554@phlogiston.msp.redhat.com> > > > > Once another client with the same id logs in, all > > > > the zombie's locks are dropped, and it is finally cleaned up. > > > > > > This reconnecting client is the owner of the locks, I hope you > > > mean, and it will soon upload a replacement, possibly smaller, set > > > of locks. > > > > When the new client uploaded its locks, it already got the locks it > > needed, so both the new client and the zombie client were on the > > holdlist. When the new client is finished logging in, the zombie > > client gets removed from the holdlist of all it's locks. That's what > > I meant. > > So just to confirm, are the new client and zombie client actually the > same client, just reconnecting? Whenever a client disconnects while holding locks, it becomes a zombie. When a new snapshot client completes logging in it will already have acquired all the locks it needs. Once it finishes logging in, it checks the zombie clients. If there is a zombie client with the same id, it removes the zombie client. So in the case were a client disconnects from the server while holding locks, and reconnects to the same server, there will be a zombie, and it will be from the client's previous connection. If a client looses connection with a server and reconnects to a new server, (because presumably, the old server went down), there will not be any zombie clients. > > > > Appropriate Zombies are also cleaned up when the server gets a > > > > REMOVE_CLIENT_IDS request. > > > > > > This would instead happen when the agent logs the client out. > > > > Um... I'm missing something here. Say the other machine crashes > > while one of it's clients is hold locks. Then the client becomes a > > zombie, but there is no agent to remove it. To solve this, the agent > > on the master server node would not just need to keep a list of > > client to wait for, but would have to always keep a list of all > > clients that are connected to the server. Then it could simply log > > the clients out for the defunct agent. This would also necessitate > > that the server contacts the its agent after each client is > > completely logged out. This method would work. And I can do this if > > that's what you want. But something needs to be done for this case. > > Good observation, the master agent has to keep its list of snapshot > clients around forever, all right. Also, if an agent has to log out a > disconnected client, it's cleaner to forward the event to the master > agent and have the master agent do it, otherwise there is a risk of > logouts arriving at the server from two different places (not actually > destructive, but untidy). The non-master agent has to forward the > client disconnection anyway, in order to implement recovery. Does this > make sense to you? Yes. > If it's done that way, then the server doesn't have to tell its agent > about logouts. This part I don't get. What you said is true, but it's got some odd implications. One, the list that the agent stores will contain every client that ever has connected to the server from each node. And Two, if another agent dies, the master's agent will send log offs for all of these clients, regardless of whether or not they are currently logged in. There is one annoying issue with this. It doesn't let us preallocate a fixed size client list. If the agent only keeps track of the clients that are actually logged into the master server, The agent just needs a list the same size as the servers client list, which is currently a static size. If the agent doesn't get logout messages, the list could very well be bigger that the maximum number of clients that could be logged into the master. That's my only issue. > > > A P.S. here, I just looked over the agent list rebuilding code, and > > > my race detector is beeping full on. I'll have a ponder before > > > offering specifics though. > > > > That's a question I have. The ast() function gets called by > > dlm_dispatch(). right? If so, I don't see the race. If not, there is > > one hell of dangerous race. If the agent_list is changing when agent > > is trying to contact the other agents, bad things will most likely > > happen. > > If the list changes and we don't know about the changes while waiting to > get answers back from other agents, we're dead in the water. So the > recovery algorithm must handle membership changes that happen in > parallel. After much pondering, I think I've got a reasonably simple > algorithm, I'll write it up now. Um... but since we wait for agent responses in the same poll loop that we wait for membership change notifications, these two things already do happen in parallel... well... mostly. The only issue I see is that we could get the event from magma, and then block trying to get the member_list. But since that's a local call, if that's hangs forever, then cman is in trouble, and there isn't much we can do anyway. But there is no chance of not getting a membership change because we are waiting on a agent response. > Well, we have for sure gotten to the interesting part of this, how about > we continue in linux-cluster? > Sure. But I'm not sure if anyone else is interested in implementation details. > Regards, > > Daniel -Ben From bmarzins at redhat.com Tue Nov 16 20:40:33 2004 From: bmarzins at redhat.com (Benjamin Marzinski) Date: Tue, 16 Nov 2004 14:40:33 -0600 Subject: [Linux-cluster] Re: patches. In-Reply-To: <20041116202418.GD2554@phlogiston.msp.redhat.com> References: <20041112225514.GI11111@phlogiston.msp.redhat.com> <200411160229.35386.phillips@redhat.com> <20041116162412.GC2554@phlogiston.msp.redhat.com> <200411161434.13636.phillips@redhat.com> <20041116202418.GD2554@phlogiston.msp.redhat.com> Message-ID: <20041116204033.GE2554@phlogiston.msp.redhat.com> > only issue. > > > > > A P.S. here, I just looked over the agent list rebuilding code, and > > > > my race detector is beeping full on. I'll have a ponder before > > > > offering specifics though. > > > > > > That's a question I have. The ast() function gets called by > > > dlm_dispatch(). right? If so, I don't see the race. If not, there is > > > one hell of dangerous race. If the agent_list is changing when agent > > > is trying to contact the other agents, bad things will most likely > > > happen. > > > > If the list changes and we don't know about the changes while waiting to > > get answers back from other agents, we're dead in the water. So the > > recovery algorithm must handle membership changes that happen in > > parallel. After much pondering, I think I've got a reasonably simple > > algorithm, I'll write it up now. > > Um... but since we wait for agent responses in the same poll loop that we wait > for membership change notifications, these two things already do happen in > parallel... well... mostly. The only issue I see is that we could get the event > from magma, and then block trying to get the member_list. But since > that's a local call, if that's hangs forever, then cman is in trouble, > and there isn't much we can do anyway. But there is no chance of not getting > a membership change because we are waiting on a agent response. Just to clarify. The issue that I had earlier mentioned is this: If the ast() code and the rebuild_agent_list() code executed at the same time, which I don't believe they can, they are both using the same data structures, and could muck each other up. > > Well, we have for sure gotten to the interesting part of this, how about > > we continue in linux-cluster? > > > > Sure. But I'm not sure if anyone else is interested in implementation details. > > > Regards, > > > > Daniel > > -Ben > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster From jn at it.swin.edu.au Tue Nov 16 22:37:47 2004 From: jn at it.swin.edu.au (John Newbigin) Date: Wed, 17 Nov 2004 09:37:47 +1100 Subject: [Linux-cluster] Grow pool without adding subpools Message-ID: <419A813B.5070906@it.swin.edu.au> I am using the RedHat GFS 6.0.0-15. I have a pool which is on a Compaq Smart Array 5i. I have added a new disk to the array and expanded the array. Is there a way I can tell the pool to grow to fill the new space? I though pool-tool -g would do it but it seems that can only add subpools. And on a related note, can I add more journals without increasing the size of the pool? Thanks John. From dwaquilina at gmail.com Tue Nov 16 22:50:38 2004 From: dwaquilina at gmail.com (David Aquilina) Date: Tue, 16 Nov 2004 17:50:38 -0500 Subject: [Linux-cluster] Grow pool without adding subpools In-Reply-To: <419A813B.5070906@it.swin.edu.au> References: <419A813B.5070906@it.swin.edu.au> Message-ID: On Wed, 17 Nov 2004 09:37:47 +1100, John Newbigin wrote: > I though pool-tool -g would do it but it seems that can only add subpools. Is there any particular reason you don't want to add a subpool? As far as I know, adding subpools is the only way to grow a pool, and won't have any other effect than a slightly longer pool configuration file... -- David Aquilina, RHCE dwaquilina at gmail.com From jbrassow at redhat.com Tue Nov 16 23:28:20 2004 From: jbrassow at redhat.com (Jonathan E Brassow) Date: Tue, 16 Nov 2004 17:28:20 -0600 Subject: [Linux-cluster] Grow pool without adding subpools In-Reply-To: References: <419A813B.5070906@it.swin.edu.au> Message-ID: <33D0D870-3827-11D9-B1C4-000A957BB1F6@redhat.com> Usually, ppl want to do this because their storage grows without adding an additional device to the system. I believe the simplest way to do this is to bring your cluster down. It is possible to do this one machine at a time - leaving the cluster as a whole running, but the instructions are more difficult. Assuming you are bringing down your cluster, do: 1. unmount gfs from all machines 2. unassemble the pool device from all machines 3. rewrite the pool labels for the pool in question from _one_ machine 4. reassemble the pool device on all machines 5. remount gfs 6. choose to add additional journals, or add space to the file system Basically, you are forcibly growing the size of the pool (offline). Then you are making your adjustments to gfs (online). Clearly, it is easier if the array simply makes another device available to the system - allowing an additional subpool to be added to the pool. This way, everything can be done online. brassow On Nov 16, 2004, at 4:50 PM, David Aquilina wrote: > On Wed, 17 Nov 2004 09:37:47 +1100, John Newbigin > wrote: >> I though pool-tool -g would do it but it seems that can only add >> subpools. > > Is there any particular reason you don't want to add a subpool? As far > as I know, adding subpools is the only way to grow a pool, and won't > have any other effect than a slightly longer pool configuration > file... > > -- > David Aquilina, RHCE > dwaquilina at gmail.com > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > From jn at it.swin.edu.au Tue Nov 16 23:42:50 2004 From: jn at it.swin.edu.au (John Newbigin) Date: Wed, 17 Nov 2004 10:42:50 +1100 Subject: [Linux-cluster] Grow pool without adding subpools In-Reply-To: References: <419A813B.5070906@it.swin.edu.au> Message-ID: <419A907A.20104@it.swin.edu.au> David Aquilina wrote: > On Wed, 17 Nov 2004 09:37:47 +1100, John Newbigin wrote: > >>I though pool-tool -g would do it but it seems that can only add subpools. > > > Is there any particular reason you don't want to add a subpool? As far > as I know, adding subpools is the only way to grow a pool, and won't > have any other effect than a slightly longer pool configuration > file... > I am using hardware raid. There is just one device which represents the array (/dev/cciss/c0d1). The size of this device has grown but I need to grow the pool to fill it. Perhaps there is a better way. I am still testing so I can recreate it differently if necessary. John. From pcaulfie at redhat.com Wed Nov 17 08:32:36 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Wed, 17 Nov 2004 08:32:36 +0000 Subject: [Linux-cluster] Any news about broken broadcast on 2.6.9 ? In-Reply-To: <20041116135640.GB645@tykepenguin.com> References: <200411041434.02469.agauthier@realmedia.com> <32F476A9-2E7A-11D9-9CA7-000A957BB1F6@redhat.com> <200411050955.33081.agauthier@realmedia.com> <20041105090257.GC27851@tykepenguin.com> <20041116135640.GB645@tykepenguin.com> Message-ID: <20041117083235.GA19299@tykepenguin.com> Now fixed, sorry it took so long. -- patrick From ben.m.cahill at intel.com Wed Nov 17 17:25:59 2004 From: ben.m.cahill at intel.com (Cahill, Ben M) Date: Wed, 17 Nov 2004 09:25:59 -0800 Subject: [Linux-cluster] [PATCH] Yet more comments Message-ID: <0604335B7764D141945E202153105960033E269F@orsmsx404.amr.corp.intel.com> Hi all, FYI, I'm sending a patch to Ken Preslan with more GFS comments. Not attaching here because it's a little large. -- Ben -- Opinions are mine, not Intel's From jn at it.swin.edu.au Thu Nov 18 05:23:30 2004 From: jn at it.swin.edu.au (John Newbigin) Date: Thu, 18 Nov 2004 16:23:30 +1100 Subject: [Linux-cluster] Grow pool without adding subpools In-Reply-To: <419A907A.20104@it.swin.edu.au> References: <419A813B.5070906@it.swin.edu.au> <419A907A.20104@it.swin.edu.au> Message-ID: <419C31D2.4090807@it.swin.edu.au> Is is possible/useful to create pools on top of LVM on top of hardware raid? Would that help in this situation? Does anyone use gfs on hardware raid? John. John Newbigin wrote: > David Aquilina wrote: > >> On Wed, 17 Nov 2004 09:37:47 +1100, John Newbigin >> wrote: >> >>> I though pool-tool -g would do it but it seems that can only add >>> subpools. >> >> >> >> Is there any particular reason you don't want to add a subpool? As far >> as I know, adding subpools is the only way to grow a pool, and won't >> have any other effect than a slightly longer pool configuration >> file... >> > I am using hardware raid. There is just one device which represents the > array (/dev/cciss/c0d1). The size of this device has grown but I need > to grow the pool to fill it. > > Perhaps there is a better way. I am still testing so I can recreate it > differently if necessary. > > John. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > > > -- John Newbigin Computer Systems Officer Faculty of Information and Communication Technologies Swinburne University of Technology Melbourne, Australia http://www.it.swin.edu.au/staff/jnewbigin From tom at regio.net Thu Nov 18 07:48:22 2004 From: tom at regio.net (tom at regio.net) Date: Thu, 18 Nov 2004 08:48:22 +0100 Subject: [Linux-cluster] Problems on ast.o Message-ID: Hi there, im having a little problem to compiling gfs..... i ran in to the following problem : Using the newest cvs version. On Kernel 2.6.9 with SuSE 9.0 (just for testing) make[2]: Entering directory `/tmp/gfs/cluster/dlm-kernel/src' if [ ! -e cluster ]; then ln -s . cluster; fi if [ ! -e service.h ]; then cp /tmp/gfs/cluster/build/incdir/cluster/service.h .; fi if [ ! -e cnxman.h ]; then cp /tmp/gfs/cluster/build/incdir/cluster/cnxman.h .; fi if [ ! -e cnxman-socket.h ]; then cp /tmp/gfs/cluster/build/incdir/cluster/cnxman-socket.h .; fi make -C /usr/src/linux-2.6 M=/tmp/gfs/cluster/dlm-kernel/src modules USING_KBUILD=yes make[3]: Entering directory `/usr/src/linux-2.6.9' CC [M] /tmp/gfs/cluster/dlm-kernel/src/ast.o /tmp/gfs/cluster/dlm-kernel/src/ast.c: In function `queue_ast': /tmp/gfs/cluster/dlm-kernel/src/ast.c:163: error: `DLM_SBF_VALNOTVALID' undeclared (first use in this function) /tmp/gfs/cluster/dlm-kernel/src/ast.c:163: error: (Each undeclared identifier is reported only once /tmp/gfs/cluster/dlm-kernel/src/ast.c:163: error: for each function it appears in.) make[4]: *** [/tmp/gfs/cluster/dlm-kernel/src/ast.o] Error 1 make[3]: *** [_module_/tmp/gfs/cluster/dlm-kernel/src] Error 2 make[3]: Leaving directory `/usr/src/linux-2.6.9' make[2]: *** [all] Error 2 make[2]: Leaving directory `/tmp/gfs/cluster/dlm-kernel/src' make[1]: *** [install] Error 2 make[1]: Leaving directory `/tmp/gfs/cluster/dlm-kernel' make: *** [all] Error 2 can someone give me a helping hand on that? ;) many thanx! -tom From pcaulfie at redhat.com Thu Nov 18 08:52:34 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Thu, 18 Nov 2004 08:52:34 +0000 Subject: [Linux-cluster] Problems on ast.o In-Reply-To: References: Message-ID: <20041118085234.GA23843@tykepenguin.com> On Thu, Nov 18, 2004 at 08:48:22AM +0100, tom at regio.net wrote: > make[3]: Entering directory `/usr/src/linux-2.6.9' > CC [M] /tmp/gfs/cluster/dlm-kernel/src/ast.o > /tmp/gfs/cluster/dlm-kernel/src/ast.c: In function `queue_ast': > /tmp/gfs/cluster/dlm-kernel/src/ast.c:163: error: `DLM_SBF_VALNOTVALID' > undeclared (first use in this function) Looks like you might have an old version of dlm.h in either /usr/include/cluster or ${kernel_src}/include/linux/cluster -- patrick From agauthier at realmedia.com Thu Nov 18 09:26:19 2004 From: agauthier at realmedia.com (Arnaud Gauthier) Date: Thu, 18 Nov 2004 10:26:19 +0100 Subject: [Linux-cluster] Any news about broken broadcast on 2.6.9 ? In-Reply-To: <20041117083235.GA19299@tykepenguin.com> References: <200411041434.02469.agauthier@realmedia.com> <20041116135640.GB645@tykepenguin.com> <20041117083235.GA19299@tykepenguin.com> Message-ID: <200411181026.20196.agauthier@realmedia.com> Le mercredi, 17 Novembre 2004 09.32, Patrick Caulfield a ?crit?: > Now fixed, sorry it took so long. Thank's ! I found that it was working with Mandrake 10 or RHEL distributions, but not with RH7.3 ? Maybe a glib compatibility problem ? Regards, Arnaud -- Arnaud Gauthier 247 Real Media From pcaulfie at redhat.com Thu Nov 18 09:52:51 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Thu, 18 Nov 2004 09:52:51 +0000 Subject: [Linux-cluster] Any news about broken broadcast on 2.6.9 ? In-Reply-To: <200411181026.20196.agauthier@realmedia.com> References: <200411041434.02469.agauthier@realmedia.com> <20041116135640.GB645@tykepenguin.com> <20041117083235.GA19299@tykepenguin.com> <200411181026.20196.agauthier@realmedia.com> Message-ID: <20041118095251.GD23843@tykepenguin.com> On Thu, Nov 18, 2004 at 10:26:19AM +0100, Arnaud Gauthier wrote: > Le mercredi, 17 Novembre 2004 09.32, Patrick Caulfield a ?crit?: > > Now fixed, sorry it took so long. > > Thank's ! > > I found that it was working with Mandrake 10 or RHEL distributions, but not > with RH7.3 ? Maybe a glib compatibility problem ? > No, it was just a stupid uninitialised variable :( When it was randomly non-zero, things worked fine. -- patrick From tom at regio.net Thu Nov 18 10:04:08 2004 From: tom at regio.net (tom at regio.net) Date: Thu, 18 Nov 2004 11:04:08 +0100 Subject: [Linux-cluster] Problems on ast.o In-Reply-To: <20041118085234.GA23843@tykepenguin.com> Message-ID: Hi, i removed everything on /usr/src/linux-2.6/include/cluster now im getting stuck here ;(( gcc -o clurgmgrd rg_thread.o rg_locks.o main.o groups.o rg_state.o rg_queue.o members.o rg_forward.o reslist.o resrules.o restree.o fo_domain.o -I ../../include -DSHAREDIR=\"/usr/share/cluster\" -Wall -I ../../include -g -I/tmp/gfs/cluster/build/incdir -I/usr/include/libxml2 -L/tmp/gfs/cluster/build/lib -g -Werror -Wstrict-prototypes -Wshadow -fPIC -D_GNU_SOURCE -L ../clulib -lclulib -lxml2 -lmagmamsg -lmagma -lpthread -ldl -lccs ../clulib/libclulib.a(gettid.o)(.text+0x24): In function `gettid': /tmp/gfs/cluster/rgmanager/src/clulib/gettid.c:5: undefined reference to `errno' collect2: ld returned 1 exit status make[3]: *** [clurgmgrd] Error 1 make[3]: Leaving directory `/tmp/gfs/cluster/rgmanager/src/daemons' make[2]: *** [install] Error 2 make[2]: Leaving directory `/tmp/gfs/cluster/rgmanager/src' make[1]: *** [install] Error 2 make[1]: Leaving directory `/tmp/gfs/cluster/rgmanager' make: *** [all] Error 2 -tom Patrick Caulfield To Sent by: linux clistering linux-cluster-bou nces at redhat.com cc Subject 18.11.2004 09:52 Re: [Linux-cluster] Problems on ast.o Please respond to linux clistering On Thu, Nov 18, 2004 at 08:48:22AM +0100, tom at regio.net wrote: > make[3]: Entering directory `/usr/src/linux-2.6.9' > CC [M] /tmp/gfs/cluster/dlm-kernel/src/ast.o > /tmp/gfs/cluster/dlm-kernel/src/ast.c: In function `queue_ast': > /tmp/gfs/cluster/dlm-kernel/src/ast.c:163: error: `DLM_SBF_VALNOTVALID' > undeclared (first use in this function) Looks like you might have an old version of dlm.h in either /usr/include/cluster or ${kernel_src}/include/linux/cluster -- patrick -- Linux-cluster mailing list Linux-cluster at redhat.com http://www.redhat.com/mailman/listinfo/linux-cluster From pcaulfie at redhat.com Thu Nov 18 10:16:10 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Thu, 18 Nov 2004 10:16:10 +0000 Subject: [Linux-cluster] Problems on ast.o In-Reply-To: References: <20041118085234.GA23843@tykepenguin.com> Message-ID: <20041118101610.GE23843@tykepenguin.com> On Thu, Nov 18, 2004 at 11:04:08AM +0100, tom at regio.net wrote: > > > > > Hi, > > i removed everything on /usr/src/linux-2.6/include/cluster now im getting > stuck here ;(( > > gcc -o clurgmgrd rg_thread.o rg_locks.o main.o groups.o rg_state.o > rg_queue.o members.o rg_forward.o reslist.o resrules.o restree.o > fo_domain.o -I ../../include -DSHAREDIR=\"/usr/share/cluster\" -Wall -I > ../../include -g -I/tmp/gfs/cluster/build/incdir -I/usr/include/libxml2 > -L/tmp/gfs/cluster/build/lib -g -Werror -Wstrict-prototypes -Wshadow -fPIC > -D_GNU_SOURCE -L ../clulib -lclulib -lxml2 -lmagmamsg -lmagma -lpthread > -ldl -lccs > ../clulib/libclulib.a(gettid.o)(.text+0x24): In function `gettid': > /tmp/gfs/cluster/rgmanager/src/clulib/gettid.c:5: undefined reference to > `errno' I'm not sure about that, you might have to wait till the yanks wake up. You could just comment out rgmanager from the Makefile for the moment, if you're not using it (which most people aren't at the moment). -- patrick From sdake at mvista.com Wed Nov 17 19:18:27 2004 From: sdake at mvista.com (Steven Dake) Date: Wed, 17 Nov 2004 12:18:27 -0700 Subject: [Linux-cluster] Old 2.4, non-nptl kernel support In-Reply-To: <4198EBE3.1050601@vitalstream.com> References: <4198EBE3.1050601@vitalstream.com> Message-ID: <1100719105.28551.7.camel@persist.az.mvista.com> redhat 9 has nptl support the rest I don't know. good luck -steve On Mon, 2004-11-15 at 10:48, Rick Stevens wrote: > This is probably a silly question, but I've got the CVS version of GFS > running on an FC3 system with a JetStor dual-host SCSI RAID array. It > runs very nicely, indeed. > > However, I do have an application where the vendor has not yet rebuilt > on a POSIX-thread compliant kernel--much less 2.6.9. I can probably > induce them to have a run at it, but they are in the throes of a release > of a different product and can't spend a lot of time on it at this > exact moment. > > I was wondering, is there code buried in the archives somewhere that > would work with a 2.4, non-nptl kernel such as one would find on RH9? > Tarballs would be fine, I'm not afraid of building stuff in the least. > ---------------------------------------------------------------------- > - Rick Stevens, Senior Systems Engineer rstevens at vitalstream.com - > - VitalStream, Inc. http://www.vitalstream.com - > - - > - I.R.S.: We've got what it takes to take what you've got! - > ---------------------------------------------------------------------- > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster From chicagoboy12001 at yahoo.com Thu Nov 18 21:15:40 2004 From: chicagoboy12001 at yahoo.com (Chicago Boy) Date: Thu, 18 Nov 2004 13:15:40 -0800 (PST) Subject: [Linux-cluster] unable to mount gfs partition Message-ID: <20041118211540.88664.qmail@web50404.mail.yahoo.com> Hello All, I am experiencing the same problem as described in this thread: https://www.redhat.com/archives/linux-cluster/2004-July/msg00145.html The cluster is setup with two nodes. One one of the nodes whichever runs lock_gulmd is able to mount GFS file systems. Initially I didn't run lock_gulmd on the second node which is not mater but later after reading this posts I started it and got the same error message: lock_gulmd[18399]: You are running in Standard mode. lock_gulmd[18399]: I am (clu2.abc.com) with ip (192.168.11.212) lock_gulmd[18399]: Forked core [18400]. lock_gulmd_core[18400]: ERROR [core_io.c:1029] Got error from reply: (clu1:192. 168.11.211) 1006:Not Allowed Has anyone solved this problem before. Please help!! Thanks, SD __________________________________ Do you Yahoo!? The all-new My Yahoo! - Get yours free! http://my.yahoo.com From mtilstra at redhat.com Thu Nov 18 21:21:51 2004 From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra) Date: Thu, 18 Nov 2004 15:21:51 -0600 Subject: [Linux-cluster] unable to mount gfs partition In-Reply-To: <20041118211540.88664.qmail@web50404.mail.yahoo.com> References: <20041118211540.88664.qmail@web50404.mail.yahoo.com> Message-ID: <20041118212151.GA1438@redhat.com> On Thu, Nov 18, 2004 at 01:15:40PM -0800, Chicago Boy wrote: > lock_gulmd[18399]: You are running in Standard mode. > lock_gulmd[18399]: I am (clu2.abc.com) with ip > (192.168.11.212) > lock_gulmd[18399]: Forked core [18400]. > lock_gulmd_core[18400]: ERROR [core_io.c:1029] Got > error from reply: (clu1:192. > 168.11.211) 1006:Not Allowed Typically means that the node is pending on a fencing action. (the node that is trying to login is expired, and cannot login until fencing completes.) run `gulm_tool nodelist localhost` on the first node to verify this. you'll then need to figure out why the fencing action didn't work. -- Michael Conrad Tadpol Tilstra Drive defensively. Buy a tank. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From owen at isrl.uiuc.edu Thu Nov 18 21:31:46 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Thu, 18 Nov 2004 15:31:46 -0600 Subject: [Linux-cluster] 'df' not accurate? Message-ID: <20041118213145.GI6491@iwork57.lis.uiuc.edu> Hi all, I have a cluster with 5 computers on board. I have a GFS partition that's about 1T. When I run 'df', it shows that I've used 81G. When I run a 'du -s' on a subtree of the filesystem, I get ~250G. I would expect that the total in-use storage size would be about 300G. The quotas seem to add up to about 300G as well. I manually created a 1M file, and the usage went up by 1028 blocks. I'm using lock_dlm, RedHat 9, 2.6.9 kernel, all patches from CVS on Nov 11. Anyone else seen anything like this? I don't have a few days/weeks to fsck the system. -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> From kpreslan at redhat.com Thu Nov 18 21:51:58 2004 From: kpreslan at redhat.com (Ken Preslan) Date: Thu, 18 Nov 2004 15:51:58 -0600 Subject: [Linux-cluster] 'df' not accurate? In-Reply-To: <20041118213145.GI6491@iwork57.lis.uiuc.edu> References: <20041118213145.GI6491@iwork57.lis.uiuc.edu> Message-ID: <20041118215158.GA20992@potassium.msp.redhat.com> On Thu, Nov 18, 2004 at 03:31:46PM -0600, Brynnen R Owen wrote: > I have a cluster with 5 computers on board. I have a GFS partition > that's about 1T. When I run 'df', it shows that I've used 81G. When > I run a 'du -s' on a subtree of the filesystem, I get ~250G. I would > expect that the total in-use storage size would be about 300G. The > quotas seem to add up to about 300G as well. I manually created a 1M > file, and the usage went up by 1028 blocks. > > I'm using lock_dlm, RedHat 9, 2.6.9 kernel, all patches from CVS on > Nov 11. Anyone else seen anything like this? I don't have a few > days/weeks to fsck the system. If you do a "gfs_tool shrink /mountpoint" and then rerun df, does the value change? -- Ken Preslan From owen at isrl.uiuc.edu Thu Nov 18 22:02:49 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Thu, 18 Nov 2004 16:02:49 -0600 Subject: [Linux-cluster] 'df' not accurate? In-Reply-To: <20041118215158.GA20992@potassium.msp.redhat.com> References: <20041118213145.GI6491@iwork57.lis.uiuc.edu> <20041118215158.GA20992@potassium.msp.redhat.com> Message-ID: <20041118220249.GK6491@iwork57.lis.uiuc.edu> No change. All nodes in the cluster appear to agree on the wrong size as well. On Thu, Nov 18, 2004 at 03:51:58PM -0600, Ken Preslan wrote: > On Thu, Nov 18, 2004 at 03:31:46PM -0600, Brynnen R Owen wrote: > > I have a cluster with 5 computers on board. I have a GFS partition > > that's about 1T. When I run 'df', it shows that I've used 81G. When > > I run a 'du -s' on a subtree of the filesystem, I get ~250G. I would > > expect that the total in-use storage size would be about 300G. The > > quotas seem to add up to about 300G as well. I manually created a 1M > > file, and the usage went up by 1028 blocks. > > > > I'm using lock_dlm, RedHat 9, 2.6.9 kernel, all patches from CVS on > > Nov 11. Anyone else seen anything like this? I don't have a few > > days/weeks to fsck the system. > > If you do a "gfs_tool shrink /mountpoint" and then rerun df, does the > value change? > > -- > Ken Preslan > -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> From chicagoboy12001 at yahoo.com Thu Nov 18 22:36:19 2004 From: chicagoboy12001 at yahoo.com (Chicago Boy) Date: Thu, 18 Nov 2004 14:36:19 -0800 (PST) Subject: [Linux-cluster] unable to mount gfs partition In-Reply-To: <20041118212151.GA1438@redhat.com> Message-ID: <20041118223619.91000.qmail@web50403.mail.yahoo.com> Thank you very much for replying to my post! The fencing method is set to manual. Is there a way to check what nodes are waiting for fencing to complete. I didn't notice any such messages in /var/log/message When I run `gulm_tool nodelist localhost` on the node lock server(node1) is running it printed node1 information. It didn't indicate that node2 is pending for fencing. Also, I tried fence_manual -s node2 earlier. The command just hung up without returning to prompt. I terminated with ctrl-c. Name: node2 ip = 127.0.0.1 state = Logged in mode = Master missed beats = 0 last beat = 1100816791202103 delay avg = 10014958 max delay = 10020006 I read in another post about loopback address problem: https://www.redhat.com/archives/linux-cluster/2004-August/msg00004.html It looks like node1 that is running lock_server picked up 127.0.0.1. Could this be the problem? Thanks again for your help! SD --- Michael Conrad Tadpol Tilstra wrote: > On Thu, Nov 18, 2004 at 01:15:40PM -0800, Chicago > Boy wrote: > > lock_gulmd[18399]: You are running in Standard > mode. > > lock_gulmd[18399]: I am (clu2.abc.com) with ip > > (192.168.11.212) > > lock_gulmd[18399]: Forked core [18400]. > > lock_gulmd_core[18400]: ERROR [core_io.c:1029] Got > > error from reply: (clu1:192. > > 168.11.211) 1006:Not Allowed > > Typically means that the node is pending on a > fencing action. (the node > that is trying to login is expired, and cannot login > until fencing > completes.) > > run `gulm_tool nodelist localhost` on the first node > to verify this. > > you'll then need to figure out why the fencing > action didn't work. > -- > Michael Conrad Tadpol Tilstra > Drive defensively. Buy a tank. > > ATTACHMENT part 1.2 application/pgp-signature > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster __________________________________________________ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com From mtilstra at redhat.com Thu Nov 18 22:42:59 2004 From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra) Date: Thu, 18 Nov 2004 16:42:59 -0600 Subject: [Linux-cluster] unable to mount gfs partition In-Reply-To: <20041118223619.91000.qmail@web50403.mail.yahoo.com> References: <20041118212151.GA1438@redhat.com> <20041118223619.91000.qmail@web50403.mail.yahoo.com> Message-ID: <20041118224259.GA4849@redhat.com> On Thu, Nov 18, 2004 at 02:36:19PM -0800, Chicago Boy wrote: > It looks like node1 that is running lock_server picked > up 127.0.0.1. Could this be the problem? yeah, that would be exatly your problem. Go fix /etc/hosts (remove the node's name from teh loopback line.) -- Michael Conrad Tadpol Tilstra God is the tangential point between zero and infinity. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From chicagoboy12001 at yahoo.com Thu Nov 18 23:14:56 2004 From: chicagoboy12001 at yahoo.com (Chicago Boy) Date: Thu, 18 Nov 2004 15:14:56 -0800 (PST) Subject: [Linux-cluster] unable to mount gfs partition In-Reply-To: <20041118224259.GA4849@redhat.com> Message-ID: <20041118231456.63154.qmail@web50409.mail.yahoo.com> Thanks much! I just tried and node2 can mount the GFS file system. But, only one node can mount the filesystem anytime. For instance if node1 already mounted the file system and mount issued on node2 waits until umount is done on node1. I read about a similar problem in the list archives: https://www.redhat.com/archives/linux-cluster/2004-July/msg00205.html The first 8 bytes of my nodes are the same: testnode1 and testnode2. Is this the problem? (I will appreciate if you tell me the document I should look into for these details.) Thanks again!! SD --- Michael Conrad Tadpol Tilstra wrote: > On Thu, Nov 18, 2004 at 02:36:19PM -0800, Chicago > Boy wrote: > > It looks like node1 that is running lock_server > picked > > up 127.0.0.1. Could this be the problem? > > yeah, that would be exatly your problem. Go fix > /etc/hosts (remove the > node's name from teh loopback line.) > > -- > Michael Conrad Tadpol Tilstra > God is the tangential point between zero and > infinity. > > ATTACHMENT part 1.2 application/pgp-signature > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster __________________________________ Do you Yahoo!? Meet the all-new My Yahoo! - Try it today! http://my.yahoo.com From kpreslan at redhat.com Thu Nov 18 23:24:42 2004 From: kpreslan at redhat.com (Ken Preslan) Date: Thu, 18 Nov 2004 17:24:42 -0600 Subject: [Linux-cluster] 'df' not accurate? In-Reply-To: <20041118225805.GL6491@iwork57.lis.uiuc.edu> References: <20041118213145.GI6491@iwork57.lis.uiuc.edu> <20041118215158.GA20992@potassium.msp.redhat.com> <20041118220249.GK6491@iwork57.lis.uiuc.edu> <20041118224612.GA21238@potassium.msp.redhat.com> <20041118225805.GL6491@iwork57.lis.uiuc.edu> Message-ID: <20041118232442.GA21845@potassium.msp.redhat.com> GFS caches the information used to perform a "df" in little chunks of cluster memory (called LVBs) associated with each bitmap lock. The data probably got corrupted somehow. The "gfs_tool reclaim" caused the data to be reinitialzed from disk. ("Gfs_tool shrink" should have done that too. I'm not sure why it didn't.) Maybe there's a bug in the lock_dlm LVB recovery code or a bug has crept into the filesystem code. On Thu, Nov 18, 2004 at 04:58:05PM -0600, Brynnen R Owen wrote: > Ok, that seems to fix the reporting. Can you hit me with a > clue-by-four as to what likely happened? We did perform a fence test > where we ripped the power cord out from a computer. Did that hose > metadata? > > On Thu, Nov 18, 2004 at 04:46:12PM -0600, Ken Preslan wrote: > > On Thu, Nov 18, 2004 at 04:02:49PM -0600, Brynnen R Owen wrote: > > > No change. All nodes in the cluster appear to agree on the wrong size > > > as well. > > > > What does a "gfs_tool reclaim /mountpoint" do? > > > > -- > > Ken Preslan > > > > -- > <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> > <> Brynnen Owen ( this space for rent )<> > <> owen at uiuc.edu ( )<> > <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> -- Ken Preslan From chicagoboy12001 at yahoo.com Fri Nov 19 00:20:13 2004 From: chicagoboy12001 at yahoo.com (Chicago Boy) Date: Thu, 18 Nov 2004 16:20:13 -0800 (PST) Subject: [Linux-cluster] unable to mount gfs partition In-Reply-To: <20041118231456.63154.qmail@web50409.mail.yahoo.com> Message-ID: <20041119002013.3692.qmail@web50401.mail.yahoo.com> Thank you, I solved the problem. I used hostdata parameter to set a different name mount -t gfs /dev/pool/pool0 /mnt/gfs1 -o hostdata=duplnode The mount now uses duplnode instead of hostname returned by uname -n This parameter is described in http://www.redhat.com/docs/manuals/csgfs/admin-guide/s1-manage-mountfs.html (but I am still wondering where is this 8 bytes limitation specified- I just want to know for reference). Thanks again for your help!! SD --- Chicago Boy wrote: > Thanks much! I just tried and node2 can mount the > GFS > file system. But, only one node can mount the > filesystem anytime. For instance if node1 already > mounted the file system and mount issued on node2 > waits until umount is done on node1. I read about a > similar problem in the list archives: > > https://www.redhat.com/archives/linux-cluster/2004-July/msg00205.html > > The first 8 bytes of my nodes are the same: > testnode1 > and testnode2. Is this the problem? (I will > appreciate > if you tell me the document I should look into for > these details.) > > Thanks again!! > SD > > --- Michael Conrad Tadpol Tilstra > wrote: > > > On Thu, Nov 18, 2004 at 02:36:19PM -0800, Chicago > > Boy wrote: > > > It looks like node1 that is running lock_server > > picked > > > up 127.0.0.1. Could this be the problem? > > > > yeah, that would be exatly your problem. Go fix > > /etc/hosts (remove the > > node's name from teh loopback line.) > > > > -- > > Michael Conrad Tadpol Tilstra > > God is the tangential point between zero and > > infinity. > > > > > ATTACHMENT part 1.2 application/pgp-signature > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > > http://www.redhat.com/mailman/listinfo/linux-cluster > > > > > __________________________________ > Do you Yahoo!? > Meet the all-new My Yahoo! - Try it today! > http://my.yahoo.com > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > __________________________________ Do you Yahoo!? The all-new My Yahoo! - Get yours free! http://my.yahoo.com From jn at it.swin.edu.au Fri Nov 19 01:03:15 2004 From: jn at it.swin.edu.au (John Newbigin) Date: Fri, 19 Nov 2004 12:03:15 +1100 Subject: [Linux-cluster] unable to mount gfs partition In-Reply-To: <20041119002013.3692.qmail@web50401.mail.yahoo.com> References: <20041119002013.3692.qmail@web50401.mail.yahoo.com> Message-ID: <419D4653.7020208@it.swin.edu.au> AFAIK, it is a bug. The Red Hat 6.0.0-15 RPMS solve that problem (I had it too). John. Chicago Boy wrote: > Thank you, I solved the problem. I used hostdata > parameter to set a different name > > mount -t gfs /dev/pool/pool0 /mnt/gfs1 -o > hostdata=duplnode > > The mount now uses duplnode instead of hostname > returned by uname -n > > This parameter is described in > http://www.redhat.com/docs/manuals/csgfs/admin-guide/s1-manage-mountfs.html > > (but I am still wondering where is this 8 bytes > limitation specified- I just want to know for > reference). > > Thanks again for your help!! > SD > --- Chicago Boy wrote: > > >>Thanks much! I just tried and node2 can mount the >>GFS >>file system. But, only one node can mount the >>filesystem anytime. For instance if node1 already >>mounted the file system and mount issued on node2 >>waits until umount is done on node1. I read about a >>similar problem in the list archives: >> >> > > https://www.redhat.com/archives/linux-cluster/2004-July/msg00205.html > >>The first 8 bytes of my nodes are the same: >>testnode1 >>and testnode2. Is this the problem? (I will >>appreciate >>if you tell me the document I should look into for >>these details.) >> >>Thanks again!! >>SD >> >>--- Michael Conrad Tadpol Tilstra >> wrote: >> >> >>>On Thu, Nov 18, 2004 at 02:36:19PM -0800, Chicago >>>Boy wrote: >>> >>>>It looks like node1 that is running lock_server >>> >>>picked >>> >>>>up 127.0.0.1. Could this be the problem? >>> >>>yeah, that would be exatly your problem. Go fix >>>/etc/hosts (remove the >>>node's name from teh loopback line.) >>> >>>-- >>>Michael Conrad Tadpol Tilstra >>>God is the tangential point between zero and >>>infinity. >>> >> >>>ATTACHMENT part 1.2 application/pgp-signature >>>-- >>>Linux-cluster mailing list >>>Linux-cluster at redhat.com >>> >> >>http://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> >> >>__________________________________ >>Do you Yahoo!? >>Meet the all-new My Yahoo! - Try it today! >>http://my.yahoo.com >> >> >>-- >>Linux-cluster mailing list >>Linux-cluster at redhat.com >>http://www.redhat.com/mailman/listinfo/linux-cluster >> > > > > > > __________________________________ > Do you Yahoo!? > The all-new My Yahoo! - Get yours free! > http://my.yahoo.com > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > > > -- John Newbigin Computer Systems Officer Faculty of Information and Communication Technologies Swinburne University of Technology Melbourne, Australia http://www.it.swin.edu.au/staff/jnewbigin From owen at isrl.uiuc.edu Fri Nov 19 17:53:45 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Fri, 19 Nov 2004 11:53:45 -0600 Subject: [Linux-cluster] Processes locked in "D" state Message-ID: <20041119175344.GO6491@iwork57.lis.uiuc.edu> Hi all, While my initial problems with getting the locking/fencing seem to be solved with the proper magma modules, my initial problem is not solved. I have been running some test backups to a GFS partition which somehow has a bad directory on it. Here's what I mean. Any process that tries to open this "bad" directory gets hung forever in a "D" state. There are no errors/warnings/logs anywhere. I have tried 'ls ', 'find .' on a directory above this bad one in the path, '/gfs_tool stat ', and the original perl script which was descending into directories and copying stuff. I now have 4 hung processes. The machine still appears awake. 'df' still works (this is an improvement over the old failure method). Any suggestions? I'm using lock_dlm gfs from CVS on Nov 11. which I applied to a kernel.org 2.6.9 kernel. Using mptscsih fibre channel cards. Athlon processors with athlon extensions No extra high memory (1G limit) Non-SMP base system is RedHat 9. copy of /proc/cluster/status (fifth node was never active): Version: 3.0.1 Config version: 7 Cluster name: gslis-san1 Cluster ID: 43161 Membership state: Cluster-Member Nodes: 4 Expected_votes: 5 Total_votes: 4 Quorum: 3 Active subsystems: 8 Node addresses: 192.168.1.240 copy of /proc/cluster/services: Service Name GID LID State Code Fence Domain: "default" 1 2 run - [1 3 4 2] DLM Lock Space: "archive-content" 2 3 run - [1 3 4 2] DLM Lock Space: "archive-home" 4 5 run - [1 3 4 2] GFS Mount Group: "archive-content" 3 4 run - [1 3 4 2] GFS Mount Group: "archive-home" 5 6 run - [1 3 4 2] -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> From mtilstra at redhat.com Fri Nov 19 17:58:27 2004 From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra) Date: Fri, 19 Nov 2004 11:58:27 -0600 Subject: [Linux-cluster] unable to mount gfs partition In-Reply-To: <419D4653.7020208@it.swin.edu.au> References: <20041119002013.3692.qmail@web50401.mail.yahoo.com> <419D4653.7020208@it.swin.edu.au> Message-ID: <20041119175827.GA14363@redhat.com> On Fri, Nov 19, 2004 at 12:03:15PM +1100, John Newbigin wrote: > AFAIK, it is a bug. The Red Hat 6.0.0-15 RPMS solve that problem (I had > it too). Yes, it is a bug, https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=127828 Upgrade your rpms, it has been fixed. -- Michael Conrad Tadpol Tilstra "It is not the evil ones we need to worry about. We can shoot evil on sight. It is the good pious zealots we should worry about. They do all the wrong things for all the 'right' reasons and lead us astray with aspirations of goodness which lead us into the Abyss." -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From owen at isrl.uiuc.edu Fri Nov 19 18:12:41 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Fri, 19 Nov 2004 12:12:41 -0600 Subject: [Linux-cluster] Processes locked in "D" state In-Reply-To: <20041119175344.GO6491@iwork57.lis.uiuc.edu> References: <20041119175344.GO6491@iwork57.lis.uiuc.edu> Message-ID: <20041119181241.GP6491@iwork57.lis.uiuc.edu> More info, hot off the presses. I just unmounted the GFS on another server, and two others with hung processes sprang back to life. So, it appears to be some kind of locking issue, but I have no idea what. On Fri, Nov 19, 2004 at 11:53:45AM -0600, Brynnen R Owen wrote: > Hi all, > > While my initial problems with getting the locking/fencing seem to > be solved with the proper magma modules, my initial problem is not > solved. I have been running some test backups to a GFS partition > which somehow has a bad directory on it. Here's what I mean. Any > process that tries to open this "bad" directory gets hung forever in a > "D" state. There are no errors/warnings/logs anywhere. I have tried > 'ls ', 'find .' on a directory above this bad one in the path, > '/gfs_tool stat ', and the original perl script which was > descending into directories and copying stuff. I now have 4 hung > processes. The machine still appears awake. 'df' still works (this > is an improvement over the old failure method). Any suggestions? > > I'm using lock_dlm > gfs from CVS on Nov 11. which I applied to a kernel.org 2.6.9 kernel. > Using mptscsih fibre channel cards. > Athlon processors with athlon extensions > No extra high memory (1G limit) > Non-SMP > base system is RedHat 9. > > copy of /proc/cluster/status (fifth node was never active): > Version: 3.0.1 > Config version: 7 > Cluster name: gslis-san1 > Cluster ID: 43161 > Membership state: Cluster-Member > Nodes: 4 > Expected_votes: 5 > Total_votes: 4 > Quorum: 3 > Active subsystems: 8 > Node addresses: 192.168.1.240 > > copy of /proc/cluster/services: > Service Name GID LID State > Code > Fence Domain: "default" 1 2 run - > [1 3 4 2] > > DLM Lock Space: "archive-content" 2 3 run - > [1 3 4 2] > > DLM Lock Space: "archive-home" 4 5 run - > [1 3 4 2] > > GFS Mount Group: "archive-content" 3 4 run - > [1 3 4 2] > > GFS Mount Group: "archive-home" 5 6 run - > [1 3 4 2] > > > -- > <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> > <> Brynnen Owen ( this space for rent )<> > <> owen at uiuc.edu ( )<> > <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> From chicagoboy12001 at yahoo.com Sat Nov 20 00:16:05 2004 From: chicagoboy12001 at yahoo.com (Chicago Boy) Date: Fri, 19 Nov 2004 16:16:05 -0800 (PST) Subject: [Linux-cluster] unable to mount gfs partition In-Reply-To: <20041119175827.GA14363@redhat.com> Message-ID: <20041120001605.31444.qmail@web50410.mail.yahoo.com> Are these RPM's freely available? Can someone post the link to GFS-6.0.0-15.src.rpm version? Another question: What is the best way to terminate ccsd? In the manual it just says "kill the ccs daemon" http://www.redhat.com/docs/manuals/csgfs/admin-guide/s1-manage-shutdown.html Does that mean "kill pid"? Thanks much for your support!! S --- Michael Conrad Tadpol Tilstra wrote: > On Fri, Nov 19, 2004 at 12:03:15PM +1100, John > Newbigin wrote: > > AFAIK, it is a bug. The Red Hat 6.0.0-15 RPMS > solve that problem (I had > > it too). > > Yes, it is a bug, > https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=127828 > > Upgrade your rpms, it has been fixed. > > -- > Michael Conrad Tadpol Tilstra > "It is not the evil ones we need to worry about. We > can shoot evil on > sight. It is the good pious zealots we should worry > about. They do all the > wrong things for all the 'right' reasons and lead us > astray with > aspirations of goodness which lead us into the > Abyss." > > ATTACHMENT part 1.2 application/pgp-signature > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster __________________________________ Do you Yahoo!? The all-new My Yahoo! - Get yours free! http://my.yahoo.com From ivan.ivanyi at isb-sib.ch Mon Nov 22 09:02:08 2004 From: ivan.ivanyi at isb-sib.ch (IVANYI Ivan) Date: Mon, 22 Nov 2004 10:02:08 +0100 Subject: [Linux-cluster] gpfs vs gpfs Message-ID: <41A1AB10.4070101@isb-sib.ch> Hi, Has anyone compared IBM GPFS to RedHat GFS? (performance/stability) And also is anyone running GFS under Fedora Core 2 or 3? Any problems I should know about? Thanks -- ************************************************************ Ivan Ivanyi Swiss Institute of Bioinformatics 1, rue Michel Servet CH-1211 Gen?ve 4 Switzerland Tel: (+41 22) 379 58 33 Fax: (+41 22) 379 58 58 E-mail: Ivan.Ivanyi at isb-sib.ch ************************************************************ PGP signature http://www.expasy.org/people/Ivan.Ivanyi.gpg From teigland at redhat.com Mon Nov 22 09:16:04 2004 From: teigland at redhat.com (David Teigland) Date: Mon, 22 Nov 2004 17:16:04 +0800 Subject: [Linux-cluster] 'df' not accurate? In-Reply-To: <20041118232442.GA21845@potassium.msp.redhat.com> References: <20041118213145.GI6491@iwork57.lis.uiuc.edu> <20041118215158.GA20992@potassium.msp.redhat.com> <20041118220249.GK6491@iwork57.lis.uiuc.edu> <20041118224612.GA21238@potassium.msp.redhat.com> <20041118225805.GL6491@iwork57.lis.uiuc.edu> <20041118232442.GA21845@potassium.msp.redhat.com> Message-ID: <20041122091604.GB16628@redhat.com> On Thu, Nov 18, 2004 at 05:24:42PM -0600, Ken Preslan wrote: > GFS caches the information used to perform a "df" in little chunks of > cluster memory (called LVBs) associated with each bitmap lock. The data > probably got corrupted somehow. The "gfs_tool reclaim" caused the > data to be reinitialzed from disk. ("Gfs_tool shrink" should have done > that too. I'm not sure why it didn't.) > > Maybe there's a bug in the lock_dlm LVB recovery code or a bug has crept > into the filesystem code. There was a bug updating lvb's in the dlm. The problem should now be fixed. -- Dave Teigland From owen at isrl.uiuc.edu Mon Nov 22 17:15:45 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Mon, 22 Nov 2004 11:15:45 -0600 Subject: [Linux-cluster] 'df' not accurate? In-Reply-To: <20041122091604.GB16628@redhat.com> References: <20041118213145.GI6491@iwork57.lis.uiuc.edu> <20041118215158.GA20992@potassium.msp.redhat.com> <20041118220249.GK6491@iwork57.lis.uiuc.edu> <20041118224612.GA21238@potassium.msp.redhat.com> <20041118225805.GL6491@iwork57.lis.uiuc.edu> <20041118232442.GA21845@potassium.msp.redhat.com> <20041122091604.GB16628@redhat.com> Message-ID: <20041122171545.GB4240@iwork57.lis.uiuc.edu> On Mon, Nov 22, 2004 at 05:16:04PM +0800, David Teigland wrote: > On Thu, Nov 18, 2004 at 05:24:42PM -0600, Ken Preslan wrote: > > GFS caches the information used to perform a "df" in little chunks of > > cluster memory (called LVBs) associated with each bitmap lock. The data > > probably got corrupted somehow. The "gfs_tool reclaim" caused the > > data to be reinitialzed from disk. ("Gfs_tool shrink" should have done > > that too. I'm not sure why it didn't.) > > > > Maybe there's a bug in the lock_dlm LVB recovery code or a bug has crept > > into the filesystem code. > > There was a bug updating lvb's in the dlm. The problem should now be > fixed. Great! I have discovered a second problem, although this may be out of date. If these problems have been addressed, or if the above bugfix addresses these, I'll try again with the latest snapshot. I had problems with the Nov 11 CVS snapshot version of 'ccsd' segfaulting and filesystem corruption. I set up a cluster with 5 computers. I created a brand new GFS filesystem and set two of the machines copying from ext3 on IDE filesystems to one GFS on Fibre filesystem. Both machines finished the task. One machine copied about 250G, the other copied about 30G. I then unmounted the filesystem and ran gfs_fsck. There were several errors, including datablock pointers out of range. I noticed that ccsd had died on several machines, and they weren't in the fence domain. I decided that was the issue, so I brought the cluster down and back up, made sure everything was in the same fence domain, recreated the GFS and reran the test. While ccsd was not dead this time, the GFS filesystem still showed several errors. I then moved to the latest CVS snapshot (Nov 21) and lock_gulm, and am waiting for the copy to finish so I can report on the results. -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> From daniel at osdl.org Mon Nov 22 20:44:07 2004 From: daniel at osdl.org (Daniel McNeil) Date: Mon, 22 Nov 2004 12:44:07 -0800 Subject: [Linux-cluster] umount hang Message-ID: <1101156247.29523.27.camel@ibm-c.pdx.osdl.net> I left some automated tests running over the weekend and ran into a umount hang. A single GFS file system was mounted on 2 nodes of a 3 node cluster. The test had just removed 2 subdirectories - one from each node. The test was then unmounting the file system from one node when the umount hung. Here's a stack trace from the hung umount (on cl030): (node cl030) umount D 00000008 0 14345 14339 (NOTLB) db259e04 00000086 db259df4 00000008 00000001 00000000 00000008 db259dc8 eda96dc0 f15d0750 c044aac0 db259000 db259de4 c01196d1 f7cf0b90 450fa673 c170df60 00000000 00049d65 44bb3183 0002dfe0 f15d0750 f15d08b0 c170df60 Call Trace: [] wait_for_completion+0xa4/0xe0 [] kcl_leave_service+0xfe/0x180 [cman] [] release_lockspace+0x2d6/0x2f0 [dlm] [] release_gdlm+0x1c/0x30 [lock_dlm] [] lm_dlm_unmount+0x24/0x50 [lock_dlm] [] lm_unmount+0x46/0xac [lock_harness] [] gfs_put_super+0x30f/0x3c0 [gfs] [] generic_shutdown_super+0x18a/0x1a0 [] kill_block_super+0x1d/0x40 [] deactivate_super+0x81/0xa0 [] sys_umount+0x3c/0xa0 [] sys_oldumount+0x19/0x20 [] sysenter_past_esp+0x52/0x71 [root at cl030 proc]# cat /proc/cluster/services Service Name GID LID State Code Fence Domain: "default" 1 2 run - [3 1 2] DLM Lock Space: "stripefs" 222 275 run S-13,210,1 [1 3] Cat'ing /proc/cluster/services on the 2nd node (cl031) hangs. [root at cl031 root]# cat /proc/cluster/services >From the 2nd node (cl031). Here are some stack traces that might be interesting: cman_serviced D 00000008 0 3818 6 12593 665 (L-TLB) ebc23edc 00000046 ebc23ecc 00000008 00000001 00000010 00000008 00000002 f7726dc0 00000000 00000000 f5a4b230 00000000 00000010 00000010 ebc23f24 c170df60 00000000 000005a8 d42bcdab 0002e201 eb5119f0 eb511b50 ebc23f08 Call Trace: [] rwsem_down_write_failed+0x9c/0x18e [] .text.lock.lockspace+0x4e/0x63 [dlm] [] process_leave_stop+0x32/0x80 [cman] [] process_one_uevent+0xc2/0x100 [cman] [] process_membership+0xc8/0xca [cman] [] serviced+0x165/0x1d0 [cman] [] kthread+0xba/0xc0 [] kernel_thread_helper+0x5/0x10 cat /proc/cluster/services stack trace: cat D 00000008 0 22151 1 13435 (NOTLB) c1f7ae90 00000086 c1f7ae7c 00000008 00000002 000000d0 00000008 c1f7ae74 eb0acdc0 00000001 00000246 00000000 e20c4670 f474f1d0 00000000 c17168c0 c1715f60 00000001 00159c05 bad07454 0003aa83 e20c4670 e20c47d0 00000000 Call Trace: [] __down+0x93/0xf0 [] __down_failed+0xb/0x14 [] .text.lock.sm_misc+0x2d/0x41 [cman] [] sm_seq_next+0x34/0x50 [cman] [] seq_read+0x159/0x2b0 [] vfs_read+0xaf/0x120 [] sys_read+0x4b/0x80 [] sysenter_past_esp+0x52/0x71 The full stack traces are available here: http://developer.osdl.org/daniel/gfs_umount_hang/ I'm running on 2.6.9 and cvs code from Nov 9th. Any ideas? Daniel From Vincent.Aniello at PipelineTrading.com Mon Nov 22 20:58:29 2004 From: Vincent.Aniello at PipelineTrading.com (Vincent Aniello) Date: Mon, 22 Nov 2004 15:58:29 -0500 Subject: [Linux-cluster] Installing GFS on RedHat Linux AS 3.0 (2.4.21-20.ELsmp) Message-ID: <834F55E6F1BE3B488AD3AFC927A09700033AFB@EMAILSRV1.exad.net> I am trying to setup GFS on a pair of RedHat Linux AS 3.0 servers running kernel version 2.4.21-20.ELsmp. I downloaded the source RPMs from ftp://ftp.redhat.com/pub/redhat/linux/enterprise/3/en/RHGFS/i386/SRPMS, but when I try to build GFS from the source RPM I get the following error: error: Failed build dependencies: kernel-source = 2.4.21-15.EL is needed by GFS-6.0.0-1.2 How do I get GFS installed on my version of the kernel? Any help would be appreciated. Thanks. --Vincent This e-mail and/or its attachments may contain confidential and/or privileged information. If you are not the intended recipient(s) or have received this e-mail in error, please notify the sender immediately and delete this e-mail and its attachments from your computer and files. Any unauthorized copying, disclosure or distribution of the material contained herein is strictly forbidden. Pipeline Trading Systems, LLC - Member NASD & SIPC. From jbrassow at redhat.com Mon Nov 22 21:18:07 2004 From: jbrassow at redhat.com (Jonathan E Brassow) Date: Mon, 22 Nov 2004 15:18:07 -0600 Subject: [Linux-cluster] Installing GFS on RedHat Linux AS 3.0 (2.4.21-20.ELsmp) In-Reply-To: <834F55E6F1BE3B488AD3AFC927A09700033AFB@EMAILSRV1.exad.net> References: <834F55E6F1BE3B488AD3AFC927A09700033AFB@EMAILSRV1.exad.net> Message-ID: <0158FC83-3CCC-11D9-A107-000A957BB1F6@redhat.com> ftp://ftp.redhat.com/pub/redhat/linux/updates/enterprise/3AS/en/RHGFS/ SRPMS/ look in "updates" brassow On Nov 22, 2004, at 2:58 PM, Vincent Aniello wrote: > > I am trying to setup GFS on a pair of RedHat Linux AS 3.0 servers > running kernel version 2.4.21-20.ELsmp. > > I downloaded the source RPMs from > ftp://ftp.redhat.com/pub/redhat/linux/enterprise/3/en/RHGFS/i386/SRPMS, > but when I try to build GFS from the source RPM I get the following > error: > > error: Failed build dependencies: > kernel-source = 2.4.21-15.EL is needed by GFS-6.0.0-1.2 > > How do I get GFS installed on my version of the kernel? > > Any help would be appreciated. > > Thanks. > > --Vincent > > > This e-mail and/or its attachments may contain confidential and/or > privileged information. If you are not the intended recipient(s) or > have received this e-mail in error, please notify the sender > immediately and delete this e-mail and its attachments from your > computer and files. Any unauthorized copying, disclosure or > distribution of the material contained herein is strictly forbidden. > Pipeline Trading Systems, LLC - Member NASD & SIPC. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > From rajkum2002 at rediffmail.com Mon Nov 22 21:18:17 2004 From: rajkum2002 at rediffmail.com (Raj Kumar) Date: 22 Nov 2004 21:18:17 -0000 Subject: [Linux-cluster] GFS fence and lock servers- test setup Message-ID: <20041122211817.5528.qmail@webmail32.rediffmail.com> Hello All, ? I am just starting with GFS with a two node cluster setup. I am able to create and mount GFS filesystems on both the nodes. However, I did not understand completely how fencing and lock servers operate: In my setup node1 runs lock servers. Both node1 and node2 use the GFS filesystem. Assume node1 is shutdown while node2 is accessing the shared storage. Since only node1 runs lock servers the cluster will hung up soon. What would be state of the cluster and files on the storage? If node1 is brought online will cluster operates normally. Would any of the files on shared storage corrupted. If yes, how to identify such files? From manual: 8.2.3. Starting LOCK_GULM Servers If there are hung GFS nodes, reset them before starting lock_gulmd servers. Resetting the hung GFS nodes before starting lock_gulmd servers prevents file system corruption. I suppose this section applies to the scenario I described above. What exactly it means "If there are hung GFS nodes, reset them before starting lock_gulmd servers"-- does this mean restart node2 or just kill ccsd, lock, and disassemble pools? When a node is fenced, what exact sequence of operations is performed. Is the fenced node restarted? My GFS nodes also run very important services and restarting my cause adverse effects sometimes. What recovery operations are performed when a fenced node joins the cluster. Can someone tell what other issues a system administrator should be concerned with operating GFS? Thanks in advance for your help! Raj -------------- next part -------------- An HTML attachment was scrubbed... URL: From Vincent.Aniello at PipelineTrading.com Mon Nov 22 22:03:05 2004 From: Vincent.Aniello at PipelineTrading.com (Vincent Aniello) Date: Mon, 22 Nov 2004 17:03:05 -0500 Subject: [Linux-cluster] Installing GFS on RedHat Linux AS 3.0(2.4.21-20.ELsmp) Message-ID: <834F55E6F1BE3B488AD3AFC927A09700033B04@EMAILSRV1.exad.net> Thank you. When creating the RPMs for an SMP system is there anything special I need to do other than the 'rpmbuild -rebuild GFS-6.0.0-15.src.rpm' command? Also, the following RPMs were created: GFS-6.0.0-15.i386.rpm GFS-debuginfo-6.0.0-15.i386.rpm GFS-devel-6.0.0-15.i386.rpm GFS-modules-6.0.0-15.i386.rpm Do all of these need to be installed, in particular is "GFS-debuginfo-6.0.0-15.i386.rpm" needed? --Vincent This e-mail and/or its attachments may contain confidential and/or privileged information. If you are not the intended recipient(s) or have received this e-mail in error, please notify the sender immediately and delete this e-mail and its attachments from your computer and files. Any unauthorized copying, disclosure or distribution of the material contained herein is strictly forbidden. Pipeline Trading Systems, LLC - Member NASD & SIPC. -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jonathan E Brassow Sent: Monday, November 22, 2004 4:18 PM To: linux clistering Subject: Re: [Linux-cluster] Installing GFS on RedHat Linux AS 3.0(2.4.21-20.ELsmp) ftp://ftp.redhat.com/pub/redhat/linux/updates/enterprise/3AS/en/RHGFS/ SRPMS/ look in "updates" brassow On Nov 22, 2004, at 2:58 PM, Vincent Aniello wrote: > > I am trying to setup GFS on a pair of RedHat Linux AS 3.0 servers > running kernel version 2.4.21-20.ELsmp. > > I downloaded the source RPMs from > ftp://ftp.redhat.com/pub/redhat/linux/enterprise/3/en/RHGFS/i386/SRPMS, > but when I try to build GFS from the source RPM I get the following > error: > > error: Failed build dependencies: > kernel-source = 2.4.21-15.EL is needed by GFS-6.0.0-1.2 > > How do I get GFS installed on my version of the kernel? > > Any help would be appreciated. > > Thanks. > > --Vincent > > > This e-mail and/or its attachments may contain confidential and/or > privileged information. If you are not the intended recipient(s) or > have received this e-mail in error, please notify the sender > immediately and delete this e-mail and its attachments from your > computer and files. Any unauthorized copying, disclosure or > distribution of the material contained herein is strictly forbidden. > Pipeline Trading Systems, LLC - Member NASD & SIPC. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > -- Linux-cluster mailing list Linux-cluster at redhat.com http://www.redhat.com/mailman/listinfo/linux-cluster From jbrassow at redhat.com Mon Nov 22 22:36:48 2004 From: jbrassow at redhat.com (Jonathan E Brassow) Date: Mon, 22 Nov 2004 16:36:48 -0600 Subject: [Linux-cluster] Installing GFS on RedHat Linux AS 3.0(2.4.21-20.ELsmp) In-Reply-To: <834F55E6F1BE3B488AD3AFC927A09700033B04@EMAILSRV1.exad.net> References: <834F55E6F1BE3B488AD3AFC927A09700033B04@EMAILSRV1.exad.net> Message-ID: don't worry about devel and debuginfo try building with: rpmbuild -bb --target i686 this should give you the smp modules brassow On Nov 22, 2004, at 4:03 PM, Vincent Aniello wrote: > > Thank you. > > When creating the RPMs for an SMP system is there anything special I > need to do other than the 'rpmbuild -rebuild GFS-6.0.0-15.src.rpm' > command? > > Also, the following RPMs were created: > > GFS-6.0.0-15.i386.rpm > GFS-debuginfo-6.0.0-15.i386.rpm > GFS-devel-6.0.0-15.i386.rpm > GFS-modules-6.0.0-15.i386.rpm > > Do all of these need to be installed, in particular is > "GFS-debuginfo-6.0.0-15.i386.rpm" needed? > > --Vincent > > > > > This e-mail and/or its attachments may contain confidential and/or > privileged information. If you are not the intended recipient(s) or > have received this e-mail in error, please notify the sender > immediately and delete this e-mail and its attachments from your > computer and files. Any unauthorized copying, disclosure or > distribution of the material contained herein is strictly forbidden. > Pipeline Trading Systems, LLC - Member NASD & SIPC. > -----Original Message----- > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Jonathan E > Brassow > Sent: Monday, November 22, 2004 4:18 PM > To: linux clistering > Subject: Re: [Linux-cluster] Installing GFS on RedHat Linux AS > 3.0(2.4.21-20.ELsmp) > > > ftp://ftp.redhat.com/pub/redhat/linux/updates/enterprise/3AS/en/RHGFS/ > SRPMS/ > > look in "updates" > > brassow > > On Nov 22, 2004, at 2:58 PM, Vincent Aniello wrote: > >> >> I am trying to setup GFS on a pair of RedHat Linux AS 3.0 servers >> running kernel version 2.4.21-20.ELsmp. >> >> I downloaded the source RPMs from >> > ftp://ftp.redhat.com/pub/redhat/linux/enterprise/3/en/RHGFS/i386/SRPMS, >> but when I try to build GFS from the source RPM I get the following >> error: >> >> error: Failed build dependencies: >> kernel-source = 2.4.21-15.EL is needed by GFS-6.0.0-1.2 >> >> How do I get GFS installed on my version of the kernel? >> >> Any help would be appreciated. >> >> Thanks. >> >> --Vincent >> >> >> This e-mail and/or its attachments may contain confidential and/or >> privileged information. If you are not the intended recipient(s) or >> have received this e-mail in error, please notify the sender >> immediately and delete this e-mail and its attachments from your >> computer and files. Any unauthorized copying, disclosure or >> distribution of the material contained herein is strictly forbidden. >> Pipeline Trading Systems, LLC - Member NASD & SIPC. >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> http://www.redhat.com/mailman/listinfo/linux-cluster >> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > From ben.m.cahill at intel.com Mon Nov 22 21:39:52 2004 From: ben.m.cahill at intel.com (Cahill, Ben M) Date: Mon, 22 Nov 2004 13:39:52 -0800 Subject: [Linux-cluster] [PATCH] Comments on atime and fs init Message-ID: <0604335B7764D141945E202153105960033E26B5@orsmsx404.amr.corp.intel.com> Hi all, Attached please find patch adding/editing comments, mostly in atime processing and mount-time processing. Thanks, Ken, for reviewing, editing, and checking in comments I sent last week. -- Ben -- Opinions are mine, not Intel's -------------- next part -------------- A non-text attachment was scrubbed... Name: comments_112204.patch Type: application/octet-stream Size: 14423 bytes Desc: comments_112204.patch URL: From teigland at redhat.com Tue Nov 23 03:50:23 2004 From: teigland at redhat.com (David Teigland) Date: Tue, 23 Nov 2004 11:50:23 +0800 Subject: [Linux-cluster] umount hang In-Reply-To: <1101156247.29523.27.camel@ibm-c.pdx.osdl.net> References: <1101156247.29523.27.camel@ibm-c.pdx.osdl.net> Message-ID: <20041123035023.GA11828@redhat.com> On Mon, Nov 22, 2004 at 12:44:07PM -0800, Daniel McNeil wrote: > The full stack traces are available here: > http://developer.osdl.org/daniel/gfs_umount_hang/ Thanks, it's evident that the dlm became "stuck" on the node that's not doing the umount. All the hung processes are blocked on the dlm's "in_recovery" lock. There are a load of crond/df processes all in this situation. I'm not sure where the real bug is at, but setting up the cron jobs to skip gfs is one way to avoid it. -- Dave Teigland From pcaulfie at redhat.com Tue Nov 23 11:14:43 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Tue, 23 Nov 2004 11:14:43 +0000 Subject: [Linux-cluster] umount hang In-Reply-To: <20041123035023.GA11828@redhat.com> References: <1101156247.29523.27.camel@ibm-c.pdx.osdl.net> <20041123035023.GA11828@redhat.com> Message-ID: <20041123111443.GB25328@tykepenguin.com> On Tue, Nov 23, 2004 at 11:50:23AM +0800, David Teigland wrote: > > On Mon, Nov 22, 2004 at 12:44:07PM -0800, Daniel McNeil wrote: > > > The full stack traces are available here: > > http://developer.osdl.org/daniel/gfs_umount_hang/ > > Thanks, it's evident that the dlm became "stuck" on the node that's not > doing the umount. All the hung processes are blocked on the dlm's > "in_recovery" lock. There also seems to be a GFS process with a failed "down_write" in dlm_unlock which might be a clue. It's not the in_recovery lock because that's only held for read during normal locking operations so it must be either the res_lock or the ls_unlock_sem. odd as those are normally only held for very short time periods. -- patrick From daniel at osdl.org Tue Nov 23 16:43:04 2004 From: daniel at osdl.org (Daniel McNeil) Date: Tue, 23 Nov 2004 08:43:04 -0800 Subject: [Linux-cluster] umount hang and assert failure In-Reply-To: <20041123111443.GB25328@tykepenguin.com> References: <1101156247.29523.27.camel@ibm-c.pdx.osdl.net> <20041123035023.GA11828@redhat.com> <20041123111443.GB25328@tykepenguin.com> Message-ID: <1101228184.31492.11.camel@ibm-c.pdx.osdl.net> On Tue, 2004-11-23 at 03:14, Patrick Caulfield wrote: > On Tue, Nov 23, 2004 at 11:50:23AM +0800, David Teigland wrote: > > > > On Mon, Nov 22, 2004 at 12:44:07PM -0800, Daniel McNeil wrote: > > > > > The full stack traces are available here: > > > http://developer.osdl.org/daniel/gfs_umount_hang/ > > > > Thanks, it's evident that the dlm became "stuck" on the node that's not > > doing the umount. All the hung processes are blocked on the dlm's > > "in_recovery" lock. > > There also seems to be a GFS process with a failed "down_write" in dlm_unlock > which might be a clue. It's not the in_recovery lock because that's only held > for read during normal locking operations so it must be either the res_lock or > the ls_unlock_sem. odd as those are normally only held for very short time > periods. More info. I rebooted the cl031 the node not doing the umount but hung doing the cat of /proc/cluster/services. The 1st node saw the node go away, but the umount was still hung. I was expecting the recovery from the death of this node to clean up any locking problem. I rebooted the 2nd node and started the tests over again last night. This morning one node (cl030) got this: cur_state = 2, new_state = 2 Kernel panic - not syncing: GFS: Assertion failed on line 69 of file /Views/redhat-cluster/cluster/gfs-kernel/src/gfs/bits.c GFS: assertion: "valid_change[new_state * 4 + cur_state]" GFS: time = 1101174691 GFS: fsid=gfs_cluster:stripefs.0: RG = 65530 I'll upgrade to latest cvs and start the tests over. Is there anything I can do to get more info when this kind of thing happens? Thanks, Daniel From agauthier at realmedia.com Tue Nov 23 16:49:52 2004 From: agauthier at realmedia.com (Arnaud Gauthier) Date: Tue, 23 Nov 2004 17:49:52 +0100 Subject: [Linux-cluster] Startup scripts Message-ID: <200411231749.53227.agauthier@realmedia.com> Hi, Well, looks like I have a question about the GFS startup scripts you are using around :-)) I have modified a GFS script from a RHEL3 like distribution (StartCom), for starting GFS 6.1 on my nodes. I called the script gfs2 and I am starting it on all my nodes like in the usage file (except the modules loaded on rc.modules). Is it a good solution, do you have more "stable & specific" scripts ? Regards, Arnaud -- Arnaud Gauthier 247 Real Media From amanthei at redhat.com Tue Nov 23 17:09:25 2004 From: amanthei at redhat.com (Adam Manthei) Date: Tue, 23 Nov 2004 11:09:25 -0600 Subject: [Linux-cluster] Startup scripts In-Reply-To: <200411231749.53227.agauthier@realmedia.com> References: <200411231749.53227.agauthier@realmedia.com> Message-ID: <20041123170925.GF26635@redhat.com> On Tue, Nov 23, 2004 at 05:49:52PM +0100, Arnaud Gauthier wrote: > Hi, > > Well, looks like I have a question about the GFS startup scripts you are using > around :-)) > > I have modified a GFS script from a RHEL3 like distribution (StartCom), for > starting GFS 6.1 on my nodes. I called the script gfs2 and I am starting it > on all my nodes like in the usage file (except the modules loaded on > rc.modules). > > Is it a good solution, do you have more "stable & specific" scripts ? There aren't any GFS-6.1 init.d scripts yet. It's on my list of things that I was suppose to have done yesterday ;) They'll be coming soon... real soon. -- Adam Manthei From ben.m.cahill at intel.com Tue Nov 23 22:34:04 2004 From: ben.m.cahill at intel.com (Cahill, Ben M) Date: Tue, 23 Nov 2004 14:34:04 -0800 Subject: [Linux-cluster] (was: clvmd without GFS?) Effect of atime updates Message-ID: <0604335B7764D141945E202153105960033E26B9@orsmsx404.amr.corp.intel.com> Hi Matt, Ken Preslan just checked in some code to help this situation, i.e. taking a *long* time when reading a large directory with atime updates enabled (i.e. GFS mounted normally, without -o noatime option). Instead of forcing a WAIT until write I/O completes for each atime update (writing the inode block for each file that gets a new atime ... that's a lot of block writes in your case), GFS now will WAIT *only* if another node or process needs to access the file (hopefully rare). This should allow Linux block I/O to write these blocks more efficiently, and not hold up the read process. I'm curious how much of a difference that might make in your situation. If you (or anyone else) can try it out, let us know the results. -- Ben -- Opinions are mine, not Intel's > -----Original Message----- > From: linux-cluster-bounces at redhat.com > [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Matt Mitchell > Sent: Thursday, October 28, 2004 5:39 PM > To: linux clistering > Subject: Re: [Linux-cluster] clvmd without GFS? > > Cahill, Ben M wrote: > > > > I'd be curious to know if it makes a difference if you > mount using -o > > noatime option (see man mount)? The default access-time update > > threshold for GFS is 3600 seconds (1 hour). This can cause > a bunch of > > write transactions to happen, even if you're just doing a read > > operation such as ls. Since your exercise is taking over an hour, > > this might be thrashing the atime updates, but I don't know > how much > > that might be adding. > > Looks like that is the case for the second time slowness; it > also probably explains why the directory seemed relatively > snappy immediately after I finished populating it but not at > all the next morning. > > hudson:/mnt/xs_media# mount > [ snipped ] > /dev/mapper/xs_gfsvg-xs_media on /mnt/xs_media type gfs > (rw,noatime) hudson:/mnt/xs_media# time sh -c 'ls > 100032/mls/fmls_stills | wc -l' > 298407 > > real 74m15.781s > user 0m5.546s > sys 0m40.529s > hudson:/mnt/xs_media# time sh -c 'ls 100032/mls/fmls_stills | wc -l' > 298407 > > real 3m37.052s > user 0m5.502s > sys 0m12.643s > > For sake of comparison, here is the same thing after > unmounting both nodes and remounting only hudson, again with > noatime (so it is not touching any disk blocks): > hudson:/mnt/xs_media# time sh -c 'ls 100032/mls/fmls_stills | wc -l' > 298407 > > real 3m59.533s > user 0m5.501s > sys 0m51.741s > > (Now I am trying to unmount the partition again, and it's hanging.) > > So it is definitely dlm behaving badly...? No, DLM is not causing the delay ... it's just a lot of disk writes ... with most of them causing a WAIT until the disk is written. -- Ben -- Opinions are mine, not Intel's > > -m > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > From jn at it.swin.edu.au Tue Nov 23 22:47:18 2004 From: jn at it.swin.edu.au (John Newbigin) Date: Wed, 24 Nov 2004 09:47:18 +1100 Subject: [Linux-cluster] Grow pool without adding subpools In-Reply-To: <419C31D2.4090807@it.swin.edu.au> References: <419A813B.5070906@it.swin.edu.au> <419A907A.20104@it.swin.edu.au> <419C31D2.4090807@it.swin.edu.au> Message-ID: <41A3BDF6.7090809@it.swin.edu.au> I think I have solved my immediate problem. It is possible to partition the hardware raid disk and use the partitions as subpools. As the disk grows, add a new partition and then add it as a subpool. The problem with this approach is that a reboot is required before fdisk will recognise the increased the disk size (even though blockdev can see it) and the kernel will not load a new partition table if the disk is in use. This means that a drive expansion requires reboot into runlevel 1. I was hoping for an online solution. John. John Newbigin wrote: > Is is possible/useful to create pools on top of LVM on top of hardware > raid? Would that help in this situation? > > Does anyone use gfs on hardware raid? > > John. > > John Newbigin wrote: > >> David Aquilina wrote: >> >>> On Wed, 17 Nov 2004 09:37:47 +1100, John Newbigin >>> wrote: >>> >>>> I though pool-tool -g would do it but it seems that can only add >>>> subpools. >>> >>> >>> >>> >>> Is there any particular reason you don't want to add a subpool? As far >>> as I know, adding subpools is the only way to grow a pool, and won't >>> have any other effect than a slightly longer pool configuration >>> file... >>> >> I am using hardware raid. There is just one device which represents >> the array (/dev/cciss/c0d1). The size of this device has grown but I >> need to grow the pool to fill it. >> >> Perhaps there is a better way. I am still testing so I can recreate >> it differently if necessary. >> >> John. >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> http://www.redhat.com/mailman/listinfo/linux-cluster >> >> >> > > -- John Newbigin Computer Systems Officer Faculty of Information and Communication Technologies Swinburne University of Technology Melbourne, Australia http://www.it.swin.edu.au/staff/jnewbigin From phillips at redhat.com Wed Nov 24 13:51:14 2004 From: phillips at redhat.com (Daniel Phillips) Date: Wed, 24 Nov 2004 08:51:14 -0500 Subject: [Linux-cluster] Cluster snapshot server failover details Message-ID: <200411240851.15089.phillips@redhat.com> Hi all, Ben has been working on the cluster infrastructure interface for cluster snapshot server failover for the last little whiles, and in that time, the two of us have covered a fair amount of conceptual ground, figuring out how to handle this reasonably elegantly and racelessly. It turned out to be a _lot_ harder than I first expected. It's just not very easy to handle all the races that come up. However, I think we've got something worked out that is fairly solid, and should work as a model for server failover in general, not just for cluster snapshot. We're also looking to prove that the cluster infrastructure interface we intend to propose to the community in general, does in fact provide the right hooks to tackle such failover problems, including internode messaging that works reliably, even when the membersip of the cluster is changing while the failover protocols are in progress. More than once I've wondered if we need a stronger formalism, such as virtual synchrony, to do the job properly, or if such a formalism would simplify the implementation. At this point, I have a tentative "no" for both questons. There appears to be no message ordering going on here that we can't take care of accurately with the simple cluster messaging facility available. The winning strategy just seems to be to enumerate all the possible events, make sure each is handled, and arrange things so that ordering does not matter where it cannot conveniently be controlled. Ben, the algorithm down below is based loosely on your code, but covers some details that came up while I was putting pen to paper. No doubt I've made a booboo or two somewhere, please shout. On to the problem. We are dealing with failover of a cluster snapshot server, where some clients have to upload state (read locks) to the new server. On server failover, we have to enforce the following rule: 1) Before the new server can begin to service requests, all snapshot clients that were connected to the previous server must have either reconnected (and uploaded their read locks) or left the cluster and for client connection failover: 2) If a snapshot client disconnects, the server can't throw away its locks until it knows the client has really terminated (it might just have suffered a temporary disconnection) Though these rules look simple, they blow up into a fairly complex interaction between clients, the server, agents and the cluster manager. Rule (1) is enforced by having the new server's agent carry out a roll call of all nodes, to see which of them had snapshot clients that may have been connected to the failed server. A complicating factor is, in the middle of carrying out the roll call, we have to handle client joins and parts, and node joins and parts. (What is an agent? Each node has a csnap agent running on it that takes care of connecting any number of device mapper csnap clients on the node to a csnap server somewhere on the cluster, controls activating a csnap server on the node if need be, and controls server failover) Most of the complexity that arises lands on the agent rather than the server, which is nice because the server is already sufficiently complex. The server's job is just to accept connections that agents make to it on behalf of their clients, and to notify its own agent of those connections so the agent can complete the roll call. The server's agent will also tell the server about any clients that have parted, as opposed to just being temporarily disconnected, in order to implement rule (2). One subtle point: in normal operation (that is, after failover is completed) when a client connects to its local agent and requests a server, the agent will forward a client join message to the server's agent. The agent responds to the join message, and only then the remote agent establishes the client's connection to the server. It has to be done this way because the remote agent can't allow a client to connect until it has received confirmation from the server agent that the client join message was received. Otherwise, if the remote agent's node parts there will be no way to know the client has parted. Messages There are a fair number of messages that fly around, collated here to help put the event descriptions in context: The server agent will send: - roll call request to remote agents - roll call acknowledgement to remote agent - client join acknowledgement to remote agent - activation message to server The server agent will receive: - initial node list from cluster manager - node part message from cluster manager - snapshot client connection message from server - roll call response from remote agent - client join message from remote agent - client part message from remote agent - connection from local client - disconnection from local client A remote agent will receive: - roll call request from server agent - roll call acknowledgement from server agent - connection from local client - disconnection from local client - client join acknowledgement from server agent A remote agent will send: - roll call response to server agent - client join message to server agent - client part message to server agent A server will receive: - client connection - activation message from server agent - client part from server agent A server will send: - client connection message to server agent Note: the server does not tell the agent about disconnections; instead, the remote agent tells the server agent about client part Note: each agent keeps a list of local client connections. It only actually has to forward clients that were connected to the failed server, but it's less bookkeeping and does no harm just to forward all. Agent Data - list of local snapshot client connections - roll call list of nodes - used only for server failover - client list of node:client pairs - used for server failover - used for snapshot client permanent disconnection - each client has state "failover" or "normal" - a failover count - how many snapshot client connections still expected Event handling Server Agent Acquires the server lock (failover): - get the current list of nodes from the cluster manager - Receive it any time before activating the new server - Some recently joined nodes may never have been connected to the old server and it doesn't matter - Don't add nodes that join after this point to the roll call list, they can't possibly have been connected to the old server - set failover count to zero - send a roll call request to the agent on every node in the list Receives a roll call response: - remove the responding node from the roll call list - if the node wasn't on the roll call list, what? - each response has a list of snapshot clients, add each to the client list - set state to failover - increase the failover count - send roll call response acknowledgement to remote agent - If roll call list empty and failover count zero, activate server Receives a node part event from cluster manager: - remove the node from the roll call list, if present - remove each matching client from the client list - if client state was normal send client part to server - if client state was failover, decrease the failover count - If roll call list empty and failover count zero, activate server Receives a snapshot client connection message from server: - if the client wasn't on the client list, what? - if the client state was failover - decrease the failover count - set client state to normal - If roll call list empty and failover count zero, activate server Receives a client join from a remote agent: - add the client to the client list in state normal - send client join acknowledgement to remote agent Receives a client part from a remote agent: - if client wasn't on the client list, what? - remove the client from the client list - if client state was normal send client part to server - if client state was failover, decrease the failover count - If roll call list empty and failover count zero, activate server Receives a snapshot client connection from device: - Send client join message to self Receives a snapshot client disconnection from device: - Send client part message to self Remote Agent Receives a snapshot client connection from device: - If roll call response has already been sent, send client join message to server agent - otherwise just add client to local snapshot client list Receives a snapshot client disconnection from device: - If roll call response has already been sent, send client part message to server agent - otherwise just remove client from local snapshot client list Receives a roll call request: - Send entire list of local snapshot clients to remote agent Receives acknowledgement of roll call response from server agent: - Connect entire list of local snapshot clients to server Receives client join acknowledgement from server agent: - Connect client to server Server Receives a client connection - Send client connection message to server agent Easy, huh? Daniel From mauelshagen at redhat.com Wed Nov 24 14:26:33 2004 From: mauelshagen at redhat.com (Heinz Mauelshagen) Date: Wed, 24 Nov 2004 15:26:33 +0100 Subject: [Linux-cluster] *** Announcement: dmraid 1.0.0-rc5f *** Message-ID: <20041124142633.GA16687@redhat.com> *** Announcement: dmraid 1.0.0-rc5f *** dmraid 1.0.0-rc5f is available at http://people.redhat.com:/heinzm/sw/dmraid/ in source, source rpm and i386 rpm. dmraid (Device-Mapper Raid tool) discovers, [de]activates and displays properties of software RAID sets (i.e. ATARAID) and contained DOS partitions using the device-mapper runtime of the 2.6 kernel. The following ATARAID types are supported on Linux 2.6: Highpoint HPT37X Highpoint HPT45X Intel Software RAID NVidia NForce *** NEW *** Promise FastTrack Silicon Image Medley This ATARAID type is only basically supported in this version (I need better metadata format specs; please help): LSI Logic MegaRAID Please provide insight to support those metadata formats completely. Thanks. See files README and CHANGELOG, which come with the source tarball for prerequisites to run this software, further instructions on installing and using dmraid! CHANGELOG is contained below for your convenience as well. Call for testers: ----------------- I need testers with the above ATARAID types, to check that the mapping created by this tool is correct (see options "-t -ay") and access to the ATARAID data is proper. In case you have a different ATARAID solution from those listed above, please feel free to contact me about supporting it in dmraid. You can activate your ATARAID sets without danger of overwriting your metadata, because dmraid accesses it read-only unless you use option -E together with -r in order to erase ATARAID metadata (see 'man dmraid')! This is a release candidate version so you want to have backups of your valuable data *and* you want to test accessing your data read-only first in order to make sure that the mapping is correct before you go for read-write access. Contacts: --------- The author is reachable at . For test results, mapping information, discussions, questions, patches, enhancement requests and the like, please subscribe and mail to . -- Regards, Heinz -- The LVM Guy -- CHANGELOG: --------- FIXES: ------ o make suffix in hpt45x set names numeric o HPT37x metadata format handler RAID10 grouping logic o HPT37x/HPT45x avoid devision by zero bug in case ->raid_disks is zero for spares o avoid devision by zero bug in case of stride = 0 o SIL device names / checksums o calc_total_sectors() on unsymmetric mirrors o Partition name suffix to make GRUB happy o perform() could return an error without releasing a lock FEATURES: --------- o added NVidia metadata format handler o quorate SIL metadata copies o sorting order of stacked subset enhanced (RAID10; hpt37x, hpt45x, lsi, nvidia and sil) o started event methods implementation in metadata format handlers o output linefeed to .offset files for better readability (-r -D) o use /sys/block/*/removable to avoid acessing removable devices o display of spare devices with -r -c{0,2} o enhanced spare device handling o '-h' option doesn't need to stand alone any more o -s displays top level sets only. "-s -s" shows subsets as well. o -f allows partial qualification of format names now (eg, "dmraid -f hpt -r" will search for hpt37x and hpt45x formats) MISCELANIOUS: ------------ o HPT37X shows subset name suffixes with -r o streamlined display.c o added lib_context* argument to alloc_disk_info() in order to be able to display an error message on failure o factored basic RAID set allocation code out of all metadata format handler into find_or_alloc_set() o factored RAID superset allocation code out of metadata format handlers into join_superset() o streamlined endianess code using CVT* macros o streamlined free_set() code o check option enum valid o introduced various metadata extraction macros to streamline related code (eg, RD(), RS()) o optimized format handler pre-registration checks o avoid format handler type() method altogether by introducing a RAID device type member o generalized list_add_sorted() which can be used to sort any "struct list_head*" which voided list_add_dev_sorted() o find_set() modified to avoid global searches for stacked sets o converted get_scsi_serial to fallback using SG_IO, SCSI_IOCTL_SEND_COMMAND and ATA identify o introduced p_fmt() for formated string pushs in order to streamline activate.c; value return code of p_fmt() o moved some paths + filenames to lib_context o introduced RAID set flag for metadata format handlers which decide to maximize capacity in unsymetric RAID0 sets o factored out device information allocation of scan.c into metadata.c o introduced RAID device list to library context in order to remove pointer from device info and be able to handle remaining RAID device structures better on library cleanup o streamlined commands.c o changed column output delimiter to ':' o introduced various enums replacing integers =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Heinz Mauelshagen Red Hat GmbH Consulting Development Engineer Am Sonnenhang 11 56242 Marienrachdorf Germany Mauelshagen at RedHat.com +49 2626 141200 FAX 924446 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- From craig at wotever.net Wed Nov 24 15:15:31 2004 From: craig at wotever.net (Craig Ward) Date: Wed, 24 Nov 2004 15:15:31 -0000 Subject: [Linux-cluster] HA with GFS and GNBD Message-ID: <000901c4d238$73382f40$d08beb50@KRAG> Hi there, We're trying to achieve a high availability fail over for data store that's part of a load balanced web server farm. Is there a way of achieving a cheap SAN by using GFS on our webservers and exporting shares from two backend servers using GNBD so effectively having a network mirrored raid setup. The idea being if server #1 fails the web servers can still access the data from server #2. Apologies in advance if this is a 'newbie' question but I've spent a while reading all the Redhat docs and I'm still unsure if this is something that can be achieved with GFS and GNBD alone. The last solution at http://www.redhat.com/docs/manuals/csgfs/admin-guide/s1-ov-perform.html seems to be what we're looking for but it's shy on details. Many thanks, Craig. From notiggy at gmail.com Wed Nov 24 15:39:45 2004 From: notiggy at gmail.com (Brian Jackson) Date: Wed, 24 Nov 2004 09:39:45 -0600 Subject: [Linux-cluster] HA with GFS and GNBD In-Reply-To: <000901c4d238$73382f40$d08beb50@KRAG> References: <000901c4d238$73382f40$d08beb50@KRAG> Message-ID: You can't currently do this in a highly available way. You'd need cluster aware software raid for it to work, and that's not available yet. It shouldn't be too long before work on it gets underway since this seems to be a commonly requested item. --Brian Jackson On Wed, 24 Nov 2004 15:15:31 -0000, Craig Ward wrote: > Hi there, > > We're trying to achieve a high availability fail over for data store > that's part of a load balanced web server farm. > > Is there a way of achieving a cheap SAN by using GFS on our webservers > and exporting shares from two backend servers using GNBD so effectively > having a network mirrored raid setup. The idea being if server #1 fails > the web servers can still access the data from server #2. > > Apologies in advance if this is a 'newbie' question but I've spent a > while reading all the Redhat docs and I'm still unsure if this is > something that can be achieved with GFS and GNBD alone. > > The last solution at > http://www.redhat.com/docs/manuals/csgfs/admin-guide/s1-ov-perform.html > seems to be what we're looking for but it's shy on details. > > Many thanks, > Craig. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > From craig at wotever.net Wed Nov 24 16:20:31 2004 From: craig at wotever.net (Craig Ward) Date: Wed, 24 Nov 2004 16:20:31 -0000 Subject: [Linux-cluster] HA with GFS and GNBD In-Reply-To: Message-ID: <000001c4d241$85172410$d08beb50@KRAG> > You can't currently do this in a highly available way. You'd need > cluster aware software raid for it to work, and that's not available > yet. It shouldn't be too long before work on it gets underway since > this seems to be a commonly requested item. Thanks Brian, its good to get clear answer at last. Has anyone on the list come up against this problem and found an economic solution? I've heard external SCSI arrays mentioned? Any ideas are greatly appreciated. Cheers, Craig. From notiggy at gmail.com Wed Nov 24 16:34:48 2004 From: notiggy at gmail.com (Brian Jackson) Date: Wed, 24 Nov 2004 10:34:48 -0600 Subject: [Linux-cluster] HA with GFS and GNBD In-Reply-To: <000001c4d241$85172410$d08beb50@KRAG> References: <000001c4d241$85172410$d08beb50@KRAG> Message-ID: It's certainly something that's been mentioned often. So you aren't alone. I don't know of a good solution that involves GFS currently. You could look at some of the other fs'es out there like AFS, Lustre, etc., or you could just use something along the lines of an nfs exported drbd. --Brian Jackson On Wed, 24 Nov 2004 16:20:31 -0000, Craig Ward wrote: > > You can't currently do this in a highly available way. You'd need > > cluster aware software raid for it to work, and that's not available > > yet. It shouldn't be too long before work on it gets underway since > > this seems to be a commonly requested item. > > Thanks Brian, its good to get clear answer at last. > > Has anyone on the list come up against this problem and found an > economic solution? I've heard external SCSI arrays mentioned? > > Any ideas are greatly appreciated. > > Cheers, > > > Craig. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > From cjk at techma.com Wed Nov 24 19:33:04 2004 From: cjk at techma.com (Kovacs, Corey J.) Date: Wed, 24 Nov 2004 14:33:04 -0500 Subject: [Linux-cluster] GFS 6 hang during file copy.. Message-ID: I've got a 3 node cluster consisting of DL380's running RHAS3 update 3 with RHGFS 6.0.0-15. The locking is being done on the cluster nodes themselves and are not external. I set up nfs to export one of the gfs mount points and this was working fine. I then tried to populate the gfs mount point with about 24GB if files from another system using scp to do some testing. During that copy, gfs froze. That is to say I can still interact with the nodes unless I do anything that might query the gfs mount (is ls -l /gfs1) this hangs the machine. After a reboot things were fine. I did a gfs_tool df /gfs1 and there were ~79000 inodes being used, and right around 20 GB (just under actually) used. The filesystem itself is 300+GB. My question, if it's the right one, is did I hit a limit in inodes? As I recall, GFS maps inodes from the host somehow. Is there a way to increas the numbe so that I can actually use the space I have or must I reduce the size of the filesystem? Of course I could be completely off base and if so, please let me know... Thanks Corey -------------- next part -------------- An HTML attachment was scrubbed... URL: From kpreslan at redhat.com Wed Nov 24 22:05:52 2004 From: kpreslan at redhat.com (Ken Preslan) Date: Wed, 24 Nov 2004 16:05:52 -0600 Subject: [Linux-cluster] GFS 6 hang during file copy.. In-Reply-To: References: Message-ID: <20041124220552.GA25192@potassium.msp.redhat.com> On Wed, Nov 24, 2004 at 02:33:04PM -0500, Kovacs, Corey J. wrote: > My question, if it's the right one, is did I hit a limit in inodes? As I > recall, GFS maps inodes from the > host somehow. Is there a way to increas the numbe so that I can actually use > the space I have or > must I reduce the size of the filesystem? > > Of course I could be completely off base and if so, please let me know... It definitely doesn't have anything to do with the number of inodes. If this happens again, collect the following for each node: 1) The output of "gfs_tool lockdump /mountpoint" 2) The output of "ps aux" 3) What gets written to the log when you do "echo 1 > /proc/sys/kernel/sysrq ; echo t > /proc/sysrq-trigger" -- Ken Preslan From schuan2 at gmail.com Wed Nov 24 23:05:49 2004 From: schuan2 at gmail.com (Shih-Che Huang) Date: Wed, 24 Nov 2004 18:05:49 -0500 Subject: [Linux-cluster] gnbd import Message-ID: Hi, I tried to import GNBD, but I came out following message. #gnbd_import -i 10.5.0.254 gnbd_import error: could not find gnbd registered in /proc/devices. This probably means that you have not loaded the ghbd module. However, I tried to load it #modprobe gnbd It came out--> GNBD v6.0.0 installed. Any ideas? Thanks! -- Shih-Che Huang From dmorgan at gmi-mr.com Wed Nov 24 23:11:41 2004 From: dmorgan at gmi-mr.com (dmorgan at gmi-mr.com) Date: Wed, 24 Nov 2004 23:11:41 -0000 (GMT) Subject: [Linux-cluster] adding/removing lock nodes while cluster running Message-ID: <1388.24.85.175.130.1101337901.squirrel@24.85.175.130> Hello, I understand that lock servers cannot be added or removed from the cluster on the fly. How is this done? It seems like it would be important to be able to do this in case of a failure (I know also about RLM). Is it a matter of taking CCA device and the bringing the cluster back online? Thanks in advance, Duncan From dmorgan at gmi-mr.com Thu Nov 25 00:54:10 2004 From: dmorgan at gmi-mr.com (dmorgan at gmi-mr.com) Date: Thu, 25 Nov 2004 00:54:10 -0000 (GMT) Subject: [Linux-cluster] how to determine master gulm server Message-ID: <1697.24.85.175.130.1101344050.squirrel@24.85.175.130> Is there a way to find out which lock_gulmd server in an RLM configuration is the master? Thanks, Duncan From cjk at techma.com Thu Nov 25 00:55:54 2004 From: cjk at techma.com (Kovacs, Corey J.) Date: Wed, 24 Nov 2004 19:55:54 -0500 Subject: [Linux-cluster] GFS 6 hang during file copy.. Message-ID: After reading a bit more I see that inode allocation is indeed not the problem. Thanks for the tip below, I'll collect the information and relay it here if it happens again. Corey -----Original Message----- From: linux-cluster-bounces at redhat.com on behalf of Ken Preslan Sent: Wed 11/24/2004 5:05 PM To: linux clistering Subject: Re: [Linux-cluster] GFS 6 hang during file copy.. On Wed, Nov 24, 2004 at 02:33:04PM -0500, Kovacs, Corey J. wrote: > My question, if it's the right one, is did I hit a limit in inodes? As I > recall, GFS maps inodes from the > host somehow. Is there a way to increas the numbe so that I can actually use > the space I have or > must I reduce the size of the filesystem? > > Of course I could be completely off base and if so, please let me know... It definitely doesn't have anything to do with the number of inodes. If this happens again, collect the following for each node: 1) The output of "gfs_tool lockdump /mountpoint" 2) The output of "ps aux" 3) What gets written to the log when you do "echo 1 > /proc/sys/kernel/sysrq ; echo t > /proc/sysrq-trigger" -- Ken Preslan -- Linux-cluster mailing list Linux-cluster at redhat.com http://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- A non-text attachment was scrubbed... Name: winmail.dat Type: application/ms-tnef Size: 3234 bytes Desc: not available URL: From naoki at valuecommerce.com Thu Nov 25 01:38:32 2004 From: naoki at valuecommerce.com (Naoki) Date: Thu, 25 Nov 2004 10:38:32 +0900 Subject: [Linux-cluster] What is the status of GFS? Message-ID: <1101346712.1048.51.camel@dragon.sys.intra> Hi all, I'm anxiously waiting for a good time to use GFS in some of my projects but I'm a little concerned that it's just not ready. I've been lurking on the list for a couple of months now and it seems everything from copying files to using samba will cause hangs or panics. Is this because people here are using fresh code from CVS or simply that it's not there yet? -n. From abe at blur.com Thu Nov 25 02:06:16 2004 From: abe at blur.com (Abe Shelton) Date: Wed, 24 Nov 2004 18:06:16 -0800 Subject: [Linux-cluster] What is the status of GFS? In-Reply-To: <1101346712.1048.51.camel@dragon.sys.intra> References: <1101346712.1048.51.camel@dragon.sys.intra> Message-ID: <41A53E18.4000508@blur.com> I've been lurking too and am also very interested hearing the official, up-to-date status of gfs on both RHEL3 and FC3. We've managed to get GFS-6.0.0-15 working on RHEL3U3 with directly attached SAN storage. (3 nodes, shared disk array via FC HBAs, each node running lock_gulmd) In my first round of benchmarks using bonnie++ on kernel 2.4.21-20.ELsmp, the 'many small files' tests didn't work out so well. We're retesting now with kernel 2.4.21-25.ELsmp. So far everything seems stable and FS performance compared to stock ext3 seems decent. - with bonnie++ running simultaneously on each machine, ext3 averages 77.10 MB/sec writing and 54.65 MB/sec reading. - same setup/methodology with gfs manages 63.01 MB/sec avg for writing and 54.87 MB/sec avg for reading. Abe Naoki wrote: > Hi all, > > I'm anxiously waiting for a good time to use GFS in some of my projects > but I'm a little concerned that it's just not ready. I've been lurking > on the list for a couple of months now and it seems everything from > copying files to using samba will cause hangs or panics. Is this because > people here are using fresh code from CVS or simply that it's not there > yet? > > -n. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > > From teigland at redhat.com Thu Nov 25 04:08:16 2004 From: teigland at redhat.com (David Teigland) Date: Thu, 25 Nov 2004 12:08:16 +0800 Subject: [Linux-cluster] What is the status of GFS? In-Reply-To: <1101346712.1048.51.camel@dragon.sys.intra> References: <1101346712.1048.51.camel@dragon.sys.intra> Message-ID: <20041125040816.GA11469@redhat.com> On Thu, Nov 25, 2004 at 10:38:32AM +0900, Naoki wrote: > Hi all, > > I'm anxiously waiting for a good time to use GFS in some of my projects > but I'm a little concerned that it's just not ready. I've been lurking > on the list for a couple of months now and it seems everything from > copying files to using samba will cause hangs or panics. Is this because > people here are using fresh code from CVS or simply that it's not there > yet? The stable version of GFS is certainly ready. There are pointers to the srpm's here: http://sources.redhat.com/cluster/gfs/ (CVS and most of the discussion is about the next version. That's nearing release time and the test reports here are very helpful.) -- Dave Teigland From schuan2 at gmail.com Thu Nov 25 04:15:46 2004 From: schuan2 at gmail.com (Shih-Che Huang) Date: Wed, 24 Nov 2004 23:15:46 -0500 Subject: [Linux-cluster] How to use GFS filesystem (GNBD)? Message-ID: Hi, I have three PCs for my nodes and have done the set up for GFS file system. I created the pool (pool0) and mounted the GFS file systems on each node by following commands #gfs_mkfs -p lock_gulm -t alpha:gfs1 -j /dev/pool/pool0 #mount -t gfs /dev/pool/pool0 /gfs1 How to test them to see if they are running well? How to use GFS file system? Could you give me some example? Thanks! -- Shih-Che Huang From Vincent.Aniello at PipelineTrading.com Thu Nov 25 13:57:27 2004 From: Vincent.Aniello at PipelineTrading.com (Vincent Aniello) Date: Thu, 25 Nov 2004 08:57:27 -0500 Subject: [Linux-cluster] What is the status of GFS? Message-ID: <834F55E6F1BE3B488AD3AFC927A09700033C0C@EMAILSRV1.exad.net> What is the current recommended stable version of the Linux kernel and GFS combination? I recently setup GFS 6.0.0-15 with kernel version 2.4.21-20.ELsmp. According to the message below, though, this will not give me the best performance and stability. Should I upgrade to a later kernel? Will the new version of GFS run with the 2.4 kernel or will it require 2.6? --Vincent This e-mail and/or its attachments may contain confidential and/or privileged information. If you are not the intended recipient(s) or have received this e-mail in error, please notify the sender immediately and delete this e-mail and its attachments from your computer and files. Any unauthorized copying, disclosure or distribution of the material contained herein is strictly forbidden. Pipeline Trading Systems, LLC - Member NASD & SIPC. -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Abe Shelton Sent: Wednesday, November 24, 2004 9:06 PM To: linux clistering Subject: Re: [Linux-cluster] What is the status of GFS? I've been lurking too and am also very interested hearing the official, up-to-date status of gfs on both RHEL3 and FC3. We've managed to get GFS-6.0.0-15 working on RHEL3U3 with directly attached SAN storage. (3 nodes, shared disk array via FC HBAs, each node running lock_gulmd) In my first round of benchmarks using bonnie++ on kernel 2.4.21-20.ELsmp, the 'many small files' tests didn't work out so well. We're retesting now with kernel 2.4.21-25.ELsmp. So far everything seems stable and FS performance compared to stock ext3 seems decent. - with bonnie++ running simultaneously on each machine, ext3 averages 77.10 MB/sec writing and 54.65 MB/sec reading. - same setup/methodology with gfs manages 63.01 MB/sec avg for writing and 54.87 MB/sec avg for reading. Abe Naoki wrote: > Hi all, > > I'm anxiously waiting for a good time to use GFS in some of my projects > but I'm a little concerned that it's just not ready. I've been lurking > on the list for a couple of months now and it seems everything from > copying files to using samba will cause hangs or panics. Is this because > people here are using fresh code from CVS or simply that it's not there > yet? > > -n. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > > -- Linux-cluster mailing list Linux-cluster at redhat.com http://www.redhat.com/mailman/listinfo/linux-cluster From dwaquilina at gmail.com Thu Nov 25 18:32:42 2004 From: dwaquilina at gmail.com (David Aquilina) Date: Thu, 25 Nov 2004 13:32:42 -0500 Subject: [Linux-cluster] Grow pool without adding subpools In-Reply-To: <41A3BDF6.7090809@it.swin.edu.au> References: <419A813B.5070906@it.swin.edu.au> <419A907A.20104@it.swin.edu.au> <419C31D2.4090807@it.swin.edu.au> <41A3BDF6.7090809@it.swin.edu.au> Message-ID: On Wed, 24 Nov 2004 09:47:18 +1100, John Newbigin wrote: > the kernel will not load a new partition table if the disk is in I've only used this for new partitions in existing space on a single disk, however the 'partprobe' command has worked for me in the past. -- David Aquilina, RHCE dwaquilina at gmail.com From shirai at sc-i.co.jp Fri Nov 26 09:40:33 2004 From: shirai at sc-i.co.jp (shirai at sc-i.co.jp) Date: Fri, 26 Nov 2004 18:40:33 +0900 Subject: [Linux-cluster] It fails in the start of lock_gulmd because of the redundant configuration of lockserver. References: <419A813B.5070906@it.swin.edu.au><419A907A.20104@it.swin.edu.au> <419C31D2.4090807@it.swin.edu.au><41A3BDF6.7090809@it.swin.edu.au> Message-ID: <006201c4d39b$f9c7d0b0$6500a8c0@viostar> Dear Sir I am constructing GFS with three nodes (gfslocksv,gfsnodea,gfsnodeb). The kernel version of gfslocksv is 2.4.21-20EL, and GFS-6.0.0-15 is installed. The kernel version of gfsnodea and gfsnodeb is 2.4.21-20ELsmp, and GFS-6.0.0-15 is installed. gfslocksv is a mastering lock server . gfsnodea and gfsnodeb are the slave lock servers. and, gfsnodea and gfsnodeb mount the filesystem. However, even if gfsnodea and gfsnodeb are booted after gfslocksv is booted, the filesystem is not mount. Then, the filesystem can mount by restart lock_gulmd. However, when gfsnodea and gfsnodeb are booted again, the filesystem is not still mount. In addition, fence is executed. How should I do? Regards ------------------------------------------------------ Shirai Noriyuki Chief Engineer Technical Div. System Create Inc Ishkawa 2nd Bldg 1-10-8 Kajicho Chiyodaku Tokyo 101-0044 Japan Tel81-3-5296-3775 Fax81-3-5296-3777 e-mail:shirai at sc-i.co.jp web:http://www.sc-i.co.jp ------------------------------------------------------ From clopmz at yahoo.com Fri Nov 26 11:35:47 2004 From: clopmz at yahoo.com (Carlos Lopez) Date: Fri, 26 Nov 2004 05:35:47 -0600 (CST) Subject: [Linux-cluster] I/O problems with RHEL Message-ID: <20041126113547.72706.qmail@web50302.mail.yahoo.com> Hi all, I have a very "heavy" problem with two nodes under Redhat Cluster Suite. In both nodes, I have installed an Oracle 9i (without RAC). When I try to create a tablespace more than 256MB, two nodes reboots. I have using ext3 filesystem on shared storage. Problems seems to be with I/O when I create the tablespace that blocks quorum partitons and then both nodes reboots .... GFS can solve my problem ??? Where can I find information about this problem??? Thank you very much for your help. _________________________________________________________ Do You Yahoo!? Informaci?n de Estados Unidos y Am?rica Latina, en Yahoo! Noticias. Vis?tanos en http://noticias.espanol.yahoo.com From gurbir_dhaliwal at indiainfo.com Fri Nov 26 14:13:54 2004 From: gurbir_dhaliwal at indiainfo.com (Gurbir Dhaliwal) Date: Fri, 26 Nov 2004 19:43:54 +0530 Subject: [Linux-cluster] Need info about Redhat CCS and CMAN modules Message-ID: <20041126141354.1FFB023CE7@ws5-3.us4.outblaze.com> Hi all, I am planning to have a high availability setup with custom services and I am looking for an open source stable cluster manager. I was wondering if someone could share their experiences of using the Redhat CCS and CMAN modules in a production environment. I would also be interested if anyone has used these modules with services other than Redhat GFS and OpenGFS. I would basically like to know how stable and usable are the version available from the CVS. Also can it be used with any commercial software with little work ? Any suggestions or links to other similar open source modules for clustering and HA would also be helpful. regards, Gurbir. -- ______________________________________________ IndiaInfo Mail - the free e-mail service with a difference! www.indiainfo.com Check out our value-added Premium features, such as an extra 20MB for mail storage, POP3, e-mail forwarding, and ads-free mailboxes! Powered by Outblaze From linux-cluster at spam.dragonhold.org Fri Nov 26 18:33:33 2004 From: linux-cluster at spam.dragonhold.org (linux-cluster at spam.dragonhold.org) Date: Fri, 26 Nov 2004 18:33:33 +0000 Subject: [Linux-cluster] Cluster of different Architectures Message-ID: <20041126183333.GA10285@dragonhold.org> I'm one of those people that like to find out if I can break things, and I want to find out if what I'm about to do has been tested at all.... Basically I'm about to shift from shared (sort of, long story but basically the hardware doesn't work) SCSI, to FCAL storage (hardware raid). All good (I think) so far. However, I've got my 2 PCs and a SunBlade 100. The Sunblade was originally intended to do something else, but is now free. Therefore I'm tempted to try to create a 3 node cluster using 2 PCs and an UltraSparc based computer This means (I think), that I'm going to be trying to access the same data from not only a different CPU size (32 bit and 64 bit) but also, if I remember correctly, big endian and little endian. So, is anyone else out there sick enough to have tried this? Is the cluster framework going to work, and/or gfs? Graham From pcaulfie at redhat.com Mon Nov 29 10:22:37 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Mon, 29 Nov 2004 10:22:37 +0000 Subject: [Linux-cluster] Cluster of different Architectures In-Reply-To: <20041126183333.GA10285@dragonhold.org> References: <20041126183333.GA10285@dragonhold.org> Message-ID: <20041129102237.GC16269@tykepenguin.com> On Fri, Nov 26, 2004 at 06:33:33PM +0000, linux-cluster at spam.dragonhold.org wrote: > I'm one of those people that like to find out if I can break things, and I want to find out if what I'm about to do has been tested at all.... > > Basically I'm about to shift from shared (sort of, long story but basically the hardware doesn't work) SCSI, to FCAL storage (hardware raid). > > All good (I think) so far. > > However, I've got my 2 PCs and a SunBlade 100. The Sunblade was originally intended to do something else, but is now free. Therefore I'm tempted to try to create a 3 node cluster using 2 PCs and an UltraSparc based computer > > This means (I think), that I'm going to be trying to access the same data from not only a different CPU size (32 bit and 64 bit) but also, if I remember correctly, big endian and little endian. > > So, is anyone else out there sick enough to have tried this? Is the cluster framework going to work, and/or gfs? > The cluster framework /should/ work with different architectures. I've tested it with intel,sparc & alpha boxes - but not much! I don't know about GFS itself though, I don't have shared storage at home :) -- patrick From cjk at techma.com Mon Nov 29 12:57:00 2004 From: cjk at techma.com (Kovacs, Corey J.) Date: Mon, 29 Nov 2004 07:57:00 -0500 Subject: [Linux-cluster] I/O problems with RHEL Message-ID: As far as I know, 9i RAC without a clustered filesystem (GFS, OCFS etc) cannot use any normal filesystem. The sharede storage in a RAC setup is either RAW or a true clustered filesystem. So, do you need GFS, not if you use raw disks. You just need to put the binaries on local node disks.... Corey -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Carlos Lopez Sent: Friday, November 26, 2004 6:36 AM To: linux-cluster at redhat.com Subject: [Linux-cluster] I/O problems with RHEL Hi all, I have a very "heavy" problem with two nodes under Redhat Cluster Suite. In both nodes, I have installed an Oracle 9i (without RAC). When I try to create a tablespace more than 256MB, two nodes reboots. I have using ext3 filesystem on shared storage. Problems seems to be with I/O when I create the tablespace that blocks quorum partitons and then both nodes reboots .... GFS can solve my problem ??? Where can I find information about this problem??? Thank you very much for your help. _________________________________________________________ Do You Yahoo!? Informaci?n de Estados Unidos y Am?rica Latina, en Yahoo! Noticias. Vis?tanos en http://noticias.espanol.yahoo.com -- Linux-cluster mailing list Linux-cluster at redhat.com http://www.redhat.com/mailman/listinfo/linux-cluster From mtilstra at redhat.com Mon Nov 29 15:50:13 2004 From: mtilstra at redhat.com (Michael Conrad Tadpol Tilstra) Date: Mon, 29 Nov 2004 09:50:13 -0600 Subject: [Linux-cluster] how to determine master gulm server In-Reply-To: <1697.24.85.175.130.1101344050.squirrel@24.85.175.130> References: <1697.24.85.175.130.1101344050.squirrel@24.85.175.130> Message-ID: <20041129155013.GA9193@redhat.com> On Thu, Nov 25, 2004 at 12:54:10AM -0000, dmorgan at gmi-mr.com wrote: > Is there a way to find out which lock_gulmd server in an RLM configuration > is the master? gulm_tool getstats -- Michael Conrad Tadpol Tilstra You humans are all alike. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From a.pugachev at pcs-net.net Mon Nov 29 16:02:48 2004 From: a.pugachev at pcs-net.net (Anatoly Pugachev) Date: Mon, 29 Nov 2004 19:02:48 +0300 Subject: [Linux-cluster] GFS and vanilla kernel Message-ID: <20041129160248.GK2721@proxy-ttk.pcs-net.net> Hello! Any chance to install GFS over vanilla kernel ? Installation guide will be usefull too. Thanks. -- Anatoly P. Pugachev -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 189 bytes Desc: not available URL: From linux-cluster at spam.dragonhold.org Mon Nov 29 21:28:49 2004 From: linux-cluster at spam.dragonhold.org (Graham Wood) Date: Mon, 29 Nov 2004 21:28:49 +0000 Subject: [Linux-cluster] GFS and vanilla kernel In-Reply-To: <20041129160248.GK2721@proxy-ttk.pcs-net.net> References: <20041129160248.GK2721@proxy-ttk.pcs-net.net> Message-ID: <20041129212849.GA2609@dragonhold.org> On Mon, Nov 29, 2004 at 07:02:48PM +0300, Anatoly Pugachev wrote: > Hello! > > Any chance to install GFS over vanilla kernel ? > Installation guide will be usefull too. > > Thanks. > > -- > Anatoly P. Pugachev If you're looking at the CVS, the instructions in http://sources.redhat.com/cluster/doc/usage.txt are the steps you need to take. To summarise: use cvs to download latest source. run configure in the relevant directories - telling it where the kernel source is. Run make install Create /etc/cluster/cluster.conf startup cluster create filesystems mount filesystems. I rebuilt mine a few days ago - from start to finish (including debian sarge install) it was a couple of hours from start to finish. Graham From qliu at ncsa.uiuc.edu Mon Nov 29 21:53:51 2004 From: qliu at ncsa.uiuc.edu (Qian Liu) Date: Mon, 29 Nov 2004 15:53:51 -0600 Subject: [Linux-cluster] GFS and vanilla kernel In-Reply-To: <20041129212849.GA2609@dragonhold.org> References: <20041129160248.GK2721@proxy-ttk.pcs-net.net> <20041129160248.GK2721@proxy-ttk.pcs-net.net> Message-ID: <5.1.0.14.2.20041129155150.04678b80@pop.ncsa.uiuc.edu> Graham, Could you tell which Linux kernel and distribution you are using for your GFS? How many nodes? and what does your cluster.conf look like? Thanks! -Qian At 21:28 2004-11-29 +0000, you wrote: >On Mon, Nov 29, 2004 at 07:02:48PM +0300, Anatoly Pugachev wrote: > > Hello! > > > > Any chance to install GFS over vanilla kernel ? > > Installation guide will be usefull too. > > > > Thanks. > > > > -- > > Anatoly P. Pugachev > >If you're looking at the CVS, the instructions in > >http://sources.redhat.com/cluster/doc/usage.txt > >are the steps you need to take. > >To summarise: > >use cvs to download latest source. >run configure in the relevant directories - telling it where the kernel >source is. >Run make install >Create /etc/cluster/cluster.conf >startup cluster >create filesystems >mount filesystems. > >I rebuilt mine a few days ago - from start to finish (including debian >sarge install) it was a couple of hours from start to finish. > >Graham > >-- >Linux-cluster mailing list >Linux-cluster at redhat.com >http://www.redhat.com/mailman/listinfo/linux-cluster From linux-cluster at spam.dragonhold.org Mon Nov 29 22:03:37 2004 From: linux-cluster at spam.dragonhold.org (Graham Wood) Date: Mon, 29 Nov 2004 22:03:37 +0000 Subject: [Linux-cluster] GFS and vanilla kernel In-Reply-To: <5.1.0.14.2.20041129155150.04678b80@pop.ncsa.uiuc.edu> References: <20041129160248.GK2721@proxy-ttk.pcs-net.net> <20041129160248.GK2721@proxy-ttk.pcs-net.net> <5.1.0.14.2.20041129155150.04678b80@pop.ncsa.uiuc.edu> Message-ID: <20041129220337.GC2609@dragonhold.org> On Mon, Nov 29, 2004 at 03:53:51PM -0600, Qian Liu wrote: > Graham, > Could you tell which Linux kernel and distribution I'm using Debian (sarge, since stable is just too out of date, and I've never had a problem with sarge (touch wood)). I'm running a 2.6.9 kernel, but I also had it running off 2.6.8.1 I grab the source from kernel.org, and then compile it up myself - I pick a few options that other people may not (mainly relating to the hardware I've got, which isn't exactly standard) but other than that it's pretty "simple". I've used the make-kpkg stuff from debian to allow me to better manage the created kernels, but that's the only thing "non-vanilla" about it. > How many nodes? 2 at the moment - there's a couple of posts from me in the last few days about attempting to put a sparc node in as a 3rd node, but that's not really working out properly at the moment. If that doesn't work, I've got another x86 motherboard lying around that I'll use to stick a normal 3rd node in the cluster - I like the idea of 3, since I've only really used 2 node clusters (sun, hp, aix) before. > what does your cluster.conf look like? The 2 node config file is pretty much nicked from that usage.txt, since that's what I used to create it, although I don't have any fencing devices: ======================================= ======================================= From michael.krietemeyer at informatik.uni-rostock.de Thu Nov 25 07:08:41 2004 From: michael.krietemeyer at informatik.uni-rostock.de (Michael Krietemeyer) Date: Thu, 25 Nov 2004 08:08:41 +0100 Subject: [Linux-cluster] node names for gfs Message-ID: <1101366521.18681.4.camel@don> Hello, As I started playing with GFS 6 there was the requirement that the names of the nodes must differ in the first 8 characters. Is this "fixed"? Thanks, Michael Krietemeyer From gurbir_dhaliwal at indiainfo.com Fri Nov 26 10:03:51 2004 From: gurbir_dhaliwal at indiainfo.com (Gurbir Dhaliwal) Date: Fri, 26 Nov 2004 15:33:51 +0530 Subject: [Linux-cluster] Need info about Redhat CCS and CMAN modules Message-ID: <20041126100351.54F58416118@ws5-2.us4.outblaze.com> Hi all, I am planning to have a high availability setup with custom services and I am looking for an open source stable cluster manager. I was wondering if someone could share their experiences of using the Redhat CCS and CMAN modules in a production environment. I would also be interested if anyone has used these modules with services other than Redhat GFS and OpenGFS. I would basically like to know how stable and usable are the version available from the CVS. Also can it be used with any commercial software with little work ? Any suggestions or links to other similar open source modules for clustering and HA would also be helpful. regards, Gurbir. -- ______________________________________________ IndiaInfo Mail - the free e-mail service with a difference! www.indiainfo.com Check out our value-added Premium features, such as an extra 20MB for mail storage, POP3, e-mail forwarding, and ads-free mailboxes! Powered by Outblaze From greg.freemyer at gmail.com Sat Nov 27 19:55:49 2004 From: greg.freemyer at gmail.com (Greg Freemyer) Date: Sat, 27 Nov 2004 14:55:49 -0500 Subject: [Linux-cluster] Need info about Redhat CCS and CMAN modules In-Reply-To: <20041126141354.1FFB023CE7@ws5-3.us4.outblaze.com> References: <20041126141354.1FFB023CE7@ws5-3.us4.outblaze.com> Message-ID: <87f94c37041127115542f5cf28@mail.gmail.com> On Fri, 26 Nov 2004 19:43:54 +0530, Gurbir Dhaliwal wrote: > Hi all, > > Any suggestions or links to other similar open source modules for clustering and HA would also be helpful. > > regards, > Gurbir. > -- If you only need HA clustering, you might also want to look at the linux-ha project. It is the clustering solution SUSE uses in their Server distros. It is based on heartbeat/mon/drbd. The official website is www.linux-ha.org, but they are getting close to releasing version 2 of the project. Version 2 is documented at http://linuxha.trick.ca/ Greg -- Greg Freemyer From linux-cluster at spam.dragonhold.org Mon Nov 29 21:49:46 2004 From: linux-cluster at spam.dragonhold.org (Graham Wood) Date: Mon, 29 Nov 2004 21:49:46 +0000 Subject: [Linux-cluster] Cluster of different Architectures In-Reply-To: <20041129102237.GC16269@tykepenguin.com> References: <20041126183333.GA10285@dragonhold.org> <20041129102237.GC16269@tykepenguin.com> Message-ID: <20041129214946.GB2609@dragonhold.org> > The cluster framework /should/ work with different architectures. I've tested it > with intel,sparc & alpha boxes - but not much! I've tried it today, and managed to kill the cluster quite spectacularly. The sun node joins, and gets counted as a vote, but in the "Quorum" line of /proc/cluster/status, there's a disagreement between the sun node and the x86 nodes. The x86 are running, with a vote count of 3, and the sun box is "frozen" or something, with a quorum count of 0. Attempting to leave the cluster (for any node) resulted in a failure code, and the load average on the first node skyrocketed (12 before it stopped being usable) - dlm was taking 100% CPU, and I got a kernel oops too. To compile it, I had to edit the source in one place (two casts), and had to play with debian's version of gcc to get it to compile 64 bit code that worked (since the majority of libraries are 32 bit only), I ended up compiling up libxml2 myself If you've had it working in the past, I'll play some more - but at the moment the 2 x86 boxes are live, so I can't break them too often *grin* > I don't know about GFS itself though, I don't have shared storage at home :) GNBD means you don't need it! :) I've been trying to work out whether GNBD on top of DRBD could be used to make HA "virtual" shared storage - but I've not used GNBD (I have an unjustified dislike of "fake" shared storage) so I don't know how well it would work, and of course some sort of failover would be required too - again I don't know how well that would work with GNBD. And of course you've got the problem of bringing the systems back in line in the right order in case you have multiple failures causing a full collapse of the system. If I get stuck with this sparc attaching to the x86 pair, I'll see if I can get the ultra10 working again and look at GNBD then. I probably should write down what I'm doing - for my benefit if no-one else's, I've had to puzzle out how to setup/configure stuff multiple times when I've rebuilt already... Graham PS. In case anyone is interested I've attached the dump from the primary node that I got after trying to join the sparc(wirenth) to the existing cluster (ramoth & mnementh), and then get it to leave again. Apologies for the lack of imagination in the names, but I like them. :) -------------- next part -------------- A non-text attachment was scrubbed... Name: ramoth.log.gz Type: application/octet-stream Size: 39645 bytes Desc: not available URL: From danderso at redhat.com Tue Nov 30 14:54:20 2004 From: danderso at redhat.com (Derek Anderson) Date: Tue, 30 Nov 2004 08:54:20 -0600 Subject: [Linux-cluster] node names for gfs In-Reply-To: <1101366521.18681.4.camel@don> References: <1101366521.18681.4.camel@don> Message-ID: <200411300854.20460.danderso@redhat.com> On Thursday 25 November 2004 01:08, Michael Krietemeyer wrote: > Hello, > > As I started playing with GFS 6 there was the requirement that the names > of the nodes must differ in the first 8 characters. > Is this "fixed"? Yes, in versions >= 6.0.0-10. > > Thanks, Michael Krietemeyer > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster From pcaulfie at redhat.com Tue Nov 30 16:24:24 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Tue, 30 Nov 2004 16:24:24 +0000 Subject: [Linux-cluster] Cluster of different Architectures In-Reply-To: <20041129214946.GB2609@dragonhold.org> References: <20041126183333.GA10285@dragonhold.org> <20041129102237.GC16269@tykepenguin.com> <20041129214946.GB2609@dragonhold.org> Message-ID: <20041130162423.GB18719@tykepenguin.com> On Mon, Nov 29, 2004 at 09:49:46PM +0000, Graham Wood wrote: > > The cluster framework /should/ work with different architectures. I've tested it > > with intel,sparc & alpha boxes - but not much! > I've tried it today, and managed to kill the cluster quite spectacularly. The sun node joins, and gets counted as a vote, but in the "Quorum" line of /proc/cluster/status, there's a disagreement between the sun node and the x86 nodes. The x86 are running, with a vote count of 3, and the sun box is "frozen" or something, with a quorum count of 0. Attempting to leave the cluster (for any node) resulted in a failure code, and the load average on the first node skyrocketed (12 before it stopped being usable) - dlm was taking 100% CPU, and I got a kernel oops too. I've checked in a fix. There was a change made recently and I forgot the byte-swapping. > To compile it, I had to edit the source in one place (two casts), and had to play with debian's version of gcc to get it to compile 64 bit code that worked (since the majority of libraries are 32 bit only), I ended up compiling up libxml2 myself > > If you've had it working in the past, I'll play some more - but at the moment the 2 x86 boxes are live, so I can't break them too often *grin* > > > I don't know about GFS itself though, I don't have shared storage at home :) > GNBD means you don't need it! :) Ah well, but that means I would have to test GNBD on mixed-architecture ;-) patrick From linux-cluster at spam.dragonhold.org Tue Nov 30 17:00:02 2004 From: linux-cluster at spam.dragonhold.org (Graham Wood) Date: Tue, 30 Nov 2004 17:00:02 +0000 Subject: [Linux-cluster] Cluster of different Architectures In-Reply-To: <20041130162423.GB18719@tykepenguin.com> References: <20041126183333.GA10285@dragonhold.org> <20041129102237.GC16269@tykepenguin.com> <20041129214946.GB2609@dragonhold.org> <20041130162423.GB18719@tykepenguin.com> Message-ID: <20041130170002.GC3539@dragonhold.org> On Tue, Nov 30, 2004 at 04:24:24PM +0000, Patrick Caulfield wrote: > I've checked in a fix. There was a change made recently and I forgot the > byte-swapping. If I manage to get the stuff for work done in a sane amount of time tonight, I'll try a CVS update on all three nodes and try it again. Is there any debug or anything that I can enable to get proper info if it doesn't work? Might as well get as much info out of it as we can if it doesn't work. > > > I don't know about GFS itself though, I don't have shared storage at home :) > > GNBD means you don't need it! :) > Ah well, but that means I would have to test GNBD on mixed-architecture ;-) Oooh - that would be interesteding, especially if my idea with DRBD in the backend was workable - you could end up mirroring between the x86 and sparc, as well as failing over between them. Suddenly it gets very interesting. I think I'll take a backup of the data on the gfs partitions before I do this. :) From pcaulfie at redhat.com Tue Nov 30 17:26:40 2004 From: pcaulfie at redhat.com (Patrick Caulfield) Date: Tue, 30 Nov 2004 17:26:40 +0000 Subject: [Linux-cluster] Cluster of different Architectures In-Reply-To: <20041130170002.GC3539@dragonhold.org> References: <20041126183333.GA10285@dragonhold.org> <20041129102237.GC16269@tykepenguin.com> <20041129214946.GB2609@dragonhold.org> <20041130162423.GB18719@tykepenguin.com> <20041130170002.GC3539@dragonhold.org> Message-ID: <20041130172640.GA389@tykepenguin.com> On Tue, Nov 30, 2004 at 05:00:02PM +0000, Graham Wood wrote: > On Tue, Nov 30, 2004 at 04:24:24PM +0000, Patrick Caulfield wrote: > > I've checked in a fix. There was a change made recently and I forgot the > > byte-swapping. > If I manage to get the stuff for work done in a sane amount of time tonight, I'll try a CVS update on all three nodes and try it again. Is there any debug or anything that I can enable to get proper info if it doesn't work? Might as well get as much info out of it as we can if it doesn't work. For the clustering infrastructure (CMAN) you have to recompile the modules with the debugging options set on - look for DEBUG_COMMS near the bottom of cnxman-private.h If you compile the DLM with debugging enabled in the Makefile then it will dump it's info into a circular buffer available in /proc/cluster/dlm_debug Not sure about the rest. -- patrick From ben.m.cahill at intel.com Tue Nov 30 18:19:27 2004 From: ben.m.cahill at intel.com (Cahill, Ben M) Date: Tue, 30 Nov 2004 10:19:27 -0800 Subject: [Linux-cluster] Cluster-aware software RAID? Message-ID: <0604335B7764D141945E202153105960033E26CC@orsmsx404.amr.corp.intel.com> Hi all, Is there any cluster-aware SW RAID that can be used with GFS? -- Ben -- Opinions are mine, not Intel's From jbrassow at redhat.com Tue Nov 30 18:59:33 2004 From: jbrassow at redhat.com (Jonathan E Brassow) Date: Tue, 30 Nov 2004 12:59:33 -0600 Subject: [Linux-cluster] Cluster-aware software RAID? In-Reply-To: <0604335B7764D141945E202153105960033E26CC@orsmsx404.amr.corp.intel.com> References: <0604335B7764D141945E202153105960033E26CC@orsmsx404.amr.corp.intel.com> Message-ID: cluster mirroring is available in cluster/cmirror. Simply checking out the cluster directory from cvs and performing a build should leave you with a dm-log_cluster module that will enable cluster mirroring. Note that the name of the module will likely change at some point before it stabilizes. I have only ever used dmsetup to create/manage these mirrors. Additionally, there is still much work to be done (have a look at the TODO file). I would certainly not classify it as production ready. With other people starting to look at it, I can certainly start making the TODO and README files more verbose to reflect the its level of readiness. brassow On Nov 30, 2004, at 12:19 PM, Cahill, Ben M wrote: > Hi all, > > Is there any cluster-aware SW RAID that can be used with GFS? > > -- Ben -- > > Opinions are mine, not Intel's > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster > From bmarzins at redhat.com Tue Nov 30 20:00:57 2004 From: bmarzins at redhat.com (Benjamin Marzinski) Date: Tue, 30 Nov 2004 14:00:57 -0600 Subject: [Linux-cluster] gnbd import In-Reply-To: References: Message-ID: <20041130200057.GC31852@phlogiston.msp.redhat.com> On Wed, Nov 24, 2004 at 06:05:49PM -0500, Shih-Che Huang wrote: > Hi, > I tried to import GNBD, but I came out following message. > #gnbd_import -i 10.5.0.254 > gnbd_import error: could not find gnbd registered in /proc/devices. > This probably means that you have not loaded the ghbd module. > > However, I tried to load it > #modprobe gnbd > It came out--> > GNBD v6.0.0 installed. > > Any ideas? Um.. That's strange. Have you manually checked /proc/devices. # cat /proc/devices you should see gnbd under the Block devices section. -Ben > Thanks! > > > -- > Shih-Che Huang > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster From owen at isrl.uiuc.edu Tue Nov 30 20:32:10 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Tue, 30 Nov 2004 14:32:10 -0600 Subject: [Linux-cluster] NFS + GFS = kernel panic Message-ID: <20041130203210.GC1163@iwork57.lis.uiuc.edu> Hi all, More from my testing. I have the CVS versions of GFS, ccs, and lock_gulm, all from Nov 21, with a stock 2.6.9 kernel from kernel.org. Things seem to run fine until NFS is used. NFS appears to work for a bit, but either after a few hours or when there is a known stale NFS handle, the kernel panics with (Sorry, only a fragment of the error) Failed assertion fs/gfs/glock.c line 1366. Any ideas on this one? -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> From jbrassow at redhat.com Tue Nov 30 20:36:58 2004 From: jbrassow at redhat.com (Jonathan E Brassow) Date: Tue, 30 Nov 2004 14:36:58 -0600 Subject: [Linux-cluster] PLEASE NOTE:: update in cluster.conf tags for CVS cluster products Message-ID: <94D9A4AA-430F-11D9-86A2-000A957BB1F6@redhat.com> GFS <= 6.0.x are not affected. Some of the tags in cluster.conf have changed to help make parsing the xml document easier and help facilitate a GUI creation. The affected tags are: nodes, node, fence_devices, device, and group They have become: clusternodes, clusternode, fencedevices, fencedevice, and resourcegroup The documents describing the cluster.conf file should be updated with the new information (cluster.conf.5, lock_gulmd.5, usage.txt, mini-gfs.txt, rgmanager/examples/cluster.conf). An example cluster.conf file might look like: From agauthier at realmedia.com Tue Nov 30 22:05:10 2004 From: agauthier at realmedia.com (Arnaud Gauthier) Date: Tue, 30 Nov 2004 23:05:10 +0100 Subject: [Linux-cluster] NFS + GFS = kernel panic In-Reply-To: <20041130203210.GC1163@iwork57.lis.uiuc.edu> References: <20041130203210.GC1163@iwork57.lis.uiuc.edu> Message-ID: <200411302305.10303.agauthier@realmedia.com> Le mardi 30 Novembre 2004 21:32, Brynnen R Owen a ?crit?: > kernel.org. Things seem to run fine until NFS is used. NFS appears > to work for a bit, but either after a few hours or when there is a > known stale NFS handle, the kernel panics with (Sorry, only a fragment > of the error) I used unfsd (NFS v3 in user space) without any trouble with GFS & latest kernel 2.6.9. But kernel level NFS was not stable. I found the same issue with ogfs or older CVS versions of GFS (with kernel 2.6.7). Hope this can help you as a workaround :-) Regards, Arnaud -- Arnaud Gauthier Realmedia From Vincent.Aniello at PipelineTrading.com Tue Nov 30 23:44:02 2004 From: Vincent.Aniello at PipelineTrading.com (Vincent Aniello) Date: Tue, 30 Nov 2004 18:44:02 -0500 Subject: [Linux-cluster] Device missing Message-ID: <834F55E6F1BE3B488AD3AFC927A0970007B29A@EMAILSRV1.exad.net> One of my GFS file system is no longer mountable because the device associated with it has disappeared from my system. The file system is named /gfs02 and the device it points to is /dev/pool/pool_gfs02. [root at dvblkc01a gfs]# mount /gfs02 mount: special device /dev/pool/pool_gfs02 does not exist [root at dvblkc01a gfs]# The device /dev/pool/pool_gfs02 is missing from my /dev/pool directory. It used to be there. [root at dvblkc01a gfs]# ls -l /dev/pool total 0 brw------- 2 root root 254, 65 Nov 29 18:22 dvcluster_cca brw------- 2 root root 254, 66 Nov 29 18:22 pool_gfs01 brw------- 2 root root 254, 67 Nov 29 18:22 pool_gfs03 [root at dvblkc01a gfs]# The device /dev/sdc2 is associated with /dev/pool/pool_gfs02: poolname pool_gfs02 subpools 1 subpool 0 0 1 pooldevice 0 0 /dev/sdc2 The device /dev/sdc2 seems to be available to the system: Disk /dev/sdc: 71.7 GB, 71728889856 bytes 255 heads, 63 sectors/track, 8720 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Device Boot Start End Blocks Id System /dev/sdc1 1 2 16033+ 83 Linux /dev/sdc2 3 8720 70027335 83 Linux So, how do I get this device and file system back so I can mount it and what would make it dissapear in the first place? --Vincent This e-mail and/or its attachments may contain confidential and/or privileged information. If you are not the intended recipient(s) or have received this e-mail in error, please notify the sender immediately and delete this e-mail and its attachments from your computer and files. Any unauthorized copying, disclosure or distribution of the material contained herein is strictly forbidden. Pipeline Trading Systems, LLC - Member NASD & SIPC. From owen at isrl.uiuc.edu Tue Nov 30 23:56:27 2004 From: owen at isrl.uiuc.edu (Brynnen R Owen) Date: Tue, 30 Nov 2004 17:56:27 -0600 Subject: [Linux-cluster] NFS + GFS = kernel panic In-Reply-To: <200411302305.10303.agauthier@realmedia.com> References: <20041130203210.GC1163@iwork57.lis.uiuc.edu> <200411302305.10303.agauthier@realmedia.com> Message-ID: <20041130235627.GA1050@iwork57.lis.uiuc.edu> Thanks, tried that, but we have Mac's that insist on locking. The only way to get NFS locking is with the kernel NFS daemon. I do have more on the assertion: "(tmp_gh->gh_flags & GL_LOCAL_EXCL) || !(gh->gh_flags & GL_LOCAL_EXCL)" on line 1366 of fs/gfs/glock.c Is this related to locks from NFS being sent through to the GFS locking code? On Tue, Nov 30, 2004 at 11:05:10PM +0100, Arnaud Gauthier wrote: > Le mardi 30 Novembre 2004 21:32, Brynnen R Owen a ?crit?: > > > kernel.org. Things seem to run fine until NFS is used. NFS appears > > to work for a bit, but either after a few hours or when there is a > > known stale NFS handle, the kernel panics with (Sorry, only a fragment > > of the error) > > I used unfsd (NFS v3 in user space) without any trouble with GFS & latest > kernel 2.6.9. But kernel level NFS was not stable. I found the same issue > with ogfs or older CVS versions of GFS (with kernel 2.6.7). > > Hope this can help you as a workaround :-) > > Regards, > Arnaud > -- > Arnaud Gauthier > Realmedia > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > http://www.redhat.com/mailman/listinfo/linux-cluster -- <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><> <> Brynnen Owen ( this space for rent )<> <> owen at uiuc.edu ( )<> <><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>