From agk at redhat.com  Fri Jun 25 09:26:40 2004
From: agk at redhat.com (Alasdair G Kergon)
Date: Fri, 25 Jun 2004 10:26:40 +0100
Subject: [Linux-cluster] Source code for Sistina projects
Message-ID: <20040625092640.GE6302@agk.surrey.redhat.com>

Welcome to the linux-cluster mailing list!

The GPL source code for former Sistina projects, including GFS, 
is now available at http://sources.redhat.com/cluster/ .

Please use this mailing list to discuss the projects.
We're eagerly awaiting your patches:-)

Alasdair
-- 
agk at redhat.com



From hv at trust-mart.com  Fri Jun 25 10:07:07 2004
From: hv at trust-mart.com (hv)
Date: Fri, 25 Jun 2004 18:07:07 +0800
Subject: [Linux-cluster] segment fault
Message-ID: <000901c45a9c$2be76700$0d7e12ac@hv>

Hi,everyone:
      I got a segment fault when I run acucobol on a gfs filesystem.My system is redhat AS3,kernel is 2.6.7. The LVM2,GFS all are from CVS.My host is Dell 6650,EMC CX200.
      When I run vutil or runcbl on these GFS file system,This oops is appear:
      <1>Unable to handle kernel paging request at virtual address 0100003c
 printing eip:
c02a9ddc
*pde = 256d0001
Oops: 0002 [#2]
SMP
Modules linked in: qla2300 qla2xxx scsi_transport_fc
CPU:    6
EIP:    0060:[<c02a9ddc>]    Not tainted
EFLAGS: 00010286   (2.6.7-bk7)
EIP is at find_lock_by_id+0x6/0x1d
eax: 0100003c   ebx: 01000004   ecx: 00000137   edx: 010d02ac
esi: f7dfcf6c   edi: ffffffea   ebp: f75b1400   esp: ecdfddf4
ds: 007b   es: 007b   ss: 0068
Process runcbl (pid: 6003, threadinfo=ecdfd000 task=f5793210)
Stack: ecdfde54 c02b1888 f5793210 c011500c 00100100 00000137 01000004 ecdfde50
       00000340 f7c24938 ecdfde64 00000000 c01fdda4 ecdfde54 c01fdc8f f7dfcf58
       f75b1400 f7c249d0 ecdfdef0 ecdfded8 f7c249d8 00000010 f7c24938 f7dfcf58
Call Trace:
 [<c02b1888>] dlm_query+0x64/0x267
 [<c011500c>] default_wake_function+0x0/0x8
 [<c01fdda4>] get_conflict_global+0x10d/0x26c
 [<c01fdc8f>] query_ast+0x0/0x8
 [<c01fe0be>] lm_dlm_plock_get+0xb6/0xc2
 [<c0227595>] gfs_lock+0x289/0x31a
 [<c022730c>] gfs_lock+0x0/0x31a
 [<c015e0ee>] fcntl_getlk+0x15f/0x181
 [<c015a2af>] generic_file_fcntl+0xad/0x16b
 [<c015a47d>] sys_fcntl64+0x76/0x83
 [<c0103deb>] syscall_call+0x7/0xb
Code: f0 83 28 01 0f 88 df 00 00 00 89 d8 e8 07 fd ff ff f0 ff 43



Could any one help me?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20040625/78087191/attachment.htm>

From pcaulfie at redhat.com  Fri Jun 25 10:18:25 2004
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Fri, 25 Jun 2004 11:18:25 +0100
Subject: [Linux-cluster] segment fault
In-Reply-To: <000901c45a9c$2be76700$0d7e12ac@hv>
References: <000901c45a9c$2be76700$0d7e12ac@hv>
Message-ID: <20040625101825.GA21064@tykepenguin.com>

On Fri, Jun 25, 2004 at 06:07:07PM +0800, hv wrote:
>    Hi,everyone:
>          I got a segment fault when I run acucobol on a gfs filesystem.My
>    system is redhat AS3,kernel is 2.6.7. The LVM2,GFS all are from CVS.My
>    host is Dell 6650,EMC CX200.
>          When I run vutil or runcbl on these GFS file system,This oops is
>     
>     
>    Could any one help me?

Try this patch to the kernel:

===== cluster/dlm/queries.c 1.12 vs edited =====
--- 1.12/cluster/dlm/queries.c  Sat Jun 19 06:31:56 2004
+++ edited/cluster/dlm/queries.c        Fri Jun 25 11:14:22 2004
@@ -49,7 +49,7 @@
        int status = -EINVAL;
        gd_lkb_t *target_lkb;
        gd_lkb_t *query_lkb = NULL;     /* Our temporary LKB */
-       gd_ls_t  *ls = (gd_ls_t *) lockspace;
+       gd_ls_t  *ls = (gd_ls_t *) find_lockspace_by_local_id(lockspace);
 
 
        if (!qinfo)


-- 

patrick



From arekm at pld-linux.org  Fri Jun 25 12:37:59 2004
From: arekm at pld-linux.org (Arkadiusz Miskiewicz)
Date: Fri, 25 Jun 2004 14:37:59 +0200
Subject: [Linux-cluster] fix DESTDIR in dlm
Message-ID: <200406251437.59857.arekm@pld-linux.org>


When using DESTDIR $(libdir) contains it and thus created
symlinks are broken (in terms of DESTDIR purpose), this patch fixes it:

diff -urN dlm.org/lib/Makefile dlm/lib/Makefile
--- dlm.org/lib/Makefile2004-06-25 12:40:01.615519264 +0200
+++ dlm/lib/Makefile2004-06-25 12:40:15.421420448 +0200
@@ -56,8 +56,8 @@
 install -d ${libdir}
 install $(LIBNAME).a ${libdir}
 install $(LIBNAME).so.$(RELEASE_MAJOR).$(RELEASE_MINOR) ${libdir}
-ln -sf ${libdir}/$(LIBNAME).so.$(RELEASE_MAJOR).$(RELEASE_MINOR) ${libdir}/$(LIBNAME).so
-ln -sf ${libdir}/$(LIBNAME).so.$(RELEASE_MAJOR).$(RELEASE_MINOR) ${libdir}/$(LIBNAME).so.$(RELEASE_MAJOR)
+ln -sf $(LIBNAME).so.$(RELEASE_MAJOR).$(RELEASE_MINOR) ${libdir}/$(LIBNAME).so
+ln -sf $(LIBNAME).so.$(RELEASE_MAJOR).$(RELEASE_MINOR) ${libdir}/$(LIBNAME).so.$(RELEASE_MAJOR)
 
 uninstall:
 ${UNINSTALL} libdlm.h ${incdir}
-- 
Arkadiusz Mi?kiewicz     CS at FoE, Wroclaw University of Technology
arekm.pld-linux.org, 1024/3DB19BBD, JID: arekm.jabber.org, PLD/Linux



From hv at trust-mart.com  Sat Jun 26 01:15:00 2004
From: hv at trust-mart.com (hv)
Date: Sat, 26 Jun 2004 09:15:00 +0800
Subject: [Linux-cluster] segment fault
References: <000901c45a9c$2be76700$0d7e12ac@hv>
	<20040625101825.GA21064@tykepenguin.com>
Message-ID: <007901c45b1b$0064b0e0$0d7e12ac@hv>

This patch is great! All is ok now.I'll try it on more than three hosts today.Thanks!
  ----- Original Message ----- 
  From: Patrick Caulfield 
  To: Discussion of clustering software components including GFS 
  Sent: Friday, June 25, 2004 6:18 PM
  Subject: Re: [Linux-cluster] segment fault


  On Fri, Jun 25, 2004 at 06:07:07PM +0800, hv wrote:
  >    Hi,everyone:
  >          I got a segment fault when I run acucobol on a gfs filesystem.My
  >    system is redhat AS3,kernel is 2.6.7. The LVM2,GFS all are from CVS.My
  >    host is Dell 6650,EMC CX200.
  >          When I run vutil or runcbl on these GFS file system,This oops is
  >     
  >     
  >    Could any one help me?

  Try this patch to the kernel:

  ===== cluster/dlm/queries.c 1.12 vs edited =====
  --- 1.12/cluster/dlm/queries.c  Sat Jun 19 06:31:56 2004
  +++ edited/cluster/dlm/queries.c        Fri Jun 25 11:14:22 2004
  @@ -49,7 +49,7 @@
          int status = -EINVAL;
          gd_lkb_t *target_lkb;
          gd_lkb_t *query_lkb = NULL;     /* Our temporary LKB */
  -       gd_ls_t  *ls = (gd_ls_t *) lockspace;
  +       gd_ls_t  *ls = (gd_ls_t *) find_lockspace_by_local_id(lockspace);
   
   
          if (!qinfo)


  -- 

  patrick

  --
  Linux-cluster mailing list
  Linux-cluster at redhat.com
  http://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20040626/25ade9e9/attachment.htm>

From buytenh at wantstofly.org  Sat Jun 26 09:01:08 2004
From: buytenh at wantstofly.org (Lennert Buytenhek)
Date: Sat, 26 Jun 2004 11:01:08 +0200
Subject: [Linux-cluster] first look at the released GFS source code
Message-ID: <20040626090108.GY28090@xi.wantstofly.org>

http://sources.redhat.com/cluster/ has a link to
http://sources.redhat.com/ml/cluster-cvs/ which gives 404.  On the same
page, there is also a link to usage.txt from CVS, which references
ftp://sources.redhat.com/pub/cluster/, and that directory also doesn't
seem to exist.

It would be easier for people to test this stuff if there were
pre-built kernel RPMs for their favourite distribution.  I can look
into doing this for Fedora Core 2 if noone else is doing that yet.

It is generally unclear to me how CLVM works, and what kind of shared
storage gfs needs.  I found links in many places to the GFS HOWTO at
http://www.sistina.com/gfs/Pages/howto.html, but that just redirects
me to RH's GFS sales pitch.

There are various references to sistina still in the tree:
./ccs/daemon/ccsd.c:#define DEFAULT_CCSD_LOCKFILE "/var/run/sistina/ccsd.pid"
./ccs/daemon/ccsd.c:  if(!strncmp(lockfile, "/var/run/sistina/", 17)){
./ccs/daemon/ccsd.c:    if(stat("/var/run/sistina", &stat_buf)){
./ccs/daemon/ccsd.c:      if(mkdir("/var/run/sistina", S_IRWXU)){
./ccs/daemon/ccsd.c:      log_err("/var/run/sistina is not a directory.\n"
./cman/tests/qwait.c:  (c) 2002 Sistina Software Inc.
./fence/agents/baytech/Makefile:        ${top_srcdir}/scripts/define2var ${top_srcdir}/config/copyright.cf perl SISTINA_COPYRIGHT >> $(TARGET)
./fence/agents/baytech/fence_baytech.pl:$SISTINA_COPYRIGHT="";
./fence/agents/baytech/fence_baytech.pl:  print "$SISTINA_COPYRIGHT\n" if ( $SISTINA_COPYRIGHT );
./gfs/man/gfs_grow.8:'\"      Steven Whitehouse <steve at sistina.com>
./gfs/man/gfs_jadd.8:'\"      Steven Whitehouse <steve at sistina.com>
./gulm/man/lock_gulmd.8:\fB/var/run/sistina/lock_gulmd_core.pid\fP
./gulm/man/lock_gulmd.8:\fB/var/run/sistina/lock_gulmd_LTPX.pid\fP
./gulm/man/lock_gulmd.8:\fB/var/run/sistina/lock_gulmd_LT000.pid\fP
./gulm/man/lock_gulmd.8:\fBlock_gulmd\fP does not create the \fIsistina\fR directory in the
./gulm/src/config_ccs.c:                                        "/var/run/sistina") );
./gulm/src/config_main.c:   gf->lock_file = strdup("/var/run/sistina");

I'm happy to see a generic cluster manager in your package (looking
into it now.)  I really hope there will eventually be a standard cluster
framework out there that everybody will use.




From agk at redhat.com  Sat Jun 26 09:39:03 2004
From: agk at redhat.com (Alasdair G Kergon)
Date: Sat, 26 Jun 2004 10:39:03 +0100
Subject: [Linux-cluster] first look at the released GFS source code
In-Reply-To: <20040626090108.GY28090@xi.wantstofly.org>
References: <20040626090108.GY28090@xi.wantstofly.org>
Message-ID: <20040626093903.GI6302@agk.surrey.redhat.com>

On Sat, Jun 26, 2004 at 11:01:08AM +0200, Lennert Buytenhek wrote:
> http://sources.redhat.com/ml/cluster-cvs/ which gives 404.  

We're still getting the commit list archives set up, but that will 
be the URL.

> page, there is also a link to usage.txt from CVS, which references
> ftp://sources.redhat.com/pub/cluster/, and that directory also doesn't
> seem to exist.

Well spotted: it should be http://sources.redhat.com/cluster/releases/
We'll start generating separate tarballs for each component early next week.

Alasdair
-- 
agk at redhat.com



From lists at wikidev.net  Sat Jun 26 14:17:55 2004
From: lists at wikidev.net (Gabriel Wicke)
Date: Sat, 26 Jun 2004 16:17:55 +0200
Subject: [Linux-cluster] documentation wiki
Message-ID: <1088259476.1302.8.camel@venus>

Hi,

i've set up a documentation wiki at http://gfs.wikidev.net/, feel free
to update/change/add things there. If somebody has a nicer logo you can
upload it yourself after login (filename 'wiki.png', 135x147px).

I'm one of wikipedia.org's admins and am currently investigating
alternatives to NFS and possibly MySQL replication. Has anybody tried to
share MySQL db files with GFS so far?
-- 
Gabriel Wicke



From buytenh at wantstofly.org  Sat Jun 26 18:42:39 2004
From: buytenh at wantstofly.org (Lennert Buytenhek)
Date: Sat, 26 Jun 2004 20:42:39 +0200
Subject: [Linux-cluster] AF_ namespace conflict
Message-ID: <20040626184239.GA6481@xi.wantstofly.org>

Hi,

cman defines AF_CLUSTER to be 31, but AF_BLUETOOTH (in-tree) is also
defined as 31, so cman doesn't load if bluetooth support is already
loaded.

30 seems to be still free.. perhaps that one should be reserved in mainline?


cheers,
Lennert


--- linux/include/linux/socket.h.orig	2004-06-26 20:40:47.876722136 +0200
+++ linux/include/linux/socket.h	2004-06-26 20:41:02.922710013 +0200
@@ -177,6 +177,7 @@
 #define AF_PPPOX	24	/* PPPoX sockets		*/
 #define AF_WANPIPE	25	/* Wanpipe API Sockets */
 #define AF_LLC		26	/* Linux LLC			*/
+#define AF_CLUSTER	30	/* GFS Cluster Manager		*/
 #define AF_BLUETOOTH	31	/* Bluetooth sockets 		*/
 #define AF_MAX		32	/* For now.. */
 



From buytenh at wantstofly.org  Sat Jun 26 18:57:36 2004
From: buytenh at wantstofly.org (Lennert Buytenhek)
Date: Sat, 26 Jun 2004 20:57:36 +0200
Subject: [Linux-cluster] FC2 gfs kernel RPMs
Message-ID: <20040626185736.GA6597@xi.wantstofly.org>

Hi,

I've hacked GFS (cman, dlm, gfs) into the most recent Fedora Core 2
kernel, the resulting RPMs can be found at:

	http://www2.wantstofly.org/gfs/

I'll be uploading FC2 RPM packages of other GFS tools here as I make
them.


cheers,
Lennert



From buytenh at wantstofly.org  Sat Jun 26 20:03:47 2004
From: buytenh at wantstofly.org (Lennert Buytenhek)
Date: Sat, 26 Jun 2004 22:03:47 +0200
Subject: [Linux-cluster] FC2 gfs kernel RPMs
In-Reply-To: <20040626185736.GA6597@xi.wantstofly.org>
References: <20040626185736.GA6597@xi.wantstofly.org>
Message-ID: <20040626200347.GA6739@xi.wantstofly.org>

On Sat, Jun 26, 2004 at 08:57:36PM +0200, Lennert Buytenhek wrote:

> 	http://www2.wantstofly.org/gfs/
> 
> I'll be uploading FC2 RPM packages of other GFS tools here as I make
> them.

I have made FC2 packages for ccs, cman, dlm, perl-Net-Telnet (needed by
fence), fence, iddev, gfs-utils and lvm2 (with clvmd), and uploaded them
to the URL above.

The gfs-enabled kernel package boots fine on a Dual Xeon and the relevant
modules load cleanly.  Everything else is untested!  Please report bugs
as/if you find them.


cheers,
Lennert



From buytenh at wantstofly.org  Sat Jun 26 21:30:57 2004
From: buytenh at wantstofly.org (Lennert Buytenhek)
Date: Sat, 26 Jun 2004 23:30:57 +0200
Subject: [Linux-cluster] trouble trying to get ccs/cman working on one
	machine, not the other
Message-ID: <20040626213057.GA2572@xi.wantstofly.org>

Hi,

Sorry to bother you all once more.  I'm seeing two problems when trying
to get ccs/cman working.

On my Celeron 2GHz, when I try to start ccsd and cman, all is well.
I start ccsd, then 'cman_tool join', and the machine begins periodically
broadcasting such packets:

	23:22:26.300381 IP 10.0.0.1.6809 > 10.0.0.255.6809: UDP, length 24
	23:22:26.300491 IP 10.0.0.1.6809 > 10.0.0.255.6809: UDP, length 24

However, when I try the exact same thing on a Dual Xeon in the same
subnet, I get this:

	23:19:51.095492 IP 10.0.0.3.32770 > 255.255.255.255.50007: UDP, length 20
	23:19:51.344805 arp who-has 10.0.0.9 tell 10.0.0.3
	23:19:52.344396 arp who-has 10.0.0.9 tell 10.0.0.3
	23:19:53.344257 arp who-has 10.0.0.9 tell 10.0.0.3

The machine begins ARPing for 10.0.0.9 -- but that IP isn't even used
at all!  It doesn't broadcast like the other machines do, and after
waiting for a while, both machines decide to create a new cluster
instead of trying to talk to each other.

Futhermore, when I try to 'cman_tool leave' on the dual proc, I get:

	Jun 26 22:51:43 phi kernel: CMAN: we are leaving the cluster
	Jun 26 22:51:43 phi ccsd[9833]: Received bad communication type on cluster socket. 
	Jun 26 22:51:49 phi last message repeated 106830 times

syslogd then starts looping, until I kill ccsd.  On the uniproc, I
don't get any such error at all when I issue a leave:

	Jun 26 22:51:40 xi kernel: CMAN: we are leaving the cluster
	Jun 26 22:51:40 xi ccsd[2181]: Unable to bind cluster socket: Transport endpoint is not connected 
	Jun 26 22:51:40 xi ccsd[2181]: Exiting... 

I tried a UP kernel (exact same one as on the uniproc) on the dual proc,
but same result.  Anyone any clues?  Anything obvious I forgot?  I've
attached /etc/cluster/cluster.xml -- it's identical on both machines,
they both run the same kernel, and same binary packages (I hope.)  Do I
have to provide more info?


cheers,
Lennert
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cluster.xml
Type: text/xml
Size: 461 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20040626/14f4017f/attachment.xml>

From buytenh at wantstofly.org  Sat Jun 26 22:07:31 2004
From: buytenh at wantstofly.org (Lennert Buytenhek)
Date: Sun, 27 Jun 2004 00:07:31 +0200
Subject: [Linux-cluster] trouble trying to get ccs/cman working on one
	machine, not the other
In-Reply-To: <20040626213057.GA2572@xi.wantstofly.org>
References: <20040626213057.GA2572@xi.wantstofly.org>
Message-ID: <20040626220731.GA2956@xi.wantstofly.org>

On Sat, Jun 26, 2004 at 11:30:57PM +0200, Lennert Buytenhek wrote:

> The machine begins ARPing for 10.0.0.9 -- but that IP isn't even used
> at all!  It doesn't broadcast like the other machines do, and after
> waiting for a while, both machines decide to create a new cluster
> instead of trying to talk to each other.

OK, found out why they didn't see each other.  If your /etc/hosts has
something like this:

127.0.0.1               phi localhost.localdomain localhost

(which might be a remnant from an earlier Red Hat install on this box,
created by the installer if you install without initially configuring a
network adapter) the port 6809 broadcasts will happily be sent out over
the loopback interface towards 10.255.255.255, and no wonder that your
machines are not going to see each other.

Still not sure why it's trying to communicate with 10.0.0.8/10.0.0.9
-- it seems to be sending a 'send me your cluster.xml' request to those
addresses right before it tries sending such a packet to 10.0.0.255.

IN= OUT=eth1 SRC=10.0.0.3 DST=10.0.0.8 LEN=509 TOS=0x00 PREC=0x00 TTL=64 ID=0 DF PROTO=UDP SPT=50007 DPT=32769 LEN=489 


cheers,
Lennert



From buytenh at wantstofly.org  Sat Jun 26 23:00:30 2004
From: buytenh at wantstofly.org (Lennert Buytenhek)
Date: Sun, 27 Jun 2004 01:00:30 +0200
Subject: [Linux-cluster] fence_tool problem,
	direct user pointer dereference in cman kernel code
Message-ID: <20040626230030.GA3495@xi.wantstofly.org>

fence_tool gives me, on both of my test machines: 

	fence_domain_add: service register failed

relevant syscalls seem to be:

	(machine 1)
	socket(PF_BLUETOOTH, SOCK_DGRAM, 3)     = 1
	ioctl(1, 0x4001780e, 0x9c34050)         = -1 EINVAL (Invalid argument)

	(machine 2)
	socket(PF_BLUETOOTH, SOCK_DGRAM, 3)     = 1
	ioctl(1, 0x4001780e, 0x9505050)         = -1 ENAMETOOLONG (File name too long)

Looking at linux/cluster/cman/sm_user.c:sm_ioctl, it casts 'arg' to a
(char *) and then passes it into user_register, which does a direct
strlen() on it... which is bad coding style in general, but definitely
ain't gonna produce anything remotely useful on a 4G/4G kernel, like
the one that ships with Fedora Core 2.

I suspect there are more such bugs out there, sometimes I get really
unexpected behaviour or things that plain don't seem to work at all.
cman+dlm+gfs kernel code is ~2MB, but cman alone is only 400kb, so if
anyone else feels like some auditing work, we could do a rough pass over
cman in a few days with a few people.. anyone volunteering?




From teigland at redhat.com  Sun Jun 27 11:14:07 2004
From: teigland at redhat.com (David Teigland)
Date: Sun, 27 Jun 2004 19:14:07 +0800
Subject: [Linux-cluster] fence_tool problem,
	direct user pointer dereference in cman kernel code
In-Reply-To: <20040626230030.GA3495@xi.wantstofly.org>
References: <20040626230030.GA3495@xi.wantstofly.org>
Message-ID: <20040627111407.GA6821@redhat.com>


On Sun, Jun 27, 2004 at 01:00:30AM +0200, Lennert Buytenhek wrote:

> Looking at linux/cluster/cman/sm_user.c:sm_ioctl, it casts 'arg' to a
> (char *) and then passes it into user_register, which does a direct
> strlen() on it... which is bad coding style in general, but definitely
> ain't gonna produce anything remotely useful on a 4G/4G kernel, like
> the one that ships with Fedora Core 2.
> 
> I suspect there are more such bugs out there, sometimes I get really
> unexpected behaviour or things that plain don't seem to work at all.
> cman+dlm+gfs kernel code is ~2MB, but cman alone is only 400kb, so if
> anyone else feels like some auditing work, we could do a rough pass over
> cman in a few days with a few people.. anyone volunteering?

Thanks for all the feedback and bug reports; we'll look at each one
and get fixes out as quickly as possible (watch cvs).  We also have bug
reports coming at us from multiple Red Hat QA people who are doing some
great testing as well.

-- 
Dave Teigland  <teigland at redhat.com>



From buytenh at wantstofly.org  Sun Jun 27 12:10:12 2004
From: buytenh at wantstofly.org (Lennert Buytenhek)
Date: Sun, 27 Jun 2004 14:10:12 +0200
Subject: [Linux-cluster] internal lvm error while locking
Message-ID: <20040627121012.GA2798@xi.wantstofly.org>

When trying to 'vgchange -aly' on a shared volume, I get:
"Error locking on node xi: Internal lvm error, check syslog", but
nothing appears in syslog.

(The error is undoubtedly between keyboard and chair, as I'm testing
quite a bizarre setup at the moment, but the error message could be
more helpful as to what exactly I'm doing which is not allowed.)




From buytenh at wantstofly.org  Sun Jun 27 13:06:09 2004
From: buytenh at wantstofly.org (Lennert Buytenhek)
Date: Sun, 27 Jun 2004 15:06:09 +0200
Subject: [Linux-cluster] fence_manual /tmp vulnerability
Message-ID: <20040627130609.GA4017@xi.wantstofly.org>

When '-p' is not specified, fence_manual uses /tmp/fence_manual.lock
as lockfile.  This should be somewhere in /var/lock, no?

Didn't check if other tools do the same.




From chrismcc at gmail.com  Sun Jun 27 17:42:01 2004
From: chrismcc at gmail.com (Christopher McCrory)
Date: Sun, 27 Jun 2004 10:42:01 -0700
Subject: [Linux-cluster] RFI quick HOWTO
Message-ID: <63261e070406271042e9e044a@mail.gmail.com>

Hello...

	Last year I looked at several distributed filesystems; OpenAFS, GFS,
and NFS.  This was for a specific project.  I settled on NFS over LVM
over software raid5.  It works well except for a few NFS issues here
and there.  I really wanted to go with GFS but the lack of good
documentation held me back.  What would be ideal is a specific simple
HOWTO on setting up a simple lab test setup.

something like:

replicated with failover
server1 fs /important_data on /dev/sdb1
server2 fs /important_data on /dev/sdb1

client1 mount /important_data
client2 mount /important_data

with no FC, no shared scsi, no dual ethernet networks.  Just the
minimum hardware possible. Most labs already have this stuff laying
around.

yes? no? maybe?

-- 
Christopher McCrory
 "The guy that keeps the servers running"



From notiggy at gmail.com  Sun Jun 27 18:01:00 2004
From: notiggy at gmail.com (Brian Jackson)
Date: Sun, 27 Jun 2004 13:01:00 -0500
Subject: [Linux-cluster] RFI quick HOWTO
In-Reply-To: <63261e070406271042e9e044a@mail.gmail.com>
References: <63261e070406271042e9e044a@mail.gmail.com>
Message-ID: <fb20c21404062711013184f45e@mail.gmail.com>

On Sun, 27 Jun 2004 10:42:01 -0700, Christopher McCrory
<chrismcc at gmail.com> wrote:
> 
> Hello...
> 
>         Last year I looked at several distributed filesystems; OpenAFS, GFS,
> and NFS.  This was for a specific project.  I settled on NFS over LVM
> over software raid5.  It works well except for a few NFS issues here
> and there.  I really wanted to go with GFS but the lack of good
> documentation held me back.  What would be ideal is a specific simple
> HOWTO on setting up a simple lab test setup.

One thing I always wanted to do with OpenGFS was create a howto-matic.
I may look at resurrecting that idea (my TODO list already has quite a
few things on it). Getting some of the good bits of docs from ogfs is
also on my TODO list.

--Brian Jackson

> 
> something like:
> 
> replicated with failover
> server1 fs /important_data on /dev/sdb1
> server2 fs /important_data on /dev/sdb1
> 
> client1 mount /important_data
> client2 mount /important_data
> 
> with no FC, no shared scsi, no dual ethernet networks.  Just the
> minimum hardware possible. Most labs already have this stuff laying
> around.
> 
> yes? no? maybe?
> 
> --
> Christopher McCrory
>  "The guy that keeps the servers running"
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> http://www.redhat.com/mailman/listinfo/linux-cluster
>



From john.hearns at clustervision.com  Sun Jun 27 18:31:13 2004
From: john.hearns at clustervision.com (John Hearns)
Date: Sun, 27 Jun 2004 19:31:13 +0100
Subject: [Linux-cluster] RFI quick HOWTO
In-Reply-To: <63261e070406271042e9e044a@mail.gmail.com>
References: <63261e070406271042e9e044a@mail.gmail.com>
Message-ID: <1088361073.2460.5.camel@vigor12>

On Sun, 2004-06-27 at 18:42, Christopher McCrory wrote:
> Hello...
> 
> 	Last year I looked at several distributed filesystems; OpenAFS, GFS,
> and NFS.  This was for a specific project.  I settled on NFS over LVM
> over software raid5.  It works well except for a few NFS issues here
> and there.  I really wanted to go with GFS but the lack of good
> documentation held me back.  What would be ideal is a specific simple
> HOWTO on setting up a simple lab test setup.
> 
> something like:
> 
> replicated with failover
> server1 fs /important_data on /dev/sdb1
> server2 fs /important_data on /dev/sdb1
> 
> client1 mount /important_data
> client2 mount /important_data
> 
> with no FC, no shared scsi, no dual ethernet networks.  Just the
> minimum hardware possible. Most labs already have this stuff laying
> around.

For that sort of thing, IMHO you should be looking at Linux-HA
http://www.linux-ha.org and using DRBD as the shared storage.

I've used Linux-HA for a failover cluster, using shared SCSI and
in another case using DRBD.

If it really is failover you are after, look at the HOWTOs and the 
journal articles referenced on the Linux-HA pages.
The minimal setup you will need is two servers, linked  by a single
Ethernet and a serial cable. The heartbeat can run over both the
Ethernet and the serial link, giving you redundancy. 



From chrismcc at gmail.com  Sun Jun 27 19:23:08 2004
From: chrismcc at gmail.com (Christopher McCrory)
Date: Sun, 27 Jun 2004 12:23:08 -0700
Subject: [Linux-cluster] RFI quick HOWTO
Message-ID: <63261e07040627122357a8c4e9@mail.gmail.com>

On Sun, 2004-06-27 at 11:31, John Hearns wrote:
> On Sun, 2004-06-27 at 18:42, Christopher McCrory wrote:
> > Hello...
> > 
> > 	Last year I looked at several distributed filesystems; OpenAFS, GFS,
> > and NFS.  This was for a specific project.  I settled on NFS over LVM
> > over software raid5.  It works well except for a few NFS issues here
> > and there.  I really wanted to go with GFS but the lack of good
> > documentation held me back.  What would be ideal is a specific simple
> > HOWTO on setting up a simple lab test setup.
> > 
> > something like:
> > 
> > replicated with failover
> > server1 fs /important_data on /dev/sdb1
> > server2 fs /important_data on /dev/sdb1
> > 
> > client1 mount /important_data
> > client2 mount /important_data
> > 
> > with no FC, no shared scsi, no dual ethernet networks.  Just the
> > minimum hardware possible. Most labs already have this stuff laying
> > around.
> 
> For that sort of thing, IMHO you should be looking at Linux-HA
> http://www.linux-ha.org and using DRBD as the shared storage.
> 
> I've used Linux-HA for a failover cluster, using shared SCSI and
> in another case using DRBD.
> 
> If it really is failover you are after, look at the HOWTOs and the 
> journal articles referenced on the Linux-HA pages.
> The minimal setup you will need is two servers, linked  by a single
> Ethernet and a serial cable. The heartbeat can run over both the
> Ethernet and the serial link, giving you redundancy. 
> 

 Wouldn't that defeat the purpose of testing GFS?

 With a lab setup you could test: 
  does my app work well with this FS?
  Is it faster? slower?
  does my backup software work?  can I restore!?  ( I ran into this
with amanda, XFS (sgi) and differing RH versions. )  nice to know
_before_ you need it :)
 any other gotchas?

then you have some answers before shelling out for more hardware ($$$)


-- 
Christopher McCrory
 "The guy that keeps the servers running"



From john.hearns at clustervision.com  Sun Jun 27 21:50:31 2004
From: john.hearns at clustervision.com (John Hearns)
Date: Sun, 27 Jun 2004 22:50:31 +0100
Subject: [Linux-cluster] RFI quick HOWTO
In-Reply-To: <63261e07040627122357a8c4e9@mail.gmail.com>
References: <63261e07040627122357a8c4e9@mail.gmail.com>
Message-ID: <1088373031.2968.2.camel@vigor12>

On Sun, 2004-06-27 at 20:23, Christopher McCrory wrote:

> > 
> > If it really is failover you are after, look at the HOWTOs and the 
> > journal articles referenced on the Linux-HA pages.
> > The minimal setup you will need is two servers, linked  by a single
> > Ethernet and a serial cable. The heartbeat can run over both the
> > Ethernet and the serial link, giving you redundancy. 
> > 
> 
>  Wouldn't that defeat the purpose of testing GFS?
> 
You are absolutely correct. This is of course a GFS list!

I thought you were wanting to know how to put together a failover
setup with minimal hardware - and was giving some pointers on that,
as I've done that sort of work.

I agree that a lab-type setup for testing GFS on a small scale would
be very interesting.



From teigland at redhat.com  Mon Jun 28 02:28:45 2004
From: teigland at redhat.com (David Teigland)
Date: Mon, 28 Jun 2004 10:28:45 +0800
Subject: [Linux-cluster] RFI quick HOWTO
In-Reply-To: <63261e070406271042e9e044a@mail.gmail.com>
References: <63261e070406271042e9e044a@mail.gmail.com>
Message-ID: <20040628022845.GA7358@redhat.com>

On Sun, Jun 27, 2004 at 10:42:01AM -0700, Christopher McCrory wrote:

> with no FC, no shared scsi, no dual ethernet networks.  Just the
> minimum hardware possible. Most labs already have this stuff laying
> around.

A "Minimum-GFS" HOWTO would be a good idea.  It would probably look something
like the following.

node1: is a gnbd server and exports any local ide/scsi disk over the network

node2 and node3:
- are gnbd clients, both importing the network block device from node1
- are gfs nodes, both sharing a gfs file system created on the imported gnbd

no FC, no shared scsi, no network power switch, no lvm or clvm
three machines, single ethernet network, a spare disk on one machine

(node1 is equivalent to an nfs server here, but it's serving blocks instead
of files)

-- 
Dave Teigland  <teigland at redhat.com>



From pcaulfie at redhat.com  Mon Jun 28 06:21:50 2004
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Mon, 28 Jun 2004 07:21:50 +0100
Subject: [Linux-cluster] internal lvm error while locking
In-Reply-To: <20040627121012.GA2798@xi.wantstofly.org>
References: <20040627121012.GA2798@xi.wantstofly.org>
Message-ID: <20040628062149.GD15996@tykepenguin.com>

On Sun, Jun 27, 2004 at 02:10:12PM +0200, Lennert Buytenhek wrote:
> When trying to 'vgchange -aly' on a shared volume, I get:
> "Error locking on node xi: Internal lvm error, check syslog", but
> nothing appears in syslog.

I'm not sure why it isn't logging to syslog - you could try increasing the
logging levels of lvm in /etc/lvm/lvm.conf

Another way of getting debug information out of clvm is to build the daemon
with debugging enabled and start it with "-d".

-- 

patrick



From pcaulfie at redhat.com  Mon Jun 28 10:35:04 2004
From: pcaulfie at redhat.com (Patrick Caulfield)
Date: Mon, 28 Jun 2004 11:35:04 +0100
Subject: [Linux-cluster] internal lvm error while locking
In-Reply-To: <20040628062149.GD15996@tykepenguin.com>
References: <20040627121012.GA2798@xi.wantstofly.org>
	<20040628062149.GD15996@tykepenguin.com>
Message-ID: <20040628103504.GG15996@tykepenguin.com>

On Mon, Jun 28, 2004 at 07:21:50AM +0100, Patrick Caulfield wrote:
> On Sun, Jun 27, 2004 at 02:10:12PM +0200, Lennert Buytenhek wrote:
> > When trying to 'vgchange -aly' on a shared volume, I get:
> > "Error locking on node xi: Internal lvm error, check syslog", but
> > nothing appears in syslog.
> 

OK, I've fixed the syslog bug in lvm2 CVS. It seems the defaults were not enough
to make even errors appear in syslog.

You'll need to remember to look in the syslog of the node that the error
occurred on.

-- 

patrick



From lhh at redhat.com  Tue Jun 29 15:41:04 2004
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 29 Jun 2004 11:41:04 -0400
Subject: [Linux-cluster] Resource Structure (proposed, not complete)
Message-ID: <1088523664.13751.4.camel@atlantis.boston.redhat.com>

User Resource Manager Operational Specification, proposed

The User Resource Manager (formerly Service Manager) is the part of Red
Hat Cluster Suite which manages resources and groups which implement a
user's clustered services.

The user resource manager only allows resource groups to operate when
it is running on a quorate member of the cluster.  This means that all
resource groups are immediately stopped when a member is no longer
quorate.  Typically, the member is also fenced (note: it may not stop
all resource groups prior to being fenced; it certainly tries to).

(Incomplete.)


Failover Domains, proposed

See http://people.redhat.com/lhh/fd.html for information on how
clumanager 1.2 failover domains operate.  The configuration format will
have to change slightly, but the operational characteristics need not. 
Additionally, we might want to add "Relocate to most-preferred member"
option to prevent unwanted service transitions in ordered failover
domains.  (Since failover domains handle multiple cluster members, it
is actually not the same as clumanager 1.0's "Relocate on preferred
node boot" option.)



User Resource Structure, proposed

<resources>

  <script name="Oracle Script"		<-- Unique name across scripts
      file="/path/to/script"/>		<-- Path to script

  <script name="Apache Script"		<-- Unique name across scripts
      file="/etc/init.d/httpd"/>	<-- Path to script

  <mount name="Oracle Data"		<-- Unique name across mounts
      fstype="gfs"			<-- Can share these.  Can
					    mount these multiple times
      options=""			<-- Defaults ok
      device="/dev/sdc1"		<-- Could be LV, GNBD, etc.
      force_umount="n"
      mountpoint="/mnt/oracle_data"/>	<-- Mount point

  <mount name="Web Data"		<-- Unique name across mounts
      fstype="nfs"			<-- Can't share these!
      options=""			<-- Mount options
      source="server:/webdata"		<-- Server/Path specification
      mountpoint="/mnt/web_data"/>	<-- Mount point

  <mount name="NFS Home"		<-- Unique name across mounts
      fstype="ext3"			<-- You can share these
      options="ro"			<-- Mount options
      device="/dev/sdc2"		<-- Could be LV, GNBD, etc.
      force_fsck=""			<-- Force fsck on journalled fs
      force_unmount="y"
      mountpoint="/mnt/nfs"/>		<-- Mount point

  <client name="Joe's machine"		<-- Name unique across clients
     type="nfs"				<-- Only NFS support for now
     target="joe.boston.redhat.com"	<-- wildcards & netgroups too!
     options="ro"/>

  <client name="Admin's machine"	<-- Name unique across clients
     type="nfs"				<-- Only NFS support for now
     target="bob.boston.redhat.com"	<-- wildcards & netgroups too!
     options="rw"/>


  <ip address="172.31.31.2"/>		<-- Address is unique
  <ip address="172.31.31.4"/>		<-- Address is unique
  <ip address=":ffff::172.31.31.3"/>	<-- Address is unique (Watch
					    for ip6/ip4 collisions)
					    ip6 == new feature!

  <!-- Web & Oracle Service -->

  <group name="Oracle/Web">		<-- Unique name across groups

    <script ref="Oracle Script"/>	<-- Note, multiple scripts
    <script ref="Apache Script"/>	    (New Feature)

    <ip ref="172.31.31.2"/>		<-- Not sure if feasible.
    <ip ref=":ffff::172.31.31.3"/>

    <mount ref="Oracle Data">

      <!-- Exports are service specific -->

      <export type="nfs" path=""/>	<-- If empty string, refers
					    to parent's mountpoint
	<client ref="Joe's machine"/>	<-- Joe can mount this.
      </export>
      <export type="samba"/>		<-- no change from 1.2 for
					    now
    </mount>

  </group>

  <!-- Home directory service */
  <group name="Homedirs"/>

    <ip ref="172.31.31.4"/>

    <mount ref="NFS Home">
      <export type="nfs" path=""/>
        <client ref="Joe's machine"/>
        <client ref="Admin's machine"/>
      </export>
    </mount>

  </group>
 
</resources>


===============================================
 Rules concerning individual resource behavior
===============================================

<mount> resource:
- When fstype is 'gfs' or 'nfs', the mount may be defined as parts of
  multiple resource groups.  If this is the case, the force_umount
  option is ignored.
- When fstype is not 'gfs' nor 'nfs', the mount may only be defined
  as part of one resource group.
- When fstype is 'nfs' or 'gfs', force_fsck is ignored.
- When fstype is 'nfs', force_umount is ignored.
- When fstype is 'ext3', 'jfs', 'reiserfs', or 'xfs', the file system
  is only fsck'd if the force_fsck option is turned on.
- When fstype is 'ext2', force_fsck is ignored and the file system
  is always checked on failover or relocation.

<ip> resource:
- An IP resource may only be part of one resource group.  If it is
  defined in multiple resource groups, the first resource group to
  start will have the IP address, and the second resource group will
  fail to start.
- IPv6 address corresponding to an IPv4 address specified in another
  <ip> resource is not allowed.

<script> resource:
- These may be a member of multiple resource groups, but beware the
  cost of doing so - the cluster makes no assumptions with respect
  to data being available to scripts.  Because of this, scripts
  depend on all other pieces of a resource group to be running.

<client> resource:
- These may be a member of any number of <exports> in any number of
  resource groups.


======================
 Dependency Structure
======================

            group__________
             / \           \
            /   \           \
           ip  mount    ...group...
           / \/ \
          /  /\  \
         /  /  \  \
       script  export
                   \
                    \
                  client

Wherever a leaf node exists, you may restart that leaf node without
affecting other nodes in the dependency tree.  For instance, if you
change the export-client options for "Joe's Machine", then the 
export is removed and replaced with the new options.

In our example above, the "Oracle/Web" service and the "Homedirs"
service both have exports with clients pointing to "Joe's Machine".
Changing the option on one changes the option on both exports.

You may detach leaves without affecting the rest of the resource
group.  That is, you may detach "Joe's Machine" without stopping
the service.  You may also attach leaves without affecting the rest
of the resource group; so you may add "Bill's Machine" to the export
without restarting the resource group.  Similarly, you may add or
restart a script.

Whenever a node of the tree does not have anything depending on it,
it is by definition a leaf node.  Thus, you may add, start, or stop
IP addresses providing no scripts or exports are defined for a given
resource group.

Rules:
<group> resource:
- A group may depend on any number of other groups.  When a
  group depends on another group, the child group is started prior
  to any other resources of the parent group starting.  Additionally,
  they are managed in a single start phase and are thus started on
  the same cluster member; so it is only a logical grouping; 
  resource groups as dependent children must start on the same
  cluster member as their parents.
- A group, if depended upon by another group, may not depend on
  another group to start.  That is, A may depend on B, but if so,
  then B may not depend on anything.  This both prevents circular
  dependencies and arbitrarily complex services.
- A resource group fails to start if any one of its dependent
  children fails to start.

<ip> resource:
- An IP resource may not be added, removed, or changed if an export
  or user script is present in the resource group.  If neither an
  export nor a user script is present in the group, it may be
  restarted without affecting other parts of the group.
- An IP resource fails to start if any one of its dependent children
  fails to start

<script> resource:
- A script resource may be added, modified, or changed without
  affecting other members in the resource group.
- A script resource is not started unless all mount and ip resources
  have started.

<mount> resource:
- A mount resource may not be added, removed, or changed if an
  export or user script is present in the resource group.  If neither
  an export nor a user script is present in the group, it may be
  restarted without affecting other parts of the group.
- A mount resource fails to start if any one of its dependent children
  fails to start.

<export> resource:
- Export resources are not defined outside of a resource group; they
  are properties of a given mount resource and defined only in the
  context of a resource group.
- An export resource may not be added, removed, or changed if a
  client exists and is depending upon it.
- An export resource does not fail to start unless all of its
  dependent children fail to start.
- Export resources with type "samba" are not started unless all
  mount and ip resources have been started.
- Export resources with type "nfs" are not started unless all mount
  resources have been started.

<client> resource:
- Client resources may be added, removed, or changed at any time
  without affecting the operation of any other part of the resource
  group.


==================================
 How it works - a High Level View
==================================

Note - this is the same way clumanager 1.0 and 1.2 do it; the main
differences are in the fact that we have the ability to start 
individual export clients.  The intention is not to illustrate this
here; I have no idea how that's going to work yet ;)  BTW, nfs exports
are intentionally started apart from Samba exports, as samba exports
generally bind to IP addresses in the resource group (ugly), but NFS
exports need no such thing and in fact, exporting after the IP
address comes up causes problems with failover and/or service
relocation.

group_start () {

	for (each group) {
		if (group_start(group) != SUCCESS)
			return FAIL;
	}

	for (each mount) {
		if (start_mount() != SUCCESS)
			return FAIL;

		for (each export) {
			if (type != nfs)
				continue;

			for (each client)
				/* Log errors */
				start_client(export_directory);
		}
	}

	for (each ip) {
		if (start_ip() != SUCCESS)
			return FAIL;
	}

	for (each mount) {
		for (each export)  {
			if (type != samba)
				continue;
	
			if (start_samba(export) != SUCCESS)
				return FAIL;
		}
	}

	for (each script) {
		if (start_script() != SUCCESS)
			return FAIL;
	}

	return SUCCESS;
}


group_stop () {

	for (each script) {
		if (stop_script() != SUCCESS)
			return FAIL;
	}

	for (each ip) {
		if (start_ip() != SUCCESS)
			return FAIL;
	}

	for (each mount) {
		if (start_mount() != SUCCESS)
			return FAIL;

		for (each export) {
			if (type == nfs) {
				for (each client)
					/* Log errors */
					start_client(export_directory);
			} else
				stop_samba(export);
		}
	}

	for (each group) {
		if (group_stop(group) != SUCCESS)
			return FAIL;
	}

	return SUCCESS;
}






From lhh at redhat.com  Tue Jun 29 15:55:47 2004
From: lhh at redhat.com (Lon Hohberger)
Date: Tue, 29 Jun 2004 11:55:47 -0400
Subject: [Linux-cluster] Resource Structure (proposed, not complete)
In-Reply-To: <1088523664.13751.4.camel@atlantis.boston.redhat.com>
References: <1088523664.13751.4.camel@atlantis.boston.redhat.com>
Message-ID: <1088524547.9914.5.camel@atlantis.boston.redhat.com>

On Tue, 2004-06-29 at 11:41 -0400, Lon Hohberger wrote:
> User Resource Manager Operational Specification, proposed
> 
> The User Resource Manager (formerly Service Manager) is the part of Red
> Hat Cluster Suite which manages resources and groups which implement a
> user's clustered services.

(Obvious note: written long before Linux Cluster Project was started.)

s/Red Hat Cluster Suite/Linux Cluster Project/g

-- Lon



From cc at ewomp.com  Wed Jun 30 14:51:39 2004
From: cc at ewomp.com (CC)
Date: Wed, 30 Jun 2004 10:51:39 -0400
Subject: [Linux-cluster] GFS with ATA/IDE drives
Message-ID: <4diduga40c0c63s.300620041051@mail.nextresponse.com>


We're looking at a project where we need 20+ small computers.
Would like to use the internal storage with GFS.

Thus, my questions: 

1. Does GFS support ATA/IDE drives? (I see SCSI and FC in the docs) 2. If
yes, is it recommended by Red Hat 3. Is it better to use iSCSI or GNDB (any
know issues)

Thank you
CC





From kpreslan at redhat.com  Wed Jun 30 20:06:04 2004
From: kpreslan at redhat.com (Ken Preslan)
Date: Wed, 30 Jun 2004 15:06:04 -0500
Subject: [Linux-cluster] GFS with ATA/IDE drives
In-Reply-To: <4diduga40c0c63s.300620041051@mail.nextresponse.com>
References: <4diduga40c0c63s.300620041051@mail.nextresponse.com>
Message-ID: <20040630200604.GA26510@potassium.msp.redhat.com>

On Wed, Jun 30, 2004 at 10:51:39AM -0400, CC wrote:
> 
> We're looking at a project where we need 20+ small computers.
> Would like to use the internal storage with GFS.
> 
> Thus, my questions: 
> 
> 1. Does GFS support ATA/IDE drives? (I see SCSI and FC in the docs) 2. If
> yes, is it recommended by Red Hat 3. Is it better to use iSCSI or GNDB (any
> know issues)
> 
> Thank you
> CC

GFS will happily run on IDE disks.  The big trick is getting the machines
so they can share access to them.

The most simple setup is to have one machine with a big pile of IDE disks 
export thost disks to an IP network with GNBD (or iSCSI).  The client
machines (running GFS) then import the disks from the GNBD server and
GFS manages the concurrent access.  A setup like that works fine.  The
problem is the one GNBD server is a single point of failure.  If that one
machine dies, they whole cluster dies.

Getting around the SPOF makes things more complicated.  You would need
two GNBD servers, each with a pile of internal IDE disks (of the same size).
The client machines (running GFS and CLVM) use CLVM's mirroring target
to maintain identical copies of the filesystem's data on both GNBD servers.
That way, if one GNBD server dies, the GFS cluster can continue running
on the other GNBD server.

Unfortunately, the mirroring target for CLVM is still being written.
So, a SPOF-free IDE disk setup isn't possible right now.  If you can stand
the SPOF, go ahead and try out the one GNBD server.

-- 
Ken Preslan <kpreslan at redhat.com>



From bdcneal at budget.state.ny.us  Wed Jun 30 20:07:57 2004
From: bdcneal at budget.state.ny.us (Mark Neal)
Date: Wed, 30 Jun 2004 16:07:57 -0400
Subject: [Linux-cluster] GFS with ATA/IDE drives
Message-ID: <s0e2e56d.037@budget.state.ny.us>

is there a place to get more in-depth documentation on GFS than just the
usage.txt file?

Mark Neal
System Administrator - Web Services
NYS Division of Budget
(518) 402-4181


>>> kpreslan at redhat.com 06/30/04 16:06 PM >>>
On Wed, Jun 30, 2004 at 10:51:39AM -0400, CC wrote:
> 
> We're looking at a project where we need 20+ small computers.
> Would like to use the internal storage with GFS.
> 
> Thus, my questions: 
> 
> 1. Does GFS support ATA/IDE drives? (I see SCSI and FC in the docs) 2.
If
> yes, is it recommended by Red Hat 3. Is it better to use iSCSI or GNDB
(any
> know issues)
> 
> Thank you
> CC

GFS will happily run on IDE disks.  The big trick is getting the
machines
so they can share access to them.

The most simple setup is to have one machine with a big pile of IDE
disks 
export thost disks to an IP network with GNBD (or iSCSI).  The client
machines (running GFS) then import the disks from the GNBD server and
GFS manages the concurrent access.  A setup like that works fine.  The
problem is the one GNBD server is a single point of failure.  If that
one
machine dies, they whole cluster dies.

Getting around the SPOF makes things more complicated.  You would need
two GNBD servers, each with a pile of internal IDE disks (of the same
size).
The client machines (running GFS and CLVM) use CLVM's mirroring target
to maintain identical copies of the filesystem's data on both GNBD
servers.
That way, if one GNBD server dies, the GFS cluster can continue running
on the other GNBD server.

Unfortunately, the mirroring target for CLVM is still being written.
So, a SPOF-free IDE disk setup isn't possible right now.  If you can
stand
the SPOF, go ahead and try out the one GNBD server.

-- 
Ken Preslan <kpreslan at redhat.com>

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster



From erik at debian.franken.de  Wed Jun 30 20:21:23 2004
From: erik at debian.franken.de (Erik Tews)
Date: Wed, 30 Jun 2004 22:21:23 +0200
Subject: [Linux-cluster] GFS with ATA/IDE drives
In-Reply-To: <20040630200604.GA26510@potassium.msp.redhat.com>
References: <4diduga40c0c63s.300620041051@mail.nextresponse.com>
	<20040630200604.GA26510@potassium.msp.redhat.com>
Message-ID: <1088626883.5856.16.camel@localhost>

Am Mi, den 30.06.2004 schrieb Ken Preslan um 22:06:
> > 1. Does GFS support ATA/IDE drives? (I see SCSI and FC in the docs) 2. If
> > yes, is it recommended by Red Hat 3. Is it better to use iSCSI or GNDB (any
> > know issues)
> ...
> The most simple setup is to have one machine with a big pile of IDE disks 
> export thost disks to an IP network with GNBD (or iSCSI).  The client
> machines (running GFS) then import the disks from the GNBD server and
> GFS manages the concurrent access.  A setup like that works fine.  The
> problem is the one GNBD server is a single point of failure.  If that one
> machine dies, they whole cluster dies.

Was there any special reason for inventing GNBD instead of just using
iSCSI for network storage? Is it somehow faster or more relyable?



From kpreslan at redhat.com  Wed Jun 30 20:30:42 2004
From: kpreslan at redhat.com (Ken Preslan)
Date: Wed, 30 Jun 2004 15:30:42 -0500
Subject: [Linux-cluster] GFS with ATA/IDE drives
In-Reply-To: <s0e2e56d.037@budget.state.ny.us>
References: <s0e2e56d.037@budget.state.ny.us>
Message-ID: <20040630203041.GA26774@potassium.msp.redhat.com>

On Wed, Jun 30, 2004 at 04:07:57PM -0400, Mark Neal wrote:
> is there a place to get more in-depth documentation on GFS than just the
> usage.txt file?

Good documentation is something we're still working on.


The GFS 6.0 admin is up on the web here:

ftp://ftp.redhat.com/pub/redhat/linux/enterprise/3/en/RHGFS/i386/SRPMS/rh-gfs-en-6.0-4.src.rpm

It's a pretty good document on how to use GFS 6.0.  Unfortunately, a lot
has changed between 6.0 and the current development branches (CCS has
changed and CMAN and the DLM have been added, for example).  So, it's
there to look at.  Whether or not it confuses you more than it helps you,
I don't know.  :-)

-- 
Ken Preslan <kpreslan at redhat.com>



From kpreslan at redhat.com  Wed Jun 30 20:35:33 2004
From: kpreslan at redhat.com (Ken Preslan)
Date: Wed, 30 Jun 2004 15:35:33 -0500
Subject: [Linux-cluster] GFS with ATA/IDE drives
In-Reply-To: <1088626883.5856.16.camel@localhost>
References: <4diduga40c0c63s.300620041051@mail.nextresponse.com>
	<20040630200604.GA26510@potassium.msp.redhat.com>
	<1088626883.5856.16.camel@localhost>
Message-ID: <20040630203533.GA26843@potassium.msp.redhat.com>

On Wed, Jun 30, 2004 at 10:21:23PM +0200, Erik Tews wrote:
> Was there any special reason for inventing GNBD instead of just using
> iSCSI for network storage? Is it somehow faster or more relyable?

GNBD is about 4 years old.  It's worked for a long time.  Long before
you could get an iSCSI client for Linux.

Will iSCSI get to the point where GNBD is obsolete and is dropped?
Perhaps.

-- 
Ken Preslan <kpreslan at redhat.com>



From cc at ewomp.com  Wed Jun 30 21:05:58 2004
From: cc at ewomp.com (CC)
Date: Wed, 30 Jun 2004 17:05:58 -0400
Subject: [JNK] Re: [Linux-cluster] GFS with ATA/IDE drives
In-Reply-To: <20040630203533.GA26843@potassium.msp.redhat.com>
Message-ID: <iceqgx8uf15e1lf.300620041706@mail.nextresponse.com>

Ken:

I'm also considering getting coraid EtherDrive -- www.coraid.com. The use
ATA over Ethernet (AoE), and have drivers for Linux 2.4/2.6 that will make
each drive appear a local drive. Basically you attach the Coraid server to a
GigE switch and all the computers attached are able to see the drives as
local drives.

How would GFS work under such an implementation (it's close to something
like the one GFS server you suggested). Will one still need to use iSCSI or
GNDB? 




-----Original Message-----
From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Ken Preslan
Sent: Wednesday, June 30, 2004 4:36 PM
To: Erik Tews
Cc: Discussion of clustering software components including GFS
Subject: [JNK] Re: [Linux-cluster] GFS with ATA/IDE drives

On Wed, Jun 30, 2004 at 10:21:23PM +0200, Erik Tews wrote:
> Was there any special reason for inventing GNBD instead of just using 
> iSCSI for network storage? Is it somehow faster or more relyable?

GNBD is about 4 years old.  It's worked for a long time.  Long before you
could get an iSCSI client for Linux.

Will iSCSI get to the point where GNBD is obsolete and is dropped?
Perhaps.

--
Ken Preslan <kpreslan at redhat.com>

--
Linux-cluster mailing list
Linux-cluster at redhat.com
http://www.redhat.com/mailman/listinfo/linux-cluster





From kpreslan at redhat.com  Wed Jun 30 21:54:11 2004
From: kpreslan at redhat.com (Ken Preslan)
Date: Wed, 30 Jun 2004 16:54:11 -0500
Subject: [JNK] Re: [Linux-cluster] GFS with ATA/IDE drives
In-Reply-To: <iceqgx8uf15e1lf.300620041706@mail.nextresponse.com>
References: <20040630203533.GA26843@potassium.msp.redhat.com>
	<iceqgx8uf15e1lf.300620041706@mail.nextresponse.com>
Message-ID: <20040630215411.GA27780@potassium.msp.redhat.com>

On Wed, Jun 30, 2004 at 05:05:58PM -0400, CC wrote:
> I'm also considering getting coraid EtherDrive -- www.coraid.com. The use
> ATA over Ethernet (AoE), and have drivers for Linux 2.4/2.6 that will make
> each drive appear a local drive. Basically you attach the Coraid server to a
> GigE switch and all the computers attached are able to see the drives as
> local drives.
> 
> How would GFS work under such an implementation (it's close to something
> like the one GFS server you suggested). Will one still need to use iSCSI or
> GNDB? 

I haven't played with an EtherDrive (or know anyone who has), but in
theory it should work fine.  I assume that the disks exported by the
EtherDrive box will show up as a block device in the /dev directory
somewhere.  You can just gfs_mkfs on the device and mount.  (Maybe you
want CLVM in there somewhere, too.)  You shouldn't need iSCSI or GNBD as
the EtherDrive should take care of all the block transport for you.

As with any shared block device, it's probably worth testing the EtherDrive
before you buy it.  Storage vendors don't always get truely shared block
devices right because there isn't a whole lot of software (like GFS) out
there to test against.

-- 
Ken Preslan <kpreslan at redhat.com>



From cc at NextResponse.com  Tue Jun 29 18:06:44 2004
From: cc at NextResponse.com (Chirag Chaman)
Date: Tue, 29 Jun 2004 14:06:44 -0400
Subject: [Linux-cluster] GFS with ATA/IDE drives
Message-ID: <2bvnfiqb4j1t44e.290620041406@mail.nextresponse.com>


I've been following GFS for quite some while, but not sure what's
new/missing since it's re-release.

We're looking at a project where we need 20+ small computers.
Would like to use the internal storage with GFS.

Thus, my questions: 

1. Does GFS support ATA/IDE drives? (I see SCSI and FC in the docs)
2. If yes, is it recommended by Red Hat
3. If yes, is it more or less reliable than using a SAN

Thank you
CC