From ccaulfie at redhat.com Mon Jan 4 07:58:11 2010 From: ccaulfie at redhat.com (Christine Caulfield) Date: Mon, 04 Jan 2010 07:58:11 +0000 Subject: [Linux-cluster] cannot add 3rd node to running cluster In-Reply-To: <8ee061010912310813g3f45bf6ekfc52c3d5420a5826@mail.gmail.com> References: <8ee061010912291130n68f0bad6l496f71df2cd703ac@mail.gmail.com> <74e9d01e0912291520l3bc36ac4yc7a17b1f96fa123d@mail.gmail.com> <8ee061010912300813i7fd29c70hd81bf691d574df0c@mail.gmail.com> <8ee061010912310813g3f45bf6ekfc52c3d5420a5826@mail.gmail.com> Message-ID: <4B419F93.5070704@redhat.com> On 31/12/09 16:13, Terry wrote: > On Wed, Dec 30, 2009 at 10:13 AM, Terry wrote: >> On Tue, Dec 29, 2009 at 5:20 PM, Jason W. wrote: >>> On Tue, Dec 29, 2009 at 2:30 PM, Terry wrote: >>>> Hello, >>>> >>>> I have a working 2 node cluster that I am trying to add a third node >>>> to. I am trying to use Red Hat's conga (luci) to add the node in but >>> >>> If you have two node cluster with two_node=1 in cluster.conf - such as >>> two nodes with no quorum device to break a tie - you'll need to bring >>> the cluster down, change two_node to 0 on both nodes (and rev the >>> cluster version at the top of cluster.conf), bring the cluster up and >>> then add the third node. >>> >>> For troubleshooting any cluster issue, take a look at syslog >>> (/var/log/messages by default). It can help to watch it on a >>> centralized syslog server that all of your nodes forward logs to. >>> >>> -- >>> HTH, YMMV, HANW :) >>> >>> Jason >>> >>> The path to enlightenment is /usr/bin/enlightenment. >> >> Thank you for the response. /var/log/messages doesn't have any >> errors. It says cman started then says can't connect to cluster >> infrastructure after a few seconds. My cluster does not have the >> two_node=1 config now. Conga took that out for me. That bit me last >> night because I needed to put that back in. >> > > CMAN still will not start and gives no debug information. Anyone know > why cman_tool -d join would not print any output at all? > Troubleshooting this is kind of a nightmare. I verified that two_node > is not in play. If cman_tool join -d doesn't produce any output then the most likely problem is a mismatch between the cman and openais versions. Because cman is a configuration module for openais it loads very early in the initialisation sequence. If you are sure the versions are right (ie they match those on the running nodes of the cluster) then do # strace -f cman_tool join -d and post the results here and I'll have a look for you. Chrisie From diamondiona at gmail.com Mon Jan 4 08:25:59 2010 From: diamondiona at gmail.com (Diamond Li) Date: Mon, 4 Jan 2010 16:25:59 +0800 Subject: [Linux-cluster] gfs2_grow does not work In-Reply-To: References: Message-ID: could someone kindly help me to get through? thanks in advance! On Thu, Dec 31, 2009 at 3:16 PM, Diamond Li wrote: > from system log, I can see the erorr message: > > Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not exist > > but I have mounted gfs2 file system under /gfs folder and I can do > operations such as mkdir, rm, successfully. > > > > On Thu, Dec 31, 2009 at 2:55 PM, Diamond Li wrote: >> Hello, >> >> I am trying to grow a gfs2 file system, unfortunately ?it does not work. >> >> anyone has similar issues or I always have bad luck? >> >> [root at wplccdlvm446 gfs]# mount >> >> /dev/mapper/vg100-lvol0 on /gfs type gfs2 (rw,hostdata=jid=0:id=131074:first=1) >> >> [root at wplccdlvm446 gfs]# gfs2_grow -v /gfs >> Initializing lists... >> gfs2_grow: Couldn't mount /tmp/.gfs2meta : Invalid argument >> >> [root at wplccdlvm446 gfs]# ls -a /tmp/.gfs2meta/ >> . ?.. >> >> >> [root at wplccdlvm446 gfs]# uname -r >> 2.6.18-164.el5 >> >> [root at wplccdlvm446 gfs]# cat /etc/redhat-release >> Red Hat Enterprise Linux Server release 5.4 (Tikanga) >> > From a.alawi at auckland.ac.nz Mon Jan 4 19:24:40 2010 From: a.alawi at auckland.ac.nz (Abraham Alawi) Date: Tue, 5 Jan 2010 08:24:40 +1300 Subject: [Linux-cluster] gfs2_grow does not work In-Reply-To: References: Message-ID: <96F1FFC8-83C1-48C9-8CF0-271CB9A998F2@auckland.ac.nz> I used gfs2_grow before but never experienced this error. Probably it's /tmp related issue, got the right permission (1777) + does it have enough space? strace could be of great help as well. Good luck On 4/01/2010, at 9:25 PM, Diamond Li wrote: > could someone kindly help me to get through? > > thanks in advance! > > On Thu, Dec 31, 2009 at 3:16 PM, Diamond Li wrote: >> from system log, I can see the erorr message: >> >> Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not exist >> >> but I have mounted gfs2 file system under /gfs folder and I can do >> operations such as mkdir, rm, successfully. >> >> >> >> On Thu, Dec 31, 2009 at 2:55 PM, Diamond Li wrote: >>> Hello, >>> >>> I am trying to grow a gfs2 file system, unfortunately it does not work. >>> >>> anyone has similar issues or I always have bad luck? >>> >>> [root at wplccdlvm446 gfs]# mount >>> >>> /dev/mapper/vg100-lvol0 on /gfs type gfs2 (rw,hostdata=jid=0:id=131074:first=1) >>> >>> [root at wplccdlvm446 gfs]# gfs2_grow -v /gfs >>> Initializing lists... >>> gfs2_grow: Couldn't mount /tmp/.gfs2meta : Invalid argument >>> >>> [root at wplccdlvm446 gfs]# ls -a /tmp/.gfs2meta/ >>> . .. >>> >>> >>> [root at wplccdlvm446 gfs]# uname -r >>> 2.6.18-164.el5 >>> >>> [root at wplccdlvm446 gfs]# cat /etc/redhat-release >>> Red Hat Enterprise Linux Server release 5.4 (Tikanga) >>> >> > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster '''''''''''''''''''''''''''''''''''''''''''''''''''''' Abraham Alawi Unix/Linux Systems Administrator Science IT University of Auckland e: a.alawi at auckland.ac.nz p: +64-9-373 7599, ext#: 87572 '''''''''''''''''''''''''''''''''''''''''''''''''''''' From a.alawi at auckland.ac.nz Mon Jan 4 19:34:00 2010 From: a.alawi at auckland.ac.nz (Abraham Alawi) Date: Tue, 5 Jan 2010 08:34:00 +1300 Subject: [Linux-cluster] cannot add 3rd node to running cluster In-Reply-To: <8ee061010912310813g3f45bf6ekfc52c3d5420a5826@mail.gmail.com> References: <8ee061010912291130n68f0bad6l496f71df2cd703ac@mail.gmail.com> <74e9d01e0912291520l3bc36ac4yc7a17b1f96fa123d@mail.gmail.com> <8ee061010912300813i7fd29c70hd81bf691d574df0c@mail.gmail.com> <8ee061010912310813g3f45bf6ekfc52c3d5420a5826@mail.gmail.com> Message-ID: <37B32C9E-A8C3-4BA4-BFB4-FAB6235985D5@auckland.ac.nz> On 1/01/2010, at 5:13 AM, Terry wrote: > On Wed, Dec 30, 2009 at 10:13 AM, Terry wrote: >> On Tue, Dec 29, 2009 at 5:20 PM, Jason W. wrote: >>> On Tue, Dec 29, 2009 at 2:30 PM, Terry wrote: >>>> Hello, >>>> >>>> I have a working 2 node cluster that I am trying to add a third node >>>> to. I am trying to use Red Hat's conga (luci) to add the node in but >>> >>> If you have two node cluster with two_node=1 in cluster.conf - such as >>> two nodes with no quorum device to break a tie - you'll need to bring >>> the cluster down, change two_node to 0 on both nodes (and rev the >>> cluster version at the top of cluster.conf), bring the cluster up and >>> then add the third node. >>> >>> For troubleshooting any cluster issue, take a look at syslog >>> (/var/log/messages by default). It can help to watch it on a >>> centralized syslog server that all of your nodes forward logs to. >>> >>> -- >>> HTH, YMMV, HANW :) >>> >>> Jason >>> >>> The path to enlightenment is /usr/bin/enlightenment. >> >> Thank you for the response. /var/log/messages doesn't have any >> errors. It says cman started then says can't connect to cluster >> infrastructure after a few seconds. My cluster does not have the >> two_node=1 config now. Conga took that out for me. That bit me last >> night because I needed to put that back in. >> > > CMAN still will not start and gives no debug information. Anyone know > why cman_tool -d join would not print any output at all? > Troubleshooting this is kind of a nightmare. I verified that two_node > is not in play. > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster Try this line in your cluster.conf file: Also, if you are sure your cluster.conf is correct then copy it manually to all the nodes and add clean_start="1" to the fence_daemon line in cluster.conf and run 'service cman start' simultaneously on all the nodes (probably a good idea to do that from runlevel 1 but make sure you have the network up first) Cheers, -- Abraham '''''''''''''''''''''''''''''''''''''''''''''''''''''' Abraham Alawi Unix/Linux Systems Administrator Science IT University of Auckland e: a.alawi at auckland.ac.nz p: +64-9-373 7599, ext#: 87572 '''''''''''''''''''''''''''''''''''''''''''''''''''''' From adas at redhat.com Mon Jan 4 22:27:15 2010 From: adas at redhat.com (Abhijith Das) Date: Mon, 4 Jan 2010 17:27:15 -0500 (EST) Subject: [Linux-cluster] gfs2_grow does not work In-Reply-To: <1186328466.2354981262643939629.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Message-ID: <330034033.2355001262644035840.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Hi, >From the following message, it looks like the gfs2meta mount routine is not able to locate the gfs2 mountpoint. "Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not exist" Can you confirm that /proc/mounts and /etc/mtab all agree on the mounted gfs2 at /gfs? Also, can you run gfs2_grow under strace so that we can see what arguments gfs2_grow passes to the mount() system call when it tries to mount the gfs2meta filesystem? Thanks! --Abhi ----- "Diamond Li" wrote: > From: "Diamond Li" > To: "linux clustering" > Sent: Monday, January 4, 2010 2:25:59 AM GMT -06:00 US/Canada Central > Subject: Re: [Linux-cluster] gfs2_grow does not work > > could someone kindly help me to get through? > > thanks in advance! > > On Thu, Dec 31, 2009 at 3:16 PM, Diamond Li > wrote: > > from system log, I can see the erorr message: > > > > Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not > exist > > > > but I have mounted gfs2 file system under /gfs folder and I can do > > operations such as mkdir, rm, successfully. > > > > > > > > On Thu, Dec 31, 2009 at 2:55 PM, Diamond Li > wrote: > >> Hello, > >> > >> I am trying to grow a gfs2 file system, unfortunately ?it does not > work. > >> > >> anyone has similar issues or I always have bad luck? > >> > >> [root at wplccdlvm446 gfs]# mount > >> > >> /dev/mapper/vg100-lvol0 on /gfs type gfs2 > (rw,hostdata=jid=0:id=131074:first=1) > >> > >> [root at wplccdlvm446 gfs]# gfs2_grow -v /gfs > >> Initializing lists... > >> gfs2_grow: Couldn't mount /tmp/.gfs2meta : Invalid argument > >> > >> [root at wplccdlvm446 gfs]# ls -a /tmp/.gfs2meta/ > >> . ?.. > >> > >> > >> [root at wplccdlvm446 gfs]# uname -r > >> 2.6.18-164.el5 > >> > >> [root at wplccdlvm446 gfs]# cat /etc/redhat-release > >> Red Hat Enterprise Linux Server release 5.4 (Tikanga) > >> > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From diamondiona at gmail.com Tue Jan 5 02:12:12 2010 From: diamondiona at gmail.com (Diamond Li) Date: Tue, 5 Jan 2010 10:12:12 +0800 Subject: [Linux-cluster] gfs2_grow does not work In-Reply-To: <330034033.2355001262644035840.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> References: <1186328466.2354981262643939629.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> <330034033.2355001262644035840.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Message-ID: [root at wplccdlvm445 proc]# df -k Filesystem 1K-blocks Used Available Use% Mounted on /dev/mapper/VolGroup00-LogVol00 28376956 9144384 17767844 34% / /dev/sda1 101086 12055 83812 13% /boot tmpfs 1037748 0 1037748 0% /dev/shm /dev/mapper/vg100-lvol0 819024 794264 24760 97% /gfs [root at wplccdlvm445 proc]# ls -ld /tmp drwxrwxrwt 8 root root 4096 Jan 5 04:02 /tmp [root at wplccdlvm445 proc]# ls -ld /tmp/.gfs2meta/ drwx------ 2 root root 4096 Dec 31 14:24 /tmp/.gfs2meta/ [root at wplccdlvm445 proc]# cat /proc/mounts rootfs / rootfs rw 0 0 /dev/root / ext3 rw,data=ordered 0 0 /dev /dev tmpfs rw 0 0 /proc /proc proc rw 0 0 /sys /sys sysfs rw 0 0 /proc/bus/usb /proc/bus/usb usbfs rw 0 0 devpts /dev/pts devpts rw 0 0 /dev/sda1 /boot ext3 rw,data=ordered 0 0 tmpfs /dev/shm tmpfs rw 0 0 none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 /etc/auto.misc /misc autofs rw,fd=7,pgrp=2200,timeout=300,minproto=5,maxproto=5,indirect 0 0 -hosts /net autofs rw,fd=13,pgrp=2200,timeout=300,minproto=5,maxproto=5,indirect 0 0 none /sys/kernel/config configfs rw 0 0 /dev/mapper/vg100-lvol0 /gfs gfs2 rw,hostdata=jid=0:id=65537:first=1 0 0 [root at wplccdlvm445 proc]# cat /etc/mtab /dev/mapper/VolGroup00-LogVol00 / ext3 rw 0 0 proc /proc proc rw 0 0 sysfs /sys sysfs rw 0 0 devpts /dev/pts devpts rw,gid=5,mode=620 0 0 /dev/sda1 /boot ext3 rw 0 0 tmpfs /dev/shm tmpfs rw 0 0 none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 none /sys/kernel/config configfs rw 0 0 /dev/mapper/vg100-lvol0 /gfs gfs2 rw,hostdata=jid=0:id=65537:first=1 0 0 [root at wplccdlvm445 proc]# strace gfs2_grow -v /gfs execve("/sbin/gfs2_grow", ["gfs2_grow", "-v", "/gfs"], [/* 29 vars */]) = 0 brk(0) = 0x942d000 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory) open("/etc/ld.so.cache", O_RDONLY) = 3 fstat64(3, {st_mode=S_IFREG|0644, st_size=90296, ...}) = 0 mmap2(NULL, 90296, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f96000 close(3) = 0 open("/lib/libvolume_id.so.0", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360 at k\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=32144, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f95000 mmap2(0x6b3000, 33540, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x6b3000 mmap2(0x6bb000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7) = 0x6bb000 close(3) = 0 open("/lib/libc.so.6", O_RDONLY) = 3 read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\17X\0004\0\0\0"..., 512) = 512 fstat64(3, {st_mode=S_IFREG|0755, st_size=1611564, ...}) = 0 mmap2(0x56b000, 1332676, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x56b000 mprotect(0x6aa000, 4096, PROT_NONE) = 0 mmap2(0x6ab000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13f) = 0x6ab000 mmap2(0x6ae000, 9668, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x6ae000 close(3) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7f94000 set_thread_area({entry_number:-1 -> 6, base_addr:0xb7f946c0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0 mprotect(0x6ab000, 8192, PROT_READ) = 0 mprotect(0x567000, 4096, PROT_READ) = 0 munmap(0xb7f96000, 90296) = 0 time(NULL) = 1262655667 getpid() = 18781 brk(0) = 0x942d000 brk(0x944e000) = 0x944e000 open("/gfs", O_RDONLY|O_LARGEFILE) = 3 open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 4 lstat64("/gfs", {st_mode=S_IFDIR|0755, st_size=3864, ...}) = 0 fstat64(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fac000 read(4, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 659 close(4) = 0 munmap(0xb7fac000, 4096) = 0 open("/dev/mapper/vg100-lvol0", O_RDWR|O_LARGEFILE) = 4 fstat64(4, {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 4), ...}) = 0 _llseek(4, 0, [1677721600], SEEK_END) = 0 fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fac000 write(1, "Initializing lists...\n", 22Initializing lists... ) = 22 _llseek(4, 65536, [65536], SEEK_SET) = 0 read(4, "\1\26\31p\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0d\0\0\0\0\0\0\7\t\0\0\7l"..., 4096) = 4096 _llseek(4, 0, [1677721600], SEEK_END) = 0 open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 5 fstat64(5, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb7fab000 read(5, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 659 read(5, "", 4096) = 0 close(5) = 0 munmap(0xb7fab000, 4096) = 0 open("/tmp/.gfs2meta", O_RDONLY|O_LARGEFILE) = 5 fstat64(5, {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0 close(5) = 0 mount("/dev/mapper/vg100-lvol0", "/tmp/.gfs2meta", "gfs2meta", 0, NULL) = -1 EINVAL (Invalid argument) write(2, "gfs2_grow: ", 11gfs2_grow: ) = 11 write(2, "Couldn't mount /tmp/.gfs2meta : "..., 49Couldn't mount /tmp/.gfs2meta : Invalid argument ) = 49 exit_group(1) = ? On Tue, Jan 5, 2010 at 6:27 AM, Abhijith Das wrote: > Hi, > > >From the following message, it looks like the gfs2meta mount routine is not able to locate the gfs2 mountpoint. > "Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not exist" > Can you confirm that /proc/mounts and /etc/mtab all agree on the mounted gfs2 at /gfs? > Also, can you run gfs2_grow under strace so that we can see what arguments gfs2_grow passes to the mount() system call when it tries to mount the gfs2meta filesystem? > > Thanks! > --Abhi > > ----- "Diamond Li" wrote: > >> From: "Diamond Li" >> To: "linux clustering" >> Sent: Monday, January 4, 2010 2:25:59 AM GMT -06:00 US/Canada Central >> Subject: Re: [Linux-cluster] gfs2_grow does not work >> >> could someone kindly help me to get through? >> >> thanks in advance! >> >> On Thu, Dec 31, 2009 at 3:16 PM, Diamond Li >> wrote: >> > from system log, I can see the erorr message: >> > >> > Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not >> exist >> > >> > but I have mounted gfs2 file system under /gfs folder and I can do >> > operations such as mkdir, rm, successfully. >> > >> > >> > >> > On Thu, Dec 31, 2009 at 2:55 PM, Diamond Li >> wrote: >> >> Hello, >> >> >> >> I am trying to grow a gfs2 file system, unfortunately ?it does not >> work. >> >> >> >> anyone has similar issues or I always have bad luck? >> >> >> >> [root at wplccdlvm446 gfs]# mount >> >> >> >> /dev/mapper/vg100-lvol0 on /gfs type gfs2 >> (rw,hostdata=jid=0:id=131074:first=1) >> >> >> >> [root at wplccdlvm446 gfs]# gfs2_grow -v /gfs >> >> Initializing lists... >> >> gfs2_grow: Couldn't mount /tmp/.gfs2meta : Invalid argument >> >> >> >> [root at wplccdlvm446 gfs]# ls -a /tmp/.gfs2meta/ >> >> . ?.. >> >> >> >> >> >> [root at wplccdlvm446 gfs]# uname -r >> >> 2.6.18-164.el5 >> >> >> >> [root at wplccdlvm446 gfs]# cat /etc/redhat-release >> >> Red Hat Enterprise Linux Server release 5.4 (Tikanga) >> >> >> > >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From adas at redhat.com Tue Jan 5 05:26:15 2010 From: adas at redhat.com (Abhijith Das) Date: Tue, 5 Jan 2010 00:26:15 -0500 (EST) Subject: [Linux-cluster] gfs2_grow does not work In-Reply-To: <678287746.2365421262668986749.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Message-ID: <1979230168.2365461262669175952.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Hi Diamond, Could I also have the kernel and gfs2-utils rpm versions you are using so I can try this on my setup? I just spotted something in your strace output that could be a problem if you have a newer kernel, but not a newer gfs2-utils package. The mount syscall in your strace output takes the device as the first arg to mount the metafs. A recent kernel patch from https://bugzilla.redhat.com/show_bug.cgi?id=457798 changed that to take the mountpoint as the first arg instead. There was a corresponding userland patch to gfs2-utils in https://bugzilla.redhat.com/show_bug.cgi?id=459630#c3 that fixed this mismatch. I'm not sure if you're seeing this. If so, an upgrade of these packages should fix what you're seeing. Cheers! --Abhi ----- "Diamond Li" wrote: > From: "Diamond Li" > To: "linux clustering" > Sent: Monday, January 4, 2010 8:12:12 PM GMT -06:00 US/Canada Central > Subject: Re: [Linux-cluster] gfs2_grow does not work > > [root at wplccdlvm445 proc]# df -k > Filesystem 1K-blocks Used Available Use% Mounted on > /dev/mapper/VolGroup00-LogVol00 > 28376956 9144384 17767844 34% / > /dev/sda1 101086 12055 83812 13% /boot > tmpfs 1037748 0 1037748 0% /dev/shm > /dev/mapper/vg100-lvol0 > 819024 794264 24760 97% /gfs > > [root at wplccdlvm445 proc]# ls -ld /tmp > drwxrwxrwt 8 root root 4096 Jan 5 04:02 /tmp > [root at wplccdlvm445 proc]# ls -ld /tmp/.gfs2meta/ > drwx------ 2 root root 4096 Dec 31 14:24 /tmp/.gfs2meta/ > > > [root at wplccdlvm445 proc]# cat /proc/mounts > rootfs / rootfs rw 0 0 > /dev/root / ext3 rw,data=ordered 0 0 > /dev /dev tmpfs rw 0 0 > /proc /proc proc rw 0 0 > /sys /sys sysfs rw 0 0 > /proc/bus/usb /proc/bus/usb usbfs rw 0 0 > devpts /dev/pts devpts rw 0 0 > /dev/sda1 /boot ext3 rw,data=ordered 0 0 > tmpfs /dev/shm tmpfs rw 0 0 > none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 > sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 > /etc/auto.misc /misc autofs > rw,fd=7,pgrp=2200,timeout=300,minproto=5,maxproto=5,indirect 0 0 > -hosts /net autofs > rw,fd=13,pgrp=2200,timeout=300,minproto=5,maxproto=5,indirect 0 0 > none /sys/kernel/config configfs rw 0 0 > /dev/mapper/vg100-lvol0 /gfs gfs2 rw,hostdata=jid=0:id=65537:first=1 0 > 0 > > [root at wplccdlvm445 proc]# cat /etc/mtab > /dev/mapper/VolGroup00-LogVol00 / ext3 rw 0 0 > proc /proc proc rw 0 0 > sysfs /sys sysfs rw 0 0 > devpts /dev/pts devpts rw,gid=5,mode=620 0 0 > /dev/sda1 /boot ext3 rw 0 0 > tmpfs /dev/shm tmpfs rw 0 0 > none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 > sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 > none /sys/kernel/config configfs rw 0 0 > /dev/mapper/vg100-lvol0 /gfs gfs2 rw,hostdata=jid=0:id=65537:first=1 0 > 0 > > > [root at wplccdlvm445 proc]# strace gfs2_grow -v /gfs > execve("/sbin/gfs2_grow", ["gfs2_grow", "-v", "/gfs"], [/* 29 vars > */]) = 0 > brk(0) = 0x942d000 > access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or > directory) > open("/etc/ld.so.cache", O_RDONLY) = 3 > fstat64(3, {st_mode=S_IFREG|0644, st_size=90296, ...}) = 0 > mmap2(NULL, 90296, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f96000 > close(3) = 0 > open("/lib/libvolume_id.so.0", O_RDONLY) = 3 > read(3, > "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360 at k\0004\0\0\0"..., > 512) = 512 > fstat64(3, {st_mode=S_IFREG|0755, st_size=32144, ...}) = 0 > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, > -1, > 0) = 0xb7f95000 > mmap2(0x6b3000, 33540, PROT_READ|PROT_EXEC, > MAP_PRIVATE|MAP_DENYWRITE, > 3, 0) = 0x6b3000 > mmap2(0x6bb000, 4096, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7) = 0x6bb000 > close(3) = 0 > open("/lib/libc.so.6", O_RDONLY) = 3 > read(3, > "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\17X\0004\0\0\0"..., > 512) = 512 > fstat64(3, {st_mode=S_IFREG|0755, st_size=1611564, ...}) = 0 > mmap2(0x56b000, 1332676, PROT_READ|PROT_EXEC, > MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x56b000 > mprotect(0x6aa000, 4096, PROT_NONE) = 0 > mmap2(0x6ab000, 12288, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13f) = 0x6ab000 > mmap2(0x6ae000, 9668, PROT_READ|PROT_WRITE, > MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x6ae000 > close(3) = 0 > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, > -1, > 0) = 0xb7f94000 > set_thread_area({entry_number:-1 -> 6, base_addr:0xb7f946c0, > limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, > limit_in_pages:1, seg_not_present:0, useable:1}) = 0 > mprotect(0x6ab000, 8192, PROT_READ) = 0 > mprotect(0x567000, 4096, PROT_READ) = 0 > munmap(0xb7f96000, 90296) = 0 > time(NULL) = 1262655667 > getpid() = 18781 > brk(0) = 0x942d000 > brk(0x944e000) = 0x944e000 > open("/gfs", O_RDONLY|O_LARGEFILE) = 3 > open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 4 > lstat64("/gfs", {st_mode=S_IFDIR|0755, st_size=3864, ...}) = 0 > fstat64(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, > -1, > 0) = 0xb7fac000 > read(4, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 659 > close(4) = 0 > munmap(0xb7fac000, 4096) = 0 > open("/dev/mapper/vg100-lvol0", O_RDWR|O_LARGEFILE) = 4 > fstat64(4, {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 4), ...}) = 0 > _llseek(4, 0, [1677721600], SEEK_END) = 0 > fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0 > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, > -1, > 0) = 0xb7fac000 > write(1, "Initializing lists...\n", 22Initializing lists... > ) = 22 > _llseek(4, 65536, [65536], SEEK_SET) = 0 > read(4, > "\1\26\31p\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0d\0\0\0\0\0\0\7\t\0\0\7l"..., > 4096) = 4096 > _llseek(4, 0, [1677721600], SEEK_END) = 0 > open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 5 > fstat64(5, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 > mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, > -1, > 0) = 0xb7fab000 > read(5, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 659 > read(5, "", 4096) = 0 > close(5) = 0 > munmap(0xb7fab000, 4096) = 0 > open("/tmp/.gfs2meta", O_RDONLY|O_LARGEFILE) = 5 > fstat64(5, {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0 > close(5) = 0 > mount("/dev/mapper/vg100-lvol0", "/tmp/.gfs2meta", "gfs2meta", 0, > NULL) = -1 EINVAL (Invalid argument) > write(2, "gfs2_grow: ", 11gfs2_grow: ) = 11 > write(2, "Couldn't mount /tmp/.gfs2meta : "..., 49Couldn't mount > /tmp/.gfs2meta : Invalid argument > ) = 49 > exit_group(1) = ? > > > On Tue, Jan 5, 2010 at 6:27 AM, Abhijith Das wrote: > > Hi, > > > > >From the following message, it looks like the gfs2meta mount > routine is not able to locate the gfs2 mountpoint. > > "Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not > exist" > > Can you confirm that /proc/mounts and /etc/mtab all agree on the > mounted gfs2 at /gfs? > > Also, can you run gfs2_grow under strace so that we can see what > arguments gfs2_grow passes to the mount() system call when it tries to > mount the gfs2meta filesystem? > > > > Thanks! > > --Abhi > > > > ----- "Diamond Li" wrote: > > > >> From: "Diamond Li" > >> To: "linux clustering" > >> Sent: Monday, January 4, 2010 2:25:59 AM GMT -06:00 US/Canada > Central > >> Subject: Re: [Linux-cluster] gfs2_grow does not work > >> > >> could someone kindly help me to get through? > >> > >> thanks in advance! > >> > >> On Thu, Dec 31, 2009 at 3:16 PM, Diamond Li > > >> wrote: > >> > from system log, I can see the erorr message: > >> > > >> > Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not > >> exist > >> > > >> > but I have mounted gfs2 file system under /gfs folder and I can > do > >> > operations such as mkdir, rm, successfully. > >> > > >> > > >> > > >> > On Thu, Dec 31, 2009 at 2:55 PM, Diamond Li > > >> wrote: > >> >> Hello, > >> >> > >> >> I am trying to grow a gfs2 file system, unfortunately ?it does > not > >> work. > >> >> > >> >> anyone has similar issues or I always have bad luck? > >> >> > >> >> [root at wplccdlvm446 gfs]# mount > >> >> > >> >> /dev/mapper/vg100-lvol0 on /gfs type gfs2 > >> (rw,hostdata=jid=0:id=131074:first=1) > >> >> > >> >> [root at wplccdlvm446 gfs]# gfs2_grow -v /gfs > >> >> Initializing lists... > >> >> gfs2_grow: Couldn't mount /tmp/.gfs2meta : Invalid argument > >> >> > >> >> [root at wplccdlvm446 gfs]# ls -a /tmp/.gfs2meta/ > >> >> . ?.. > >> >> > >> >> > >> >> [root at wplccdlvm446 gfs]# uname -r > >> >> 2.6.18-164.el5 > >> >> > >> >> [root at wplccdlvm446 gfs]# cat /etc/redhat-release > >> >> Red Hat Enterprise Linux Server release 5.4 (Tikanga) > >> >> > >> > > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From diamondiona at gmail.com Tue Jan 5 05:57:08 2010 From: diamondiona at gmail.com (Diamond Li) Date: Tue, 5 Jan 2010 13:57:08 +0800 Subject: [Linux-cluster] gfs2_grow does not work In-Reply-To: <1979230168.2365461262669175952.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> References: <678287746.2365421262668986749.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> <1979230168.2365461262669175952.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Message-ID: Appreciate for your reply! [root at wplccdlvm445 ~]# uname -r 2.6.18-164.el5 [root at wplccdlvm445 ~]# rpm -qa |grep gfs gfs2-utils-0.1.44-1.el5 gfs-utils-0.1.17-1.el5 I am using the packages shipped by Redhat, no any customization. Does it mean gfs2_grow does not work at all in 5.4 release(I wish I am wrong)? I did not find any patch for x86 32 bit CPU, it does have x86_64. On Tue, Jan 5, 2010 at 1:26 PM, Abhijith Das wrote: > Hi Diamond, > > Could I also have the kernel and gfs2-utils rpm versions you are using so I can try this on my setup? I just spotted something in your strace output that could be a problem if you have a newer kernel, but not a newer gfs2-utils package. > > The mount syscall in your strace output takes the device as the first arg to mount the metafs. A recent kernel patch from https://bugzilla.redhat.com/show_bug.cgi?id=457798 changed that to take the mountpoint as the first arg instead. There was a corresponding userland patch to gfs2-utils in https://bugzilla.redhat.com/show_bug.cgi?id=459630#c3 that fixed this mismatch. > I'm not sure if you're seeing this. If so, an upgrade of these packages should fix what you're seeing. > > Cheers! > --Abhi > > ----- "Diamond Li" wrote: > >> From: "Diamond Li" >> To: "linux clustering" >> Sent: Monday, January 4, 2010 8:12:12 PM GMT -06:00 US/Canada Central >> Subject: Re: [Linux-cluster] gfs2_grow does not work >> >> [root at wplccdlvm445 proc]# df -k >> Filesystem ? ? ? ? ? 1K-blocks ? ? ?Used Available Use% Mounted on >> /dev/mapper/VolGroup00-LogVol00 >> ? ? ? ? ? ? ? ? ? ? ? 28376956 ? 9144384 ?17767844 ?34% / >> /dev/sda1 ? ? ? ? ? ? ? 101086 ? ? 12055 ? ? 83812 ?13% /boot >> tmpfs ? ? ? ? ? ? ? ? ?1037748 ? ? ? ? 0 ? 1037748 ? 0% /dev/shm >> /dev/mapper/vg100-lvol0 >> ? ? ? ? ? ? ? ? ? ? ? ? 819024 ? ?794264 ? ? 24760 ?97% /gfs >> >> [root at wplccdlvm445 proc]# ls -ld /tmp >> drwxrwxrwt 8 root root 4096 Jan ?5 04:02 /tmp >> [root at wplccdlvm445 proc]# ls -ld /tmp/.gfs2meta/ >> drwx------ 2 root root 4096 Dec 31 14:24 /tmp/.gfs2meta/ >> >> >> [root at wplccdlvm445 proc]# cat /proc/mounts >> rootfs / rootfs rw 0 0 >> /dev/root / ext3 rw,data=ordered 0 0 >> /dev /dev tmpfs rw 0 0 >> /proc /proc proc rw 0 0 >> /sys /sys sysfs rw 0 0 >> /proc/bus/usb /proc/bus/usb usbfs rw 0 0 >> devpts /dev/pts devpts rw 0 0 >> /dev/sda1 /boot ext3 rw,data=ordered 0 0 >> tmpfs /dev/shm tmpfs rw 0 0 >> none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 >> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 >> /etc/auto.misc /misc autofs >> rw,fd=7,pgrp=2200,timeout=300,minproto=5,maxproto=5,indirect 0 0 >> -hosts /net autofs >> rw,fd=13,pgrp=2200,timeout=300,minproto=5,maxproto=5,indirect 0 0 >> none /sys/kernel/config configfs rw 0 0 >> /dev/mapper/vg100-lvol0 /gfs gfs2 rw,hostdata=jid=0:id=65537:first=1 0 >> 0 >> >> [root at wplccdlvm445 proc]# cat /etc/mtab >> /dev/mapper/VolGroup00-LogVol00 / ext3 rw 0 0 >> proc /proc proc rw 0 0 >> sysfs /sys sysfs rw 0 0 >> devpts /dev/pts devpts rw,gid=5,mode=620 0 0 >> /dev/sda1 /boot ext3 rw 0 0 >> tmpfs /dev/shm tmpfs rw 0 0 >> none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 >> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 >> none /sys/kernel/config configfs rw 0 0 >> /dev/mapper/vg100-lvol0 /gfs gfs2 rw,hostdata=jid=0:id=65537:first=1 0 >> 0 >> >> >> [root at wplccdlvm445 proc]# strace gfs2_grow -v /gfs >> execve("/sbin/gfs2_grow", ["gfs2_grow", "-v", "/gfs"], [/* 29 vars >> */]) = 0 >> brk(0) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0x942d000 >> access("/etc/ld.so.preload", R_OK) ? ? ?= -1 ENOENT (No such file or >> directory) >> open("/etc/ld.so.cache", O_RDONLY) ? ? ?= 3 >> fstat64(3, {st_mode=S_IFREG|0644, st_size=90296, ...}) = 0 >> mmap2(NULL, 90296, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f96000 >> close(3) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >> open("/lib/libvolume_id.so.0", O_RDONLY) = 3 >> read(3, >> "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360 at k\0004\0\0\0"..., >> 512) = 512 >> fstat64(3, {st_mode=S_IFREG|0755, st_size=32144, ...}) = 0 >> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >> -1, >> 0) = 0xb7f95000 >> mmap2(0x6b3000, 33540, PROT_READ|PROT_EXEC, >> MAP_PRIVATE|MAP_DENYWRITE, >> 3, 0) = 0x6b3000 >> mmap2(0x6bb000, 4096, PROT_READ|PROT_WRITE, >> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7) = 0x6bb000 >> close(3) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >> open("/lib/libc.so.6", O_RDONLY) ? ? ? ?= 3 >> read(3, >> "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\17X\0004\0\0\0"..., >> 512) = 512 >> fstat64(3, {st_mode=S_IFREG|0755, st_size=1611564, ...}) = 0 >> mmap2(0x56b000, 1332676, PROT_READ|PROT_EXEC, >> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x56b000 >> mprotect(0x6aa000, 4096, PROT_NONE) ? ? = 0 >> mmap2(0x6ab000, 12288, PROT_READ|PROT_WRITE, >> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13f) = 0x6ab000 >> mmap2(0x6ae000, 9668, PROT_READ|PROT_WRITE, >> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x6ae000 >> close(3) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >> -1, >> 0) = 0xb7f94000 >> set_thread_area({entry_number:-1 -> 6, base_addr:0xb7f946c0, >> limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, >> limit_in_pages:1, seg_not_present:0, useable:1}) = 0 >> mprotect(0x6ab000, 8192, PROT_READ) ? ? = 0 >> mprotect(0x567000, 4096, PROT_READ) ? ? = 0 >> munmap(0xb7f96000, 90296) ? ? ? ? ? ? ? = 0 >> time(NULL) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 1262655667 >> getpid() ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 18781 >> brk(0) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0x942d000 >> brk(0x944e000) ? ? ? ? ? ? ? ? ? ? ? ? ?= 0x944e000 >> open("/gfs", O_RDONLY|O_LARGEFILE) ? ? ?= 3 >> open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 4 >> lstat64("/gfs", {st_mode=S_IFDIR|0755, st_size=3864, ...}) = 0 >> fstat64(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 >> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >> -1, >> 0) = 0xb7fac000 >> read(4, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 659 >> close(4) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >> munmap(0xb7fac000, 4096) ? ? ? ? ? ? ? ?= 0 >> open("/dev/mapper/vg100-lvol0", O_RDWR|O_LARGEFILE) = 4 >> fstat64(4, {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 4), ...}) = 0 >> _llseek(4, 0, [1677721600], SEEK_END) ? = 0 >> fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0 >> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >> -1, >> 0) = 0xb7fac000 >> write(1, "Initializing lists...\n", 22Initializing lists... >> ) = 22 >> _llseek(4, 65536, [65536], SEEK_SET) ? ?= 0 >> read(4, >> "\1\26\31p\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0d\0\0\0\0\0\0\7\t\0\0\7l"..., >> 4096) = 4096 >> _llseek(4, 0, [1677721600], SEEK_END) ? = 0 >> open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 5 >> fstat64(5, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 >> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >> -1, >> 0) = 0xb7fab000 >> read(5, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 659 >> read(5, "", 4096) ? ? ? ? ? ? ? ? ? ? ? = 0 >> close(5) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >> munmap(0xb7fab000, 4096) ? ? ? ? ? ? ? ?= 0 >> open("/tmp/.gfs2meta", O_RDONLY|O_LARGEFILE) = 5 >> fstat64(5, {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0 >> close(5) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >> mount("/dev/mapper/vg100-lvol0", "/tmp/.gfs2meta", "gfs2meta", 0, >> NULL) = -1 EINVAL (Invalid argument) >> write(2, "gfs2_grow: ", 11gfs2_grow: ) ? ? ? ? ? ? = 11 >> write(2, "Couldn't mount /tmp/.gfs2meta : "..., 49Couldn't mount >> /tmp/.gfs2meta : Invalid argument >> ) = 49 >> exit_group(1) ? ? ? ? ? ? ? ? ? ? ? ? ? = ? >> >> >> On Tue, Jan 5, 2010 at 6:27 AM, Abhijith Das wrote: >> > Hi, >> > >> > >From the following message, it looks like the gfs2meta mount >> routine is not able to locate the gfs2 mountpoint. >> > "Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not >> exist" >> > Can you confirm that /proc/mounts and /etc/mtab all agree on the >> mounted gfs2 at /gfs? >> > Also, can you run gfs2_grow under strace so that we can see what >> arguments gfs2_grow passes to the mount() system call when it tries to >> mount the gfs2meta filesystem? >> > >> > Thanks! >> > --Abhi >> > >> > ----- "Diamond Li" wrote: >> > >> >> From: "Diamond Li" >> >> To: "linux clustering" >> >> Sent: Monday, January 4, 2010 2:25:59 AM GMT -06:00 US/Canada >> Central >> >> Subject: Re: [Linux-cluster] gfs2_grow does not work >> >> >> >> could someone kindly help me to get through? >> >> >> >> thanks in advance! >> >> >> >> On Thu, Dec 31, 2009 at 3:16 PM, Diamond Li >> >> >> wrote: >> >> > from system log, I can see the erorr message: >> >> > >> >> > Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not >> >> exist >> >> > >> >> > but I have mounted gfs2 file system under /gfs folder and I can >> do >> >> > operations such as mkdir, rm, successfully. >> >> > >> >> > >> >> > >> >> > On Thu, Dec 31, 2009 at 2:55 PM, Diamond Li >> >> >> wrote: >> >> >> Hello, >> >> >> >> >> >> I am trying to grow a gfs2 file system, unfortunately ?it does >> not >> >> work. >> >> >> >> >> >> anyone has similar issues or I always have bad luck? >> >> >> >> >> >> [root at wplccdlvm446 gfs]# mount >> >> >> >> >> >> /dev/mapper/vg100-lvol0 on /gfs type gfs2 >> >> (rw,hostdata=jid=0:id=131074:first=1) >> >> >> >> >> >> [root at wplccdlvm446 gfs]# gfs2_grow -v /gfs >> >> >> Initializing lists... >> >> >> gfs2_grow: Couldn't mount /tmp/.gfs2meta : Invalid argument >> >> >> >> >> >> [root at wplccdlvm446 gfs]# ls -a /tmp/.gfs2meta/ >> >> >> . ?.. >> >> >> >> >> >> >> >> >> [root at wplccdlvm446 gfs]# uname -r >> >> >> 2.6.18-164.el5 >> >> >> >> >> >> [root at wplccdlvm446 gfs]# cat /etc/redhat-release >> >> >> Red Hat Enterprise Linux Server release 5.4 (Tikanga) >> >> >> >> >> > >> >> >> >> -- >> >> Linux-cluster mailing list >> >> Linux-cluster at redhat.com >> >> https://www.redhat.com/mailman/listinfo/linux-cluster >> > >> > -- >> > Linux-cluster mailing list >> > Linux-cluster at redhat.com >> > https://www.redhat.com/mailman/listinfo/linux-cluster >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From diamondiona at gmail.com Tue Jan 5 06:00:33 2010 From: diamondiona at gmail.com (Diamond Li) Date: Tue, 5 Jan 2010 14:00:33 +0800 Subject: [Linux-cluster] gfs2_grow does not work In-Reply-To: References: <678287746.2365421262668986749.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> <1979230168.2365461262669175952.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Message-ID: by the way, from the link, the version is 5.3. but my version is [root at wplccdlvm445 ~]# cat /etc/redhat-release Red Hat Enterprise Linux Server release 5.4 (Tikanga) On Tue, Jan 5, 2010 at 1:57 PM, Diamond Li wrote: > Appreciate for your reply! > > > [root at wplccdlvm445 ~]# uname -r > 2.6.18-164.el5 > > [root at wplccdlvm445 ~]# rpm -qa |grep gfs > gfs2-utils-0.1.44-1.el5 > gfs-utils-0.1.17-1.el5 > > I am using the packages shipped by Redhat, no any customization. Does > it mean gfs2_grow does not work at all in 5.4 release(I wish I am > wrong)? > > I did not find any patch for x86 32 bit CPU, ?it does have x86_64. > > > > On Tue, Jan 5, 2010 at 1:26 PM, Abhijith Das wrote: >> Hi Diamond, >> >> Could I also have the kernel and gfs2-utils rpm versions you are using so I can try this on my setup? I just spotted something in your strace output that could be a problem if you have a newer kernel, but not a newer gfs2-utils package. >> >> The mount syscall in your strace output takes the device as the first arg to mount the metafs. A recent kernel patch from https://bugzilla.redhat.com/show_bug.cgi?id=457798 changed that to take the mountpoint as the first arg instead. There was a corresponding userland patch to gfs2-utils in https://bugzilla.redhat.com/show_bug.cgi?id=459630#c3 that fixed this mismatch. >> I'm not sure if you're seeing this. If so, an upgrade of these packages should fix what you're seeing. >> >> Cheers! >> --Abhi >> >> ----- "Diamond Li" wrote: >> >>> From: "Diamond Li" >>> To: "linux clustering" >>> Sent: Monday, January 4, 2010 8:12:12 PM GMT -06:00 US/Canada Central >>> Subject: Re: [Linux-cluster] gfs2_grow does not work >>> >>> [root at wplccdlvm445 proc]# df -k >>> Filesystem ? ? ? ? ? 1K-blocks ? ? ?Used Available Use% Mounted on >>> /dev/mapper/VolGroup00-LogVol00 >>> ? ? ? ? ? ? ? ? ? ? ? 28376956 ? 9144384 ?17767844 ?34% / >>> /dev/sda1 ? ? ? ? ? ? ? 101086 ? ? 12055 ? ? 83812 ?13% /boot >>> tmpfs ? ? ? ? ? ? ? ? ?1037748 ? ? ? ? 0 ? 1037748 ? 0% /dev/shm >>> /dev/mapper/vg100-lvol0 >>> ? ? ? ? ? ? ? ? ? ? ? ? 819024 ? ?794264 ? ? 24760 ?97% /gfs >>> >>> [root at wplccdlvm445 proc]# ls -ld /tmp >>> drwxrwxrwt 8 root root 4096 Jan ?5 04:02 /tmp >>> [root at wplccdlvm445 proc]# ls -ld /tmp/.gfs2meta/ >>> drwx------ 2 root root 4096 Dec 31 14:24 /tmp/.gfs2meta/ >>> >>> >>> [root at wplccdlvm445 proc]# cat /proc/mounts >>> rootfs / rootfs rw 0 0 >>> /dev/root / ext3 rw,data=ordered 0 0 >>> /dev /dev tmpfs rw 0 0 >>> /proc /proc proc rw 0 0 >>> /sys /sys sysfs rw 0 0 >>> /proc/bus/usb /proc/bus/usb usbfs rw 0 0 >>> devpts /dev/pts devpts rw 0 0 >>> /dev/sda1 /boot ext3 rw,data=ordered 0 0 >>> tmpfs /dev/shm tmpfs rw 0 0 >>> none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 >>> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 >>> /etc/auto.misc /misc autofs >>> rw,fd=7,pgrp=2200,timeout=300,minproto=5,maxproto=5,indirect 0 0 >>> -hosts /net autofs >>> rw,fd=13,pgrp=2200,timeout=300,minproto=5,maxproto=5,indirect 0 0 >>> none /sys/kernel/config configfs rw 0 0 >>> /dev/mapper/vg100-lvol0 /gfs gfs2 rw,hostdata=jid=0:id=65537:first=1 0 >>> 0 >>> >>> [root at wplccdlvm445 proc]# cat /etc/mtab >>> /dev/mapper/VolGroup00-LogVol00 / ext3 rw 0 0 >>> proc /proc proc rw 0 0 >>> sysfs /sys sysfs rw 0 0 >>> devpts /dev/pts devpts rw,gid=5,mode=620 0 0 >>> /dev/sda1 /boot ext3 rw 0 0 >>> tmpfs /dev/shm tmpfs rw 0 0 >>> none /proc/sys/fs/binfmt_misc binfmt_misc rw 0 0 >>> sunrpc /var/lib/nfs/rpc_pipefs rpc_pipefs rw 0 0 >>> none /sys/kernel/config configfs rw 0 0 >>> /dev/mapper/vg100-lvol0 /gfs gfs2 rw,hostdata=jid=0:id=65537:first=1 0 >>> 0 >>> >>> >>> [root at wplccdlvm445 proc]# strace gfs2_grow -v /gfs >>> execve("/sbin/gfs2_grow", ["gfs2_grow", "-v", "/gfs"], [/* 29 vars >>> */]) = 0 >>> brk(0) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0x942d000 >>> access("/etc/ld.so.preload", R_OK) ? ? ?= -1 ENOENT (No such file or >>> directory) >>> open("/etc/ld.so.cache", O_RDONLY) ? ? ?= 3 >>> fstat64(3, {st_mode=S_IFREG|0644, st_size=90296, ...}) = 0 >>> mmap2(NULL, 90296, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb7f96000 >>> close(3) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >>> open("/lib/libvolume_id.so.0", O_RDONLY) = 3 >>> read(3, >>> "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360 at k\0004\0\0\0"..., >>> 512) = 512 >>> fstat64(3, {st_mode=S_IFREG|0755, st_size=32144, ...}) = 0 >>> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >>> -1, >>> 0) = 0xb7f95000 >>> mmap2(0x6b3000, 33540, PROT_READ|PROT_EXEC, >>> MAP_PRIVATE|MAP_DENYWRITE, >>> 3, 0) = 0x6b3000 >>> mmap2(0x6bb000, 4096, PROT_READ|PROT_WRITE, >>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x7) = 0x6bb000 >>> close(3) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >>> open("/lib/libc.so.6", O_RDONLY) ? ? ? ?= 3 >>> read(3, >>> "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\340\17X\0004\0\0\0"..., >>> 512) = 512 >>> fstat64(3, {st_mode=S_IFREG|0755, st_size=1611564, ...}) = 0 >>> mmap2(0x56b000, 1332676, PROT_READ|PROT_EXEC, >>> MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x56b000 >>> mprotect(0x6aa000, 4096, PROT_NONE) ? ? = 0 >>> mmap2(0x6ab000, 12288, PROT_READ|PROT_WRITE, >>> MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x13f) = 0x6ab000 >>> mmap2(0x6ae000, 9668, PROT_READ|PROT_WRITE, >>> MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x6ae000 >>> close(3) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >>> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >>> -1, >>> 0) = 0xb7f94000 >>> set_thread_area({entry_number:-1 -> 6, base_addr:0xb7f946c0, >>> limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, >>> limit_in_pages:1, seg_not_present:0, useable:1}) = 0 >>> mprotect(0x6ab000, 8192, PROT_READ) ? ? = 0 >>> mprotect(0x567000, 4096, PROT_READ) ? ? = 0 >>> munmap(0xb7f96000, 90296) ? ? ? ? ? ? ? = 0 >>> time(NULL) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 1262655667 >>> getpid() ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 18781 >>> brk(0) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0x942d000 >>> brk(0x944e000) ? ? ? ? ? ? ? ? ? ? ? ? ?= 0x944e000 >>> open("/gfs", O_RDONLY|O_LARGEFILE) ? ? ?= 3 >>> open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 4 >>> lstat64("/gfs", {st_mode=S_IFDIR|0755, st_size=3864, ...}) = 0 >>> fstat64(4, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 >>> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >>> -1, >>> 0) = 0xb7fac000 >>> read(4, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 659 >>> close(4) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >>> munmap(0xb7fac000, 4096) ? ? ? ? ? ? ? ?= 0 >>> open("/dev/mapper/vg100-lvol0", O_RDWR|O_LARGEFILE) = 4 >>> fstat64(4, {st_mode=S_IFBLK|0660, st_rdev=makedev(253, 4), ...}) = 0 >>> _llseek(4, 0, [1677721600], SEEK_END) ? = 0 >>> fstat64(1, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 1), ...}) = 0 >>> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >>> -1, >>> 0) = 0xb7fac000 >>> write(1, "Initializing lists...\n", 22Initializing lists... >>> ) = 22 >>> _llseek(4, 65536, [65536], SEEK_SET) ? ?= 0 >>> read(4, >>> "\1\26\31p\0\0\0\1\0\0\0\0\0\0\0\0\0\0\0d\0\0\0\0\0\0\7\t\0\0\7l"..., >>> 4096) = 4096 >>> _llseek(4, 0, [1677721600], SEEK_END) ? = 0 >>> open("/proc/mounts", O_RDONLY|O_LARGEFILE) = 5 >>> fstat64(5, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0 >>> mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, >>> -1, >>> 0) = 0xb7fab000 >>> read(5, "rootfs / rootfs rw 0 0\n/dev/root"..., 4096) = 659 >>> read(5, "", 4096) ? ? ? ? ? ? ? ? ? ? ? = 0 >>> close(5) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >>> munmap(0xb7fab000, 4096) ? ? ? ? ? ? ? ?= 0 >>> open("/tmp/.gfs2meta", O_RDONLY|O_LARGEFILE) = 5 >>> fstat64(5, {st_mode=S_IFDIR|0700, st_size=4096, ...}) = 0 >>> close(5) ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?= 0 >>> mount("/dev/mapper/vg100-lvol0", "/tmp/.gfs2meta", "gfs2meta", 0, >>> NULL) = -1 EINVAL (Invalid argument) >>> write(2, "gfs2_grow: ", 11gfs2_grow: ) ? ? ? ? ? ? = 11 >>> write(2, "Couldn't mount /tmp/.gfs2meta : "..., 49Couldn't mount >>> /tmp/.gfs2meta : Invalid argument >>> ) = 49 >>> exit_group(1) ? ? ? ? ? ? ? ? ? ? ? ? ? = ? >>> >>> >>> On Tue, Jan 5, 2010 at 6:27 AM, Abhijith Das wrote: >>> > Hi, >>> > >>> > >From the following message, it looks like the gfs2meta mount >>> routine is not able to locate the gfs2 mountpoint. >>> > "Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not >>> exist" >>> > Can you confirm that /proc/mounts and /etc/mtab all agree on the >>> mounted gfs2 at /gfs? >>> > Also, can you run gfs2_grow under strace so that we can see what >>> arguments gfs2_grow passes to the mount() system call when it tries to >>> mount the gfs2meta filesystem? >>> > >>> > Thanks! >>> > --Abhi >>> > >>> > ----- "Diamond Li" wrote: >>> > >>> >> From: "Diamond Li" >>> >> To: "linux clustering" >>> >> Sent: Monday, January 4, 2010 2:25:59 AM GMT -06:00 US/Canada >>> Central >>> >> Subject: Re: [Linux-cluster] gfs2_grow does not work >>> >> >>> >> could someone kindly help me to get through? >>> >> >>> >> thanks in advance! >>> >> >>> >> On Thu, Dec 31, 2009 at 3:16 PM, Diamond Li >>> >>> >> wrote: >>> >> > from system log, I can see the erorr message: >>> >> > >>> >> > Dec 31 15:04:56 wplccdlvm446 kernel: GFS2: gfs2 mount does not >>> >> exist >>> >> > >>> >> > but I have mounted gfs2 file system under /gfs folder and I can >>> do >>> >> > operations such as mkdir, rm, successfully. >>> >> > >>> >> > >>> >> > >>> >> > On Thu, Dec 31, 2009 at 2:55 PM, Diamond Li >>> >>> >> wrote: >>> >> >> Hello, >>> >> >> >>> >> >> I am trying to grow a gfs2 file system, unfortunately ?it does >>> not >>> >> work. >>> >> >> >>> >> >> anyone has similar issues or I always have bad luck? >>> >> >> >>> >> >> [root at wplccdlvm446 gfs]# mount >>> >> >> >>> >> >> /dev/mapper/vg100-lvol0 on /gfs type gfs2 >>> >> (rw,hostdata=jid=0:id=131074:first=1) >>> >> >> >>> >> >> [root at wplccdlvm446 gfs]# gfs2_grow -v /gfs >>> >> >> Initializing lists... >>> >> >> gfs2_grow: Couldn't mount /tmp/.gfs2meta : Invalid argument >>> >> >> >>> >> >> [root at wplccdlvm446 gfs]# ls -a /tmp/.gfs2meta/ >>> >> >> . ?.. >>> >> >> >>> >> >> >>> >> >> [root at wplccdlvm446 gfs]# uname -r >>> >> >> 2.6.18-164.el5 >>> >> >> >>> >> >> [root at wplccdlvm446 gfs]# cat /etc/redhat-release >>> >> >> Red Hat Enterprise Linux Server release 5.4 (Tikanga) >>> >> >> >>> >> > >>> >> >>> >> -- >>> >> Linux-cluster mailing list >>> >> Linux-cluster at redhat.com >>> >> https://www.redhat.com/mailman/listinfo/linux-cluster >>> > >>> > -- >>> > Linux-cluster mailing list >>> > Linux-cluster at redhat.com >>> > https://www.redhat.com/mailman/listinfo/linux-cluster >>> >>> -- >>> Linux-cluster mailing list >>> Linux-cluster at redhat.com >>> https://www.redhat.com/mailman/listinfo/linux-cluster >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > From adas at redhat.com Tue Jan 5 16:48:15 2010 From: adas at redhat.com (Abhijith Das) Date: Tue, 5 Jan 2010 11:48:15 -0500 (EST) Subject: [Linux-cluster] gfs2_grow does not work In-Reply-To: <1328423852.2405981262710057211.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Message-ID: <1673557480.2406041262710095635.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> ----- "Diamond Li" wrote: > From: "Diamond Li" > To: "linux clustering" > Sent: Monday, January 4, 2010 11:57:08 PM GMT -06:00 US/Canada Central > Subject: Re: [Linux-cluster] gfs2_grow does not work > > Appreciate for your reply! > > > [root at wplccdlvm445 ~]# uname -r > 2.6.18-164.el5 > > [root at wplccdlvm445 ~]# rpm -qa |grep gfs > gfs2-utils-0.1.44-1.el5 There's your problem :). This gfs2-utils package is pretty old (RHEL5.2 timeframe). The one that shipped with RHEL5.4 is gfs2-utils-0.1.62-1.el5. Please upgrade to this version and try again. Cheers! --Abhi From diamondiona at gmail.com Wed Jan 6 02:53:33 2010 From: diamondiona at gmail.com (Diamond Li) Date: Wed, 6 Jan 2010 10:53:33 +0800 Subject: [Linux-cluster] gfs2_grow does not work In-Reply-To: <1673557480.2406041262710095635.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> References: <1328423852.2405981262710057211.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> <1673557480.2406041262710095635.JavaMail.root@zmail05.collab.prod.int.phx2.redhat.com> Message-ID: Super! yes, I made mistake, installed this package from 5.2 yum repository. now, it is working perfectly! really appreciate for your help! On Wed, Jan 6, 2010 at 12:48 AM, Abhijith Das wrote: > > ----- "Diamond Li" wrote: > >> From: "Diamond Li" >> To: "linux clustering" >> Sent: Monday, January 4, 2010 11:57:08 PM GMT -06:00 US/Canada Central >> Subject: Re: [Linux-cluster] gfs2_grow does not work >> >> Appreciate for your reply! >> >> >> [root at wplccdlvm445 ~]# uname -r >> 2.6.18-164.el5 >> >> [root at wplccdlvm445 ~]# rpm -qa |grep gfs >> gfs2-utils-0.1.44-1.el5 > > There's your problem :). This gfs2-utils package is pretty old (RHEL5.2 timeframe). The one that shipped with RHEL5.4 is gfs2-utils-0.1.62-1.el5. Please upgrade to this version and try again. > > Cheers! > --Abhi > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From ccook at pandora.com Wed Jan 6 18:03:05 2010 From: ccook at pandora.com (Christopher Strider Cook) Date: Wed, 06 Jan 2010 10:03:05 -0800 Subject: [Linux-cluster] Will a service fail cause a node to fence? Message-ID: <4B44D059.6030801@pandora.com> If a cluster looses communication with a node then fencing will take place, but if a service fails and fails to stop/exit cleanly so another node can take over, will a fencing operation take place? Cluster 3, corosync 1, rgmanager 3 Thanks, Chris From pradhanparas at gmail.com Wed Jan 6 23:47:08 2010 From: pradhanparas at gmail.com (Paras pradhan) Date: Wed, 6 Jan 2010 17:47:08 -0600 Subject: [Linux-cluster] GFS and performace Message-ID: <8b711df41001061547h757f57fdi23fed852b5ac536a@mail.gmail.com> I have a GFS based shared storage cluster that connects to SAN by fibre channel. This GFS shared storage hold several virtual machines. While running hdparam from the host to a GFS share, I get following results. -- hdparm -t /guest_vms1 /dev/mapper/test_vg1-prd_vg1_lv: Timing buffered disk reads: 262 MB in 3.00 seconds = 87.24 MB/sec --- Now from within the virtual machines, the I/O is low --- hdparm -t /dev/mapper/VolGroup00-LogVol00 /dev/mapper/VolGroup00-LogVol00: Timing buffered disk reads: 88 MB in 3.00 seconds = 29.31 MB/sec --- I am looking for possibilities if I can increase my I/O read write within my virtual machines. Tuning GFS does help in this case? Sorry if my question is not relevant to this list Thanks Paras. -------------- next part -------------- An HTML attachment was scrubbed... URL: From sdake at redhat.com Wed Jan 6 23:55:40 2010 From: sdake at redhat.com (Steven Dake) Date: Wed, 06 Jan 2010 16:55:40 -0700 Subject: [Linux-cluster] GFS and performace In-Reply-To: <8b711df41001061547h757f57fdi23fed852b5ac536a@mail.gmail.com> References: <8b711df41001061547h757f57fdi23fed852b5ac536a@mail.gmail.com> Message-ID: <1262822140.2588.3.camel@localhost.localdomain> Virtual machines use memory copies between the physical device and the guest OS. Clearly this is an area where more work is being done in the virtualization community but is outside the scope of the typical filesystem gfs or otherwise. You might ask about io performance tuning on the respective virtualization technology mailing list you use. Regards -steve On Wed, 2010-01-06 at 17:47 -0600, Paras pradhan wrote: > I have a GFS based shared storage cluster that connects to SAN by > fibre channel. This GFS shared storage hold several virtual machines. > While running hdparam from the host to a GFS share, I get following > results. > > > -- > hdparm -t /guest_vms1 > > > /dev/mapper/test_vg1-prd_vg1_lv: > Timing buffered disk reads: 262 MB in 3.00 seconds = 87.24 MB/sec > --- > > > > > Now from within the virtual machines, the I/O is low > > > --- > hdparm -t /dev/mapper/VolGroup00-LogVol00 > > > /dev/mapper/VolGroup00-LogVol00: > Timing buffered disk reads: 88 MB in 3.00 seconds = 29.31 MB/sec > --- > > > I am looking for possibilities if I can increase my I/O read write > within my virtual machines. Tuning GFS does help in this case? > > > Sorry if my question is not relevant to this list > > > > > Thanks > Paras. > > > > > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From gordan at bobich.net Thu Jan 7 00:13:50 2010 From: gordan at bobich.net (Gordan Bobic) Date: Thu, 07 Jan 2010 00:13:50 +0000 Subject: [Linux-cluster] GFS and performace In-Reply-To: <8b711df41001061547h757f57fdi23fed852b5ac536a@mail.gmail.com> References: <8b711df41001061547h757f57fdi23fed852b5ac536a@mail.gmail.com> Message-ID: <4B45273E.8050807@bobich.net> Paras pradhan wrote: > I have a GFS based shared storage cluster that connects to SAN by fibre > channel. This GFS shared storage hold several virtual machines. While > running hdparam from the host to a GFS share, I get following results. > > -- > hdparm -t /guest_vms1 > > /dev/mapper/test_vg1-prd_vg1_lv: > Timing buffered disk reads: 262 MB in 3.00 seconds = 87.24 MB/sec > --- > > > Now from within the virtual machines, the I/O is low > > --- > hdparm -t /dev/mapper/VolGroup00-LogVol00 > > /dev/mapper/VolGroup00-LogVol00: > Timing buffered disk reads: 88 MB in 3.00 seconds = 29.31 MB/sec > --- > > I am looking for possibilities if I can increase my I/O read write > within my virtual machines. Tuning GFS does help in this case? > > Sorry if my question is not relevant to this list I suspect you'll find that is pretty normal for virtualization-induced I/O penalty. Virtualization really, trully, utterly sucks when it comes to I/O performance. My I/O performance tests (done using kernel building) show that the bottleneck was always disk I/O (including when the entire kernel source tree is pre-cached, with a 2GB of RAM guest. The _least_ horribly performing virtualization solution was VMware (tested with latest player 3.0, but verified against the latest server, too). That managed to complete the task in "only" 140% of the time the bare metal machine did (the host machine had it's memory limited to 2GB with the mem= kernel option to make sure the test was fair). So, 40% slower than bare metal. Paravirtualized Xen was close behind, followed very closely by non-paravirtualized KVM (which was actually slower when paravirtualized drivers were used!). VirtualBox came so far behind it's not even worth mentioning. Nevertheless, it shows that the whole "performance being close to bare metal" premise is completely mythical and comes from very selective tests (e.g. only testing CPU intensive tasks). But then again we all knew that, right? Gordan From frank at si.ct.upc.edu Thu Jan 7 07:43:05 2010 From: frank at si.ct.upc.edu (frank) Date: Thu, 07 Jan 2010 08:43:05 +0100 Subject: [Linux-cluster] lock_dlm but local flocks = true? In-Reply-To: References: Message-ID: <4B459089.5030807@si.ct.upc.edu> Hi Steve, I have not answered before because I was on holidays. By the way, happy new year. I have looked /proc/mounts as you told me, and ... surprise: /dev/mapper/volCluster-lvol0 /mnt/gfs gfs rw,hostdata=jid=0:id=196610:first=1,localflocks 0 0 "localflocks" is there! I don't understand because I mount it using "/etc/init.d/gfs start" which looks at /etc/fstab, and there the line is: /dev/volCluster/lvol0 /mnt/gfs gfs defaults 0 0 I must admit that there is a particular thing in this system which I thought it didn't affect, but I am not so sure now, and that is it is a OpenVZ patched kernel. Can this have something to do with gfs mounts? Thanks for your help once more. Frank > Date: Wed, 23 Dec 2009 15:15:28 +0000 From: Steven Whitehouse > To: linux clustering > Subject: Re: [Linux-cluster] lock_dlm but local flocks = true? > Message-ID: <1261581328.14393.113.camel at localhost.localdomain> > Content-Type: text/plain Hi, On Wed, 2009-12-23 at 15:53 +0100, frank > wrote: >> > Hi Steve, thanks for your answer >> > but I have not put the "localflocks" mount parameter anywhere. Look at >> > "gfs_tool df" output: >> > >> > # gfs_tool df /mnt/gfs >> > /mnt/gfs: >> > SB lock proto = "lock_dlm" >> > SB lock table = "H-N:gfs01" >> > SB ondisk format = 1309 >> > SB multihost format = 1401 >> > Block size = 4096 >> > Journals = 2 >> > Resource Groups = 200 >> > Mounted lock proto = "lock_dlm" >> > Mounted lock table = "H-N:gfs01" >> > Mounted host data = "jid=0:id=196610:first=1" >> > Journal number = 0 >> > Lock module flags = 0 >> > Local flocks = TRUE >> > Local caching = FALSE >> > Oopses OK = FALSE >> > >> > it says 'Mounted lock proto = "lock_dlm" ' because that is what I did. >> > So why is it using "local flocks"? >> > >> > I don't know. What does it say in /proc/mounts? (or what was your mount > command line?) > > Steve. > -- Aquest missatge ha estat analitzat per MailScanner a la cerca de virus i d'altres continguts perillosos, i es considera que est? net. For all your IT requirements visit: http://www.transtec.co.uk From pasik at iki.fi Thu Jan 7 08:34:29 2010 From: pasik at iki.fi (Pasi =?iso-8859-1?Q?K=E4rkk=E4inen?=) Date: Thu, 7 Jan 2010 10:34:29 +0200 Subject: [Linux-cluster] GFS and performace In-Reply-To: <4B45273E.8050807@bobich.net> References: <8b711df41001061547h757f57fdi23fed852b5ac536a@mail.gmail.com> <4B45273E.8050807@bobich.net> Message-ID: <20100107083429.GO25902@reaktio.net> On Thu, Jan 07, 2010 at 12:13:50AM +0000, Gordan Bobic wrote: > Paras pradhan wrote: > >I have a GFS based shared storage cluster that connects to SAN by fibre > >channel. This GFS shared storage hold several virtual machines. While > >running hdparam from the host to a GFS share, I get following results. > > > >-- > >hdparm -t /guest_vms1 > > > >/dev/mapper/test_vg1-prd_vg1_lv: > >Timing buffered disk reads: 262 MB in 3.00 seconds = 87.24 MB/sec > >--- > > > > > >Now from within the virtual machines, the I/O is low > > > >--- > >hdparm -t /dev/mapper/VolGroup00-LogVol00 > > > >/dev/mapper/VolGroup00-LogVol00: > > Timing buffered disk reads: 88 MB in 3.00 seconds = 29.31 MB/sec > >--- > > > >I am looking for possibilities if I can increase my I/O read write > >within my virtual machines. Tuning GFS does help in this case? > > > >Sorry if my question is not relevant to this list > > I suspect you'll find that is pretty normal for virtualization-induced > I/O penalty. Virtualization really, trully, utterly sucks when it comes > to I/O performance. > > My I/O performance tests (done using kernel building) show that the > bottleneck was always disk I/O (including when the entire kernel source > tree is pre-cached, with a 2GB of RAM guest. The _least_ horribly > performing virtualization solution was VMware (tested with latest player > 3.0, but verified against the latest server, too). That managed to > complete the task in "only" 140% of the time the bare metal machine did > (the host machine had it's memory limited to 2GB with the mem= kernel > option to make sure the test was fair). So, 40% slower than bare metal. > > Paravirtualized Xen was close behind, followed very closely by > non-paravirtualized KVM (which was actually slower when paravirtualized > drivers were used!). VirtualBox came so far behind it's not even worth > mentioning. > What, you're saying VMware Server (and player) were faster than Xen PV? I have hard time believing that.. based on my own experiences. -- Pasi From gordan at bobich.net Thu Jan 7 09:24:22 2010 From: gordan at bobich.net (Gordan Bobic) Date: Thu, 07 Jan 2010 09:24:22 +0000 Subject: [Linux-cluster] GFS and performace In-Reply-To: <20100107083429.GO25902@reaktio.net> References: <8b711df41001061547h757f57fdi23fed852b5ac536a@mail.gmail.com> <4B45273E.8050807@bobich.net> <20100107083429.GO25902@reaktio.net> Message-ID: <4B45A846.1090909@bobich.net> Pasi K?rkk?inen wrote: > On Thu, Jan 07, 2010 at 12:13:50AM +0000, Gordan Bobic wrote: >> Paras pradhan wrote: >>> I have a GFS based shared storage cluster that connects to SAN by fibre >>> channel. This GFS shared storage hold several virtual machines. While >>> running hdparam from the host to a GFS share, I get following results. >>> >>> -- >>> hdparm -t /guest_vms1 >>> >>> /dev/mapper/test_vg1-prd_vg1_lv: >>> Timing buffered disk reads: 262 MB in 3.00 seconds = 87.24 MB/sec >>> --- >>> >>> >>> Now from within the virtual machines, the I/O is low >>> >>> --- >>> hdparm -t /dev/mapper/VolGroup00-LogVol00 >>> >>> /dev/mapper/VolGroup00-LogVol00: >>> Timing buffered disk reads: 88 MB in 3.00 seconds = 29.31 MB/sec >>> --- >>> >>> I am looking for possibilities if I can increase my I/O read write >>> within my virtual machines. Tuning GFS does help in this case? >>> >>> Sorry if my question is not relevant to this list >> I suspect you'll find that is pretty normal for virtualization-induced >> I/O penalty. Virtualization really, trully, utterly sucks when it comes >> to I/O performance. >> >> My I/O performance tests (done using kernel building) show that the >> bottleneck was always disk I/O (including when the entire kernel source >> tree is pre-cached, with a 2GB of RAM guest. The _least_ horribly >> performing virtualization solution was VMware (tested with latest player >> 3.0, but verified against the latest server, too). That managed to >> complete the task in "only" 140% of the time the bare metal machine did >> (the host machine had it's memory limited to 2GB with the mem= kernel >> option to make sure the test was fair). So, 40% slower than bare metal. >> >> Paravirtualized Xen was close behind, followed very closely by >> non-paravirtualized KVM (which was actually slower when paravirtualized >> drivers were used!). VirtualBox came so far behind it's not even worth >> mentioning. >> > > What, you're saying VMware Server (and player) were faster than Xen PV? > > I have hard time believing that.. based on my own experiences. Yes, that is exactly what I'm saying. But the best performing virtualization solution (vmware) still had a 40% performance penalty in disk I/O compared to bare metal. But regardless of which one is least slow, they are all so slow as to only be worth considering if you are doing anything other than consolidating idle machines. The VM may feel faster in terms of boot times and such-like (the second time around when all the data is cached in the host's RAM), but it's all smoke and mirrors and doesn't stand up to scrutiny. The only virtualization solutions that deliver on the sort of performance claims the big vendors are making are the likes of OpenVZ and VServers, but those are mostly just chroots, more like FreeBSD jails or Solaris zones with a bit of network interface virtualization thrown in than what proper virtualization. If you don't believe me, try it yourself. Do a full kernel build with the stock RH .config file with make -j 8 on a quad core box with the 2GB of RAM VM and then on the bare metal box limited to 2GB with mem= kernel boot parameter and see how long it takes. I make it 6.5 minutes on bare metal on my 3.2GHz Core2 vs about 9.5 minutes in a VM on the same machine (vmware, paravirtualized xen and KVM come reasonably close together). Each was tested multiple times, and the results were holding consistent. Gordan From brem.belguebli at gmail.com Thu Jan 7 11:30:50 2010 From: brem.belguebli at gmail.com (brem belguebli) Date: Thu, 7 Jan 2010 12:30:50 +0100 Subject: [Linux-cluster] qdisk max_error_cycles setting In-Reply-To: <29ae894c0912300738g16c2a808u6aa1afe38270cce3@mail.gmail.com> References: <29ae894c0912300738g16c2a808u6aa1afe38270cce3@mail.gmail.com> Message-ID: <29ae894c1001070330g195ba815o360d3057e8b163d9@mail.gmail.com> Hi All, Any idea about that ? Regards 2009/12/30 brem belguebli : > Hi, > > It looks like the quorumd max_error_cycles parameter it not taken into account. > > Here's the test I'm doing: > > A 3 nodes cluster (RHEL 5.4) with a iscsi qdisk lun from a RHEL 5.4 > target server. > > All 3 cluster nodes have the following cqdisk configuration: > > log_facility="local5" log_level="7" tko="10" votes="1" > max_error_cycles="10"> > > When I block access from the 3 nodes to the target server (iptables > rule that prevents all ip flows from the 3 nodes to the target > server), I see the Quorum disk go offline but qdisk never gets stopped > and keeps on retrying the qdisk device despite the fact that I > instructed it to abort after 10 cycles (max_error_cycles=10). > > Am I misunderstanding the max_error_cycles definition in the qdisk man page ? > > Regards > > PS: As consequence of not being killed after this max-error_cycles, > qdisk ?keeps on growing (memory usage virtual size) and if the > situation lasts too long OOM killer gets involved..... > From swhiteho at redhat.com Thu Jan 7 11:54:35 2010 From: swhiteho at redhat.com (Steven Whitehouse) Date: Thu, 07 Jan 2010 11:54:35 +0000 Subject: [Linux-cluster] lock_dlm but local flocks = true? In-Reply-To: <4B459089.5030807@si.ct.upc.edu> References: <4B459089.5030807@si.ct.upc.edu> Message-ID: <1262865275.2240.4.camel@localhost> Hi, On Thu, 2010-01-07 at 08:43 +0100, frank wrote: > Hi Steve, I have not answered before because I was on holidays. By the > way, happy new year. > > I have looked /proc/mounts as you told me, and ... surprise: > > /dev/mapper/volCluster-lvol0 /mnt/gfs gfs > rw,hostdata=jid=0:id=196610:first=1,localflocks 0 0 > > "localflocks" is there! I don't understand because I mount it using > "/etc/init.d/gfs start" which looks at /etc/fstab, and there the line is: > > /dev/volCluster/lvol0 /mnt/gfs gfs defaults 0 0 > > I must admit that there is a particular thing in this system which I > thought it didn't affect, but I am not so sure now, and that is it is a > OpenVZ patched kernel. Can this have something to do with gfs mounts? > > Thanks for your help once more. > That does seem strange. You could try stracing the mount command when its run and that might show you the source of the localflocks flag, Steve. From hicheerup at gmail.com Fri Jan 8 05:53:10 2010 From: hicheerup at gmail.com (linux-crazy) Date: Fri, 8 Jan 2010 11:23:10 +0530 Subject: [Linux-cluster] Will a service fail cause a node to fence? In-Reply-To: <4B44D059.6030801@pandora.com> References: <4B44D059.6030801@pandora.com> Message-ID: <29e045b81001072153q7ecfb70aw51d17e8b9f470ad2@mail.gmail.com> Hi, AFAIK Rgmanager tries to restart the failed service 6 times(not sure), if it fails to restart on the that node , then it will try to relocate the service to another node. On Wed, Jan 6, 2010 at 11:33 PM, Christopher Strider Cook wrote: > If a cluster looses communication with a node then fencing will take place, > but if a service fails and fails to stop/exit cleanly so another node can > take over, will a fencing operation take place? > > Cluster 3, corosync 1, rgmanager 3 > > > Thanks, > Chris > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > From gianluca.cecchi at gmail.com Fri Jan 8 12:03:38 2010 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Fri, 8 Jan 2010 13:03:38 +0100 Subject: [Linux-cluster] suggestion on freeze-on-node1 and unfreeze-on-node2 approach? Message-ID: <561c252c1001080403r2052be58qec9c05cf9d8114a5@mail.gmail.com> Hello, I have a cluster with an Oracle service and rhel 5.4 nodes. Tipically one sets the "shutdown abort" of the DB as the default mechanism to close the service, to prevent stalling and accelerate switch of service itself in case of problems. The same approach is indeed used by the rhcs provided script, that I'm using. But sometimes we have to do maintenance on DB and use the strategy to freeze the service, manually stop DB, make modifications, manually start DB and unfreeze the service. This is useful when all the work is done on the same node carrying the service at that moment. Sometimes we need activities where we want to relocate the service too. And for the DBAs is desirable to clean shutdown the DB when there is a planned activity in place. With the same approach we do something like this: node1 with active service - freeze of the service: clusvcadm -Z SRV - maintenance activities with manual stop of service components (eg listener and Oracle instance) - shutdown of node1 shutdown -h now The shutdown takes about 2 minutes it is necessary to do a shutdown, because any command I tried, gave the error that the service was frozen and that I cannot run that command... - Wait on the survival node that: 1) it becomes master for the quorum disk otherwise it looses quorum Messagges in /var/log/qdiskd.log Jan 7 17:57:55 oracs1 qdiskd[7043]: Node 2 shutdown Jan 7 17:57:55 oracs1 qdiskd[7043]: Making bid for master Jan 7 17:58:30 oracs1 qdiskd[7043]: Assuming master role it takes about 1 minute, after shutdown of the other one 2) the cluster registers that the other node has gone Messages in /var/log/qdiskd.log Jan 7 18:00:35 oracs1 openais[7014]: [TOTEM] The token was lost in the OPERATIONAL state. Jan 7 18:00:35 oracs1 openais[7014]: [TOTEM] Receive multicast socket recv buffer size (320000 bytes). Jan 7 18:00:35 oracs1 openais[7014]: [TOTEM] Transmit multicast socket send buffer size (320000 bytes). Jan 7 18:00:35 oracs1 openais[7014]: [TOTEM] entering GATHER state from 2. Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] entering GATHER state from 0. Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] Creating commit token because I am the rep. Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] Saving state aru 24 high seq received 24 Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] Storing new sequence id for ring 4da34 Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] entering COMMIT state. Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] entering RECOVERY state. Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] position [0] member 192.168.16.1: Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] previous ring seq 318000 rep 192.168.16.1 Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] aru 24 high delivered 24 received flag 1 Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] Did not need to originate any messages in recovery. Jan 7 18:00:40 oracs1 openais[7014]: [TOTEM] Sending initial ORF token Jan 7 18:00:40 oracs1 openais[7014]: [CLM ] CLM CONFIGURATION CHANGE Jan 7 18:00:40 oracs1 openais[7014]: [CLM ] New Configuration: Jan 7 18:00:40 oracs1 openais[7014]: [CLM ] r(0) ip(192.168.16.1) Jan 7 18:00:40 oracs1 openais[7014]: [CLM ] Members Left: Jan 7 18:00:40 oracs1 openais[7014]: [CLM ] r(0) ip(192.168.16.8) Jan 7 18:00:40 oracs1 openais[7014]: [CLM ] Members Joined: Jan 7 18:00:40 oracs1 openais[7014]: [CLM ] CLM CONFIGURATION CHANGE Jan 7 18:00:40 oracs1 openais[7014]: [CLM ] New Configuration: Jan 7 18:00:40 oracs1 openais[7014]: [CLM ] r(0) ip(192.168.16.1) Jan 7 18:00:41 oracs1 openais[7014]: [CLM ] Members Left: Jan 7 18:00:41 oracs1 openais[7014]: [CLM ] Members Joined: Jan 7 18:00:41 oracs1 openais[7014]: [SYNC ] This node is within the primary component and will provide service. Jan 7 18:00:41 oracs1 openais[7014]: [TOTEM] entering OPERATIONAL state. Jan 7 18:00:41 oracs1 openais[7014]: [CLM ] got nodejoin message 192.168.16.1 Jan 7 18:00:41 oracs1 openais[7014]: [CPG ] got joinlist message from node 1 It takes about 2 minutes (also due to timeouts set up because of qdisk, cman and multipath interactions needs) Total of about 5 minutes. And after this we can work on node2: - unfreeze of the service clusvcadm -U SRV This is not enough to have service start automatically. clustat gives service as "started" on the other node and remains so. Even if theoretically the node knows that the other one has left the cluster...... sort of bug in my opinion.... - disable of the service clusvcadm -d SRV - enable of the service clusvcadm -e SRV At this time the service suddenly starts as there is only one node alive and it is not necessary to specify the "-m " switch After a few minutes we can restart the node1 that will join the cluster again without problems: Messages in /var/log/qdiskd.log of the node2 Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] entering GATHER state from 11. Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] Creating commit token because I am the rep. Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] Saving state aru 1c high seq received 1c Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] Storing new sequence id for ring 4da38 Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] entering COMMIT state. Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] entering RECOVERY state. Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] position [0] member 192.168.16.1: Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] previous ring seq 318004 rep 192.168.16.1 Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] aru 1c high delivered 1c received flag 1 Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] position [1] member 192.168.16.8: Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] previous ring seq 318004 rep 192.168.16.8 Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] aru a high delivered a received flag 1 Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] Did not need to originate any messages in recovery. Jan 7 18:12:50 oracs1 openais[7014]: [TOTEM] Sending initial ORF token Jan 7 18:12:50 oracs1 openais[7014]: [CLM ] CLM CONFIGURATION CHANGE Jan 7 18:12:50 oracs1 openais[7014]: [CLM ] New Configuration: Jan 7 18:12:50 oracs1 openais[7014]: [CLM ] r(0) ip(192.168.16.1) Jan 7 18:12:50 oracs1 openais[7014]: [CLM ] Members Left: Jan 7 18:12:50 oracs1 openais[7014]: [CLM ] Members Joined: Jan 7 18:12:50 oracs1 openais[7014]: [CLM ] CLM CONFIGURATION CHANGE Jan 7 18:12:51 oracs1 openais[7014]: [CLM ] New Configuration: Jan 7 18:12:51 oracs1 openais[7014]: [CLM ] r(0) ip(192.168.16.1) Jan 7 18:12:51 oracs1 openais[7014]: [CLM ] r(0) ip(192.168.16.8) Jan 7 18:12:51 oracs1 openais[7014]: [CLM ] Members Left: Jan 7 18:12:51 oracs1 openais[7014]: [CLM ] Members Joined: Jan 7 18:12:51 oracs1 openais[7014]: [CLM ] r(0) ip(192.168.16.8) Jan 7 18:12:51 oracs1 openais[7014]: [SYNC ] This node is within the primary component and will provide service. Jan 7 18:12:51 oracs1 openais[7014]: [TOTEM] entering OPERATIONAL state. Jan 7 18:12:51 oracs1 openais[7014]: [CLM ] got nodejoin message 192.168.16.1 Jan 7 18:12:51 oracs1 openais[7014]: [CLM ] got nodejoin message 192.168.16.8 Jan 7 18:12:51 oracs1 openais[7014]: [CPG ] got joinlist message from node 1 Jan 7 18:13:20 oracs1 qdiskd[7043]: Node 2 is UP So the steps above let us clean switch the db with this limits: 1) it takes about 10-15 minutes to have the whole cluster up again with both nodes active 2) we have to shutdown one node and in case of clusters with more than only one service this could be a blocker at all of the approach itself. Any hints? Thanks, Gianluca From lhh at redhat.com Fri Jan 8 14:06:57 2010 From: lhh at redhat.com (Lon Hohberger) Date: Fri, 08 Jan 2010 09:06:57 -0500 Subject: [Linux-cluster] suggestion on freeze-on-node1 and unfreeze-on-node2 approach? In-Reply-To: <561c252c1001080403r2052be58qec9c05cf9d8114a5@mail.gmail.com> References: <561c252c1001080403r2052be58qec9c05cf9d8114a5@mail.gmail.com> Message-ID: <1262959617.7436.30.camel@localhost.localdomain> On Fri, 2010-01-08 at 13:03 +0100, Gianluca Cecchi wrote: > But sometimes we have to do maintenance on DB and use the strategy to > freeze the service, manually stop DB, make modifications, manually > start DB and unfreeze the service. You could set 'recovery="relocate"', freeze the service, stop the database cleanly, then unfreeze the service. When rgmanager does a status check on the service, it will fail, issue a stop (which will be a no-op for the database), then start it on the other node. It's kind of ... odd to do it that way, but it should work. Alternatively, you could make oracledb.sh use clean shutdown of the database, wait some period of time, then do a hard-shutdown. In a failover case where the node fails, the recovery time will be no different. -- Lon From cluster at xinet.it Fri Jan 8 15:00:10 2010 From: cluster at xinet.it (cluster at xinet.it) Date: Fri, 8 Jan 2010 16:00:10 +0100 Subject: [Linux-cluster] Rename scsi devices Message-ID: <00d301ca9073$4609da10$d21d8e30$@it> Hi all, do someone know how i could rename an iscsi device from /dev/sde to /dev/sdg? I mean, I have an iscsi device from a storage with address 7:0:0:12 and I see it in one host with the name /dev/sde while in the second host i see it with name /dev/sdg. I need to rename due to virtualization purposes. Can someone help me? Thanks all, Francesco Gallo -------------- next part -------------- An HTML attachment was scrubbed... URL: From gianluca.cecchi at gmail.com Fri Jan 8 15:12:21 2010 From: gianluca.cecchi at gmail.com (Gianluca Cecchi) Date: Fri, 8 Jan 2010 16:12:21 +0100 Subject: [Linux-cluster] suggestion on freeze-on-node1 and unfreeze-on-node2 approach? In-Reply-To: <561c252c1001080403r2052be58qec9c05cf9d8114a5@mail.gmail.com> References: <561c252c1001080403r2052be58qec9c05cf9d8114a5@mail.gmail.com> Message-ID: <561c252c1001080712w18f9c941hcfc58917301c4165@mail.gmail.com> On Fri, 08 Jan 2010 09:06:57 -0500 Lon Hohberger wrote: >You could set 'recovery="relocate"', freeze the service, stop the > database cleanly, then unfreeze the service. Ah, thanks, it should work. The only "limit" would be that any recovery action will imply relocation, correct? (Some problems here with Oracle license in theory, because they let you pay only one license in a two node cluster only if total time where DB runs on one of the two node is less than a small amount of time....) Re-reading the manual about rhel 5.4 cluster administration, it puts another doubt in my mind.... Section D.4 Failure Recovery and Independent Subtrees: "... if any of the scripts defined in this service fail, the normal course of action is to restart (or relocate or disable, according to the service recovery policy) the service..." Does this mean that if my service definition is the one below: