From gianpietro.sella at unipd.it Sun May 10 09:28:25 2015 From: gianpietro.sella at unipd.it (gianpietro.sella at unipd.it) Date: Sun, 10 May 2015 11:28:25 +0200 Subject: [Linux-cluster] nfs cluster, problem with delete file in the failover case Message-ID: <8c343e26ef810ff47bdc10ae3ece85ce.squirrel@webmail.unipd.it> Hi, sorry for my bad english. I testing nfs cluster active/passsive (2 nodes). I use the next instruction for nfs: https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Administration/s1-resourcegroupcreatenfs-HAAA.html I use centos 7.1 on the nodes. The 2 node of the cluster share the same iscsi volume. The nfs cluster is very good. I have only one problem. I mount the nfs cluster exported folder on my client node (nfsv3 protocol). I write on the nfs folder an big data file (70GB): dd if=/dev/zero bs=1M count=70000 > /Instances/output.dat Before write is finished I put the active node in standby status. then the resource migrate in the other node. when the dd write finish the file is ok. I delete the file output.dat. Now the file output.dat is not present in the nfs folder, it is correctly erased. but the space in the nfs volume is not free. If I execute an df command on the client (and on the new active node) I see 70GB on used space in the exported volume disk. Now if I put the new active node in standby status (migrate the resource in the first node where start writing file), and the other node is now the active node, the space of the deleted output.dat file is now free. It is very strange. From bfields at fieldses.org Mon May 11 21:20:15 2015 From: bfields at fieldses.org (J. Bruce Fields) Date: Mon, 11 May 2015 17:20:15 -0400 Subject: [Linux-cluster] nfs cluster, problem with delete file in the failover case In-Reply-To: <8c343e26ef810ff47bdc10ae3ece85ce.squirrel@webmail.unipd.it> References: <8c343e26ef810ff47bdc10ae3ece85ce.squirrel@webmail.unipd.it> Message-ID: <20150511212015.GB23754@fieldses.org> On Sun, May 10, 2015 at 11:28:25AM +0200, gianpietro.sella at unipd.it wrote: > Hi, sorry for my bad english. > I testing nfs cluster active/passsive (2 nodes). > I use the next instruction for nfs: > > https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Administration/s1-resourcegroupcreatenfs-HAAA.html > > I use centos 7.1 on the nodes. > The 2 node of the cluster share the same iscsi volume. > The nfs cluster is very good. > I have only one problem. > I mount the nfs cluster exported folder on my client node (nfsv3 protocol). > I write on the nfs folder an big data file (70GB): > dd if=/dev/zero bs=1M count=70000 > /Instances/output.dat > Before write is finished I put the active node in standby status. > then the resource migrate in the other node. > when the dd write finish the file is ok. > I delete the file output.dat. So, the dd and the later rm are both run on the client, and the rm after the dd has completed and exited? And the rm doesn't happen till after the first migration is completely finished? What version of NFS are you using? It sounds like a sillyrename problem, but I don't see the explanation. --b. > Now the file output.dat is not present in the nfs folder, it is correctly > erased. > but the space in the nfs volume is not free. > If I execute an df command on the client (and on the new active node) I > see 70GB on used space in the exported volume disk. > Now if I put the new active node in standby status (migrate the resource > in the first node where start writing file), and the other node is now the > active node, the space of the deleted output.dat file is now free. > It is very strange. > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From gianpietro.sella at unipd.it Mon May 11 22:37:10 2015 From: gianpietro.sella at unipd.it (gianpietro.sella at unipd.it) Date: Tue, 12 May 2015 00:37:10 +0200 Subject: [Linux-cluster] nfs cluster, problem with delete file in the failover case In-Reply-To: <20150511212015.GB23754@fieldses.org> References: <8c343e26ef810ff47bdc10ae3ece85ce.squirrel@webmail.unipd.it> <20150511212015.GB23754@fieldses.org> Message-ID: > On Sun, May 10, 2015 at 11:28:25AM +0200, gianpietro.sella at unipd.it wrote: >> Hi, sorry for my bad english. >> I testing nfs cluster active/passsive (2 nodes). >> I use the next instruction for nfs: >> >> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Administration/s1-resourcegroupcreatenfs-HAAA.html >> >> I use centos 7.1 on the nodes. >> The 2 node of the cluster share the same iscsi volume. >> The nfs cluster is very good. >> I have only one problem. >> I mount the nfs cluster exported folder on my client node (nfsv3 >> protocol). >> I write on the nfs folder an big data file (70GB): >> dd if=/dev/zero bs=1M count=70000 > /Instances/output.dat >> Before write is finished I put the active node in standby status. >> then the resource migrate in the other node. >> when the dd write finish the file is ok. >> I delete the file output.dat. > > So, the dd and the later rm are both run on the client, and the rm after > the dd has completed and exited? And the rm doesn't happen till after > the first migration is completely finished? What version of NFS are you > using? > > It sounds like a sillyrename problem, but I don't see the explanation. > > --b. Hi Bruce, thank for your answer. yes the dd command and the rm command (all on the client node) finish without error. I use nfsv3, but is the same with nfsv4 protocol. the s.o. is centos 7.1, the nfs package is nfs-utils-1.3.0-0.8.el7.x86_64. the pacemaker configuration is: pcs resource create nfsclusterlv LVM volgrpname=nfsclustervg exclusive=true --group nfsclusterha pcs resource create nfsclusterdata Filesystem device="/dev/nfsclustervg/nfsclusterlv" directory="/nfscluster" fstype="ext4" --group nfsclusterha pcs resource create nfsclusterserver nfsserver nfs_shared_infodir=/nfscluster/nfsinfo nfs_no_notify=true --group nfsclusterha pcs resource create nfsclusterroot exportfs clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfscluster/exports fsid=0 --group nfsclusterha pcs resource create nfsclusternova exportfs clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfscluster/exports/nova fsid=1 -- group nfsclusterha pcs resource create nfsclusterglance exportfs clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfscluster/exports/glance fsid= 2 --group nfsclusterha pcs resource create nfsclustervip IPaddr2 ip=192.168.61.180 cidr_netmask=24 --group nfsclusterha pcs resource create nfsclusternotify nfsnotify source_host=192.168.61.180 --group nfsclusterha now I have done the next test. nfs cluster with 2 node. the first node in standby state. the second node in active state. I mount the empty (not used space) exported volume in the client with nfsv3 protocol (with nfs4 protocol is the same). I write on the client an big file (70GB) in the mount directory with dd (but is the same with cp command). while the command write the file I disable nfsnotify, Iaddr2, exportfs and nfsserver resource in this order (pcs resource disable ...) and next I enable the resource (pcs resource enable ...) in the reverse order. when disable resource writing freeze, when enable resource writing restart without error. when the writing command is finished I delete the file. the mount directory is empty and the used space of exported volume is 0, this is ok. now i repead the test. but now I disable/enable even the Filesystem resource: disable nfsnotify, Iaddr2, exportfs, nfsserver and Filesystem resource (writing freeze) then enable in the reverse order (writing restart without error). when writing command is finished I delete the file. now the mounted directory is empty (not file) but the used space is not 0 but is 70GB. this is not ok. now I execute the next command on the active node of the cluster where the volume is exported with nfs: mount -o remount /dev/nfsclustervg/nfsclusterlv where /dev/nfsclustervg/nfsclusterlv is the exported volume (iscsi volume configured with lvm). after this command the used space in the mounted directory of the client is 0, this is ok. I think that the problem is the Filesystem resource on the active node of the cluster. but is very strange. > >> Now the file output.dat is not present in the nfs folder, it is >> correctly >> erased. >> but the space in the nfs volume is not free. >> If I execute an df command on the client (and on the new active node) I >> see 70GB on used space in the exported volume disk. >> Now if I put the new active node in standby status (migrate the resource >> in the first node where start writing file), and the other node is now >> the >> active node, the space of the deleted output.dat file is now free. >> It is very strange. >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > From bfields at fieldses.org Tue May 12 15:25:17 2015 From: bfields at fieldses.org (J. Bruce Fields) Date: Tue, 12 May 2015 11:25:17 -0400 Subject: [Linux-cluster] nfs cluster, problem with delete file in the failover case In-Reply-To: References: <8c343e26ef810ff47bdc10ae3ece85ce.squirrel@webmail.unipd.it> <20150511212015.GB23754@fieldses.org> Message-ID: <20150512152517.GB6370@fieldses.org> On Tue, May 12, 2015 at 12:37:10AM +0200, gianpietro.sella at unipd.it wrote: > > On Sun, May 10, 2015 at 11:28:25AM +0200, gianpietro.sella at unipd.it wrote: > >> Hi, sorry for my bad english. > >> I testing nfs cluster active/passsive (2 nodes). > >> I use the next instruction for nfs: > >> > >> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/html/High_Availability_Add-On_Administration/s1-resourcegroupcreatenfs-HAAA.html > >> > >> I use centos 7.1 on the nodes. > >> The 2 node of the cluster share the same iscsi volume. > >> The nfs cluster is very good. > >> I have only one problem. > >> I mount the nfs cluster exported folder on my client node (nfsv3 > >> protocol). > >> I write on the nfs folder an big data file (70GB): > >> dd if=/dev/zero bs=1M count=70000 > /Instances/output.dat > >> Before write is finished I put the active node in standby status. > >> then the resource migrate in the other node. > >> when the dd write finish the file is ok. > >> I delete the file output.dat. > > > > So, the dd and the later rm are both run on the client, and the rm after > > the dd has completed and exited? And the rm doesn't happen till after > > the first migration is completely finished? What version of NFS are you > > using? > > > > It sounds like a sillyrename problem, but I don't see the explanation. > > > > --b. > > > Hi Bruce, thank for your answer. > yes the dd command and the rm command (all on the client node) finish > without error. > I use nfsv3, but is the same with nfsv4 protocol. > the s.o. is centos 7.1, the nfs package is nfs-utils-1.3.0-0.8.el7.x86_64. > the pacemaker configuration is: > > pcs resource create nfsclusterlv LVM volgrpname=nfsclustervg > exclusive=true --group nfsclusterha > > pcs resource create nfsclusterdata Filesystem > device="/dev/nfsclustervg/nfsclusterlv" directory="/nfscluster" > fstype="ext4" --group nfsclusterha > > pcs resource create nfsclusterserver nfsserver > nfs_shared_infodir=/nfscluster/nfsinfo nfs_no_notify=true --group > nfsclusterha > > pcs resource create nfsclusterroot exportfs > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > directory=/nfscluster/exports fsid=0 --group > nfsclusterha > > pcs resource create nfsclusternova exportfs > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > directory=/nfscluster/exports/nova fsid=1 -- > group nfsclusterha > > pcs resource create nfsclusterglance exportfs > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > directory=/nfscluster/exports/glance fsid= > 2 --group nfsclusterha > > pcs resource create nfsclustervip IPaddr2 ip=192.168.61.180 cidr_netmask=24 > --group nfsclusterha > > pcs resource create nfsclusternotify nfsnotify source_host=192.168.61.180 > --group nfsclusterha > > now I have done the next test. > nfs cluster with 2 node. > the first node in standby state. > the second node in active state. > I mount the empty (not used space) exported volume in the client with nfsv3 > protocol (with nfs4 protocol is the same). > I write on the client an big file (70GB) in the mount directory with dd (but > is the same with cp command). > while the command write the file I disable nfsnotify, Iaddr2, exportfs and > nfsserver resource in this order (pcs resource disable ...) and next I > enable the resource (pcs resource enable ...) in the reverse order. > when disable resource writing freeze, when enable resource writing restart > without error. > when the writing command is finished I delete the file. > the mount directory is empty and the used space of exported volume is 0, > this is ok. > now i repead the test. > but now I disable/enable even the Filesystem resource: > disable nfsnotify, Iaddr2, exportfs, nfsserver and Filesystem resource > (writing freeze) then enable in the reverse order (writing restart without > error). > when writing command is finished I delete the file. > now the mounted directory is empty (not file) but the used space is not 0 > but is 70GB. > this is not ok. > now I execute the next command on the active node of the cluster where the > volume is exported with nfs: > mount -o remount /dev/nfsclustervg/nfsclusterlv > where /dev/nfsclustervg/nfsclusterlv is the exported volume (iscsi volume > configured with lvm). > after this command the used space in the mounted directory of the client is > 0, this is ok. > I think that the problem is the Filesystem resource on the active node of > the cluster. > but is very strange. So, the only difference between the "good" and "bad" cases was the addition of the stop/start of the filesystem resource? I assume that's equivalent to an umount/mount. I guess the server's dentry for that file is hanging around for a little while for some reason. We've run across at least one problem of that sort before (see d891eedbc3b1 "fs/dcache: allow d_obtain_alias() to return unhashed dentries"). In both cases after the restart the first operation the server will get for that file is a write with a filehandle, and it will have to look up that filehandle to find the file. (Whereas without the restart the initial discovery of the file will be a lookup by name.) In the "good" case the server already has a dentry cached for that file, in the "bad" case the umount/mount means that we'll be doing a cold-cache lookup of that filehandle. I wonder if the test case can be narrowed down any further.... Is the large file necessary? If it's needed only to ensure the writes are actually sent to the server promptly then it might be enough to do the nfs mount with -osync. Instead of the cluster migration or restart, it might be possible to reproduce the bug just with a echo 2 >/proc/sys/vm/drop_caches run on the server side while the dd is in progress--I don't know if that will reliably drop the one dentry, though. Maybe do a few of those in a row. --b. From gianpietro.sella at unipd.it Wed May 13 09:51:13 2015 From: gianpietro.sella at unipd.it (sella gianpietro) Date: Wed, 13 May 2015 11:51:13 +0200 Subject: [Linux-cluster] R: nfs cluster, problem with delete file in the failover case In-Reply-To: <20150512152517.GB6370@fieldses.org> Message-ID: <20150513100053.25D711F3C@mydoom.unipd.it> J. Bruce Fields fieldses.org> writes: > > On Tue, May 12, 2015 at 12:37:10AM +0200, gianpietro.sella unipd.it wrote: > > > On Sun, May 10, 2015 at 11:28:25AM +0200, gianpietro.sella unipd.it wrote: > > >> Hi, sorry for my bad english. > > >> I testing nfs cluster active/passsive (2 nodes). > > >> I use the next instruction for nfs: > > >> > > >> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/htm l/High_Availability_Add-On_Administration/s1-resourcegroupcreatenfs-HAAA.htm l > > >> > > >> I use centos 7.1 on the nodes. > > >> The 2 node of the cluster share the same iscsi volume. > > >> The nfs cluster is very good. > > >> I have only one problem. > > >> I mount the nfs cluster exported folder on my client node (nfsv3 > > >> protocol). > > >> I write on the nfs folder an big data file (70GB): > > >> dd if=/dev/zero bs=1M count=70000 > /Instances/output.dat > > >> Before write is finished I put the active node in standby status. > > >> then the resource migrate in the other node. > > >> when the dd write finish the file is ok. > > >> I delete the file output.dat. > > > > > > So, the dd and the later rm are both run on the client, and the rm after > > > the dd has completed and exited? And the rm doesn't happen till after > > > the first migration is completely finished? What version of NFS are you > > > using? > > > > > > It sounds like a sillyrename problem, but I don't see the explanation. > > > > > > --b. > > > > > > Hi Bruce, thank for your answer. > > yes the dd command and the rm command (all on the client node) finish > > without error. > > I use nfsv3, but is the same with nfsv4 protocol. > > the s.o. is centos 7.1, the nfs package is nfs-utils-1.3.0-0.8.el7.x86_64. > > the pacemaker configuration is: > > > > pcs resource create nfsclusterlv LVM volgrpname=nfsclustervg > > exclusive=true --group nfsclusterha > > > > pcs resource create nfsclusterdata Filesystem > > device="/dev/nfsclustervg/nfsclusterlv" directory="/nfscluster" > > fstype="ext4" --group nfsclusterha > > > > pcs resource create nfsclusterserver nfsserver > > nfs_shared_infodir=/nfscluster/nfsinfo nfs_no_notify=true --group > > nfsclusterha > > > > pcs resource create nfsclusterroot exportfs > > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > > directory=/nfscluster/exports fsid=0 --group > > nfsclusterha > > > > pcs resource create nfsclusternova exportfs > > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > > directory=/nfscluster/exports/nova fsid=1 -- > > group nfsclusterha > > > > pcs resource create nfsclusterglance exportfs > > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > > directory=/nfscluster/exports/glance fsid= > > 2 --group nfsclusterha > > > > pcs resource create nfsclustervip IPaddr2 ip=192.168.61.180 cidr_netmask=24 > > --group nfsclusterha > > > > pcs resource create nfsclusternotify nfsnotify source_host=192.168.61.180 > > --group nfsclusterha > > > > now I have done the next test. > > nfs cluster with 2 node. > > the first node in standby state. > > the second node in active state. > > I mount the empty (not used space) exported volume in the client with nfsv3 > > protocol (with nfs4 protocol is the same). > > I write on the client an big file (70GB) in the mount directory with dd (but > > is the same with cp command). > > while the command write the file I disable nfsnotify, Iaddr2, exportfs and > > nfsserver resource in this order (pcs resource disable ...) and next I > > enable the resource (pcs resource enable ...) in the reverse order. > > when disable resource writing freeze, when enable resource writing restart > > without error. > > when the writing command is finished I delete the file. > > the mount directory is empty and the used space of exported volume is 0, > > this is ok. > > now i repead the test. > > but now I disable/enable even the Filesystem resource: > > disable nfsnotify, Iaddr2, exportfs, nfsserver and Filesystem resource > > (writing freeze) then enable in the reverse order (writing restart without > > error). > > when writing command is finished I delete the file. > > now the mounted directory is empty (not file) but the used space is not 0 > > but is 70GB. > > this is not ok. > > now I execute the next command on the active node of the cluster where the > > volume is exported with nfs: > > mount -o remount /dev/nfsclustervg/nfsclusterlv > > where /dev/nfsclustervg/nfsclusterlv is the exported volume (iscsi volume > > configured with lvm). > > after this command the used space in the mounted directory of the client is > > 0, this is ok. > > I think that the problem is the Filesystem resource on the active node of > > the cluster. > > but is very strange. > > So, the only difference between the "good" and "bad" cases was the > addition of the stop/start of the filesystem resource? I assume that's > equivalent to an umount/mount. yes is correct > > I guess the server's dentry for that file is hanging around for a little > while for some reason. We've run across at least one problem of that > sort before (see d891eedbc3b1 "fs/dcache: allow d_obtain_alias() to > return unhashed dentries"). > > In both cases after the restart the first operation the server will get > for that file is a write with a filehandle, and it will have to look up > that filehandle to find the file. (Whereas without the restart the > initial discovery of the file will be a lookup by name.) > > In the "good" case the server already has a dentry cached for that file, > in the "bad" case the umount/mount means that we'll be doing a > cold-cache lookup of that filehandle. > > I wonder if the test case can be narrowed down any further.... Is the > large file necessary? If it's needed only to ensure the writes are > actually sent to the server promptly then it might be enough to do the > nfs mount with -osync. I use sync options, is the same problem > > Instead of the cluster migration or restart, it might be possible to > reproduce the bug just with a > > echo 2 >/proc/sys/vm/drop_caches > > run on the server side while the dd is in progress--I don't know if that > will reliably drop the one dentry, though. Maybe do a few of those in a > row. no, with echo command is not possible reproduce the problem > > --b. > From gianpietro.sella at unipd.it Wed May 13 11:06:17 2015 From: gianpietro.sella at unipd.it (sella gianpietro) Date: Wed, 13 May 2015 13:06:17 +0200 Subject: [Linux-cluster] R: nfs cluster, problem with delete file in the failover case In-Reply-To: <20150512152517.GB6370@fieldses.org> Message-ID: <20150513111531.3CFB11F31@mydoom.unipd.it> this is the inodes number in the exported folder of the volume in the server before write file in the client: [root at cld-blu-13 nova]# du --inodes 2 . this is the used block: [root at cld-blu-13 nova]# df -T Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 33000 1152845588 1% /nfscluster after write file in the client with umount/mount during writing: [root at cld-blu-13 nova]# du --inodes 3 . [root at cld-blu-13 nova]# df -T Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 1131874068 2% /nfscluster thi is correct. now delete file: [root at cld-blu-13 nova]# du --inodes 2 . the number of the inodes is correct (from 3 to 2). [root at cld-blu-13 nova]# df -T Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 1131874068 2% /nfscluster the number of used block is not correct. Do not return to initial value 33000 -----Messaggio originale----- Da: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] Per conto di J. Bruce Fields Inviato: marted? 12 maggio 2015 17.25 A: linux clustering Oggetto: Re: [Linux-cluster] nfs cluster, problem with delete file in the failover case On Tue, May 12, 2015 at 12:37:10AM +0200, gianpietro.sella at unipd.it wrote: > > On Sun, May 10, 2015 at 11:28:25AM +0200, gianpietro.sella at unipd.it wrote: > >> Hi, sorry for my bad english. > >> I testing nfs cluster active/passsive (2 nodes). > >> I use the next instruction for nfs: > >> > >> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/htm l/High_Availability_Add-On_Administration/s1-resourcegroupcreatenfs-HAAA.htm l > >> > >> I use centos 7.1 on the nodes. > >> The 2 node of the cluster share the same iscsi volume. > >> The nfs cluster is very good. > >> I have only one problem. > >> I mount the nfs cluster exported folder on my client node (nfsv3 > >> protocol). > >> I write on the nfs folder an big data file (70GB): > >> dd if=/dev/zero bs=1M count=70000 > /Instances/output.dat > >> Before write is finished I put the active node in standby status. > >> then the resource migrate in the other node. > >> when the dd write finish the file is ok. > >> I delete the file output.dat. > > > > So, the dd and the later rm are both run on the client, and the rm after > > the dd has completed and exited? And the rm doesn't happen till after > > the first migration is completely finished? What version of NFS are you > > using? > > > > It sounds like a sillyrename problem, but I don't see the explanation. > > > > --b. > > > Hi Bruce, thank for your answer. > yes the dd command and the rm command (all on the client node) finish > without error. > I use nfsv3, but is the same with nfsv4 protocol. > the s.o. is centos 7.1, the nfs package is nfs-utils-1.3.0-0.8.el7.x86_64. > the pacemaker configuration is: > > pcs resource create nfsclusterlv LVM volgrpname=nfsclustervg > exclusive=true --group nfsclusterha > > pcs resource create nfsclusterdata Filesystem > device="/dev/nfsclustervg/nfsclusterlv" directory="/nfscluster" > fstype="ext4" --group nfsclusterha > > pcs resource create nfsclusterserver nfsserver > nfs_shared_infodir=/nfscluster/nfsinfo nfs_no_notify=true --group > nfsclusterha > > pcs resource create nfsclusterroot exportfs > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > directory=/nfscluster/exports fsid=0 --group > nfsclusterha > > pcs resource create nfsclusternova exportfs > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > directory=/nfscluster/exports/nova fsid=1 -- > group nfsclusterha > > pcs resource create nfsclusterglance exportfs > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > directory=/nfscluster/exports/glance fsid= > 2 --group nfsclusterha > > pcs resource create nfsclustervip IPaddr2 ip=192.168.61.180 cidr_netmask=24 > --group nfsclusterha > > pcs resource create nfsclusternotify nfsnotify source_host=192.168.61.180 > --group nfsclusterha > > now I have done the next test. > nfs cluster with 2 node. > the first node in standby state. > the second node in active state. > I mount the empty (not used space) exported volume in the client with nfsv3 > protocol (with nfs4 protocol is the same). > I write on the client an big file (70GB) in the mount directory with dd (but > is the same with cp command). > while the command write the file I disable nfsnotify, Iaddr2, exportfs and > nfsserver resource in this order (pcs resource disable ...) and next I > enable the resource (pcs resource enable ...) in the reverse order. > when disable resource writing freeze, when enable resource writing restart > without error. > when the writing command is finished I delete the file. > the mount directory is empty and the used space of exported volume is 0, > this is ok. > now i repead the test. > but now I disable/enable even the Filesystem resource: > disable nfsnotify, Iaddr2, exportfs, nfsserver and Filesystem resource > (writing freeze) then enable in the reverse order (writing restart without > error). > when writing command is finished I delete the file. > now the mounted directory is empty (not file) but the used space is not 0 > but is 70GB. > this is not ok. > now I execute the next command on the active node of the cluster where the > volume is exported with nfs: > mount -o remount /dev/nfsclustervg/nfsclusterlv > where /dev/nfsclustervg/nfsclusterlv is the exported volume (iscsi volume > configured with lvm). > after this command the used space in the mounted directory of the client is > 0, this is ok. > I think that the problem is the Filesystem resource on the active node of > the cluster. > but is very strange. So, the only difference between the "good" and "bad" cases was the addition of the stop/start of the filesystem resource? I assume that's equivalent to an umount/mount. I guess the server's dentry for that file is hanging around for a little while for some reason. We've run across at least one problem of that sort before (see d891eedbc3b1 "fs/dcache: allow d_obtain_alias() to return unhashed dentries"). In both cases after the restart the first operation the server will get for that file is a write with a filehandle, and it will have to look up that filehandle to find the file. (Whereas without the restart the initial discovery of the file will be a lookup by name.) In the "good" case the server already has a dentry cached for that file, in the "bad" case the umount/mount means that we'll be doing a cold-cache lookup of that filehandle. I wonder if the test case can be narrowed down any further.... Is the large file necessary? If it's needed only to ensure the writes are actually sent to the server promptly then it might be enough to do the nfs mount with -osync. Instead of the cluster migration or restart, it might be possible to reproduce the bug just with a echo 2 >/proc/sys/vm/drop_caches run on the server side while the dd is in progress--I don't know if that will reliably drop the one dentry, though. Maybe do a few of those in a row. --b. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From vinh.cao at hp.com Wed May 13 11:38:51 2015 From: vinh.cao at hp.com (Cao, Vinh) Date: Wed, 13 May 2015 11:38:51 +0000 Subject: [Linux-cluster] R: nfs cluster, problem with delete file in the failover case In-Reply-To: <20150513111531.3CFB11F31@mydoom.unipd.it> References: <20150512152517.GB6370@fieldses.org> <20150513111531.3CFB11F31@mydoom.unipd.it> Message-ID: Sounds like the process that has the file create while you are moving it to another node still open. Meaning you are deleting the file and doing failover at the same time. This has not things to do with your cluster setup. I believed , you can run lsof command on the system that you're seeing the disk size is still not clean up. Then grep for deteled arg. You may see the process number that is still there. Then kill that process and it will clean up the file handle process that is still open. That is how I see in your problem. I don't think it has any things to do with OS cluster. Vinh -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of sella gianpietro Sent: Wednesday, May 13, 2015 7:06 AM To: 'linux clustering' Subject: [Linux-cluster] R: nfs cluster, problem with delete file in the failover case this is the inodes number in the exported folder of the volume in the server before write file in the client: [root at cld-blu-13 nova]# du --inodes 2 . this is the used block: [root at cld-blu-13 nova]# df -T Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 33000 1152845588 1% /nfscluster after write file in the client with umount/mount during writing: [root at cld-blu-13 nova]# du --inodes 3 . [root at cld-blu-13 nova]# df -T Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 1131874068 2% /nfscluster thi is correct. now delete file: [root at cld-blu-13 nova]# du --inodes 2 . the number of the inodes is correct (from 3 to 2). [root at cld-blu-13 nova]# df -T Filesystem Type 1K-blocks Used Available Use% Mounted on /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 1131874068 2% /nfscluster the number of used block is not correct. Do not return to initial value 33000 -----Messaggio originale----- Da: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] Per conto di J. Bruce Fields Inviato: marted? 12 maggio 2015 17.25 A: linux clustering Oggetto: Re: [Linux-cluster] nfs cluster, problem with delete file in the failover case On Tue, May 12, 2015 at 12:37:10AM +0200, gianpietro.sella at unipd.it wrote: > > On Sun, May 10, 2015 at 11:28:25AM +0200, gianpietro.sella at unipd.it wrote: > >> Hi, sorry for my bad english. > >> I testing nfs cluster active/passsive (2 nodes). > >> I use the next instruction for nfs: > >> > >> https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/7/htm l/High_Availability_Add-On_Administration/s1-resourcegroupcreatenfs-HAAA.htm l > >> > >> I use centos 7.1 on the nodes. > >> The 2 node of the cluster share the same iscsi volume. > >> The nfs cluster is very good. > >> I have only one problem. > >> I mount the nfs cluster exported folder on my client node (nfsv3 > >> protocol). > >> I write on the nfs folder an big data file (70GB): > >> dd if=/dev/zero bs=1M count=70000 > /Instances/output.dat Before > >> write is finished I put the active node in standby status. > >> then the resource migrate in the other node. > >> when the dd write finish the file is ok. > >> I delete the file output.dat. > > > > So, the dd and the later rm are both run on the client, and the rm > > after the dd has completed and exited? And the rm doesn't happen > > till after the first migration is completely finished? What version > > of NFS are you using? > > > > It sounds like a sillyrename problem, but I don't see the explanation. > > > > --b. > > > Hi Bruce, thank for your answer. > yes the dd command and the rm command (all on the client node) finish > without error. > I use nfsv3, but is the same with nfsv4 protocol. > the s.o. is centos 7.1, the nfs package is nfs-utils-1.3.0-0.8.el7.x86_64. > the pacemaker configuration is: > > pcs resource create nfsclusterlv LVM volgrpname=nfsclustervg > exclusive=true --group nfsclusterha > > pcs resource create nfsclusterdata Filesystem > device="/dev/nfsclustervg/nfsclusterlv" directory="/nfscluster" > fstype="ext4" --group nfsclusterha > > pcs resource create nfsclusterserver nfsserver > nfs_shared_infodir=/nfscluster/nfsinfo nfs_no_notify=true --group > nfsclusterha > > pcs resource create nfsclusterroot exportfs > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > directory=/nfscluster/exports fsid=0 --group nfsclusterha > > pcs resource create nfsclusternova exportfs > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > directory=/nfscluster/exports/nova fsid=1 -- group nfsclusterha > > pcs resource create nfsclusterglance exportfs > clientspec=192.168.61.0/255.255.255.0 options=rw,sync,no_root_squash > directory=/nfscluster/exports/glance fsid= > 2 --group nfsclusterha > > pcs resource create nfsclustervip IPaddr2 ip=192.168.61.180 cidr_netmask=24 > --group nfsclusterha > > pcs resource create nfsclusternotify nfsnotify > source_host=192.168.61.180 --group nfsclusterha > > now I have done the next test. > nfs cluster with 2 node. > the first node in standby state. > the second node in active state. > I mount the empty (not used space) exported volume in the client with nfsv3 > protocol (with nfs4 protocol is the same). > I write on the client an big file (70GB) in the mount directory with > dd (but > is the same with cp command). > while the command write the file I disable nfsnotify, Iaddr2, exportfs > and nfsserver resource in this order (pcs resource disable ...) and > next I enable the resource (pcs resource enable ...) in the reverse order. > when disable resource writing freeze, when enable resource writing > restart without error. > when the writing command is finished I delete the file. > the mount directory is empty and the used space of exported volume is > 0, this is ok. > now i repead the test. > but now I disable/enable even the Filesystem resource: > disable nfsnotify, Iaddr2, exportfs, nfsserver and Filesystem resource > (writing freeze) then enable in the reverse order (writing restart > without error). > when writing command is finished I delete the file. > now the mounted directory is empty (not file) but the used space is > not 0 but is 70GB. > this is not ok. > now I execute the next command on the active node of the cluster where > the volume is exported with nfs: > mount -o remount /dev/nfsclustervg/nfsclusterlv where > /dev/nfsclustervg/nfsclusterlv is the exported volume (iscsi volume > configured with lvm). > after this command the used space in the mounted directory of the > client is > 0, this is ok. > I think that the problem is the Filesystem resource on the active node > of the cluster. > but is very strange. So, the only difference between the "good" and "bad" cases was the addition of the stop/start of the filesystem resource? I assume that's equivalent to an umount/mount. I guess the server's dentry for that file is hanging around for a little while for some reason. We've run across at least one problem of that sort before (see d891eedbc3b1 "fs/dcache: allow d_obtain_alias() to return unhashed dentries"). In both cases after the restart the first operation the server will get for that file is a write with a filehandle, and it will have to look up that filehandle to find the file. (Whereas without the restart the initial discovery of the file will be a lookup by name.) In the "good" case the server already has a dentry cached for that file, in the "bad" case the umount/mount means that we'll be doing a cold-cache lookup of that filehandle. I wonder if the test case can be narrowed down any further.... Is the large file necessary? If it's needed only to ensure the writes are actually sent to the server promptly then it might be enough to do the nfs mount with -osync. Instead of the cluster migration or restart, it might be possible to reproduce the bug just with a echo 2 >/proc/sys/vm/drop_caches run on the server side while the dd is in progress--I don't know if that will reliably drop the one dentry, though. Maybe do a few of those in a row. --b. -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From bfields at fieldses.org Wed May 13 15:46:52 2015 From: bfields at fieldses.org (J. Bruce Fields) Date: Wed, 13 May 2015 11:46:52 -0400 Subject: [Linux-cluster] R: nfs cluster, problem with delete file in the failover case In-Reply-To: References: <20150512152517.GB6370@fieldses.org> <20150513111531.3CFB11F31@mydoom.unipd.it> Message-ID: <20150513154652.GB2070@fieldses.org> On Wed, May 13, 2015 at 11:38:51AM +0000, Cao, Vinh wrote: > Sounds like the process that has the file create while you are moving > it to another node still open. If I understand correctly, the filesystem is still unmountable. If a process held a file on the filesystem open, an unmount attempt would return -EBUSY. --b. > Meaning you are deleting the file and > doing failover at the same time. This has not things to do with your > cluster setup. > > I believed , you can run lsof command on the system that you're seeing > the disk size is still not clean up. Then grep for deteled arg. You > may see the process number that is still there. Then kill that process > and it will clean up the file handle process that is still open. > > That is how I see in your problem. I don't think it has any things to > do with OS cluster. From bfields at fieldses.org Wed May 13 15:45:19 2015 From: bfields at fieldses.org (J. Bruce Fields) Date: Wed, 13 May 2015 11:45:19 -0400 Subject: [Linux-cluster] R: nfs cluster, problem with delete file in the failover case In-Reply-To: <20150513111531.3CFB11F31@mydoom.unipd.it> References: <20150512152517.GB6370@fieldses.org> <20150513111531.3CFB11F31@mydoom.unipd.it> Message-ID: <20150513154519.GA2070@fieldses.org> On Wed, May 13, 2015 at 01:06:17PM +0200, sella gianpietro wrote: > this is the inodes number in the exported folder of the volume > in the server before write file in the client: > > [root at cld-blu-13 nova]# du --inodes > 2 . > > this is the used block: > > [root at cld-blu-13 nova]# df -T > Filesystem Type 1K-blocks Used Available > Use% Mounted on > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 33000 1152845588 > 1% /nfscluster > > after write file in the client with umount/mount during writing: > > [root at cld-blu-13 nova]# du --inodes > 3 . > > [root at cld-blu-13 nova]# df -T > Filesystem Type 1K-blocks Used > Available Use% Mounted on > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 > 1131874068 2% /nfscluster > > thi is correct. > now delete file: > > [root at cld-blu-13 nova]# du --inodes > 2 . > > the number of the inodes is correct (from 3 to 2). > > [root at cld-blu-13 nova]# df -T > Filesystem Type 1K-blocks Used > Available Use% Mounted on > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 > 1131874068 2% /nfscluster > > the number of used block is not correct. > Do not return to initial value 33000 If you try "df -i", you'll probably also find that it gives the "wrong" result. (So, probably 3 inodes, though "du --inodes" is still only finding 2). --b. From gianpietro.sella at unipd.it Wed May 13 19:38:03 2015 From: gianpietro.sella at unipd.it (gianpietro sella) Date: Wed, 13 May 2015 19:38:03 +0000 (UTC) Subject: [Linux-cluster] R: nfs cluster, problem with delete file in the failover case References: <20150512152517.GB6370@fieldses.org> <20150513111531.3CFB11F31@mydoom.unipd.it> <20150513154519.GA2070@fieldses.org> Message-ID: J. Bruce Fields fieldses.org> writes: > > On Wed, May 13, 2015 at 01:06:17PM +0200, sella gianpietro wrote: > > this is the inodes number in the exported folder of the volume > > in the server before write file in the client: > > > > [root cld-blu-13 nova]# du --inodes > > 2 . > > > > this is the used block: > > > > [root cld-blu-13 nova]# df -T > > Filesystem Type 1K-blocks Used Available > > Use% Mounted on > > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 33000 1152845588 > > 1% /nfscluster > > > > after write file in the client with umount/mount during writing: > > > > [root cld-blu-13 nova]# du --inodes > > 3 . > > > > [root cld-blu-13 nova]# df -T > > Filesystem Type 1K-blocks Used > > Available Use% Mounted on > > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 > > 1131874068 2% /nfscluster > > > > thi is correct. > > now delete file: > > > > [root cld-blu-13 nova]# du --inodes > > 2 . > > > > the number of the inodes is correct (from 3 to 2). > > > > [root cld-blu-13 nova]# df -T > > Filesystem Type 1K-blocks Used > > Available Use% Mounted on > > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 > > 1131874068 2% /nfscluster > > > > the number of used block is not correct. > > Do not return to initial value 33000 > > If you try "df -i", you'll probably also find that it gives the "wrong" > result. (So, probably 3 inodes, though "du --inodes" is still only > finding 2). > > --b. > the problem is that after delete file the inode go in the orphaned state: [root at cld-blu-13 nova]# tune2fs -l /dev/nfsclustervg/nfsclusterlv |grep -i inode Filesystem features: has_journal ext_attr resize_inode dir_index filetype needs_recovery sparse_super large_file Inode count: 72097792 Free inodes: 72097754 Inodes per group: 8192 Inode blocks per group: 512 First inode: 11 Inode size: 256 Journal inode: 8 First orphan inode: 53067783 Journal backup: inode blocks From unix.co at gmail.com Thu May 21 11:01:59 2015 From: unix.co at gmail.com (Umar Draz) Date: Thu, 21 May 2015 16:01:59 +0500 Subject: [Linux-cluster] iscsi-stonith-device stopped Message-ID: Hi I have created 2 node clvm cluster, everything apparently running file, but when I did *pcs status* it always display this Clone Set: dlm-clone [dlm] Started: [ clvm-1 clvm-2 ] Clone Set: clvmd-clone [clvmd] Started: [ clvm-1 clvm-2 ] * iscsi-stonith-device (stonith:fence_scsi): Stopped* Failed actions: iscsi-stonith-device_start_0 on clvm-1 'unknown error' (1): call=40, status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:23 2015', queued=0ms, exec=1154ms iscsi-stonith-device_start_0 on clvm-2 'unknown error' (1): call=38, status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:26 2015', queued=0ms, exec=1161ms PCSD Status: clvm-1: Online clvm-2: Online Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled Would you please help me why iscsi-stonith device stopped, and how I can solve this issue. Br. Umar -------------- next part -------------- An HTML attachment was scrubbed... URL: From gianpietro.sella at unipd.it Thu May 21 12:10:15 2015 From: gianpietro.sella at unipd.it (sella gianpietro) Date: Thu, 21 May 2015 14:10:15 +0200 Subject: [Linux-cluster] R: iscsi-stonith-device stopped In-Reply-To: Message-ID: <20150521122011.D7CCBF706@kletz.unipd.it> that operating system do you use? I used fence_scsi with centos 7.1 but do not start. https://access.redhat.com/solutions/1421063 _____ Da: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] Per conto di Umar Draz Inviato: gioved? 21 maggio 2015 13.02 A: linux-cluster at redhat.com Oggetto: [Linux-cluster] iscsi-stonith-device stopped Hi I have created 2 node clvm cluster, everything apparently running file, but when I did pcs status it always display this Clone Set: dlm-clone [dlm] Started: [ clvm-1 clvm-2 ] Clone Set: clvmd-clone [clvmd] Started: [ clvm-1 clvm-2 ] iscsi-stonith-device (stonith:fence_scsi): Stopped Failed actions: iscsi-stonith-device_start_0 on clvm-1 'unknown error' (1): call=40, status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:23 2015', queued=0ms, exec=1154ms iscsi-stonith-device_start_0 on clvm-2 'unknown error' (1): call=38, status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:26 2015', queued=0ms, exec=1161ms PCSD Status: clvm-1: Online clvm-2: Online Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled Would you please help me why iscsi-stonith device stopped, and how I can solve this issue. Br. Umar -------------- next part -------------- An HTML attachment was scrubbed... URL: From unix.co at gmail.com Thu May 21 12:28:18 2015 From: unix.co at gmail.com (Umar Draz) Date: Thu, 21 May 2015 17:28:18 +0500 Subject: [Linux-cluster] R: iscsi-stonith-device stopped In-Reply-To: <20150521122011.D7CCBF706@kletz.unipd.it> References: <20150521122011.D7CCBF706@kletz.unipd.it> Message-ID: Hi, Yes I am using Centos 7, so it will not work on CentOS 7? Br. Umar On Thu, May 21, 2015 at 5:10 PM, sella gianpietro wrote: > that operating system do you use? > > I used fence_scsi with centos 7.1 but do not start. > > https://access.redhat.com/solutions/1421063 > > > > > > > > > ------------------------------ > > *Da:* linux-cluster-bounces at redhat.com [mailto: > linux-cluster-bounces at redhat.com] *Per conto di *Umar Draz > *Inviato:* gioved? 21 maggio 2015 13.02 > *A:* linux-cluster at redhat.com > *Oggetto:* [Linux-cluster] iscsi-stonith-device stopped > > > > Hi > > > > I have created 2 node clvm cluster, everything apparently running file, > but when I did > > > > *pcs** status* > > > > it always display this > > > > Clone Set: dlm-clone [dlm] > > Started: [ clvm-1 clvm-2 ] > > Clone Set: clvmd-clone [clvmd] > > Started: [ clvm-1 clvm-2 ] > > * iscsi-stonith-device (stonith:fence_scsi): Stopped* > > > > Failed actions: > > iscsi-stonith-device_start_0 on clvm-1 'unknown error' (1): call=40, > status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:23 > 2015', queued=0ms, exec=1154ms > > iscsi-stonith-device_start_0 on clvm-2 'unknown error' (1): call=38, > status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:26 > 2015', queued=0ms, exec=1161ms > > > > > > PCSD Status: > > clvm-1: Online > > clvm-2: Online > > > > Daemon Status: > > corosync: active/enabled > > pacemaker: active/enabled > > pcsd: active/enabled > > > > > > Would you please help me why iscsi-stonith device stopped, and how I can > solve this issue. > > > > Br. > > > > Umar > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Umar Draz Network Architect -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Thu May 21 12:33:23 2015 From: emi2fast at gmail.com (emmanuel segura) Date: Thu, 21 May 2015 14:33:23 +0200 Subject: [Linux-cluster] R: iscsi-stonith-device stopped In-Reply-To: References: <20150521122011.D7CCBF706@kletz.unipd.it> Message-ID: I don't know if the fence_scsi is the same used in redhat cluster 5, but i think that works with scsi reservation, so your disk need to support scsi 3 2015-05-21 14:28 GMT+02:00 Umar Draz : > Hi, > > Yes I am using Centos 7, so it will not work on CentOS 7? > > Br. > > Umar > > On Thu, May 21, 2015 at 5:10 PM, sella gianpietro > wrote: >> >> that operating system do you use? >> >> I used fence_scsi with centos 7.1 but do not start. >> >> https://access.redhat.com/solutions/1421063 >> >> >> >> >> >> >> >> >> >> ________________________________ >> >> Da: linux-cluster-bounces at redhat.com >> [mailto:linux-cluster-bounces at redhat.com] Per conto di Umar Draz >> Inviato: gioved? 21 maggio 2015 13.02 >> A: linux-cluster at redhat.com >> Oggetto: [Linux-cluster] iscsi-stonith-device stopped >> >> >> >> Hi >> >> >> >> I have created 2 node clvm cluster, everything apparently running file, >> but when I did >> >> >> >> pcs status >> >> >> >> it always display this >> >> >> >> Clone Set: dlm-clone [dlm] >> >> Started: [ clvm-1 clvm-2 ] >> >> Clone Set: clvmd-clone [clvmd] >> >> Started: [ clvm-1 clvm-2 ] >> >> iscsi-stonith-device (stonith:fence_scsi): Stopped >> >> >> >> Failed actions: >> >> iscsi-stonith-device_start_0 on clvm-1 'unknown error' (1): call=40, >> status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:23 2015', >> queued=0ms, exec=1154ms >> >> iscsi-stonith-device_start_0 on clvm-2 'unknown error' (1): call=38, >> status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:26 2015', >> queued=0ms, exec=1161ms >> >> >> >> >> >> PCSD Status: >> >> clvm-1: Online >> >> clvm-2: Online >> >> >> >> Daemon Status: >> >> corosync: active/enabled >> >> pacemaker: active/enabled >> >> pcsd: active/enabled >> >> >> >> >> >> Would you please help me why iscsi-stonith device stopped, and how I can >> solve this issue. >> >> >> >> Br. >> >> >> >> Umar >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > -- > Umar Draz > Network Architect > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -- .~. /V\ // \\ /( )\ ^`~'^ From bfields at fieldses.org Thu May 21 18:01:42 2015 From: bfields at fieldses.org (J. Bruce Fields) Date: Thu, 21 May 2015 14:01:42 -0400 Subject: [Linux-cluster] R: nfs cluster, problem with delete file in the failover case In-Reply-To: References: <20150512152517.GB6370@fieldses.org> <20150513111531.3CFB11F31@mydoom.unipd.it> <20150513154519.GA2070@fieldses.org> Message-ID: <20150521180142.GA29163@fieldses.org> On Wed, May 13, 2015 at 07:38:03PM +0000, gianpietro sella wrote: > J. Bruce Fields fieldses.org> writes: > > > > > On Wed, May 13, 2015 at 01:06:17PM +0200, sella gianpietro wrote: > > > this is the inodes number in the exported folder of the volume > > > in the server before write file in the client: > > > > > > [root cld-blu-13 nova]# du --inodes > > > 2 . > > > > > > this is the used block: > > > > > > [root cld-blu-13 nova]# df -T > > > Filesystem Type 1K-blocks Used Available > > > Use% Mounted on > > > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 33000 1152845588 > > > 1% /nfscluster > > > > > > after write file in the client with umount/mount during writing: > > > > > > [root cld-blu-13 nova]# du --inodes > > > 3 . > > > > > > [root cld-blu-13 nova]# df -T > > > Filesystem Type 1K-blocks Used > > > Available Use% Mounted on > > > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 > > > 1131874068 2% /nfscluster > > > > > > thi is correct. > > > now delete file: > > > > > > [root cld-blu-13 nova]# du --inodes > > > 2 . > > > > > > the number of the inodes is correct (from 3 to 2). > > > > > > [root cld-blu-13 nova]# df -T > > > Filesystem Type 1K-blocks Used > > > Available Use% Mounted on > > > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 > > > 1131874068 2% /nfscluster > > > > > > the number of used block is not correct. > > > Do not return to initial value 33000 > > > > If you try "df -i", you'll probably also find that it gives the "wrong" > > result. (So, probably 3 inodes, though "du --inodes" is still only > > finding 2). > > > > --b. > > > > > the problem is that after delete file the inode go in the orphaned state: Yeah, that's consistent with everything else--we're not removing a dentry when we should for some reason, so the inode's staying referenced. --b. > > [root at cld-blu-13 nova]# tune2fs -l /dev/nfsclustervg/nfsclusterlv |grep -i inode > Filesystem features: has_journal ext_attr resize_inode dir_index > filetype needs_recovery sparse_super large_file > Inode count: 72097792 > Free inodes: 72097754 > Inodes per group: 8192 > Inode blocks per group: 512 > First inode: 11 > Inode size: 256 > Journal inode: 8 > First orphan inode: 53067783 > Journal backup: inode blocks > > > > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster From gianpietro.sella at unipd.it Thu May 21 19:05:36 2015 From: gianpietro.sella at unipd.it (gianpietro.sella at unipd.it) Date: Thu, 21 May 2015 21:05:36 +0200 Subject: [Linux-cluster] R: nfs cluster, problem with delete file in the failover case In-Reply-To: <20150521180142.GA29163@fieldses.org> References: <20150512152517.GB6370@fieldses.org> <20150513111531.3CFB11F31@mydoom.unipd.it> <20150513154519.GA2070@fieldses.org> <20150521180142.GA29163@fieldses.org> Message-ID: <8467be50c20c996d71b0412b6e8a9677.squirrel@webmail.unipd.it> > On Wed, May 13, 2015 at 07:38:03PM +0000, gianpietro sella wrote: >> J. Bruce Fields fieldses.org> writes: >> >> > >> > On Wed, May 13, 2015 at 01:06:17PM +0200, sella gianpietro wrote: >> > > this is the inodes number in the exported folder of the volume >> > > in the server before write file in the client: >> > > >> > > [root cld-blu-13 nova]# du --inodes >> > > 2 . >> > > >> > > this is the used block: >> > > >> > > [root cld-blu-13 nova]# df -T >> > > Filesystem Type 1K-blocks Used >> Available >> > > Use% Mounted on >> > > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 33000 >> 1152845588 >> > > 1% /nfscluster >> > > >> > > after write file in the client with umount/mount during writing: >> > > >> > > [root cld-blu-13 nova]# du --inodes >> > > 3 . >> > > >> > > [root cld-blu-13 nova]# df -T >> > > Filesystem Type 1K-blocks Used >> > > Available Use% Mounted on >> > > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 >> > > 1131874068 2% /nfscluster >> > > >> > > thi is correct. >> > > now delete file: >> > > >> > > [root cld-blu-13 nova]# du --inodes >> > > 2 . >> > > >> > > the number of the inodes is correct (from 3 to 2). >> > > >> > > [root cld-blu-13 nova]# df -T >> > > Filesystem Type 1K-blocks Used >> > > Available Use% Mounted on >> > > /dev/mapper/nfsclustervg-nfsclusterlv xfs 1152878588 21004520 >> > > 1131874068 2% /nfscluster >> > > >> > > the number of used block is not correct. >> > > Do not return to initial value 33000 >> > >> > If you try "df -i", you'll probably also find that it gives the >> "wrong" >> > result. (So, probably 3 inodes, though "du --inodes" is still only >> > finding 2). >> > >> > --b. >> > >> >> >> the problem is that after delete file the inode go in the orphaned >> state: > > Yeah, that's consistent with everything else--we're not removing a > dentry when we should for some reason, so the inode's staying > referenced. > > --b. > tanks Bruce. yes this is true. I use nfs cluster on 2 node for nova instances in openstack (the instamces are stored on nfs folder). the probability that I create an file before an failover and then I delete the file file after failover is very little. In this case I can execute an "mount -o remount" after the failover and delete command and the orpahned inode is deleted and the free disk space is ok. I do not understand who use the file after failover and delete command. After I delete the file I do not see process that use the deleted file. this is very strange. But my is just an curiosity. I think that the cause is the unmount operation on the failover node. >> >> [root at cld-blu-13 nova]# tune2fs -l /dev/nfsclustervg/nfsclusterlv |grep >> -i inode >> Filesystem features: has_journal ext_attr resize_inode dir_index >> filetype needs_recovery sparse_super large_file >> Inode count: 72097792 >> Free inodes: 72097754 >> Inodes per group: 8192 >> Inode blocks per group: 512 >> First inode: 11 >> Inode size: 256 >> Journal inode: 8 >> First orphan inode: 53067783 >> Journal backup: inode blocks >> >> >> >> >> -- >> Linux-cluster mailing list >> Linux-cluster at redhat.com >> https://www.redhat.com/mailman/listinfo/linux-cluster > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > From unix.co at gmail.com Fri May 22 06:22:45 2015 From: unix.co at gmail.com (Umar Draz) Date: Fri, 22 May 2015 11:22:45 +0500 Subject: [Linux-cluster] R: iscsi-stonith-device stopped In-Reply-To: References: <20150521122011.D7CCBF706@kletz.unipd.it> Message-ID: Hi Thanks for your response, I will check about this scsi 3 supprot, Now I another question How i can remove the dead node from my cluster, I used this pcs cluster node remove leftnode but it wasn't working due to this error *(Error: pcsd is not running on leftnode)* Br. Umar On Thu, May 21, 2015 at 5:33 PM, emmanuel segura wrote: > I don't know if the fence_scsi is the same used in redhat cluster 5, > but i think that works with scsi reservation, so your disk need to > support scsi 3 > > 2015-05-21 14:28 GMT+02:00 Umar Draz : > > Hi, > > > > Yes I am using Centos 7, so it will not work on CentOS 7? > > > > Br. > > > > Umar > > > > On Thu, May 21, 2015 at 5:10 PM, sella gianpietro > > wrote: > >> > >> that operating system do you use? > >> > >> I used fence_scsi with centos 7.1 but do not start. > >> > >> https://access.redhat.com/solutions/1421063 > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> ________________________________ > >> > >> Da: linux-cluster-bounces at redhat.com > >> [mailto:linux-cluster-bounces at redhat.com] Per conto di Umar Draz > >> Inviato: gioved? 21 maggio 2015 13.02 > >> A: linux-cluster at redhat.com > >> Oggetto: [Linux-cluster] iscsi-stonith-device stopped > >> > >> > >> > >> Hi > >> > >> > >> > >> I have created 2 node clvm cluster, everything apparently running file, > >> but when I did > >> > >> > >> > >> pcs status > >> > >> > >> > >> it always display this > >> > >> > >> > >> Clone Set: dlm-clone [dlm] > >> > >> Started: [ clvm-1 clvm-2 ] > >> > >> Clone Set: clvmd-clone [clvmd] > >> > >> Started: [ clvm-1 clvm-2 ] > >> > >> iscsi-stonith-device (stonith:fence_scsi): Stopped > >> > >> > >> > >> Failed actions: > >> > >> iscsi-stonith-device_start_0 on clvm-1 'unknown error' (1): call=40, > >> status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:23 > 2015', > >> queued=0ms, exec=1154ms > >> > >> iscsi-stonith-device_start_0 on clvm-2 'unknown error' (1): call=38, > >> status=Error, exit-reason='none', last-rc-change='Thu May 21 05:52:26 > 2015', > >> queued=0ms, exec=1161ms > >> > >> > >> > >> > >> > >> PCSD Status: > >> > >> clvm-1: Online > >> > >> clvm-2: Online > >> > >> > >> > >> Daemon Status: > >> > >> corosync: active/enabled > >> > >> pacemaker: active/enabled > >> > >> pcsd: active/enabled > >> > >> > >> > >> > >> > >> Would you please help me why iscsi-stonith device stopped, and how I can > >> solve this issue. > >> > >> > >> > >> Br. > >> > >> > >> > >> Umar > >> > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > > -- > > Umar Draz > > Network Architect > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > .~. > /V\ > // \\ > /( )\ > ^`~'^ > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- Umar Draz Network Architect -------------- next part -------------- An HTML attachment was scrubbed... URL: From tojeline at redhat.com Wed May 27 10:54:59 2015 From: tojeline at redhat.com (Tomas Jelinek) Date: Wed, 27 May 2015 12:54:59 +0200 Subject: [Linux-cluster] R: iscsi-stonith-device stopped In-Reply-To: References: <20150521122011.D7CCBF706@kletz.unipd.it> Message-ID: <5565A283.9030505@redhat.com> Dne 22.5.2015 v 08:22 Umar Draz napsal(a): > Hi > > Thanks for your response, I will check about this scsi 3 supprot, Now I > another question > > How i can remove the dead node from my cluster, I used this > > pcs cluster node remove leftnode > > but it wasn't working due to this error *(Error: pcsd is not running on > leftnode)* Hi, You can remove it like this: 1. run 'pcs cluster localnode remove leftnode' on all remaining nodes 2. run 'pcs cluster reload corosync' on one remaining node 3. run 'crm_node -R leftnode --force' on one remaining node Tomas > > Br. > > Umar > > On Thu, May 21, 2015 at 5:33 PM, emmanuel segura > wrote: > > I don't know if the fence_scsi is the same used in redhat cluster 5, > but i think that works with scsi reservation, so your disk need to > support scsi 3 > > 2015-05-21 14:28 GMT+02:00 Umar Draz >: > > Hi, > > > > Yes I am using Centos 7, so it will not work on CentOS 7? > > > > Br. > > > > Umar > > > > On Thu, May 21, 2015 at 5:10 PM, sella gianpietro > > > wrote: > >> > >> that operating system do you use? > >> > >> I used fence_scsi with centos 7.1 but do not start. > >> > >> https://access.redhat.com/solutions/1421063 > >> > >> > >> > >> > >> > >> > >> > >> > >> > >> ________________________________ > >> > >> Da: linux-cluster-bounces at redhat.com > > >> [mailto:linux-cluster-bounces at redhat.com > ] Per conto di Umar Draz > >> Inviato: gioved? 21 maggio 2015 13.02 > >> A: linux-cluster at redhat.com > >> Oggetto: [Linux-cluster] iscsi-stonith-device stopped > >> > >> > >> > >> Hi > >> > >> > >> > >> I have created 2 node clvm cluster, everything apparently > running file, > >> but when I did > >> > >> > >> > >> pcs status > >> > >> > >> > >> it always display this > >> > >> > >> > >> Clone Set: dlm-clone [dlm] > >> > >> Started: [ clvm-1 clvm-2 ] > >> > >> Clone Set: clvmd-clone [clvmd] > >> > >> Started: [ clvm-1 clvm-2 ] > >> > >> iscsi-stonith-device (stonith:fence_scsi): Stopped > >> > >> > >> > >> Failed actions: > >> > >> iscsi-stonith-device_start_0 on clvm-1 'unknown error' (1): > call=40, > >> status=Error, exit-reason='none', last-rc-change='Thu May 21 > 05:52:23 2015', > >> queued=0ms, exec=1154ms > >> > >> iscsi-stonith-device_start_0 on clvm-2 'unknown error' (1): > call=38, > >> status=Error, exit-reason='none', last-rc-change='Thu May 21 > 05:52:26 2015', > >> queued=0ms, exec=1161ms > >> > >> > >> > >> > >> > >> PCSD Status: > >> > >> clvm-1: Online > >> > >> clvm-2: Online > >> > >> > >> > >> Daemon Status: > >> > >> corosync: active/enabled > >> > >> pacemaker: active/enabled > >> > >> pcsd: active/enabled > >> > >> > >> > >> > >> > >> Would you please help me why iscsi-stonith device stopped, and > how I can > >> solve this issue. > >> > >> > >> > >> Br. > >> > >> > >> > >> Umar > >> > >> > >> -- > >> Linux-cluster mailing list > >> Linux-cluster at redhat.com > >> https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > > > > > > -- > > Umar Draz > > Network Architect > > > > -- > > Linux-cluster mailing list > > Linux-cluster at redhat.com > > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > -- > .~. > /V\ > // \\ > /( )\ > ^`~'^ > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > > > > > -- > Umar Draz > Network Architect > >