From pradeepanan at gmail.com Thu Jul 6 02:11:16 2017 From: pradeepanan at gmail.com (pradeep s) Date: Wed, 5 Jul 2017 23:11:16 -0300 Subject: [Linux-cluster] Cluster - NFS Share Configuration Message-ID: I am working on configuring cluster environment for NFS share using pacemaker. Below are the resources I have configured. Quote: Group: nfsgroup Resource: my_lvm (class=ocf provider=heartbeat type=LVM) Attributes: volgrpname=my_vg exclusive=true Operations: start interval=0s timeout=30 (my_lvm-start-interval-0s) stop interval=0s timeout=30 (my_lvm-stop-interval-0s) monitor interval=10 timeout=30 (my_lvm-monitor-interval-10) Resource: nfsshare (class=ocf provider=heartbeat type=Filesystem) Attributes: device=/dev/my_vg/my_lv directory=/nfsshare fstype=ext4 Operations: start interval=0s timeout=60 (nfsshare-start-interval-0s) stop interval=0s timeout=60 (nfsshare-stop-interval-0s) monitor interval=20 timeout=40 (nfsshare-monitor-interval-20) Resource: nfs-daemon (class=ocf provider=heartbeat type=nfsserver) Attributes: nfs_shared_infodir=/nfsshare/nfsinfo nfs_no_notify=true Operations: start interval=0s timeout=40 (nfs-daemon-start-interval-0s) stop interval=0s timeout=20s (nfs-daemon-stop-interval-0s) monitor interval=10 timeout=20s (nfs-daemon-monitor-interval-10) Resource: nfs-root (class=ocf provider=heartbeat type=exportfs) Attributes: clientspec=10.199.1.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfsshare/exports fsid=0 Operations: start interval=0s timeout=40 (nfs-root-start-interval-0s) stop interval=0s timeout=120 (nfs-root-stop-interval-0s) monitor interval=10 timeout=20 (nfs-root-monitor-interval-10) Resource: nfs-export1 (class=ocf provider=heartbeat type=exportfs) Attributes: clientspec=10.199.1.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfsshare/exports/export1 fsid=1 Operations: start interval=0s timeout=40 (nfs-export1-start-interval-0s) stop interval=0s timeout=120 (nfs-export1-stop-interval-0s) monitor interval=10 timeout=20 (nfs-export1-monitor-interval-10) Resource: nfs-export2 (class=ocf provider=heartbeat type=exportfs) Attributes: clientspec=10.199.1.0/255.255.255.0 options=rw,sync,no_root_squash directory=/nfsshare/exports/export2 fsid=2 Operations: start interval=0s timeout=40 (nfs-export2-start-interval-0s) stop interval=0s timeout=120 (nfs-export2-stop-interval-0s) monitor interval=10 timeout=20 (nfs-export2-monitor-interval-10) Resource: nfs_ip (class=ocf provider=heartbeat type=IPaddr2) Attributes: ip=10.199.1.86 cidr_netmask=24 Operations: start interval=0s timeout=20s (nfs_ip-start-interval-0s) stop interval=0s timeout=20s (nfs_ip-stop-interval-0s) monitor interval=10s timeout=20s (nfs_ip-monitor-interval-10s) Resource: nfs-notify (class=ocf provider=heartbeat type=nfsnotify) Attributes: source_host=10.199.1.86 Operations: start interval=0s timeout=90 (nfs-notify-start-interval-0s) stop interval=0s timeout=90 (nfs-notify-stop-interval-0s) monitor interval=30 timeout=90 (nfs-notify-monitor-interval-30) PCS Status Quote: Cluster name: my_cluster Stack: corosync Current DC: node3.cluster.com (version 1.1.15-11.el7_3.5-e174ec8) - partition with quorum Last updated: Wed Jul 5 13:12:48 2017 Last change: Wed Jul 5 13:11:52 2017 by root via crm_attribute on node3.cluster.com 2 nodes and 10 resources configured Online: [ node3.cluster.com node4.cluster.com ] Full list of resources: fence-3 (stonith:fence_vmware_soap): Started node4.cluster.com fence-4 (stonith:fence_vmware_soap): Started node3.cluster.com Resource Group: nfsgroup my_lvm (ocf::heartbeat:LVM): Started node3.cluster.com nfsshare (ocf::heartbeat:Filesystem): Started node3.cluster.com nfs-daemon (ocf::heartbeat:nfsserver): Started node3.cluster.com nfs-root (ocf::heartbeat:exportfs): Started node3.cluster.com nfs-export1 (ocf::heartbeat:exportfs): Started node3.cluster.com nfs-export2 (ocf::heartbeat:exportfs): Started node3.cluster.com nfs_ip (ocf::heartbeat:IPaddr2): Started node3.cluster.com nfs-notify (ocf::heartbeat:nfsnotify): Started node3.cluster.com I followedthe redhat link to configure. Once configured, I could mount the directory from nfs client with no issues. However, wehn I try entering the node to standby, the resources not starting up in secondary node. After entering active node to standby, Quote: [root at node3 ~]# pcs status Cluster name: my_cluster Stack: corosync Current DC: node3.cluster.com (version 1.1.15-11.el7_3.5-e174ec8) - partition with quorum Last updated: Wed Jul 5 13:16:05 2017 Last change: Wed Jul 5 13:15:38 2017 by root via crm_attribute on node3.cluster.com 2 nodes and 10 resources configured Node node3.cluster.com: standby Online: [ node4.cluster.com ] Full list of resources: fence-3 (stonith:fence_vmware_soap): Started node4.cluster.com fence-4 (stonith:fence_vmware_soap): Started node4.cluster.com Resource Group: nfsgroup my_lvm (ocf::heartbeat:LVM): Stopped nfsshare (ocf::heartbeat:Filesystem): Stopped nfs-daemon (ocf::heartbeat:nfsserver): Stopped nfs-root (ocf::heartbeat:exportfs): Stopped nfs-export1 (ocf::heartbeat:exportfs): Stopped nfs-export2 (ocf::heartbeat:exportfs): Stopped nfs_ip (ocf::heartbeat:IPaddr2): Stopped nfs-notify (ocf::heartbeat:nfsnotify): Stopped Failed Actions: * fence-3_monitor_60000 on node4.cluster.com 'unknown error' (1): call=50, status=Timed Out, exitreason='none', last-rc-change='Wed Jul 5 13:11:54 2017', queued=0ms, exec=20012ms * fence-4_monitor_60000 on node4.cluster.com 'unknown error' (1): call=47, status=Timed Out, exitreason='none', last-rc-change='Wed Jul 5 13:05:32 2017', queued=0ms, exec=20028ms * my_lvm_start_0 on node4.cluster.com 'unknown error' (1): call=49, status=complete, exitreason='Volume group [my_vg] does not exist or contains error! Volume group "my_vg" not found', last-rc-change='Wed Jul 5 13:05:39 2017', queued=0ms, exec=1447ms Daemon Status: corosync: active/enabled pacemaker: active/enabled pcsd: active/enabled I am seeing this error, Quote: ERROR: Volume group [my_vg] does not exist or contains error! Volume group "my_vg" not found#012 Cannot process volume group my_vg This resolves when I create the lvm manually on secondary but I expect the resources to do the job. Am I missing something in this configuration? -- Regards, Pradeep Anandh -------------- next part -------------- An HTML attachment was scrubbed... URL: From admin at feldhost.cz Tue Jul 18 23:25:47 2017 From: admin at feldhost.cz (=?utf-8?Q?Kristi=C3=A1n_Feldsam?=) Date: Wed, 19 Jul 2017 01:25:47 +0200 Subject: [Linux-cluster] GFS2 Errors Message-ID: <302FB14A-2B08-4427-A9FD-21F90375235A@feldhost.cz> Hello, I see today GFS2 errors in log and nothing about that is on net, so I writing to this mailing list. node2 19.07.2017 01:11:55 kernel kern err vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-4549568322848002755 node2 19.07.2017 01:10:56 kernel kern err vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-8191295421473926116 node2 19.07.2017 01:10:48 kernel kern err vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-8225402411152149004 node2 19.07.2017 01:10:47 kernel kern err vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-8230186816585019317 node2 19.07.2017 01:10:45 kernel kern err vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-8242007238441787628 node2 19.07.2017 01:10:39 kernel kern err vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-8250926852732428536 node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-5150933278940354602 node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-64 node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-64 Would somebody explain this errors? cluster is looks like working normally. I enabled vm.zone_reclaim_mode = 1 on nodes... Thank you! S pozdravem Kristi?n Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: support at feldhost.cz www.feldhost.cz - FeldHost? ? profesion?ln? hostingov? a serverov? slu?by za adekv?tn? ceny. FELDSAM s.r.o. V rohu 434/3 Praha 4 ? Libu?, PS? 142 00 I?: 290 60 958, DI?: CZ290 60 958 C 200350 veden? u M?stsk?ho soudu v Praze Banka: Fio banka a.s. ??slo ??tu: 2400330446/2010 BIC: FIOBCZPPXX IBAN: CZ82 2010 0000 0024 0033 0446 -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Tue Jul 18 23:39:08 2017 From: lists at alteeve.ca (Digimer) Date: Tue, 18 Jul 2017 19:39:08 -0400 Subject: [Linux-cluster] GFS2 Errors In-Reply-To: <302FB14A-2B08-4427-A9FD-21F90375235A@feldhost.cz> References: <302FB14A-2B08-4427-A9FD-21F90375235A@feldhost.cz> Message-ID: On 2017-07-18 07:25 PM, Kristi?n Feldsam wrote: > Hello, I see today GFS2 errors in log and nothing about that is on net, > so I writing to this mailing list. > > node2 19.07.2017 01:11:55 kernel kern err vmscan: shrink_slab: > gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete > nr=-4549568322848002755 > node2 19.07.2017 01:10:56 kernel kern err vmscan: shrink_slab: > gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete > nr=-8191295421473926116 > node2 19.07.2017 01:10:48 kernel kern err vmscan: shrink_slab: > gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete > nr=-8225402411152149004 > node2 19.07.2017 01:10:47 kernel kern err vmscan: shrink_slab: > gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete > nr=-8230186816585019317 > node2 19.07.2017 01:10:45 kernel kern err vmscan: shrink_slab: > gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete > nr=-8242007238441787628 > node2 19.07.2017 01:10:39 kernel kern err vmscan: shrink_slab: > gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete > nr=-8250926852732428536 > node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: > gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete > nr=-5150933278940354602 > node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: > gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-64 > node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: > gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-64 > > Would somebody explain this errors? cluster is looks like working > normally. I enabled vm.zone_reclaim_mode = 1 on nodes... > > Thank you! Please post this to the Clusterlabs - Users list. This ML is deprecated. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein?s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould From swhiteho at redhat.com Wed Jul 19 09:09:08 2017 From: swhiteho at redhat.com (Steven Whitehouse) Date: Wed, 19 Jul 2017 10:09:08 +0100 Subject: [Linux-cluster] GFS2 Errors In-Reply-To: References: <302FB14A-2B08-4427-A9FD-21F90375235A@feldhost.cz> Message-ID: <77967257-df7d-3d86-14d6-8d9ae78a64d9@redhat.com> Hi, On 19/07/17 00:39, Digimer wrote: > On 2017-07-18 07:25 PM, Kristi?n Feldsam wrote: >> Hello, I see today GFS2 errors in log and nothing about that is on net, >> so I writing to this mailing list. >> >> node2 19.07.2017 01:11:55 kernel kern err vmscan: shrink_slab: >> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >> nr=-4549568322848002755 >> node2 19.07.2017 01:10:56 kernel kern err vmscan: shrink_slab: >> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >> nr=-8191295421473926116 >> node2 19.07.2017 01:10:48 kernel kern err vmscan: shrink_slab: >> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >> nr=-8225402411152149004 >> node2 19.07.2017 01:10:47 kernel kern err vmscan: shrink_slab: >> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >> nr=-8230186816585019317 >> node2 19.07.2017 01:10:45 kernel kern err vmscan: shrink_slab: >> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >> nr=-8242007238441787628 >> node2 19.07.2017 01:10:39 kernel kern err vmscan: shrink_slab: >> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >> nr=-8250926852732428536 >> node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: >> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >> nr=-5150933278940354602 >> node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: >> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-64 >> node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: >> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-64 >> >> Would somebody explain this errors? cluster is looks like working >> normally. I enabled vm.zone_reclaim_mode = 1 on nodes... >> >> Thank you! > Please post this to the Clusterlabs - Users list. This ML is deprecated. > > cluster-devel is the right list for GFS2 issues. This looks like a bug to me, since the object count should never be negative. The glock shrinker is not (yet) zone aware, although the quota shrinker is. Not sure if that is related, but perhaps.... certainly something we'd like to investigate further. That said the messages in themselves are harmless, but it will likely indicate a less than optimal use of memory. If there are any details that can be shared about the use case, and how to reproduce that will be very helpful for us to know. Also what kernel version was this? Steve. From admin at feldhost.cz Wed Jul 19 09:21:14 2017 From: admin at feldhost.cz (=?utf-8?Q?Kristi=C3=A1n_Feldsam?=) Date: Wed, 19 Jul 2017 11:21:14 +0200 Subject: [Linux-cluster] GFS2 Errors In-Reply-To: <77967257-df7d-3d86-14d6-8d9ae78a64d9@redhat.com> References: <302FB14A-2B08-4427-A9FD-21F90375235A@feldhost.cz> <77967257-df7d-3d86-14d6-8d9ae78a64d9@redhat.com> Message-ID: Hello, kernel actually running in nodes is 4.11.1-1.el7.elrepo.x86_64 use case: 3 compute nodes in cluster corosync/pacemaker resources dlm clvm gfs volume 1 gfs volume 2 volume journal size 256MB please tell me cmds for more informations Thank you! S pozdravem Kristi?n Feldsam Tel.: +420 773 303 353, +421 944 137 535 E-mail.: support at feldhost.cz www.feldhost.cz - FeldHost? ? profesion?ln? hostingov? a serverov? slu?by za adekv?tn? ceny. FELDSAM s.r.o. V rohu 434/3 Praha 4 ? Libu?, PS? 142 00 I?: 290 60 958, DI?: CZ290 60 958 C 200350 veden? u M?stsk?ho soudu v Praze Banka: Fio banka a.s. ??slo ??tu: 2400330446/2010 BIC: FIOBCZPPXX IBAN: CZ82 2010 0000 0024 0033 0446 > On 19 Jul 2017, at 11:09, Steven Whitehouse wrote: > > Hi, > > > On 19/07/17 00:39, Digimer wrote: >> On 2017-07-18 07:25 PM, Kristi?n Feldsam wrote: >>> Hello, I see today GFS2 errors in log and nothing about that is on net, >>> so I writing to this mailing list. >>> >>> node2 19.07.2017 01:11:55 kernel kern err vmscan: shrink_slab: >>> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >>> nr=-4549568322848002755 >>> node2 19.07.2017 01:10:56 kernel kern err vmscan: shrink_slab: >>> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >>> nr=-8191295421473926116 >>> node2 19.07.2017 01:10:48 kernel kern err vmscan: shrink_slab: >>> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >>> nr=-8225402411152149004 >>> node2 19.07.2017 01:10:47 kernel kern err vmscan: shrink_slab: >>> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >>> nr=-8230186816585019317 >>> node2 19.07.2017 01:10:45 kernel kern err vmscan: shrink_slab: >>> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >>> nr=-8242007238441787628 >>> node2 19.07.2017 01:10:39 kernel kern err vmscan: shrink_slab: >>> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >>> nr=-8250926852732428536 >>> node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: >>> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete >>> nr=-5150933278940354602 >>> node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: >>> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-64 >>> node3 19.07.2017 00:16:02 kernel kern err vmscan: shrink_slab: >>> gfs2_glock_shrink_scan+0x0/0x2f0 [gfs2] negative objects to delete nr=-64 >>> >>> Would somebody explain this errors? cluster is looks like working >>> normally. I enabled vm.zone_reclaim_mode = 1 on nodes... >>> >>> Thank you! >> Please post this to the Clusterlabs - Users list. This ML is deprecated. >> >> > > cluster-devel is the right list for GFS2 issues. This looks like a bug to me, since the object count should never be negative. The glock shrinker is not (yet) zone aware, although the quota shrinker is. Not sure if that is related, but perhaps.... certainly something we'd like to investigate further. That said the messages in themselves are harmless, but it will likely indicate a less than optimal use of memory. If there are any details that can be shared about the use case, and how to reproduce that will be very helpful for us to know. Also what kernel version was this? > > Steve. -------------- next part -------------- An HTML attachment was scrubbed... URL: From deepeshkumarpal at gmail.com Sun Jul 30 18:03:44 2017 From: deepeshkumarpal at gmail.com (deepesh kumar) Date: Sun, 30 Jul 2017 23:33:44 +0530 Subject: [Linux-cluster] Need advice Redhat Clusters Message-ID: I need to set up 2 node HA Active Passive redhat cluster on rhel 6.9. Should I start with rgmanager or pacemaker ..???? Do I need Quorum disk ..(mandatory ) and what fence method should I use. Thanks to great friends..!!! -- DEEPESH KUMAR -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Sun Jul 30 20:19:43 2017 From: lists at alteeve.ca (Digimer) Date: Sun, 30 Jul 2017 16:19:43 -0400 Subject: [Linux-cluster] Need advice Redhat Clusters In-Reply-To: References: Message-ID: <8c4ada12-ab3e-5d87-e60d-9dfb4767b1cc@alteeve.ca> On 2017-07-30 02:03 PM, deepesh kumar wrote: > I need to set up 2 node HA Active Passive redhat cluster on rhel 6.9. > > Should I start with rgmanager or pacemaker ..???? > > Do I need Quorum disk ..(mandatory ) and what fence method should I use. > > Thanks to great friends..!!! > > -- > DEEPESH KUMAR Hi Deepesh, Note that this channel is deprecated, please use clusterlabs - users (cc'ed here). Use pacemaker, but it will need the cman plugin. Only existing projects should use rgmanager. The fence method you use will depend on what your nodes are; IPMI is common on most servers, so fence_ipmilan is quite common. Switched PDUs from APC are also popular, and they use fence_apc_snmp, etc. -- Digimer Papers and Projects: https://alteeve.com/w/ "I am, somehow, less interested in the weight and convolutions of Einstein?s brain than in the near certainty that people of equal talent have lived and died in cotton fields and sweatshops." - Stephen Jay Gould