From Mark.Vallevand at UNISYS.com Thu Apr 3 20:58:30 2014 From: Mark.Vallevand at UNISYS.com (Vallevand, Mark K) Date: Thu, 3 Apr 2014 15:58:30 -0500 Subject: [Linux-cluster] Simple data replication in a cluster Message-ID: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> I'm looking for a simple way to replicate data within a cluster. It looks like my resources will be self-configuring and may need to push changes they see to all nodes in the cluster. The idea being that when a node crashes, the resource will have its configuration present on the node on which it is restarted. We're talking about a few kb of data, probably in one file, probably text. A typical cluster would have multiple resources (more than two), one resource per node and one extra node. Ideas? Could I use the CIB directly to replicate data? Use cibadmin to update something and sync? How big can a resource parameter be? Could a resource modify its parameters so that they are replicated throughout the cluster? Is there a simple file replication Resource Agent? Drdb seems like overkill. Regards. Mark K Vallevand Mark.Vallevand at Unisys.com May you live in interesting times, may you come to the attention of important people and may all your wishes come true. THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. -------------- next part -------------- An HTML attachment was scrubbed... URL: From lists at alteeve.ca Thu Apr 3 21:17:40 2014 From: lists at alteeve.ca (Digimer) Date: Thu, 03 Apr 2014 17:17:40 -0400 Subject: [Linux-cluster] Simple data replication in a cluster In-Reply-To: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> References: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> Message-ID: <533DCFF4.2070404@alteeve.ca> On 03/04/14 04:58 PM, Vallevand, Mark K wrote: > I?m looking for a simple way to replicate data within a cluster. > > It looks like my resources will be self-configuring and may need to push > changes they see to all nodes in the cluster. The idea being that when > a node crashes, the resource will have its configuration present on the > node on which it is restarted. We?re talking about a few kb of data, > probably in one file, probably text. A typical cluster would have > multiple resources (more than two), one resource per node and one extra > node. > > Ideas? > > Could I use the CIB directly to replicate data? Use cibadmin to update > something and sync? > > How big can a resource parameter be? Could a resource modify its > parameters so that they are replicated throughout the cluster? > > Is there a simple file replication Resource Agent? > > Drdb seems like overkill. > > Regards. > Mark K Vallevand Mark.Vallevand at Unisys.com If you don't want to use DRBD + gfs2 (what I use), then you'll probably want to look at corosync directly for keeping the data in sync. Pacemaker itself is a cluster resource manager and I don't think the cib is well suited for general data sync'ing. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From morpheus.ibis at gmail.com Thu Apr 3 21:30:32 2014 From: morpheus.ibis at gmail.com (Pavel Herrmann) Date: Thu, 03 Apr 2014 23:30:32 +0200 Subject: [Linux-cluster] Simple data replication in a cluster In-Reply-To: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> References: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> Message-ID: <2878037.uhyBF5WAPy@bloomfield> Hi On Thursday 03 of April 2014 15:58:30 Vallevand, Mark K wrote: > I'm looking for a simple way to replicate data within a cluster. > > It looks like my resources will be self-configuring and may need to push > changes they see to all nodes in the cluster. The idea being that when a > node crashes, the resource will have its configuration present on the node > on which it is restarted. We're talking about a few kb of data, probably > in one file, probably text. A typical cluster would have multiple > resources (more than two), one resource per node and one extra node. I was facing a similar issue, but instead of going for full cluster stack (i didnt need it for the failover), I went for csync2 if you absolutely need to have the changes propagated instantly, you need to hook csync2 to inotify (google should give you options), or if you dont expect any changes right before crashes, running with cron every few minutes might suit your needs Regards Pavel Herrmann > > Ideas? > > Could I use the CIB directly to replicate data? Use cibadmin to update > something and sync? How big can a resource parameter be? Could a resource > modify its parameters so that they are replicated throughout the cluster? > Is there a simple file replication Resource Agent? > Drdb seems like overkill. > > Regards. > Mark K Vallevand > Mark.Vallevand at Unisys.com May you live in > interesting times, may you come to the attention of important people and > may all your wishes come true. THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL > AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the > intended recipient. If you received this in error, please contact the > sender and delete the e-mail and its attachments from all computers. From ricks at alldigital.com Thu Apr 3 21:51:33 2014 From: ricks at alldigital.com (Rick Stevens) Date: Thu, 3 Apr 2014 14:51:33 -0700 Subject: [Linux-cluster] Simple data replication in a cluster In-Reply-To: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> References: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> Message-ID: <533DD7E5.2020709@alldigital.com> On 04/03/2014 01:58 PM, Vallevand, Mark K issued this missive: > I?m looking for a simple way to replicate data within a cluster. > > It looks like my resources will be self-configuring and may need to push > changes they see to all nodes in the cluster. The idea being that when > a node crashes, the resource will have its configuration present on the > node on which it is restarted. We?re talking about a few kb of data, > probably in one file, probably text. A typical cluster would have > multiple resources (more than two), one resource per node and one extra > node. > > Ideas? > > Could I use the CIB directly to replicate data? Use cibadmin to update > something and sync? > > How big can a resource parameter be? Could a resource modify its > parameters so that they are replicated throughout the cluster? > > Is there a simple file replication Resource Agent? > > Drdb seems like overkill. If you're OK with it and it's a small group of files/directories, why not use something like inotifywait and have it run a script that rsyncs the altered files to the other nodes when the files change? I've done it before and it works pretty well. -- ---------------------------------------------------------------------- - Rick Stevens, Systems Engineer, AllDigital ricks at alldigital.com - - AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 - - - - Never eat anything larger than your head - ---------------------------------------------------------------------- From Mark.Vallevand at UNISYS.com Fri Apr 4 14:44:26 2014 From: Mark.Vallevand at UNISYS.com (Vallevand, Mark K) Date: Fri, 4 Apr 2014 09:44:26 -0500 Subject: [Linux-cluster] Simple data replication in a cluster In-Reply-To: <533DCFF4.2070404@alteeve.ca> References: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> <533DCFF4.2070404@alteeve.ca> Message-ID: <99C8B2929B39C24493377AC7A121E21FC5E48A8EE3@USEA-EXCH8.na.uis.unisys.com> Thanks. I'll check out corosync. Regards. Mark K Vallevand?? Mark.Vallevand at Unisys.com May you live in interesting times, may you come to the attention of important people and may all your wishes come true. THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Digimer Sent: Thursday, April 03, 2014 04:18 PM To: linux clustering Subject: Re: [Linux-cluster] Simple data replication in a cluster On 03/04/14 04:58 PM, Vallevand, Mark K wrote: > I'm looking for a simple way to replicate data within a cluster. > > It looks like my resources will be self-configuring and may need to push > changes they see to all nodes in the cluster. The idea being that when > a node crashes, the resource will have its configuration present on the > node on which it is restarted. We're talking about a few kb of data, > probably in one file, probably text. A typical cluster would have > multiple resources (more than two), one resource per node and one extra > node. > > Ideas? > > Could I use the CIB directly to replicate data? Use cibadmin to update > something and sync? > > How big can a resource parameter be? Could a resource modify its > parameters so that they are replicated throughout the cluster? > > Is there a simple file replication Resource Agent? > > Drdb seems like overkill. > > Regards. > Mark K Vallevand Mark.Vallevand at Unisys.com If you don't want to use DRBD + gfs2 (what I use), then you'll probably want to look at corosync directly for keeping the data in sync. Pacemaker itself is a cluster resource manager and I don't think the cib is well suited for general data sync'ing. -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From Mark.Vallevand at UNISYS.com Fri Apr 4 14:48:31 2014 From: Mark.Vallevand at UNISYS.com (Vallevand, Mark K) Date: Fri, 4 Apr 2014 09:48:31 -0500 Subject: [Linux-cluster] Simple data replication in a cluster In-Reply-To: <2878037.uhyBF5WAPy@bloomfield> References: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> <2878037.uhyBF5WAPy@bloomfield> Message-ID: <99C8B2929B39C24493377AC7A121E21FC5E48A8EFD@USEA-EXCH8.na.uis.unisys.com> Thanks. A good idea. I've considered using csync2 (and similar things). I use unison for syncing in my current clusters. But that is done to simplify installation/configuration by the user. My scripts push the static application data around the cluster when the user is installing. Regards. Mark K Vallevand?? Mark.Vallevand at Unisys.com May you live in interesting times, may you come to the attention of important people and may all your wishes come true. THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. -----Original Message----- From: Pavel Herrmann [mailto:morpheus.ibis at gmail.com] Sent: Thursday, April 03, 2014 04:31 PM To: linux-cluster at redhat.com Cc: Vallevand, Mark K Subject: Re: [Linux-cluster] Simple data replication in a cluster Hi On Thursday 03 of April 2014 15:58:30 Vallevand, Mark K wrote: > I'm looking for a simple way to replicate data within a cluster. > > It looks like my resources will be self-configuring and may need to push > changes they see to all nodes in the cluster. The idea being that when a > node crashes, the resource will have its configuration present on the node > on which it is restarted. We're talking about a few kb of data, probably > in one file, probably text. A typical cluster would have multiple > resources (more than two), one resource per node and one extra node. I was facing a similar issue, but instead of going for full cluster stack (i didnt need it for the failover), I went for csync2 if you absolutely need to have the changes propagated instantly, you need to hook csync2 to inotify (google should give you options), or if you dont expect any changes right before crashes, running with cron every few minutes might suit your needs Regards Pavel Herrmann > > Ideas? > > Could I use the CIB directly to replicate data? Use cibadmin to update > something and sync? How big can a resource parameter be? Could a resource > modify its parameters so that they are replicated throughout the cluster? > Is there a simple file replication Resource Agent? > Drdb seems like overkill. > > Regards. > Mark K Vallevand > Mark.Vallevand at Unisys.com May you live in > interesting times, may you come to the attention of important people and > may all your wishes come true. THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL > AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the > intended recipient. If you received this in error, please contact the > sender and delete the e-mail and its attachments from all computers. From Mark.Vallevand at UNISYS.com Fri Apr 4 14:48:58 2014 From: Mark.Vallevand at UNISYS.com (Vallevand, Mark K) Date: Fri, 4 Apr 2014 09:48:58 -0500 Subject: [Linux-cluster] Simple data replication in a cluster In-Reply-To: <533DD7E5.2020709@alldigital.com> References: <99C8B2929B39C24493377AC7A121E21FC5E47F3C5D@USEA-EXCH8.na.uis.unisys.com> <533DD7E5.2020709@alldigital.com> Message-ID: <99C8B2929B39C24493377AC7A121E21FC5E48A8F02@USEA-EXCH8.na.uis.unisys.com> Yup. I've considered similar. Thanks! Regards. Mark K Vallevand?? Mark.Vallevand at Unisys.com May you live in interesting times, may you come to the attention of important people and may all your wishes come true. THIS COMMUNICATION MAY CONTAIN CONFIDENTIAL AND/OR OTHERWISE PROPRIETARY MATERIAL and is thus for use only by the intended recipient. If you received this in error, please contact the sender and delete the e-mail and its attachments from all computers. -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Rick Stevens Sent: Thursday, April 03, 2014 04:52 PM To: linux clustering Subject: Re: [Linux-cluster] Simple data replication in a cluster On 04/03/2014 01:58 PM, Vallevand, Mark K issued this missive: > I'm looking for a simple way to replicate data within a cluster. > > It looks like my resources will be self-configuring and may need to push > changes they see to all nodes in the cluster. The idea being that when > a node crashes, the resource will have its configuration present on the > node on which it is restarted. We're talking about a few kb of data, > probably in one file, probably text. A typical cluster would have > multiple resources (more than two), one resource per node and one extra > node. > > Ideas? > > Could I use the CIB directly to replicate data? Use cibadmin to update > something and sync? > > How big can a resource parameter be? Could a resource modify its > parameters so that they are replicated throughout the cluster? > > Is there a simple file replication Resource Agent? > > Drdb seems like overkill. If you're OK with it and it's a small group of files/directories, why not use something like inotifywait and have it run a script that rsyncs the altered files to the other nodes when the files change? I've done it before and it works pretty well. -- ---------------------------------------------------------------------- - Rick Stevens, Systems Engineer, AllDigital ricks at alldigital.com - - AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 - - - - Never eat anything larger than your head - ---------------------------------------------------------------------- -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster From bjoern.teipel at internetbrands.com Mon Apr 7 07:26:43 2014 From: bjoern.teipel at internetbrands.com (Bjoern Teipel) Date: Mon, 7 Apr 2014 00:26:43 -0700 Subject: [Linux-cluster] DLM nodes disconnected issue Message-ID: H all, i did a dlm_tool leave clvmd on one node (node06) of a CMAN cluster with CLVMD Now I have the problem that clvmd is stuck and all nodes lost connections to DLM. For some reason dlm want's to fence member 8 I guess and that might stuck the whole dlm? All other stacks, cman, corosync look fine... Thanks, Bjoern Error: dlm: closing connection to node 2 dlm: closing connection to node 3 dlm: closing connection to node 4 dlm: closing connection to node 5 dlm: closing connection to node 6 dlm: closing connection to node 8 dlm: closing connection to node 9 dlm: closing connection to node 10 dlm: closing connection to node 2 dlm: closing connection to node 3 dlm: closing connection to node 4 dlm: closing connection to node 5 dlm: closing connection to node 6 dlm: closing connection to node 8 dlm: closing connection to node 9 dlm: closing connection to node 10 INFO: task dlm_tool:33699 blocked for more than 120 seconds. "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. dlm_tool D 0000000000000003 0 33699 33698 0x00000080 ffff88138905dcc0 0000000000000082 ffffffff81168043 ffff88138905dd18 ffff88138905dd08 ffff88305b30ccc0 ffff88304fa5c800 ffff883058e49900 ffff881857329058 ffff88138905dfd8 000000000000fb88 ffff881857329058 Call Trace: [] ? kmem_cache_alloc_trace+0x1a3/0x1b0 [] ? misc_open+0x1ca/0x320 [] rwsem_down_failed_common+0x95/0x1d0 [] ? chrdev_open+0x125/0x230 [] rwsem_down_read_failed+0x26/0x30 [] ? __dentry_open+0x23f/0x360 [] call_rwsem_down_read_failed+0x14/0x30 [] ? down_read+0x24/0x30 [] dlm_clear_proc_locks+0x3d/0x2a0 [dlm] [] ? generic_acl_chmod+0x46/0xd0 [] device_close+0x66/0xc0 [dlm] [] __fput+0xf5/0x210 [] fput+0x25/0x30 [] filp_close+0x5d/0x90 [] sys_close+0xa5/0x100 [] system_call_fastpath+0x16/0x1b Status: cman_tool nodes Node Sts Inc Joined Name 1 M 18908 2014-03-24 19:01:00 node01 2 M 18972 2014-04-06 22:47:57 node02 3 M 18972 2014-04-06 22:47:57 node03 4 M 18972 2014-04-06 22:47:57 node04 5 M 18972 2014-04-06 22:47:57 node05 6 X 18960 node06 7 X 18928 node07 8 M 18972 2014-04-06 22:47:57 node08 9 M 18972 2014-04-06 22:47:57 node09 10 M 18972 2014-04-06 22:47:57 node10 dlm lockspaces name clvmd id 0x4104eefa flags 0x00000004 kern_stop change member 8 joined 0 remove 1 failed 0 seq 11,11 members 1 2 3 4 5 8 9 10 new change member 8 joined 1 remove 0 failed 0 seq 12,41 new status wait_messages 0 wait_condition 1 fencing new members 1 2 3 4 5 8 9 10 DLM dump: 1396849677 cluster node 2 added seq 18972 1396849677 set_configfs_node 2 10.14.18.66 local 0 1396849677 cluster node 3 added seq 18972 1396849677 set_configfs_node 3 10.14.18.67 local 0 1396849677 cluster node 4 added seq 18972 1396849677 set_configfs_node 4 10.14.18.68 local 0 1396849677 cluster node 5 added seq 18972 1396849677 set_configfs_node 5 10.14.18.70 local 0 1396849677 cluster node 8 added seq 18972 1396849677 set_configfs_node 8 10.14.18.80 local 0 1396849677 cluster node 9 added seq 18972 1396849677 set_configfs_node 9 10.14.18.81 local 0 1396849677 cluster node 10 added seq 18972 1396849677 set_configfs_node 10 10.14.18.77 local 0 1396849677 dlm:ls:clvmd conf 2 1 0 memb 1 3 join 3 left 1396849677 clvmd add_change cg 35 joined nodeid 3 1396849677 clvmd add_change cg 35 counts member 2 joined 1 remove 0 failed 0 1396849677 dlm:ls:clvmd conf 3 1 0 memb 1 2 3 join 2 left 1396849677 clvmd add_change cg 36 joined nodeid 2 1396849677 clvmd add_change cg 36 counts member 3 joined 1 remove 0 failed 0 1396849677 dlm:ls:clvmd conf 4 1 0 memb 1 2 3 9 join 9 left 1396849677 clvmd add_change cg 37 joined nodeid 9 1396849677 clvmd add_change cg 37 counts member 4 joined 1 remove 0 failed 0 1396849677 dlm:ls:clvmd conf 5 1 0 memb 1 2 3 8 9 join 8 left 1396849677 clvmd add_change cg 38 joined nodeid 8 1396849677 clvmd add_change cg 38 counts member 5 joined 1 remove 0 failed 0 1396849677 dlm:ls:clvmd conf 6 1 0 memb 1 2 3 8 9 10 join 10 left 1396849677 clvmd add_change cg 39 joined nodeid 10 1396849677 clvmd add_change cg 39 counts member 6 joined 1 remove 0 failed 0 1396849677 dlm:ls:clvmd conf 7 1 0 memb 1 2 3 5 8 9 10 join 5 left 1396849677 clvmd add_change cg 40 joined nodeid 5 1396849677 clvmd add_change cg 40 counts member 7 joined 1 remove 0 failed 0 1396849677 dlm:ls:clvmd conf 8 1 0 memb 1 2 3 4 5 8 9 10 join 4 left 1396849677 clvmd add_change cg 41 joined nodeid 4 1396849677 clvmd add_change cg 41 counts member 8 joined 1 remove 0 failed 0 1396849677 dlm:controld conf 2 1 0 memb 1 3 join 3 left 1396849677 dlm:controld conf 3 1 0 memb 1 2 3 join 2 left 1396849677 dlm:controld conf 4 1 0 memb 1 2 3 9 join 9 left 1396849677 dlm:controld conf 5 1 0 memb 1 2 3 8 9 join 8 left 1396849677 dlm:controld conf 6 1 0 memb 1 2 3 8 9 10 join 10 left 1396849677 dlm:controld conf 7 1 0 memb 1 2 3 5 8 9 10 join 5 left 1396849677 dlm:controld conf 8 1 0 memb 1 2 3 4 5 8 9 10 join 4 left From emi2fast at gmail.com Mon Apr 7 07:44:03 2014 From: emi2fast at gmail.com (emmanuel segura) Date: Mon, 7 Apr 2014 09:44:03 +0200 Subject: [Linux-cluster] DLM nodes disconnected issue In-Reply-To: References: Message-ID: your fencing is working ? because i see this from your dlm lockspace "new status wait_messages 0 wait_condition 1 fencing". 2014-04-07 9:26 GMT+02:00 Bjoern Teipel : > H all, > > i did a dlm_tool leave clvmd on one node (node06) of a CMAN cluster with > CLVMD > Now I have the problem that clvmd is stuck and all nodes lost > connections to DLM. > For some reason dlm want's to fence member 8 I guess and that might > stuck the whole dlm? > All other stacks, cman, corosync look fine... > > Thanks, > Bjoern > > Error: > > dlm: closing connection to node 2 > dlm: closing connection to node 3 > dlm: closing connection to node 4 > dlm: closing connection to node 5 > dlm: closing connection to node 6 > dlm: closing connection to node 8 > dlm: closing connection to node 9 > dlm: closing connection to node 10 > dlm: closing connection to node 2 > dlm: closing connection to node 3 > dlm: closing connection to node 4 > dlm: closing connection to node 5 > dlm: closing connection to node 6 > dlm: closing connection to node 8 > dlm: closing connection to node 9 > dlm: closing connection to node 10 > INFO: task dlm_tool:33699 blocked for more than 120 seconds. > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > dlm_tool D 0000000000000003 0 33699 33698 0x00000080 > ffff88138905dcc0 0000000000000082 ffffffff81168043 ffff88138905dd18 > ffff88138905dd08 ffff88305b30ccc0 ffff88304fa5c800 ffff883058e49900 > ffff881857329058 ffff88138905dfd8 000000000000fb88 ffff881857329058 > Call Trace: > [] ? kmem_cache_alloc_trace+0x1a3/0x1b0 > [] ? misc_open+0x1ca/0x320 > [] rwsem_down_failed_common+0x95/0x1d0 > [] ? chrdev_open+0x125/0x230 > [] rwsem_down_read_failed+0x26/0x30 > [] ? __dentry_open+0x23f/0x360 > [] call_rwsem_down_read_failed+0x14/0x30 > [] ? down_read+0x24/0x30 > [] dlm_clear_proc_locks+0x3d/0x2a0 [dlm] > [] ? generic_acl_chmod+0x46/0xd0 > [] device_close+0x66/0xc0 [dlm] > [] __fput+0xf5/0x210 > [] fput+0x25/0x30 > [] filp_close+0x5d/0x90 > [] sys_close+0xa5/0x100 > [] system_call_fastpath+0x16/0x1b > > > > Status: > > cman_tool nodes > Node Sts Inc Joined Name > 1 M 18908 2014-03-24 19:01:00 node01 > 2 M 18972 2014-04-06 22:47:57 node02 > 3 M 18972 2014-04-06 22:47:57 node03 > 4 M 18972 2014-04-06 22:47:57 node04 > 5 M 18972 2014-04-06 22:47:57 node05 > 6 X 18960 node06 > 7 X 18928 node07 > 8 M 18972 2014-04-06 22:47:57 node08 > 9 M 18972 2014-04-06 22:47:57 node09 > 10 M 18972 2014-04-06 22:47:57 node10 > > dlm lockspaces > name clvmd > id 0x4104eefa > flags 0x00000004 kern_stop > change member 8 joined 0 remove 1 failed 0 seq 11,11 > members 1 2 3 4 5 8 9 10 > new change member 8 joined 1 remove 0 failed 0 seq 12,41 > new status wait_messages 0 wait_condition 1 fencing > new members 1 2 3 4 5 8 9 10 > > > > DLM dump: > 1396849677 cluster node 2 added seq 18972 > 1396849677 set_configfs_node 2 10.14.18.66 local 0 > 1396849677 cluster node 3 added seq 18972 > 1396849677 set_configfs_node 3 10.14.18.67 local 0 > 1396849677 cluster node 4 added seq 18972 > 1396849677 set_configfs_node 4 10.14.18.68 local 0 > 1396849677 cluster node 5 added seq 18972 > 1396849677 set_configfs_node 5 10.14.18.70 local 0 > 1396849677 cluster node 8 added seq 18972 > 1396849677 set_configfs_node 8 10.14.18.80 local 0 > 1396849677 cluster node 9 added seq 18972 > 1396849677 set_configfs_node 9 10.14.18.81 local 0 > 1396849677 cluster node 10 added seq 18972 > 1396849677 set_configfs_node 10 10.14.18.77 local 0 > 1396849677 dlm:ls:clvmd conf 2 1 0 memb 1 3 join 3 left > 1396849677 clvmd add_change cg 35 joined nodeid 3 > 1396849677 clvmd add_change cg 35 counts member 2 joined 1 remove 0 failed > 0 > 1396849677 dlm:ls:clvmd conf 3 1 0 memb 1 2 3 join 2 left > 1396849677 clvmd add_change cg 36 joined nodeid 2 > 1396849677 clvmd add_change cg 36 counts member 3 joined 1 remove 0 failed > 0 > 1396849677 dlm:ls:clvmd conf 4 1 0 memb 1 2 3 9 join 9 left > 1396849677 clvmd add_change cg 37 joined nodeid 9 > 1396849677 clvmd add_change cg 37 counts member 4 joined 1 remove 0 failed > 0 > 1396849677 dlm:ls:clvmd conf 5 1 0 memb 1 2 3 8 9 join 8 left > 1396849677 clvmd add_change cg 38 joined nodeid 8 > 1396849677 clvmd add_change cg 38 counts member 5 joined 1 remove 0 failed > 0 > 1396849677 dlm:ls:clvmd conf 6 1 0 memb 1 2 3 8 9 10 join 10 left > 1396849677 clvmd add_change cg 39 joined nodeid 10 > 1396849677 clvmd add_change cg 39 counts member 6 joined 1 remove 0 failed > 0 > 1396849677 dlm:ls:clvmd conf 7 1 0 memb 1 2 3 5 8 9 10 join 5 left > 1396849677 clvmd add_change cg 40 joined nodeid 5 > 1396849677 clvmd add_change cg 40 counts member 7 joined 1 remove 0 failed > 0 > 1396849677 dlm:ls:clvmd conf 8 1 0 memb 1 2 3 4 5 8 9 10 join 4 left > 1396849677 clvmd add_change cg 41 joined nodeid 4 > 1396849677 clvmd add_change cg 41 counts member 8 joined 1 remove 0 failed > 0 > 1396849677 dlm:controld conf 2 1 0 memb 1 3 join 3 left > 1396849677 dlm:controld conf 3 1 0 memb 1 2 3 join 2 left > 1396849677 dlm:controld conf 4 1 0 memb 1 2 3 9 join 9 left > 1396849677 dlm:controld conf 5 1 0 memb 1 2 3 8 9 join 8 left > 1396849677 dlm:controld conf 6 1 0 memb 1 2 3 8 9 10 join 10 left > 1396849677 dlm:controld conf 7 1 0 memb 1 2 3 5 8 9 10 join 5 left > 1396849677 dlm:controld conf 8 1 0 memb 1 2 3 4 5 8 9 10 join 4 left > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From mgrac at redhat.com Mon Apr 7 12:39:22 2014 From: mgrac at redhat.com (Marek Grac) Date: Mon, 07 Apr 2014 14:39:22 +0200 Subject: [Linux-cluster] fence-agents-4.0.8 stable release Message-ID: <53429C7A.9050905@redhat.com> Welcome to the fence-agents 4.0.8 release. This release includes new fence agent for Raritan and several bugfixes: * fence_wti respects delay option in telnet connections * fixed problem when using identity file for login via ssh * correct values in manual pages for symlinks * allow SSl connection to fallback to SSL 3.0 (--notls) used for HP iLO2 The new source tarball can be downloaded here: https://fedorahosted.org/releases/f/e/fence-agents/fence-agents-4.0.8.tar.xz To report bugs or issues: https://bugzilla.redhat.com/ Would you like to meet the cluster team or members of its community? Join us on IRC (irc.freenode.net #linux-cluster) and share your experience with other sysadministrators or power users. Thanks/congratulations to all people that contributed to achieve this great milestone. m, From lists at alteeve.ca Tue Apr 15 21:03:55 2014 From: lists at alteeve.ca (Digimer) Date: Tue, 15 Apr 2014 17:03:55 -0400 Subject: [Linux-cluster] KVM Live migration when node's FS is read-only Message-ID: <534D9EBB.80200@alteeve.ca> Hi all, So I hit a weird issue last week... (EL6 + cman + rgamanager + drbd) For reasons unknown, a client thought they could start yanking and replacing hard drives on a running node. Obviously, that did not end well. The VMs that had been running on the node continues to operate fine and they just started using the peer's storage. The problem came when I tried to live-migrate the VMs over to the still-good node. Obviously, the old host couldn't write to logs, and the live-migration failed. Once failed, rgmanager also stopped working once the migration failed. In the end, I had to manually fence the node (corosync never failed, so it didn't get automatically fenced). This obviously caused the VMs running on the node to reboot, causing a ~40 second outage. It strikes me that the system *should* have been able to migrate, had it not tried to write to the logs. Is there a way, or can there be made a way, to migrate VMs off of a node whose underlying FS is read-only/corrupt/destroyed, so long as the programs in memory are still working? I am sure this is part a part rgmanager, part KVM/qemu question. Thanks for any feedback! -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From david.l.henley at hp.com Thu Apr 17 13:20:11 2014 From: david.l.henley at hp.com (Henley, David (Solutions Architect Chicago)) Date: Thu, 17 Apr 2014 13:20:11 +0000 Subject: [Linux-cluster] KVM availability groups Message-ID: <9195F18F518EC7428E0397625DF6E1AE1890FF01@G2W2431.americas.hpqcorp.net> I have 8 to 10 Rack mount Servers running Red Hat KVM. I need to create 2 availability zones and a backup zone. 1. What tools do you use to create these? Is it always scripted or is there an open source interface similar to say Vcenter. 2. Are there KVM tools that monitor the zones? Thanks Dave David Henley Solutions Architect Hewlett-Packard Company +1 815 341 2463 dhenley at hp.com -------------- next part -------------- An HTML attachment was scrubbed... URL: From morpheus.ibis at gmail.com Thu Apr 17 19:16:22 2014 From: morpheus.ibis at gmail.com (Pavel Herrmann) Date: Thu, 17 Apr 2014 21:16:22 +0200 Subject: [Linux-cluster] KVM availability groups In-Reply-To: <9195F18F518EC7428E0397625DF6E1AE1890FF01@G2W2431.americas.hpqcorp.net> References: <9195F18F518EC7428E0397625DF6E1AE1890FF01@G2W2431.americas.hpqcorp.net> Message-ID: <2179081.39Bc0pasea@bloomfield> Hi, I am not an expert in this, but as far as i understand it works like this On Thursday 17 of April 2014 13:20:11 Henley, David wrote: > I have 8 to 10 Rack mount Servers running Red Hat KVM. > I need to create 2 availability zones and a backup zone. > > > 1. What tools do you use to create these? Is it always scripted or is > there an open source interface similar to say Vcenter. There are vcenter-like interfaces, but I'm not sure how they handle HA, have a look at ganeti and/or openstack this list is rather more concerned about the low level workings of clustered systems, with tools such as cman or pacemaker (depending on your OS version, I think all current RHEL versions use cman) to monitor and manage availability of your services (a VM is a service in this context), and corosync to keep your cluster in a consistent state. if you are looking for a vsphere replacement, you might have better luck with openstack than tinkering with linux clustering directly, in my opinion. > 2. Are there KVM tools that monitor the zones? You would probably use libvirt interface to manipulate with your KVM instances regards, Pavel Herrmann From mgalan at ujaen.es Fri Apr 18 17:44:57 2014 From: mgalan at ujaen.es (=?ISO-8859-1?Q?Manuel_Gal=E1n_=28UJA=29?=) Date: Fri, 18 Apr 2014 19:44:57 +0200 Subject: [Linux-cluster] Linux-cluster Digest, Vol 120, Issue 5 Message-ID: Hello all,? ? ? What about ovirt? visit ovirt.org Good weekend... Enviado de Samsung Mobile -------- Mensaje original -------- De: linux-cluster-request at redhat.com Fecha: 18/04/2014 18:00 (GMT+01:00) Para: linux-cluster at redhat.com Asunto: Linux-cluster Digest, Vol 120, Issue 5 Send Linux-cluster mailing list submissions to linux-cluster at redhat.com To subscribe or unsubscribe via the World Wide Web, visit https://www.redhat.com/mailman/listinfo/linux-cluster or, via email, send a message with subject or body 'help' to linux-cluster-request at redhat.com You can reach the person managing the list at linux-cluster-owner at redhat.com When replying, please edit your Subject line so it is more specific than "Re: Contents of Linux-cluster digest..." Today's Topics: ?? 1. Re: KVM availability groups (Pavel Herrmann) ---------------------------------------------------------------------- Message: 1 Date: Thu, 17 Apr 2014 21:16:22 +0200 From: Pavel Herrmann To: linux-cluster at redhat.com Cc: "Henley, David \(Solutions Architect Chicago\)" Subject: Re: [Linux-cluster] KVM availability groups Message-ID: <2179081.39Bc0pasea at bloomfield> Content-Type: text/plain; charset="us-ascii" Hi, I am not an expert in this, but as far as i understand it works like this On Thursday 17 of April 2014 13:20:11 Henley, David wrote: > I have 8 to 10 Rack mount Servers running Red Hat KVM. > I need to create 2 availability zones and a backup zone. > > > 1.?????? What tools do you use to create these? Is it always scripted or is > there an open source interface similar to say Vcenter. There are vcenter-like interfaces, but I'm not sure how they handle HA, have a look at ganeti and/or openstack this list is rather more concerned about the low level workings of clustered systems, with tools such as cman or pacemaker (depending on your OS version, I think all current RHEL versions use cman) to monitor and manage availability of your services (a VM is a service in this context), and corosync to keep your cluster in a consistent state. if you are looking for a vsphere replacement, you might have better luck with openstack than tinkering with linux clustering directly, in my opinion. > 2.?????? Are there KVM tools that monitor the zones? You would probably use libvirt interface to manipulate with your KVM instances regards, Pavel Herrmann ------------------------------ -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster End of Linux-cluster Digest, Vol 120, Issue 5 ********************************************* -------------- next part -------------- An HTML attachment was scrubbed... URL: From swhiteho at redhat.com Fri Apr 25 11:42:59 2014 From: swhiteho at redhat.com (Steven Whitehouse) Date: Fri, 25 Apr 2014 12:42:59 +0100 Subject: [Linux-cluster] mixing OS versions? In-Reply-To: <53593BD9.4050602@ucl.ac.uk> References: <12440.1396024637@localhost> <5335CE1A.3060509@redhat.com> <5335F2B4.6080605@mssl.ucl.ac.uk> <1396179266.2659.30.camel@menhir> <53593BD9.4050602@ucl.ac.uk> Message-ID: <535A4A43.8000005@redhat.com> Hi, On 24/04/14 17:29, Alan Brown wrote: > On 30/03/14 12:34, Steven Whitehouse wrote: > >> Well that is not entirely true. We have done a great deal of >> investigation into this issue. We do test quotas (among many other >> things) on each release to ensure that they are working. Our tests have >> all passed correctly, and to date you have provided the only report of >> this particular issue via our support team. So it is certainly not >> something that lots of people are hitting. > > Someone else reported it on this list (on centos), so we're not an > isolated case. > >> We do now have a good idea of where the issue is. However it is clear >> that simply exceeding quotas is not enough to trigger it. Instead quotas >> need to be exceeded in a particular way. > > My suspicion is that it's some kind of interaction between quotas and > NFS, but it'd be good if you could provide a fuller explanation. > Yes, thats what we thought to start with... however that turned out to be a bit of a red herring. Or at least the issue has nothing specifically to do with NFS. The problem was related to when quota was exceeded, and specifically what operation was in progress. You could write to files as often as you wanted to, and exceeding quota would be handled correctly. The problem was a specific code path within the inode creation code, if it didn't result in quota being exceeded on that one specific code path, then everything would work as expected. Also, quite often when the problem did appear, it did not actually trigger a problem until later, making it difficult to track down. You are correct that someone else reported the issue on the list, however I'm not aware of any other reports beyond yours and theirs. Also, this was specific to certain versions of GFS2, and not something that relates to all versions. The upstream patch is here: http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/gfs2?id=059788039f1e6343f34f46d202f8d9f2158c2783 It should be available in RHEL shortly - please ping support via the ticket for updates, Steve. >> Returning to the original point however, it is certainly not recommended >> to have mixed RHEL or CentOS versions running in the same cluster. It is >> much better to keep everything the same, even though the GFS2 on-disk >> format has not changed between the versions. > > More specfically (for those who are curious): Whilst the on-disk > format has not changed between EL5 and EL6, the way that RH cluster > members communicate with each other has. > > I ran a quick test some time back and the 2 different OS cluster > versions didn't see each other for LAN heartbeating. > > > From Micah.Schaefer at jhuapl.edu Fri Apr 25 14:05:37 2014 From: Micah.Schaefer at jhuapl.edu (Schaefer, Micah) Date: Fri, 25 Apr 2014 10:05:37 -0400 Subject: [Linux-cluster] iSCSI GFS2 CMIRRORD Message-ID: Hello All, I have been successfully running a cluster for about a year. I have a question about best practice for my storage setup. Currently, I have 2 front end nodes and two back end nodes. The front end nodes are part of the cluster, run all the services, etc. The back end nodes are only exporting raw block devices via iSCSI and are not cluster aware. The front end import the raw block and use GFS2 with LVM for storage. At this time, I am only using the block devices from one of the back end nodes. I would like the LVMs to be mirrored across the two iSCSI devices, creating redundancy at the block level. The last time I tried this, when creating the LVM, it basically sat for 2 days making no progress. I now have 10GB network connections at my front end and back end nodes (was 1GB only before). Also, on topology, these 4 nodes are across 2 buildings, 1 front end and 1 back end in each building. There are switches in each building that have layer 2 connectivity (10GB) to each other. I also have 2 each 10GB connections per node, and multiple 1GB connections per node. I have come up with the following scenarios, and am looking for advise on which of these methods to use (or none). 1: * Connect all nodes to the 10GB switches. * Use 1 10GB for iSCSI only and 1 for other ip traffic 2: * Connect each back end node to each from end node via 10GB * Use 1GB for other ip traffic 3: * Connect the front end nodes to each other via 10GB * Connect front end and back end nodes to 10GB switch for Ip traffic I am also willing to use device mapper multi path if needed. Thanks in advance for any assistance. Regards, ------- Micah Schaefer JHU/ APL -------------- next part -------------- An HTML attachment was scrubbed... URL: From emi2fast at gmail.com Fri Apr 25 16:12:36 2014 From: emi2fast at gmail.com (emmanuel segura) Date: Fri, 25 Apr 2014 18:12:36 +0200 Subject: [Linux-cluster] iSCSI GFS2 CMIRRORD In-Reply-To: References: Message-ID: you can use multipath when the system see a lun from more than one path, but you your case are importing two differents devices from your backend servers in your frontend server, sou you can use lvm mirror with cmirror in your fronted cluster 2014-04-25 16:05 GMT+02:00 Schaefer, Micah : > Hello All, > I have been successfully running a cluster for about a year. I have a > question about best practice for my storage setup. > > Currently, I have 2 front end nodes and two back end nodes. The front end > nodes are part of the cluster, run all the services, etc. The back end > nodes are only exporting raw block devices via iSCSI and are not cluster > aware. The front end import the raw block and use GFS2 with LVM for > storage. At this time, I am only using the block devices from one of the > back end nodes. > > I would like the LVMs to be mirrored across the two iSCSI devices, > creating redundancy at the block level. The last time I tried this, when > creating the LVM, it basically sat for 2 days making no progress. I now > have 10GB network connections at my front end and back end nodes (was 1GB > only before). > > Also, on topology, these 4 nodes are across 2 buildings, 1 front end and 1 > back end in each building. There are switches in each building that have > layer 2 connectivity (10GB) to each other. I also have 2 each 10GB > connections per node, and multiple 1GB connections per node. > > I have come up with the following scenarios, and am looking for advise on > which of these methods to use (or none). > > 1: > > - Connect all nodes to the 10GB switches. > - Use 1 10GB for iSCSI only and 1 for other ip traffic > > 2: > > - Connect each back end node to each from end node via 10GB > - Use 1GB for other ip traffic > > 3: > > - Connect the front end nodes to each other via 10GB > - Connect front end and back end nodes to 10GB switch for Ip traffic > > I am also willing to use device mapper multi path if needed. > > Thanks in advance for any assistance. > > Regards, > ------- > Micah Schaefer > JHU/ APL > > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster > -- esta es mi vida e me la vivo hasta que dios quiera -------------- next part -------------- An HTML attachment was scrubbed... URL: From neale at sinenomine.net Fri Apr 25 19:13:33 2014 From: neale at sinenomine.net (Neale Ferguson) Date: Fri, 25 Apr 2014 19:13:33 +0000 Subject: [Linux-cluster] luci question Message-ID: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> Hi, One of the guys created a simple configuration and was attempting to use luci to administer the cluster. It comes up fine but the links "Admin ... Logout" at the top left of the window that usually appears is not appearing. Looking at the code in the header html I see the following:
  • Admin
  • Preferences
  • Logout
  • Login
  • What affects (or effects) the tg.auth_stack_enabled value? I assume its some browser setting but really have no clue. Neale From vinh.cao at hp.com Fri Apr 25 21:37:30 2014 From: vinh.cao at hp.com (Cao, Vinh) Date: Fri, 25 Apr 2014 21:37:30 +0000 Subject: [Linux-cluster] luci question In-Reply-To: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> References: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> Message-ID: What type of the browser are you using? I have the same issue with IE. But if I use Firefox. It's there for me. I'm hoping that is it what you are looking for. Vinh -----Original Message----- From: linux-cluster-bounces at redhat.com [mailto:linux-cluster-bounces at redhat.com] On Behalf Of Neale Ferguson Sent: Friday, April 25, 2014 3:14 PM To: linux clustering Subject: [Linux-cluster] luci question Hi, One of the guys created a simple configuration and was attempting to use luci to administer the cluster. It comes up fine but the links "Admin ... Logout" at the top left of the window that usually appears is not appearing. Looking at the code in the header html I see the following:
  • Admin
  • Preferences
  • Logout
  • Login
  • What affects (or effects) the tg.auth_stack_enabled value? I assume its some browser setting but really have no clue. Neale -- Linux-cluster mailing list Linux-cluster at redhat.com https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- A non-text attachment was scrubbed... Name: smime.p7s Type: application/pkcs7-signature Size: 6168 bytes Desc: not available URL: From morpheus.ibis at gmail.com Sat Apr 26 00:06:51 2014 From: morpheus.ibis at gmail.com (Pavel Herrmann) Date: Sat, 26 Apr 2014 02:06:51 +0200 Subject: [Linux-cluster] mixing OS versions? In-Reply-To: <535A4A43.8000005@redhat.com> References: <12440.1396024637@localhost> <53593BD9.4050602@ucl.ac.uk> <535A4A43.8000005@redhat.com> Message-ID: <1787478.6mil9TdHqg@bloomfield> Hi, On Friday 25 of April 2014 12:42:59 Steven Whitehouse wrote: > Hi, > > On 24/04/14 17:29, Alan Brown wrote: > > On 30/03/14 12:34, Steven Whitehouse wrote: > >> Well that is not entirely true. We have done a great deal of > >> investigation into this issue. We do test quotas (among many other > >> things) on each release to ensure that they are working. Our tests have > >> all passed correctly, and to date you have provided the only report of > >> this particular issue via our support team. So it is certainly not > >> something that lots of people are hitting. > > > > Someone else reported it on this list (on centos), so we're not an > > isolated case. > > > >> We do now have a good idea of where the issue is. However it is clear > >> that simply exceeding quotas is not enough to trigger it. Instead quotas > >> need to be exceeded in a particular way. > > > > My suspicion is that it's some kind of interaction between quotas and > > NFS, but it'd be good if you could provide a fuller explanation. > > Yes, thats what we thought to start with... however that turned out to > be a bit of a red herring. Or at least the issue has nothing > specifically to do with NFS. The problem was related to when quota was > exceeded, and specifically what operation was in progress. You could > write to files as often as you wanted to, and exceeding quota would be > handled correctly. The problem was a specific code path within the inode > creation code, if it didn't result in quota being exceeded on that one > specific code path, then everything would work as expected. could you please provide a (somewhat reliable) test case to reproduce this bug? I have looked at the patch, and found nothing obviously related to quotas (it seems the patch only changes the fail-path of posix_acl_create() call, which doesn't appear to have nothing to do with quotas) I have been facing a possibly quota-related oops in GFS2 for some time, which I am unable to reproduce without switching my cluster to production use (which means potentialy facing the anger of my users, which I'd rather not do without at least a chance of the issue being fixed). sadly, I don't have RedHat support subscription (nor do I use RHEL or derivates), my kernel is mostly upstream. thanks Pavel Herrmann > > Also, quite often when the problem did appear, it did not actually > trigger a problem until later, making it difficult to track down. > > You are correct that someone else reported the issue on the list, > however I'm not aware of any other reports beyond yours and theirs. > Also, this was specific to certain versions of GFS2, and not something > that relates to all versions. > > The upstream patch is here: > http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/fs/gfs > 2?id=059788039f1e6343f34f46d202f8d9f2158c2783 > > It should be available in RHEL shortly - please ping support via the > ticket for updates, > > Steve. > > >> Returning to the original point however, it is certainly not recommended > >> to have mixed RHEL or CentOS versions running in the same cluster. It is > >> much better to keep everything the same, even though the GFS2 on-disk > >> format has not changed between the versions. > > > > More specfically (for those who are curious): Whilst the on-disk > > format has not changed between EL5 and EL6, the way that RH cluster > > members communicate with each other has. > > > > I ran a quick test some time back and the 2 different OS cluster > > versions didn't see each other for LAN heartbeating. From neale at sinenomine.net Tue Apr 29 14:01:18 2014 From: neale at sinenomine.net (Neale Ferguson) Date: Tue, 29 Apr 2014 14:01:18 +0000 Subject: [Linux-cluster] luci question In-Reply-To: References: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> Message-ID: <62696CB1-2791-4F09-95D5-03AF87F096AD@sinenomine.net> Thanks Vinh. He is using IE8 (company policy!!). I've tried it with IE8, IE10, Chrome, and Safari and all worked fine. He has cookies enabled so I'm at a loss as to how that auth_stack_enabled setting is set/updated/cleared. Neale On Apr 25, 2014, at 5:37 PM, Cao, Vinh wrote: > What type of the browser are you using? > I have the same issue with IE. But if I use Firefox. It's there for me. > I'm hoping that is it what you are looking for. > > Vinh > -----Original Message----- > Hi, > One of the guys created a simple configuration and was attempting to use > luci to administer the cluster. It comes up fine but the links "Admin ... > Logout" at the top left of the window that usually appears is not appearing. > Looking at the code in the header html I see the following: > > > >
  • class="${('', 'active')[defined('page') and > page==page=='admin']}">Admin
  • >
  • class="${('', 'active')[defined('page') and > page==page=='prefs']}">Preferences
  • >
  • href="${tg.url('/logout_handler')}">Logout
  • >
    >
  • href="${tg.url('/login')}">Login
  • >
    > > What affects (or effects) the tg.auth_stack_enabled value? I assume its some > browser setting but really have no clue. > > Neale From jpokorny at redhat.com Tue Apr 29 15:18:21 2014 From: jpokorny at redhat.com (Jan =?utf-8?Q?Pokorn=C3=BD?=) Date: Tue, 29 Apr 2014 17:18:21 +0200 Subject: [Linux-cluster] luci question In-Reply-To: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> References: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> Message-ID: <20140429151821.GA4367@redhat.com> Hello Neal, On 25/04/14 19:13 +0000, Neale Ferguson wrote: > One of the guys created a simple configuration and was attempting > to use luci to administer the cluster. It comes up fine but the > links "Admin ... Logout" at the top left of the window that usually > appears is not appearing. Looking at the code in the header html I > see the following: > > > >
  • Admin
  • >
  • Preferences
  • >
  • Logout
  • >
    >
  • Login
  • >
    > > What affects (or effects) the tg.auth_stack_enabled value? I assume > its some browser setting but really have no clue. could you be more specific as to which versions of luci, TurboGears and repoze.who? In RHEL-like distros, the latter map to TurboGears2 and python-repoze-who packages. I don't recall any issue like what you described. -- Jan -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: not available URL: From neale at sinenomine.net Tue Apr 29 16:02:17 2014 From: neale at sinenomine.net (Neale Ferguson) Date: Tue, 29 Apr 2014 16:02:17 +0000 Subject: [Linux-cluster] luci question In-Reply-To: <20140429151821.GA4367@redhat.com> References: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> <20140429151821.GA4367@redhat.com> Message-ID: luci-0.26.0-48 (tried -13 as well) TurboGears2-2.0.3-4. kernel-2.6.32-358.2.1 python-repoze-who-1.0.18-1 (I believe - am verifying) On Apr 29, 2014, at 11:18 AM, Jan Pokorn? wrote: > could you be more specific as to which versions of luci, TurboGears > and repoze.who? In RHEL-like distros, the latter map to TurboGears2 > and python-repoze-who packages. > > I don't recall any issue like what you described. > > -- > Jan > -- > Linux-cluster mailing list > Linux-cluster at redhat.com > https://www.redhat.com/mailman/listinfo/linux-cluster -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: From jpokorny at redhat.com Tue Apr 29 17:17:40 2014 From: jpokorny at redhat.com (Jan =?utf-8?Q?Pokorn=C3=BD?=) Date: Tue, 29 Apr 2014 19:17:40 +0200 Subject: [Linux-cluster] luci question In-Reply-To: References: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> <20140429151821.GA4367@redhat.com> Message-ID: <20140429171740.GB4367@redhat.com> On 29/04/14 16:02 +0000, Neale Ferguson wrote: > luci-0.26.0-48 (tried -13 as well) > TurboGears2-2.0.3-4. > kernel-2.6.32-358.2.1 > python-repoze-who-1.0.18-1 (I believe - am verifying) Thanks, this looks sane. Actually there used to be an issue with Genshi generating strict XML by default, notably baffling IE, but it should be sufficiently solved for a long time (https://bugzilla.redhat.com/663103), definitely in the version you used. Just to be sure could you provide also your python-genshi version? Now I am thinking about another thing: > One of the guys created a simple configuration seems to be pretty generic expression, and makes me confused. Does it mean mere luci installation/deployment (along the lines of what specfile does, or better yet, directly from package), or (also) some configuration files tweaking (having especially /var/lib/luci/etc/luci.ini in mind, modifying file analogous to /etc/sysconfig/luci should be fine)? Because if the latter, chances are that admittedly a bit fragile start up process involving hierarchical configuration (via two stated files) and intentional run-time substitutions of middleware initialization routines (cf. luci.initwrappers) could suffer from that. On the other hand, if logging in works as expected, even when reproduced with cookies previously cleared, the issue remains a mystery. [If it meant a _cluster_ configuration in luci, I don't think this has any relevance to the issue.] Also to this point: > the links "Admin ... Logout" at the top left of the window that > usually appears is not appearing. not even "Login" is shown at that very position, right? Further debugging pointers: - inspect source code of the generated page, best when static original is preserved (some code inspectors tend to rather work with live, dynamically modified DOM serialization), wget/curl output when pretending being logged in via cookies might also help - last and promise-less attempt: try enabling verbose logging in /var/lib/luci/etc/luci.ini or equivalent (substitute to fit): # sed -i.old ':0;/\[logger_root\]/b1;p;d;:1;n;s|\(level[ \t]*=[ \t]*\).*|\1DEBUG|;t0;b1' \ /var/lib/luci/etc/luci.ini followed by luci restart and accessing the page in question; there may be something suspicious in the log (usually /var/log/luci/luci.log) but expectedly amongst plenty of other/worthless messages :-/ > On Apr 29, 2014, at 11:18 AM, Jan Pokorn? wrote: > >> could you be more specific as to which versions of luci, TurboGears >> and repoze.who? In RHEL-like distros, the latter map to TurboGears2 >> and python-repoze-who packages. >> >> I don't recall any issue like what you described. -- Jan -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: not available URL: From neale at sinenomine.net Tue Apr 29 17:32:58 2014 From: neale at sinenomine.net (Neale Ferguson) Date: Tue, 29 Apr 2014 17:32:58 +0000 Subject: [Linux-cluster] luci question In-Reply-To: <20140429171740.GB4367@redhat.com> References: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> <20140429151821.GA4367@redhat.com> <20140429171740.GB4367@redhat.com> Message-ID: <7A8D29DD-99C9-4A56-9193-8CE045EF5289@sinenomine.net> He installed luci and then pointed his browser at the host:8084. He gets the login panel, logs in as root, gets the homebase screen but it doesn't have those links at the top right. No changes to any of the files that luci installs. It was a clean RHEL install so I'm guessing that genshi is up to date but I'll ask. Neale On Apr 29, 2014, at 1:17 PM, Jan Pokorn? wrote: > On 29/04/14 16:02 +0000, Neale Ferguson wrote: >> luci-0.26.0-48 (tried -13 as well) >> TurboGears2-2.0.3-4. >> kernel-2.6.32-358.2.1 >> python-repoze-who-1.0.18-1 (I believe - am verifying) > > Thanks, this looks sane. > > Actually there used to be an issue with Genshi generating strict > XML by default, notably baffling IE, but it should be sufficiently > solved for a long time (https://bugzilla.redhat.com/663103), > definitely in the version you used. > > Just to be sure could you provide also your python-genshi version? > > > Now I am thinking about another thing: > >> One of the guys created a simple configuration > > seems to be pretty generic expression, and makes me confused. Does it > mean mere luci installation/deployment (along the lines of what > specfile does, or better yet, directly from package), or (also) some > configuration files tweaking (having especially > /var/lib/luci/etc/luci.ini in mind, modifying file analogous to > /etc/sysconfig/luci should be fine)? > > Because if the latter, chances are that admittedly a bit fragile > start up process involving hierarchical configuration (via two stated > files) and intentional run-time substitutions of middleware > initialization routines (cf. luci.initwrappers) could suffer from > that. On the other hand, if logging in works as expected, even when > reproduced with cookies previously cleared, the issue remains > a mystery. [If it meant a _cluster_ configuration in luci, I don't > think this has any relevance to the issue.] -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: From neale at sinenomine.net Tue Apr 29 17:56:49 2014 From: neale at sinenomine.net (Neale Ferguson) Date: Tue, 29 Apr 2014 17:56:49 +0000 Subject: [Linux-cluster] luci question In-Reply-To: <20140429171740.GB4367@redhat.com> References: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> <20140429151821.GA4367@redhat.com> <20140429171740.GB4367@redhat.com> Message-ID: Name : python-genshi Arch : s390x Version : 0.5.1 Release : 7.1.el6 On Apr 29, 2014, at 1:17 PM, Jan Pokorn? wrote: > > Just to be sure could you provide also your python-genshi version? -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: From jpokorny at redhat.com Tue Apr 29 18:27:48 2014 From: jpokorny at redhat.com (Jan =?utf-8?Q?Pokorn=C3=BD?=) Date: Tue, 29 Apr 2014 20:27:48 +0200 Subject: [Linux-cluster] luci question In-Reply-To: References: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> <20140429151821.GA4367@redhat.com> <20140429171740.GB4367@redhat.com> Message-ID: <20140429182748.GA9564@redhat.com> On 29/04/14 17:56 +0000, Neale Ferguson wrote: > Name : python-genshi > Arch : s390x > Version : 0.5.1 > Release : 7.1.el6 Thanks again, but I have to admit I am short of ideas. Please see my other post wrt. next possible pointers, notably inspecting a page dump (e.g., via save page as) because there could be also some weird styling issue despite the demanded content reached the browser. Probably last item to check that I recalled is checking that no interfering EPEL or other non-RH package is installed, perhaps by running: rpm -q --qf "%{NEVRA}; %{VENDOR}\n" -- luci TurboGears2 pyOpenSSL \ python python-babel python-beaker python-cheetah python-decorator \ python-decoratortools python-formencode python-genshi python-mako \ python-markdown python-markupsafe python-myghty python-nose \ python-paste python-paste-deploy python-paste-script \ python-peak-rules python-peak-util-addons python-peak-util-assembler\ python-peak-util-extremes python-peak-util-symbols \ python-prioritized-methods python-pygments python-pylons \ python-repoze-tm2 python-repoze-what python-repoze-what-pylons \ python-repoze-who python-repoze-who-friendlyform \ python-repoze-who-testutil python-routes python-setuptools \ python-simplejson python-sqlalchemy python-tempita \ python-toscawidgets python-transaction python-turbojson \ python-tw-forms python-weberror python-webflash python-webhelpers \ python-webob python-webtest python-zope-filesystem \ python-zope-interface python-zope-sqlalchemy | grep -v 'Red Hat' ("package python-tw-forms is not installed" in the output is OK, it's just a legacy thing) Sadly, having no direct access to IE8, cannot track this further on my own. -- Jan -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 819 bytes Desc: not available URL: From neale at sinenomine.net Tue Apr 29 18:38:38 2014 From: neale at sinenomine.net (Neale Ferguson) Date: Tue, 29 Apr 2014 18:38:38 +0000 Subject: [Linux-cluster] luci question In-Reply-To: <20140429182748.GA9564@redhat.com> References: <8C5B1F22-0A70-4705-9DC2-749230BEF1D0@sinenomine.net> <20140429151821.GA4367@redhat.com> <20140429171740.GB4367@redhat.com> <20140429182748.GA9564@redhat.com> Message-ID: <832CABBC-F493-4032-AA3B-50D9A3B80BFB@sinenomine.net> Thanks for the suggestions Jan. Your help is appreciated. Neale On Apr 29, 2014, at 2:27 PM, Jan Pokorn? wrote: > Sadly, having no direct access to IE8, cannot track this further on my > own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 496 bytes Desc: Message signed with OpenPGP using GPGMail URL: