From adrian.gibanel at btactic.com Sun Jun 2 21:12:54 2013 From: adrian.gibanel at btactic.com (Adrian Gibanel) Date: Sun, 2 Jun 2013 23:12:54 +0200 (CEST) Subject: [Linux-cluster] fence_ovh - Fence agent for OVH In-Reply-To: <2056213.43.1370206273474.JavaMail.adrian@adrianworktop> Message-ID: <3826574.639.1370207568347.JavaMail.adrian@adrianworktop> As requested by digimer in linux-ha irc channel here there is fence_ovh. It's not a priority that it's included by default in official distribution of cluster software but if you guide me on how to polish it I think I can improve it a lot more and make tests in real machines (as long as my machines are still test machines and not production ones). 1) What is fence_ovh fence_ovh is a fence agent based on python for the big French datacentre provider OVH. You can get information about OVH on: http://www.ovh.co.uk/ . I also wanted to make clear that I'm not part of official OVH staff. 2) Features The script has two main functions: * Reboot into rescue mode (action=off) * Reboot into the hard disk (action=on;action=reboot) 3) Technical details So as you might deduce the classical fence mechanism which turns off the other node is not actually done by turning off the machine but by rebooting it into a rescue mode. Another particular thing to mention is that the script checks if the machine has rebooted ok into rescue mode thanks to an OVH API which reports the date when the server rebooted. By the way the OVH API is also used in the main function that consists in rebooting the machine into rescue mode. 4) How to use it 4.1) Make sure python-soappy package is installed (Debian/Ubuntu). 4.2) Save fence_ovh in /usr/sbin 4.3) Run: ccs_update_schema so that new metadata is put into cluster.rng 4.4) If needed validate your configuration: ccs_config_validate -v -f /etc/pve/cluster.conf.new 4.5) Here's an example of how to use it in cluster.conf: Finally I attach to this email the first version of ovh_fence script. It can be improved a lot, I've just realised that I've left some mention an .ini file in the metadata that I had used previously to feed user / pass while now they are gathered from cluster.conf configuration directly as any fence agent. The original thread from Proxmox forum from which I adapted original secofor script: http://forum.proxmox.com/threads/11066-Proxmox-HA-Cluster-at-OVH-Fencing?p=75152#post75152 P.S.: It was not easy to develop a fence agent because there's no documentation on it. I maybe arise another email in this same mailing list about this subject. -- -- Adri?n Gibanel I.T. Manager +34 675 683 301 www.btactic.com Ens podeu seguir a/Nos podeis seguir en: i Abans d?imprimir aquest missatge, pensa en el medi ambient. El medi ambient ?s cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. AVIS: El contingut d'aquest missatge i els seus annexos ?s confidencial. Si no en sou el destinatari, us fem saber que est? prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autoritzaci? corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . AVISO: El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que est? prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorizaci?n correspondiente. Si han recibido este mensaje por error, les agradecer?amos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . -------------- next part -------------- A non-text attachment was scrubbed... Name: ovh_fence Type: text/x-python Size: 5740 bytes Desc: not available URL: From adrian.gibanel at btactic.com Sun Jun 2 21:26:43 2013 From: adrian.gibanel at btactic.com (Adrian Gibanel) Date: Sun, 2 Jun 2013 23:26:43 +0200 (CEST) Subject: [Linux-cluster] Fence agent howto needed In-Reply-To: <257418.648.1370208010160.JavaMail.adrian@adrianworktop> Message-ID: <11568770.732.1370208401096.JavaMail.adrian@adrianworktop> As I mentioned in a previous message I think there's missing a: How to write your own Fence agent howto or Fence agent howto As I've written one fence agent myself I think I can help in explaining how to use included python fencing libraries (when you program in python of course) which eases a lot the parsing of options from stdin. I mean, it's easy, but it's far from obvious even by checking the default python fence_* scripts. Thank you. -- Adri?n Gibanel I.T. Manager +34 675 683 301 www.btactic.com Ens podeu seguir a/Nos podeis seguir en: i Abans d?imprimir aquest missatge, pensa en el medi ambient. El medi ambient ?s cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. AVIS: El contingut d'aquest missatge i els seus annexos ?s confidencial. Si no en sou el destinatari, us fem saber que est? prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autoritzaci? corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . AVISO: El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que est? prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorizaci?n correspondiente. Si han recibido este mensaje por error, les agradecer?amos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . -------------- next part -------------- An HTML attachment was scrubbed... URL: From adrian.gibanel at btactic.com Wed Jun 5 14:52:55 2013 From: adrian.gibanel at btactic.com (Adrian Gibanel) Date: Wed, 5 Jun 2013 16:52:55 +0200 (CEST) Subject: [Linux-cluster] fence_ovh - Fence agent for OVH In-Reply-To: <51ACAD2C.1000902@redhat.com> References: <3826574.639.1370207568347.JavaMail.adrian@adrianworktop> <51ACAD2C.1000902@redhat.com> Message-ID: <19221023.357.1370443974551.JavaMail.adrian@adrianworktop> ----- Mensaje original ----- > De: "Marek Grac" > Para: "Adrian Gibanel" > Enviados: Lunes, 3 de Junio 2013 16:50:20 > Asunto: Re: [Linux-cluster] fence_ovh - Fence agent for OVH > Hi Adrian, > On 06/02/2013 11:12 PM, Adrian Gibanel wrote: > > As requested by digimer in linux-ha irc channel here there is > > fence_ovh. It's not a priority that it's included by default in > > official distribution of cluster software but if you guide me on > > how to polish it I think I can improve it a lot more and make > > tests in real machines (as long as my machines are still test > > machines and not production ones). > Inclusion in upstream package should not be a serious issue but I > have > some comments. Ok. > > 4) How to use it > > > > 4.1) Make sure python-soappy package is installed (Debian/Ubuntu). > Do you think it would be possible to use package 'python-suds' ? We > already use it for fence_vmware_soap and I would rather use only one > library for same thing. Do not bother with strange code in > fence_vmware_soap, VMWare implementation was strange so some hacks > were > needed, it should not be needed for you and transformation to suds > should be quite simple. I think it's just a matter of trying it. Not sure when I will have the time. > > 4.2) Save fence_ovh in /usr/sbin > > 4.3) Run: ccs_update_schema so that new metadata is put into > > cluster.rng > > 4.4) If needed validate your configuration: > > ccs_config_validate -v -f /etc/pve/cluster.conf.new > > 4.5) Here's an example of how to use it in cluster.conf: > > > > > > > > > > > transport="udpu" two_node="1" expected_votes="1"> > > > > > > > > > email="admin at domain.com" ipaddr="ns123456" login="ab12345-ovh" > > passwd="MYSECRET" /> > > > email="admin at domain.com" ipaddr="ns789012" login="ab12345-ovh" > > passwd="MYSECRET" /> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > I like the idea of 'email', it is possible to do it on cluster level > but > as fence agents are used in different places that could help more > people. Well, I didn't find a default field for setting it and I added one of my own. I think that you can use OVH API without that email informed but I prefer myself to be informed about fencing. So, I think that rather than writing in a file somewhere in the filesystem, it would be more useful to write it in cluster.conf. I don't think if I understood what you meant. You were thinking that maybe email should be a cluster wide settings instead of a fence agent wide setting... but then you thought that the latter was better idea? > > P.S.: It was not easy to develop a fence agent because there's no > > documentation on it. I maybe arise another email in this same > > mailing list about this subject. > > > There is a Fence Agent API document (a bit outdated), info about > fence > timing and the major document that you need is in my to-do list > (should > be completed before summer, but we had some more urgent deadlines). Ok. I'll contribute back to it if I can when it's public. Thank you! > m, -- -- Adri?n Gibanel I.T. Manager +34 675 683 301 www.btactic.com Ens podeu seguir a/Nos podeis seguir en: i Abans d?imprimir aquest missatge, pensa en el medi ambient. El medi ambient ?s cosa de tothom. / Antes de imprimir el mensaje piensa en el medio ambiente. El medio ambiente es cosa de todos. AVIS: El contingut d'aquest missatge i els seus annexos ?s confidencial. Si no en sou el destinatari, us fem saber que est? prohibit utilitzar-lo, divulgar-lo i/o copiar-lo sense tenir l'autoritzaci? corresponent. Si heu rebut aquest missatge per error, us agrairem que ho feu saber immediatament al remitent i que procediu a destruir el missatge . AVISO: El contenido de este mensaje y de sus anexos es confidencial. Si no es el destinatario, les hacemos saber que est? prohibido utilizarlo, divulgarlo y/o copiarlo sin tener la autorizaci?n correspondiente. Si han recibido este mensaje por error, les agradecer?amos que lo hagan saber inmediatamente al remitente y que procedan a destruir el mensaje . From rossnick-lists at cybercat.ca Thu Jun 6 16:37:06 2013 From: rossnick-lists at cybercat.ca (Nicolas Ross) Date: Thu, 06 Jun 2013 12:37:06 -0400 Subject: [Linux-cluster] Using LVM / ext4 in cluster.conf more than one time Message-ID: <51B0BAB2.6060009@cybercat.ca> Hi ! We have 2 clusters of 8 nodes each. Since we begun using RHCS about 2.5 years ago, we mostly use GFS on a shared storage array for data storage. In most cases, there is no need to use GFS, ext4 would be enough since that filesystem is only used within one service. Production services are enabled on one cluster at the time. A service containing a webserver and data directories is running in cluster A. A service also exists in cluster B that only use the data directories. Data is then synced from cluster A to cluster B manually for recovery in case of disaster. So in each cluster, we have a copy of 2 services. Both use the same data file system, but only one start the webserver. I will upload a verry simplified version of one cluster.conf to illustrate this and the problem. We begun migrating some GFS file system to ext4, as it offers more performance, while retaining the same HA features in the event of a node failure. And while doing so, we discover this problem. While a ext3 using HA-LVM in a clustered environement cannot be mounted on 2 nodes (that's fine), we do need for it to be defined 2 times in the cluster.conf file. So, in my exemple cluster.conf, you will see 2 services. Both uses the same lvm and fs, but one starts a script, and the other doesn't. While only one service can be active at the time in one cluster, and that's fine for us, only the one will effectively mount the file system. In my cluster.conf example, I have 2 services. SandBox would be the "production" service starting the script and listening on the proper IP. The SandBoxRecovery service would be the service I run in my other cluster to receive the synced data from the "production" service. In this exemple, the SandBox service would start fine, but SandBoxRecovery would only starts the IPs, and not mount the FS. With rg_test, I was able to see this warning : Warning: Max references exceeded for resource name (type lvm) and google led me to : https://access.redhat.com/site/solutions/222453 Which pretty much explains it. You can see the comment on this solution which is execly my question here : So, how can I define a LVM and FS a second time in cluster.conf ? ------------ example cluster.conf ------------