From lists at alteeve.ca Mon Jan 6 16:34:14 2014 From: lists at alteeve.ca (Digimer) Date: Mon, 06 Jan 2014 11:34:14 -0500 Subject: [rhelv6-list] Announcing a new HA KVM tutorial! Message-ID: <52CADB06.4070809@alteeve.ca> Almost exactly two years ago, I released the first tutorial for building an HA platform for KVM VMs. In that time, I have learned a lot, created some tools to simplify management and refined the design to handle corner-cases seen in the field. Today, the culmination of that learning is summed up in the "2nd Edition" of that tutorial, now called "AN!Cluster Tutorial 2". https://alteeve.ca/w/AN!Cluster_Tutorial_2 These HA KVM platforms have been in production for over two years now in facilities all over the world; Universities, municipal governments, corporate DCs, manufacturing facilities, etc. I've gotten wonderful feedback from users and all that real-world experience has been integrated into this new tutorial. As always, everything is 100% open source and free-as-in-beer! The major changes are: * SELinux and iptables are enabled and used. * Numerous slight changes made to the OS and cluster stack configuration to provide better corner-case fault handling. * Architecture refinements; ** Redundant PSUs, UPSes and fence methods emphasized. ** Monitoring multiple UPSes added via modified apcupsd ** Detailed monitoring of LSI-based RAID controllers and drives ** Discussion on hardware considerations for VM performance based on anticipated work loads * Naming convention changes to support the new AN!CDB dashboard[1] ** New alert system covered with fault and notable event alerting * Wider array of guest OSes are covered; ** Windows 7 ** Windows 8 ** Windows 2008 R2 ** Windows 2012 ** Solaris 11 ** FreeBSD 9 ** RHEL 6 ** SLES 11 Beyond that, the formatting of the tutorial itself has been slightly modified. I do think it is the easiest to follow tutorial I have yet been able to produce. I am very proud of this one! :D As always, feedback is always very much appreciated. Everything from typos/grammar mistakes, functional problems or anything else is very valuable. I take all the feedback I get and use it to helping make the tutorials better. Enjoy! Digimer, who now can now start the next tutorial in earnest! 1. https://alteeve.ca/w/AN!CDB -- Digimer Papers and Projects: https://alteeve.ca/w/ What if the cure for cancer is trapped in the mind of a person without access to education? From evilensky at gmail.com Fri Jan 31 16:53:57 2014 From: evilensky at gmail.com (Eugene Vilensky) Date: Fri, 31 Jan 2014 10:53:57 -0600 Subject: [rhelv6-list] Distributing SELinux policies In-Reply-To: <15tvbypj6y5.fsf@tux.uio.no> References: <15tvbypj6y5.fsf@tux.uio.no> Message-ID: On Mon, Dec 16, 2013 at 6:50 AM, Trond Hasle Amundsen wrote: > I'll take a crack at answering this. We do exactly like you describe, > i.e. distribute a set of SELinux policies in an RPM, as a module. Thank you. This was very instructive. From KCollins at chevron.com Fri Jan 31 23:14:22 2014 From: KCollins at chevron.com (Collins, Kevin [Contractor Acquisition Program]) Date: Fri, 31 Jan 2014 23:14:22 +0000 Subject: [rhelv6-list] Highly available OpenLDAP Message-ID: <6F56410FBED1FC41BCA804E16F594B0B33054720@chvpkw8xmbx05.chvpk.chevrontexaco.net> Hi all, I'm looking for a little input on what other folks are doing to solve a problem we are trying to address. The scenario is as follows: We were an NIS shop for many, many years. Our environment was (and still is) heavily dependant on NIS, and netgroups in particular, to function correctly. About 5 or 6 years ago we migrated from NIS to LDAP (using RFC2307 to provide NIS maps via LDAP). The environment at the time consisted of less than 200 servers (150 in primary site, the rest in a secondary site), mostly HP-UX with Linux playing the part of "utility" services (LDAP, DNS, mysql, httpd, VNC). We use LDAP only to provide the standard NIS "maps" (with a few small custom maps, too). We maintain our our LDAP servers with the RHEL-provided OpenLDAP, with a single master in our primary site in conjunction with 2 replica servers in our primary site and 2 replica servers in our secondary site. Replication was using the slurpd mechanism (we started on RHEL3). Life was good :) Fast forward to current environment, and a merger with a different Unix team (and migrating that environment from NIS to LDAP as well). We now have close to 1000 servers (mix of physical and VM): roughly 400 each for our 2 primary sites and the rest scattered across another 3 sites. The mix is now much more heavily Linux (70%), which the remaining 30% split between HP-UX and Solaris. We have increased the number of replicas adding 2 more replicas in each of the new sites. We are still (mostly) using slurpd for replication, although with the impending migration of our LDAP master from RHEL5 to RHEL6, we must change to using sync-repl. No problem, as this is (IMO) a much better replication method and relieves the worries and headaches that occur when a replica for some reason becomes "broken" for some period of time. We have already started this migration, and our master now handles both slurpd (to old replicas) and sync-repl (from new replicas). In our environment, each site has is configured to point to LDAP services by IP address. Two IP addresses per site which are "load-balanced" by alternating which IP is first and second in the config files based on whether the last octet of the client IP address is even or odd. This is done as very basic way to distribute the load. Now comes the crux of the problem: what happens when an LDAP server becomes unavailable for some reason? If the client is HP-UX (ldapclientd), Solaris (ldap_cachemgr) or RHEL6 (nslcd) there is not much of an issue as long as 1 LDAP replica in each site is functioning. The specific LDAP-daemon for each platform will have a small hiccup while it times out and falls over to the next LDAP replica... a few seconds, not a big deal. If, however, the client is RHEL4 (yes, still!) or RHEL5 then the problem is much bigger! On these versions, each process that needs to use LDAP must go thru the exact same timeout process - the systems become very bogged down, or even unusable depending on the server load. In one subset of our larger environment (about 40%), we run nscd which can help alleviate some of this issue but not all of it. We are planning to enable nscd on the remainder very soon - the historical reasoning for why those servers do not use nscd is unknown. Last year, I started investigating and testing the use of LVS (Linux Virtual Server) to provide a highly available (aka, clustered), load-balanced front-end that would direct client requests for a single IP address (per site) to the backend LDAP servers. Results were very good, and I proposed this plan to our management. DENIED! It was deemed to be "too complex to manage" by our team, and redundant to the BigIP F5 service offering with the company. I tend to favor self-management of infrastructure components which are critical to maintaining system functionality, but what do I know? :) So, we are now looking down the route of using F5 (managed by another team) to front-ent our LDAP But, another option has been proposed: what if we make each linux server an LDAP replica that keeps itself up to date with sync-repl and have each server use only itself for LDAP services? The setup of this would be fairly straightforward, and could be easily integrated into our build process. Since we don't make massive volumes of changes, I feel like the network load for LDAP would probably drop significantly, and we don't have to worry about many of these other issues. I know that this solves the problem only for Linux, but Solaris and HP-UX already handle the problem case are are being phased out of our environment. Anyway, thanks for reading this novel - had not intended to write so much, but wanted to set the foundation for my question. What are you people doing to solve this problem? Are you using F5? Do you think the "every server a replica" approach makes sense? I am posting to both RHEL5 and RHEL6 lists, sorry if you see it twice. Thanks in advance for your input. Kevin -------------- next part -------------- An HTML attachment was scrubbed... URL: From b.j.smith at ieee.org Fri Jan 31 23:32:26 2014 From: b.j.smith at ieee.org (Bryan J Smith) Date: Fri, 31 Jan 2014 18:32:26 -0500 Subject: [rhelv6-list] Highly available OpenLDAP In-Reply-To: <6F56410FBED1FC41BCA804E16F594B0B33054720@chvpkw8xmbx05.chvpk.chevrontexaco.net> References: <6F56410FBED1FC41BCA804E16F594B0B33054720@chvpkw8xmbx05.chvpk.chevrontexaco.net> Message-ID: To start ... Why nscd and nslcd? Haven't you considered moving to sssd on your RHEL6 and RHEL5.8+ systems? So many benefits ... so, so many, client and server ... especially caching, which is far more deterministic and far more stable too! It's just putting in that small amount of time to figure out the details for sssd. E.g., a common one I run into is lax security, allowed by prior nss, pam, etc... modules, but not sssd by default (by can be disabled). I know that don't solve your non-RHEL/Fedora client load, but just FYI. -- bjs P.S. I'm purposely not mentioning RHDS (389), but there's that age-old argument, especially if this is already costing your organization money in time. ;) Plus there is now RHEL6 built-in IdM (IPA) services. I.e., if you're just serving what is, essentially, converted NIS maps ... even "free" IdM might be all you need. It can service egacy LDAP client as well, although I don't know your environment and all of your schema. -- Bryan J Smith - UCF '97 Engr - http://www.linkedin.com/in/bjsmith ----------------------------------------------------------------- "In a way, Bortles is the personification of the UCF football program. Each has many of the elements that everyone claims to want, and yet they are nobody's first choice. Coming out of high school, Bortles had the size and the arm to play at a more prestigious program. UCF likewise has the market size and the talent base to play in a more prestigious conference than the American Athletic. But timing and circumstances conspired to put both where they are now." -- Andy Staples, CNN-Sports Illustrated On Fri, Jan 31, 2014 at 6:14 PM, Collins, Kevin [Contractor Acquisition Program] wrote: > Hi all, > > > > I'm looking for a little input on what other folks are doing to > solve a problem we are trying to address. The scenario is as follows: > > > > We were an NIS shop for many, many years. Our environment was (and still is) > heavily dependant on NIS, and netgroups in particular, to function > correctly. > > > > About 5 or 6 years ago we migrated from NIS to LDAP (using RFC2307 to > provide NIS maps via LDAP). The environment at the time consisted of less > than 200 servers (150 in primary site, the rest in a secondary site), mostly > HP-UX with Linux playing the part of "utility" services (LDAP, DNS, mysql, > httpd, VNC). > > > > We use LDAP only to provide the standard NIS "maps" (with a few small custom > maps, too). > > > > We maintain our our LDAP servers with the RHEL-provided OpenLDAP, with a > single master in our primary site in conjunction with 2 replica servers in > our primary site and 2 replica servers in our secondary site. Replication > was using the slurpd mechanism (we started on RHEL3). > > > > Life was good J > > > > Fast forward to current environment, and a merger with a different Unix team > (and migrating that environment from NIS to LDAP as well). We now have close > to 1000 servers (mix of physical and VM): roughly 400 each for our 2 primary > sites and the rest scattered across another 3 sites. The mix is now much > more heavily Linux (70%), which the remaining 30% split between HP-UX and > Solaris. > > > > We have increased the number of replicas adding 2 more replicas in each of > the new sites. > > > > We are still (mostly) using slurpd for replication, although with the > impending migration of our LDAP master from RHEL5 to RHEL6, we must change > to using sync-repl. No problem, as this is (IMO) a much better replication > method and relieves the worries and headaches that occur when a replica for > some reason becomes "broken" for some period of time. We have already > started this migration, and our master now handles both slurpd (to old > replicas) and sync-repl (from new replicas). > > > > In our environment, each site has is configured to point to LDAP services by > IP address. Two IP addresses per site which are "load-balanced" by > alternating which IP is first and second in the config files based on > whether the last octet of the client IP address is even or odd. This is done > as very basic way to distribute the load. > > > > Now comes the crux of the problem: what happens when an LDAP server becomes > unavailable for some reason? > > > > If the client is HP-UX (ldapclientd), Solaris (ldap_cachemgr) or RHEL6 > (nslcd) there is not much of an issue as long as 1 LDAP replica in each site > is functioning. The specific LDAP-daemon for each platform will have a small > hiccup while it times out and falls over to the next LDAP replica... a few > seconds, not a big deal. > > > > If, however, the client is RHEL4 (yes, still!) or RHEL5 then the problem is > much bigger! On these versions, each process that needs to use LDAP must go > thru the exact same timeout process - the systems become very bogged down, > or even unusable depending on the server load. > > > > In one subset of our larger environment (about 40%), we run nscd which can > help alleviate some of this issue but not all of it. We are planning to > enable nscd on the remainder very soon - the historical reasoning for why > those servers do not use nscd is unknown. > > > > Last year, I started investigating and testing the use of LVS (Linux Virtual > Server) to provide a highly available (aka, clustered), load-balanced > front-end that would direct client requests for a single IP address (per > site) to the backend LDAP servers. Results were very good, and I proposed > this plan to our management. > > > > DENIED! > > > > It was deemed to be "too complex to manage" by our team, and redundant to > the BigIP F5 service offering with the company. I tend to favor > self-management of infrastructure components which are critical to > maintaining system functionality, but what do I know? J > > > > So, we are now looking down the route of using F5 (managed by another team) > to front-ent our LDAP > > > > But, another option has been proposed: what if we make each linux server an > LDAP replica that keeps itself up to date with sync-repl and have each > server use only itself for LDAP services? The setup of this would be fairly > straightforward, and could be easily integrated into our build process. > > > > Since we don't make massive volumes of changes, I feel like the network load > for LDAP would probably drop significantly, and we don't have to worry about > many of these other issues. I know that this solves the problem only for > Linux, but Solaris and HP-UX already handle the problem case are are being > phased out of our environment. > > > > Anyway, thanks for reading this novel - had not intended to write so much, > but wanted to set the foundation for my question. > > > > What are you people doing to solve this problem? Are you using F5? Do you > think the "every server a replica" approach makes sense? > > > > I am posting to both RHEL5 and RHEL6 lists, sorry if you see it twice. > > > > Thanks in advance for your input. > > > > Kevin > > > > > _______________________________________________ > rhelv6-list mailing list > rhelv6-list at redhat.com > https://www.redhat.com/mailman/listinfo/rhelv6-list