[Linux-cluster] Diskless Shared-Root GFS/Cluster

Marc Grimme grimme at atix.de
Thu Feb 1 16:12:26 UTC 2007


On Thursday 01 February 2007 12:57, Jayson Vantuyl wrote:
> We are talking about application servers.
>
> One of the toughest things about clustering in general and GFS in
> particular is the failure scenarios.
>
> When you have any sort of cluster issue, if your root is on a shared
> GFS, that GFS freezes in various ways until fencing happens.  The
> problem with this is that certain binaries that are on the same GFS
> may need to be used to recover.  How do you execute fence_apc to
> fence a failed node when it is on a GFS that is hung waiting on that
> same fencing operation?
We move that fencing, ccsd functionality into a special chroot that is rebuilt 
at any time you boot the server. This might be on a tmpfs - which is the case 
if the path you specified for the chroot is identified as GFS - and stays 
untouched if it is a local FS.
Many customers are using lokal disks but not for booting or any valueable data 
just for temporary files and swap. So that a server is only an independent 
exchangeable box of metal.
>
> There are ways around this involving RAM disks and the like, but
> eventually we just settled on having a minimal flash disk that would
> get us onto our SAN (but not clustered).  Only after we were on a non-
> clustered-FS on our SAN would we then start up our clustered
> filesystem.  This gave us the ability to move our nodes around
> easily.  This is an often overlooked benefit of a shared root that
> putting your root FS on SAN gives you as well.  There's nothing like
> booting up a dead node on spare hardware.  This also gives you a
> solid way to debug a damaged root system.  With shared-root it's all
> or nothing.  It's not so with this configuration.  You also have
> separate syslog files and other things that are one more special case
> on a shared root.  It's also easy to set up nodes with slightly
> different configurations (shared-root makes this another special
> case).  As for the danger of drive failure, a read-only IDE flash
> disk (Google for Transcend) is simple, easy, and dead solid.
You can also boot nodes with different hw configurations. The initrd in the 
open sharedroot does the hw detection.
>
> After consolidating your shared configuration files into /etc/shared
> and placing appropriate symlinks into that directory, it is a simple
> matter of rsync / csync / tsync / cron+scp to keep them synchronized.
That's a question of architecture not technology. Where do you want to have 
your complexity? In the FS or userspace?
>
> It is tempting to want to have a shared root to minimize management
> requirements.  It is tempting to want to play games with ramfs and
> the like to provide a support system that will function when that
> shared root is hung due to clustering issues.  It is tempting to
> think that having a shared GFS root is really useful.
>
> However, if you value reliability and practicality, it's much easier
> to script up an occasional Rsync than it is to do so many acrobatics
> for such little gain.  For a cluster (and its apps) to be reliable at
> all, it needs to be able to function, recover, and generally have a
> stable operating environment.  Putting GFS under the userspace that
> drives it is asking for trouble.
You should really have a deeper look into sharedroot concepts . You'll like 
it!

Regards Marc.
>
> On Jan 31, 2007, at 1:34 PM, isplist at logicore.net wrote:
> > I'm thinking for application servers/cluster only, not workstation
> > users.
> >
> > On Wed, 31 Jan 2007 11:10:55 -0800, Tom Mornini wrote:
> >> We boot from flash drives, then pivot root to SAN storage.
> >>
> >> I agree with no drives in servers, but shared root is a
> >> whole different ball game if you mean everyone using a
> >> single filesystem for root.
> >>
> >> --
> >> -- Tom Mornini, CTO
> >> -- Engine Yard, Ruby on Rails Hosting
> >> -- Reliability, Ease of Use, Scalability
> >> -- (866) 518-YARD (9273)
>
> --
> Jayson Vantuyl
> Systems Architect
> Engine Yard
> jvantuyl at engineyard.com

-- 
Gruss / Regards,

** Visit us at CeBIT 2007 in Hannover/Germany **
** in Hall 5, Booth G48/2  (15.-21. of March) **

Marc Grimme
Phone: +49-89 452 3538-14
http://www.atix.de/               http://www.open-sharedroot.org/

**
ATIX - Ges. fuer Informationstechnologie und Consulting mbH
Einsteinstr. 10 - 85716 Unterschleissheim - Germany




More information about the Linux-cluster mailing list