[Linux-cluster] Clustering Tutorial

Sat Oct 22 02:18:36 UTC 2005

Thanks. I should have mentioned that we're doing high performance
clustering, and not HA. We have a beowulf cluster (old and decrepid) and an
OSCAR cluster. None of our current clusters are RH, but that will probably
change once we get our next 4-opteron cpu/box cluster...

Yeehaw!

And a Big Thanks to everyone who responded. I now have some good resources.
A lot of reading... yaaaawn ! heh-heh.

dave

On 10/20/05, Tim Spaulding <tspauld98 at yahoo.com> wrote:
>
> Just a note of caution, there's a big difference between High Availability
> Clustering and High
> Performance Clustering. AFAIK, Beowulf is an HPC technology. RHCS (Red Hat
> Cluster Suite) and
> GFS (Global File System) are HAC technologies. Some of the underlying
> building blocks are used by
> both communities but they are used for fundamentally difference purposes.
>
> http://www.linux-ha.org is the home of another HAC, linux-based
> technology. They have more
> documentation on clustering and its concepts. Red Hat does a good job on
> the HOW-TOs of getting a
> cluster working but a terrible job of telling folks the WHY-TOs of
> clustering.
>
> I'm currently working on a comparison of linux-ha and RHCS so if you have
> questions regarding HAC
> on linux then fire away. If you have a beowulf cluster, je ne comprends
> pas, sorry.
>
> --tims
>
> --- Michael Will <mwill at penguincomputing.com> wrote:
>
> >
> http://www.phy.duke.edu/resources/computing/brahma/Resources/beowulf_book.php
> > is a good start,
> > http://www.beowulf.org is another good place, it is also the home of the
> > original beowulf mailinglist.
> >
> > Generally I would recommend digging through recent mailinglist postings
> > because
> > there are often very informed answers to questions.
> >
> > Lon just answered a fencing question a few days ago:
> >
> > "STONITH, STOMITH, etc. are indeed implementations of I/O fencing.
> >
> > Fencing is the act of forcefully preventing a node from being able to
> > access resources after that node has been evicted from the cluster in an
> > attempt to avoid corruption.
> >
> > The canonical example of when it is needed is the live-hang scenario, as
> > you described:
> >
> > 1. node A hangs with I/Os pending to a shared file system
> > 2. node B and node C decide that node A is dead and recover resources
> > allocated on node A (including the shared file system)
> > 3. node A resumes normal operation
> > 4. node A completes I/Os to shared file system
> >
> > At this point, the shared file system is probably corrupt. If you're
> > lucky, fsck will fix it -- if you're not, you'll need to restore from
> > backup. I/O fencing (STONITH, or whatever we want to call it) prevents
> > the last step (step 4) from happening.
> >
> > How fencing is done (power cycling via external switch, SCSI
> > reservations, FC zoning, integrated methods like IPMI, iLO, manual
> > intervention, etc.) is unimportant - so long as whatever method is used
> > can guarantee that step 4 can not complete."
> >
> > "GFS can use fabric-level fencing - that is, you can tell the iSCSI
> > server to cut a node off, or ask the fiber-channel switch to disable a
> > port. This is in addition to "power-cycle" fencing."
> >
> >
> > Michael
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
> >
>
>
>
>
> __________________________________
> Yahoo! Music Unlimited
> Access over 1 million songs. Try it free.
> http://music.yahoo.com/unlimited/
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20051021/22de32ed/attachment.htm>