[Linux-cluster] Questions about GFS
Bowie Bailey
Bowie_Bailey at BUC.com
Wed Apr 12 15:45:19 UTC 2006
As someone else pointed out, it is possible to run diskless
workstations with their root on the GFS. I haven't tried this
configuration, so I don't know what issues their may be. The security
issue is there. Since they are all running from the same disk, a
compromise on one can corrupt the entire cluster.
On my systems, I just have a small hard drive to hold the OS and
applications and then mount the GFS as a data partition.
Bowie
Greg Perry wrote:
> Also, after reviewing the GFS architecture it seems there would be
> significant security issues to consider, ie if one client/member of
> the GFS volume were compromised, that would lead to a full compromise
> of the filesystem across all nodes (and the ability to create special
> devices and modify the filesystem on any other GFS node member). Are
> there any plans to include any form of discretionary or mandatory
> access controls for GFS in the upcoming v2 release?
>
> Greg
>
> Greg Perry wrote:
> > Thanks Bowie, I understand more now. So within this architecture,
> > it would make more sense to utilize a RAID-5/10 SAN, then add
> > diskless workstations as needed for performance...?
> >
> > For said diskless workstations, does it make sense to run Stateless
> > Linux to keep the images the same across all of the
> > workstations/client machines?
> >
> > Regards
> >
> > Greg
> >
> > Bowie Bailey wrote:
> > > Greg Perry wrote:
> > > > I have been researching GFS for a few days, and I have some
> > > > questions that hopefully some seasoned users of GFS may be able
> > > > to answer.
> > > >
> > > > I am working on the design of a linux cluster that needs to be
> > > > scalable, it will be primarily an RDBMS-driven data warehouse
> > > > used for data mining and content indexing. In an ideal world,
> > > > we would be able to start with a small (say 4 node) cluster,
> > > > then add machines (and storage) as the various RDBMS' grow in
> > > > size (as well as the use virtual IPs for load balancing across
> > > > multiple lighttpd instances. All machines on the node need to
> > > > be able to talk to the same volume of information, and GFS (in
> > > > theory at least) would be used to aggregate the drives from
> > > > each machine into that huge shared logical volume). With that
> > > > being said, here are some questions:
> > > >
> > > > 1) What is the preference on the RDBMS, will MySQL 5.x work and
> > > > are there any locking issues to consider? What would the best
> > > > open source RDBMS be (MySQL vs. Postgresql etc)
> > >
> > > Someone more qualified than me will have to answer that question.
> > >
> > > > 2) If there was a 10 machine cluster, each with a 300GB SATA
> > > > drive, can you use GFS to aggregate all 10 drives into one big
> > > > logical 3000GB volume? Would that scenario work similar to a
> > > > RAID array? If one or two nodes fail, but the GFS quorum is
> > > > maintained, can those nodes be replaced and repopulated just
> > > > like a RAID-5 array? If this scenario is possible, how
> > > > difficult is it to "grow" the shared logical volume by adding
> > > > additional nodes (say I had two more machines each with a 300GB
> > > > SATA drive)?
> > >
> > > GFS doesn't work that way. GFS is just a fancy filesystem. It
> > > takes an already shared volume and allows all of the nodes to
> > > access it at the same time.
> > >
> > > > 3) How stable is GFS currently, and is it used in many
> > > > production environments?
> > >
> > > It seems to be stable for me, but we are still in testing mode at
> > > the moment.
> > >
> > > > 4) How stable is the FC5 version, and does it include all of the
> > > > configuration utilities in the RH Enterprise Cluster version?
> > > > (the idea would be to prove the point on FC5, then migrate to RH
> > > > Enterprise).
> > >
> > > Haven't used that one.
> > >
> > > > 5) Would CentOS be preferred over FC5 for the initial
> > > > proof of concept and early adoption?
> > >
> > > If your eventual platform is RHEL, then CentOS would make more
> > > sense for a testing platform since it is almost identical to
> > > RHEL. Fedora can be less stable and may introduce some issues
> > > that you wouldn't have with RHEL. On the other hand, RHEL may
> > > have some problems that don't appear on Fedora because of updated
> > > packages.
> > >
> > > If you want bleeding edge, use Fedora.
> > > If you want stability, use CentOS or RHEL.
> > >
> > > > 6) Are there any restrictions or performance advantages of
> > > > using all drives with the same geometry, or can you mix and
> > > > match different size drives and just add to the aggregate
> > > > volume size?
> > >
> > > As I said earlier, GFS does not do the aggregation.
> > >
> > > What you get with GFS is the ability to share an already networked
> > > storage volume. You can use iSCSI, AoE, GNBD, or others to
> > > connect the storage to all of the cluster nodes. Then you format
> > > the volume with GFS so that it can be used with all of the nodes.
> > >
> > > I believe there is a project for the aggregate filesystem that
> > > you are looking for, but as far as I know, it is still beta.
> > >
> >
> > --
> > Linux-cluster mailing list
> > Linux-cluster at redhat.com
> > https://www.redhat.com/mailman/listinfo/linux-cluster
More information about the Linux-cluster
mailing list