[Linux-cluster] help on configuring a shared gfs volume in a load balanced http cluster

Fri Jul 25 11:05:04 UTC 2008

On Fri, 25 Jul 2008, Alex wrote:

> On Thursday 24 July 2008 15:59, gordan at bobich.net wrote:
>> So, shd machines are actually SANs. You will need to use something like
>> DRBD if you want shd machines mirrored
>
> Hello Gordan,
>
> I am confused because i didn't do this job in a past and have no experience
> with this service. I would like to parse this task using small steps, in
> order to be able to understand what to do..., so my questions comes below:
>
> Actually, i want just to have hdb1 from shd1 and hdc1 from shd2 joided in one
> volume. No mirror for this volume at that momment. Is possible? If yes, how?
> Using ATAoE?

Set up ATAoE on shd and use it to export a volume. Connect to this ATAoE 
share from the front end nodes. You can then use something like Cluster 
LVM (CLVM) to unify them into one volume.

Then create GFS on this volume.

Note that if you lose either of the two shd machines you will likely lose 
all the data.

> After that, i would like to know, how to install GFS on this volume and use it
> as documennt root on our real web servers (rs1, rs2, rs3). Is possible? If
> yes, how?

Yes, when you have the logical volume consisting of shd1 and shd2, create 
the GFS on it as per the docs (mkfs.gfs), mount it to where you want it, 
and point Apache at that path. Nothing magical about it, it's just like 
any other once you have it mounted.

> I don't understand from your explanation, how to group machines: shd1 and shd2
> should be in one cluster and rs1, rs2 and rs3 in other cluster

I don't see why you need shd1 and shd2 machines in a cluster. They are 
just SANs. Unless they are mirroring each other or beign each other's 
backup there is no immediately obvious reason from your example why they 
should be clustered together.

> or: shd1 and shd2 shoud be regular servers which is just exporting their 
> HDD

Yes. And you don't export the HDD per se using ATAoE or iSCSI - you export 
a "volume" (which is just a file on shd's file system that is effectively 
a disk image).

> using ATAoE and rs1, rs2 and rs2 to be grouped in one cluster which 
> are importing a GFS volume from somwhere?

rs machines would import the ATAoE volumes, establish a logical volume on 
top of them, and then start the GFS file system on top of that.

> If yes, from where? How can i configure a GFS volume on
> ATAoE disks and from where will be accesible?

It will be accessible from any machine in the cluster the GFS volume is 
built for (in this case rs set), once they connect the ATAoE (or iSCSI if 
that's what you use for it, there isn't THAT much difference between 
them) shares from shds.

> I need another one machine
> which will act as agregator for ATAoE disks or our real web servers (rs1,
> rs2, rs3) will responsible to import directly these disks?

You don't need an agregator, you can unify the volumes using CLVM into one 
big logical volume, and have GFS live on top of that.

>> and ATAoE or iSCSI to export the
>> volumes for the rs machines to mount.
>
> In our lab we are using regular hard disks, so iSCSI is excluded.

iSCSI is a network protocol, nothing to do with SCSI disks per se.
It's SCSI-over-ethernet. You can export any file on a machine as a volume 
using iSCSI. Whether the underlying disk is SCSI, ATA or something exotic 
is entirely irrelevant.

ATAoE and iSCSI are both applicable to your case. ATAoE has somewhat lower 
overheads (read: a little faster) but is ethernet layer based. iSCSI is 
TCP based so is routable. iSCSI is also a little more mature.

> I read an article here (http://www.linuxjournal.com/article/8149) about ATAoE
> and i have some questions:
>
> - on our centos 5.2 boxes, we already have aoe kernel module but we don't have
> aoe-stat command. Is any packet shoud i install via yum to have this command
> (or other command to hadle aoe disks) or is required do download
> aoetools-26.tar.gz and compile from source
> (http://sourceforge.net/projects/aoetools/)
>
> - in above article they are talking about RAID10, LVM and JFS. They are not
> teaching me about GFS and clustering. They choose JFS and not GFS saying that
> "JFS is a filesystem that can grow dynamically to large sizes, so he is going
> to put a JFS filesystem on a logical volume". I want that but using GFS, is
> possible or not?

There are several concepts and technologies you need to go read up on 
before getting further with this:
ATAoE
iSCSI
LVM/CLVM for volume management

If you add additional volumes (e.g. exported via iSCSI or ATAoE) to your 
SAN boxes, you can add them into your CLVM volume you have GFS on top of, 
and the virtual "disk" (logical volume) will show as being bigger. You can 
then grow the GFS file system on this volume and have it extend onto the 
additional space.

> They are saying that:
>
> "using a cluster filesystem such as GFS, it is possible for multiple hosts on
> the Ethernet network to access the same block storage using ATA over
> Ethernet. There's no need for anything like an NFS server"

NFS and GFS are sort of equivalent, layer wise.

> "But there's a snag. Any time you're using a lot of disks, you're increasing
> the chances that one of the disks will fail. Usually you use RAID to take
> care of this issue by introducing some redundancy. Unfortunately, Linux
> software RAID is not cluster-aware. That means each host on the network
> cannot do RAID 10 using mdadm and have things simply work out."

What they are saying is that you can't export two ATAoE/iSCSI shares, have 
mdadm RAID on top, and then have GFS on top, because the mdadm layer isn't 
cluster aware. But you aren't using RAID on that level.

RAID would be on the shd machines (hardware or mdadm RAID on the disks you 
use for storage, before any exporting via ATAoE or iSCSI happens.

If you want the servers mirrored (i.e. RAID1), that's what you would use 
DRBD as I mentioned earlier. But then you wouldn't mount a share from each 
machine, you'd mount just one of the two, and have shds clustered for 
fail-over.

> So, finally, what should i do? Can you or anybody suggest me some howtos and
> what is the correct order to group machines and implement clustering?

See above. Have a Google around for the things I mentioned, and ask more 
specific questions. :)

Gordan