[Linux-cluster] GFS, iSCSI, multipaths and RAID
Wendy Cheng
s.wendy.cheng at gmail.com
Thu May 22 04:20:01 UTC 2008
Alex Kompel wrote:
> On Mon, May 19, 2008 at 2:15 PM, Michael O'Sullivan
> <michael.osullivan at auckland.ac.nz> wrote:
>
>> Thanks for your response Wendy. Please see a diagram of the system at
>> http://www.ndsg.net.nz/ndsg_cluster.jpg/view (or
>> http://www.ndsg.net.nz/ndsg_cluster.jpg/image_view_fullscreen for the
>> fullscreen view) that (I hope) explains the setup. We are not using FC as we
>> are building the SAN with commodity components (the total cost of the system
>> was less than NZ $9000). The SAN is designed to hold files for staff and
>> students in our department, I'm not sure exactly what applications will use
>> the GFS. We are using iscsi-target software although we may upgrade to using
>> firmware in the future. We have used CLVM on top of software RAID, I agree
>> there are many levels to this system, but I couldn't find the necessary is
>> hardware/software to implement this in a simpler way. I am hoping the list
>> may be helpful here.
>>
>>
>
> So what do you want to get out of this configuration? iSCSI SAN, GFS
> cluster or both? I don't see any reason for 2 additional servers
> running GFS on top of iSCSI SAN.
>
There are advantages (for 2 additional storage servers) because serving
data traffic over IP network has its own overhead(s). They offload CPU
as well as memory consumption(s) away from GFS nodes. If done right, the
setup could emulate high end SAN box using commodity hardware to provide
low cost solutions. The issue here is how to find the right set of
software subcomponents to build this configuration. I personally never
use Linux iscsi target or multi-path md devices - so can't comment on
their features and/or performance characteristics. I was hoping folks
well versed in these Linux modules (software raid, dm multi-path, clvm
raid level etc) could provide their comments. Check out linux-lvm and/or
dm-devel mailing lists .. you may be able to find good links and/or
ideas there, or even start to generate interesting discussions from
scratch.
So, if this configuration will be used as a research project, I'm
certainly interested to read the final report. Let us know what works
and which one sucks.
If it is for a production system to store critical data, better to do
more searches to see what are available in the market (to replace the
components grouped inside the "iscsi-raid" box in your diagram - it is
too complicated to isolate issues if problems popped up). There should
be plenty of them out there (e.g. Netapp has offered iscsi SAN boxes
with additional feature set such as failover, data de-duplication,
backup, performance monitoring, etc). At the same time, it would be nice
to have support group to call if things go wrong.
From GFS side, I learned from previous GFS-GNBD experiences that
serving data from IP networks have its overhead and it is not as cheap
as people would expect. The issue is further complicated by the newer
Red Hat cluster infra-structure that also places non-trivial amount of
workloads on the TCP/IP stacks. So separating these IP traffic(s)
(cluster HA, data, and/or GFS node access by applications) should be a
priority to make the whole setup works.
-- Wendy
More information about the Linux-cluster
mailing list