[Linux-cluster] GFS, iSCSI, multipaths and RAID

Wendy Cheng s.wendy.cheng at gmail.com
Thu May 22 04:20:01 UTC 2008


Alex Kompel wrote:
> On Mon, May 19, 2008 at 2:15 PM, Michael O'Sullivan
> <michael.osullivan at auckland.ac.nz> wrote:
>   
>> Thanks for your response Wendy. Please see a diagram of the system at
>> http://www.ndsg.net.nz/ndsg_cluster.jpg/view (or
>> http://www.ndsg.net.nz/ndsg_cluster.jpg/image_view_fullscreen for the
>> fullscreen view) that (I hope) explains the setup. We are not using FC as we
>> are building the SAN with commodity components (the total cost of the system
>> was less than NZ $9000). The SAN is designed to hold files for staff and
>> students in our department, I'm not sure exactly what applications will use
>> the GFS. We are using iscsi-target software although we may upgrade to using
>> firmware in the future. We have used CLVM on top of software RAID, I agree
>> there are many levels to this system, but I couldn't find the necessary is
>> hardware/software to implement this in a simpler way. I am hoping the list
>> may be helpful here.
>>
>>     
>
> So what do you want to get out of this configuration? iSCSI SAN, GFS
> cluster or both? I don't see any reason for 2 additional servers
> running GFS on top of iSCSI SAN.
>   
There are advantages (for 2 additional storage servers) because serving 
data traffic over IP network has its own overhead(s). They offload CPU 
as well as memory consumption(s) away from GFS nodes. If done right, the 
setup could emulate high end SAN box using commodity hardware to provide 
low cost solutions. The issue here is how to find the right set of 
software subcomponents to build this configuration. I personally never 
use Linux iscsi target or multi-path md devices - so can't comment on 
their features and/or performance characteristics. I was hoping folks 
well versed in these Linux modules (software raid, dm multi-path, clvm 
raid level etc) could provide their comments. Check out linux-lvm and/or 
dm-devel mailing lists .. you may be able to find good links and/or 
ideas there, or even start to generate interesting discussions from 
scratch.

So, if this configuration will be used as a research project, I'm 
certainly interested to read the final report. Let us know what works 
and which one sucks.

If it is for a production system to store critical data, better to do 
more searches to see what are available in the market (to replace the 
components grouped inside the "iscsi-raid" box in your diagram - it is 
too complicated to isolate issues if problems popped up). There should 
be  plenty of them out there (e.g. Netapp has offered iscsi SAN boxes 
with additional feature set such as failover, data de-duplication, 
backup, performance monitoring, etc). At the same time, it would be nice 
to have support group to call if things go wrong.

 From GFS side, I learned from previous GFS-GNBD experiences that 
serving data from IP networks have its overhead and it is not as cheap 
as people would expect. The issue is further complicated by the newer 
Red Hat cluster infra-structure that also places non-trivial amount of 
workloads on the TCP/IP stacks. So separating these IP traffic(s) 
(cluster HA, data, and/or GFS node access by applications) should be a 
priority to make the whole setup works.

-- Wendy









More information about the Linux-cluster mailing list