[Linux-cluster] GFS limits?

Don MacAskill don at smugmug.com
Wed Jul 14 03:17:07 UTC 2004



Brian Jackson wrote:

> 
> The code that most people on this list are interested in currently is
> the code in cvs which is for 2.6 only. 2.6 has a config option to
> enable using devices larger than 2TB. I'm still reading through all
> the GFS code, but it's still architecturally the same as when it was
> closed source, so I'm pretty sure most of my knowledge from OpenGFS
> will still apply. GFS uses 64bit values internally, so you can have
> very large filesystems (larger than PBs).
> 

This is nice.  I was specifically thinking of 64bit machines, in which 
case, I'd expect it to be 9EB or something.

> 
>>Our current (homegrown) solution will scale very well for quite some
>>time, but eventually we're going to get saturated with write requests to
>>individual head units.  Does GFS intelligently "spread the load" among
>>multiple storage entities for writing under high load?
> 
> 
> No, each node that mounts has direct access to the storage. It writes
> just like any other fs, when it can.
> 

So, if I have a dozen seperate arrays in a given cluster, it will write 
data linearly to array #1, then array #2, then array #3?  If that's the 
case, GFS doesn't solve my biggest fear - write performance with a huge 
influx of data.  I'd hoped it might somehow "stripe" the data across 
individual units so that we can aggregate the combined interface 
bandwidth to some extent.


> 
>>Does it always
>>write to any available storage units, or are there thresholds where it
>>expands the pool of units it writes to?  (I'm not sure I'm making much
>>sense, but we'll see if any of you grok it :)
> 
> 
> I think you may have a little misconception about just what GFS is.
> You should check the WHATIS_OpenGFS doc at
> http://opengfs.sourceforge.net/docs.php It says OpenGFS, but for the
> most part, the same stuff applies to GFS.
> 

I've read it, and quite a few other documents and whitepapers on GFS 
quite a few times, but perhaps you're right - I must be missing 
something.  More on this below...


>>I notice the pricing for GFS is $2200.  Is that per seat?  And if so,
>>what's a "seat"?  Each client?  Each server with storage participating
>>in the cluster?  Both?  Some other distinction?
> 
> 
> Now I definitely know you have some misconception. GFS doesn't have
> any concept of server and client. All nodes mount the fs directly
> since they are all directly connected to the storage.
> 

Hmm, yes, this is probably my sticking point.  It was my understanding 
(or maybe just my hope?) that servers could participate as "storage 
units" in the cluster by exporting their block devices, in addition to 
FC or iSCSI or whatever devices which aren't techincally 'servers'.

In other words, I was thinking/hoping that the cluster consisted of 
block units aggregated into a filesystem, and that the filesystem could 
consist of FC RAID devices, iSCSI solutions, and "dumb servers" that 
just exported their local disks to the cluster FS.

Am I totally wrong?  I guess it's GNDB I don't totally understand, so 
I'd better go read up on it.

Thanks,

Don

-------------- next part --------------
A non-text attachment was scrubbed...
Name: don.vcf
Type: text/x-vcard
Size: 253 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20040713/e77f6ba8/attachment.vcf>


More information about the Linux-cluster mailing list