[rhelv6-list] Home-brew SAN for virtualization

Bryan J Smith b.j.smith at ieee.org
Wed Feb 26 04:10:39 UTC 2014


Chris Adams <linux at cmadams.net> wrote:
> Hmm, interesting to hear.  It would seem that iSCSI would be higher
> throughput than NFS, but maybe the many extra years of people pounding
> on Linux NFS (compared to the relatively yound SCSI target code) has
> made it better.

The two are completely _non-comparable_.

iSCSI, like FC, is a block service, not directly usable by an OS.
NFS is a file service, with an optional lock manager, usable by an OS.

"Raw" block services are _not_ directly usable as a file system, and
have _no_ coherency (locking) capabilities from an OS usable
standpoint.

I.e., one must put another, often end-device, OS set of layers atop.
At a minimum, this is to carve it out with volume management (e.g.,
LVM).  And then, usually one of the follow to make it "usable" ...
 - File system (Ext4, XFS, etc...) + File service (NFS) for export
 - Cluster file system (GFS2) + Cluster Services (related) for mount
 - Etc...

However ... since VMs _prefer_ blocks, and most VMs only execute on
one node at a time, LVM Volume Groups can be created atop of "raw"
block storage.  The key is to ensure a "farm" does not have more than
one HyperVisor node make any one Logical Volume active at a time.

That's oVirt and agents (e.g., VDSM) right there.  ;)

GlusterFS is different than both approaches.  GlusterFS itself is a
shared file system service with many protocols built-in -- not just
Native, but concurrent NFSv3 (pNFS 4.1 in the future), an API, etc...
The API is how Samba, QEMU-libgfapi, etc... work to directly access
the

At its base form, Gluster is a managed, local file system -- e.g., XFS
-- that stores files directly, but presents them either "replicated"
or "distributed" across bricks.  But instead of using the local file
system locking and OS services with their locking, it has its own,
powerful locking xlator (translator), and support xlators to emulate
everything else.  E.g., NLM4 for NFS3 POSIX locking.  API exposes the
POSIX locking as well.

At the heart of Gluster is actually a single service ... glusterfsd,
for which one is spawned for every brick created.  It is where the
xlators not only lie, but talk to each other -- all Bricks in a Volume
-- to ensure locking coherency.  It actually works extremely well,
especially for load-balanced NFSv3 (from multiple servers) of a
Gluster Volume, in addition to ensuring coherency with native mounts
as well.  The big difference between NFS and Native is NFS uses an
export server, whereas Native is always direct to where the Brick is.
In both cases, you don't even have to know which server(s) physically
have the brick(s) in the Volume.

-- bjs




More information about the rhelv6-list mailing list