[Linux-cluster] Cluster with shared storage on low budget

Tue Feb 15 10:23:36 UTC 2011

Fajar A. Nugraha wrote:
> On Tue, Feb 15, 2011 at 4:57 PM, Gordan Bobic <gordan at bobich.net> wrote:
>> Nikola Savic wrote:
>>>  If I understand you well, even before sync is completely done DRBD
>>> will take care of reading and writing of dirty blocks on problematic
>>> node that got back online? Let's say that node was down for longer time
>>> and that synchronization can take few minutes, maybe more. If all
>>> services start working before sync is complete, it can happen that web
>>> applications tries to write into or read from dirty block(s). Will DRBD
>>> take care of that? If not, is there way to suspend startup of services
>>> (web server and similar) until sync is done?
>> DRBD and GFS will take care of that for you. DRBD directs reads to nodes
>> that are up to date until everything is in sync.
> 
> Really? Can you point to a documentation that said so?
> IIRC the block device /dev/drbd* on a node will not be accessible for
> read/write until it's synced.

If you are running in primary/primary mode, the block device will most 
definitely be available in rw mode as soon as drbd has connected to the 
cluster and established where to get the most up to date copy from.

I haven't looked through the documentation recently so don't have a link 
handy but I have several clusters with this setup deployed, so I'm 
reasonably confident I know what I'm talking about. :)

>> Make sure that in drbd.conf you put in a stonith parameter pointing at your
>> fencing agent with suitable parameters, and set the timeout to slightly less
>> than what you have it set in cluster.conf. That will ensure that you are
>> protected from the race condition where DRBD might drop out but the node
>> starts heartbeating between then and when the fencing timeout occurs.
>>
>> Oh, and if you are going to use DRBD there is no reason to use LVM.
> 
> There are two ways to use DRBD with LVM in a cluster:
> (1) Use drbd on partition/disk, and use CLVM on top of that
> (2) create local LVM, and use drbd on top of the LVs
> 
> Personally I prefer (2), since this setup allows LVM snapshots, and
> faster to resync if I want to reinitialize a drbd device on one of the
> nodes (like when a split brain occurred, which was often on my
> fencingless-test-setup a while back).

I don't see what the purpose of (1) is. I can sort of see where you are 
coming from with snapshots in (2), but what you are describing doesn't 
sound like something you would ever want to use in production.

Just because you _can_ use LVM doesn't mean that you _should_ use it. 
Another bad thing about LVM if you are using it on top of RAID or an SSD 
is that its headers will throw the FS completely out of alignment if you 
don't pre-compensate for it.

Gordan