[Linux-cluster] Bad day in writesville - Followup #2

Rick Stevens rstevens at vitalstream.com
Thu Dec 23 22:18:07 UTC 2004


Rick Stevens wrote:
> Rick Stevens wrote:
> 
>> Ken Preslan wrote:
>>
>>> On Tue, Dec 21, 2004 at 05:19:30PM -0800, Rick Stevens wrote:
>>>
>>>> 2. Do I have to destroy the filesystem and reformat it using the "-p 
>>>> lock_gulm" option?
>>>
>>>
>>>
>>>
>>> To change the module permanently, you can unmount the filesystem on all
>>> nodes and run the commands:
>>>
>>> gfs_tool sb <device> proto <module>
>>> gfs_tool sb <device> table <table>
>>>
>>> If it's just a temporary thing, you can unmount the filesystem on all
>>> nodes and mount with options to override the defaults in the superblock:
>>>
>>> mount -o lockproto=<module>,locktable=<table>
>>>
>>>
>>> <Module> is whatever you'd have for gfs_mkfs' -p and <table>
>>> is whatever you'd have to gfs_mkfs' -t.
>>>
>>>
>>> Note that you need to be careful when you do either of these things.
>>> Running a mixed cluster where some machines are locking a FS with one
>>> protocol/table and other machines are locking the same FS with a
>>> different protocol/table is bad.  It is bound to end in great sadness.
>>
>>
>>
>> Gotcha.  After looking at the bugzilla entries refered to in other
>> replies to my question, it appears that LVM works fairly well with DLM
>> but has major issues with GULM.  However, someone mentioned that one can
>> use DLM to lock LVM and use GULM to lock GFS.
>>
>> Since the LVM stuff is pretty important, I intend to try that method.
>> I rebuilt LVM to use DLM/CMAN.  I've modprobed both lock_dlm and
>> lock_gulm and started clvmd.  I've used "cman_tool join" to set up DLM
>> and "vgchange -aly"d the LVM volume and it appeared.  I've used gfs_tool
>> to change the locking protocol on the filesystem to use GULM (the table
>> remains the same) and it mounted fine.  I'm about to start the same
>> stuff on the second node.  I'll keep you informed.
> 
> 
> Followup:
> 
> Well, that seems to be the fix.  I used CMAN/DLM to manage the LVM
> volumes via clvmd and I changed the GFS filesystem to use GULM as its
> locking mechanism.  No hiccups so far, so it appears that heavy write
> activity freaks out DLM when it's also managing GFS.
> 
> I wish I had traces and such to show you what was going on with CMAN/DLM
> only, but the machine locked up so hard only a reboot could bring it
> back.
> 
> I'm going to keep testing this so there will be an additional follow up.
> I'd also like to thank Ken Preslan, Patrick Caulfield and Derek Anderson
> for their help on this.  I may be much closer to a solution and you
> chaps helped a lot.  Thanks!

Followup #2:

This is cool!  Based on the 20 December 2004 CVS code, I'm using
CMAN/DLM to manage the VG and LVM locking and GULM to manage the GFS
locking.  It works like a charm--even having both nodes write to the
same file.

For the final test, I added a new LU off the SAN to the VG, extended
the LV by the size of the new LU and was able to grow the GFS filesystem
on top of that...all without having to shut anything down.

So, success is near!  Next, we plan to deploy this cluster into our real
environment and let our clients flog it.  If it crashes, our load
balancer will block access to it and I'll do a post-mortem at that
point.

Well, that's enough for now.  It's 14:15 on 23 December here in
California right now, and I'm skedattling until Monday (well, the laptop
is full of stuff to do on Sunday).

Thanks again for the help, gang!  Have a Happy Holiday.  Don't eat or
drink too much!
----------------------------------------------------------------------
- Rick Stevens, Senior Systems Engineer     rstevens at vitalstream.com -
- VitalStream, Inc.                       http://www.vitalstream.com -
-                                                                    -
-    First Law of Work:                                              -
-    If you can't get it done in the first 24 hours, work nights.    -
----------------------------------------------------------------------




More information about the Linux-cluster mailing list