[Linux-cluster] Rhel 5.7 Cluster - gfs2 volume in "LEAVE_START_WAIT" status

Mon Jun 4 13:29:13 UTC 2012

Hi Emmanuel,
 Yes, i'm running gfs2. I'm also trying this out on Rhel 6.2 with three
nodes so see if this happens upstream.
Looks like i may have to open a BZ to get more info on this.

root at bl13-node13:~# gfs2_tool list
253:15 cluster3:cluster3_disk6
253:16 cluster3:cluster3_disk3
253:18 cluster3:disk10
253:17 cluster3:cluster3_disk9
253:19 cluster3:cluster3_disk8
253:21 cluster3:cluster3_disk7
253:22 cluster3:cluster3_disk2
253:23 cluster3:cluster3_disk1

thanks,
-Cedric
On Sun, Jun 3, 2012 at 1:17 PM, emmanuel segura <emi2fast at gmail.com> wrote:

> Hello Cedric
>
> Are you using gfs or gfs2? if you are using gfs  i recommend to use gfs2
>
> 2012/6/3 Cedric Kimaru <rhel_cluster at ckimaru.com>
>
>> Fellow Cluster Compatriots,
>> I'm looking for some guidance here. Whenever my rhel 5.7 cluster get's
>> into "*LEAVE_START_WAIT*" on on a given iscsi volume, the following
>> occurs:
>>
>>    1. I can't r/w io to the volume.
>>    2. Can't unmount it, from any node.
>>    3. In flight/pending IO's are impossible to determine or kill since
>>    lsof on the mount fails. Basically all IO operations stall/fail.
>>
>> So my questions are:
>>
>>    1. What does the output from group_tool -v really indicate, *"00030005
>>    LEAVE_START_WAIT 12 c000b0002 1" *? Man on group_tool doesn't list
>>    these fields.
>>    2. Does anyone have a list of what these fields represent ?
>>    3. Corrective actions. How do i get out of this state without
>>    rebooting the entire cluster ?
>>    4. Is it possible to determine the offending node ?
>>
>> thanks,
>> -Cedric
>>
>>
>> //misc output
>>
>> root at bl13-node13:~# clustat
>> Cluster Status for cluster3 @ Sat Jun  2 20:47:08 2012
>> Member Status: Quorate
>>
>>  Member Name                                                     ID
>> Status
>>  ------ ----                                                     ----
>> ------
>> bl01-node01                                      1 Online, rgmanager
>>  bl04-node04                                      4 Online, rgmanager
>>  bl05-node05                                      5 Online, rgmanager
>>  bl06-node06                                      6 Online, rgmanager
>>  bl07-node07                                      7 Online, rgmanager
>>  bl08-node08                                      8 Online, rgmanager
>>  bl09-node09                                      9 Online, rgmanager
>>  bl10-node10                                     10 Online, rgmanager
>>  bl11-node11                                     11 Online, rgmanager
>>  bl12-node12                                     12 Online, rgmanager
>>  bl13-node13                                     13 Online, Local,
>> rgmanager
>>  bl14-node14                                     14 Online, rgmanager
>>  bl15-node15                                     15 Online, rgmanager
>>
>>
>>  Service Name                                                 Owner
>> (Last)                                                 State
>>  ------- ----                                                 -----
>> ------                                                 -----
>>  service:httpd
>> bl05-node05                               started
>>  service:nfs_disk2
>> bl08-node08                               started
>>
>>
>> root at bl13-node13:~# group_tool -v
>> type             level name            id       state node id local_done
>> fence            0     default         0001000d none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> dlm              1     clvmd           0001000c none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> dlm              1     cluster3_disk1  00020005 none
>> [4 5 6 7 8 9 10 11 12 13 14 15]
>> dlm              1     cluster3_disk2  00040005 none
>> [4 5 6 7 8 9 10 11 13 14 15]
>> dlm              1     cluster3_disk7  00060005 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> dlm              1     cluster3_disk8  00080005 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> dlm              1     cluster3_disk9  000a0005 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> dlm              1     disk10          000c0005 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> dlm              1     rgmanager       0001000a none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> dlm              1     cluster3_disk3  00020001 none
>> [1 5 6 7 8 9 10 11 12 13]
>> dlm              1     cluster3_disk6  00020008 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> gfs              2     cluster3_disk1  00010005 none
>> [4 5 6 7 8 9 10 11 12 13 14 15]
>> *gfs              2     cluster3_disk2  00030005 LEAVE_START_WAIT 12
>> c000b0002 1
>> [4 5 6 7 8 9 10 11 13 14 15]*
>> gfs              2     cluster3_disk7  00050005 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> gfs              2     cluster3_disk8  00070005 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> gfs              2     cluster3_disk9  00090005 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> gfs              2     disk10          000b0005 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>> gfs              2     cluster3_disk3  00010001 none
>> [1 5 6 7 8 9 10 11 12 13]
>> gfs              2     cluster3_disk6  00010008 none
>> [1 4 5 6 7 8 9 10 11 12 13 14 15]
>>
>> root at bl13-node13:~# gfs2_tool list
>> 253:15 cluster3:cluster3_disk6
>> 253:16 cluster3:cluster3_disk3
>> 253:18 cluster3:disk10
>> 253:17 cluster3:cluster3_disk9
>> 253:19 cluster3:cluster3_disk8
>> 253:21 cluster3:cluster3_disk7
>> 253:22 cluster3:cluster3_disk2
>> 253:23 cluster3:cluster3_disk1
>>
>> root at bl13-node13:~# lvs
>>     Logging initialised at Sat Jun  2 20:50:03 2012
>>     Set umask from 0022 to 0077
>>     Finding all logical volumes
>>   LV                            VG                            Attr
>> LSize   Origin Snap%  Move Log Copy%  Convert
>>   lv_cluster3_Disk7             vg_Cluster3_Disk7             -wi-ao
>> 3.00T
>>   lv_cluster3_Disk9             vg_Cluster3_Disk9             -wi-ao
>> 200.01G
>>   lv_Cluster3_libvert           vg_Cluster3_libvert           -wi-a-
>> 100.00G
>>   lv_cluster3_disk1             vg_cluster3_disk1             -wi-ao
>> 100.00G
>>   lv_cluster3_disk10            vg_cluster3_disk10            -wi-ao
>> 15.00T
>>   lv_cluster3_disk2             vg_cluster3_disk2             -wi-ao
>> 220.00G
>>   lv_cluster3_disk3             vg_cluster3_disk3             -wi-ao
>> 330.00G
>>   lv_cluster3_disk4_1T-kvm-thin vg_cluster3_disk4_1T-kvm-thin -wi-a-
>> 1.00T
>>   lv_cluster3_disk5             vg_cluster3_disk5             -wi-a-
>> 555.00G
>>   lv_cluster3_disk6             vg_cluster3_disk6             -wi-ao
>> 2.00T
>>   lv_cluster3_disk8             vg_cluster3_disk8             -wi-ao
>> 2.00T
>>
>>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>>
>
>
>
> --
> esta es mi vida e me la vivo hasta que dios quiera
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20120604/c53b8326/attachment.htm>