[Linux-cluster] RHEL5 CLVMD hang
Nuno Fernandes
npf-mlists at eurotux.com
Fri Feb 1 15:34:45 UTC 2008
Hi,
CLVM is hung again. This time, the problem started when we restarted clvmd in
one node (xen1).
Xen2 started to report:
Feb 1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 103e7
Feb 1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 10264
Feb 1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 10008
Feb 1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 10047
Feb 1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 1020b
...
Attached i send group_tool dump.
All cluster nodes have dlm_recoverd blocked:
..
10191 ? D< 0:00 \_ [dlm_recoverd]
..
but only xen2 is putting all that logs.
# cman_tool services
type level name id state
fence 0 default 00010008 none
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
dlm 1 clvmd 00010001 none
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
# clustat
Member Status: Quorate
Member Name ID Status
------ ---- ---- ------
xen1.dc.eurotux.pt 1 Online
xen2.dc.eurotux.pt 2 Online, Local
xen3.dc.eurotux.pt 3 Online
xen4.dc.eurotux.pt 4 Online
xen5.dc.eurotux.pt 5 Online
xen6.dc.eurotux.pt 6 Online
xen7.dc.eurotux.pt 7 Online
xen8.dc.eurotux.pt 8 Online
xen9.dc.eurotux.pt 9 Online
xen10.dc.eurotux.pt 10 Online
xen11.dc.eurotux.pt 11 Online
xen12.dc.eurotux.pt 12 Online
xen13.dc.eurotux.pt 13 Online
xen14.dc.eurotux.pt 14 Online
xen17.dc.eurotux.pt 15 Online
xen18.dc.eurotux.pt 16 Online
xen19.dc.eurotux.pt 17 Online
xen20.dc.eurotux.pt 18 Online
xen21.dc.eurotux.pt 19 Online
xen22.dc.eurotux.pt 20 Online
Any info on this?
Thanks
Nuno Fernandes
On Tuesday 22 January 2008 17:07:51 Nuno Fernandes wrote:
> On Tuesday 22 January 2008 09:13:55 Patrick Caulfeld wrote:
> > Nuno Fernandes wrote:
> > > On Monday 21 January 2008 15:58:38 Patrick Caulfeld wrote:
> > >> echo 255 > /sys/kernel/config/dlm/cluster/log_debug
> > >
> > > echo 255 > /sys/kernel/config/dlm/cluster/log_debug
> > > -bash: /sys/kernel/config/dlm/cluster/log_debug: Permission denied
> > >
> > > ls -la /sys/kernel/config/dlm/cluster/
> > > total 0
> > > drwxr-xr-x 4 root root 0 May 27 2007 .
> > > drwxr-xr-x 3 root root 0 May 27 2007 ..
> > > drwxr-xr-x 19 root root 0 Jan 17 16:36 comms
> > > drwxr-xr-x 3 root root 0 Nov 27 14:48 spaces
> >
> > No debug options! you need to upgrade the kernel I'm afraid. It might
> > even fix the bug ;-)
> >
> > Patrick
>
> Solved. Rebooted the whole cluster! :(
>
> Thanks
> Nuno Fernandes
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: group.dump.txt.gz
Type: application/x-gzip
Size: 13974 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080201/3b17a7d9/attachment.bin>
More information about the Linux-cluster
mailing list