[Linux-cluster] RHEL5 CLVMD hang

Nuno Fernandes npf-mlists at eurotux.com
Fri Feb 1 15:34:45 UTC 2008


Hi,

CLVM is hung again. This time, the problem started when we restarted clvmd in 
one node (xen1).

Xen2 started to report:

Feb  1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 103e7
Feb  1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 10264
Feb  1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 10008
Feb  1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 10047
Feb  1 15:26:34 xen2 kernel: dlm: recover_master_copy -53 1020b
...

Attached i send group_tool dump.

All cluster nodes have dlm_recoverd blocked:

..
10191 ?        D<     0:00  \_ [dlm_recoverd]
..

but only xen2 is putting all that logs.


# cman_tool services
type             level name     id       state
fence            0     default  00010008 none
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]
dlm              1     clvmd    00010001 none
[1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20]

# clustat
Member Status: Quorate

  Member Name                        ID   Status
  ------ ----                        ---- ------
  xen1.dc.eurotux.pt                      1 Online
  xen2.dc.eurotux.pt                      2 Online, Local
  xen3.dc.eurotux.pt                      3 Online
  xen4.dc.eurotux.pt                      4 Online
  xen5.dc.eurotux.pt                      5 Online
  xen6.dc.eurotux.pt                      6 Online
  xen7.dc.eurotux.pt                      7 Online
  xen8.dc.eurotux.pt                      8 Online
  xen9.dc.eurotux.pt                      9 Online
  xen10.dc.eurotux.pt                    10 Online
  xen11.dc.eurotux.pt                    11 Online
  xen12.dc.eurotux.pt                    12 Online
  xen13.dc.eurotux.pt                    13 Online
  xen14.dc.eurotux.pt                    14 Online
  xen17.dc.eurotux.pt                    15 Online
  xen18.dc.eurotux.pt                    16 Online
  xen19.dc.eurotux.pt                    17 Online
  xen20.dc.eurotux.pt                    18 Online
  xen21.dc.eurotux.pt                    19 Online
  xen22.dc.eurotux.pt                    20 Online

Any info on this?

Thanks
Nuno Fernandes

On Tuesday 22 January 2008 17:07:51 Nuno Fernandes wrote:
> On Tuesday 22 January 2008 09:13:55 Patrick Caulfeld wrote:
> > Nuno Fernandes wrote:
> > > On Monday 21 January 2008 15:58:38 Patrick Caulfeld wrote:
> > >> echo 255 > /sys/kernel/config/dlm/cluster/log_debug
> > >
> > > echo 255 > /sys/kernel/config/dlm/cluster/log_debug
> > > -bash: /sys/kernel/config/dlm/cluster/log_debug: Permission denied
> > >
> > > ls -la /sys/kernel/config/dlm/cluster/
> > > total 0
> > > drwxr-xr-x  4 root root 0 May 27  2007 .
> > > drwxr-xr-x  3 root root 0 May 27  2007 ..
> > > drwxr-xr-x 19 root root 0 Jan 17 16:36 comms
> > > drwxr-xr-x  3 root root 0 Nov 27 14:48 spaces
> >
> > No debug options! you need to upgrade the kernel I'm afraid. It might
> > even fix the bug ;-)
> >
> > Patrick
>
> Solved. Rebooted the whole cluster! :(
>
> Thanks
> Nuno Fernandes
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: group.dump.txt.gz
Type: application/x-gzip
Size: 13974 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080201/3b17a7d9/attachment.bin>


More information about the Linux-cluster mailing list