[Linux-cluster] Problem in clvmd/dlm_recoverd

Nuno Fernandes npf-mlists at eurotux.com
Fri Nov 14 11:02:36 UTC 2008


On Friday 14 November 2008 10:29:50 Christine Caulfield wrote:
> Nuno Fernandes wrote:
> > Hi,
> >
> > we have an cluster with 7 machines with a SAN. We are using them to
> > provide virtual machines, so we are using clvmd.
> >
> > At some point we are unable to access any of the pv/lv/vg tools. They
> > are all stuck. From stracing them i've come to the conclusion that they
> > are waiting for clvmd.
>
> They could be waiting for fencing to complete.
>
> Have a look at the output from group_tool, that will tell you which
> services have recovered after a node has joined or left the cluster

I don't think that is the reason..


# group_tool
type             level name     id       state
fence            0     default  00010002 none
[1 2 3 4 5 7]
dlm              1     clvmd    00010004 none
[1 2 3 4 5 7]

Any other ideas?
Best regards,
Nuno Fernandes

>
> Chrissie
>
> > Nuno Fernandes
> >
> > in host xen1:
> >
> > Linux blade01.dc.xpto.com 2.6.18-92.1.17.el5xen #1 SMP Tue Nov 4
> > 14:13:09 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
> >
> > lvm2-cluster-2.02.32-4.el5
> >
> > cman-2.0.84-2.el5_2.1
> >
> > PID TTY STAT TIME COMMAND
> >
> > 20874 ? D< 0:00 \_ [dlm_recoverd]
> >
> > 20854 pts/1 S+ 0:00 \_ /bin/sh /sbin/service clvmd start
> >
> > 20861 pts/1 S+ 0:00 \_ /bin/bash /etc/init.d/clvmd start
> >
> > 20931 pts/1 S+ 0:00 \_ /usr/sbin/vgscan -d
> >
> > 20869 ? Ssl 0:00 clvmd -T40
> >
> > ps ax -o pid,cmd,wchan
> >
> > 20874 [dlm_recoverd] -
> >
> > ------------------------------
> >
> > Connection to xen1 closed.
> >
> > in host xen2:
> >
> > Linux blade02.dc.xpto.com 2.6.18-8.1.14.el5xen #1 SMP Thu Oct 4 11:38:56
> > WEST 2007 x86_64 x86_64 x86_64 GNU/Linux
> >
> > lvm2-cluster-2.02.16-3.el5
> >
> > cman-2.0.64-1.0.1.el5
> >
> > PID TTY STAT TIME COMMAND
> >
> > 22662 ? D< 0:00 \_ [dlm_recoverd]
> >
> > 22613 ? Ssl 0:02 clvmd -T40
> >
> > ps ax -o pid,cmd,wchan
> >
> > 22662 [dlm_recoverd] -
> >
> > ------------------------------
> >
> > Connection to xen2 closed.
> >
> > in host xen3:
> >
> > Linux blade03.dc.xpto.com 2.6.18-8.1.14.el5xen #1 SMP Thu Oct 4 11:38:56
> > WEST 2007 x86_64 x86_64 x86_64 GNU/Linux
> >
> > lvm2-cluster-2.02.16-3.el5
> >
> > cman-2.0.64-1.0.1.el5
> >
> > PID TTY STAT TIME COMMAND
> >
> > 22236 ? D< 0:00 \_ [dlm_recoverd]
> >
> > 22231 ? Ssl 0:02 clvmd -T40
> >
> > ps ax -o pid,cmd,wchan
> >
> > Connection to xen3 closed.
> >
> > 22236 [dlm_recoverd] dlm_wait_function
> >
> > ------------------------------
> >
> > in host xen4:
> >
> > Linux blade04.dc.xpto.com 2.6.18-8.1.14.el5xen #1 SMP Thu Oct 4 11:38:56
> > WEST 2007 x86_64 x86_64 x86_64 GNU/Linux
> >
> > lvm2-cluster-2.02.16-3.el5
> >
> > cman-2.0.64-1.0.1.el5
> >
> > PID TTY STAT TIME COMMAND
> >
> > 25097 ? D< 0:00 \_ [dlm_recoverd]
> >
> > 25092 ? Ssl 0:02 clvmd -T40
> >
> > ps ax -o pid,cmd,wchan
> >
> > 25097 [dlm_recoverd] dlm_wait_function
> >
> > ------------------------------
> >
> > Connection to xen4 closed.
> >
> > in host xen5:
> >
> > Linux blade05.dc.xpto.com 2.6.18-92.1.17.el5xen #1 SMP Tue Nov 4
> > 14:13:09 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
> >
> > lvm2-cluster-2.02.32-4.el5
> >
> > cman-2.0.84-2.el5_2.1
> >
> > PID TTY STAT TIME COMMAND
> >
> > 22333 ? D< 0:00 \_ [dlm_recoverd]
> >
> > 22328 ? Ssl 0:02 clvmd -T40
> >
> > ps ax -o pid,cmd,wchan
> >
> > 22333 [dlm_recoverd] -
> >
> > ------------------------------
> >
> > Connection to xen5 closed.
> >
> > in host xen6:
> >
> > Linux blade06.dc.xpto.com 2.6.18-92.1.17.el5xen #1 SMP Tue Nov 4
> > 14:13:09 EST 2008 x86_64 x86_64 x86_64 GNU/Linux
> >
> > lvm2-cluster-2.02.32-4.el5
> >
> > cman-2.0.84-2.el5_2.1
> >
> > PID TTY STAT TIME COMMAND
> >
> > ps ax -o pid,cmd,wchan
> >
> > ------------------------------
> >
> > Connection to xen6 closed.
> >
> > in host xen7:
> >
> > Linux blade07.dc.xpto.com 2.6.18-92.1.13.el5xen #1 SMP Wed Sep 24
> > 20:01:15 EDT 2008 x86_64 x86_64 x86_64 GNU/Linux
> >
> > lvm2-cluster-2.02.32-4.el5
> >
> > cman-2.0.84-2.el5
> >
> > cman-2.0.84-2.el5_2.1
> >
> > PID TTY STAT TIME COMMAND
> >
> > 19793 ? D< 0:00 \_ [dlm_recoverd]
> >
> > 19788 ? Ssl 0:01 clvmd -T40
> >
> > ps ax -o pid,cmd,wchan
> >
> > 19793 [dlm_recoverd] -
> >
> > ------------------------------
> >
> > Connection to xen7 closed.
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081114/0757bdd5/attachment.htm>


More information about the Linux-cluster mailing list