[Linux-cluster] Problem in clvmd/dlm_recoverd

Nuno Fernandes npf-mlists at eurotux.com
Fri Nov 14 10:00:13 UTC 2008


Hi,

we have an cluster with 7 machines with a SAN. We are using them to provide 
virtual machines, so we are using clvmd.

At some point we are unable to access any of the pv/lv/vg tools. They are all 
stuck. From stracing them i've come to the conclusion that they are waiting 
for clvmd.

Has anyone been in this situation?

Thanks for any help,
Nuno Fernandes

in host xen1:                                                                                                                
Linux blade01.dc.xpto.com 2.6.18-92.1.17.el5xen #1 SMP Tue Nov 4 14:13:09 EST 
2008 x86_64 x86_64 x86_64 GNU/Linux            
lvm2-cluster-2.02.32-4.el5                                                                                                   
cman-2.0.84-2.el5_2.1                                                                                                        
  PID TTY      STAT   TIME COMMAND                                                                                           
20874 ?        D<     0:00  \_ [dlm_recoverd]                                                                                
20854 pts/1    S+     0:00      \_ /bin/sh /sbin/service clvmd start                                                         
20861 pts/1    S+     0:00          \_ /bin/bash /etc/init.d/clvmd start                                                     
20931 pts/1    S+     0:00              \_ /usr/sbin/vgscan -d                                                               
20869 ?        Ssl    0:00 clvmd -T40                                                                                        
ps ax -o pid,cmd,wchan                                                                                                       
20874 [dlm_recoverd]              -                                                                                          
------------------------------                                                                                               
Connection to xen1 closed.                                                                                                   
in host xen2:                                                                                                                
Linux blade02.dc.xpto.com 2.6.18-8.1.14.el5xen #1 SMP Thu Oct 4 11:38:56 WEST 
2007 x86_64 x86_64 x86_64 GNU/Linux            
lvm2-cluster-2.02.16-3.el5                                                                                                   
cman-2.0.64-1.0.1.el5                                                                                                        
  PID TTY      STAT   TIME COMMAND                                                                                           
22662 ?        D<     0:00  \_ [dlm_recoverd]                                                                                
22613 ?        Ssl    0:02 clvmd -T40                                                                                        
ps ax -o pid,cmd,wchan                                                                                                       
22662 [dlm_recoverd]              -                                                                                          
------------------------------                                                                                               
Connection to xen2 closed.                                                                                                   
in host xen3:                                                                                                                
Linux blade03.dc.xpto.com 2.6.18-8.1.14.el5xen #1 SMP Thu Oct 4 11:38:56 WEST 
2007 x86_64 x86_64 x86_64 GNU/Linux            
lvm2-cluster-2.02.16-3.el5                                                                                                   
cman-2.0.64-1.0.1.el5                                                                                                        
  PID TTY      STAT   TIME COMMAND                                                                                           
22236 ?        D<     0:00  \_ [dlm_recoverd]                                                                                
22231 ?        Ssl    0:02 clvmd -T40                                                                                        
ps ax -o pid,cmd,wchan                                                                                                       
Connection to xen3 closed.                                                                                                   
22236 [dlm_recoverd]              dlm_wait_function                                                                          
------------------------------                                                                                               
in host xen4:                                                                                                                
Linux blade04.dc.xpto.com 2.6.18-8.1.14.el5xen #1 SMP Thu Oct 4 11:38:56 WEST 
2007 x86_64 x86_64 x86_64 GNU/Linux            
lvm2-cluster-2.02.16-3.el5                                                                                                   
cman-2.0.64-1.0.1.el5                                                                                                        
  PID TTY      STAT   TIME COMMAND                                                                                           
25097 ?        D<     0:00  \_ [dlm_recoverd]                                                                                
25092 ?        Ssl    0:02 clvmd -T40                                                                                        
ps ax -o pid,cmd,wchan                                                                                                       
25097 [dlm_recoverd]              dlm_wait_function                                                                          
------------------------------                                                                                               
Connection to xen4 closed.                                                                                                   
in host xen5:
Linux blade05.dc.xpto.com 2.6.18-92.1.17.el5xen #1 SMP Tue Nov 4 14:13:09 EST 
2008 x86_64 x86_64 x86_64 GNU/Linux
lvm2-cluster-2.02.32-4.el5
cman-2.0.84-2.el5_2.1
  PID TTY      STAT   TIME COMMAND
22333 ?        D<     0:00  \_ [dlm_recoverd]
22328 ?        Ssl    0:02 clvmd -T40
ps ax -o pid,cmd,wchan
22333 [dlm_recoverd]              -
------------------------------
Connection to xen5 closed.
in host xen6:
Linux blade06.dc.xpto.com 2.6.18-92.1.17.el5xen #1 SMP Tue Nov 4 14:13:09 EST 
2008 x86_64 x86_64 x86_64 GNU/Linux
lvm2-cluster-2.02.32-4.el5
cman-2.0.84-2.el5_2.1
  PID TTY      STAT   TIME COMMAND
ps ax -o pid,cmd,wchan
------------------------------
Connection to xen6 closed.
in host xen7:
Linux blade07.dc.xpto.com 2.6.18-92.1.13.el5xen #1 SMP Wed Sep 24 20:01:15 EDT 
2008 x86_64 x86_64 x86_64 GNU/Linux
lvm2-cluster-2.02.32-4.el5
cman-2.0.84-2.el5
cman-2.0.84-2.el5_2.1
  PID TTY      STAT   TIME COMMAND
19793 ?        D<     0:00  \_ [dlm_recoverd]
19788 ?        Ssl    0:01 clvmd -T40
ps ax -o pid,cmd,wchan
19793 [dlm_recoverd]              -
------------------------------
Connection to xen7 closed.



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20081114/949033f7/attachment.htm>


More information about the Linux-cluster mailing list