[Linux-cluster] Virtual service using CLVM not migrating

Tom Lanyon tom at netspot.com.au
Mon Aug 25 06:19:49 UTC 2008


Hi list,

(let me know if this should be on the xen list, but I think it's an  
issue with clvm locking a logical volume)

I have a three node RHEL5 cluster running some virtual machines. The  
virtual machines use a LVM LV as their root which is available cluster- 
wide via clvmd.

Live migration between cluster nodes seems to work well when running  
one-vm-per-node exclusively, but fails when a node is running more  
than one virtual machine.

I can migrate my two VMs, "nodea" and "nodeb", onto the same physical  
node and they run fine:

# xm list
Name                                      ID Mem(MiB) VCPUs State    
Time(s)
Domain-0                                0     4120     4 r-----   3398.3
nodea                                      9     5999     1 - 
b----      0.3
nodeb                                      4     5999     1 -b----     
265.9


However, when I try to migrate one of these VMs *away* from this  
physical node to another cluster member (using clusvcadm -M), it  
performs the state transfer and then I get a nasty error on the VMs  
console and I end up with a broken virtual machine on both physical  
nodes:

WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!
netif_release_rx_bufs: 0 xfer, 62 noxfer, 194 unused
WARNING: g.e. still in use!
WARNING: leaking g.e. and page still in use!


Sorry for the large email, but I'll also include the xend log on the  
source physical server showing the failure. You can see that device  
51712 is 'still active' while trying to migrate and that device 51712  
is the LV block device; I assume this means it is having trouble  
removing a CLVM lock?


[2008-08-24 00:27:43 xend 5252] DEBUG (balloon:127) Balloon: 26652 KiB  
free; need 25600; done.
[2008-08-24 00:27:43 xend 5252] DEBUG (XendCheckpoint:89) [xc_save]: / 
usr/lib64/xen/bin/xc_save 22 9 0 0 1
[2008-08-24 00:27:43 xend 5252] INFO (XendCheckpoint:351) ERROR  
Internal error: Couldn't enable shadow mode
[2008-08-24 00:27:43 xend 5252] INFO (XendCheckpoint:351) Save exit rc=1
[2008-08-24 00:27:43 xend 5252] ERROR (XendCheckpoint:133) Save failed  
on domain nodea (9).
Traceback (most recent call last):
   File "/usr/lib64/python2.4/site-packages/xen/xend/ 
XendCheckpoint.py", line 110, in save
     forkHelper(cmd, fd, saveInputHandler, False)
   File "/usr/lib64/python2.4/site-packages/xen/xend/ 
XendCheckpoint.py", line 339, in forkHelper
     raise XendError("%s failed" % string.join(cmd))
XendError: /usr/lib64/xen/bin/xc_save 22 9 0 0 1 failed
[2008-08-24 00:27:43 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo: 
1601) XendDomainInfo.resumeDomain(9)
[2008-08-24 00:27:43 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo: 
1614) XendDomainInfo.resumeDomain: devices released
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo: 
791) Storing domain details: {'console/ring-ref': '2057005', 'console/ 
port': '2', 'name': 'migrating-nodea', 'console/limit': '1048576',
  'vm': '/vm/b845f914-33a3-e1cf-551e-01b6d346b92b', 'domid': '9', 'cpu/ 
0/availability': 'online', 'memory/target': '6144000', 'store/ring- 
ref': '2049294', 'store/port': '1'}
[2008-08-24 00:27:44 xend 5252] DEBUG (DevController:110)  
DevController: writing {'backend-id': '0', 'mac': '00:16:3e:6c:ae:9f',  
'handle': '0', 'state': '1', 'backend': '/local/domain/0/backend/vif/ 
9/0'} t
o /local/domain/9/device/vif/0.
[2008-08-24 00:27:44 xend 5252] DEBUG (DevController:112)  
DevController: writing {'bridge': 'br102', 'domain': 'migrating- 
nodea', 'handle': '0', 'script': '/etc/xen/scripts/vif-bridge',  
'state': '1', 'fron
tend': '/local/domain/9/device/vif/0', 'mac': '00:16:3e:6c:ae:9f',  
'online': '1', 'frontend-id': '9'} to /local/domain/0/backend/vif/9/0.
[2008-08-24 00:27:44 xend 5252] DEBUG (blkif:24) exception looking up  
device number for xvda: [Errno 2] No such file or directory: '/dev/xvda'
[2008-08-24 00:27:44 xend 5252] DEBUG (DevController:110)  
DevController: writing {'backend-id': '0', 'virtual-device': '51712',  
'device-type': 'disk', 'state': '1', 'backend': '/local/domain/0/ 
backend/vbd/
9/51712'} to /local/domain/9/device/vbd/51712.
[2008-08-24 00:27:44 xend 5252] DEBUG (DevController:112)  
DevController: writing {'domain': 'migrating-nodea', 'frontend': '/ 
local/domain/9/device/vbd/51712', 'format': 'raw', 'dev': 'xvda',  
'state': '1',
'params': '/dev/int_vg/os_nodea', 'mode': 'w', 'online': '1',  
'frontend-id': '9', 'type': 'phy'} to /local/domain/0/backend/vbd/ 
9/51712.
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] DEBUG (XendDomainInfo: 
1626) XendDomainInfo.resumeDomain: devices created
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] ERROR (XendDomainInfo: 
1631) XendDomainInfo.resume: xc.domain_resume failed on domain 9.
Traceback (most recent call last):
   File "/usr/lib64/python2.4/site-packages/xen/xend/ 
XendDomainInfo.py", line 1628, in resumeDomain
     xc.domain_resume(self.domid, fast)
Error: (1, 'Internal error', "Couldn't map start_info")
[2008-08-24 00:27:44 xend 5252] DEBUG (XendCheckpoint:136)  
XendCheckpoint.save: resumeDomain
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:44 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
[2008-08-24 00:27:45 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1722) Dev 51712 still active, looping...
-------many repeats-------
[2008-08-24 00:28:14 xend.XendDomainInfo 5252] INFO (XendDomainInfo: 
1728) Dev still active but hit max loop timeout





More information about the Linux-cluster mailing list