[dm-devel] Can not remove device. No files open, no processes attached. Forced to reboot server.

Zdenek Kabelac zdenek.kabelac at gmail.com
Mon Feb 7 15:06:59 UTC 2022


Dne 06. 02. 22 v 16:16 Aidan Walton napsal(a):
> Hi,
> I've been chasing a problem now for a few weeks. I have a flaky SATA
> controller that fails unpredictably and upon doing so all drives
> attached are disconnected by the kernel. I have 2 discs on this
> controller which are the components of a RAID1 array. mdraid fails the
> disc (in its strange way) stating that one device is removed and the
> other is active. Apparently this is the default mdraid approach. Even
> though both devices are in fact failed. Regardless, the devmapper
> device which is supporting an LVM logical volume on top of this raid
> array, remains active. The logical volume is no longer listed by
> lvdisplay, but dmsetup -c info shows:
> Name                                Maj Min Stat Open Targ Event  UUID
> storage.mx.vg2-shared_sun_NAS.lv1   253   2 L--w    1    1      0
> LVM-Ud9pj6QE4hK1K3xiAFMVCnno3SrXaRyTXJLtTGDOPjBUppJgzr4t0jJowixEOtx7
> storage.mx.vg1-shared_sun_users.lv1 253   1 L--w    1    1      0
> LVM-ypcHlbNXu36FLRgU0EcUiXBSIvcOlHEP3MHkBKsBeHf6Q68TIuGA9hd5UfCpvOeo
> ubuntu_server--vg-ubuntu_server--lv 253   0 L--w    1    1      0
> LVM-eGBUJxP1vlW3MfNNeC2r5JfQUiKKWZ73t3U3Jji3lggXe8LPrUf0xRE0YyPzSorO
> 
> The device in question is 'storage.mx.vg2-shared_sun_NAS.lv1'
> 
> As can be seen is displays 'open'
> 
> however lsof /dev/mapper/storage.mx.vg2-shared_sun_NAS.lv1
> <blank>
> 
> fuser -m /dev/storage.mx.vg1/shared_sun_users.lv1
> <blank>
> 
> dmsetup status storage.mx.vg2-shared_sun_NAS.lv1
> 0 976502784 error
> 
> dmsetup remove storage.mx.vg2-shared_sun_NAS.lv1
> device-mapper: remove ioctl on storage.mx.vg2-shared_sun_NAS.lv1
> failed: Device or resource busy
> 
> dmsetup wipe_table storage.mx.vg2-shared_sun_NAS.lv1
> device-mapper: resume ioctl on storage.mx.vg2-shared_sun_NAS.lv1
> failed: Invalid argument
> 

You can't remove device with open count >0.
You've already replaced device target type with error - so whoever keeps this 
device open gets error on all read & writes (and you would probably see it on 
kernel log)

Your remaining problem is to figure out who holds devices open in kernel.

fusers shows only user-land apps - but not in-kernel users - so you should 
probably try to see how you are usinf your devices - who is mounting/using them ?

If your kernel is working correctly - tools like  'lsblk' are typically quite 
good at exposing your device tree..

But you would need to expose way more details to give more qualified advice...

Regards

Zdenek




More information about the dm-devel mailing list