[dm-devel] How do you force-close a dm device after a disk failure?
Zdenek Kabelac
zkabelac at redhat.com
Mon Sep 14 10:04:25 UTC 2015
Dne 14.9.2015 v 11:45 Adam Nielsen napsal(a):
>> Whole dm table with all deps needs to be known.
>
> $ dmsetup table
> backup: 0 11720531968 crypt aes-xts-plain64
> 0000000000000000000000000000000000000000000000000000000000000000 0
> 9:10 4096
>
> $ dmsetup status
> backup: 0 11720531968 crypt
>
> $ dmsetup ls --tree
> backup (253:0)
> └─ (9:10)
>
> $ dmsetup info -f
> Name: backup
> State: ACTIVE (DEFERRED REMOVE)
> Read Ahead: 4096
> Tables present: LIVE
> Open count: 1
> Event number: 0
> Major, minor: 253, 0
> Number of targets: 1
> UUID: CRYPT-LUKS1-d0b3d38e421545908537dc50f59fb217-backup
>
> All I'm using it for is to encrypt an mdadm-style RAID array composed
> of two external disks, connected temporarily via USB to do a full
> system backup with rsync.
>
>>> I'm not sure how to do this, could you please elaborate? I thought
>>> "dmsetup remove --force" would do this but as that doesn't work
>>
>> really state of whole table needs to be known.
>>
>>>> Also note - dmsetup remove supports --deferred removal (see man
>>>> page).
>>>
>>> Oh I didn't notice that. It doesn't seem to have much of an effect
>>> though:
>>
>> Sure it will not fix your problem - it's like lazy umount...
>
> So replacing the table with the 'error' target won't release the
> underlying device, even though that device is not used by the new
> target?
>
>> What is not clear to me is - what is your expectation here ?
>> Obviously your system is far more broken - so placing 'error' target
>> for your backup device will not fix it.
>>
>> You should likely attach also portion of 'dmesg' - there surely will
>> be written what is going wrong with your system.
>
> What happened was in the middle of the backup, there was some USB
> interruption and the disks dropped out, so the writes started failing.
> The kernel logs were full of write errors to various sector numbers. I
> think you would have the same result if you set things up with a USB
> stick and then unplugged it during a data transfer.
>
> The devices are connected like this:
>
> dm device "backup"
> |
> +-- mdadm device /dev/md10
> |
> +-- USB/SATA disk A (/dev/sdd)
> |
> +-- USB/SATA disk B (/dev/sde)
>
> The problem is that I can't just reconnect the disks and rerun the
> backup. mdadm refuses to stop the RAID array as it is in use by
> the dm device, and it thinks the array is active despite the disks being
> unplugged and in a drawer. If I reconnect the disks they appear as
> different devices (sdf and sdg) but I still can't start the "new" array
> from these new disk devices, as it tells me the disks are already part
> of an active array.
>
> So the only way I can have another go at running this backup is to
> close down /dev/md10, and it seems the only way I can do that is to
> tell dm to release that device. It doesn't matter if the dm device
> "backup" is unusable, I will just create "backup2" to use for the
> second attempt.
>
> But until I can figure out how to get dm to release the underlying
> device, I'm stuck!
>
>> i.e. you cannot expect 'remove --force' will work when your machine
>> start to show kernel errors.
>
> There were no kernel crashes, just errors related to USB transfers. I
> would assume this is not much different to how a real failed disk might
> behave, so I figure it is a situation that should be encountered
> relatively often!
>
dmsetup reload backup --table "0 11720531968 error"
dmsetup suspend --noflush backup
dmsetup resume backup
Is this working for you ?
Zdenek
More information about the dm-devel
mailing list