[lvm-devel] dmeventd doesn't handle failures during mirror resync.

Wed May 5 13:22:32 UTC 2010

On May 5, 2010, at 3:08 AM, Petr Rockai wrote:

> Neil Brown <neilb at suse.de> writes:
>
>> I was surprised to discover that while a normal write error is
>> handled properly - dmeventd runs 'lvconvert' to fix the array up,
>> this does not happen in response to a write error while syncing
>> the array.
>>
>> If I arrange for the new device to die, then
>>          lvconvert --repair --use-policies
>>
>> will fix it up as I would expect, but dmeventd never asks it to do
>> this.
>>
>> This seems to be a deliberate decision:  in _process_status_code
>> in dmeventd_mirror.c, a status of 'F' will cause lvconvert to be
>> run while 'S' and 'R' (sync and read errors) will not.
>>
>> Is there a reason for this?
> I think the rationale is that:
>
> For read errors, we should *not* strip the mirror leg, since we want  
> to
> keep as much redundancy as possible in this scenario. The failure  
> should
> be logged, but I think that's it.
>
> For sync, I am not sure. It may be that the reason for this is that  
> sync
> is usually related to manual action and dmeventd intervention may be
> unexpected and unwanted in this case. But that case could be argued.
>
>> Can we change dmeventd to response to sync (and read) errors in the  
>> same
>> way that it responds to write errors?
> I think it's a bad idea for read errors, unless maybe we could have a
> new feature for that -- one that'd upconvert the mirror first (if
> there's a hotspare) and only if that finishes OK, kill the bad leg.  
> Just
> log the error if there are no hotspares.
>
> For sync errors, I am ambivalent. Any further opinions?

I think for sync errors, we should restart the sync.  This can be done  
by a suspend/resume of the mirror device.  Effectively, we are  
assuming a transient failure.  Perhaps if we have tried to clear the  
fault a couple times, then we could remove the failed device.

Read errors I would definitely leave alone.  Drives can often relocate  
bad sectors, but that is done on writes.  If the relocation fails, we  
will know about it when the write fails.

  brassow