[dm-devel] [PATCH 0 of 2] dm-raid: Bug fixes

NeilBrown neilb at suse.de
Tue Apr 17 04:26:58 UTC 2012


On Mon, 16 Apr 2012 18:45:17 -0500 Jonathan Brassow <jbrassow at redhat.com>
wrote:

> Neil,
> 
> I have 3 bugs that I've been working on.  Two I have fixed and one I
> have not, but have a question.
> 
> The first patch (dm-raid-set-recovery-flags-on-resume) addresses the
> fact that some recovery flags are altered during suspend, but not
> corrected upon resume.  I'm wondering if you think these flags would be
> better pushed into 'mddev_resume' rather that being altered in
> dm-raid.c?

I think setting MD_RECOVERY_NEEDED in mddev_resume makes perfect sense.
It is quite safe to set it at any time, and the one place where md.c calls
mddev_resume() it sets the flag immediately afterwards.  So moving that
setting into mddev_resume() makes sense.

MD_RECOVERY_FROZEN I'm less sure about.  If we clear it in mddev_resume(),
then as soon as you convert a RAID5 to a RAID6 it would start recovery of the
extra device, even if you had set sync_action to 'frozen' first.  That would
be wrong.

I guess we are over-loading 'MD_RECOVERY_FROZEN' it bit.  It means both
"user-space requested a freeze" and  "resync temporarily disabled".

I wonder if md_stop_writes() only needs to set it temporarily, and to make
sure MD_RECOVERY_NEEDED isn't set when it completes.  That might be enough??

However maybe it is easiest to just clear it in raid_resume() like you did.


> 
> The second patch (dm-raid-record-and-handle-missing-devices) adds code
> to address the case where the user specifies particular array positions
> as missing.  I don't have any significant questions about this patch.

I do :-)

md already does all the proper accounting for ->degraded, dm-raid shouldn't
need to.

Incrementing md.degraded in dev_parms shouldn't be needed as md_run is
subsequently called, and it sets md.degraded correctly.

incrementing it in read_disk_sb() and setting the Faulty flag is wrong.  I
think it should just call md_error().

The other changes in that patch look OK.


> 
> The 3rd issue I am seeing concerns how 'suspend' happens.  Suspend
> should flush all outstanding I/O and quiesce.  When I look at the code,
> I feel it should be doing this.  ('md_stop_writes' is called and
> followed-up by a call to 'mddev_suspend', which quiesces the
> personality.)  However, if I create a RAID1 device, suspend it, and then
> detach one of the legs, it does not show the changes written immediately
> before the suspend.  If I issue a 'sync', then the changes do show-up.
> I confused as to why the suspend process doesn't seem to be pushing out
> the writes that have been issued.  Any ideas?

That sounds like it is behaving exactly as I would expect.
You have written to the filesystem (and so to the pagecache) but the
filesystem hasn't written to the device yet.  That happens after a time, or
on a 'sync' or 'fsync'.

You might be able to get the block device to ask the filesystem to flush
things out using freeze_bdev(), but I'm not sure of the details there.
It might not flush things, it might just ensure metadata is consistent - or
something.


NeilBrown
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 828 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20120417/21a4257e/attachment.sig>


More information about the dm-devel mailing list