[dm-devel] 2.6.10-rc1-udm1: multipath work in progress

Tue Nov 2 20:19:28 UTC 2004

Miscellaneous points:

On Tue, Nov 02, 2004 at 08:46:04PM +0100, christophe varoqui wrote:
> Let the kernel fail them ... as soon as the primary PG paths are
> exhausted, it will switch to the secondary PG and an event will cause
> multipathd to reconfigure the table. The secondary will become primary,
> and failed paths will come back up, grouped in a low prio PG.

Which may require rapid intervention by userspace, or the queue_if_no_paths 
pause to give userspace time to sort things out.

[Consider the primary pg_init_fn finds the paths would be OK but
aren't current, so fails them all so the currently-preferred secondary can
be used.  But the secondary paths turn out to have genuinely failed so you
*do* want to use the primary after all, but you can't now.  How do you tell
the primary to *forcibly* use the paths?  This method has effectively
transferred the pg_init_fn to userspace.  Or it requires giving the
pg_init_fn complete knowledge of the configuration so it checks both primary
and secondary PGs before deciding what to do - but then that has an
equivalent effect to what's already implemented in these patches using PG
enable/disable. Or you have a 3rd and 4th PG duplicating the 1st & 2nd ones
but with a new 'force' flag.]

[I see queue_if_no_paths very much as a last resort: it's there
as an option for not-so-good hardware.  In any decent system there should 
never be no paths without catastrophic hardware failure.]

> We can failback already, with the current design.
> As I see it, all the "disable PG" feature brings is save some table
> reloads. Is it worth the added complexity ?
Performing tables reloads is the complex option IMHO.
[Even ignoring the suspend/resume queueing issues that aren't
resolved yet.]
Table reloads wipe all knowledge of the existing state from the kernel and
start afresh, so pg_init_fn's have to be run again etc.  They also cannot
avoid allocating memory, which might not be available immediately.
You can't assume a table reload will succeed and must always have a
fallback plan in case it fails.  

Alasdair
-- 
agk at redhat.com