[dm-devel] 2.6.10-rc1-udm1: multipath work in progress

Alasdair G Kergon agk at redhat.com
Tue Nov 2 15:25:28 UTC 2004


On Tue, Nov 02, 2004 at 12:03:01AM +0100, christophe varoqui wrote:
> > But a table load flag to suppress the device size checks does sound OK.
> Or just move the test at the end of the PG switch-on procedure for all
> multipath tables ? It would keep the API less complex ...
But if the test fails then what?  Drop the whole table?
Rather I/O to the missing part gets errored i.e. same as always 
suppressing the check.  (And it's necessary for dm to cope with 
device resizing anyway.)
 
> What problem do we try to solve here ? Planned outages, like controler
> restart or firmware upgrades.
ie all paths fail simultaneously, but recover quickly
 
> If so, I guess we can go for queue_if_no_path for all and just ask
> userspace the time & queued ios threshold before failing.
Thought about a timer, but not persuaded it gains us anything:
In the current model, only userspace can reinstate a path, so userspace
is required to intervene to resolve the situation - it might as
well handle any timeouts it wants itself.

Having it as 'feature args' means we could implement alternatives
later and add other things without breaking the API.
 
> I don't quite see the benefits of PG disabling feature.
> As far as I can see, all it brings is permiting kernel code to change
> the maps, which seems like enabling policies in the kernel : from
> userspace, we have the same effect by instanciating the PG at the tail
> of the params string.

Kernel multipath always chooses the first available path in PG order.
The disabling/enabling of PGs copes with the case when switching
PG incurs a serious performance penalty (maybe across a cluster).
i.e. you don't want to call pg_init_fn more than necessary.

  PG 1 - path A
  PG 2 - path B
  PG 3 - path C

Path A fails; it starts using path B.
Path A becomes available again.  If userspace reinstates it, then it
will immediately start being used and PG 1's initialisation function
will run.  But you'd prefer to continue using path B until it
fails, and only then switch back to path A.  [Or revert to the
preferred path at a time of your choosing when the system is not
busy.]  So you set PG 1 & PG 3 to 'disabled' before reinstating path A.

An alternative would have been to reload the table with PG 1 and PG2
swapped over - and table reloading is also an expensive operation
(and doesn't deal with queued I/O properly yet).
Other interface options could have been to let you change the
order of PGs dynamically (needs lots more code) or to just have a
'sticky' flag so it doesn't change PG until it has no choice (less
flexible - maybe you want to switch to path C next rather than A).

Alasdair
-- 
agk at redhat.com




More information about the dm-devel mailing list