[dm-devel] SAN failover question
Christophe Varoqui
christophe.varoqui at gmail.com
Sun May 1 21:40:07 UTC 2011
On dim., 2011-05-01 at 23:01 +0200, urgrue wrote:
> I've tried all around to find a good solution for my conundrum, without
> much luck.
>
> The point is, multipath works nice. Until a bigger disaster comes along,
> e.g. san or datacenter failure. Of course like most big environments you
> have a synchronous replica of your SAN. But you have to "do stuff" to
> get Linux to take that new LUN and get back to work. A reboot, or san
> rescans, forcibly removing disks and so forth. It's not very pretty.
>
> So my question is, is there any way to get multipath to treat both the
> active lun and it's passive replica (usually in readonly or offline
> state) as one and the same disk? The goal being, if your SAN fails, you
> merely have to activate your DR replica, and multipath would pick it up
> and all's well (except for the 30 sec to a few mins of I/O hanging until
> the DR was online). In essence, you'd have four paths to a LUN - 2 to
> the active one, 2 to the passive one, which is a different LUN
> technically speaking (different serials, WWNs, etc), but an identical
> synchronous replica (identical data, identical state, identical PVID, etc).
>
You would have to :
1/ setup a customized getuid program instead of the default scsi_id.
(may be based on the pairing id if there is such thing in your context)
2/ set the 'group_by_prio' path grouping policy
3/ develop a prioritizer shared object to assign path priorities based
on the master/slave role of the logical unit. Paths to master get prio
2, paths to slave get prio 1.
May be someone else can comment on the specific ro->rw promotion issue
upon path_group switching. I can't tell if it needs a hw_handler these
days.
--
Christophe Varoqui
OpenSVC - Tools to scale
http://www.opensvc.com/
More information about the dm-devel
mailing list