[Linux-cluster] How do you HA your storage?

Mon May 30 09:45:27 UTC 2011

On 30/04/2011, at 6:31 PM, urgrue wrote:
> 
> I'm struggling to find the best way to deal with SAN failover.
> By this I mean the common scenario where you have SAN-based mirroring.
> It's pretty easy with host-based mirroring (md, DRBD, LVM, etc) but how
> can you minimize the impact and manual effort to recover from losing a
> LUN, and needing to somehow get your system to realize the data is now
> on a different LUN (the now-active mirror)?

urgrue,

As others have mentioned, this may be a little off-topic for the list. However, I reply in support of hopefully providing an answer to your original question.

In my experience the destination array of storage-based (i.e. array-to-array) replication is able to present the replication target LUN with the same ID (e.g. WWN) as that of the source LUN on the source array.

In this scenario, you would present the replicated LUN on the destination array to your server(s), and your multipathing (i.e. device-mapper-multipath) software would essentially see it as another path to the same device. You obviously need to ensure that the priority of these paths are such that no I/O operations will traverse them unless the paths to the source array have failed.

In the case of a failure on the source array, it's paths will (hopefully!) be marked as failed, your multipath software will start queueing I/O, the destination array will detect the source array failure and switch its LUN presentation to read/write and your multipathing software will resume I/O on the new paths.

There's a lot to consider here. Such live failover can often be asking for trouble, and given the total failure rates of high-end storage equipment is quite minimal, I'd only implement if absolutely required.

The above assumes synchronous replication between the arrays.

Hope this helps somewhat.

Tom