[Linux-cluster] Quorum disk over RAID software device

Wed Dec 16 19:09:04 UTC 2009

Hi Brem

El mar, 15-12-2009 a las 21:15 +0100, brem belguebli escribió:
> Hi Rafael,
> 
> I can already predict what is going to happen during your test
> 
> I one of your nodes looses only 1 leg of your mirrored qdisk (either
> with mdadm or lvm), the qdisk will still be active from the point of
> view of this particular node, so nothing will happen.
> 
> What you should consider is
> 
> 1) reducing the scsi timeout of the lun which is by default around 60
> seconds (see udev rules)
> 2) if your qdisk lun is configured to multipath, don't configure it
> with queue_if_no_path or mdadm will never see if one of the legs came
> to be unavail.
> 
> Brem

> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

I made some tests today. 

A) With MDADM mirrored LUNs:
I built the MD device over the multipathd devices and used it as a
quorum disk. It seemed to work, but in a test during the intentioned
failure of a LUN on a single machine the node failed to access the
quorum device, so it was evicted by the rest of the nodes. I have to
take a closer look to this because in other attempts it didn't happen, I
think this is realated with the device timeouts, retries and queues.

B) With non-clustered LVM-Mirrored LUNs:
Seems to work too, but there are some strange behaviours. During the
intentioned failure of a LUN on a single machine the node did not see
the failure at the LVM layer of one device not being reachable, but the
multipath daemon was marking the device as failed. In other attempts it
worked right.

Also I have to check, as you commented, the values at the udev rules and
multipath.conf file:

device {
vendor			"HP"
product			"MSA VOLUME" 
path_grouping_policy	group_by_prio
getuid_callout		"/sbin/scsi_id -g -u -s /block/%n"
path_checket		tur
patch_selector		"round_robin 0"
prio_callout		"/sbin/mpath_prio_alua /dev/%n"
rr_weight		uniform
failback		immediate
hardware_handler	"0"
no_path_retry		12
rr_min_io		100
}

Note: this is my testing scenario. The production environment is not
using MSA storage arrays.

I'm thinking in reducing the "no_path_retry" to a smaller value or even
to "fail". With the current value (equivalent to "queue_if_no_path" of
12 regarding RHEL docs) MDADM saw the failure of the device, so this is
more or less working. 
I'm interested too in the "flush_on_last_del" parameter, have you ever
tried it?

Thanks in advance. Cheers,

Rafael

-- 
Rafael Micó Miranda