[Linux-cluster] Quorum disk over RAID software device

Wed Dec 16 19:41:26 UTC 2009

In my multipath setup I use the following :

polling_interval        3 (checks the storage every 3 seconds)
no_path_retry     5 (will check 5 times the path if failure happens on
it, making it  last scsi_timer (/sys/block/sdXX/device/timeout) + 5*3
secondes )

path_grouping_policy    multibus (to load-balance accross all paths,
group_by_prio may be recommended with MSA if it is an active/passive
array?)

>From my experience,
no_path_retry, when using mirror (md or LVM) could be put to fail
instead of 5 in my case.

Concerning the flush_on_last_del, it just means that for a given LUN,
when there is only one path remaining, if it comes to fail, what
behaviour to adopt.

Same consideration, if using mirror, just fail.

The thing to take into account is the interval at which your qdisk
process accesses the qdisk lun, if configured to a high value (let's
imagine every 65 seconds) it'll take (worst case) 60 seconds of scsi
timeout (default) + 12 times default polling interval  (30 seconds if
I'm not wrong) + 5 seconds=  425 seconds.....

Brem

2009/12/16 Rafael Micó Miranda <rmicmirregs at gmail.com>:
> Hi Brem
>
> El mar, 15-12-2009 a las 21:15 +0100, brem belguebli escribió:
>> Hi Rafael,
>>
>> I can already predict what is going to happen during your test
>>
>> I one of your nodes looses only 1 leg of your mirrored qdisk (either
>> with mdadm or lvm), the qdisk will still be active from the point of
>> view of this particular node, so nothing will happen.
>>
>> What you should consider is
>>
>> 1) reducing the scsi timeout of the lun which is by default around 60
>> seconds (see udev rules)
>> 2) if your qdisk lun is configured to multipath, don't configure it
>> with queue_if_no_path or mdadm will never see if one of the legs came
>> to be unavail.
>>
>> Brem
>
>> --
>> Linux-cluster mailing list
>> Linux-cluster at redhat.com
>> https://www.redhat.com/mailman/listinfo/linux-cluster
>
> I made some tests today.
>
> A) With MDADM mirrored LUNs:
> I built the MD device over the multipathd devices and used it as a
> quorum disk. It seemed to work, but in a test during the intentioned
> failure of a LUN on a single machine the node failed to access the
> quorum device, so it was evicted by the rest of the nodes. I have to
> take a closer look to this because in other attempts it didn't happen, I
> think this is realated with the device timeouts, retries and queues.
>
> B) With non-clustered LVM-Mirrored LUNs:
> Seems to work too, but there are some strange behaviours. During the
> intentioned failure of a LUN on a single machine the node did not see
> the failure at the LVM layer of one device not being reachable, but the
> multipath daemon was marking the device as failed. In other attempts it
> worked right.
>
> Also I have to check, as you commented, the values at the udev rules and
> multipath.conf file:
>
> device {
> vendor                  "HP"
> product                 "MSA VOLUME"
> path_grouping_policy    group_by_prio
> getuid_callout          "/sbin/scsi_id -g -u -s /block/%n"
> path_checket            tur
> patch_selector          "round_robin 0"
> prio_callout            "/sbin/mpath_prio_alua /dev/%n"
> rr_weight               uniform
> failback                immediate
> hardware_handler        "0"
> no_path_retry           12
> rr_min_io               100
> }
>
> Note: this is my testing scenario. The production environment is not
> using MSA storage arrays.
>
> I'm thinking in reducing the "no_path_retry" to a smaller value or even
> to "fail". With the current value (equivalent to "queue_if_no_path" of
> 12 regarding RHEL docs) MDADM saw the failure of the device, so this is
> more or less working.
> I'm interested too in the "flush_on_last_del" parameter, have you ever
> tried it?
>
> Thanks in advance. Cheers,
>
> Rafael
>
> --
> Rafael Micó Miranda
>
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
>