[dm-devel] Designing a new prio_callout

Ethan John ethan.john at gmail.com
Mon Aug 27 15:50:19 UTC 2007


Thanks again, Hannes. We really appreciate your time on this.

Stefan's suggestion will be a great option for round-robin failover for our
first release. We'll try to figure out a way to do that. It also sounds like
using the ALUA callout is going to be the best long-term solution, which
answers the original question that I posed.

As for putting failover paths on a different subnet, that will be up to the
user to manage. We won't prevent that sort of thing, and network
configurations are extremely flexible on our systems. Subnet also doesn't
necessarily effect physical network paths with our system.

Again, thanks so much for your help!

On 8/27/07, Hannes Reinecke <hare at suse.de> wrote:
>
> Ethan John wrote:
> > For the record, setting rr_min_io to something extremely large (we're
> using
> > 2 billion now, since I'm assuming it's a C integer) solves the immediate
> > problem that we're having (overhead in path switching causing poor
> > performance). Telling people to use mpath_prio_random is still less than
> > ideal for any small number of iSCSI targets, but it a better short-term
> > solution for us than nothing.
> >
> In setting rr_min_io to something extremely large you effectively
> disable the round-robin scheduler in multipathing.
> That's okay for the failover scenario you have (as you only have
> one path per group), but whenever you have more than one path
> in a group that wouldn't work anymore.
>
> > On 8/10/07, Ethan John <ethan.john at gmail.com> wrote:
> >> Hannes, thanks again for your help with this.
> >>
> >> I haven't noticed that failback does the right thing, but I'll try it
> out
> >> again. Could be something we're doing wrong. In any case, there's very
> >> little documentation on all this, and I'm trying to develop some kind
> of
> >> strategy for our Linux customers to use until we get ALUA implemented.
> >>
> >> Being able to set path priorities manually would be ideal, but it seems
> >> like this is impossible, right?
> >>
> >> Here's the situation we have right now. I initiate two connections to
> one
> >> target, across two sessions with two different IPs, with two LUs.
> Multipath
> >> looks like this:
> >> mpath45 (20002c9020020001a00151b6b46bb57b0) dm-1 company,iSCSI target
> >> [size=15G][features=0][hwhandler=0]
> >> \_ round-robin 0 [prio=1][active]
> >>  \_ 22:0:0:1 sdc 8:32  [active][ready]
> >> \_ round-robin 0 [prio=1][enabled]
> >>  \_ 23:0:0:1 sde 8:64  [active][ready]
> >> mpath44 (20002c9020020001200151b6b46bb57ae) dm-0 company,iSCSI target
> >> [size=15G][features=0][hwhandler=0]
> >> \_ round-robin 0 [prio=1][enabled]
> >>  \_ 22:0:0:0 sdb 8:16  [active][ready]
> >> \_ round-robin 0 [prio=1][enabled]
> >>  \_ 23:0:0:0 sdd 8:48  [active][ready]
> >>
> >> Note that there are only two active sessions:
> >> # iscsiadm -m session
> >> tcp: [20] 10.53.152.22:3260 ,1 iqn.2001-07.com.company:qaiscsi2:blah1
> >> tcp: [21] 10.53.152.23:3260,2 iqn.2001-07.com.company:qaiscsi2:blah1
> >>
> >> So the result is that all activity is routed to the first session that
> was
> >> initiated. I want to change the priorities of the paths to allow for
> traffic
> >> to go to the first IP for mpath45 and the second IP for mpath46.
> >>
> That's a matter of the IP routing. Having both target on the same (sub-)
> net
> doesn't work very well with multipathing. Please setup your system with
> each iSCSI Target port in a different subnet eg
>
> 10.53.152.22:3260,1 iqn.2001-07.com.company:qaiscsi2:blah1
> 10.53.153.22:3260,2 iqn.2001-07.com.company:qaiscsi2:blah1
>
> then you'll have one iSCSI target port per subnet and you can actually
> do failover etc.
>
> >> Obviously ALUA is the way to go for this in the future, but we won't
> have
> >> the resources to implement that, so I'm looking for an interim solution
> that
> >> will scale to thousands of clients. Right now, the only thing I can
> tell
> >> people is to manually initiate connections to certain targets through
> >> certain IP addresses -- basically, doing the load balancing themselves.
> Is
> >> there a better way?
> >>
> No, not really. But I'm not a network guru. You may want to ask on
> the open-iscsi mailing list.
>
> And you can get all information you need via sysfs, so it should
> be possible to create a script like Stefan Bader suggested.
>
> Cheers,
>
> Hannes
> --
> Dr. Hannes Reinecke                   zSeries & Storage
> hare at suse.de                          +49 911 74053 688
> SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
> GF: Markus Rex, HRB 16746 (AG Nürnberg)
>
> --
> dm-devel mailing list
> dm-devel at redhat.com
> https://www.redhat.com/mailman/listinfo/dm-devel
>



-- 
Ethan John
http://www.flickr.com/photos/thaen/
(206) 841.4157
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/dm-devel/attachments/20070827/aafa0431/attachment.htm>


More information about the dm-devel mailing list