[dm-devel] A step behind?
Nicola Ranaldo
ranaldo at unina.it
Thu Dec 16 16:46:15 UTC 2004
Ok guys, please answer me to this doubt question, or i'll wast all my
remaining time
While multipath tools gives me random problems I *suppose* i could be a dm
multipath target kernel problem.
I'm doing some testing now using only the dmsetup tool in order to check at
low level the multipath target.
I recall my configuration:
hp pc server dl380
qlogic 2312 kernel builtin driver single path (qlport_down_retry 1)
hsg80 dual controller in multibus failover
the fabric connected to both controllers
slackware 10.0 kernel 2.6.10rc-2 udm-2
devmapper 1.00.19
i configured a unit on the hsg80 and it appears to the system as an active
path /dev/sdb and a ghost path /dev/sda.
so this is my table
disk1: 0 71114623 multipath 1 queue_if_no_path 0 2 2 round-robin 0 1 1 8:0
1000 round-robin 0 1 1 8:16 1000
i begin some write operation on the disk (i have a task syncing every 1
seconds in order to stress the disk).
When i fail manually the active path (for example restart the controller
having it online) dmsetup status reports flag "F" for every path.
I think it is normal, becouse hsg80 is not so fast in order to pass the unit
online to the remaining controller.
So when kernel try the alternate path it founds it is down (and fails it).
Because the presence of queue_if_no_path the ouput will be queued and my
process is not distrupted.
I can see the growing queue whit dmsetup status disk1, but after some
seconds the sync/writing process goes in D status, so is it normal or is
simple a limit to the queuing?
I begin to do a dmsetup message disk1 0 reinstate_path 8:0, and 8:16
alternatively and randomly (yes i think i can also reinstate a failed path
and i'm aspecting the target retries and refails again, if it is not correct
i think no multipath tools will be useful better then my manual commands)
After some seconds i can see the queue shrinking, when it reaches 0, the
sync/writing process wake up and all continue normally.
This howewer is NOT the normal behaviour, and after some testing (randomly
but never more then 10/20) i got the process distruption.
I want to know if is it correct for me (or am i mad) assuming
queue_if_no_path will never disrupt a process!
If it is so, (and i think really it is), is unuseful for me passing nights
stracing multipath tools and reading a lot of sources!
In this scenario i'd like to have a good interpretation of kernel messages
too:
*) scsi error
*) lost page error
*) incorrect number of segments
and so on
Please help me!
Regards
Nicola Ranaldo
More information about the dm-devel
mailing list