[dm-devel] A step behind?

Nicola Ranaldo ranaldo at unina.it
Thu Dec 16 16:46:15 UTC 2004


Ok guys, please answer me to this doubt question, or i'll wast all my 
remaining time
While multipath tools gives me random problems I *suppose* i could be a dm 
multipath target kernel problem.
I'm doing some testing now using only the dmsetup tool in order to check at 
low level the multipath target.
I recall my configuration:

hp pc server dl380
qlogic 2312 kernel builtin driver single path (qlport_down_retry 1)
hsg80 dual controller in multibus failover
the fabric connected to both controllers

slackware 10.0 kernel 2.6.10rc-2 udm-2
devmapper 1.00.19

i configured a unit on the hsg80 and it appears to the system as an active 
path /dev/sdb and a ghost path /dev/sda.

so this is my table
disk1: 0 71114623 multipath 1 queue_if_no_path 0 2 2 round-robin 0 1 1 8:0 
1000 round-robin 0 1 1 8:16 1000

i begin some write operation on the disk (i have a task syncing every 1 
seconds in order to stress the disk).
When i fail manually the active path (for example restart the controller 
having it online) dmsetup status reports flag "F" for every path.
I think it is normal, becouse hsg80 is not so fast in order to pass the unit 
online to the remaining controller.
So when kernel try the alternate path it founds it is down (and fails it).
Because the presence of queue_if_no_path the ouput will be queued and my 
process is not distrupted.
I can see the growing queue whit dmsetup status disk1, but after some 
seconds the sync/writing process goes in D status, so is it normal or is 
simple a limit to the queuing?
I begin to do a dmsetup message disk1 0 reinstate_path 8:0, and 8:16 
alternatively and randomly (yes i think i can also reinstate a failed path 
and i'm aspecting the target retries and refails again, if it is not correct 
i think no multipath tools will be useful better then my manual commands)
After some seconds i can see the queue shrinking, when it reaches 0, the 
sync/writing process wake up and all continue normally.
This howewer is NOT the normal behaviour, and after some testing (randomly 
but never more then 10/20)  i got the process distruption.
I want to know if is it correct for me (or am i mad) assuming 
queue_if_no_path will never disrupt a process!
If it is so, (and i think really it is), is unuseful for me passing nights 
stracing multipath tools and reading a lot of sources!

In this scenario i'd like to have a good interpretation of kernel messages 
too:

*) scsi error
*) lost page error
*) incorrect number of segments

and so on

Please help me!

Regards

    Nicola Ranaldo 




More information about the dm-devel mailing list