[dm-devel] Multipath and HSG80 phase 2
Nicola Ranaldo
ranaldo at unina.it
Mon Dec 13 10:24:43 UTC 2004
> Indeed,
> can you audit your fixes in
> http://christophe.varoqui.free.fr/multipath-tools/multipath-tools-0.4.0.tar.bz2
> before I release it ?
Ok, now the tools does not segs, but the last check i have to do is about
the clone syscall, on my system (slackware 10.0) i have to use fork in
order to have multipathd daemons
run.
While using clone strace multipathd gives:
brk(0) = 0x8051000
brk(0x8052000) = 0x8052000
brk(0) = 0x8052000
brk(0) = 0x8052000
brk(0x8056000) = 0x8056000
clone(child_stack=0x8055040, flags=CLONE_NEWNS) = 2443
exit_group(0) = ?
and the process dies...
it's the clone call necessary? does the process run properly even if i use
fork?
> ... and report on general behaviour.
Ok, some progress is done :)))
Failover initiated by an "sg_start /dev/sgx 1" works properly! and i can do
a lot of switches between active and ghost path, with a 1/2 second delay
between each other, with no process disruption! great :)
howewer a failover initiated by a "restart other" on the hsg80 console
gives:
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 655327
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 656343
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 656351
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 657367
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 657375
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 658391
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 658399
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 658407
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 658903
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 658911
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 659927
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 659935
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 660951
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 660959
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 660967
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 661983
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 661991
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 663007
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 663015
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 664031
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 664039
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 664047
Dec 13 11:13:55 m3 kernel: SCSI error : <0 0 1 2> return code = 0x20000
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 664791
Dec 13 11:13:55 m3 kernel: end_request: I/O error, dev sdb, sector 664799
Dec 13 11:13:55 m3 kernel: Incorrect number of segments after building list
Dec 13 11:13:55 m3 kernel: counted 8, received 1
Dec 13 11:13:55 m3 kernel: req nr_sec 1024, cur_nr_sec 8
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81523
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81524
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81525
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81526
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81527
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81528
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81529
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81530
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81531
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Buffer I/O error on device dm-1, logical block
81532
Dec 13 11:13:55 m3 kernel: lost page write due to I/O error on dm-1
Dec 13 11:13:55 m3 kernel: Incorrect number of segments after building list
Dec 13 11:13:55 m3 kernel: counted 8, received 1
Dec 13 11:13:55 m3 kernel: req nr_sec 1024, cur_nr_sec 8
Dec 13 11:14:06 m3 kernel: SCSI error : <0 0 1 2> return code = 0x10000
Dec 13 11:14:06 m3 multipathd: 8:16 : tur checker reports path is down
Dec 13 11:14:06 m3 kernel: SCSI error : <0 0 1 2> return code = 0x10000
Dec 13 11:14:06 m3 last message repeated 4 times
Dec 13 11:14:06 m3 multipathd: event checker startup : disk1
Dec 13 11:14:16 m3 multipathd: 8:0 : tur checker reports path is up
Dec 13 11:14:18 m3 kernel: SCSI error : <0 0 1 2> return code = 0x10000
Dec 13 11:14:18 m3 last message repeated 2 times
Dec 13 11:14:42 m3 kernel: Incorrect number of segments after building list
Dec 13 11:14:42 m3 kernel: counted 8, received 1
Dec 13 11:14:42 m3 kernel: req nr_sec 1024, cur_nr_sec 8
Dec 13 11:14:42 m3 multipathd: devmap event on disk1
Dec 13 11:14:42 m3 kernel: Incorrect number of segments after building list
Dec 13 11:14:42 m3 kernel: counted 8, received 1
Dec 13 11:14:42 m3 kernel: req nr_sec 1024, cur_nr_sec 8
Dec 13 11:14:42 m3 kernel: Incorrect number of segments after building list
Dec 13 11:14:42 m3 kernel: counted 8, received 1
Dec 13 11:14:42 m3 kernel: req nr_sec 1024, cur_nr_sec 8
Dec 13 11:14:44 m3 kernel: SCSI error : <0 0 1 2> return code = 0x10000
Dec 13 11:14:44 m3 last message repeated 2 times
Dec 13 11:14:44 m3 multipathd: event checker startup : disk1
after a long delay the random write operation (blocked due to the fail)
restarts!
but in the log i have:
Dec 13 11:15:43 m3 kernel: SCSI error : <0 0 1 2> return code = 0x10000
Dec 13 11:16:16 m3 last message repeated 12 times
Dec 13 11:16:44 m3 last message repeated 17 times
and multipath -l -v3 gives
0:0:1:2: sg_io failed status 0x0 0x1 0x0 0x0
0:0:1:2: Unable to get INQUIRY vpd 1 page 0x0.
disk1 (360001fe1001613800009205005470164)
[size=33 GB][features="0"][hwhandler="0"]
\_ round-robin 0 [active][first]
\_ 0:0:0:2 sda 8:0 [ready ][active]
the second path is lose!
to double check giving an sg_start on the lose path i get:
start_stop: Host_status=0x01 [DID_NO_CONNECT]
all this without oops
thanks
Nicola Ranaldo
More information about the dm-devel
mailing list