[dm-devel] dm-io async WRITE_SAME results in iSCSI NULL pointer [was: Re: Write same support]

Mike Snitzer snitzer at redhat.com
Thu Feb 16 21:35:02 UTC 2012


On Thu, Feb 16 2012 at  4:25pm -0500,
Mike Christie <michaelc at cs.wisc.edu> wrote:

> On 02/16/2012 03:03 PM, Mike Snitzer wrote:
> > On Thu, Feb 16 2012 at  3:02pm -0500,
> > Mike Snitzer <snitzer at redhat.com> wrote:
> > 
> >> FYI, I'll bounce a message detailing the iSCSI scatter-gather NULL
> >> pointer I _always_ hit with dm-io issuing async WRITE_SAME.
> > 
> > I developed a patch for dm-io so that the new dm-thinp target can
> > leverage your new WRITE SAME functionality for, hopefully, more
> > efficient zeroing of the disk (see: dm-io-WRITE_SAME.patch at the end of
> > the following patchset).
> > 
> > Here is the patchset I'm using ontop of Linux 3.2:
> > http://people.redhat.com/msnitzer/patches/upstream/dm-io-WRITE_SAME/series.html
> > 
> > All works great on FC (tested against NetApp 3040 LUN)... I'm using the
> > thinp-test-suite to test dm-thinp's use of dm_kcopyd_zero().
> > 
> > But testing with iSCSI, I get a NULL pointer _every_ time in the iSCSI
> > scatter-gather code, see:
> > http://people.redhat.com/msnitzer/patches/upstream/dm-io-WRITE_SAME/async-WRITE_SAME-makes-iscsi-sg-die.txt
> > -- in the middle of that file you'll see my 'crash' analysis of the
> > issue -- but that is just the NULL pointer.. no idea what the smoking
> > gun is that caused the iscsi_segment to become NULL.
> > 
> > Anyway, taking a step back... WRITE SAME is all about transfering a
> > single logical block, backed by a single empty_zero_page in this test
> > case, so I'm wondering if for some reason iSCSI's sg code is getting
> > confused and thinking that more pages need to be transferred than were
> > in the original bio's payload (but iSCSI is way beneath the bio -> SCSI
> > command translation... grr)
> 
> Yeah, probably a request/scsi_cmnd/sg sector/length/offset value is off
> or iscsi is making a bad assumption.
> 
> Do:
> 
> echo 1 > /sys/module/libiscsi/parameters/debug_libiscsi_session
> echo 1 > /sys/module/libiscsi/parameters/debug_libiscsi_session
> echo 1 > /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp
> echo 1 > /sys/module/libiscsi_tcp/parameters/iscsi_tcp
> 
> then rerun your test.

OK, will retry with all 4.. but just this caused the system to crap
itself:

echo 1 > /sys/module/libiscsi_tcp/parameters/debug_libiscsi_tcp

(I did this to turn on the ISCSI_DBG_TCP messages I noticed while
reviewing the code).

I saw a bunch of opcode 0x25 (READ CAPACITY) but never did see 0x93
(WRITE_SAME_16) come through.




More information about the dm-devel mailing list