[dm-devel] [PATCH 0/6] scsi_dh : Couple of fixes for scsi device handlers
Shyam Iyer
shyam_iyer at dell.com
Sat Jul 31 00:22:12 UTC 2010
On 07/30/2010 07:56 PM, Chandra Seetharaman wrote:
> On Fri, 2010-07-30 at 12:12 -0600, Moger, Babu wrote:
>
>
>>>> Yes, We can do that. Problem is I am hitting the issue with BUG_ON
>>>> in get_rdac_data which is there in the beginning of rdac_activate.
>>>>
>>> I do not understand the problem.
>>>
>>> If the BUG_ON on get_rdac_data() is being triggered, that means
>>> sdev->scsi_dh_data is NULL, if that is the case, how can you access
>>> sdev->scsi_dh_data->kref in scsi_dh_activate (in patch 2/6) ? Wouldn't
>>> it trip a oops ?
>>>
>> Test case is deleting both active and passive paths almost together during
>> the multipath testing. Looks like DM picked up the active path failure
>> first. Then failed the active path and started scheduling activate_path to
>> failover to passive path. Passive path is also about to go down pretty soon.
>> When the control was in scsi_dh_activate, Looks like scsi_dh_data was still
>> valid because I did not see panic here. But scsi_dh_data became NULL when
>> control went to rdac_activate. That is when I hit the bug on.
>>
>> kernel BUG at /usr/src/packages/BUILD/lsi-scsi_dh_rdac-01.00/obj/default/scsi_dh_rdac.c:232!
>> RIP: 0010: rdac_activate+0x257/0x387 [scsi_dh_rdac]
>>
>> My understanding is someone triggered scsi_dh->detach for passive path during
>> this small window. Only way I could see problem go away is holding reference
>> counts between these calls. Did i miss anything here? See the code snippet below..
>>
> So, basically, the new patch just reduces the window of race. IOW, if it
> just spins for few seconds (or preempted) just before calling the
> kref_get() in the new patch, it will generate an oops as scsi_dh_data
> would be NULL.
>
> Looks like you have to create a lock to protect scsi_dh_data and then
> call kref_get() with the protection of lock.
>
>
Second that.. Sounds like each kref_get/kref_put .. needs a lock protection not just this scenario.
More information about the dm-devel
mailing list