[dm-devel] [PATCH 0/8] Use block pr_ops in LIO

Sun Jun 5 16:55:35 UTC 2022

On 6/4/22 11:01 PM, Bart Van Assche wrote:
> On 6/2/22 23:55, Mike Christie wrote:
>> The following patches were built over Linus's tree. They allow us to use
>> the block pr_ops with LIO's target_core_iblock module to support cluster
>> applications in VMs.
>>
>> Currently, to use something like windows clustering in VMs with LIO and
>> vhost-scsi, you have to use tcmu or pscsi or use a cluster aware
>> FS/framework for the LIO pr file. Setting up a cluster FS/framework is
>> pain and waste when your real backend device is already a distributed
>> device, and pscsi and tcmu are nice for specific use cases, but iblock
>> gives you the best performance and allows you to use stacked devices
>> like dm-multipath. So these patches allow iblock to work like pscsi/tcmu
>> where they can pass a PR command to the backend module. And then iblock
>> will use the pr_ops to pass the PR command to the real devices similar
>> to what we do for unmap today.
>>
>> Note that this is patchset does not attempt to support every PR SCSI
>> feature in iblock. It has the same limitations as tcmu and pscsi where
>> you can have a single I_T nexus per device and only supports what is
>> needed for windows clustering right now.
> 
> How has this patch series been tested? Does LIO pass the libiscsi persistent reservation tests with this patch series applied?
> 

libiscsi is not suitable for this type of setup. If libiscsi works correctly,
then this patchset should fail. It's probably opposite of what you are
thinking about. We are not supporting a a single instance of LIO/qemu that
handles multiple I_T nexues like what libiscsi can test well. It's more
like multiple LIO/qemu instances each on different systems that each have a
single I_T nexus between the VM's initiator and LIO/qemu. So it's more of
a passthrough between the VM and real device.

For example, right now to use a cluster app in VMs with a backend device that
is itself cluster aware/shared you commonly do:

1. Qemu's userspace block layer which can send SG_IO to your real backend
device to do the PR request. Checks for conflicts are then done by the
backend device as well.

So here you have 2 systems. On system0, Qemu0 exports /dev/sdb to VM0. VM0
only has the one I_T nexus. System1 exports /dev/sdb to  VM1. VM1 only has
the one I_T nexus as well.

2. Qemu vhost-scsi with pscsi or tcmu. In these cases it's similar as 1 where
you have 2 different systems. How you pass the PRs to the real device may
differ for tcmu. pscsi just injects them into the scsi queue. We do not use
the LIO pr code at all (pgr_support=0).

3. This patchset allows you to use Qemu vhost-scsi with iblock. The setup will
be similar as 1 and 2 but we use a different backend driver.

To test this type of thing you would want a cluster aware libiscsi where
you do a pr register and reserve in VM0, then in VM1 you would do the WRITE
to check that your pr_type is honored from that I_T nexus.

And so we are going to run our internal QA type of tests, but we are hoping to
also implement some qemu clustered SCSI type of tests like this. We are still
trying to figure out the framework (looking into Luis's ansible based stuff,
etc) because for general iscsi testing we want to be able to kick off multiple
VMs and bare metal systems and run both open-iscsi + lio tests.