[Linux-cluster] scsi reservation issue

Thu Nov 8 20:31:15 UTC 2007

On Fri, 2007-11-02 at 12:32 -0500, Ryan O'Hara wrote:
> Christopher Barry wrote:
> > On Wed, 2007-10-31 at 10:44 -0500, Ryan O'Hara wrote:
> >> Christopher Barry wrote:
> >>> Greetings all,
> >>>
> >>> I have 2 vmware esx servers, each hitting a NetApp over FS, and each
> >>> with 3 RHCS cluster nodes trying to mount a gfs volume.
> >>>
> >>> All of the nodes (1,2,& 3) on esx-01 can mount the volume fine, but none
> >>> of the nodes in the second esx box can mount the gfs volume at all, and
> >>> I get the following error in dmesg:
> >> Are you intentionally trying to use scsi reservations as a fence method? 
> > 
> > No. In fact I thought the scsi_reservation service may be *causing* the
> > issue, and disabled the service from starting on all nodes. Does this
> > have to be on?
> 
> No. You only need to run this service if you plan on using scsi 
> reservations as a fence method. A scsi reservation will restrict access 
> to a device such that only registered nodes can access it. If a 
> reservation exist and a unregistered node tries to access the device, 
> you'll see what you are seeing.
> 
> It may be that some reservations were created and never got cleaned-up, 
> which might cause the problem to continue even after the scsi_reserve 
> script was disabled. You can manually run '/etc/init.d/scsi_reserve 
> stop' to attempt to clean up any reservations. Note that I am assuming 
> that any reservations that might still exist on a device were created by 
> the scsi_reserve script. If that is the case, you can see what devices a 
> node is registered for by doing a '/etc/init.d/scsi_reserve status'. 
> Also not that the scsi_reserve script does *not* have to but started or 
> enabled to do these things (ie. you can safely run 'status' or 'stop' 
> without first running 'start').
> 
> On caveat... 'scsi_reserve stop' will not unregister a node if it is the 
> reservation holder and other nodes are still registered with a device. 
> You can also use sg_persist command directly to clean all registrations 
> and reservations. Use the -C option. See the sg_persist man page for a 
> better description.
> 

Okay. I had some other issues to deal with, but now I'm back to this,
and let me get you all up to speed on what I have done, and what I do
not understand about all of this.

status:
esx-01: contains nodes 1 thru 3
esx-02: contains nodes 4 thru 6

esx-01: all 3 cluster nodes can mount gfs.

esx-02: none can mount gfs.
esx-02: scsi reservation errors in dmesg
esx-02: mount fails w/ "can't read superblock" 

Oddly, with the gfs filesystem unmounted on all nodes, I can format the
gfs filesystem from the esx-02 box (from node4), and then mount it from
a node on esx-01, but cannot mount it on the node I just formatted it
from!

fdisk -l shows /dev/sdc1 on nodes 4 thru 6 just fine.

# sg_persist -C --out /dev/sdc1
fails to clear out the reservations

I do not understand these reservations, maybe someone can summarize?

I'm not at the box this sec (vpn-ing in will hork my evolution), but I
will provide any amount of data if either you Ryan, or anyone else has
stuff for me to try.

Thanks all,
-C

> >> It sounds like the nodes on esx-01 are creating reservations, but the 
> >> nodes on the second esx box are not registering with the device and 
> >> therefore are unable to mount the filesystem. Creation of reservations 
> >> and registrations is handled by the scsi_reserve init script, which 
> >> should be run at startup on all nodes in the cluster. You can check to 
> >> see what devices a node is registered for before you mount the 
> >> filesystem by doing /etc/init.d/scsi_reservce status. If your nodes are 
> >> not registered with the device and a reservation exists then you won't 
> >> be able to mount.
> >>
> >>> Lock_Harness 2.6.9-72.2 (built Apr 24 2007 12:45:38) installed
> >>> GFS 2.6.9-72.2 (built Apr 24 2007 12:45:54) installed
> >>> GFS: Trying to join cluster "lock_dlm", "kop-sds:gfs_home"
> >>> Lock_DLM (built Apr 24 2007 12:45:40) installed
> >>> GFS: fsid=kop-sds:gfs_home.2: Joined cluster. Now mounting FS...
> >>> GFS: fsid=kop-sds:gfs_home.2: jid=2: Trying to acquire journal lock...
> >>> GFS: fsid=kop-sds:gfs_home.2: jid=2: Looking at journal...
> >>> GFS: fsid=kop-sds:gfs_home.2: jid=2: Done
> >>> scsi2 (0,0,0) : reservation conflict
> >>> SCSI error : <2 0 0 0> return code = 0x18
> >>> end_request: I/O error, dev sdc, sector 523720263
> >>> scsi2 (0,0,0) : reservation conflict
> >>> SCSI error : <2 0 0 0> return code = 0x18
> >>> end_request: I/O error, dev sdc, sector 523720271
> >>> scsi2 (0,0,0) : reservation conflict
> >>> SCSI error : <2 0 0 0> return code = 0x18
> >>> end_request: I/O error, dev sdc, sector 523720279
> >>> GFS: fsid=kop-sds:gfs_home.2: fatal: I/O error
> >>> GFS: fsid=kop-sds:gfs_home.2:   block = 65464979
> >>> GFS: fsid=kop-sds:gfs_home.2:   function = gfs_logbh_wait
> >>> GFS: fsid=kop-sds:gfs_home.2:   file
> >>> = /builddir/build/BUILD/gfs-kernel-2.6.9-72/smp/src/gfs/dio.c, line =
> >>> 923
> >>> GFS: fsid=kop-sds:gfs_home.2:   time = 1193838678
> >>> GFS: fsid=kop-sds:gfs_home.2: about to withdraw from the cluster
> >>> GFS: fsid=kop-sds:gfs_home.2: waiting for outstanding I/O
> >>> GFS: fsid=kop-sds:gfs_home.2: telling LM to withdraw
> >>> lock_dlm: withdraw abandoned memory
> >>> GFS: fsid=kop-sds:gfs_home.2: withdrawn
> >>> GFS: fsid=kop-sds:gfs_home.2: can't get resource index inode: -5
> >>>
> >>>
> >>> Does anyone have a clue as to where I should start looking?
> >>>
> >>>
> >>> Thanks,
> >>> -C
> >>>
> >>> --
> >>> Linux-cluster mailing list
> >>> Linux-cluster at redhat.com
> >>> https://www.redhat.com/mailman/listinfo/linux-cluster
> >> --
> >> Linux-cluster mailing list
> >> Linux-cluster at redhat.com
> >> https://www.redhat.com/mailman/listinfo/linux-cluster
> 
> --
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-- 
Regards,
-C

Christopher Barry
Systems Engineer, Principal
QLogic Corporation
780 Fifth Avenue, Suite 140
King of Prussia, PA   19406
o/f: 610-233-4870 / 4777
  m: 267-242-9306