[Linux-cluster] SCSI reservation conflicts after update

Gary Romo garromo at us.ibm.com
Wed Apr 2 21:17:55 UTC 2008


We had a similar issue and we just removed sg3utils (orsomething like 
that), if your not going to use it.

Gary Romo
IBM Global Technology Services
303.458.4415
Email: garromo at us.ibm.com
Pager:1.877.552.9264
Text message: gromo at skytel.com



"Ryan O'Hara" <rohara at redhat.com> 
Sent by: linux-cluster-bounces at redhat.com
04/02/2008 10:23 AM
Please respond to
linux clustering <linux-cluster at redhat.com>


To
ssingh at amnh.org, linux clustering <linux-cluster at redhat.com>
cc

Subject
Re: [Linux-cluster] SCSI reservation conflicts after update







I went back and investigated why this might happen. Seems that I had 
seen it before but could not recall how this sort of thing happens.

For 4.6, the scsi_reserve script should only be run if you intend to use 
SCSI reservations as a fence mechanism, as you correctly pointed out at 
the end of your message. I believe in 4.6 scsi_reserve was incorrectly 
enabled by default.

The real problem is that the keys used for scsi reservations are based 
on node ID. For this reason, it is required that nodeid be defined in 
the cluster.conf file for all nodes. Without this, the nodeid can change 
from node to node between cluster restarts, etc. The scsi_reserve and 
fence_scsi scripts require consistent nodeid (ie. they do not change).

So I think the problem we are seeing is that running 'scsi_reserve stop' 
cannot work since that will attempt to remove that node's key from the 
devices. If that key has changed (the node ID changed), it will not find 
a matching registration key on the device and thus fail.

The best bet is to disable scsi_reserve and to clear all scsi 
reservations. As you mentioned, the sg_persist command with the -C 
option should do the trick. I am guessing that the reason that failed 
for you is that you must supply the device name AND the key being used 
for that I_T nexus. You can use sg_persist to list the keys registered 
with a particular device, but since nodeid's may have changed you might 
have to guess the key for a particular node (ie. the node you run the 
sg_persist -C command on). The good news is that when you identify the 
correct key it will clear all the keys.

Ryan

Sajesh Singh wrote:
> After updating my GFS cluster to the latest packages (as of 3/28/08) on 
> an Enterprise Linux 4.6 cluster (kernel version 2.6.9-67.0.7.ELsmp)  I 
> am receiving scsi reservation errors whenever the nodes are rebooted. 
> The node is then subsequently rebooted at varying intervals without any 
> intervention. I have tried to disable the scsi_reserve script from 
> startup, but it does not seem to have any effect. I have also tried to 
> use the sg_persist command to clear all reservations with the -C option 
> to no avail. I first noticed something was wrong when the 2nd node of 
> the 2 node cluster was being updated. That was the first sign of the 
> scsi reservation errors on the console.
> 
>  From my understanding persistent SCSI reservations are only needed if I 

> am using the fence_scsi module.
> 
> I would appreciate any guidance.
> 
> Regards,
> 
> Sajesh Singh
> 
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster

--
Linux-cluster mailing list
Linux-cluster at redhat.com
https://www.redhat.com/mailman/listinfo/linux-cluster

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080402/34bbe732/attachment.htm>


More information about the Linux-cluster mailing list