[Linux-cluster] scsi reservation issue

Ryan O'Hara rohara at redhat.com
Thu Nov 8 21:32:42 UTC 2007


Christopher Barry wrote:
> 
> Okay. I had some other issues to deal with, but now I'm back to this,
> and let me get you all up to speed on what I have done, and what I do
> not understand about all of this.
> 
> status:
> esx-01: contains nodes 1 thru 3
> esx-02: contains nodes 4 thru 6
> 
> esx-01: all 3 cluster nodes can mount gfs.
> 
> esx-02: none can mount gfs.
> esx-02: scsi reservation errors in dmesg
> esx-02: mount fails w/ "can't read superblock" 

OK. So it looks like one of the nodes is still holding a reservation on 
the device. First, we need to determine which node has that reservation. 
  From any node, you should be able to run the following commands:

sg_persist -i -k /dev/sdc
sg_persist -i -r /dev/sdc

The first will list all the keys registered with the device. The second 
will show you which key is holding the reservation. At this point, I 
would expect that you will only see 1 key registered and that key will 
also be the reservation holder, but that it just a guess.

The keys are unique to each node, so we can figure correlate a key to a 
node. The key is just the hex representation of the node's IP address. 
You can get this by running gethostip -x <hostname>. By doing this, you 
should be able to figure out which node is still holding a reservation.
Once you determine this key/node, try running /etc/init.d/scsi_reserve 
stop from that node. Once that runs, use the sg_persist commands listed 
above to see if the reservation is cleared.

> Oddly, with the gfs filesystem unmounted on all nodes, I can format the
> gfs filesystem from the esx-02 box (from node4), and then mount it from
> a node on esx-01, but cannot mount it on the node I just formatted it
> from!
> 
> fdisk -l shows /dev/sdc1 on nodes 4 thru 6 just fine.

Hmm. I wonder if there is something goofy happening because the nodes 
are running within vmware. I have never tried this, so I have no idea. 
Either way, we should be able to clear up the problem.

> # sg_persist -C --out /dev/sdc1
> fails to clear out the reservations

Right. It believe this must be run from the node holding the 
reservation, or at the very least a node that is registered with the 
device. Also node that scsi reservations effect the entire LUN, so you 
can't issue registrations/reservations to a single partition (ie. sdc1).

> I do not understand these reservations, maybe someone can summarize?

I'll try to be brief. Each node in the cluster can register with a 
device, thus a device may have many registrations. Each node registers 
by using a unique key. Once registered, one of the nodes can issue a 
reservation. Only one node may hold the reservation, the reservations is 
created using that node's key. For our purposed, we use a 
write-exclusive, registrants only type of reservation. This means that 
only nodes that are registered with the device may write to it. As long 
as that reservation exists, that rule will be enforced.

When it comes to to remove registrations, there it one caveat: the node 
that hold the reservation cannot unregister unless there are no other 
nodes registered with the device. This is due to the fact that the 
reservations holder must also be registered  *and* if the reservation 
were to go away the write-exclusive, registrants-only policy would not 
longer be in effect. So ... what may have happened is that you tried to 
clear the reservation while other nodes were still registered, which 
will fail since that cannot happen. Once all the other nodes have 
"unregistered", you should be able to go back and clear the reservation.

Yes, this is a limitation in our product. There is a notion of moving a 
reservation (in the case where the reservation holder wants to 
unregister), but that is not yet implemented.

> I'm not at the box this sec (vpn-ing in will hork my evolution), but I
> will provide any amount of data if either you Ryan, or anyone else has
> stuff for me to try.

Please let me know if you have questions or need further assistance 
clearing that pesky reservation for you. :)

> Thanks all,
> -C
>




More information about the Linux-cluster mailing list