[Linux-cluster] Re: gnbd_export stops working after reboot

Sun Nov 27 09:42:49 UTC 2005

Never mind, I have fixed all the problems myself. The key requirements are:

- You must use caching GNBD when exporting the cluster_cca. This is because
at the time you create cluster_cca, there's still no cluster falicity, no
lock_gulm so far, and GNBD exporting only works with "-c".

- You must NOT use caching GNBD when exporting devices for GFS.

I think these requirements should be put into GFS Administrator Guide.

Regards,

--Thai Duong.

On 11/27/05, Thai Duong <thaidn at gmail.com> wrote:
>
> Hi list,
>
> I intend to setup a Oracle9i RAC cluster using GFS 6.0 as the CFS. Because
> the SAN is not available atm so I decide to use GNBD instead. I have three
> IA64 servers running RHAS 3 update 6 called node1, node2 and node3. Node1
> and node2 are GNBD clients and GFS nodes. Node3 is the GNBD server. I also
> use all of them as lock servers.
>
> I followed the GFS 6.0 Administrator guide and encountered no problem
> until I tried to mount the GFS file system on node2. It took forever to run
> "mount -t gfs /dev/pool/pool0 /gfs -o acl". I killed the mount process and
> tried again on node1. This time it returned something like the error when
> you try to mount a unknown file system. I rmmod the gfs module and modprobe
> it again but still no luck. I checked against the startup procedure and
> found that although I had started lock_gulmd on all nodes but only node3 had
> a running instance. There was no sight of lock_gulmd on node1 and node2. I
> tried to start lock_gulmd again and after a few times, it got running just
> on node2 but mounting gfs still didnt work.
>
> I didnt know what to do next so I decided to start over again. After
> chkconfig off and GFS related daemons, I restarted the servers (a bad habit
> from the Windows time :( ). After all the servers are up again, I got
> "gnbd_export error: create request failed : Connection refused" error when
> executing the following commands on node3 (in order to export device as
> GNBD):
>
> # modprobe gndb_serv
>
> # lsmod
> [root at db-svr-test-03 root]# lsmod
> Module                  Size  Used by    Not tainted
> gnbd_serv              74288   0  (unused)
> lock_gulm             149872   0  [gnbd_serv]
> lock_harness            7288   0  [lock_gulm]
> ....
>
> # gnbd_export -d /dev/cciss/c0d0p4 -e cluster.cca
> gnbd_export error: create request failed : Connection refused
>
> As you can see below, gnbd_serv was running and listening on the default
> port, 14243:
> # netstat -nat
>
> [root at db-svr-test-03 root]# netstat -nat
> Active Internet connections (servers and established)
> Proto Recv-Q Send-Q Local Address               Foreign
> Address             State
> tcp        0      0 0.0.0.0:14243               0.0.0.0:*
> LISTEN
> tcp        0      0 0.0.0.0:22                  0.0.0.0:*
> LISTEN
> .....
>
> I also placed a tcpdump -vv -i lo port 14243 on node3 and saw that there
> were some traffic when I re-executed "gnbd_export -d /dev/cciss/c0d0p4 -e
> cluster.cca". it even passed the threeway handshark procedure but while
> the client side was pushing data the server suddenly sent a F packet.
>
> I even removed the GFS and GFS-modules RPM and reinstalled them but still
> no luck. What am I supposed to do now? Any help appreciated.
>
> Regards,
>
> --Thai Duong.
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20051127/76e117dd/attachment.htm>