[Linux-cluster] Problems with SAMBA server on Centos 51 virtual xen guest with iSCSI SAN

Wed Apr 2 20:10:23 UTC 2008

Paolo Marini wrote:
> I have implemented a cluster of a few xen guest with a shared GFS 
> filesystem residing on a SAN build with openfiler to support iSCSI 
> storage.
>
> Physical servers are 3 machines implementing a physical cluster, each 
> one equipped with quad xeon and 4 G RAM. The network interface is 
> based on channel bonding with LACP (on the physical hosts) having an 
> aggregate of 2 gigabits ethernet per physical host, the switch 
> supports LACP and has been configured accordingly.
>
> Virtual servers are based on xen nodes on top of the physical server 
> with shared storage on iSCSI and GFS.
>
> The networking is based on a cluster private network (for cluster 
> heartbeat and cluster communication + iSCSI) and an ethernet alias for 
> the LAN to which the users are connected.
>
> One of the cluster xen nodes is used for implementing a samba PDC (no 
> failover of the service, plain samba, single samba server on the LAN) 
> plus ldap server; samba works with ldap for users authentication. 
> Storage for the samba server is on the SAN.
>
> I continue to receive complaints from my users due to the fact that 
> sometimes copying file generates errors, plus problems related to 
> office usage (we still use the old Office 97 on some machines). The 
> samba configuration is more or less the same as that correctly working 
> on the previous physical machine, on which those problems were not 
> present.
>
> The problems generate these log entries on /var/log/samba/smbd:
>
> [2008/04/02 19:00:50, 0] lib/util_sock.c:get_peer_addr(1232)
>  getpeername failed. Error was Transport endpoint is not connected
> [2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
>  getpeername failed. Error was Transport endpoint is not connected
> [2008/04/02 19:05:32, 0] lib/util_sock.c:get_peer_addr(1232)
>  getpeername failed. Error was Transport endpoint is not connected
>
> And on the client machine log also on /var/log/samba
>
> [2008/04/02 19:04:34, 0] lib/util_sock.c:read_data(534)
>  read_data: read failure for 4 bytes to client 192.168.13.240. Error = 
> Connection reset by peer
> [2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
>  amhwq53p (192.168.13.240) closed connection to service tmp
> [2008/04/02 19:04:34, 1] smbd/service.c:close_cnum(1230)
>  amhwq53p (192.168.13.240) closed connection to service stock
> [2008/04/02 19:04:34, 0] lib/util_sock.c:write_data(562)
>  write_data: write failure in writing to client 192.168.13.240. Error 
> Broken pipe
> [2008/04/02 19:04:34, 0] lib/util_sock.c:send_smb(769)
>  Error writing 75 bytes to client. -1. (Broken pipe)
> [2008/04/02 19:04:34, 1] smbd/service.c:make_connection_snum(1033)
>
> They seem similar to problems related to poor connectivity or problem 
> in the network; however, these problems are new and were never found 
> before switching to the clustered architecture. Also no problem have 
> been found so far on the other xen nodes serving the same GFS 
> filesystem (different dirs !) for NFS or other services.
>
> Also putting the option
>
> posix locking = no
>
> on the smb.conf file did not help.
>
> Any idea from someone else facing the same problems ?
>
> thanks, Paolo
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster 
Those errors are explained in

     http://kbase.redhat.com/faq/FAQ_45_5274.shtm

John