[Linux-cluster] RE: Errors trying to login to LT000: ...1006:Not Allowed

Treece, Britt Britt.Treece at savvis.net
Wed Mar 7 15:17:17 UTC 2007


Does anyone have any idea why incorrect entries in /etc/hosts of the
lock servers would intermittently cause the "Errors trying to login to
LT000: ...1006:Not Allowed?"  I would think this would be something that
if wrong should *consistently* cause the client not to be allowed into
the lockspace.
 
Additionally can anyone explain the fundamentals of GFS 6.0 lock tables
and the locking process.  A couple specific questions I have...
 
    What is the difference between LTPX and the LT000?
 
    What is the advantage of having additional lock tables and when
would having more be a disadvantage?
 
    Is each lock propagated to each locktable or is it held in only one
table?
 
    Is the highwater mark for each locktable or the sum of locks across
all locktables?
 
 
Regards,

Britt Treece

________________________________

From: linux-cluster-bounces at redhat.com
[mailto:linux-cluster-bounces at redhat.com] On Behalf Of Britt Treece
Sent: Monday, March 05, 2007 10:51 PM
To: linux clustering
Subject: Re: [Linux-cluster] RE: Errors trying to login to LT000:
...1006:Not Allowed


Not sure why my first post didn't, but here it is...

---
I am running a 13 node GFS (6.0.2.33) cluster with 10 mounting clients
and 3 dedicated lock servers.  The master lock server was rebooted and
the next slave in the voting order took over.  At that time 3 of the
client nodes started receiving login errors for the ltpx server

Mar  4 00:05:52 lock1 lock_gulmd_core[3798]: Master Node Is Logging Out
NOW! 
... 

Mar  4 00:05:52 lock2 lock_gulmd_core[24627]: Master Node has logged
out. 
Mar  4 00:05:54 lock2 lock_gulmd_core[24627]: I see no Masters, So I am
Arbitrating until enough Slaves talk to me. 
Mar  4 00:05:54 lock2 lock_gulmd_LTPX[24638]: New Master at lock2
:192.168.1.3 
Mar  4 00:05:56 lock2 lock_gulmd_core[24627]: Now have Slave quorum,
going full Master. 
Mar  4 00:11:39 lock2 lock_gulmd_core[24627]: Master Node Is Logging Out
NOW! 
... 

Mar  4 00:05:52 client1 kernel: lock_gulm: Checking for journals for
node "lock1 " 
Mar  4 00:05:52 client1 lock_gulmd_core[9383]: Master Node has logged
out. 
Mar  4 00:05:52 client1 kernel: lock_gulm: Checking for journals for
node "lock1 " 
Mar  4 00:05:56 client1 lock_gulmd_core[9383]: Found Master at lock2 ,
so I'm a Client. 
Mar  4 00:05:56 client1 lock_gulmd_core[9383]: Failed to receive a
timely heartbeat reply from Master. (t:1172988356370685 mb:1)

Mar  4 00:05:56 client1 lock_gulmd_LTPX[9390]: New Master at lock2
:192.168.1.3 
Mar  4 00:06:01 client1 lock_gulmd_LTPX[9390]: Errors trying to login to
LT002: (lock2 :192.168.1.3) 1006:Not Allowed 
Mar  4 00:06:01 client1 lock_gulmd_LTPX[9390]: Errors trying to login to
LT000: (lock2 :192.168.1.3) 1006:Not Allowed 
Mar  4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to
LT000: (lock2 :192.168.1.3) 1006:Not Allowed 
Mar  4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to
LT002: (lock2 :192.168.1.3) 1006:Not Allowed 
Mar  4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to
LT004: (lock2 :192.168.1.3) 1006:Not Allowed 
Mar  4 00:06:02 client1 lock_gulmd_LTPX[9390]: Errors trying to login to
LT001: (lock2 :192.168.1.3) 1006:Not Allowed
---

Britt


On 3/5/07 10:30 PM, "Treece, Britt" <Britt.Treece at savvis.net> wrote:



	All, 
	
	After much further investigation I found /etc/hosts is off by
one for these 3 client nodes on all 3 lock servers.  Having fixed the
typo's is it safe to assume that the root of the problem trying to login
to LTPX is that /etc/hosts on the lock servers was wrong for these
nodes?  If yes, why would these 3 clients be allowed into the cluster
when it was originally started being that they had incorrect entries in
/etc/hosts?
	
	Regards, 
	
	Britt Treece 
	
	
________________________________

	--
	Linux-cluster mailing list
	Linux-cluster at redhat.com
	https://www.redhat.com/mailman/listinfo/linux-cluster
	



-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20070307/64f42ce4/attachment.htm>


More information about the Linux-cluster mailing list