[Linux-cluster] Error messages during Fence operation

Randy Brown randy.brown at noaa.gov
Thu Jan 3 13:17:00 UTC 2008


Thanks.  That makes sense and I hadn't thought of that.  I don't see any 
other connections.  However, it appears to have properly fenced one of 
the nodes last night and I don't believe I've changed anything in the 
config.   Maybe I did have another connection and something I did 
cleared it without me realizing it.  As long as it's working. :)

I'm still pretty "green" when it comes to clustering and SANS and 
sincerely appreciate the quality responses and willingness to help on 
this list.

Randy

James Parsons wrote:
> Randy Brown wrote:
>
>> I forgot....I'm using Centos 5 with latest patches and kernel.
>>
>> Randy Brown wrote:
>>
>>> I am using an APC Masterswitch Plus as my fencing device.  I am 
>>> seeing this in my logs now when fencing occurs:
>>>
>>> Dec 31 11:36:26 nfs1-cluster fenced[3848]: agent "fence_apc" 
>>> reports: Traceback (most recent call last):   File 
>>> "/sbin/fence_apc", line 829, in ?     main()   File 
>>> "/sbin/fence_apc", line 289, in main     do_login(sock)   File 
>>> "/sbin/fence_apc", line 444, in do_login     i, mo, txt = 
>>> sock.expect(regex_list, TELNET_TIMEOUT)
>>> Dec 31 11:36:26 nfs1-cluster fenced[3848]: agent "fence_apc" 
>>> reports:   File "/usr/lib/python2.4/telnetlib.py", line 620, in 
>>> expect     text = self.read_very_lazy()   File 
>>> "/usr/lib/python2.4/telnetlib.py", line 400, in read_very_lazy     
>>> raise EOFError, 'telnet connection closed' EOFError: telnet 
>>> connection closed
>>> Dec 31 11:36:26 nfs1-cluster fenced[3848]: fence 
>>> "nfs2-cluster.nws.noaa.gov" failed
>>>
>>> This used to work just fine.  If I run `fence_apc -a 192.168.42.30 
>>> -l cluster -n 1:7 -o Reboot -p <my password>` from the command line, 
>>> fencing works as expected.  The relevant lines from my cluster.conf 
>>> file are below.  I will gladly provide more information as necessary.
>>
> Is it possible that you are already telnet'ed into the switch from a 
> terminal or somesuch when the fence attempt takes place? APC switches 
> allow only one login at a time. I should/will add a log comment that 
> mentions this as a possible reason.
>
> If this is not the issue, well, we can keep digging...
>
> -J
>
> -- 
> Linux-cluster mailing list
> Linux-cluster at redhat.com
> https://www.redhat.com/mailman/listinfo/linux-cluster
-------------- next part --------------
A non-text attachment was scrubbed...
Name: randy_brown.vcf
Type: text/x-vcard
Size: 313 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/linux-cluster/attachments/20080103/f3def3f6/attachment.vcf>


More information about the Linux-cluster mailing list