From f.difonzo at gmail.com Fri Sep 7 14:55:14 2007 From: f.difonzo at gmail.com (Fabrizio Di Fonzo) Date: Fri, 7 Sep 2007 14:55:14 +0000 Subject: Service minitor script help. Message-ID: <7734d97e0709070755g9d44cd4v211453f3ab2eb3cf@mail.gmail.com> Good morning everybody, I've built up a load balance system with two routers an two real servers. On each servers, I've configured two different services: tftp and http. For momitoring http service I've used default piranha parameters and everything works well, while for monitoring tftp I've written a custom script. My script returns the string "OK" if it is able to contact the service, otherwise it returns "FAIL". If I start "pulse" daemon after starting real servers, everyting works fine and I can see all the two real server displayed on the piranha monitoring interface, but if I stop some of the tftp services, piranha doesn't detect this event (I still see on the piranha monitoring user interface all the services, even dead ones) and continues sending the tftp packets to the dead services (I can see this sniffing packets on servers). Also If I want to restart the pulse daemon I must change heartbeat post on lvs.cf fila, as I get the message: pulse: "cannot create heartbeat socket -- running as root?" In short, it seems to me, when using an external monitoring script, piranha is only able the to detect when the script succeeds. I'm using piranha version 0.7.12 on Red HAT AS 3. Here's my monitoring script (It can return "Ok" if it can contact service, oterwise it returns "FAIL"): #!/bin/bash TFTP=/usr/bin/tftp #HOST="localhost" HOST=$1 PORT=69 RETRIES=10 TO=10 URL="/000000000000" RC=0 TMP1=`mktemp /tmp/tftpmon.XXXXXXXXXX` || exit 1 TMP2=`mktemp /tmp/tftpmon.XXXXXXXXXX` || exit 1 $TFTP <<- EOF > $TMP1 2> /dev/null $HOST timeout $TO rexmt $TO get $URL $TMP2 EOF if ! grep -q Received $TMP1; then RC=1 fi if grep -q "File not found" $TMP1; then RC=0 fi if [ "$RC" = "0" ]; then echo "OK"; else echo "FAIL"; fi rm -f $TMP1 $TMP2 And my lvs.cf is the following: serial_no = 117 primary = 172.42.1.190 service = lvs backup_active = 1 backup = 172.42.1.191 heartbeat = 1 heartbeat_port = 539 keepalive = 6 deadtime = 18 network = nat nat_router = 192.168.1.251 eth1:1 debug_level = 3 virtual TFTP { active = 1 address = 172.42.1.195 eth0:1 vip_nmask = 255.255.255.0 port = 69 persistent = 10 expect = "OK" use_regex = 0 send_program = "/opt/myscript.sh %h" load_monitor = none scheduler = rr protocol = udp timeout = 6 reentry = 15 quiesce_server = 0 server pc170 { address = 192.168.1.170 active = 1 weight = 1 } server pc171 { address = 192.168.1.171 active = 1 weight = 1 } } virtual ZTC-WEB { active = 1 address = 172.42.1.195 eth0:1 vip_nmask = 255.255.255.0 fwmark = 80 port = 8080 persistent = 10 send = "GET / HTTP/1.0\r\n\r\n" expect = "HTTP" use_regex = 0 load_monitor = none scheduler = rr protocol = tcp timeout = 6 reentry = 15 quiesce_server = 0 server pc170 { address = 192.168.1.170 active = 1 weight = 1 } server pc171 { address = 192.168.1.171 active = 1 weight = 1 } } Is it a bug or did a mmake some mistake??? can anyone help me??? Thanks in advance, Fabrizio -------------- next part -------------- An HTML attachment was scrubbed... URL: From U.Criola at patrick.com.au Fri Sep 7 15:01:17 2007 From: U.Criola at patrick.com.au (U.Criola at patrick.com.au) Date: Sat, 8 Sep 2007 01:01:17 +1000 Subject: Criola, Urbano is out of the office. Message-ID: I will be Out of the Office Start Date: 4/09/2007. End Date: 17/09/2007. I will be away from the office Tue 4/9/07 and will be back in on Monday 17/9/07.