Service minitor script help.

Fabrizio Di Fonzo f.difonzo at gmail.com
Fri Sep 7 14:55:14 UTC 2007


Good morning everybody,

I've built up a load balance system with two routers an two real servers.
On each servers, I've configured two different services: tftp and http.
For momitoring http service I've used default piranha parameters and
everything
works well, while for monitoring tftp I've written a custom script.

My script returns the string "OK" if it is able to contact the service,
otherwise
it returns "FAIL". If I start "pulse" daemon after starting real servers,
everyting
works fine and I can see all the two real server displayed on the piranha
monitoring interface, but if I stop some of the tftp services, piranha
doesn't
detect this event (I still see on the piranha monitoring user interface all
the services, even dead
ones) and continues sending the tftp packets to the dead services (I can see
this sniffing packets on servers).
Also If I want to restart the pulse daemon I must change heartbeat post on
lvs.cf fila, as I get the message: pulse: "cannot create heartbeat socket --
running as root?"

In short, it seems to me, when using an external monitoring script, piranha
is
only able the to detect when the script succeeds.

I'm using piranha version 0.7.12 on Red HAT AS 3.

Here's my monitoring script (It can return "Ok" if it can contact service,
oterwise it returns "FAIL"):

#!/bin/bash

TFTP=/usr/bin/tftp

#HOST="localhost"
HOST=$1
PORT=69
RETRIES=10
TO=10
URL="/000000000000"

RC=0
TMP1=`mktemp /tmp/tftpmon.XXXXXXXXXX` || exit 1
TMP2=`mktemp /tmp/tftpmon.XXXXXXXXXX` || exit 1
$TFTP <<- EOF > $TMP1 2> /dev/null
$HOST
timeout $TO
rexmt $TO
get $URL $TMP2
EOF

   if ! grep -q Received $TMP1; then
      RC=1
   fi
   if grep -q "File not found" $TMP1; then
      RC=0
   fi

           if [ "$RC" = "0" ]; then
                        echo "OK";
           else
                echo "FAIL";
           fi

   rm -f $TMP1 $TMP2

And my lvs.cf is the following:

serial_no = 117
primary = 172.42.1.190
service = lvs
backup_active = 1
backup = 172.42.1.191
heartbeat = 1
heartbeat_port = 539
keepalive = 6
deadtime = 18
network = nat
nat_router = 192.168.1.251 eth1:1
debug_level = 3
virtual TFTP {
     active = 1
     address = 172.42.1.195 eth0:1
     vip_nmask = 255.255.255.0
     port = 69
     persistent = 10
     expect = "OK"
     use_regex = 0
     send_program = "/opt/myscript.sh %h"
     load_monitor = none
     scheduler = rr
     protocol = udp
     timeout = 6
     reentry = 15
     quiesce_server = 0
     server pc170 {
         address = 192.168.1.170
         active = 1
         weight = 1
     }
     server pc171 {
         address = 192.168.1.171
         active = 1
         weight = 1
     }
}
virtual ZTC-WEB {
     active = 1
     address = 172.42.1.195 eth0:1
     vip_nmask = 255.255.255.0
     fwmark = 80
     port = 8080
     persistent = 10
     send = "GET / HTTP/1.0\r\n\r\n"
     expect = "HTTP"
     use_regex = 0
     load_monitor = none
     scheduler = rr
     protocol = tcp
     timeout = 6
     reentry = 15
     quiesce_server = 0
     server pc170 {
         address = 192.168.1.170
         active = 1
         weight = 1
     }
     server pc171 {
         address = 192.168.1.171
         active = 1
         weight = 1
     }
}

Is it a bug or did a mmake some mistake???
can anyone help me???

Thanks in advance,
Fabrizio
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/piranha-list/attachments/20070907/d71d0ac9/attachment.htm>


More information about the Piranha-list mailing list