lvsd kills off all nannies!

Dan Yocum yocum at fnal.gov
Wed May 6 16:56:22 UTC 2009


Hi all,

Here's the situation we're running into - after setting a real server to 
active = 0 and weight = 0 and reloading pulse, <perform some work on the 
RS>, set active = 1 and weight = 3 and reloading pulse, lvsd first 
creates the monitor for the process, which dies for some strange reason, 
then proceeds to shutdown *all* virtual services!!

Here's what I see in /var/log/messages:

lvs[2821]: rereading configuration file
lvs[2821]: create_monitor for squid:3128/fg3x3.fnal.gov running as pid 13633
lvs[2821]: nanny for child squid:3128/fg3x3.fnal.gov died! shutting down lvs
lvs[2821]: shutting down virtual service MYSQL:3306
lvs[2821]: shutting down virtual service SAZ:8888
lvs[2821]: shutting down virtual service SAZ:8881
lvs[2821]: shutting down virtual service SAZ:8882
lvs[2821]: shutting down virtual service voms:8443
lvs[2821]: shutting down virtual service voms-osg:8443
lvs[2821]: shutting down virtual service gums:8443
lvs[2821]: shutting down virtual service voms-auger:15007
nanny[2854]: Terminating due to signal 15
nanny[2858]: Terminating due to signal 15
nanny[2865]: Terminating due to signal 15
nanny[2867]: Terminating due to signal 15
nanny[2868]: Terminating due to signal 15
nanny[2878]: Terminating due to signal 15
nanny[2888]: Terminating due to signal 15
nanny[2906]: Terminating due to signal 15
nanny[2910]: Terminating due to signal 15
nanny[2921]: Terminating due to signal 15
nanny[16558]: Terminating due to signal 15
nanny[16561]: Terminating due to signal 15
nanny[16562]: Terminating due to signal 15
nanny[16563]: Terminating due to signal 15
nanny[16566]: Terminating due to signal 15
nanny[16588]: Terminating due to signal 15
nanny[16592]: Terminating due to signal 15
nanny[16593]: Terminating due to signal 15
lvs[2821]: shutting down virtual service voms-cdf:15020
lvs[2821]: shutting down virtual service voms-cms:15015
lvs[2821]: shutting down virtual service voms-des:15017
lvs[2821]: shutting down virtual service voms-dzero:15002
lvs[2821]: shutting down virtual service voms-fermilab:15001
lvs[2821]: shutting down virtual service voms-i2u2:15026
lvs[2821]: shutting down virtual service voms-ilc:15023
lvs[2821]: shutting down virtual service voms-lqcd:15024
lvs[2821]: shutting down virtual service voms-nanohub:15022
lvs[2821]: shutting down virtual service voms-jdem:15028
nanny[2924]: Terminating due to signal 15
nanny[2938]: Terminating due to signal 15
nanny[2941]: Terminating due to signal 15
nanny[2955]: Terminating due to signal 15
nanny[2958]: Terminating due to signal 15
nanny[2971]: Terminating due to signal 15
nanny[2974]: Terminating due to signal 15
nanny[2989]: Terminating due to signal 15
nanny[2992]: Terminating due to signal 15
nanny[3003]: Terminating due to signal 15
nanny[3005]: Terminating due to signal 15
nanny[16594]: Terminating due to signal 15
nanny[16595]: Terminating due to signal 15
nanny[16604]: Terminating due to signal 15
nanny[16605]: Terminating due to signal 15
nanny[16606]: Terminating due to signal 15
nanny[16607]: Terminating due to signal 15
nanny[16608]: Terminating due to signal 15
nanny[16618]: Terminating due to signal 15
nanny[16619]: Terminating due to signal 15
nanny[16620]: Terminating due to signal 15
nanny[13633]: starting LVS client monitor for 131.225.107.161:3128
nanny[13633]: making 131.225.107.144:3128 available
nanny[13633]: /sbin/ipvsadm command failed!
lvs[2821]: shutting down virtual service voms-osg:15027
lvs[2821]: shutting down virtual service squid:3128


Performing a 'service pulse restart' brings everything back online just 
fine.

What's going on here?

The OS is Scientific Linux 5.2 (i.e., RHELv5.2) on a Xen VM, kernel 
2.6.18-128.1.6.el5xen.

Thanks,
Dan


-- 
Dan Yocum
Fermilab  630.840.6509
yocum at fnal.gov, http://fermigrid.fnal.gov
Fermilab.  Just zeros and ones.




More information about the Piranha-list mailing list