From mishagreen at gmail.com Tue May 17 10:14:02 2005 From: mishagreen at gmail.com (Michael Green) Date: Tue, 17 May 2005 13:14:02 +0300 Subject: LVS error: "bad load average" Message-ID: <17e909a4050517031458b6846b@mail.gmail.com> Getting the following errors in the log file from the LVS: May 17 12:57:58 biocsm nanny[7310]: bad load average returned: biocl1 up 13+02:00, 0 users, load 2.00, 1.98, 1.90 biocl2 up 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocl3 up 13+02:13, 0 users, load 0.98, 0.96, 0.91 biocl4 up 13+02:12, 0 users, load 2.00, 2.00, 1.91 biocl5 up 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocl6 up 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocsm up 13+22:08, 0 users, load 0.00, 0.00, 0.00 May 17 12:57:58 biocsm nanny[7305]: bad load average returned: biocl1 up 13+02:00, 0 users, load 2.00, 1.98, 1.90 biocl2 up 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocl3 up 13+02:13, 0 users, load 0.98, 0.96, 0.91 biocl4 up 13+02:12, 0 users, load 2.00, 2.00, 1.91 biocl5 up 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocl6 up 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocsm up 13+22:08, 0 users, load 0.00, 0.00, 0.00 and so on... the /var/log/messages is literally flooded with these... I've googled, I've search through the LVS & Piranha mail lists and found that other people also had same problem, but I havn't found any definitive answer. Please help. My lvs.cf is as following: [root at biocsm ha]# more lvs.cf serial_no = 35 primary = 132.77.90.131 service = lvs backup = 0.0.0.0 heartbeat = 1 heartbeat_port = 539 keepalive = 6 deadtime = 18 network = nat nat_router = 192.168.1.30 eth1:1 debug_level = NONE virtual HTTP { active = 1 address = 132.77.90.131 eth0:1 vip_nmask = 255.255.255.0 port = 80 send = "GET / HTTP/1.0\r\n\r\n" expect = "HTTP" use_regex = 0 load_monitor = ruptime scheduler = wlc protocol = tcp timeout = 6 reentry = 15 quiesce_server = 1 server biocl1 { address = 192.168.1.31 active = 1 weight = 1 } server biocl2 { address = 192.168.1.32 active = 1 weight = 1 } server biocl3 { address = 192.168.1.33 active = 1 weight = 1 } server biocl4 { address = 192.168.1.34 active = 1 weight = 1 } server biocl5 { address = 192.168.1.35 active = 1 weight = 1 } server biocl6 { address = 192.168.1.36 active = 1 weight = 1 } } -- Warm regards, Michael Green From sebastien.bonnet at experian.fr Tue May 17 10:14:14 2005 From: sebastien.bonnet at experian.fr (=?ISO-8859-1?Q?S=E9bastien_BONNET?=) Date: Tue, 17 May 2005 12:14:14 +0200 Subject: LVS error: "bad load average" In-Reply-To: <17e909a4050517031458b6846b@mail.gmail.com> References: <17e909a4050517031458b6846b@mail.gmail.com> Message-ID: <4289C3F6.9080001@experian.fr> Michael, Dig in the list archives for a patch I've released to solve this problem and explainations on how to use ruptime successfully. Regards, Michael Green wrote: > Getting the following errors in the log file from the LVS: > > May 17 12:57:58 biocsm nanny[7310]: bad load average returned: biocl1 > up 13+02:00, 0 users, load 2.00, 1.98, 1.90 biocl2 up > 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocl3 up > 13+02:13, 0 users, load 0.98, 0.96, 0.91 biocl4 up > 13+02:12, 0 users, load 2.00, 2.00, 1.91 biocl5 up > 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocl6 up > 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocsm up > 13+22:08, 0 users, load 0.00, 0.00, 0.00 > May 17 12:57:58 biocsm nanny[7305]: bad load average returned: biocl1 > up 13+02:00, 0 users, load 2.00, 1.98, 1.90 biocl2 up > 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocl3 up > 13+02:13, 0 users, load 0.98, 0.96, 0.91 biocl4 up > 13+02:12, 0 users, load 2.00, 2.00, 1.91 biocl5 up > 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocl6 up > 13+02:13, 0 users, load 1.98, 1.96, 1.90 biocsm up > 13+22:08, 0 users, load 0.00, 0.00, 0.00 > > and so on... the /var/log/messages is literally flooded with these... > > I've googled, I've search through the LVS & Piranha mail lists and > found that other people also had same problem, but I havn't found any > definitive answer. > > Please help. > > My lvs.cf is as following: > > [root at biocsm ha]# more lvs.cf > serial_no = 35 > primary = 132.77.90.131 > service = lvs > backup = 0.0.0.0 > heartbeat = 1 > heartbeat_port = 539 > keepalive = 6 > deadtime = 18 > network = nat > nat_router = 192.168.1.30 eth1:1 > debug_level = NONE > virtual HTTP { > active = 1 > address = 132.77.90.131 eth0:1 > vip_nmask = 255.255.255.0 > port = 80 > send = "GET / HTTP/1.0\r\n\r\n" > expect = "HTTP" > use_regex = 0 > load_monitor = ruptime > scheduler = wlc > protocol = tcp > timeout = 6 > reentry = 15 > quiesce_server = 1 > server biocl1 { > address = 192.168.1.31 > active = 1 > weight = 1 > } > server biocl2 { > address = 192.168.1.32 > active = 1 > weight = 1 > } > server biocl3 { > address = 192.168.1.33 > active = 1 > weight = 1 > } > server biocl4 { > address = 192.168.1.34 > active = 1 > weight = 1 > } > server biocl5 { > address = 192.168.1.35 > active = 1 > weight = 1 > } > server biocl6 { > address = 192.168.1.36 > active = 1 > weight = 1 > } > } > > -- S?bastien BONNET -- Ing?nieur syst?me Tel: 04.42.25.15.40 GSM: 06.64.44.58.98 From mishagreen at gmail.com Tue May 17 12:52:41 2005 From: mishagreen at gmail.com (Michael Green) Date: Tue, 17 May 2005 15:52:41 +0300 Subject: LVS error: "bad load average" In-Reply-To: <4289C3F6.9080001@experian.fr> References: <17e909a4050517031458b6846b@mail.gmail.com> <4289C3F6.9080001@experian.fr> Message-ID: <17e909a405051705523daa1c45@mail.gmail.com> On 5/17/05, S?bastien BONNET wrote: > Michael, > > Dig in the list archives for a patch I've released to solve this problem > and explainations on how to use ruptime successfully. S?bastien hi, I found a patch here and explanations here Is that it? I'm a bit confused because 'ruptime' is not mentioned anywhere in that explanation. -- Warm regards, Michael Green From ritesh.a at net4india.net Wed May 18 18:35:19 2005 From: ritesh.a at net4india.net (Ritesh Agrawal) Date: Thu, 19 May 2005 00:05:19 +0530 Subject: Piranha hangs Message-ID: <428B8AE7.3090702@net4india.net> Hi , i am facing strange problem in implmenting load balancer using piranha, i make loadbalancer LB1 with following configurations LB1: private ip: 192.168.35.253 public ip:192.168.24.126 floating VIP:192.168.35.254 service ip : 192.168.24.60 and spam2 is my real server with ip 192.168.35.22 providing http service. after staring pulse, when i send the request from outside world to 192.168.24.60:80. it works fine,but after some time my LB1 hanged displaying no error no clue, unable to find the proper reason of hanging. my lvs.cf file serial_no = 59 primary = 192.168.24.126 primary_private = 192.168.35.253 service = lvs backup_active = 0 backup = 192.168.24.57 backup_private = 192.168.35.57 heartbeat = 1 heartbeat_port = 539 keepalive = 6 deadtime = 18 network = nat nat_router = 192.168.35.254 eth1:1 nat_nmask = 255.255.255.0 debug_level = NONE virtual spam { active = 1 address = 192.168.24.60 eth0:1 vip_nmask = 255.255.255.0 port = 80 send = "GET / HTTP/1.0\r\n\r\n" expect = "HTTP" use_regex = 0 load_monitor = none scheduler = lc protocol = tcp timeout = 6 reentry = 15 quiesce_server = 0 server spam2 { address = 192.168.35.22 active = 1 weight = 1 } } -- Regards Ritesh Agrawal Senior Engineer-Systems Net 4 India Ltd, B-4/47, Safdarjung Enclave, New Delhi- 110 029, India --------------------------------------------------- The more I learn, the less I know --------------------------------------------------- From ritesh.a at net4india.net Wed May 18 18:23:12 2005 From: ritesh.a at net4india.net (Ritesh Agrawal) Date: Wed, 18 May 2005 23:53:12 +0530 Subject: Piranha hangs Message-ID: <428B8810.9080507@net4india.net> Hi , i am facing strange problem in implmenting load balancer using piranha, i make loadbalancer LB1 with following configurations LB1: private ip: 192.168.35.253 public ip:192.168.24.126 floating VIP:192.168.35.254 service ip : 192.168.24.60 and spam2 is my real server with ip 192.168.35.22 providing http service. after staring pulse, when i send the request from outside world to 192.168.24.60:80. it works fine,but after some time my LB1 hanged displaying no error no clue, unable to find the proper reason of hanging. my lvs.cf file serial_no = 59 primary = 192.168.24.126 primary_private = 192.168.35.253 service = lvs backup_active = 0 backup = 192.168.24.57 backup_private = 192.168.35.57 heartbeat = 1 heartbeat_port = 539 keepalive = 6 deadtime = 18 network = nat nat_router = 192.168.35.254 eth1:1 nat_nmask = 255.255.255.0 debug_level = NONE virtual spam { active = 1 address = 192.168.24.60 eth0:1 vip_nmask = 255.255.255.0 port = 80 send = "GET / HTTP/1.0\r\n\r\n" expect = "HTTP" use_regex = 0 load_monitor = none scheduler = lc protocol = tcp timeout = 6 reentry = 15 quiesce_server = 0 server spam2 { address = 192.168.35.22 active = 1 weight = 1 } } From ren at teamware-gmbh.de Fri May 20 10:53:23 2005 From: ren at teamware-gmbh.de (=?iso-8859-1?B?UmVu6SBFbnNrYXQgW1RlYW13YXJlIEdtYkhd?=) Date: Fri, 20 May 2005 12:53:23 +0200 Subject: Problem with Direct Working Howto v0.2 Message-ID: <2b916236311d1c47b5809ce1ffb71465@teamware-gmbh.de> Hello all, I tried the Howto from: https://www.redhat.com/archives/piranha-list/2005-April/msg00000.html My problem ist now when i try this command: [root at lb root]# arptables -A in -d 212.29.1.83 -j drop arptables v0.0.6: Couldn't load target `drop':/lib/arptables/libarpt_drop.so: cannot open shared object file: No such file or directory Try `arptables -h' or 'arptables --help' for more information. ... All other line the OUT command for example is working! I installe dthe newest arptables: arptables_jf-0.0.8-2 What the roblem here? It seems thats the reason why the lb is not working right, when i try to open a site it goes directly to the realhost and not over the LB. Thx Rene From lhh at redhat.com Fri May 20 20:10:15 2005 From: lhh at redhat.com (Lon Hohberger) Date: Fri, 20 May 2005 16:10:15 -0400 Subject: Problem with Direct Working Howto v0.2 In-Reply-To: <2b916236311d1c47b5809ce1ffb71465@teamware-gmbh.de> References: <2b916236311d1c47b5809ce1ffb71465@teamware-gmbh.de> Message-ID: <1116619815.4621.20.camel@ayanami.boston.redhat.com> On Fri, 2005-05-20 at 12:53 +0200, Ren? Enskat [Teamware GmbH] wrote: > Hello all, > > I tried the Howto from: > > https://www.redhat.com/archives/piranha-list/2005-April/msg00000.html > > My problem ist now when i try this command: > > [root at lb root]# arptables -A in -d 212.29.1.83 -j drop > arptables v0.0.6: Couldn't load target > `drop':/lib/arptables/libarpt_drop.so: cannot open shared object file: > No such file or directory > It seems thats the reason why the lb is not working right, when i try to > open a site it goes directly to the realhost and not over the LB. Try using the iptables method for now; I'm looking in to the arptables problem. -- Lon From lhh at redhat.com Fri May 20 20:13:44 2005 From: lhh at redhat.com (Lon Hohberger) Date: Fri, 20 May 2005 16:13:44 -0400 Subject: Piranha hangs In-Reply-To: <428B8AE7.3090702@net4india.net> References: <428B8AE7.3090702@net4india.net> Message-ID: <1116620024.4621.25.camel@ayanami.boston.redhat.com> On Thu, 2005-05-19 at 00:05 +0530, Ritesh Agrawal wrote: > Hi , > i am facing strange problem in implmenting load balancer using > piranha, i make loadbalancer LB1 with following configurations > LB1: > private ip: 192.168.35.253 > public ip:192.168.24.126 > floating VIP:192.168.35.254 > service ip : 192.168.24.60 > > and spam2 is my real server with ip 192.168.35.22 providing http service. > after staring pulse, when i send the request from outside world to > 192.168.24.60:80. it works fine,but after some time my LB1 hanged > displaying no error no clue, > unable to find the proper reason of hanging. Off the top of my head, piranha hanging (or even crashing!) shouldn't affect the load balancing traffic. Piranha controls the routing assignments, but it doesn't do any routing itself - the routing is done in-kernel in the IPVS modules. So, I suspect that lb1 (or rather, its kernel) might be hung as opposed to piranha... Can you get a serial console attached to lb1 and see if it's panicking/hung? -- Lon From lhh at redhat.com Fri May 20 20:48:46 2005 From: lhh at redhat.com (Lon Hohberger) Date: Fri, 20 May 2005 16:48:46 -0400 Subject: Problem with Direct Working Howto v0.2 In-Reply-To: <2b916236311d1c47b5809ce1ffb71465@teamware-gmbh.de> References: <2b916236311d1c47b5809ce1ffb71465@teamware-gmbh.de> Message-ID: <1116622126.4621.40.camel@ayanami.boston.redhat.com> On Fri, 2005-05-20 at 12:53 +0200, Ren? Enskat [Teamware GmbH] wrote: > Hello all, > > I tried the Howto from: > > https://www.redhat.com/archives/piranha-list/2005-April/msg00000.html > > My problem ist now when i try this command: > > [root at lb root]# arptables -A in -d 212.29.1.83 -j drop > arptables v0.0.6: Couldn't load target Thanks to arptables_jf maintainer, we have an answer for you... The commands are case sensitive... Try: arptables -A IN -d ... ^^ -- Lon From ritesh.a at net4india.net Sat May 21 14:17:32 2005 From: ritesh.a at net4india.net (Ritesh Agrawal) Date: Sat, 21 May 2005 19:47:32 +0530 Subject: Piranha hangs In-Reply-To: <1116620024.4621.25.camel@ayanami.boston.redhat.com> References: <428B8AE7.3090702@net4india.net> <1116620024.4621.25.camel@ayanami.boston.redhat.com> Message-ID: <428F42FC.6070407@net4india.net> Lon Hohberger wrote: > On Thu, 2005-05-19 at 00:05 +0530, Ritesh Agrawal wrote: > > >>Hi , >> i am facing strange problem in implmenting load balancer using >>piranha, i make loadbalancer LB1 with following configurations >>LB1: >>private ip: 192.168.35.253 >>public ip:192.168.24.126 >>floating VIP:192.168.35.254 >>service ip : 192.168.24.60 >> >>and spam2 is my real server with ip 192.168.35.22 providing http service. >>after staring pulse, when i send the request from outside world to >>192.168.24.60:80. it works fine,but after some time my LB1 hanged >>displaying no error no clue, >>unable to find the proper reason of hanging. > > > Off the top of my head, piranha hanging (or even crashing!) shouldn't > affect the load balancing traffic. Piranha controls the routing > assignments, but it doesn't do any routing itself - the routing is done > in-kernel in the IPVS modules. > > So, I suspect that lb1 (or rather, its kernel) might be hung as opposed > to piranha... > > Can you get a serial console attached to lb1 and see if it's > panicking/hung? > > -- Lon Hi Lon, Thanks for your help , actually after starting the pulse , everything going well, 10-20 web request it handle properly , but within a minutes Machine hangs , i don't why , I tried to troubleshoot by attaching the console with server ,no kernel panic or other error message displayed in console. I checked /var/log/message for any error ,but i couldn't find any clue. If problem with hardware or software configuration then even single request shouldn't be handled , but it properly work for 10-20 requests, but after some idle time it hangs. Regards Ritesh From peterbaitz at yahoo.com Sat May 21 13:03:01 2005 From: peterbaitz at yahoo.com (pb) Date: Sat, 21 May 2005 06:03:01 -0700 (PDT) Subject: Piranha hangs In-Reply-To: 6667 Message-ID: <20050521130301.66734.qmail@web60024.mail.yahoo.com> Now pulse daemon starts the lvs daemon which manages the nanny daemons, which run port test scripts and setup the routes per your lvs.cf. You might also check the lvs daemon - see if it is going Z (Zombie). If it is, it will stop managing the nanny daemons - though the nannies should keep functioning as long as thy are running, even if they are orphaned. Check with: ps vafx --- Ritesh Agrawal wrote: > > > Lon Hohberger wrote: > > On Thu, 2005-05-19 at 00:05 +0530, Ritesh Agrawal > wrote: > > > > > >>Hi , > >> i am facing strange problem in implmenting load > balancer using > >>piranha, i make loadbalancer LB1 with following > configurations > >>LB1: > >>private ip: 192.168.35.253 > >>public ip:192.168.24.126 > >>floating VIP:192.168.35.254 > >>service ip : 192.168.24.60 > >> > >>and spam2 is my real server with ip 192.168.35.22 > providing http service. > >>after staring pulse, when i send the request > from outside world to > >>192.168.24.60:80. it works fine,but after some > time my LB1 hanged > >>displaying no error no clue, > >>unable to find the proper reason of hanging. > > > > > > Off the top of my head, piranha hanging (or even > crashing!) shouldn't > > affect the load balancing traffic. Piranha > controls the routing > > assignments, but it doesn't do any routing itself > - the routing is done > > in-kernel in the IPVS modules. > > > > So, I suspect that lb1 (or rather, its kernel) > might be hung as opposed > > to piranha... > > > > Can you get a serial console attached to lb1 and > see if it's > > panicking/hung? > > > > -- Lon > > Hi Lon, > Thanks for your help , actually after starting the > pulse , everything > going well, 10-20 web request it handle properly , > but within a minutes > Machine hangs , i don't why , I tried to > troubleshoot by attaching the > console with server ,no kernel panic or other error > message displayed in > console. > > I checked /var/log/message for any error ,but i > couldn't find any clue. > If problem with hardware or software configuration > then even single > request shouldn't be handled , but it properly work > for 10-20 requests, > but after some idle time it hangs. > > Regards > Ritesh > > > > > _______________________________________________ > Piranha-list mailing list > Piranha-list at redhat.com > https://www.redhat.com/mailman/listinfo/piranha-list > __________________________________ Yahoo! Mail Mobile Take Yahoo! Mail with you! Check email on your mobile phone. http://mobile.yahoo.com/learn/mail From ren at teamware-gmbh.de Mon May 23 05:56:39 2005 From: ren at teamware-gmbh.de (=?iso-8859-1?B?UmVu6SBFbnNrYXQgW1RlYW13YXJlIEdtYkhd?=) Date: Mon, 23 May 2005 07:56:39 +0200 Subject: AW: Problem with Direct Working Howto v0.2 In-Reply-To: <1116622126.4621.40.camel@ayanami.boston.redhat.com> Message-ID: Same result as before: [root at telemach3 ~]# arptables -A IN -d 212.29.1.83 -j drop arptables v0.0.8: Couldn't load target `drop':/lib/arptables/libarpt_drop.so: cannot open shared object file: No such file or directory Try `arptables -h' or 'arptables --help' for more information. -----Urspr?ngliche Nachricht----- Von: piranha-list-bounces at redhat.com [mailto:piranha-list-bounces at redhat.com] Im Auftrag von Lon Hohberger Gesendet: Freitag, 20. Mai 2005 22:49 An: Piranha clustering/HA technology Betreff: Re: Problem with Direct Working Howto v0.2 On Fri, 2005-05-20 at 12:53 +0200, Ren? Enskat [Teamware GmbH] wrote: > Hello all, > > I tried the Howto from: > > https://www.redhat.com/archives/piranha-list/2005-April/msg00000.html > > My problem ist now when i try this command: > > [root at lb root]# arptables -A in -d 212.29.1.83 -j drop arptables > v0.0.6: Couldn't load target Thanks to arptables_jf maintainer, we have an answer for you... The commands are case sensitive... Try: arptables -A IN -d ... ^^ -- Lon _______________________________________________ Piranha-list mailing list Piranha-list at redhat.com https://www.redhat.com/mailman/listinfo/piranha-list From ren at teamware-gmbh.de Mon May 23 06:05:19 2005 From: ren at teamware-gmbh.de (=?iso-8859-1?B?UmVu6SBFbnNrYXQgW1RlYW13YXJlIEdtYkhd?=) Date: Mon, 23 May 2005 08:05:19 +0200 Subject: AW: Problem with Direct Working Howto v0.2 In-Reply-To: <1116622126.4621.40.camel@ayanami.boston.redhat.com> Message-ID: <0fe0989cf49a0645ae28b1da3b12e9de@teamware-gmbh.de> Lon, The sam eproblem with iptables :) [root at telemach3 ~]# iptables -A IN -d 212.29.1.83 -j drop iptables v1.2.11: Couldn't load target `drop':/lib/iptables/libipt_drop.so: cannot open shared object file: No such file or directory Try `iptables -h' or 'iptables --help' for more information. -----Urspr?ngliche Nachricht----- Von: piranha-list-bounces at redhat.com [mailto:piranha-list-bounces at redhat.com] Im Auftrag von Lon Hohberger Gesendet: Freitag, 20. Mai 2005 22:49 An: Piranha clustering/HA technology Betreff: Re: Problem with Direct Working Howto v0.2 On Fri, 2005-05-20 at 12:53 +0200, Ren? Enskat [Teamware GmbH] wrote: > Hello all, > > I tried the Howto from: > > https://www.redhat.com/archives/piranha-list/2005-April/msg00000.html > > My problem ist now when i try this command: > > [root at lb root]# arptables -A in -d 212.29.1.83 -j drop arptables > v0.0.6: Couldn't load target Thanks to arptables_jf maintainer, we have an answer for you... The commands are case sensitive... Try: arptables -A IN -d ... ^^ -- Lon _______________________________________________ Piranha-list mailing list Piranha-list at redhat.com https://www.redhat.com/mailman/listinfo/piranha-list From ren at teamware-gmbh.de Mon May 23 06:12:22 2005 From: ren at teamware-gmbh.de (=?iso-8859-1?B?UmVu6SBFbnNrYXQgW1RlYW13YXJlIEdtYkhd?=) Date: Mon, 23 May 2005 08:12:22 +0200 Subject: AW: Problem with Direct Working Howto v0.2 In-Reply-To: <1116622126.4621.40.camel@ayanami.boston.redhat.com> Message-ID: Ah well found the problem DROP must also be case sensitive :) Sorry! :) -----Urspr?ngliche Nachricht----- Von: piranha-list-bounces at redhat.com [mailto:piranha-list-bounces at redhat.com] Im Auftrag von Lon Hohberger Gesendet: Freitag, 20. Mai 2005 22:49 An: Piranha clustering/HA technology Betreff: Re: Problem with Direct Working Howto v0.2 On Fri, 2005-05-20 at 12:53 +0200, Ren? Enskat [Teamware GmbH] wrote: > Hello all, > > I tried the Howto from: > > https://www.redhat.com/archives/piranha-list/2005-April/msg00000.html > > My problem ist now when i try this command: > > [root at lb root]# arptables -A in -d 212.29.1.83 -j drop arptables > v0.0.6: Couldn't load target Thanks to arptables_jf maintainer, we have an answer for you... The commands are case sensitive... Try: arptables -A IN -d ... ^^ -- Lon _______________________________________________ Piranha-list mailing list Piranha-list at redhat.com https://www.redhat.com/mailman/listinfo/piranha-list From ren at teamware-gmbh.de Mon May 23 06:52:52 2005 From: ren at teamware-gmbh.de (=?iso-8859-1?B?UmVu6SBFbnNrYXQgW1RlYW13YXJlIEdtYkhd?=) Date: Mon, 23 May 2005 08:52:52 +0200 Subject: Ruptime/rup? And Cookie problem when DR Message-ID: Ok my LB seems to work now but i have some little problems. I tried ruptime and rup but always i get the error in log: May 23 08:49:51 lb nanny[2472]: The following exited abnormally: May 23 08:49:51 lb nanny[2472]: failed to read remote load May 23 08:50:09 lb nanny[2472]: The following exited abnormally: May 23 08:50:09 lb nanny[2472]: failed to read remote load May 23 08:50:27 lb nanny[2472]: The following exited abnormally: May 23 08:50:27 lb nanny[2472]: failed to read remote load I made an ssh key i can directly connect form lb to the realhost as root via ssh So what is the problem there? My other problem is, that it seems that when i try to reach my website wich is loadbalanced i get error messages from the InternetExplorer that i haven't enable cookies for this sit ebut i did. When i bound the site to the vhost on the realip of the virtualhost then it works fine! From ren at teamware-gmbh.de Mon May 23 07:09:28 2005 From: ren at teamware-gmbh.de (=?iso-8859-1?B?UmVu6SBFbnNrYXQgW1RlYW13YXJlIEdtYkhd?=) Date: Mon, 23 May 2005 09:09:28 +0200 Subject: AW: Ruptime/rup? And Cookie problem when DR In-Reply-To: Message-ID: <8811d543a6220448965903ae42dc2c80@teamware-gmbh.de> Ok it seems th erwhod was broken it is started niw but the errorlog say this now: May 23 09:07:14 lb nanny[2566]: bad load average returned: lb up 0:53, 2 users, load 0.05, 0.02, 0.00 -----Urspr?ngliche Nachricht----- Von: Ren? Enskat [Teamware GmbH] [mailto:ren at teamware-gmbh.de] Gesendet: Montag, 23. Mai 2005 08:53 An: 'Piranha clustering/HA technology' Betreff: Ruptime/rup? And Cookie problem when DR Ok my LB seems to work now but i have some little problems. I tried ruptime and rup but always i get the error in log: May 23 08:49:51 lb nanny[2472]: The following exited abnormally: May 23 08:49:51 lb nanny[2472]: failed to read remote load May 23 08:50:09 lb nanny[2472]: The following exited abnormally: May 23 08:50:09 lb nanny[2472]: failed to read remote load May 23 08:50:27 lb nanny[2472]: The following exited abnormally: May 23 08:50:27 lb nanny[2472]: failed to read remote load I made an ssh key i can directly connect form lb to the realhost as root via ssh So what is the problem there? My other problem is, that it seems that when i try to reach my website wich is loadbalanced i get error messages from the InternetExplorer that i haven't enable cookies for this sit ebut i did. When i bound the site to the vhost on the realip of the virtualhost then it works fine! From lhh at redhat.com Mon May 23 13:39:19 2005 From: lhh at redhat.com (Lon Hohberger) Date: Mon, 23 May 2005 09:39:19 -0400 Subject: Piranha hangs In-Reply-To: <428F42FC.6070407@net4india.net> References: <428B8AE7.3090702@net4india.net> <1116620024.4621.25.camel@ayanami.boston.redhat.com> <428F42FC.6070407@net4india.net> Message-ID: <1116855560.4621.55.camel@ayanami.boston.redhat.com> On Sat, 2005-05-21 at 19:47 +0530, Ritesh Agrawal wrote: > Hi Lon, > Thanks for your help , actually after starting the pulse , everything > going well, 10-20 web request it handle properly , but within a minutes > Machine hangs , i don't why , I tried to troubleshoot by attaching the > console with server ,no kernel panic or other error message displayed in > console. > I checked /var/log/message for any error ,but i couldn't find any clue. > If problem with hardware or software configuration then even single > request shouldn't be handled , but it properly work for 10-20 requests, > but after some idle time it hangs. Hi Ritesh, You're absolutely right -- if it's a configuration issue, it is not likely to work at all. If the whole machine is hanging, it's probably not a problem with piranha itself - it's likely a kernel issue (my guess would be in the ipvs code). Personally, I haven't seen this problem before. I would call Red Hat support at this point. Chances are good that Red Hat Support will want at least the following: (a) kernel version (b) piranha version (c) Task stack traces (sysrq-t, or BRK-t over a serial terminal) (d) kernel PC (sysrq-p, or BRK-p over a serial terminal) (e) memory info (sysrq-m, or BRK-m over a serial terminal) (f) Full sysreport -- Lon From ritesh.a at net4india.net Tue May 24 13:32:46 2005 From: ritesh.a at net4india.net (Ritesh Agrawal) Date: Tue, 24 May 2005 19:02:46 +0530 Subject: Piranha hangs In-Reply-To: <428F42FC.6070407@net4india.net> References: <428B8AE7.3090702@net4india.net> <1116620024.4621.25.camel@ayanami.boston.redhat.com> <428F42FC.6070407@net4india.net> Message-ID: <42932CFE.60905@net4india.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi , ~ Now my loadbalancer is working fine with same hardware and same configurations , i just boot the server in single proccesor kernel image instead of SMP kernel image . my all real servers working fine... it means 'pulse' daemon do not working fine with smp kernel image.... i don't know what the exact reason behind it. plz help me to findout proper reason. Thanks for your help Regards Ritesh Ritesh Agrawal wrote: | | | Lon Hohberger wrote: | |> On Thu, 2005-05-19 at 00:05 +0530, Ritesh Agrawal wrote: |> |> |>> Hi , |>> i am facing strange problem in implmenting load balancer using |>> piranha, i make loadbalancer LB1 with following configurations |>> LB1: |>> private ip: 192.168.35.253 |>> public ip:192.168.24.126 |>> floating VIP:192.168.35.254 |>> service ip : 192.168.24.60 |>> |>> and spam2 is my real server with ip 192.168.35.22 providing http |>> service. |>> after staring pulse, when i send the request from outside world to |>> 192.168.24.60:80. it works fine,but after some time my LB1 hanged |>> displaying no error no clue, |>> unable to find the proper reason of hanging. |> |> |> |> Off the top of my head, piranha hanging (or even crashing!) shouldn't |> affect the load balancing traffic. Piranha controls the routing |> assignments, but it doesn't do any routing itself - the routing is done |> in-kernel in the IPVS modules. |> |> So, I suspect that lb1 (or rather, its kernel) might be hung as opposed |> to piranha... |> |> Can you get a serial console attached to lb1 and see if it's |> panicking/hung? |> |> -- Lon | | | Hi Lon, | Thanks for your help , actually after starting the pulse , everything | going well, 10-20 web request it handle properly , but within a minutes | Machine hangs , i don't why , I tried to troubleshoot by attaching the | console with server ,no kernel panic or other error message displayed in | console. | | I checked /var/log/message for any error ,but i couldn't find any clue. | If problem with hardware or software configuration then even single | request shouldn't be handled , but it properly work for 10-20 requests, | but after some idle time it hangs. | | Regards | Ritesh | | | | | _______________________________________________ | -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCkyz9Foz+P95jnTIRAi25AKDwGXX0sqYr1CysDZrXhaBSlTK6RACgwKln fMtNa2KaPnNzAXJgt6J+jXM= =BjGw -----END PGP SIGNATURE----- From ren at teamware-gmbh.de Tue May 24 13:42:18 2005 From: ren at teamware-gmbh.de (=?iso-8859-1?B?UmVu6SBFbnNrYXQgW1RlYW13YXJlIEdtYkhd?=) Date: Tue, 24 May 2005 15:42:18 +0200 Subject: Another LB problem Message-ID: <6956cd9564f764499e086e4d74e96bd1@teamware-gmbh.de> Since yesterday i have a new problem When i start the the pulse etc i only get the maste rproces but the routing through the realhost isn't working: [root at lb bin]# ipvsadm IP Virtual Server version 1.0.8 (size=65536) Prot LocalAddress:Port Scheduler Flags -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP adressenlb.bfv.de:http wlc But my settings in lvs.cf are the same :( From lhh at redhat.com Tue May 24 15:21:37 2005 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 24 May 2005 11:21:37 -0400 Subject: Piranha hangs In-Reply-To: <42932CFE.60905@net4india.net> References: <428B8AE7.3090702@net4india.net> <1116620024.4621.25.camel@ayanami.boston.redhat.com> <428F42FC.6070407@net4india.net> <42932CFE.60905@net4india.net> Message-ID: <1116948097.18073.2.camel@ayanami.boston.redhat.com> On Tue, 2005-05-24 at 19:02 +0530, Ritesh Agrawal wrote: > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > > Hi , > > ~ Now my loadbalancer is working fine with same hardware and same > configurations , i just boot the server in single proccesor kernel image > instead of SMP kernel image . > > my all real servers working fine... > it means 'pulse' daemon do not working fine with smp kernel image.... > i don't know what the exact reason behind it. > plz help me to findout proper reason. Sounds like one or more of the IPVS kernel modules isn't SMP safe... -- Lon From lhh at redhat.com Tue May 24 19:44:06 2005 From: lhh at redhat.com (Lon Hohberger) Date: Tue, 24 May 2005 15:44:06 -0400 Subject: [Linux-cluster] LVS error: "bad load average" In-Reply-To: <17e909a4050517031458b6846b@mail.gmail.com> References: <17e909a4050517031458b6846b@mail.gmail.com> Message-ID: <1116963846.18073.39.camel@ayanami.boston.redhat.com> On Tue, 2005-05-17 at 13:14 +0300, Michael Green wrote: > Getting the following errors in the log file from the LVS: What version of piranha? -- Lon From ritesh.a at net4india.net Sat May 28 09:19:07 2005 From: ritesh.a at net4india.net (Ritesh Agrawal) Date: Sat, 28 May 2005 14:49:07 +0530 Subject: Does Piranaha work with SMP kernel? Message-ID: <4298378B.20605@net4india.net> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hi list, ~ Recently i configured 'piranha load blanacer' using DELL poweredge 1650 server. When machine is booted with SMP kernel ,after runing pulse daemon and sending request it hangs. but when i boot the machine with non SMP kernel with same hardware and software configuration , it works well. it means pirana doesn't work with SMP kernel or any other reason. is Piranha suitable with SMP kernel ???? A big question for me. Thanks in advance. - -- Regards Ritesh Agrawal -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.2 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org iD8DBQFCmDeKFoz+P95jnTIRAjNIAJ93MqGsxsOsS+Kxi3+3juv7hCqZQgCgm8nJ SAM93N8qjdSEX1vONbCrnKc= =0dhY -----END PGP SIGNATURE----- From ren at teamware-gmbh.de Mon May 30 07:13:39 2005 From: ren at teamware-gmbh.de (=?iso-8859-1?B?UmVu6SBFbnNrYXQgW1RlYW13YXJlIEdtYkhd?=) Date: Mon, 30 May 2005 09:13:39 +0200 Subject: Ipvsad problems Message-ID: I still have problems with my loadbalancer I start the pulse an dpiranha daemon without errors then i have this: -> RemoteAddress:Port Forward Weight ActiveConn InActConn TCP lbtest.teamware-gmbh.de:http rr -> telemach4.teamware-gmbh.de:h Route 18618 0 0 -> telemach3.teamware-gmbh.de:h Route 20480 0 0 TCP adressenlb.bfv.de:http wlc persistent 600 -> telemach4.teamware-gmbh.de:h Route 18618 0 0 -> telemach3.teamware-gmbh.de:h Route 20480 0 0 But i cant surf to the sides. The strange thing is when i ask for the arp for the adress of adressenlb.bfv.de i get: 0030.4852.fb8c But the LB has the arp: 0030.0525.16ac I installed the arpfilter on the both realhost and denied outgoing arps to substitude with the main host ip for outgoing arp requests. Chain IN (policy ACCEPT) target source-ip destination-ip source-hw destination-hw hlen op hrd pro DROP anywhere adressenlb.bfv.de anywhere anywhere any any any any Chain OUT (policy ACCEPT) target source-ip destination-ip source-hw destination-hw hlen op hrd pro mangle anywhere adressenlb.bfv.de anywhere anywhere any any any any --mangle-ip-s telemach3.teamware-gmbh.de Chain FORWARD (policy ACCEPT) target source-ip destination-ip source-hw destination-hw hlen op hrd pro Somebody can say whats the problem there?