From mingo at elte.hu  Sat May  1 03:41:49 2004
From: mingo at elte.hu (Ingo Molnar)
Date: Sat, 1 May 2004 05:41:49 +0200
Subject: Benchmarks
In-Reply-To: <1083342694.18079.45.camel@localhost.localdomain>
References: <40914BA6.4080700@pcextremist.com>
	<1083265271.1009.28.camel@mcdlp.pbi.daviesinc.com>
	<4091591E.20205@pcextremist.com>
	<1083268075.16885.37.camel@localhost.localdomain>
	<1083269442.1009.37.camel@mcdlp.pbi.daviesinc.com>
	<1083274784.16881.50.camel@localhost.localdomain>
	<1083339749.5390.22.camel@mcdlp.pbi.daviesinc.com>
	<1083342694.18079.45.camel@localhost.localdomain>
Message-ID: <20040501034149.GA19200@elte.hu>


* William Lovaton <williama_lovaton at coomeva.com.co> wrote:

> Load average is a good number to look at.  The thing is that a 1.5 load
> means there are almost 2 processes in execution state. [...]

the load also includes processes in 'uninterruptible sleep' - i.e. 
processes that are waiting for some sort of definitive, driver-related
IO event, such as disk IO or network IO.

Newer kernels (2.6, or vendor kernels with the 'iowait patch'
backported) also have the iowait stat:

 05:17:02  up 1 day, 21:43, 15 users,  load average: 0.61, 0.25, 0.19
74 processes: 72 sleeping, 2 running, 0 zombie, 0 stopped
CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
           total    3.4%    0.0%    5.2%   0.0%     0.0%   10.0%   81.4%

('softirq' overhead is typically caused by networking overhead, 'iowait'
is idle time while there is pending network/disk IO, and 'idle' is pure
idle time when nothing happens in the system.)

> > > > Machine is pushing 25.34mb/sec
> > > How do you get this number??  (25.34mb/sec)
> > right off the switch port.
> 
> Neat!  I'll talk with the net guy here.  ;-)

there are soft stats on the Linux side too:

	sar -n DEV 10 0

will display summary per-interface tx/rx statistics every 10 seconds. 

Also, 'iptraf' is a pretty handy tool too, for simple traffic analysis.

	Ingo


From williama_lovaton at coomeva.com.co  Mon May  3 13:22:05 2004
From: williama_lovaton at coomeva.com.co (William Lovaton)
Date: 03 May 2004 08:22:05 -0500
Subject: Benchmarks
In-Reply-To: <1083352140.5390.71.camel@mcdlp.pbi.daviesinc.com>
References: <40914BA6.4080700@pcextremist.com>
	<1083265271.1009.28.camel@mcdlp.pbi.daviesinc.com>
	<4091591E.20205@pcextremist.com>
	<1083268075.16885.37.camel@localhost.localdomain>
	<1083269442.1009.37.camel@mcdlp.pbi.daviesinc.com>
	<1083274784.16881.50.camel@localhost.localdomain>
	<1083339749.5390.22.camel@mcdlp.pbi.daviesinc.com>
	<1083342694.18079.45.camel@localhost.localdomain>
	<1083352140.5390.71.camel@mcdlp.pbi.daviesinc.com>
Message-ID: <1083590525.18084.86.camel@localhost.localdomain>

Hi Chris,

El vie, 30-04-2004 a las 14:09, Chris Davies escribi?:
> > > 
> > > I don't really think load average is a very good number other than a
> > > rough generalization.  I've been on machines where the average is at 1.5
> > Load average is a good number to look at.  The thing is that a 1.5 load
> > means there are almost 2 processes in execution state.  It doesn't mean
> 
> Right, but you can have a low load and be io bound and have a system
> that is slow.  or cpu bound and have a system that is relatively
> responsive and a high load.

Exactly... once you are cpu bound and have low response under high load
then, your bottleneck is the CPU. You will need more MHz.

> 
> then where is the bottleneck?   Something is keeping processes in the
> run queue.  IO? VM? 

VM is not a problem for me, with 2GB in RAM and almost 2 months of
uptime it never touched the swap.

The problem seems to be specially network I/O between the web server and
the database server, it seems that there is a little glitch in transfer
speed.  A few months ago we had a serious slowness problem and it was
because de NIC that connects to the DB (for some reason) changed to work
Half Duplex. We were crazy trying to find the problem.

But right now I guess the problem starts to be CPU power.  I'm really
looking forward to Fedora Core 2 and try the new kernel.

I already tried Mandrake 10 in production environtment with a spare
server.  It worked really well from the average load point of view. 
While RH9 2.4.20 was between 12 - 30, Kernel 2.6.3 was between 3 - 9 and
this without TUX. (MDK doesnt ship tux)

The only problem I had was that for some reason network transfer speed
was really really slow and I don't know why.  It was 100Mbps Full Duplex
like the other server.  At the end it didn't work and I had to go back
with the old server.

I was reading the docs in the new TUX version (FC2T2) and saw something
related to network cards.  I posted this mail to the list but I got no
answer:

[] https://www.redhat.com/archives/tux-list/2004-April/msg00012.html

I would be very grateful if you give me some info about that mail.

> I'm curious, have you gotten the php4-tux sapi to work?  I've not gotten
> the CGI to work, but I think if I can get both CGI and PHP to work with
> tux, there won't be much that apache needs to do.

I haven't try that.  I guess that way TUX could be a replacement for
apache.  But not for me, TUX is a threaded web server and that doesn't
make a good combination with PHP if you are using database connections
or some other kind of shared resource.  Besides, some PHP libraries or
modules are not thread safe.


-William


From williama_lovaton at coomeva.com.co  Mon May  3 14:00:18 2004
From: williama_lovaton at coomeva.com.co (William Lovaton)
Date: 03 May 2004 09:00:18 -0500
Subject: Benchmarks
In-Reply-To: <20040501034149.GA19200@elte.hu>
References: <40914BA6.4080700@pcextremist.com>
	<1083265271.1009.28.camel@mcdlp.pbi.daviesinc.com>
	<4091591E.20205@pcextremist.com>
	<1083268075.16885.37.camel@localhost.localdomain>
	<1083269442.1009.37.camel@mcdlp.pbi.daviesinc.com>
	<1083274784.16881.50.camel@localhost.localdomain>
	<1083339749.5390.22.camel@mcdlp.pbi.daviesinc.com>
	<1083342694.18079.45.camel@localhost.localdomain>
	<20040501034149.GA19200@elte.hu>
Message-ID: <1083592818.18080.104.camel@localhost.localdomain>

Hi Ingo,

Very useful explanation, thank you.  What packages provides sar and
iptraf?, I can't manage to find them.

Right now I have this on my system... could you give us some analysis
please?? (RH9 2.4.20-28.9smp)


08:37:44  up 7 days,  1:39,  1 user,  load average: 30,75, 35,62, 34,06
270 processes: 239 sleeping, 31 running, 0 zombie, 0 stopped
CPU0 states:  90,0% user   9,0% system    0,0% nice   0,0% iowait   0,1%
idle
CPU1 states:  87,0% user  12,0% system    0,0% nice   0,0% iowait   0,1%
idle
CPU2 states:  85,1% user  14,0% system    0,0% nice   0,0% iowait   0,0%
idle
CPU3 states:  84,0% user  14,0% system    0,0% nice   0,0% iowait   1,0%
idle
Mem:  2322992k av, 1215588k used, 1107404k free,       0k shrd,   56524k
buff
                    486768k actv,  635060k in_d,    5040k in_c
Swap:  681336k av,       0k used,  681336k free                  697300k
cached
                                                                                
  PID USER     PRI  NI  SIZE  RSS SHARE STAT %CPU %MEM   TIME CPU
COMMAND
7882 root      15   0  5372 5372  3704 S    99,9  0,2 349:44   1 httpd
8301 root      16   0  1320 1320   868 R     4,4  0,0   0:39   2 top


-William


El vie, 30-04-2004 a las 22:41, Ingo Molnar escribi?:
> * William Lovaton <williama_lovaton at coomeva.com.co> wrote:
> 
> > Load average is a good number to look at.  The thing is that a 1.5 load
> > means there are almost 2 processes in execution state. [...]
> 
> the load also includes processes in 'uninterruptible sleep' - i.e. 
> processes that are waiting for some sort of definitive, driver-related
> IO event, such as disk IO or network IO.
> 
> Newer kernels (2.6, or vendor kernels with the 'iowait patch'
> backported) also have the iowait stat:
> 
>  05:17:02  up 1 day, 21:43, 15 users,  load average: 0.61, 0.25, 0.19
> 74 processes: 72 sleeping, 2 running, 0 zombie, 0 stopped
> CPU states:  cpu    user    nice  system    irq  softirq  iowait    idle
>            total    3.4%    0.0%    5.2%   0.0%     0.0%   10.0%   81.4%
> 
> ('softirq' overhead is typically caused by networking overhead, 'iowait'
> is idle time while there is pending network/disk IO, and 'idle' is pure
> idle time when nothing happens in the system.)
> 
> > > > > Machine is pushing 25.34mb/sec
> > > > How do you get this number??  (25.34mb/sec)
> > > right off the switch port.
> > 
> > Neat!  I'll talk with the net guy here.  ;-)
> 
> there are soft stats on the Linux side too:
> 
> 	sar -n DEV 10 0
> 
> will display summary per-interface tx/rx statistics every 10 seconds. 
> 
> Also, 'iptraf' is a pretty handy tool too, for simple traffic analysis.
> 
> 	Ingo


From mcd at daviesinc.com  Mon May  3 14:56:59 2004
From: mcd at daviesinc.com (Chris Davies)
Date: Mon, 03 May 2004 10:56:59 -0400
Subject: Benchmarks
In-Reply-To: <1083592818.18080.104.camel@localhost.localdomain>
References: <40914BA6.4080700@pcextremist.com>
	<1083265271.1009.28.camel@mcdlp.pbi.daviesinc.com>
	<4091591E.20205@pcextremist.com>
	<1083268075.16885.37.camel@localhost.localdomain>
	<1083269442.1009.37.camel@mcdlp.pbi.daviesinc.com>
	<1083274784.16881.50.camel@localhost.localdomain>
	<1083339749.5390.22.camel@mcdlp.pbi.daviesinc.com>
	<1083342694.18079.45.camel@localhost.localdomain>
	<20040501034149.GA19200@elte.hu>
	<1083592818.18080.104.camel@localhost.localdomain>
Message-ID: <1083596218.18014.220.camel@mcdlp.pbi.daviesinc.com>

try 

ps aux --forest

sar is probably in sysstat
iptraf should probably be in its own package called iptraf

On Mon, 2004-05-03 at 10:00, William Lovaton wrote:
> Hi Ingo,
> 
> Very useful explanation, thank you.  What packages provides sar and
> iptraf?, I can't manage to find them.
> 
> Right now I have this on my system... could you give us some analysis
> please?? (RH9 2.4.20-28.9smp)


From alex.kiernan at thus.net  Tue May  4 12:56:53 2004
From: alex.kiernan at thus.net (Alex Kiernan)
Date: 04 May 2004 13:56:53 +0100
Subject: [patch] tux3-2.6.5-A3
In-Reply-To: <20040422161948.GG22027@krispykreme>
References: <20040419070756.GA15513@elte.hu>
	<20040422161948.GG22027@krispykreme>
Message-ID: <72d65kwdt6.fsf@alexk.eng.demon.net>

Anton Blanchard <anton at samba.org> writes:

> > 
> > the latest Tux patch merged to 2.6.5 is available at:
> > 
> >     	redhat.com/~mingo/TUX-patches/tux3-2.6.5-A3
> >  
> > this patch also includes the x86-64 fixes from Alex Kiernan.
> 
> I removed almost all of the in kernel syscalls on ppc64 and it broke
> tux. I think we can use syscalls.h now and call them directly.
> 

Works for me on x86-64 - I added in this fragment to remove unused stuff:

--- linux-2.6.5/include/asm-x86_64/unistd.h.orig	2004-04-26 09:07:51.502573104 +0000
+++ linux-2.6.5/include/asm-x86_64/unistd.h	2004-04-26 09:08:03.705717944 +0000
@@ -724,16 +724,6 @@
 	return sys_wait4(pid, wait_stat, flags, NULL);
 }
 
-static inline long chroot(const char *filename)
-{
-	return sys_chroot(filename);
-}
-
-static inline long chdir(const char *filename)
-{
-	return sys_chdir(filename);
-}
-
 extern long sys_mmap(unsigned long addr, unsigned long len,
 			unsigned long prot, unsigned long flags,
 			unsigned long fd, unsigned long off);


-- 
Alex Kiernan, Principal Engineer, Development, THUS plc


From mcd at daviesinc.com  Wed May  5 16:02:41 2004
From: mcd at daviesinc.com (Chris Davies)
Date: Wed, 05 May 2004 12:02:41 -0400
Subject: Benchmarks
In-Reply-To: <1083590525.18084.86.camel@localhost.localdomain>
References: <40914BA6.4080700@pcextremist.com>
	<1083265271.1009.28.camel@mcdlp.pbi.daviesinc.com>
	<4091591E.20205@pcextremist.com>
	<1083268075.16885.37.camel@localhost.localdomain>
	<1083269442.1009.37.camel@mcdlp.pbi.daviesinc.com>
	<1083274784.16881.50.camel@localhost.localdomain>
	<1083339749.5390.22.camel@mcdlp.pbi.daviesinc.com>
	<1083342694.18079.45.camel@localhost.localdomain>
	<1083352140.5390.71.camel@mcdlp.pbi.daviesinc.com>
	<1083590525.18084.86.camel@localhost.localdomain>
Message-ID: <1083772960.18014.559.camel@mcdlp.pbi.daviesinc.com>

On Mon, 2004-05-03 at 09:22, William Lovaton wrote:
> The problem seems to be specially network I/O between the web server and
> the database server, it seems that there is a little glitch in transfer
> speed.  A few months ago we had a serious slowness problem and it was
> because de NIC that connects to the DB (for some reason) changed to work
> Half Duplex. We were crazy trying to find the problem.
why not fix the ports at 100mb Full Duplex, no auto-negotiate?

You would need to do this on the switch and on the machine.  I believe
mii-tool will allow you to Force a connection to 100baseTx-FD.

> But right now I guess the problem starts to be CPU power.  I'm really
> looking forward to Fedora Core 2 and try the new kernel.
looking again at what you posted with top, cpu is not your problem.


From mcd at daviesinc.com  Wed May  5 16:09:18 2004
From: mcd at daviesinc.com (Chris Davies)
Date: Wed, 05 May 2004 12:09:18 -0400
Subject: Using Tux and always forcing particular IPs to the backend
Message-ID: <1083773358.18014.567.camel@mcdlp.pbi.daviesinc.com>

Since it appears noone has used Tux & CGI, I've dropped that for now.

Tux & PHP is proving to be just as troublesome.

The Listen commands for tux just do not seem to behave the way I would
expect.  It binds to the right port, but doesn't seem to hand off to the
backend on a 404.

Permissions don't seem to solve the problem that I need to handle, so,
the only possiblity left is to let the other traffic pass through tux
and always 404, so that the backend will serve it.

So, has anyone done real-life testing where every request on a
particular IP happens to go to a backend?  I haven't noticed that tux
adds that much latency for things that do get handed off, but before I
go in that direction, has anyone done this?

On one machine, I expect that only 2mb/sec of traffic will get passed to
the backend.  I just wish there was a better way to do it, but, in
poking through the code, I don't see the reason that it would act
differently when not listening to http://0.0.0.0:80

Any feedback before I do this?


From sitz at onastick.net  Wed May  5 19:53:41 2004
From: sitz at onastick.net (Noah)
Date: Wed, 5 May 2004 15:53:41 -0400
Subject: Using Tux and always forcing particular IPs to the backend
In-Reply-To: <1083773358.18014.567.camel@mcdlp.pbi.daviesinc.com>
References: <1083773358.18014.567.camel@mcdlp.pbi.daviesinc.com>
Message-ID: <20040505195339.GA19203@radu.onastick.net>

On Wed, May 05, 2004 at 12:09:18PM -0400, Chris Davies wrote:
> So, has anyone done real-life testing where every request on a
> particular IP happens to go to a backend?  I haven't noticed that tux
> adds that much latency for things that do get handed off, but before I
> go in that direction, has anyone done this?

FWIW, I had, at one point, hacked tux to hand off all requests for a
particular 'Host:' header to the back-end webserver. It was a fairly
small patch; just a few lines. I only needed to do it for one host
header, so I just hardcoded the host I wanted to hand off into tux. Being 
able to maintain a list of Host: headers that tux will always hand to the
back-end server would be a really useful feature, especially for those
of us who used name-based vhosting and want to run mailman (yes, you
alter mailman to roll the CGIs to have extensions, but that's just a
little icky IMO =)  ).

--n

-- 
DON'T BUY CELERY if you have pink underwear!


From greearb at candelatech.com  Tue May 11 08:57:27 2004
From: greearb at candelatech.com (greearb at candelatech.com)
Date: Tue, 11 May 2004 03:57:27 -0500
Subject: Forum notify
Message-ID: <evdychrcbyvkgptnwqi@redhat.com>

An HTML attachment was scrubbed...
URL: <http://listman.redhat.com/archives/tux-list/attachments/20040511/0cba28fd/attachment.htm>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: ndewnvsxuw.bmp
Type: image/bmp
Size: 3766 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/tux-list/attachments/20040511/0cba28fd/attachment.bmp>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: Document.zip
Type: application/octet-stream
Size: 22635 bytes
Desc: not available
URL: <http://listman.redhat.com/archives/tux-list/attachments/20040511/0cba28fd/attachment.obj>