network stalls on Fedora Core 3

Keith Fetterman kfetterman at go2marine.com
Fri Apr 15 06:17:16 UTC 2005


Folks,

I continued my attempts to diagnose the source of my network stalls. 
here is what I found out:

- The problem only occurs when downloading large files over a T1.  The 
problem does not occur when transferring files between two computers on 
the same 100BaseT switched subnet.  I successfully copied a 2.5GB file 
between the Fedora Core 3 (FC3) and another linux box.

We have a point-to-point T1 that connects our office to our co-location 
facility where our production servers and route to the Internet exists. 
  The problem occurs when downloading large files (+20MB files) over 
this T1.  It happens randomly, but I usually get the first 10MB before 
the network stalls.

Using Ethereal, I discovered a problem where I think the FC3 system is 
not recovering from receiving a bad TCP packet.  When the FC3 system 
receives a bad TCP packet, it sends s response to the remote server 
requesting it to resend the packet.  The remote server does, but then 
the FC3 system sends the request again for the same packet.  It does 
this several times and then gives up.  I don't think the FC3 OS is 
processing the resent packet properly so it retries several before 
eventually giving up on with the download.  The lower level OS return 
doesn't fail, it just stops responding, which is why from scp level it 
looks like a stall.

Our RedHat Enterprise 3 WS systems are handling this problem correctly, 
i.e., when they receive a bad packet, they send the response requesting 
a resend.  They receive the retransmitted packet and then contine the 
download.

Tonight, on the computer that was having the problem, I replaced the 
Fedora Core 3 OS with RedHat Enterprise 3 (RHE3) OS.  The problem did 
not exist with RHE3.  I was able to download 10 - 40MB files over the T1 
without a problem.  So I think the problem is with FC3.

Does anyone have any idea why the FC3 OS might be having this problem 
and what I can do or who to report the problem too?

Thanks,
Keith





Shawn Iverson wrote:
> On Wednesday, April 06, 2005 6:25 PM Keith Fetterman wrote:
> 
>>Folks,
>>
>>I am encountering a random problem where network (100BaseT ethernet) 
>>connections stall when downloading files.  I first noticed a problem 
>>when updating my system using up2date.  I discovered the 
>>update process 
>>would hang.
>>
>>I am now seeing the network stalling when I am downloading a 
>>large file, 
>>roughly 41MB, using scp.  The point at which it stalls is 
>>random and it 
>>doesn't always stall (but it usually does.) When it stalls, it usually 
>>stalls after 20-30MB have been transferred.
>>
> 
> 
> Are you going through a SOHO router?  If you are, temporarily bypass
> your router and connect directly to your service provider (after turning
> on the firewall, of course) and then check for stalling.  This will tell
> you if your router is the source of the problem.  I have similar
> symptoms on the BEFSR11.  The only workaround I have found is to connect
> directly to the router with a patch cable and force the speed to 10Mbit
> with mii-tool, thereby forcing the Linksys also to 10Mbit.
> 
> --
> 
> Shawn 
> 




More information about the fedora-list mailing list