[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

TFTP and xinet -- was: Re: [K12OSN] Intolerable Boot Times



This piece is longish -- it is a work in progress in
describing troubleshooting tftp and xinetd issues with
two-part PXE booting.

On 27 Nov 2002, Paul Nelson wrote:

> One thing we noticed after updating to the RH8 based K12LTSP beta 3 is
> slower response with tftp. Before the update I could walk around the lab
> and turn on every machine at once (30) and they would all come up within
> 30 seconds or so. Now, the server bogs down when I do more than 4 at a
> time. As soon as the client gets the image and boots things are fine.
> 
> I'm still scratching my head on this one...

There are tuning issues with the RHL 8.0 xinetd -- under heavy
simultaneous load, it is not releasing terminated tftp-server
sessions cleanly as shipped;  this problem also surfaced with
RHL 7.3, with the xinetd security update.
   
-- the concluded tftp connection shows up with 
	netstat -a | grep tftp 
in FIN/WAIT state, or a 'respawning too fast' error appears in
/var/log/messages when things get 'out of whack'.

This is predictable (or at least not-unexpected behaviour
under the tftp RFC <smile>) because of the way a tftp PXE
client seeks out the correct image to load -- it works through
successive permutations of the IP address, spitting off these
log entries (This level of logging detail is visible with the
atftp tftp server mentioned infra.):

  File /tftpboot///lts/pxe/pxelinux.cfg/0AFA005E not found
  File /tftpboot///lts/pxe/pxelinux.cfg/0AFA005 not found
  File /tftpboot///lts/pxe/pxelinux.cfg/0AFA00 not found
  File /tftpboot///lts/pxe/pxelinux.cfg/0AFA0 not found
  File /tftpboot///lts/pxe/pxelinux.cfg/0AFA not found
  File /tftpboot///lts/pxe/pxelinux.cfg/0AF not found
  File /tftpboot///lts/pxe/pxelinux.cfg/0A not found
  File /tftpboot///lts/pxe/pxelinux.cfg/0 not found
  Serving /lts/pxe/pxelinux.cfg/default

This is not abnormal -- within the tftp protocol, it is how
the client figures out which bootstrap image to load, but
there are eight un-productive tftp sessions in a VERY short
time -- and THIS triggers the xinetd rate limiting to kick in 
when loading goes up.

The obvious solution is not NOT rate limit tftp connections --
but then the xinetd FIN/WAIT issue shows up, and the kernel
timeouts for terminating abandoned IP connections have to
expire before the xinetd concludes it is proper to offer new
connections again.  Reverting to an earlier xinetd without 
this issue is one approach.

I will be testing fixes on this a several score unit site next
week.  I run a couple levels of the xinetd, because of this
recent infirmity.

Alternatively, debugging tftp issues with tcpdump is easier
with this kind of sampling:

   tcpdump -i eth0 -nxX -s 0 port 69 | tee /tmp/replay-log.txt

So it can be matched against /var/log/messages with logging
turned way up. The stock RHL as-shipped tftp-server is not
very verbose -- The atftp variant allowed me to see where
things were falling apart.

There are at least four major open source tftpd's out there
which I have found -- I would ask that list members advise me
here or offlist of any other Open Source variants which exist:

1.  The one which ships with Red Hat is the ISC reference 
implementation, as I recall -- tftp

2.  The Allen Reese atftp -- best logging found

3.  The H Peter Arvin tftp-hpa -- Syslinux roots

-and- 4. The Uwe Ohse utftpd 

These vary on their handling of the tsize option, and so 
forth.  A mis-set tsize option was blocking booting at one 
network I debugged.

Each is GPL'd -- I am running three of the four at 
differing sites, with ThinkNic's and IntelEEPro-100 PXE 
clients

These are available in their latest incarnation, compiled
non-root on an RHL 8.0 stock install with all updates, at:

  ftp://ftp.owlriver.com//pub/local/ORC/XYZ/

where XYZ == atftp tftp tftp-hpa or utftpd

[herrold new ORC]$ ls -al | grep tftp
drwxr-xr-x    2 root     root         4096 Nov 21 17:20 atftp
drwxr-xr-x    2 root     root         4096 Nov 27 11:28 tftp
drwxr-xr-x    2 root     root         4096 Nov 12 14:26 tftp-hpa
drwxr-xr-x    2 root     root         4096 Nov 24 21:23 utftpd

Jim McQuillan's excellent outline, at section 6.3 is sparse on
tftp debugging hints.  This email is intended to be a resource
to extend that discussion, and offer some debugging tools to
the sysadmin.  

Also compounding matters, there are clear issues in the
various PXE clients out there, as mentioned in Arvin's
'gotcha's' webpage.  Sniffing the connection is about the only
way to debug it with the stock RH tftp-server.  The other 
tftp servers make it easier with more reasonable logging.

-- Russ Herrold

/var/log/messages session:

Nov 27 11:46:16 ftp dhcpd: DHCPDISCOVER from 00:02:b3:96:6c:d5 via eth0
Nov 27 11:46:17 ftp dhcpd: DHCPOFFER on 10.250.0.94 to 00:02:b3:96:6c:d5 via eth0
Nov 27 11:46:17 ftp dhcpd: DHCPREQUEST for 10.250.0.94 (10.250.0.253) from 00:02:b3:96:6c:d5 via eth0
Nov 27 11:46:17 ftp dhcpd: DHCPACK on 10.250.0.94 to 00:02:b3:96:6c:d5 via eth0
Nov 27 11:46:17 ftp tftpd[3456]: Trivial FTP server started (0.6)
Nov 27 11:46:17 ftp tftpd[3456]:   started by inetd
Nov 27 11:46:17 ftp tftpd[3456]:   logging level: 7
Nov 27 11:46:17 ftp tftpd[3456]:   directory: /tftpboot//
Nov 27 11:46:17 ftp tftpd[3456]:   user: nobody.nogroup
Nov 27 11:46:17 ftp tftpd[3456]:   log file: syslog
Nov 27 11:46:17 ftp tftpd[3456]:   server timeout: 300
Nov 27 11:46:17 ftp tftpd[3456]:   tftp retry timeout: 5
Nov 27 11:46:17 ftp tftpd[3456]:   maximum number of thread: 100
Nov 27 11:46:17 ftp tftpd[3456]:   option timeout:   enabled
Nov 27 11:46:17 ftp tftpd[3456]:   option tzise:     enabled
Nov 27 11:46:17 ftp tftpd[3456]:   option blksize:   disabled
Nov 27 11:46:18 ftp tftpd[3456]:   option multicast: enabled
Nov 27 11:46:18 ftp tftpd[3456]:      address range: 239.255.0.0-255
Nov 27 11:46:18 ftp tftpd[3456]:      port range:    1753
Nov 27 11:46:18 ftp tftpd[3458]: Serving /lts/pxe/pxelinux.bin 
	to 10.250.0.94:2070
Nov 27 11:46:18 ftp tftpd[3458]: tsize option -> 9364
Nov 27 11:46:19 ftp tftpd[3459]: Serving /lts/pxe/pxelinux.bin 
	to 10.250.0.94:2071
Nov 27 11:46:19 ftp tftpd[3458]: Server thread exiting
Nov 27 11:46:21 ftp tftpd[3459]: Server thread exiting
Nov 27 11:46:21 ftp tftpd[3463]: Serving 
	/lts/pxe/pxelinux.cfg/0AFA005E to 10.250.0.94:57217
Nov 27 11:46:21 ftp tftpd[3463]: File 
	/tftpboot///lts/pxe/pxelinux.cfg/0AFA005E not found
Nov 27 11:46:21 ftp tftpd[3463]: Server thread exiting
Nov 27 11:46:21 ftp tftpd[3464]: Serving 
	/lts/pxe/pxelinux.cfg/0AFA005 to 10.250.0.94:57090
Nov 27 11:46:21 ftp tftpd[3464]: File 
	/tftpboot///lts/pxe/pxelinux.cfg/0AFA005 not found
Nov 27 11:46:21 ftp tftpd[3464]: Server thread exiting
Nov 27 11:46:21 ftp tftpd[3465]: Serving 
	/lts/pxe/pxelinux.cfg/0AFA00 to 10.250.0.94:56963
Nov 27 11:46:21 ftp tftpd[3465]: File 
	/tftpboot///lts/pxe/pxelinux.cfg/0AFA00 not found
Nov 27 11:46:21 ftp tftpd[3466]: Serving 
	/lts/pxe/pxelinux.cfg/0AFA0 to 10.250.0.94:56836
Nov 27 11:46:21 ftp tftpd[3465]: Server thread exiting
Nov 27 11:46:21 ftp tftpd[3466]: File 
	/tftpboot///lts/pxe/pxelinux.cfg/0AFA0 not found
Nov 27 11:46:21 ftp tftpd[3467]: Serving 
	/lts/pxe/pxelinux.cfg/0AFA to 10.250.0.94:56709
Nov 27 11:46:22 ftp tftpd[3466]: Server thread exiting
Nov 27 11:46:22 ftp tftpd[3467]: File 
	/tftpboot///lts/pxe/pxelinux.cfg/0AFA not found
Nov 27 11:46:22 ftp tftpd[3468]: Serving 
	/lts/pxe/pxelinux.cfg/0AF to 10.250.0.94:56582
Nov 27 11:46:22 ftp tftpd[3467]: Server thread exiting
Nov 27 11:46:22 ftp tftpd[3468]: File 
	/tftpboot///lts/pxe/pxelinux.cfg/0AF not found
Nov 27 11:46:22 ftp tftpd[3469]: Serving 
	/lts/pxe/pxelinux.cfg/0A to 10.250.0.94:56455
Nov 27 11:46:22 ftp tftpd[3468]: Server thread exiting
Nov 27 11:46:23 ftp tftpd[3469]: File 
	/tftpboot///lts/pxe/pxelinux.cfg/0A not found
Nov 27 11:46:23 ftp tftpd[3470]: Serving 
	/lts/pxe/pxelinux.cfg/0 to 10.250.0.94:56328
Nov 27 11:46:23 ftp tftpd[3469]: Server thread exiting
Nov 27 11:46:23 ftp tftpd[3470]: File 
	/tftpboot///lts/pxe/pxelinux.cfg/0 not found
Nov 27 11:46:23 ftp tftpd[3471]: Serving 
	/lts/pxe/pxelinux.cfg/default to 10.250.0.94:56201
Nov 27 11:46:23 ftp tftpd[3470]: Server thread exiting
Nov 27 11:46:23 ftp tftpd[3471]: tsize option -> 169
Nov 27 11:46:23 ftp tftpd[3471]: Server thread exiting
Nov 27 11:46:23 ftp tftpd[3472]: Serving /lts/pxe/vmlinuz-2.4.18-ltsp-1.pxe to 10.250.0.94:56202
Nov 27 11:46:23 ftp tftpd[3472]: tsize option -> 665600
Nov 27 11:46:27 ftp tftpd[3473]: Serving /lts/pxe/initrd.lts to 10.250.0.94:56203
Nov 27 11:46:27 ftp tftpd[3473]: tsize option -> 733184
Nov 27 11:46:27 ftp tftpd[3472]: Server thread exiting
Nov 27 11:46:32 ftp tftpd[3473]: Server thread exiting
Nov 27 11:46:34 ftp dhcpd: DHCPDISCOVER from 00:02:b3:96:6c:d5 via eth0
Nov 27 11:46:34 ftp dhcpd: DHCPOFFER on 10.250.0.94 to 00:02:b3:96:6c:d5 via eth0
Nov 27 11:46:36 ftp dhcpd: DHCPREQUEST for 10.250.0.94 (10.250.0.253) from 00:02:b3:96:6c:d5 via eth0
Nov 27 11:46:36 ftp dhcpd: DHCPACK on 10.250.0.94 to 00:02:b3:96:6c:d5 via eth0
Nov 27 11:46:36 ftp rpc.mountd: authenticated mount request 
from dhcp-94.victim.lan.:688 for /usr/opt/ltsp/i386 (/usr/opt/ltsp/i386)






[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]