boot hang with 2.6.11-1.27_FC3 & DHCP (& ypbind)

Todd Allen todd.allen at attglobal.net
Tue May 24 00:21:04 UTC 2005


I installed the kernel-2.6.11-1.27_FC3 rpm today, and ran into trouble.  The
boot hung while trying to bring up the eth0 network device.  The device
actually came up fine, as I could ping the machine from elsewhere, but it
hung doing some of the post-processing after bringing up the device.  

The important thing is that this device was configured to use DHCP.

I was unable to reproduce the problem by running the individual
/etc/rc.d/init.d/* scripts in single-user mode, oddly enough.  So I had to
debug it with attempted reboots to level 5 several dozen times, and "bash -x"
and "exec > /dev/console 2>&1" in the relevant boot scripts.  Here are the
events leading up to the hang as I tracked it down:

   /etc/rc.d/init.d/network start   which runs:
   ifup eth0                        which runs:
   dhclient                         which runs:
   /sbin/dhclient-script            which runs:
   ypbind start                     which runs:
   rpcinfo -p | fgrep -q ypbind

Actually, the "ypbind start" tries the rpcinfo command 20 times and then
gives up.

With the old 2.6.11-1.14_FC3, the rpcinfo command fails pretty quickly with:
   rpcinfo: can't contact portmapper: RPC: Remote system error - Connection refused
Presumably, networking isn't sufficiently "up" for this to work yet, and the
fact that it fails like this is crucial to the boot process.

But with the new 2.6.11.-1.27_FC3, the rpcinfo just hangs.  And so the whole
boot process hangs.

-- 
Todd Allen




More information about the fedora-list mailing list