boot hang with 2.6.11-1.27_FC3 & DHCP (& ypbind)
Todd Allen
todd.allen at attglobal.net
Tue May 24 00:21:04 UTC 2005
I installed the kernel-2.6.11-1.27_FC3 rpm today, and ran into trouble. The
boot hung while trying to bring up the eth0 network device. The device
actually came up fine, as I could ping the machine from elsewhere, but it
hung doing some of the post-processing after bringing up the device.
The important thing is that this device was configured to use DHCP.
I was unable to reproduce the problem by running the individual
/etc/rc.d/init.d/* scripts in single-user mode, oddly enough. So I had to
debug it with attempted reboots to level 5 several dozen times, and "bash -x"
and "exec > /dev/console 2>&1" in the relevant boot scripts. Here are the
events leading up to the hang as I tracked it down:
/etc/rc.d/init.d/network start which runs:
ifup eth0 which runs:
dhclient which runs:
/sbin/dhclient-script which runs:
ypbind start which runs:
rpcinfo -p | fgrep -q ypbind
Actually, the "ypbind start" tries the rpcinfo command 20 times and then
gives up.
With the old 2.6.11-1.14_FC3, the rpcinfo command fails pretty quickly with:
rpcinfo: can't contact portmapper: RPC: Remote system error - Connection refused
Presumably, networking isn't sufficiently "up" for this to work yet, and the
fact that it fails like this is crucial to the boot process.
But with the new 2.6.11.-1.27_FC3, the rpcinfo just hangs. And so the whole
boot process hangs.
--
Todd Allen
More information about the fedora-list
mailing list