Why is my load ave so high now? [Now I know why!]
Rick Stevens
ricks at nerd.com
Tue Jul 28 18:00:58 UTC 2009
Kevin J. Cummings wrote:
> On 07/27/2009 02:26 PM, Rick Stevens wrote:
>> You see a bunch of NFS-related things in a "D" state and you wonder why
>> it's slow?
>
> Yes. Mostly because the machine accessing the NFS mounts has been
> re-booted a couple of times.
>
>> If you have processes in an I/O wait (a.k.a. "D") state, that'll bog
>> stuff down badly...especially if the NFS mounts are mounted "hard".
>
> Well, tonight I rebooted the server with NFS turned off. When it
> booted, I saw a load average between 1 and 2. That's all. When it
> re-booted, ivtv started back up, despite my blacklisting it and removing
> it from modprobe.conf. However, ivtvfb did not get installed.
> I also noticed that BOINC started right up again. With astropulse
> grabbing all the idle cpu time, my load average was still between 1 and 2.
>
> So, I decided that NFS was my problem, but I'm still not sure why.
>
> So, I tried a couple of things. My laptop references a few directories
> on my server via NFS and autofs.
>
> So, I started nfs again on the server (service nfs start)
>
> Load average remains between 1 and 2. So far so good.
>
>>From the laptop, I did a "cd /net/kjc386". I can then do an ls and see
> all of the exported filesystems. Continues to look good.
>
> "ls home" lists the directories in the server's exported /home dir.
> nfs does the work, and disappears from the top -i that I have running.
> Great.
>
> Next I do a "ls c:" to look at the old WINDOWS partition on my server.
> HANG! I can't interrupt the ls with ^C nor ^Z. I have to kill it from
> another process. When I do, the hung nfs processes on the server stay
> hung. After it collects all 8 allowed nfs processes, nothing more nfs
> works to the server, and the load average climbs roughly 1 per nfs
> process (I watched the load average increase with each new nfs process
> that appeared).
>
> So, I guess my question is what's broken with NFS between my F11 laptop
> and the F10 server????
I could see where "ls c:" might be interpreted by the system as trying
to find an NFS machine called "c". An NFS mount command is:
mount -t nfs server:/sharename /mountpoint
Perhaps F11 is trying to invoke an automount of an NFS share from server
"c" to satisfy your "ls" command. That'd be wild!
I haven't tried this. perhaps you've found a very subtle bug in F11's
NFS client implementation. Could you run a wireshark or tcpdump and
watch for NFS traffic when you do that "ls c:" command? If you do,
then I'd file a bugzilla PDQ (pretty damned quick).
----------------------------------------------------------------------
- Rick Stevens, Systems Engineer ricks at nerd.com -
- AIM/Skype: therps2 ICQ: 22643734 Yahoo: origrps2 -
- -
- "People tell me I look at the dark side. That's not true. I have -
- the heart of a small boy......in a jar right here on my desk." -
- -- Stephen King -
----------------------------------------------------------------------
More information about the fedora-list
mailing list