[Crash-utility] crash CPU bound waiting for user response

D. Hugh Redelmeier hugh at mimosa.com
Wed Jul 4 04:17:39 UTC 2007


| From: Dave Anderson <anderson at redhat.com>

| D. Hugh Redelmeier wrote:

| > ==> Worse: while it is awaiting my RETURN, it is burning 100% of the CPU!
| > 
| > Here is what "ps laxgwf" says about the crash process and its child.
| > 
| > F   UID   PID  PPID PRI  NI    VSZ   RSS WCHAN  STAT TTY        TIME COMMAND
| > 4     0  4426  4406  25   0 416812 332764 -     R+   pts/5     80:36
| > |               |           \_ crash --readnow
| > /usr/lib/debug/lib/modules/2.6.21-1.3228.fc7/vmlinux
| > /var/crash/2007-07-02-13:42/vmcore
| > 0     0  4989  4426  18   0  73976   740 -      S+   pts/5      0:00
| > |               |               \_ /usr/bin/less -E -X -Ps -- MORE --
| > forward\: <SPACE>, <ENTER> or j  backward\: b or k  quit\: q
| > 
| > strace of the crash process shows an infinite sequence of:
| >     wait4(4989, 0x7fffcd9cae90, WNOHANG, NULL) = 0
| >     wait4(4989, 0x7fffcd9cae90, WNOHANG, NULL) = 0
| >     wait4(4989, 0x7fffcd9cae90, WNOHANG, NULL) = 0
| >     wait4(4989, 0x7fffcd9cae90, WNOHANG, NULL) = 0
| > 
| > This is very wasteful.
| > 
| > There are other ways to get into this state.  Other places less is
| > being used and is waiting.  Probably wherever less is used even if it
| > isn't waiting.
| > 
| > I just tested: this problem exists when using a normal xterm.
| 
| Yeah, I have seen this on occasions, but I have never been able
| to reproduce it on demand.  There was a patch suggestion a while ago,
| but I deferred it until I could reliably reproduce it for testing
| before taking it in.

I've put gdb on the case.  The CPU burning that I'm currently experiencing is
in cmdline.c:restore_sanity.  The actuall code in question is:
    while (!waitpid(pc->stdpipe_pid, &waitstatus, WNOHANG))
                                ;
That sure looks like a busy-wait.

If you execute this code, you should get a busy-wait too.

If you replaced WNOHANG with 0, I think that the wait would have the
same result but not be busy.  You would then want to loop in the case
where waitpid returns a -1 with errno == EINTR.

Here's what I'd try (UNTESTED!):
    do ; while (waitpid(pc->stdpipe_pid, &waitstatus, 0) == -1 && errno == EINTR);

All the uses of WNOHANG in that function look suspicious.




More information about the Crash-utility mailing list