[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]

Re: events/0 thread in kernels 2.6.22*fc6 causing scheduling latency



On 27 Aug 2007, Mike Fleetwood wrote:
> On 27 Aug 2007, Mike Fleetwood wrote:
> > Hi,
> >
> > Since I upgraded my FC6 box from kernel-2.6.20-1.2962.fc6 to
> > kernel-2.6.22.1-32.fc6, and now 2.6.22.2-42.fc6, I am getting pauses
> > from the whole OS.  They last ~1 second and occur every few minutes.
> > Every application becomes unresponsive for the duration.  The 1 second
> > scheduler latency this causes is long enough for my music player to be
> > effected and the audio track to be interrupted.  This makes the fault
> > very easy to hear.  At the same time kernel thread events/0 seems to
> > use all the CPU time.  Here is the first few lines of top's output
> > when a pause happens:
> >  top - 21:39:41 up  1:17,  2 users,  load average: 1.05, 1.20, 1.24
> >  Tasks: 138 total,   4 running, 134 sleeping,   0 stopped,   0 zombie
> >  Cpu(s):  2.3%us, 29.3%sy, 68.4%ni,  0.0%id,  0.0%wa,  0.0%hi,  0.0%si,  0.0%st
> >  Mem:   2074880k total,  1355244k used,   719636k free,    62144k buffers
> >  Swap:  1004052k total,        0k used,  1004052k free,   882684k cached
> >
> >    PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
> >   5525 mike      39  19  202m  81m 4016 R 68.4  4.0  13:07.46 hadcm3transum_5
> >      6 root      15  -5     0    0    0 S 28.5  0.0   0:53.40 events/0
> >   3619 mike      20   0 45620 8264 5856 S  0.9  0.4   0:18.58 xmms
> >   3318 root      20   0  328m  46m 8132 S  0.6  2.3   1:39.60 Xorg
> >   3524 mike      20   0 76728  23m 9640 S  0.6  1.2   1:13.15 bittorrent
> >   3557 mike      20   0  211m 115m  30m S  0.6  5.7   3:54.24 firefox-bin
> >   3489 mike      20   0 57240  23m  15m S  0.3  1.1   0:04.02 gnome-terminal
> >   5583 root      20   0  2204 1100  832 R  0.3  0.1   0:01.14 top
> > 28.5% CPU time used by events/0 of a single 3 second top refresh is
> > 0.85 seconds of CPU time.  Rebooting back to kernel 2.6.20 completely
> > fixes it.
> >
> > Has any one else seen this issue?
> > Does anyone know what kernel thread events/0 does?
> > Could this be related to CFS newly introduced into Fedora's kernel 2.6.22?
> > Can anybody suggest how to fix this issue?
>
> Thanks for the replies so far.
>
> 1) Seems no one else has seen this issue.
> 2) Kernel has one events thread per CPU, numbered 0 upwards.  They are
> used to get kernel work do a little later, for example a device drive
> interrupt handler might use one.  Ref:
> http://docs.blackfin.uclinux.org/doku.php?id=kernel_events
> 3) I have also just compiled Linus' kernel 2.6.22.2 and get the same
> OS wide 1 second pauses.  Therefore it's not a Fedora specific patch,
> such as CFS, causing the fault.
> 4) Now off to try a binary search of Linus' kernel releases between
> 2.6.20 and 2.6.22.2 to see when it got introduced.
>
> Still could be a kernel software fault, but as it seems no one else
> has this issue perhaps it is buggy hardware the kernel no longer
> handles as well.  (I don't think that I run any unusual software or
> uncommon hardware).  Could be a while before I finish binary searching
> kernel releases.
After lots of kernel compiling and testing I finally tracked down the
causes ...

1) Between Fedora kernels 2.6.20-1.2962.fc6 and 2.6.22.1-32.fc6 this
configuration change was the one which triggered the ~1 second OS pauses
to appear:
diff -y /boot/config-2.6.20-1.2962.fc6 /boot/config-2.6.22.1-22.fc6
...
CONFIG_RTC=y           | # CONFIG_RTC is not set
                       > CONFIG_GEN_RTC=y
                       > CONFIG_GEN_RTC_X=y
...

2) I also run chrony (http://chrony.sunsite.dk/) rather than ntp to time
synchronise my machine.  (It monitors and adjusts the hardware RTC as
well as the kernel software clock, hence accesses /dev/rtc).  Running
chrony is probably very rare, hence why no one else has seen this issue.

I still want to understand what the above kernel configuration changes
actually do and which RTC related OS calls chrony is making to cause the
events thread to hog the CPU solidly for ~1 second.

Mike


[Date Prev][Date Next]   [Thread Prev][Thread Next]   [Thread Index] [Date Index] [Author Index]