HZ value changed from 250 to 1000 in the latest updated kernel

Ralf Corsepius rc040203 at freenet.de
Tue Nov 21 06:24:24 UTC 2006


On Mon, 2006-11-20 at 17:11 -0400, John DeDourek wrote:
> I have dealt with this issue before.  I think that the effect is the
> result of rounding errors in the timing code in the kernel.  In any
> case, the effective frequency of the clock changes slightly when the
> Hz. is changed.  When starting ntpd, it uses the "remembered" effective
> frequency of the clock from the previous shutdown in the "drift file".
> (Actually, this remembers the frequency error from the "nominal" frequency.)
> In any case, when changing to a kernel with a different Hz., I had
> the best result when I deleted the drift file and let ntpd create
> a new one.  Otherwise, it appears, that it takes a while for ntpd
> to convince itself that the frequency of the clock has made a step
> change.  (Of course, if you just let it run a while, it should
> eventually convince itself of the new frequency; so if you're still
> having these problems after a while, then it is something different).

Let me describe what has happened:

I have one machine, on which the original fc6-kernel doesn't boot, while
fc5-kernels run flawlessly. Therefore, after upgrading to fc6, I had
been running an fc5-kernel underneath of fc6 on this machine. In this
setup, ntp had been working flawlessly, resulting into a driftfile
containing a value |drift| < 100.0.

At the time when the 1000Hz fc6-kernels had been released, I finally
managed to boot this machine with "noapic" for the first time.
Ca. 2 days uptime later, ntp had lost sync. "drift" had hit its
tolerance (>512).

I tried to set drift to 0. 1-2 days later "drift" hit the 512 tolerance
again, ntp didn't converge. This was the situation last Friday.


Then, I tried to changed this machine's ntp setup to writing driftfile
every 10 minutes instead 60 minutes 
(ntp.conf: driftfile /var/lib/ntp/drift 10).

This finally seems to have helped. At least, since having introduced
this change, ntp seems to be able to sync again, |drift| stays < 80.0.

> If you want to convince yourself of the issue, delete (or rename)
> the drift file, and run the 250 Hz. kernel a while; then record
> the contents of the drift file.  Repeat with the 1000 Hz. kernel.
> I for one, would be interested in the result, since it has been
> a significant time since I ran these tests, and the kernel clock
> code has changed significantly since then.
That's very similar to what I did.

> Dave Jones wrote:
> 
> > On Mon, Nov 20, 2006 at 04:31:42PM +0100, Ralf Corsepius wrote:
> >  > On Sun, 2006-11-19 at 21:03 -0600, Callum Lerwick wrote:
> >  > > On Tue, 2006-11-14 at 18:44 +0300, Dmitry Butskoy wrote:
> >  > > > The latest updated kernel has another HZ value (1000 instead of 250), 
> >  > > > according to:
> >  > > > 
> >  > > > > * Thu Nov 9 2006 Dave Jones
> >  > > > > - Change HZ to 1000 for increased accuracy.
> >  > > > > (Except in Xen, where it stays at 250 for now).
> >  > > 
> >  > > Woohoo, Rosegarden (which I maintain in Extras) doesn't bitch anymore!
> >  > 
> >  > Could it be this change also had an impact on ntpd?
> >  > 
> >  > At least, since this change I am facing severe problems with my ntp
> >  > setup (ntp clients are drifting away and have problems to sync).
> >  > 
> >  > Or is this just a random coincidence?
> > 
> > A coincidence I hope.  I'm not sure how increased timing resolution could
> > cause the drifting effects you've observed.  I've also not noticed
> > any other similar reports (yet?)
Well, there had been some "vague/unclear" reports on ntp issues on
fedora-users@, which could be read as to falling into the same class of
issue. Unfortunately, this all is a rather "soft issue" very hard to
grasp.

May-be all his a side-effect of a different issue? I don't know[1]. 

Ralf

[1] NetworkManager also seems to be a likely candidate to me.






More information about the fedora-devel-list mailing list