Clocks broken on Mac mini
Hal Murray
hmurray at megapathdsl.net
Tue Jan 31 23:16:42 UTC 2017
gem at rellim.com said:
> I think we are heading the wrong way on this. It is not an Open Firmware
> problem and it is not a Mac problem. I have seen this before with NTP
> Classic. I have verified before that Chronyd does not have the same issue.
> There was a time I used to buy and provision 10 identical servers at a
> time. Of the 10, one often had this problem. The fix was chrony.
If you ever get another system where ntpd doesn't work but chrony does, I'd
like to investigate.
> Sometimes the clock needs to be pulled more than ntpd expects and ntpd gives
> up just before it succeeds.
> I think this is clearly an ntpd bug.
I think it's more complicated than that.
There are long standing occasional reports of ntp classic getting stuck with
a drift of 500 and not being able to recover by itself. It does recover if
you delete the drift file and restart ntpd. I assume our code will do the
same thing.
I think your "gives up just before" is off by quite a bit.
ntpd has a limit of 500 ppm. The kernel may have a similar limit. I haven't
dived into that corner of the kernel source yet. A limit may be a good idea
to prevent the system getting into strange modes where it jumps around rather
than converging smoothly. I'm not enough of a PLL geek to explain that area
but I know that it's easy to get into oscillations.
I hacked our code to have a limit of 2500. It didn't work. Somebody claimed
it was running at 500 ppm. I don't know if that's the kernel or some corner
of our code that I didn't catch. (I think there are also a few lines of code
in libc.)
I think it's reasonable for ntpd to expect the kernel clock to be reasonably
close. We could have an interesting discussion about how close is
reasonable, but most systems get well within 500. If not, it's usually a bug
in the setup/calibration chain someplace. There are lots of opportunities to
get something wrong. It's easy to get close-enough so that you won't notice
unless you are interested in time and go looking for troubles.
In this case, I was half looking, but didn't notice the problem for quite a
while. Sure, it's easy to spot after you know what to look for, but it does
work well enough to hide any gross errors. I didn't get around to making
graphs where it would have been obvious.
> So what is your 'adjtimex -p' output?
mode: 0
offset: -6106235
frequency: 565962
maxerror: 52530
esterror: 1820
status: 8193
time_constant: 6
precision: 1
tolerance: 32768000
tick: 10029
raw time: 1485903664s 513432740us = 1485903664.513432740
I'm not familiar with adjtimex. I installed in a while ago. The
installation ran it.
Comparing clocks (this will take 70 sec)...done.
Adjusting system time by 251.512 sec/day to agree with CMOS clock...done.
That's 2911 ppm.
Looks like ntpd is happy now.
--
These are my opinions. I hate spam.
More information about the devel
mailing list