NTP Performance
Richard Laager
rlaager at wiktel.com
Tue Nov 26 07:34:46 UTC 2019
I have logs going back to 2019-10-26. These clock fuzz errors started on
ntp1 on 2019-11-02 and ntp2 on 2019-11-21.
On 2019-11-02, I upgraded to NTPsec 1.1.7 (from 1.1.3) and enabled NTS
(as both a client and server).
On 2019-11-08, I added the GPS to ntp2. Based on the dates, that seems
unrelated.
On 2019-11-21 on ntp2, I was performing debugging as discussed earlier
in the thread. This involved a reboot. This is probably when it moved to
Linux 4.15.0-70-generic (that's from the Ubuntu package), from likely
4.15.0-45-generic. That also seems unrelated, though, as ntp1 is still
running 4.15.0-45-generic and has not been rebooted since 2019-09-28.
Trying again with NTPsec 1.1.3 seems like a useful next step. If that is
good, then I need to bisect the difference.
On 11/25/19 11:46 AM, Achim Gratz via devel wrote:
> Richard Laager via devel writes:
>> These both have the following CPU (which is older):
>> Intel(R) Xeon(R) CPU X5460 @ 3.16GHz
>
> These may not yet have consistent TSC between cores/sockets (or require
> BIOS tweaks for that).
/proc/cpu says constant_tsc, but that's it (besides "tsc", of course).
That is, I do _not_ have nonstop_tsc, so therefore I presume I do not
have the "invariant TSC" CPU feature.
Any thoughts on what to look for in the BIOS? I poked around, but there
didn't seem much related. There was a "Clock Spectrum Feature", which I
assume is something about spread spectrum, which is disabled. The HPET
is enabled. The Intel EIST setting is set to disabled, which the help
text says disables C-states.
Should I consider trying the HPET as the kernel clocksource?
$ cat /sys/devices/system/clocksource/clocksource0/available_clocksource
tsc hpet acpi_pm
$ cat /sys/devices/system/clocksource/clocksource0/current_clocksource
tsc
I only have the clock fuzzing errors on my NTP servers. I don't have an
exact matching configuration that's not an NTP server, but: Similar
hardware running Debian 10 and ntpsec 1.1.3 does not. Two eras of newer
hardware running Ubuntu 18.04 and the same ntpsec do not.
I tried disabling the second CPU socket. I verified that I'm down from 8
cores to 4 cores. No change to the CLOCK_MONOTONIC_RAW performance:
rlaager at ntp2:~$ ./a.out
res avg min dups CLOCK
1 28 26 CLOCK_REALTIME
4000000 8 3999658 -1 CLOCK_REALTIME_COARSE
1 28 26 CLOCK_MONOTONIC
1 374 362 CLOCK_MONOTONIC_RAW
1 383 371 CLOCK_BOOTTIME
Histogram: CLOCK_REALTIME, 1 ns per bucket, 1000000 samples.
ns hits
26 50500
27 531901
28 49137
29 914
30 367497
33 2
34 1
36 5
39 1
60 1
41 samples were bigger than 60.
rlaager at ntp2:~$ ./a.out
res avg min dups CLOCK
1 29 26 CLOCK_REALTIME
4000000 8 3999852 -3 CLOCK_REALTIME_COARSE
1 28 26 CLOCK_MONOTONIC
1 375 362 CLOCK_MONOTONIC_RAW
1 384 372 CLOCK_BOOTTIME
Histogram: CLOCK_REALTIME, 1 ns per bucket, 1000000 samples.
ns hits
26 50139
27 531774
28 49516
29 397
30 367973
32 1
36 1
48 1
63 2
66 2
194 samples were bigger than 66.
rlaager at ntp2:~$ ./a.out
res avg min dups CLOCK
1 28 26 CLOCK_REALTIME
4000000 8 3999859 -3 CLOCK_REALTIME_COARSE
1 28 26 CLOCK_MONOTONIC
1 374 366 CLOCK_MONOTONIC_RAW
1 385 374 CLOCK_BOOTTIME
Histogram: CLOCK_REALTIME, 1 ns per bucket, 1000000 samples.
ns hits
26 49328
27 523261
28 48763
29 360
30 378133
33 2
36 4
39 2
40 1
45 2
144 samples were bigger than 45.
--
Richard
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ntpsec.org/pipermail/devel/attachments/20191126/c05e7be2/attachment-0001.bin>
More information about the devel
mailing list