NTP Performance

Richard Laager rlaager at wiktel.com
Thu Nov 21 04:07:35 UTC 2019

On 11/20/19 4:33 PM, Paul Theodoropoulos via devel wrote:
> Just a shot in the dark, but it looks to me like an errant
> overly-greedy cron job or systemd timer - perhaps something else is
> polling the serial port on a schedule. If you set up hourly graphs you
> might be able to correlate it with something else firing on the server
> at those intervals.

On 11/20/19 6:33 PM, Gary E. Miller via devel wrote:
> Check your crontab, and other recurrent jobs.  Those are way too large
> for an idle server.  Could be a remote cronjob doing something like a 
> log pull or other test.

This is a physical server that is idle other than ntpd. It _is_ serving
NTP to the public pool, so there's a fair amount of load there. FWIW,
ntpd is 25-30% CPU. Both systems are the same in these respects.

At 23:10 UTC, I stopped cron and ALL systemd timers. At 03:55 UTC, I
re-ran the ntpviz daily job. I ran it again at 04:03 just to get the
04:00 label on the graph. There was no change in the pattern from 22:00
to 04:00.

>> Another idea might be to switch to the PPS refclock on ntp2 to see if
>> it behaves similarly with the ublox PPS.
> My bet is that it does.

I'll look into this, but this will take a bit longer to get useful results.

>> As a second question, if you look at the ntp2 weekly graphs, there is
>> a single huge transient. Any idea what that might be? I've seen these
>> about once or twice a week since I put this in a couple weeks ago:
>> https://ntp2.wiktel.com/week/
> Usually some sort of kernel stall.  Likely some big job, or a backup,
> happened during those events.  Less likely, but very possible, a disk
> drive or NFS mount hung the entire system for 50 ms.

No NFS on this system. In terms of disks, it's a pair of SATA SSDs
connected to the motherboard in an ext4-on-LVM-on-MD configuration.

> Also check the ntpd is locked to your PPS during the events.

Is there a way to check that retroactively?


