Using Go for NTPsec

Mon Jul 5 04:19:21 UTC 2021

Dan Drown via devel <devel at ntpsec.org>:
> Let's talk a bit about what time critical sections are currently in the
> code.  I think that will help drive the decision about the impact of garbage
> collection.
> 
> I haven't looked at ntpsec's codebase lately, so some of this might be out
> of date.  Please feel free to correct any mistakes or omissions.
> 
> Time critical code:
> 
> 1. packet tx happening right after tx timestamp for server response
> 2. serial NMEA data timestamps
> 
> Non time critical code:
> 
> 1. packet rx timestamp (assumption: SO_TIMESTAMPNS or alike is being used)
> 2. packet tx happening right after tx timestamp for client request
> (assumption: SO_TIMESTAMPNS or alike is being used)
> 3. receiving SHM data
> 4. receiving PPS data
> 5. calculating/updating local clock offset/frequency
> 
> The time critical code can tolerate some level of delay (~hundreds of
> microseconds), as things like packet tx can be delayed for a multitude of
> kernel and hardware reasons.  The good news is both of the time critical
> code paths are somewhat predictable and if we can manually schedule GC, we
> can avoid scheduling it during those times.
> 
> The non timing critical code can be delayed tens of milliseconds without an
> impact to timing quality.

This matches my analysis almost exactly.

My current plan is:

A) Mitigate window 1 by turning off GC before it and back on after.

B) Mitigate window 2 by taking timestamps before and after sample read, asking
the Go runtime if a GC has occurred in that interval, and throwing out
the sample if it has. This tactic might slow convergence times minutely
but should not affect overall sync accuracy.

In all other circiumnstances, treat GC-induced latency spikes as though
they're just another kind of network weather.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>