What's left to doo on NTS.

Sun Mar 3 22:59:11 UTC 2019

On Sun, Mar 3, 2019 at 8:45 AM Kurt Roeckx via devel <devel at ntpsec.org> wrote:
> On Sun, Mar 03, 2019 at 05:23:31AM -0800, Hal Murray wrote:
> >
> > kurt at roeckx.be said:
> > > If this is something you're worried about, this can be solved with the
> > > interleave mode, which was removed.
> >
> > How well does it work?
>
> It works great, the errors are much smaller when it's enabled.

Interleaved mode in NTP Classic doesn't do what you think it does.

The concept behind interleaved mode is sound: get packet timestamps
from the NIC at the moment they cross the wire, thus eliminating the
contribution of local buffers to jitter. But ntpd doesn't do anything
of the kind!

Actually getting timestamps from the NIC is fairly involved. The NIC
has its own clock and its own oscillator, which has to carefully be
kept in sync with the system clock. Furthermore, all the APIs for
doing this are OS-specific. Check out the linux-ptp project to see
what it looks like when it's done right; it's a fair bit of code, and
nothing of the kind is present in ntpd.

One thing ntpd does do (both in NTP Classic and in NTPsec) is fetch
kernel timestamps on incoming packets using the SO_TIMESTAMP option.
This is different from hardware timestamps; they're not generated by
the NIC, they're generated by the kernel at the moment the NIC passes
the packet to it. No analogue to SO_TIMESTAMP exists for outgoing
packets.

For outgoing packets in interleaved mode, all NTP Classic does is call
clock_gettime(2) right after calling send(2), rather than right
before. You can see it here:
https://github.com/ntp-project/ntp/blob/stable/ntpd/ntp_proto.c#L3324-L3355
It purports the result of this call to be a hardware timestamp, but
it's nothing of the sort. All send(2) does is copy the packet into the
kernel's IO buffers and then return. The return time has nothing to do
with when the packet *leaves* any buffer. The only circumstance under
which the post-send(2) timestamp will be less jittery than the
pre-send(2) timestamp is if the kernel's IO buffer is full, in which
case the call will block until there's room again for the packet. This
should only ever happen on an overloaded server.

Now, the post-send(2) timestamp is not less jittery than the
pre-send(2) timestamp, but it is different, because for send(2) to do
a context switch in and out of kernel space takes a non-zero amount of
time. So, you claim you're getting smaller errors when interleaved
mode is enabled. I'm not going to credit that claim without
substantiation: how are you measuring these errors? Do you have a GPS
reference clock to check them against? But if you convince me it's
true, then it's likely true because the difference in the point where
the timestamp is captured is offsetting an asymmetry somewhere else in
ntpd or on your network path.