Apparent protocol-machine bug, new top priority

Tue Sep 26 02:26:02 UTC 2017

On Mon, 25 Sep 2017, Gary E. Miller via devel wrote:
> On Mon, 25 Sep 2017 16:55:30 -0700 (PDT)
> Fred Wright via devel <devel at ntpsec.org> wrote:
>
> > PPS(2) is the counter-capture PPS source, and is the primary timing
> > reference.
>
> Can you explain a bit more about this source?  How does this differ
> from the KPPS or PPS(TIOCMIWAIT) sources?

It's a different kind of KPPS, using the pps-gmtimer driver (with some
experimental improvements of my own) rather than the usual pps-gpio
driver.

> > SHM(1) is the combined NMEA/PPS source from GPSD, which is
> > configured to use the interrupt-based PPS driver, and hence
> > illustrates the offset in the interrupt-based capture.
>
> Your SHM(1), on both your hosts, seem to not be very good.
>
> Are you using KPPS?  Or just TIOCMWAIT?

It's KPPS via the usual pps-gpio.  It's not a "modem-control" PPS at all.

I don't do anything to reduce the variability since that's not the real
timing reference, anyway, and it's not the main issue here.

Some of that may be in the receiver itself.  The *cape* vendor only
claims +/-200ns, even though the *chipset* vendor claims +/-60ns IIRC.

> > Again, PPS(2) is the main timing reference, though it's listed as
> > 127.127.22.2 due to the lame partial translation table in ntpviz.
>
> What do you think ntpviz should do better?  Just convert 127.127.22.C
> to PPS(X) ?  That would be ann easy patch.

Basically, yes, but note that ntpq has the same issue when pointed at
classic ntpd.  The mapping table should really be in one of the common
libraries so that ntpq and ntpviz can share it.  And it should have the
complete list so that it can cover everything that classic ntpd supports.

> > The actual time offsets are
> > visible in the loopstats graph and in the PPS(2) peer offset graph,
> > and are substantially smaller than the offsets in SHM(1).  QED.
>
> Well, your much poorer than normal SHM(1) Standard Deviation casts
> doubt on your QED.

Except that the SD is still way less than the mean.

Having a really tight grouping of shots five feet to the right of the
target doesn't make you a good marksman. :-)

> > > Uh, that is not my experience.  And I have more control over my
> > > temperature than I have over my interrupt latency.
> >
> > See above.  And note that you can at least make the latency (as well
> > as the variation in latency) as small as possible by running the CPU
> > as fast as possible, rather than slowing it down for "thermal
> > stability".
>
> I always run my ntpd's with the perfomance governor.  So not an issue
> for me.

Though apparently Achim runs his slower, hence the comment.

> I get 'thermal stability' by controlling an external heater and use that
> to stabilize my test enclosure.  Then for fine temp control I vary the
> CPU workload to stabilize the CPU temp.
>
> Prolly several reasons why my SHM(1) seems to have 40x less jitter than
> yours.

Perhaps, but I'll bet you're still more than 10 microseconds off without
having any way to see it.

Fred Wright