nmea refclock not locking to the pps

Tony Hain tony at tndh.net
Sun Mar 5 05:26:55 UTC 2017


Ok, so part of the issue with gpsd is that ppsthread.c assumes that "non-linux" systems always have the pps on the primary serial port control pins. That might be a reasonable default assumption if  /dev/ppsx isn't provided, but ...

A related issue, and maybe core to why this is failing is gpsd.c and others talk about ppsonly in   if def (PPS_ENABLE)  blocks, but I can't find in any of the files where PPS_ENABLE gets defined. So providing /dev/pps0 as a second source should resolve the first problem, but only when a ppsonly device is enabled. Given content in the logs when debug is turned on, /dev/pps0 is tried, but when no serial stream comes in it is shut down. How would one go about forcing PPS_ENABLE so that the ppsonly /dev/pps0 device is not turned off? Wouldn't it make more sense to test a port that times out the serial stream for a pps signal before shutting it down? 

I backed up to 3.14 in the FreeBSD ports tree, and by magic, gpsmon started working (ntpd still doesn't like the JSON data). I don't know what build option differences there might be, so that is something to look into Monday. In any case, getting the maintainer to move from 3.14 to something current needs to happen.

PS: I found the page that talked about 3.17+, it is the catb how-to page. 

Tony


> -----Original Message-----
> From: bugs [mailto:bugs-bounces at ntpsec.org] On Behalf Of Tony Hain
> Sent: Saturday, March 04, 2017 5:30 PM
> To: 'Gary E. Miller'; bugs at ntpsec.org
> Subject: RE: nmea refclock not locking to the pps
> 
> Gary E. Miller wrote:
> > Yo Tony!
> >
> > On Fri, 3 Mar 2017 17:20:43 -0800
> > "Tony Hain" <tony at tndh.net> wrote:
> >
> > > Gary E. Miller wrote:
> > > > Yo Tony!
> > > >
> > > > On Fri, 3 Mar 2017 01:00:23 -0800
> > > > "Tony Hain" <tony at tndh.net> wrote:
> > > >
> > > > > Not clear if this is an ntpsec issue, or comes from upstream,
> > > > > but the pps lock on the nmea stream never converges to a
> > > > > reasonable offset or apparently itself.
> > > >
> > > > Fiar warning: I always recommend people not to use that refclok.
> > >
> > > Well I tried gpsd with no luck, so I switched back, and the nmea
> > > driver was working up through 4.2.4p5, (which I was using because I
> > > couldn't get gpsd to work with ntpd back then either). In both cases
> > > I can get gpsd to show it is getting the correct feed from the gps,
> > > I just can't make it play well with ntpd.
> >
> > Which sounds similar to the problem you now report on the nmea driver.
> > Just with fewer tools to use.  And by gpsd, I hope you mene the SHM
> > driver, not the flakey gpsd-ng driver.
> 
> Flakey doesn't even begin to describe it:
> # gpsmon
> gpsmon:ERROR: SER: device open of tcp://localhost:2947 failed: No such file
> or directory - retrying read-only # netstat -an|grep LIST
> tcp4       0      0 127.0.0.1.2947         *.*                    LISTEN
> tcp4       0      0 *.21                   *.*                    LISTEN
> tcp6       0      0 *.21                   *.*                    LISTEN
> tcp4       0      0 *.22                   *.*                    LISTEN
> tcp6       0      0 *.22                   *.*                    LISTEN
> # ntpmon
>      remote           refid      st t when poll reach   delay   offset   jitter
> oPPS(0)          .PPS.            0 l    4    8  377   0.0000   0.0513   0.0008
> xGPSD(0)         .GPSD.           0 l   43   64  377   0.0000 16385488   0.0003
> *2603:3023:102:1 .PPS.            1 u 1050    8  376   1.4244   0.1479   0.4357
> 
> My first reaction to the above is that gpsmon is broken because it likely
> picked localhost as ::1 then didn't try IPv4, because ntpd is showing
> reachability, even if it doesn't like the values it is getting. My second reaction
> is that gpsd is broken in that it doesn't bind to ::1.  ;-)
> 
> So, yes the SHM driver is the one I have been testing and it is not playing well
> with ntpd.
> 
> >
> > > I don't know if the offset shown in 4.2.8p9 is due to changes in
> > > that driver, or something in the BBB dmtpps driver implementation.
> >
> > As I said, looks to me like the PPS got out voted.
> >
> > > I will
> > > be taking that up on the FreeBSD arm list, but I was less concerned
> > > about that than the fact that the ntp-sec nmea driver appears to
> > > behave very differently from 4.2.8p9 nmea driver on the same hardware.
> > > I assumed that driver would have come down without changes other
> > > than device identifier, but clearly something is different.
> >
> > I doubt it is a driver issue, my guess is that it is a dice toss thing.
> > With your setup NTP, of either flavor will lock onto the wrong source
> > now and again.
> 
> I would buy that if it weren't a consistent behavior where the ntp-sec nmea
> driver is off by > 10x the offset shown by the 4.2.8p9 nmea driver, because
> other than  the refclock xxx / server 127.127.x.x syntax issues, the config is
> identical.
> 
> >
> > > > That should be a good verion.
> > > >
> > > > > oPPS(0)          .PPS.            0 l    7    8  377   0.0000
> > > > > 0.0001 0.0001
> > > > > xNMEA(0)         .GPS.            0 l    6    8  377   0.0000
> > > > > -52.1062 1.4493
> > > > > *2001:470:e930:7 .GPS.            1 u   58   64  377   0.8173
> > > > > -0.9772 1.9614
> > > >
> > > > Clearly the PPS got outvoted.
> > >
> > > The behavior in that configuration is what I expected because the
> > > nmea start time jitter is high. It is the configurations where the
> > > pps flag is turned on for the nmea interface that are not working as
> > > expected.
> >
> > Your expectations do not match mine.  That looks bad to me and is fixable.
> 
> That is the flag1 0 configuration where the nmea stream has no reference to
> the top-of-second mark, so any variance in the start of transmission will show
> as an offset from ntp time. The sirf manual does not give a fixed time for
> start of transmission, and given its reactions probably varies based on
> position on the ellipsoid and the number of satellites in view.
> Experimentation shows that for this location, adding 370ms to the sentence
> time results in nmea offset +/- 50ms from the pps0 mark.
> 
> In the flag1 1 configuration, the driver has the gpspps0 marker to
> compensate. My original point was that the ntp-sec version of that driver is
> an order of magnitude sloppier than the 4.2.8p9 driver on the same
> hardware with effectively the same configuration.
> 
> >
> > > > You neglected the most important par of a bug report: your ntp.conf.
> > >
> > > Well it appears from the copy I got back that the message format was
> > > garbled.
> >
> > Yeah, email does that.  Still very hard to read.
> >
> > > # trying different values to see how it shifts refclock pps stratum
> > > 0 refid PPS minpoll 3 maxpoll 6 prefer time1
> > > 0.000000337160
> >
> > Looks fairly good.  My experiments on RasPi 3 shows that
> > minpoll=maxpoll=2 will give best results.
> 
> I can't find anyplace in the code that actually specifies what the range is, but
> for 15+ years the documentation has said that the minimum is 4, but my
> experimentation has always shown that the minimum is 3. You can set less
> than that, but the result is always 8 sec polls, which equates to 3. I tried it
> again today, and minpoll = maxpoll = 2 still locks in at poll = 8.
> 
> >
> > > #refclock nmea refid GPS baud 4800 mode 8 minpoll 3 maxpoll 6 prefer
> > > flag1 0 flag4 1 time2 0.442916667 refclock nmea refid GPS baud 4800
> > > mode 8 minpoll 3 maxpoll 6 prefer flag1 1 flag4 1 time1
> > > 0.000000337160 time2 0.072916667
> >
> > hard to tell, I think that is the line being used?  If so, that is
> > part if your problem.  You either need noselect, of minpoll much
> > greater than the minpossl of the PPS.  Otherwise, as you see, ntpd
> > flips a coin and locks onto the wrgon refclock.
> 
> That was garbled. That should be 2 lines where I can flip the first char to
> switch between flag1 settings. Flag1 0 doesn't need a time1, and needs the
> estimated 370ms additional offset, where flag1 1 is otherwise identical, and
> the offsets track measured/calculated corrections.
> 
> >
> > > # The following three servers will give you a random set of three #
> > > NTP servers geographically close to you.
> > > # See http://www.pool.ntp.org/ for details. Note, the pool
> > > encourages # users with a static IP and good upstream NTP servers to
> > > add a server # to the pool. See http://www.pool.ntp.org/join.html if
> > > you are interested. # # The option `iburst' is used for faster
> > > initial synchronisation.
> > > # The option `maxpoll 9' is used to prevent PLL/FLL flipping on
> > > FreeBSD. # server 0.freebsd.pool.ntp.org iburst maxpoll 8
> >
> > Best not to mix pool servers and specific servers.
> 
> I generally turn the pool servers off when calibrating because you never get
> the same one, and path symmetry can be anything. At least with specific
> servers path symmetry is generally consistent, even if it creates an offset. I
> turned one pool server on just to get another vote because the NIST servers
> were showing a persistent ~50ms offset lately.
> 
> >
> > > > I'm guessing you do not have prefer set on your PPS?  You'll also
> > > > want to set the mib- and max-poll on the PPS to much less than the
> > > > for the nmea driver.
> > >
> > > Prefer is set on the pps & nmea, as well as the reference i386
> > > system which is what it keep locking to.
> >
> > Well, that is part of the problem.  When ntpd start is it as likely to
> > choose the nmea and the pps.   You prefer your besst source, not a
> > flakey source.
> 
> I understand that. The pps driver says it is disabled unless there is a preferred
> server in the surviving set, or when another driver is tracking pps. When I
> make the nmea driver the only preferred option, the pps driver drops out
> because when the nmea driver is flag1 0, its offset is so large that it becomes
> a false ticker so there are no preferred survivors, and with flag1 1 the nmea
> driver takes over the pps tracking.
> 
> >
> > > So right now the relevant lines are:
> > > refclock pps stratum 0 refid PPS minpoll 3 maxpoll 6 prefer time1
> > > 0.000000337160 refclock nmea refid GPS baud 4800 mode 8 minpoll 3
> > > maxpoll 6 prefer flag1 1 flag4 1 time1 0.000000337160 time2
> > > 0.072916667 server ntp2.tndh.net  minpoll 4 prefer
> >
> > Pretty mushed together, did you actually prefer an internet source?
> 
> That should be 3 lines, parsed at refclock/server. The last line is the
> i386/FreeBSD-8 system the BBB is sitting on top of.
> 
> >
> > Don't do that!
> 
> It was there for the nmea flag1 0 case so the pps would not drop out, and I
> have tried without it altogether, as well as without the prefer flag.
> 
> >
> > > Is it possible that something in the interpretation of refclock nmea
> > > vs. peer 127.127.20.0 would account for the difference in handling
> > > the pps event?
> >
> > Nope.  The problem is you did not help ntpd select the best refclock.
> 
> I have cut the config down to 3 lines in the ntp-sec case, and the 5
> functionally equivalent lines in the 4.2.8p9 case, and it still acts the same. If
> the nmea driver is set to track the pps, it locks and the 4.2.8p9 offset is ~20us
> while the ntp-sec offset is ~500us. The only thing I do between those is
> change the sym links to point at the different daemons and config files and
> restart the service.
> 
> >
> >
> > > If it would be useful I can try switching back to gpsd to show what
> > > that is doing. It has been awhile so I don't recall exactly off the
> > > top of my head.
> >
> > Is you convert to SHN, but keep your 'prefer's the same way your will
> > get similar results.  Not every time as ntpd has to copin flip on
> > start about which of the 3 prefer to believe.
> 
> I have changed the prefer to single on pps, single on nmea, flags on and off,
> and just about every other configuration I can think of. I am not convinced it
> is a predictability issue, because it does pretty much what I expect every
> time, except the ntp-sec nmea driver is an order of magnitude offset from
> the 4.2.8p9 version of the same configuration. The only other thing that was
> initially surprising was that the i386 system kept being preferred even when
> the nmea flag 1 was set to track the pps, but the persistent ~20us offset
> would explain that. Like I said earlier, that could be the gpio dmtpps
> implementation, so I will take that up on the FreeBSD side.
> 
> 
> Right now the configuration is SHM :::
> refclock shm unit 0 refid GPS time1 0.4429166667 refclock shm unit 1 refid
> PPS prefer minpoll 3 maxpoll 3 time1 0.000000337160
> 
> # ppsapitest  /dev/pps0
> 1488676128 .949531317 451379 0 .000000000 0
> 1488676129 .949460094 451380 0 .000000000 0
> 1488676130 .949391909 451381 0 .000000000 0
> 1488676131 .949324748 451382 0 .000000000 0
> 
> Starting gpsd -D 5 shows :::
> # grep pps /var/log/messages
> Mar  4 16:52:22 tic gpsd[31863]: gpsd:INFO: KPPS:/dev/gps0 pps_caps 0x1133
> Mar  4 16:52:22 tic gpsd[31863]: gpsd:INFO: stashing device /dev/pps0 at slot
> 1 Mar  4 16:52:22 tic gpsd[31863]: gpsd:PROG: PPS:/dev/pps0 chrony socket
> /var/run/chrony.pps0.sock doesn't exist Mar  4 16:52:22 tic gpsd[31863]:
> gpsd:INFO: KPPS:/dev/pps0 RFC2783 path:/dev/pps0, fd is -2 Mar  4 16:52:22
> tic gpsd[31863]: gpsd:INFO: KPPS:/dev/pps0 time_pps_create(-2) failed: Bad
> file descriptor Mar  4 16:52:22 tic gpsd[31863]: gpsd:WARN: KPPS:/dev/pps0
> kernel PPS unavailable, PPS accuracy will suffer Mar  4 16:52:22 tic
> gpsd[31863]: gpsd:PROG: PPS:/dev/pps0 thread launched Mar  4 16:52:22 tic
> gpsd[31863]: gpsd:PROG: KPPS:/dev/pps0 gps_fd:-2 not a tty, can not use
> TIOMCIWAIT Mar  4 16:52:22 tic gpsd[31863]: gpsd:WARN: PPS:/dev/pps0
> die: no TIOMCIWAIT, nor RFC2783 CANWAIT Mar  4 16:52:22 tic gpsd[31863]:
> gpsd:PROG: PPS:/dev/pps0 gpsd_ppsmonitor exited.
> Mar  4 16:52:22 tic gpsd[31863]: gpsd:INFO: PPS:/dev/pps0
> ntpshm_link_activate: 0 Mar  4 16:52:22 tic gpsd[31863]: gpsd:INFO: device
> /dev/pps0 activated Mar  5 00:52:49 tic gpsd[31863]: gpsd:WARN:
> PPS:/dev/gps0 unchanged state, ppsmonitor sleeps 10 Mar  5 00:53:30 tic
> gpsd[31863]: gpsd:WARN: PPS:/dev/gps0 unchanged state, ppsmonitor
> sleeps 10
> 
> # ntpshmmon
> ntpshmmon version 1
> #      Name Seen@                Clock                Real                 L Prec
> sample NTP0 1488676083.453855318 1488676083.452996370
> 1488676083.000000000 0 -20 sample NTP0 1488676084.453409041
> 1488676084.452360690 1488676084.000000000 0 -20 sample NTP0
> 1488676085.359069375 1488676085.358358653 1488676085.000000000 0 -20
> sample NTP0 1488676086.453295445 1488676086.452264486
> 1488676086.000000000 0 -20 sample NTP0 1488676087.450587130
> 1488676087.449534086 1488676087.000000000 0 -20
> 
> # ntpmon
>      remote           refid      st t when poll reach   delay   offset   jitter
> xSHM(0)          .GPS.            0 l   39   64  377   0.0000  31.7789   9.3181
>  SHM(1)          .PPS.            0 l    -    8    0   0.0000   0.0000   0.0000
> *2603:3023:102:1 .PPS.            1 u    5    8  377   1.4972  -0.1347   1.2510
> +2001:470:e930:7 .GPS.            1 u    -   16  377   1.2670  -0.1301   1.0264
> ntpd ntpsec-0.9.6+536 2017-02-22T20:26:50Z  Last update: 2017-03-
> 04T17:28:31
> 
> 
> So part of the reason I have been having a problem with gpsd is that ntpd
> can't get a response from the PPS unit. The pps0 comments in the log are
> conflicting in that it claims to have problems,
> gpsd:PROG: PPS:/dev/pps0 gpsd_ppsmonitor exited then shows it works,
> gpsd:INFO: device /dev/pps0 activated
> 
> So which is it? The last message in the log says activated, but ntpd's response
> suggests otherwise. Ntpshmmon doesn't tell me anything other than the
> precision is -20 instead of the -30 that it should be if the pps is really active.
> Gpsmon crashes, so how do I have more tools this way rather than using the
> discrete nmea and pps device drivers?
> 
> >
> > > PS:   a web page for gpsd with ntpsec {can't find it right now} says
> > > to ensure 3.17+, but the gpsd download page only offers 3.16-.
> >
> > git head.  We ahave been remiss getting 3.17 released.  Nothing
> > related to what you see.
> 
> I eventually found it, but it took a little while. Search kept leading to the
> downloads page which didn't have it.
> 
> >
> >
> > > PPS:   tried ntpviz this morning, and it failed due to variance in
> > > stats path assumption.
> >
> > Easy to fix, just tell ntpviz where your stats are.
> 
> Was just following directions at: https://blog.ntpsec.org/2016/12/19/ntpviz-
> intro.html	which didn't indicate that telling ntpviz where the stats were
> was necessary.
> 
> >
> > > I see from the command line help that command line and a config file
> > > is an option, and that is good because I will likely post-process
> > > these somewhere else because the clock system doesn't have gnuplot
> > > installed (2nd failure) and my cross-build system ran out of disk
> > > (3rd failure, done for the day)
> >
> > So copy the files to a server with space and gnuplot.  Rsync, scp, NFS, etc.
> 
> Had planned to rsync the files like the current systems before seeing the
> ntpviz thing. That system doesn't currently have gnuplot either, but that is
> fixable.
> 
> >
> >
> > >. That said, I
> > > had expected that ntpviz would read the ntp.conf file for the
> > >location of the stats dir,
> >
> > Bad expectation.  Precisely because many people do rsync, scp, nfs the
> > stat files to unexpected places.
> 
> Just following directions.
> 
> >
> > And when you change default one plae you should expect to need to
> > change defaults other places.  For example, ntpviz has no way of weven
> > know which ntp.conf you are using.  There really is no standard place
> > for ntp.conf.
> 
> That is a problem. Until the recent changes, on FreeBSD I would have said
> you could look in rc.conf to find out which ntp.conf is being used, but
> someone decided to make that concise location as diversified and difficult as
> linux ...  ;(
> 
> >
> >
> >
> > > so maybe have it try its command line option, then its config file,
> > > then ntp.conf, then the existing default, would allow for
> > > post-processing while tracking with where ntpd has been told to put
> > > the stats, if it was told something specific.
> >
> > Ugh.  Once you go off plan, ntpviz could never read you mind and get
> > back on plan.
> 
> I was simply thinking in terms of inserting ntp.conf in the search sequence,
> without otherwise impacting what is there. I understand about mind reading.
> The other way to look at it is that the web page assumes a specific location
> for the stats files without indicating that, or what options there are for
> resolving differences. Personally the command line is fine because I really
> don't want yet another conf file to maintain, and if a central system is
> processing for several servers it doesn't make sense to be editing a file or
> changing files for every run.
> 
> Tony
> 
> 
> >
> > RGDS
> > GARY
> > ----------------------------------------------------------------------
> > ----- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR
> > 97703
> > 	gem at rellim.com  Tel:+1 541 382 8588
> >
> > 	    Veritas liberabit vos. -- Quid est veritas?
> >     "If you can’t measure it, you can’t improve it." - Lord Kelvin
> 
> _______________________________________________
> bugs mailing list
> bugs at ntpsec.org
> http://lists.ntpsec.org/mailman/listinfo/bugs



More information about the bugs mailing list