From hmurray at megapathdsl.net Sat Apr 1 00:14:46 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 31 Mar 2017 17:14:46 -0700 Subject: Wildcard-socket simplification hits a wall In-Reply-To: Message from "Eric S. Raymond" of "Fri, 31 Mar 2017 19:30:05 EDT." <20170331233005.GB30079@thyrsus.com> Message-ID: <20170401001446.B6644406060@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: >> Have you look to things like IP_PKTINFO? > Nope. Thanks to the @!%^#~! crappy documentation, I didn't know that was an > identifier of interest until you pointed it out. It's not available on NetBSD or FreeBSD. Google found info in man 7 ip -- These are my opinions. I hate spam. From kurt at roeckx.be Sat Apr 1 09:33:04 2017 From: kurt at roeckx.be (Kurt Roeckx) Date: Sat, 1 Apr 2017 11:33:04 +0200 Subject: Wildcard-socket simplification hits a wall In-Reply-To: <20170331233005.GB30079@thyrsus.com> References: <20170330160636.5EF1913A0197@snark.thyrsus.com> <20170331213918.tb3kmwrtkkh7pvbc@roeckx.be> <20170331233005.GB30079@thyrsus.com> Message-ID: <20170401093304.gq6pkmmobxiqwpyr@roeckx.be> On Fri, Mar 31, 2017 at 07:30:05PM -0400, Eric S. Raymond wrote: > Kurt Roeckx : > > > One might expect this to be available via a CMSG lookup into recmvsg's > > > per-package auxiliary headers, analogously to the way we now get the > > > packet-arrival timestamp (see ntpd/ntp_packetstamp.c). It's the only > > > place for the information to be that has the right locality. > > > > Have you look to things like IP_PKTINFO? > > Nope. Thanks to the @!%^#~! crappy documentation, I didn't know that was an > identifier of interest until you pointed it out. Neither the CMSG(3) or ip(7) > manual pages hint that it exists. Because heaven forfend that they should > either list CMSG capabilities or point unambiguously at someplace that > does, that would be too fscking much like being *helpful*. > > *grumble* > > Now that I've looked at it, it halfway solves the problem in a rather > frustrating way...maybe. Or maybe not at all. Yeah, it means you can > back out an "interface index" which designates a physical interface, or its > local address. But then you have the problem of intepreting that index or > address. I was under the impression that you can also use that index when sending something. Kurt From esr at thyrsus.com Mon Apr 3 14:23:31 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 3 Apr 2017 10:23:31 -0400 (EDT) Subject: Time for people interested in packaging NTPsec to get serious Message-ID: <20170403142331.6AD2113A019A@snark.thyrsus.com> We are rapidly converging on code we think we can ship as 1.0. The last feature on the pre-1.0 schedule landed this morning when I merged Ian Bruene's units patch. What we've got left to work on is build-recipe tweaks and a bug in an experimental driver. The one major item left is developing packaging metadata for major Linux distributions. Our target list is Debian, Ubuntu, Raspbian, Red Hat, Gentoo, and SuSe. We've had people occasionally surface on this list expressing interest in being packagers. Now's the time to get serious about this. If you are one of those people, please identify yourself and your distro so we can discuss your particular requirements. -- Eric S. Raymond The kind of charity you can force out of people nourishes about as much as the kind of love you can buy --- and spreads even nastier diseases. From gem at rellim.com Mon Apr 3 21:52:19 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 3 Apr 2017 14:52:19 -0700 Subject: =?UTF-8?B?4pyYaW50ZXJmYWNlL25pYw==?= Message-ID: <20170403145219.4059dab3@spidey.rellim.com> Yo All! I got a good response to my question on nic/interface. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin Begin forwarded message: Date: Mon, 3 Apr 2017 17:12:50 +0200 From: Marco Marongiu To: questions at lists.ntp.org Subject: Re: [ntp:questions] ?interface/nic On 31/03/17 22:39, Gary E. Miller wrote: > Quick question, does anyone use either of these in ntp.conf? > > interface[listen | ignore | drop] [all | ipv4 | ipv6 | wildcard | > name | address[/prefixlen]] nic[listen | ignore | drop] [all | ipv4 | > ipv6 | wildcard | name | address[/prefixlen]] > > If so, how and why? Is 'name' the name of the interface? I used interface ignore and then bound ntpd to specific interfaces on LVS servers. This was because virtual interfaces were continuously created and destroyed on those servers, ntpd had to continuosly run after the change and sometimes it got... "confused" and eventually stopped serving time. Since I wanted the service to be provided only on specific addresses, I forced ntpd to ignore all interfaces but the loopback and the ones where those addresses were bound to. Hope this helps. Why do you want to know exactly? Ciao -- bronto _______________________________________________ questions mailing list questions at lists.ntp.org http://lists.ntp.org/listinfo/questions RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 4 09:34:22 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 04 Apr 2017 02:34:22 -0700 Subject: Current HEAD is broken Message-ID: <20170404093422.BFBA5406060@ip-64-139-1-69.sjc.megapath.net> I just pushed a fix for a couple of "#if DEBUG" that should have been ifdef. I think they have been there for ages, but a recent tweak I made to test/option-tester.sh found them. (compile refclocks with --disable-debug) isc/net.h is in libisc/include/ [147/196] Compiling tests/common/tests_main.c [148/196] Compiling tests/common/caltime.c In file included from ../../include/ntp_stdlib.h:14:0, from ../../tests/libparse/ieee754io.c:2: ../../include/ntp_net.h:13:21: fatal error: isc/net.h: No such file or directory #include ^ compilation terminated. In file included from ../../include/ntp_stdlib.h:14:0, from ../../tests/common/caltime.c:5: ../../include/ntp_net.h:13:21: fatal error: isc/net.h: No such file or directory #include ^ compilation terminated. In file included from ../../include/ntp_stdlib.h:14:0, from ../../tests/common/tests_main.h:9, from ../../tests/common/tests_main.c:3: ../../include/ntp_net.h:13:21: fatal error: isc/net.h: No such file or directory #include ^ compilation terminated. ---------------- >From early on, but it keeps going and stuff scrolls off my screen. EEE.E.EE ====================================================================== ERROR: test_filtcooker (__main__.TestPylibUtilMethods) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/murray/ntpsec/raw/tests/pylib/test_util.py", line 82, in test_filtcooker self.assertEqual(ntp.util.filtcooker( AttributeError: 'module' object has no attribute 'filtcooker' ====================================================================== ERROR: test_formatdigitsplit (__main__.TestPylibUtilMethods) ---------------------------------------------------------------------- Traceback (most recent call last): File "/home/murray/ntpsec/raw/tests/pylib/test_util.py", line 88, in test_formatdigitsplit self.assertEqual(ntp.util.formatdigitsplit(10.0, 5, 9), AttributeError: 'module' object has no attribute 'formatdigitsplit' and lots more Ran 8 tests in 0.001s FAILED (errors=6) -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 5 00:06:20 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 4 Apr 2017 17:06:20 -0700 Subject: Current HEAD is broken In-Reply-To: <20170404093422.BFBA5406060@ip-64-139-1-69.sjc.megapath.net> References: <20170404093422.BFBA5406060@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170404170620.3581a668@spidey.rellim.com> Yo Hal! On Tue, 04 Apr 2017 02:34:22 -0700 Hal Murray wrote: > I just pushed a fix for a couple of "#if DEBUG" that should have been > ifdef. I think they have been there for ages, but a recent tweak I > made to test/option-tester.sh > found them. (compile refclocks with --disable-debug) Sorry. Fixed. The new refclock test does not work well with no refclocks. I should not push late at night... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 5 08:47:57 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 05 Apr 2017 01:47:57 -0700 Subject: Current HEAD is broken Message-ID: <20170405084757.8224F40605C@ip-64-139-1-69.sjc.megapath.net> > Sorry. Fixed. The new refclock test does not work well with no refclocks. I fixed a few more sign-conversion warnings. Please check. On Fedora, all I see now are the 2 from the parser. That was a 64 bit Fedora. I get lots of them on a 32 bit system. But not the parser errors. :) > I should not push late at night... Why didn't buildbot catch that one? Is it reasonable to tweak buildbot each time something sneaks through? Is there a web page that describes buildbot? If not, I'll make it if you feed me the info. -- These are my opinions. I hate spam. From ghane0 at gmail.com Wed Apr 5 09:52:19 2017 From: ghane0 at gmail.com (Sanjeev Gupta) Date: Wed, 5 Apr 2017 17:52:19 +0800 Subject: Current HEAD is broken In-Reply-To: <20170405084757.8224F40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170405084757.8224F40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Wed, Apr 5, 2017 at 4:47 PM, Hal Murray wrote: > > Is there a web page that describes buildbot? If not, I'll make it if you > feed me the info. And shift buildbot web master to the new server :-) -- Sanjeev Gupta +65 98551208 http://www.linkedin.com/in/ghane -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Wed Apr 5 18:44:37 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 5 Apr 2017 11:44:37 -0700 Subject: Current HEAD is broken In-Reply-To: <20170405084757.8224F40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170405084757.8224F40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170405114437.43993368@spidey.rellim.com> Yo Hal! On Wed, 05 Apr 2017 01:47:57 -0700 Hal Murray wrote: > > Sorry. Fixed. The new refclock test does not work well with no > > refclocks. > > I fixed a few more sign-conversion warnings. Please check. > > On Fedora, all I see now are the 2 from the parser. > > That was a 64 bit Fedora. I get lots of them on a 32 bit system. > But not the parser errors. :) And Matt Selsky reports lots of warnings on Solaris. Seems like Solaris is not very POSIX... I'm gonna remove the sign-conversion check, no way to make it silent for all targets. > > > I should not push late at night... > > Why didn't buildbot catch that one? Buildbot did, I didn't. > Is it reasonable to tweak buildbot each time something sneaks through? Lost me? > Is there a web page that describes buildbot? If not, I'll make it if > you feed me the info. http://buildbot.net/ Our implementation is no the NTPsec admin wiki. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Apr 5 18:46:33 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 5 Apr 2017 11:46:33 -0700 Subject: Current HEAD is broken In-Reply-To: References: <20170405084757.8224F40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170405114633.3efd11b9@spidey.rellim.com> Yo Sanjeev! On Wed, 5 Apr 2017 17:52:19 +0800 Sanjeev Gupta wrote: > On Wed, Apr 5, 2017 at 4:47 PM, Hal Murray > wrote: > > > > > Is there a web page that describes buildbot? If not, I'll make it > > if you feed me the info. > > > And shift buildbot web master to the new server :-) I'm still asking for volunteers to help on that. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 5 20:57:59 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 05 Apr 2017 13:57:59 -0700 Subject: Current HEAD is broken Message-ID: <20170405205759.9C8C640605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > And Matt Selsky reports lots of warnings on Solaris. Seems like Solaris is > not very POSIX... I'm gonna remove the sign-conversion check, no way to > make it silent for all targets. You could make it conditional on the systems where it works. Or make it a configure option so we can easily work on fixing things. gem at rellim.com said: >> Why didn't buildbot catch that one? > Buildbot did, I didn't. I was expecting buildbot to reject commits if they didn't build cleanly. >> Is it reasonable to tweak buildbot each time something sneaks through? > Lost me? If buildbot was supposed to reject non-clean commits but missed something, is it practical to add something to buildbot to catch that case? -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 5 21:11:59 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 5 Apr 2017 14:11:59 -0700 Subject: Current HEAD is broken In-Reply-To: <20170405205759.9C8C640605C@ip-64-139-1-69.sjc.megapath.net> References: <20170405205759.9C8C640605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170405141159.7c3b4d25@spidey.rellim.com> Yo Hal! On Wed, 05 Apr 2017 13:57:59 -0700 Hal Murray wrote: > gem at rellim.com said: > > And Matt Selsky reports lots of warnings on Solaris. Seems like > > Solaris is not very POSIX... I'm gonna remove the sign-conversion > > check, no way to make it silent for all targets. > > You could make it conditional on the systems where it works. There used to be a lot of OS conditionals in the waf, Most got ripped out as host detection got better. In the specific case of -Wsign-conversion, it illuminates a bug in Bison, so prolly not a good idea to have on by default. It will just annoy and confuse people. > Or make it a configure option so we can easily work on fixing things. I thinking -Wsign-comparison would be good with 'waf --enable-debug, but not for normal builds. The debug part of waf needs works, in many places. > >> Is it reasonable to tweak buildbot each time something sneaks > >> through? > > Lost me? > > If buildbot was supposed to reject non-clean commits but missed > something, is it practical to add something to buildbot to catch that > case? Define clean? If something does not build, then buildbot fails. otherwise it passes. There is the known bug that test failures do not fail cause buildbot fail. I think someone already filed an issue on it. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Thu Apr 6 00:58:39 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 5 Apr 2017 17:58:39 -0700 Subject: =?UTF-8?B?4pyYdW5pdHM=?= mode Message-ID: <20170405175839.7e789a65@spidey.rellim.com> Yo Ian! I gotta dig real deep to find unit issues now. There are problems in ntpmon detail mode: filtdelay = 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 0.00000 ms filtoffset = 0.00000 -0.00000 -0.00000 -0.00000 -0.00000 -0.00000 -0.00000 -0.0000 ms filtdisp = 0.000 0.000 0.000 0.000 0.000 0.001 0.001 0.001 us A lot of zeros of ms, minus zero, etc. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Apr 6 02:29:49 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 05 Apr 2017 19:29:49 -0700 Subject: Current HEAD is broken Message-ID: <20170406022949.92A8540605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > I thinking -Wsign-comparison would be good with 'waf --enable-debug, but > not for normal builds. --enable-debug is the default. I think that makes it normal. I think another option to turn on -Wsign-comparison would be handy. If nothing else, how about --enable-sign-comparison? gem at rellim.com said: > Define clean? If something does not build, then buildbot fails. > otherwise it passes. There is the known bug that test failures do not fail > cause buildbot fail. I think someone already filed an issue on it. I'd be happy to have warnings included in "fail". The idea is to reduce clutter. -- These are my opinions. I hate spam. From gem at rellim.com Thu Apr 6 06:56:09 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 5 Apr 2017 23:56:09 -0700 Subject: Current HEAD is broken In-Reply-To: <20170406022949.92A8540605C@ip-64-139-1-69.sjc.megapath.net> References: <20170406022949.92A8540605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170405235609.32e6e1c9@spidey.rellim.com> Yo Hal! On Wed, 05 Apr 2017 19:29:49 -0700 Hal Murray wrote: > gem at rellim.com said: > > I thinking -Wsign-comparison would be good with 'waf > > --enable-debug, but not for normal builds. > > --enable-debug is the default. I think that makes it normal. And as recently discussed on IRC, that is wrong and will be fixed soon. > I think another option to turn on -Wsign-comparison would be handy. > If nothing else, how about --enable-sign-comparison? Gack. > gem at rellim.com said: > > Define clean? If something does not build, then buildbot fails. > > otherwise it passes. There is the known bug that test failures do > > not fail cause buildbot fail. I think someone already filed an > > issue on it. > > I'd be happy to have warnings included in "fail". The idea is to > reduce clutter. I don't see how that can work, warnings are by defintion not failures. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Apr 6 07:14:48 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 06 Apr 2017 00:14:48 -0700 Subject: Current HEAD is broken Message-ID: <20170406071448.A556240605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: >> I'd be happy to have warnings included in "fail". The idea is to >> reduce clutter. > I don't see how that can work, warnings are by defintion not failures. Do you have a correct word? I assume the goal is to have build work without warnings. Thus if it does generate any warnings, something needs fixing and things will be cleaner if the attempted commit if rejected and fixed rather than being accepted and followed by another commit to fix the warning. My local scripts grep for "warning". I think there is a issue open on the bug that waf check gets errors but doesn't "fail" in the sense of returning a non-zero return code. ------- >> --enable-debug is the default. I think that makes it normal. > And as recently discussed on IRC, that is wrong and will be fixed soon. Would somebody please summarize. Changing the default on building with debugging seems important enough that it should be mentioned (and archived) here rather than only on IRC where lots of people will miss it. -- These are my opinions. I hate spam. From gem at rellim.com Thu Apr 6 19:04:50 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 6 Apr 2017 12:04:50 -0700 Subject: Current HEAD is broken In-Reply-To: <20170406071448.A556240605C@ip-64-139-1-69.sjc.megapath.net> References: <20170406071448.A556240605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170406120450.6402dc9d@spidey.rellim.com> Yo Hal! On Thu, 06 Apr 2017 00:14:48 -0700 Hal Murray wrote: > gem at rellim.com said: > >> I'd be happy to have warnings included in "fail". The idea is to > >> reduce clutter. > > I don't see how that can work, warnings are by defintion not > > failures. > > Do you have a correct word? I assume the goal is to have build work > without warnings. Given the number of C compilers we work on that is not possible given our current staffing. Just to start: 1. Get Bison to fix their upstream bugs. 2. Add checks, and matching code, for several dozen Solaris quirks. 3. Add checks, and matching code, for several dozen OpenBSD quirks. > Thus if it does generate any warnings, something > needs fixing and things will be cleaner if the attempted commit if > rejected and fixed rather than being accepted and followed by another > commit to fix the warning. I'd be thrilled if wanted to take on that project. > My local scripts grep for "warning". Easy to make it work on one OS, the trick is getting all the partial POSIX OS to work. > I think there is a issue open on the bug that waf check gets errors > but doesn't "fail" in the sense of returning a non-zero return code. Diffferent bug. The bug is that when the tests in the tests/ folder fail that waf does not fail. That is high on my list to fix. > >> --enable-debug is the default. I think that makes it normal. > > And as recently discussed on IRC, that is wrong and will be fixed > > soon. > > Would somebody please summarize. Changing the default on building > with debugging seems important enough that it should be mentioned > (and archived) here rather than only on IRC where lots of people will > miss it. Already been discussed on IRC and in email. To summarize, it is not appropriate that a 1.0 version has debugging on by default. Debugging means not stripping the binaries, adding in debug flags, etc. Pretty much by definition, begging code is code you do not want in a production binary. This is also high on my list to fix. The vast majrity of users will want debugging off, the few that want it can turm it on. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Sat Apr 8 19:59:53 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 08 Apr 2017 12:59:53 -0700 Subject: warnings, attic/sht.c Message-ID: <20170408195953.6884B406063@ip-64-139-1-69.sjc.megapath.net> I just pushed fixes for a couple of warnings. If buildbot won't complain about warnings, I'd like to encourage people to at least run tests/option_tester.sh on their local setup before pushing cleanup type fixes that are likely to break cases you don't normally test. That's why I wrote it. ------- attic/sht.c now complains: ../../attic/sht.c: In function ???main???: ../../attic/sht.c:165:20: warning: implicit declaration of function ???ntp_random??? [-Wimplicit-function-declaration] rcv_frc = (uint)ntp_random() % 1000000000u; ^~~~~~~~~~ It used to include ntp_random.h I don't know what the right fix is. Is sht supposed to be stand-alone? If so, we should fix it to use some other randomness and not link with our libraries. If not, it should include ntp.h. -- These are my opinions. I hate spam. From gem at rellim.com Sat Apr 8 20:11:16 2017 From: gem at rellim.com (Gary E. Miller) Date: Sat, 8 Apr 2017 13:11:16 -0700 Subject: warnings, attic/sht.c In-Reply-To: <20170408195953.6884B406063@ip-64-139-1-69.sjc.megapath.net> References: <20170408195953.6884B406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170408131116.61441efa@spidey.rellim.com> Yo Hal! On Sat, 08 Apr 2017 12:59:53 -0700 Hal Murray wrote: > I just pushed fixes for a couple of warnings. I do not understand this one: Cool, thanks. > If buildbot won't complain about warnings, buildbot shows the warnings, it just does not fail on warning. Prolly need to make buildbot a bit more vocal, but failing on warnings woud be bad. I'd like to encourage > people to at least run > tests/option_tester.sh on their local setup before pushing cleanup > type fixes that are likely to break cases you don't normally test. > That's why I wrote it. Only 800 more warning messages to fix... > attic/sht.c now complains: > ../../attic/sht.c: In function ???main???: > ../../attic/sht.c:165:20: warning: implicit declaration of function > ???ntp_random??? [-Wimplicit-function-declaration] > rcv_frc = (uint)ntp_random() % 1000000000u; > ^~~~~~~~~~ > It used to include ntp_random.h Correct, and ntp_random.h is now gone. The prototype for ntp_random() is in ntp.h now. > I don't know what the right fix is. So the fix is to include ntp.h. I thought I had already done that. > Is sht supposed to be > stand-alone? Yes. > If so, we should fix it to use some other randomness > and not link with our libraries. Dunno... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Sat Apr 8 20:20:48 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 08 Apr 2017 13:20:48 -0700 Subject: warnings, attic/sht.c Message-ID: <20170408202048.0B644406063@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: >> I don't know what the right fix is. > So the fix is to include ntp.h. I thought I had already done that. >> Is sht supposed to be stand-alone? > Yes. Maybe I should have said build-alone rather than stand-alone. It currently doesn't include ntp.h and only gets one warning (on my setup). So if we change it to use some other randomness, we can remove the linking with our libraries. >> I just pushed fixes for a couple of warnings. > I do not understand this one: > Cool, thanks. I don't understand what you don't understand. After a git pull, I ran tests/option-tester.sh on a Fedora system. It found 3 warnings. I fixed 2 of them. The 3rd is discussed above. > Only 800 more warning messages to fix... Are you using an optional flag or unpushed change to get those? I'm only seeing one warning, the one from sht. -- These are my opinions. I hate spam. From trv-n at comcast.net Mon Apr 10 00:57:10 2017 From: trv-n at comcast.net (trv-n at comcast.net) Date: Sun, 09 Apr 2017 20:57:10 -0400 Subject: float-equal warnings Message-ID: In ntp_refclock.c function refclock_control near line 920 a test for a nonzero value is made with the time1 and time2 fudges, which raises float-equal warnings. One way to fix this is to compare against DBL_MIN, but float.h would need to be included. Comparing against LOGTOD(-31) will fail for any value less than ~0.2ns (the minimum l_fp fraction). Comparing against S_PER_NS will fail for values less than 1ns. I was going to submit a patch with LOGTOD() since it doesn't seem like a user would bother to put a fudge smaller than that, but I thought I'd ask first. From gem at rellim.com Mon Apr 10 02:06:53 2017 From: gem at rellim.com (Gary E. Miller) Date: Sun, 9 Apr 2017 19:06:53 -0700 Subject: float-equal warnings In-Reply-To: References: Message-ID: <20170409190653.3d3f3f6c@spidey.rellim.com> Yo trv-n at comcast.net! On Sun, 09 Apr 2017 20:57:10 -0400 trv-n at comcast.net wrote: > In ntp_refclock.c function refclock_control near line 920 a test for a > nonzero value is made with the time1 and time2 fudges, which raises > float-equal warnings. Yup. > One way to fix this is to compare against DBL_MIN, but float.h would > need to be included. Yup. > Comparing against LOGTOD(-31) will fail for any value less than ~0.2ns > (the minimum l_fp fraction). Yup. Except the results is a constant, and having to compute a log every time on a constant is a bit watesful. > Comparing against S_PER_NS will fail for values less than 1ns. Yup. Or maybe S_PER_NS/10 in case the value might be rounded. > I was going to submit a patch with LOGTOD() since it doesn't seem like > a user would bother to put a fudge smaller than that, but I thought > I'd ask first. I agree, for most cases. There are a few where 0.09 may be enough. Start hacking, send a merge request or a patch. We can interate to a solution. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Mon Apr 10 17:43:44 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 10 Apr 2017 10:43:44 -0700 Subject: DNS cleanup Message-ID: <20170410174344.D1E31406060@ip-64-139-1-69.sjc.megapath.net> The basic idea is to simplify things. In the new world, there is only one DNS thread. To do a lookup, the main thread starts a worker thread with a pointer to a peer structure. The worker thread returns the return code and answer in global storage and self destructs. Maybe it raises a signal to wakeup the main thread. When it notices that the answer is ready, the main thread processes it including freeing the answer. Do we want a configure option to build/run without DNS lookups? Even if we don't, we should probably structure the code to support that. The pool code already scans the peer list and does DNS lookups. We can piggy back on that. So we need a dummy peer block for the server case. The peer structure already has room for a name - the pool case needs it. We'll have to add a few flags. The pool case makes another peer slot for each address it adds. For the server case, there is only one address. I think we can reuse the same peer block. It needs a flag to indicate that the IP Address is valid. My plan is to delete ntp_intres, ntp_worker, and work_thread, then add a new module to fill in the gaps. Eric: close_all_except is in libntp/ntp_worker.c where should it live? Or can we get rid of it? It's only called from ntpdmain. -- These are my opinions. I hate spam. From gem at rellim.com Mon Apr 10 20:22:27 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 10 Apr 2017 13:22:27 -0700 Subject: float-equal warnings In-Reply-To: <20170409190653.3d3f3f6c@spidey.rellim.com> References: <20170409190653.3d3f3f6c@spidey.rellim.com> Message-ID: <20170410132227.1179ba00@spidey.rellim.com> Yo All! Mark asked me to dig deeper. On Sun, 9 Apr 2017 19:06:53 -0700 "Gary E. Miller" wrote: > > Comparing against LOGTOD(-31) will fail for any value less than > > ~0.2ns (the minimum l_fp fraction). > > Yup. Except the results is a constant, and having to compute a log > every time on a constant is a bit watesful. So I made a test: #include #define LOGTOD(a) ldexp(1., (int)(a)) /* log2 to double */ int main( int argc, char ** argv) { return (int)LOGTOD(1e-9); } gcc -g test.c -o test gdb test disasssemble main: Dump of assembler code for function main: 0x00000000004004b6 <+0>: push %rbp 0x00000000004004b7 <+1>: mov %rsp,%rbp 0x00000000004004ba <+4>: mov %edi,-0x4(%rbp) 0x00000000004004bd <+7>: mov %rsi,-0x10(%rbp) 0x00000000004004c1 <+11>: mov $0x1,%eax 0x00000000004004c6 <+16>: pop %rbp 0x00000000004004c7 <+17>: retq End of assembler dump. So I'm shocked, gcc made LOGTOD(1e-9) a simple binary constant. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Mon Apr 10 23:00:28 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 10 Apr 2017 19:00:28 -0400 Subject: DNS cleanup In-Reply-To: <20170410174344.D1E31406060@ip-64-139-1-69.sjc.megapath.net> References: <20170410174344.D1E31406060@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170410230028.GB18373@thyrsus.com> Hal Murray : > Do we want a configure option to build/run without DNS lookups? Even if we > don't, we should probably structure the code to support that. Agreed with the second. But we've got enough obscure seldom-used options that I'm strongly against adding another exposed one without demonstrated need. > My plan is to delete ntp_intres, ntp_worker, and work_thread, > then add a new module to fill in the gaps. I like that plan. > Eric: > close_all_except is in libntp/ntp_worker.c > where should it live? Or can we get rid of it? > It's only called from ntpdmain. Move it to ntpdmain.c for now, then. It's possible we can get rid of it, but I'd rather not have that change be entangled with your refactor. Be aware that at some point I would still be interested in replacing our internal async-lookup code with calls to c_ares or equivalent, on the general principle that it's better when specialty code unrelated to our business logic is somebody else's maintainance problem. However, *don't let that stop you*. Your change might make alter the tradeoffs, making plugging in c_ares unnecessry. If not, I have no doubt your change will make it easier. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Mon Apr 10 23:15:08 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 10 Apr 2017 16:15:08 -0700 Subject: TEST fails on NetBSD Message-ID: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> Anybody recognize this? TEST(lfpfunc, FDF_RoundTrip)../../tests/libntp/lfpfunc.c:264:TEST(lfpfunc, FDF_R oundTrip):FAIL: Values Not Within Delta Inserting a printf: double d = lfptod(temp); printf("## %f %f\n", eps(op2), d); TEST_ASSERT_DOUBLE_WITHIN(eps(op2), 0.0, fabs(d)); Now I get: TEST(lfpfunc, FDF_RoundTrip)## 0.000000 0.000000 ## 0.000000 0.000000 ## 0.000000 0.000000 ## 0.000000 0.000000 ## 0.000000 2147482624.000000 ../../tests/libntp/lfpfunc.c:265:TEST(lfpfunc, FDF_RoundTrip):FAIL: Values Not W ithin Delta -- These are my opinions. I hate spam. From esr at thyrsus.com Mon Apr 10 23:51:06 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 10 Apr 2017 19:51:06 -0400 Subject: TEST fails on NetBSD In-Reply-To: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> References: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170410235106.GA20814@thyrsus.com> Hal Murray : > Anybody recognize this? > > TEST(lfpfunc, FDF_RoundTrip)../../tests/libntp/lfpfunc.c:264:TEST(lfpfunc, > FDF_R > oundTrip):FAIL: Values Not Within Delta > > Inserting a printf: > double d = lfptod(temp); > printf("## %f %f\n", eps(op2), d); > TEST_ASSERT_DOUBLE_WITHIN(eps(op2), 0.0, fabs(d)); > > Now I get: > TEST(lfpfunc, FDF_RoundTrip)## 0.000000 0.000000 > ## 0.000000 0.000000 > ## 0.000000 0.000000 > ## 0.000000 0.000000 > ## 0.000000 2147482624.000000 > ../../tests/libntp/lfpfunc.c:265:TEST(lfpfunc, FDF_RoundTrip):FAIL: Values > Not W > ithin Delta Good target for a bisect. I have a suspicion about what you'll find, but I'll shut up lest I mistakenly criticize the wrong dev. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 11 00:22:48 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 10 Apr 2017 17:22:48 -0700 Subject: TEST fails on NetBSD In-Reply-To: Message from "Eric S. Raymond" of "Mon, 10 Apr 2017 19:51:06 EDT." <20170410235106.GA20814@thyrsus.com> Message-ID: <20170411002248.80813406060@ip-64-139-1-69.sjc.megapath.net> -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 11 00:24:00 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 10 Apr 2017 17:24:00 -0700 Subject: TEST fails on NetBSD In-Reply-To: Message from "Eric S. Raymond" of "Mon, 10 Apr 2017 19:51:06 EDT." <20170410235106.GA20814@thyrsus.com> Message-ID: <20170411002400.37204406060@ip-64-139-1-69.sjc.megapath.net> > Good target for a bisect. Anybody else have access to a NetBSD system? I'd rather work on DNS. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 11 06:13:54 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 10 Apr 2017 23:13:54 -0700 Subject: DNS cleanup - broadcast Message-ID: <20170411061354.5D17A406060@ip-64-139-1-69.sjc.megapath.net> I've started serious work on the DNS cleanup. It will probably take several of days - maybe longer. --- What's the current story on broadcast? What do we support? Does anybody test it? I know we ripped out a lot of that stuff, but it's still a keyword for the parser and still documented. T_Broadcast is only used in get_correct_host_mode() which turns it into MODE_BROADCAST, and that is only referenced in i_require_authentication() -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 11 12:24:33 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 11 Apr 2017 08:24:33 -0400 Subject: DNS cleanup - broadcast In-Reply-To: <20170411061354.5D17A406060@ip-64-139-1-69.sjc.megapath.net> References: <20170411061354.5D17A406060@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170411122433.GA16035@thyrsus.com> Hal Murray : > I've started serious work on the DNS cleanup. It will probably take several > of days - maybe longer. /me bows in Hal's direction. Not an easy job. Good luck. > What's the current story on broadcast? What do we support? Does anybody > test it? > > I know we ripped out a lot of that stuff, but it's still a keyword for the > parser and still documented. > > T_Broadcast is only used in get_correct_host_mode() which turns it into > MODE_BROADCAST, and that is only referenced in i_require_authentication() This is what the feature-change list says: * Broadcast- and multicast client modes, which are impossible to secure, have been removed. Broadcast (but not multicast) service can still be enabled, though this is a deprecated and unsupported mode of operation and may be entirely removed in a future release. I don't know if the remaining broadcast support has been tested. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From jason at azze.org Tue Apr 11 12:31:43 2017 From: jason at azze.org (Jason Azze) Date: Tue, 11 Apr 2017 08:31:43 -0400 Subject: Test failing on CentOS 6 32-bit Message-ID: When Matt Selsky made the commit "Make sure to fail on test failures", my very next automated build on CentOS 6 32-bit failed thusly... tests that fail 1/3 stdout: Unity test run 1 of 1 TEST(ieee754io, test_zero32)../../tests/libparse/ieee754io.c:44:TEST(ieee754io, test_zero32):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_one32)../../tests/libparse/ieee754io.c:56:TEST(ieee754io, test_one32):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_negone32)../../tests/libparse/ieee754io.c:68:TEST(ieee754io, test_negone32):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_small32)../../tests/libparse/ieee754io.c:159:TEST(ieee754io, test_small32):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_nan32)../../tests/libparse/ieee754io.c:82:TEST(ieee754io, test_nan32):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_max32)../../tests/libparse/ieee754io.c:116:TEST(ieee754io, test_max32):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_order32)../../tests/libparse/ieee754io.c:139:TEST(ieee754io, test_order32):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_zero64)../../tests/libparse/ieee754io.c:172:TEST(ieee754io, test_zero64):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_one64)../../tests/libparse/ieee754io.c:184:TEST(ieee754io, test_one64):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_negone64)../../tests/libparse/ieee754io.c:196:TEST(ieee754io, test_negone64):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_small64)../../tests/libparse/ieee754io.c:308:TEST(ieee754io, test_small64):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_nan64)../../tests/libparse/ieee754io.c:210:TEST(ieee754io, test_nan64):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_max64)../../tests/libparse/ieee754io.c:265:TEST(ieee754io, test_max64):FAIL: Unity 64-bit Support Disabled TEST(ieee754io, test_order64)../../tests/libparse/ieee754io.c:288:TEST(ieee754io, test_order64):FAIL: Unity 64-bit Support Disabled From ghane0 at gmail.com Tue Apr 11 12:43:35 2017 From: ghane0 at gmail.com (Sanjeev Gupta) Date: Tue, 11 Apr 2017 20:43:35 +0800 Subject: DNS cleanup - broadcast In-Reply-To: <20170411061354.5D17A406060@ip-64-139-1-69.sjc.megapath.net> References: <20170411061354.5D17A406060@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Tue, Apr 11, 2017 at 2:13 PM, Hal Murray wrote: > > What's the current story on broadcast? What do we support? Does anybody > test it? Hal, I asked about this (I was cleaning documentation) in Dec 2016. You mentioned it might be used by the pool directive. See: https://lists.ntpsec.org/pipermail/devel/2016-December/002877.html -- Sanjeev Gupta +65 98551208 http://www.linkedin.com/in/ghane -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Tue Apr 11 18:51:11 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 11 Apr 2017 11:51:11 -0700 Subject: TEST fails on NetBSD In-Reply-To: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> References: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170411115111.78ad7c40@spidey.rellim.com> Yo Hal! The last commit I can build is 1b47ed72aedcc8f2e39ae198a0ea26a565ca6a15 and that one works for me. Whoever broke head this morning, please fix it. On Mon, 10 Apr 2017 16:15:08 -0700 Hal Murray wrote: > Anybody recognize this? > > TEST(lfpfunc, > FDF_RoundTrip)../../tests/libntp/lfpfunc.c:264:TEST(lfpfunc, FDF_R > oundTrip):FAIL: Values Not Within Delta > > Inserting a printf: > double d = lfptod(temp); > printf("## %f %f\n", eps(op2), d); > TEST_ASSERT_DOUBLE_WITHIN(eps(op2), 0.0, fabs(d)); > > Now I get: > TEST(lfpfunc, FDF_RoundTrip)## 0.000000 0.000000 > ## 0.000000 0.000000 > ## 0.000000 0.000000 > ## 0.000000 0.000000 > ## 0.000000 2147482624.000000 > ../../tests/libntp/lfpfunc.c:265:TEST(lfpfunc, FDF_RoundTrip):FAIL: > Values Not W > ithin Delta > > > RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Apr 11 18:55:28 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 11 Apr 2017 11:55:28 -0700 Subject: TEST fails on NetBSD In-Reply-To: <20170411115111.78ad7c40@spidey.rellim.com> References: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> <20170411115111.78ad7c40@spidey.rellim.com> Message-ID: <20170411115528.2cd9498d@spidey.rellim.com> Yo Hal! > The last commit I can build is > 1b47ed72aedcc8f2e39ae198a0ea26a565ca6a15 and that one works for me. Buildbot says this commit broke NetBSD: b298cca81dff03f7d4105a615ca14697cee95dc3 But actually, that just made some warnings into failures, so the bug has likely been around a long time. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 11 21:03:32 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 11 Apr 2017 17:03:32 -0400 Subject: TEST fails on NetBSD In-Reply-To: <20170411115111.78ad7c40@spidey.rellim.com> References: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> <20170411115111.78ad7c40@spidey.rellim.com> Message-ID: <20170411210332.GA4881@thyrsus.com> Gary E. Miller : > Yo Hal! > > The last commit I can build is 1b47ed72aedcc8f2e39ae198a0ea26a565ca6a15 > and that one works for me. > > Whoever broke head this morning, please fix it. There ain't much there there. commit 8b13059e4936e7424062f0b69737b3ecbd96317e Author: Eric S. Raymond Date: Tue Apr 11 15:41:32 2017 -0400 Revert "easier-to-use conditional warning flags" It broke the configure logic. commit 5e95e9186322284845ffa8c6c3c25bb079b046c0 Author: Ian Bruene Date: Tue Apr 11 09:32:52 2017 -0500 Added string based formatters and associated tests. commit 40c17222243fe5a039ada4bc488300e1151ae8f7 Author: Matt Selsky Date: Tue Apr 11 10:14:35 2017 -0400 Make sure NTP_API is defined before doing comparison tests Avoids these warnings: warning: "NTP_API" is not defined [-Wundef] commit fa68a88752bb8e782cf19fb6e1bd378d0b803f21 Author: Trevor N Date: Tue Apr 11 01:40:54 2017 -0400 easier-to-use conditional warning flags I noticed that more flags were added into the check list, but valid flags are added to CFLAGS in a second step far away from the list and the second step was skipped. Using a loop to add the flags will prevent this from happening. commit 1b47ed72aedcc8f2e39ae198a0ea26a565ca6a15 Author: Ian Bruene Date: Mon Apr 10 22:57:13 2017 -0500 Added raw return mode to mode 6 parser You report it working at Ian's commit and the one after that was reverted. Bisect? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Tue Apr 11 21:05:49 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 11 Apr 2017 14:05:49 -0700 Subject: TEST fails on NetBSD In-Reply-To: <20170411210332.GA4881@thyrsus.com> References: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> <20170411115111.78ad7c40@spidey.rellim.com> <20170411210332.GA4881@thyrsus.com> Message-ID: <20170411140549.6367baba@spidey.rellim.com> Yo Eric! It was the Trevor one. Not sure why our commit id's differ?? On Tue, 11 Apr 2017 17:03:32 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > Yo Hal! > > > > The last commit I can build is > > 1b47ed72aedcc8f2e39ae198a0ea26a565ca6a15 and that one works for me. > > > > Whoever broke head this morning, please fix it. > > There ain't much there there. > > commit 8b13059e4936e7424062f0b69737b3ecbd96317e > Author: Eric S. Raymond > Date: Tue Apr 11 15:41:32 2017 -0400 > > Revert "easier-to-use conditional warning flags" > > It broke the configure logic. > > commit 5e95e9186322284845ffa8c6c3c25bb079b046c0 > Author: Ian Bruene > Date: Tue Apr 11 09:32:52 2017 -0500 > > Added string based formatters and associated tests. > > commit 40c17222243fe5a039ada4bc488300e1151ae8f7 > Author: Matt Selsky > Date: Tue Apr 11 10:14:35 2017 -0400 > > Make sure NTP_API is defined before doing comparison tests > > Avoids these warnings: > warning: "NTP_API" is not defined [-Wundef] > > commit fa68a88752bb8e782cf19fb6e1bd378d0b803f21 > Author: Trevor N > Date: Tue Apr 11 01:40:54 2017 -0400 > > easier-to-use conditional warning flags > > I noticed that more flags were added into the check list, but > valid flags are added to CFLAGS in a second step far away from the > list and the second step was skipped. Using a loop to add the flags > will prevent this from happening. > > commit 1b47ed72aedcc8f2e39ae198a0ea26a565ca6a15 > Author: Ian Bruene > Date: Mon Apr 10 22:57:13 2017 -0500 > > Added raw return mode to mode 6 parser > > You report it working at Ian's commit and the one after that was > reverted. Bisect? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 12 00:04:52 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 11 Apr 2017 17:04:52 -0700 Subject: How it works... Message-ID: <20170412000452.4D2D740605C@ip-64-139-1-69.sjc.megapath.net> Some of this should have been obvious, but ... In the process of trying to understand some pool stuff, I came to some leftover multicast stuff. When the pool code gets an IP address, it doesn't setup a peer slot. It just sends a normal client request packet. When the answer returns, the receive routine decides it's a manycast packet and hands it off to handle_manycast() which makes the peer block. The key to that decision is a table lookup in findpeer(). It uses MATCH_ASSOC() to look in AM which is two dimensional. One is the type of packet. The other is the local state that I haven't sorted out yet. There is a hash table on remote IP address to find the peer block. Mumble. It's working now. We can clean things up when I get the rest of the DNS stuff over the hump. I think the right thing to do is make the peer block early on and throw them away if there isn't any response. -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 12 01:17:27 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 11 Apr 2017 18:17:27 -0700 Subject: Added "tai" variable to units display, fixed lurking bug in ntpmon. Message-ID: <20170411181727.638adbee@spidey.rellim.com> Yo Ian! > Subject: [Git][NTPsec/ntpsec][master] Added "tai" variable to units > display, fixed lurking bug in ntpmon. > > Note that tai has not been properly tested as I have never seen it show > up on my system in the first place. However, it uses the same system as > all the other variables that display units. I can not test this as ntpmon now crashes before i can get to the ntpmon screen with tai on it. Easy way to crash: # ntpmon -u (press 'd' key) (press [down arrow] key) Boom: # ntpmon -u Traceback (most recent call last): File "/usr/local/bin/ntpmon", line 309, in stdscr.addstr(strconvert.encode('UTF-8')) UnicodeDecodeError: 'ascii' codec can't decode byte 0xae in position 60: ordinal not in range(128) RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ vc mailing list vc at ntpsec.org http://lists.ntpsec.org/mailman/listinfo/vc -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Apr 12 01:32:38 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 11 Apr 2017 18:32:38 -0700 Subject: =?UTF-8?B?4pyYdGVzdHM=?= test_util.py Message-ID: <20170411183238.5c8a5b5f@spidey.rellim.com> Yo Ian! I'm confused by some of the new tests. Like this one: self.assertEqual(f("1.234", nu.UNITS_SEC, nu.UNIT_MS), " 1.234ms") Does that mean the input is 1.234s and the output should be 1.234ms? If not, the function is not very obvious. Why is there UNITS_xxx and UNIT_xxx??? Like here: u.UNITS_SEC, nu.UNIT_NS And what is this: UNITS_PPX ?? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Wed Apr 12 02:23:54 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 11 Apr 2017 22:23:54 -0400 Subject: TEST fails on NetBSD In-Reply-To: <20170411140549.6367baba@spidey.rellim.com> References: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> <20170411115111.78ad7c40@spidey.rellim.com> <20170411210332.GA4881@thyrsus.com> <20170411140549.6367baba@spidey.rellim.com> Message-ID: <20170412022354.GA10908@thyrsus.com> Gary E. Miller : > It was the Trevor one. Which is now reverted. >Not sure why our commit id's differ?? That's not good. I'll re-clone; you should too. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Wed Apr 12 04:24:14 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 11 Apr 2017 21:24:14 -0700 Subject: TEST fails on NetBSD In-Reply-To: <20170412022354.GA10908@thyrsus.com> References: <20170410231508.10506406060@ip-64-139-1-69.sjc.megapath.net> <20170411115111.78ad7c40@spidey.rellim.com> <20170411210332.GA4881@thyrsus.com> <20170411140549.6367baba@spidey.rellim.com> <20170412022354.GA10908@thyrsus.com> Message-ID: <20170411212414.1dc9f691@spidey.rellim.com> Yo Eric! On Tue, 11 Apr 2017 22:23:54 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > It was the Trevor one. > > Which is now reverted. > > >Not sure why our commit id's differ?? > > That's not good. I'll re-clone; you should too. 1st thing I always do. And did before sending you the commit id. The commit ID I sent was from 'git log -p' while I was on git head. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 12 06:49:11 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 11 Apr 2017 23:49:11 -0700 Subject: DNS cleanup In-Reply-To: Message from "Eric S. Raymond" of "Mon, 10 Apr 2017 19:00:28 EDT." <20170410230028.GB18373@thyrsus.com> Message-ID: <20170412064911.75181406060@ip-64-139-1-69.sjc.megapath.net> The pool stuff is working. There are probably lots of quirks/bugs, but the general structure seems right. I did the pool stuff first because I found the code I needed while I was cleaning out the old stuff. Normal host mode shouldn't be too hard. esr at thyrsus.com said: > Be aware that at some point I would still be interested in replacing our > internal async-lookup code with calls to c_ares or equivalent, on the > general principle that it's better when specialty code unrelated to our > business logic is somebody else's maintainance problem. One possible reason to use something else is to get the TTL from the response. There is no way to get that with the normal API. Other than that, I'll be real surprised if you want to drag in another dependency. I think you said that c_ares uses a callback. To fit that into the current/new structure, all we have to do is have the called-back routine save or copy the data, then send a signal to the main thread. The main thread will then call in to do the real callbacks in the right context. -- These are my opinions. I hate spam. From esr at thyrsus.com Wed Apr 12 09:34:08 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 12 Apr 2017 05:34:08 -0400 Subject: DNS cleanup In-Reply-To: <20170412064911.75181406060@ip-64-139-1-69.sjc.megapath.net> References: <20170410230028.GB18373@thyrsus.com> <20170412064911.75181406060@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170412093408.GB30192@thyrsus.com> Hal Murray : > esr at thyrsus.com said: > > Be aware that at some point I would still be interested in replacing our > > internal async-lookup code with calls to c_ares or equivalent, on the > > general principle that it's better when specialty code unrelated to our > > business logic is somebody else's maintainance problem. > > One possible reason to use something else is to get the TTL from the > response. There is no way to get that with the normal API. Pardon my ignorance, but what would we use the TTL for? > Other than that, I'll be real surprised if you want to drag in another > dependency. Don't think of it as a dependency, think of it as making specialized code that we don't want to maintain into somebody else's problem. :-) > I think you said that c_ares uses a callback. To fit that into the > current/new structure, all we have to do is have the called-back routine save > or copy the data, then send a signal to the main thread. The main thread > will then call in to do the real callbacks in the right context. Noted. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Wed Apr 12 11:25:01 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 12 Apr 2017 04:25:01 -0700 Subject: DNS cleanup In-Reply-To: Message from "Eric S. Raymond" of "Wed, 12 Apr 2017 05:34:08 EDT." <20170412093408.GB30192@thyrsus.com> Message-ID: <20170412112502.0027E406060@ip-64-139-1-69.sjc.megapath.net> > Pardon my ignorance, but what would we use the TTL for? It tells you when to retry the pool DNS so you will get new data. >From dig: ;; ANSWER SECTION: 0.fedora.pool.ntp.org. 150 IN A 74.82.59.150 0.fedora.pool.ntp.org. 150 IN A 104.131.53.252 0.fedora.pool.ntp.org. 150 IN A 45.127.112.2 0.fedora.pool.ntp.org. 150 IN A 208.53.158.34 The 150 is the TTL. Of course, a caching DNS server might drop them sooner. Then an earlier reload might get a different answer, but if all goes well, that's a good and polite time to wait. -- These are my opinions. I hate spam. From trv-n at comcast.net Thu Apr 13 04:40:54 2017 From: trv-n at comcast.net (Trevor N.) Date: Thu, 13 Apr 2017 00:40:54 -0400 Subject: TEST fails on NetBSD Message-ID: <520uec92m8hv6cgdosiaf1c5fh6gq9a2pd@4ax.com> Sorry about that, I didn't test the change well enough. Without this patch the warning flag added to the list needs to also be added down where the other conditional warning flags are added to CFLAGS. >Gary E. Miller gem at rellim.com >Tue Apr 11 21:05:49 UTC 2017 > > Previous message (by thread): TEST fails on NetBSD > Next message (by thread): TEST fails on NetBSD > Messages sorted by: [ date ] [ thread ] [ subject ] [ author ] > >Yo Eric! > >It was the Trevor one. Not sure why our commit id's differ?? > >On Tue, 11 Apr 2017 17:03:32 -0400 >"Eric S. Raymond" wrote: > >> Gary E. Miller : >> > Yo Hal! >> > >> > The last commit I can build is >> > 1b47ed72aedcc8f2e39ae198a0ea26a565ca6a15 and that one works for me. >> > >> > Whoever broke head this morning, please fix it. >> >> There ain't much there there. >> >> commit 8b13059e4936e7424062f0b69737b3ecbd96317e >> Author: Eric S. Raymond >> Date: Tue Apr 11 15:41:32 2017 -0400 >> >> Revert "easier-to-use conditional warning flags" >> >> It broke the configure logic. >> From hmurray at megapathdsl.net Thu Apr 13 07:45:07 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 13 Apr 2017 00:45:07 -0700 Subject: Documentation request/opportunity Message-ID: <20170413074507.1EDDC40605C@ip-64-139-1-69.sjc.megapath.net> I think we need a chart/table showing the types of packets we send and expect to receive. And another table of the config commands and the packets they generate/process. Maybe there should be notes about the meaning of the old/classic types/commands that we no longer support. ---------- I found a key comment in the code somewhat explaining 2 key words: >From newpeer(): * If a peer is found, this would be a duplicate and we don't * allow that. This avoids duplicate ephemeral (broadcast/ * multicast) and preemptible (manycast and pool) client * associations. -- These are my opinions. I hate spam. From gem at rellim.com Thu Apr 13 15:48:53 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 13 Apr 2017 08:48:53 -0700 Subject: Documentation request/opportunity In-Reply-To: <20170413074507.1EDDC40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170413074507.1EDDC40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170413084853.4da91748@spidey.rellim.com> Yo Hal! On Thu, 13 Apr 2017 00:45:07 -0700 Hal Murray wrote: > I think we need a chart/table showing the types of packets we send > and expect to receive. How about the RFC? I would hate to duplicate that. https://www.ietf.org/rfc/rfc5905.txt > And another table of the config commands and the packets they > generate/process. The packets are in the RFC, the specific parameters you can set/request are not. > Maybe there should be notes about the meaning of the old/classic > types/commands that we no longer support. We do have that, but not all in one place. A delta to NTP Classic and also to the RFC would be nice. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Thu Apr 13 15:58:01 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 13 Apr 2017 08:58:01 -0700 Subject: TEST fails on NetBSD In-Reply-To: <520uec92m8hv6cgdosiaf1c5fh6gq9a2pd@4ax.com> References: <520uec92m8hv6cgdosiaf1c5fh6gq9a2pd@4ax.com> Message-ID: <20170413085801.35e1e35e@spidey.rellim.com> Yo Trevor! On Thu, 13 Apr 2017 00:40:54 -0400 "Trevor N." wrote: > Sorry about that, I didn't test the change well enough. Without this > patch the warning flag added to the list needs to also be added down > where the other conditional warning flags are added to CFLAGS. Actually, they need to be added a good bit sooner. Some of the debug CFLAGS depend on other debug CFLAGS. So adding one must be done before the next one is tested. For example: -Wformat-nonliteral depends on -Wformat RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ianbruene at gmail.com Thu Apr 13 16:06:37 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Thu, 13 Apr 2017 11:06:37 -0500 Subject: Unit bugs Message-ID: oY Gene! I pushed a patch for most of the bugs, but I haven't been able to replicate the unicode bug yet so I added some temporary logging to the relevant function. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From gem at rellim.com Thu Apr 13 16:13:28 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 13 Apr 2017 09:13:28 -0700 Subject: Unit bugs In-Reply-To: References: Message-ID: <20170413091328.18e82a95@spidey.rellim.com> Yo Ian! On Thu, 13 Apr 2017 11:06:37 -0500 Ian Bruene wrote: > I pushed a patch for most of the bugs, but I haven't been able to > replicate the unicode > bug yet so I added some temporary logging to the relevant function. I can't Merge a WIP, is it ready to push? Is it better than current git head? Where does it log? Often a crash is better since it gives you a line number to look at. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ianbruene at gmail.com Thu Apr 13 16:18:50 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Thu, 13 Apr 2017 11:18:50 -0500 Subject: Unit bugs In-Reply-To: <20170413091328.18e82a95@spidey.rellim.com> References: <20170413091328.18e82a95@spidey.rellim.com> Message-ID: Yes, it is mergeable.And it is better than head. I de-wipped and it is rebasingas I type. Logging is just a try/except block catching the unicode error, then dumps with print(repr(foo)), then re-raises the error. So the traceback still happens. On 04/13/2017 11:13 AM, Gary E. Miller wrote: > Yo Ian! > > > I can't Merge a WIP, is it ready to push? Is it better than current > git head? > > Where does it log? Often a crash is better since it gives you a line > number to look at. > > RGDS > GARY > --------------------------------------------------------------------------- > Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 > gem at rellim.com Tel:+1 541 382 8588 > > Veritas liberabit vos. -- Quid est veritas? > "If you can?t measure it, you can?t improve it." - Lord Kelvin > > > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From hmurray at megapathdsl.net Thu Apr 13 19:58:09 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 13 Apr 2017 12:58:09 -0700 Subject: Documentation request/opportunity Message-ID: <20170413195809.8267940605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: >> I think we need a chart/table showing the types of packets we send >> and expect to receive. > How about the RFC? I would hate to duplicate that. No, that's not what I'm looking for. We only implement a subset of the full spec. For example, we don't implement the peer stuff. I think that's 2 packet types. The RFC is pages and pages. I'm looking for the one page (or less) summary. A pointer to the right section in the RFC might be appropriate. Context/background: There is a two dimensional table used in the input packet processing. I think we can clean that up by eliminating the table. This is tangled up with broadcast and friends and we removed some of that but I'm not sure exactly sure what is or should be left so I'd like some documentation of what is currently supported. The current code for the pool stuff, sends request packets when it gets the DNS answer. When the reply comes back, it sets up the peer slot. If the server doesn't respond, there is never any peer slot setup. That same path also sets up broadcast clients, but I think we don't support that any more. I think that table goes away if the pool stuff sets up the peer slot before it sends the first request. That means response processing is a simple as look for the peer and drop anything that doesn't match. -- These are my opinions. I hate spam. From gem at rellim.com Thu Apr 13 20:10:29 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 13 Apr 2017 13:10:29 -0700 Subject: Documentation request/opportunity In-Reply-To: <20170413195809.8267940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170413195809.8267940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170413131029.602f9f9b@spidey.rellim.com> Yo Hal! On Thu, 13 Apr 2017 12:58:09 -0700 Hal Murray wrote: > gem at rellim.com said: > >> I think we need a chart/table showing the types of packets we send > >> and expect to receive. > > How about the RFC? I would hate to duplicate that. > > No, that's not what I'm looking for. > > We only implement a subset of the full spec. For example, we don't > implement the peer stuff. I think that's 2 packet types. > > The RFC is pages and pages. I'm looking for the one page (or less) > summary. A pointer to the right section in the RFC might be > appropriate. OK. > I think that table goes away if the pool stuff sets up the peer slot > before it sends the first request. That means response processing is > a simple as look for the peer and drop anything that doesn't match. Wow, it does sound pretty Rube Goldberg the way you describe it. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Apr 13 20:25:27 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 13 Apr 2017 13:25:27 -0700 Subject: Documentation request/opportunity Message-ID: <20170413202527.E2A3740605C@ip-64-139-1-69.sjc.megapath.net> > Wow, it does sound pretty Rube Goldberg the way you describe it. I think that's why it took me so long to figure it out. I forgot to mention that there are two uses for the document I'm looking for. One is to compare what we have with ntp classic. The other is to understand the code. -- These are my opinions. I hate spam. From gem at rellim.com Thu Apr 13 20:48:32 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 13 Apr 2017 13:48:32 -0700 Subject: Documentation request/opportunity In-Reply-To: <20170413202527.E2A3740605C@ip-64-139-1-69.sjc.megapath.net> References: <20170413202527.E2A3740605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170413134832.53dc9d6f@spidey.rellim.com> Yo Hal! On Thu, 13 Apr 2017 13:25:27 -0700 Hal Murray wrote: > I forgot to mention that there are two uses for the document I'm > looking for. > > One is to compare what we have with ntp classic. > The other is to understand the code. Since you seem to understand the code the best, I guess you just elected yourself. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Apr 13 21:33:11 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 13 Apr 2017 14:33:11 -0700 Subject: Ready to push new DNS Message-ID: <20170413213311.6548A40605C@ip-64-139-1-69.sjc.megapath.net> My plan is to push when I get back in several hours unless somebody objects or my testing finds something. I think it's working. I'll be doing more testing while I'm out. There is, of course, a chance I've broken something with a change this big. I'm still seeing this on NetBSD. Does anybody understand this one? Is it a real error or a testing glitch? TEST(lfpfunc, Absolute) PASS TEST(lfpfunc, FDF_RoundTrip)../../tests/libntp/lfpfunc.c:268:TEST(lfpfunc, FDF_R oundTrip):FAIL: Values Not Within Delta . 2147483647.500000 diff 2147482624.0000 00 not within 2.384186e-07 TEST(lfpfunc, SignedRelOps) PASS -- These are my opinions. I hate spam. From esr at thyrsus.com Thu Apr 13 22:47:32 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 13 Apr 2017 18:47:32 -0400 Subject: Ready to push new DNS In-Reply-To: <20170413213311.6548A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170413213311.6548A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170413224732.GB28473@thyrsus.com> Hal Murray : > > My plan is to push when I get back in several hours unless somebody objects > or my testing finds something. > > I think it's working. I'll be doing more testing while I'm out. There is, > of course, a chance I've broken something with a change this big. Once you've pushed ut I'll update the test faerm and do some serious burn-in. > I'm still seeing this on NetBSD. Does anybody understand this one? Is it a > real error or a testing glitch? > > TEST(lfpfunc, Absolute) PASS > TEST(lfpfunc, FDF_RoundTrip)../../tests/libntp/lfpfunc.c:268:TEST(lfpfunc, > FDF_R > oundTrip):FAIL: Values Not Within Delta . 2147483647.500000 diff > 2147482624.0000 > 00 not within 2.384186e-07 > TEST(lfpfunc, SignedRelOps) PASS I think Gary is working on this. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Thu Apr 13 23:37:53 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 13 Apr 2017 16:37:53 -0700 Subject: Ready to push new DNS In-Reply-To: <20170413213311.6548A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170413213311.6548A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170413163753.046a1344@spidey.rellim.com> Yo Hal! On Thu, 13 Apr 2017 14:33:11 -0700 Hal Murray wrote: > I'm still seeing this on NetBSD. Does anybody understand this one? > Is it a real error or a testing glitch? A real bug, on just NetBSD. I'm still working on it. For some reason it looks like NetBSD floating point is not as accurate as other floating point when subtracing doubles. If I can nail that down as a cause my only choice is to relax that one test for NetBSD. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Apr 14 00:33:31 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 13 Apr 2017 17:33:31 -0700 Subject: =?UTF-8?B?4pyYTmV0QlNE?= failure. Message-ID: <20170413173331.193f35d7@spidey.rellim.com> Yo Hal! More info on the NetBSD bug: TEST(lfpfunc, FDF_RoundTrip)../../tests/libntp/lfpfunc.c:268::FAIL: Expected 0.0 Was 2147482624.0. 2147483647.500000 diff 2147482624.000000 not within 2.384186e-07 2147483647 is hex: 7FFFFFFF 2147482624 is hex: 7FFFFC00 If you know your floaing point you know that 7FFFFFFF fits in a double just fine. But if you put it in float and then back to hex, you get 7FFFFC00. That because a float is only 21 significant digits, the same significant digits as in 7FFFFC00 So somewhere netBSD is using a float instead of a double... Or something else that is netBSD specific... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Fri Apr 14 01:06:10 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 13 Apr 2017 18:06:10 -0700 Subject: Ready to push new DNS In-Reply-To: Message from Hal Murray of "Thu, 13 Apr 2017 14:33:11 PDT." <20170413213311.6548A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414010610.1C3DC40605C@ip-64-139-1-69.sjc.megapath.net> I fatfingered something while trying to pull before I could push. Mumble. It will take me a while to sort things out. Eric: Please add this as another vote for a how-to-use-git paper. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Apr 14 05:08:21 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 13 Apr 2017 22:08:21 -0700 Subject: New DNS has been pushed Message-ID: <20170414050822.0A23640605C@ip-64-139-1-69.sjc.megapath.net> There is lots of extra/debugging logging Bugs/quirks in DNS area: It needs good backoff I think we can simplify things if FLAG_DNS is set on the pool too findinterface needs work. I've seen it return a an interface without an IPv6 address The callback API might get cleaner if passes IP Addresses multiple time rather than the list of addrinfo Other bugs/quirks notices while looking at the code: t21 and friends in struct peer are unused pool pokes a hole in restrict. Is that documented? We need something like RES_NOPOOL to avoid pool servers we don't want move newpeer from handle_manycast to pool_take_dns I think we can cleanup input processing by removing MATCH_ASSOC and friends MODE_PRIVATE and MODE_ACTIVE/PASSIVE aren't used/needed ?? is_vn_mode_acceptable FLAG_TSTAMP_PPS is 0x4cd000 (looks like a typo) MDF_BCLNT is never set MDF_BCAST is used in mon_entry ?? cleanup MDF_* and peer->cast_flags only broadcast server left INT_PPP and INT_PRIVACY aren't used. Probably others. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Fri Apr 14 05:26:35 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Fri, 14 Apr 2017 07:26:35 +0200 Subject: Ready to push new DNS References: <20170414010610.1C3DC40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <87h91r36gk.fsf@Rainer.invalid> Hal Murray writes: > I fatfingered something while trying to pull before I could push. Mumble. > It will take me a while to sort things out. Just check your reflog and go back to the state before the pull. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From Stromeko at nexgo.de Fri Apr 14 06:08:56 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Fri, 14 Apr 2017 08:08:56 +0200 Subject: =?utf-8?Q?=E2=9C=98NetBSD?= failure. References: <20170413173331.193f35d7@spidey.rellim.com> Message-ID: <87d1cf34hz.fsf@Rainer.invalid> Gary E. Miller writes: > That because a float is only 21 significant digits, the same significant > digits as in 7FFFFC00 Nope, a float has 24 binary digits for the mantissa (including the hidden bit), unless they changed the standard while I wasn't looking. So there really isn't an explanation of where the extra three missing bits went. > So somewhere netBSD is using a float instead of a double... > Or something else that is netBSD specific... Not necessarily. The printf conversion specifier in that test is bunk, use %g or even better %a and specify the correct precision. Also, add the index into the table somewhere in the output so one can see what test was actually performed and output _all_ intermediates. > TEST(lfpfunc, FDF_RoundTrip)../../tests/libntp/lfpfunc.c:268::FAIL: > Expected 0.0 > Was 2147482624.0. 2147483647.500000 diff 2147482624.000000 not within 2.384186e-07 The operative number here is actually the third one (diff) and it looks to me like op3 was zero (the conversion back from double was'#t performed correctly). You'd likely gain more insight if you looked at the bit pattern from temp before converting it back to double. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From hmurray at megapathdsl.net Fri Apr 14 06:27:03 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 13 Apr 2017 23:27:03 -0700 Subject: Lots of errors Message-ID: <20170414062703.BCADF40605C@ip-64-139-1-69.sjc.megapath.net> I assume that is work in progress rather than something I did. samples: test-all/test.log:../../libisc/error.c:28:50: warning: initialization left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format] test-all/test.log:../../libisc/error.c:29:45: warning: initialization left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format] test-all/test.log:../../libisc/error.c:34:23: warning: assignment left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format] test-all/test.log:../../libisc/error.c:42:18: warning: assignment left-hand side might be a candidate for a format attribute [-Wsuggest-attribute=format] test-all/test.log:../../libntp/systime.c:253:9: warning: comparing floating point with == or != is unsafe [-Wfloat-equal] test-all/test.log:../../libntp/pymodule.c:28:18: warning: initialization discards ???const??? qualifier from pointer target type [-Wdiscarded-qualifiers] test-all/test.log:../../libntp/systime.c:253:9: warning: comparing floating point with == or != is unsafe [-Wfloat-equal] -- These are my opinions. I hate spam. From gem at rellim.com Fri Apr 14 06:32:30 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 13 Apr 2017 23:32:30 -0700 Subject: Lots of errors In-Reply-To: <20170414062703.BCADF40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414062703.BCADF40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170413233207.4a12fe8f@spidey.rellim.com> Yo Hal! On Thu, 13 Apr 2017 23:27:03 -0700 Hal Murray wrote: > I assume that is work in progress rather than something I did. Warnings, not errors. You should only get those if you have --enable-debug I started out with over a thousand, now just over 100 hundred left to fix. Feel free to fix any that annoy you. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Fri Apr 14 09:57:07 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 02:57:07 -0700 Subject: Lots of errors Message-ID: <20170414095707.B702D40605C@ip-64-139-1-69.sjc.megapath.net> > Warnings, not errors. Weird. I thought of that several seconds after sending the message. > You should only get those if you have --enable-debug That's my default setup. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Apr 14 10:27:19 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 03:27:19 -0700 Subject: DEBUG in ntpsec Message-ID: <20170414102719.84FEC40605C@ip-64-139-1-69.sjc.megapath.net> The default was --enable-debug. A while ago, that was changed to --disable-debug. I think we should reconsider that and/or this whole area. There are several things all lumped together under --enable-debug and/or --enable-debug-gdb One is a bunch of optional compiler checking options - the stuff Gary is working on now. Another is not stripping symbols and whatever is needed for using gdb. Another is a bunch of run time sanity checks - things like crash if foo is NULL. Another is a bunch of optional printing. This is useful for chasing obscure bugs. You can run ntpd from the command line with -n and -d or -D and you get lots of printout. This allows getting more info to chase some problems without rebuilding ntpd. We should probably measure the size difference and/or run time differences. The latter will take something like a busy pool server. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Apr 14 11:18:16 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 04:18:16 -0700 Subject: HEAD broken if -u ntp:ntp Message-ID: <20170414111816.3A5FF40605C@ip-64-139-1-69.sjc.megapath.net> It crashes with Apr 14 04:00:30 fed ntpd[4726]: invalid op: 110 110 decimal is ascii u This looks suspicious, but I'm sacking out. commit d9ead3aa6abd718d4cf73bcf8023d312d510359e Author: Gary E. Miller Date: Thu Apr 13 18:36:24 2017 -0700 ntpd; add missing default case, fix a bad exit() code. ... case 'Z': if (ntp_optarg != NULL) set_sys_var(ntp_optarg, strlen(ntp_optarg) + 1, (u_short) (RW | DEF)); - break; + break; + default: + msyslog(LOG_ERR, "invalid op: %d", op); + exit(1); } } ------- There are two loops scanning the command line args. -- These are my opinions. I hate spam. From ghane0 at gmail.com Fri Apr 14 16:07:13 2017 From: ghane0 at gmail.com (Sanjeev Gupta) Date: Sat, 15 Apr 2017 00:07:13 +0800 Subject: DEBUG in ntpsec In-Reply-To: <20170414102719.84FEC40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414102719.84FEC40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Fri, Apr 14, 2017 at 6:27 PM, Hal Murray wrote: > > We should probably measure the size difference and/or run time differences. > The latter will take something like a busy pool server. I have one, ntpmon.dcs1.biz , in a very busy pool, IPv6, ntpsec git head, with gpsd git head. Debian testing head, too :-) Let me know who needs access. -- Sanjeev Gupta +65 98551208 http://www.linkedin.com/in/ghane -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Fri Apr 14 16:34:43 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 09:34:43 -0700 Subject: =?UTF-8?B?4pyYZ2l0?= head broken Message-ID: <20170414093443.5f764ac8@spidey.rellim.com> Yo Hal! Build is broken this AM: commit 1d40226e0f88fe91dd97f9568e07d4a0ed004dd7 In file included from ../../ntpd/ntp_proto.c:7:0: ../../ntpd/ntp_proto.c:2450:27: error: 'lcladr' undeclared (first use in this function) current_time, latoa(lcladr), socktoa(rmtadr))); ^ RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Apr 14 16:52:18 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 09:52:18 -0700 Subject: =?UTF-8?B?4pyYZ2l0?= head broken In-Reply-To: <20170414093443.5f764ac8@spidey.rellim.com> References: <20170414093443.5f764ac8@spidey.rellim.com> Message-ID: <20170414095218.52cde187@spidey.rellim.com> Yo All! > Build is broken this AM: > commit 1d40226e0f88fe91dd97f9568e07d4a0ed004dd7 git head now working, I put in a partial fix until Hal can look at it: commit bac8b43fe18a9fa09ada0e32fd2d1d31da1e1a62 To see the bug you had to use --enable-debug. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Apr 14 17:31:04 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 10:31:04 -0700 Subject: HEAD broken if -u ntp:ntp In-Reply-To: <20170414111816.3A5FF40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414111816.3A5FF40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414103104.42a100bc@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 04:18:16 -0700 Hal Murray wrote: > It crashes with > Apr 14 04:00:30 fed ntpd[4726]: invalid op: 110 Great! Adding the missing default found a real bug! That would be the second bug found in that piece of code in a month. > 110 decimal is ascii u Unhandled option. Looks like it has been broken a really long time. I went way back and don't see it ever worked in NTPsec. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Fri Apr 14 18:07:14 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 11:07:14 -0700 Subject: HEAD broken if -u ntp:ntp Message-ID: <20170414180714.633D440605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > Unhandled option. Looks like it has been broken a really long time. I went > way back and don't see it ever worked in NTPsec. It works. There are two passes for the command line stuff. -u is handled on the other one. I don't know why there are two. There is a comment about initializing the library. How did that get past your testing? -- These are my opinions. I hate spam. From gem at rellim.com Fri Apr 14 18:08:23 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 11:08:23 -0700 Subject: DEBUG in ntpsec In-Reply-To: <20170414102719.84FEC40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414102719.84FEC40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414110823.3298bd27@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 03:27:19 -0700 Hal Murray wrote: > The default was --enable-debug. A while ago, that was changed to > --disable-debug. And then changed back to --enable-debug Right now --enable-debug enables only things that developers need or want. It enables things that are scary to distributions, packagers and end users, but essential to developers. They will all run from anything that says "debug'. > I think we should reconsider that and/or this whole area. We have reconsidered this a lot lately. You email only asks questions, do you have any suggestions? > There are several things all lumped together under --enable-debug > and/or --enable-debug-gdb Actually, those two are completely separate. All --enable-debug-gdb does is add -g to CFLAGS and not strip the binary. > One is a bunch of optional compiler checking options - the stuff Gary > is working on now. Yup, and almost all gone now. Only 50 more warnings to go, starting out at over 1,000 on the last batch. > Another is not stripping symbols and whatever is needed for using gdb. See above. If you want gdb, just use --enable-debug-gdb > Another is a bunch of run time sanity checks - things like crash if > foo is NULL. Yeah, I have not touched those, they are a mystery to me. Those at lesst need some documentation. > Another is a bunch of optional printing. This is useful for chasing > obscure bugs. You can run ntpd from the command line with -n and -d > or -D and you get lots of printout. This allows getting more info to > chase some problems without rebuilding ntpd. ntpd already has a ton of that. What would you change? > We should probably measure the size difference and/or run time > differences. The latter will take something like a busy pool server. I doubt time/size has much to do with it. It is a philosophical difference. Classic minimalist versus maximalist argument that can never be settled. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Apr 14 18:28:43 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 11:28:43 -0700 Subject: HEAD broken if -u ntp:ntp In-Reply-To: <20170414180714.633D440605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414180714.633D440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414112843.197faf44@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 11:07:14 -0700 Hal Murray wrote: > gem at rellim.com said: > > Unhandled option. Looks like it has been broken a really long > > time. I went way back and don't see it ever worked in NTPsec. > > It works. How would you know? ntp:ntp is the default, right? > There are two passes for the command line stuff. -u is handled on > the other one. > > I don't know why there are two. There is a comment about > initializing the library. Gack, fixing that would be a good project for someone. > How did that get past your testing? It did not, you caught it just fine. :-) I went over the option string: 46c:dD:f:gGhi:I:k:l:LmnNp:PqRs:t:u:U:Vw:xzZ The switch statements we are talking about handles 'b' and 'r' which are not in the option string?? What did they used to do? I added the 'u'. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Apr 14 18:33:18 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 11:33:18 -0700 Subject: Lots of errors In-Reply-To: <20170414095707.B702D40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414095707.B702D40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414113318.3c9d956e@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 02:57:07 -0700 Hal Murray wrote: > > Warnings, not errors. > > Weird. I thought of that several seconds after sending the message. A good reason not to email at 3am. :-) > > You should only get those if you have --enable-debug > > That's my default setup. Then how did you last commit, which failed to build with --enable-debug slip through? I test with both (usually) before pushing. Only 50 more of those warnings to fix! Some will be a bit nasty, some can never be fixed, like the bugs in Bison and libisc. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Fri Apr 14 18:35:28 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 11:35:28 -0700 Subject: HEAD broken if -u ntp:ntp Message-ID: <20170414183528.DE88F40605C@ip-64-139-1-69.sjc.megapath.net> > The switch statements we are talking about handles 'b' and 'r' which are not > in the option string?? What did they used to do? I added the 'u'. The man page for ntp classic says they are: -b, --bcastsync Allow us to sync to broadcast servers. -r string, --propagationdelay=string Broadcast/propagation delay. I assume they are leftover from before we ripped out the broadcast client stuff. -- These are my opinions. I hate spam. From gem at rellim.com Fri Apr 14 18:47:43 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 11:47:43 -0700 Subject: HEAD broken if -u ntp:ntp In-Reply-To: <20170414183528.DE88F40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414183528.DE88F40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414114743.73c1be19@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 11:35:28 -0700 Hal Murray wrote: > > The switch statements we are talking about handles 'b' and 'r' > > which are not in the option string?? What did they used to do? I > > added the 'u'. > > The man page for ntp classic says they are: > > -b, --bcastsync > Allow us to sync to broadcast servers. > > -r string, --propagationdelay=string > Broadcast/propagation delay. > > > I assume they are leftover from before we ripped out the broadcast > client stuff. How about I rip that cruft out then? Maybe print a message about old/unused option? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Fri Apr 14 18:59:56 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 11:59:56 -0700 Subject: Lots of errors Message-ID: <20170414185956.1AC5240605C@ip-64-139-1-69.sjc.megapath.net> > Then how did you last commit, which failed to build with --enable-debug slip > through? It used to be the default. My setup hasn't yet recovered from that switch. Why didn't buildbot complain? > I test with both (usually) before pushing. Is that a run time test, or just a build? We need more/better run time testing. > Only 50 more of those warnings to fix! Some will be a bit nasty, some can > never be fixed, like the bugs in Bison and libisc. I added tests/option-testing.sh to help find this sort of problem. We should teach buildbot to run it occasionally. (but only if the output will include warnings) There are some grumbles about inlines from old CentOS and maybe others (NetBSD?) ../../include/timespecops.h:331: warning: inlining failed in call to ???tspec_stamp_to_lfp???: call is unlikely and code size would grow ../../ntpd/refclock_shm.c:555: warning: called from here ../../include/timespecops.h:331: warning: inlining failed in call to ???tspec_stamp_to_lfp???: call is unlikely and code size would grow ../../ntpd/refclock_shm.c:556: warning: called from here ../../ntpd/ntp_config.c:2955: warning: implicit declaration of function ???yyparse??? I may have fixed (aka de-inlined) something similar in ntp_monitor before I figured out that the warning was probably bogus. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Apr 14 19:03:22 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 12:03:22 -0700 Subject: HEAD broken if -u ntp:ntp Message-ID: <20170414190322.BC40440605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: [Context is -b and -r on command line] > How about I rip that cruft out then? Maybe print a message about old/unused > option? Probably better to print a good feature-not-supported message and crash. If it doesn't crash, people are likely to not notice a skipping type message and it's likely to not work. -- These are my opinions. I hate spam. From gem at rellim.com Fri Apr 14 19:26:54 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 12:26:54 -0700 Subject: Lots of errors In-Reply-To: <20170414185956.1AC5240605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414185956.1AC5240605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414122654.153952b9@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 11:59:56 -0700 Hal Murray wrote: > > Then how did you last commit, which failed to build with > > --enable-debug slip through? > > It used to be the default. My setup hasn't yet recovered from that > switch. Yeah, it will take a while for everyone to catch up. > Why didn't buildbot complain? Because buildbot does not build with --enable-debug. > > I test with both (usually) before pushing. > > Is that a run time test, or just a build? > > We need more/better run time testing. Patches welcome. > > Only 50 more of those warnings to fix! Some will be a bit nasty, > > some can never be fixed, like the bugs in Bison and libisc. > > I added tests/option-testing.sh to help find this sort of problem. > We should teach buildbot to run it occasionally. (but only if the > output will include warnings) Patches welcome. > There are some grumbles about inlines from old CentOS and maybe > others (NetBSD?) Thanks. > ../../include/timespecops.h:331: warning: inlining failed in call to > ???tspec_stamp_to_lfp???: call is unlikely and code size would grow > ../../ntpd/refclock_shm.c:555: warning: called from here > ../../include/timespecops.h:331: warning: inlining failed in call to > ???tspec_stamp_to_lfp???: call is unlikely and code size would grow > ../../ntpd/refclock_shm.c:556: warning: called from here Hmmm, not sure how to fix that one. I'll ponder. I see your email client is still not UTF-8. > ../../ntpd/ntp_config.c:2955: warning: implicit declaration of > function ???yyparse??? This file is autogeneraed by Bison. Upstream bug. Good luck getting that one fixed. > I may have fixed (aka de-inlined) something similar in ntp_monitor > before I figured out that the warning was probably bogus. Not bogus at all. It means we are telling the c compiler to do something it refuses to do. When the C compiler starts doing things the programmer did not explicitly ask for that is worrisome. Not horrible, but needs to be fixed. Which is why this warning is not shown with normal builds. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Fri Apr 14 19:35:24 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 12:35:24 -0700 Subject: DEBUG in ntpsec Message-ID: <20170414193524.F3EEF40605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: >> The default was --enable-debug. A while ago, that was changed to >> --disable-debug. > And then changed back to --enable-debug When was it changed back? I don't see that. Maybe we are having a word mixup. How about this way? The default was DEBUG enabled. A few weeks ago, that was changed to DEBUG disabled and you had to add --enable-debug to get the old behavior. -- These are my opinions. I hate spam. From gem at rellim.com Fri Apr 14 19:49:15 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 12:49:15 -0700 Subject: DEBUG in ntpsec In-Reply-To: <20170414193524.F3EEF40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414193524.F3EEF40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414124915.072edb77@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 12:35:24 -0700 Hal Murray wrote: > gem at rellim.com said: > >> The default was --enable-debug. A while ago, that was changed to > >> --disable-debug. > > And then changed back to --enable-debug > > When was it changed back? I don't see that. Covered in the git logs, on IM and here on devel@ Debug is default off, --enable-debug to enable debug. > The default was DEBUG enabled. A few weeks ago, that was changed to > DEBUG disabled and you had to add --enable-debug to get the old > behavior. Yup, still that way. Debug is default off, --enable-debug to enable debug. --enable-debug-gdb just adds -g to CFLAGS and does not strip the binaries. It does not enable DEBUG. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Apr 14 19:59:28 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 12:59:28 -0700 Subject: HEAD broken if -u ntp:ntp In-Reply-To: <20170414183528.DE88F40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414183528.DE88F40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414125928.5e19341b@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 11:35:28 -0700 Hal Murray wrote: > > The switch statements we are talking about handles 'b' and 'r' > > which are not in the option string?? What did they used to do? I > > added the 'u'. > > The man page for ntp classic says they are: Thanks. > I assume they are leftover from before we ripped out the broadcast > client stuff. Done. Prints error message and exit(1). RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Fri Apr 14 20:46:25 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 14 Apr 2017 16:46:25 -0400 (EDT) Subject: C in codebase has shrunk to 25% of Classic Message-ID: <20170414204625.2D8DD13A021A@snark.thyrsus.com> Hal's async-DNS simplification finally did it. We have removed three out of every four lines of code present at fork time. This is significant because a "factor of four" reduction is both impressive and easy for people to understand. Now we can say that. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Fri Apr 14 21:01:20 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 14:01:20 -0700 Subject: C in codebase has shrunk to 25% of Classic In-Reply-To: Message from esr@thyrsus.com (Eric S. Raymond) of "Fri, 14 Apr 2017 16:46:25 EDT." <20170414204625.2D8DD13A021A@snark.thyrsus.com> Message-ID: <20170414210120.55C9140605C@ip-64-139-1-69.sjc.megapath.net> How big is the python code? > This is significant because a "factor of four" reduction is both impressive > and easy for people to understand. Now we can say that. I'd call that misleading if it didn't mention the python code. -- These are my opinions. I hate spam. From gem at rellim.com Fri Apr 14 21:11:43 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 14:11:43 -0700 Subject: C in codebase has shrunk to 25% of Classic In-Reply-To: <20170414210120.55C9140605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414204625.2D8DD13A021A@snark.thyrsus.com> <20170414210120.55C9140605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414141143.3368af0e@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 14:01:20 -0700 Hal Murray wrote: > How big is the python code? > > > This is significant because a "factor of four" reduction is both > > impressive and easy for people to understand. Now we can say > > that. > > I'd call that misleading if it didn't mention the python code. Full disclosure: # ./waf loccount all 70492 (100.00%) in 287 files c 58272 (82.66%) in 148 files python 8355 (11.85%) in 46 files shell 1599 (2.27%) in 8 files yacc 1205 (1.71%) in 1 files css 464 (0.66%) in 1 files waf 459 (0.65%) in 11 files javascript 138 (0.20%) in 1 files 'loccount' finished successfully (0.075s) RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Fri Apr 14 21:14:45 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 14 Apr 2017 17:14:45 -0400 Subject: C in codebase has shrunk to 25% of Classic In-Reply-To: <20170414210120.55C9140605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414204625.2D8DD13A021A@snark.thyrsus.com> <20170414210120.55C9140605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170414211445.GB22307@thyrsus.com> Hal Murray : > How big is the python code? 8 KLOC. About equivalent to 3% of the fork-time volume. > > This is significant because a "factor of four" reduction is both impressive > > and easy for people to understand. Now we can say that. > > I'd call that misleading if it didn't mention the python code. I wouldn't. If you're doing a vulnerability audit, Python code is not an attack surface you really worry about. Especially not in a scenario like this where it's all client tools. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Fri Apr 14 21:59:02 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 14:59:02 -0700 Subject: C in codebase has shrunk to 25% of Classic Message-ID: <20170414215902.D468140605C@ip-64-139-1-69.sjc.megapath.net> Thanks. > javascript 138 (0.20%) in 1 files There is a docs/asciidoc.js Does that mean that we also need javascript to build docs? -- These are my opinions. I hate spam. From gem at rellim.com Sat Apr 15 00:52:02 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 14 Apr 2017 17:52:02 -0700 Subject: =?UTF-8?B?4pyYSEFMX0ZJWEVE?= Message-ID: <20170414175202.3172a0c3@spidey.rellim.com> Yo Hal! Your commit 1d40226e0f88fe91dd97f9568e07d4a0ed004dd7 broke git head. it is mitigated in commit 12af50ead9bbf98c027a52e5921f803bf2070b7e, but still awaiting your fix. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Sat Apr 15 04:36:11 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 15 Apr 2017 00:36:11 -0400 Subject: C in codebase has shrunk to 25% of Classic In-Reply-To: <20170414215902.D468140605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414215902.D468140605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170415043611.GA1077@thyrsus.com> Hal Murray : > Thanks. > > > javascript 138 (0.20%) in 1 files > > There is a docs/asciidoc.js > Does that mean that we also need javascript to build docs? It's used for footnotes in the Web rendering. We don't have any footnotes. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Sat Apr 15 05:20:26 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 22:20:26 -0700 Subject: For your interface collection - from a NetBSD list Message-ID: <20170415052026.0C53B40605C@ip-64-139-1-69.sjc.megapath.net> Subject: Re: bind reacts badly to dhcpcd losing/regaining connectivity From: Robert Elz Date: Sat, 15 Apr 2017 07:59:12 +0700 To: Rhialto Cc: netbsd-users at netbsd.org Date: Sat, 15 Apr 2017 01:41:17 +0200 From: Rhialto Message-ID: <20170414234117.GA18315 at falu.nl> | Why does named not succeed in using the interface when it gets an | address again? What to do about it? I noticed partly because my dns data | seemed to have dropped out of caching name servers elsewhere. This will be a side-effect of the non-root version of named. Named binds to port 53 on each address it can find, rather than just port 53 (any address) as typical daemons do, as it is required to send its replies (UDP replies) from the same address as they were sent to (part of the DNS spec.) [These days, I think there's an interface to allow a UDP socket to be told which (local) addr a packet was sent to, but when bind was created there was no such thing, so it does it the way that works everywhere.] Binding to port 53 requires root permissions - when named first starts it binds to all addresses, and then drops privs. Later, when an addr goes away, it will close the socket bound to that addr - if the addr comes back (or a new address appears) it (attempts to) bind to port 53 on that addr - but without root privs any more, it cannot (EPERM). Solutions to this are just to always run as root, or to recode the receive code to use the new way to receive the dest addr of incoming packets, and to set the source addr of outgoing ones (so just one UDP socket is needed), or perhaps to have named simply re-exec itself whenever a new addr appears, if not running as root. kre -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Apr 15 05:38:41 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 14 Apr 2017 22:38:41 -0700 Subject: More Interface - IPv6 Message-ID: <20170415053841.AD5C440605C@ip-64-139-1-69.sjc.megapath.net> Date: Sat, 15 Apr 2017 02:53:23 +0200 From: Rhialto Message-ID: <20170415005323.GA17890 at falu.nl> | I also noticed the error seems to mention IPv4 only. I am not sure if it | managed to bind an IPv6 address on the same interface (and now it is too | late, unfortunately). IPv6 does not have the same problem/issue - the IPv6 UDP API was designed from the start with the requirement that the incoming dest addr be available to the application, named knows that, and uses it, so binds just to port 53 on any (all) local v6 addrs. On munnari, for IPv4, we have (ignoring other sockets for current tcp connections, etc) ... tcp 0 0 127.0.0.1.53 *.* LISTEN tcp 0 0 202.29.151.3.53 *.* LISTEN tcp 0 0 172.30.0.22.53 *.* LISTEN udp 0 0 127.0.0.1.53 *.* udp 0 0 202.29.151.3.53 *.* udp 0 0 172.30.0.22.53 *.* whereas for IPv6 ... tcp6 0 0 *.53 *.* LISTEN udp6 0 0 *.53 *.* The IPv4 sockets bound to all the IPv4 addresses for TCP are not really needed, I assume it is just done that way for simplicity/consistency. | In case it makes a difference, I am running bind in the chroot as | provided by named_chrootdir="/var/chroot/named". And I have 2 views, an | internal and an external one. The chroot itself makes no difference (it is changing from root to _named or whatever user name it uses, that matters) and nor do views (though personally I believe that the DNS tree should be one consistent data set with the exact same answers available from everywhere). -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Apr 15 07:22:11 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 15 Apr 2017 00:22:11 -0700 Subject: C in codebase has shrunk to 25% of Classic In-Reply-To: Message from "Eric S. Raymond" of "Sat, 15 Apr 2017 00:36:11 EDT." <20170415043611.GA1077@thyrsus.com> Message-ID: <20170415072211.935BA40605C@ip-64-139-1-69.sjc.megapath.net> > It's used for footnotes in the Web rendering. We don't have any footnotes. Thanks. Then why do we need that file? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Apr 15 08:31:50 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 15 Apr 2017 01:31:50 -0700 Subject: HAL_FIXED Message-ID: <20170415083150.E1BF440605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > but still awaiting your fix. Your fix solved the problem. It seemed better to be calm and test changes rather than contribute to the thrashing. -- These are my opinions. I hate spam. From esr at thyrsus.com Sat Apr 15 09:41:40 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 15 Apr 2017 05:41:40 -0400 Subject: C in codebase has shrunk to 25% of Classic In-Reply-To: <20170415072211.935BA40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170415043611.GA1077@thyrsus.com> <20170415072211.935BA40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170415094140.GB11043@thyrsus.com> Hal Murray : > > It's used for footnotes in the Web rendering. We don't have any footnotes. > > Thanks. Then why do we need that file? We probably don't. It's generated, dropped into place by the asciidoc toolchain. We don't ship it. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Sat Apr 15 20:02:50 2017 From: gem at rellim.com (Gary E. Miller) Date: Sat, 15 Apr 2017 13:02:50 -0700 Subject: HAL_FIXED In-Reply-To: <20170415083150.E1BF440605C@ip-64-139-1-69.sjc.megapath.net> References: <20170415083150.E1BF440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170415130250.2b903c80@spidey.rellim.com> Yo Hal! On Sat, 15 Apr 2017 01:31:50 -0700 Hal Murray wrote: > gem at rellim.com said: > > but still awaiting your fix. > > Your fix solved the problem. It seemed better to be calm and test > changes rather than contribute to the thrashing. Not a fix, there is now data loss. I'd like to see the missing debug data restored. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Sat Apr 15 20:10:52 2017 From: gem at rellim.com (Gary E. Miller) Date: Sat, 15 Apr 2017 13:10:52 -0700 Subject: More Interface - IPv6 In-Reply-To: <20170415053841.AD5C440605C@ip-64-139-1-69.sjc.megapath.net> References: <20170415053841.AD5C440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170415131052.43b2387f@spidey.rellim.com> Yo Hal! On Fri, 14 Apr 2017 22:38:41 -0700 Hal Murray wrote: > The IPv4 sockets bound to all the IPv4 addresses for TCP are not > really needed, I assume it is just done that way for > simplicity/consistency. Nope. It is done that way so that certain interfaces and/or IPs can be ignored. As long as it has to be done that way sometimes, easiest to code it so it is that way all the time. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Sun Apr 16 01:05:44 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 15 Apr 2017 18:05:44 -0700 Subject: More Interface - IPv6 Message-ID: <20170416010544.7172C40605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > Nope. It is done that way so that certain interfaces and/or IPs can be > ignored. The bottom line is that if Eric wants to cleanup the interface area, I think it may be possible. Bind has similar requirements. I think that discussion from NetBSD provides the key ideas on how to do it. I think that filter is only part of the problem. I haven't looked carefully at the code to find where it uses the local address on UDP packets. I'm pretty sure it will be needed for any serious crypto work. The point of that message and the previous one is that there is an option to get the local address (destination address in packet) on receive. It uses an option similar to the mechanism for getting the time stamp. >From another message in that (NetBSD) thread: > I assume that what jnemeth was really asling, was how to make it work the > way bind requires, and I suspect the IP_RECVDSTADDR setsockopt() along with > using recvmsg() is the answer really desired. (See ip(4)) That option isn't available on Linux. Looks like IP_PKTINFO will do what we want. Details in man 7 ip I don't know how the send side works. I assume there is a similar mechanism. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sun Apr 16 06:59:12 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 15 Apr 2017 23:59:12 -0700 Subject: Interfaces: Link to discussion on netbsd-users Message-ID: <20170416065912.529BD40605C@ip-64-139-1-69.sjc.megapath.net> http://mail-index.netbsd.org/netbsd-users/2017/04/14/msg019489.html Subject: bind reacts badly to dhcpcd losing/regaining connectivity Much of it is NetBSD sepecific. I think it will be worth scanning if/when you decide to revisit this area. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sun Apr 16 09:15:21 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 02:15:21 -0700 Subject: Version Strings - doesn't show recent edits Message-ID: <20170416091521.8C67E40605C@ip-64-139-1-69.sjc.megapath.net> I consider the current setup to be close to useless. If I go through the typical edit, build, test cycle, the string doesn't get updated so I can't tell if I'm running the version from git or something with local modifications. I gather there is a constraint from distros. They want the binaries they ship to be identical to what you would get it you pulled their sources and built your own. That means we can't use __DATE__ or __TIME__. Will the distros be happy if we have a configure option (default off) to use the build time? ------ There is another worm in this can: issue #268 Waf uses current date and time breaking repro builds > That time stamp is only used by the NMEA driver. I assume it's trying to > recover from the GPS 1024 week rollover. I'd be happy to throw all that code > out and tell people to use GPSD if they have a device that old. -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Apr 16 09:18:28 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 16 Apr 2017 05:18:28 -0400 Subject: Interfaces: Link to discussion on netbsd-users In-Reply-To: <20170416065912.529BD40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170416065912.529BD40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170416091828.GA22615@thyrsus.com> Hal Murray : > > http://mail-index.netbsd.org/netbsd-users/2017/04/14/msg019489.html > Subject: bind reacts badly to dhcpcd losing/regaining connectivity > > Much of it is NetBSD sepecific. I think it will be worth scanning if/when > you decide to revisit this area. Thanks. The essential point (bind reacquisition failing after droproot) is not NetBSD specific and is worth keeping in mind. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Sun Apr 16 09:32:28 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 02:32:28 -0700 Subject: Interfaces: Link to discussion on netbsd-users In-Reply-To: Message from "Eric S. Raymond" of "Sun, 16 Apr 2017 05:18:28 EDT." <20170416091828.GA22615@thyrsus.com> Message-ID: <20170416093228.5427140605C@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Thanks. The essential point (bind reacquisition failing after droproot) is > not NetBSD specific and is worth keeping in mind. I think it works (on Fedora). My test case is closing the lid on a laptop, waiting a while, then opening it. 16 Apr 02:21:50 ntpd[636]: Deleting interface #4 wlp2s0b1, 192.168.1.100#123, interface stats: received=1189, sent=1259, dropped=0, active_time=10601 secs ... 16 Apr 02:21:50 ntpd[636]: Deleting interface #5 wlp2s0b1, fe80::864b:f5ff:fe39:244a%3#123, interface stats: received=0, sent=0, dropped=0, active_time=10601 secs 16 Apr 02:21:57 ntpd[636]: Listen normally on 6 wlp2s0b1 192.168.1.100:123 16 Apr 02:21:57 ntpd[636]: Listen normally on 7 wlp2s0b1 [fe80::864b:f5ff:fe39:244a%3]:123 16 Apr 02:21:57 ntpd[636]: new interface(s) found: waking up resolver 16 Apr 02:23:57 ntpd[636]: 74.120.8.2 132a 8a sys_peer 16 Apr 02:23:57 ntpd[636]: 0.0.0.0 0613 03 spike_detect -0.190987 s 16 Apr 02:24:14 ntpd[636]: 129.250.35.250 141a 8a sys_peer I think stock Fedora restarts ntpd in that case. I think I patched something to let it keep running, but I can't find any notes on what I did. -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Apr 16 10:26:45 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 16 Apr 2017 06:26:45 -0400 Subject: Version Strings - doesn't show recent edits In-Reply-To: <20170416091521.8C67E40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170416091521.8C67E40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170416102645.GB22615@thyrsus.com> Hal Murray : > I consider the current setup to be close to useless. If I go through the > typical edit, build, test cycle, the string doesn't get updated so I can't > tell if I'm running the version from git or something with local > modifications. > > I gather there is a constraint from distros. They want the binaries they > ship to be identical to what you would get it you pulled their sources and > built your own. That means we can't use __DATE__ or __TIME__. > > Will the distros be happy if we have a configure option (default off) to use > the build time? > > ------ > > There is another worm in this can: issue #268 > Waf uses current date and time breaking repro builds > > > That time stamp is only used by the NMEA driver. I assume it's trying to > > recover from the GPS 1024 week rollover. I'd be happy to throw all that code > > out and tell people to use GPSD if they have a device that old. This whole area is a mess in which I have been unable to think up a policy that makes me entirely happy. We're going to have to choose one thing to prioritize at the expense of other annoyances. It would be possible, with some hackery, to bump the version string on every edit. However, this would mean that every single build would always recompile every file that included the generated version symbol. This would be irritating, especially given the existence of slow buildbots; qemu images for exotic architectures would be right out. That is why the usual policy these days is to *not do that* and say that if you want a version string that fine-grained you look at HEAD's git hash. Or run git-describe on it. There are two distinct uses of current date and time in the build. One is dispensable, maybe. The other is not. The indispensable one is the use in the NMEA driver. In theory we could bow to the distros' desire for reproducible builds, throw that code out, and pass the buck to GPSD. I don't want to do that because, as they say in politics, the optics would be bad. It's the same reason I fought against dropping filtering on named interfaces - we're on dangerous ground any time we throw away a functional feature that some grognard time admin somewhere might have been relying on for 15 years. I feel safe in doing that only if we have a very convincing security/reliability reason to give, and appeasing distros' policy guidelines *doesn't qualify*. The maybe-dispensible one is the use of build-time pivoting to find the nearest time corresponding to an l_fp across ntp-date calendar cycles. If we toss that we lose the ability for ntpd binaries built in different eras (but within a half-cycle of each other) to settle on a common timebase and interoperate. In theory we could replace that use with a timestamp that freezes at each point-release time (GPSD does this). So, if a sysdamin updates from our point source releases conscientiously, no problem. Do you want to bet on everybody doing that, nobody recompiling the same source release multiple to throw it on multiple machines? I don't. The really security-consious admins (our core demographic, as it were) don't trust anything they haven't buit from cryptosigned sources with a toolchain they control and trust. Now, suppose we give the distros a no-__DATE__/__TIME_ option. Then that's what they're going to ship. Cross-era interoperability goes poof, and nobody who cares about it is going to be much appeased by the argument that their distro package could have been built to (at least partly) handle that case. This may seem like an academic issue now, with end of era 0 still 19 years off. I don't think it's going to be acadenic at all in the two or three years around the era rollover. I think we have to tell any distro that brings this up that we can't comply because doing so would create a serious risk of breaking a core assumption of RFC5905. While I'm not looking forward to being in that argument, still less am I loving the blowback I think we'd get if we knocked that central column out of Dave Mills's cathedral. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Sun Apr 16 19:18:18 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 12:18:18 -0700 Subject: Version Strings - doesn't show recent edits In-Reply-To: Message from "Eric S. Raymond" of "Sun, 16 Apr 2017 06:26:45 EDT." <20170416102645.GB22615@thyrsus.com> Message-ID: <20170416191818.E07EE40605C@ip-64-139-1-69.sjc.megapath.net> > It would be possible, with some hackery, to bump the version string on every > edit. However, this would mean that every single build would always > recompile every file that included the generated version symbol. This would > be irritating, especially given the existence of slow buildbots; qemu images > for exotic architectures would be right out. My proposal was for a option that defaulted to off. Buildbots wouldn't enable it. I'm willing to rebuild a few extra modules in order to get useful version strings. The when-to-update complexity doesn't have to be part of waf (but that would be great). I have a script that does the rebuilding. It could do whatever is appropriate. It already deletes ntpd/version.h, leftover from before we switched to autorevision. We already have to do the relink step. If recompiling ntpd is too slow, we could move the version string out to a separate module. > The maybe-dispensible one is the use of build-time pivoting to find the > nearest time corresponding to an l_fp across ntp-date calendar cycles. If we > toss that we lose the ability for ntpd binaries built in different eras (but > within a half-cycle of each other) to settle on a common timebase and > interoperate. We already got rid of that usage as part of the time_t 32 vs 64 discussion. We now depend on the system time being "close enough". The interface across the 32-64 boundary is a delta-time rather than the correct time. I think that puts the pivot time at 1970. We could advance that by 40+ years by adding some code that would update the system time to the release date if it was before that. We could get a "better" time from the file system. That would be a disaster if the time ever got into the future long enough to write whatever part of the file system we get the time from. > The indispensable one is the use in the NMEA driver. In theory we could bow > to the distros' desire for reproducible builds, throw that code out, and > pass the buck to GPSD. I don't want to do that because, as they say in > politics, the optics would be bad. We could switch that usage to a time stamp that gets updated as part of the release process. The NMEA usage has a lifetime of 20 years after the reference time. The l_fp pivoting breaks after 60+ years. Is 60 years after the release date good enough for the l_fp pivoting? Is the difference between 20 years from the release date vs 20 years from build time significant? -- These are my opinions. I hate spam. From gem at rellim.com Sun Apr 16 19:41:00 2017 From: gem at rellim.com (Gary E. Miller) Date: Sun, 16 Apr 2017 12:41:00 -0700 Subject: Version Strings - doesn't show recent edits In-Reply-To: <20170416191818.E07EE40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170416102645.GB22615@thyrsus.com> <20170416191818.E07EE40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170416124100.3827a1be@spidey.rellim.com> Yo Hal! I see a strong synergy between this and bug #268. Hal wants builds that are identifiably touched by developer actions. Repro builds want the exact opposite, builds that validate zero developer changes. Flip sides of the same coin? On Sun, 16 Apr 2017 12:18:18 -0700 Hal Murray wrote: > > It would be possible, with some hackery, to bump the version string > > on every edit. However, this would mean that every single build > > would always recompile every file that included the generated > > version symbol. This would be irritating, especially given the > > existence of slow buildbots; qemu images for exotic architectures > > would be right out. > > My proposal was for a option that defaulted to off. Buildbots > wouldn't enable it. Before we get into how much extra work this adds to the build, what exactly is the extra info we would want? For repro builds we replace __DATE__ and __TIME__ with MKREPRO_TIME and MKREPRO_DATE. Could MKREPRO_xxx be derived easily from last the lsat commit meta data? > I'm willing to rebuild a few extra modules in order to get useful > version strings. What would you define as 'useful' for a dev build? > > The maybe-dispensible one is the use of build-time pivoting to find > > the nearest time corresponding to an l_fp across ntp-date calendar > > cycles. > We already got rid of that usage as part of the time_t 32 vs 64 > discussion. +1. > We could get a "better" time from the file system. That would be a > disaster if the time ever got into the future long enough to write > whatever part of the file system we get the time from. Gentoo on RasPi sets the time with a special file that is updated now and again. ntpd could use the file time on the driftfile. Or many other possibilities. If we just assume last git time is valid then that gives us a 60+ year window where we know the pivot. > > The indispensable one is the use in the NMEA driver. > We could switch that usage to a time stamp that gets updated as part > of the release process. +1, or just use MKREPRO_xxx > Is 60 years after the release date good enough for the l_fp pivoting? Yes. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Sun Apr 16 20:26:14 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 13:26:14 -0700 Subject: Does anybody regularly test the cxfreeze recipe? Message-ID: <20170416202614.6AA9740605C@ip-64-139-1-69.sjc.megapath.net> and test/use the output. as described in devel/packaging.txt -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Apr 16 20:48:10 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 16 Apr 2017 16:48:10 -0400 Subject: Does anybody regularly test the cxfreeze recipe? In-Reply-To: <20170416202614.6AA9740605C@ip-64-139-1-69.sjc.megapath.net> References: <20170416202614.6AA9740605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170416204810.GA14900@thyrsus.com> Hal Murray : > and test/use the output. > > as described in devel/packaging.txt I tested it one when I implemented. Haven't since. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From ianbruene at gmail.com Sun Apr 16 23:24:35 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Sun, 16 Apr 2017 18:24:35 -0500 Subject: Refclocks and formatting Message-ID: @ESR In the units project I discovered that the string formatting for refclocks is handled in a completely different manner from the rest of the code. Specifically ntpq/mon call ntp.ntpc.statustoa which is a C library, instead of calling hypothetical formatting functions in pylib/util.py like they do for everything else. As far as I can tell from a cursory examination of the code the reason for this is so it can use the same bitmask #defines as the rest of the system. Is this correct? If so does that need to remain the case, if not then why is the complexity of a language bridge being maintained? If it has to stay this way the unit formatters *can* munch on the output of statustoa. Related @anyone: is there a way to make ntpd produce a fake refclock for testing purposes? I don't have the hardware for it to produce one naturally, and unit testing / careful examination of the logic can only go so far. Pushing the testing burden onto someone else's patience is suboptimal as well. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From hmurray at megapathdsl.net Sun Apr 16 23:39:45 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 16:39:45 -0700 Subject: Refclocks and formatting In-Reply-To: Message from Ian Bruene of "Sun, 16 Apr 2017 18:24:35 CDT." Message-ID: <20170416233945.AB76640605C@ip-64-139-1-69.sjc.megapath.net> ianbruene at gmail.com said: > Related @anyone: is there a way to make ntpd produce a fake refclock for > testing purposes? I don't have the hardware for it to produce one Sure. Just set one up. NMEA, for example, needs a serial port to get past the open, but it should work if you don't have anything pluged into it. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sun Apr 16 23:46:30 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 16:46:30 -0700 Subject: Does anybody have a sample of a NMEA device with the 1024 week bug? Message-ID: <20170416234630.505B340605C@ip-64-139-1-69.sjc.megapath.net> I'd like to get one for testing. -- These are my opinions. I hate spam. From gem at rellim.com Sun Apr 16 23:51:40 2017 From: gem at rellim.com (Gary E. Miller) Date: Sun, 16 Apr 2017 16:51:40 -0700 Subject: Refclocks and formatting In-Reply-To: References: Message-ID: <20170416165140.6098879b@spidey.rellim.com> Yo Ian! On Sun, 16 Apr 2017 18:24:35 -0500 Ian Bruene wrote: > As far as I can tell from a cursory examination of the code the > reason for this is so it can use the same bitmask #defines as the > rest of the system. Is this correct? If so does that need to remain > the case, if not then why is the complexity of a language bridge > being maintained? If it has to stay this way the unit formatters > *can* munch on the output of statustoa. Many people would love the pymodule.c to go away. Just for starters: It causes issues for dual Python2 and 3 systems. > Related @anyone: is there a way to make ntpd produce a fake refclock > for testing purposes? Spend a few $$ and get a cheap USB GPS? Mark has some nice ones that also send PPS. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Mon Apr 17 00:36:21 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 17:36:21 -0700 Subject: Refclocks and formatting In-Reply-To: Message from Ian Bruene of "Sun, 16 Apr 2017 18:24:35 CDT." Message-ID: <20170417003621.C5E0940605C@ip-64-139-1-69.sjc.megapath.net> ianbruene at gmail.com said: > In the units project I discovered that the string formatting for refclocks > is handled in a completely different manner from the rest of the code. I can't see any reason for refclocks to be different. There might be one. At least from ntpmon, it does the same for refclocks and non-refclocks. if ntp.util.PeerSummary.is_clock(retained): dtype = ntp.ntpc.TYPE_CLOCK else: dtype = ntp.ntpc.TYPE_PEER sw = ntp.ntpc.statustoa(dtype, ... The ntpq code isn't so simple. I haven't backtracked through layers of procedure calls to see if it really is different for refclocks. There is another potential worm in this can. I don't think if it applies to this case, but it's worth keeping in mind. Some of the flags that get decoded come from the kernel sources rather than NTP sources. They are generally the same across OSes and kernel versions, but I don't know how to verify that. I think the clean solution would be to decode them on the server. [Or copy the versions from one kernel to our sources and have the build step crash if they are defined in the kernel and different.) -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Mon Apr 17 01:52:07 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 18:52:07 -0700 Subject: Version Strings - doesn't show recent edits Message-ID: <20170417015207.34B4740605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > Before we get into how much extra work this adds to the build, what exactly > is the extra info we would want? I'm not sure we need any "extra" info. The NMEA driver wants as late a time-stamp as it can get. It will work with the last commit date. It will work better with the actual build date - more-better as the time between commit and build increases. > Could MKREPRO_xxx be derived easily from last the lsat commit meta data? I view that as an implementation detail. I assume the answer is yes. Currently, the version string comes from autorevision and the time-stamp that NMEA uses comes from __DATE__ or MKREPRO_DATE and friends. We could fix the NMEA problem by using the time-stamp from autorevision. There is a scanf that pulls fields from __DATE__ and __TIME__. It should be a simple change to use the string from autorevision in that step. We could remove a lot of cruft from libntp/ntp_calendar.c if whatever time-stamp was used for NMEA was pre-processed by an external script rather than at run time. > What would you define as 'useful' for a dev build? I'd be happy with the build time. (as compared to the current last commit) If you want to do something fancy like the latest edit of any sources that would be OK too, but don't forget to include config.h > Gentoo on RasPi sets the time with a special file that is updated now and > again. ntpd could use the file time on the driftfile. Or many other > possibilities. The problem with schemes like that is not which file to use, it's how to recover if the time on that file ever gets set into the future. We could use something like the leap-seconds file which gets updated occasionally as long as the update mechanism preserves the last-edit date. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Mon Apr 17 05:07:44 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 22:07:44 -0700 Subject: Does anybody regularly test the cxfreeze recipe? In-Reply-To: Message from "Eric S. Raymond" of "Sun, 16 Apr 2017 16:48:10 EDT." <20170416204810.GA14900@thyrsus.com> Message-ID: <20170417050744.8118040605C@ip-64-139-1-69.sjc.megapath.net> > I tested it one when I implemented. Haven't since. I added a step to devel/pre-release.txt -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Mon Apr 17 06:16:41 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 16 Apr 2017 23:16:41 -0700 Subject: ntpq vs new DNS Message-ID: <20170417061641.2B8F940605C@ip-64-139-1-69.sjc.megapath.net> For server slots specified by name, the new DNS returns both the local IP Address and the hostname. Servers specified by numerical address don't return a hostname. The current ntpq gives preference to the hostname slot. That works for pool slots where the address is useless. It's much more complicated for things like server 0.us.pool.ntp.org I think changing a few lines will revert to the previous behavior. Does anybody have any great ideas for how to take advantage of this extra info? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Mon Apr 17 07:30:06 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 00:30:06 -0700 Subject: ntpq peers printout: what happened to the 4th digit? Message-ID: <20170417073006.74C5340605C@ip-64-139-1-69.sjc.megapath.net> A while ago, I squeezed things so that we could get a 4th digit on the last 3 columns in the normal case. I'm only seeing 3 digits now. Is that a feature or a glitch in all the work in this area that I haven't been paying much attention to? -- These are my opinions. I hate spam. From esr at thyrsus.com Mon Apr 17 17:14:25 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 17 Apr 2017 13:14:25 -0400 (EDT) Subject: Time to slow down and be more careful Message-ID: <20170417171425.33A2113A021A@snark.thyrsus.com> This morning, while investigating a recent code change that smelled bad to me, I discovered that an error cascade of small, wrong changes starting some weeks ago had destroyed the mechanism that would allow instances of ntpd to interoperate across the epoch 1 boundary in 2036. I fixed the problem, recovering the old code required to make this work again, but it's a blot on our record. Up to now all our mistakes have been very minor. This wasn't. If it remained undetected until actual symptoms were visible it would have been extremely difficult to fix. I fear we've gotten a little too used to success, a bit cavalier about checking our assumptions. So: It's time to slow down and be more careful. Check your premises twice before you hack. At this point in the game, if you think you've detected dead code, check with other devs; it may be a sign that something that should be calling it has been incorrectly removed. -- Eric S. Raymond The whole aim of practical politics is to keep the populace alarmed (and hence clamorous to be led to safety) by menacing it with an endless series of hobgoblins, all of them imaginary. -- H.L. Mencken From gem at rellim.com Mon Apr 17 20:52:04 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 17 Apr 2017 13:52:04 -0700 Subject: Version Strings - doesn't show recent edits In-Reply-To: <20170417015207.34B4740605C@ip-64-139-1-69.sjc.megapath.net> References: <20170417015207.34B4740605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170417135204.1b567bb7@spidey.rellim.com> Yo Hal! On Sun, 16 Apr 2017 18:52:07 -0700 Hal Murray wrote: > gem at rellim.com said: > > Before we get into how much extra work this adds to the build, what > > exactly is the extra info we would want? > > I'm not sure we need any "extra" info. The repro build does need different info. > The NMEA driver wants as late a time-stamp as it can get. It will > work with the last commit date. It will work better with the actual > build date - more-better as the time between commit and build > increases. Since both work, we just need a switch between commit data and build date for NMEA driver. They will rarely differ by more than a small percentage of 10 years, and at runtime other reality checks are usually available. > > Could MKREPRO_xxx be derived easily from last the lsat commit meta > > data? > > I view that as an implementation detail. I assume the answer is yes. > > Currently, the version string comes from autorevision and the > time-stamp that NMEA uses comes from __DATE__ or MKREPRO_DATE and > friends. So no changes needed to autorevision? > We could fix the NMEA problem by using the time-stamp from > autorevision. There is a scanf that pulls fields from __DATE__ and > __TIME__. It should be a simple change to use the string from > autorevision in that step. Works for me. > We could remove a lot of cruft from libntp/ntp_calendar.c if whatever > time-stamp was used for NMEA was pre-processed by an external script > rather than at run time. I like that idea. > > What would you define as 'useful' for a dev build? > > I'd be happy with the build time. (as compared to the current last > commit) I'm not familiar with the problems that came up last time the version scheme changed. Here is one I have now: ntpd ntpsec-0.9.7+421 2017-04-15T19:46:45Z So, for you purposes that is good now? And for report build just replace with the last commit timestamp? > If you want to do something fancy like the latest edit of any sources > that would be OK too, but don't forget to include config.h I'm not up to doing anything fancy/ Patches welcome. > > Gentoo on RasPi sets the time with a special file that is updated > > now and again. ntpd could use the file time on the driftfile. Or > > many other possibilities. > > The problem with schemes like that is not which file to use, it's how > to recover if the time on that file ever gets set into the future. Yes, I see that all the time on my RasPi's. Sort of. The problem is Linux thinks the time is in the future, but it really is not in the future because the system clock not set properly yet. > We could use something like the leap-seconds file which gets updated > occasionally as long as the update mechanism preserves the last-edit > date. That is worse than Gentoo does on RasPi now and would create many more problems. Just being a few hours wrong creates havoc, and the longer the time spread the worse it gets. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Mon Apr 17 21:46:07 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 14:46:07 -0700 Subject: Time to slow down and be more careful In-Reply-To: Message from esr@thyrsus.com (Eric S. Raymond) of "Mon, 17 Apr 2017 13:14:25 EDT." <20170417171425.33A2113A021A@snark.thyrsus.com> Message-ID: <20170417214607.2669240605C@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > This morning, while investigating a recent code change that smelled bad to > me, I discovered that an error cascade of small, wrong changes starting some > weeks ago had destroyed the mechanism that would allow instances of ntpd to > interoperate across the epoch 1 boundary in 2036. Could you please say more. If I screwed up, I'd like to learn something from it. Looking back, I should have written something about how that stuff works. It's in several messages but never made it to a file that got committed. I think the old code converted l_fp to full time. That needs a pivot. I changed things so that there is never a conversion from l_fp to full time. There is a subtract done on the l_fp side. The clock offset in l_fp is converted to an offset in seconds. I think it's a double. That eventually turns into a clock adjustment. There is no explicit pivot. There is an implicit pivot of the current time. That turns into a requirement for the time to be reasonably close before ntpd is started. Reasonably close is within 68 years. That will screw up in 2038 on systems like the Raspberry Pi that don't have a battery backed RTC. I see two ways to fix that. One would be to put a pivot like time stamp into ntpd and early in the startup sequence, bump the clock if the current time is earlier than the pivot. The other would be to run some other program before starting ntpd. That program could use a compiled in time stamp or look in the file system or ... This area is also tangled up with time_t being 32 or 64 bits. We decided to use time_t as much as possible, expecting the environments to fix that in time. -- These are my opinions. I hate spam. From esr at thyrsus.com Mon Apr 17 22:40:03 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 17 Apr 2017 18:40:03 -0400 Subject: Time to slow down and be more careful In-Reply-To: <20170417214607.2669240605C@ip-64-139-1-69.sjc.megapath.net> References: <20170417171425.33A2113A021A@snark.thyrsus.com> <20170417214607.2669240605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170417224003.GB10652@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > This morning, while investigating a recent code change that smelled bad to > > me, I discovered that an error cascade of small, wrong changes starting some > > weeks ago had destroyed the mechanism that would allow instances of ntpd to > > interoperate across the epoch 1 boundary in 2036. > > Could you please say more. If I screwed up, I'd like to learn something from > it. You did, but only in a minor way. You removed a call from libntp/ntp_calendar.c that the pivot code needed to do cross-era resolution before it was wrongly deleted. This complicated the fix - I had to figure out which parts of your cleanup to revert - but it was no part of the original error. I wouldn't have been surprised if you had noticed that the code that exercised that entry point shouldn't have been removed, you're good at being that kind of careful, but it wasn't really your responsibility to notice; it was the tech lead's, e.g. mine. For *you*, I think the only lesson out of this one is to be more careful about dead-code removal. There was a lot of really useless stuff in the codebase at fork time, but I ripped most of that crap out last year. Now, if you see what looks like dead code, you need to double-and triple-check whether it should have a call site that shouldn't have been dropped. This will probably involve sniffing around the Classic tree a bit. > Looking back, I should have written something about how that stuff works. > It's in several messages but never made it to a file that got committed. > > I think the old code converted l_fp to full time. That needs a pivot. > > I changed things so that there is never a conversion from l_fp to full time. > There is a subtract done on the l_fp side. The clock offset in l_fp is > converted to an offset in seconds. I think it's a double. That eventually > turns into a clock adjustment. There is no explicit pivot. There is an > implicit pivot of the current time. I'm actually not sure which code you're talking about here, and I think it's important that I should. > That turns into a requirement for the time to be reasonably close before ntpd > is started. Reasonably close is within 68 years. A half cycle, yes. That's the same constraint the original Mills code has. The underlying modular arithmetic is clear to me, even if its relationship to the implementation remains a bit murky. > That will screw up in 2038 on systems like the Raspberry Pi that don't have a > battery backed RTC. I see two ways to fix that. One would be to put a pivot > like time stamp into ntpd and early in the startup sequence, bump the clock > if the current time is earlier than the pivot. The other would be to run > some other program before starting ntpd. That program could use a compiled > in time stamp or look in the file system or ... This is going to take some very careful work, with planning and discussion beforehand. *After* 1.0. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 18 01:41:01 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 18:41:01 -0700 Subject: _DATE__, version string, and distros Message-ID: <20170418014101.5EFEC40605C@ip-64-139-1-69.sjc.megapath.net> I think we can kill two birds with one stone. The first step is to change the code that uses __DATE__ to use the time stamp from autorevision. That will solve the repro builds problem. But it means the pivot point for GPS is the last commit time rather than the build time. The GPS time scale is only 20 years, so somebody might want to rebuild ntpd to update that. So the second step is to add a way to override the time stamp from autorevision. That will also solve my problem of getting useful info from the version string. I don't have details of how to do that. My straw man was a script that would get invoked by a configure option. But then I noticed that it gets tangled up with the build-repro guarantee. If we don't want to break the build-repro guarantee, we have to put the new date into a file or update the last commit date. If we update the last commit date, then we don't need anything special for distros. But I can't update the last commit date to get a new version string, at least not with my current knowledge of git. So this opens up a new can of worms. So how do we envision that distros will use our code? Are they going to take a tarball, or clone our git repo? How are they going to handle local patches? Is there a slot in the version space for a distro to indicate that they are running with local patches? ... If they clone our git repo, then all they have to do to update the GPS pivot is edit a file, commit that change, and turn the crank. They can add a local file with a summary of local changes. If they use a tarball, and/or distribute tarballs, we have to put the new-date in a file (or file metadata) or patch autorevision.cache. I think that fits in with my script suggestion, again I haven't worked out any details. So I guess I'm proposing a script and configure option. That will enable distros to distribute tarballs with their patches. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 18 02:42:04 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 19:42:04 -0700 Subject: Time to slow down and be more careful In-Reply-To: Message from "Eric S. Raymond" of "Mon, 17 Apr 2017 18:40:03 EDT." <20170417224003.GB10652@thyrsus.com> Message-ID: <20170418024205.08FE940605C@ip-64-139-1-69.sjc.megapath.net> >> I changed things so that there is never a conversion from l_fp >> to full time. There is a subtract done on the l_fp side. The clock >> offset in l_fp is converted to an offset in seconds. I think it's a >> double. That eventually turns into a clock adjustment. There is >> no explicit pivot. There is an implicit pivot of the current time. > I'm actually not sure which code you're talking about here, and I think it's > important that I should. You added some pivot code to step_systime in libntp/systime.c I don't understand why. The argument is a time step as a double. That comes from packets exchanged with a server using l_fp. That's at most 31 bits plus sign, relative to the current system time. That's the biggest step you can take. You can step across epoch boundaries. You can't step over whole epochs. If you want to cross an epoch boundary, the system you are running on must support the new epoch. That will require more than 32 bit time_t. (It might work with 32 bit unsigned, but all sorts of code does subtracts.) Note that there is no pivot mentioned in the previous two paragraphs. The pivot point is "now". That turns into a requirement that the system time be close enough. You added l_fp fp_ofs, fp_sys; /* offset and target system time in FP */ The idea was to avoid l_fp in anything that talks to the OS. If you use timespec, and assume time_t is 64 bits, then you don't need to worry about epochs or pivots. Does that help? -- These are my opinions. I hate spam. From gem at rellim.com Tue Apr 18 03:10:50 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 17 Apr 2017 20:10:50 -0700 Subject: Time to slow down and be more careful In-Reply-To: <20170418024205.08FE940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170417224003.GB10652@thyrsus.com> <20170418024205.08FE940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170417201050.3ebebb13@spidey.rellim.com> Yo Hal! On Mon, 17 Apr 2017 19:42:04 -0700 Hal Murray wrote: > >> I changed things so that there is never a conversion from l_fp > >> to full time. There is a subtract done on the l_fp side. The clock > >> offset in l_fp is converted to an offset in seconds. I think it's > >> a double. That eventually turns into a clock adjustment. There is > >> no explicit pivot. There is an implicit pivot of the current > >> time. > > > I'm actually not sure which code you're talking about here, and I > > think it's important that I should. > > You added some pivot code to step_systime in libntp/systime.c > I don't understand why. +1 RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 18 03:26:52 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 20:26:52 -0700 Subject: Big picture... Message-ID: <20170418032652.E927940605C@ip-64-139-1-69.sjc.megapath.net> Draw an upside down tree of ntpd modules with packet processing on the left and OS interface on the right and data processing in the middle. (I'm handwaving. I mean the way you would draw the picture if you were explaining things rather than the actual current module structure. I think we are reasonably close.) The packet code uses l_fp. The OS interfaces use timespec and time_t. The code in the middle works in time offsets using seconds in doubles and some time in time_t. I'm not sure that is totally correct. It seems like a good goal. I think we are close. That was the direction I was going when I got rid of a lot of time64_t a while ago and when I cleaned up the leap-second code. --------- I'd like to get rid of ntp_calendar. Nothing urgent, it just seems like we should be able to use POSIX date/time calls instead. I made some progress in the recent leap-second cleanup. One rough edge is that there is no UTC version of mktime. There is timegm, but it's not POSIX. The linux man pages says it's a GNU extension and is also available on BSDs. For the leap-second code, I used 28 days rather than 1 month. I could do that calculation with simple arithmetic. -- These are my opinions. I hate spam. From gem at rellim.com Tue Apr 18 03:33:35 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 17 Apr 2017 20:33:35 -0700 Subject: _DATE__, version string, and distros In-Reply-To: <20170418014101.5EFEC40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418014101.5EFEC40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170417203335.316eeeb2@spidey.rellim.com> Yo Hal! On Mon, 17 Apr 2017 18:41:01 -0700 Hal Murray wrote: > The first step is to change the code that uses __DATE__ to use the > time stamp from autorevision. I'd go with optional, but either way works for me. > So the second step is to add a way to override the time stamp from > autorevision. That will also solve my problem of getting useful info > from the version string. Debug mode only? > If we don't want to break the build-repro guarantee, we have to put > the new date into a file or update the last commit date. commit date is fine. > So this opens up a new can of worms. I'm not as worried as you. > So how do we envision that distros will use our code? Are they going > to take a tarball, or clone our git repo? Prolly 50/50. Gentoo does both. > How are they going to > handle local patches? Not out problem. they will be changing flags, default locations, etc. so a user can only reproduce that exact patch set, which is what they want. > Is there a slot in the version space for a > distro to indicate that they are running with local patches? ... Not out problem. Every distro does that differently. Maybe we see a pattern after we get in a lot of distros. We can wait for their requests. > If they clone our git repo, then all they have to do to update the > GPS pivot is edit a file, commit that change, and turn the crank. > They can add a local file with a summary of local changes. That could work. > If they use a tarball, and/or distribute tarballs, we have to put the > new-date in a file (or file metadata) or patch autorevision.cache. I > think that fits in with my script suggestion, again I haven't worked > out any details. Or just use the timestamp on the VERSION file. > So I guess I'm proposing a script and configure option. That will > enable distros to distribute tarballs with their patches. Which is partly what we were asked for. Maybe we have 3 cases, not all needed: 1. default to use current __DATE__ and __TIME__ 2. repro build option to substitute VERSION time for __DATE__ and __TIME__ 3. pull data/time from a special file. Good for testing, otherwise sucky. 4. the strangest one, a dev option to use the time of the newest C file? Which is what I think you want? And none of this attacks the real problem about old binaries handling the GPS epoch. the difference between compile time and reprotime will be quite small. Hours in the case of Gentoo, maybe a few years in the case of laggards, But the problem to be solved is when a user has a 20 year old ntpd binary. Sadly I know where several such running instances are now. Maybe allow the user to put a GPS epoch in ntp.conf? No matter how well we guess we'll still get it wrong sometimes. Also useful for testing. Maybe a small script the user can run to verify current time and save in ntp.conf.d/? o RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Apr 18 03:48:01 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 17 Apr 2017 20:48:01 -0700 Subject: Big picture... In-Reply-To: <20170418032652.E927940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418032652.E927940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170417204801.5fea3a38@spidey.rellim.com> Yo Hal! On Mon, 17 Apr 2017 20:26:52 -0700 Hal Murray wrote: > I think we are reasonably close.) +1 > The packet code uses l_fp. The OS interfaces use timespec and > time_t. Yes, pertty much as it is now. > The code in the middle works in time offsets using seconds > in doubles and some time in time_t. Currently, sort of, yes. But, I see no point not to do the offsets as timespec's too. Otherwise big time corrections need multiple jumps due to loss of precision in the doubles for large 'gate' times. And the time sve in doubles is lost in the converting back and forth. > t seems like a good goal. I > think we are close. That was the direction I was going when I got > rid of a lot of time64_t a while ago and when I cleaned up the > leap-second code. Yes, we had good momentum, but then I got stuck in the warning, which flushed out a lot of little junk and will be hard to do after 1.0. Most of the isolated pointless usage of l_fp's can be converted to tiemspec(64) with no problems. The timespec(32) people are hosed no matter what we do in 2038. > I'd like to get rid of ntp_calendar. Nothing urgent, it just seems > like we should be able to use POSIX date/time calls instead. I made > some progress in the recent leap-second cleanup. May not be totally possible, but close. If all but the NTP external interface is timespec(64) then no need to carry a lot of oddball time arithmetic. > One rough edge is that there is no UTC version of mktime. Yeah, a PITA, but ugly solutions abound.. > For the leap-second code, I used 28 days rather than 1 month. I > could do that calculation with simple arithmetic. Yet another can of worms... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 18 04:25:42 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 21:25:42 -0700 Subject: _DATE__, version string, and distros Message-ID: <20170418042542.BED4F40605C@ip-64-139-1-69.sjc.megapath.net> >> So the second step is to add a way to override the time stamp from >> autorevision. That will also solve my problem of getting useful info >> from the version string. > Debug mode only? There already is too much tangled up with debug. I've been assuming it would be a new option. >> If we don't want to break the build-repro guarantee, we >> have to put the new date into a file or update the last >>commit date. > commit date is fine. Commit date alone doesn't solve my problem. I want an option that shows that I've made some edits that aren't committed. I could put an exit+commit for a dummy file into a script, but then I couldn't ever push anything without trashing the master repo. Maybe git can handle that case. If so, I need a lesson. -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 18 04:52:51 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 00:52:51 -0400 Subject: Time to slow down and be more careful In-Reply-To: <20170418024205.08FE940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170417224003.GB10652@thyrsus.com> <20170418024205.08FE940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418045251.GA15781@thyrsus.com> Hal Murray : > > >> I changed things so that there is never a conversion from l_fp > >> to full time. There is a subtract done on the l_fp side. The clock > >> offset in l_fp is converted to an offset in seconds. I think it's a > >> double. That eventually turns into a clock adjustment. There is > >> no explicit pivot. There is an implicit pivot of the current time. > > > I'm actually not sure which code you're talking about here, and I think it's > > important that I should. > > You added some pivot code to step_systime in libntp/systime.c > I don't understand why. I didn't "add" a damn thing. I restored the pivot computation that was there before it was mistakenly removed. You can compare the Classic version to check this; there are some superficial differences due to the l_fp and macro cleanups but the logic is the same. > The argument is a time step as a double. That comes from packets exchanged with a server using l_fp. That's at most 31 bits plus sign, relative to the current system time. That's the biggest step you can take. You can step across epoch boundaries. You can't step over whole epochs. Maximum step size isn't the problem. The problem, which is invisible now but won't be in the future, is that any given l_fp can represent a countable infinity of timestamps separated from each other by the 136-year cycle lengths. Which one it *actually* represents depends on the base epoch of the sending ntpd, which we don't know. Mills's solution was to map the timestamp to whichever of these aleph-0 possibilities is closest in time to a pivot. Then, if the time intended by the sender was within a half-cycle of the pivot, all is good. That's what this code is doing. Now, you might think the smart thing to do is pivot around now, but *what if your system clock isn't reliable?* Like, you're running on an RTC-less system and it comes up zero. Mills's workaround against this possibility was to pivot on the ntpd's build date. This is why, if the same distros that didn't insist on Classic having reproducible builds get a wild hair and pressure us to do it, we have to tell them no. If we take out the build-time pivot we create subtle risks to future systems that happen not to have reliable clocks; best case, you'd get good sources being rejected as falsetickers because a timestamp was pivoted to the wrong era. Weirder things could occur. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 18 04:58:42 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 21:58:42 -0700 Subject: Big picture... Message-ID: <20170418045842.ECEFD40605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > But, I see no point not to do the offsets as timespec's too. Otherwise big > time corrections need multiple jumps due to loss of precision in the doubles > for large 'gate' times. And the time sve in doubles is lost in the > converting back and forth. [What's a "time sve"? I can't find a typo that turns it into something useful.] I think the code using doubles would be much easier to read. If we convert from an offset in l_fp to double, we get sub ns resolution for small offsets. How big an offset can a double hold with ns precision? 53 bits of ns is many seconds. So anything but the first long jump will be OK. I'm happy with that. It might be in interesting experiment. Set the time so that it is off as far as possible and we get the worst precision through a double. Then start ntpd. Compare that with the time starting off by 1000 seconds. -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 18 05:06:03 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 01:06:03 -0400 Subject: Big picture... In-Reply-To: <20170418032652.E927940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418032652.E927940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418050603.GB15781@thyrsus.com> Hal Murray : > Draw an upside down tree of ntpd modules with packet processing on the left > and OS interface on the right and data processing in the middle. (I'm > handwaving. I mean the way you would draw the picture if you were explaining > things rather than the actual current module structure. I think we are > reasonably close.) > > The packet code uses l_fp. The OS interfaces use timespec and time_t. The > code in the middle works in time offsets using seconds in doubles and some > time in time_t. > > I'm not sure that is totally correct. It seems like a good goal. I think we > are close. That was the direction I was going when I got rid of a lot of > time64_t a while ago and when I cleaned up the leap-second code. It's a good direction. > I'd like to get rid of ntp_calendar. Nothing urgent, it just seems like we > should be able to use POSIX date/time calls instead. I made some progress in > the recent leap-second cleanup. The most basic requirement for ntp_calendar is that it needs to be able to translate a pivot date to a form that lfp_stamp_to_tspec() can use. > One rough edge is that there is no UTC version of mktime. There is timegm, > but it's not POSIX. The linux man pages says it's a GNU extension and is > also available on BSDs. Yeah, if we're going to stay inside the POSIX lines that is a blocker. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Apr 18 05:07:09 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 01:07:09 -0400 Subject: _DATE__, version string, and distros In-Reply-To: <20170418042542.BED4F40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418042542.BED4F40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418050709.GC15781@thyrsus.com> Hal Murray : > >> So the second step is to add a way to override the time stamp from > >> autorevision. That will also solve my problem of getting useful info > >> from the version string. > > > Debug mode only? > > There already is too much tangled up with debug. > > I've been assuming it would be a new option. > > >> If we don't want to break the build-repro guarantee, we > >> have to put the new date into a file or update the last > >>commit date. > > commit date is fine. > > Commit date alone doesn't solve my problem. I want an option that shows that > I've made some edits that aren't committed. I could put an exit+commit for a > dummy file into a script, but then I couldn't ever push anything without > trashing the master repo. > > Maybe git can handle that case. If so, I need a lesson. Have you tried git describe? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 18 05:40:08 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 22:40:08 -0700 Subject: _DATE__, version string, and distros In-Reply-To: Message from "Eric S. Raymond" of "Tue, 18 Apr 2017 01:07:09 EDT." <20170418050709.GC15781@thyrsus.com> Message-ID: <20170418054008.A0F9C40605C@ip-64-139-1-69.sjc.megapath.net> >> Maybe git can handle that case. If so, I need a lesson. > Have you tried git describe? I don't see how that does what I want. I'm looking for something that ignores a file and any commits to that file when I do a push. .gitignore doesn't do what I want. I need to commit something so I can update the last-commit time stamp, but I don't want that junk to get pushed back where anybody else will see it. -- These are my opinions. I hate spam. From gem at rellim.com Tue Apr 18 05:44:45 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 17 Apr 2017 22:44:45 -0700 Subject: Big picture... In-Reply-To: <20170418045842.ECEFD40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418045842.ECEFD40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170417224445.3207d0b0@spidey.rellim.com> Yo Hal! On Mon, 17 Apr 2017 21:58:42 -0700 Hal Murray wrote: > gem at rellim.com said: > > But, I see no point not to do the offsets as timespec's too. > > Otherwise big time corrections need multiple jumps due to loss of > > precision in the doubles for large 'gate' times. And the time sve > > in doubles is lost in the converting back and forth. > > [What's a "time sve"? I can't find a typo that turns it into > something useful.] s/sve/saved/ > I think the code using doubles would be much easier to read. Really? You see a big difference between these tow? double a,b,c; c = a + b And: timespeec a, b, c; c - timespec_add(a, b); > If we convert from an offset in l_fp to double, we get sub ns > resolution for small offsets. How do you start with two things in ns, subtract and get them now in sub-ns? > How big an offset can a double hold with ns precision? 53 bits of ns > is many seconds. So anything but the first long jump will be OK. > I'm happy with that. Acceptable, not optimal. And it leads to this converstaion on precision which wastes hours every month for all time. Just do it right and be done with it. I would consider using long double if you feel the need for floaring point. > It might be in interesting experiment. Set the time so that it is > off as far as possible and we get the worst precision through a > double. Then start ntpd. Compare that with the time starting off by > 1000 seconds. And when Eric changes step_systeim() so I can add tests for it that will be one of the first tests. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Apr 18 05:47:06 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 17 Apr 2017 22:47:06 -0700 Subject: _DATE__, version string, and distros In-Reply-To: <20170418042542.BED4F40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418042542.BED4F40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170417224706.2ff51012@spidey.rellim.com> Yo Hal! On Mon, 17 Apr 2017 21:25:42 -0700 Hal Murray wrote: > >> So the second step is to add a way to override the time stamp from > >> autorevision. That will also solve my problem of getting useful > >> info from the version string. > > > Debug mode only? > > There already is too much tangled up with debug. > I've been assuming it would be a new option. Oh, yeah, me too, sorry for lack of clarity. > >> If we don't want to break the build-repro guarantee, we > >> have to put the new date into a file or update the last > >>commit date. > > commit date is fine. > > Commit date alone doesn't solve my problem. Correct, it solves the repro problem. > I want an option that > shows that I've made some edits that aren't committed. Just a flag? Maybe waf runs git status and makes a flag from the result? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Apr 18 05:51:05 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 17 Apr 2017 22:51:05 -0700 Subject: Time to slow down and be more careful In-Reply-To: <20170418045251.GA15781@thyrsus.com> References: <20170417224003.GB10652@thyrsus.com> <20170418024205.08FE940605C@ip-64-139-1-69.sjc.megapath.net> <20170418045251.GA15781@thyrsus.com> Message-ID: <20170417225105.1c308b6b@spidey.rellim.com> Yo Eric! On Tue, 18 Apr 2017 00:52:51 -0400 "Eric S. Raymond" wrote: > Hal Murray : > > > > >> I changed things so that there is never a conversion from l_fp > > >> to full time. There is a subtract done on the l_fp side. The > > >> clock offset in l_fp is converted to an offset in seconds. I > > >> think it's a double. That eventually turns into a clock > > >> adjustment. There is no explicit pivot. There is an implicit > > >> pivot of the current time. > > > > > I'm actually not sure which code you're talking about here, and I > > > think it's important that I should. > > > > You added some pivot code to step_systime in libntp/systime.c > > I don't understand why. > > I didn't "add" a damn thing. I restored the pivot computation that > was there before it was mistakenly removed. You can compare the > Classic version to check this; there are some superficial differences > due to the l_fp and macro cleanups but the logic is the same. OK, my mistake, you re-added the broken pivot code. > > The argument is a time step as a double. That comes from packets > > exchanged with a server using l_fp. That's at most 31 bits plus > > sign, relative to the current system time. That's the biggest step > > you can take. You can step across epoch boundaries. You can't > > step over whole epochs. > > Maximum step size isn't the problem. Agreed, just one of them. > The problem, which is invisible now but won't be in the future, is > that any given l_fp can represent a countable infinity of timestamps > separated from each other by the 136-year cycle lengths. Which one it > *actually* represents depends on the base epoch of the sending ntpd, > which we don't know. I guess you have not read my comments, and Hal's comments, to the bug where we both show, in different ways, why the pivot is just a bug. Please read that, ponder, and return here. Or, if you prefer, make the changes I have suggested, also presented in the bug, so I can write the tests that prove things one or the other. Can we please get out of the bike shed loop and just prove something? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 18 06:21:49 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 23:21:49 -0700 Subject: Time to slow down and be more careful In-Reply-To: Message from "Eric S. Raymond" of "Tue, 18 Apr 2017 00:52:51 EDT." <20170418045251.GA15781@thyrsus.com> Message-ID: <20170418062149.2651E40605C@ip-64-139-1-69.sjc.megapath.net> > I didn't "add" a damn thing. I restored the pivot computation that was > there before it was mistakenly removed. You can compare the Classic version > to check this; there are some superficial differences due to the l_fp and > macro cleanups but the logic is the same. > Mills's solution was to map the timestamp to whichever of these aleph-0 > possibilities is closest in time to a pivot. Then, if the time intended by > the sender was within a half-cycle of the pivot, all is good. That's what > this code is doing. OK, I think I'm catching on. That code you restored is crap. It takes an offset, converts it to a l_fp, gets the system time in l_fp, adds them together, then converts it back to a system time. Due to the small range of an l_fp, that convert back step needs a pivot time. That pivot will catch starting up in 2039 on systems without RTC. Before your recent restore, step_systime did a pivot around "now". That works correctly once it gets started. I've said several times that we depend on the current time to be reasonable. The no-RTC case is a good way to get an unreasonable time. I see two ways to handle the no-RTC case. One is to run some other program before starting ntpd. It could use the build date or file system or whatever to set the system time to a reasonable value. My straw man for a file would be the leap seconds file. That's assuming it won't get updated when the system has a bogus time and/or the update process will preserve the time stamps. (We need to handle the no-leap-second file case too.) The other would be to but a sanity check early during startup of ntpd. If the system time is less than the build date, bump it up. Either way could also check for time unreasonably far into the future. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 18 06:51:34 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 17 Apr 2017 23:51:34 -0700 Subject: _DATE__, version string, and distros Message-ID: <20170418065134.559E540605C@ip-64-139-1-69.sjc.megapath.net> >> I want an option that >> shows that I've made some edits that aren't committed. > Just a flag? Maybe waf runs git status and makes a flag from the result? The option I want needs to run on systems without git. I want more than just a flag. I want to know that the result of an edit, build, run is different from a following edit, build, run. A time stamp solves that. Anything else if overkill and we should probably stop bike-sheding it. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 18 07:03:38 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 00:03:38 -0700 Subject: Big picture... In-Reply-To: Message from "Eric S. Raymond" of "Tue, 18 Apr 2017 01:06:03 EDT." <20170418050603.GB15781@thyrsus.com> Message-ID: <20170418070338.5AE5A40605C@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > The most basic requirement for ntp_calendar is that it needs to be able to > translate a pivot date to a form that lfp_stamp_to_tspec() can use. For that, we can use mktime. The time zone offset isn't a big deal. We would probably have to back up by the worst case zone offset so everybody can test right after building. If we are using a build time, we can run an external script to generate the date in a convenient format. date -u +%s looks good to me. -------- Another part for the big picture. Except for the l_fp time-stamps in packets, all of the full times are POSIX times. (or should be) -- These are my opinions. I hate spam. From gem at rellim.com Tue Apr 18 07:15:41 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 00:15:41 -0700 Subject: _DATE__, version string, and distros In-Reply-To: <20170418065134.559E540605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418065134.559E540605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418001541.2143d6b1@spidey.rellim.com> Yo Hal! On Mon, 17 Apr 2017 23:51:34 -0700 Hal Murray wrote: > >> I want an option that > >> shows that I've made some edits that aren't committed. > > > Just a flag? Maybe waf runs git status and makes a flag from the > > result? > > The option I want needs to run on systems without git. You are editing files on a system without git and still want fine grain versioning? How does that work? How do you know if any files changed? > I want more than just a flag. I want to know that the result of an > edit, build, run is different from a following edit, build, run. A > time stamp solves that. No, you want more. If the timestamp is just the build time, that does not tell you if any files are changed from default. You also want to know if you have made un-committed edits. > Anything else if overkill and we should > probably stop bike-sheding it. Send patches then! Or wait for me to get to it. I think we have enough to get started and see how it flies. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 18 07:39:21 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 00:39:21 -0700 Subject: _DATE__, version string, and distros Message-ID: <20170418073921.63F1240605C@ip-64-139-1-69.sjc.megapath.net> > You are editing files on a system without git and still want fine grain > versioning? How does that work? How do you know if any files changed? Mostly, I edit and test on one system, then use scp to push the bits to other systems. Occasionally, I do a throwaway edit on one of the other systems for a quick test. Sometimes I do serious work if that's where I can reproduce the bug, then copy the edited files back. With the current setup, all I get is the last-commit from git. There is no way to tell if a system is running the unmodified code or some of my edits. I can live with a simple build time. If you want to get fancy, the last edit would be nice. Making a script do the work seems like the right start. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 18 08:31:06 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 01:31:06 -0700 Subject: Time to slow down and be more careful Message-ID: <20170418083106.1354A40605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > I guess you have not read my comments, and Hal's comments, to the bug where > we both show, in different ways, why the pivot is just a bug. I wouldn't call the pivot a bug, just unnecessary if the system time is "close enough". Without that pivot, things will screwup if the correct time is 2039 and the system time is close to 1970. That will happen in a just booted system without RTC if it doesn't have something else that gets a sane time. ---------- I just rebooted a Raspberry Pi. Somebody is setting the system time. I don't know who/where. >From /var/log/messages Apr 18 01:11:57 rp10 rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="420" x-info="http://www.rsyslog.com"] exiting on signal 15. Apr 18 01:12:00 rp10 rsyslogd: [origin software="rsyslogd" swVersion="8.4.2" x-pid="376" x-info="http://www.rsyslog.com"] start >From /var/log/syslog Apr 18 00:17:01 rp10 CRON[4095]: (root) CMD ( cd / && run-parts --report /etc/cron.hourly) Apr 18 01:11:57 rp10 systemd[1]: Started Turns off Raspberry Pi display backlight on shutdown/reboot. Apr 18 01:12:01 rp10 ntpd[433]: ntpd play-0.9.7+467 2017-04-15T07:29:36Z: Starting Apr 18 01:12:01 rp10 ntpd[433]: Command line: /usr/local/sbin/ntpd -p /var/run/ntpd.pid -g -u 106:111 Apr 18 01:12:01 rp10 ntp[393]: Starting NTP server: ntpd. Apr 18 01:12:01 rp10 ntpd[434]: proto: precision = 1.042 usec (-20) Apr 18 01:12:01 rp10 ntpd[434]: successfully locked into RAM Apr 18 01:12:01 rp10 ntpd[434]: switching logging to file /var/log/ntp/ntpd.log -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 18 13:02:02 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 09:02:02 -0400 Subject: Time to slow down and be more careful In-Reply-To: <20170417225105.1c308b6b@spidey.rellim.com> References: <20170417224003.GB10652@thyrsus.com> <20170418024205.08FE940605C@ip-64-139-1-69.sjc.megapath.net> <20170418045251.GA15781@thyrsus.com> <20170417225105.1c308b6b@spidey.rellim.com> Message-ID: <20170418130202.GA20726@thyrsus.com> Gary E. Miller : > I guess you have not read my comments, and Hal's comments, to the bug where > we both show, in different ways, why the pivot is just a bug. > > Please read that, ponder, and return here. Right. You have two things wrong. Mills's correction is not a bug, and Hal has (at least as of this morning) figured out that it's not a bug. He thinks the implementation is crap, and I begin to think he's right about that - in which case the correct thing to do is fix it. > Can we please get out of the bike shed loop and just prove something? We are not in a bike shed - the cycle correction is required for RFC5905 conformance and we will be *roasted* if we fuck it up. You continue to not understand the whole problem, or to think it can be dismissed for various reasons such as being too far in the future. Hal doesn't have it quite right either, but he's getting there. Once he has understood that we have to minimize our exposure to a bad system clock at startup, and that Mills's pivot algorithm was a weirdly clever way to attempt that even if the implementation isn't yet quite right, he'll be fully up to speed. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Tue Apr 18 14:35:15 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 10:35:15 -0400 Subject: Time to slow down and be more careful In-Reply-To: <20170418062149.2651E40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418045251.GA15781@thyrsus.com> <20170418062149.2651E40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418143515.GB20726@thyrsus.com> Hal Murray : > > > I didn't "add" a damn thing. I restored the pivot computation that was > > there before it was mistakenly removed. You can compare the Classic version > > to check this; there are some superficial differences due to the l_fp and > > macro cleanups but the logic is the same. > > > Mills's solution was to map the timestamp to whichever of these aleph-0 > > possibilities is closest in time to a pivot. Then, if the time intended by > > the sender was within a half-cycle of the pivot, all is good. That's what > > this code is doing. > > OK, I think I'm catching on. > > That code you restored is crap. It takes an offset, converts it to a l_fp, > gets the system time in l_fp, adds them together, then converts it back to a > system time. Due to the small range of an l_fp, that convert back step needs > a pivot time. Hm. I need to stare at that code some more. I'm beginning to think the pivot is the right idea implemented in a slightly wrong place. Maybe it ought to be applied to in-packet timestamps as soon as they arrive? > That pivot will catch starting up in 2039 on systems without RTC. > > Before your recent restore, step_systime did a pivot around "now". That > works correctly once it gets started. > > I've said several times that we depend on the current time to be reasonable. Where, specifically? Because if so, that is a serious bug that needs to be fixed. Let's keep our eye on the ball, here. This is a time-synchronization daemon - we have to deal cleanly with the case where the system clock is garbage at startup. > I see two ways to handle the no-RTC case. > > One is to run some other program before starting ntpd. It could use the > build date or file system or whatever to set the system time to a reasonable > value. My straw man for a file would be the leap seconds file. That's > assuming it won't get updated when the system has a bogus time and/or the > update process will preserve the time stamps. (We need to handle the > no-leap-second file case too.) > > The other would be to but a sanity check early during startup of ntpd. If > the system time is less than the build date, bump it up. > > Either way could also check for time unreasonably far into the future. Simpler: The last-modification time of /etc. But that could foo up if /etc is ever modified at boot time before ntpd can sync the clock. Anyway, now I think we're moving in a constructive direction. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Apr 18 15:03:05 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 11:03:05 -0400 Subject: Big picture... In-Reply-To: <20170418070338.5AE5A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418050603.GB15781@thyrsus.com> <20170418070338.5AE5A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418150305.GC20726@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > The most basic requirement for ntp_calendar is that it needs to be able to > > translate a pivot date to a form that lfp_stamp_to_tspec() can use. > > For that, we can use mktime. The time zone offset isn't a big deal. We > would probably have to back up by the worst case zone offset so everybody can > test right after building. > > If we are using a build time, we can run an external script to generate the > date in a convenient format. date -u +%s looks good to me. Maybe. Almost. How are we going to do this under Windows? Are we giving up on a Windows port? Because POSIX API conformance tells us we can get at POSIX time froom C, but we have no guarantee that date(1) will exist. > Another part for the big picture. > > Except for the l_fp time-stamps in packets, all of the full times are POSIX > times. (or should be) Agreed. Specifically, struct timespecs. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Apr 18 15:08:42 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 11:08:42 -0400 Subject: Time to slow down and be more careful In-Reply-To: <20170418083106.1354A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418083106.1354A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418150842.GD20726@thyrsus.com> Hal Murray : > > gem at rellim.com said: > > I guess you have not read my comments, and Hal's comments, to the bug where > > we both show, in different ways, why the pivot is just a bug. > > I wouldn't call the pivot a bug, just unnecessary if the system time is > "close enough". > > Without that pivot, things will screwup if the correct time is 2039 and the > system time is close to 1970. That will happen in a just booted system > without RTC if it doesn't have something else that gets a sane time. Right. You now understand what Mills's compile-time pivot was intended to hedge against. Thank you for pointing out that the implementation may be wrong; that may need fixing. > I just rebooted a Raspberry Pi. Somebody is setting the system time. I > don't know who/where. I do. There's a special hack to save the time at shutdown and restore it at startup. The assumption is that that sort of reboot happens fast enough so that the restored time is not too horrible for syslogging. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 18 17:14:33 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 10:14:33 -0700 Subject: Startup time on Raspberry Pi In-Reply-To: Message from "Eric S. Raymond" of "Tue, 18 Apr 2017 11:08:42 EDT." <20170418150842.GD20726@thyrsus.com> Message-ID: <20170418171433.45CD940605C@ip-64-139-1-69.sjc.megapath.net> >> I just rebooted a Raspberry Pi. Somebody is setting the >> system time. I don't know who/where. > I do. There's a special hack to save the time at shutdown and restore it at > startup. The assumption is that that sort of reboot happens fast enough so > that the restored time is not too horrible for syslogging. >From the log file: 18 Apr 01:12:08 ntpd[434]: 0.0.0.0 c41c 0c clock_step +6.249811 s That seems consistent with your description. I'm curious. I'd like to look at the code. Do you know where it is located? Name of program or similar? ... -- These are my opinions. I hate spam. From gem at rellim.com Tue Apr 18 17:21:54 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 10:21:54 -0700 Subject: Startup time on Raspberry Pi In-Reply-To: <20170418171433.45CD940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418150842.GD20726@thyrsus.com> <20170418171433.45CD940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418102154.26ca8975@spidey.rellim.com> Yo Hal! On Tue, 18 Apr 2017 10:14:33 -0700 Hal Murray wrote: > >> I just rebooted a Raspberry Pi. Somebody is setting the > >> system time. I don't know who/where. On Gentoo it is a program called swclock. Just look at what your systemd runs early at boot time. swclock runs on shutdown and touches the current time on the file: /run/openrc/shutdowntime On boot it reads the timestamp on that file and set the system clock to that time. systemd probably does it in some similar, using some strange convoluted logic... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Apr 18 17:32:04 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 10:32:04 -0700 Subject: Time to slow down and be more careful In-Reply-To: <20170418130202.GA20726@thyrsus.com> References: <20170417224003.GB10652@thyrsus.com> <20170418024205.08FE940605C@ip-64-139-1-69.sjc.megapath.net> <20170418045251.GA15781@thyrsus.com> <20170417225105.1c308b6b@spidey.rellim.com> <20170418130202.GA20726@thyrsus.com> Message-ID: <20170418103204.7ede4ae8@spidey.rellim.com> Yo Eric! On Tue, 18 Apr 2017 09:02:02 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > I guess you have not read my comments, and Hal's comments, to the > > bug where we both show, in different ways, why the pivot is just a > > bug. > > > > Please read that, ponder, and return here. > > Right. You have two things wrong. Mills's correction is not a bug, It is not a bug, but doing it there is a bug. Follow along while I do the math: sys_steptime() to takes these inputs: sys_residual - double seconds, a time offset range +/- 1us no l_fp or 1JAN1900 in sight. step - double seconds, range usually +/1 1ms, 1us or 1ns. never larger than 'gate', but for arguement assume it could be as large as 1Jan2200 - 1Jan1970 no l_fp or 1JAN1900 in sight. system clock - timespec(64) (except on some 32 bit binarites.) Range 1Jan1970 to well past 1Jan2200. We can easilary add or subtract 100s of years to current timestamp with no problems, as long as we stay past 1Jan1970. no l_fp or 1JAN1900 in sight. All sys_steptime() does is add all three up to get the desired system time to step to. The time to step to must be a timespec(64) to send to the syscall. no l_fp or 1JAN1900 in sight. So, where so you see any 1Jan1900 term or pivot in there? no l_fp or 1JAN1900 in sight. Please be specific, not invoking deities. > He thinks the implementation is crap, and I begin to think he's > right about that We all agree there. > - in which case the correct thing to do is fix it. We all agree there. > > > Can we please get out of the bike shed loop and just prove > > something? > > We are not in a bike shed - the cycle correction is required for > RFC5905 conformance and we will be *roasted* if we fuck it up. Correct, we have to get it right, but this is the WRONG place. See above. > You > continue to not understand the whole problem, or to think it can be > dismissed for various reasons such as being too far in the future. I have provided my analysis above, including times far in the future. What step is wrong? > Hal doesn't have it quite right either, but he's getting there. Once > he has understood that we have to minimize our exposure to a bad > system clock at startup, and that Mills's pivot algorithm was a > weirdly clever way to attempt that even if the implementation isn't > yet quite right, he'll be fully up to speed. Clever right, wrong place. If you still disagree, show me which statement in my math above is incorrect. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 18 17:35:22 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 13:35:22 -0400 Subject: Startup time on Raspberry Pi In-Reply-To: <20170418171433.45CD940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418150842.GD20726@thyrsus.com> <20170418171433.45CD940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418173522.GA25612@thyrsus.com> Hal Murray : > > >> I just rebooted a Raspberry Pi. Somebody is setting the > >> system time. I don't know who/where. > > > I do. There's a special hack to save the time at shutdown and restore it at > > startup. The assumption is that that sort of reboot happens fast enough so > > that the restored time is not too horrible for syslogging. > > >From the log file: > 18 Apr 01:12:08 ntpd[434]: 0.0.0.0 c41c 0c clock_step +6.249811 s > > That seems consistent with your description. > > I'm curious. I'd like to look at the code. Do you know where it is located? > Name of program or similar? ... I do not. Somebody told me how it works, but I have not examined the mechanism myself. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From cbwierda at gmail.com Tue Apr 18 17:36:02 2017 From: cbwierda at gmail.com (Clark B. Wierda) Date: Tue, 18 Apr 2017 13:36:02 -0400 Subject: Startup time on Raspberry Pi In-Reply-To: <20170418171433.45CD940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418150842.GD20726@thyrsus.com> <20170418171433.45CD940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Tue, Apr 18, 2017 at 1:14 PM, Hal Murray wrote: > > >> I just rebooted a Raspberry Pi. Somebody is setting the > >> system time. I don't know who/where. > > > I do. There's a special hack to save the time at shutdown and restore > it at > > startup. The assumption is that that sort of reboot happens fast enough > so > > that the restored time is not too horrible for syslogging. > > From the log file: > 18 Apr 01:12:08 ntpd[434]: 0.0.0.0 c41c 0c clock_step +6.249811 s > > That seems consistent with your description. > > I'm curious. I'd like to look at the code. Do you know where it is > located? > Name of program or similar? ... > I believe Raspberry Pi uses fake-hwclock to keep time advancing. My understanding (not at home to check directly) is that there is a cron entry to save time hourly and script to save on shutdown. There is another script that reads the file on boot. Clark -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Tue Apr 18 17:37:56 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 10:37:56 -0700 Subject: Time to slow down and be more careful In-Reply-To: <20170418143515.GB20726@thyrsus.com> References: <20170418045251.GA15781@thyrsus.com> <20170418062149.2651E40605C@ip-64-139-1-69.sjc.megapath.net> <20170418143515.GB20726@thyrsus.com> Message-ID: <20170418103756.3b6d8cce@spidey.rellim.com> Yo Eric! On Tue, 18 Apr 2017 10:35:15 -0400 "Eric S. Raymond" wrote: > Hm. I need to stare at that code some more. I'm beginning to think > the pivot is the right idea implemented in a slightly wrong place. > Maybe it ought to be applied to in-packet timestamps as soon as they > arrive? Finally!!!! > > Either way could also check for time unreasonably far into the > > future. > > Simpler: The last-modification time of /etc. But that could foo up if > /etc is ever modified at boot time before ntpd can sync the clock. But /etc itself is infrequently updated, may be set by any package install to something far in the past. The time on the driftfile, if present is usually to within an hour. Better if always set on ntpd shutdown. Backstop that time with ntpd build time or a time in ntp.conf. > Anyway, now I think we're moving in a constructive direction. Good. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 18 17:53:37 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 10:53:37 -0700 Subject: Startup time on Raspberry Pi Message-ID: <20170418175337.E016040605C@ip-64-139-1-69.sjc.megapath.net> > I believe Raspberry Pi uses fake-hwclock to keep time advancing. Thanks. There is even a man page with all the details. Time is stored in /etc/fake-hwclock.data -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 18 18:08:11 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 11:08:11 -0700 Subject: Big picture... In-Reply-To: Message from "Eric S. Raymond" of "Tue, 18 Apr 2017 11:03:05 EDT." <20170418150305.GC20726@thyrsus.com> Message-ID: <20170418180811.7E98840605C@ip-64-139-1-69.sjc.megapath.net> >> If we are using a build time, we can run an external script >> to generate the date in a convenient format. >> date -u +%s looks good to me. > Maybe. Almost. How are we going to do this under Windows? Are we giving > up on a Windows port? Because POSIX API conformance tells us we can get at > POSIX time froom C, but we have no guarantee that date(1) will exist. Is that going to be the biggest problem with Windows? I know next to nothing about Windows. Does their POSIX support include only c code or will it also include shell stuff? How many programs outside of sh itself do we depend on? Will autorevision work? Does POSIX include a date command? ... -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 18 18:38:30 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 11:38:30 -0700 Subject: Let's get rid of pivots In-Reply-To: Message from "Eric S. Raymond" of "Tue, 18 Apr 2017 10:35:15 EDT." <20170418143515.GB20726@thyrsus.com> Message-ID: <20170418183830.3E3B440605C@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Hm. I need to stare at that code some more. I'm beginning to think the > pivot is the right idea implemented in a slightly wrong place. Maybe it > ought to be applied to in-packet timestamps as soon as they arrive? "as soon as they arrive" seems ugly to me, but maybe that's just because I've been thinking of using l_fp to compute an offset and using the offset to adjust the time with an effective pivot of "now". An alternative approach would be to get rid of pivots totally. If we removed all of the pivot code and logic from the current code, it will work until 2036. How long will it take to implement and deploy a new version of ntp protocol with enough bits? My straw man would be 64 bits of ns. That's 500+ years. (if I did the math right) -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 18 18:53:37 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 11:53:37 -0700 Subject: Time to slow down and be more careful Message-ID: <20170418185337.B63EA40605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > It is not a bug, but doing it there is a bug. Follow along while I do the > math: ... The case Eric was considering was the local clock is 1970 and the target time is post 2036. That requires the step adjustment to be more than 31 bits of seconds. That would work if we applied Eric's suggestion of doing the pivot way back when the l_fp format is extracted from the packets. Then the arithmetic with the 4 time stamps will get a big offset. [This may be too short to make sense. Sorry, I'm leaving now.] -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 18 19:04:41 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 15:04:41 -0400 Subject: Let's get rid of pivots In-Reply-To: <20170418183830.3E3B440605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418143515.GB20726@thyrsus.com> <20170418183830.3E3B440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418190441.GA26010@thyrsus.com> Hal Murray : > An alternative approach would be to get rid of pivots totally. If we removed > all of the pivot code and logic from the current code, it will work until > 2036. How long will it take to implement and deploy a new version of ntp > protocol with enough bits? There are technical issues with this idea, but they're not the ones that worry me most. The politics and optics of abandoning the cross-era compatibility written into RFC5905 would be *awful*. Our project has enemies. We must not give them a club that large to near us with. We not only have to do the conservative, RFC-conformant thing, we have to be seen to do it. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Tue Apr 18 19:06:44 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 12:06:44 -0700 Subject: Let's get rid of pivots In-Reply-To: <20170418183830.3E3B440605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418143515.GB20726@thyrsus.com> <20170418183830.3E3B440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418120644.55e79fec@spidey.rellim.com> Yo Hal! On Tue, 18 Apr 2017 11:38:30 -0700 Hal Murray wrote: > esr at thyrsus.com said: > > Hm. I need to stare at that code some more. I'm beginning to > > think the pivot is the right idea implemented in a slightly wrong > > place. Maybe it ought to be applied to in-packet timestamps as > > soon as they arrive? > > "as soon as they arrive" seems ugly to me, but maybe that's just > because I've been thinking of using l_fp to compute an offset and > using the offset to adjust the time with an effective pivot of "now". By far I agree, all we need from the received l_fp is the offset from the local clock, modulo 64 bits of l_fp. But, there is one important case. Let's take our poor new RasPi that wakes up thinking it is 1Jan1970. After being awake 5 seconds it sends an l_fp of 1Jan1970 to a remote chimer, and gets back an l_fp if 2Jan1970. A delta of one day. Somehow the RasPi needs to know if that delta of one day means the time is really 2 Jan1970, or 2 Jan 2138, or 2 Jan 2306? Once that initial jump is made the following deltas will be small, much less than 1 hour likely, but even decades would still work nicely with 2s complement deltas.. Until the next reboot to 1Jan1970. Nail that one big case and all the little ones go away. > An alternative approach would be to get rid of pivots totally. If we > removed all of the pivot code and logic from the current code, it > will work until 2036. How long will it take to implement and deploy > a new version of ntp protocol with enough bits? Or pivot on 2015 and delay the problem until 2183? > My straw man would be 64 bits of ns. That's 500+ years. (if I did > the math right) l_fp is almost that, it is 64 bits of about 233 ps. Which works out to 168 yeats. You mean 64 bits of seeconds? Like timespec(64)? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 18 19:06:26 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 15:06:26 -0400 Subject: Big picture... In-Reply-To: <20170418180811.7E98840605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418150305.GC20726@thyrsus.com> <20170418180811.7E98840605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418190626.GB26010@thyrsus.com> Hal Murray : > > >> If we are using a build time, we can run an external script > >> to generate the date in a convenient format. > >> date -u +%s looks good to me. > > Maybe. Almost. How are we going to do this under Windows? Are we giving > > up on a Windows port? Because POSIX API conformance tells us we can get at > > POSIX time froom C, but we have no guarantee that date(1) will exist. > > Is that going to be the biggest problem with Windows? > > I know next to nothing about Windows. Does their POSIX support include only > c code or will it also include shell stuff? How many programs outside of sh > itself do we depend on? Will autorevision work? Does POSIX include a date > command? ... Hell of a can of worms, innit? Which is in part my point... -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Apr 18 19:08:11 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 15:08:11 -0400 Subject: Time to slow down and be more careful In-Reply-To: <20170418185337.B63EA40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418185337.B63EA40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418190811.GC26010@thyrsus.com> Hal Murray : > > gem at rellim.com said: > > It is not a bug, but doing it there is a bug. Follow along while I do the > > math: > ... > > The case Eric was considering was the local clock is 1970 and the target time > is post 2036. That requires the step adjustment to be more than 31 bits of > seconds. > > That would work if we applied Eric's suggestion of doing the pivot way back > when the l_fp format is extracted from the packets. Then the arithmetic with > the 4 time stamps will get a big offset. What would really be going on here is busting our time arithmetic out of the 32-bit box... -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Tue Apr 18 19:19:53 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 12:19:53 -0700 Subject: Time to slow down and be more careful In-Reply-To: <20170418190811.GC26010@thyrsus.com> References: <20170418185337.B63EA40605C@ip-64-139-1-69.sjc.megapath.net> <20170418190811.GC26010@thyrsus.com> Message-ID: <20170418121953.667bce3a@spidey.rellim.com> Yo Eric! On Tue, 18 Apr 2017 15:08:11 -0400 "Eric S. Raymond" wrote: > Hal Murray : > > > > gem at rellim.com said: > > > It is not a bug, but doing it there is a bug. Follow along while > > > I do the math: > > ... > > > > The case Eric was considering was the local clock is 1970 and the > > target time is post 2036. That requires the step adjustment to be > > more than 31 bits of seconds. > > > > That would work if we applied Eric's suggestion of doing the pivot > > way back when the l_fp format is extracted from the packets. Then > > the arithmetic with the 4 time stamps will get a big offset. > > What would really be going on here is busting our time arithmetic out > of the 32-bit box... Not really. ntpd still does int64_t arithmetic in a 32-bit binary. All the ntpd math is the same. The 32-bit problem is elsewhere. The 32-bit problem is that you have to deal with timespec(32) for system time. That breaks in 2038. When we read system time in timespec(32) we do not know if the currennt year is 1971 or 2039. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From jason at azze.org Tue Apr 18 19:37:37 2017 From: jason at azze.org (Jason Azze) Date: Tue, 18 Apr 2017 15:37:37 -0400 Subject: Time to slow down and be more careful In-Reply-To: <20170418103756.3b6d8cce@spidey.rellim.com> References: <20170418045251.GA15781@thyrsus.com> <20170418062149.2651E40605C@ip-64-139-1-69.sjc.megapath.net> <20170418143515.GB20726@thyrsus.com> <20170418103756.3b6d8cce@spidey.rellim.com> Message-ID: On Tue, Apr 18, 2017 at 1:37 PM, Gary E. Miller wrote: > But /etc itself is infrequently updated, may be set by any package > install to something far in the past. The time on the driftfile, if > present is usually to within an hour. Better if always set on ntpd > shutdown. Is the kernel build time as returned by uname -a or uname -v a reasonable guidepost? From gem at rellim.com Tue Apr 18 19:49:45 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 12:49:45 -0700 Subject: Time to slow down and be more careful In-Reply-To: References: <20170418045251.GA15781@thyrsus.com> <20170418062149.2651E40605C@ip-64-139-1-69.sjc.megapath.net> <20170418143515.GB20726@thyrsus.com> <20170418103756.3b6d8cce@spidey.rellim.com> Message-ID: <20170418124945.6f9d8972@spidey.rellim.com> Yo Jason! On Tue, 18 Apr 2017 15:37:37 -0400 Jason Azze wrote: > On Tue, Apr 18, 2017 at 1:37 PM, Gary E. Miller > wrote: > > > But /etc itself is infrequently updated, may be set by any package > > install to something far in the past. The time on the driftfile, if > > present is usually to within an hour. Better if always set on ntpd > > shutdown. > > Is the kernel build time as returned by uname -a or uname -v a > reasonable guidepost? If you assume the kernel is no more than 20 years old, that gets you within 148 years. But not portable to Windows. Hmmm, and looking at several distros close at hand, not even portable between Linux's. Darwin mini.rellim.com 16.4.0 Darwin Kernel Version 16.4.0: Thu Dec 22 22:53:21 PST 2016; root:xnu-3789.41.3~3/RELEASE_X86_64 x86_64 Linux blondie 4.10.8-gentoo #1 SMP Fri Apr 7 18:54:22 PDT 2017 x86_64 Intel(R) Core(TM) i7-6700HQ CPU @ 2.60GHz GenuineIntel GNU/Linux RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 18 20:38:00 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 16:38:00 -0400 Subject: Time to slow down and be more careful In-Reply-To: <20170418121953.667bce3a@spidey.rellim.com> References: <20170418185337.B63EA40605C@ip-64-139-1-69.sjc.megapath.net> <20170418190811.GC26010@thyrsus.com> <20170418121953.667bce3a@spidey.rellim.com> Message-ID: <20170418203800.GD26010@thyrsus.com> Gary E. Miller : > The 32-bit problem is that you have to deal with timespec(32) > for system time. That breaks in 2038. When we read system time > in timespec(32) we do not know if the currennt year is 1971 or 2039. The integral part of timespec is time_t which has been 64-bit *even on 32-bit Linuxes* for, what is it, close to 20 years now? I am prepared to assume that by 2038 we will be able to put in a config test that barfs on 32-bit time_t without causing a ripple. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Tue Apr 18 21:06:20 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 14:06:20 -0700 Subject: Time to slow down and be more careful In-Reply-To: <20170418203800.GD26010@thyrsus.com> References: <20170418185337.B63EA40605C@ip-64-139-1-69.sjc.megapath.net> <20170418190811.GC26010@thyrsus.com> <20170418121953.667bce3a@spidey.rellim.com> <20170418203800.GD26010@thyrsus.com> Message-ID: <20170418140620.644cd19b@spidey.rellim.com> Yo Eric! On Tue, 18 Apr 2017 16:38:00 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > The 32-bit problem is that you have to deal with timespec(32) > > for system time. That breaks in 2038. When we read system time > > in timespec(32) we do not know if the currennt year is 1971 or > > 2039. > > The integral part of timespec is time_t which has been 64-bit *even > on 32-bit Linuxes* for, what is it, close to 20 years now? > > I am prepared to assume that by 2038 we will be able to put in a > config test that barfs on 32-bit time_t without causing a ripple. Gentoo 32 bit uses timespec(64). But not all distros do, and POSIX does not mandate it. POSIX just says time_t is a signed integer. FreeBSD 8 defines time_t as int32_t. QNX 6 also hass 32 bit time_t. I'm happy to EOL any OS that still uses 32 bit time_t. If we do so we just need to be explicit about it. Then we have no remaining 32 bit only problems. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 18 21:34:55 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 14:34:55 -0700 Subject: 32 bit time_t Message-ID: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> > The integral part of timespec is time_t which has been 64-bit *even on > 32-bit Linuxes* for, what is it, close to 20 years now? Are you sure? It's only 32 bits on Raspberry Pi and Fedora i386 and Debian ppc It's 32 bits on FreeBSD i386 and powerpc It's 64 bits on NetBSD i386 > I am prepared to assume that by 2038 we will be able to put in a config test > that barfs on 32-bit time_t without causing a ripple. We need to put that test in well before 2038 or somebody will build it a few years before and expect it to work for another dozen years. Is there any reason not to put it in now as a warning? -- These are my opinions. I hate spam. From gem at rellim.com Tue Apr 18 21:40:07 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 14:40:07 -0700 Subject: 32 bit time_t In-Reply-To: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418144007.42cda233@spidey.rellim.com> Yo Hal! On Tue, 18 Apr 2017 14:34:55 -0700 Hal Murray wrote: > Is there any reason not to put it in now as a warning? Good idea: if (4 >= sizeof(time_t)) puts("WARNING your system will fail in 2038."); RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 18 21:49:43 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 17:49:43 -0400 Subject: Time to slow down and be more careful In-Reply-To: <20170418140620.644cd19b@spidey.rellim.com> References: <20170418185337.B63EA40605C@ip-64-139-1-69.sjc.megapath.net> <20170418190811.GC26010@thyrsus.com> <20170418121953.667bce3a@spidey.rellim.com> <20170418203800.GD26010@thyrsus.com> <20170418140620.644cd19b@spidey.rellim.com> Message-ID: <20170418214943.GF26010@thyrsus.com> Gary E. Miller : > Gentoo 32 bit uses timespec(64). But not all distros do, and POSIX does > not mandate it. POSIX just says time_t is a signed integer. I'm aware of this. > FreeBSD 8 defines time_t as int32_t. > > QNX 6 also hass 32 bit time_t. > > I'm happy to EOL any OS that still uses 32 bit time_t. If we do so we > just need to be explicit about it. Then we have no remaining 32 bit > only problems. I don't think we need to pull that trigger yet. I think there won't be any problem pulling it within a few yeas of 2038. Everybody else can see that cliff coming, and nobody wants to ship an OS that goes pear-shaped on rollover day. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From hmurray at megapathdsl.net Tue Apr 18 21:54:07 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 14:54:07 -0700 Subject: 32 bit time_t Message-ID: <20170418215407.DB08840605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > FreeBSD 8 defines time_t as int32_t. 8 is really old. Where did you find one of those? 11 is current. 10.3 is still supported. It's 64 bits on amd64 and Raspberry Pi and BBB but still 32 on i386 and powerpc -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 18 21:53:41 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 17:53:41 -0400 Subject: 32 bit time_t In-Reply-To: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418215341.GG26010@thyrsus.com> Hal Murray : > > > The integral part of timespec is time_t which has been 64-bit *even on > > 32-bit Linuxes* for, what is it, close to 20 years now? > > Are you sure? > > It's only 32 bits on Raspberry Pi and Fedora i386 and Debian ppc Well that's weird. I thought I remembered the Linux kernel devs going to 64-bit time_t back in the late 90s, well before the 64-bit transition. > It's 32 bits on FreeBSD i386 and powerpc > It's 64 bits on NetBSD i386 > > > I am prepared to assume that by 2038 we will be able to put in a config test > > that barfs on 32-bit time_t without causing a ripple. > > We need to put that test in well before 2038 or somebody will build it a few > years before and expect it to work for another dozen years. > > Is there any reason not to put it in now as a warning? No, in fact I was going to suggest that. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 18 22:00:14 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 15:00:14 -0700 Subject: 32 bit time_t Message-ID: <20170418220014.C249940605C@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: >> Is there any reason not to put it in now as a warning? > if (4 >=3D sizeof(time_t)) puts("WARNING your system will fail in 2038."); I was thinking of a build time warning. Your run time suggestion seems good too. I think we need a better text. Maybe a URL for a page with the long story. -- These are my opinions. I hate spam. From jdb at systemsartisans.com Tue Apr 18 22:22:31 2017 From: jdb at systemsartisans.com (John D. Bell) Date: Tue, 18 Apr 2017 18:22:31 -0400 Subject: Big picture... In-Reply-To: <20170418190626.GB26010@thyrsus.com> References: <20170418150305.GC20726@thyrsus.com> <20170418180811.7E98840605C@ip-64-139-1-69.sjc.megapath.net> <20170418190626.GB26010@thyrsus.com> Message-ID: <1f92eadf-35d7-f36a-eeb3-226c472f8a0e@systemsartisans.com> My $0.02 worth - Since you've already got a dependency on Python, write a one-liner that is the equivalent of Unix's "date -u +%s". Use that. Otherwise, a tiny C program would also do the trick (at the cost of increased complexity). I believe "POSIX support" under windows is only the API corresponding to sections 3 and 2 of the Unix manual (library functions and (emulations of) system calls). The (still new and evolving) "Bash under Ubuntu" available under Windows 10 may give you more, but may also *not* be portable to other Windows variants (especially the Server ones which would be more in use in datacenters). - John D. Bell On 04/18/2017 03:06 PM, Eric S. Raymond wrote: > Hal Murray : >>>> If we are using a build time, we can run an external script >>>> to generate the date in a convenient format. >>>> date -u +%s looks good to me. >>> Maybe. Almost. How are we going to do this under Windows? Are we giving >>> up on a Windows port? Because POSIX API conformance tells us we can get at >>> POSIX time froom C, but we have no guarantee that date(1) will exist. >> Is that going to be the biggest problem with Windows? >> >> I know next to nothing about Windows. Does their POSIX support include only >> c code or will it also include shell stuff? How many programs outside of sh >> itself do we depend on? Will autorevision work? Does POSIX include a date >> command? ... > Hell of a can of worms, innit? > > Which is in part my point... From gem at rellim.com Tue Apr 18 22:33:56 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 15:33:56 -0700 Subject: 32 bit time_t In-Reply-To: <20170418220014.C249940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418220014.C249940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418153356.093e9b5d@spidey.rellim.com> Yo Hal! On Tue, 18 Apr 2017 15:00:14 -0700 Hal Murray wrote: > gem at rellim.com said: > >> Is there any reason not to put it in now as a warning? > > if (4 >=3D sizeof(time_t)) puts("WARNING your system will fail in > > 2038."); > > I was thinking of a build time warning. Your run time suggestion > seems good too. Yeah, should be in waf. Trust, but verify. New issue: https://gitlab.com/NTPsec/ntpsec/issues/272 RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 18 22:33:39 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 18:33:39 -0400 Subject: 32 bit time_t In-Reply-To: <20170418220014.C249940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418220014.C249940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418223339.GI26010@thyrsus.com> Hal Murray : > > gem at rellim.com said: > >> Is there any reason not to put it in now as a warning? > > if (4 >=3D sizeof(time_t)) puts("WARNING your system will fail in 2038."); > > I was thinking of a build time warning. Your run time suggestion seems good > too. > > I think we need a better text. Maybe a URL for a page with the long story. I'm writing a build-time check now. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Tue Apr 18 22:36:04 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 15:36:04 -0700 Subject: 32 bit time_t In-Reply-To: <20170418215407.DB08840605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418215407.DB08840605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418153604.0aaf4ce8@spidey.rellim.com> Yo Hal! On Tue, 18 Apr 2017 14:54:07 -0700 Hal Murray wrote: > gem at rellim.com said: > > FreeBSD 8 defines time_t as int32_t. > > 8 is really old. Where did you find one of those? I googled for 32 bit time_t. > 11 is current. 10.3 is still supported. There is a long time between unsupported, and no longer used. WinXP is 32bit time_t, and I still see that now and again. > It's 64 bits on amd64 and Raspberry Pi and BBB but still 32 on i386 > and powerpc Not on Gentoo RasPi. Those will be good tests for the upcoming waf test for 32-bit time_t. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Apr 18 22:47:34 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 18 Apr 2017 15:47:34 -0700 Subject: 32 bit time_t In-Reply-To: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418154734.55a45428@spidey.rellim.com> Yo Eric! > The integral part of timespec is time_t which has been 64-bit *even > on 32-bit Linuxes* for, what is it, close to 20 years now? 13 May 2014, here is the patch set: https://lwn.net/Articles/598408/ "This patchset change default time_t and clock_t to 64 bit in include/uapi/asm-generic/posix_types.h. The existing 32 bit architectures override these define to 32 bit in arch posix_types.h." So not all arch's went to 64 bit at that time. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 18 23:03:26 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 16:03:26 -0700 Subject: Big picture... In-Reply-To: Message from "John D. Bell" of "Tue, 18 Apr 2017 18:22:31 EDT." <1f92eadf-35d7-f36a-eeb3-226c472f8a0e@systemsartisans.com> Message-ID: <20170418230326.DBDA440605C@ip-64-139-1-69.sjc.megapath.net> jdb at systemsartisans.com said: > Since you've already got a dependency on Python, write a one-liner that is > the equivalent of Unix's "date -u +%s". Use that. Otherwise, a tiny C > program would also do the trick (at the cost of increased complexity). Thanks. I was expecting it would be that simple but didn't see that solution which seems obvious in hindsight. For future reference... c code is reasonable. waf already has the structure for building code that gets run on the build system as part of the build process vs code that runs on some other system. It's needed for the config file parser. -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 18 23:25:33 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 19:25:33 -0400 Subject: 32 bit time_t In-Reply-To: <20170418153356.093e9b5d@spidey.rellim.com> References: <20170418220014.C249940605C@ip-64-139-1-69.sjc.megapath.net> <20170418153356.093e9b5d@spidey.rellim.com> Message-ID: <20170418232533.GJ26010@thyrsus.com> Gary E. Miller : > Yeah, should be in waf. Trust, but verify. > > New issue: > > https://gitlab.com/NTPsec/ntpsec/issues/272 Working it now. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From kurt at roeckx.be Tue Apr 18 23:28:37 2017 From: kurt at roeckx.be (Kurt Roeckx) Date: Wed, 19 Apr 2017 01:28:37 +0200 Subject: _DATE__, version string, and distros In-Reply-To: <20170418014101.5EFEC40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418014101.5EFEC40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418232836.awvchyrame6tsq4y@roeckx.be> On Mon, Apr 17, 2017 at 06:41:01PM -0700, Hal Murray wrote: > I think we can kill two birds with one stone. > > The first step is to change the code that uses __DATE__ to use the time stamp > from autorevision. Have you seen https://reproducible-builds.org/specs/source-date-epoch/ ? Kurt From hmurray at megapathdsl.net Tue Apr 18 23:36:04 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 16:36:04 -0700 Subject: Let's get rid of pivots In-Reply-To: Message from Hal Murray of "Tue, 18 Apr 2017 11:38:30 PDT." <20170418183830.3E3B440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170418233604.EBFDA40605C@ip-64-139-1-69.sjc.megapath.net> hmurray at megapathdsl.net said: > "as soon as they arrive" seems ugly to me, but maybe that's just because > I've been thinking of using l_fp to compute an offset and using the offset > to adjust the time with an effective pivot of "now". After thinking about it some more... I think that doing the pivot "as soon as they arrive" is a good idea. After a packet exchange, there are 4 time stamps; 2 local and 2 remote. We get the local times as timespecs before turning them into l_fp. We'll have to save those local timespecs in order to avoid a bogus pivot. The point is that there is no pivot anywhere near the main body of timekeeping. And no hidden pivot like requiring the system time to be close enough. Note that the pivot time isn't critical. Being off by a few years or even a few dozen is not a problem. It would be perfectly reasonable to use the release date rather than the build time. (We may want the build time for GPS pivot.) We might want to include a sanity check in the pivot code. The full range of 32 bits of seconds is 136 years. Most of that range doesn't make sense. Does it make sense to set the clock to 100 years after the build time? More likely you are talking to a confused server. Maybe that filter should be part of the BOGON filtering. ------- Note that there are actually 2 packet processing paths to consider. The above is the client side. I think the server side is trivial - no pivot involved. -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 18 23:39:04 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 18 Apr 2017 19:39:04 -0400 Subject: _DATE__, version string, and distros In-Reply-To: <20170418232836.awvchyrame6tsq4y@roeckx.be> References: <20170418014101.5EFEC40605C@ip-64-139-1-69.sjc.megapath.net> <20170418232836.awvchyrame6tsq4y@roeckx.be> Message-ID: <20170418233904.GA6790@thyrsus.com> Kurt Roeckx : > On Mon, Apr 17, 2017 at 06:41:01PM -0700, Hal Murray wrote: > > I think we can kill two birds with one stone. > > > > The first step is to change the code that uses __DATE__ to use the time stamp > > from autorevision. > > Have you seen > https://reproducible-builds.org/specs/source-date-epoch/ ? Interesting idea. But I don't see how it solves our problem unless we can count on every packager to set this variable. Which ones do? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 18 23:50:00 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 18 Apr 2017 16:50:00 -0700 Subject: _DATE__, version string, and distros In-Reply-To: Message from "Eric S. Raymond" of "Tue, 18 Apr 2017 19:39:04 EDT." <20170418233904.GA6790@thyrsus.com> Message-ID: <20170418235000.0E9C140605C@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Interesting idea. But I don't see how it solves our problem unless we can > count on every packager to set this variable. Which ones do? I think we can do that without help from the packaging environment. We can put our release date into a git controled file and use that as a default. The one line python code would use that file or SOURCE_DATE_EPOCH if it is in the environment. I can get my updated build date by having a script update that file. I'll have to git checkout that file to undo those edits when I want to commit something. I can live with that. -- These are my opinions. I hate spam. From jdb at systemsartisans.com Wed Apr 19 13:30:21 2017 From: jdb at systemsartisans.com (John D. Bell) Date: Wed, 19 Apr 2017 09:30:21 -0400 Subject: Big picture... In-Reply-To: <20170418230326.DBDA440605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418230326.DBDA440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On 04/18/2017 07:03 PM, Hal Murray wrote: > For future reference... c code is reasonable. .... > Yeahbut - it's more complicated to make sure that C code is portable. And since you're talking about an environment that is already *not* Unix-like (i.e., Windows), might as well push the portability/compatibility issues off onto the maintainers of the interpreter. - John D. Bell From gem at rellim.com Wed Apr 19 18:38:02 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 19 Apr 2017 11:38:02 -0700 Subject: 32 bit time_t In-Reply-To: <20170418215341.GG26010@thyrsus.com> References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> <20170418215341.GG26010@thyrsus.com> Message-ID: <20170419113802.50504b00@spidey.rellim.com> Yo All! Uh, oh. My boo-boo. I got this wrong. Gentoo stable on RasPi has 32 bit time_t: Checking sizeof long : 4 Checking sizeof time_t (time.h) : 4 WARNING: This system has a 32-bit time_t. WARNING: Your ntpd will fail on 2038-01-19T03:14:07Z. # uname -a Linux pi3 4.9.17-v7+ #1 SMP Wed Mar 29 12:17:48 PDT 2017 armv7l ARMv7 Processor rev 4 (v7l) BCM2835 GNU/Linux I'm thinking the warning also needs to be in the binary, must users never compile from source. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 19 21:57:00 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 19 Apr 2017 14:57:00 -0700 Subject: 32 bit time_t warnings Message-ID: <20170419215700.A171A40605C@ip-64-139-1-69.sjc.megapath.net> We need to be sure they get to a place where people will see them. On my waf configure, the second line of the warning ends up at the top of the screen. It's hard to find even if you are looking for it. Can we move that printout closer to the end of each step? The run time test has similar problems. This may be the time of an iceberg. We need to verify that all LOG_ERR really are important and that they also go to syslog when we have setup a logging file. For the waf warnings, it will probably help make them stand out if we put blank lines around them. I don't know how to do something similar with log fiiles. -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 19 22:03:32 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 19 Apr 2017 15:03:32 -0700 Subject: 32 bit time_t warnings In-Reply-To: <20170419215700.A171A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170419215700.A171A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170419150332.35b0b0e1@spidey.rellim.com> Yo Hal! On Wed, 19 Apr 2017 14:57:00 -0700 Hal Murray wrote: > We need to be sure they get to a place where people will see them. Yes, this is just to see how bad the problem is. We know people just ignore all the warnings they see. I'm afraid if we warn to lowdly, for a problem 20 years in the future, when NTP Clasic does not warn, that they will just stick with Classic thinking Classic does not have the problem. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at Nexgo.DE Thu Apr 20 07:11:37 2017 From: Stromeko at Nexgo.DE (Achim Gratz) Date: Thu, 20 Apr 2017 09:11:37 +0200 Subject: 32 bit time_t In-Reply-To: <20170419113802.50504b00@spidey.rellim.com> References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> <20170418215341.GG26010@thyrsus.com> <20170419113802.50504b00@spidey.rellim.com> Message-ID: Am 19.04.2017 um 20:38 schrieb Gary E. Miller: > Gentoo stable on RasPi has 32 bit time_t: > > Checking sizeof long : 4 > Checking sizeof time_t (time.h) : 4 This is a 32bit glibc based system, so this means you didn't ask for a 64bit time_t, then. Here's that link again that tells you how glibc handles this: https://sourceware.org/glibc/wiki/Y2038ProofnessDesign The Linux kernel itself is not Y2038 clean yet, AFAIK. But that's no excuse for applications to skimp there. -- Achim. (on the road :-) From esr at thyrsus.com Thu Apr 20 07:50:29 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 20 Apr 2017 03:50:29 -0400 Subject: 32 bit time_t In-Reply-To: References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> <20170418215341.GG26010@thyrsus.com> <20170419113802.50504b00@spidey.rellim.com> Message-ID: <20170420075028.GB5707@thyrsus.com> Achim Gratz : > Am 19.04.2017 um 20:38 schrieb Gary E. Miller: > >Gentoo stable on RasPi has 32 bit time_t: > > > >Checking sizeof long : 4 > >Checking sizeof time_t (time.h) : 4 > > This is a 32bit glibc based system, so this means you didn't ask for a 64bit > time_t, then. Here's that link again that tells you how glibc handles this: > > https://sourceware.org/glibc/wiki/Y2038ProofnessDesign > > The Linux kernel itself is not Y2038 clean yet, AFAIK. But that's no excuse > for applications to skimp there. We need to stick to POSIX entry points. Is there a directive we can give GCC that tells it to map to the 64-bit versions, if available? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From Stromeko at Nexgo.DE Thu Apr 20 10:27:11 2017 From: Stromeko at Nexgo.DE (Achim Gratz) Date: Thu, 20 Apr 2017 12:27:11 +0200 Subject: 32 bit time_t In-Reply-To: <20170420075028.GB5707@thyrsus.com> References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> <20170418215341.GG26010@thyrsus.com> <20170419113802.50504b00@spidey.rellim.com> <20170420075028.GB5707@thyrsus.com> Message-ID: >> This is a 32bit glibc based system, so this means you didn't ask for a 64bit >> time_t, then. Here's that link again that tells you how glibc handles this: >> >> https://sourceware.org/glibc/wiki/Y2038ProofnessDesign >> >> The Linux kernel itself is not Y2038 clean yet, AFAIK. But that's no excuse >> for applications to skimp there. > > We need to stick to POSIX entry points. Is there a directive we can give GCC > that tells it to map to the 64-bit versions, if available? > Quote from the document linked previously: ========== The following is proposed: User code defines _TIME_BITS=64 to get 64-bit time support instead of the legacy 32-bit time. If glibc sees _TIME_BITS=64, then it defines __USE_TIME_BITS64 to indicate that time support is 64-bit rather than 32-bit. ========== Configury needs to check whether it's looking at a glibc-based system that implements that. I don't see any feature test macros discussed, so I guess you'd check the size of some suitable datatype with and without that define being present. -- Achim. (on the road :-) From hmurray at megapathdsl.net Thu Apr 20 10:39:16 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 20 Apr 2017 03:39:16 -0700 Subject: 32 bit time_t In-Reply-To: Message from Achim Gratz of "Thu, 20 Apr 2017 12:27:11 +0200." Message-ID: <20170420103916.8CF0040605C@ip-64-139-1-69.sjc.megapath.net> Stromeko at Nexgo.DE said: > The following is proposed: > User code defines _TIME_BITS=64 to get 64-bit time support instead of > the legacy 32-bit time. > If glibc sees _TIME_BITS=64, then it defines __USE_TIME_BITS64 to > indicate that time support is 64-bit rather than 32-bit. That says "proposed". Does anybody implement it? Is it implemented deep inside gcc or should a grep for TIME_BITS in /usr/includ/ find something? (It doesn't on my Fedora system.) -- These are my opinions. I hate spam. From esr at thyrsus.com Thu Apr 20 16:16:29 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 20 Apr 2017 12:16:29 -0400 Subject: 32 bit time_t In-Reply-To: References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> <20170418215341.GG26010@thyrsus.com> <20170419113802.50504b00@spidey.rellim.com> <20170420075028.GB5707@thyrsus.com> Message-ID: <20170420161629.GA11115@thyrsus.com> Achim Gratz : > Quote from the document linked previously: > > ========== > The following is proposed: > > User code defines _TIME_BITS=64 to get 64-bit time support instead of > the legacy 32-bit time. > > If glibc sees _TIME_BITS=64, then it defines __USE_TIME_BITS64 to > indicate that time support is 64-bit rather than 32-bit. > ========== > > Configury needs to check whether it's looking at a glibc-based system that > implements that. I don't see any feature test macros discussed, so I guess > you'd check the size of some suitable datatype with and without that define > being present. I don't think I see any reason not to just define that unconditionally. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Thu Apr 20 17:08:39 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 20 Apr 2017 10:08:39 -0700 Subject: 32 bit time_t In-Reply-To: References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> <20170418215341.GG26010@thyrsus.com> <20170419113802.50504b00@spidey.rellim.com> Message-ID: <20170420100839.78afcbb1@spidey.rellim.com> Yo Achim! On Thu, 20 Apr 2017 09:11:37 +0200 Achim Gratz wrote: > Am 19.04.2017 um 20:38 schrieb Gary E. Miller: > > Gentoo stable on RasPi has 32 bit time_t: > > > > Checking sizeof long : 4 > > Checking sizeof time_t (time.h) : > > 4 > > This is a 32bit glibc based system, so this means you didn't ask for > a 64bit time_t, then. Here's that link again that tells you how > glibc handles this: Yes, but... The real problem for ntpd on 32-bit is that clock_gettime() and clock_settime() use timespec which uses time_t. So while time64_t is available, it is not used on the path that we need it. > The Linux kernel itself is not Y2038 clean yet, AFAIK. But that's no > excuse for applications to skimp there. For our purposes, which are mostly clock_gettime)( and clock_gettime(), the Linux kernel suits our purposes since May 2014. At that time, with 64 bit kernels, time_t became 8 bytes. So then timespec, using time_t, is really using time64_t, and ntpd works just fine. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Thu Apr 20 17:14:00 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 20 Apr 2017 10:14:00 -0700 Subject: 32 bit time_t In-Reply-To: <20170420075028.GB5707@thyrsus.com> References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> <20170418215341.GG26010@thyrsus.com> <20170419113802.50504b00@spidey.rellim.com> <20170420075028.GB5707@thyrsus.com> Message-ID: <20170420101400.653b3f88@spidey.rellim.com> Yo Eric! On Thu, 20 Apr 2017 03:50:29 -0400 "Eric S. Raymond" wrote: > Achim Gratz : > > Am 19.04.2017 um 20:38 schrieb Gary E. Miller: > > >Gentoo stable on RasPi has 32 bit time_t: > > > > > >Checking sizeof long : 4 > > >Checking sizeof time_t (time.h) : > > >4 > > > > This is a 32bit glibc based system, so this means you didn't ask > > for a 64bit time_t, then. Here's that link again that tells you > > how glibc handles this: > > > > https://sourceware.org/glibc/wiki/Y2038ProofnessDesign > > > > The Linux kernel itself is not Y2038 clean yet, AFAIK. But that's > > no excuse for applications to skimp there. > > We need to stick to POSIX entry points. Is there a directive we can > give GCC that tells it to map to the 64-bit versions, if available? It is easy for us to use time64_t, but without a syscall that uses it to get/set the time it does us not good. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Thu Apr 20 17:15:16 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 20 Apr 2017 10:15:16 -0700 Subject: 32 bit time_t In-Reply-To: References: <20170418213455.5D35A40605C@ip-64-139-1-69.sjc.megapath.net> <20170418215341.GG26010@thyrsus.com> <20170419113802.50504b00@spidey.rellim.com> <20170420075028.GB5707@thyrsus.com> Message-ID: <20170420101516.277e27a0@spidey.rellim.com> Yo Achim! On Thu, 20 Apr 2017 12:27:11 +0200 Achim Gratz wrote: > Quote from the document linked previously: Interesting, but until Linus accepts it, of no practical use to ntpd. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Thu Apr 20 21:05:10 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 20 Apr 2017 14:05:10 -0700 Subject: =?UTF-8?B?4pyYRVBPQ0g=?= Message-ID: <20170420140510.26d61c6d@spidey.rellim.com> Yo All! It seems NTP Classic added (non-standard) support for reproduceable builds since commit 24e5bcb672a0006c9854e04036793569d7a71c7f in Dec 2014. This worked until commit c2d30fff9f1697bbc041065dda2a72b37755afa1 in Jan 2017 which broke it. As of my recent commit 8cf918e318049767f452b4693de1cb45d7c6543a the support for reproduceable builds should be working again. And it is compliant with the specification found here: https://reproducible-builds.org/specs/source-date-epoch/ Until now, the pivot date and the NMEA base date, where derived at compile time from __DATE__ and __TIME__. Now they are derived from the EPOCH which is set at configure time. So for most cases it will be within a minute of compile time. If you wait 10 years between configureing and compiling you might find some issues with your binary. For reproduceable builds the EPOCH can be set to an arbitrary time by either of: 'waf --epcoch=EPOCH' or setting an environment variable: 'export SOURCE_DATE_EPOCH=EPOCH' The standard recommends settng EPOCH by default to the date of the last source file modification. I'm not sure if that is desirable, should be optional, or just ignored. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Apr 20 21:57:53 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 20 Apr 2017 14:57:53 -0700 Subject: Pivoting Message-ID: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> We need a web page that summarizes this area - not the internal details, but the general problem and how it impacts users. NTP overflows in 2036 32 bit signed time_t overflows in 2038 GPS overflows every 1024 weeks (~20 years starting from 1980) First rollover was 22 August 1999 They added 3 bits in ??? Some older devices used pivot logic to extend their lifetime past 2000 A description of pivoting and how it works. Maybe mention the distro reproducible problem and __DATE__ Good URLs for more info -------- I see 4 ways to for ntpd to handle pivoting. 1) The current code does the pivoting at step_systime 2) We could do the pivoting "as soon as possible", when the received packet arrives. The NTP calculations use 4 time stamps: 2 local, 2 remote. The local time stamps need to save the timespec used to produce them. 3) We could assume that the system time is "close enough". That turns into pivoting around "now". That assumes that the system has a sane RTC or software that does something like get a sane time from the file system. "close enough" doesn't have to be very close. Within 50 years is good enough. 4) We could add a few lines of code to the initialization of ntpd that jumps bogus time to some compiled in value. The old way to get that value was __DATE_ but the last git commit date or something similar from the distro environment avoids the reproducible problem. This is pivoting around the build date. -------- We need to verify that ntpdig does something sane. As long as we are cleaning up this area... 32 bits of seconds spans 136 years. ntpd panics if it is about to step the clock more than 900 seconds. There is a startup switch to bypass that and allow one long jump. We could supplement that sanity check with a pivot check. If we have a build date, we know the time should be after that build date and less than the life of the program past that. What's a reasonable life of a program like ntpd? 20 years seems like the right ballpark. After that, we have to check the GPS drivers. The magnovox driver just checks for before. I haven't checked the NMEA driver. 50 would be more conservative for IoT type devices. (Many GPS devices lasted long enough to hit the week roll over bug.) Configure options, both build and run-time, could help but they would probably be ignored by the few people who should use them. -- These are my opinions. I hate spam. From gem at rellim.com Thu Apr 20 22:13:44 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 20 Apr 2017 15:13:44 -0700 Subject: Pivoting In-Reply-To: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170420151344.390c8fac@spidey.rellim.com> Yo Hal! On Thu, 20 Apr 2017 14:57:53 -0700 Hal Murray wrote: > We need a web page that summarizes this area - not the internal > details, but the general problem and how it impacts users. Actually, EPOCH doesn't impact users, not any differently than __DATE__ did. EPOCH only impacts repository managers. And yes, they could use guidance, packaging guidance is here: devel/packaging.txt Patches welcome. > NTP overflows in 2036 > 32 bit signed time_t overflows in 2038 > GPS overflows every 1024 weeks (~20 years starting from 1980) > First rollover was 22 August 1999 > They added 3 bits in ??? > Some older devices used pivot logic to extend their lifetime past > 2000 None of that changed with EPOCH. Eric said he is looking into the pivot problem, he'll tell us when he is ready. > Maybe mention the distro reproducible problem and __DATE__ That is in devel/packaging.txt and https://reproducible-builds.org/ > Good URLs for more info I can add that URL to devel/packaging.txt > I see 4 ways to for ntpd to handle pivoting. I'll leave that to you and Eric. Until Eric comes back with a plan, and seeks input. In any case, off topic for EPOCH. > We need to verify that ntpdig does something sane. Verify away. > As long as we are cleaning up this area... So many balls in the air, I'll wait for a few to get caught first. Feel free to send patches. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ianbruene at gmail.com Fri Apr 21 00:03:20 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Thu, 20 Apr 2017 19:03:20 -0500 Subject: Adding a dependency / SNMP daemon Message-ID: In preparation for starting work on the SNMP daemon I've been looking at python SNMP libraries. PySNMP would appear to be stable / well maintained, based on the documentation that I've seen so far. However, adding a new dependency is a fairly major event and I do not have the experience to be certain that PySNMP is a good choice. Does anyone have a library that would be a better choice? or other related input? (oh, and it is also easily installable via pip) -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From esr at thyrsus.com Fri Apr 21 01:29:31 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 20 Apr 2017 21:29:31 -0400 Subject: Pivoting In-Reply-To: <20170420151344.390c8fac@spidey.rellim.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170420151344.390c8fac@spidey.rellim.com> Message-ID: <20170421012931.GA18916@thyrsus.com> Gary E. Miller : > So many balls in the air, I'll wait for a few to get caught first. I wish I could do that... I've been quiet for the last couple of days because Cathy and I have been struggling with the aftermath of that *vicious* flu. I no longer have overt symptoms, Cathy has only minor cold-like ones, but we're both tired a lot of the time and needing more sleep than usual. Minor exertion can knock us flat. I'd heard this can sometimes happen after Type A influenza but never encountered it before. I have good days when I think it's gone, then next day I'm chronically exhausted. Depressing not knowing how long it will last; we were vaguely warned that such effects can linger for weeks. However, this does not mean Penguicon is a scrub. Even if I'm not fully recovered by then, the modafinil I keep around for mitigating my CP spasticity is the exact drug normally prescribed for post-viral fatique syndrome. I try to avoid taking the stuff oftener than I need to because I don't want to develop a lifestyle dependency, but being fully functional for our FTF qualifies as "need". What I'm doing when I can summon enough energy to work is ploughing my way through the huge mass of back mail about pivot and related topics trying to boil it down into a work plan that I can post here for discussion. I'm already pretty sure we're going to want to move the pivot logic to packet receipt time. But I'm *not* sure we should do this immediately; the bug won't become urgent until 2036, which gives us plenty of time to build a test jig and be sure we get the change right. Gary's implementation of a pivot environment variable Beats me why Mills didn't write it that way to begin with. I don't catch him in architectural mistakes often, he was pretty good that way even by my elevated standards, but this sure seems to be one. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Fri Apr 21 04:29:18 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 20 Apr 2017 21:29:18 -0700 Subject: Pivoting In-Reply-To: <20170421012931.GA18916@thyrsus.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170420151344.390c8fac@spidey.rellim.com> <20170421012931.GA18916@thyrsus.com> Message-ID: <20170420212918.392feba6@spidey.rellim.com> Yo Eric! On Thu, 20 Apr 2017 21:29:31 -0400 "Eric S. Raymond" wrote: > I've been quiet for the last couple of days because Cathy and I have > been struggling with the aftermath of that *vicious* flu. Family first. We'll be here after Cathy's better and you are more covered. Spring is coming, that will energize us all. Tell Cathy we are wishing for her speedy recovery. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Apr 21 04:32:42 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 20 Apr 2017 21:32:42 -0700 Subject: Adding a dependency / SNMP daemon In-Reply-To: References: Message-ID: <20170420213242.648de20d@spidey.rellim.com> Yo Ian! On Thu, 20 Apr 2017 19:03:20 -0500 Ian Bruene wrote: > In preparation for starting work on the SNMP daemon I've been looking > at python SNMP libraries. PySNMP would appear to be stable / well > maintained, based on the documentation that I've seen so far. > However, adding a new dependency is a fairly major event and I do not > have the experience to be certain that PySNMP is a good choice. Yes, a big decision, luckily the SNMP is entirely optional. Before you make any decisions, try a few of them out, see how easy they work with your concept.. Also, look inside them, it may be you only need a few small bits and you can just cut/paste that you need. > Does anyone have a library that would be a better choice? or other > related input? > > (oh, and it is also easily installable via pip) Big win. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ianbruene at gmail.com Fri Apr 21 04:47:58 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Thu, 20 Apr 2017 23:47:58 -0500 Subject: Pivoting In-Reply-To: <20170421012931.GA18916@thyrsus.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170420151344.390c8fac@spidey.rellim.com> <20170421012931.GA18916@thyrsus.com> Message-ID: <6da76a81-9c31-d7e1-5d17-f010a4679e1f@gmail.com> On 04/20/2017 08:29 PM, Eric S. Raymond wrote: > I no longer have overt symptoms, Cathy has only minor cold-like ones, but we're > both tired a lot of the time and needing more sleep than usual. Minor > exertion can knock us flat. * hopes of a speedy genocide to the relevant invasive lifeforms -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From Stromeko at Nexgo.DE Fri Apr 21 07:49:02 2017 From: Stromeko at Nexgo.DE (Achim Gratz) Date: Fri, 21 Apr 2017 09:49:02 +0200 Subject: =?UTF-8?B?UmU6IOKcmEVQT0NI?= In-Reply-To: <20170420140510.26d61c6d@spidey.rellim.com> References: <20170420140510.26d61c6d@spidey.rellim.com> Message-ID: > The standard recommends settng EPOCH by default to the date of the > last source file modification. I'm not sure if that is desirable, > should be optional, or just ignored. If you stick to that recommendation then someone has to go to great lengths to have the EPOCH going backwards and separate builds from the same source will have the same EPOCH. The latter consideration becomes important if you use buildbots working away concurrently, which are nowadays what will provide the packages for almost any GNU/Linux distribution. So I think it's a quite sensible recommendation that should be followed unless you have a spcific reason not to. -- Achim. (on the road :-) From Stromeko at Nexgo.DE Fri Apr 21 08:26:36 2017 From: Stromeko at Nexgo.DE (Achim Gratz) Date: Fri, 21 Apr 2017 10:26:36 +0200 Subject: Pivoting In-Reply-To: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: > I see 4 ways to for ntpd to handle pivoting. > > 1) The current code does the pivoting at step_systime > > 2) We could do the pivoting "as soon as possible", when the received packet > arrives. The NTP calculations use 4 time stamps: 2 local, 2 remote. The > local time stamps need to save the timespec used to produce them. > > 3) We could assume that the system time is "close enough". That turns into > pivoting around "now". That assumes that the system has a sane RTC or > software that does something like get a sane time from the file system. > "close enough" doesn't have to be very close. Within 50 years is good enough. > > 4) We could add a few lines of code to the initialization of ntpd that jumps > bogus time to some compiled in value. The old way to get that value was > __DATE_ but the last git commit date or something similar from the distro > environment avoids the reproducible problem. This is pivoting around the > build date. I don't think ntpd needs to do any pivoting _except_ at startup time, where it is unavoidable and it should attempt to do anything after it has started up. For setting the initial time you'll want to have as many independent bounds on the time as you can provide, since you potentially cannot trust _any_ of the possible sources. Short of an authoritative and trusted time that is within about 68 years of the true time, the only thing you can do is a maximum likelyhood estimate and that always means that there is a non-zero chance to resolve to the wrong NTP era since any assumption that you make can turn out to be wrong for any number of reasons. > We could supplement that sanity check with a pivot check. If we have a build > date, we know the time should be after that build date and less than the life > of the program past that. One assumption delivering a single bound. The other assumption is that the system time is already close enough for ntpd to not need to pivot again. Getting time over HTTPS[*] could deliver a third venue to start doing a majority vote. [*] http://phk.freebsd.dk/time/20151129.html > What's a reasonable life of a program like ntpd? 20 years seems like the > right ballpark. After that, we have to check the GPS drivers. The magnovox > driver just checks for before. I haven't checked the NMEA driver. 50 would > be more conservative for IoT type devices. (Many GPS devices lasted long > enough to hit the week roll over bug.) Configure options, both build and > run-time, could help but they would probably be ignored by the few people who > should use them. Given that there are still PDP8 and VAX systems running production plants (albeit increasingly as virtual machines), I'd say you vastly underestimate the longevity of something that is mostly out-of-sight and "just works". Even if you only consider physical hardware, based on the projected lifetime of automotive qualified systems (15 years or longer) you have to expect a much longer actual lifetime in the field. -- Achim. (on the road :-) From ianbruene at gmail.com Fri Apr 21 12:46:18 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Fri, 21 Apr 2017 07:46:18 -0500 Subject: Pivoting In-Reply-To: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On 04/20/2017 04:57 PM, Hal Murray wrote: > What's a reasonable life of a program like ntpd? [...] Allow me to reproduce a lesson I learned the other day: 19:14 One should always put a multiplier on one's estimate of how long their code will be in use: code expected to last for 5 years lasting 20 got us Y2K. Code expected to last for 50 years had better be happy running on a vital server in the Forgotten Closet of Doom for the next 150 years. Code randomly breaking for unforeseeable reasons is one thing; that is what continuing support and maintenance is for. But designing the software to break horribly after a couple decades for perfectly foreseeable reasons isn't going to cut it, not when dealing with core infrastructure. As Achim Gratz pointed out, there are still *PDP8s* running around. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From gem at rellim.com Fri Apr 21 16:51:48 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 21 Apr 2017 09:51:48 -0700 Subject: =?UTF-8?B?4pyYRVBPQ0g=?= In-Reply-To: References: <20170420140510.26d61c6d@spidey.rellim.com> Message-ID: <20170421095148.73846798@spidey.rellim.com> Yo Achim! On Fri, 21 Apr 2017 09:49:02 +0200 Achim Gratz wrote: > > The standard recommends settng EPOCH by default to the date of the > > last source file modification. I'm not sure if that is desirable, > > should be optional, or just ignored. > > If you stick to that recommendation then someone has to go to great > lengths to have the EPOCH going backwards and separate builds from > the same source will have the same EPOCH. The latter consideration > becomes important if you use buildbots working away concurrently, > which are nowadays what will provide the packages for almost any > GNU/Linux distribution. So I think it's a quite sensible > recommendation that should be followed unless you have a spcific > reason not to. I agree with the reasons to do it, I'm not sure how to do it. waf can't just look at the date stamps on the C files and pick the newest one. Some of the C files are auto generated. Patches welcome. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Sat Apr 22 05:09:18 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 21 Apr 2017 22:09:18 -0700 Subject: Pivoting In-Reply-To: Message from Hal Murray of "Thu, 20 Apr 2017 14:57:53 PDT." <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> More pivoting.... pylib calls ntp.ntpc.lfptofloat() in several places ntpc_lfptofloat() calls lfp_stamp_to_tspec() That's an inline which calls ntpcal_ntp_to_time() ntpcal_ntp_to_time takes an optional second argument, the pivot time. If it's NULL, it uses "now". ----------------- The code in step_systime() is really really ugly. (to my eye) It starts by computing the pivot. It gets the build time as a broken down struct, subtracts 10 years, converts back to a time_t. All that can be precomputed. It converts the step to a l_fp, gets the system time as a tspec, converts that to l_fp, adds them together, then converts back to a tspec. That convert back uses the pivot. Now that we have a nice simple EPOCH, I hope somebody cleans that up. Does the 10 year step back make any sense? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Apr 22 06:22:33 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 21 Apr 2017 23:22:33 -0700 Subject: Pivoting In-Reply-To: Message from Achim Gratz of "Fri, 21 Apr 2017 10:26:36 +0200." Message-ID: <20170422062233.78E2A40605C@ip-64-139-1-69.sjc.megapath.net> Stromeko at Nexgo.DE said: > Even if you only consider physical hardware, based on the projected > lifetime of automotive qualified systems (15 years or longer) you have to > expect a much longer actual lifetime in the field. ianbruene at gmail.com said: > 19:14 One should always put a multiplier on one's > estimate of how long their code will be in use: code expected to > last for 5 years lasting 20 got us Y2K. ... Right. I think there are two issues tangled up here. One is that there is a tradeoff between building in a long lifetime and catching problems during a normal lifetime. I'm not sure which is more likely. Consider what happens if a server gets fired up with a broken clock and starts answering all requests with 1970. Do you want to reject that, or pivot it to 2036? The other is that there really is a 20 year rollover with old GPS units. (Newer units have 13 bits.) I think that turns into 3 choices: Don't support really old GPS units. Advertise the default lifetime. Allow the user to specify the pivot time and/or life time, either at build time or at run time or both. --------- How long have computers been used in cars? When did they start using software complicated enough to support a system time-of-day clock? What's their track record for long lifetime bugs? -- These are my opinions. I hate spam. From gem at rellim.com Sat Apr 22 20:35:03 2017 From: gem at rellim.com (Gary E. Miller) Date: Sat, 22 Apr 2017 13:35:03 -0700 Subject: Pivoting In-Reply-To: <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170422133503.49edac5b@spidey.rellim.com> Yo Hal! On Fri, 21 Apr 2017 22:09:18 -0700 Hal Murray wrote: > The code in step_systime() is really really ugly. (to my eye) Look at how it was, before Eric reverted my code. I think that is clean, but further suggestions welcome. Eric asked no on e touch that until he reviewed it again. > It starts by computing the pivot. It gets the build time as a broken > down struct, subtracts 10 years, converts back to a time_t. All that > can be precomputed. And unnecessary. > Does the 10 year step back make any sense? What ten yeaer step back? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Sat Apr 22 20:39:29 2017 From: gem at rellim.com (Gary E. Miller) Date: Sat, 22 Apr 2017 13:39:29 -0700 Subject: Pivoting In-Reply-To: <20170422062233.78E2A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170422062233.78E2A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170422133929.2d40cf2b@spidey.rellim.com> Yo Hal! On Fri, 21 Apr 2017 23:22:33 -0700 Hal Murray wrote: > One is that there is a tradeoff between building in a long lifetime > and catching problems during a normal lifetime. I'm not sure which > is more likely. Consider what happens if a server gets fired up with > a broken clock and starts answering all requests with 1970. Do you > want to reject that, or pivot it to 2036? I'd have ntpd reject any time prior to EPOCH. > Allow the user to specify the pivot time and/or life time, either > at build time or at run time or both. EPOCH is used for NMEA, so that is covered at build time. I could see adding an option to specify the EPOCH at run time too. > How long have computers been used in cars? When did they start using > software complicated enough to support a system time-of-day clock? > What's their track record for long lifetime bugs? As mentioend earlier, PDP-8s still run. That is late 1960's. Call it 50 years. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Sat Apr 22 22:41:56 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 22 Apr 2017 18:41:56 -0400 Subject: Pivoting In-Reply-To: <20170422133503.49edac5b@spidey.rellim.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> <20170422133503.49edac5b@spidey.rellim.com> Message-ID: <20170422224156.GC15391@thyrsus.com> Gary E. Miller : > Yo Hal! > > On Fri, 21 Apr 2017 22:09:18 -0700 > Hal Murray wrote: > > > The code in step_systime() is really really ugly. (to my eye) > > Look at how it was, before Eric reverted my code. I think that > is clean, but further suggestions welcome. Eric asked no on e touch > that until he reviewed it again. It is indeed extremely ugly. We inherited that, and as has been already demonsteated attempting to fix it is risky. I pan to do some refactoring that will at least reduce the ugliness. > > It starts by computing the pivot. It gets the build time as a broken > > down struct, subtracts 10 years, converts back to a time_t. All that > > can be precomputed. > > And unnecessary. Probably. But we need to be solw and careful here. > > Does the 10 year step back make any sense? > > What ten yeaer step back? I don't see that either. Have you noticed someting we didn't? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From hmurray at megapathdsl.net Sun Apr 23 07:45:33 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 23 Apr 2017 00:45:33 -0700 Subject: Pivoting Message-ID: <20170423074533.5F4BE40605C@ip-64-139-1-69.sjc.megapath.net> >>> Does the 10 year step back make any sense? >> What ten yeaer step back? > I don't see that either. Have you noticed someting we didn't? if (ntpcal_get_build_date(&jd)) { jd.year -= 10; pivot += ntpcal_date_to_time(&jd); -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sun Apr 23 10:20:48 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 23 Apr 2017 03:20:48 -0700 Subject: Pivoting Message-ID: <20170423102048.13F19406063@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: > I'd have ntpd reject any time prior to EPOCH. How do you decide whether to reject it or pivot it into the future? >> Allow the user to specify the pivot time and/or life time, either >> at build time or at run time or both. > EPOCH is used for NMEA, so that is covered at build time. > I could see adding an option to specify the EPOCH at run time too. My build time comment was mostly for life time. I was assuming that EPOCH would be used for pivoting. I know about three pivots to consider. One is GPS 10 bits for weeks with a 20 year step size. Another is 2 digit year numbers with a 100 year step size. The third is 32 bits of seconds in NTP packets with a 136 year step size. Are there any others I've overlooked? If we want our software to last more than 20 years while talking to crappy GPS receivers, we need a way to update the pivot date at run time. (I'm using "last" to mean without rebuilding.) If we want our software to reject bogus time, we have to balance the tradeoff between long life and good filtering. Run time parameters will allow the user to choose. > As mentioend earlier, PDP-8s still run. That is late 1960's. Call it 50 > years. It would be interesting to see what those setups are actually doing and if they have documentation for something as obscure as NTP. There is a lot of lab gear running embedded software. I wonder how much of it will be running at its 25th or 50th birthday. I have a pre-software scope that's over 35 years old. The combination of long life and crappy GPS seems obscure enough that I'm willing to document it as a limitation. It's the kind of code that Eric would love to rip out if he found it a year ago. The documentation issue gets interesting. A feature isn't any good if you can't figure out how to use it. I wonder if the web will solve that problem. Will NTPsec still be online 20 years from now? Will we maintain online versions of 20 year old releases? Would anybody notice a warning message from a program that's been running for 19 years? -- These are my opinions. I hate spam. From Stromeko at Nexgo.DE Sun Apr 23 12:18:58 2017 From: Stromeko at Nexgo.DE (Achim Gratz) Date: Sun, 23 Apr 2017 14:18:58 +0200 Subject: Pivoting In-Reply-To: <20170423102048.13F19406063@ip-64-139-1-69.sjc.megapath.net> References: <20170423102048.13F19406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: > How do you decide whether to reject it or pivot it into the future? That's the thing: you can't do it in the general case. > I know about three pivots to consider. One is GPS 10 bits for weeks with a > 20 year step size. Another is 2 digit year numbers with a 100 year step > size. The third is 32 bits of seconds in NTP packets with a 136 year step > size. Are there any others I've overlooked? Let's consider them seperately. Limitations in primary clock sources are one thing, the more important case is cold startup. The two become quite entangled for a stratum-1 of course, but running a server rather than a client increases the chances of some actual person getting involved. If you indeed know the overflow cycles, then you can make the reconciliation timespan much longer than each individual one if the offsets and periods are relatively prime. > If we want our software to last more than 20 years while talking to crappy > GPS receivers, we need a way to update the pivot date at run time. (I'm > using "last" to mean without rebuilding.) Look at what gets done for leap seconds. However (again), this is not a problem at all for a running NTP server with a tiny bit of care in interpreting the input (which should never be fully trusted anyway). > If we want our software to reject bogus time, we have to balance the tradeoff > between long life and good filtering. Run time parameters will allow the > user to choose. I don't think the user will be able to chose anything except tell whether the time is good or not _if_ he gets involved. > It would be interesting to see what those setups are actually doing and if > they have documentation for something as obscure as NTP. None of that is networked to the interwebs or anything like that. > There is a lot of lab gear running embedded software. I wonder how much of > it will be running at its 25th or 50th birthday. I have a pre-software scope > that's over 35 years old. We have measurement systems still in active use in our lab bought in 1993 that had a positively ancient embedded system software on them at that time already. It's also never seen an update since. I'm not sure I can tease out the build date for the software, but since it uses NFS I'd say it's some 28 years old. These _are_ networked, since otherwise we'd have to exchange data via 3 1/2" floppy disks. > The combination of long life and crappy GPS seems obscure enough that I'm > willing to document it as a limitation. It's the kind of code that Eric > would love to rip out if he found it a year ago. In any case resolving this really belongs into the drivers, not the main ntpd code. I've long been convinced that the clock drivers should be able to tell ntpd how sure they are about the data they report and ntpd should be able to tell them what it thinks about the correctness of the data in some fashion as well. > The documentation issue gets interesting. A feature isn't any good if you > can't figure out how to use it. I wonder if the web will solve that problem. > Will NTPsec still be online 20 years from now? Will we maintain online > versions of 20 year old releases? I have saved plenty of links to documentation over time that are now dead. Some of them can be revived using the Wayback archive, but not all. So yes, this is a concern, however it may be out of scope for the project at least at the moment. > Would anybody notice a warning message from a program that's been running for > 19 years? No, but if it was a time server that systems still followed, it'd take your network down quite fast these days. Figuring out why would be a minor nightmare. -- Achim. (on the road :-) From esr at thyrsus.com Sun Apr 23 12:27:39 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 08:27:39 -0400 Subject: Refclocks and formatting In-Reply-To: References: Message-ID: <20170423122739.GA25461@thyrsus.com> Ian Bruene : > @ESR > > In the units project I discovered that the string formatting for refclocks > is handled in a completely different manner from the rest of the code. > Specifically ntpq/mon call ntp.ntpc.statustoa which is a C library, instead > of calling hypothetical formatting functions in pylib/util.py like they do > for everything else. > > As far as I can tell from a cursory examination of the code the reason for > this is so it can use the same bitmask #defines as the rest of the system. > Is this correct? If so does that need to remain the case, if not then why is > the complexity of a language bridge being maintained? If it has to stay this > way the unit formatters *can* munch on the output of statustoa. Well spotted. The truth is, this split is a historical hangover from the sequence in which I wrote the Python tools - there's nothing principled about it at all. I wrote the extension early; afterwards I figured out how to mechanically generate the required #defines into control.py and the need for ntp.ntpc.statustoa went away. If it were otherwise possible to get rid of the Python extension I'd have gotten rid of ntp.ntpc.statustoa already. As it is, every time I have remembered this minor wart I've either had something more urgent to work on or (the last couple of weeeks) been so ill that I couldn't easily summon up the energy to deal with code issues below emergency priority. (I'm feeling better now. Slow recovery, though; stamina isn't 100% yet. Post-viral fatigue syndrome is a *bitch*.) If you want to clean this up, go right ahead. I think it would be a good way for you to get your fingers into the C code, with a simple excision. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From ianbruene at gmail.com Sun Apr 23 15:41:13 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Sun, 23 Apr 2017 10:41:13 -0500 Subject: Refclocks and formatting In-Reply-To: <20170423122739.GA25461@thyrsus.com> References: <20170423122739.GA25461@thyrsus.com> Message-ID: <0ee8cf0d-d1c3-3950-34e6-3bccae538c1d@gmail.com> I should have mentioned this in the devlist, but I keep forgetting that you aren't getting updates from bug #263. My question was based on a misunderstanding of the code: I thought that all of the formatting for refclocks was happening in statustoa, when it is actually just a single line. Incidentally, overfocusing on the first path that becomes apparent is a good first candidate for "You'll find [other failure modes], and have to train your own way out of them.". On 04/23/2017 07:27 AM, Eric S. Raymond wrote: > Well spotted. The truth is, this split is a historical hangover from > the sequence in which I wrote the Python tools - there's nothing > principled about it at all. I wrote the extension early; afterwards > I figured out how to mechanically generate the required #defines into > control.py and the need for ntp.ntpc.statustoa went away. I suspected something the sort, I knew that control.py was generated at build time and was going to suggest that this could be done to eliminate statustoa. > If you want to clean this up, go right ahead. I think it would be a > good way for you to get your fingers into the C code, with a simple > excision. I'll take a look at it, in /theory/ it should be an easy swap. This would also be a good propaganda case: "Look here! This is how efficient Python is; a noob can write code smaller, and by extension with fewer bug receptors, than the same code written in C by an expert". It would apply both in general, and also when trying to sell people on NTPsec. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From esr at thyrsus.com Sun Apr 23 17:41:21 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 13:41:21 -0400 Subject: Refclocks and formatting In-Reply-To: <0ee8cf0d-d1c3-3950-34e6-3bccae538c1d@gmail.com> References: <20170423122739.GA25461@thyrsus.com> <0ee8cf0d-d1c3-3950-34e6-3bccae538c1d@gmail.com> Message-ID: <20170423174121.GA28602@thyrsus.com> Ian Bruene : > >If you want to clean this up, go right ahead. I think it would be a > >good way for you to get your fingers into the C code, with a simple > >excision. > > I'll take a look at it, in /theory/ it should be an easy swap. I'm now investigating a complication, alas. Turns out that the C statustoa handles one set of #defines that isn't in control.py; the PLL* things for the system status word. This matters if the status type is TYPE_SYS, not TYPE_PEER or TYPE_CLOCK. This doesn't matter in the ntpmon case, it only wants a peer or clock status. The printvars commands in ntpq are what use TYPE_SYS. Ugh. I didn't know this before because I hadn't gotten around to actually trying to ditch the C statustoa. It's still possible you might be able to get rid of it, but it is unlikely to be simple. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Sun Apr 23 18:17:00 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 14:17:00 -0400 Subject: Refclocks and formatting In-Reply-To: <20170417003621.C5E0940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170417003621.C5E0940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170423181700.GB25461@thyrsus.com> Hal Murray : > There is another potential worm in this can. I don't think if it applies to > this case, but it's worth keeping in mind. > > Some of the flags that get decoded come from the kernel sources rather than > NTP sources. They are generally the same across OSes and kernel versions, > but I don't know how to verify that. I think the clean solution would be to > decode them on the server. [Or copy the versions from one kernel to our > sources and have the build step crash if they are defined in the kernel and > different.) You are right to raise this issue. Either fix would be messy though. Mode 6 should really not be shipping those flag bits in hex at all; it should decode them into a list of mode strings. That would be the *right* fix, but of course it would break mode 6 backward compatibility. I'm tempted to do it anyway. I'm normally extremely wary of doing things that might break sysadmin scripts, but this is a rather improbable case - somebody would have to be decoding those bits with a thing that isn't ntpq or ntpmon itself to be affected. We could keep the old behavior under ENABLE_CLASSIC_MODE. Hm. Hal, what do you you think our actual risk exposure is here? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Sun Apr 23 18:21:50 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 14:21:50 -0400 Subject: ntpq vs new DNS In-Reply-To: <20170417061641.2B8F940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170417061641.2B8F940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170423182150.GD25461@thyrsus.com> Hal Murray : > > For server slots specified by name, the new DNS returns both the local IP > Address and the hostname. Servers specified by numerical address don't > return a hostname. > > The current ntpq gives preference to the hostname slot. That works for pool > slots where the address is useless. It's much more complicated for things > like > server 0.us.pool.ntp.org > > I think changing a few lines will revert to the previous behavior. > > Does anybody have any great ideas for how to take advantage of this extra > info? Sorry, I don't. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Sun Apr 23 18:26:03 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 14:26:03 -0400 Subject: Let's get rid of pivots In-Reply-To: <20170418233604.EBFDA40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170418183830.3E3B440605C@ip-64-139-1-69.sjc.megapath.net> <20170418233604.EBFDA40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170423182603.GE25461@thyrsus.com> Hal Murray : > We might want to include a sanity check in the pivot code. The full range of > 32 bits of seconds is 136 years. Most of that range doesn't make sense. > Does it make sense to set the clock to 100 years after the build time? I would be very wary of making assumptions about the maximum service life of an installed binary. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Sun Apr 23 18:39:04 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 23 Apr 2017 11:39:04 -0700 Subject: Refclocks and formatting In-Reply-To: Message from "Eric S. Raymond" of "Sun, 23 Apr 2017 14:17:00 EDT." <20170423181700.GB25461@thyrsus.com> Message-ID: <20170423183904.83BAC406063@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Mode 6 should really not be shipping those flag bits in hex at all; it > should decode them into a list of mode strings. That would be the *right* > fix, but of course it would break mode 6 backward compatibility. > I'm tempted to do it anyway. I'm normally extremely wary of doing things > that might break sysadmin scripts, but this is a rather improbable case - > somebody would have to be decoding those bits with a thing that isn't ntpq > or ntpmon itself to be affected. We could ship both hex and text. New code could use the text and ignore the hex. Old code would continue doing whatever it did. > We could keep the old behavior under ENABLE_CLASSIC_MODE. Hm. Hal, what do > you you think our actual risk exposure is here? That doesn't feel right, but I can't come up with a simple example to demonstrate why. I think the exposure is very low. Any geek that has written a mode 6 client can probably fix it quickly. The real problem is that this is just the tip of an iceberg. I'm sure there will be other fixes we want to make that are not backward compatible. Is shipping both a good general policy? Do we need to start tracking what we will support and/or develop a policy for when we drop support for a slot? -- These are my opinions. I hate spam. From gem at rellim.com Sun Apr 23 18:44:24 2017 From: gem at rellim.com (Gary E. Miller) Date: Sun, 23 Apr 2017 11:44:24 -0700 Subject: Refclocks and formatting In-Reply-To: <20170423174121.GA28602@thyrsus.com> References: <20170423122739.GA25461@thyrsus.com> <0ee8cf0d-d1c3-3950-34e6-3bccae538c1d@gmail.com> <20170423174121.GA28602@thyrsus.com> Message-ID: <20170423114424.1b70202d@spidey.rellim.com> Yo Eric! On Sun, 23 Apr 2017 13:41:21 -0400 "Eric S. Raymond" wrote: > This doesn't matter in the ntpmon case, it only wants a peer or clock > status. The printvars commands in ntpq are what use TYPE_SYS. It should be a TODO item to add to TYPE_SYS things ntpmon. I miss them. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ianbruene at gmail.com Sun Apr 23 18:49:32 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Sun, 23 Apr 2017 13:49:32 -0500 Subject: Refclocks and formatting In-Reply-To: <20170423183904.83BAC406063@ip-64-139-1-69.sjc.megapath.net> References: <20170423183904.83BAC406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: On 04/23/2017 01:39 PM, Hal Murray wrote: > > The real problem is that this is just the tip of an iceberg. I'm sure there > will be other fixes we want to make that are not backward compatible. Is > shipping both a good general policy? This sounds like something for post-1.0, in fact I had already decided to leave the statustoa problem until after SNMP is implemented. It works, it's just ugly. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From gem at rellim.com Sun Apr 23 18:56:45 2017 From: gem at rellim.com (Gary E. Miller) Date: Sun, 23 Apr 2017 11:56:45 -0700 Subject: Pivoting In-Reply-To: <20170423102048.13F19406063@ip-64-139-1-69.sjc.megapath.net> References: <20170423102048.13F19406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170423115645.1f97f9b4@spidey.rellim.com> Yo Hal! On Sun, 23 Apr 2017 03:20:48 -0700 Hal Murray wrote: > gem at rellim.com said: > > I'd have ntpd reject any time prior to EPOCH. > > How do you decide whether to reject it or pivot it into the future? If you know unambiguously the time is past then you can reject. Otherwise pivot. > >> Allow the user to specify the pivot time and/or life time, either > >> at build time or at run time or both. > > EPOCH is used for NMEA, so that is covered at build time. > > I could see adding an option to specify the EPOCH at run time too. > > My build time comment was mostly for life time. I was assuming that > EPOCH would be used for pivoting. Incomplete assumption, BUILD_EPOCH is also used to disambihuate century for 2 digit years in NMEA and some other drivers. As was __DATE__ previously. > I know about three pivots to consider. One is GPS 10 bits for weeks > with a 20 year step size. Another is 2 digit year numbers with a 100 > year step size. The third is 32 bits of seconds in NTP packets with > a 136 year step size. Are there any others I've overlooked? Some GPS use 13 bit weeks. For completeness, there is still the 2038 pivot for time32_t in use by some 32 bit OS. > If we want our software to last more than 20 years while talking to > crappy GPS receivers, we need a way to update the pivot date at run > time. (I'm using "last" to mean without rebuilding.) Fair enough. I keep suggesting being able to override BUILD_EPOCH in ntp.conf. hen leave the problem to others how to get that set right. > If we want our software to reject bogus time, we have to balance the > tradeoff between long life and good filtering. Run time parameters > will allow the user to choose. Choose to shoot himself in the foot too. But gotta take that risk. > I have a pre-software scope that's over 35 years old. Ditto here. Just one? > The combination of long life and crappy GPS seems obscure enough that > I'm willing to document it as a limitation. It's the kind of code > that Eric would love to rip out if he found it a year ago. Doc is a good start. > The documentation issue gets interesting. A feature isn't any good > if you can't figure out how to use it. I wonder if the web will > solve that problem. Will NTPsec still be online 20 years from now? > Will we maintain online versions of 20 year old releases? As I've previously said, I have friend running 20 year binaries of NTP classic. > Would anybody notice a warning message from a program that's been > running for 19 years? Prolly not... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Sun Apr 23 19:01:59 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 15:01:59 -0400 Subject: Refclocks and formatting In-Reply-To: References: <20170423183904.83BAC406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170423190159.GA30937@thyrsus.com> Ian Bruene : > >The real problem is that this is just the tip of an iceberg. I'm sure there > >will be other fixes we want to make that are not backward compatible. Is > >shipping both a good general policy? > > This sounds like something for post-1.0, in fact I had already decided to > leave the statustoa problem until > after SNMP is implemented. It works, it's just ugly. Actually, I'd *rather* roll out incompatibilities in 1.0 if we're going to have them at all. Better to present people with one clean break and truthfully promise stability afterwards than to have an indefinitely continuing series of breakages. Of course, we need a strong justification for each one. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Sun Apr 23 19:03:34 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 15:03:34 -0400 Subject: Refclocks and formatting In-Reply-To: <20170423114424.1b70202d@spidey.rellim.com> References: <20170423122739.GA25461@thyrsus.com> <0ee8cf0d-d1c3-3950-34e6-3bccae538c1d@gmail.com> <20170423174121.GA28602@thyrsus.com> <20170423114424.1b70202d@spidey.rellim.com> Message-ID: <20170423190334.GB30937@thyrsus.com> Gary E. Miller : > Yo Eric! > > On Sun, 23 Apr 2017 13:41:21 -0400 > "Eric S. Raymond" wrote: > > > This doesn't matter in the ntpmon case, it only wants a peer or clock > > status. The printvars commands in ntpq are what use TYPE_SYS. > > It should be a TODO item to add to TYPE_SYS things ntpmon. I miss them. Where would they fit in the UI, though? I did think about that - couldn't come up with a design that didn't seem ugly. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Sun Apr 23 19:09:46 2017 From: gem at rellim.com (Gary E. Miller) Date: Sun, 23 Apr 2017 12:09:46 -0700 Subject: Refclocks and formatting In-Reply-To: <20170423190334.GB30937@thyrsus.com> References: <20170423122739.GA25461@thyrsus.com> <0ee8cf0d-d1c3-3950-34e6-3bccae538c1d@gmail.com> <20170423174121.GA28602@thyrsus.com> <20170423114424.1b70202d@spidey.rellim.com> <20170423190334.GB30937@thyrsus.com> Message-ID: <20170423120946.07528e59@spidey.rellim.com> Yo Eric! On Sun, 23 Apr 2017 15:03:34 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > Yo Eric! > > > > On Sun, 23 Apr 2017 13:41:21 -0400 > > "Eric S. Raymond" wrote: > > > > > This doesn't matter in the ntpmon case, it only wants a peer or > > > clock status. The printvars commands in ntpq are what use > > > TYPE_SYS. > > > > It should be a TODO item to add to TYPE_SYS things ntpmon. I miss > > them. > > Where would they fit in the UI, though? I did think about that - > couldn't come up with a design that didn't seem ugly. I'd like a a key command, like 'S', that flips the whole screen to display, with updates, the NTP system variables. The things you can only see statically now from "ntpq -c". Then S again flips back to current ntpmon screen. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Sun Apr 23 20:40:57 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 16:40:57 -0400 Subject: Refclocks and formatting In-Reply-To: <20170423183904.83BAC406063@ip-64-139-1-69.sjc.megapath.net> References: <20170423181700.GB25461@thyrsus.com> <20170423183904.83BAC406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170423204057.GC30937@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > Mode 6 should really not be shipping those flag bits in hex at all; it > > should decode them into a list of mode strings. That would be the *right* > > fix, but of course it would break mode 6 backward compatibility. > > > I'm tempted to do it anyway. I'm normally extremely wary of doing things > > that might break sysadmin scripts, but this is a rather improbable case - > > somebody would have to be decoding those bits with a thing that isn't ntpq > > or ntpmon itself to be affected. > > We could ship both hex and text. New code could use the text and ignore the > hex. Old code would continue doing whatever it did. Here's the thing - I really don't think old code ever did anything but present this particular item for eyeballing. And the argument against shipping hex is that it's a bad idea to visibly ship bits the interpretation of which could change in the future for reasons beyond our control. That's what makes the PLL bits different from the other status-word bits. The others we control. As long as we don't mess with the definitions in code files, nothing else is going to break. > > We could keep the old behavior under ENABLE_CLASSIC_MODE. Hm. Hal, what do > > you you think our actual risk exposure is here? > > That doesn't feel right, but I can't come up with a simple example to > demonstrate why. Not very helpful, there. > I think the exposure is very low. Any geek that has written a mode 6 client > can probably fix it quickly. That's what I think too. > The real problem is that this is just the tip of an iceberg. I'm sure there > will be other fixes we want to make that are not backward compatible. Is > shipping both a good general policy? If you mean changes to Mode 6, I'm not sure enough that shipping both hex and text would fix anything to we want to do it. How do we know old code wouldn't barf on the following text? Naive implementations in scripting languages would do that. > Do we need to start tracking what we will support and/or develop a policy for > when we drop support for a slot? I've been obsessive about documenting visible changes; see docs/ntpsec.txt for them all. There have not been many such things, and only one change in Mode 6. There's one context in which it ships a driver name rather than a number now. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Sun Apr 23 20:43:12 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 16:43:12 -0400 Subject: Refclocks and formatting In-Reply-To: <20170423120946.07528e59@spidey.rellim.com> References: <20170423122739.GA25461@thyrsus.com> <0ee8cf0d-d1c3-3950-34e6-3bccae538c1d@gmail.com> <20170423174121.GA28602@thyrsus.com> <20170423114424.1b70202d@spidey.rellim.com> <20170423190334.GB30937@thyrsus.com> <20170423120946.07528e59@spidey.rellim.com> Message-ID: <20170423204312.GD30937@thyrsus.com> Gary E. Miller : > I'd like a a key command, like 'S', that flips the whole screen to display, > with updates, the NTP system variables. The things you can only see > statically now from "ntpq -c". But that you could do with a trivial shellscript! -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Sun Apr 23 20:59:19 2017 From: gem at rellim.com (Gary E. Miller) Date: Sun, 23 Apr 2017 13:59:19 -0700 Subject: Refclocks and formatting In-Reply-To: <20170423204312.GD30937@thyrsus.com> References: <20170423122739.GA25461@thyrsus.com> <0ee8cf0d-d1c3-3950-34e6-3bccae538c1d@gmail.com> <20170423174121.GA28602@thyrsus.com> <20170423114424.1b70202d@spidey.rellim.com> <20170423190334.GB30937@thyrsus.com> <20170423120946.07528e59@spidey.rellim.com> <20170423204312.GD30937@thyrsus.com> Message-ID: <20170423135919.5f86f7c3@spidey.rellim.com> Yo Eric! On Sun, 23 Apr 2017 16:43:12 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > I'd like a a key command, like 'S', that flips the whole screen to > > display, with updates, the NTP system variables. The things you > > can only see statically now from "ntpq -c". > > But that you could do with a trivial shellscript! A trivial shell script to flip in and out of ntpmon and a screen of ntpq -c? How would you ever flip back out of ntpq to return to ntpmon? A timer, or go to a bash menu? Please nooooo.... No need to worry about it, when it bugs me enough I'll improve ntpmon, so I never need ntpq again. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Sun Apr 23 21:44:20 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 17:44:20 -0400 Subject: Pivoting In-Reply-To: References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170423214420.GA1141@thyrsus.com> Achim Gratz : > I don't think ntpd needs to do any pivoting _except_ at startup time, where > it is unavoidable and it should attempt to do anything after it has started > up. Huh? Potentially you need to apply a pivot to every packet that comes in; different sources could even have different epochs. > For setting the initial time you'll want to have as many independent bounds > on the time as you can provide, since you potentially cannot trust _any_ of > the possible sources. Short of an authoritative and trusted time that is > within about 68 years of the true time, the only thing you can do is a > maximum likelyhood estimate and that always means that there is a non-zero > chance to resolve to the wrong NTP era since any assumption that you make > can turn out to be wrong for any number of reasons. True. But it is not clear what we can do about this other than what we are already doing - that is, pull in lots of sources, discard outliers, make a best guess. > One assumption delivering a single bound. The other assumption is that the > system time is already close enough for ntpd to not need to pivot again. > Getting time over HTTPS[*] could deliver a third venue to start doing a > majority vote. > > [*] http://phk.freebsd.dk/time/20151129.html Instead of trying to use this for precise time, we could try using a Date just at startup to figure out what era we are in. > >What's a reasonable life of a program like ntpd? 20 years seems like the > >right ballpark. After that, we have to check the GPS drivers. The magnovox > >driver just checks for before. I haven't checked the NMEA driver. 50 would > >be more conservative for IoT type devices. (Many GPS devices lasted long > >enough to hit the week roll over bug.) Configure options, both build and > >run-time, could help but they would probably be ignored by the few people who > >should use them. > > Given that there are still PDP8 and VAX systems running production plants > (albeit increasingly as virtual machines), I'd say you vastly underestimate > the longevity of something that is mostly out-of-sight and "just works". > Even if you only consider physical hardware, based on the projected lifetime > of automotive qualified systems (15 years or longer) you have to expect a > much longer actual lifetime in the field. Agreed. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From ianbruene at gmail.com Sun Apr 23 21:59:52 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Sun, 23 Apr 2017 16:59:52 -0500 Subject: SNMP Libraries Message-ID: I have sifted through the selection of python SNMP libraries and narrowed it downto three options. All options install easily through pip, and have python 2 / 3compatibility. The options are: pysnmp (4.3.5) Pure python, implements all three versions of SNMP, appears to contain several kitchen sinks.The Old Established Firm of the field. netsnmp-py (0.3) Python bindings for the netsnmp library. Does *NOT* support SNMP v3 as of yet. From the README " Support for SET and TRAP, as well as SNMPv3 is planned", but the last commit was 9/19/16. easysnmp (0.2.4) [NOT MAINTAINED, Project asking for maintainer] Fork of netsnmp-py, reasons given: netsnmp-py not Pythonic, pysnmp even less Pythonic, pysnmp slow. It would appear that pysnmp is the best choice. Speed should not be an issue for NTPsec's usecase, and the other two options are missing functionality or support. In particular, pysnmp is the only option with any clear indication that it isn't going to drop out of existence tomorrow. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From esr at thyrsus.com Sun Apr 23 22:03:36 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 18:03:36 -0400 Subject: SNMP Libraries In-Reply-To: References: Message-ID: <20170423220336.GA1839@thyrsus.com> Ian Bruene : > It would appear that pysnmp is the best choice. Speed should not be > an issue for NTPsec's usecase, and the other two options are missing > functionality or support. In particular, pysnmp is the only option > with any clear indication that it isn't going to drop out of > existence tomorrow. Good reasoning, especially the last sentence. Carry on! -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Sun Apr 23 23:16:40 2017 From: gem at rellim.com (Gary E. Miller) Date: Sun, 23 Apr 2017 16:16:40 -0700 Subject: Pivoting In-Reply-To: <20170423214420.GA1141@thyrsus.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170423214420.GA1141@thyrsus.com> Message-ID: <20170423161640.2267099f@spidey.rellim.com> Yo Eric! On Sun, 23 Apr 2017 17:44:20 -0400 "Eric S. Raymond" wrote: > Achim Gratz : > > I don't think ntpd needs to do any pivoting _except_ at startup > > time, where it is unavoidable and it should attempt to do anything > > after it has started up. > > Huh? Potentially you need to apply a pivot to every packet that > comes in; different sources could even have different epochs. Not really. 2s complement fixes it for packet up to many years apart. Remember, all ntpd uses, after the first time, is time deltas. I would be happy to write test cases that show this. > > For setting the initial time you'll want to have as many > > independent bounds on the time as you can provide, since you > > potentially cannot trust _any_ of the possible sources. > > True. But it is not clear what we can do about this other than what > we are already doing - that is, pull in lots of sources, discard > outliers, make a best guess. If we could mandate using the leap file, can we assume the leap file has been updated in the lst 68 years? > > One assumption delivering a single bound. The other assumption is > > that the system time is already close enough for ntpd to not need > > to pivot again. Getting time over HTTPS[*] could deliver a third > > venue to start doing a majority vote. > > > > [*] http://phk.freebsd.dk/time/20151129.html > > Instead of trying to use this for precise time, we could try using a > Date just at startup to figure out what era we are in. Except for ntpd that is not net connected. A more common condition than I would have thought. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Mon Apr 24 00:00:33 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 23 Apr 2017 20:00:33 -0400 Subject: Pivoting In-Reply-To: <20170423161640.2267099f@spidey.rellim.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170423214420.GA1141@thyrsus.com> <20170423161640.2267099f@spidey.rellim.com> Message-ID: <20170424000033.GA2501@thyrsus.com> Gary E. Miller : > Yo Eric! > > On Sun, 23 Apr 2017 17:44:20 -0400 > "Eric S. Raymond" wrote: > > > Achim Gratz : > > > I don't think ntpd needs to do any pivoting _except_ at startup > > > time, where it is unavoidable and it should attempt to do anything > > > after it has started up. > > > > Huh? Potentially you need to apply a pivot to every packet that > > comes in; different sources could even have different epochs. > > Not really. 2s complement fixes it for packet up to many years > apart. Remember, all ntpd uses, after the first time, is time deltas. I don't understand this remark. We're being shipped timestamps from upstream, not merely deltas. > > Instead of trying to use this for precise time, we could try using a > > Date just at startup to figure out what era we are in. > > Except for ntpd that is not net connected. A more common condition > than I would have thought. I want to make that case work better too. Doesn't mean this couldn't be useful to a normal client-only setup. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Mon Apr 24 01:41:15 2017 From: gem at rellim.com (Gary E. Miller) Date: Sun, 23 Apr 2017 18:41:15 -0700 Subject: Pivoting In-Reply-To: <20170424000033.GA2501@thyrsus.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170423214420.GA1141@thyrsus.com> <20170423161640.2267099f@spidey.rellim.com> <20170424000033.GA2501@thyrsus.com> Message-ID: <20170423184115.7de8a704@spidey.rellim.com> Yo Eric! On Sun, 23 Apr 2017 20:00:33 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > Yo Eric! > > > > On Sun, 23 Apr 2017 17:44:20 -0400 > > "Eric S. Raymond" wrote: > > > > > Achim Gratz : > > > > I don't think ntpd needs to do any pivoting _except_ at startup > > > > time, where it is unavoidable and it should attempt to do > > > > anything after it has started up. > > > > > > Huh? Potentially you need to apply a pivot to every packet that > > > comes in; different sources could even have different epochs. > > > > Not really. 2s complement fixes it for packet up to many years > > apart. Remember, all ntpd uses, after the first time, is time > > deltas. > > I don't understand this remark. We're being shipped timestamps from > upstream, not merely deltas. Yes, but when we subtract from our local time, truncated to an l_fp, and 2s complement, we end up with a delta on local time. Once we get runnning, any delta past 'gate' is thrown away. That gate, by default is just 1,000 seconds. So. after the first big correction, we KNOW the delta is 1,000 seconds or less and 2s complement arithmetic over the epoch rollever is fine. I'd be happy to create some test cases showing the effect. But to really use them we'll need hooks into that area of ntpd. But for example on these dates: NTP Epoch NTP Era NTP Datestamp 1 Jan 1900 0 0 0 27 Feb 2036 0 1 0x100000000 So at the roll over, any time greater than 1,000 seconds in the past or the future will be rejected. The only acceptable time stamps will be between: NTP Epoch EPOCH - 1,000 sec 0x3e8 EPOCH + 1,000 sec 0xfffffc18 Plug those into the hex calculator of you choice and you can see that subtracting thos from zero very neatly give you a plain and simple offset as an l_fp. Subtract 0x3e8 from zero and you get 0xfffffca8, which is conventiently the offset of -1,000 seconds. Feel free to compute the other permutations. > > > Instead of trying to use this for precise time, we could try > > > using a Date just at startup to figure out what era we are in. > > > > Except for ntpd that is not net connected. A more common condition > > than I would have thought. > > I want to make that case work better too. Doesn't mean this couldn't > be useful to a normal client-only setup. Diconnected is the hard case, get that right and the others are already done. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at Nexgo.DE Mon Apr 24 06:24:44 2017 From: Stromeko at Nexgo.DE (Achim Gratz) Date: Mon, 24 Apr 2017 08:24:44 +0200 Subject: Pivoting In-Reply-To: <20170423214420.GA1141@thyrsus.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170423214420.GA1141@thyrsus.com> Message-ID: > Huh? Potentially you need to apply a pivot to every packet that comes in; > different sources could even have different epochs. Nope. All calculations are done in modulo arithmetic and end up as time deltas to local time. One of the assumptions of a working NTP is that the local times of all systems involved do converge to true time, so those differences are _much_ smaller than 68 years around local time. > True. But it is not clear what we can do about this other than what we are > already doing - that is, pull in lots of sources, discard outliers, make > a best guess. At the minimum we'd need to document the assumptions. As said before, the more independent estimates of time (or rather bounds on the time) you have, the less likely it becomes to guess wrong on the NTP era at startup. -- Achim. (on the road :-) From esr at thyrsus.com Mon Apr 24 13:43:29 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 24 Apr 2017 09:43:29 -0400 Subject: ntpq peers formatting needs floating point for time slots. In-Reply-To: <87vasolpfu.fsf@Rainer.invalid> References: <20170204224916.3DBF1406061@ip-64-139-1-69.sjc.megapath.net> <87efzdm2i3.fsf@Rainer.invalid> <20170205114616.GA9376@thyrsus.com> <87zii0ltom.fsf@Rainer.invalid> <20170205131606.GA10522@thyrsus.com> <87vasolpfu.fsf@Rainer.invalid> Message-ID: <20170424134329.GA17849@thyrsus.com> Achim Gratz : > > You don't have the asynchronous option either. The protocol is lockstep and > > reverse-lookup on an address can cause long client-side stalls. > > I can already say I just want the IP addresses. I can get those from > the (possibly remote) ntpd very fast and send off the reverse DNS lookup > asynchronously from ntpq. If I don't get the result fast enough, I just > leave the IP in the output. And yes, that means the actual querying of > ntpd, the processing of the answers and the combination of the output is > going to happen asynchronously (not necessarily in parallel, but not > sequential). If you still think this is desirable, ship Ian Bruene a detailed proposal. He pretty much owns the ntpq/ntpmon display code now. One possible technical blocker: I don't know if async DNS lookups are doable from Python. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Mon Apr 24 15:09:34 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 24 Apr 2017 11:09:34 -0400 Subject: DEBUG in ntpsec In-Reply-To: <20170414102719.84FEC40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170414102719.84FEC40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170424150934.GA19529@thyrsus.com> Hal Murray : > > The default was --enable-debug. A while ago, that was changed to > --disable-debug. > > I think we should reconsider that and/or this whole area. > > There are several things all lumped together under --enable-debug and/or > --enable-debug-gdb > > One is a bunch of optional compiler checking options - the stuff Gary is > working on now. > > Another is not stripping symbols and whatever is needed for using gdb. > > Another is a bunch of run time sanity checks - things like crash if foo is > NULL. > > Another is a bunch of optional printing. This is useful for chasing obscure > bugs. You can run ntpd from the command line with -n and -d or -D and you > get lots of printout. This allows getting more info to chase some problems > without rebuilding ntpd. > > We should probably measure the size difference and/or run time differences. > The latter will take something like a busy pool server. I think there are some sound ideas here, but either (a) you'll have to implement them yourself, or (b) if you want someone else to do it, you'll need to put a much more detailed specification in an RFE on the tracker. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Mon Apr 24 15:26:35 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 24 Apr 2017 11:26:35 -0400 Subject: What does --disable-kernel-PLL do? In-Reply-To: <20161202072923.CCCB4406061@ip-64-139-1-69.sjc.megapath.net> References: <20161202060513.GA25702@thyrsus.com> <20161202072923.CCCB4406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170424152635.GA19718@thyrsus.com> Responding to very old mail, from December... Hal Murray : > > > Here's the inside view from looking at the code: --disable-kernel-PLL turns > > off the use of ntp_adjtime() to slew time, leaving adjustments to be done at > > much coarser granularity by the old-style adjtime(2) call. > > How about we rename it to --disable-ntp_adjtime? I'm with you in spirit, but if we create an option name that mixes dashes and underscores we'll be cursed forever because nobody will be able to remember which way to type it for longer than three seconds after looking at the docs. If --disable-adjtime is acceptable to you, I'm good with that. You can either do it yourself or file an RFE assigned to the build recipe owner (which is Matt). > I assume we will get the same effect if ntp_adjtime is not defined in the > headers. Yes. > I stumbled into one case where it is actually interesting. NetBSD on > Raspberry Pi has ntp_adjtime in the headers but it's not implemented in the > kernel. So a default build of our code crashes. Bletch. Well, that answers the question of why we need to keep the option at all. If this is still a problem with current NetBSD, please add a note to INSTALL about it. > >> There is a tangle in this area that I don't understand. When ntpd > >> exits (or crashes), it leaves the previous state in the kernel so anybody > >> running ntptime will think things are fine. > > > What previous state? > > The state in the kernel when ntpd last called ntp_adjtime. The kernel > doesn't know that ntpd is steering time so it can't tell that a > stopped/crashed ntpd won't make another call in the future. > > > [Kernel bug - not returning STA_NANO when status is UNSYNC.] > > Yes, and it makes a fix difficult. Do you have any recommended action? > > Just add a comment to that area. It's probably worth nuking the first time > setup call just to simplify things. The first call will run in micro mode. > That will update the nano flag so the second call will work as expected. Please do that. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From ianbruene at gmail.com Mon Apr 24 15:35:25 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Mon, 24 Apr 2017 10:35:25 -0500 Subject: ntpq peers formatting needs floating point for time slots. In-Reply-To: <20170424134329.GA17849@thyrsus.com> References: <20170204224916.3DBF1406061@ip-64-139-1-69.sjc.megapath.net> <87efzdm2i3.fsf@Rainer.invalid> <20170205114616.GA9376@thyrsus.com> <87zii0ltom.fsf@Rainer.invalid> <20170205131606.GA10522@thyrsus.com> <87vasolpfu.fsf@Rainer.invalid> <20170424134329.GA17849@thyrsus.com> Message-ID: On 04/24/2017 08:43 AM, Eric S. Raymond wrote: > One possible technical blocker: I don't know if async DNS lookups are doable > from Python. Doesn't look like async DNS is possible in stock Python. There is a library with python bindings to do it (*another* dependency), and there is always the option of brute forcing the issue using a helper script. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From esr at thyrsus.com Mon Apr 24 15:37:41 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 24 Apr 2017 11:37:41 -0400 Subject: ntpq update In-Reply-To: <20161223012904.BE6D6406063@ip-64-139-1-69.sjc.megapath.net> References: <20161222232927.GB20526@thyrsus.com> <20161223012904.BE6D6406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170424153741.GA19815@thyrsus.com> Old mail... Hal Murray : > > esr at thyrsus.com said: > >> The frags= and limit= on the mru command are only used > >> for the first batch. I'd like them to stick. > > > There's a computation of those for second and later span requests that I > > transcribed from the C, down around line 1287 in packet.py. I'm very > > reluctant to mess with it; it's at the far end of some logic that I don't > > understand that seems to be trying to adapt to network conditions or server > > errors or *something*. > > There is nothing magic there. We should add some debug printout so we can > see what is actually going on. > > But if the user specifies how big, either by packets or slot count, I'd like > to stick with that size, at least until we get a better idea. I think it > will make more sense after we collect and print some statistics for total > packets and retransmissions. Then we can do some experiments. Please file a bug and assign it to Ian Bruene. I'm handing off ntpq stuff to him now, and this is just the kind of grubby experience he can use to hone his problem-solving skills. Ian, the tricky part here is going to be reading code (not writing it) to understand what that adaptive logic is doing. It's pushing further in the direction of the Mode 6 work you've been doing. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Mon Apr 24 15:39:56 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 24 Apr 2017 11:39:56 -0400 Subject: ntpq peers formatting needs floating point for time slots. In-Reply-To: References: <20170204224916.3DBF1406061@ip-64-139-1-69.sjc.megapath.net> <87efzdm2i3.fsf@Rainer.invalid> <20170205114616.GA9376@thyrsus.com> <87zii0ltom.fsf@Rainer.invalid> <20170205131606.GA10522@thyrsus.com> <87vasolpfu.fsf@Rainer.invalid> <20170424134329.GA17849@thyrsus.com> Message-ID: <20170424153956.GA20018@thyrsus.com> Ian Bruene : > > > On 04/24/2017 08:43 AM, Eric S. Raymond wrote: > >One possible technical blocker: I don't know if async DNS lookups are doable > >from Python. > > Doesn't look like async DNS is possible in stock Python. There is a library > with python bindings to do it (*another* dependency), and there is always > the option of brute forcing the issue using a helper script. Too complicated for now. We have other things that need doing more. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Mon Apr 24 16:13:12 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 24 Apr 2017 12:13:12 -0400 Subject: mrulist direct mode, monitoring pool servers In-Reply-To: <20161222022807.EF60B406063@ip-64-139-1-69.sjc.megapath.net> References: <20161222010757.GA7376@thyrsus.com> <20161222022807.EF60B406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170424161312.GA20780@thyrsus.com> Very old mail... Hal Murray : > esr at thyrsus.com said: > >> If we can't go fast enough, we should be able to get some of the > >> data and/or some estimates of how much we are missing. > > > Some of the data, yes. As the Mode 6 protocol is designed I don't see how > > to get good estimates. > > > On the other hand, I can imagine an inexpensive protocol extension that > > would help a lot - adding a tag to the front of each span that reports the > > MRU-list size at the time of transmission. If your client sees this number > > rising rather than falling during a span sequence then you can at least be > > warned that you're probably in a losing race. > > The size of the list isn't what you want. If slots are getting recycled, the list size will be constant. > > I think we can get the numbers we need, not through the mru protocol but by through something like monstats. It doesn't have what we need, but we can fix that. > > [memoryusage] > > Can't easily see it being a big problem in the normal mode either, frankly. > > By definition the client memory usage has to be linearly related to the > > memory usage on the server, and even in Python I don't think the constant of > > proportionality can be very large. I'd guess around 2x-3x. > > The context is running on a cloud server where they charge for memory. The server has a lot of that. A few experiments showed that collecting needed more than was available. I'll try to get more data if you think it's important. > > > >> Any suggestions for a UI/CLI? > > Not before seeing the patch, no. > > Current UI is a "direct" command to ntpq that sets a flag which gets passed in to the worker code which then prints each batch out on the fly. I've lost the context of this discussion. Is there some related thing that still needs to be done to the MRUlist code? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Mon Apr 24 16:36:17 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 24 Apr 2017 12:36:17 -0400 (EDT) Subject: Standard set of terms for precision, accuracy, related concepts. Message-ID: <20170424163617.AF72013A021B@snark.thyrsus.com> Having almost recovered from post-viral fatigue syndrome, I have enough energy to work now and am attempting to clear out my old project-related mail. I'd like to have my decks cleared for the face-to-face team meeting this coming Saturday - don't know if I'll actually manage that but I'm giving it a mighty effort. One of the unresolved threads that has been mouldering in my mailbox is Gary Miller and Achim Gratz arguing about what terms to use in the numerical-analysis parts of our comments and documentation. I agree that it would be desirable to have a uniform set of terms; that area is bewildering enough without additional terminological confusion. This has been sitting there unaddressed because I more or less own the documentation end of things and have been telling myself that I ought to digest that entire thread and write a glossary. I am now confronting the fact that this was staggeringly unrealistic of me considering everything else on my plate. So here's what I'm going to do... Achim, you and Gary *both* get to write glossaries covering terms like precision, accuracy, drift, and related stuff. Give it your best shot(s). If, after a reasonable period of time, I have a glossary only from one of you, tha person wins and the glossary gets blessed and added to the official documentation. If you both step up, I will engage the two of you in Socratic dialogue until we have a merged version. Note however that I am neither willing nor able to allow this pricess to become an infinite time sink. If I think you two are arguing in circles, I will ruthlessly terminate the process and bless something you may not like. Gentlemen, start your engines... -- Eric S. Raymond Rapists just *love* unarmed women. And the politicians who disarm them. From kurt at roeckx.be Mon Apr 24 16:49:39 2017 From: kurt at roeckx.be (Kurt Roeckx) Date: Mon, 24 Apr 2017 18:49:39 +0200 Subject: Standard set of terms for precision, accuracy, related concepts. In-Reply-To: <20170424163617.AF72013A021B@snark.thyrsus.com> References: <20170424163617.AF72013A021B@snark.thyrsus.com> Message-ID: <20170424164939.rqbkxnob6ueia7uy@roeckx.be> On Mon, Apr 24, 2017 at 12:36:17PM -0400, Eric S. Raymond wrote: > Having almost recovered from post-viral fatigue syndrome, I have > enough energy to work now and am attempting to clear out my old > project-related mail. I'd like to have my decks cleared for the > face-to-face team meeting this coming Saturday - don't know if > I'll actually manage that but I'm giving it a mighty effort. > > One of the unresolved threads that has been mouldering in my > mailbox is Gary Miller and Achim Gratz arguing about what terms > to use in the numerical-analysis parts of our comments and > documentation. I agree that it would be desirable to have > a uniform set of terms; that area is bewildering enough without > additional terminological confusion. > > This has been sitting there unaddressed because I more or less own > the documentation end of things and have been telling myself that I > ought to digest that entire thread and write a glossary. I am now > confronting the fact that this was staggeringly unrealistic of > me considering everything else on my plate. > > So here's what I'm going to do... > > Achim, you and Gary *both* get to write glossaries covering terms like > precision, accuracy, drift, and related stuff. Give it your best > shot(s). If, after a reasonable period of time, I have a glossary > only from one of you, tha person wins and the glossary gets blessed > and added to the official documentation. > > If you both step up, I will engage the two of you in Socratic dialogue > until we have a merged version. Note however that I am neither > willing nor able to allow this pricess to become an infinite time > sink. If I think you two are arguing in circles, I will ruthlessly > terminate the process and bless something you may not like. I think one of the problems is that such terms are used at many different places. And you might need to be more specific at which one you mean. That is, there might be variables called precision, but not all of them mean exactly the same. A text that works for one might be wrong for the other. So I suggest you start by finding out where you all use such terms. Kurt From gem at rellim.com Mon Apr 24 16:56:00 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 24 Apr 2017 09:56:00 -0700 Subject: ntpq peers formatting needs floating point for time slots. In-Reply-To: References: <20170204224916.3DBF1406061@ip-64-139-1-69.sjc.megapath.net> <87efzdm2i3.fsf@Rainer.invalid> <20170205114616.GA9376@thyrsus.com> <87zii0ltom.fsf@Rainer.invalid> <20170205131606.GA10522@thyrsus.com> <87vasolpfu.fsf@Rainer.invalid> <20170424134329.GA17849@thyrsus.com> Message-ID: <20170424095600.3c897f37@spidey.rellim.com> Yo Ian! On Mon, 24 Apr 2017 10:35:25 -0500 Ian Bruene wrote: > On 04/24/2017 08:43 AM, Eric S. Raymond wrote: > > One possible technical blocker: I don't know if async DNS lookups > > are doable from Python. > > Doesn't look like async DNS is possible in stock Python. There is a > library with python bindings to do it (*another* dependency), and > there is always the option of brute forcing the issue using a helper > script. Python can do 'threading'. Look in the doc for the threading module. Just create a thread, have the thread do a dns lookup with socket.gethostbyaddr(ip_address). When the thread gets the answer it saves it and sets a flag. The main loop just goes about its other tasks, checking for the flag now and then. There will be details like having the thread timeout after a while, etc. But prolly 50 lines of code or less. I put 'threading' in quotes because it is not full blown concurrent threading, just cooperative multi-taksing on a cingle core. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Mon Apr 24 17:07:47 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 24 Apr 2017 10:07:47 -0700 Subject: Standard set of terms for precision, accuracy, related concepts. In-Reply-To: <20170424164939.rqbkxnob6ueia7uy@roeckx.be> References: <20170424163617.AF72013A021B@snark.thyrsus.com> <20170424164939.rqbkxnob6ueia7uy@roeckx.be> Message-ID: <20170424100747.5aaf730a@spidey.rellim.com> Yo Kurt! On Mon, 24 Apr 2017 18:49:39 +0200 Kurt Roeckx wrote: > I think one of the problems is that such terms are used at many > different places. And you might need to be more specific at which > one you mean. That is, there might be variables called precision, > but not all of them mean exactly the same. A text that works for > one might be wrong for the other. Yup. 'precision' is used many places in NTP for several different and incompatible concepts. And 'accuracy' is a useless and overbroad term for NTP. NTP talks instead of 'offset' and 'jitter' computed between what ntpd thinks the time is and each of it's peers, serversm refclocks and local system clock. > So I suggest you start by finding out where you all use such > terms. And that taks has been ongoing. Ian has spent a lot of time just documenting the units for many variables. Documenting what the actually do will be a very large job. So for my submission, I'll stand by the glossary at the bottom of the ntpviz html output. Included below for reference, but the html formatting is lost. No need to create yet another document. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin Glossary: frequency offset: The difference between the ntpd calculated frequency and the local system clock frequency (usually in parts per million, ppm) jitter, dispersion: The short term change in a value ms, millisecond: One thousandth of a second = 0.001s mu, mean: The arithmetic mean: the sum of all the values divided by the number of values. ns, nanosecond: One billionth of a second, also one thousandth of a microsecond, 0.000000001s. percentile: The value below which a given percentage of values fall. ppb, parts per billion: Ratio between two values. These following are all the same: 1 ppb, one in one billion, 1/1,000,000,000, 0.000,000,001, and 0.000,000,1% ppm, parts per million: Ratio between two values. These following are all the same: 1 ppm, one in one million, 1/1,000,000, 0.000,001, and 0.000,1% ppt, parts per thousand: Ratio between two values. These following are all the same: 1 ppt, one in one thousand, 1/1,000, 0.001, and 0.1% refclock: Reference clock, a local GPS module or other local source of time. remote clock: Any clock reached over the network, LAN or WAN. Also called a peer or server. time offset: The difference between the ntpd calculated time and the local system clock's time. Also called phase offset. upstream clock: Any remote clock or reference clock used as a source of time. ?, sigma: Sigma denotes the standard deviation (SD) and is centered on the arithmetic mean of the data set. The SD is simply the square root of the variance of the data set. Two sigma is simply twice the standard deviation. Three sigma is three times sigma. Smaller is better. ?s, us, microsecond: One millionth of a second, also one thousandth of a millisecond, 0.000,001s. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Mon Apr 24 17:15:38 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 24 Apr 2017 13:15:38 -0400 Subject: The state of NTPsec as I see it In-Reply-To: <20170123004759.29D40406061@ip-64-139-1-69.sjc.megapath.net> References: <20170122173409.GC17686@thyrsus.com> <20170123004759.29D40406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170424171538.GA21795@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > >> I think the ntpq retransmission logic is still broken. > > Urgh. Do you know any way to reproduce this? > > I've had the reverse problem. It always strikes when I'm trying to get real > work done. > > My test case is just the typical network dropping packets. Try mrulist on a > remote system with a big list. > > I haven't looked at ntpq recently. If there is a single place where it reads > packets, you can hack that to randomly drop packets. That will let you test > things on a local LAN where the network doesn't drop packets. > > Or get some flaky WiFi gear. Maybe move far enough away. Small USB WiFi > chips are not expensive. You can add one to a Raspberry Pi. > > Plan B would be to hack the server side. I think the mode 6 responses all go > through a single place. > > The restrict stuff has a +flake+ option. It's in docs/includes/access-command > s.txt > That will drop request packets arriving at the server. (I think, not tested) > That might be enough. Or maybe we need the case where ntpq gets a partial > answer. > +flake+;; > Discard received NTP packets with probability 0.1; that is, on > average drop one packet in ten. This is for testing and amusement. > The name comes from Bob Braden's _flakeway_, which once did a > similar thing for early Internet testing. > > -------- > > https://tools.ietf.org/html/rfc1025 > Some test are made more interesting by the use of a "flakeway". A > flakeway is a purposely flakey gateway. It should have control > parameters that can be adjusted while it is running to specify a > percentage of datagrams to be dropped, a percentage of datagrams to > be corrupted and passed on, and a percentage of datagrams to be > reordered so that they arrive in a different order than sent. I fear we dropped this on the floor. If it's still an issue, please file a bug on the tracker. Be as specific as possible; it might be Ian rather than me working on it. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From kurt at roeckx.be Mon Apr 24 17:30:03 2017 From: kurt at roeckx.be (Kurt Roeckx) Date: Mon, 24 Apr 2017 19:30:03 +0200 Subject: Standard set of terms for precision, accuracy, related concepts. In-Reply-To: <20170424100747.5aaf730a@spidey.rellim.com> References: <20170424163617.AF72013A021B@snark.thyrsus.com> <20170424164939.rqbkxnob6ueia7uy@roeckx.be> <20170424100747.5aaf730a@spidey.rellim.com> Message-ID: <20170424173002.zmg4o7kvcoajqr7o@roeckx.be> On Mon, Apr 24, 2017 at 10:07:47AM -0700, Gary E. Miller wrote: > frequency offset: > The difference between the ntpd calculated frequency and the local system clock frequency (usually in parts per million, ppm) > jitter, dispersion: > The short term change in a value For me jitter is about chances in the interval. Some examples which I think are relevant are: - For a PPS signal, the signal generated by the source is not exactly 1 second each time, but on average it's 1 second. - For the same PPS signal, there is a variable delay between the source generating the edge and the PC having seen it and measured the time. - A packet send over a network doesn't always have the same delay. dispersion for me is the standard deviation Kurt From gem at rellim.com Mon Apr 24 19:40:21 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 24 Apr 2017 12:40:21 -0700 Subject: Standard set of terms for precision, accuracy, related concepts. In-Reply-To: <20170424173002.zmg4o7kvcoajqr7o@roeckx.be> References: <20170424163617.AF72013A021B@snark.thyrsus.com> <20170424164939.rqbkxnob6ueia7uy@roeckx.be> <20170424100747.5aaf730a@spidey.rellim.com> <20170424173002.zmg4o7kvcoajqr7o@roeckx.be> Message-ID: <20170424124021.676a97d8@spidey.rellim.com> Yo Kurt! On Mon, 24 Apr 2017 19:30:03 +0200 Kurt Roeckx wrote: > On Mon, Apr 24, 2017 at 10:07:47AM -0700, Gary E. Miller wrote: > > frequency offset: > > The difference between the ntpd calculated frequency and the > > local system clock frequency (usually in parts per million, ppm) > > jitter, dispersion: The short term change in a value > > For me jitter is about chances in the interval. Some examples which > I think are relevant are: > - For a PPS signal, the signal generated by the source is not > exactly 1 second each time, but on average it's 1 second. > - For the same PPS signal, there is a variable delay between > the source generating the edge and the PC having seen it > and measured the time. > - A packet send over a network doesn't always have the same delay. So put that in a nice two sentence explanatory paragraph. Something that fits at the bottom of an ntpviz page, suitable for newbies. > dispersion for me is the standard deviation Like yes, but standard deviation is a very precise math formula. ? = standard deviation xi = each value of dataset x = the arithmetic mean of the data (This symbol will be indicated as mean from now) N = the total number of data points ? (xi - mean)^2 = The sum of (xi - mean)^2 for all datapoints Leading to: ? = ?[ ?(x-mean)^2 / N ] Very few sqrt() in ntpd, and none near dispersion. Comceptually they are similar, measures of the population variance, but ? = standard deviation is a very specific thing not used in ntpd. ? is used in ntpviz. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Mon Apr 24 20:25:05 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 24 Apr 2017 22:25:05 +0200 Subject: Something is buggy with maxpoll... Message-ID: <87fugxlfj2.fsf@Rainer.invalid> Something is seriously wrong with the maxpoll handling, I'm not sure if that's a new issue or when it was introduced. I've set my local NTP servers to monitor each other at maxpoll=4 (16s). If one of those servers doesn't respond at the regular poll interval (for instance when I restart the server at that moment) it looks like the poll interval changes to at least 1024s and never recovers back to the actual maxpoll setting. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ DIY Stuff: http://Synth.Stromeko.net/DIY.html From hmurray at megapathdsl.net Mon Apr 24 20:27:18 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 24 Apr 2017 13:27:18 -0700 Subject: Refclocks and formatting In-Reply-To: Message from "Eric S. Raymond" of "Sun, 23 Apr 2017 16:40:57 EDT." <20170423204057.GC30937@thyrsus.com> Message-ID: <20170424202718.DFA12406063@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > If you mean changes to Mode 6, I'm not sure enough that shipping both hex > and text would fix anything to we want to do it. How do we know old code > wouldn't barf on the following text? Naive implementations in scripting > languages would do that. If you look at the old code up a level from the detailed chunk that sends each field, you find that there is logic to scramble the order and send bogus extra fields. The intention was to make sure client code didn't do the simple things that would break when fields are added. So I think that old code will keep working as long as we keep sending the old fields without changing them. Or at least any somewhat sane code. I don't think we have to worry about the rest. esr at thyrsus.com said: >> Do we need to start tracking what we will support and/or develop >> a policy for when we drop support for a slot? > I've been obsessive about documenting visible changes; see docs/ntpsec.txt > for them all. There have not been many such things, and only one change in > Mode 6. There's one context in which it ships a driver name rather than a > number now. What I was trying to suggest was a list of the fields and which ones came from ntp classic (call it version 0.0) and which ones we added and which version we added them in and which fields they replace and/or have been replaced by and/or some policy about how long we support each version. I'm assuming that we can add things to test them without a lot of discussion, but maybe we should think twice before we ship an actual released version that has them since that commits us to support for a long time. -- These are my opinions. I hate spam. From gem at rellim.com Mon Apr 24 23:52:47 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 24 Apr 2017 16:52:47 -0700 Subject: Something is buggy with maxpoll... In-Reply-To: <87fugxlfj2.fsf@Rainer.invalid> References: <87fugxlfj2.fsf@Rainer.invalid> Message-ID: <20170424165247.4cb3bda6@spidey.rellim.com> Yo Achim! On Mon, 24 Apr 2017 22:25:05 +0200 Achim Gratz wrote: > Something is seriously wrong with the maxpoll handling, I'm not sure > if that's a new issue or when it was introduced. I've set my local > NTP servers to monitor each other at maxpoll=4 (16s). If one of those > servers doesn't respond at the regular poll interval (for instance > when I restart the server at that moment) it looks like the poll > interval changes to at least 1024s and never recovers back to the > actual maxpoll setting. Unable to duplicate. Here is what I did: 1. server has a peer, minpoll=maxpoll=3 2. server has today's git head. 2. peer has reach of 377. 3. turn off ntpd on the peer. 4. wait for reach to that peer to be 0. 5. wait some more. 6. turn on ntpd on peer. 7. reach returns to 377. 8. poll as shown in ntpmon and ntpq staed always at 32. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 25 01:56:21 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 24 Apr 2017 21:56:21 -0400 Subject: Refclocks and formatting In-Reply-To: <20170424202718.DFA12406063@ip-64-139-1-69.sjc.megapath.net> References: <20170423204057.GC30937@thyrsus.com> <20170424202718.DFA12406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425015621.GD25799@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > If you mean changes to Mode 6, I'm not sure enough that shipping both hex > > and text would fix anything to we want to do it. How do we know old code > > wouldn't barf on the following text? Naive implementations in scripting > > languages would do that. > > If you look at the old code up a level from the detailed chunk that sends > each field, you find that there is logic to scramble the order and send bogus > extra fields. The intention was to make sure client code didn't do the > simple things that would break when fields are added. > > So I think that old code will keep working as long as we keep sending the old > fields without changing them. Or at least any somewhat sane code. I don't > think we have to worry about the rest. Ah, I think I misunderstood your proposal. You're suggesting adding *new* response components that have unpacked strings in them. That makes sense. I thought you were suggesting something else. I'll add this to the list of potential work items I'm putting together. > esr at thyrsus.com said: > >> Do we need to start tracking what we will support and/or develop > >> a policy for when we drop support for a slot? > > > I've been obsessive about documenting visible changes; see docs/ntpsec.txt > > for them all. There have not been many such things, and only one change in > > Mode 6. There's one context in which it ships a driver name rather than a > > number now. > > What I was trying to suggest was a list of the fields and which ones came > from ntp classic (call it version 0.0) and which ones we added and which > version we added them in and which fields they replace and/or have been > replaced by and/or some policy about how long we support each version. > > I'm assuming that we can add things to test them without a lot of discussion, > but maybe we should think twice before we ship an actual released version > that has them since that commits us to support for a long time. docs/mode6.txt See "Compatibility Notes". It's a start. anyway - no policy there. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 25 05:41:03 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 24 Apr 2017 22:41:03 -0700 Subject: Errors and warnings from head Message-ID: <20170425054104.0752A406063@ip-64-139-1-69.sjc.megapath.net> NetBSD: TEST(lfpfunc, Absolute) PASS TEST(lfpfunc, FDF_RoundTrip)../../tests/libntp/lfpfunc.c:270::FAIL: Expected 0.0 Was 2147482624.0. \nop2: 2147483647.500000 op3: 1023.50000000 diff 2147482624.000 000 not within 2.384186e-07 TEST(lfpfunc, SignedRelOps) PASS Debian wheezy: ../../ntpd/ntp_config.c:2955:2: warning: implicit declaration of function ???yyparse??? [-Wimplicit-function-declaration] ../../ntpd/ntp_config.c:3020:2: error: ???yydebug??? undeclared (first use in this function) CentOS release 6.9 (Final): ../../ntpd/ntp_monitor.c:156: warning: inlining failed in call to ???mon_reclaim_entry???: call is unlikely and code size would grow ../../ntpd/ntp_monitor.c:454: warning: called from here ../../ntpd/ntp_monitor.c:156: warning: inlining failed in call to ???mon_reclaim_entry???: call is unlikely and code size would grow ../../ntpd/ntp_monitor.c:467: warning: called from here ../../include/timespecops.h:334: warning: inlining failed in call to ???tspec_stamp_to_lfp???: call is unlikely and code size would grow ../../ntpd/ntp_config.c:3020: error: ???yydebug??? undeclared (first use in this function) ../../ntpd/ntp_config.c:3020: error: (Each undeclared identifier is reported only once ../../ntpd/ntp_config.c:3020: error: for each function it appears in.) NetBSD 7: ../../ntpfrob/tickadj.c:30:6: warning: function might be candidate for attribute 'noreturn' [-Wsuggest-attribute=noreturn] HAVE_ADJTIMEX is not defined so it does an unconditional exit(1) >From Raspberry Pi and BeagleBone Black: ../../ntpd/ntp_control.c: In function ???ctl_error???: ../../ntpd/ntp_control.c:714:35: warning: cast increases required alignment of target type [-Wcast-align] maclen = authencrypt(res_keyid, (uint32_t *)&rpkt, ^ ../../ntpd/ntp_control.c: In function ???process_control???: ../../ntpd/ntp_control.c:830:35: warning: cast increases required alignment of target type [-Wcast-align] else if (authdecrypt(res_keyid, (uint32_t *)pkt, ^ and many more, including some in the test area -- These are my opinions. I hate spam. From Stromeko at nexgo.de Tue Apr 25 05:52:50 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 25 Apr 2017 07:52:50 +0200 Subject: Something is buggy with maxpoll... References: <87fugxlfj2.fsf@Rainer.invalid> <20170424165247.4cb3bda6@spidey.rellim.com> Message-ID: <87y3upghjh.fsf@Rainer.invalid> Gary E. Miller writes: > Unable to duplicate. Here is what I did: > > 1. server has a peer, minpoll=maxpoll=3 I've not explicitly set minpoll. > 2. server has today's git head. I've just put the newest version on all four boxen. I should perhaps mention that the problem has started occuring once I've started monitoring all other machines on the network (plus a handful of stratum-1 from outside) from each of these. They are all connected to the same switch (I've tried to connect one of these directly to my router to see if it makes a difference and it did not). This behaviour makes it very difficult to restart the ntpd on any of these, since I will often have to restart the ntpd on another one and then go back and see if the others are still working correctly. > 8. poll as shown in ntpmon and ntpq staed always at 32. The displayed poll value never changes. But the actual poll interval obviously gets much longer than what was set. I've seen over 50 minutes without a poll (and the reach flags stay at their prvious value instead of shifting out every 16 seconds as they should). This smells like an unitialised variable somewhere that then picks up a random value? Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Q+, Q and microQ: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From gem at rellim.com Tue Apr 25 06:20:53 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 24 Apr 2017 23:20:53 -0700 Subject: Errors and warnings from head In-Reply-To: <20170425054104.0752A406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425054104.0752A406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170424232053.2212217f@spidey.rellim.com> Yo Hal! On Mon, 24 Apr 2017 22:41:03 -0700 Hal Murray wrote: > NetBSD: > TEST(lfpfunc, Absolute) PASS > TEST(lfpfunc, FDF_RoundTrip)../../tests/libntp/lfpfunc.c:270::FAIL: > Expected 0.0 > Was 2147482624.0. \nop2: 2147483647.500000 op3: 1023.50000000 diff > 2147482624.000 > 000 not within 2.384186e-07 > TEST(lfpfunc, SignedRelOps) PASS Issue #264 > Debian wheezy: > ../../ntpd/ntp_config.c:2955:2: warning: implicit declaration of > function ???yyparse??? [-Wimplicit-function-declaration] > ../../ntpd/ntp_config.c:3020:2: error: ???yydebug??? undeclared > (first use in this function) Do you have --enable-warnings on? Known upstream Bison bug. > CentOS release 6.9 (Final): > ../../ntpd/ntp_monitor.c:156: warning: inlining failed in call to > ???mon_reclaim_entry???: call is unlikely and code size would grow > ../../ntpd/ntp_monitor.c:454: warning: called from here > ../../ntpd/ntp_monitor.c:156: warning: inlining failed in call to > ???mon_reclaim_entry???: call is unlikely and code size would grow > ../../ntpd/ntp_monitor.c:467: warning: called from here > ../../include/timespecops.h:334: warning: inlining failed in call to > ???tspec_stamp_to_lfp???: call is unlikely and code size would grow > ../../ntpd/ntp_config.c:3020: error: ???yydebug??? undeclared (first > use in this function) > ../../ntpd/ntp_config.c:3020: error: (Each undeclared identifier is > reported only once > ../../ntpd/ntp_config.c:3020: error: for each function it appears in.) Issue #274 > NetBSD 7: > ../../ntpfrob/tickadj.c:30:6: warning: function might be candidate > for attribute 'noreturn' [-Wsuggest-attribute=noreturn] > HAVE_ADJTIMEX is not defined so it does an unconditional exit(1) That is a new one. I guess related to your earlier no adjtimex() bug on NetBSD. Have you creaated an issue fpr the adjtimex() thing yet? > From Raspberry Pi and BeagleBone Black: > ../../ntpd/ntp_control.c: In function ???ctl_error???: > ../../ntpd/ntp_control.c:714:35: warning: cast increases required > alignment of target type [-Wcast-align] > maclen = authencrypt(res_keyid, (uint32_t *)&rpkt, > ^ > ../../ntpd/ntp_control.c: In function ???process_control???: > ../../ntpd/ntp_control.c:830:35: warning: cast increases required > alignment of target type [-Wcast-align] > else if (authdecrypt(res_keyid, (uint32_t *)pkt, > ^ Known issue. I just created issue #277 for it. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Apr 25 06:26:09 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 24 Apr 2017 23:26:09 -0700 Subject: Something is buggy with maxpoll... In-Reply-To: <87y3upghjh.fsf@Rainer.invalid> References: <87fugxlfj2.fsf@Rainer.invalid> <20170424165247.4cb3bda6@spidey.rellim.com> <87y3upghjh.fsf@Rainer.invalid> Message-ID: <20170424232609.0201b1fa@spidey.rellim.com> Yo Achim! On Tue, 25 Apr 2017 07:52:50 +0200 Achim Gratz wrote: > Gary E. Miller writes: > > Unable to duplicate. Here is what I did: > > > > 1. server has a peer, minpoll=maxpoll=3 > > I've not explicitly set minpoll. Probalby unrelated. > > 2. server has today's git head. > > I've just put the newest version on all four boxen. I should perhaps > mention that the problem has started occuring once I've started > monitoring all other machines on the network (plus a handful of > stratum-1 from outside) from each of these. They are all connected to > the same switch (I've tried to connect one of these directly to my > router to see if it makes a difference and it did not). This > behaviour makes it very difficult to restart the ntpd on any of > these, since I will often have to restart the ntpd on another one and > then go back and see if the others are still working correctly. Gonna be hard to debug unless you can narrow down to something we can replicate. > > 8. poll as shown in ntpmon and ntpq staed always at 32. > > The displayed poll value never changes. But the actual poll interval > obviously gets much longer than what was set. How can you tell? I watched with ntpmon and the actual poll matched the poll shown. > I've seen over 50 > minutes without a poll (and the reach flags stay at their prvious > value instead of shifting out every 16 seconds as they should). Odd, I have never seen this. And I tried several time today to see it. Do you think the issue is in ntpmon or in ntpd? > This > smells like an unitialised variable somewhere that then picks up a > random value? Could be, but we have gcc, and pyflakes, set to detext unintialized variables. So cn't be a trivail issue. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 25 06:45:20 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 24 Apr 2017 23:45:20 -0700 Subject: Something is buggy with maxpoll... In-Reply-To: Message from Achim Gratz of "Tue, 25 Apr 2017 07:52:50 +0200." <87y3upghjh.fsf@Rainer.invalid> Message-ID: <20170425064520.B6433406063@ip-64-139-1-69.sjc.megapath.net> You can get a second opinion on the actual traffic by turning on rawstats. That will log a line for each received packet. That line includes a counter for lost packets which will be non-zero if the server is dropping packets for some reason. Or use tcpdump and avoid another layer of possible confusion. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 25 07:24:49 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 00:24:49 -0700 Subject: Errors and warnings from head Message-ID: <20170425072449.EE0EB406063@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: >> Debian wheezy: >> ../../ntpd/ntp_config.c:2955:2: warning: implicit declaration of >> function =C3=A2=E2=82=AC=CB=9Cyyparse=C3=A2=E2=82=AC=E2=84=A2 [-Wimplicit= >> -function-declaration] >> ../../ntpd/ntp_config.c:3020:2: error: =C3=A2=E2=82=AC=CB=9Cyydebug=C3=A2= >> =E2=82=AC=E2=84=A2 undeclared >> (first use in this function) > Do you have --enable-warnings on? Known upstream Bison bug. I don't think so. I do have --enable-debug and --enable-debug-gdb The yydebug error is conditional on DEBUG, but I've been using that for ages without problems. (It used to default to on, right?) I might have missed the yyparse warning but that seems unlikely. ----- >> NetBSD 7: >> ../../ntpfrob/tickadj.c:30:6: warning: function might be candidate >> for attribute 'noreturn' [-Wsuggest-attribute=3Dnoreturn] >> HAVE_ADJTIMEX is not defined so it does an unconditional exit(1) > That is a new one. I guess related to your earlier no adjtimex() bug on > NetBSD. Have you creaated an issue fpr the adjtimex() thing yet? I can't find my earlier adjtimex on NetBSD comments to sort out what I said. It's not on FreeBSD either. Is devel exposed to google? It found stuff on gitlab, but no hits on a mail archive. It's the corner of ntpfrob that adjusts "tick". -a tick Set the kernel variable tick to the value tick specifies. -A Display the kernel variable tick. At first, I was thinking that was the drift rate which we could set some other way. But the -A option says things like tick = 10000 If that's microseconds, that would be 100 Hz, a reasonable number but not very interesting on a NO_HZ system. I think we can fix it by preferring ntp_adjtimex and trying adjtimex if that's not available. We can copy the ideas from the mainline ntpd routines, maybe libntp/clockwork.c -- These are my opinions. I hate spam. From Stromeko at nexgo.de Tue Apr 25 09:27:17 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 25 Apr 2017 11:27:17 +0200 Subject: Something is buggy with maxpoll... References: <87fugxlfj2.fsf@Rainer.invalid> <20170424165247.4cb3bda6@spidey.rellim.com> <87y3upghjh.fsf@Rainer.invalid> <20170424232609.0201b1fa@spidey.rellim.com> Message-ID: <87tw5chm6i.fsf@Rainer.invalid> Gary E. Miller writes: >> The displayed poll value never changes. But the actual poll interval >> obviously gets much longer than what was set. > > How can you tell? I watched with ntpmon and the actual poll matched the > poll shown. None of the peerstats update at the poll interval, so I conclude that the poll doesn't happen. If it does poll, then it doesn't seem to recognise whether it gets an answer or not (since it also does not put a zero in the reachability) or any of the other things that get updated with the data from the poll. > Do you think the issue is in ntpmon or in ntpd? It must be in ntpd I think. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf microQ V2.22R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From hmurray at megapathdsl.net Tue Apr 25 09:33:37 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 02:33:37 -0700 Subject: Something is buggy with maxpoll... In-Reply-To: Message from Achim Gratz of "Tue, 25 Apr 2017 11:27:17 +0200." <87tw5chm6i.fsf@Rainer.invalid> Message-ID: <20170425093337.2290F406063@ip-64-139-1-69.sjc.megapath.net> Are you using DNS to set things up? You might have stumbled into something I broke. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Tue Apr 25 10:36:05 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 25 Apr 2017 12:36:05 +0200 Subject: Something is buggy with maxpoll... References: <20170425093337.2290F406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <87lgqohizu.fsf@Rainer.invalid> Hal Murray writes: > Are you using DNS to set things up? You might have stumbled into something I > broke. Yes, I use the DNS from my router to target the local servers (besides the external ones, obviously). Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From hmurray at megapathdsl.net Tue Apr 25 10:53:15 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 03:53:15 -0700 Subject: Something is buggy with maxpoll... In-Reply-To: Message from Achim Gratz of "Tue, 25 Apr 2017 12:36:05 +0200." <87lgqohizu.fsf@Rainer.invalid> Message-ID: <20170425105315.105D6406063@ip-64-139-1-69.sjc.megapath.net> Stromeko at nexgo.de said: >> Are you using DNS to set things up? You might have stumbled >> into something I broke. > Yes, I use the DNS from my router to target the local servers (besides the > external ones, obviously). Is the slot that is not working right being setup by DNS? As in server foo.example.com maxpoll 5 rather than server 1.2.3.4 maxpoll 6 If so, does it work correctly when setup by numerical IP Address? -- These are my opinions. I hate spam. From Stromeko at nexgo.de Tue Apr 25 11:19:27 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 25 Apr 2017 13:19:27 +0200 Subject: Something is buggy with maxpoll... References: <20170425105315.105D6406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <87h91chgzk.fsf@Rainer.invalid> Hal Murray writes: > Is the slot that is not working right being setup by DNS? As in > server foo.example.com maxpoll 5 > rather than > server 1.2.3.4 maxpoll 6 Yes. > If so, does it work correctly when setup by numerical IP Address? I can't test that right now, not sure when I'll have time to get to that. -- Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From esr at thyrsus.com Tue Apr 25 12:54:19 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 08:54:19 -0400 Subject: Documentation request/opportunity In-Reply-To: <20170413195809.8267940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170413195809.8267940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425125419.GA2157@thyrsus.com> Hal Murray : > > gem at rellim.com said: > >> I think we need a chart/table showing the types of packets we send > >> and expect to receive. > > How about the RFC? I would hate to duplicate that. > > No, that's not what I'm looking for. > > We only implement a subset of the full spec. For example, we don't implement > the peer stuff. I think that's 2 packet types. > > The RFC is pages and pages. I'm looking for the one page (or less) summary. > A pointer to the right section in the RFC might be appropriate. > > Context/background: There is a two dimensional table used in the input > packet processing. I think we can clean that up by eliminating the table. > This is tangled up with broadcast and friends and we removed some of that but > I'm not sure exactly sure what is or should be left so I'd like some > documentation of what is currently supported. > > The current code for the pool stuff, sends request packets when it gets the > DNS answer. When the reply comes back, it sets up the peer slot. If the > server doesn't respond, there is never any peer slot setup. That same path > also sets up broadcast clients, but I think we don't support that any more. > > I think that table goes away if the pool stuff sets up the peer slot before > it sends the first request. That means response processing is a simple as > look for the peer and drop anything that doesn't match. Since the protocol-engine refactoring, the only person who knows for dead sure which parts of the RFC we support is Daniel. I could go digging in the code, but I might mistake renant stubs for support; his answer should be faster and more reliavle. So, Daniel, which packet types are still supported? IIRC we have removed (for security reasons) manycast, and broadcast client (but not broadcast server - I think it's still in there but untested since the refactor). Something I don't completely recall has been done to peer mode. Please make a definitive statemement. I will update docs accordingly. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Apr 25 13:58:10 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 09:58:10 -0400 Subject: Pivoting In-Reply-To: <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425135810.GA3024@thyrsus.com> Hal Murray : > More pivoting.... > > pylib calls ntp.ntpc.lfptofloat() in several places > > ntpc_lfptofloat() calls lfp_stamp_to_tspec() > That's an inline which calls ntpcal_ntp_to_time() > > ntpcal_ntp_to_time takes an optional second argument, the pivot time. > If it's NULL, it uses "now". That is correct. It is a known bug in the Python tools (I think we inherited this from the C versions) that they can easily fail when querying a server based in a different era if the local clock has not yet been synced properly. Thanks for reminding me of this; I will add bug warnings to the ntpq and ntpmon manual pages. > The code in step_systime() is really really ugly. (to my eye) Not just to yours... > It starts by computing the pivot. It gets the build time as a broken down > struct, subtracts 10 years, converts back to a time_t. All that can be > precomputed. > > It converts the step to a l_fp, gets the system time as a tspec, converts > that to l_fp, adds them together, then converts back to a tspec. That > convert back uses the pivot. > > Now that we have a nice simple EPOCH, I hope somebody cleans that up. This is a major work item, but it's not an urgent one - we have 14 years to get a fix working and deployed. Discussion of pivoting vs. sanity checking is still heavy; I want to see us arrive at some sort of mutually checking consnsus before we cut code. > Does the 10 year step back make any sense? Not to me. I've added it to the list of work items as a mystery to be plumbed. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Apr 25 15:24:46 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 11:24:46 -0400 Subject: Pivoting In-Reply-To: <20170422062233.78E2A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170422062233.78E2A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425152446.GA3883@thyrsus.com> Mark, heads up - policy issue related to old GPS support. Hal Murray : > > Stromeko at Nexgo.DE said: > > Even if you only consider physical hardware, based on the projected > > lifetime of automotive qualified systems (15 years or longer) you have to > > expect a much longer actual lifetime in the field. > > ianbruene at gmail.com said: > > 19:14 One should always put a multiplier on one's > > estimate of how long their code will be in use: code expected to > > last for 5 years lasting 20 got us Y2K. ... > > Right. I think there are two issues tangled up here. > > One is that there is a tradeoff between building in a long lifetime and > catching problems during a normal lifetime. I'm not sure which is more > likely. Consider what happens if a server gets fired up with a broken clock > and starts answering all requests with 1970. Do you want to reject that, or > pivot it to 2036? Well put, and good on you for putting your finger squarely on the dilemma. I'm not sure which is more likely either. In the absence of such knowledge, my call is to (a) do the *simplest* possible thing - that is, incur the least possible code complexity - and carefully document our assumptions and the failure modes. I also think we should continue trying to have insights about this problem, but not bet on a breakthrough. And not try to solve it before 1.0; I rate the risk from code destabilization higher than the gain until we're much surer of our ground than we are now. > The other is that there really is a 20 year rollover with old GPS units. > (Newer units have 13 bits.) I think that turns into 3 choices: > Don't support really old GPS units. > Advertise the default lifetime. > Allow the user to specify the pivot time and/or life time, either at build > time or at run time or both. This is where I'd like Mark to check in. I think I'm changing my mind about this, but there's a piss-off-legacy-users issue not to be lightly dismissed. I used to think we needed to support all GPSes back to the beginning of time. But I think I was failing to separate expensive high-precision reflocks (about which people do get cheesed off when they fall out of support) from generic GPSes, which are now dirt-cheap and effectively disposable. Dirt-cheap and effectively disposable changes the tradeoffs, especially with 13-bit week counters that make the service life 157 years. I'm coming around to the view that it's reasonable *given this combination of circumstances* to disclaim support for old GPSes. But I'm open to counterargument on that. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From Stromeko at nexgo.de Tue Apr 25 16:47:57 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 25 Apr 2017 18:47:57 +0200 Subject: Standard set of terms for precision, accuracy, related concepts. References: <20170424163617.AF72013A021B@snark.thyrsus.com> Message-ID: <878tmoh1s2.fsf@Rainer.invalid> Eric S. Raymond writes: > Achim, you and Gary *both* get to write glossaries covering terms like > precision, accuracy, drift, and related stuff. Give it your best > shot(s). If, after a reasonable period of time, I have a glossary > only from one of you, tha person wins and the glossary gets blessed > and added to the official documentation. That is a bit of a can of worms as you have already seen, both in the original thread and the answers here. Just the two terms you mention are used in different ways for different things and so far we haven't even determined whether we can or want to unify on one set of definitions throughout all of NTP or maybe keep domain-specific meanings where appropriate. Another line along which these terms split is whether they are applied to continous or discrete quantities, whether you are talking about a single number or a statistical moment and whether the thing you are talking about is a stochastic variable, a physical quantity or something you do a calculation with. For NTP, I'd think the only physical continous quantity of interest is time, but it only ever gets processed as a quantized numerical value. NTP als cannot directly measure the time, instead it approximates it by various means, in particular in the form of time differences. That approximation requires both statistical inference and numerical calculations. So there are at least three domains that need a glossary: representation of absolute and delta time (both conceptual and in implementation), the statistical inference and the implementation of these operations as numerical algorithms. Many of the terms used for ntpd are actually referring to internal algorithmic variables of the control loop rather than an estimate of some measureable quantity. The frequency offset for instance does describe the deviation of the system clock from the ideal frequency only when both the derivative of the frequency offset and the time offset are both zero. In all other cases it's a mixture of the (not measured) offset of the clock frequency and an additional offset introduced by the FLL/PLL that tries to keep the time offset as close to zero as possible. Similarly the time offset isn't the actual offset, but a measurement corrupted by multiple noise sources. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ DIY Stuff: http://Synth.Stromeko.net/DIY.html From esr at thyrsus.com Tue Apr 25 17:03:22 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 13:03:22 -0400 Subject: Pivoting In-Reply-To: <20170423102048.13F19406063@ip-64-139-1-69.sjc.megapath.net> References: <20170423102048.13F19406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425170322.GA5184@thyrsus.com> Hal Murray : > > gem at rellim.com said: > > I'd have ntpd reject any time prior to EPOCH. > > How do you decide whether to reject it or pivot it into the future? This is indeed a problem, and it's fundamental whenever you're trying to work with devices that have time conters that can roll over within their expected lifespan. It's foolish to pretend that *any* workaround will never have perverse consequences. > >> Allow the user to specify the pivot time and/or life time, either > >> at build time or at run time or both. > > EPOCH is used for NMEA, so that is covered at build time. > > I could see adding an option to specify the EPOCH at run time too. > > My build time comment was mostly for life time. I was assuming that EPOCH > would be used for pivoting. > > I know about three pivots to consider. One is GPS 10 bits for weeks with a > 20 year step size. Another is 2 digit year numbers with a 100 year step > size. The third is 32 bits of seconds in NTP packets with a 136 year step > size. Are there any others I've overlooked? > > If we want our software to last more than 20 years while talking to crappy > GPS receivers, we need a way to update the pivot date at run time. (I'm > using "last" to mean without rebuilding.) > > If we want our software to reject bogus time, we have to balance the tradeoff > between long life and good filtering. Run time parameters will allow the > user to choose. As I think more about this, I become more nervous about these prospective "run-time parameters". They seem like asking for trouble - not easy to explain, easy to misconfigure. > > As mentioend earlier, PDP-8s still run. That is late 1960's. Call it 50 > > years. > > It would be interesting to see what those setups are actually doing and if > they have documentation for something as obscure as NTP. > > There is a lot of lab gear running embedded software. I wonder how much of > it will be running at its 25th or 50th birthday. I have a pre-software scope > that's over 35 years old. > > > The combination of long life and crappy GPS seems obscure enough that I'm > willing to document it as a limitation. It's the kind of code that Eric > would love to rip out if he found it a year ago. You're right enough about *that*. What kept my hands off it was defensive conservatism - I've managed to not fuck us up while removing almost 75% of the C code by being extremely careful about messing with things I didn't understand. As I remarked in my last mail, I'm coming around to the view that trying to work around old crappy GPSes is a swamp that we're best out of. There are no right answers, just differently wrong ones - what you get to choose is the distribution of your failure cases. > The documentation issue gets interesting. A feature isn't any good if you > can't figure out how to use it. I wonder if the web will solve that problem. > Will NTPsec still be online 20 years from now? Will we maintain online > versions of 20 year old releases? I think we have to plan for a longer than 20-year lifetime; after all the code is about 20 years old now. > Would anybody notice a warning message from a program that's been running for > 19 years? That is imponderable. But I think we have to emit whatever messages will give future admins the most useful information *assuming* they pay attention. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From Stromeko at nexgo.de Tue Apr 25 17:08:21 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 25 Apr 2017 19:08:21 +0200 Subject: Standard set of terms for precision, accuracy, related concepts. References: <20170424163617.AF72013A021B@snark.thyrsus.com> <20170424164939.rqbkxnob6ueia7uy@roeckx.be> <20170424100747.5aaf730a@spidey.rellim.com> Message-ID: <871ssgh0u2.fsf@Rainer.invalid> Gary E. Miller writes: > Glossary: > > frequency offset: > The difference between the ntpd calculated frequency and the local system clock frequency (usually in parts per million, ppm) It's actually the current correction ntpd applies to the system clock (for ppm, the correct unit would be ?s/s). It would converge to your definition for a stationary system if all noise sources were unbiased. > jitter, dispersion: > The short term change in a value That's too short for an explanation. Jitter is the deviation from an idealized periodical signal and can refer to both the devation at some discrete time (clock event) or more commonly the resulting distribution. The dispersion is the width of that distribution and can be expressed in different ways depending on the type of distribution. > ppt, parts per thousand: > Ratio between two values. These following are all the same: 1 ppt, one in one thousand, 1/1,000, 0.001, and 0.1% When talking about precise frequency measurements I would rather expect ppt to refer to "parts per trillion". "Parts per thousand" is usually expressed as "per mille [?]" > ?, sigma: > Sigma denotes the standard deviation (SD) and is centered on the arithmetic mean of the data set. The SD is simply the square root of the variance of the data set. Two sigma is simply twice the standard deviation. Three sigma is three times sigma. Smaller is better. Sigma is the measure of dispersion for normal distributions. It isn't centered on anything and the rest of that explanation is either misleading or redundant. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Waldorf MIDI Implementation & additional documentation: http://Synth.Stromeko.net/Downloads.html#WaldorfDocs From esr at thyrsus.com Tue Apr 25 17:14:25 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 13:14:25 -0400 Subject: Standard set of terms for precision, accuracy, related concepts. In-Reply-To: <878tmoh1s2.fsf@Rainer.invalid> References: <20170424163617.AF72013A021B@snark.thyrsus.com> <878tmoh1s2.fsf@Rainer.invalid> Message-ID: <20170425171425.GB6363@thyrsus.com> Achim Gratz : > Eric S. Raymond writes: > > Achim, you and Gary *both* get to write glossaries covering terms like > > precision, accuracy, drift, and related stuff. Give it your best > > shot(s). If, after a reasonable period of time, I have a glossary > > only from one of you, tha person wins and the glossary gets blessed > > and added to the official documentation. > > That is a bit of a can of worms as you have already seen, both in the > original thread and the answers here. Just the two terms you mention > are used in different ways for different things and so far we haven't > even determined whether we can or want to unify on one set of > definitions throughout all of NTP or maybe keep domain-specific meanings > where appropriate. If we're not working towards a unified vocabulary, what's the point of having a glossary at all? If all you can do is tell me how hard the problem is, you won't have any influence on the fix. I understand it's difficult; I'll take the best documentation solution I can get. It's your choice how much you want to try to pull in the directions you think are correct. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Apr 25 17:16:35 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 13:16:35 -0400 Subject: Standard set of terms for precision, accuracy, related concepts. In-Reply-To: <871ssgh0u2.fsf@Rainer.invalid> References: <20170424163617.AF72013A021B@snark.thyrsus.com> <20170424164939.rqbkxnob6ueia7uy@roeckx.be> <20170424100747.5aaf730a@spidey.rellim.com> <871ssgh0u2.fsf@Rainer.invalid> Message-ID: <20170425171635.GC6363@thyrsus.com> Achim Gratz : > Gary E. Miller writes: > > Glossary: > > > > frequency offset: > > The difference between the ntpd calculated frequency and the local system clock frequency (usually in parts per million, ppm) > > It's actually the current correction ntpd applies to the system clock > (for ppm, the correct unit would be ?s/s). It would converge to your > definition for a stationary system if all noise sources were unbiased. > > > jitter, dispersion: > > The short term change in a value > > That's too short for an explanation. Jitter is the deviation from an > idealized periodical signal and can refer to both the devation at some > discrete time (clock event) or more commonly the resulting > distribution. The dispersion is the width of that distribution and can > be expressed in different ways depending on the type of distribution. > > ppt, parts per thousand: > > Ratio between two values. These following are all the same: 1 ppt, one in one thousand, 1/1,000, 0.001, and 0.1% > > When talking about precise frequency measurements I would rather expect > ppt to refer to "parts per trillion". > > "Parts per thousand" is usually expressed as "per mille [?]" > > > ?, sigma: > > Sigma denotes the standard deviation (SD) and is centered on the arithmetic mean of the data set. The SD is simply the square root of the variance of the data set. Two sigma is simply twice the standard deviation. Three sigma is three times sigma. Smaller is better. > > Sigma is the measure of dispersion for normal distributions. It isn't > centered on anything and the rest of that explanation is either > misleading or redundant. Comments on Gary's definitions, *without a proposed definition of your own*, just make more editing work for me without solving the problem. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 25 17:23:11 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 10:23:11 -0700 Subject: doc suggestion Message-ID: <20170425172311.F2495406063@ip-64-139-1-69.sjc.megapath.net> I often have troubles finding the description for a keyword in a config file. How about a page with a list, sorted alphabetically, of links to the details. Maybe a few words of context so you don't have to follow a link to discover that it isn't what you are looking for. A few of them may want 2 or 3 links. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Tue Apr 25 17:34:52 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 25 Apr 2017 19:34:52 +0200 Subject: Standard set of terms for precision, accuracy, related concepts. References: <20170424163617.AF72013A021B@snark.thyrsus.com> <878tmoh1s2.fsf@Rainer.invalid> <20170425171425.GB6363@thyrsus.com> Message-ID: <87wpa8fl1f.fsf@Rainer.invalid> Eric S. Raymond writes: > If we're not working towards a unified vocabulary, what's the point > of having a glossary at all? Please define the scope of that glossary. Your request seemed overly broad to me, maybe I'm just reading it wrong. In any case it's not clear for me what you actually want. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for KORG EX-800 and Poly-800MkII V0.9: http://Synth.Stromeko.net/Downloads.html#KorgSDada From gem at rellim.com Tue Apr 25 17:42:35 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 10:42:35 -0700 Subject: doc suggestion In-Reply-To: <20170425172311.F2495406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425172311.F2495406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425104235.52fe2a5e@spidey.rellim.com> Yo Hal! On Tue, 25 Apr 2017 10:23:11 -0700 Hal Murray wrote: > I often have troubles finding the description for a keyword in a > config file. Aren't they all in the ntp.conf man page? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Apr 25 17:48:06 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 10:48:06 -0700 Subject: Pivoting In-Reply-To: Message from "Eric S. Raymond" of "Tue, 25 Apr 2017 13:03:22 EDT." <20170425170322.GA5184@thyrsus.com> Message-ID: <20170425174806.A2656406063@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: >> The documentation issue gets interesting. A feature >> isn't any good if you can't figure out how to use it. I wonder >> if the web will solve that problem. >> Will NTPsec still be online 20 years from now? Will we >> maintain online versions of 20 year old releases? > I think we have to plan for a longer than 20-year lifetime; after all the > code is about 20 years old now. You are confusing project lifetime with the life of a specific release. Can you find documentation for the 20 year old version of ntp classic? (as modified by the distro you are using) I picked 20 years for discussion because that is the GPS rollover time. Our current code will break after 20 years if used with a really old GPS receiver. -- These are my opinions. I hate spam. From ianbruene at gmail.com Tue Apr 25 18:06:14 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 25 Apr 2017 13:06:14 -0500 Subject: Standard set of terms for precision, accuracy, related concepts. In-Reply-To: <871ssgh0u2.fsf@Rainer.invalid> References: <20170424163617.AF72013A021B@snark.thyrsus.com> <20170424164939.rqbkxnob6ueia7uy@roeckx.be> <20170424100747.5aaf730a@spidey.rellim.com> <871ssgh0u2.fsf@Rainer.invalid> Message-ID: On 04/25/2017 12:08 PM, Achim Gratz wrote: > When talking about precise frequency measurements I would rather expect > ppt to refer to "parts per trillion". Useful note: the units display code shows parts-per-thousand as "ppk", parts-per-kilo in order to avoid this collision. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From hmurray at megapathdsl.net Tue Apr 25 18:13:07 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 11:13:07 -0700 Subject: doc suggestion Message-ID: <20170425181307.A5925406063@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: >> I often have troubles finding the description for a >> keyword in aconfig file. > Aren't they all in the ntp.conf man page? Depends on what you mean by "in". Here is a chunk: server For server addresses, this command mobilizes a persistent client mode association with the specified remote server or local radio clock. In this mode the local clock can synchronized to the remote server, but the remote server can never be synchronized to the local clock. Is that all you need? Is there a link to more? -- These are my opinions. I hate spam. From gem at rellim.com Tue Apr 25 18:15:14 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 11:15:14 -0700 Subject: Standard set of terms for precision, accuracy, related concepts. In-Reply-To: References: <20170424163617.AF72013A021B@snark.thyrsus.com> <20170424164939.rqbkxnob6ueia7uy@roeckx.be> <20170424100747.5aaf730a@spidey.rellim.com> <871ssgh0u2.fsf@Rainer.invalid> Message-ID: <20170425111514.77142fb6@spidey.rellim.com> Yo Ian! On Tue, 25 Apr 2017 13:06:14 -0500 Ian Bruene wrote: > On 04/25/2017 12:08 PM, Achim Gratz wrote: > > When talking about precise frequency measurements I would rather > > expect ppt to refer to "parts per trillion". > > Useful note: the units display code shows parts-per-thousand as > "ppk", parts-per-kilo in order to avoid this collision. Can someone find a citation for best practice on parts-per-thousand? Wikipedia says ppt is used in some disciplines for this. https://en.wikipedia.org/wiki/Parts-per_notation An unsatifying compromise is the millage (?) symbo RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 25 19:43:03 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 15:43:03 -0400 Subject: Standard set of terms for precision, accuracy, related concepts. In-Reply-To: <87wpa8fl1f.fsf@Rainer.invalid> References: <20170424163617.AF72013A021B@snark.thyrsus.com> <878tmoh1s2.fsf@Rainer.invalid> <20170425171425.GB6363@thyrsus.com> <87wpa8fl1f.fsf@Rainer.invalid> Message-ID: <20170425194303.GB8200@thyrsus.com> Achim Gratz : > Eric S. Raymond writes: > > If we're not working towards a unified vocabulary, what's the point > > of having a glossary at all? > > Please define the scope of that glossary. Your request seemed overly > broad to me, maybe I'm just reading it wrong. In any case it's not > clear for me what you actually want. The list on the ntpviz pafes would make a good start. Also, any term you have argued with Gary about. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Apr 25 20:18:26 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 16:18:26 -0400 Subject: Pivoting In-Reply-To: <20170425174806.A2656406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425170322.GA5184@thyrsus.com> <20170425174806.A2656406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425201826.GC8200@thyrsus.com> Hal Murray : > You are confusing project lifetime with the life of a specific release. Fair point. My error. > Can you find documentation for the 20 year old version of ntp classic? (as > modified by the distro you are using) Let's see, 20 years old is 1997 at this point...I think I've seen NTP docs that old or nearly when Googling. It's hard to tell because there are so very many stale NTP documentation trees on the web, often without obvious indication of year or version of origin. On the other hand, those hits do not necessarily mean the installations they're describing are still running, either. Many of them seem to be related to extinct-dinosaur big-iron Unixes and are probably still up only because nobody thought to remove them when a machine was decomissioned (I've seen a lot of these in edu). I don't think we can deduce much from that noise. > I picked 20 years for discussion because that is the GPS rollover time. Our > current code will break after 20 years if used with a really old GPS receiver. Yes. I don't see any way to fix that in the general case. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 25 21:05:41 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 14:05:41 -0700 Subject: Standard set of terms for precision, accuracy, related concepts. Message-ID: <20170425210541.1646E406063@ip-64-139-1-69.sjc.megapath.net> > Can someone find a citation for best practice on parts-per-thousand? > Wikipedia says ppt is used in some disciplines for this. I've seen PPM and PPB in frequent use. I don't remember seeing PPT or PPK. PPT I would probably figure out from the context right away. PPK would take me a bit longer. At that small a range I'd expect to see percentages. -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 25 21:05:25 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 17:05:25 -0400 Subject: Pivoting In-Reply-To: References: <20170423102048.13F19406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425210525.GA9918@thyrsus.com> Achim Gratz : > >Would anybody notice a warning message from a program that's been running for > >19 years? > > No, but if it was a time server that systems still followed, it'd take your > network down quite fast these days. Figuring out why would be a minor > nightmare. Huh? Why would it not be discarded as a falseticker? Isn't this one of the cases outlier filtering is for? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Tue Apr 25 21:12:46 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 14:12:46 -0700 Subject: Pivoting In-Reply-To: Message from "Eric S. Raymond" of "Tue, 25 Apr 2017 17:05:25 EDT." <20170425210525.GA9918@thyrsus.com> Message-ID: <20170425211246.27D93406063@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Huh? Why would it not be discarded as a falseticker? > Isn't this one of the cases outlier filtering is for? Because it's a 18 year old system and 3 out of 4 of the servers it was configured with have been retired. :) -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 25 21:17:14 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 14:17:14 -0700 Subject: Pivoting In-Reply-To: Message from "Eric S. Raymond" of "Tue, 25 Apr 2017 16:18:26 EDT." <20170425201826.GC8200@thyrsus.com> Message-ID: <20170425211714.6E6D1406063@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > It's hard to tell because there are so very many stale NTP documentation > trees on the web, often without obvious indication of year or version of > origin. Is making sure that all our man and web pages have date and version on the release checklist? Where is the release checklist? "All" is misleading in the context of web pages. I'm referring to the documentation for the software that is mostly parallel to the man pages. -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 25 21:52:38 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 17:52:38 -0400 Subject: Pivoting In-Reply-To: <20170425211246.27D93406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425210525.GA9918@thyrsus.com> <20170425211246.27D93406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425215238.GA10555@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > Huh? Why would it not be discarded as a falseticker? > > Isn't this one of the cases outlier filtering is for? > > Because it's a 18 year old system and 3 out of 4 of the servers it was > configured with have been retired. :) We can't fix *that*, either. All we can do is add a warning to the man pages for NMEA-related drivers that if your GPS is older than 1024 weeks you may be cruising for a bruising. I think I'll go do that. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Tue Apr 25 22:03:51 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 15:03:51 -0700 Subject: Pivoting In-Reply-To: <20170425135810.GA3024@thyrsus.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> <20170425135810.GA3024@thyrsus.com> Message-ID: <20170425150351.3031d80b@spidey.rellim.com> Yo Eric! On Tue, 25 Apr 2017 09:58:10 -0400 "Eric S. Raymond" wrote: > > The code in step_systime() is really really ugly. (to my eye) > > Not just to yours... So, can we now got back to my better version? The one that had no l_fp or pivot needs or dependencies? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Tue Apr 25 10:34:52 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 25 Apr 2017 12:34:52 +0200 Subject: Temperature Controlled rasPi 3B Message-ID: <87pog0hj1v.fsf@Rainer.invalid> So, just before I left last week I managed to finally set up a temperature controller on my rasPi 3B using three sha512sum processes that get individually stopped and continued by the temperature control loop every 100ms to create a load between 0% and 300% and put it into another cardboard box and bubblewrap. I still have to add an integral term to the control loop in order to make the residual zero and add a few other things to enable changing the setpoint while it's running and improve the logging, but it already converges to within 0.2K of the target temperature and keeps the 5 minutes average temperature to within 20mK or so (unless something else loads the cores for more than just a few seconds, like compiling a new NTPsec version). Since I was away I have data for nine days of uninterrupted and undisturbed performance: -------------- next part -------------- A non-text attachment was scrubbed... Name: temprecord_rasPi3B.png Type: image/png Size: 82625 bytes Desc: not available URL: -------------- next part -------------- The average heating power is too low at the moment, so I will have to reduce the thermal isolation when the weather gets warmer. -------------- next part -------------- A non-text attachment was scrubbed... Name: rasPi3B_temp_regulated.png Type: image/png Size: 160113 bytes Desc: not available URL: -------------- next part -------------- The rasPi 2B that I had set up the same way unfortunately had only intermittent GPS lock. It appears these problems were due to RF interference, so I've removed the GPS module from the box now and put it outside. I will also have to check the GPS settings or swap in one of my spare modules the next time I take the system down as the other modules seem to work better in the same spot. It also had a jump in the crystal aging that was probably caused by moving the system and having it switched off for a few hours. Since I had to pry the GPS out anyway, I've (moderately) overheated the system for some time and it looks like I'm back to the original aging transient. Without these issues it appears that the performance should have been very close to the one of the rasPi 3B (not very surprisingly). At the level of performance I'm expecting now it takes about a week to gather meaningful data. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From esr at thyrsus.com Tue Apr 25 22:21:49 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 18:21:49 -0400 Subject: Pivoting In-Reply-To: <20170425150351.3031d80b@spidey.rellim.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> <20170425135810.GA3024@thyrsus.com> <20170425150351.3031d80b@spidey.rellim.com> Message-ID: <20170425222149.GA10948@thyrsus.com> Gary E. Miller : > Yo Eric! > > On Tue, 25 Apr 2017 09:58:10 -0400 > "Eric S. Raymond" wrote: > > > > The code in step_systime() is really really ugly. (to my eye) > > > > Not just to yours... > > So, can we now got back to my better version? The one that had no > l_fp or pivot needs or dependencies? Not yet. We know that code stinks, but there is still thinking to be done before replacing it. There's no rush; we have more than a decade before the issue becomes urgent. Much more important to get this right than do it fast. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From hmurray at megapathdsl.net Tue Apr 25 22:30:36 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 15:30:36 -0700 Subject: Pivoting In-Reply-To: Message from "Eric S. Raymond" of "Tue, 25 Apr 2017 17:52:38 EDT." <20170425215238.GA10555@thyrsus.com> Message-ID: <20170425223036.4E2D4406063@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > We can't fix *that*, either. All we can do is add a warning to the man > pages for NMEA-related drivers that if your GPS is older than 1024 weeks you > may be cruising for a bruising. I think I'll go do that. We need to find out when they added the 3 extra bits. -- These are my opinions. I hate spam. From gem at rellim.com Tue Apr 25 22:30:36 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 15:30:36 -0700 Subject: Temperature Controlled rasPi 3B In-Reply-To: <87pog0hj1v.fsf@Rainer.invalid> References: <87pog0hj1v.fsf@Rainer.invalid> Message-ID: <20170425153036.54674cb3@spidey.rellim.com> Yo Achim! On Tue, 25 Apr 2017 12:34:52 +0200 Achim Gratz wrote: > So, just before I left last week I managed to finally set up a > temperature controller on my rasPi 3B using three sha512sum processes > that get individually stopped and continued by the temperature control > loop every 100ms to create a load between 0% and 300% and put it into > another cardboard box and bubblewrap. Cool. Did you use ntpheat? > I still have to add an integral > term to the control loop in order to make the residual zero ntpheatusb already has a full PID controller. Did you look at that? > and add a > few other things to enable changing the setpoint while it's running > and improve the logging, If yu do so, could you send us patches for ntpheat? > but it already converges to within 0.2K of > the target temperature and keeps the 5 minutes average temperature to > within 20mK or so (unless something else loads the cores for more > than just a few seconds, like compiling a new NTPsec version). I found that keeping the CPU chip temp stable was less important that\n keeping ambient stable. The XTAL is on the other side of the PCB from the CPU. I also found adding a fan to the box evened out the temps between PCB top and bottom better. > Since > I was away I have data for nine days of uninterrupted and undisturbed > performance: Very nice. What is the 'NTP Loop Offset'? The Time offset? How is the predicted frequency offset calculated? Could that be patched into ntpviz? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Apr 25 22:35:19 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 15:35:19 -0700 Subject: Pivoting In-Reply-To: <20170425222149.GA10948@thyrsus.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> <20170425135810.GA3024@thyrsus.com> <20170425150351.3031d80b@spidey.rellim.com> <20170425222149.GA10948@thyrsus.com> Message-ID: <20170425153519.3a72ca00@spidey.rellim.com> Yo Eric! On Tue, 25 Apr 2017 18:21:49 -0400 "Eric S. Raymond" wrote: > Gary E. Miller : > > Yo Eric! > > > > On Tue, 25 Apr 2017 09:58:10 -0400 > > "Eric S. Raymond" wrote: > > > > > > The code in step_systime() is really really ugly. (to my > > > > eye) > > > > > > Not just to yours... > > > > So, can we now got back to my better version? The one that had no > > l_fp or pivot needs or dependencies? > > Not yet. We know that code stinks, but there is still thinking to be > done before replacing it. Rush? Thinking was already done, Hal was happier with the code than what it is now. What about that code have I not explained clearly yet? > There's no rush; we have more than a decade > before the issue becomes urgent. That code is causing loss of precision and jitter now. Reverting it to what I had it will makes things better now, and in the future. I want to kill all l_fp as much as possible before l_fp. Once you agree that my approach is better there are many similar piece of code that can be ripped out. The more l_fp we can agree to remove the less l_fp we need to worry about for 2036. > Much more important to get this > right than do it fast. Yes, and everyone seems to think my code has it right, just waiting on you. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 25 22:50:07 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 18:50:07 -0400 (EDT) Subject: I think we can drop the Jupiter driver. Message-ID: <20170425225007.D867813A021A@snark.thyrsus.com> I think we can drop the Jupiter driver. I looked for rollover compensation and didn't find it. Instead there's this at line 1019: instance->timecode = GPS_EPOCH + (instance->gweek * WEEKSECS) + sweek; I think that means this driver will have timewarped as of the GPS rollover in August 1999 and it's been busted for *18 years.* Jeez. Is there no end to the undetected lossage in this codebase? Other GPS-based refclocks will have to be checked for the same issue. The magnavox, oncore, trimble, truetime, and spectracom drivers might well be broken the same way, and I'd like to hear Hal weigh in on hpgps. Hal, Gary, would you please be additional eyeballs on this? -- Eric S. Raymond The whole of the Bill [of Rights] is a declaration of the right of the people at large or considered as individuals... It establishes some rights of the individual as unalienable and which consequently, no majority has a right to deprive them of. -- Albert Gallatin, Oct 7 1789 From esr at thyrsus.com Tue Apr 25 22:59:44 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 18:59:44 -0400 Subject: Pivoting In-Reply-To: <20170425223036.4E2D4406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425215238.GA10555@thyrsus.com> <20170425223036.4E2D4406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425225944.GA11554@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > We can't fix *that*, either. All we can do is add a warning to the man > > pages for NMEA-related drivers that if your GPS is older than 1024 weeks you > > may be cruising for a bruising. I think I'll go do that. > > We need to find out when they added the 3 extra bits. That is messy. IIRC they came in with the Block III version of the sats, but (a) I think Block IIs are still flying, and (b) we don't know whether or when receiver firmware has been updated by anybody. Of course this is never documented. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Apr 25 23:08:30 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 19:08:30 -0400 Subject: Pivoting In-Reply-To: <20170425153519.3a72ca00@spidey.rellim.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> <20170425135810.GA3024@thyrsus.com> <20170425150351.3031d80b@spidey.rellim.com> <20170425222149.GA10948@thyrsus.com> <20170425153519.3a72ca00@spidey.rellim.com> Message-ID: <20170425230830.GB11554@thyrsus.com> Gary E. Miller via devel : > > Not yet. We know that code stinks, but there is still thinking to be > > done before replacing it. > > Rush? Thinking was already done, Hal was happier with the code than > what it is now. What about that code have I not explained clearly yet? The thinking is not done until *I* am sure I fully understand the implications. Please don't argue with that policy; it has almost entirely prevented iatrogenic damage so far, and we want to keep it that way. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Tue Apr 25 23:08:57 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 16:08:57 -0700 Subject: Pivoting In-Reply-To: <20170425225944.GA11554@thyrsus.com> References: <20170425215238.GA10555@thyrsus.com> <20170425223036.4E2D4406063@ip-64-139-1-69.sjc.megapath.net> <20170425225944.GA11554@thyrsus.com> Message-ID: <20170425160857.3e250c3d@spidey.rellim.com> Yo Eric! On Tue, 25 Apr 2017 18:59:44 -0400 "Eric S. Raymond via devel" wrote: > Hal Murray : > > > > esr at thyrsus.com said: > > > We can't fix *that*, either. All we can do is add a warning to > > > the man pages for NMEA-related drivers that if your GPS is older > > > than 1024 weeks you may be cruising for a bruising. I think I'll > > > go do that. > > > > We need to find out when they added the 3 extra bits. > > That is messy. IIRC they came in with the Block III version of the > sats, but (a) I think Block IIs are still flying, and (b) we don't > know whether or when receiver firmware has been updated by anybody. > Of course this is never documented. I think only some Block II output CNAV, and only then on the L2 and L5 bands which are not used in consumer GPS. You GPS only needs to get a CNAV message now and again to know the proper GPS epoch, so you only need a few birds transmitting it. Until there are a lot of Block III's in service I do not expec to see any L1C band GPS. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 25 23:13:25 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 19:13:25 -0400 (EDT) Subject: Work item list extracted from my back mail Message-ID: <20170425231325.D248513A021A@snark.thyrsus.com> This is the list of work items and potential work items I have extracted from my back mail. Serious bug: We have a report from MAYER Hans that ifstats is broken, returning no output. It should be possible to pin down when this broke with a regression test. Possible work item (Ian): See how much of the C statustoa() function we can get rid of in favor of the Python implementation in pylib/util.c. Possible work item (me): Change Mode 6 so it no longer ships system status bits in hex, decoding them into a string token list instead, old behavior preserved under ENABLE_CLASSIC_MODE. This is because the PLL* definitions aren't guaranteed to be consistent across different system versions. Work item (me): Move the pivot application to timestamps as they are received. The current locus in sys_adjtime() is wrong. This won't be urgent until closer to 2036, but it's important - the current pivoting machanism probably doesn't work. (And what's with the 10-year stepback in there?) Must figure out a way to test it. Can be post-1.0. Work item (me): Hal wrote: We need a web page that summarizes this area - not the internal details, but the general problem and how it impacts users. NTP overflows in 2036 32 bit signed time_t overflows in 2038 GPS overflows every 1024 weeks (~20 years starting from 1980) First rollover was 22 August 1999 They added 3 bits in ??? Some older devices used pivot logic to extend their lifetime past 2000 A description of pivoting and how it works. Good URLs for more info Yes, we do need this. Now we should add a description of SOURCE_DATE_EPOCH and link to https://reproducible-builds.org/specs/source-date-epoch/ Work item: Replace typedef uint64_t l_fp; with typedef uint64_t l_fp_time; typedef int64_t l_fp_offset; Then replace all the places that used to use l_fp with the correct one. Then iterate on making macros and removing casts in the main code. Work item: Study this: https://developers.google.com/time/smear#standardsmear Are we implementing it? Work item: New Mode 6 responses that unpack status bits into list of string tags for the status bits. Avoids potential problem with PLL bits (outside our control) changing due to kernel mods. Work item: There are two date/time to text modules, prettdyate.c and humandate.c. Hal notes that seems like one too many. This whole area feels ripe for more cleanup. Hal thinks think one of the routines in prettydate is only used by python. This feels like an area where if we start getting rid of things then we can get rid of more things that nobody else uses. Work item: The lfpfloat() function in the Python interface may be dispensible. Convert the hex literal to float and divide by 1<<32 to scale to seconds since the NTP epoch. -- Eric S. Raymond Alcohol still kills more people every year than all `illegal' drugs put together, and Prohibition only made it worse. Oppose the War On Some Drugs! From hmurray at megapathdsl.net Tue Apr 25 23:15:52 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 16:15:52 -0700 Subject: I think we can drop the Jupiter driver. In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 25 Apr 2017 18:50:07 EDT." <20170425225007.D867813A021A@snark.thyrsus.com> Message-ID: <20170425231552.4E424406063@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > I think we can drop the Jupiter driver. I looked for rollover compensation > and didn't find it. Instead there's this at line 1019: You could use it as an example of how to do it right. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Apr 25 23:32:17 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 16:32:17 -0700 Subject: Pivoting In-Reply-To: Message from "Gary E. Miller" of "Tue, 25 Apr 2017 16:03:34 PDT." <20170425160334.42ed466c@spidey.rellim.com> Message-ID: <20170425233217.260A0406063@ip-64-139-1-69.sjc.megapath.net> gem at rellim.com said: >> We need to find out when they added the 3 extra bits. > Technically, this is the Transmission Week Number (WN). The 13 bit WN will > be in the new CNAV messages. > CNAV messages will not be transmitted until the Block IIIA satellites got > live. Expected launch Spring 2018. So the answer is simple. Not yet. All GPS receivers still have roll over problems. All I have to do to find a test case is get one more than 20 years old. Thanks. -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Apr 25 23:35:36 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 19:35:36 -0400 Subject: I think we can drop the Jupiter driver. In-Reply-To: <20170425231552.4E424406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425225007.D867813A021A@snark.thyrsus.com> <20170425231552.4E424406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425233536.GA12113@thyrsus.com> Hal Murray : > > devel at ntpsec.org said: > > I think we can drop the Jupiter driver. I looked for rollover compensation > > and didn't find it. Instead there's this at line 1019: > > You could use it as an example of how to do it right. Er, *what*? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Tue Apr 25 23:37:37 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 16:37:37 -0700 Subject: Pivoting In-Reply-To: <20170425230830.GB11554@thyrsus.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> <20170425135810.GA3024@thyrsus.com> <20170425150351.3031d80b@spidey.rellim.com> <20170425222149.GA10948@thyrsus.com> <20170425153519.3a72ca00@spidey.rellim.com> <20170425230830.GB11554@thyrsus.com> Message-ID: <20170425163737.5a2f92e7@spidey.rellim.com> Yo Eric! On Tue, 25 Apr 2017 19:08:30 -0400 "Eric S. Raymond" wrote: > Gary E. Miller via devel : > > > Not yet. We know that code stinks, but there is still thinking > > > to be done before replacing it. > > > > Rush? Thinking was already done, Hal was happier with the code than > > what it is now. What about that code have I not explained clearly > > yet? > > The thinking is not done until *I* am sure I fully understand the > implications. Please don't argue with that policy; it has almost > entirely prevented iatrogenic damage so far, and we want to keep > it that way. I've always wanted you to understand, never argued with that. So get to it! I'm trying to get action items checked off. This was one that was checked off, validated, and working, until you reverted it. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Apr 25 23:40:24 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 16:40:24 -0700 Subject: Pivoting In-Reply-To: <20170425233217.260A0406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425160334.42ed466c@spidey.rellim.com> <20170425233217.260A0406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425164024.3c948def@spidey.rellim.com> Yo Hal! On Tue, 25 Apr 2017 16:32:17 -0700 Hal Murray wrote: > So the answer is simple. Not yet. All GPS receivers still have roll > over problems. Not so simple. If your GPS picks up GLONASS, Beidou, Galileo, etc. then it will not have the gps week rollover problem. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Apr 25 23:47:02 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 19:47:02 -0400 Subject: Pivoting In-Reply-To: <20170425163737.5a2f92e7@spidey.rellim.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> <20170425135810.GA3024@thyrsus.com> <20170425150351.3031d80b@spidey.rellim.com> <20170425222149.GA10948@thyrsus.com> <20170425153519.3a72ca00@spidey.rellim.com> <20170425230830.GB11554@thyrsus.com> <20170425163737.5a2f92e7@spidey.rellim.com> Message-ID: <20170425234702.GA12297@thyrsus.com> Gary E. Miller via devel : > So get to it! I'm trying to get action items checked off. This was one > that was checked off, validated, and working, until you reverted it. Would have failed in 2036. Probably the existing code would have, too, but still. You're not going to push me into moving faster than I am confident is safe. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Tue Apr 25 23:48:25 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 19:48:25 -0400 Subject: Pivoting In-Reply-To: <20170425164024.3c948def@spidey.rellim.com> References: <20170425160334.42ed466c@spidey.rellim.com> <20170425233217.260A0406063@ip-64-139-1-69.sjc.megapath.net> <20170425164024.3c948def@spidey.rellim.com> Message-ID: <20170425234825.GB12297@thyrsus.com> Gary E. Miller via devel : > On Tue, 25 Apr 2017 16:32:17 -0700 > Hal Murray wrote: > > > So the answer is simple. Not yet. All GPS receivers still have roll > > over problems. > > Not so simple. If your GPS picks up GLONASS, Beidou, Galileo, etc. > then it will not have the gps week rollover problem. Do they not use the same wek/second representation? What's the length of the era in those systems. Has to be finite... -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Wed Apr 26 00:11:08 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 17:11:08 -0700 Subject: Pivoting In-Reply-To: <20170425234702.GA12297@thyrsus.com> References: <20170420215753.948B640605C@ip-64-139-1-69.sjc.megapath.net> <20170422050918.36A5540605C@ip-64-139-1-69.sjc.megapath.net> <20170425135810.GA3024@thyrsus.com> <20170425150351.3031d80b@spidey.rellim.com> <20170425222149.GA10948@thyrsus.com> <20170425153519.3a72ca00@spidey.rellim.com> <20170425230830.GB11554@thyrsus.com> <20170425163737.5a2f92e7@spidey.rellim.com> <20170425234702.GA12297@thyrsus.com> Message-ID: <20170425171108.732cf3d1@spidey.rellim.com> Yo Eric! On Tue, 25 Apr 2017 19:47:02 -0400 "Eric S. Raymond" wrote: > Gary E. Miller via devel : > > So get to it! I'm trying to get action items checked off. This > > was one that was checked off, validated, and working, until you > > reverted it. > > Would have failed in 2036. Prove it. I have already sent you a detailed analysis, you keep saying no, but refuse to defend your point. I can accept an "I don't know yet", but if you keep saying it is wrong I expect you to be able to prove your point. "Delay is the cruelest form of denial". > Probably the existing code would have, > too, but still. You're not going to push me into moving faster than > I am confident is safe. Not pushing you to be faster, just pushing you to move this up a bit on your priority list. It is blocking things high up on my priority list. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Apr 26 00:13:31 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 17:13:31 -0700 Subject: Pivoting In-Reply-To: <20170425234825.GB12297@thyrsus.com> References: <20170425160334.42ed466c@spidey.rellim.com> <20170425233217.260A0406063@ip-64-139-1-69.sjc.megapath.net> <20170425164024.3c948def@spidey.rellim.com> <20170425234825.GB12297@thyrsus.com> Message-ID: <20170425171331.30e54521@spidey.rellim.com> Yo Eric! On Tue, 25 Apr 2017 19:48:25 -0400 "Eric S. Raymond" wrote: > Gary E. Miller via devel : > > On Tue, 25 Apr 2017 16:32:17 -0700 > > Hal Murray wrote: > > > > > So the answer is simple. Not yet. All GPS receivers still have > > > roll over problems. > > > > Not so simple. If your GPS picks up GLONASS, Beidou, Galileo, etc. > > then it will not have the gps week rollover problem. > > Do they not use the same wek/second representation? No. GPS uses 10 bit weeks on the L1 C/A channel, which is what most consumer GPS use. The other systems use 13 bit weeks. > What's the > length of the era in those systems. Has to be finite... 13 bits of weeks. 8191 weeks. 157 years. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 26 00:16:58 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 17:16:58 -0700 Subject: I think we can drop the Jupiter driver. In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 25 Apr 2017 19:35:36 EDT." <20170425233536.GA12113@thyrsus.com> Message-ID: <20170426001658.83347406063@ip-64-139-1-69.sjc.megapath.net> >> You could use it as an example of how to do it right. > Er, *what*? Fix the code. Do it right and cleanly. Add a comment pointing to the page with a full description of the whole mess. Have that page point back there as an example of how to fix the GPS week roll over problem. It's not as simple as if (week < xx) week += 1024 That only works for 20 years. You need something like: while (week < xx) week += 1024 -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Wed Apr 26 00:22:28 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 17:22:28 -0700 Subject: Pivoting Message-ID: <20170426002228.7FD46406063@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > Prove it. I have already sent you a detailed analysis, you keep saying no, > but refuse to defend your point. > I can accept an "I don't know yet", but if you keep saying it is wrong I > expect you to be able to prove your point. I didn't follow your detailed analysis carefully, but I don't remember the key point. If there is no pivot in step_clock, where is it done? It all works if the system time is "close enough". Were you assuming that? I was. Eric isn't. -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 26 00:22:47 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 17:22:47 -0700 Subject: I think we can drop the Jupiter driver. In-Reply-To: <20170426001658.83347406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425233536.GA12113@thyrsus.com> <20170426001658.83347406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425172247.5c3709a4@spidey.rellim.com> Yo Hal! On Tue, 25 Apr 2017 17:16:58 -0700 Hal Murray via devel wrote: > >> You could use it as an example of how to do it right. > > Er, *what*? > > Fix the code. Do it right and cleanly. I'm all for that, but until we find a real Jupiter user we can't test, and are fixing it to no purpose. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Apr 26 00:50:44 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 17:50:44 -0700 Subject: Pivoting In-Reply-To: <20170426002228.7FD46406063@ip-64-139-1-69.sjc.megapath.net> References: <20170426002228.7FD46406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425175027.6f11a0c6@spidey.rellim.com> Yo Hal! On Tue, 25 Apr 2017 17:22:28 -0700 Hal Murray wrote: > devel at ntpsec.org said: > > Prove it. I have already sent you a detailed analysis, you keep > > saying no, but refuse to defend your point. > > > I can accept an "I don't know yet", but if you keep saying it is > > wrong I expect you to be able to prove your point. > > I didn't follow your detailed analysis carefully, but I don't > remember the key point. Rather than repeating myself: https://lists.ntpsec.org/pipermail/devel/2017-April/004179.html https://lists.ntpsec.org/pipermail/devel/2017-April/004266.html > If there is no pivot in step_clock, where is it done? We are not talking about step_clock(), so that prolly adds to the confusion. We are talking about adj_systime(). That function never change the time more than --panicgate, which is almost always +/- 1,000 seconds. There is one odd patch, mode_ntpdate, that never calls setp_system() for steps more than 128 ms. > It all works if the system time is "close enough". Yes, and adj_systime() is only called when system time is "close enough". If the step is more than panicgate (usually 1,000 seconds), then step_systime() or adj_systime() is called. Both those are also a mess, but one (two?) messes at a time. > Were you assuming that? I was. I'm not assuming that, i know the code says that. > Eric isn't. I'll let Eric speak for Eric. Given that the adjustment path is so tortured, I like the suggestion someone made a while back to add a trace mode to the loopfilter code. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fallenpegasus at gmail.com Wed Apr 26 00:51:40 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 26 Apr 2017 00:51:40 +0000 Subject: I think we can drop the Jupiter driver. In-Reply-To: <20170425172247.5c3709a4@spidey.rellim.com> References: <20170425233536.GA12113@thyrsus.com> <20170426001658.83347406063@ip-64-139-1-69.sjc.megapath.net> <20170425172247.5c3709a4@spidey.rellim.com> Message-ID: I find it unlikely we're going to find a real Jupiter to test against, if nobody has raised this issue against NTP Classic. ..m On Tue, Apr 25, 2017 at 5:22 PM Gary E. Miller via devel wrote: > Yo Hal! > > On Tue, 25 Apr 2017 17:16:58 -0700 > Hal Murray via devel wrote: > > > >> You could use it as an example of how to do it right. > > > Er, *what*? > > > > Fix the code. Do it right and cleanly. > > I'm all for that, but until we find a real Jupiter user we can't test, > and are fixing it to no purpose. > > > > RGDS > GARY > --------------------------------------------------------------------------- > Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 > gem at rellim.com Tel:+1 541 382 8588 <(541)%20382-8588> > > Veritas liberabit vos. -- Quid est veritas? > "If you can?t measure it, you can?t improve it." - Lord Kelvin > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel -- Mark Atwood http://about.me/markatwood +1-206-604-2198 SMS & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Wed Apr 26 00:54:56 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 17:54:56 -0700 Subject: Errors and warnings from head In-Reply-To: <20170425072449.EE0EB406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425072449.EE0EB406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425175456.730b80e7@spidey.rellim.com> Yo Hal! On Tue, 25 Apr 2017 00:24:49 -0700 Hal Murray wrote: > > Do you have --enable-warnings on? Known upstream Bison bug. > > I don't think so. I do have --enable-debug and --enable-debug-gdb > The yydebug error is conditional on DEBUG, but I've been using that > for ages without problems. (It used to default to on, right?) > > I might have missed the yyparse warning but that seems unlikely. The warning is new, and only happens on a few platforms. None of the main buildbot show that warning. I'm not sure of the proper work around yet. So many Bison bugs, so little time. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Apr 26 00:57:00 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 17:57:00 -0700 Subject: Something is buggy with maxpoll... In-Reply-To: <20170425064520.B6433406063@ip-64-139-1-69.sjc.megapath.net> References: <87y3upghjh.fsf@Rainer.invalid> <20170425064520.B6433406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425175700.2d018bba@spidey.rellim.com> Yo Hal! On Mon, 24 Apr 2017 23:45:20 -0700 Hal Murray wrote: > Or use tcpdump and avoid another layer of possible confusion. I also looked at it with tcpdump, since it showed nothing unexpected I skipped mentioning it. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fallenpegasus at gmail.com Wed Apr 26 00:57:09 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 26 Apr 2017 00:57:09 +0000 Subject: ntpq vs new DNS In-Reply-To: <20170417061641.2B8F940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170417061641.2B8F940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Sun, Apr 16, 2017 at 11:16 PM Hal Murray wrote: > > Does anybody have any great ideas for how to take advantage of this extra > info? > I've been thinking that over for the past week, and I don't have a great idea on how to take advantage of that. ..m -- Mark Atwood http://about.me/markatwood +1-206-604-2198 SMS & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From fallenpegasus at gmail.com Wed Apr 26 01:09:14 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 26 Apr 2017 01:09:14 +0000 Subject: Does anybody have a sample of a NMEA device with the 1024 week bug? In-Reply-To: <20170416234630.505B340605C@ip-64-139-1-69.sjc.megapath.net> References: <20170416234630.505B340605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: How hard is it to write a fake NMEA device simulator in Python, and use Linux IPC control magic to have it appear at a fake serial device? ..m On Sun, Apr 16, 2017 at 4:46 PM Hal Murray wrote: > > I'd like to get one for testing. > > > -- > These are my opinions. I hate spam. > > > > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel > -- Mark Atwood http://about.me/markatwood +1-206-604-2198 SMS & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Wed Apr 26 01:17:38 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 18:17:38 -0700 Subject: Does anybody have a sample of a NMEA device with the 1024 week bug? In-Reply-To: References: <20170416234630.505B340605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425181738.4cc9bcae@spidey.rellim.com> Yo Mark! On Wed, 26 Apr 2017 01:09:14 +0000 Mark Atwood via devel wrote: > How hard is it to write a fake NMEA device simulator in Python, and > use Linux IPC control magic to have it appear at a fake serial device? Not how hard, but who wants that at the top of their list. I think you have some free time now? All for an obsolete device that we do not think anyone has? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From trv-n at comcast.net Wed Apr 26 01:49:26 2017 From: trv-n at comcast.net (Trevor N.) Date: Tue, 25 Apr 2017 21:49:26 -0400 Subject: I think we can drop the Jupiter driver. Message-ID: I created a loop like that for the Trimble driver. The algorithm is pretty simple so I'm probably missing something; please check out the merge request I made a few days ago. I still need to test the changes with my simulator and with ntpd started with date offsets. >Hal Murray hmurray at megapathdsl.net >Wed Apr 26 00:16:58 UTC 2017 > >>> You could use it as an example of how to do it right. >> Er, *what*? > >Fix the code. Do it right and cleanly. Add a comment pointing to the page >with a full description of the whole mess. Have that page point back there >as an example of how to fix the GPS week roll over problem. > >It's not as simple as > if (week < xx) week += 1024 >That only works for 20 years. You need something like: > while (week < xx) week += 1024 From esr at thyrsus.com Wed Apr 26 03:28:54 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 23:28:54 -0400 Subject: Pivoting In-Reply-To: <20170425171331.30e54521@spidey.rellim.com> References: <20170425160334.42ed466c@spidey.rellim.com> <20170425233217.260A0406063@ip-64-139-1-69.sjc.megapath.net> <20170425164024.3c948def@spidey.rellim.com> <20170425234825.GB12297@thyrsus.com> <20170425171331.30e54521@spidey.rellim.com> Message-ID: <20170426032854.GA14901@thyrsus.com> Gary E. Miller via devel : > No. GPS uses 10 bit weeks on the L1 C/A channel, which is what > most consumer GPS use. The other systems use 13 bit weeks. Ah, that is the crucial bit on information I lacked. Thanks. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Wed Apr 26 03:30:09 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 23:30:09 -0400 Subject: I think we can drop the Jupiter driver. In-Reply-To: References: <20170425233536.GA12113@thyrsus.com> <20170426001658.83347406063@ip-64-139-1-69.sjc.megapath.net> <20170425172247.5c3709a4@spidey.rellim.com> Message-ID: <20170426033009.GB14901@thyrsus.com> Mark Atwood via devel : > I find it unlikely we're going to find a real Jupiter to test against, if > nobody has raised this issue against NTP Classic. I do, too. Seventeen years! *plonk* -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Wed Apr 26 03:44:44 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 25 Apr 2017 23:44:44 -0400 Subject: I think we can drop the Jupiter driver. In-Reply-To: <20170426001658.83347406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425233536.GA12113@thyrsus.com> <20170426001658.83347406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426034444.GA15005@thyrsus.com> Hal Murray : > > >> You could use it as an example of how to do it right. > > Er, *what*? > > Fix the code. Do it right and cleanly. Add a comment pointing to the page > with a full description of the whole mess. Have that page point back there > as an example of how to fix the GPS week roll over problem. > > It's not as simple as > if (week < xx) week += 1024 > That only works for 20 years. You need something like: > while (week < xx) week += 1024 Sure, we could do it. And the point would be? I'd rather put in that effort on a device that has been ... you know ... *used in this century*? :-) No, I'm not mocking you personally, Hal. I just feel suddenly like I've been dropped into the middle of an extremely nerdy comedy sketch. Seventeen years. Nobody noticed. #@!#!&&!!** 'E's not pinin'! 'E's passed on! This driver is no more! Deployment has ceased to be! 'E's expired and gone to meet 'is maker! 'E's a stiff! Bereft of users, 'e rests in peace! If NTF hadn't nailed 'im into the codebase 'e'd be pushing up daisies! 'Is metabolic processes are now 'istory! 'E's off the twig! 'E's kicked the bucket, 'e's shuffled off 'is mortal coil, run down the curtain and joined the bleedin' choir invisibule!! THIS IS AN EX-DRIVER!! -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From trv-n at comcast.net Wed Apr 26 05:52:46 2017 From: trv-n at comcast.net (Trevor N.) Date: Wed, 26 Apr 2017 01:52:46 -0400 Subject: Does anybody have a sample of a NMEA device with the 1024 week bug? Message-ID: I have a device that will rollover after week 1998 (in 2018) that I just tested with a GPS simulator set to 2 years in the future, attached to ntpd classic with a +2 year offset in get_ostime (and "disable ntp" in conf) and ntpcal_get_build_date() of 2016. The 512-week-around-receive-timestamp code in the driver fixed the timecode timestamp from 1999-09-10 to the proper date. >Gary E. Miller gem at rellim.com wrote: > >Yo Mark! > >On Wed, 26 Apr 2017 01:09:14 +0000 >Mark Atwood via devel wrote: > >> How hard is it to write a fake NMEA device simulator in Python, and >> use Linux IPC control magic to have it appear at a fake serial device? > >Not how hard, but who wants that at the top of their list. I think you >have some free time now? > >All for an obsolete device that we do not think anyone has? > > > >RGDS >GARY From hmurray at megapathdsl.net Wed Apr 26 06:12:33 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 23:12:33 -0700 Subject: Does anybody have a sample of a NMEA device with the 1024 week bug? In-Reply-To: Message from Mark Atwood of "Wed, 26 Apr 2017 01:09:14 -0000." Message-ID: <20170426061233.2C79E406063@ip-64-139-1-69.sjc.megapath.net> fallenpegasus at gmail.com said: > How hard is it to write a fake NMEA device simulator in Python, and use > Linux IPC control magic to have it appear at a fake serial device? I don't know anything about IPC, but I do know how to connect 2 PCs with a crossover cable. USB would be good enough so you don't actually need a second PC. Gary said: > All for an obsolete device that we do not think anyone has? No, to test some code, and check a box that Eric thinks is important. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Wed Apr 26 06:54:21 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 25 Apr 2017 23:54:21 -0700 Subject: Pivoting (or rather missing pivot) Message-ID: <20170426065421.7B13C406063@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > Rather than repeating myself: > https://lists.ntpsec.org/pipermail/devel/2017-April/004179.html Thanks. Here is the key part: > step - double seconds, range usually +/1 1ms, 1us or 1ns. > never larger than 'gate', but for arguement assume > it could be as large as 1Jan2200 - 1Jan1970 It can be (and often is) larger than gate on the first adjustment when run with the -g switch. 2200-1970 is 230. That doesn't fit into a l_fp so you can't possibly get a step size that big. Consider what happens when the local time is 1970 and the remote server time is 2037. That's the case Eric is interested in. The 2037 gets truncated so the remote server time comes back as 1901. 1901-1970 is -69 rather than +67. 1901 is obviously earlier than the build EPOCH, so bumping step by 1<<32 adds 136 years and gets the right answer. If our time were "close enough", say 2000, then 1901-2000 is -99 which overflows/underflows and turns into +37. -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 26 06:58:21 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 25 Apr 2017 23:58:21 -0700 Subject: Pivoting (or rather missing pivot) In-Reply-To: <20170426065421.7B13C406063@ip-64-139-1-69.sjc.megapath.net> References: <20170426065421.7B13C406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170425235821.7645f71a@spidey.rellim.com> Yo Hal! On Tue, 25 Apr 2017 23:54:21 -0700 Hal Murray wrote: > devel at ntpsec.org said: > > Rather than repeating myself: > > https://lists.ntpsec.org/pipermail/devel/2017-April/004179.html > > Thanks. Here is the key part: > > step - double seconds, range usually +/1 1ms, 1us or 1ns. > > never larger than 'gate', but for arguement assume > > it could be as large as 1Jan2200 - 1Jan1970 > > It can be (and often is) larger than gate on the first adjustment > when run with the -g switch. > > 2200-1970 is 230. That doesn't fit into a l_fp so you can't possibly > get a step size that big. Yes, but that is a different code path. > The 2037 gets truncated so the remote server time comes back as > 1901. 1901-1970 is -69 rather than +67. Different code path. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 26 07:05:11 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 00:05:11 -0700 Subject: Work item list: dumping In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 25 Apr 2017 19:13:25 EDT." <20170425231325.D248513A021A@snark.thyrsus.com> Message-ID: <20170426070511.848EE406063@ip-64-139-1-69.sjc.megapath.net> Eric said: > Possible work item (me): Change Mode 6 so it no longer ships system status > bits in hex, decoding them into a string token list instead, old behavior > preserved under ENABLE_CLASSIC_MODE. This is because the PLL* definitions > aren't guaranteed to be consistent across different system versions. > Work item: New Mode 6 responses that unpack status bits into list of string > tags for the status bits. Avoids potential problem with PLL bits (outside > our control) changing due to kernel mods. I think the first should be dropped in favor of the second. I don't see how to make ENABLE_CLASSIC_MODE help with this problem. There are 4 possible combinations of server and ntpq with and without ENABLE_CLASSIC_MODE. We want all of them to work. I think the new send-string case works if it is sent with a different tag in addition to the old hex mode. I agree that it is possible for different PLL implementations to assign different bit values to flags, but none of them have done it yet and it would cause all sorts of confusion. I'd be happy to add a warning at build time. (or require a configuration option to get past a fatal error) -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 26 07:06:44 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 00:06:44 -0700 Subject: From: Hal Murray via devel In-Reply-To: <20170426062923.C8C30406063@ip-64-139-1-69.sjc.megapath.net> References: <20170426062923.C8C30406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426000629.592e6529@spidey.rellim.com> Yo Hal! On Tue, 25 Apr 2017 23:29:23 -0700 Hal Murray wrote: > What's the deal on this? It's new as of yesterday or today. Yup. > Is that a manual configuration step or something that mailman does > automagically? Manual config > Please cc devel and/or other lists as appropriate. Only applies to devel at . This ir a consequence of SPF. A lot of devel@ email is bunced as undeliverable. You have seem this weh your email was removed from devel@ due to too many bounces. Many people have their email setup, for very good reasons, to only be accepted when it come from their own email servers. When mailman pretends to be the original sender of email, which is configured to allow from ntpsec.org, then the recipients reject it. This has become a large problem for ntpsec, as it has for other mailing lists. The only real solutiom is for email from devel@ to be seen as from devel@, and not pretending to be from the original sender, when it is not. Already the bounced email from devel@ has gone down. This is a good thing. If you look around, you will find that most large mailman site, like nanog, have already been doing this for a while. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 26 07:10:36 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 00:10:36 -0700 Subject: Work item list: l_fp_time and l_fp_offset In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 25 Apr 2017 19:13:25 EDT." <20170425231325.D248513A021A@snark.thyrsus.com> Message-ID: <20170426071036.33042406063@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > Work item: Replace > typedef uint64_t l_fp; > with > typedef uint64_t l_fp_time; > typedef int64_t l_fp_offset; I think that should be rejected in favor of eliminating l_fp except at the very edge and doing the pivot at the edge. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Wed Apr 26 07:13:03 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 00:13:03 -0700 Subject: Work item list: lfpfloat to Python In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 25 Apr 2017 19:13:25 EDT." <20170425231325.D248513A021A@snark.thyrsus.com> Message-ID: <20170426071303.A68F5406063@ip-64-139-1-69.sjc.megapath.net> Eric said: > Work item: The lfpfloat() function in the Python interface may be > dispensible. Convert the hex literal to float and divide by 1<<32 to scale > to seconds since the NTP epoch. It also needs to do a pivot and/or have an option not to pivot. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Wed Apr 26 07:40:52 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 00:40:52 -0700 Subject: From: Hal Murray via devel Message-ID: <20170426074052.3F138406063@ip-64-139-1-69.sjc.megapath.net> Thanks. > Manual config Did you enable this for all senders or only selected domains? > This ir a consequence of SPF. A lot of devel@ email is bunced as > undeliverable. You have seem this weh your email was removed from devel@ > due to too many bounces. I thought the normal approach was to do the From rewriting only on domains that have SPF. (and/or DMARC) Doing it when it isn't necessary screws up reply formatting. John Levine, blowing off (lots of) steam: Yahoo breaks every mailing list in the world including the IETF's https://www.ietf.org/mail-archive/web/ietf/current/msg87153.html Date: 7 Apr 2014 20:11:04 -0000 I don't think mymail bounced because of SPF. It bounced because Megapath has a screwed up spam filter and/or I haven't found a way to whitelist by IP Address. I'm currently getting mail direct to a second address that I control and have this one set to no-mail so Megapath doesn't get a chance to bounce any list traffic. Nobody has complained about bounces of non-list traffic including copies of list traffic. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Wed Apr 26 07:52:27 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 00:52:27 -0700 Subject: Pivoting (or rather missing pivot) Message-ID: <20170426075227.72965406063@ip-64-139-1-69.sjc.megapath.net> > Different code path. OK, I'm missing something. Eric restored the pivot code that we don't like to step_systime() How does the first step happen without going through there? and/or how does the first step get turned into the right size double without a pivot and/or where does that pivot happen? >From your second message: > Yes, but when we subtract from our local time, truncated to an l_fp, and 2s > complement, we end up with a delta on local time. Once we get runnning, > any delta past 'gate' is thrown away. That gate, by default is just 1,000 > seconds. > So. after the first big correction, we KNOW the delta is 1,000 seconds or > less and 2s complement arithmetic over the epoch rollever is fine. The question is how does the first big correction work? Your analysis covers the no-big-step case. I called that: system time is "close enough". -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Wed Apr 26 08:05:07 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 01:05:07 -0700 Subject: Work item list: verify/fix ntpq retransmissions In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 25 Apr 2017 19:13:25 EDT." <20170425231325.D248513A021A@snark.thyrsus.com> Message-ID: <20170426080507.869D1406063@ip-64-139-1-69.sjc.megapath.net> I don't think they are right. I get occasional timeouts/crashes but haven't had time to investigate. Possibly the logic is correct and we need to bump the retry count. Has anybody else noticed troubles? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Wed Apr 26 08:23:47 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 01:23:47 -0700 Subject: Work item list: finish DNS work In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 25 Apr 2017 19:13:25 EDT." <20170425231325.D248513A021A@snark.thyrsus.com> Message-ID: <20170426082347.706BA406063@ip-64-139-1-69.sjc.megapath.net> I need a day or two without distractions (like this) to finish things. Mostly, get good backoff on the DNS not working case. -- These are my opinions. I hate spam. From esr at thyrsus.com Wed Apr 26 09:12:58 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 26 Apr 2017 05:12:58 -0400 Subject: Work item list: l_fp_time and l_fp_offset In-Reply-To: <20170426071036.33042406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425231325.D248513A021A@snark.thyrsus.com> <20170426071036.33042406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426091258.GA17833@thyrsus.com> Hal Murray : > > devel at ntpsec.org said: > > Work item: Replace > > typedef uint64_t l_fp; > > with > > typedef uint64_t l_fp_time; > > typedef int64_t l_fp_offset; > > I think that should be rejected in favor of eliminating l_fp except at the > very edge and doing the pivot at the edge. OK, I can agree with that. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Wed Apr 26 09:14:21 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 26 Apr 2017 05:14:21 -0400 Subject: Work item list: verify/fix ntpq retransmissions In-Reply-To: <20170426080507.869D1406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425231325.D248513A021A@snark.thyrsus.com> <20170426080507.869D1406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426091421.GB17833@thyrsus.com> Hal Murray : > > I don't think they are right. I get occasional timeouts/crashes but haven't > had time to investigate. Possibly the logic is correct and we need to bump > the retry count. > > Has anybody else noticed troubles? I have never seen this problem. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Wed Apr 26 09:15:44 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 26 Apr 2017 05:15:44 -0400 Subject: Work item list: dumping In-Reply-To: <20170426070511.848EE406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425231325.D248513A021A@snark.thyrsus.com> <20170426070511.848EE406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426091544.GC17833@thyrsus.com> Hal Murray : > > Eric said: > > Possible work item (me): Change Mode 6 so it no longer ships system status > > bits in hex, decoding them into a string token list instead, old behavior > > preserved under ENABLE_CLASSIC_MODE. This is because the PLL* definitions > > aren't guaranteed to be consistent across different system versions. > > > Work item: New Mode 6 responses that unpack status bits into list of string > > tags for the status bits. Avoids potential problem with PLL bits (outside > > our control) changing due to kernel mods. > > I think the first should be dropped in favor of the second. > > I don't see how to make ENABLE_CLASSIC_MODE help with this problem. There > are 4 possible combinations of server and ntpq with and without > ENABLE_CLASSIC_MODE. We want all of them to work. > > I think the new send-string case works if it is sent with a different tag in > addition to the old hex mode. > > I agree that it is possible for different PLL implementations to assign > different bit values to flags, but none of them have done it yet and it would > cause all sorts of confusion. I'd be happy to add a warning at build time. > (or require a configuration option to get past a fatal error) Your amendments seem completely reasonable, OK. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Wed Apr 26 09:25:06 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 26 Apr 2017 05:25:06 -0400 Subject: Does anybody have a sample of a NMEA device with the 1024 week bug? In-Reply-To: References: Message-ID: <20170426092506.GE17833@thyrsus.com> Trevor N. via devel : > I have a device that will rollover after week 1998 (in 2018) that I > just tested with a GPS simulator set to 2 years in the future, > attached to ntpd classic with a +2 year offset in get_ostime (and > "disable ntp" in conf) and ntpcal_get_build_date() of 2016. The > 512-week-around-receive-timestamp code in the driver fixed the > timecode timestamp from 1999-09-10 to the proper date. Oh, that's good. Please test with our code, if that's convenient. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Wed Apr 26 10:33:14 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 03:33:14 -0700 Subject: Documentation request/opportunity In-Reply-To: Message from "Eric S. Raymond" of "Tue, 25 Apr 2017 08:54:19 EDT." <20170425125419.GA2157@thyrsus.com> Message-ID: <20170426103314.2CDC8406063@ip-64-139-1-69.sjc.megapath.net> [What packet types do we support?] man ntp.conf says: peer NTP peer mode has been removed for security reasons. peer is now just an alias for the server keyword. See above. It's not clear what "has been removed" means. Just the server setup command or support for the packet types. I added a peer line to an instance of ntp classic pointing at a server running ntpsec. It gets a respnose. mrulist says it's sending mode 1 and receiving mode 2. -- These are my opinions. I hate spam. From esr at thyrsus.com Wed Apr 26 10:36:31 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 26 Apr 2017 06:36:31 -0400 Subject: Documentation request/opportunity In-Reply-To: <20170426103314.2CDC8406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425125419.GA2157@thyrsus.com> <20170426103314.2CDC8406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426103631.GA19172@thyrsus.com> Hal Murray : > [What packet types do we support?] > > man ntp.conf says: > peer > NTP peer mode has been removed for security reasons. peer is now > just an alias for the server keyword. See above. > > It's not clear what "has been removed" means. Just the server setup command > or support for the packet types. > > I added a peer line to an instance of ntp classic pointing at a server > running ntpsec. It gets a respnose. mrulist says it's sending mode 1 and > receiving mode 2. If I answer this I might garble it. Daniel? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From fallenpegasus at gmail.com Wed Apr 26 17:42:59 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 26 Apr 2017 17:42:59 +0000 Subject: blog/_drafts/gps-pivot.ad Message-ID: Hi! I just created blog/_drafts/gps-pivot.ad It's a lightly edited transcript of a conversation about GPS rollover and pivoting. I'm going to work it into a blog post. Anyone with a good grasp of the GPS rollover issues is invited to work on it as well. Thanks! ..m -- Mark Atwood http://about.me/markatwood +1-206-604-2198 SMS & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Wed Apr 26 18:22:10 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 11:22:10 -0700 Subject: Documentation request/opportunity In-Reply-To: <20170426103314.2CDC8406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425125419.GA2157@thyrsus.com> <20170426103314.2CDC8406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426112210.4bdbd2e2@spidey.rellim.com> Yo Hal! On Wed, 26 Apr 2017 03:33:14 -0700 Hal Murray via devel wrote: > [What packet types do we support?] > > man ntp.conf says: > peer > NTP peer mode has been removed for security reasons. peer > is now just an alias for the server keyword. See above. > > It's not clear what "has been removed" means. Just the server setup > command or support for the packet types. Confusing statement. "peer mode" has been removed. Now any "peer" config in ntp.conf now is an alias for "server". Then ntpd contacts that chimer as a client, not a peer. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 26 18:26:42 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 11:26:42 -0700 Subject: blog/_drafts/gps-pivot.ad Message-ID: <20170426182642.B91FF406063@ip-64-139-1-69.sjc.megapath.net> Mark Atwood said: > I'm going to work it into a blog post. Anyone with a good grasp of the GPS > rollover issues is invited to work on it as well. It's not just a GPS issue. Unix/POSIX has the same problem with time_t It rolls over into the sign bit in 2038 NTP packet format has the same problem too. It overflows an unsigned 32 bit field in 2036 -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 26 18:38:50 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 11:38:50 -0700 Subject: Pivoting (or rather missing pivot) In-Reply-To: <20170426075227.72965406063@ip-64-139-1-69.sjc.megapath.net> References: <20170426075227.72965406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426113850.2ac8128f@spidey.rellim.com> Yo Hal! On Wed, 26 Apr 2017 00:52:27 -0700 Hal Murray wrote: > > Different code path. > > OK, I'm missing something. Many people are. We jus tnot sure which ones. :-) > Eric restored the pivot code that we don't like to step_systime() Yes. > How does the first step happen without going through there? Wrong question, it does not matter even if it did. step_systime() takes the amount to step as a double. Any pivot info is long gone. step_systime() just adds step to the current sys clock time and then sets that time. But in a very long and obfuscated way. > and/or > how does the first step get turned into the right size double without > a pivot and/or where does that pivot happen? Unrelated to this issue. That happens in ntpd/ntp_loopfilter.c So a question for another thread. Let's not continue to get lost bouncing around. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Apr 26 18:40:08 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 11:40:08 -0700 Subject: blog/_drafts/gps-pivot.ad In-Reply-To: <20170426182642.B91FF406063@ip-64-139-1-69.sjc.megapath.net> References: <20170426182642.B91FF406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426114008.1ca180f8@spidey.rellim.com> Yo Hal! On Wed, 26 Apr 2017 11:26:42 -0700 Hal Murray via devel wrote: > Unix/POSIX has the same problem with time_t > It rolls over into the sign bit in 2038 Only for some. Not an issue with 64 bit Linux. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 26 18:50:56 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 11:50:56 -0700 Subject: Pivoting (or rather missing pivot) Message-ID: <20170426185056.6AE29406063@ip-64-139-1-69.sjc.megapath.net> Gary said: >> and/or >> how does the first step get turned into the right size double without >> a pivot and/or where does that pivot happen? > Unrelated to this issue. That happens in ntpd/ntp_loopfilter.c So a question > for another thread. Let's not continue to get lost bouncing around. That's the whole point. ntp_loopfilter.c calls step_systime() Your previous message said: Different code path. Where is that path? Where does it do a pivot on the way to step_systime? -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 26 18:53:07 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 11:53:07 -0700 Subject: From: Hal Murray via devel In-Reply-To: <20170426074052.3F138406063@ip-64-139-1-69.sjc.megapath.net> References: <20170426074052.3F138406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426115307.09fc6d57@spidey.rellim.com> Yo Hal! On Wed, 26 Apr 2017 00:40:52 -0700 Hal Murray wrote: > > Manual config > > Did you enable this for all senders or only selected domains? ALL senders, not an option by sender. > > This ir a consequence of SPF. A lot of devel@ email is bunced as > > undeliverable. You have seem this weh your email was removed from > > devel@ due to too many bounces. > > I thought the normal approach was to do the From rewriting only on > domains that have SPF. (and/or DMARC) Mailman does not know that I do, or do not have spf on rellim.com. And would be very weird if mailmman keeps bouncing around between modes. That would make phishing easier. > Doing it when it isn't necessary screws up reply formatting. It IS necessary. Sometimes 50% of devel@ email is bouncing. Many users keep getting punted from the list due to excessive rejects. > John Levine, blowing off (lots of) steam: > Yahoo breaks every mailing list in the world including the IETF's > https://www.ietf.org/mail-archive/web/ietf/current/msg87153.html > Date: 7 Apr 2014 20:11:04 -0000 Yup. Yahoo and gmail, the big ones. Can't fight tham. > I don't think mymail bounced because of SPF. It bounced because > Megapath has a screwed up spam filter and/or I haven't found a way to > whitelist by IP Address. So, is it better or worse now with your megapath? > I'm currently getting mail direct to a second address that I control > and have this one set to no-mail so Megapath doesn't get a chance to > bounce any list traffic. Nobody has complained about bounces of > non-list traffic including copies of list traffic. "nobody"? How about I add you as a list admin and you can see how many nobodies we have? Fully 50% of my posts, with SPF, were getting bounced some days. I had DMARC reports to prove it. So maybe this particular setting is not optimal, but something had to change, the situation was bad. I'm willing to try other mailman setting. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Apr 26 18:57:02 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 11:57:02 -0700 Subject: Pivoting (or rather missing pivot) In-Reply-To: <20170426185056.6AE29406063@ip-64-139-1-69.sjc.megapath.net> References: <20170426185056.6AE29406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426115702.043b5799@spidey.rellim.com> Yo Hal! On Wed, 26 Apr 2017 11:50:56 -0700 Hal Murray wrote: > Gary said: > >> and/or > >> how does the first step get turned into the right size double > >> without a pivot and/or where does that pivot happen? > > Unrelated to this issue. That happens in ntpd/ntp_loopfilter.c So a > > question for another thread. Let's not continue to get lost > > bouncing around. > > That's the whole point. ntp_loopfilter.c calls step_systime() > > Your previous message said: > Different code path. > > Where is that path? > Where does it do a pivot on the way to step_systime? Stuck in a time warp again. Not relevant to the bad reversion. I'm not gonna go down that rabbit hole until we get this very narrow problem dealt with. We have a well bounded problem with 2 inputs and 1 output. No need for external info to know if this function does what it is supposed to do: step the clock by double. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Apr 26 19:02:39 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 12:02:39 -0700 Subject: Work item list: lfpfloat to Python In-Reply-To: <20170426071303.A68F5406063@ip-64-139-1-69.sjc.megapath.net> References: <20170425231325.D248513A021A@snark.thyrsus.com> <20170426071303.A68F5406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426120239.77b4a0a4@spidey.rellim.com> Yo Hal! On Wed, 26 Apr 2017 00:13:03 -0700 Hal Murray via devel wrote: > Eric said: > > Work item: The lfpfloat() function in the Python interface may be > > dispensible. Convert the hex literal to float and divide by 1<<32 > > to scale to seconds since the NTP epoch. > > It also needs to do a pivot and/or have an option not to pivot. It needs to be pivoted earlier. When you put a timestamp l_fp into a double you lose too much precision. Only l_fp offsets should be converted to doubles. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Wed Apr 26 19:19:04 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 26 Apr 2017 21:19:04 +0200 Subject: Pivoting References: <20170425210525.GA9918@thyrsus.com> <20170425211246.27D93406063@ip-64-139-1-69.sjc.megapath.net> <20170425215238.GA10555@thyrsus.com> Message-ID: <87h91bugd3.fsf@Rainer.invalid> Eric S. Raymond writes: > We can't fix *that*, either. All we can do is add a warning to the man pages > for NMEA-related drivers that if your GPS is older than 1024 weeks you may be > cruising for a bruising. I think I'll go do that. Well, the more I think about it, I'd say we should rip out any half-baked stopgaps from ntpd and just document that a) the system time at startup of ntpd needs to be reasonably close to true time (exact numbers for "reasonable" vary with whether time stepping is allowed or not) and b) any primary reference clock must not by lying about the time either. Unless the two conditions are satisfied ntpd can not be expected to converge to the correct time. Some of the stopgap measures might be pulled out into additional utilities that can be used to test or fulfill the two conditions. There may be further conditions past the two I mentioned. Regardsm Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ DIY Stuff: http://Synth.Stromeko.net/DIY.html From Stromeko at nexgo.de Wed Apr 26 19:43:03 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 26 Apr 2017 21:43:03 +0200 Subject: Temperature Controlled rasPi 3B References: <87pog0hj1v.fsf@Rainer.invalid> <20170425153036.54674cb3@spidey.rellim.com> Message-ID: <87d1bzuf94.fsf@Rainer.invalid> Gary E. Miller via devel writes: > Cool. Did you use ntpheat? No, I've simply extended the Perl script that logs all PPS timestamps (this also ensures that all the measurements are aligned with the loopstats). >> I still have to add an integral >> term to the control loop in order to make the residual zero > > ntpheatusb already has a full PID controller. Did you look at that? No again, I avoid looking at Python as much as possible. > If yu do so, could you send us patches for ntpheat? See above. It's really nothing spectactular, so I don't expect it will be difficult to translate, however. > I found that keeping the CPU chip temp stable was less important that\n > keeping ambient stable. The XTAL is on the other side of the PCB > from the CPU. I know. I have described the exact setup in excruciationg detail to you before, so I'm not going to repeat why the CPU temperature can be used as a proxy for the ambient in that case. Remember that this is about getting the best possible performance out of the rasPi for exactly zero cost above the rasPi and GPS itself. That journey has come a lot further than I first hoped, and I will keep it going for a while. > I also found adding a fan to the box evened out the temps between > PCB top and bottom better. I have salvaged a few nice copper heatsinks from servers that were thrown out at work that I will use later on for a better version of the ovenized NTP server. I already got some temp/humidity/pressure sensors so I can check the actual temperatures and other environmental influences. It will most likely not make any difference for the clients, so it's really just a game to see how far I can push this. > Very nice. What is the 'NTP Loop Offset'? The Time offset? Yes. > How is the predicted frequency offset calculated? Could that be patched > into ntpviz? I believe I've detailed the aging equations to you before. Teasing out the five model variables from the data requires a bit of care, but the fit is easily done in gnuplot. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Waldorf MIDI Implementation & additional documentation: http://Synth.Stromeko.net/Downloads.html#WaldorfDocs From hmurray at megapathdsl.net Wed Apr 26 19:43:20 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 12:43:20 -0700 Subject: Work item list: lfpfloat to Python Message-ID: <20170426194320.0566D406063@ip-64-139-1-69.sjc.megapath.net> NB typo: It's lfptofloat rather than lfpfloat (I'm leaving the subject un-fixed to help mail archives) Gary said: > It needs to be pivoted earlier. When you put a timestamp l_fp into a double > you lose too much precision. Only l_fp offsets should be converted to > doubles. Looks like Gary is suggesting a new item for your list. The Python code calls lfptofloat in 8 places. Some are offsets. Some are times. We should look at the ones that are times and process the fractional part in a way that doesn't lose precision. Maybe we should have another item for the pivoting (in the Python code). I think we need to be able to see both the pivoted and un-pivoted versions. ntpq needs to be able to show the EPOCH a server is using. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Wed Apr 26 19:52:50 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 26 Apr 2017 21:52:50 +0200 Subject: Does anybody have a sample of a NMEA device with the 1024 week bug? References: <20170426061233.2C79E406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <878tmnuest.fsf@Rainer.invalid> Hal Murray via devel writes: > fallenpegasus at gmail.com said: >> How hard is it to write a fake NMEA device simulator in Python, and use >> Linux IPC control magic to have it appear at a fake serial device? > > I don't know anything about IPC, but I do know how to connect 2 PCs with a > crossover cable. > > USB would be good enough so you don't actually need a second PC. Is the driver code actually looking whether it has an actual serial device? At least the generic driver should not care and you could just replace the device with a FIFO and feed the data in from your simulator. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf microQ V2.22R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From gem at rellim.com Wed Apr 26 19:56:16 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 12:56:16 -0700 Subject: Work item list: lfpfloat to Python In-Reply-To: <20170426194320.0566D406063@ip-64-139-1-69.sjc.megapath.net> References: <20170426194320.0566D406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426125616.5a3c5108@spidey.rellim.com> Yo Hal! On Wed, 26 Apr 2017 12:43:20 -0700 Hal Murray via devel wrote: > ntpq needs to be able to show the EPOCH a server is using. BUILD_EPOCH, UNIX_EPOCH and NTP_EPOCH. Just more system variable to add to mode 6. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Wed Apr 26 19:59:32 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 26 Apr 2017 21:59:32 +0200 Subject: Work item list: l_fp_time and l_fp_offset References: <20170426071036.33042406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <874lxbuehn.fsf@Rainer.invalid> Hal Murray via devel writes: > devel at ntpsec.org said: >> Work item: Replace >> typedef uint64_t l_fp; >> with >> typedef uint64_t l_fp_time; >> typedef int64_t l_fp_offset; > > I think that should be rejected in favor of eliminating l_fp except at the > very edge and doing the pivot at the edge. What exactly do you suggest to replace it with? Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Wavetables for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables From gem at rellim.com Wed Apr 26 20:01:44 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 13:01:44 -0700 Subject: Temperature Controlled rasPi 3B In-Reply-To: <87d1bzuf94.fsf@Rainer.invalid> References: <87pog0hj1v.fsf@Rainer.invalid> <20170425153036.54674cb3@spidey.rellim.com> <87d1bzuf94.fsf@Rainer.invalid> Message-ID: <20170426130144.76e692b1@spidey.rellim.com> Yo Achim! On Wed, 26 Apr 2017 21:43:03 +0200 Achim Gratz via devel wrote: > Gary E. Miller via devel writes: > > Cool. Did you use ntpheat? > > No, I've simply extended the Perl script that logs all PPS timestamps > (this also ensures that all the measurements are aligned with the > loopstats). Hmm, no perl in NTPsec anymore. Care to submit that for contrib/ ? > > ntpheatusb already has a full PID controller. Did you look at > > that? > > No again, I avoid looking at Python as much as possible. Well, the algorithm is still the same in Perl and Python. If you get it done, please submit that too for contrib/ > > If yu do so, could you send us patches for ntpheat? > > See above. It's really nothing spectactular, so I don't expect it > will be difficult to translate, however. Since Python annoys you, I'm sure others would like to see your perl. > > I found that keeping the CPU chip temp stable was less important > > that\n keeping ambient stable. The XTAL is on the other side of > > the PCB from the CPU. > > I know. I have described the exact setup in excruciationg detail to > you before, so I'm not going to repeat why the CPU temperature can be > used as a proxy for the ambient in that case. And I guess we'l have to disagree on how well that works. > Remember that this is > about getting the best possible performance out of the rasPi for > exactly zero cost above the rasPi and GPS itself. Yup. And even better when you send us the tools to put in contrib/ so others can replicate. > > I also found adding a fan to the box evened out the temps between > > PCB top and bottom better. > > I have salvaged a few nice copper heatsinks from servers that were > thrown out at work that I will use later on for a better version of > the ovenized NTP server. I already got some temp/humidity/pressure > sensors so I can check the actual temperatures and other environmental > influences. It will most likely not make any difference for the > clients, so it's really just a game to see how far I can push this. Yeah, the frequency jitter keeps getting beter, but not the time jitter. > > How is the predicted frequency offset calculated? Could that be > > patched into ntpviz? > > I believe I've detailed the aging equations to you before. Teasing > out the five model variables from the data requires a bit of care, > but the fit is easily done in gnuplot. Yes, I remember your descriptions, what I'm asking for is code. reuseable code. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Apr 26 20:05:59 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 13:05:59 -0700 Subject: Work item list: l_fp_time and l_fp_offset In-Reply-To: <874lxbuehn.fsf@Rainer.invalid> References: <20170426071036.33042406063@ip-64-139-1-69.sjc.megapath.net> <874lxbuehn.fsf@Rainer.invalid> Message-ID: <20170426130559.1853d337@spidey.rellim.com> Yo Achim! On Wed, 26 Apr 2017 21:59:32 +0200 Achim Gratz via devel wrote: > Hal Murray via devel writes: > > devel at ntpsec.org said: > >> Work item: Replace > >> typedef uint64_t l_fp; > >> with > >> typedef uint64_t l_fp_time; > >> typedef int64_t l_fp_offset; > > > > I think that should be rejected in favor of eliminating l_fp except > > at the very edge and doing the pivot at the edge. > > What exactly do you suggest to replace it with? Timespec with time64_t. As Linux 64-bit uses now for native time. Then the timestamp arithmetic does not lose precision and the first time64_t rollover is centuries from now. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fallenpegasus at gmail.com Wed Apr 26 20:10:11 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 26 Apr 2017 20:10:11 +0000 Subject: blog/_drafts/gps-pivot.ad In-Reply-To: <20170426114008.1ca180f8@spidey.rellim.com> References: <20170426182642.B91FF406063@ip-64-139-1-69.sjc.megapath.net> <20170426114008.1ca180f8@spidey.rellim.com> Message-ID: This blog post will be about GPS receiver firmware pivoting issues and about GPS system 1024 week era wrapping issues. There will be another different blog post about about NTP era wrapping and pivoting Analysis of the time_t wrapping is getting too far out of our remit, and other people have already written deeply and well on it. ..m On Wed, Apr 26, 2017 at 11:40 AM Gary E. Miller via devel wrote: > Yo Hal! > > On Wed, 26 Apr 2017 11:26:42 -0700 > Hal Murray via devel wrote: > > > Unix/POSIX has the same problem with time_t > > It rolls over into the sign bit in 2038 > > Only for some. Not an issue with 64 bit Linux. > > RGDS > GARY > --------------------------------------------------------------------------- > Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 > gem at rellim.com Tel:+1 541 382 8588 <(541)%20382-8588> > > Veritas liberabit vos. -- Quid est veritas? > "If you can?t measure it, you can?t improve it." - Lord Kelvin > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel -- Mark Atwood http://about.me/markatwood +1-206-604-2198 SMS & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From hmurray at megapathdsl.net Wed Apr 26 20:12:48 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 13:12:48 -0700 Subject: Work item list: l_fp_time and l_fp_offset In-Reply-To: Message from Achim Gratz via devel of "Wed, 26 Apr 2017 21:59:32 +0200." <874lxbuehn.fsf@Rainer.invalid> Message-ID: <20170426201248.8CA3F406063@ip-64-139-1-69.sjc.megapath.net> Achim said: <> I think that should be rejected in favor of eliminating l_fp except at the >> very edge and doing the pivot at the edge. > What exactly do you suggest to replace it with? There are two uses for l_fp: time and offset. By pushing it to the edge, we eliminate uses as time. I would be happy with a double for offsets. Gary wants a timespec to preserve accuracy. That only matters if the offset is huge. The only time that happens is during a giant initial step. I'm willing to live with that. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Wed Apr 26 20:20:17 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 26 Apr 2017 22:20:17 +0200 Subject: Work item list: l_fp_time and l_fp_offset References: <20170426071036.33042406063@ip-64-139-1-69.sjc.megapath.net> <874lxbuehn.fsf@Rainer.invalid> <20170426130559.1853d337@spidey.rellim.com> Message-ID: <87zif2udj2.fsf@Rainer.invalid> Gary E. Miller via devel writes: >> What exactly do you suggest to replace it with? > > Timespec with time64_t. Is that POSIX yet? > As Linux 64-bit uses now for native time. Then > the timestamp arithmetic does not lose precision and the first time64_t > rollover is centuries from now. I'm not sure what you try to gain there, but you are certainly not be able to stuff that into registers or call stack and instead have pointers to a structure. Again, as long as you can reconcile the time at startup, there is absolutely no problem with any rollover during the operation of ntpd that results from the use of l_fp. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ DIY Stuff: http://Synth.Stromeko.net/DIY.html From gem at rellim.com Wed Apr 26 20:22:00 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 13:22:00 -0700 Subject: Work item list: l_fp_time and l_fp_offset In-Reply-To: <20170426201248.8CA3F406063@ip-64-139-1-69.sjc.megapath.net> References: <874lxbuehn.fsf@Rainer.invalid> <20170426201248.8CA3F406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426132200.1454f5e3@spidey.rellim.com> Yo Hal! On Wed, 26 Apr 2017 13:12:48 -0700 Hal Murray via devel wrote: > Achim said: > <> I think that should be rejected in favor of eliminating l_fp > except at the > >> very edge and doing the pivot at the edge. > > What exactly do you suggest to replace it with? > > There are two uses for l_fp: time and offset. > > By pushing it to the edge, we eliminate uses as time. Confusing the offset is also a time. Let's use timestamp and offset. > > I would be happy with a double for offsets. > > Gary wants a timespec to preserve accuracy. Ugh, please, not what I meant. I see the confusion due to the sloppy terminology. Timestamps NEED to be l_fp (with ntp epoch number 0 or 1), or timespec(64) to preserve full precision (232ps or 1 ns). But I'm OK with doubles for offsets. If the offset is into many years the nanosec will take care of themselves later. So you start with two l_fp, or two timespec(64), or one of each. Then subtract to get an offet as a timespec(64) or a double. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Wed Apr 26 20:22:32 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 13:22:32 -0700 Subject: Does anybody have a sample of a NMEA device with the 1024 week bug? In-Reply-To: Message from Achim Gratz via devel of "Wed, 26 Apr 2017 21:52:50 +0200." <878tmnuest.fsf@Rainer.invalid> Message-ID: <20170426202232.80047406063@ip-64-139-1-69.sjc.megapath.net> Achim said: >> USB would be good enough so you don't actually need a second PC. > Is the driver code actually looking whether it has an actual serial device? > At least the generic driver should not care and you could just replace the > device with a FIFO and feed the data in from your simulator. The driver uses tcgetattr and friends. I don't think it would work with a non-TTY. -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 26 20:26:24 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 13:26:24 -0700 Subject: Work item list: l_fp_time and l_fp_offset In-Reply-To: <87zif2udj2.fsf@Rainer.invalid> References: <20170426071036.33042406063@ip-64-139-1-69.sjc.megapath.net> <874lxbuehn.fsf@Rainer.invalid> <20170426130559.1853d337@spidey.rellim.com> <87zif2udj2.fsf@Rainer.invalid> Message-ID: <20170426132624.463e1a94@spidey.rellim.com> Yo Achim! On Wed, 26 Apr 2017 22:20:17 +0200 Achim Gratz via devel wrote: > Gary E. Miller via devel writes: > >> What exactly do you suggest to replace it with? > > > > Timespec with time64_t. > > Is that POSIX yet? Nope. There is no generic standardized solution yet. Until an OS gives us a way to set time past 2038 nothing that NTP can do. > > As Linux 64-bit uses now for native time. Then > > the timestamp arithmetic does not lose precision and the first > > time64_t rollover is centuries from now. > > I'm not sure what you try to gain there, but you are certainly not be > able to stuff that into registers or call stack and instead have > pointers to a structure. Well, that is what ntpd does now on 64 bit linux and glibc, so not a change at all. It just works. I see no reason to try to fight Linux or glibc. Go with their flow. > Again, as long as you can reconcile the time > at startup, there is absolutely no problem with any rollover during > the operation of ntpd that results from the use of l_fp. Yes, agreed. The startup is the problem. Once we have that everything is good. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Wed Apr 26 20:45:22 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 26 Apr 2017 22:45:22 +0200 Subject: Temperature Controlled rasPi 3B References: <87pog0hj1v.fsf@Rainer.invalid> <20170425153036.54674cb3@spidey.rellim.com> <87d1bzuf94.fsf@Rainer.invalid> <20170426130144.76e692b1@spidey.rellim.com> Message-ID: <87vapqucd9.fsf@Rainer.invalid> Gary E. Miller via devel writes: > Yes, I remember your descriptions, what I'm asking for is code. reuseable > code. Have at it. The datafile (joined.txt) has the UTC timestamp, PPS offset, temperature (sampled each second right after the PPS pulse and averaged over 5min or 300 samples), NTP time offset and NTP frequency offset as the columns (from the loopstats), in that order. The variables going into the script from the command line are UTC timestamp for the begin of the data (start), last day to plot (sl), range of days to use for fit (sr to sq). The maximum time offset for samples to be considered valid is thr, ideally one would check that in conjunction with the derivative of the frequency offset (close to zero), but that's cumbersome to do in gnuplot and the resulting extra noise from not doing it is no big problem for convergence. The initial model variables could be moved to another datafile, which would also make them reusable between runs and speed up convergence a bit. --8<---------------cut here---------------start------------->8--- t(x)=x - 24*start age(t)=pa*1000*log(t+paa) p2a(t,T)=age(t)+pb*(T-T0)**2+pc pa=750e-6 paa=2500 pb=-5.e-3 pc=-10.0 T0=62.5 thr=10e-9 fit [t=24*(sl-sr):24*(sl-sq)][T=50:70][ppm=*:*] (p2a(t,T)-ppm) "joined.txt" using (tp($1,$4)):($3):($5) via pa,paa fit [t=24*(sl-sr):24*(sl-sq)][T=50:70][ppm=*:*] (p2a(t,T)-ppm) "joined.txt" using (tp($1,$4)):($3):($5) via T0,pb,pc fit [t=24*(sl-sr):24*(sl-sq)][T=50:70][ppm=*:*] (p2a(t,T)-ppm) "joined.txt" using (tp($1,$4)):($3):($5) via T0,pa,paa,pb,pc --8<---------------cut here---------------end--------------->8--- Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf microQ V2.22R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From gem at rellim.com Wed Apr 26 20:53:14 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 13:53:14 -0700 Subject: Temperature Controlled rasPi 3B In-Reply-To: <87vapqucd9.fsf@Rainer.invalid> References: <87pog0hj1v.fsf@Rainer.invalid> <20170425153036.54674cb3@spidey.rellim.com> <87d1bzuf94.fsf@Rainer.invalid> <20170426130144.76e692b1@spidey.rellim.com> <87vapqucd9.fsf@Rainer.invalid> Message-ID: <20170426135314.4c5c4fd6@spidey.rellim.com> Yo Achim! On Wed, 26 Apr 2017 22:45:22 +0200 Achim Gratz via devel wrote: > Gary E. Miller via devel writes: > > Yes, I remember your descriptions, what I'm asking for is code. > > reuseable code. > > Have at it. Thanks, but I'm more interested in the temp/freq logging program. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Wed Apr 26 21:04:18 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 26 Apr 2017 23:04:18 +0200 Subject: Temperature Controlled rasPi 3B References: <87pog0hj1v.fsf@Rainer.invalid> <20170425153036.54674cb3@spidey.rellim.com> <87d1bzuf94.fsf@Rainer.invalid> <20170426130144.76e692b1@spidey.rellim.com> <87vapqucd9.fsf@Rainer.invalid> <20170426135314.4c5c4fd6@spidey.rellim.com> Message-ID: <87r30eubhp.fsf@Rainer.invalid> Gary E. Miller via devel writes: > Thanks, but I'm more interested in the temp/freq logging program. When it's cleaned up? Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Wavetables for the Terratec KOMPLEXER: http://Synth.Stromeko.net/Downloads.html#KomplexerWaves From hmurray at megapathdsl.net Wed Apr 26 22:43:00 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 15:43:00 -0700 Subject: Work item list: l_fp_time and l_fp_offset Message-ID: <20170426224300.A963D406063@ip-64-139-1-69.sjc.megapath.net> > Let's use timestamp and offset. Good suggestion. Thanks. > But I'm OK with doubles for offsets. If the offset is into many years the > nanosec will take care of themselves later. Ahh. Good. > So you start with two l_fp, or two timespec(64), or one of each. Then > subtract to get an offet as a timespec(64) or a double. My expectation is that timestamps would never leave the front end. The subtracts and pivot would happen there resulting in an offset. There is a back door for l_fp and/or timestamps. That's ntpq. We may have to convert offsets back to l_fp for backward compatibility with old ntpq. There may be some timestamps saved that I don't know about. If they are used for other than ntpq then they will need timespec. -- These are my opinions. I hate spam. From gem at rellim.com Wed Apr 26 23:12:34 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 26 Apr 2017 16:12:34 -0700 Subject: Work item list: l_fp_time and l_fp_offset In-Reply-To: <20170426224300.A963D406063@ip-64-139-1-69.sjc.megapath.net> References: <20170426224300.A963D406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170426161234.4b3eb37f@spidey.rellim.com> Yo Hal! On Wed, 26 Apr 2017 15:43:00 -0700 Hal Murray wrote: > > So you start with two l_fp, or two timespec(64), or one of each. > > Then subtract to get an offet as a timespec(64) or a double. > > My expectation is that timestamps would never leave the front end. > The subtracts and pivot would happen there resulting in an offset. I'm not gonna guess on where that happens now. It happens before local_clock() which is called with an offset and in turn calls step_systime(), adj_systime(), or ntp_adjtime_ns() with an offset when it wants to touch the system clock. So any pivot from local_clock() on down to step_systime(), adj_systime(), or ntp_adjtime_ns() is pointless. > There is a back door for l_fp and/or timestamps. That's ntpq. > We may have to convert offsets back to l_fp for backward > compatibility with old ntpq. Either orks for me, but back compatibility is good. > There may be some timestamps saved that I don't know about. If they > are used for other than ntpq then they will need timespec. Just about every refclock has their own way of doing the timestamp. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Apr 27 05:45:42 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 26 Apr 2017 22:45:42 -0700 Subject: Work item list: eliminate l_fp from refclocks Message-ID: <20170427054542.1BF95406063@ip-64-139-1-69.sjc.megapath.net> Gary pointed out: > Just about every refclock has their own way of doing the timestamp. The problem is that things like refclock_process_offset() take l_fp as arguments. We can add wrappers that use timespec, convert to l_fp, then call the old code or the reverse depending on the order we do things. I think this is worth splitting out from eliminating l_fp except at the edge. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Thu Apr 27 09:31:31 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 27 Apr 2017 02:31:31 -0700 Subject: Confusion on how pivot works Message-ID: <20170427093131.522BD406063@ip-64-139-1-69.sjc.megapath.net> >From devel/packaging.txt Two instances talking to each other have no way to know that they're based in the same era. To mitigate this problem, each instance has a pivot date and resolves incoming timestamps to the era that minimizes distance between now and the timestamp. This procedure is part of the core protocol specification. [Minimum distance could be negative and large enough to be before the build date.] >From blog/_drafts/gps-pivot.ad The GPS system has a cyclic clock with a 1024-week period. Each device has a cyclic clock with the same period, but it can only disambiguate over exactly a half cycle in either direction from its hidden pivot date. once a gps gets half of a 1024 week span past its pivot, it fails, emits wrong dates wrong by modulo 10 bits of weeks --------- The above is a possible solution, but it's not what I was expecting. The above description with "half" in it works if you put the pivot point in the middle of the target range. You can also put the pivot at the start. Then legal values are from the pivot point until pivot + one cycle time. It's signed vs unsigned values. We are using the build date for the pivot point. That works naturally as a start time rather than middle time. (It's possible I've missed some code that bumps the build date by a half-cycle. I wasn't expecting it so I wasn't looking for it. But that also means that I would be surprised and pay attention if I did see something like that.) -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Thu Apr 27 09:53:37 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 27 Apr 2017 02:53:37 -0700 Subject: Pivoting (or rather missing pivot) Message-ID: <20170427095337.617FC406063@ip-64-139-1-69.sjc.megapath.net> >> Where does it do a pivot on the way to step_systime? Gary said: > Stuck in a time warp again. Not relevant to the bad reversion. I'm not > gonna go down that rabbit hole until we get this very narrow problem dealt > with. Which problem do you think we are discussing? > We have a well bounded problem with 2 inputs and 1 output. No need for > external info to know if this function does what it is supposed to do: step > the clock by double. step_system works as you describe it. The problem is that it is being fed bad data. The code Eric reverted does a pivot. Right? We agree that the code is ugly. You claim it is not necessary. All you have to do to get me to tell Eric it is not needed is show me where the pivot does take place. All the pivots are based on BUILD_EPOCH. Right? It's only used by ntpcal_get_build_date() ntpcal_get_build_date() is called from ntpd/refclock_nmea, ntpd/refclock_magnavox, and libntp/systime So unless I'm missing amd/or missed something, there is no pivot that covers l_fp data from NTP packets other than the one in step_systime() -- These are my opinions. I hate spam. From gem at rellim.com Thu Apr 27 15:17:06 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 27 Apr 2017 08:17:06 -0700 Subject: Pivoting (or rather missing pivot) In-Reply-To: <20170427095337.617FC406063@ip-64-139-1-69.sjc.megapath.net> References: <20170427095337.617FC406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170427081706.5d5f18e0@spidey.rellim.com> Yo Hal! On Thu, 27 Apr 2017 02:53:37 -0700 Hal Murray wrote: > >> Where does it do a pivot on the way to step_systime? > > Gary said: > > Stuck in a time warp again. Not relevant to the bad reversion. > > I'm not gonna go down that rabbit hole until we get this very > > narrow problem dealt with. > > Which problem do you think we are discussing? The bad reversion is what I am discussing. > > We have a well bounded problem with 2 inputs and 1 output. No need > > for external info to know if this function does what it is supposed > > to do: step the clock by double. > > step_system works as you describe it. The problem is that it is > being fed bad data. Really? Can you detail your assertion? > The code Eric reverted does a pivot. Right? No. It thinks it does a pivot, but you never pivot an offset, only a timestaamp. When you subtract one timestamp (with the implicit term NTP_EPOCH) from another timestamp (with th eimplicit NTP_EPOCH) then you have removed the term NTP_EPOCH. No need to deal with it downstream any more. > We agree that the code > is ugly. You claim it is not necessary. And have shown my math to shot that. Can you show any math to prove me wrong? > All you have to do to get me > to tell Eric it is not needed is show me where the pivot does take > place. See above. No pivot needed. Subtracting one timestamp from another removes any constant that is in both terms. > All the pivots are based on BUILD_EPOCH. Right? It's only used by > ntpcal_get_build_date() No. As you just said. And out of scopy for this one discussion that keeps going in circles. > So unless I'm missing amd/or missed something, there is no pivot that > covers l_fp data from NTP packets other than the one in step_systime() Nor is there any need for one, in step_systime(). The 2s complement subtraction works fine around the rollover. I have posted here arithetic samples that show that. Please get a piece of apaer and just calculate several specific math examples, as I have already posted on this discussion. The number do not lie. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Apr 27 20:07:58 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 27 Apr 2017 13:07:58 -0700 Subject: Pivoting (or rather missing pivot) Message-ID: <20170427200758.7359F406063@ip-64-139-1-69.sjc.megapath.net> >> Which problem do you think we are discussing? > The bad reversion is what I am discussing. OK. >> step_system works as you describe it. The problem is that it is >> being fed bad data. > Really? Can you detail your assertion? How about an example? System time is 1970. Server time is 2037, which gets truncated and turns into 1901 Subtract gives an offset of -69 > No. It thinks it does a pivot, but you never pivot an offset, only a > timestaamp. Get the currect system time. Add the offset. Now you have a timestamp for the new system time. Pivot that. Using the above example, the new time would be 1901 which is before the build date so add 136 years to get 2037, the server time that we want. > When you subtract one timestamp (with the implicit term > NTP_EPOCH) from another timestamp (with th eimplicit NTP_EPOCH) then you > have removed the term NTP_EPOCH. Right. But in the l_fp case, one or both of the timestamps may have been truncated. One truncation isn't fatal. Consider 2035 and 2037. After truncation, the 2037 is 1. When you do the subtract, there is an overflow which in this case adds the truncation back. The nasty cases are one truncation and no overflow or overflow with no truncation. That's assuming you do the subtract using 64 bit arithmetic. If you use something bigger, you don't get the overflows. That fixes one case but breaks one that used to work. > See above. No pivot needed. Subtracting one timestamp from another removes > any constant that is in both terms. Right. But it doesn't magically fix unmatched truncations or overflows in the subtract. -- These are my opinions. I hate spam. From gem at rellim.com Fri Apr 28 08:09:36 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 28 Apr 2017 01:09:36 -0700 Subject: Pivoting (or rather missing pivot) In-Reply-To: <20170427200758.7359F406063@ip-64-139-1-69.sjc.megapath.net> References: <20170427200758.7359F406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170428010936.63a74989@spidey.rellim.com> Yo Hal! On Thu, 27 Apr 2017 13:07:58 -0700 Hal Murray wrote: > >> Which problem do you think we are discussing? > > The bad reversion is what I am discussing. > OK. And yet your next comment is out of that scope. > >> step_system works as you describe it. The problem is that it is > >> being fed bad data. > > Really? Can you detail your assertion? > > How about an example? > > System time is 1970. Server time is 2037, which gets truncated and > turns into 1901 > Subtract gives an offset of -69 Yes, but that happens way before step_systime() and local_loop(), thus out of scope for this discussion: the bad reversion. > > No. It thinks it does a pivot, but you never pivot an offset, only > > a timestaamp. > > Get the currect system time. Add the offset. Now you have a > timestamp for the new system time. Pivot that. That is what step_systime() does, minus the pivot. Best not to pivot in step_systime() because it is passed an offset, not an l_fp, and the offset may have been calculated from GPS that never saw an NTP_EPOCH. > Using the above example, the new time would be 1901 which is before > the build date so add 136 years to get 2037, the server time that we > want. Which is why the pivot should have happened long before local_loop)( got called. > > When you subtract one timestamp (with the > > implicit term NTP_EPOCH) from another timestamp (with th eimplicit > > NTP_EPOCH) then you have removed the term NTP_EPOCH. > > Right. But in the l_fp case, one or both of the timestamps may have > been truncated. And since you don't know one or both, or none, that has to be done where that knowledge still exists. Before local_loop(). > The nasty cases are one truncation and no overflow or overflow with > no truncation. Yes, nasty, so need to be taken care of, in the proper place. Before local_loop(). > > See above. No pivot needed. Subtracting one timestamp from > > another removes any constant that is in both terms. > > Right. But it doesn't magically fix unmatched truncations or > overflows in the subtract. Not did I ever claim it did, I claim that should be handled before local_loop() which is before step_systime(). We'll have to table this for a week, I'm now at Penguicon... I'm giving up that we can intellectualize this, people can't stay focused on one issue, we keepp going around in circles. We need to figure out a way to wrap local_loop() and what it calls, so tests can be run. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ianbruene at gmail.com Fri Apr 28 21:17:29 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Fri, 28 Apr 2017 16:17:29 -0500 Subject: More SNMP dependency questions, also assertions. Message-ID: <50ba1c2a-90ab-fd76-3f66-84ead727b269@gmail.com> assert: the daemon name should be ntpsnmpd assert: both ipv4 and ipv6 should be implemented assert: if both exist, they should be bound at the same time assert: the port(s) should be choosable ===================================================================== SNMP version / security According to RFC-5907 it is Recommended to use SNMPv3 and it's security features, and use of previous versions of SNMP is Not Recommended. Since NTPsec is about security and removing / not adding unnecessary complexity, these Recommendations will be treated as Requirements, and only SNMPv3 will be implemented. assert: security should be optional ===================================================================== Dependencies: pysnmp, pysmi, python-daemon The need for pysnmp is obvious, and has already been discussed. pysmi is by the same author, and is an MIB compiler for pysnmp. python-daemon is an implementation of PEP 3143. According to the PEP and other info getting a daemon right is finicky, python-daemon exists to handle that. It is currently on version 2.1.2, and listed as "Production/Stable" dev status in the package index, so it has a low probability of existence failure. Given the description of Proper Daemon Behavior as "fiddly", my inexperience, and NTPsec's interest in reducing complexity where possible I believe the addition of /another/ dependency is justified. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Sun Apr 30 05:16:57 2017 From: gem at rellim.com (Gary E. Miller) Date: Sat, 29 Apr 2017 22:16:57 -0700 Subject: More SNMP dependency questions, also assertions. In-Reply-To: <50ba1c2a-90ab-fd76-3f66-84ead727b269@gmail.com> References: <50ba1c2a-90ab-fd76-3f66-84ead727b269@gmail.com> Message-ID: <20170429221657.50c07dbb@spidey.rellim.com> Yo Ian! On Fri, 28 Apr 2017 16:17:29 -0500 Ian Bruene via devel wrote: > assert: the daemon name should be ntpsnmpd Sure. > assert: both ipv4 and ipv6 should be implemented Yes. > assert: if both exist, they should be bound at the same time If yu bind to the IPv6, you get the IPv4 for free. No need to bind both. > assert: the port(s) should be choosable Yes, but default to the snmp port in /etc/services. > assert: security should be optional I would want default, but that that can't work without a default passwowrd, which is worse... > The need for pysnmp is obvious, and has already been discussed. pysmi > is by the same author, and is an MIB compiler for pysnmp. To start with, yes. Once you get it running may may chnage your minde. > python-daemon is an implementation of PEP 3143. According to the PEP > and other > info getting a daemon right is finicky, python-daemon exists to > handle that. Daemonizing is at most a dozen lines of code. Play with it, but you will likely decide that it is too trivial to add as a dependency. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ianbruene at gmail.com Sun Apr 30 17:34:35 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Sun, 30 Apr 2017 12:34:35 -0500 Subject: More SNMP dependency questions, also assertions. In-Reply-To: <20170429221657.50c07dbb@spidey.rellim.com> References: <50ba1c2a-90ab-fd76-3f66-84ead727b269@gmail.com> <20170429221657.50c07dbb@spidey.rellim.com> Message-ID: >> assert: if both exist, they should be bound at the same time > If yu bind to the IPv6, you get the IPv4 for free. No need to bind both. TIL >> assert: the port(s) should be choosable > Yes, but default to the snmp port in /etc/services. Knew it had a specific default, TIL /etc/services >> assert: security should be optional > I would want default, but that that can't work without a default passwowrd, > which is worse... ACK >> python-daemon is an implementation of PEP 3143. According to the PEP >> and other >> info getting a daemon right is finicky, python-daemon exists to >> handle that. > Daemonizing is at most a dozen lines of code. Play with it, but > you will likely decide that it is too trivial to add as a dependency. Ah, ok. The info I saw made it sound as if daemonization was something even masters found tricky. Conveniently there is a library and PEP on how to do it right..... :-) -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan