From gem at rellim.com Tue Nov 1 22:00:22 2016 From: gem at rellim.com (Gary E. Miller) Date: Tue, 1 Nov 2016 15:00:22 -0700 Subject: Fw: [ntpwg] [Editorial Errata Reported] RFC7822 (4848) Message-ID: <20161101150022.4dcb03ae@spidey.rellim.com> Yo All! More tea leaves to read from ntpwg... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Begin forwarded message: Date: Tue, 1 Nov 2016 14:46:59 -0700 From: Harlan Stenn To: RFC Errata System , talmi at marvell.com, mayer at ntp.org, suresh.krishnan at ericsson.com, terry.manderson at icann.org, odonoghue at isoc.org, dieter.sibold at ptb.de Cc: ntpwg at lists.ntp.org Subject: Re: [ntpwg] [Editorial Errata Reported] RFC7822 (4848) I'd like to rescind the errata 4848 report. H On 10/31/16 3:10 PM, RFC Errata System wrote: > The following errata report has been submitted for RFC7822, > "Network Time Protocol Version 4 (NTPv4) Extension Fields". > > -------------------------------------- > You may review the report below and at: > http://www.rfc-editor.org/errata_search.php?rfc=7822&eid=4848 > > -------------------------------------- > Type: Editorial > Reported by: Harlan Stenn > > Section: Author > > Original Text > ------------- > > > Danny Mayer > Network Time Foundation > PO Box 918 > Talent, OR 97540 > United States > > Email: mayer at ntp.org > > Corrected Text > -------------- > Danny Mayer > PDM Consulting > > Email: mayer at pdmconsulting.net > > Notes > ----- > This document DOES NOT represent the position of the NTP Project. > > I say this as the Project Manager of the NTP Project. > > I have submitted a revision to 7822 that DOES represent the position > of the NTP Project. > > Instructions: > ------------- > This erratum is currently posted as "Reported". If necessary, please > use "Reply All" to discuss whether it should be verified or > rejected. When a decision is reached, the verifying party > can log in to change the status and edit the report, if necessary. > > -------------------------------------- > RFC7822 (draft-ietf-ntp-extension-field-07) > -------------------------------------- > Title : Network Time Protocol Version 4 (NTPv4) > Extension Fields Publication Date : March 2016 > Author(s) : T. Mizrahi, D. Mayer > Category : PROPOSED STANDARD > Source : Network Time Protocol > Area : Internet > Stream : IETF > Verifying Party : IESG > RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Nov 1 22:04:10 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 1 Nov 2016 18:04:10 -0400 Subject: Fw: [ntpwg] [Editorial Errata Reported] RFC7822 (4848) In-Reply-To: <20161101150022.4dcb03ae@spidey.rellim.com> References: <20161101150022.4dcb03ae@spidey.rellim.com> Message-ID: <20161101220410.GA29069@thyrsus.com> Gary E. Miller : > Yo All! > > More tea leaves to read from ntpwg... I'm getting a there's-drama-going-on-we-can't-see vibe from this. -- Eric S. Raymond -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Thu Nov 3 17:47:44 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 3 Nov 2016 13:47:44 -0400 Subject: Progress, and a puzzle. Message-ID: <20161103174744.GA12707@thyrsus.com> The Python translation of ntpq is done, modulo one minor feature in mrulist that I expect to have up later today. All commands, including the authenticated ones, are working. The translation has turned up a weird bug in ntpd, however. It turns out that in two mode 6 responses, reslist and ifstats, ntpd frequently sends bursts of binary garbage in the middle of what is otherwise good textual data. There is a perhaps related problem, much less frequent, with extra NULs being sent after value strings in MRU-list responses. This was hard to notice in C ntpq because it did a whole bunch of validation and consistency checks on the responses and plain threw out any records that looked hinky. The Python port has a different philosophy; it tries to mess with the data that ntpd is shipping as little as possible and to fill in displays of corrupted data with '?'. Before I pull the big switch and drop the C ntpq code I'm going to try to track down this ntpd problem. For those of you interested, Mode 6 is now fully documented at docs/mode6.txt. It's a strange piece of design - very nearly excellent, but with a bunch of inconsistencies and odd glitches that spoil the effect. One wonders, for example, why exactly one response (readstats) has a binary payload, but all others use a common textual format. Mills, or whoever designed it, seems to have been groping towards something like JSON. But the protocol is not consistent about whether or how text payloads are terminated, and has the odd misfeature that instead of writing foo="" to set a response variable to the empty string, yo just say foo (the bare name) and the following ="" is assumed. -- Eric S. Raymond From hmurray at megapathdsl.net Thu Nov 3 18:54:34 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 03 Nov 2016 11:54:34 -0700 Subject: Progress, and a puzzle. In-Reply-To: Message from "Eric S. Raymond" of "Thu, 03 Nov 2016 13:47:44 EDT." <20161103174744.GA12707@thyrsus.com> Message-ID: <20161103185434.523AA406061@ip-64-139-1-69.sjc.megapath.net> > For those of you interested, Mode 6 is now fully documented at docs/ > mode6.txt. It's a strange piece of design - very nearly excellent, but with > a bunch of inconsistencies and odd glitches that spoil the effect. Many thanks. > One wonders, for example, why exactly one response (readstats) has a binary > payload My guess is history. It's probably left over from before the mode6/mode7 stuff got sorted out and Mills decided that mode6 should be all text. You could fix that. I'd probably wait until there is a good reason to add another command. > The translation has turned up a weird bug in ntpd, however. It turns out > that in two mode 6 responses, reslist and ifstats, ntpd frequently sends > bursts of binary garbage in the middle of what is otherwise good textual > data. That's probably a bug. One of the complicated responses sends the slots within a line in random order and adds a garbage field name and value. The comment said it was to keep the client side from making assumptions that would turn into constraints on the server side. I think it's reslist. ----------- How are you testing things? I get this: $ ./ntpq/pyntpq Traceback (most recent call last): File "./ntpq/pyntpq", line 17, in from ntp.packet import * File "/home/murray/ntpsec/play/ntpq/ntp/packet.py", line 16, in from ntpc import lfptofloat ImportError: /usr/local/lib/python2.7/site-packages/ntpc.so: undefined symbol: OBJ_sn2nid $ I tried installing, but there is no pyntpq on my search path and I don't see it actually getting installed. After installing, I get this: $ ./ntpq/pyntpq Traceback (most recent call last): File "./ntpq/pyntpq", line 18, in from ntp.util import * File "/home/murray/ntpsec/play/ntpq/ntp/util.py", line 9, in import ntp.ntpc ImportError: No module named ntpc $ -- These are my opinions. I hate spam. From fallenpegasus at gmail.com Thu Nov 3 19:57:23 2016 From: fallenpegasus at gmail.com (Mark Atwood) Date: Thu, 03 Nov 2016 19:57:23 +0000 Subject: Progress, and a puzzle. In-Reply-To: <20161103185434.523AA406061@ip-64-139-1-69.sjc.megapath.net> References: <20161103174744.GA12707@thyrsus.com> <20161103185434.523AA406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Thu, Nov 3, 2016 at 11:54 AM Hal Murray wrote: > > One wonders, for example, why exactly one response (readstats) has a > binary > > payload > > My guess is history. It's probably left over from before the mode6/mode7 > stuff got sorted out and Mills decided that mode6 should be all text. > > You could fix that. I'd probably wait until there is a good reason to add > another command. > At least for the time being, we want the Python implementation of ntpq to interop with NTP Classic. And even if we manage to land a fixing patch on NTP Classic, we want to be able to interop with older versions of NTP Classic, at least until all major distros upgrade. We're going to have to live with mode6 as it exists right now, for a while yet. > One of the complicated responses sends the slots within a line in random > order and adds a garbage field name and value. The comment said it was to > keep the client side from making assumptions that would turn into > constraints > on the server side. I've actually done that myself in code I've written in the past, pretty much for the same reason. -------------- next part -------------- An HTML attachment was scrubbed... URL: From jason at azze.org Thu Nov 3 20:04:08 2016 From: jason at azze.org (Jason Azze) Date: Thu, 3 Nov 2016 16:04:08 -0400 Subject: Progress, and a puzzle. In-Reply-To: <20161103185434.523AA406061@ip-64-139-1-69.sjc.megapath.net> References: <20161103174744.GA12707@thyrsus.com> <20161103185434.523AA406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Thu, Nov 3, 2016 at 2:54 PM, Hal Murray wrote: > How are you testing things? > > I get this: > > $ ./ntpq/pyntpq > Traceback (most recent call last): > File "./ntpq/pyntpq", line 17, in > from ntp.packet import * > File "/home/murray/ntpsec/play/ntpq/ntp/packet.py", line 16, in > from ntpc import lfptofloat > ImportError: /usr/local/lib/python2.7/site-packages/ntpc.so: undefined > symbol: OBJ_sn2nid > $ Hal, this looks like it could be the conflict with something in the ssl development package (libssl-dev on Debian family and openssl-devel on Red Hat). If you configure with --enable-crypto with the ssl devel package present, you get a pyntpq that's broken in that way. If you remove the package and rebuild, you get a working pyntpq. Eric made a note in the INSTALL file about the conflict. From esr at thyrsus.com Thu Nov 3 21:09:47 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 3 Nov 2016 17:09:47 -0400 Subject: Progress, and a puzzle. In-Reply-To: References: <20161103174744.GA12707@thyrsus.com> <20161103185434.523AA406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161103210947.GA16080@thyrsus.com> Jason Azze : > On Thu, Nov 3, 2016 at 2:54 PM, Hal Murray wrote: > > > How are you testing things? > > > > I get this: > > > > $ ./ntpq/pyntpq > > Traceback (most recent call last): > > File "./ntpq/pyntpq", line 17, in > > from ntp.packet import * > > File "/home/murray/ntpsec/play/ntpq/ntp/packet.py", line 16, in > > from ntpc import lfptofloat > > ImportError: /usr/local/lib/python2.7/site-packages/ntpc.so: undefined > > symbol: OBJ_sn2nid > > $ > > Hal, this looks like it could be the conflict with something in the > ssl development package (libssl-dev on Debian family and openssl-devel > on Red Hat). If you configure with --enable-crypto with the ssl devel > package present, you get a pyntpq that's broken in that way. If you > remove the package and rebuild, you get a working pyntpq. Eric made a > note in the INSTALL file about the conflict. That conflict is gone. I moved the crypto code to pure Python to avoid it. Hal, you sould try reinstalling. That version of the Python extension might be stale, which *could* trigger the OpenSSL conlict. -- Eric S. Raymond From esr at thyrsus.com Thu Nov 3 21:38:55 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 3 Nov 2016 17:38:55 -0400 Subject: Progress, and a puzzle. In-Reply-To: <20161103185434.523AA406061@ip-64-139-1-69.sjc.megapath.net> References: <20161103174744.GA12707@thyrsus.com> <20161103185434.523AA406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161103213855.GB16080@thyrsus.com> Hal Murray : > > One wonders, for example, why exactly one response (readstats) has a binary > > payload > > My guess is history. It's probably left over from before the mode6/mode7 > stuff got sorted out and Mills decided that mode6 should be all text. That seems plausible. > You could fix that. I'd probably wait until there is a good reason to add > another command. As Mark says, I think we're stuck with being interoperable with old versions. The problems with the design of Mode 6 are annoying in a minor way, but nowhere near bad enough to merit a flag day. I'll just seal it all off behind a Python interface class and suppress any urge to fix non-bugs. > > The translation has turned up a weird bug in ntpd, however. It turns out > > that in two mode 6 responses, reslist and ifstats, ntpd frequently sends > > bursts of binary garbage in the middle of what is otherwise good textual > > data. > > That's probably a bug. Considering that the binary crap erupts in the middle of tag names and value literals, yes. You get stuff that looks like "fl!@$%*ags", only with even less friemdly characters. > One of the complicated responses sends the slots within a line in random > order and adds a garbage field name and value. The comment said it was to > keep the client side from making assumptions that would turn into constraints > on the server side. I think it's reslist. First thing I checked. If you disable the random tags (which are generated with all-ASCII names) you still get binary eruptions. ifstats also does frequent random-tag sends. And has binary tumors, though less often. > How are you testing things? Try running pyntpq in the ntpq directory. That's an ntp symlink to the source-tree location of the libries it needs. > I tried installing, but there is no pyntpq on my search path and I don't see > it actually getting installed. In the present state of things, the Python ntp library is part of the normal install to rootspace, but pyntpq is not. That will change when it replaces the C version, but because of a requirement from a potential funder I first need to verify that I can compile pyntpq to a binary that doesn't have a runtime Python dependency. > After installing, I get this: > $ ./ntpq/pyntpq > Traceback (most recent call last): > File "./ntpq/pyntpq", line 18, in > from ntp.util import * > File "/home/murray/ntpsec/play/ntpq/ntp/util.py", line 9, in > import ntp.ntpc > ImportError: No module named ntpc > $ As I said, run it from ntpq/ for now. I'll clean all this up shortly. Had you noticed that the peers display resizes horizontally to fit a wide terminal emulator? Leaves more room for hostnames. -- Eric S. Raymond From esr at thyrsus.com Thu Nov 3 21:42:07 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 3 Nov 2016 17:42:07 -0400 Subject: Progress, and a puzzle. In-Reply-To: References: <20161103174744.GA12707@thyrsus.com> <20161103185434.523AA406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161103214207.GC16080@thyrsus.com> Mark Atwood : > > You could fix that. I'd probably wait until there is a good reason to add > > another command. > > > > At least for the time being, we want the Python implementation of ntpq to > interop with NTP Classic. And even if we manage to land a fixing patch on > NTP Classic, we want to be able to interop with older versions of NTP > Classic, at least until all major distros upgrade. > > We're going to have to live with mode6 as it exists right now, for a while > yet. Agreed. As I noted to Hal, the problems with the design are not serious enough to merit causing any compatibility issues. -- Eric S. Raymond From hmurray at megapathdsl.net Fri Nov 4 11:27:58 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 04 Nov 2016 04:27:58 -0700 Subject: Progress, and a puzzle. In-Reply-To: Message from "Eric S. Raymond" of "Thu, 03 Nov 2016 17:09:47 EDT." <20161103210947.GA16080@thyrsus.com> Message-ID: <20161104112758.3A8F3406061@ip-64-139-1-69.sjc.megapath.net> > Hal, you sould try reinstalling. That version of the Python extension might > be stale, which *could* trigger the OpenSSL conlict. I tried that and it still dies. > Try running pyntpq in the ntpq directory. That's an ntp symlink to the > source-tree location of the libries it needs. That doesn't work either. > In the present state of things, the Python ntp library is part of the normal > install to rootspace... There is still a can of worms in this area. I want to be able to test new libraries without installing them. I'm happy to run a small script to encapsulate the recipe. -- These are my opinions. I hate spam. From esr at thyrsus.com Fri Nov 4 20:16:43 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 4 Nov 2016 16:16:43 -0400 Subject: Progress, and a puzzle. In-Reply-To: <20161104112758.3A8F3406061@ip-64-139-1-69.sjc.megapath.net> References: <20161103210947.GA16080@thyrsus.com> <20161104112758.3A8F3406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161104201643.GA7320@thyrsus.com> Hal Murray : > > Hal, you sould try reinstalling. That version of the Python extension might > > be stale, which *could* trigger the OpenSSL conlict. > > I tried that and it still dies. OK, that's weird. I certainly cannot reproduce it here. > > Try running pyntpq in the ntpq directory. That's an ntp symlink to the > > source-tree location of the libries it needs. > > That doesn't work either. What error message do you get? > > In the present state of things, the Python ntp library is part of the normal > > install to rootspace... > > There is still a can of worms in this area. I want to be able to test new > libraries without installing them. That's the reason for the ntp symlink in ntpq. If you run pyntpq from there, import ntp should find that link and all should be well. Is "." not on your Python import path? -- Eric S. Raymond From hmurray at megapathdsl.net Fri Nov 4 20:53:59 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 04 Nov 2016 13:53:59 -0700 Subject: Progress, and a puzzle. In-Reply-To: Message from "Eric S. Raymond" of "Fri, 04 Nov 2016 16:16:43 EDT." <20161104201643.GA7320@thyrsus.com> Message-ID: <20161104205359.F170F406061@ip-64-139-1-69.sjc.megapath.net> > That's the reason for the ntp symlink in ntpq. If you run pyntpq from > there, import ntp should find that link and all should be well. Is "." not > on your Python import path? That's the right question. Thanks. Who is supposed to setup PYTHONPATH? A raw clone has: pylib/ntpc.so -> ../build/main/libntp/ntpc.so That looks like a bug. It won't work if you configure with --out=foo (Guess what I normally use.) -- These are my opinions. I hate spam. From esr at thyrsus.com Sat Nov 5 02:27:30 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 4 Nov 2016 22:27:30 -0400 Subject: Progress, and a puzzle. In-Reply-To: <20161104205359.F170F406061@ip-64-139-1-69.sjc.megapath.net> References: <20161104201643.GA7320@thyrsus.com> <20161104205359.F170F406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161105022730.GA16291@thyrsus.com> Hal Murray : > > > That's the reason for the ntp symlink in ntpq. If you run pyntpq from > > there, import ntp should find that link and all should be well. Is "." not > > on your Python import path? > > That's the right question. Thanks. > > Who is supposed to setup PYTHONPATH? The Python interpreter has a default search path for module files. It includes "." on my system, so I never have to set PYTHONPATH. If you do set PYTHONPATH, those directories are appended. > A raw clone has: > pylib/ntpc.so -> ../build/main/libntp/ntpc.so > > That looks like a bug. It won't work if you configure with --out=foo > (Guess what I normally use.) I set up some symlinks. That location is symlinked to pylib/ntpc.so. Anything that reaches pylib through an ntp symlink will see that link and get the module. At installation time, the ntpc.so build production in libntp/wscript has an install_path of ${PYTHONDIR}/ntp, so copy will go there. I don't know if this works with --out. It depender on whether --out changes the computation of ${PYTHONDIR}. -- Eric S. Raymond From esr at thyrsus.com Sat Nov 5 05:36:36 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 5 Nov 2016 01:36:36 -0400 Subject: Python ntpq lands - what to do next? Message-ID: <20161105053636.GA30100@thyrsus.com> Reassured by Gary's finding that we can make standalone binaries from Python using cxfreeze (which satifies a want expressed by 2sigma) I landed the Python translation of ntpq this evening. (Hal, I tested and it works both in-tree without installation and out of tree after waf install. If you still have an environment-specific problem we'll fix that.) The good news: the peers spreadsheet now automatically resizes horizontally in wide terminal emulator windows. The bad news: Gary says it's 6x slower than the C version, and this does not change when you cxfreeze it. That's OK, it only needs to operate at human speed. It seems unlikely anyone will ever care. The better news: that's 7KLOC more gone. We're now down to 67KLOC, which is 29% of the original codebase size. Now let me step back and remind everyone of the last strategic look forward ("Forward-planning towards release 1.0", Wed Oct 5 04:27:10 UTC 2016) Here's the to-do list from that post exactly a month ago: * Fix symmetric-peer mode and MS-SNTP, definitely. * Drop broadcast client mode, wich Daniel rightly notes is fundamentally impossible to secure * Fix broadcast server mode, because Mark says a lot of Windows client users rely on it, but add a strong warning that it's a bad idea and users should transition out ASAP. My understanding is that symmetric-peer mode has been merged with server mode. Daniel, can you confirm? Is it documented where it needs to be? I suspect not - I grepped for "symmetric" and found references in assoc.txt, authentic.txt, disipline.txt, orphan.txt, and select.txt that pretty clearly need to be changed. (We probably can't have symmetric loop errors any more, which is a good thing.) Has MS-SNTP authentication been restored? What is the state of broadcast modes? I then proposed three possible scenarios for working towards release. We weren't feeling time pressure, so we ended up in Case Blue. I note that my schedule projection for landing Python ntpq was accurate to the day. :-) Daniel has been heard to say that with symmetric-peer mode out of the picture there may no longer be a need to write a new restriction language. That's just as well in my opinion as it would reduce the friction of adoption. However, Daniel has also said the default restrictions are a security problem. Daniel, would you say more about that, please? What, if anything, do we need to change? Mark has noted that we need to carry packaging help and possibly metadata for major distributions. I don't know that anyone has taken it on. We have 5 bugs on the tracker. Three of them relate to the GPSD refclock, which I think may not be useful enough to be worth saving. Gary disagrees. I think that means Gary gets to fix it. If it doesn't get fixed and becomes the last remaining 1.0 blocker, I'm reserving the option to yank it on grounds of too crappy to ship. One bug is an issue Hal Murray seems to understand but I don't. As for me, I think my next two tasks are (1) write the nice sizzly ntpmon tool we've been discussing, and (2) move ntpdig to Python. That'll drop 3 more KLOC and remove our libevent2 dependency. Neither job will be difficult, because I've already built back-end machinery to speak Mode 6 packets in Python (including authenticated mode 6) that can be used for ntpmon and adapted to speak SNTP. In summary, here is my list of pre-1.0 issues and the people who seem to own them. * Documentation update for abolition of symmetric-peer mode. (Daniel.) * MS-SNTP support (Daniel). * Decision, implementation, and documentation on brodcast modes (Daniel). * Do we really not need a new restrict language? What defaults need to change? (Daniel) * Package metadata for major distributions (?). * Tracker issues related to refclocks #20 and #46 (Gary). * Tracker issue #44 (Hal). * ntpmon and Python ntpdig (myself). -- Eric S. Raymond From hmurray at megapathdsl.net Sat Nov 5 19:06:21 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 12:06:21 -0700 Subject: Python ntpq lands - what to do next? In-Reply-To: Message from "Eric S. Raymond" of "Sat, 05 Nov 2016 01:36:36 EDT." <20161105053636.GA30100@thyrsus.com> Message-ID: <20161105190621.935CF406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > (Hal, I tested and it works both in-tree without installation and out of > tree after waf install. If you still have an environment-specific problem > we'll fix that.) It's broken for me. [murray at glypnod ntpq]$ ./ntpq Traceback (most recent call last): File "./ntpq", line 17, in from ntp.packet import * File "/home/murray/ntpsec/play/ntpq/ntp/packet.py", line 16, in from ntpc import lfptofloat ImportError: No module named ntpc [murray at glypnod ntpq]$ That's with no PYTHONPATH. There is a link in pylib: ntpc.so -> ../build/main/libntp/ntpc.so I don't have a build directory. I assume waf configure should setup that link. ----------- I haven't figured out what's going on here. [110/137] Compiling VERSION error: No repo or cache detected. -> task in '/home/murray/ntpsec/play/pylib/version.py' failed with exit status 1 (run with -v to display more information) -rw-rw-r-- 1 murray murray 0 Nov 5 11:42 version.py What I'm doing is building on a system without git after rsyncing from my main system. The rsync step excludes as much as I can, mostly .waf and .git How is this new version stuff going to work from a tarball? After build, there are things like statfiles.pyc in pylib/ Is that the right place for them? Why are they not in build/main/pylib/ in parallel with the other compiler output? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Nov 5 20:26:59 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 13:26:59 -0700 Subject: Python ntpq lands - what to do next? In-Reply-To: Message from "Eric S. Raymond" of "Sat, 05 Nov 2016 01:36:36 EDT." <20161105053636.GA30100@thyrsus.com> Message-ID: <20161105202659.55BE9406061@ip-64-139-1-69.sjc.megapath.net> > * Tracker issue #44 (Hal). Don't be surprised if that doesn't get solved anytime soon. Please don't blow it away because you can't reproduce and/or want to clean up the list. We need a way to keep track of rare quirks. -- These are my opinions. I hate spam. From gem at rellim.com Sat Nov 5 21:10:35 2016 From: gem at rellim.com (Gary E. Miller) Date: Sat, 5 Nov 2016 14:10:35 -0700 Subject: =?UTF-8?B?4pyYMC4yNTBwcG0vwrBD?= Message-ID: <20161105141035.17b7523e@spidey.rellim.com> Yo All!! Ntpviz now has a plot for frequency vs temp. The results are interesting. Attached is from a Pi3. It plots the Local Frequency Offset vs. the Pi CPU temp. By eyeball looks like a frequency shift of 0.250 ppm/?C RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 -------------- next part -------------- A non-text attachment was scrubbed... Name: local-freq-ttemps.png Type: image/png Size: 22297 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Sat Nov 5 21:29:23 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 5 Nov 2016 17:29:23 -0400 Subject: Python ntpq lands - what to do next? In-Reply-To: <20161105202659.55BE9406061@ip-64-139-1-69.sjc.megapath.net> References: <20161105053636.GA30100@thyrsus.com> <20161105202659.55BE9406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161105212923.GA11010@thyrsus.com> Hal Murray : > > * Tracker issue #44 (Hal). > > Don't be surprised if that doesn't get solved anytime soon. > > Please don't blow it away because you can't reproduce and/or want to clean up > the list. We need a way to keep track of rare quirks. I have no intention of doing that. -- Eric S. Raymond From esr at thyrsus.com Sat Nov 5 21:50:00 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 5 Nov 2016 17:50:00 -0400 Subject: Python ntpq lands - what to do next? In-Reply-To: <20161105190621.935CF406061@ip-64-139-1-69.sjc.megapath.net> References: <20161105053636.GA30100@thyrsus.com> <20161105190621.935CF406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161105215000.GB11010@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > (Hal, I tested and it works both in-tree without installation and out of > > tree after waf install. If you still have an environment-specific problem > > we'll fix that.) > > It's broken for me. > > [murray at glypnod ntpq]$ ./ntpq > Traceback (most recent call last): > File "./ntpq", line 17, in > from ntp.packet import * > File "/home/murray/ntpsec/play/ntpq/ntp/packet.py", line 16, in > from ntpc import lfptofloat > ImportError: No module named ntpc > [murray at glypnod ntpq]$ > > That's with no PYTHONPATH. > > There is a link in pylib: > ntpc.so -> ../build/main/libntp/ntpc.so > I don't have a build directory. That's odd. Where is waf putting your binaries? > I assume waf configure should setup that link. No, I made it by hand based on the normal shape of the tree after a waf build. Clearly you have done something to break that assumption, but I don't yet know what. > I haven't figured out what's going on here. > > [110/137] Compiling VERSION > error: No repo or cache detected. > > -> task in '/home/murray/ntpsec/play/pylib/version.py' failed with exit > status 1 (run with -v to display more information) > > -rw-rw-r-- 1 murray murray 0 Nov 5 11:42 version.py > > What I'm doing is building on a system without git after rsyncing from my > main system. The rsync step excludes as much as I can, mostly .waf and .git > > How is this new version stuff going to work from a tarball? What's going on: In fulfilment of a request from Gary to make package version information easily accessible from Python, I told waf to generate a file pylib/version.py using VERSION as input. Mine looks like this: # Generated by autorevision - do not hand-hack! TYPE = "git" BASENAME = "ntpsec" UUID = "5a60ff124237ba5d23101e4ca60016ebeae7cb50" NUM = 11271 DATE = "2016-11-05T20:26:32Z" BRANCH = "master" TAG = "NTPsec_0_9_4" TICK = 838 VERSION = "0.9.5" ACTION_STAMP = "2016-11-05T20:26:32Z!esr at thyrsus.com" FULL_HASH = "30b568a7e26aaf7c43089545d4c0384834046c3d" SHORT_HASH = "30b568a" WC_MODIFIED = True # end I believe that the production won't fire (and fail) if pylib/version.py exists and is newer than VERSION, which I expect to be the case in a tarball that prserves dates (because it will always be true after waf runs). So the first question is, does your unpacked tarball have a pylib/version.py? And is it newer or older than VERSION? > After build, there are things like statfiles.pyc in pylib/ > Is that the right place for them? Why are they not in build/main/pylib/ in > parallel with the other compiler output? It's a waf quirk I don't know the reason for. If you look in build/main/pylib/ you'll probably see this: __init__.pyo ntp_magic.pyo statfiles.pyo version.pyo Those are the optimized versions that "waf install" copies to rootspace. -- Eric S. Raymond From hmurray at megapathdsl.net Sat Nov 5 22:57:21 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 15:57:21 -0700 Subject: Python ntpq lands - what to do next? In-Reply-To: Message from "Eric S. Raymond" of "Sat, 05 Nov 2016 17:50:00 EDT." <20161105215000.GB11010@thyrsus.com> Message-ID: <20161105225721.622D1406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: >> I don't have a build directory. > That's odd. Where is waf putting your binaries? Off in some other directory where I told it to put them -o OUT, --out=OUT build dir for the project -- These are my opinions. I hate spam. From esr at thyrsus.com Sat Nov 5 23:46:27 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 5 Nov 2016 19:46:27 -0400 Subject: Python ntpq lands - what to do next? In-Reply-To: <20161105225721.622D1406061@ip-64-139-1-69.sjc.megapath.net> References: <20161105215000.GB11010@thyrsus.com> <20161105225721.622D1406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161105234627.GA14028@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > >> I don't have a build directory. > > That's odd. Where is waf putting your binaries? > > Off in some other directory where I told it to put them > -o OUT, --out=OUT build dir for the project OK, that link will not work, then. I need to figure out how to tell waf to genetrate it correctly for whatever the current build directory is. I have an idea about how to do that... Do you get correct behavior after an install to rootspace? -- Eric S. Raymond From hmurray at megapathdsl.net Sat Nov 5 23:55:59 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 16:55:59 -0700 Subject: Python ntpq lands - what to do next? In-Reply-To: Message from "Eric S. Raymond" of "Sat, 05 Nov 2016 19:46:27 EDT." <20161105234627.GA14028@thyrsus.com> Message-ID: <20161105235559.99A46406061@ip-64-139-1-69.sjc.megapath.net> > Do you get correct behavior after an install to rootspace? [murray at glypnod play]$ ntpq -p Traceback (most recent call last): File "/usr/local/bin/ntpq", line 17, in from ntp.packet import * ImportError: No module named ntp.packet >From the install step: + install /usr/local/lib/pylib/packet.pyc (from pylib/packet.pyc) no PYTHONPATH -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sun Nov 6 00:00:30 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 17:00:30 -0700 Subject: libntp/lib_strbuf.c.1.o created more than once Message-ID: <20161106000030.E33D5406061@ip-64-139-1-69.sjc.megapath.net> from the bottom of waf build -v * Node /home/murray/ntpsec/play/build/main/libntp/lib_strbuf.c.1.o is created more than once (full message on 'waf -v -v'). The task generators are: 1. 'ntp' in /home/murray/ntpsec/play/libntp 2. 'ntp' in /home/murray/ntpsec/play/libntp If you think that this is an error, set no_errcheck_out on the task instance -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sun Nov 6 01:29:27 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 18:29:27 -0700 Subject: colors from waf => crap in log files Message-ID: <20161106012927.32BD9406061@ip-64-139-1-69.sjc.megapath.net> waf likes fancy colors. >From build -v, it prints "runner" in red. You end up with things like this: 18:10:51 ^[[35mrunner^[[0m ['/usr/bin/bison', '-d', '--debug', '/home/murray/ntpsec/play/ntpd/ntp_parser.y', '-o', 'ntp_parser.tab.c'] But I don't get them in a log file made by a script. Does anybody understand that area? The script tee-s to a log file, but doing that by hand still got colors. -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Nov 6 01:32:40 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 5 Nov 2016 21:32:40 -0400 Subject: libntp/lib_strbuf.c.1.o created more than once In-Reply-To: <20161106000030.E33D5406061@ip-64-139-1-69.sjc.megapath.net> References: <20161106000030.E33D5406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161106013240.GA14782@thyrsus.com> Hal Murray : > > from the bottom of waf build -v > > * Node /home/murray/ntpsec/play/build/main/libntp/lib_strbuf.c.1.o is created > more than once (full message on 'waf -v -v'). The task generators are: > 1. 'ntp' in /home/murray/ntpsec/play/libntp > 2. 'ntp' in /home/murray/ntpsec/play/libntp > If you think that this is an error, set no_errcheck_out on the task instance Thanks, that was trivial to fix. Done. -- Eric S. Raymond From esr at thyrsus.com Sun Nov 6 01:43:59 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 5 Nov 2016 21:43:59 -0400 Subject: Python ntpq lands - what to do next? In-Reply-To: <20161105235559.99A46406061@ip-64-139-1-69.sjc.megapath.net> References: <20161105234627.GA14028@thyrsus.com> <20161105235559.99A46406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161106014359.GB14782@thyrsus.com> Hal Murray : > > > Do you get correct behavior after an install to rootspace? > > [murray at glypnod play]$ ntpq -p > Traceback (most recent call last): > File "/usr/local/bin/ntpq", line 17, in > from ntp.packet import * > ImportError: No module named ntp.packet > > >From the install step: > + install /usr/local/lib/pylib/packet.pyc (from pylib/packet.pyc) > > no PYTHONPATH Uh oh. I think this may be a regression in waf 0.9.5. The build production for the Python in pylib says: ctx(features='py', source=ctx.path.ant_glob('*.py'), install_from='.', install_path='${PYTHONDIR}/ntp') But it seems to be ignoring that install_path argument. I don't think that was happening in 0.9.4. I'll go talk to the maintainer. -- Eric S. Raymond From esr at thyrsus.com Sun Nov 6 01:45:38 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 5 Nov 2016 21:45:38 -0400 Subject: colors from waf => crap in log files In-Reply-To: <20161106012927.32BD9406061@ip-64-139-1-69.sjc.megapath.net> References: <20161106012927.32BD9406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161106014538.GC14782@thyrsus.com> Hal Murray : > > waf likes fancy colors. > > >From build -v, it prints "runner" in red. You end up with things like this: > > 18:10:51 ^[[35mrunner^[[0m ['/usr/bin/bison', '-d', '--debug', > '/home/murray/ntpsec/play/ntpd/ntp_parser.y', '-o', 'ntp_parser.tab.c'] > > But I don't get them in a log file made by a script. Does anybody understand that area? > > The script tee-s to a log file, but doing that by hand still got colors. Options: --version show program's version number and exit -h, --help show this help message and exit -c COLORS, --color=COLORS You can turn them off. -- Eric S. Raymond From hmurray at megapathdsl.net Sun Nov 6 02:10:03 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 19:10:03 -0700 Subject: Python ntpq lands - what to do next? In-Reply-To: Message from "Eric S. Raymond" of "Sat, 05 Nov 2016 17:50:00 EDT." <20161105215000.GB11010@thyrsus.com> Message-ID: <20161106021003.1F5FB406061@ip-64-139-1-69.sjc.megapath.net> > What's going on: In fulfilment of a request from Gary to make package > version information easily accessible from Python, I told waf to generate a > file pylib/version.py using VERSION as input. Mine looks like this: > ... > I believe that the production won't fire (and fail) if pylib/version.py > exists and is newer than VERSION, which I expect to be the case in a tarball > that prserves dates (because it will always be true after waf runs). I'm not actually making a tarball, just trying to do something similar. I have a script to do all the work, mostly so I don't forget a critical step, but also to make a log file without extra typing and such. It should be a line or two in the right place if/when I figure out what to do. > So the first question is, does your unpacked tarball have a pylib/ > version.py? And is it newer or older than VERSION? -rw-rw-r-- 1 murray murray 6 Oct 13 01:45 VERSION -rw-rw-r-- 1 murray murray 408 Nov 5 17:37 pylib/version.py [100/132] Compiling VERSION 18:11:27 runner ' VERSION=`cat ../VERSION` ../wafhelpers/autorevision.sh -t python >version.py ' error: No repo or cache detected. That's from my attempt at mimicking a tarball process. So I tried the real things. ./waf distcheck worked and left behind noname-1.0.tar.bz2 I scp-ed that to another machine that didn't have git, untared, and ... [ 98/132] Compiling VERSION error: No repo or cache detected. Waf: Leaving directory `/home/murray/foo/noname-1.0/build/main' Build failed -> task in '/home/murray/foo/noname-1.0/pylib/version.py' failed with exit status 1 (run with -v to display more information) ---------- This is turning into an interesting can of worms. waf clean doesn't get pylib/ntp_control.py or pylib/ntp_magic.py or pylib/version.py is that a bug or feature? The parallel non-python version stuff just uses the time if it can't get info from git. You may want to "fix" that to do something similar. It's only used by ntpd. (There used to be one for ntpq too.) There are actually two times of interest. One is the latest time a source file was modified. The other is the time the package was built. Python may not have the concept of build, but it might come back if you are translating from python to c. If you edit a file, waf doesn't automagically update version.c So my rebuild script deletes */main/ntpd/version.c so waf rebuilds it to get the updated build-time. That version stuff is currently half broken. ntpd --version --version ntpd 0.9.5-5feab12-glypnod Nov 5 2016 17:37:08 ntpd CFLAGS=Need-CFLAGS LDFLAGS=Need-LDFLAGS -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sun Nov 6 02:37:51 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 19:37:51 -0700 Subject: colors from waf => crap in log files In-Reply-To: Message from "Eric S. Raymond" of "Sat, 05 Nov 2016 21:45:38 EDT." <20161106014538.GC14782@thyrsus.com> Message-ID: <20161106023751.63CB5406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > -c COLORS, --color=COLORS > You can turn them off. waf: error: option -c: invalid choice: 'off' (choose from 'yes', 'no', 'auto') Thanks. I assume the default is auto. How do I find out what that really does? My quick attempt with google didn't find ianything. -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Nov 6 03:23:39 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 5 Nov 2016 23:23:39 -0400 Subject: Python ntpq lands - what to do next? In-Reply-To: <20161106021003.1F5FB406061@ip-64-139-1-69.sjc.megapath.net> References: <20161105215000.GB11010@thyrsus.com> <20161106021003.1F5FB406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161106032339.GA21859@thyrsus.com> Hal Murray : > > So the first question is, does your unpacked tarball have a pylib/ > > version.py? And is it newer or older than VERSION? > > -rw-rw-r-- 1 murray murray 6 Oct 13 01:45 VERSION > -rw-rw-r-- 1 murray murray 408 Nov 5 17:37 pylib/version.py > > [100/132] Compiling VERSION > 18:11:27 runner ' VERSION=`cat ../VERSION` ../wafhelpers/autorevision.sh -t > python >version.py ' > error: No repo or cache detected. > > That's from my attempt at mimicking a tarball process. Bletch. OK, I thought autorevision was smarter than it is. (I contributed to it, but I'm not the maintainer.) I'll go look at our copy... OK, I've just modified the autorevision command in pylib to make and use a cache file. The cache file location is declared as a target, which should mean it gets picked up by waf dist. > This is turning into an interesting can of worms. > > waf clean doesn't get pylib/ntp_control.py or pylib/ntp_magic.py or > pylib/version.py > is that a bug or feature? A minor bug. What's special about those files is that they're generated into the pylib *source* directory by the normal build; waf doesn't try to clean that directory. The underlying problem here is the same thing that's messing up your in-tree testing - the kludge I was using isn't good enough. At the moment the magic ntp symlinks that are supposed to let you run the Python stuff without installing to rootspace point to pylib/; a symlink there forwards references to ntpc.so (the Python extension) to the build directory. What I need to do is make those ntp/ links forward to the ${build}/pylib/ directory in such a way that it works even with --out. I guess that's my top task now. > The parallel non-python version stuff just uses the time if it can't get info > from git. > You may want to "fix" that to do something similar. > It's only used by ntpd. (There used to be one for ntpq too.) > > There are actually two times of interest. One is the latest time a source > file was modified. The other is the time the package was built. Python may > not have the concept of build, but it might come back if you are translating > from python to c. > > If you edit a file, waf doesn't automagically update version.c > So my rebuild script deletes > */main/ntpd/version.c > so waf rebuilds it to get the updated build-time. > > That version stuff is currently half broken. > ntpd --version --version > ntpd 0.9.5-5feab12-glypnod Nov 5 2016 17:37:08 > ntpd CFLAGS=Need-CFLAGS LDFLAGS=Need-LDFLAGS OK, this is exactly what autorevision is for. If I tell it to output in C format it will make a version.h that ntpd can use. I'll make that my next task once I've got your bugs fixed. -- Eric S. Raymond From esr at thyrsus.com Sun Nov 6 03:30:12 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 5 Nov 2016 23:30:12 -0400 Subject: colors from waf => crap in log files In-Reply-To: <20161106023751.63CB5406061@ip-64-139-1-69.sjc.megapath.net> References: <20161106014538.GC14782@thyrsus.com> <20161106023751.63CB5406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161106033012.GB21859@thyrsus.com> Hal Murray : > I assume the default is auto. I think so. > How do I find out what that really does? My quick attempt with google didn't > find ianything. That's because the waf documentation is horrible - it's the one real drawback waf has. It's the maintainer brain-dumping about *his* knowledge of the system in a way nearly inaccessible to anyone else. Thus, the user community is tiny and has not generated the cloud of googlable FAQs and hints and tips that you normally get around tools like this. I occasionally think about starting a waf FAQ. -- Eric S. Raymond From hmurray at megapathdsl.net Sun Nov 6 04:49:47 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 21:49:47 -0700 Subject: Python ntpq lands - what to do next? In-Reply-To: Message from "Eric S. Raymond" of "Sat, 05 Nov 2016 23:23:39 EDT." <20161106032339.GA21859@thyrsus.com> Message-ID: <20161106044947.0D55A406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > OK, I've just modified the autorevision command in pylib to make and use a > cache file. The cache file location is declared as a target, which should > mean it gets picked up by waf dist. I think that's working with my hack and with a real tarball. Thanks. There is still a can of worms in this area. If I do a git pull and build to setup that cache, then do an edit but no commit, version.py doesn't show any difference between the before/after than edit. >> waf clean doesn't get pylib/ntp_control.py or pylib/ntp_magic.py or >> pylib/version.py > A minor bug. What's special about those files is that they're generated > into the pylib *source* directory by the normal build; waf doesn't try to > clean that directory. The c file from bison goes into xxx/host/ntpd/ntp_parser.tab.c so waf knows how to work with source files that aren't in the source tree. But I don't know if python can find them. -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Nov 6 05:50:33 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 6 Nov 2016 01:50:33 -0400 Subject: Python ntpq lands - what to do next? In-Reply-To: <20161106044947.0D55A406061@ip-64-139-1-69.sjc.megapath.net> References: <20161106032339.GA21859@thyrsus.com> <20161106044947.0D55A406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161106055033.GA27269@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > OK, I've just modified the autorevision command in pylib to make and use a > > cache file. The cache file location is declared as a target, which should > > mean it gets picked up by waf dist. > > I think that's working with my hack and with a real tarball. Thanks. Good, that's one down. > There is still a can of worms in this area. If I do a git pull and build to > setup that cache, then do an edit but no commit, version.py doesn't show any > difference between the before/after than edit. Right, and it should change at least the value of WC_MODIFIED. The only way to make sure that happens would be to force autorevision to be run on every build. I've left a question about this on #waf. The maintainer is pretty good about answering those. > >> waf clean doesn't get pylib/ntp_control.py or pylib/ntp_magic.py or > >> pylib/version.py > > A minor bug. What's special about those files is that they're generated > > into the pylib *source* directory by the normal build; waf doesn't try to > > clean that directory. > > The c file from bison goes into xxx/host/ntpd/ntp_parser.tab.c > so waf knows how to work with source files that aren't in the source tree. > But I don't know if python can find them. This is yet another face of the mess involving pylib and the build directory. There are Python libraries needed by ntpq and (soon) by ntpmon under pylib. One wrinkle is that they need to be installed not as 'pylib' under the Python dist-files directory but rather as 'ntp'. That means we need a custom installation production, which I think I don't have quite right yet. Some Python files are being installed twice, once in the right place under /usr/local/lib/python2.7/dist-packages/ and once under /usr/local/lib. Another is that we need (or at least I think we need) to set things up so that, e.g. ntpq can be tested from under ntpq/ without installing pylib to rootspace. This constraint is which these files exist: lrwxrwxrwx 1 esr esr 8 Nov 6 00:14 ntpq/ntp -> ../pylib lrwxrwxrwx 1 esr esr 9 Oct 20 23:47 ntpstats/ntp -> ../pylib/ lrwxrwxrwx 1 esr esr 9 Oct 20 23:47 ntpsweep/ntp -> ../pylib/ lrwxrwxrwx 1 esr esr 8 Oct 20 23:47 ntptrace/ntp -> ../pylib lrwxrwxrwx 1 esr esr 8 Oct 20 23:47 ntpwait/ntp -> ../pylib When you run Python in any of these directories and import (say) ntp.packet, the symlink forwards that import to pylib. That would be sufficient if everything you needed to load were one of the Python sourcefiles that is at home there. Unfortunately, this isn't the case. There are two kinds of generated things in that directory. One is three files, version.py, ntp_control.py, and ntp_magic.py, that are generated during a build. As you now know, version.py is generated by autorevision.sh and it is not yet clear how to get it to run often enough. The ntp_control.py and ntp_magic.py files are script-generated translations of C header files that make a bunch of macro definitions visible to Pyton. The other special case is ntpc.so. This is the Python extension module. It is generated into the build directory. In order to make it visible when running ntpq/ntpq before installation, there is this symlink. lrwxrwxrwx 1 esr esr 28 Nov 5 17:00 pylib/ntpc.so -> ../build/main/libntp/ntpc.so This is the part that is coming unstuck for you, because that's a static link that doesn't track where --out is putting the build dirctory. Now I'm trying to figure out a way to swap the static "ntp" directory links for waf-generated links that forward to pylib/ under the *build* directory, whetever that is (and even if you're using --out). ntpq/ntpq would no longer run before a waf build, but I could simplify things by having the generated Python be put there rather than hanging out as special caseds in the source directory. I'm sure these is a way to do this, but the opacity of the waf documentation makes it unreasonably difficult to figure that out. I may have to unbend and change pylib/ to ntp/. I dislike giving away that piece of the tree's namespace, but the name change confuses waf and requires ugly, only semi-working klugery. -- Eric S. Raymond From hmurray at megapathdsl.net Sun Nov 6 06:49:11 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 05 Nov 2016 23:49:11 -0700 Subject: Python ntpq lands - what to do next? In-Reply-To: Message from "Eric S. Raymond" of "Sun, 06 Nov 2016 01:50:33 EDT." <20161106055033.GA27269@thyrsus.com> Message-ID: <20161106064911.3567A406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Right, and it should change at least the value of WC_MODIFIED. The only way > to make sure that happens would be to force autorevision to be run on every > build. I've left a question about this on #waf. The maintainer is pretty > good about answering those. I can hack that for my scripts. I'll just delete pylib/version.py > Some Python files are being installed twice, once in the right place under / > usr/local/lib/python2.7/dist-packages/ and once under /usr/local/lib. I think you have fixed the /usr/local/lib/pylib stuff (maybe post your msg. I just did a git pull to check) The current version string from ntpq doesn't have any date info. (and I don't care about the build directory) -- These are my opinions. I hate spam. From Stromeko at nexgo.de Sun Nov 6 07:33:49 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 06 Nov 2016 08:33:49 +0100 Subject: colors from waf => crap in log files References: <20161106023751.63CB5406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <87eg2pdpuq.fsf@Rainer.invalid> Hal Murray writes: > I assume the default is auto. > > How do I find out what that really does? The usual thing to do is check if stdout goes to a terminal and color that using escape sequences pulled from terminfo. If the output goes someplace else or the terminal doesn't support (enough) colors, don't emit escape sequences. If you play around a bit with the TERM variable and piping / redirecting the output, you should see if waf is doing this. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Wavetables for the Terratec KOMPLEXER: http://Synth.Stromeko.net/Downloads.html#KomplexerWaves From hmurray at megapathdsl.net Sun Nov 6 08:02:32 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 06 Nov 2016 01:02:32 -0700 Subject: colors from waf => crap in log files In-Reply-To: Message from Achim Gratz of "Sun, 06 Nov 2016 08:33:49 BST." <87eg2pdpuq.fsf@Rainer.invalid> Message-ID: <20161106080232.A49E9406061@ip-64-139-1-69.sjc.megapath.net> Stromeko at nexgo.de said: >> I assume the default is auto. > >> How do I find out what that really does? > The usual thing to do is check if stdout goes to a terminal and color that > using escape sequences pulled from terminfo. If the output goes someplace > else or the terminal doesn't support (enough) colors, don't emit escape > sequences. If you play around a bit with the TERM variable and piping / > redirecting the output, you should see if waf is doing this. I was fishing for something like that, but my test cases are piping through tee so no terminal. -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Nov 6 09:35:06 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 6 Nov 2016 04:35:06 -0500 Subject: Python ntpq lands - what to do next? In-Reply-To: <20161106064911.3567A406061@ip-64-139-1-69.sjc.megapath.net> References: <20161106055033.GA27269@thyrsus.com> <20161106064911.3567A406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161106093506.GB5353@thyrsus.com> Hal Murray : > I think you have fixed the /usr/local/lib/pylib stuff > (maybe post your msg. I just did a git pull to check) Yes, I was about to mail you to tell you this. > The current version string from ntpq doesn't have any date info. > (and I don't care about the build directory) I guess I'll embed the date and short hash. -- Eric S. Raymond From Stromeko at nexgo.de Sun Nov 6 13:22:10 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 06 Nov 2016 14:22:10 +0100 Subject: =?utf-8?B?4pyYMC4yNTBwcG0vwrBD?= References: <20161105141035.17b7523e@spidey.rellim.com> Message-ID: <877f8geoal.fsf@Rainer.invalid> Gary E. Miller writes: > Ntpviz now has a plot for frequency vs temp. The results are > interesting. ?wasn't it you who said just two weeks ago that they wouldn't be? > Attached is from a Pi3. It plots the Local Frequency Offset vs. the > Pi CPU temp. You really need to record the temperature aligned with the loopstat values if you want to correlate them. I actually read it out for each PPS timestamp, then average onto the next loopstat value (I'm running the loop at poll=4, so 16 seconds). The temperature measurement on the Pi is quite noisy, even though it appears they already do some sort of filtering on it. > By eyeball looks like a frequency shift of 0.250 ppm/?C The denominator is a temperature difference, which can't be reported in ?C; so that should be ppm/K. The ballpark number is roughly what I'd determined earlier for my rasPi. I had initially assumed that they'd use AT cut crystals which would produce a third-order temperature dependency, but the characteristic is quite clearly parabolic. So it is probably a BT cut, but the parabolic constant extracted from the data is smaller than the theoretical value for this cut. Over a period of time longer than a week you need to take crystal aging into account also. Aging is a logarithmic function of time, but assuming you've already aged out the initial transient and are looking at a short time period w.r.t. the total aging time, you can use linear aging instead. That makes the fit function p2a(t,T)=pa*t+pb*(T-T0)**2+pc I'm using these initial values for the fit pa=100e-6 pb=-5-3 pc=-3.5 T0=64. then fit fit [t=*:*][T=*:*][ppm=*:*] (p2a(t,T)-ppm) "joined.txt" using ? via pa,pb,pc,T0 I'm only using values where the loop has converged to better than 1?s for the fit. Depending on how much data you have and what temperature range it spans you might need to fix some of those parameters to make the fit for the others converge. In order to reduce the disturbances by fast ambient temperature transients now that the heating season has begun, I've bubblewrapped my rasPi (still in it's case) + GPS module and put it into a corrugated cardboard box with some slits cut to feed the cables through. That has raised the thermal resistance enough that I can now operate the XO near the cusp of the TC curve. Currently I'm loading a single core with sha512sum /dev/zero & to raise the temperature a bit and there's a few hours of operation with two cores loaded this way in the data. I'll have to see how to get some finer control over that load (and maybe use a second core for "heating") so that I can operate exactly on the zero-TC point. But first I need to collect some more days worth of data. -------------- next part -------------- A non-text attachment was scrubbed... Name: ppm.png Type: image/png Size: 98400 bytes Desc: not available URL: -------------- next part -------------- Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From gem at rellim.com Sun Nov 6 19:27:49 2016 From: gem at rellim.com (Gary E. Miller) Date: Sun, 6 Nov 2016 11:27:49 -0800 Subject: =?UTF-8?B?4pyYMC4yNTBwcG0vwrBD?= In-Reply-To: <877f8geoal.fsf@Rainer.invalid> References: <20161105141035.17b7523e@spidey.rellim.com> <877f8geoal.fsf@Rainer.invalid> Message-ID: <20161106112749.69c4e887@spidey.rellim.com> Yo Achim! On Sun, 06 Nov 2016 14:22:10 +0100 Achim Gratz wrote: > Gary E. Miller writes: > > Ntpviz now has a plot for frequency vs temp. The results are > > interesting. > > ?wasn't it you who said just two weeks ago that they wouldn't be? Maybe I should have said illustrative? It makes visually obvious Many details people have been arguing over in the abstract. I'm seeing at least 4 different things going on in that graph. > > Attached is from a Pi3. It plots the Local Frequency Offset vs. the > > Pi CPU temp. > > You really need to record the temperature aligned with the loopstat > values if you want to correlate them. I actually read it out for each > PPS timestamp, then average onto the next loopstat value (I'm running > the loop at poll=4, so 16 seconds). The temperature measurement on > the Pi is quite noisy, even though it appears they already do some > sort of filtering on it. Yeah, I'm starting to think we need to create a whole daemon devoted to logging, instead of the little throw away scripts in contrib. > > By eyeball looks like a frequency shift of 0.250 ppm/?C > > The denominator is a temperature difference, which can't be reported > in ?C; so that should be ppm/K. One ?C is one ?K for ratio purposes. The offset cancels out when youu compute the delta. > Over a period of time > longer than a week you need to take crystal aging into account also. I think my poor control of the test room temp vastly outweighs the aging. So I'm pondering some sort of chamber. > I'm only using values where the loop has converged to better than 1?s > for the fit. But that's the thing. As long as the RasPi has GPS lock, any aging or temp error is correctetd in the loop. Unless you want to provide good time with long GPS outages the aging is not important to NTP operation. > In order to reduce the disturbances by fast ambient temperature > transients now that the heating season has begun, I've bubblewrapped > my rasPi (still in it's case) + GPS module and put it into a > corrugated cardboard box with some slits cut to feed the cables > through. Ditto here on one of my RasPi. That clearly buffered the RasPi a bit from room temp, but magnified the load factor effect. So on balance not a win. > I'll have to see how to > get some finer control over that load (and maybe use a second core > for "heating") so that I can operate exactly on the zero-TC point. I'm pondering a more direct thermal control. Fans, heaters, etc. But to me that is very much a low priority. My concerns are different than yours. I could care less if the OSC frequency drifts, as long as NTP applies a good loop correction. What I find worthy of fixing in the graphs is the frequency spikes, the temp related and especially the temp unrelated. What other forces are changing the frequency. Ts there some way to change the loop control to improve time and frequency accuracy? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Sun Nov 6 22:33:34 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 06 Nov 2016 23:33:34 +0100 Subject: =?utf-8?B?4pyYMC4yNTBwcG0vwrBD?= References: <20161105141035.17b7523e@spidey.rellim.com> <877f8geoal.fsf@Rainer.invalid> <20161106112749.69c4e887@spidey.rellim.com> Message-ID: <8737j4dyrl.fsf@Rainer.invalid> Gary E. Miller writes: >> The denominator is a temperature difference, which can't be reported >> in ?C; so that should be ppm/K. > > One ?C is one ?K for ratio purposes. The offset cancels out when youu > compute the delta. There isn't any unit ?K, it's just K; and as I said, ?C must not be used for temperature differences either. The two scale factors are the same. >> Over a period of time >> longer than a week you need to take crystal aging into account also. > > I think my poor control of the test room temp vastly outweighs the > aging. So I'm pondering some sort of chamber. The aging is indeed slow enough to get removed by the NTP loop, but it shows up when you look at a week or more worth of data. That will then make your fit worse or even prevent it from converging, which is why you explicitly need to take it into account there. >> I'm only using values where the loop has converged to better than 1?s >> for the fit. > > But that's the thing. As long as the RasPi has GPS lock, any aging or > temp error is correctetd in the loop. Unless you want to provide good > time with long GPS outages the aging is not important to NTP > operation. Your mental model of how the NTP loop works seems to be missing something important: any change in the XO frequency shows up as an error in the measurements that NTP makes. Since that error is not unbiased when the frequency drifts along with temperature, it will take quite some time to get corrected and while it is getting corrected, there is a time offset that is proportional to the derivative of the frequency offset. > Ditto here on one of my RasPi. That clearly buffered the RasPi a bit > from room temp, but magnified the load factor effect. So on balance not > a win. That's why you will want to run it near the zero-TC point and with as uniform as possible power dissipation. >> I'll have to see how to >> get some finer control over that load (and maybe use a second core >> for "heating") so that I can operate exactly on the zero-TC point. > > I'm pondering a more direct thermal control. Fans, heaters, etc. But > to me that is very much a low priority. That doesn't work the way you seem to imagine it, you just pile on another problem of keeping the temperature stable on a system with substantial and varying power dissipation. The time constants just don't match up and get more unfavorable by the added thermal mass. If you want to throw hardware at it, just remove the XO and feed the rasPi 19.2MHz synthesized by a GPSDO (the navSpark timing module can do that for about $80) and run ntpd in external discipline mode (if that still works). > My concerns are different than yours. I could care less if the OSC > frequency drifts, as long as NTP applies a good loop correction. Again, if your yardstick changes between measurements, that correction is also off by definition. > What I find worthy of fixing in the graphs is the frequency spikes, > the temp related and especially the temp unrelated. What other forces > are changing the frequency. Ts there some way to change the loop > control to improve time and frequency accuracy? You need to first and foremost ensure that the the XO frequency is stable. The accuracy at timescales below 100s is limited by the short-term stability of the XO. You can shift that error between time and frequency within some reason, but it doesn't go away. If the frequency isn't stable, then its rate of change must be within the loop bandwidth; the slower any drifts, the better. Any other errors must be unbiased so they rapidly converge to zero by averaging. If you cannot ensure that either, then you need to have a nested loop that predicts the fast and/or systematic disturbances and incorporates the resulting correction into the control algorithm as a feed-forward component, so the feedback loop doesn't need to deal with that. Ensuring stability for that nested loop is left as an exercise for the reader. The latter part would require extensive characterisation of each system setup, so it is not really practical in the general case. Before I've changed the setup to the one I'm currently running, I've had the temperature vs. ppm curve predicted (again, that was with a non-causal filter, so the feed-forward part wouldn't have worked in reality) so that it would have reduced the swing on the NTP loop correction by a factor of 5 to 10, but some bias remained. More gain on the correction might get the offset down, but leans the loop towards oscillation or even chaotic behaviour. Running the XO near the zero-TC point does the same or better without adding a nested loop and fiddling with unknown system and control variables, so it's a much more practical approach, plus it doesn't cost anything but a bit of power. The differences between the actual loop correction and the off-line prediction are now indeed almost unbiased. The actual loop offset averages to zero (plus the MAD on the loop offset runs somewhere between 200...400ns, which is only about 4 to 8 cycles jitter of the 19.2MHz XO frequency). 99% of all PPS timestamps over the last day stay within ?10?s, 75% within ?1?s and slightly less than half within ?500ns. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From gem at rellim.com Sun Nov 6 22:52:56 2016 From: gem at rellim.com (Gary E. Miller) Date: Sun, 6 Nov 2016 14:52:56 -0800 Subject: =?UTF-8?B?4pyYMC4yNTBwcG0vwrBD?= In-Reply-To: <8737j4dyrl.fsf@Rainer.invalid> References: <20161105141035.17b7523e@spidey.rellim.com> <877f8geoal.fsf@Rainer.invalid> <20161106112749.69c4e887@spidey.rellim.com> <8737j4dyrl.fsf@Rainer.invalid> Message-ID: <20161106145256.0f0c8807@spidey.rellim.com> Yo Achim! On Sun, 06 Nov 2016 23:33:34 +0100 Achim Gratz wrote: > Gary E. Miller writes: > >> The denominator is a temperature difference, which can't be > >> reported in ?C; so that should be ppm/K. > > > > One ?C is one ?K for ratio purposes. The offset cancels out when > > youu compute the delta. > > There isn't any unit ?K, it's just K; and as I said, ?C must not be > used for temperature differences either. The two scale factors are > the same. picky, picky. Is this not the typesettting list? > >> Over a period of time > >> longer than a week you need to take crystal aging into account > >> also. > > > > I think my poor control of the test room temp vastly outweighs the > > aging. So I'm pondering some sort of chamber. > > The aging is indeed slow enough to get removed by the NTP loop, but it > shows up when you look at a week or more worth of data. That will > then make your fit worse or even prevent it from converging, which is > why you explicitly need to take it into account there. I'm not seeing it. And is so then NTP is failing. https://pi4.rellim.com/week/#local_frequency/temp > >> I'm only using values where the loop has converged to better than > >> 1?s for the fit. > > > > But that's the thing. As long as the RasPi has GPS lock, any aging > > or temp error is correctetd in the loop. Unless you want to > > provide good time with long GPS outages the aging is not important > > to NTP operation. > > Your mental model of how the NTP loop works seems to be missing > something important: any change in the XO frequency shows up as an > error in the measurements that NTP makes. Well, duh. > Since that error is not > unbiased when the frequency drifts along with temperature, it will > take quite some time to get corrected and while it is getting > corrected, there is a time offset that is proportional to the > derivative of the frequency offset. Yes, but I only see an obviouus time offsett error when the NTP frequency servo is too slow. Like after a sudden temp change. In normal practice the effect is too small to see on any of my 5 testt cells. > > Ditto here on one of my RasPi. That clearly buffered the RasPi a > > bit from room temp, but magnified the load factor effect. So on > > balance not a win. > > That's why you will want to run it near the zero-TC point and with as > uniform as possible power dissipation. Hard to control the power dissipation on a PC serving the internet. I do agree that any environment is best at the zero-TC point. But I do not agree that I can see that actually effect my NTP offsets, which is the real goal. > >> I'll have to see how to > >> get some finer control over that load (and maybe use a second core > >> for "heating") so that I can operate exactly on the zero-TC > >> point. > > > > I'm pondering a more direct thermal control. Fans, heaters, etc. > > But to me that is very much a low priority. > > That doesn't work the way you seem to imagine it, you just pile on > another problem of keeping the temperature stable on a system with > substantial and varying power dissipation. I've done it before, I'll do it again, just not high on my list. > If > you want to throw hardware at it, just remove the XO and feed the > rasPi 19.2MHz synthesized by a GPSDO (the navSpark timing module can > do that for about $80) and run ntpd in external discipline mode (if > that still works). WAY more expensive than what I am thinking of. Feel free to try it and send in the results. > > My concerns are different than yours. I could care less if the OSC > > frequency drifts, as long as NTP applies a good loop correction. > > Again, if your yardstick changes between measurements, that correction > is also off by definition. And again, is the change due to short term tetmpp shift is small, NTP handles it. That is what I observce. > > What I find worthy of fixing in the graphs is the frequency spikes, > > the temp related and especially the temp unrelated. What other > > forces are changing the frequency. Ts there some way to change the > > loop control to improve time and frequency accuracy? > > You need to first and foremost ensure that the the XO frequency is > stable. The accuracy at timescales below 100s is limited by the > short-term stability of the XO. You can shift that error between time > and frequency within some reason, but it doesn't go away. I look forward to your experimental evidence that this effect is significantt on the RasPi. > If the frequency isn't stable,[...] Facts not in evidence. > 99% of all > PPS timestamps over the last day stay within ?10?s, 75% within ?1?s > and slightly less than half within ?500ns. Odd that I get much better numbers without all that work: Ranges...... 90% 98% StdDev Mean Units 0.976 2.515 0.449 -0.000 ?s RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Mon Nov 7 13:09:26 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 7 Nov 2016 08:09:26 -0500 Subject: Prototype Python version of ntpdig lands. Message-ID: <20161107130926.GA5438@thyrsus.com> Well, *that* didn't take a lot of work. After reading a couple of articles on the Web about simple SNTP clients in Python, I decided to take a swing at the ntpdig replacement. Two days later it's mostly there; look in ntpdig/pyntpdig. The querying stuff works perfectly; you can't readily tell the old-style or JSON output from that of the C version. A few things are still missing; the big two are authentication support and actual time-stepping/slewing. The logging needs to be changed to use syslog, and there's one reply validation it should be doing that it doesn't yet. None of these things will be difficult or take a lot of code. I've added one new feature. Rather than simply taking the first sample it can get, pyntpdig has the ability to take multiple samples, apply bogon filtering, and choose the best. This is behavior similar to Classic's ntpdate - in fact I swiped the filtering code from ntpdate (then simplified it somewhat). Even with the new feature, the Python code is smaller than the C ntpdig code by a factor of 9.6, and not likely to get much larger. That's an unusually high degree of compression - I generally see ratios of 2:1 to 5:1. And it's a big gain in maintainability down the road. In the process of working on this I learned that the C ntpdig code is a pretty extreme example of "because I can" overengineering. Asynchronous DNS lookup, when the typical query is one host, usually a pool server? *Really?* Please test this code and eyeball-review it. I expect to finish it over the next week. -- Eric S. Raymond From hmurray at megapathdsl.net Mon Nov 7 18:08:23 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 07 Nov 2016 10:08:23 -0800 Subject: ntpq needs a push after each line Message-ID: <20161107180823.E8EDA406061@ip-64-139-1-69.sjc.megapath.net> or you can't see anything if you use tee. I assume the python default is line buffered when sysout is a terminal but not if it's a pipe. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Mon Nov 7 19:01:46 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 07 Nov 2016 20:01:46 +0100 Subject: =?utf-8?B?4pyYMC4yNTBwcG0vwrBD?= References: <20161105141035.17b7523e@spidey.rellim.com> <877f8geoal.fsf@Rainer.invalid> <20161106112749.69c4e887@spidey.rellim.com> <8737j4dyrl.fsf@Rainer.invalid> <20161106145256.0f0c8807@spidey.rellim.com> Message-ID: <87k2cfru5h.fsf@Rainer.invalid> Gary E. Miller writes: > Hard to control the power dissipation on a PC serving the internet. We weren't talking about a PC, or at least I was not. The only subject I'm discussing is a rasPi and more specifically the 2B, which has multiple cores to play with. >> 99% of all >> PPS timestamps over the last day stay within ?10?s, 75% within ?1?s >> and slightly less than half within ?500ns. > > Odd that I get much better numbers without all that work: Read again what I wrote, please. This data was for 86400 raw PPS timestamps (which you are seemingly not recording), you are talking the loopstats. Not surprisingly, the loopstat data as filtered through NTP and has a much tighter distribution. So in fact I do get better numbers than yours at least at the tail ends of the distribution. Over the last 48 hours (the distribution is slightly skewed with the median around -10ns and the average around 30ns): 1%?99% -600ns ? +850ns = 1350ns (1200ns no skew) 5%?95% -375ns ? +550ns = 925ns ( 750ns no skew) 10%?90% -290ns ? +400ns = 690ns ( 580ns no skew) 25%?75% -160ns ? +170ns = 330ns The skew seems to originate from the aging transient (especially the initial part, see below), assuming it gets fully removed as indicated by looking at the last few hours only it should converge to better than the numbers shown in parenthesis. Here's the last five days on that system, including some configuration changes: -------------- next part -------------- A non-text attachment was scrubbed... Name: lstats.png Type: image/png Size: 63220 bytes Desc: not available URL: -------------- next part -------------- The x axis is in hours, the ppm scale is 1.5ppm top to bottom on the (unlabeled) y2 axis. The first two days shown was an intermediate configuration (case closed, a single core loaded). Then comes a day of running with two cores loaded, then the system getting wrapped up and boxed and the load switched back to a single core. The long-term downward trend you see on the ppm is the aging transient of the crystal, the temperature (not shown in the plot) has been stable over the last two days. A week out this will look like a linear trend and a month out you'll not be able to see it on such a short timescale. It will start anew however if you do such a wild temperature swing on that crystal again. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From gem at rellim.com Mon Nov 7 19:16:10 2016 From: gem at rellim.com (Gary E. Miller) Date: Mon, 7 Nov 2016 11:16:10 -0800 Subject: =?UTF-8?B?4pyYMC4yNTBwcG0vwrBD?= In-Reply-To: <87k2cfru5h.fsf@Rainer.invalid> References: <20161105141035.17b7523e@spidey.rellim.com> <877f8geoal.fsf@Rainer.invalid> <20161106112749.69c4e887@spidey.rellim.com> <8737j4dyrl.fsf@Rainer.invalid> <20161106145256.0f0c8807@spidey.rellim.com> <87k2cfru5h.fsf@Rainer.invalid> Message-ID: <20161107111610.6957e82b@spidey.rellim.com> Yo Achim! On Mon, 07 Nov 2016 20:01:46 +0100 Achim Gratz wrote: > Gary E. Miller writes: > > Hard to control the power dissipation on a PC serving the > > internet. > > We weren't talking about a PC, or at least I was not. The only > subject I'm discussing is a rasPi and more specifically the 2B, which > has multiple cores to play with. I consider RasPi's to be PCs, powerfull little machines. I use my RasPi's exactly as I use my other 'PC's. > >> 99% of all > >> PPS timestamps over the last day stay within ?10?s, 75% within ?1?s > >> and slightly less than half within ?500ns. > > > > Odd that I get much better numbers without all that work: > > Read again what I wrote, please. Yeah, I still arrive at tthe same opinion. > This data was for 86400 raw PPS > timestamps (which you are seemingly not recording), you are talking > the loopstats. Correct. I care about the end result. But when I look attt my raw data I actually see worse results in the unaveraged data. > So in fact I do get > better numbers than yours at least at the tail ends of the > distribution. Facts not in evidence. Please post ntpviz graphs and stats. I'm not gonna go rouund and round with vague assertions instead of data. Here are many of mine: https://pi.rellim.com https://pi2.rellim.com https://pi3.rellim.com https://pi4.rellim.com https://rellim.com > The skew seems to originate from the aging transient (especially the > initial part, see below), Yuu think you are seeing xtal aging in 24 hours? I seriously doubt it and you'll need a lot longer data collection to prove it. There may be 'initial' aging just after you cut a crystal, but not it sees a small transient, what you are seeing is loop convergence. Aging is by definition irreversible. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Mon Nov 7 19:39:56 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 7 Nov 2016 14:39:56 -0500 Subject: ntpq needs a push after each line In-Reply-To: <20161107180823.E8EDA406061@ip-64-139-1-69.sjc.megapath.net> References: <20161107180823.E8EDA406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161107193956.GC9007@thyrsus.com> Hal Murray : > or you can't see anything if you use tee. > > I assume the python default is line buffered when sysout is a terminal but > not if it's a pipe. Yes. It turns out to be remarkably difficult to change this: see http://stackoverflow.com/questions/107705/disable-output-buffering I can flush() each peer spreadsheet line; is that the main issue? -- Eric S. Raymond From Stromeko at nexgo.de Mon Nov 7 20:28:27 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 07 Nov 2016 21:28:27 +0100 Subject: =?utf-8?B?4pyYMC4yNTBwcG0vwrBD?= References: <20161105141035.17b7523e@spidey.rellim.com> <877f8geoal.fsf@Rainer.invalid> <20161106112749.69c4e887@spidey.rellim.com> <8737j4dyrl.fsf@Rainer.invalid> <20161106145256.0f0c8807@spidey.rellim.com> <87k2cfru5h.fsf@Rainer.invalid> <20161107111610.6957e82b@spidey.rellim.com> Message-ID: <87fun3rq50.fsf@Rainer.invalid> Gary E. Miller writes: > Yeah, I still arrive at tthe same opinion. You are free to keep it. >> This data was for 86400 raw PPS >> timestamps (which you are seemingly not recording), you are talking >> the loopstats. > > Correct. I care about the end result. But when I look attt my raw > data I actually see worse results in the unaveraged data. You still mistake the loop offset for that end result you so vaguely talk about. >> So in fact I do get better numbers than yours at least at the tail >> ends of the distribution. > > Facts not in evidence. You've conveniently snipped those numbers so you can make it look like none were provided. > Please post ntpviz graphs and stats. I gave you the statistics you've asked for and are now ignoring. I simply don't run ntpviz. > Yuu think you are seeing xtal aging in 24 hours? I seriously doubt it > and you'll need a lot longer data collection to prove it. You might want to read a whitepaper or book on how crystals age, there is a choice of more than a handful you can just download, from all the usual suspects. Or maybe you don't want and keep your opinion. Either way is fine with me. For your convenience, NIST defines "Aging is the systematic change in frequency with time due to internal changes in the oscillator." (that was proposed to become an IEEE standard in 1992). http://tf.boulder.nist.gov/general/pdf/979.pdf What you're seeing here is most likely caused by stress relaxation (thermal hysteresis), since the long-term aging actually goes the other way. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ DIY Stuff: http://Synth.Stromeko.net/DIY.html From gem at rellim.com Mon Nov 7 20:38:25 2016 From: gem at rellim.com (Gary E. Miller) Date: Mon, 7 Nov 2016 12:38:25 -0800 Subject: =?UTF-8?B?4pyYMC4yNTBwcG0vwrBD?= In-Reply-To: <87fun3rq50.fsf@Rainer.invalid> References: <20161105141035.17b7523e@spidey.rellim.com> <877f8geoal.fsf@Rainer.invalid> <20161106112749.69c4e887@spidey.rellim.com> <8737j4dyrl.fsf@Rainer.invalid> <20161106145256.0f0c8807@spidey.rellim.com> <87k2cfru5h.fsf@Rainer.invalid> <20161107111610.6957e82b@spidey.rellim.com> <87fun3rq50.fsf@Rainer.invalid> Message-ID: <20161107123825.127302a0@spidey.rellim.com> Yo Achim! On Mon, 07 Nov 2016 21:28:27 +0100 Achim Gratz wrote: > >> This data was for 86400 raw PPS > >> timestamps (which you are seemingly not recording), you are talking > >> the loopstats. > > > > Correct. I care about the end result. But when I look attt my raw > > data I actually see worse results in the unaveraged data. > > You still mistake the loop offset for that end result you so vaguely > talk about. Vaguely? No, I'll just let my data speak for itself: https://pi.rellim.com https://pi2.rellim.com https://pi3.rellim.com https://pi4.rellim.com https://rellim.com When you have data that looks better, no matter how you get it, I'd like to try to replicate. > >> So in fact I do get better numbers than yours at least at the tail > >> ends of the distribution. > > > > Facts not in evidence. > > You've conveniently snipped those numbers so you can make it look like > none were provided. None useful. Seemed like cherry picking to me. Please provide ntpviz URLs so we can compare apples and apples. > > Please post ntpviz graphs and stats. > > I gave you the statistics you've asked for and are now ignoring. I > simply don't run ntpviz. Very few stats, with no backup. See above. > > Yuu think you are seeing xtal aging in 24 hours? I seriously doubt > > it and you'll need a lot longer data collection to prove it. > > You might want to read a whitepaper or book on how crystals age, there > is a choice of more than a handful you can just download, from all the > usual suspects. Or maybe you don't want and keep your opinion. > Either way is fine with me. Been there, done that. If you have one that shows 'reversible aging' please send the link. > What you're seeing here is most likely caused by stress relaxation > (thermal hysteresis), since the long-term aging actually goes the > other way. Then it is not aging, is it? I never said 'thermal hysteresis' was not an effect. But this is all arguing hwo many angles dance on the head of a pin, unless and until you can get the end results of NTP (loopstats) looking better. I have long thought the signal processing in ntpd was sub-optimal. I suspect there are changes in there that would help a lot, see if you can find one and share. Then we ccan try to replicatet. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Mon Nov 7 20:48:26 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 07 Nov 2016 21:48:26 +0100 Subject: ntpq needs a push after each line References: <20161107180823.E8EDA406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <87bmxrrp7p.fsf@Rainer.invalid> Hal Murray writes: > or you can't see anything if you use tee. > > I assume the python default is line buffered when sysout is a terminal but > not if it's a pipe. Yes, but running the command in question with stdbuf -oL ? should make it line-buffered anyway. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf Blofeld V1.15B11: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From hmurray at megapathdsl.net Mon Nov 7 22:09:08 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 07 Nov 2016 14:09:08 -0800 Subject: ntpq needs a push after each line In-Reply-To: Message from "Eric S. Raymond" of "Mon, 07 Nov 2016 14:39:56 EST." <20161107193956.GC9007@thyrsus.com> Message-ID: <20161107220908.72F5D406061@ip-64-139-1-69.sjc.megapath.net> > I can flush() each peer spreadsheet line; is that the main issue? I think that would be fine. There are probably a zillion other places that also write lines. They may not matter unless it is slow to write the next line. I can think of two reasons for a line to be slow. One is collecting data from the target server. The other is doing a DNS lookup to print out a name rather than an IP Address. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Mon Nov 7 22:40:10 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 07 Nov 2016 14:40:10 -0800 Subject: We need a way to test the python stuff? Message-ID: <20161107224010.23406406061@ip-64-139-1-69.sjc.megapath.net> Or several ways. Step one is to run the local copy of ntpq (for example) using local libraries. The trick is to be sure that it is using the local libraries rather than a previously installed library. Step two is to test the installed code to be sure the libraries got installed where it can find them and that it is using the new libraries rather than a previously installed version. Mostly, this is testing the waf recipe, but we need most of this in order to be able to test new features in the pything code and/or new features in the libraries before they get installed. If nothing else, something like this should get added to the release checklist. (I think I'm assuming we would be willing to manually clean out the installed libraries to verify they aren't getting used. I don't want to do that very often, but I'd be willing to do it occasionally as part of getting ready for a release.) -- These are my opinions. I hate spam. From esr at thyrsus.com Mon Nov 7 22:52:42 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 7 Nov 2016 17:52:42 -0500 Subject: ntpq needs a push after each line In-Reply-To: <87bmxrrp7p.fsf@Rainer.invalid> References: <20161107180823.E8EDA406061@ip-64-139-1-69.sjc.megapath.net> <87bmxrrp7p.fsf@Rainer.invalid> Message-ID: <20161107225242.GA17272@thyrsus.com> Achim Gratz : > Hal Murray writes: > > or you can't see anything if you use tee. > > > > I assume the python default is line buffered when sysout is a terminal but > > not if it's a pipe. > > Yes, but running the command in question with > > stdbuf -oL ? > > should make it line-buffered anyway. I think that is an unreasonable hoop to expect people to jump through. -- Eric S. Raymond From esr at thyrsus.com Mon Nov 7 23:03:06 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 7 Nov 2016 18:03:06 -0500 Subject: ntpq needs a push after each line In-Reply-To: <20161107220908.72F5D406061@ip-64-139-1-69.sjc.megapath.net> References: <20161107193956.GC9007@thyrsus.com> <20161107220908.72F5D406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161107230306.GB17272@thyrsus.com> Hal Murray : > > I can flush() each peer spreadsheet line; is that the main issue? > > I think that would be fine. > > There are probably a zillion other places that also write lines. They may > not matter unless it is slow to write the next line. I can think of two > reasons for a line to be slow. One is collecting data from the target > server. The other is doing a DNS lookup to print out a name rather than an > IP Address. In ntpq it turns out there was an easy way to handle this. I wrote its code to do output through a method in the command-interpreter class named "say" which called sys.stdout.write. Adding sys.stdout.flush() after the write should have solved your problem. Score one for correct factoring. -- Eric S. Raymond From esr at thyrsus.com Mon Nov 7 23:46:30 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 7 Nov 2016 18:46:30 -0500 Subject: We need a way to test the python stuff? In-Reply-To: <20161107224010.23406406061@ip-64-139-1-69.sjc.megapath.net> References: <20161107224010.23406406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161107234630.GC17272@thyrsus.com> Hal Murray : > > Or several ways. > > Step one is to run the local copy of ntpq (for example) using local > libraries. The trick is to be sure that it is using the local libraries > rather than a previously installed library. We can force this with "waf uninstall". In order to avoid this problem I normally run without the libaries installed. > Step two is to test the installed code to be sure the libraries got installed > where it can find them and that it is using the new libraries rather than a > previously installed version. I'm pretty sure the only way using a previously installed version is possible is if you change the target directory for library installation to one that's later on your path than the stale one. It's not a case I worry about. > Mostly, this is testing the waf recipe, but we need most of this in order to > be able to test new features in the pything code and/or new features in the > libraries before they get installed. > > If nothing else, something like this should get added to the release > checklist. (I think I'm assuming we would be willing to manually clean out > the installed libraries to verify they aren't getting used. I don't want to > do that very often, but I'd be willing to do it occasionally as part of > getting ready for a release.) I think I know a better attack on the problem - that is, one less reliant on intervention by a human who might forget a step. It's contingent on the fact that new ntpq was deliberately written with as much of its functionality as possible pushed to the ntp Python library, in particular ntp.packet and ntp.util. If you look at ntpq/ntpq there's not much there besides a command interpreter wrapped around those library calls. I think the right answer is that the ntp Python library needs to have its own set of unit tests that are run as part of waf check. I haven't figured out the mechanics of this yet, but I'm sure looking closely at the way the existing tests are run will be enlightening. I guess I need to do this before I write ntpmon. Sigh. -- Eric S. Raymond From hmurray at megapathdsl.net Tue Nov 8 06:16:55 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 07 Nov 2016 22:16:55 -0800 Subject: (new) ntpq peers hangs In-Reply-To: Message from "Eric S. Raymond" of "Mon, 07 Nov 2016 14:42:35 EST." <20161107194235.GD9007@thyrsus.com> Message-ID: <20161108061655.CE92240605C@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: >> ntpq sometimes hangs. I assume it's due to a DNS problem. > Probably. I saw occasional hangs in the C version, too, and assumed that > was what caused them. Depends on what you mean by "hang". The DNS lookup can a take long time to give up. The problem I'm chasing takes longer than that. >> What do I type to python to use it as a debugger? > https://docs.python.org/2/library/pdb.html What I was expecting was that ^C would get me to the debugger so I could get a stack track. It doesn't seem to work that way. Or I missed a key step, or ... It gets me to the debugger, but the stack is garbage. If a python program is hung, how do I find out where? There is another bug/quirk in ntpq. In interactive mode, it needs a ^C catcher. -- These are my opinions. I hate spam. From gem at rellim.com Tue Nov 8 17:33:35 2016 From: gem at rellim.com (Gary E. Miller) Date: Tue, 8 Nov 2016 09:33:35 -0800 Subject: Fw: [Git][NTPsec/ntpsec][master] Retire ntpq's -O option. It's unclear how to do it right. Message-ID: <20161108093335.5624211c@spidey.rellim.com> Yo Eric! Just look at the old ntpq: dagwood ~ # ntpq -O -c "rv 0 frequency" spidey associd=0 status=0418 leap_none, sync_uhf_radio, 1 event, no_sys_peer, frequency=-0.611 dagwood ~ # ntpq -c "rv 0 frequency" spidey frequency=-0.588 dagwood ~ # RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Begin forwarded message: Date: Tue, 08 Nov 2016 13:20:47 +0000 From: "Eric S. Raymond" To: vc at ntpsec.org Subject: [Git][NTPsec/ntpsec][master] Retire ntpq's -O option. It's unclear how to do it right. Eric S. Raymond pushed to branch master at NTPsec / ntpsec Commits: 80a24b8c by Eric S. Raymond at 2016-11-08T08:19:51-05:00 Retire ntpq's -O option. It's unclear how to do it right. - - - - - 2 changed files: - docs/includes/ntpq-body.txt - ntpq/ntpq Changes: ===================================== docs/includes/ntpq-body.txt ===================================== --- a/docs/includes/ntpq-body.txt +++ b/docs/includes/ntpq-body.txt @@ -74,9 +74,6 @@ attempt to read interactive format commands from the standard input. +-n+, +--numeric+:: Output all host addresses in numeric format rather than converting to the canonical host names. -+-O+, +--old-rv+:: - Print an extra line when reading a single value with rv, - for example +ntpq -O -c "rv 0 frequency"+ +-p+, +--peers+:: Print a list of the peers known to the server as well as a summary of their state. This is equivalent to the +peers+ interactive command. @@ -563,4 +560,6 @@ In older versions, the 'type' variable associated with a reference clock was a numeric driver type index. It has been replaced by 'name', a shortname for the driver type. +The -O (--old-rv) option of legacy versions has been retired. + // end ===================================== ntpq/ntpq ===================================== --- a/ntpq/ntpq +++ b/ntpq/ntpq @@ -140,7 +140,6 @@ class Ntpq(cmd.Cmd): self.session = session self.prompt = "ntpq> " self.interactive = False # set to True when we should prompt - self.old_rv = False # use old readvars behavior? #self.auth_keyid = 0 # Keyid used for authentication. #self.auth_keytype = "NID_md5" # MD5 (FIXME: string value is a dummy) #self.auth_hashlen = 16 # MD5 @@ -444,12 +443,6 @@ usage: help [ command ] def __dolist(self, varlist, associd, op, type): "List variables associated with a specified peer." - # if we're asking for specific variables don't include the - # status header line in the output. - if self.old_rv: - quiet = False - else: - quiet = not (not varlist) # nonempty? try: variables = self.session.readvar(associd, varlist, op) except Mode6Exception as e: @@ -471,7 +464,7 @@ usage: help [ command ] return True if not quiet: self.say("associd=%d " % associd) - self.printvars(variables, type, quiet) + self.printvars(variables, type, not (not varlist)) return True # Unexposed helper tables and functions end here @@ -1490,7 +1483,6 @@ USAGE: ntpq [-46dphinOV] [-c str] [-D lvl] [ host ...] command peers -n no numeric numeric host addresses - -O no old-rv Always output status line with readvar -V opt version Output version information and exit -w no wide enable wide display of addresses ''' @@ -1501,11 +1493,11 @@ if __name__ == '__main__': try: (options, arguments) = getopt.getopt(sys.argv[1:], - "46c:dD:hinOpVw", + "46c:dD:hinpVw", ["ipv4","ipv6", "command", "debug", "set-debug-level", "help", "interactive", "numeric", - "old-rv", "peers", "version", + "peers", "version", "wide"]) except getopt.GetoptError as e: print(e) @@ -1534,8 +1526,6 @@ if __name__ == '__main__': interpreter.interactive = True elif switch in ("-n", "--numeric"): interpreter.showhostnames = False - elif switch in ("-O", "--old-rv"): - interpreter.old_rv = True elif switch in ("-p", "--peers"): interpreter.ccmds.append("peers") elif switch in ("-V", "--version"): View it on GitLab: https://gitlab.com/NTPsec/ntpsec/commit/80a24b8cbe20fd465a97b0916136905fcec0e436 RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 -------------- next part -------------- An HTML attachment was scrubbed... URL: -------------- next part -------------- _______________________________________________ vc mailing list vc at ntpsec.org http://lists.ntpsec.org/mailman/listinfo/vc -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Wed Nov 9 18:38:29 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 9 Nov 2016 13:38:29 -0500 (EST) Subject: Is there some good reason for ntpdig to be doing asynchronous DNS? Message-ID: <20161109183829.3997113A10AD@snark.thyrsus.com> Mark suggested I ask the devel list this. C ntpdig goes to vast efforts to do asynchonous DNS lookup, and I don't get why. It's not like ntpd - ntpdig doesn't have anything else to do while it's waiting for replies, and being UDP datagrams the requests themselves don't block. Anyone have a good explanation for this? It looks like overengineering to me. -- Eric S. Raymond False is the idea of utility that sacrifices a thousand real advantages for one imaginary or trifling inconvenience; that would take fire from men because it burns, and water because one may drown in it; that has no remedy for evils except destruction. The laws that forbid the carrying of arms are laws of such a nature. They disarm only those who are neither inclined nor determined to commit crimes. -- Cesare Beccaria, as quoted by Thomas Jefferson's Commonplace book From hmurray at megapathdsl.net Wed Nov 9 18:52:54 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 09 Nov 2016 10:52:54 -0800 Subject: Is there some good reason for ntpdig to be doing asynchronous DNS? In-Reply-To: Message from esr@thyrsus.com (Eric S. Raymond) of "Wed, 09 Nov 2016 13:38:29 EST." <20161109183829.3997113A10AD@snark.thyrsus.com> Message-ID: <20161109185254.C4E5440605C@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Anyone have a good explanation for this? It looks like overengineering to > me. My guess would be historical. It probably made sense when somebody copied some code ages ago. Another possibility is that there is an option to bail after N seconds. If you are using it to set the time during booting, you might not want to wait for DNS to timeout. DNS timeouts on this system are ballpark of 30 seconds. -- These are my opinions. I hate spam. From esr at thyrsus.com Wed Nov 9 19:03:43 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 9 Nov 2016 14:03:43 -0500 Subject: Is there some good reason for ntpdig to be doing asynchronous DNS? In-Reply-To: <20161109185254.C4E5440605C@ip-64-139-1-69.sjc.megapath.net> References: <20161109183829.3997113A10AD@snark.thyrsus.com> <20161109185254.C4E5440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161109190343.GC16957@thyrsus.com> Hal Murray : > Another possibility is that there is an option to bail after N seconds. If > you are using it to set the time during booting, you might not want to wait > for DNS to timeout. DNS timeouts on this system are ballpark of 30 seconds. There's a timeout option, but it seems to apply to the wait for SNTP reply, not the wait for DNS. -- Eric S. Raymond From hmurray at megapathdsl.net Wed Nov 9 19:12:09 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 09 Nov 2016 11:12:09 -0800 Subject: Is there some good reason for ntpdig to be doing asynchronous DNS? In-Reply-To: Message from "Eric S. Raymond" of "Wed, 09 Nov 2016 14:03:43 EST." <20161109190343.GC16957@thyrsus.com> Message-ID: <20161109191209.C147140605C@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > There's a timeout option, but it seems to apply to the wait for SNTP reply, > not the wait for DNS. I was thinking of an overall timeout. Something like: If you can't set the time within N seconds, bail with an error return code and I'll try something else. Lots of people want really fast reboot. (Of course, some of them also expect accurate time.) -- These are my opinions. I hate spam. From esr at thyrsus.com Wed Nov 9 19:28:43 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 9 Nov 2016 14:28:43 -0500 Subject: Is there some good reason for ntpdig to be doing asynchronous DNS? In-Reply-To: <20161109191209.C147140605C@ip-64-139-1-69.sjc.megapath.net> References: <20161109190343.GC16957@thyrsus.com> <20161109191209.C147140605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161109192843.GA17430@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > There's a timeout option, but it seems to apply to the wait for SNTP reply, > > not the wait for DNS. > > I was thinking of an overall timeout. Something like: > If you can't set the time within N seconds, bail with an error return code > and I'll try something else. Nothing like that is documented. -- Eric S. Raymond From hmurray at megapathdsl.net Mon Nov 14 06:40:30 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 13 Nov 2016 22:40:30 -0800 Subject: Ultra wide ntpq-peers printout Message-ID: <20161114064030.5E0F5406074@ip-64-139-1-69.sjc.megapath.net> I just tried it on a system that isn't running X. The screen is over 150 characters wide. The printout is 2 chunks. The name and status character is on the left. The rest of the stuff is way over on the right. It's next to impossible to line things up. I don't have any great suggestions. I think the -n mode should use the old format. -- These are my opinions. I hate spam. From esr at thyrsus.com Mon Nov 14 11:19:12 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 14 Nov 2016 06:19:12 -0500 Subject: Ultra wide ntpq-peers printout In-Reply-To: <20161114064030.5E0F5406074@ip-64-139-1-69.sjc.megapath.net> References: <20161114064030.5E0F5406074@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161114111912.GB22409@thyrsus.com> Hal Murray : > > I just tried it on a system that isn't running X. The screen is over 150 > characters wide. > > The printout is 2 chunks. The name and status character is on the left. The > rest of the stuff is way over on the right. It's next to impossible to line > things up. > > I don't have any great suggestions. I think the -n mode should use the old > format. ..because -n mode can't use the extra width well, I get it. Done. -- Eric S. Raymond From esr at thyrsus.com Mon Nov 14 12:19:17 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 14 Nov 2016 07:19:17 -0500 (EST) Subject: Please exercise ntpq, I just refactored it Message-ID: <20161114121917.AC9EC13A10DD@snark.thyrsus.com> I just attempted to get rid of all instances of "from import *" from ntpq; these are not really good practice, it is better to use plain imports and fully qualify every imported symbol. If I dropped a stitch anywhere it will manifest as a crash bug, probably "global name undefined". Keep calm and carry on. The fix will be trivial; if you can figure it out MR it yourself, otherwise report the bug and I'll fix it immediately. -- Eric S. Raymond There's a tendency today to absolve individuals of moral responsibility and treat them as victims of social circumstance. You buy that, you pay with your soul. -Tom Robbins, Still Life with Woodpecker From hmurray at megapathdsl.net Mon Nov 14 12:48:42 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 14 Nov 2016 04:48:42 -0800 Subject: Please exercise ntpq, I just refactored it In-Reply-To: Message from esr@thyrsus.com (Eric S. Raymond) of "Mon, 14 Nov 2016 07:19:17 EST." <20161114121917.AC9EC13A10DD@snark.thyrsus.com> Message-ID: <20161114124842.4C287406074@ip-64-139-1-69.sjc.megapath.net> How about setting up a simple script that at least tries all of the commands. Maybe 2 of them. One for the non-privileged commands and another for the commands that need a password. -- These are my opinions. I hate spam. From esr at thyrsus.com Mon Nov 14 15:22:51 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 14 Nov 2016 10:22:51 -0500 Subject: Please exercise ntpq, I just refactored it In-Reply-To: <20161114124842.4C287406074@ip-64-139-1-69.sjc.megapath.net> References: <20161114121917.AC9EC13A10DD@snark.thyrsus.com> <20161114124842.4C287406074@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161114152251.GA25903@thyrsus.com> Hal Murray : > How about setting up a simple script that at least tries all of the commands. > > Maybe 2 of them. One for the non-privileged commands and another for the > commands that need a password. Hm. All it could detect is crashes. Time-varying output pretty scotches any prospect of real regression testing. -- Eric S. Raymond From hmurray at megapathdsl.net Mon Nov 14 21:46:24 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 14 Nov 2016 13:46:24 -0800 Subject: Please exercise ntpq, I just refactored it In-Reply-To: Message from "Eric S. Raymond" of "Mon, 14 Nov 2016 10:22:51 EST." <20161114152251.GA25903@thyrsus.com> Message-ID: <20161114214624.ED797406074@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: >> How about setting up a simple script that at least tries all of the commands. > Hm. All it could detect is crashes. Time-varying output pretty scotches any > prospect of real regression testing. Yes, but it has a good chance of catching a large class of simple bugs. In hindsight, I'm a bit surprised something like that isn't on your checklist already. We should do it with all the programs we build and again after we install things. If nothing else, it will verify that the libraries are linked correctly. The simple case would print out the version. It might be useful to have (or have waf build) a script that does the post install tests - something an admin could run after a system upgrade. We should have a config file that tests all the options for ntpd. It would need a command line switch so that it doesn't actually try to run anything. How about --check? That might be handy anyway. You could use it to check your changes to ntp.conf before restarting ntpd. -- These are my opinions. I hate spam. From fallenpegasus at gmail.com Wed Nov 16 04:10:10 2016 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 16 Nov 2016 04:10:10 +0000 Subject: Of possible use for our Python code: "Hypothesis" Message-ID: https://hypothesis.readthedocs.io/en/latest/ -------------- next part -------------- An HTML attachment was scrubbed... URL: From dfoxfranke at gmail.com Fri Nov 18 22:55:22 2016 From: dfoxfranke at gmail.com (Daniel Franke) Date: Fri, 18 Nov 2016 17:55:22 -0500 Subject: Fw: [ntpwg] [Editorial Errata Reported] RFC7822 (4848) In-Reply-To: <20161101220410.GA29069@thyrsus.com> References: <20161101150022.4dcb03ae@spidey.rellim.com> <20161101220410.GA29069@thyrsus.com> Message-ID: On 11/1/16, Eric S. Raymond wrote: > Gary E. Miller : >> Yo All! >> >> More tea leaves to read from ntpwg... > > I'm getting a there's-drama-going-on-we-can't-see vibe from this. Note that Harlan rescinded this erratum the day after he reported it. No explanation was given. The erratum was illogical considering that no endorsement was implied in the first place. Author affiliations on RFCs are just that -- affiliations -- and nothing more. As a standards track document which cleared the IESG, RFC 7822 is the position *of the IETF*, and not necessarily that of the authors' institutions or even that of the authors. It ceased to be Danny and Tal's document as soon as the NTP WG adopted it. Anyway, that would be my reasoning for rejecting the erratum but I have no idea if it was Harlan's reasoning for rescinding it. Anyway, I don't think there's any hidden interpersonal drama here, just strong technical disagreement. We discussed https://tools.ietf.org/html/draft-stenn-ntp-extension-fields-00, which is Harlan's proposed replacement for RFC 7822, at Tuesday's session. I can't make head or tail of that document, and although I kept silent during that segment of the discussion others panned it pretty hard. When Harlan was asked to describe how the document differed from RFC 7822 and what substantive changes he was proposing, he couldn't answer. Also note that although three authors are listed, it seems to be entirely Harlan's work; listing Danny and Tal seems to be aspirational. Tal was the document's loudest critic during the WG session. So, given the kind of reception it got, I don't think this document is going anywhere, and RFC 7822 is still the law of the land. From hmurray at megapathdsl.net Sat Nov 19 07:55:24 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 18 Nov 2016 23:55:24 -0800 Subject: More Python quirks Message-ID: <20161119075524.A22EB406061@ip-64-139-1-69.sjc.megapath.net> Background for Devel... I mostly work with Fedora. It's default python search path (sys.path) doesn't include /usr/local/lib/python2.7/site-packages where our libraries get installed. Things mostly work if you run export PYTHONPATH=/usr/local/lib/python2.7/site-packages I don't know how to set up cleanly. If I put it in my .bashrc, it doesn't work for root. Maybe it should go in /etc/profile.d/ Mumble. But that's not the current problem. -------- I'm getting confused again (still?). It may be simple, but not until I know the answer. I was trying to fix the version string so that the one from ntpq would match the one from ntpd. It seemed to ignore my edits. I put a "barf" in the new code to make sure I was getting there. It didn't go off. I think the problem was that it finds the compiled stuff off in $build/main/pylib, but can't find the source do do a check, so it runs the old compiled versions. That was because I didn't run waf build 'cause I didn't think I needed it. I expected python to automagically rebuild the compiled versions. Will that get fixed if you add a sym link back to the source? After waf build, I got this. [murray at glypnod ntpq]$ ./ntpq --version Traceback (most recent call last): File "./ntpq", line 34, in version = ntp.util.stdversion() File "/usr/local/lib/python2.7/site-packages/ntp/util.py", line 16, in stdversion def stdversion(): NameError: global name 'barf' is not defined [murray at glypnod ntpq]$ Note that the traceback printout is finding the installed source rather than the local version with the "barf" in it. -- These are my opinions. I hate spam. From ghane0 at gmail.com Sat Nov 19 16:49:32 2016 From: ghane0 at gmail.com (Sanjeev Gupta) Date: Sun, 20 Nov 2016 00:49:32 +0800 Subject: Finding abusive NTP clients In-Reply-To: <20160415110557.2342C406057@ip-64-139-1-69.sjc.megapath.net> References: <20160415110557.2342C406057@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Fri, Apr 15, 2016 at 7:05 PM, Hal Murray wrote: > > I just pushed a tweak to ntpq's mrulist command to provide more info if > the > average > interval between requests is tiny. Anybody running a pool server might > like > to try it out. > > It now looks like this: > > ntpq> hostnames no > ntpq> mru mincount=1000 sort=avgint > Ctrl-C will stop MRU retrieval and display partial results. > Retrieved 239 unique MRU entries and 0 updates. > lstint avgint rstr r m v count rport remote address Hal, the 'mru' command no longer works. Was this removed intentionally? -- Sanjeev Gupta +65 98551208 http://www.linkedin.com/in/ghane -------------- next part -------------- An HTML attachment was scrubbed... URL: From hmurray at megapathdsl.net Sat Nov 19 19:51:42 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 19 Nov 2016 11:51:42 -0800 Subject: Finding abusive NTP clients In-Reply-To: Message from Sanjeev Gupta of "Sun, 20 Nov 2016 00:49:32 +0800." Message-ID: <20161119195142.CE817406063@ip-64-139-1-69.sjc.megapath.net> > Hal, the 'mru' command no longer works. Was this removed intentionally? It's probably blocked by some restrictions (to avoid DDoS). Another possibility is that your fingers typed the old name for a similar command. I forget what it was called. The (new) mrulist command requires a cookie in the request packet so it doesn't work as a DDoS amplifier. -- These are my opinions. I hate spam. From gem at rellim.com Sun Nov 20 06:05:09 2016 From: gem at rellim.com (Gary E. Miller) Date: Sat, 19 Nov 2016 22:05:09 -0800 Subject: More Python quirks In-Reply-To: <20161119075524.A22EB406061@ip-64-139-1-69.sjc.megapath.net> References: <20161119075524.A22EB406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161119220509.7ebdd56f@spidey.rellim.com> Yo Hal! On Fri, 18 Nov 2016 23:55:24 -0800 Hal Murray wrote: > Things mostly work if you run > export PYTHONPATH=/usr/local/lib/python2.7/site-packages Ditto for Gentoo. > I don't know how to set up cleanly. If I put it in my .bashrc, it > doesn't work for root. Maybe it should go in /etc/profile.d/ Since ntpviz needs it, and ntpviz is a daemon, then /etc/profile.d seems better. Gentoo should put it in /etc/env.d. > I think the problem was that it finds the compiled stuff off in > $build/main/pylib, but can't find the source do do a check, so it > runs the old compiled versions. That was because I didn't run waf > build 'cause I didn't think I needed it. Well, you do. There is your problem. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Mon Nov 21 11:18:07 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 21 Nov 2016 03:18:07 -0800 Subject: Everything in libntp_source_sharable is getting compiled twice Message-ID: <20161121111807.4AA7E406062@ip-64-139-1-69.sjc.megapath.net> [ 37/211] Compiling libntp/clockwork.c [ 38/211] Compiling libntp/emalloc.c [ 39/211] Compiling libntp/hextolfp.c [ 40/211] Compiling libntp/humandate.c [ 41/211] Compiling libntp/lib_strbuf.c [ 42/211] Compiling libntp/msyslog.c [ 43/211] Compiling libntp/ntp_calendar.c [ 44/211] Compiling libntp/prettydate.c [ 45/211] Compiling libntp/statestr.c [ 46/211] Compiling libntp/strl_obsd.c [ 47/211] Compiling libntp/systime.c [ 48/211] Compiling libntp/timetoa.c [ 50/211] Compiling libntp/clockwork.c [ 51/211] Compiling libntp/emalloc.c [ 52/211] Compiling libntp/hextolfp.c [ 53/211] Compiling libntp/humandate.c [ 54/211] Compiling libntp/lib_strbuf.c [ 55/211] Compiling libntp/msyslog.c [ 56/211] Compiling libntp/ntp_calendar.c [ 57/211] Compiling libntp/prettydate.c [ 58/211] Compiling libntp/statestr.c [ 59/211] Compiling libntp/strl_obsd.c [ 60/211] Compiling libntp/systime.c [ 61/211] Compiling libntp/timetoa.c -- These are my opinions. I hate spam. From ghane0 at gmail.com Mon Nov 21 12:07:23 2016 From: ghane0 at gmail.com (Sanjeev Gupta) Date: Mon, 21 Nov 2016 20:07:23 +0800 Subject: Merge requests Message-ID: Hi, I have three merge requests in the queue: 1. SUSE. Update INSTALL to document packages for SuSE Linux Enterprise Server 12, SP1. This has been reviewed by Matt S. 2. DOCS. Fix broken links (including internal links) 3. HTTPS. For external links (and even to ntpsec.org), check if the target is available over https, and if so, change the docs Thanks -- Sanjeev Gupta +65 98551208 http://www.linkedin.com/in/ghane -------------- next part -------------- An HTML attachment was scrubbed... URL: From hmurray at megapathdsl.net Mon Nov 21 12:40:03 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 21 Nov 2016 04:40:03 -0800 Subject: What does --disable-kernel-PLL do? Message-ID: <20161121124003.C8DA7406063@ip-64-139-1-69.sjc.megapath.net> I was expecting it to be a no-op unless you turned on one of the magic options in a refclock that handed PPS processing off to the kernel. I think it's doing more than that. The symptom is that ntptime says: ntp_gettime() returns code 5 (ERROR) time dbdd51ae.aa8bc000 2016-11-21T02:57:50.666, (.666195), maximum error 16000000 us, estimated error 16000000 us, TAI offset 0 ntp_adjtime() returns code 5 (ERROR) modes 0x0 (), offset 0.000 us, frequency 0.000 ppm, interval 1 s, maximum error 16000000 us, estimated error 16000000 us, status 0x40 (UNSYNC), time constant 2, precision 1.000 us, tolerance 500 ppm, The long story is that I had a setup that included --disable-kernel-PLL. (I don't remember why I added that.) I noticed that my daemon that records things like CPU and disk temperature and NTP's drift was getting 0 for the drift because that's what ntp_adjtime was returning. I eventually tracked it back to the --disable-kernel-PLL configuration option. ------ There is a tangle in this area that I don't understand. When ntpd exits (or crashes), it leaves the previous state in the kernel so anybody running ntptime will think things are fine. I think there is a minor bug in libntp/clockwork:ntp_adjtime_ns The first time through when it tests for NANO/MICRO, it will get a wrong answer. See the sample above. Here is printout from a debugging hack: 21 Nov 03:30:13 ntpd[580]: ntp_adjtime: rc=5, modes: 0, status: 40 21 Nov 03:41:10 ntpd[1119]: ntp_adjtime: rc=0, modes: 0, status: 2001 The first line is from a reboot. The second line if from restarting ntpd after it had been running. The code says: nanoseconds = (STA_NANO & ztx.status) != 0; so the first run will work in MICRO mode and divide the time correction by 1000. But there is more code following that recomputs nanoseconds after each call so it will get fixed before long. It's probably a bug in the kernel that it doesn't return the MICRO/NANO flag with the UNSYNC flag. -- These are my opinions. I hate spam. From esr at thyrsus.com Mon Nov 21 16:51:22 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 21 Nov 2016 11:51:22 -0500 Subject: Merge requests In-Reply-To: References: Message-ID: <20161121165122.GA12810@thyrsus.com> Sanjeev Gupta : > Hi, > > I have three merge requests in the queue: > > > 1. SUSE. Update INSTALL to document packages for SuSE Linux Enterprise > Server 12, SP1. This has been reviewed by Matt S. > 2. DOCS. Fix broken links (including internal links) > 3. HTTPS. For external links (and even to ntpsec.org), check if the > target is available over https, and if so, change the docs > > Thanks > > -- > Sanjeev Gupta > +65 98551208 http://www.linkedin.com/in/ghane I should get to those today. -- Eric S. Raymond From esr at thyrsus.com Mon Nov 21 21:58:04 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 21 Nov 2016 16:58:04 -0500 Subject: Everything in libntp_source_sharable is getting compiled twice In-Reply-To: <20161121111807.4AA7E406062@ip-64-139-1-69.sjc.megapath.net> References: <20161121111807.4AA7E406062@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161121215804.GA23797@thyrsus.com> Hal Murray : > [ 37/211] Compiling libntp/clockwork.c > [ 38/211] Compiling libntp/emalloc.c > [ 39/211] Compiling libntp/hextolfp.c > [ 40/211] Compiling libntp/humandate.c > [ 41/211] Compiling libntp/lib_strbuf.c > [ 42/211] Compiling libntp/msyslog.c > [ 43/211] Compiling libntp/ntp_calendar.c > [ 44/211] Compiling libntp/prettydate.c > [ 45/211] Compiling libntp/statestr.c > [ 46/211] Compiling libntp/strl_obsd.c > [ 47/211] Compiling libntp/systime.c > [ 48/211] Compiling libntp/timetoa.c > > [ 50/211] Compiling libntp/clockwork.c > [ 51/211] Compiling libntp/emalloc.c > [ 52/211] Compiling libntp/hextolfp.c > [ 53/211] Compiling libntp/humandate.c > [ 54/211] Compiling libntp/lib_strbuf.c > [ 55/211] Compiling libntp/msyslog.c > [ 56/211] Compiling libntp/ntp_calendar.c > [ 57/211] Compiling libntp/prettydate.c > [ 58/211] Compiling libntp/statestr.c > [ 59/211] Compiling libntp/strl_obsd.c > [ 60/211] Compiling libntp/systime.c > [ 61/211] Compiling libntp/timetoa.c That's expected. It's a quirk of waf's; they're getting built for two different deliverables, and could have different compilation options. -- Eric S. Raymond From royce at tycho.org Mon Nov 21 23:11:12 2016 From: royce at tycho.org (Royce Williams) Date: Mon, 21 Nov 2016 14:11:12 -0900 Subject: fuzzing NTPsec with afl Message-ID: This can obviously wait until after the current CVE scramble dies down. Below is how Stubman modified ntpd to be afl-friendly. I'm not sure, but I think he modified ntpd to accept UDP "input" from stdin, and created valid initial NTP UDP "packets" as test-case data with which to to "seed" afl. Until lcamtuf brings the network-aware fork of afl into the main tree, something similar to this approach is probably the most forward-compatible one. The other alternative is to use Birdwell's network-aware fork[1], but it has fallen behind the main afl tree. Taking the stdin approach obviously won't exercise any skipped network-specific code paths. It may take some ingenuity to identify the minimum change that keeps the maximum amount of important code exercised. If those minimal changes are turned into a compile-time option, this would enable adding fuzzing to the rolling test suite, perhaps using some of Susan's resources. Such an option would also increase the number of people who could quickly start fuzzing ntpsec. This latter may be a bug or a feature, depending on your perspective. :) Royce (tychotithonus on IRC) 1. https://github.com/jdbirdwell/afl ---------- Forwarded message ---------- From: Magnus Stubman Date: Mon, Nov 21, 2016 at 11:32 AM Subject: [afl-users] CVE-2016-7434 found with AFL. To: afl-users at googlegroups.com Hi guys, I found CVE-2016-7434, remote pre-auth DoS in the latest version of ntpd, using afl-fuzz by modifying ntpd to accept input from stdin, and then sending it to itself over UDP. Full writeup: http://dumpco.re/cve-2016-7434/ Relevant sample code of my instrumentation: http://dumpco.re/afl/#43 As can be seen in the asciinema below, I?m fuzzing with above 11k executions pr second on a single core with ASAN. Therefore, I believe that rewriting targets to accept testcases from stdin is superior compared to using forks of afl which send packets over the network and employ timeouts to estimate if the target is done processing the testcase. https://asciinema.org/a/1npswngnfah6m4m0et246e0lr Michael, thanks for sharing your awesome tool. Magnus. From kurt at roeckx.be Mon Nov 21 23:18:00 2016 From: kurt at roeckx.be (Kurt Roeckx) Date: Tue, 22 Nov 2016 00:18:00 +0100 Subject: fuzzing NTPsec with afl In-Reply-To: References: Message-ID: <20161121231800.qdrjvuu75tlhy4nw@roeckx.be> On Mon, Nov 21, 2016 at 02:11:12PM -0900, Royce Williams wrote: > > If those minimal changes are turned into a compile-time option, this > would enable adding fuzzing to the rolling test suite, perhaps using > some of Susan's resources. Google also provides resources via oss-fuzz. If you can read from stdin, it should also be easy to fuzz with other fuzzers like libfuzzer. Kurt From royce at tycho.org Mon Nov 21 23:34:58 2016 From: royce at tycho.org (Royce Williams) Date: Mon, 21 Nov 2016 14:34:58 -0900 Subject: fuzzing NTPsec with afl In-Reply-To: <20161121231800.qdrjvuu75tlhy4nw@roeckx.be> References: <20161121231800.qdrjvuu75tlhy4nw@roeckx.be> Message-ID: On Mon, Nov 21, 2016 at 2:18 PM, Kurt Roeckx wrote: > On Mon, Nov 21, 2016 at 02:11:12PM -0900, Royce Williams wrote: >> >> If those minimal changes are turned into a compile-time option, this >> would enable adding fuzzing to the rolling test suite, perhaps using >> some of Susan's resources. > > Google also provides resources via oss-fuzz. If you can read from > stdin, it should also be easy to fuzz with other fuzzers like > libfuzzer. Indeed. And my understanding is that stdin is often much faster than equivalent network-level testing, which translates to a lot more coverage per wall-clock hour (which is important for this kind of fuzzing). Ideally, we could enable some kind of basic coverage for both methods -- stdin and network-based. This would more closely model the actual threat landscape and attackers' capabilities. But between the two, stdin would be the best bang for the buck. Royce From esr at thyrsus.com Mon Nov 21 23:55:55 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 21 Nov 2016 18:55:55 -0500 (EST) Subject: Thinking forward Message-ID: <20161121235555.15DD913A10E5@snark.thyrsus.com> This is a lightly-edited version of a briefing I just gave Hal Murray off-list. Some of it's been discussed on the Signal channel. Everybody else should should know what's going on, too Mark and I have been thinking strategically about the medium and long-term future of this project. The era during which we could make major gains by code removal is pretty clearly drawing to a close. Going forward frrom 1.0 we are likely to have to proceed by adding code more than subtracting it. But we have about concluded that it's not really a good path forward to add a lot of complexity to the C for either performance or other reasons. Instead, we are now seriously entertaining the idea of stripping the C codebase down to the bare minimum that will still work, isolating the platform dependencies - and then moving the whole codebase to a language with better correctness guarantees and better concurrency support. Of course, the major point of the move would be to get to a place where buffer overruns and wild-pointer bugs are impossible. But another effect would be to get us the use of concurrency primitives that are much easier and safer to use. We could use these, in particular, to replace the rather alarming kludge that is the current asynch-DNS lookup code. This wouldn't have been practical starting from 227KLOC of grubby, #ifdef-encrusted C. But we're now down to 66KLOC of much cleaner C and likely to drop a few KLOC more (in particular, from moving ntpdig to Python). Moving to another language, even if we had to do it by hand-translation, is probably within the limit of practicality now. And we probably wouldn't have to do it by hand. The two candidate languages we're considering, Go and Rust, have mechanical C translators. The Rust one, called "corrode", is rumored to be production-quality. The Go one was written to translate the Go compiler from C, and is advertised to only translate C written in a restricted style. Which seems to mean excluding unions and some kinds of gotos that are a bad idea anyway. Which ties directly into the reason I've been pretty silent for the last week. I've been learning Go - writing a replacement for David A. Wheeler's sloccount utility. It has been quite the experience, and has left me with a good feeling about the feasibility of moving our codebase to Go. Mark is encouraging this research, though he quite rightly wants me to evaluate Rust just as thoroughly before we make any major decisions. -- Eric S. Raymond You know why there's a Second Amendment? In case the government fails to follow the first one. -- Rush Limbaugh, in a moment of unaccustomed profundity 17 Aug 1993 From hmurray at megapathdsl.net Mon Nov 21 23:56:44 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 21 Nov 2016 15:56:44 -0800 Subject: ntpq crashes if stdout pipe closed Message-ID: <20161121235645.00B6C406063@ip-64-139-1-69.sjc.megapath.net> [murray at glypnod play]$ ntpq -p | head -3 remote refid st t when poll reach delay offset jitter ============================================================================== -shuksan .PPS. 1 u 15 64 1 0.106 0.017 0.002 Traceback (most recent call last): File "/usr/local/bin/ntpq", line 1590, in interpreter.onecmd(cmd) File "/usr/lib/python2.7/cmd.py", line 221, in onecmd return func(arg) File "/usr/local/bin/ntpq", line 1123, in do_peers self.__dopeers(showall=False, mode="peers") File "/usr/local/bin/ntpq", line 334, in __dopeers variables, peer.associd)) File "/usr/local/bin/ntpq", line 173, in say sys.stdout.flush() # In case we're piping the output IOError: [Errno 32] Broken pipe [murray at glypnod play]$ -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Nov 22 00:10:11 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 21 Nov 2016 19:10:11 -0500 Subject: Please exercise ntpq, I just refactored it In-Reply-To: <20161114214624.ED797406074@ip-64-139-1-69.sjc.megapath.net> References: <20161114152251.GA25903@thyrsus.com> <20161114214624.ED797406074@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161122001011.GA26010@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > >> How about setting up a simple script that at least tries all of the > commands. > > Hm. All it could detect is crashes. Time-varying output pretty scotches any > > prospect of real regression testing. > > Yes, but it has a good chance of catching a large class of simple bugs. True. > In hindsight, I'm a bit surprised something like that isn't on your checklist > already. I've had a lot of other things to think about. Like this morning's CVE burst. > We should do it with all the programs we build and again after we > install things. If nothing else, it will verify that the libraries are > linked correctly. The simple case would print out the version. > > It might be useful to have (or have waf build) a script that does the post > install tests - something an admin could run after a system upgrade. > > We should have a config file that tests all the options for ntpd. It would > need a command line switch so that it doesn't actually try to run anything. > How about --check? That might be handy anyway. You could use it to check > your changes to ntp.conf before restarting ntpd. I'm not clear on how we'd tell good behavior from bad in that particular context. It's easy to tell when ntpq crashes, but how do we know when ntpd is processing an option wrongly? Is this something you'd be willing to work on? You seem to have a clearer vision of what the tests ought to be like, and I still have pre-1.0 work to do getting ntpdig moved and writing ntpmon. -- Eric S. Raymond From dtpoirot at gmail.com Tue Nov 22 02:55:04 2016 From: dtpoirot at gmail.com (Dan Poirot) Date: Mon, 21 Nov 2016 20:55:04 -0600 Subject: fuzzing NTPsec with afl In-Reply-To: References: <20161121231800.qdrjvuu75tlhy4nw@roeckx.be> Message-ID: <000701d2446b$d47c6fc0$7d754f40$@gmail.com> I have been using Synopsys Defensics with over 250,000 discrete 'generational' NTP Server and NTPv2, NTPv3, and NTPv4 Server Control tests on NTPsec for some time now. (NTPsec doesn't answer back as v0 or v1) Hacking the Network Time Protocol to take the 'network' out is clever! UDP, IP and Ethernet aren't changing in this domain. Fuzzing the Linux TCP/IP stack is a waste of time. - Dan Poirot, CISSP -----Original Message----- From: devel [mailto:devel-bounces at ntpsec.org] On Behalf Of Royce Williams Sent: Monday, November 21, 2016 5:35 PM To: Kurt Roeckx Cc: devel at ntpsec.org Subject: Re: fuzzing NTPsec with afl On Mon, Nov 21, 2016 at 2:18 PM, Kurt Roeckx wrote: > On Mon, Nov 21, 2016 at 02:11:12PM -0900, Royce Williams wrote: >> >> If those minimal changes are turned into a compile-time option, this >> would enable adding fuzzing to the rolling test suite, perhaps using >> some of Susan's resources. > > Google also provides resources via oss-fuzz. If you can read from > stdin, it should also be easy to fuzz with other fuzzers like > libfuzzer. Indeed. And my understanding is that stdin is often much faster than equivalent network-level testing, which translates to a lot more coverage per wall-clock hour (which is important for this kind of fuzzing). Ideally, we could enable some kind of basic coverage for both methods -- stdin and network-based. This would more closely model the actual threat landscape and attackers' capabilities. But between the two, stdin would be the best bang for the buck. Royce _______________________________________________ devel mailing list devel at ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel From dfoxfranke at gmail.com Tue Nov 22 23:22:04 2016 From: dfoxfranke at gmail.com (Daniel Franke) Date: Tue, 22 Nov 2016 18:22:04 -0500 Subject: Update on the latest batch of CVEs Message-ID: NTP Classic announced 10 new CVEs yesterday. Of them, six have no impact on NTPsec: CVE-2016-9311: Trap crash CVE-2016-9310: Mode 6 unauthenticated trap information disclosure and DDoS vector CVE-2016-7427: Broadcast Mode Replay Prevention DoS CVE-2016-7428: Broadcast Mode Poll Interval Enforcement DoS CVE-2016-9312: Windows: ntpd DoS by oversized UDP packet CVE-2016-7431: Regression: 010-origin: Zero Origin Timestamp Bypass One we independently found and fixed in 0.9.4 but it impacts 0.9.0 through 0.9.3: CVE-2016-7433: Reboot sync calculation problem Note that we didn't treat this one as a security issue at the time. In retrospect, we probably should have. Low severity, but a vulnerability nonetheless. One is bogus: CVE-2016-7426: Client rate limiting and server responses The behavior described in this advisory reflects rate-limiting working as designed, and the resulting potential for denial of service is a well-understood consequence that I've been harping about for years. I may add support for a configuration option to exempt mode 4 packets from rate-limiting, but I'm not going to treat this as an urgent security issue. Finally, two do impact NTPsec: CVE-2016-7434: Null pointer dereference in _IO_str_init_static_internal() CVE-2016-7429: Interface selection attack I've ported the patches for these issues from NTP Classic and pushed them to HEAD. Of these issues, only the first is worth worrying about: processing certain malformed mode 6 (i.e., ntpq) packets can trigger a null pointer dereference in ntpd, resulting in a crash. Use of 'restrict noquery' directives is sufficient to prevent the vulnerable code from executing, so if you system is configured to only allow ntpq queries from localhost then this is not remotely exploitable. CVE-2016-7429 is another DoS vulnerability, but in order for it to be exploitable you have to have disabled RP filtering in your kernel. Furthermore, the attacker needs to be positioned on a network interface different from the one you use to access your time servers. So, e.g., if you're running ntpd on your home router and have RP filtering turned off, then an adversary on the internet can prevent you from syncing with time servers on your LAN, and an adversary on your LAN can prevent you form syncing with time servers on the internet. I'm not quite ready for us to tag a release yet. I still need to update the NEWS file, and more importantly I need to finish up some testing, cleanup, and documentation updates left over from my protocol refactor. I'll get this done ASAP, hopefully by tomorrow. From hmurray at megapathdsl.net Thu Nov 24 01:23:20 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 23 Nov 2016 17:23:20 -0800 Subject: ntpq broken on NetBSD 6.1.5 Message-ID: <20161124012320.31235406061@ip-64-139-1-69.sjc.megapath.net> It works on 7.0.2 NetBSD 6.1.5 (GENERIC) -bash-4.3$ ntpq -p ntpq: can't find Python NTP library -- check PYTHONPATH. Shared object "libpython2.7.so.1.0" not found -bash-4.3$ It's not a PYTHONPATH problem. That's coming from ntpc.so -bash-4.3$ ldd /usr/local/lib/python2.7/site-packages/ntp/ntpc.so /usr/local/lib/python2.7/site-packages/ntp/ntpc.so: -lpython2.7.1.0 => not found -lutil.7 => /usr/lib/libutil.so.7 -lgcc_s.1 => /lib/libgcc_s.so.1 -lc.12 => /usr/lib/libc.so.12 -lm.0 => /usr/lib/libm.so.0 -lrt.1 => /usr/lib/librt.so.1 -lpthread.1 => /usr/lib/libpthread.so.1 -bash-4.3$ The pre-install version is the same and has the same problem: -bash-4.3$ cd ntpq -bash-4.3$ ntpq ntpq: can't find Python NTP library -- check PYTHONPATH. Shared object "libpython2.7.so.1.0" not found -bash-4.3$ No complaints from the linker: [ 62/162] Compiling libntp/timetoa.c [ 63/162] Linking bob/main/libntp/ntpc.so [ 64/162] Compiling libsodium/sodium/core.c I poked around a bit but but didn't get far enough to figure out why that library is needed or find the waf code the puts in the -lpython2.7 -- These are my opinions. I hate spam. From esr at thyrsus.com Thu Nov 24 12:31:17 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 24 Nov 2016 07:31:17 -0500 (EST) Subject: Heads up - directory reorganization coming Message-ID: <20161124123117.6195613A10F3@snark.thyrsus.com> Just so nobody is surprised...once the Python port of ntpdig lands I'm planning to move ntpkeygen to Python, then change the shape of the source tree. It made sense when we had more multiple-file C subprojects lying around, but not so much now. The Pythonized tools (actually, anything that doesn't have its own wscript, so ntpleapfetch as well) will all move to a common directory, probably named "tools". The directories I expect to get merged are ntpdig, ntpkeygen, ntpleapfetch, ntpq, ntpstats, ntpsweep, ntpwait, and ntptrace. The goal here is to make a less branchy tree so the overall organization is more easily taken in at a glance. I think it's still good to avoid mixing C files from different subprojects in the same directory, and I'll stick to that for ntpd/ntptime/ntpfrob. But I think less branchiness will now lower friction just slightly more than perfect one-directory-per-tool consistency - otherwise I'd take us the other direction and do ntpstats -> ntpviz. If anyone thinks they have a better idea, please speak up. -- Eric S. Raymond Everything that is really great and inspiring is created by the individual who can labor in freedom. -- Albert Einstein, in H. Eves Return to Mathematical Circles, Boston: Prindle, Weber and Schmidt, 1988. From frank at nicholasfamilycentral.com Thu Nov 24 13:54:44 2016 From: frank at nicholasfamilycentral.com (Frank Nicholas) Date: Thu, 24 Nov 2016 08:54:44 -0500 Subject: SUSE Linux Enterprise Server 64-bit for Raspberry Pi 3 Message-ID: <9565E87A-055C-4278-ADA6-333109C2C85C@nicholasfamilycentral.com> A supported 64-bit Linux has officially arrived on the Raspberry Pi 3. Because it is SUSE Linux Enterprise Server, I expect it to be more stable (fewer changes) than Raspbian (if SLES/SLED is any indication). SLES is offered at no cost with a free subscription for a year of updates. SUSE forum posts indicate after the year is up, that another free year of updates can be applied for, instead of paying for a subscription. SLES does include Systemd, and I?m not about to try and rip it out (as much as I dislike it). I?ll be using SLES as distributed. I dedicated one of my Pi 3s to SLES, and am working on the toolchain to build GPSd & NTPsec. I have a less expensive ($24.95 USD vs. $39.95 USD) version of the GPS chip that is on the Adafruit Ultimate GPS (MT3339). The Microstack GPS module can be used just like the Adafruit Ultimate GPS breakout board, without the Microstack base board. I also found some more documentation for another package of the actual GPS chip (Quectel L80 - linked below). I?ll post back after I get GPSd & NTPSec to build. SLES on Raspberry Pi documentation links & announcements: https://www.raspberrypi.org/blog/suse-linux-enterprise-server-for-raspberry-pi/ https://www.suse.com/communities/blog/suse-linux-enterprise-server-raspberry-pi/ https://www.suse.com/documentation/suse-best-practices/sles-rpi-quick/data/sles-rpi-quick.html https://www.suse.com/documentation/sles-12/ Microstack GPS (same MT3339 chip as Adafruit Ultimate GPS, but less expensive - $24.99): http://www.mcmelectronics.com/product/MICROSTACK-2434228-/83-16406 http://www.microstack.org.uk/products/microstack-gps/ http://www.newark.com/microstack/microstack-gps/add-on-board-l80-gps-raspberry/dp/78X5851 http://www.microstack.org.uk/assets/gps/FormattedGPSgettingstarted.pdf http://www.quectel.com/UploadImage/Downlad/Quectel_L80_GPS_Presentation_V1.1.pdf -------------- next part -------------- An HTML attachment was scrubbed... URL: From esr at thyrsus.com Thu Nov 24 15:04:28 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 24 Nov 2016 10:04:28 -0500 (EST) Subject: Request for code & logic review Message-ID: <20161124150428.9A70313A10F4@snark.thyrsus.com> The Python port of ntpdig is almost ready to land. But there is one last little bit of it I'm not sure I understand correctly. I'm requesting review of my code and assumptions. Presently the adjustment and synch distance are calculated this way: def delta(self): return self.root_delay def epsilon(self): return self.root_dispersion def synchd(self): "Synchronization distance, estimates worst-case error in seconds" # This is "lambda" in NTP-speak, but that's a Python keyword return abs(self.delta() - self.epsilon()) def adjust(self): "Adjustment implied by this packet." return self.received - self.transmit_timestamp where self.received is the receipt time of the response packet. I do it this way for no better reason than that some examples of Python SNTP clients I found on the web use this formula. But the C ntpdig code suggests this is wrong. Here's the C for the offset/synch-distance calculation: -------------------------------------------------------------------------------- /* Convert timestamps from network to host byte order */ p_rdly = NTOHS_FP(rpkt->rootdelay); p_rdsp = NTOHS_FP(rpkt->rootdisp); NTOHL_FP(&rpkt->reftime, &p_ref); NTOHL_FP(&rpkt->org, &p_org); NTOHL_FP(&rpkt->rec, &p_rec); NTOHL_FP(&rpkt->xmt, &p_xmt); *precision = LOGTOD(rpkt->precision); TRACE(3, ("offset_calculation: LOGTOD(rpkt->precision): %f\n", *precision)); /* Compute offset etc. */ tmp = p_rec; L_SUB(&tmp, &p_org); LFPTOD(&tmp, t21); TVTOTS(tv_dst, &dst); dst.l_ui += JAN_1970; tmp = p_xmt; L_SUB(&tmp, &dst); LFPTOD(&tmp, t34); *offset = (t21 + t34) / 2.; delta = t21 - t34; // synch_distance is: // (peer->delay + peer->rootdelay) / 2 + peer->disp // + peer->rootdisp + clock_phi * (current_time - peer->update) // + peer->jitter; // // and peer->delay = fabs(peer->offset - p_offset) * 2; // and peer->offset needs history, so we're left with // p_offset = (t21 + t34) / 2.; // peer->disp = 0; (we have no history to augment this) // clock_phi = 15e-6; // peer->jitter = LOGTOD(sys_precision); (we have no history to augment this) // and ntp_proto.c:set_sys_tick_precision() should get us sys_precision. // // so our answer seems to be: // // (fabs(t21 + t34) + peer->rootdelay) / 3. // + 0 (peer->disp) // + peer->rootdisp // + 15e-6 (clock_phi) // + LOGTOD(sys_precision) INSIST( FPTOD(p_rdly) >= 0. ); #if 1 *synch_distance = (fabs(t21 + t34) + FPTOD(p_rdly)) / 3. + 0. + FPTOD(p_rdsp) + 15e-6 + 0. /* LOGTOD(sys_precision) when we can get it */ ; INSIST( *synch_distance >= 0. ); #else *synch_distance = (FPTOD(p_rdly) + FPTOD(p_rdsp))/2.0; #endif -------------------------------------------------------------------------------- That is rather nasty and in my opinion sufficient reason to shoot the C version through the head. Now I'm going to simplify it into pseudocode, ignoring issues about endianness and timestamp epochs. -------------------------------------------------------------------------------- precision = log2(rpkt->precision); /* Compute offset etc. */ t21 = rpkt->rec - rpkt->org t34 = now - rpkt->xmt offset = (t21 + t34) / 2.; #if 1 synch_distance = (fabs(t21 + t34) + rpkt->rdly) / 3. + 0. + rpkt->rdsp + 15e-6 + 0. /* LOGTOD(sys_precision) when we can get it */ ; #else synch_distance = (pkt->rdly + pkt->p_rdsp)/2.0; #endif -------------------------------------------------------------------------------- I've omitted the delta calculation because it's only used in a diagnostic; offset is what is applied to the local clock. Note that the synch distance formula that is conditioned out is the one I'm using. Somebody must have thought it was correct at one time. But if I'm to believe the C above, the Python should read thus: def synchd(self): "Synchronization distance, estimates worst-case error in seconds" (self.receive_timestamp - self.origin_timestamp) \ + (now - self.transmit_timestamp) + self.rootdelay) / 3 \ + self.root_dispersion + 15e-6 def adjust(self): "Adjustment implied by this packet." return ((self.receive_timestamp - self.origin_timestamp) + (now - self.transmit_timestamp)) / 2 My requests to Daniel and anyone else who might understand these formulas: (1) Check my logic and C translation. A bug here would be bad. (2) What is the authority for these formulas? What RFC chapter and verse can we cite? -- Eric S. Raymond "Among the many misdeeds of British rule in India, history will look upon the Act depriving a whole nation of arms as the blackest." -- Mohandas Gandhi, An Autobiography, pg 446 From esr at thyrsus.com Thu Nov 24 22:59:52 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 24 Nov 2016 17:59:52 -0500 Subject: ntpq broken on NetBSD 6.1.5 In-Reply-To: <20161124012320.31235406061@ip-64-139-1-69.sjc.megapath.net> References: <20161124012320.31235406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161124225952.GB15746@thyrsus.com> Hal Murray : > It works on 7.0.2 > > > NetBSD 6.1.5 (GENERIC) > > -bash-4.3$ ntpq -p > ntpq: can't find Python NTP library -- check PYTHONPATH. > Shared object "libpython2.7.so.1.0" not found > -bash-4.3$ > > It's not a PYTHONPATH problem. That's coming from ntpc.so > > -bash-4.3$ ldd /usr/local/lib/python2.7/site-packages/ntp/ntpc.so > /usr/local/lib/python2.7/site-packages/ntp/ntpc.so: > -lpython2.7.1.0 => not found > -lutil.7 => /usr/lib/libutil.so.7 > -lgcc_s.1 => /lib/libgcc_s.so.1 > -lc.12 => /usr/lib/libc.so.12 > -lm.0 => /usr/lib/libm.so.0 > -lrt.1 => /usr/lib/librt.so.1 > -lpthread.1 => /usr/lib/libpthread.so.1 > -bash-4.3$ > > The pre-install version is the same and has the same problem: > > -bash-4.3$ cd ntpq > -bash-4.3$ ntpq > ntpq: can't find Python NTP library -- check PYTHONPATH. > Shared object "libpython2.7.so.1.0" not found > -bash-4.3$ > > No complaints from the linker: > [ 62/162] Compiling libntp/timetoa.c > [ 63/162] Linking bob/main/libntp/ntpc.so > [ 64/162] Compiling libsodium/sodium/core.c > > I poked around a bit but but didn't get far enough to figure out why that > library is needed or find the waf code the puts in the -lpython2.7 I think this just means the Python dev tools aren't installed on that system. -- Eric S. Raymond From hmurray at megapathdsl.net Fri Nov 25 00:17:42 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 24 Nov 2016 16:17:42 -0800 Subject: ntpq broken on NetBSD 6.1.5 In-Reply-To: Message from "Eric S. Raymond" of "Thu, 24 Nov 2016 17:59:52 EST." <20161124225952.GB15746@thyrsus.com> Message-ID: <20161125001742.317C5406061@ip-64-139-1-69.sjc.megapath.net> > I think this just means the Python dev tools aren't installed on that system. Maybe. But I don't think there is a separate dev package on NetBSD. There is a Python.h. It works on 7.0.2 and I didn't install anything extra there. I'm slightly surprised the linker didn't complain. Is there a search path to find things like that if they aren't resolved at link time? -- These are my opinions. I hate spam. From dfoxfranke at gmail.com Fri Nov 25 00:21:00 2016 From: dfoxfranke at gmail.com (Daniel Franke) Date: Thu, 24 Nov 2016 19:21:00 -0500 Subject: Request for code & logic review In-Reply-To: <20161124150428.9A70313A10F4@snark.thyrsus.com> References: <20161124150428.9A70313A10F4@snark.thyrsus.com> Message-ID: On 11/24/16, Eric S. Raymond wrote: > The Python port of ntpdig is almost ready to land. But there is one last > little bit of it I'm not sure I understand correctly. I'm requesting > review > of my code and assumptions. > > Presently the adjustment and synch distance are calculated this way: > > def delta(self): > return self.root_delay > def epsilon(self): > return self.root_dispersion > def synchd(self): > "Synchronization distance, estimates worst-case error in seconds" > # This is "lambda" in NTP-speak, but that's a Python keyword > return abs(self.delta() - self.epsilon()) > def adjust(self): > "Adjustment implied by this packet." > return self.received - self.transmit_timestamp > > where self.received is the receipt time of the response packet. I do > it this way for no better reason than that some examples of Python SNTP > clients I found on the web use this formula. But the C ntpdig code > suggests this is wrong. > > Here's the C for the offset/synch-distance calculation: > > -------------------------------------------------------------------------------- > /* Convert timestamps from network to host byte order */ > p_rdly = NTOHS_FP(rpkt->rootdelay); > p_rdsp = NTOHS_FP(rpkt->rootdisp); > NTOHL_FP(&rpkt->reftime, &p_ref); > NTOHL_FP(&rpkt->org, &p_org); > NTOHL_FP(&rpkt->rec, &p_rec); > NTOHL_FP(&rpkt->xmt, &p_xmt); > > *precision = LOGTOD(rpkt->precision); > > TRACE(3, ("offset_calculation: LOGTOD(rpkt->precision): %f\n", > *precision)); > > /* Compute offset etc. */ > tmp = p_rec; > L_SUB(&tmp, &p_org); > LFPTOD(&tmp, t21); > TVTOTS(tv_dst, &dst); > dst.l_ui += JAN_1970; > tmp = p_xmt; > L_SUB(&tmp, &dst); > LFPTOD(&tmp, t34); > *offset = (t21 + t34) / 2.; > delta = t21 - t34; > > // synch_distance is: > // (peer->delay + peer->rootdelay) / 2 + peer->disp > // + peer->rootdisp + clock_phi * (current_time - peer->update) > // + peer->jitter; > // > // and peer->delay = fabs(peer->offset - p_offset) * 2; > // and peer->offset needs history, so we're left with > // p_offset = (t21 + t34) / 2.; > // peer->disp = 0; (we have no history to augment this) > // clock_phi = 15e-6; > // peer->jitter = LOGTOD(sys_precision); (we have no history to augment > this) > // and ntp_proto.c:set_sys_tick_precision() should get us sys_precision. > // > // so our answer seems to be: > // > // (fabs(t21 + t34) + peer->rootdelay) / 3. > // + 0 (peer->disp) > // + peer->rootdisp > // + 15e-6 (clock_phi) > // + LOGTOD(sys_precision) > > INSIST( FPTOD(p_rdly) >= 0. ); > #if 1 > *synch_distance = (fabs(t21 + t34) + FPTOD(p_rdly)) / 3. > + 0. > + FPTOD(p_rdsp) > + 15e-6 > + 0. /* LOGTOD(sys_precision) when we can get it */ > ; > INSIST( *synch_distance >= 0. ); > #else > *synch_distance = (FPTOD(p_rdly) + FPTOD(p_rdsp))/2.0; > #endif > -------------------------------------------------------------------------------- > > That is rather nasty and in my opinion sufficient reason to shoot the C > version through the head. Now I'm going to simplify it into pseudocode, > ignoring issues about endianness and timestamp epochs. > > -------------------------------------------------------------------------------- > precision = log2(rpkt->precision); > > /* Compute offset etc. */ > t21 = rpkt->rec - rpkt->org > t34 = now - rpkt->xmt > offset = (t21 + t34) / 2.; > > #if 1 > synch_distance = (fabs(t21 + t34) + rpkt->rdly) / 3. > + 0. > + rpkt->rdsp > + 15e-6 > + 0. /* LOGTOD(sys_precision) when we can get it */ > ; > #else > synch_distance = (pkt->rdly + pkt->p_rdsp)/2.0; > #endif > -------------------------------------------------------------------------------- > > I've omitted the delta calculation because it's only used in a diagnostic; > offset is what is applied to the local clock. > > Note that the synch distance formula that is conditioned out is the one > I'm using. Somebody must have thought it was correct at one time. But if > I'm to believe the C above, the Python should read thus: > > def synchd(self): > "Synchronization distance, estimates worst-case error in seconds" > (self.receive_timestamp - self.origin_timestamp) \ > + (now - self.transmit_timestamp) + self.rootdelay) / 3 \ > + self.root_dispersion + 15e-6 > def adjust(self): > "Adjustment implied by this packet." > return ((self.receive_timestamp - self.origin_timestamp) > + (now - self.transmit_timestamp)) / 2 > > My requests to Daniel and anyone else who might understand these formulas: > > (1) Check my logic and C translation. A bug here would be bad. > > (2) What is the authority for these formulas? What RFC chapter and > verse can we cite? Neither the old nor new version looks correct to me. If you want to see a correct and readable C implementation of this logic then see my refactored implementation in ntp_proto.c:592-605. The authority here is RFC5905, Section 10, but let me spare you from hacking through that and give you an explanation you'll follow and remember. For starters, we have our four timestamps: t_1, the origin timestamp, is the time according to the client at which the request was sent. t_2, the transmit timestamp, is the time according to the server at which the request was received. t_3, the receive timestamp, is the time according to the server at which the reply was sent. t_4, the destination timestamp, is the time according to the client at which the reply was received. Theta is the thing we want to estimate: the offset between the server clock and the client clock. The sign convention is that theta is positive iff the server is ahead of the client. Theta is estimated by [(t_2-t_1)+(t_3-t_4)]/2. The accuracy of this estimate is predicated upon network latency being symmetrical. I've yet to come up with a snappy description of why this formula is correct, but just try plugging in some numbers and you'll get the idea. Delta is the network round trip time, i.e. (t_4-t_1)-(t_3-t_2). This one is easier to explain: (t_4-t_1) is the total time that the request was in flight, and (t_3-t_2) is time that the server spent processing it; when you subtract that out you're left with just network delays. Lambda nominally represents the maximum amount by which theta could be off. It's computed as delta/2 + epsilon. The delta/2 term usually dominates and represents the maximum amount by which network asymmetry could be throwing off the calculation. Epsilon is the sum of three other sources of error: rho_r: the (im)precision field from response packet, representing the server's inherent error in clock measurement. rho_s: the client's own (im)precision. PHI*(t_4-t_1): The amount by which the client's clock may plausibly have drifted while the packet was in flight. PHI is taken to be a constant of 15ppm. rho_r and rho_s are estimated by making back-to-back calls to clock_gettime() (or similar) and taking their difference. They're encoded on the wire as an eight-bit two's complement integer representing, to the nearest integer, log_2 of the value in seconds. And that's pretty much all there is to know. Go try again and I'll review your next pass. From esr at thyrsus.com Fri Nov 25 00:53:24 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 24 Nov 2016 19:53:24 -0500 Subject: More Python quirks In-Reply-To: <20161119075524.A22EB406061@ip-64-139-1-69.sjc.megapath.net> References: <20161119075524.A22EB406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161125005324.GA17777@thyrsus.com> Hal Murray : > I think the problem was that it finds the compiled stuff off in > $build/main/pylib, but can't find the source do do a check, so it runs the > old compiled versions. That was because I didn't run waf build 'cause I > didn't think I needed it. You should, at this point, assume that you need to run "waf build" any time anything in the Python library changes. Not the client code, like ntpq, but anything in pylib. > I expected python to automagically rebuild the compiled versions. And it used to work that way. Until you clued me in about --out and I realized that the magic link needed to forward to the build directory rather than the pylib source directory. > Will that get fixed if you add a sym link back to the > source? It's possible. The rule is that Python builds a foo.pyc when, and *only* when, you import foo. -- Eric S. Raymond From esr at thyrsus.com Fri Nov 25 01:21:46 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 24 Nov 2016 20:21:46 -0500 Subject: Request for code & logic review In-Reply-To: References: <20161124150428.9A70313A10F4@snark.thyrsus.com> Message-ID: <20161125012146.GA17982@thyrsus.com> Daniel Franke : > t_1, the origin timestamp, is the time according to the client at > which the request was sent. > t_2, the transmit timestamp, is the time according to the server at > which the request was received. > t_3, the receive timestamp, is the time according to the server at > which the reply was sent. > t_4, the destination timestamp, is the time according to the client at > which the reply was received. Er... > t_2, the transmit timestamp, is the time according to the server at > which the request was received. The "transmit timestamp" is a receipt time? Are you sure that's right >From the protocol diagram: +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Reference Timestamp (64) + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Origin Timestamp (64) + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Receive Timestamp (64) + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Transmit Timestamp (64) + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Your t_1 looks like it must be the second field, "Origin Timestamp" Your t_2 looks like it must be the fourth field, "Transmit Timestamp" Your t_3 looks like it must be the third field, "Receive Timestamp" Is t_4 field 1 (the "Reference Timestamp") your "destination timestamp"? I'm confused about what "the time according to the client at which the reply was received" is doing in the packet at all. Surely the server can't have put it there, unless it can see into the future. > Lambda nominally represents the maximum amount by which theta could be > off. It's computed as delta/2 + epsilon. The delta/2 term usually > dominates and represents the maximum amount by which network asymmetry > could be throwing off the calculation. Epsilon is the sum of three > other sources of error: > > rho_r: the (im)precision field from response packet, representing the > server's inherent error in clock measurement. > rho_s: the client's own (im)precision. > PHI*(t_4-t_1): The amount by which the client's clock may plausibly > have drifted while the packet was in flight. PHI is taken to be a > constant of 15ppm. The big formula in the C comment looks like a botched attempt to compute this, under the assumption that ntpdig doesn't have rho_s available and must set it to zero. -- Eric S. Raymond From dfoxfranke at gmail.com Fri Nov 25 01:57:38 2016 From: dfoxfranke at gmail.com (Daniel Franke) Date: Thu, 24 Nov 2016 20:57:38 -0500 Subject: Request for code & logic review In-Reply-To: <20161125012146.GA17982@thyrsus.com> References: <20161124150428.9A70313A10F4@snark.thyrsus.com> <20161125012146.GA17982@thyrsus.com> Message-ID: On 11/24/16, Eric S. Raymond wrote: > Daniel Franke : >> t_1, the origin timestamp, is the time according to the client at >> which the request was sent. >> t_2, the transmit timestamp, is the time according to the server at >> which the request was received. >> t_3, the receive timestamp, is the time according to the server at >> which the reply was sent. >> t_4, the destination timestamp, is the time according to the client at >> which the reply was received. > > Er... > >> t_2, the transmit timestamp, is the time according to the server at >> which the request was received. > > The "transmit timestamp" is a receipt time? Are you sure that's right It should be, t_1, the origin timestamp, is the time according to the client at which the request was sent. t_2, the receive timestamp, is the time according to the server at which the request was received. t_3, the transmit timestamp, is the time according to the server at which the reply was sent. t_4, the destination timestamp, is the time according to the client at which the reply was received. The reference timestamp isn't really used for anything, and the destination timestamp obviously can't be part of the packet. From hmurray at megapathdsl.net Fri Nov 25 04:24:19 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 24 Nov 2016 20:24:19 -0800 Subject: How NTP works In-Reply-To: Message from Daniel Franke of "Thu, 24 Nov 2016 19:21:00 EST." Message-ID: <20161125042419.918BF406061@ip-64-139-1-69.sjc.megapath.net> [was Re: Request for code & logic review] dfoxfranke at gmail.com said: > For starters, we have our four timestamps: > t_1, the origin timestamp, is the time according to the client at which the > request was sent. t_2, the transmit timestamp, is the time according to the > server at which the request was received. t_3, the receive timestamp, is the > time according to the server at which the reply was sent. t_4, the > destination timestamp, is the time according to the client at which the > reply was received. (and more) I think that transmit and receive in the above are swapped. In this context, receive and transmit refer to the server while origin and destination refer to the client. There really should be someplace where this is explained clearly so we can refer to it at times like this and put all our effort into making thatplace better rather than spreading good ideas all over the place where they are hard to find. Does anybody know of such a place? If not, we should make one. I'll volunteer to make a first pass. esr at thyrsus.com said: > I'm confused about what "the time according to the client at which the reply > was received" is doing in the packet at all. Surely the server can't have > put it there, unless it can see into the future. There is confusion because the calculations use 4 time stamps and there are 4 time stamps in the packet. Only the last 3 time stamps from the packet are used for the calculations. The 4th is the destination (aka arrival back at the client) time stamp. dfoxfranke at gmail.com said: > The reference timestamp isn't really used for anything, and the destination > timestamp obviously can't be part of the packet. I think it would help people understand things if we included the reason for the the extra time stamp in the packet. Mills wouldn't have wasted 8 whole bytes without a good one. Another chunk that should go in this area: If you look at the raw data, there are 3 unknowns: transit time client to server transit time server to client clock offset but there are only two equations, so you can't solve it. NTP gets a 3rd equation by assuming the transit times are equal. That lets it solve for the clock offset. If you assume that both clocks are accurate which is reasonable if you have GPS at both ends, then you can easily solve for the transit times in each direction. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Nov 25 05:29:16 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 24 Nov 2016 21:29:16 -0800 Subject: More Python quirks In-Reply-To: Message from "Eric S. Raymond" of "Thu, 24 Nov 2016 19:53:24 EST." <20161125005324.GA17777@thyrsus.com> Message-ID: <20161125052916.B2F3E406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > You should, at this point, assume that you need to run "waf build" any time > anything in the Python library changes. Not the client code, like ntpq, but > anything in pylib. This could easily be a wild goose chase, so treat accordingly. I'm not particularly happy with the waf/Python interactions. I can't put my finger on anything that is obviously wrong, but it doesn't feel right. Why do I need to run ./waf build to fixup python stuff. That's not part of the normal python work flow. (aka if I screwed up, others probably will too so I think we should fix it if we can) Why is ntpq copied to build/main/ntpq/ ? Why aren't the sources from pylib copied to build/main/pylib/ ? (or rather links back to the source so I don't have to rerun the copy) I think if the source was available, python would have recompiled things and avoided my confusion. For testing, why do we run ntpq by cd-ing to ntpq rather than cd-ing to build/main/ntpq ------- There is a some coverage of PYTHONPATH in devel/testing.txt The text is clean, but not satisfying. I think the problem is that I'm missing the big picture. What is python's overall search strategy? What happens if it finds binaries without sources and sources without binaries? I'm guessing the search strategy is dot, PYTHONPATH, then sys.path, and it uses binaries if it finds them without sources and if it find sources without binaries, it puts the binaries back in that directory (permissions allowing). How would you find that text if you had a problem but didn't know much about python? You might get there if you followed everything step by step, but nobody does that. A line or two in INSTALL might help. Is there a section about switching to python in the differences between ntpsec and ntp classic? I think a warning about potential search path quirks would be appropriate. NEWS has the details. Do we need a less verbose summary of the changes? I think README and/or INSTALL need a pointer to the differences. Does --enable-classic-mode do anything to the python code? If so, how does that work? (If not, we should probably tweak the text to indicate that it only applies to ntpd.) -------- There are still some waf activity for the python stuff that happens at check vs build time and the other way around. I'll put that in another message. -- These are my opinions. I hate spam. From esr at thyrsus.com Fri Nov 25 13:53:30 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 25 Nov 2016 08:53:30 -0500 Subject: How NTP works In-Reply-To: <20161125042419.918BF406061@ip-64-139-1-69.sjc.megapath.net> References: <20161125042419.918BF406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161125135330.GA30746@thyrsus.com> Hal Murray : > [was Re: Request for code & logic review] > > dfoxfranke at gmail.com said: > > For starters, we have our four timestamps: > > t_1, the origin timestamp, is the time according to the client at which the > > request was sent. t_2, the transmit timestamp, is the time according to the > > server at which the request was received. t_3, the receive timestamp, is the > > time according to the server at which the reply was sent. t_4, the > > destination timestamp, is the time according to the client at which the > > reply was received. > > (and more) > > I think that transmit and receive in the above are swapped. > > In this context, receive and transmit refer to the server while origin and > destination refer to the client. Daniel corrected that in his next reply. > There really should be someplace where this is explained clearly so we can > refer to it at times like this and put all our effort into making thatplace > better rather than spreading good ideas all over the place where they are > hard to find. Does anybody know of such a place? If not, we should make > one. I'll volunteer to make a first pass. For right now I need this explanation to live in the header comment of packet.py; it has to be there to explain the theta/delta/epsilon formulas in the SyncPacket class, which is crucial to implementing Python ntpdig, and will be the core of ntpshark as well. Once we get it all polished up it can move to docs/, but I weant to keep the content issues separate from the organization/location issue while we get it right. > dfoxfranke at gmail.com said: > > The reference timestamp isn't really used for anything, and the destination > > timestamp obviously can't be part of the packet. > > I think it would help people understand things if we included the reason for > the the extra time stamp in the packet. Mills wouldn't have wasted 8 whole > bytes without a good one. Oh yes indeed. That confused the hell out of me. In fact it's still doing so. /me looks in the code. That's the 'reftime' field in struct pkt. Grepping, I find...that it really *doesn't* seem to be used. But verifying this is confusing because there are other variables with reftime in their names. You should both check me on it. > Another chunk that should go in this area: > > If you look at the raw data, there are 3 unknowns: > transit time client to server > transit time server to client > clock offset > but there are only two equations, so you can't solve it. > > NTP gets a 3rd equation by assuming the transit times are equal. That lets > it solve for the clock offset. > > If you assume that both clocks are accurate which is reasonable if you have > GPS at both ends, then you can easily solve for the transit times in each > direction. I'll just edit this into that coment header. OK, I've pushed the state of our knowledge. Hal, Daniel, please review it. The salient questions are: Is the reference timestamp used anywhere? What does ntpd put in it on transmission? -- Eric S. Raymond From hmurray at megapathdsl.net Fri Nov 25 14:23:30 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 25 Nov 2016 06:23:30 -0800 Subject: Bug in comment or code Message-ID: <20161125142330.E141E406061@ip-64-139-1-69.sjc.megapath.net> The comment on the tail of the call to record_raw_stats says: /* This will always be 0 by the time we get here */ peer->outcount); If that "always" is correct, something is broken. outcount is a hack I added to count lost packets. It's supposed to get bumped when a packet is sent and unbumped when a well formed packet is processed. If all goes well, it will be 0. But if a packet gets lost, the unbump doesn't happen and it will be non-zero when a packet finally arrives. It should get cleared after being printed. -- These are my opinions. I hate spam. From esr at thyrsus.com Fri Nov 25 14:33:58 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 25 Nov 2016 09:33:58 -0500 Subject: Bug in comment or code In-Reply-To: <20161125142330.E141E406061@ip-64-139-1-69.sjc.megapath.net> References: <20161125142330.E141E406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161125143358.GA32041@thyrsus.com> Hal Murray : > > The comment on the tail of the call to record_raw_stats says: > /* This will always be 0 by the time we get here */ > peer->outcount); > > If that "always" is correct, something is broken. > > outcount is a hack I added to count lost packets. > > It's supposed to get bumped when a packet is sent and unbumped when a well > formed packet is processed. If all goes well, it will be 0. But if a packet > gets lost, the unbump doesn't happen and it will be non-zero when a packet > finally arrives. > > It should get cleared after being printed. Daniel, I think this one is yours. IIRC you jusy had to modify the outcount logic. -- Eric S. Raymond From esr at thyrsus.com Fri Nov 25 15:36:19 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 25 Nov 2016 10:36:19 -0500 (EST) Subject: An ntpmon proof-of-concept exists Message-ID: <20161125153619.2A16513A10F9@snark.thyrsus.com> As I expected, a first cut at ntpmon took only a couple of hours to write. This is where we start to collect some serious benefits from having a common back-end library for Python clients; these will continue when we start on ntpshark. A couple of problems stand in the way of landing ntpmon in the tree, however. One is purely a housekeeping issue; rather than create a ntpmon/ subdirectory that will go away in the upcoming directory reorganization, I'd rather land Python ntpdig first, do the reorg, and then commit ntpmon. (Am currently thinking the new directory will be named "clients".) The other problem is a bit more serious. I can't seem to get my ntpd to report mrulist entries. This hampers testing of ntpmon; the lower window is supposed to be an mrulist display. Right now I can only demonstrate the upper half, the automatically-refreshing peers display. To answer the obvious questions: No, I do not have nonrulist configured, and my daemon thinks mru_enabled is 1 when I query it through ntpq. And I can (sometimes) see MRUlist entries from Hal Murray's instance at 178.62.68.79, so ntpd is generating them. Hal thinks some configuration switch is implicated, but doesn't know which. I don't either. Can anyone alse shed any light on this? -- Eric S. Raymond I have never made but one prayer to God, a very short one: "O Lord, make my enemies ridiculous." And God granted it. --Voltaire From hmurray at megapathdsl.net Fri Nov 25 18:44:28 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 25 Nov 2016 10:44:28 -0800 Subject: An ntpmon proof-of-concept exists In-Reply-To: Message from esr@thyrsus.com (Eric S. Raymond) of "Fri, 25 Nov 2016 10:36:19 EST." <20161125153619.2A16513A10F9@snark.thyrsus.com> Message-ID: <20161125184428.53989406061@ip-64-139-1-69.sjc.megapath.net> > The other problem is a bit more serious. I can't seem to get my ntpd to > report mrulist entries. This hampers testing of ntpmon; the lower window is > supposed to be an mrulist display. Right now I can only demonstrate the > upper half, the automatically-refreshing peers display. It will probably be simple after we find it. Send me a copy of your ntp.conf What do you have on your waf configure line? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Nov 25 18:54:21 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 25 Nov 2016 10:54:21 -0800 Subject: How NTP works In-Reply-To: Message from "Eric S. Raymond" of "Fri, 25 Nov 2016 08:53:30 EST." <20161125135330.GA30746@thyrsus.com> Message-ID: <20161125185421.D836B406061@ip-64-139-1-69.sjc.megapath.net> > That's the 'reftime' field in struct pkt. Grepping, I find...that it really > *doesn't* seem to be used. But verifying this is confusing because there > are other variables with reftime in their names. You should both check me > on it. We should scan ntp classic in case we botched something. I took a look at the RFC. The only use I saw was on page 83 as part of a sanity check filter. /* * Verify valid root distance. */ if (r->rootdelay / 2 + r->rootdisp >= MAXDISP || p->reftime > r->xmt) return; /* invalid header values */ The MAXDISP above turns into sys_maxdisp in the code. This looks like the corresponding code: if(scalbn((double)pkt->rootdelay/2.0 + (double)pkt->rootdisp, -16) >= sys_maxdisp) { peer->flash |= BOGON7; return; } I don't see the reftime check. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Fri Nov 25 19:12:55 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Fri, 25 Nov 2016 20:12:55 +0100 Subject: Request for code & logic review References: <20161124150428.9A70313A10F4@snark.thyrsus.com> <20161125012146.GA17982@thyrsus.com> Message-ID: <8760nb4bm0.fsf@Rainer.invalid> Daniel Franke writes: > The reference timestamp isn't really used for anything The server is supposed to return this value unchanged, so one of the BSD implementations of the ntp client uses this field to send random data in order to weed out replay and fake packets. Checking for out-of-order replies may actually have been the original intention anyway. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Waldorf MIDI Implementation & additional documentation: http://Synth.Stromeko.net/Downloads.html#WaldorfDocs From dfoxfranke at gmail.com Fri Nov 25 20:01:06 2016 From: dfoxfranke at gmail.com (Daniel Franke) Date: Fri, 25 Nov 2016 15:01:06 -0500 Subject: Request for code & logic review In-Reply-To: <8760nb4bm0.fsf@Rainer.invalid> References: <20161124150428.9A70313A10F4@snark.thyrsus.com> <20161125012146.GA17982@thyrsus.com> <8760nb4bm0.fsf@Rainer.invalid> Message-ID: On 11/25/16, Achim Gratz wrote: > Daniel Franke writes: >> The reference timestamp isn't really used for anything > > The server is supposed to return this value unchanged, so one of the BSD > implementations of the ntp client uses this field to send random data in > order to weed out replay and fake packets. Checking for out-of-order > replies may actually have been the original intention anyway. You have a couple things mixed up here. The server copies the client's *transmit* timestamp unchanged into the *origin* timestamp in its response. All implementations, not just OpenNTPD, check for this match. However, OpenNTPD randomizes the entire field rather than just the low bits, a practice that I shortly plan to duplicate in NTPsec and advocate in https://www.ietf.org/id/draft-dfranke-ntp-data-minimization-01.txt. The reference timestamp is supposed to be copied unchanged from *upstream* in the hierarchy. So whatever reference a stratum 2 server's system peer is reporting to it, it will in turn report to its stratum 3 clients. Stratum 1 servers will set the reference timestamp to whatever time was mostly recently given to them by their reference clock. But despite this information being maintained, it isn't used for anything except perhaps diagnostics. This may not always have been the case; I suspect that in some point in the past the reference timestamp was important in server selection, but I haven't verified this. From esr at thyrsus.com Sat Nov 26 04:33:27 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 25 Nov 2016 23:33:27 -0500 Subject: An ntpmon proof-of-concept exists In-Reply-To: <20161125184428.53989406061@ip-64-139-1-69.sjc.megapath.net> References: <20161125153619.2A16513A10F9@snark.thyrsus.com> <20161125184428.53989406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161126043327.GA12697@thyrsus.com> Hal Murray : > > > The other problem is a bit more serious. I can't seem to get my ntpd to > > report mrulist entries. This hampers testing of ntpmon; the lower window is > > supposed to be an mrulist display. Right now I can only demonstrate the > > upper half, the automatically-refreshing peers display. > > It will probably be simple after we find it. > > Send me a copy of your ntp.conf # /etc/ntp.conf, configuration for ntpd; see ntp.conf(5) for help driftfile /var/lib/ntp/ntp.drift # Enable this if you want statistics to be logged. statsdir /var/log/ntpstats/ statistics loopstats peerstats clockstats rawstats filegen loopstats file loopstats type day enable filegen peerstats file peerstats type day enable filegen clockstats file clockstats type day enable #refclock shm unit 0 flag4 1 refid GPS #refclock shm unit 1 prefer flag4 1 refid PPS #refclock nmea baud 9600 path /dev/ttyUSB0 # Specify one or more NTP servers. #server [2001:470:e815::8] maxpoll 5 # spidey.rellim.com # Use servers from the NTP Pool Project. Approved by Ubuntu Technical Board # on 2011-02-08 (LP: #104525). See http://www.pool.ntp.org/join.html for # more information. #server 0.ubuntu.pool.ntp.org #server 1.ubuntu.pool.ntp.org #server 2.ubuntu.pool.ntp.org #server 3.ubuntu.pool.ntp.org pool us.pool.ntp.org # Use Ubuntu's ntp server as a fallback. #server ntp.ubuntu.com # Access control configuration; see /usr/share/doc/ntp-doc/html/accopt.html for # details. The web page # might also be helpful. # # Note that "restrict" applies to both servers and clients, so a configuration # that might be intended to block requests from certain clients could also end # up blocking replies from your own upstream servers. # By default, exchange time with everybody, but don't allow configuration. restrict -4 default notrap nomodify nopeer noquery restrict -6 default notrap nomodify nopeer noquery # Local users may interrogate the ntp server more closely. restrict 127.0.0.1 restrict ::1 # Clients from this (example!) subnet have unlimited access, but only if # cryptographically authenticated. #restrict 192.168.123.0 mask 255.255.255.0 notrust # If you want to provide time to your local subnet, change the next line. # (Again, the address is an example only.) #broadcast 192.168.123.255 # If you want to listen to time broadcasts on your local subnet, de-comment the # next lines. Please do this only if you trust everybody on the network! #disable auth #broadcastclient keys /usr/local/etc/ntp.keys trustedkey 10 controlkey 10 > What do you have on your waf configure line? ./waf configure --enable-crypto --refclock=all --enable-doc -- Eric S. Raymond From hmurray at megapathdsl.net Sat Nov 26 05:09:33 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 25 Nov 2016 21:09:33 -0800 Subject: An ntpmon proof-of-concept exists In-Reply-To: Message from "Eric S. Raymond" of "Fri, 25 Nov 2016 23:33:27 EST." <20161126043327.GA12697@thyrsus.com> Message-ID: <20161126050933.7B6FA406061@ip-64-139-1-69.sjc.megapath.net> > # By default, exchange time with everybody, but don't allow configuration. > restrict -4 default notrap nomodify nopeer noquery > restrict -6 default notrap nomodify nopeer noquery > restrict 127.0.0.1 > restrict ::1 >From docs/includes/access-commands.txt +noquery+;; Deny {ntpqman} queries. Time service is not affected. Are you running ntpq on that system or poking from another box? Are you using localhost or its host name? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Nov 26 09:41:21 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 01:41:21 -0800 Subject: ntpq quirk Message-ID: <20161126094121.AA292406061@ip-64-139-1-69.sjc.megapath.net> The old ntpq used to accept any unique prefix of a command. The new version doesn't. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Sat Nov 26 11:34:25 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sat, 26 Nov 2016 12:34:25 +0100 Subject: Request for code & logic review References: <20161124150428.9A70313A10F4@snark.thyrsus.com> <20161125012146.GA17982@thyrsus.com> <8760nb4bm0.fsf@Rainer.invalid> Message-ID: <87a8cmzd8e.fsf@Rainer.invalid> Daniel Franke writes: > You have a couple things mixed up here. Thanks for the clarification. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From esr at thyrsus.com Sat Nov 26 11:49:32 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 06:49:32 -0500 Subject: An ntpmon proof-of-concept exists In-Reply-To: <20161126050933.7B6FA406061@ip-64-139-1-69.sjc.megapath.net> References: <20161126043327.GA12697@thyrsus.com> <20161126050933.7B6FA406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161126114932.GA19600@thyrsus.com> Hal Murray : > > # By default, exchange time with everybody, but don't allow configuration. > > restrict -4 default notrap nomodify nopeer noquery > > restrict -6 default notrap nomodify nopeer noquery > > > restrict 127.0.0.1 > > restrict ::1 > > >From docs/includes/access-commands.txt > +noquery+;; > Deny {ntpqman} queries. Time service is not affected. > > Are you running ntpq on that system or poking from another box? > Are you using localhost or its host name? I solved the problem. It wasn't the access bits. There is a separare nomrulist, now documented. -- Eric S. Raymond From hmurray at megapathdsl.net Sat Nov 26 12:02:33 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 04:02:33 -0800 Subject: Heads up - directory reorganization coming In-Reply-To: Message from esr@thyrsus.com (Eric S. Raymond) of "Thu, 24 Nov 2016 07:31:17 EST." <20161124123117.6195613A10F3@snark.thyrsus.com> Message-ID: <20161126120233.7545B406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > The goal here is to make a less branchy tree so the overall organization is > more easily taken in at a glance. I think it's still good to avoid mixing C > files from different subprojects in the same directory, and I'll stick to > that for ntpd/ntptime/ntpfrob. ... I think ntptime can be pythonized. (without a lot of work) Do you have sample code that calls libc and returns a big struct? or equivalent -- These are my opinions. I hate spam. From esr at thyrsus.com Sat Nov 26 12:03:50 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 07:03:50 -0500 Subject: ntpq quirk In-Reply-To: <20161126094121.AA292406061@ip-64-139-1-69.sjc.megapath.net> References: <20161126094121.AA292406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161126120350.GB19600@thyrsus.com> Hal Murray : > The old ntpq used to accept any unique prefix of a command. The new version > doesn't. I was going to say tab completion is the closest we can come, then I thought of a kludgy way to fix this by hacking the precmd method. It might not be portable to Python 3, though. How important do you think this is? -- Eric S. Raymond From esr at thyrsus.com Sat Nov 26 12:06:52 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 07:06:52 -0500 Subject: Heads up - directory reorganization coming In-Reply-To: <20161126120233.7545B406061@ip-64-139-1-69.sjc.megapath.net> References: <20161124123117.6195613A10F3@snark.thyrsus.com> <20161126120233.7545B406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161126120652.GC19600@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > The goal here is to make a less branchy tree so the overall organization is > > more easily taken in at a glance. I think it's still good to avoid mixing C > > files from different subprojects in the same directory, and I'll stick to > > that for ntpd/ntptime/ntpfrob. ... > > I think ntptime can be pythonized. (without a lot of work) > > Do you have sample code that calls libc and returns a big struct? or > equivalent No, but I know exactly how to do it. I'll put this on the to-do list at low priority. -- Eric S. Raymond From Stromeko at nexgo.de Sat Nov 26 13:00:03 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sat, 26 Nov 2016 14:00:03 +0100 Subject: Python ntpq Message-ID: <8760naz99o.fsf@Rainer.invalid> I've configured to use python3.4m, since that's what I have a matching python-config installed for. However, all the Python scripts still use /usr/bin/python (which points to python2.7 on my system) and hilarity ensues. The waf build or install needs to create the correct shebang lines when a different python binary is configured, I suppose. Why PYTHONPATH is necessary isn't really clear to me (but then I know next to nothing about Python). The parent directory is in sys.path, so it should know how to find the ntp dir without any configuration, as it does with the other packages that are already there. Or again waf should add the correct PYTHONPATH to the shebang line. Editing the shebang by hand then doesn't work anyway: traceback (most recent call last): File "/usr/bin/ntpq", line 1590, in interpreter.onecmd(cmd) File "/usr/lib/python3.4/cmd.py", line 217, in onecmd return func(arg) File "/usr/bin/ntpq", line 956, in do_rv self.do_readvar(line) File "/usr/bin/ntpq", line 946, in do_readvar self.__dolist(line.split()[1:], associd, ntp.packet.CTL_OP_READVAR, qtype, quiet=True) File "/usr/bin/ntpq", line 456, in __dolist variables = self.session.readvar(associd, varlist, op) File "/usr/lib/python3/dist-packages/ntp/packet.py", line 1104, in readvar self.doquery(opcode, associd=associd, qdata=qdata) File "/usr/lib/python3/dist-packages/ntp/packet.py", line 1040, in doquery res = self.getresponse(opcode, associd, not retry) File "/usr/lib/python3/dist-packages/ntp/packet.py", line 904, in getresponse rpkt.analyze(rawdata) File "/usr/lib/python3/dist-packages/ntp/packet.py", line 526, in analyze rawdata[:ControlPacket.HEADER_LEN]) TypeError: 'str' does not support the buffer interface So I've installed python-dev to get pathon-conf for 2.7 and ntpq finally works. The shortest unique abbreviation for commands isn't working anymore (Hal already notedt that). Also, while it's good that the print can use a wide terminal now, it really shouldn't move the data to the right as far as possible (my text console is 320 characters wide and the usual graphics terminal 176 characters). Can the fields please be widened only so much that they show the data without truncation? Else I'd need some option to limit the width of the printout. The output from the cv command mangles the timecode string from the GPS: timecode=""$GPZDA", 120543.000="", 26="", 11="", 2016="", 00="", 00*56"="", This used to look like that (and really still should): timecode="$GPZDA,120543.000,26,11,2016,00,00*56", Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From Stromeko at nexgo.de Sat Nov 26 13:19:16 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sat, 26 Nov 2016 14:19:16 +0100 Subject: ntp.conf new refclock configuration syntax Message-ID: <871sxyz8dn.fsf@Rainer.invalid> I've switched to the new refclock configuration syntax, but it seems there is either a bug somewhere in the implementation or some mismatch to the documentation. I was trying to set up my NavSpark GPS to use $GPZDA and 115200 baud like this: refclock nmea unit 1 refid NavS mode 8 baud 115200 flag1 1 but that actually configured the serial for 0 baud and the GPS stopped sending data. Manually setting it back to the correct baud rate with stty recovered operation. Since the old syntax still seemed to work correctly I've tried to pull the baud configuration into the mode parameter again and that actually worked: refclock nmea unit 1 refid NavS mode 88 flag1 1 The documentation exmaple at the end of the page also looks wrong: refclock nmea mode baud 19200 # All sentences from /dev/gps0 at 19200 baud It looks like the mode parameter is missing here (it might actually interpret a missing mode as zero, but that should be commented). It would also be good to show the unit syntax in one of the examples for each refclock I'd think. It would be a good thing if there was a way to read in any old configuration file and spit it out again in the new syntax. Any default values should be present as a comment so it's easy to find out what else might need changing. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From Stromeko at nexgo.de Sat Nov 26 13:24:47 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sat, 26 Nov 2016 14:24:47 +0100 Subject: ntpd w/ --enable-seccomp Message-ID: <87wpfqxtk0.fsf@Rainer.invalid> I've tried to configure ntpd to use seccomp on Raspian, but it shows an error on start: Nov 26 13:10:59 raspberrypi2 ntpd[18805]: sandbox: seccomp_init succeeded Nov 26 13:10:59 raspberrypi2 ntpd[18805]: sandbox: seccomp_load() failed: Invalid argument It then seems to work anyway, but I'm wondering if any configuration might be missing. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From Stromeko at nexgo.de Sat Nov 26 13:28:48 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sat, 26 Nov 2016 14:28:48 +0100 Subject: ntpd w/ --enable-early-droproot Message-ID: <87shqextdb.fsf@Rainer.invalid> Configuring ntpd to drop root early makes it fail to open the refclock devices (which are owned by root). I guess they should be readyble by group ntp at least on Raspbian, which starts ntpd with that group? Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From Stromeko at nexgo.de Sat Nov 26 13:35:41 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sat, 26 Nov 2016 14:35:41 +0100 Subject: ntpq quirk References: <20161126094121.AA292406061@ip-64-139-1-69.sjc.megapath.net> <20161126120350.GB19600@thyrsus.com> Message-ID: <87oa12xt1u.fsf@Rainer.invalid> Eric S. Raymond writes: > Hal Murray : >> The old ntpq used to accept any unique prefix of a command. The new version >> doesn't. > > I was going to say tab completion is the closest we can come, then I thought > of a kludgy way to fix this by hacking the precmd method. It might not be > portable to Python 3, though. How important do you think this is? I've had to change a few command lines in hitory and one script. If this was something I had inherited, I'd have had a much harder time to find out why it suddenly stopped working. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Wavetables for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables From ghane0 at gmail.com Sat Nov 26 14:12:47 2016 From: ghane0 at gmail.com (Sanjeev Gupta) Date: Sat, 26 Nov 2016 22:12:47 +0800 Subject: An ntpmon proof-of-concept exists In-Reply-To: <20161125153619.2A16513A10F9@snark.thyrsus.com> References: <20161125153619.2A16513A10F9@snark.thyrsus.com> Message-ID: On Fri, Nov 25, 2016 at 11:36 PM, Eric S. Raymond wrote: > The other problem is a bit more serious. I can't seem to get my ntpd > to report mrulist entries. This hampers testing of ntpmon; the lower > window is supposed to be an mrulist display. Right now I can only > demonstrate the upper half, the automatically-refreshing peers display. > I reported something that might be the same issue, which I have now reopened: https://gitlab.com/NTPsec/ntpsec/issues/156 I will do a git pull again and test. -- Sanjeev Gupta +65 98551208 http://www.linkedin.com/in/ghane -------------- next part -------------- An HTML attachment was scrubbed... URL: From ghane0 at gmail.com Sat Nov 26 14:30:07 2016 From: ghane0 at gmail.com (Sanjeev Gupta) Date: Sat, 26 Nov 2016 22:30:07 +0800 Subject: An ntpmon proof-of-concept exists In-Reply-To: References: <20161125153619.2A16513A10F9@snark.thyrsus.com> Message-ID: On Sat, Nov 26, 2016 at 10:12 PM, Sanjeev Gupta wrote: > > I reported something that might be the same issue, which I have now > reopened: > https://gitlab.com/NTPsec/ntpsec/issues/156 > > I will do a git pull again and test. > After a waf uninstall, configure, and install, mrulist is till broken. See issue 156 , please. -- Sanjeev Gupta +65 98551208 http://www.linkedin.com/in/ghane -------------- next part -------------- An HTML attachment was scrubbed... URL: From esr at thyrsus.com Sat Nov 26 20:01:56 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 15:01:56 -0500 Subject: Python ntpq In-Reply-To: <8760naz99o.fsf@Rainer.invalid> References: <8760naz99o.fsf@Rainer.invalid> Message-ID: <20161126200156.GA4544@thyrsus.com> Achim Gratz : > The waf build or install needs to create the correct shebang > lines when a different python binary is configured, I suppose. Oh dear Goddess, no. I tried that approach once during GPSD's history. It was horrible, a maintainance nightmare. Better practice is to write your Python so it will run under either 2 or 3. See: http://www.catb.org/esr/faqs/practical-python-porting/ The GPSD project uses this approach successfully. We've tried to write the NTPsec Python this way, but the packet library is still under active development on a Python 2 system and that sometimes breaks Python 3 compatibility. One of our guys, Matt Selsky, is working on catching it up and I expect fixes will land soon. > So I've installed python-dev to get pathon-conf for 2.7 and ntpq finally > works. Well, that's good. > The shortest unique abbreviation for commands isn't working > anymore (Hal already notedt that). Yes, on my list, but I can't give it high priority. I have a couple of fires I need to fight first. > Also, while it's good that the print > can use a wide terminal now, it really shouldn't move the data to the > right as far as possible (my text console is 320 characters wide and the > usual graphics terminal 176 characters). Can the fields please be > widened only so much that they show the data without truncation? Else > I'd need some option to limit the width of the printout. No can do. You're going to have to live with using the --wide option. If this problem could be solved with clever programming, I'd do it. Unfortunately it's fundamental. You can't know the right width to pad to until you've reverse-DNsed all the addresses. But if you do that, you can't report even the first line until after a potentially unbounded amount of DNS-lookup stalls. The C ntpq code was carefully written not to require this (and had a control flow that looked really odd unless you understood this constraint). The Python port preserves these properties. Use -w. > The output from the cv command mangles the timecode string from the GPS: > > timecode=""$GPZDA", 120543.000="", 26="", 11="", 2016="", 00="", 00*56"="", > > This used to look like that (and really still should): > > timecode="$GPZDA,120543.000,26,11,2016,00,00*56", Oh, bletch. I see the problem. I'm going to have to write a real parser rather than using a naive split(",") call. ...or maybe not. Rather klugey fix pushed. Complain if it needs futher work. -- Eric S. Raymond From esr at thyrsus.com Sat Nov 26 20:34:10 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 15:34:10 -0500 Subject: ntpd w/ --enable-seccomp In-Reply-To: <87wpfqxtk0.fsf@Rainer.invalid> References: <87wpfqxtk0.fsf@Rainer.invalid> Message-ID: <20161126203410.GA7353@thyrsus.com> Achim Gratz : > > I've tried to configure ntpd to use seccomp on Raspian, but it shows an > error on start: > > Nov 26 13:10:59 raspberrypi2 ntpd[18805]: sandbox: seccomp_init succeeded > Nov 26 13:10:59 raspberrypi2 ntpd[18805]: sandbox: seccomp_load() failed: Invalid argument > > It then seems to work anyway, but I'm wondering if any configuration > might be missing. Hal, I think you were the last person to have hands on this code. Do you have any idea what might be going on here? -- Eric S. Raymond From esr at thyrsus.com Sat Nov 26 20:36:07 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 15:36:07 -0500 Subject: ntpd w/ --enable-early-droproot In-Reply-To: <87shqextdb.fsf@Rainer.invalid> References: <87shqextdb.fsf@Rainer.invalid> Message-ID: <20161126203607.GB7353@thyrsus.com> Achim Gratz : > Configuring ntpd to drop root early makes it fail to open the refclock > devices (which are owned by root). I guess they should be readyble by > group ntp at least on Raspbian, which starts ntpd with that group? Yes, they should be. Our philosophy in situations like this is to go for the high-security option even if it needs a little more one-time setup, like a chmod or a udev rule. -- Eric S. Raymond From hmurray at megapathdsl.net Sat Nov 26 21:24:24 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 13:24:24 -0800 Subject: Heads up - directory reorganization coming In-Reply-To: Message from "Eric S. Raymond" of "Sat, 26 Nov 2016 07:06:52 EST." <20161126120652.GC19600@thyrsus.com> Message-ID: <20161126212424.3F7C2406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: >> Do you have sample code that calls libc and returns a big struct? > No, but I know exactly how to do it. I'll put this on the to-do list at low > priority. I have code that works using ctypes. It doesn't give me that warm feeling. libc = ctypes.CDLL("libc.so.6") class timex(ctypes.Structure): _fields_ = [ ("modes", ctypes.c_uint), ("offset", ctypes.c_long), ("freq", ctypes.c_long), ("maxerror", ctypes.c_long), ("esterror", ctypes.c_long), ... libc.ntp_adjtime(ctypes.byref(ntp)) On at least one system, I have to change that 6 to a 7. It probably works without the number, The other problem is that there is no check that the layout of the struct matches what is on the system. Is there a preprocessor that would automate building the struct? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Nov 26 21:27:58 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 13:27:58 -0800 Subject: ntpq quirk In-Reply-To: Message from "Eric S. Raymond" of "Sat, 26 Nov 2016 07:03:50 EST." <20161126120350.GB19600@thyrsus.com> Message-ID: <20161126212758.1852F406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: >> The old ntpq used to accept any unique prefix of a command. The new >> version doesn't. > I was going to say tab completion is the closest we can come, then I thought > of a kludgy way to fix this by hacking the precmd method. It might not be > portable to Python 3, though. How important do you think this is? I think it's reasonably important. I had a script with commands wired in. I often type things like: ntpq -c mru -- These are my opinions. I hate spam. From Stromeko at nexgo.de Sat Nov 26 21:31:15 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sat, 26 Nov 2016 22:31:15 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> Message-ID: <87h96ux718.fsf@Rainer.invalid> Eric S. Raymond writes: > We've tried to > write the NTPsec Python this way, but the packet library is still > under active development on a Python 2 system and that sometimes > breaks Python 3 compatibility. One of our guys, Matt Selsky, is > working on catching it up and I expect fixes will land soon. OK, but if it's finally version-agnostic the ntp sub-directory should not be installed into a version-dependent tree. > No can do. You're going to have to live with using the --wide option. How about giving this option a number argument? Besides, the --help says: -w no wide enable wide display of addresses Well, I'm not giving it the -w / --wide option and it still uses a wide display for the peers. It doesn't use wide display for the rv and cv commands no matter the option. So I'd think it's actually ignoring the option at the moment. > ...or maybe not. Rather klugey fix pushed. Complain if it needs > futher work. Thanks, will check tomorrow. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Wavetables for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables From Stromeko at nexgo.de Sat Nov 26 21:33:38 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sat, 26 Nov 2016 22:33:38 +0100 Subject: ntpd w/ --enable-early-droproot References: <87shqextdb.fsf@Rainer.invalid> <20161126203607.GB7353@thyrsus.com> Message-ID: <87d1hix6x9.fsf@Rainer.invalid> Eric S. Raymond writes: > Achim Gratz : >> Configuring ntpd to drop root early makes it fail to open the refclock >> devices (which are owned by root). I guess they should be readyble by >> group ntp at least on Raspbian, which starts ntpd with that group? > > Yes, they should be. > > Our philosophy in situations like this is to go for the high-security option > even if it needs a little more one-time setup, like a chmod or a udev rule. I'll try that tomorrow as well. I have these devices set up by udev anyway, so I only need to figure out how to tell it to give them a different group. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for KORG EX-800 and Poly-800MkII V0.9: http://Synth.Stromeko.net/Downloads.html#KorgSDada From hmurray at megapathdsl.net Sat Nov 26 21:51:59 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 13:51:59 -0800 Subject: ntpd w/ --enable-seccomp In-Reply-To: Message from Achim Gratz of "Sat, 26 Nov 2016 14:24:47 +0100." <87wpfqxtk0.fsf@Rainer.invalid> Message-ID: <20161126215159.20AE2406061@ip-64-139-1-69.sjc.megapath.net> Stromeko at nexgo.de said: > I've tried to configure ntpd to use seccomp on Raspian, but it shows an > error on start: > Nov 26 13:10:59 raspberrypi2 ntpd[18805]: sandbox: seccomp_init succeeded > Nov 26 13:10:59 raspberrypi2 ntpd[18805]: sandbox: seccomp_load() failed: > Invalid argument > It then seems to work anyway, but I'm wondering if any configuration might > be missing. That probably means that I'm trying to allow a syscall that isn't supported by the kernel. It's working without seccomp. Thanks for catching that. I'll see if I can track it down. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Nov 26 21:53:05 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 13:53:05 -0800 Subject: *****SPAM***** ntpd w/ --enable-early-droproot In-Reply-To: Message from Achim Gratz of "Sat, 26 Nov 2016 14:28:48 +0100." <87shqextdb.fsf@Rainer.invalid> Message-ID: <20161126215305.D3C6C406061@ip-64-139-1-69.sjc.megapath.net> Stromeko at nexgo.de said: > Configuring ntpd to drop root early makes it fail to open the refclock > devices (which are owned by root). I guess they should be readyble by group > ntp at least on Raspbian, which starts ntpd with that group? Yes. We should probably add a note to the documentation to mention that case. -- These are my opinions. I hate spam. From esr at thyrsus.com Sat Nov 26 21:54:30 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 16:54:30 -0500 Subject: ntp.conf new refclock configuration syntax In-Reply-To: <871sxyz8dn.fsf@Rainer.invalid> References: <871sxyz8dn.fsf@Rainer.invalid> Message-ID: <20161126215430.GA10538@thyrsus.com> Achim Gratz : > > I've switched to the new refclock configuration syntax, but it seems > there is either a bug somewhere in the implementation or some mismatch > to the documentation. > > I was trying to set up my NavSpark GPS to use $GPZDA and 115200 baud > like this: > > refclock nmea unit 1 refid NavS mode 8 baud 115200 flag1 1 > > but that actually configured the serial for 0 baud and the GPS stopped > sending data. Manually setting it back to the correct baud rate with > stty recovered operation. Since the old syntax still seemed to work > correctly I've tried to pull the baud configuration into the mode > parameter again and that actually worked: > > refclock nmea unit 1 refid NavS mode 88 flag1 1 I tried reproducing your bug under gdb by adding the NavSpark config line as my only clock. Here's what I got: root at snark:/home/esr/software/ntp-rescue/ntpsec# gdb build/main/ntpd/ntpd [boilerplate skipped] Reading symbols from build/main/ntpd/ntpd...done. (gdb) break nmea_start Breakpoint 1 at 0x45ca50: file ../../ntpd/refclock_nmea.c, line 396. (gdb) r -g -n Starting program: /home/esr/software/ntp-rescue/ntpsec/build/main/ntpd/ntpd -g -n [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1". 11-26T16:22:10 ntpd[10502]: ntpd 0.9.6-9d95f6e Nov 26 2016 16:19:43: Starting 11-26T16:22:10 ntpd[10502]: Command line: /home/esr/software/ntp-rescue/ntpsec/build/main/ntpd/ntpd -g -n 11-26T16:22:10 ntpd[10502]: proto: precision = 0.838 usec (-20) 11-26T16:22:10 ntpd[10502]: successfully locked into RAM 11-26T16:22:10 ntpd[10502]: authreadkeys: reading /usr/local/etc/ntp.keys 11-26T16:22:10 ntpd[10502]: authreadkeys: added 10 keys 11-26T16:22:10 ntpd[10502]: restrict 0.0.0.0: notrap keyword is ignored. 11-26T16:22:10 ntpd[10502]: restrict ::: notrap keyword is ignored. 11-26T16:22:10 ntpd[10502]: Listen and drop on 0 v6wildcard [::]:123 11-26T16:22:10 ntpd[10502]: Listen and drop on 1 v4wildcard 0.0.0.0:123 11-26T16:22:10 ntpd[10502]: Listen normally on 2 lo 127.0.0.1:123 11-26T16:22:10 ntpd[10502]: Listen normally on 3 enp14s0 192.168.1.22:123 11-26T16:22:10 ntpd[10502]: Listen normally on 4 lo [::1]:123 11-26T16:22:10 ntpd[10502]: Listen normally on 5 enp14s0 [2001:470:e34c:2:56a0:50ff:febb:62d0]:123 11-26T16:22:10 ntpd[10502]: Listen normally on 6 enp14s0 [fe80::56a0:50ff:febb:62d0%2]:123 11-26T16:22:10 ntpd[10502]: Listening on routing socket on fd #23 for interface updates Breakpoint 1, nmea_start (unit=1, peer=0x6b2d00 ) at ../../ntpd/refclock_nmea.c:396 396 { (gdb) n 397 struct refclockproc * const pp = peer->procptr; (gdb) n 398 nmea_unit * const up = emalloc_zero(sizeof(*up)); (gdb) n 406 rate = (peer->ttl & NMEA_BAUDRATE_MASK) >> NMEA_BAUDRATE_SHIFT; (gdb) n 409 if (peer->baud) (gdb) p rate $1 = 0 (gdb) p peer->baud $2 = 115200 (gdb) p peer->ttl $3 = 8 (gdb) n 410 rate = peer->baud; (gdb) n 412 switch (rate) { (gdb) n 443 baudrate = B115200; (gdb) n 444 baudtext = "115200"; (gdb) n It looks like the baud rate and mode bits (which go into the otherwise unused ttl field, yes, I know that's ugly) are being passed in correctly and not stepping on each other. Then I see the baud rate being set correctly. I would expect pulling the baud rate into the mode field to work, because here's the logic: /* Old style: get baudrate choice from mode byte bits 4/5/6 */ rate = (peer->ttl & NMEA_BAUDRATE_MASK) >> NMEA_BAUDRATE_SHIFT; /* New style: get baudrate from baud option */ if (peer->baud) rate = peer->baud; Then the following switch is set up to accept either small-integer values or larger ones in the range 300..115200 and do the right thing. What's mysterious is how you are coming out of that logic with speed set to zero. Please try configuring with ./waf configure --enable-crypto --refclock=all --enable-debug-gdb and tracing through this yourself. > The documentation example at the end of the page also looks wrong: > > refclock nmea mode baud 19200 # All sentences from /dev/gps0 at 19200 baud > > It looks like the mode parameter is missing here (it might actually > interpret a missing mode as zero, but that should be commented). It > would also be good to show the unit syntax in one of the examples for > each refclock I'd think. You're right. I typoed that. I'll fix it. > It would be a good thing if there was a way to read in any old > configuration file and spit it out again in the new syntax. Any default > values should be present as a comment so it's easy to find out what else > might need changing. That is probably doable. Not in ntpd itsef, I don't want to add the weight there, but I will poke at a conversion script. -- Eric S. Raymond From hmurray at megapathdsl.net Sat Nov 26 22:33:01 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 14:33:01 -0800 Subject: ttl and mode Message-ID: <20161126223301.A6EE5406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > It looks like the baud rate and mode bits (which go into the otherwise > unused ttl field, yes, I know that's ugly) ... We could fix that. I think it's leftover from a long time ago when memory usage was important. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Nov 26 22:43:39 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 14:43:39 -0800 Subject: Python ntpq In-Reply-To: Message from "Eric S. Raymond" of "Sat, 26 Nov 2016 15:01:56 EST." <20161126200156.GA4544@thyrsus.com> Message-ID: <20161126224339.B8C70406061@ip-64-139-1-69.sjc.megapath.net> > No can do. You're going to have to live with using the --wide option. I think you should pay more attention to user feedback, especially when changing existing behavior. It's just as hard to understand stuff that's too wide as it is to figure out what's going on when the names get truncated. Most of my terminal windows are 80 characters wide so I don't see the problem very often. But sometimes I use ntpq on a system that isn't running X (yet). The width depends on the font size, but my typical font makes a line well over 100 characters. Your new format is useless. I think a sensible upper limit would be a reasonable approach. It's at least worth a try. Or a build time option to enable/disable the auto-wide mode. -- These are my opinions. I hate spam. From esr at thyrsus.com Sat Nov 26 22:52:29 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 17:52:29 -0500 Subject: Heads up - directory reorganization coming In-Reply-To: <20161126212424.3F7C2406061@ip-64-139-1-69.sjc.megapath.net> References: <20161126120652.GC19600@thyrsus.com> <20161126212424.3F7C2406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161126225229.GA11261@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > >> Do you have sample code that calls libc and returns a big struct? > > No, but I know exactly how to do it. I'll put this on the to-do list at low > > priority. > > I have code that works using ctypes. It doesn't give me that warm feeling. > > libc = ctypes.CDLL("libc.so.6") > class timex(ctypes.Structure): > _fields_ = [ ("modes", ctypes.c_uint), > ("offset", ctypes.c_long), > ("freq", ctypes.c_long), > ("maxerror", ctypes.c_long), > ("esterror", ctypes.c_long), > ... > > libc.ntp_adjtime(ctypes.byref(ntp)) > > On at least one system, I have to change that 6 to a 7. It probably works > without the number, > > The other problem is that there is no check that the layout of the struct > matches what is on the system. > > Is there a preprocessor that would automate building the struct? No. And this should *absolutely not* give you a warm and fuzzy feeling; it's a brittle, chancy way to do things which I have summarily rejected for that reason. The right way to do it would be to add a wrapper to libnp/pymodule.c implementing Python access to adjtimex(2). If you look at that code, it will be pretty obvious how this is done for simple cases that don't involve passing a struct to the C call being wrapped. Here's the missing magic: for structs, you take arguments from Python-land and pass them to Python's struct.pack(), giving it a format argument that tells it to mock the C layout of the struct in the binary blob it hands back. Then you pass that blob to the underlying C call. The trick works in the other direction with struct.unpack, so it's no big deal to pass the elements of a C struct back out to Python-land. In both cases, the way the format string is interpreted mimics the same self-alignment rules for structure padding that NTP has been assuming since the year zero in the protocol engine. I use struct.unpack to analyze NTP packets in the back-end code for the new Python tools. As an example, here's the format that drives the analysis of control (mode 6) packets. format = "!BBHHHHH" That says: assume network byte order; peel off four bytes; peel off four halfwords (16 bits each - this was designed in the 2-bit era). The result is returned as a tuple. The corresponding format for sync packets is format = "!BBBBIIIQQQQ" where I requests a 32-bit word and Q a "quad" (64 bits). If you'd like to spread your wings a little, writing that adjtimex wrapper would be a good way for you to begin acquiring some Python. I could do it inside half an hour. I'll be surprised if it takes you more than three, even from a standing start and counting the time to read the documentation for the struct module and Python extensions. -- Eric S. Raymond From esr at thyrsus.com Sat Nov 26 22:54:57 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 17:54:57 -0500 Subject: ttl and mode In-Reply-To: <20161126223301.A6EE5406061@ip-64-139-1-69.sjc.megapath.net> References: <20161126223301.A6EE5406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161126225457.GB11261@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > It looks like the baud rate and mode bits (which go into the otherwise > > unused ttl field, yes, I know that's ugly) ... > > We could fix that. > > I think it's leftover from a long time ago when memory usage was important. Agreed on both counts. This is the kind of thing I clean up in background when I don't have anything pressing to do. -- Eric S. Raymond From esr at thyrsus.com Sat Nov 26 23:45:46 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 18:45:46 -0500 Subject: ntpq quirk In-Reply-To: <20161126212758.1852F406061@ip-64-139-1-69.sjc.megapath.net> References: <20161126120350.GB19600@thyrsus.com> <20161126212758.1852F406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161126234546.GC11261@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > >> The old ntpq used to accept any unique prefix of a command. The new > >> version doesn't. > > > I was going to say tab completion is the closest we can come, then I thought > > of a kludgy way to fix this by hacking the precmd method. It might not be > > portable to Python 3, though. How important do you think this is? > > I think it's reasonably important. > > I had a script with commands wired in. > > I often type things like: > ntpq -c mru OK, there are two ways we can handle this. If the set of prefixes in actual use is relatively small and stereotyped, a handful of explicit aliases will do. It would be completely reasonable to add "mru" as an alias for "mrulist", for example. If it's not, the problem can be solved in general with a precommand hook that looks at the first token on the line and does its own attempt at a unique-prefix match, filling in the right command if it finds one. There are two problems with this possibility. The more serious one is that it requires introspecting on the members of our cmd.Cmd instance. That is an unstable area of the Python API - it changed sigificantly between 2 and 3, and might well change again in the future. It's the kind of thing where yes, you can do it, but you're begging for a future maintainence problem if you do. The less serious, more general problem is simply that it adds complexity and potential points of failure, something I am reluctant to do without better reason than I presently think we have here. Can you audit your script usage to find out what abbreviatons you actually use? -- Eric S. Raymond From esr at thyrsus.com Sun Nov 27 00:44:11 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 19:44:11 -0500 Subject: Python ntpq In-Reply-To: <87h96ux718.fsf@Rainer.invalid> References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87h96ux718.fsf@Rainer.invalid> Message-ID: <20161127004411.GD11261@thyrsus.com> Achim Gratz : > Eric S. Raymond writes: > > We've tried to > > write the NTPsec Python this way, but the packet library is still > > under active development on a Python 2 system and that sometimes > > breaks Python 3 compatibility. One of our guys, Matt Selsky, is > > working on catching it up and I expect fixes will land soon. > > OK, but if it's finally version-agnostic the ntp sub-directory should > not be installed into a version-dependent tree. You have a point. Unfortunately there is no version-independent place for Python libraries that the Python interpreter recognizes. And there's a good reason for this - the bytecode used in .pyc/.pyo files (the just-in-time compiled versions of .py files) differs across versions. > > No can do. You're going to have to live with using the --wide option. > > How about giving this option a number argument? Besides, the --help says: > > -w no wide enable wide display of addresses The -w option doesn't actually use more screen width at all. Instead...uh oh, it's got a bug. Fix pushed. Where was I? Oh yes: esr at snark:~/software/ntp-rescue/ntpsec$ ntpq/ntpq -p remote refid st t when poll reach delay offset jitter ================================================================================= us.pool.ntp.org .POOL. 16 p - 64 0 0.000 0.000 0.001 -104.131.53.252 209.51.161.238 2 u 51 64 377 8.115 0.195 1.176 -104.156.99.226 192.12.19.20 2 u 1 64 377 80.869 4.499 0.566 -mirror 216.93.242.12 3 u - 64 377 28.447 0.181 0.691 -mis.wci.com 131.107.13.100 2 u 34 64 377 78.246 2.603 0.893 -cheezum.mattnordho 66.228.59.187 3 u 97 128 377 45.435 2.786 0.814 *level1e.cs.unc.edu .PPS. 1 u 6 64 377 30.781 -1.981 0.678 -66.241.101.63 169.254.0.2 2 u 68 128 377 60.657 2.227 1.192 +helium.constant.co 208.90.144.72 2 u 61 64 377 13.262 -3.405 0.736 -enigma.wiredgoats. 64.142.1.20 2 u 46 64 377 75.871 1.312 1.044 +mail.coldnorthadmi 132.246.11.231 2 u 51 64 377 10.741 -2.608 1.207 esr at snark:~/software/ntp-rescue/ntpsec$ ntpq/ntpq -p -w remote refid st t when poll reach delay offset jitter ================================================================================= us.pool.ntp.org .POOL. 16 p - 64 0 0.000 0.000 0.001 -104.131.53.252 209.51.161.238 2 u 53 64 377 8.115 0.195 1.176 -104.156.99.226 192.12.19.20 2 u 3 64 377 80.869 4.499 0.566 -mirror 216.93.242.12 3 u 2 64 377 28.447 0.181 0.691 -mis.wci.com 131.107.13.100 2 u 36 64 377 78.246 2.603 0.893 -cheezum.mattnordhoff.net 66.228.59.187 3 u 99 128 377 45.435 2.786 0.814 *level1e.cs.unc.edu .PPS. 1 u 8 64 377 30.781 -1.981 0.678 -66.241.101.63 169.254.0.2 2 u 70 128 377 60.657 2.227 1.192 +helium.constant.com 208.90.144.72 2 u 63 64 377 13.262 -3.405 0.736 -enigma.wiredgoats.com 64.142.1.20 2 u 48 64 377 75.871 1.312 1.044 +mail.coldnorthadmin.com 132.246.11.231 2 u 53 64 377 10.741 -2.608 1.207 esr at snark:~/software/ntp-rescue/ntpsec$ See the difference? A numeric argument to -w wouldn't mean anything. Besides, it would be the kind of compatibility break that people yell about. > Well, I'm not giving it the -w / --wide option and it still uses a wide > display for the peers. It doesn't use wide display for the rv and cv > commands no matter the option. So I'd think it's actually ignoring the > option at the moment. Nope, it's paying attention all right, it's just interpreting "wide" in a way you're not expecting, that has nothing to do with the screen width. (We inherited this behavior and I haven't changed it.) The only way I can think of to get near to what you want would be to set a maximum width for the clockname field, implicitly bounding the amount the display can extent to the right. -- Eric S. Raymond From esr at thyrsus.com Sun Nov 27 00:53:17 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 19:53:17 -0500 Subject: Python ntpq In-Reply-To: <20161126224339.B8C70406061@ip-64-139-1-69.sjc.megapath.net> References: <20161126200156.GA4544@thyrsus.com> <20161126224339.B8C70406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161127005317.GE11261@thyrsus.com> Hal Murray : > > No can do. You're going to have to live with using the --wide option. > > I think you should pay more attention to user feedback, especially > when changing existing behavior. We have this problem exactly *because* I paid attention to user feedback. >From Phil Salkie. He requested the stretchy display. > I think a sensible upper limit would be a reasonable approach. It's > at least worth a try. Or a build time option to enable/disable the > auto-wide mode. Ugh. Not going to do the second, it woulld be a classic case of lazily substituting an option for figuring out what the right thing is. OTOH, I think "sensible upper limit" is reasonable to try. What would you propose? Is there any way to gather statistics on the length distribution of hostmes, so we can set a threshold that handles 95% of cases? -- Eric S. Raymond From esr at thyrsus.com Sun Nov 27 01:00:51 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 20:00:51 -0500 Subject: ntpd w/ --enable-early-droproot In-Reply-To: <20161126215305.D3C6C406061@ip-64-139-1-69.sjc.megapath.net> References: <87shqextdb.fsf@Rainer.invalid> <20161126215305.D3C6C406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161127010051.GA15102@thyrsus.com> Hal Murray : > > Stromeko at nexgo.de said: > > Configuring ntpd to drop root early makes it fail to open the refclock > > devices (which are owned by root). I guess they should be readyble by group > > ntp at least on Raspbian, which starts ntpd with that group? > > Yes. > > We should probably add a note to the documentation to mention that case. Wherever you documented --early-droproot, I suppose. I don't know where that is. -- Eric S. Raymond From hmurray at megapathdsl.net Sun Nov 27 01:19:43 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 17:19:43 -0800 Subject: ntpq quirk In-Reply-To: Message from "Eric S. Raymond" of "Sat, 26 Nov 2016 18:45:46 EST." <20161126234546.GC11261@thyrsus.com> Message-ID: <20161127011943.C8BDD406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Can you audit your script usage to find out what abbreviatons you actually > use? Not reasonably. > There are two problems with this possibility. The more serious one is that > it requires introspecting on the members of our cmd.Cmd instance. That is > an unstable area of the Python API - it changed sigificantly between 2 and > 3, and might well change again in the future. It's the kind of thing where > yes, you can do it, but you're begging for a future maintainence problem if > you do. Can the code determine at run time if it's running on python 2 or 3? If so, I think we should consider maintaining 2 versions of the lookup code. > The less serious, more general problem is simply that it adds complexity and > potential points of failure, something I am reluctant to do without better > reason than I presently think we have here. I'm a bit surprised that the python crowd hasn't implemented something like this. It seems handy, but yes, it breaks learned behavior when somebody adds a new command that collides. ------------ I don't think changing the UI is as big a deal as handling it correctly. We've already nuked lots of features for various reasons. Burying it in NEWS isn't good enough. I think we need a separate document that summarizes the changes with pointers to NEWS (or some other file) for the details. Besides, NEWS is chronological. That just adds a layer of obfuscation. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sun Nov 27 01:25:57 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 17:25:57 -0800 Subject: ntpd w/ --enable-early-droproot In-Reply-To: Message from "Eric S. Raymond" of "Sat, 26 Nov 2016 20:00:51 EST." <20161127010051.GA15102@thyrsus.com> Message-ID: <20161127012558.046DA406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: >> We should probably add a note to the documentation to mention that case. > Wherever you documented --early-droproot, I suppose. I don't know where that > is. Interesting point. I don't think we have a place where configure time options are discussed. Where should that go? What other configure options that would benefit from a few words of explanation? seccomp is one. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sun Nov 27 01:42:52 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 17:42:52 -0800 Subject: Python ntpq In-Reply-To: Message from "Eric S. Raymond" of "Sat, 26 Nov 2016 19:53:17 EST." <20161127005317.GE11261@thyrsus.com> Message-ID: <20161127014252.28750406061@ip-64-139-1-69.sjc.megapath.net> > OTOH, I think "sensible upper limit" is reasonable to try. What would you > propose? Is there any way to gather statistics on the length distribution > of hostmes, so we can set a threshold that handles 95% of cases? We could write a script that does DNS lookups on the pool servers, reverses those addresses, and then does whatever statistics you want. You will get different statistics if you do/don't eliminate duplicates. A quick poking around finds things like c-24-15-80-185.hsd1.il.comcast.net 4e.f4.36a9.ip4.static.sl-reverse.com I don't have any IPv6 examples handy. How about you set things up for 25 or 30 and we can see how that feels while I collect data? Another approach would be to make it big enough for the worst case IPv6 numerical printout. -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Nov 27 02:46:26 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 21:46:26 -0500 Subject: ntpd w/ --enable-early-droproot In-Reply-To: <20161127012558.046DA406061@ip-64-139-1-69.sjc.megapath.net> References: <20161127010051.GA15102@thyrsus.com> <20161127012558.046DA406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161127024626.GA17008@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > >> We should probably add a note to the documentation to mention that case. > > Wherever you documented --early-droproot, I suppose. I don't know where that > > is. > > Interesting point. I don't think we have a place where configure time > options are discussed. Where should that go? Er, I guess in the INSTALL file for now. We'll probably want to spin it out to its own page sometime. > What other configure options that would benefit from a few words of > explanation? seccomp is one. --refclock, --enable-docs, and --enable-crypto leap to mind. -- Eric S. Raymond From esr at thyrsus.com Sun Nov 27 02:58:42 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 21:58:42 -0500 Subject: Python ntpq In-Reply-To: <20161127014252.28750406061@ip-64-139-1-69.sjc.megapath.net> References: <20161127005317.GE11261@thyrsus.com> <20161127014252.28750406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161127025842.GB17008@thyrsus.com> Hal Murray : > > > OTOH, I think "sensible upper limit" is reasonable to try. What would you > > propose? Is there any way to gather statistics on the length distribution > > of hostmes, so we can set a threshold that handles 95% of cases? > > We could write a script that does DNS lookups on the pool servers, reverses > those addresses, and then does whatever statistics you want. You will get > different statistics if you do/don't eliminate duplicates. Yes. But no big deal to collect both. > A quick poking around finds things like > c-24-15-80-185.hsd1.il.comcast.net > 4e.f4.36a9.ip4.static.sl-reverse.com > > I don't have any IPv6 examples handy. > > How about you set things up for 25 or 30 and we can see how that feels while > I collect data? That works for me, but... > Another approach would be to make it big enough for the worst case IPv6 > numerical printout. ...I like that better, because we can document it as a principled limit rather than an arbitrary magic number. Looks to me like the number 39 would work. That's 8 4-byte hex octets plus seven colon separators. Your examples, which are on the long side but not atypical for ISP hostnames, are 34 and 36 characters. I don't think I often see anything longer than that. Which gives me a good feeling about this number; I bet we'll find it is 2 STDs above mean. -- Eric S. Raymond From esr at thyrsus.com Sun Nov 27 03:17:21 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 22:17:21 -0500 Subject: Python ntpq In-Reply-To: <20161127025842.GB17008@thyrsus.com> References: <20161127005317.GE11261@thyrsus.com> <20161127014252.28750406061@ip-64-139-1-69.sjc.megapath.net> <20161127025842.GB17008@thyrsus.com> Message-ID: <20161127031721.GA17705@thyrsus.com> Hal Murray : > How about you set things up for 25 or 30 and we can see how that feels while > I collect data? The NTPsec project is about to have a blog. I think some visualizations of the distribution (e.g. with and without dups) would make a neat centerpiece for a post there. Will you slap me with a mackerel if I title it "The 39 Steps"? -- Eric S. Raymond From hmurray at megapathdsl.net Sun Nov 27 04:00:50 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sat, 26 Nov 2016 20:00:50 -0800 Subject: Python ntpq In-Reply-To: Message from "Eric S. Raymond" of "Sat, 26 Nov 2016 21:58:42 EST." <20161127025842.GB17008@thyrsus.com> Message-ID: <20161127040050.2D3DC406061@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Looks to me like the number 39 would work. That's 8 4-byte hex octets plus > seven colon separators. I'm collecting data: addresses and names (if available). It will need post processing if you want statistics. I've seen a sample IPv6 numerical address that is 1 character short of worst case. -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Nov 27 04:50:35 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 26 Nov 2016 23:50:35 -0500 Subject: Python ntpq In-Reply-To: <20161127040050.2D3DC406061@ip-64-139-1-69.sjc.megapath.net> References: <20161127025842.GB17008@thyrsus.com> <20161127040050.2D3DC406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161127045035.GA19174@thyrsus.com> Hal Murray : > I'm collecting data: addresses and names (if available). It will need post > processing if you want statistics. gnuplot is our friend for situations like this -- Eric S. Raymond From Stromeko at nexgo.de Sun Nov 27 08:14:11 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 09:14:11 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87h96ux718.fsf@Rainer.invalid> <20161127004411.GD11261@thyrsus.com> Message-ID: <87oa11nxv0.fsf@Rainer.invalid> Eric S. Raymond writes: > Achim Gratz : >> Eric S. Raymond writes: >> > We've tried to >> > write the NTPsec Python this way, but the packet library is still >> > under active development on a Python 2 system and that sometimes >> > breaks Python 3 compatibility. One of our guys, Matt Selsky, is >> > working on catching it up and I expect fixes will land soon. >> >> OK, but if it's finally version-agnostic the ntp sub-directory should >> not be installed into a version-dependent tree. > > You have a point. Unfortunately there is no version-independent place > for Python libraries that the Python interpreter recognizes. And > there's a good reason for this - the bytecode used in .pyc/.pyo files > (the just-in-time compiled versions of .py files) differs across > versions. Then it's not appropriate to make the choice of interpreter dependent on the vagaries of the PATH setting at the time of invocation as you do now by using env to find the binary. Especially in the case the user built ntp with a specific python binary it might not be in path at all for the user(s) invoking the scripts. > See the difference? A numeric argument to -w wouldn't mean anything. Besides, > it would be the kind of compatibility break that people yell about. Well, here I was hoping that by giving the -w option it would switch back to using a fixed width display since it's now again having a whole line for the variable length component to play with. > Nope, it's paying attention all right, it's just interpreting "wide" > in a way you're not expecting, that has nothing to do with the screen > width. (We inherited this behavior and I haven't changed it.) > > The only way I can think of to get near to what you want would be to set > a maximum width for the clockname field, implicitly bounding the amount the > display can extent to the right. Yes, and please provide an option to set that width instead of trying to figure out a number that works for everyone. Well, you can do that too, but I can guarantee that whatever number you come up with doesn't work for at least one person. There is actually a much better solution: move the variable length field to the end of the line. It also needs an option since it breaks compatibility (the wide display already does anyway since you can't parse the output using fixed columns anymore). Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From Stromeko at nexgo.de Sun Nov 27 08:50:01 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 09:50:01 +0100 Subject: ntp.conf new refclock configuration syntax References: <871sxyz8dn.fsf@Rainer.invalid> <20161126215430.GA10538@thyrsus.com> Message-ID: <87inr9nw7a.fsf@Rainer.invalid> Eric S. Raymond writes: > I would expect pulling the baud rate into the mode field to work, because > here's the logic: > > /* Old style: get baudrate choice from mode byte bits 4/5/6 */ > rate = (peer->ttl & NMEA_BAUDRATE_MASK) >> NMEA_BAUDRATE_SHIFT; > > /* New style: get baudrate from baud option */ > if (peer->baud) > rate = peer->baud; > > Then the following switch is set up to accept either small-integer values > or larger ones in the range 300..115200 and do the right thing. What's > mysterious is how you are coming out of that logic with speed set to zero. The logic inside the switch is good, you fail while opening the device 25 lines down. The driver needs a code for the selected baudrate, not the baudrate itself as an integer number. It only gets that code when it's not using the baud parameter. Why the baudrate gets effectively set to zero as a result I don't know, but I don't care much about it. > Please try configuring with > > ./waf configure --enable-crypto --refclock=all --enable-debug-gdb > > and tracing through this yourself. Luckily no need for that, this patch fixes it: --- a/ntpd/refclock_nmea.c +++ b/ntpd/refclock_nmea.c @@ -475,7 +475,7 @@ nmea_start( /* Open serial port. Use CLK line discipline, if available. */ snprintf(device, sizeof(device), DEVICE, unit); pp->io.fd = refclock_open(peer->path ? peer->path : device, - peer->baud ? peer->baud : baudrate, + baudrate, LDISC_CLK); if (0 >= pp->io.fd) { pp->io.fd = nmead_open(device); Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Q+, Q and microQ: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From Stromeko at nexgo.de Sun Nov 27 08:51:44 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 09:51:44 +0100 Subject: ntpd w/ --enable-seccomp References: <20161126215159.20AE2406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <87eg1xnw4f.fsf@Rainer.invalid> Hal Murray writes: > Stromeko at nexgo.de said: >> I've tried to configure ntpd to use seccomp on Raspian, but it shows an >> error on start: > >> Nov 26 13:10:59 raspberrypi2 ntpd[18805]: sandbox: seccomp_init succeeded >> Nov 26 13:10:59 raspberrypi2 ntpd[18805]: sandbox: seccomp_load() failed: >> Invalid argument > >> It then seems to work anyway, but I'm wondering if any configuration might >> be missing. > > That probably means that I'm trying to allow a syscall that isn't supported > by the kernel. It's working without seccomp. > > Thanks for catching that. I'll see if I can track it down. Let me know if you can reproduce or not. I'm using: Linux raspberrypi2 4.4.32-v7+ #924 SMP Tue Nov 15 18:11:28 GMT 2016 armv7l GNU/Linux Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Wavetables for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables From Stromeko at nexgo.de Sun Nov 27 08:55:41 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 09:55:41 +0100 Subject: ntpq quirk References: <20161126120350.GB19600@thyrsus.com> <20161126212758.1852F406061@ip-64-139-1-69.sjc.megapath.net> <20161126234546.GC11261@thyrsus.com> Message-ID: <87a8clnvxu.fsf@Rainer.invalid> Eric S. Raymond writes: > OK, there are two ways we can handle this. The third way is building a hash table that maps each possible unique prefix to the actual full-length command. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From Stromeko at nexgo.de Sun Nov 27 09:37:37 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 10:37:37 +0100 Subject: ntpd w/ --enable-early-droproot References: <87shqextdb.fsf@Rainer.invalid> <20161126203607.GB7353@thyrsus.com> <87d1hix6x9.fsf@Rainer.invalid> Message-ID: <8760n9ntzy.fsf@Rainer.invalid> Achim Gratz writes: >> Our philosophy in situations like this is to go for the high-security option >> even if it needs a little more one-time setup, like a chmod or a udev rule. > > I'll try that tomorrow as well. I have these devices set up by udev > anyway, so I only need to figure out how to tell it to give them a > different group. Adding 'GROUP="ntp"' to the udev rules setting up the device symlinks correctly changes the actual device files' group to ntp and lets ntpd use these devices while --enable-early-droproot is configured. [what markup language is INSTALL in?] --- a/INSTALL +++ b/INSTALL @@ -226,6 +226,15 @@ of options. refclocks are enabled with `--refclock= or --refclock=all' `waf configure --list' will print a list of available refclocks. +=== --enable-early-droproot === + +Drop root privileges as early as possible. This requires the refclock +devices to be owned by the same owner or group that ntpd will be +running under (most likely that group will be named "ntp") so that it +can still open the devices. This can be accomplished by adding +`GROUP="ntp"` or `OWNER="ntp"` to the udev rules that create the +device symlinks for the refclocks. + == Developer options == --enable-debug-gdb:: Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for KORG EX-800 and Poly-800MkII V0.9: http://Synth.Stromeko.net/Downloads.html#KorgSDada From esr at thyrsus.com Sun Nov 27 09:52:07 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 04:52:07 -0500 Subject: ntp.conf new refclock configuration syntax In-Reply-To: <87inr9nw7a.fsf@Rainer.invalid> References: <871sxyz8dn.fsf@Rainer.invalid> <20161126215430.GA10538@thyrsus.com> <87inr9nw7a.fsf@Rainer.invalid> Message-ID: <20161127095207.GA23805@thyrsus.com> Achim Gratz : > Luckily no need for that, this patch fixes it: Thanks! -- Eric S. Raymond From esr at thyrsus.com Sun Nov 27 09:53:33 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 04:53:33 -0500 Subject: ntpq quirk In-Reply-To: <87a8clnvxu.fsf@Rainer.invalid> References: <20161126120350.GB19600@thyrsus.com> <20161126212758.1852F406061@ip-64-139-1-69.sjc.megapath.net> <20161126234546.GC11261@thyrsus.com> <87a8clnvxu.fsf@Rainer.invalid> Message-ID: <20161127095333.GB23805@thyrsus.com> Achim Gratz : > Eric S. Raymond writes: > > OK, there are two ways we can handle this. > > The third way is building a hash table that maps each possible unique > prefix to the actual full-length command. That seems like a detail in how the second approach would be implemented. -- Eric S. Raymond From esr at thyrsus.com Sun Nov 27 09:55:34 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 04:55:34 -0500 Subject: ntpd w/ --enable-early-droproot In-Reply-To: <8760n9ntzy.fsf@Rainer.invalid> References: <87shqextdb.fsf@Rainer.invalid> <20161126203607.GB7353@thyrsus.com> <87d1hix6x9.fsf@Rainer.invalid> <8760n9ntzy.fsf@Rainer.invalid> Message-ID: <20161127095534.GC23805@thyrsus.com> Achim Gratz : > Adding 'GROUP="ntp"' to the udev rules setting up the device symlinks > correctly changes the actual device files' group to ntp and lets ntpd > use these devices while --enable-early-droproot is configured. Patch applied. > [what markup language is INSTALL in?] asciidoc. We use it for everything, both docs internal to the tree and generating the website. -- Eric S. Raymond From hmurray at megapathdsl.net Sun Nov 27 10:14:59 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 27 Nov 2016 02:14:59 -0800 Subject: ntpd w/ --enable-early-droproot In-Reply-To: Message from Achim Gratz of "Sun, 27 Nov 2016 10:37:37 +0100." <8760n9ntzy.fsf@Rainer.invalid> Message-ID: <20161127101459.481EF406061@ip-64-139-1-69.sjc.megapath.net> > [what markup language is INSTALL in?] asciidoc Thanks for the update. Do you have examples of a full udev rule? There is another worm in this area. The PPS stuff for serial ports in Linux needs some magic like: ldattach 18 /dev/ttyS6 Does anybody have a udev recipe to set that up? -- These are my opinions. I hate spam. From esr at thyrsus.com Sun Nov 27 10:17:53 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 05:17:53 -0500 Subject: Python ntpq In-Reply-To: <87oa11nxv0.fsf@Rainer.invalid> References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87h96ux718.fsf@Rainer.invalid> <20161127004411.GD11261@thyrsus.com> <87oa11nxv0.fsf@Rainer.invalid> Message-ID: <20161127101753.GA24214@thyrsus.com> Achim Gratz : > > You have a point. Unfortunately there is no version-independent place > > for Python libraries that the Python interpreter recognizes. And > > there's a good reason for this - the bytecode used in .pyc/.pyo files > > (the just-in-time compiled versions of .py files) differs across > > versions. > > Then it's not appropriate to make the choice of interpreter dependent on > the vagaries of the PATH setting at the time of invocation as you do now > by using env to find the binary. Especially in the case the user built > ntp with a specific python binary it might not be in path at all for the > user(s) invoking the scripts. Your point has merit. But we don't know of a better option. The theoretical workarounds, like hacking hashbang lines at install time, turn out to be horribly brittle in practice. The Python devs are not stupid. If there were a painless solution to this problem it would be deployed already... > > The only way I can think of to get near to what you want would be to set > > a maximum width for the clockname field, implicitly bounding the amount the > > display can extent to the right. > > Yes, and please provide an option to set that width instead of trying to > figure out a number that works for everyone. Well, you can do that too, > but I can guarantee that whatever number you come up with doesn't work > for at least one person. "One person" I don't care about. I might if the --wide option didn't exist, but since it does nobody is ever actually stuck with a truncating display. I'll wait to see what the statistics on length distribution look like before complicating things. I'm in general not a fan of adding options as a subsitute for figuring out what the right thing is. It increases test and documentation complexity too much when you do that casually. > There is actually a much better solution: move the variable length field > to the end of the line. In theory, yes. But not as a default - NTP users are conservative and I don't want to deal with the grief I think I would get if I made that large a change to the default layout. > It also needs an option since it breaks > compatibility (the wide display already does anyway since you can't > parse the output using fixed columns anymore). I'm not buying the second clause as a real problem. Nobody does that kind of parsing with actual fixed fields as an assumption, not when all modern scripting language make parsing by whitespace-separation trivial. -- Eric S. Raymond From Stromeko at nexgo.de Sun Nov 27 10:20:39 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 11:20:39 +0100 Subject: ntpd w/ --enable-early-droproot References: <20161127101459.481EF406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <871sxxns08.fsf@Rainer.invalid> Hal Murray writes: > There is another worm in this area. The PPS stuff for serial ports in Linux > needs some magic like: > ldattach 18 /dev/ttyS6 > Does anybody have a udev recipe to set that up? --8<---------------cut here---------------start------------->8--- KERNEL=="pps0" SYMLINK+="gpspps1" GROUP="ntp" KERNEL=="ttyAMA0" RUN+="/bin/setserial /dev/%k low_latency" RUN+="/bin/stty -F /dev/%k 115200" SYMLINK+="navspark-%n" SYMLINK+="gps1" GROUP="ntp" --8<---------------cut here---------------end--------------->8--- You'd add another RUN+="" command to the rule setting up the serial device. It doesn't matter if you write that as multiple lines or in a single line as the example above. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From Stromeko at nexgo.de Sun Nov 27 10:22:19 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 11:22:19 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87h96ux718.fsf@Rainer.invalid> <20161127004411.GD11261@thyrsus.com> <87oa11nxv0.fsf@Rainer.invalid> Message-ID: <87wpfpmdd0.fsf@Rainer.invalid> Achim Gratz writes: > Yes, and please provide an option to set that width instead of trying to > figure out a number that works for everyone. Well, you can do that too, > but I can guarantee that whatever number you come up with doesn't work > for at least one person. Like this perhaps: --- a/ntpq/ntpq +++ b/ntpq/ntpq @@ -156,6 +156,7 @@ class Ntpq(cmd.Cmd): self.pktversion = ntp.packet.NTP_OLDVERSION + 1 self.uservars = collections.OrderedDict() self.ai_family = socket.AF_UNSPEC + self.termwidth = ntp.util.termsize()[1] def emptyline(self): "Called when an empty line is entered in response to the prompt." @@ -288,7 +289,7 @@ usage: help [ command ] if not self.__dogetassoc(): return if self.showhostnames: - termwidth = ntp.util.termsize()[1] + termwidth = interpreter.termwidth else: termwidth = None # Default width report = ntp.util.PeerSummary(mode, @@ -1501,7 +1502,10 @@ USAGE: ntpq [-46dphinOV] [-c str] [-D lvl] [ host ...] peers -n no numeric numeric host addresses -V opt version Output version information and exit - -w no wide enable wide display of addresses + -w no wide enable wide display of addresses / hosts + on a separate line + -W no width force output width to this value instead of + querying the terminal size ''' if __name__ == '__main__': @@ -1510,12 +1514,12 @@ if __name__ == '__main__': try: (options, arguments) = getopt.getopt(sys.argv[1:], - "46c:dD:hinpVw", + "46c:dD:hinpVwW:", ["ipv4","ipv6", "command=", "debug", "set-debug-level=", "help", "interactive", "numeric", "peers", "version", - "wide"]) + "wide", "width="]) except getopt.GetoptError as e: sys.stderr.write("%s\n" % e) sys.stderr.write(usage) @@ -1556,6 +1560,8 @@ if __name__ == '__main__': raise SystemExit(0) elif switch in ("-w", "--wide"): interpreter.wideremote = True + elif switch in ("-W", "--width"): + interpreter.termwidth = int(val) if interpreter.interactive and len(interpreter.ccmds) > 0: interpreter.warn("%s: invalid option combination.\n" % progname) Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf Blofeld V1.15B11: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From Stromeko at nexgo.de Sun Nov 27 10:48:31 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 11:48:31 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87h96ux718.fsf@Rainer.invalid> <20161127004411.GD11261@thyrsus.com> <87oa11nxv0.fsf@Rainer.invalid> <20161127101753.GA24214@thyrsus.com> Message-ID: <87shqdmc5c.fsf@Rainer.invalid> Eric S. Raymond writes: > I'm not buying the second clause as a real problem. Nobody does that > kind of parsing with actual fixed fields as an assumption, not when > all modern scripting language make parsing by whitespace-separation > trivial. If I had a nickel for each time somebody said "Nobody does that?" and then it turned out to be wrong, I'd be the richest man on earth. I've said it often enough myself and I was wrong every time. If it can be done at all, somebody somewhere will have done it already. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf rackAttack V1.04R1: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From Stromeko at nexgo.de Sun Nov 27 11:10:49 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 12:10:49 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87h96ux718.fsf@Rainer.invalid> Message-ID: <87oa11mb46.fsf@Rainer.invalid> Achim Gratz writes: > It doesn't use wide display for the rv and cv commands no matter the > option. That turned out to be caused by a hardcoded width: --- a/ntpq/ntpq +++ b/ntpq/ntpq @@ -443,7 +444,7 @@ usage: help [ command ] lastcount = 0 else: lastcount += 1 - if lastcount + len(item) > 78: + if lastcount + len(item) > self.termwidth: text = text[:-1] + "\n" text += item text = text[:-2] + "\n" Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Wavetables for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables From Stromeko at nexgo.de Sun Nov 27 11:46:57 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 12:46:57 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> Message-ID: <87k2bpm9fy.fsf@Rainer.invalid> Eric S. Raymond writes: > Oh, bletch. I see the problem. I'm going to have to write a real > parser rather than using a naive split(",") call. > > ...or maybe not. Rather klugey fix pushed. Complain if it needs > futher work. Current master is mysteriously broken: pi at raspberrypi2:~/ntpsec $ ntpq/ntpq -w -c rv -p status=0415 leap_none, sync_uhf_radio, 1 event, clock_sync, version="ntpd 0.9.6-6d3838e Nov 26 2016 12:56:44", processor="armv7l", system="Linux/4.4.32-v7+", leap=00, stratum=1, precision=-19, rootdelay=0.0, rootdisp=1.0, refid=NavS, reftime=dbe5442c.fb7432a8 2016-11-27T12:38:20.982, clock=dbe5442d.6dc8ea75 2016-11-27T12:38:21.428, peer=14813, tc=4, mintc=0, offset=0.000306, frequency=-3.143, sys_jitter=0.000713, clk_jitter=0.001, clk_wander=0.0 remote refid st t when poll reach delay offset jitter ====================================================================================================== Traceback (most recent call last): File "ntpq/ntpq", line 1590, in interpreter.onecmd(cmd) File "/usr/lib/python2.7/cmd.py", line 221, in onecmd return func(arg) File "ntpq/ntpq", line 1123, in do_peers self.__dopeers(showall=False, mode="peers") File "ntpq/ntpq", line 334, in __dopeers variables, peer.associd)) File "/usr/lib/python2.7/dist-packages/ntp/util.py", line 208, in summary if hmode == MODE_BCLIENT: NameError: global name 'MODE_BCLIENT' is not defined That constant should be in ntp_magic, which is imported for all I can tell. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From hmurray at megapathdsl.net Sun Nov 27 11:58:07 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 27 Nov 2016 03:58:07 -0800 Subject: ntpd w/ --enable-seccomp In-Reply-To: Message from Achim Gratz of "Sun, 27 Nov 2016 09:51:44 +0100." <87eg1xnw4f.fsf@Rainer.invalid> Message-ID: <20161127115807.50288406061@ip-64-139-1-69.sjc.megapath.net> Stromeko at nexgo.de said: > Let me know if you can reproduce or not. I'm using: Linux raspberrypi2 > 4.4.32-v7+ #924 SMP Tue Nov 15 18:11:28 GMT 2016 armv7l GNU/Linux I can easily reproduce it, but I can't find what I need. The way that seccomp works is that you build a big table of all the allowed syscalls and pass that to the kernel. If you miss one, you get an illegal instruction signal. Normally, I just run ntpd from gdb: run -n -u ntp:ntp when it crashes, a backtrace usually tells me the name of the system call. It may help to comment out the signal_no_reset line in ntpd/ntp_sandbox.c I need the __NR_xxx for the system call. I'm seeing: (gdb) bt #0 _armv7_tick () at armv4cpuid.S:17 #1 0x76da84b4 in OPENSSL_cpuid_setup () at armcap.c:75 #2 0x76fdeffc in call_init (l=, argc=4, argv=0x7efffc94, env=0x7efffca8) at dl-init.c:78 #3 0x76fdf0d8 in _dl_init (main_map=0x76fff958, argc=4, argv=0x7efffc94, env=0x7efffca8) at dl-init.c:126 #4 0x76fcfd84 in _dl_start_user () from /lib/ld-linux-armhf.so.3 Backtrace stopped: previous frame identical to this frame (corrupt stack?) I assume that _armv7_tick is the system call, but I can't translate that to what we need. This looks like initialization for openssl/libcrypto We may be able to work around this by calling some crypto routine to make the initialization happen earlier. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Sun Nov 27 12:02:45 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 13:02:45 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87k2bpm9fy.fsf@Rainer.invalid> Message-ID: <87eg1xm8pm.fsf@Rainer.invalid> Achim Gratz writes: > Eric S. Raymond writes: >> Oh, bletch. I see the problem. I'm going to have to write a real >> parser rather than using a naive split(",") call. >> >> ...or maybe not. Rather klugey fix pushed. Complain if it needs >> futher work. > > Current master is mysteriously broken: Bisecting to: [82220634479a5356533b41e597b8525220e88c94] In pyntpdig, make repr() for packets work. Backing out this change makes it work again: --- a/pylib/packet.py +++ b/pylib/packet.py @@ -183,7 +183,7 @@ from __future__ import print_function, division import sys, socket, select, struct, collections, string import getpass, hashlib, time from ntp.ntpc import lfptofloat -import ntp.util + # General notes on Python 2/3 compatibility: # Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From esr at thyrsus.com Sun Nov 27 12:07:20 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 07:07:20 -0500 Subject: Python ntpq In-Reply-To: <87oa11mb46.fsf@Rainer.invalid> References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87h96ux718.fsf@Rainer.invalid> <87oa11mb46.fsf@Rainer.invalid> Message-ID: <20161127120720.GA26435@thyrsus.com> Achim Gratz : > Achim Gratz writes: > > It doesn't use wide display for the rv and cv commands no matter the > > option. > > That turned out to be caused by a hardcoded width: > > --- a/ntpq/ntpq > +++ b/ntpq/ntpq > @@ -443,7 +444,7 @@ usage: help [ command ] > lastcount = 0 > else: > lastcount += 1 > - if lastcount + len(item) > 78: > + if lastcount + len(item) > self.termwidth: > text = text[:-1] + "\n" > text += item > text = text[:-2] + "\n" Actually, self.termwidth - 2. But yes. C ntpq had *all* its widths fixed like that. -- Eric S. Raymond From Stromeko at nexgo.de Sun Nov 27 12:24:05 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 13:24:05 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> Message-ID: <87a8clm7q2.fsf@Rainer.invalid> Eric S. Raymond writes: > ...or maybe not. Rather klugey fix pushed. Complain if it needs > futher work. Breaks as follows: pi at raspberrypi2:~/ntpsec $ ntpq/ntpq -p -c "cv &1" -c "cv &2" remote refid st t when poll reach delay offset jitter ============================================================================== NMEA(0) .uBx8. 0 l 10 16 377 0.000 -4.070 2.224 oNMEA(1) .NavS. 0 l 9 16 377 0.000 -0.001 0.003 ptbtime1.ptb.de .PTB. 1 u 9 64 377 27.571 0.751 0.081 ptbtime2.ptb.de .PTB. 1 u 18 64 377 27.396 0.783 0.098 ptbtime3.ptb.de .PTB. 1 u - 64 377 25.930 0.111 0.132 Traceback (most recent call last): File "ntpq/ntpq", line 1590, in interpreter.onecmd(cmd) File "/usr/lib/python2.7/cmd.py", line 221, in onecmd return func(arg) File "ntpq/ntpq", line 1082, in do_cv self.do_clockvar(line) File "ntpq/ntpq", line 1072, in do_clockvar self.__dolist(line.split()[1:], assoc, ntp.packet.CTL_OP_READCLOCK, ntp.ntpc.TYPE_CLOCK) File "ntpq/ntpq", line 456, in __dolist variables = self.session.readvar(associd, varlist, op) File "/usr/lib/python2.7/dist-packages/ntp/packet.py", line 1139, in readvar return collections.OrderedDict(items) File "/usr/lib/python2.7/dist-packages/ntp/packet.py", line 1110, in __parse_varlist response += c TypeError: 'str' object does not support item assignment Also, it seems that association indices can't be used if you do not print any peer list: pi at raspberrypi2:~/ntpsec $ ntpq/ntpq -c "cv &1" -c "cv &2" No such index. No such index. I havent tracked when it was broken, but I assume this is a missing initialization and was always there. Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Waldorf MIDI Implementation & additional documentation: http://Synth.Stromeko.net/Downloads.html#WaldorfDocs From esr at thyrsus.com Sun Nov 27 12:25:46 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 07:25:46 -0500 Subject: Python ntpq In-Reply-To: <87k2bpm9fy.fsf@Rainer.invalid> References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87k2bpm9fy.fsf@Rainer.invalid> Message-ID: <20161127122546.GB26435@thyrsus.com> Achim Gratz : > Current master is mysteriously broken: Fix pushed. Though I don't undetstand what odd corner case in the import rules we tripped over. -- Eric S. Raymond From esr at thyrsus.com Sun Nov 27 12:31:57 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 07:31:57 -0500 Subject: Python ntpq In-Reply-To: <87a8clm7q2.fsf@Rainer.invalid> References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87a8clm7q2.fsf@Rainer.invalid> Message-ID: <20161127123157.GA27299@thyrsus.com> Achim Gratz : > Eric S. Raymond writes: > > ...or maybe not. Rather klugey fix pushed. Complain if it needs > > futher work. > > Breaks as follows: > > pi at raspberrypi2:~/ntpsec $ ntpq/ntpq -p -c "cv &1" -c "cv &2" > remote refid st t when poll reach delay offset jitter > ============================================================================== > NMEA(0) .uBx8. 0 l 10 16 377 0.000 -4.070 2.224 > oNMEA(1) .NavS. 0 l 9 16 377 0.000 -0.001 0.003 > ptbtime1.ptb.de .PTB. 1 u 9 64 377 27.571 0.751 0.081 > ptbtime2.ptb.de .PTB. 1 u 18 64 377 27.396 0.783 0.098 > ptbtime3.ptb.de .PTB. 1 u - 64 377 25.930 0.111 0.132 > Traceback (most recent call last): > File "ntpq/ntpq", line 1590, in > interpreter.onecmd(cmd) > File "/usr/lib/python2.7/cmd.py", line 221, in onecmd > return func(arg) > File "ntpq/ntpq", line 1082, in do_cv > self.do_clockvar(line) > File "ntpq/ntpq", line 1072, in do_clockvar > self.__dolist(line.split()[1:], assoc, ntp.packet.CTL_OP_READCLOCK, ntp.ntpc.TYPE_CLOCK) > File "ntpq/ntpq", line 456, in __dolist > variables = self.session.readvar(associd, varlist, op) > File "/usr/lib/python2.7/dist-packages/ntp/packet.py", line 1139, in readvar > return collections.OrderedDict(items) > File "/usr/lib/python2.7/dist-packages/ntp/packet.py", line 1110, in __parse_varlist > response += c > TypeError: 'str' object does not support item assignment Not reproducing here. I get: esr at snark:~/software/ntp-rescue/ntpsec$ ntpq/ntpq -p -c "cv &1" -c "cv &2" remote refid st t when poll reach delay offset jitter ================================================================================= us.pool.ntp.org .POOL. 16 p - 64 0 0.000 0.000 0.001 -104.131.53.252 209.51.161.238 2 u 54 1024 377 8.349 2.995 2.124 -104.156.99.226 164.67.62.194 2 u 876 1024 377 80.653 7.029 19.653 -mirror 199.233.236.226 3 u 1049 1024 377 29.468 2.902 0.394 -mis.wci.com 131.107.13.100 2 u 37 1024 377 78.478 2.798 3.091 *level1e.cs.unc.edu .PPS. 1 u 189 1024 377 31.141 0.426 0.784 -66.241.101.63 169.254.0.2 2 u - 1024 377 61.246 8.378 4.832 +helium.constant.co 128.59.0.245 2 u 954 1024 377 12.912 -1.199 1.149 -enigma.wiredgoats. 64.142.122.38 2 u 44 1024 377 78.486 3.660 0.893 +mail.coldnorthadmi 132.246.11.231 2 u 58 1024 377 10.804 -0.017 0.589 ***Server error code BADASSOC ***Server error code BADASSOC Possibly this is a Python-3-specific issue? Not that that doesn't mean we should fix it. > Also, it seems that association indices can't be used if you do not > print any peer list: > > pi at raspberrypi2:~/ntpsec $ ntpq/ntpq -c "cv &1" -c "cv &2" > No such index. > No such index. > > I havent tracked when it was broken, but I assume this is a missing > initialization and was always there. That is correct. It was like that in C ntpq. -- Eric S. Raymond From Stromeko at nexgo.de Sun Nov 27 12:34:21 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 13:34:21 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87k2bpm9fy.fsf@Rainer.invalid> <20161127122546.GB26435@thyrsus.com> Message-ID: <8760n9m78y.fsf@Rainer.invalid> Eric S. Raymond writes: > Achim Gratz : >> Current master is mysteriously broken: > > Fix pushed. Though I don't undetstand what odd corner case in the > import rules we tripped over. That now doesn't work like this: pi at raspberrypi2:~/ntpsec $ ntpq/ntpq -c "cv &1" -c "cv &2" No such index. No such index. pi at raspberrypi2:~/ntpsec $ ntpq/ntpq -pc "cv &1" -c "cv &2" remote refid st t when poll reach delay offset jitter ============================================================================================================================================================================== NMEA(0) .uBx8. 0 l 2 16 377 0.000 -1.405 1.327 oNMEA(1) .NavS. 0 l 1 16 377 0.000 0.003 0.004 ptbtime1.ptb.de .PTB. 1 u 15 64 377 27.486 0.728 0.097 ptbtime2.ptb.de .PTB. 1 u 23 64 377 27.193 0.764 0.129 ptbtime3.ptb.de .PTB. 1 u 67 64 376 25.892 -0.101 0.132 Traceback (most recent call last): File "ntpq/ntpq", line 1590, in interpreter.onecmd(cmd) File "/usr/lib/python2.7/cmd.py", line 221, in onecmd return func(arg) File "ntpq/ntpq", line 1082, in do_cv self.do_clockvar(line) File "ntpq/ntpq", line 1072, in do_clockvar self.__dolist(line.split()[1:], assoc, ntp.packet.CTL_OP_READCLOCK, ntp.ntpc.TYPE_CLOCK) File "ntpq/ntpq", line 456, in __dolist variables = self.session.readvar(associd, varlist, op) File "/usr/lib/python2.7/dist-packages/ntp/packet.py", line 1148, in readvar return self.__parse_varlist() File "/usr/lib/python2.7/dist-packages/ntp/packet.py", line 1119, in __parse_varlist response[i] = "\xae" TypeError: 'str' object does not support item assignment Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Wavetables for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables From Stromeko at nexgo.de Sun Nov 27 12:38:48 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 13:38:48 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87a8clm7q2.fsf@Rainer.invalid> <20161127123157.GA27299@thyrsus.com> Message-ID: <871sxxm71j.fsf@Rainer.invalid> Eric S. Raymond writes: > Possibly this is a Python-3-specific issue? Not that that doesn't mean > we should fix it. I'm currently using /usr/bin/python, which is pointing to /usr/bin/python2.7, so I don't think that's the reason. >> Also, it seems that association indices can't be used if you do not >> print any peer list: >> >> pi at raspberrypi2:~/ntpsec $ ntpq/ntpq -c "cv &1" -c "cv &2" >> No such index. >> No such index. >> >> I havent tracked when it was broken, but I assume this is a missing >> initialization and was always there. > > That is correct. It was like that in C ntpq. No, in the C version it worked: pi at raspberrypi:~ $ ntpq -c 'cv &1' associd=46064 status=00f0 15 events, clk_unspec, device="RAW DCF77 CODE (Conrad DCF77 receiver module)", name="RAWDCF_CONRAD", timecode="----#--###---#----M-S-24-12-p12--1-P124--21241---1-24-1---p", poll=36084, noreply=0, badformat=29, baddata=0, fudgetime1=880.500, fudgetime2=27.350, stratum=0, refid=PCFk, flags=0, refclock_time="dbe551e4.00000000 2016-11-27T12:36:52.000Z", refclock_status="TIME CODE; (LEAP INDICATION; CALLBIT)", refclock_format="RAW DCF77 Timecode" If you don't have a refclock then no associations exist that you might be able to use, but I have two of them on each box. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From esr at thyrsus.com Sun Nov 27 12:47:49 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 07:47:49 -0500 Subject: Python ntpq In-Reply-To: <8760n9m78y.fsf@Rainer.invalid> References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87k2bpm9fy.fsf@Rainer.invalid> <20161127122546.GB26435@thyrsus.com> <8760n9m78y.fsf@Rainer.invalid> Message-ID: <20161127124749.GA27811@thyrsus.com> Achim Gratz : > Eric S. Raymond writes: > > Achim Gratz : > >> Current master is mysteriously broken: > > > > Fix pushed. Though I don't undetstand what odd corner case in the > > import rules we tripped over. > > That now doesn't work like this: Fix pushed. Try again? -- Eric S. Raymond From Stromeko at nexgo.de Sun Nov 27 12:52:04 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 13:52:04 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87k2bpm9fy.fsf@Rainer.invalid> <20161127122546.GB26435@thyrsus.com> <8760n9m78y.fsf@Rainer.invalid> <20161127124749.GA27811@thyrsus.com> Message-ID: <87wpfpkruz.fsf@Rainer.invalid> Eric S. Raymond writes: >> That now doesn't work like this: > > Fix pushed. Try again? Almost there? where are the double double quotes coming from? And can the header rule be truncated to the table width, please? pi at raspberrypi2:~/ntpsec $ ntpq/ntpq -pc "cv &1" -c "cv &2" remote refid st t when poll reach delay offset jitter ============================================================================================================================================================================== "NMEA(0)" .uBx8. 0 l 10 16 377 0.000 1.989 3.109 o"NMEA(1)" .NavS. 0 l 9 16 377 0.000 0.000 0.001 ptbtime1.ptb.de .PTB. 1 u 47 64 377 27.627 0.756 0.080 ptbtime2.ptb.de .PTB. 1 u 56 64 377 27.095 0.767 0.108 ptbtime3.ptb.de .PTB. 1 u 25 64 377 25.922 -0.122 0.195 associd=14812 status=0000 no events, clk_unspec, device=""NMEA GPS Clock"", name=""NMEA"", timecode=""$GNZDA,124958.00,27,11,2016,00,00*7B"", poll=379, noreply=0, badformat=0, baddata=0, fudgetime2=39.6, stratum=0, refid=uBx8, flags=0 associd=14813 status=0000 no events, clk_unspec, device=""NMEA GPS Clock"", name=""NMEA"", timecode=""$GPZDA,124958.000,27,11,2016,00,00*55"", poll=379, noreply=0, badformat=0, baddata=0, fudgetime1=-1.69, fudgetime2=150.0, stratum=0, refid=NavS, flags=1 Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From esr at thyrsus.com Sun Nov 27 12:52:47 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 07:52:47 -0500 Subject: Python ntpq In-Reply-To: <871sxxm71j.fsf@Rainer.invalid> References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87a8clm7q2.fsf@Rainer.invalid> <20161127123157.GA27299@thyrsus.com> <871sxxm71j.fsf@Rainer.invalid> Message-ID: <20161127125247.GA28126@thyrsus.com> Achim Gratz : > > That is correct. It was like that in C ntpq. > > No, in the C version it worked: > > pi at raspberrypi:~ $ ntpq -c 'cv &1' > associd=46064 status=00f0 15 events, clk_unspec, > device="RAW DCF77 CODE (Conrad DCF77 receiver module)", > name="RAWDCF_CONRAD", > timecode="----#--###---#----M-S-24-12-p12--1-P124--21241---1-24-1---p", > poll=36084, noreply=0, badformat=29, baddata=0, fudgetime1=880.500, > fudgetime2=27.350, stratum=0, refid=PCFk, flags=0, > refclock_time="dbe551e4.00000000 2016-11-27T12:36:52.000Z", > refclock_status="TIME CODE; (LEAP INDICATION; CALLBIT)", > refclock_format="RAW DCF77 Timecode" > > If you don't have a refclock then no associations exist that you might > be able to use, but I have two of them on each box. Oh, that explains it. I could never get & references to work before doing a peer listing, but that was on a client- only system with no refclocks. I guess I could trigger an associaton fetch when te command parser sees "&". -- Eric S. Raymond From hmurray at megapathdsl.net Sun Nov 27 12:57:52 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 27 Nov 2016 04:57:52 -0800 Subject: ntpd broken on ARM Message-ID: <20161127125752.7778B406061@ip-64-139-1-69.sjc.megapath.net> I spent a lot of time chasing what I thought was a missing syscall in the seccomp list, but that was a wild goose chase. For me, it currently breaks even without seccomp. I'm running code that was built on the 24th. That was probably shortly after a git pull, so the change that broke things is probably recent. Have we changed anything in libntp or ntpd since then? ??? Hopefully, somebody will fix it before I wake up. 'nite. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Sun Nov 27 13:11:01 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 14:11:01 +0100 Subject: ntpd broken on ARM References: <20161127125752.7778B406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <87shqdkqze.fsf@Rainer.invalid> Hal Murray writes: > I spent a lot of time chasing what I thought was a missing syscall in the > seccomp list, but that was a wild goose chase. > > For me, it currently breaks even without seccomp. Hmm. Works just fine for me? Did for some reason one of the devices you use not close correctly when restarting/debugging ntpd? That can produce some pretty strange results IIRC. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Q+, Q and microQ: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From Stromeko at nexgo.de Sun Nov 27 13:43:21 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 14:43:21 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87k2bpm9fy.fsf@Rainer.invalid> <20161127122546.GB26435@thyrsus.com> <8760n9m78y.fsf@Rainer.invalid> <20161127124749.GA27811@thyrsus.com> <87wpfpkruz.fsf@Rainer.invalid> Message-ID: <87oa11kphi.fsf@Rainer.invalid> Achim Gratz writes: > Eric S. Raymond writes: >>> That now doesn't work like this: >> >> Fix pushed. Try again? > > Almost there? where are the double double quotes coming from? Fixed by: --- a/pylib/packet.py +++ b/pylib/packet.py @@ -1108,7 +1108,6 @@ class ControlSession: response = "" for c in self.response: if c == '"': - response = response + c instring = not instring if instring and c == ',': response = response + "\xae" > And can the header rule be truncated to the table width, please? Fixed by: --- a/pylib/util.py +++ b/pylib/util.py @@ -87,14 +87,14 @@ class PeerSummary: # By default, the peer spreadsheet layout is designed so lines just # fit in 80 characters. This tells us how much extra horizontal space # we have available on a wider terminal emulator. - self.horizontal_slack = (termwidth or 80) - 80 + self.horizontal_slack = min((termwidth or 80) - 80, 24) # Peer spreadsheet column widths. The reason we cap extra # width used at 24 is that on very wide displays, slamming the # non-hostname fields all the way to the right produces a huge # river that makes the entries difficult to read as wholes. # This choice caps the peername field width at that of the longest # possible IPV6 numeric address. - self.namewidth = 15 + min(self.horizontal_slack, 24) + self.namewidth = 15 + self.horizontal_slack self.refidwidth = 15 # Compute peer spreadsheet headers self.__remote = " remote ".ljust(self.namewidth) Although I still think the limiting should be done in ntpq (or any other program using utils) rather than there so that one can override the termwidth from the top-level caller without second-guessing by util. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From Stromeko at nexgo.de Sun Nov 27 14:02:56 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 27 Nov 2016 15:02:56 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87k2bpm9fy.fsf@Rainer.invalid> <20161127122546.GB26435@thyrsus.com> <8760n9m78y.fsf@Rainer.invalid> <20161127124749.GA27811@thyrsus.com> <87wpfpkruz.fsf@Rainer.invalid> <87oa11kphi.fsf@Rainer.invalid> Message-ID: <87k2bpkokv.fsf@Rainer.invalid> Achim Gratz writes: > Although I still think the limiting should be done in ntpq (or any other > program using utils) rather than there so that one can override the > termwidth from the top-level caller without second-guessing by util. The more general way to do that would be to have util.termsize take optional arguments that specify the maximum termsize to return. --- a/pylib/util.py +++ b/pylib/util.py @@ -61,7 +61,7 @@ def canonicalize_dns(hostname, family=socket.AF_UNSPEC): return canonicalized.lower() + portsuffix return name[0].lower() + portsuffix -def termsize(): +def termsize(maxrows=0, maxcols=0): "Return the current terminal size." # Should work under Linux and Solaris at least. # Alternatives at http://stackoverflow.com/questions/566746/how-to-get-console-window-width-in-python @@ -72,8 +72,10 @@ def termsize(): return (24, 80) for pattern in ('rows\D+(\d+); columns\D+(\d+);', '\s+(\d+)\D+rows;\s+(\d+)\D+columns;'): m = re.search(pattern, output) + rows = int(m.group(1)) + cols = int(m.group(2)) if m: - return int(m.group(1)), int(m.group(2)) + return ((min(rows,maxrows) or rows), (min(cols,maxcols) or cols)) return (24, 80) Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Q+, Q and microQ: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From esr at thyrsus.com Sun Nov 27 17:20:54 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 12:20:54 -0500 Subject: ntpd broken on ARM In-Reply-To: <20161127125752.7778B406061@ip-64-139-1-69.sjc.megapath.net> References: <20161127125752.7778B406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161127172054.GA31857@thyrsus.com> Hal Murray : > > I spent a lot of time chasing what I thought was a missing syscall in the > seccomp list, but that was a wild goose chase. > > For me, it currently breaks even without seccomp. > > I'm running code that was built on the 24th. That was probably shortly after > a git pull, so the change that broke things is probably recent. Have we > changed anything in libntp or ntpd since then? ??? > > Hopefully, somebody will fix it before I wake up. 'nite. Recent changes to libntp and ntpd, from git log: commit 77c8a205820877caa901a08738c118fc5fa09d68 Author: Eric S. Raymond Date: Sun Nov 27 04:50:17 2016 -0500 Pass baudrate into the refclock_open() function correctly. commit 6d3838ec2555e0294d90f3a5ef4487f6d1b00138 Author: Eric S. Raymond Date: Sat Nov 26 00:29:50 2016 -0500 Repair mrulist reporting. The ntp_monitor() call got pushed down a code path that packets (at least, packets from pool servers) don't normally take. Result: mosdt traffic was not getting recorded, and ntpq -c mrulist returned spuriously empty reports. Fix verified by actually seeing an mrulist cpme out afterwards. commit 4911e2df0c54414a539685545780168757879434 Author: Daniel Fox Franke Date: Wed Nov 23 16:23:55 2016 -0500 Fix authentication This corrects two evil bugs, one of which was introduced during the protocol refactor. That one would have been a vulnerability but fortunately 1. the other was masking it, and 2. this bug never made it in a release. Phew. First bug: missing exit after failed authentication, which would have allowed misauthenticated packets to be accepted. Yikes! Second bug: Even correctly authenticated packets were getting rejected by different security check. handle_procpkt() checks that there's a request in flight before it's willing to process any response. But due to a bug that predates the fork from NTP Classic, authenticated requests never got their outcount incremented. This is why we test things... commit 1fb1136aa8964218719ed561bd608d418150c9a3 Author: Daniel Fox Franke Date: Tue Nov 22 15:45:38 2016 -0500 CVE-2016-7429: Avoid interface update on possibly-spoofed responses Adapted from the NTP Classic patch by Juergen Perlinger commit bdbc4cb53696b27895730a5a61437256762761b6 Author: Daniel Fox Franke Date: Mon Nov 21 21:42:30 2016 -0500 CVE-2016-7434: Crash from malformed mrulist request Patch ported from NTP Classic and mainly due to Juergen Perlinger -- Eric S. Raymond From esr at thyrsus.com Sun Nov 27 17:27:07 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 27 Nov 2016 12:27:07 -0500 Subject: Python ntpq In-Reply-To: <87oa11kphi.fsf@Rainer.invalid> References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87k2bpm9fy.fsf@Rainer.invalid> <20161127122546.GB26435@thyrsus.com> <8760n9m78y.fsf@Rainer.invalid> <20161127124749.GA27811@thyrsus.com> <87wpfpkruz.fsf@Rainer.invalid> <87oa11kphi.fsf@Rainer.invalid> Message-ID: <20161127172707.GB31857@thyrsus.com> Achim Gratz : > Although I still think the limiting should be done in ntpq (or any other > program using utils) rather than there so that one can override the > termwidth from the top-level caller without second-guessing by util. Agreed, but let's get it working and stable before refactoring. -- Eric S. Raymond From hmurray at megapathdsl.net Mon Nov 28 07:41:31 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Sun, 27 Nov 2016 23:41:31 -0800 Subject: ntpd broken on ARM In-Reply-To: Message from Achim Gratz of "Sun, 27 Nov 2016 14:11:01 +0100." <87shqdkqze.fsf@Rainer.invalid> Message-ID: <20161128074131.23D42406063@ip-64-139-1-69.sjc.megapath.net> > Hmm. Works just fine for me??? I think the problem is that gdb is broken on ARM. Does it work for you? I get the same results on another system and when running code that is known to work when not run from gdb. It crashes before it gets to a breakpoint at main. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Mon Nov 28 17:46:19 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 28 Nov 2016 18:46:19 +0100 Subject: ntpd broken on ARM References: <20161128074131.23D42406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <87eg1vts44.fsf@Rainer.invalid> Hal Murray writes: >> Hmm. Works just fine for me? > > I think the problem is that gdb is broken on ARM. Does it work for you? I avoid the use of debuggers whenever possible, so I can't comment (that's a habit I got into when maintaining some large-scale numerical simulation codes and it stuck?). > I get the same results on another system and when running code that is known > to work when not run from gdb. It crashes before it gets to a breakpoint at > main. Sounds strange. Any easily followed recipe to demonstrate what you are seeing? Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From Stromeko at nexgo.de Mon Nov 28 18:43:31 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 28 Nov 2016 19:43:31 +0100 Subject: Python ntpq References: <8760naz99o.fsf@Rainer.invalid> <20161126200156.GA4544@thyrsus.com> <87k2bpm9fy.fsf@Rainer.invalid> <20161127122546.GB26435@thyrsus.com> <8760n9m78y.fsf@Rainer.invalid> <20161127124749.GA27811@thyrsus.com> <87wpfpkruz.fsf@Rainer.invalid> <87oa11kphi.fsf@Rainer.invalid> <20161127172707.GB31857@thyrsus.com> Message-ID: <87a8cjtpgs.fsf@Rainer.invalid> Eric S. Raymond writes: > Achim Gratz : >> Although I still think the limiting should be done in ntpq (or any other >> program using utils) rather than there so that one can override the >> termwidth from the top-level caller without second-guessing by util. > > Agreed, but let's get it working and stable before refactoring. Well here's the nail in that coffin: ntp/rpi2> ssh raspberrypi2 ntpq -p /bin/stty: standard input: Inappropriate ioctl for device remote refid st t when poll reach delay offset jitter ============================================================================== NMEA(0) .uBx8. 0 l 9 16 377 0.000 -2.942 4.747 oNMEA(1) .NavS. 0 l 8 16 377 0.000 -0.000 0.000 ptbtime1.ptb.de .PTB. 1 u 50 64 377 27.493 0.705 0.128 ptbtime2.ptb.de .PTB. 1 u 32 64 377 27.224 0.619 0.129 ptbtime3.ptb.de .PTB. 1 u 49 64 377 25.872 0.071 0.077 It gets worse when I want to output rv and cv status. With my patch, I can just set the width and don't need to query a non-existent tty. ntp/rpi2> ssh raspberrypi2 ntpq -W 95 -p remote refid st t when poll reach delay offset jitter ============================================================================================= NMEA(0) .uBx8. 0 l 4 16 377 0.000 2.654 3.133 oNMEA(1) .NavS. 0 l 3 16 377 0.000 0.001 0.001 ptbtime1.ptb.de .PTB. 1 u 11 64 377 27.493 0.705 0.220 ptbtime2.ptb.de .PTB. 1 u 59 64 377 27.224 0.619 0.129 ptbtime3.ptb.de .PTB. 1 u 11 64 377 25.872 0.071 0.081 Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From hmurray at megapathdsl.net Mon Nov 28 18:46:48 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 28 Nov 2016 10:46:48 -0800 Subject: ntpd broken on ARM In-Reply-To: Message from Achim Gratz of "Mon, 28 Nov 2016 18:46:19 +0100." <87eg1vts44.fsf@Rainer.invalid> Message-ID: <20161128184648.7C3D6406063@ip-64-139-1-69.sjc.megapath.net> Stromeko at nexgo.de said: > I avoid the use of debuggers whenever possible, so I can't comment (that's a > habit I got into when maintaining some large-scale numerical simulation > codes and it stuck???). It seems like the right tool for this problem. What I need is a stack trace from a crash. Is there a reasonable way to get a stack trace from a signal handler? All I need is the number of the syscall that is getting trapped. Or for a running program to take a core dump so I can look at it later on? I haven't managed to get one out of systemd. > Sounds strange. Any easily followed recipe to demonstrate what you are > seeing? gdb /usr/local/sbin/ntpd break main run -n -u ntp:ntp For me, on ARM, that crashes before it gets to the breakpoint. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Mon Nov 28 19:09:05 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 28 Nov 2016 20:09:05 +0100 Subject: ntpd broken on ARM References: <20161128184648.7C3D6406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <8760n7toa6.fsf@Rainer.invalid> Hal Murray writes: >> Sounds strange. Any easily followed recipe to demonstrate what you are >> seeing? > > gdb /usr/local/sbin/ntpd > break main > run -n -u ntp:ntp > > For me, on ARM, that crashes before it gets to the breakpoint. It's been a _really_ long time I've tried anything like this and that was on Solaris, I don't think I can help with GDB. I don't see any option for ntpd to not fork into a daemon process, so you'll at least have to setup gdb to follow the process into the child. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ DIY Stuff: http://Synth.Stromeko.net/DIY.html From fallenpegasus at gmail.com Mon Nov 28 20:25:35 2016 From: fallenpegasus at gmail.com (Mark Atwood) Date: Mon, 28 Nov 2016 20:25:35 +0000 Subject: ntpq quirk In-Reply-To: <87a8clnvxu.fsf@Rainer.invalid> References: <20161126120350.GB19600@thyrsus.com> <20161126212758.1852F406061@ip-64-139-1-69.sjc.megapath.net> <20161126234546.GC11261@thyrsus.com> <87a8clnvxu.fsf@Rainer.invalid> Message-ID: On Sun, Nov 27, 2016 at 1:00 AM Achim Gratz wrote: > Eric S. Raymond writes: > > OK, there are two ways we can handle this. > > The third way is building a hash table that maps each possible unique > prefix to the actual full-length command. > > I was on a project years ago that did that (minimal length matching of command strings). Someone on team dug up an amazing little utility that ingested a list of strings, and emitted a human-unreadable table-driven maximally efficient C function that implemented some computer science magic state machine character parser that took as a parameter char* and returned a integer or enum that designated one of the target strings that was minimal matched by the input. We then made running this code generator a Makefile production, and compiled the generator itself as another dependent Makefile production. I'm struggling now to remember it's name... ...m -------------- next part -------------- An HTML attachment was scrubbed... URL: From esr at thyrsus.com Mon Nov 28 20:48:46 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 28 Nov 2016 15:48:46 -0500 Subject: ntpq quirk In-Reply-To: References: <20161126120350.GB19600@thyrsus.com> <20161126212758.1852F406061@ip-64-139-1-69.sjc.megapath.net> <20161126234546.GC11261@thyrsus.com> <87a8clnvxu.fsf@Rainer.invalid> Message-ID: <20161128204846.GB18120@thyrsus.com> Mark Atwood : > On Sun, Nov 27, 2016 at 1:00 AM Achim Gratz wrote: > > > Eric S. Raymond writes: > > > OK, there are two ways we can handle this. > > > > The third way is building a hash table that maps each possible unique > > prefix to the actual full-length command. > > > > > I was on a project years ago that did that (minimal length matching of > command strings). > > Someone on team dug up an amazing little utility that ingested a list of > strings, and emitted a human-unreadable table-driven maximally efficient C > function that implemented some computer science magic state machine > character parser that took as a parameter char* and returned a integer or > enum that designated one of the target strings that was minimal matched by > the input. > > We then made running this code generator a Makefile production, and > compiled the generator itself as another dependent Makefile production. > > I'm struggling now to remember it's name... Wow, does *that* take me back. The technique is called "perfect hashing", and used to be part of the standard bag of tricks for writing compilers. I rember reading an article about this 40-odd years ago in one of my dad's Communications of the ACM. This was a few years before I was an actual programmer. I've tripped over it a few times since, but it's almost never used anymore. It made sense in a world of v e r y s l o w p r o c e s s o r s but the tradeoffs are wrong for modern systems - you lose more in code weight than you gain in execution speed. It's a cute trick, though. -- Eric S. Raymond From Stromeko at nexgo.de Mon Nov 28 20:54:42 2016 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 28 Nov 2016 21:54:42 +0100 Subject: ntpq quirk References: <20161126120350.GB19600@thyrsus.com> <20161126212758.1852F406061@ip-64-139-1-69.sjc.megapath.net> <20161126234546.GC11261@thyrsus.com> <87a8clnvxu.fsf@Rainer.invalid> Message-ID: <871sxvtje5.fsf@Rainer.invalid> Mark Atwood writes: > I was on a project years ago that did that (minimal length matching of > command strings). > > Someone on team dug up an amazing little utility that ingested a list of > strings, and emitted a human-unreadable table-driven maximally efficient C > function that implemented some computer science magic state machine > character parser that took as a parameter char* and returned a integer or > enum that designated one of the target strings that was minimal matched by > the input. > > We then made running this code generator a Makefile production, and > compiled the generator itself as another dependent Makefile production. > > I'm struggling now to remember it's name... I wouldn't be too surprised if it was gperf or gengetopt, although at one point in time several such implementations made their rounds. https://www.gnu.org/software/gperf/ https://www.gnu.org/software/gengetopt/gengetopt.html The general solution to this problem is to build a trie, but in this case pre-computing a hash table from all possible unique input strings is probably easier, all things considered. I'm pretty sure Python has a shortest unique prefix matcher already somewhere in the options parser, just like about anybody else. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From hmurray at megapathdsl.net Mon Nov 28 21:47:13 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 28 Nov 2016 13:47:13 -0800 Subject: ntpd broken on ARM In-Reply-To: Message from Achim Gratz of "Mon, 28 Nov 2016 20:09:05 +0100." <8760n7toa6.fsf@Rainer.invalid> Message-ID: <20161128214713.0BD4C406063@ip-64-139-1-69.sjc.megapath.net> Stromeko at nexgo.de said: > I don't see any option for ntpd to not fork into a daemon process, so you'll > at least have to setup gdb to follow the process into the child. -n on the command line avoids the fork. Works great for things like this. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Nov 29 09:56:18 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 29 Nov 2016 01:56:18 -0800 Subject: DNS name lengths for pool servers Message-ID: <20161129095618.A8973406063@ip-64-139-1-69.sjc.megapath.net> I collected 3724 samples each for IPv4 and IPv6. 3323 IPv4 samples had names. 2919 IPv6 samples had names. There were 300 unique IPv4 addresses and 275 unique IPv4 names. There were 275 unique IPv6 addresses and 119 unique IPv6 names. I didn't check to see if the forward DNS matched. The longest IPv4 name was 49 characters. The longest IPv6 name was 44 characters. IPv4 has 4% longer than 39, 6% if you only consider uniq names. IPv6 has 1.5% longer than 39, 3% if you only consider uniq names. -------------- next part -------------- A non-text attachment was scrubbed... Name: DNS-hist2.png Type: image/png Size: 5613 bytes Desc: DNS-hist2.png URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: DNS-hist1.png Type: image/png Size: 5628 bytes Desc: DNS-hist1.png URL: -------------- next part -------------- -- These are my opinions. I hate spam. From ghane0 at gmail.com Tue Nov 29 14:29:35 2016 From: ghane0 at gmail.com (Sanjeev Gupta) Date: Tue, 29 Nov 2016 22:29:35 +0800 Subject: Why do we have an "includes" directory in docs ? Message-ID: I am going through the docs, fixing internal broken links and anchors (I assume this is because we change the documentation, but not the TOC as often). I have fixed most of the misspellings, but there is a bigger issue I am facing now. As an example, please see: https://docs.ntpsec.org/latest/confopt.html This is generated from confopt.txt , which gets its "Related Links" from: include::includes/confopt.txt[] which has (an example): * link:confopt.html#server[server - configure client association] For reasons not clear to me at first sight, the anchor "confopt.html#server" does not link back to the right paragraph in confopt.html. It goes nowhere, in fact. I can reproduce this with Chrome, Firefox, and finally with the online W3C checker. Before I struggle with asciidoc, is there a reason for the files in "includes"? Most are sourced only once, they could be inlined. Is this to help with the man pages? -- Sanjeev Gupta +65 98551208 http://www.linkedin.com/in/ghane -------------- next part -------------- An HTML attachment was scrubbed... URL: From ghane0 at gmail.com Tue Nov 29 15:05:10 2016 From: ghane0 at gmail.com (Sanjeev Gupta) Date: Tue, 29 Nov 2016 23:05:10 +0800 Subject: Why do we have an "includes" directory in docs ? In-Reply-To: References: Message-ID: On Tue, Nov 29, 2016 at 10:29 PM, Sanjeev Gupta wrote: > > I have fixed most of the misspellings, but there is a bigger issue I am > facing now. As an example, please see: > > https://docs.ntpsec.org/latest/confopt.html > > This is generated from confopt.txt , which gets its "Related Links" from: > include::includes/confopt.txt[] > > which has (an example): > * link:confopt.html#server[server - configure client association] > > For reasons not clear to me at first sight, the anchor > "confopt.html#server" does not link back to the right paragraph in > confopt.html. It goes nowhere, in fact. > I think the answer is at: https://www.w3.org/TR/xhtml1/guidelines.html#C_8 The toolchain seems to be producing HTML without "id=", and in some cases, at the wrong place. I have been reading asciidoc documentation all day, and am quite confused now :-( See commit: https://gitlab.com/NTPsec/ntpsec/commit/df1b84a68058be5d855b106c91714ad9743e4b21 for an example of non-intutive-ness. I am pushing some simple fixes now, but not touching the anchor issue any more (I might break the man pages, etc) Help? -- Sanjeev Gupta +65 98551208 http://www.linkedin.com/in/ghane -------------- next part -------------- An HTML attachment was scrubbed... URL: From esr at thyrsus.com Tue Nov 29 16:00:57 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 29 Nov 2016 11:00:57 -0500 Subject: Why do we have an "includes" directory in docs ? In-Reply-To: References: Message-ID: <20161129160057.GA13925@thyrsus.com> Sanjeev Gupta : > Before I struggle with asciidoc, is there a reason for the files in > "includes"? Most are sourced only once, they could be inlined. Is this to > help with the man pages? For many of them that is it. Anything with "body" in it is a manual-page body. These are included once in a small wrapper file with a "Name" section to generate the manual page. and once in a more elaborate wrapper (typically including a Pogo cartoon) somewhere else in the docs/ directory, which is used to generate the website documentation. -- Eric S. Raymond From esr at thyrsus.com Tue Nov 29 21:01:52 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 29 Nov 2016 16:01:52 -0500 Subject: Why do we have an "includes" directory in docs ? In-Reply-To: References: Message-ID: <20161129210152.GB13925@thyrsus.com> Sanjeev Gupta : > On Tue, Nov 29, 2016 at 10:29 PM, Sanjeev Gupta wrote: > > > > > I have fixed most of the misspellings, but there is a bigger issue I am > > facing now. As an example, please see: > > > > https://docs.ntpsec.org/latest/confopt.html > > > > This is generated from confopt.txt , which gets its "Related Links" from: > > include::includes/confopt.txt[] > > > > which has (an example): > > * link:confopt.html#server[server - configure client association] > > > > For reasons not clear to me at first sight, the anchor > > "confopt.html#server" does not link back to the right paragraph in > > confopt.html. It goes nowhere, in fact. > > > > I think the answer is at: https://www.w3.org/TR/xhtml1/guidelines.html#C_8 > > The toolchain seems to be producing HTML without "id=", and in some cases, > at the wrong place. I have been reading asciidoc documentation all day, > and am quite confused now :-( > > See commit: > https://gitlab.com/NTPsec/ntpsec/commit/df1b84a68058be5d855b106c91714ad9743e4b21 > for an example of non-intutive-ness. I am pushing some simple fixes now, > but not touching the anchor issue any more (I might break the man pages, > etc) > > Help? An ASCIIDoc, writing an enclosure of the form [[foo]] is how you force an id anchor to be generated and that point; anchor:foo[] will also work. I think the reason confopt.html#server doesn't point ansywhere is that there's no [[server]] anchor in the target file. I see this instead: [[option]] == Server Command Options == I've written some help for you. If you do "waf linkcheck" in the root directory, you will get a report on unresolved links. If you run it with compile-command in an Emacs buffer, you will be able to step through them as you would compiler error messages. -- Eric S. Raymond From hmurray at megapathdsl.net Wed Nov 30 12:35:50 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 30 Nov 2016 04:35:50 -0800 Subject: ntpq quirks Message-ID: <20161130123550.DA8D4406063@ip-64-139-1-69.sjc.megapath.net> The mru list is getting printed out oldest first. I thought you fixed that. The print-what-you-have after ^C doesn't work if the output is a pipe. (Classic doesn't work either.) [murray at second ~]$ ntpq -nc mru | tee foo.log Ctrl-C will stop MRU retrieval and display partial results. ^Cclose failed in file object destructor: sys.excepthook is missing lost sys.stderr The date in the python version string is close to useless. [murray at hgm ~]$ ntpd --version ntpd 0.9.6-b9cadf5-hgm Nov 30 2016 03:44:12 [murray at hgm ~]$ ntpq --version ntpq 0.9.6-119-play 2016-11-06T00:05:45Z [murray at hgm ~]$ I don't know what the right answer is. The date is ages ago. The "play" is the directory I built it in. The "hgm" in the ntpd string is what I put there with waf --build-version-tag= I'm getting hangs with mrulist. I haven't figured out what's going on. Yes, it works OK in small local cases. I'm getting hangs in big local cases (pool server). If I wait a while then ^C, it prints out a bunch of stuff (in backwards order), then stops. It doesn't print out any of the 346 (0 updates) messages, but I don't know if the new version does that normally. ... 1231 0 90 . 3 4 1 123 92.24.199.193 1230 0 90 . 3 4 1 30704 223.24.43.95 1230 0 90 . 3 3 1 60456 78.145.21.250 1230 0.00 90 . 3 3 3 3688 86.135.151.203 1230 0 90 . 3 3 1 44231 175.151.247.31 [murray at second ~]$ It's a pool server getting lots of traffic. The 1200 is ballpark of how long since ntpd was restarted. So either local packets are getting dropped and the retransmission logic is broken or something else is broken. Any suggestions for how to debug this? How about adding a few counters to ntpq so we can at least get some hints? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Wed Nov 30 12:48:20 2016 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 30 Nov 2016 04:48:20 -0800 Subject: ntpq rv 0 is broken Message-ID: <20161130124820.55D63406063@ip-64-139-1-69.sjc.megapath.net> [murray at glypnod play]$ ntpq -c "rv 0" status=0115 leap_none, sync_pps, 1 event, clock_sync, /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set /bin/stty: when specifying an output style, modes may not be set version="ntpd 0.9.6-glypnod Nov 27 2016 23:56:22", processor="i686", system="Linux/4.8.8-200.fc24.i686", leap=00, stratum=1, precision=-21, rootdelay=0.0, rootdisp=1.27, refid=PPS, reftime=dbe948d0.61a17e2b 2016-11-30T04:47:12.381, clock=dbe948e2.ff8cad7a 2016-11-30T04:47:30.998, peer=35371, tc=6, mintc=0, offset=0.006283, frequency=9.869, sys_jitter=0.002798, clk_jitter=0.001, clk_wander=0.0, tai=36, leapsec=201701010000L, expire=201706280000L [murray at glypnod play]$ -- These are my opinions. I hate spam. From gem at rellim.com Wed Nov 30 20:15:14 2016 From: gem at rellim.com (Gary E. Miller) Date: Wed, 30 Nov 2016 12:15:14 -0800 Subject: Google public NTP service Message-ID: <20161130121514.73765a04@spidey.rellim.com> Yo All! This just in from time-nuts. On Wed, 30 Nov 2016 14:21:39 +0000 Michael Rothwell wrote: > ... was just announced. > https://cloudplatform.googleblog.com/2016/11/making-every-leap-second-count-with-our-new-public-NTP-servers.html?m=1 I sort of see where they are coming from, but this will cause problems. The NTP packets google sends out have no way to be marked as 'not UTC'. Given how promiscuous people are sharing NTP chimers these 'not UTC' chimers will get into the mix. When they diverge from the real UTC servers it will sorely confuse NTP clients. I would support an RFC to mark the type a time an chimer is servings. Not only smeared and UTC, but also TAI, UT, UT0, UT1, UT2, ET, TDT, TDB, TT, TCG, TCB, GPS, etc... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 455 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Wed Nov 30 22:44:29 2016 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 30 Nov 2016 17:44:29 -0500 Subject: ntpq rv 0 is broken In-Reply-To: <20161130124820.55D63406063@ip-64-139-1-69.sjc.megapath.net> References: <20161130124820.55D63406063@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20161130224429.GA16027@thyrsus.com> Hal Murray : > [murray at glypnod play]$ ntpq -c "rv 0" > status=0115 leap_none, sync_pps, 1 event, clock_sync, > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > /bin/stty: when specifying an output style, modes may not be set > version="ntpd 0.9.6-glypnod Nov 27 2016 23:56:22", processor="i686", > system="Linux/4.8.8-200.fc24.i686", leap=00, stratum=1, precision=-21, > rootdelay=0.0, rootdisp=1.27, refid=PPS, > reftime=dbe948d0.61a17e2b 2016-11-30T04:47:12.381, > clock=dbe948e2.ff8cad7a 2016-11-30T04:47:30.998, peer=35371, tc=6, mintc=0, > offset=0.006283, frequency=9.869, sys_jitter=0.002798, clk_jitter=0.001, > clk_wander=0.0, tai=36, leapsec=201701010000L, expire=201706280000L > [murray at glypnod play]$ Ugh. I am not entirely surprised that the kluge in termsize() came unstuck. I've pushed a different, and hopefully better, implementation. -- Eric S. Raymond