From esr at thyrsus.com Fri Sep 1 02:25:50 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 31 Aug 2017 22:25:50 -0400 Subject: =?utf-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: <20170831165912.452c7eee@spidey.rellim.com> References: <20170831165435.2c697d5c@spidey.rellim.com> <20170831165912.452c7eee@spidey.rellim.com> Message-ID: <20170901022550.GA8343@thyrsus.com> Gary E. Miller via devel : > > I just tried to build ntpsec for the first time in weeks. Not good. > > Here is how I build: > > ./waf configure --enable-debug --enable-debug-gdb --enable-warnings \ > --refclock=all --enable-doc --enable-seccomp && \ > ./waf build && \ > ./waf install Aha. With *that* I can reproduce these - don't get them with my normal invocation. Nothing here is difficult to fix. I'm on it. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Fri Sep 1 02:41:04 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 31 Aug 2017 22:41:04 -0400 Subject: =?utf-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: <20170831165435.2c697d5c@spidey.rellim.com> References: <20170831165435.2c697d5c@spidey.rellim.com> Message-ID: <20170901024104.GB8343@thyrsus.com> Gary E. Miller via devel : > I just tried to build ntpsec for the first time in weeks. Not good. Were you formerly getting these? ntp_parser.tab.c: In function ?yytnamerr?: ntp_parser.tab.c:1329:21: warning: conversion to ?long unsigned int? from ?long int? may change the sign of the result [-Wsign-conversion] ntp_parser.tab.c:1391:10: note: in expansion of macro ?yystpcpy? ntp_parser.tab.c: In function ?yysyntax_error?: ntp_parser.tab.c:1478:3: warning: switch missing default case [-Wswitch-default] ntp_parser.tab.c: In function ?yyparse?: ntp_parser.tab.c:1637:25: warning: conversion to ?long unsigned int? from ?long int? may change the sign of the result [-Wsign-conversion] I've fixed all the others. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From trv-n at comcast.net Fri Sep 1 02:43:15 2017 From: trv-n at comcast.net (Trevor N.) Date: Thu, 31 Aug 2017 22:43:15 -0400 Subject: Verified - ntpd ignores the year part of refclock timestamps Message-ID: <65ehqc5mu79llljr4mv1u99tkn8gr8qqbf@4ax.com> It's not necessary to use refclock_process() if the driver creates its own l_fp timecode timestamp and uses refclock_process_offset(). I was considering removing refclock_process() when I added the rollover workaround to the trimble driver, but then I read this: https://bugs.ntp.org/show_bug.cgi?id=417 Where the reasoning behind clock_panic and not using the year in refclocks is touched on. From fw at fwright.net Fri Sep 1 02:57:55 2017 From: fw at fwright.net (Fred Wright) Date: Thu, 31 Aug 2017 19:57:55 -0700 (PDT) Subject: =?utf-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: <20170901024104.GB8343@thyrsus.com> References: <20170831165435.2c697d5c@spidey.rellim.com> <20170901024104.GB8343@thyrsus.com> Message-ID: On Thu, 31 Aug 2017, Eric S. Raymond via devel wrote: > Gary E. Miller via devel : > > I just tried to build ntpsec for the first time in weeks. Not good. > > Were you formerly getting these? > > ntp_parser.tab.c: In function ?yytnamerr?: > ntp_parser.tab.c:1329:21: warning: conversion to ?long unsigned int? from ?long int? may change the sign of the result [-Wsign-conversion] > ntp_parser.tab.c:1391:10: note: in expansion of macro ?yystpcpy? > ntp_parser.tab.c: In function ?yysyntax_error?: > ntp_parser.tab.c:1478:3: warning: switch missing default case [-Wswitch-default] > ntp_parser.tab.c: In function ?yyparse?: > ntp_parser.tab.c:1637:25: warning: conversion to ?long unsigned int? from ?long int? may change the sign of the result [-Wsign-conversion] > > I've fixed all the others. Here I don't see build failures (admittedly without all Gary's options), but I do see a test failure (OSX): TEST(clocktime, CurrentYearExplicit)../../tests/libntp/clocktime.c:59::FAIL: Expected 3486372600 Was 104913720 This bisects to commit 5489ed5a593c8e14138e47544a1598d4c6576dee. Fred Wright From esr at thyrsus.com Fri Sep 1 03:35:03 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 31 Aug 2017 23:35:03 -0400 Subject: =?utf-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: References: <20170831165435.2c697d5c@spidey.rellim.com> <20170901024104.GB8343@thyrsus.com> Message-ID: <20170901033503.GA22055@thyrsus.com> Fred Wright via devel : > Here I don't see build failures (admittedly without all Gary's options), > but I do see a test failure (OSX): > > TEST(clocktime, CurrentYearExplicit)../../tests/libntp/clocktime.c:59::FAIL: Expected 3486372600 Was 104913720 > > This bisects to commit 5489ed5a593c8e14138e47544a1598d4c6576dee. That's interesting. Doesn't fail here or on Gitlab. (The Gitlab failures are elsewhere and confined to FreeBSD.) The relevant section of code of code is this in libntp/clocktime.c: /* * Year > 1970 - from a 4-digit year stamp, must be greater * than POSIX epoch. Means we're not dependent on the pivot * value (derived from the packet receipt timestamp, and thus * ultimately from the system clock) to be correct. CLOSETIME * clipping to the receive time will *not* be applied in this * case. These two lines thus make it possible to recover from * a trashed or zeroed system clock. * * Warning: the hack in the NMEA driver that rectifies 4-digit * yearts from 2-digit ones has an expiration date in 2399. * After that this code will go badly wrong. */ if (year > 1970) { *yearstart = year_to_ntp(year); return (int32_t)*yearstart + tmp; } Something about the expression "(int32_t)*yearstart + tmp" is computing differently under MacOS than elsewhere. Anyone got any ideas? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Fri Sep 1 04:07:51 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 31 Aug 2017 21:07:51 -0700 Subject: =?iso-8859-1?Q?=9C=98BuildMime-Version: 1.0 Message-ID: <20170901040751.5D84840605C@ip-64-139-1-69.sjc.megapath.net> > Aha. With *that* I can reproduce these - don't get them with my normal > invocation. What was interesting about Gary's recipe that you didn't normally test? Is there some option that we should add to tests/option-tester.sh? Does anybody use it? Is there a buildbot? Does it work? What OSes and options does it try? I haven't seen mail from it in ages. -- These are my opinions. I hate spam. From gem at rellim.com Fri Sep 1 06:00:51 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 31 Aug 2017 23:00:51 -0700 Subject: =?iso-8859-1?Q?=9C=98BuildMime-Version: 1.0 In-Reply-To: <20170901040751.5D84840605C@ip-64-139-1-69.sjc.megapath.net> References: <20170901040751.5D84840605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170831230051.0c6c2bf5@spidey.rellim.com> Yo Hal! On Thu, 31 Aug 2017 21:07:51 -0700 Hal Murray via devel wrote: > Is there a buildbot? Does it work? What OSes and options does it > try? I haven't seen mail from it in ages. It sends IRC messages to #ntpsec-dev. You should also have a loging to https://buildbot.ntpsec.org/ RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Sep 1 06:05:41 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 31 Aug 2017 23:05:41 -0700 Subject: =?UTF-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: <20170901024104.GB8343@thyrsus.com> References: <20170831165435.2c697d5c@spidey.rellim.com> <20170901024104.GB8343@thyrsus.com> Message-ID: <20170831230541.62d11aaf@spidey.rellim.com> Yo Eric! On Thu, 31 Aug 2017 22:41:04 -0400 "Eric S. Raymond" wrote: > Gary E. Miller via devel : > > I just tried to build ntpsec for the first time in weeks. Not > > good. > > Were you formerly getting these? Yes, I see no way to fix ntp_parser_tab.c. That is why the sign conversion warning is optional. > I've fixed all the others. Good. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Sep 1 06:10:40 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 31 Aug 2017 23:10:40 -0700 Subject: =?UTF-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: <20170901033503.GA22055@thyrsus.com> References: <20170831165435.2c697d5c@spidey.rellim.com> <20170901024104.GB8343@thyrsus.com> <20170901033503.GA22055@thyrsus.com> Message-ID: <20170831231040.7871bd27@spidey.rellim.com> Yo Eric! On Thu, 31 Aug 2017 23:35:03 -0400 "Eric S. Raymond via devel" wrote: > > Expected 3486372600 Was 104913720 Sure looks like integer overflow. > Something about the expression "(int32_t)*yearstart + tmp" is > computing differently under MacOS than elsewhere. Anyone got any > ideas? yearstart is uint32_t, and our expected result will not fir in a uint32_t (3486372600 == 0xCFCDD2F8). So the compiler is not changing uint32_t to int32_t where we need it to. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Fri Sep 1 10:54:11 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 1 Sep 2017 06:54:11 -0400 Subject: =?utf-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: <20170831231040.7871bd27@spidey.rellim.com> References: <20170831165435.2c697d5c@spidey.rellim.com> <20170901024104.GB8343@thyrsus.com> <20170901033503.GA22055@thyrsus.com> <20170831231040.7871bd27@spidey.rellim.com> Message-ID: <20170901105411.GA27618@thyrsus.com> Gary E. Miller via devel : > Yo Eric! > > On Thu, 31 Aug 2017 23:35:03 -0400 > "Eric S. Raymond via devel" wrote: > > > > Expected 3486372600 Was 104913720 > > Sure looks like integer overflow. > > > Something about the expression "(int32_t)*yearstart + tmp" is > > computing differently under MacOS than elsewhere. Anyone got any > > ideas? > > yearstart is uint32_t, and our expected result will not fir in a > uint32_t (3486372600 == 0xCFCDD2F8). > > So the compiler is not changing uint32_t to int32_t where we need it to. It's a good thing the Mac balked. There was a bug in the way I was passing out the computation result. Embarrassing. Fred, do you still get this error with xurrebt head? -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From hmurray at megapathdsl.net Fri Sep 1 20:11:38 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 01 Sep 2017 13:11:38 -0700 Subject: =?iso-8859-1?Q?=98BuildMime-Version: 1.0 Message-ID: <20170901201138.6FC56406061@ip-64-139-1-69.sjc.megapath.net> > It's a good thing the Mac balked. There was a bug in the way I was passing > out the computation result. Embarrassing. Then how did it work at all on non-Mac systems? -- These are my opinions. I hate spam. From gem at rellim.com Fri Sep 1 21:06:37 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 1 Sep 2017 14:06:37 -0700 Subject: =?iso-8859-1?Q?=98BuildMime-Version: 1.0 In-Reply-To: <20170901201138.6FC56406061@ip-64-139-1-69.sjc.megapath.net> References: <20170901201138.6FC56406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170901140637.71a9fb8f@spidey.rellim.com> Yo Hal! On Fri, 01 Sep 2017 13:11:38 -0700 Hal Murray via devel wrote: > > It's a good thing the Mac balked. There was a bug in the way I was > > passing out the computation result. Embarrassing. > > Then how did it work at all on non-Mac systems? My guess is the cast (int32_t to uint32_t) took place at a different point in the calculation. My solution would have been to force 32 bit arithmetic. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Fri Sep 1 21:39:13 2017 From: fw at fwright.net (Fred Wright) Date: Fri, 1 Sep 2017 14:39:13 -0700 (PDT) Subject: =?utf-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: <20170901105411.GA27618@thyrsus.com> References: <20170831165435.2c697d5c@spidey.rellim.com> <20170901024104.GB8343@thyrsus.com> <20170901033503.GA22055@thyrsus.com> <20170831231040.7871bd27@spidey.rellim.com> <20170901105411.GA27618@thyrsus.com> Message-ID: On Fri, 1 Sep 2017, Eric S. Raymond via devel wrote: > Gary E. Miller via devel : > > Yo Eric! > > > > On Thu, 31 Aug 2017 23:35:03 -0400 > > "Eric S. Raymond via devel" wrote: > > > > > > Expected 3486372600 Was 104913720 > > > > Sure looks like integer overflow. > > > > > Something about the expression "(int32_t)*yearstart + tmp" is > > > computing differently under MacOS than elsewhere. Anyone got any > > > ideas? > > > > yearstart is uint32_t, and our expected result will not fir in a > > uint32_t (3486372600 == 0xCFCDD2F8). > > > > So the compiler is not changing uint32_t to int32_t where we need it to. > > It's a good thing the Mac balked. There was a bug in the way I was > passing out the computation result. Embarrassing. > > Fred, do you still get this error with xurrebt head? The failure is gone at the current master (8e7e0677d). I assume that "xurrebt" is a bottom-row-shifted version of "current". :-) Fred Wright From Matthew.Selsky at twosigma.com Sun Sep 3 01:28:20 2017 From: Matthew.Selsky at twosigma.com (Matthew Selsky) Date: Sat, 2 Sep 2017 21:28:20 -0400 Subject: =?utf-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: References: <20170831165435.2c697d5c@spidey.rellim.com> <20170901024104.GB8343@thyrsus.com> <20170901033503.GA22055@thyrsus.com> <20170831231040.7871bd27@spidey.rellim.com> <20170901105411.GA27618@thyrsus.com> Message-ID: <20170903012820.GD23760@twosigma.com> Hey Eric/Gary: Are there extra options we should pass to ./waf configure to catch these types of errors in our gitlab bots? I'm happy to update our .gitlab-ci.yml to do extra checking so we detect these before users do. Thanks, -Matt From gem at rellim.com Sun Sep 3 01:47:46 2017 From: gem at rellim.com (Gary E. Miller) Date: Sat, 2 Sep 2017 18:47:46 -0700 Subject: =?UTF-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: <20170903012820.GD23760@twosigma.com> References: <20170831165435.2c697d5c@spidey.rellim.com> <20170901024104.GB8343@thyrsus.com> <20170901033503.GA22055@thyrsus.com> <20170831231040.7871bd27@spidey.rellim.com> <20170901105411.GA27618@thyrsus.com> <20170903012820.GD23760@twosigma.com> Message-ID: <20170902184746.5c052e32@spidey.rellim.com> Yo Matthew! On Sat, 2 Sep 2017 21:28:20 -0400 Matthew Selsky via devel wrote: > Are there extra options we should pass to ./waf configure to catch > these types of errors in our gitlab bots? Here is my formula: ./waf configure --enable-debug --enable-debug-gdb --enable-warnings \ --refclock=all --enable-doc --enable-seccomp && \ ./waf build && \ ./waf install It enables several extra paths that normal users normally do not exercise. There are some warings with that now: asciidoc: WARNING: rollover.txt: line 210: missing section: [gps_pivots] asciidoc: WARNING: rollover.txt: line 259: missing section: [ntp_pivots] [...] ../../libntp/clocktime.c: In function 'clocktime': ../../libntp/clocktime.c:94:15: warning: conversion to 'uint32_t {aka unsigned int}' from 'int32_t {aka int}' may change the sign of the result [-Wsign-conversion] *ts_ui = (int32_t)*yearstart + tmp; ^ The --enbable-warnings is a bit troublesome as it does generate some warnings that are not really fixable. ntp_parser.tab.c: In function 'yytnamerr': ntp_parser.tab.c:1391:34: warning: conversion to 'long unsigned int' from 'long int' may change the sign of the result [-Wsign-conversion] ntp_parser.tab.c: In function 'yysyntax_error': ntp_parser.tab.c:1478:3: warning: switch missing default case [-Wswitch-default] ntp_parser.tab.c: In function 'yyparse': ntp_parser.tab.c:1637:25: warning: conversion to 'long unsigned int' from 'long int' may change the sign of the result [-Wsign-conversion] AFAIK the only way to fix that one is to fix Bison. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Sun Sep 3 17:52:58 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 03 Sep 2017 19:52:58 +0200 Subject: Tinkerboard w/ TinkerOS 2.0.1 Message-ID: <87a82bis91.fsf@Rainer.invalid> I've bumped my TinkerBoard up to the latest official distro (based on Stretch now) and took some notes of what I've done along the way. --8<---------------cut here---------------start------------->8--- ** Asus TinkerBoard *** TinkerOS 2.0.1 - change password for linaro & copy SSH key - sudo hostname tinkerboard - sudo vi /etc/hosts /etc/hostname # change hostname from linaro-alip to tinkerboard - uname -a # Linux tinkerboard 4.4.71+ #1 SMP Thu Aug 17 00:28:01 CST 2017 armv7l GNU/Linux - sudo apt-get update && sudo apt-get -us dist-upgrade && sudo apt-get -u dist-upgrade && sudo apt-get autoremove - sudo apt-get purge anacron - sudo apt-get install emacs cpufrequtils pps-tools util-linux setserial miniterm picocom python-serial rsync cron git - sudo apt-mark hold ntp # install ntpsec over existing scaffolding - sudo vi /etc/default/cpufrequtils # set governor userspace @ fixed frequency 1.2GHz - sudo vi /etc/udev/rules.d/99-navspark.rules - sudo vi /etc/ntp.conf /etc/ntp.keys - sudo dpkg reconfigure locales # activate a UTF8 locale - echo LC_MESSAGES=POSIX | sudo tee -a /etc/default/locale # keep messages sane - echo iface wlan0 inet manual | sudo tee -a /etc/network/interfaces.d/wlan0 - sudo systemctl mask wpa_supplicant - sudo systemctl set-default multi-user.target - git clone https://gitlab.com/NTPsec/ntpsec.git # hangs with git: transport - sudo rm /var/run/ntp.conf.dhcp /etc/dhcp/dhclient-exit-hooks.d/ntp - mkdir -p /var/log/ntpstat && chown ntp.ntp /var/log/ntpstat - ntpsec + sudo ./buildprep + ./waf configure --refclock=nmea,pps,local,generic,shm --prefix=/usr --disable-debug --8<---------------cut here---------------end--------------->8--- Still no sign of PPS via GPIO, I might try Armbian later to see if it's available there since I don't need the GPU and WLAN anyway. I've finally found a description of how to get the devicetree extracted from the boot files and there are some examples from Armbian on how to do devicetree overlays. Not sure if that would help getting a pps_gpio device off the ground since I don't know if the module is compiled in (it's not available in the module directory, so either it's not configured at all or compiled in). Otherwise this was going a lot smoother than the first install (version 1.4) and even the Ethernet LEDs now work correctly. I will need to upgrade the three RasPi to Stretch as well. At least the RasPi 1B will not be an in-place upgrade since the SD card is too small (maybe I'll copy it to a larger one and extend the fs). But I might just install Raspian lite from scratch on all three of them. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Q+, Q and microQ: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From Matthew.Selsky at twosigma.com Mon Sep 4 20:38:51 2017 From: Matthew.Selsky at twosigma.com (Matthew Selsky) Date: Mon, 4 Sep 2017 16:38:51 -0400 Subject: =?utf-8?B?4pyYUHJldmVu?= =?utf-8?Q?t?= potential buffer overruns in the mode 6 code. In-Reply-To: <20170313121147.2429a0bb@spidey.rellim.com> References: <20170313121147.2429a0bb@spidey.rellim.com> Message-ID: <20170904203851.GE23760@twosigma.com> On Mon, Mar 13, 2017 at 12:11:47PM -0700, Gary E. Miller wrote: > Yo Ertic! > > > cp = buffer; > cq = tag; > - while (*cq != '\0') > + while (*cq != '\0' && cp < buffer + sizeof(buffer) - 1) > *cp++ = *cq++; > > > Why not just use strlcpy? NTPsec has its own copy if the OS does > not provide it. This sort of bit-picky C code is where problems lurk. Hey Eric, Was there an off-list answer to this? Can we switch to strlcpy() for the cases where we're copying null-terminated strings? Thanks, -Matt From esr at thyrsus.com Tue Sep 5 01:53:28 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 4 Sep 2017 21:53:28 -0400 Subject: =?utf-8?B?4pyYUHJldmVu?= =?utf-8?Q?t?= potential buffer overruns in the mode 6 code. In-Reply-To: <20170904203851.GE23760@twosigma.com> References: <20170313121147.2429a0bb@spidey.rellim.com> <20170904203851.GE23760@twosigma.com> Message-ID: <20170905015328.GA24703@thyrsus.com> Matthew Selsky via devel : > On Mon, Mar 13, 2017 at 12:11:47PM -0700, Gary E. Miller wrote: > > Yo Ertic! > > > > > > cp = buffer; > > cq = tag; > > - while (*cq != '\0') > > + while (*cq != '\0' && cp < buffer + sizeof(buffer) - 1) > > *cp++ = *cq++; > > > > > > Why not just use strlcpy? NTPsec has its own copy if the OS does > > not provide it. This sort of bit-picky C code is where problems lurk. > > Hey Eric, > > Was there an off-list answer to this? Can we switch to strlcpy() for the cases where we're copying null-terminated strings? Sorry, was off for the weekend and missed this. I agree with "This sort of bit-picky C code is where problems lurk" and I'd have absolutely no objection to moving to strlcpy in cases like this. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Sep 5 02:01:41 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 4 Sep 2017 22:01:41 -0400 Subject: =?utf-8?B?4pyYQnVpbGQ=?= failure In-Reply-To: References: <20170831165435.2c697d5c@spidey.rellim.com> <20170901024104.GB8343@thyrsus.com> <20170901033503.GA22055@thyrsus.com> <20170831231040.7871bd27@spidey.rellim.com> <20170901105411.GA27618@thyrsus.com> Message-ID: <20170905020141.GC24703@thyrsus.com> Fred Wright via devel : > The failure is gone at the current master (8e7e0677d). Good to know. I was a little worried about this while on the road this weekend. > I assume that "xurrebt" is a bottom-row-shifted version of "current". :-) Right. I can spell, but my typing is terrible. :-) -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From esr at thyrsus.com Tue Sep 5 18:56:16 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 5 Sep 2017 14:56:16 -0400 (EDT) Subject: All hands alert - crash of unknown origin Message-ID: <20170905185616.DDB9513A0206@snark.thyrsus.com> Everyone should read this thread: https://gitlab.com/NTPsec/ntpsec/issues/375 The only empirical clue we have is that it only seems to manifest under the kind of high load characterestic of pool service. I have a suspicion that somrthing is causing memory usage to spike and the OOM killer is reaping the process. This is a serious bug and we need everyone with test facilities trying to reproduce it. If there is any way you can set up and watch a pool server, please do so. -- Eric S. Raymond The Bible is not my book, and Christianity is not my religion. I could never give assent to the long, complicated statements of Christian dogma. -- Abraham Lincoln From esr at thyrsus.com Tue Sep 5 22:25:38 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 5 Sep 2017 18:25:38 -0400 (EDT) Subject: Dave Morgan's report on the mystery crash Message-ID: <20170905222538.D618713A0206@snark.thyrsus.com> Dave Morgan sent me a report on two instances of the mystery crash tghat hapened to him last week (he also said the installation had been stable since). Alas, I somehow fat-fingered my copy of that mail. Dave, please repost to the list so we can all stare at your logs and config. -- Eric S. Raymond Non-cooperation with evil is as much a duty as cooperation with good. -- Mohandas Gandhi From morgadave at googlemail.com Wed Sep 6 09:27:33 2017 From: morgadave at googlemail.com (Dave Morgan) Date: Wed, 6 Sep 2017 10:27:33 +0100 Subject: Dave Morgan's report on the mystery crash In-Reply-To: <20170905222538.D618713A0206@snark.thyrsus.com> References: <20170905222538.D618713A0206@snark.thyrsus.com> Message-ID: All, I am at work at moment. If logs still needed I will send in about 10 hours when back home. Dave On 05/09/2017, Eric S. Raymond via devel wrote: > Dave Morgan sent me a report on two instances of the mystery crash > tghat hapened to him last week (he also said the installation had been > stable since). Alas, I somehow fat-fingered my copy of that mail. > > Dave, please repost to the list so we can all stare at your logs and > config. > -- > Eric S. Raymond > > Non-cooperation with evil is as much a duty as cooperation with good. > -- Mohandas Gandhi > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel > -- http://www.morgad.co.uk/index.html DP: http://www.pgdp.net NTP: http://www.pool.ntp.org L&B: http://www.lynton-rail.co.uk From esr at thyrsus.com Wed Sep 6 10:33:41 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 6 Sep 2017 06:33:41 -0400 Subject: Dave Morgan's report on the mystery crash In-Reply-To: References: <20170905222538.D618713A0206@snark.thyrsus.com> Message-ID: <20170906103341.GA7257@thyrsus.com> Dave Morgan : > All, > I am at work at moment. If logs still needed I will send in about 10 > hours when back home. Thanks, we've found and fixed the problem. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Wed Sep 6 22:23:13 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 06 Sep 2017 15:23:13 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h Message-ID: <20170906222313.3B38840605C@ip-64-139-1-69.sjc.megapath.net> [3/3] Compiling bob2/host/ntpd/ntp_parser.tab.c In file included from ../../include/ntp.h:12:0, from /home/murray/ntpsec/play/ntpd/ntp_parser.y:16: ../../include/ntp_fp.h: In function 'dtolfp': ../../include/ntp_fp.h:144:2: warning: implicit declaration of function 'ldexpl' ../../include/ntp_fp.h:144:25: warning: incompatible implicit declaration of buil t-in function 'ldexpl' ../../include/ntp_fp.h: In function 'lfptod': ../../include/ntp_fp.h:152:9: warning: incompatible implicit declaration of built -in function 'ldexpl' ... Repeat for many modules. ------- man ldexp says: NAME ldexp, ldexpf -- multiply floating-point number by integral power of 2 LIBRARY Math Library (libm, -lm) SYNOPSIS #include double ldexp(double x, int exp); float ldexpf(float x, int exp); DESCRIPTION The ldexp() family of functions compute -- These are my opinions. I hate spam. From gem at rellim.com Wed Sep 6 23:30:39 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 6 Sep 2017 16:30:39 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170906222313.3B38840605C@ip-64-139-1-69.sjc.megapath.net> References: <20170906222313.3B38840605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170906163039.3b08918f@spidey.rellim.com> Yo Hal! On Wed, 06 Sep 2017 15:23:13 -0700 Hal Murray via devel wrote: > ../../include/ntp_fp.h:144:2: warning: implicit declaration of > function 'ldexpl' ldexpl() is POSIX 2008 and ISO/IEC 9899:1999 (a.k.a. C99). Not supporting C99 is pretty lame. NTPsec specifically requires C99 support. So clearly a NetBSD problem. Got a workaround? Maybe ldexpl() is in math.h, but if behind an #ifdef? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Sep 7 08:37:30 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 07 Sep 2017 01:37:30 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h Message-ID: <20170907083730.C37E040605C@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > ldexpl() is POSIX 2008 and ISO/IEC 9899:1999 (a.k.a. C99). =20 > Not supporting C99 is pretty lame. NTPsec specifically requires C99 > support. > So clearly a NetBSD problem. Thanks. Looks like they tried but didn't get everything. The man page says: The described functions conform to ISO/IEC 9899:1999 (``ISO C99''). Of course, that doesn't say the whole system confirms to C99, just ldexp and ldexpf, but I'll interpret it as meaning that they tried to cover the whole C99. > Maybe ldexpl() is in math.h, but if behind an #ifdef? Nope. grep doesn't find it. > Got a workaround? This seems to build and check: #include /* ldexpl() */ #ifndef ldexpl /* Missing in NetBSD 6.1.5 */ #define ldexpl ldexp #endif Will that do the right conversions between double and long double? Do we want to work with old but still supported NetBSD or be sticky about requiring C99? Eric: Do we have a list of OSes and hardware where ntpsec is known to build and work? ---------- grep does find this in /usr/include/g++/cmath inline long double ldexp(long double __x, int __exp) { return __builtin_ldexpl(__x, __exp); } I screwed around trying to make something like that work and didn't find a recipe. It might be possible, but it requires some skills I don't have. I'll itemize what I tried if anybody is interested. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Thu Sep 7 18:30:09 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 07 Sep 2017 11:30:09 -0700 Subject: Monitoring bad buys Message-ID: <20170907183009.0A00A40605C@ip-64-139-1-69.sjc.megapath.net> >From issue simple mathematics (#381) > The ntp codebase itself is not specific about what "query packets" are, but > the processing for Mode 6 is right before that. I'd say odds are good that > bad guys are shipping you Mode 6 packets hoping to crack something. I've been thinking of tweaking the mru counters to have a separate counter for everything other than simple time requests. And maybe a bit mask of packet types that have been seen. -- These are my opinions. I hate spam. From gem at rellim.com Thu Sep 7 19:47:42 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 7 Sep 2017 12:47:42 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170907083730.C37E040605C@ip-64-139-1-69.sjc.megapath.net> References: <20170907083730.C37E040605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170907124742.591c2070@spidey.rellim.com> Yo Hal! On Thu, 07 Sep 2017 01:37:30 -0700 Hal Murray wrote: > > Got a workaround? > > This seems to build and check: > #include /* ldexpl() */ > #ifndef ldexpl > /* Missing in NetBSD 6.1.5 */ > #define ldexpl ldexp > #endif > > Will that do the right conversions between double and long double? Serious loss of precision, but maybe the best we can do. > Do we want to work with old but still supported NetBSD or be sticky > about requiring C99? You brougth it up. If yuo don't care we can drop that version of NetBSD. > Eric: Do we have a list of OSes and hardware where ntpsec is known > to build and work? buildbot.ntpsec.org. > grep does find this in /usr/include/g++/cmath > inline long double > ldexp(long double __x, int __exp) > { return __builtin_ldexpl(__x, __exp); } Is __builtin_ldexpl() defined anywhere? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From paul at anastrophe.com Thu Sep 7 19:51:07 2017 From: paul at anastrophe.com (Paul Theodoropoulos) Date: Thu, 7 Sep 2017 12:51:07 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170907124742.591c2070@spidey.rellim.com> References: <20170907083730.C37E040605C@ip-64-139-1-69.sjc.megapath.net> <20170907124742.591c2070@spidey.rellim.com> Message-ID: <12ead933-de3b-cc92-9025-89ad3639be99@anastrophe.com> On 9/7/17 12:47, Gary E. Miller via devel wrote: >> Eric: Do we have a list of OSes and hardware where ntpsec is known >> to build and work? > > buildbot.ntpsec.org. Just fyi, that page is htpasswd protected. -- Paul Theodoropoulos www.anastrophe.com From gem at rellim.com Thu Sep 7 19:55:08 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 7 Sep 2017 12:55:08 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <12ead933-de3b-cc92-9025-89ad3639be99@anastrophe.com> References: <20170907083730.C37E040605C@ip-64-139-1-69.sjc.megapath.net> <20170907124742.591c2070@spidey.rellim.com> <12ead933-de3b-cc92-9025-89ad3639be99@anastrophe.com> Message-ID: <20170907125508.630d7a62@spidey.rellim.com> Yo Paul! On Thu, 7 Sep 2017 12:51:07 -0700 Paul Theodoropoulos via devel wrote: > On 9/7/17 12:47, Gary E. Miller via devel wrote: > > >> Eric: Do we have a list of OSes and hardware where ntpsec is known > >> to build and work? > > > > buildbot.ntpsec.org. > > Just fyi, that page is htpasswd protected. Yes, and it must be. The load on the server from that page is large, more than a few users overloads the server. But it is the only real time and accurate way to know what is working. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From paul at anastrophe.com Thu Sep 7 20:02:03 2017 From: paul at anastrophe.com (Paul Theodoropoulos) Date: Thu, 7 Sep 2017 13:02:03 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170907125508.630d7a62@spidey.rellim.com> References: <20170907083730.C37E040605C@ip-64-139-1-69.sjc.megapath.net> <20170907124742.591c2070@spidey.rellim.com> <12ead933-de3b-cc92-9025-89ad3639be99@anastrophe.com> <20170907125508.630d7a62@spidey.rellim.com> Message-ID: On 9/7/17 12:55, Gary E. Miller via devel wrote:> Yes, and it must be. The load on the server from that page is large, > more than a few users overloads the server. > > But it is the only real time and accurate way to know what is working. That's fine of course, it's simply that it was offered in answer to Hal's question of whether there's a list available of known working OS/hardware combinations, so out of curiousity I went there to look at the list, that's all. -- Paul Theodoropoulos www.anastrophe.com From hmurray at megapathdsl.net Thu Sep 7 23:23:04 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 07 Sep 2017 16:23:04 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h Message-ID: <20170907232305.02E5740605C@ip-64-139-1-69.sjc.megapath.net> [Using ldexp when ldexpl isn't available.] > Serious loss of precision, but maybe the best we can do. Does anybody have any data on how serious that would be? Does ntp classic do anything interesting in this area? (That's the sort of thing that Dave Mills is likely to have thought about.) Did anybody notice any problems before that change? commit 3705a499961391748c2b2cf1383270924f2f9df9 Author: Eric S. Raymond Date: Thu Aug 17 09:32:38 2017 -0400 Partially address Gitlab issue #270: Loss of precision in step_systime() -- These are my opinions. I hate spam. From esr at thyrsus.com Fri Sep 8 02:39:55 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 7 Sep 2017 22:39:55 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170907232305.02E5740605C@ip-64-139-1-69.sjc.megapath.net> References: <20170907232305.02E5740605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170908023955.GA16773@thyrsus.com> Hal Murray via devel : > > [Using ldexp when ldexpl isn't available.] > > Serious loss of precision, but maybe the best we can do. > > Does anybody have any data on how serious that would be? > > Does ntp classic do anything interesting in this area? (That's the sort of > thing that Dave Mills is likely to have thought about.) > > Did anybody notice any problems before that change? > > commit 3705a499961391748c2b2cf1383270924f2f9df9 > Author: Eric S. Raymond > Date: Thu Aug 17 09:32:38 2017 -0400 > > Partially address Gitlab issue #270: Loss of precision in step_systime() Gary originally raised the issue. He might have seen it in the wild. No, NTP doesn't do anything interesting here. The code predates the era whend long double was standardized and generally available, a transition that might have been as late as C99. -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From hmurray at megapathdsl.net Fri Sep 8 03:09:44 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 07 Sep 2017 20:09:44 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: Message from "Eric S. Raymond" of "Thu, 07 Sep 2017 22:39:55 EDT." <20170908023955.GA16773@thyrsus.com> Message-ID: <20170908030944.BC3EC40605C@ip-64-139-1-69.sjc.megapath.net> > No, NTP doesn't do anything interesting here. The code predates the era > whend long double was standardized and generally available, a transition > that might have been as late as C99. So it seems reasonable to assume that it's OK to run without the extra precision. The case I was worried about was where cleaning up that area we accidentally lost the precision. Do you (or Mark) have any opinions on whether I should push the fix? It pokes a tiny hole in the we-need-C99 position. -- These are my opinions. I hate spam. From Matthew.Selsky at twosigma.com Fri Sep 8 13:11:00 2017 From: Matthew.Selsky at twosigma.com (Matthew Selsky) Date: Fri, 8 Sep 2017 09:11:00 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170907124742.591c2070@spidey.rellim.com> References: <20170907083730.C37E040605C@ip-64-139-1-69.sjc.megapath.net> <20170907124742.591c2070@spidey.rellim.com> Message-ID: <20170908131100.GA1552@twosigma.com> On Thu, Sep 07, 2017 at 12:47:42PM -0700, Gary E. Miller via devel wrote: > Yo Hal! > > On Thu, 07 Sep 2017 01:37:30 -0700 > Hal Murray wrote: > > > > Got a workaround? > > > > This seems to build and check: > > #include /* ldexpl() */ > > #ifndef ldexpl > > /* Missing in NetBSD 6.1.5 */ > > #define ldexpl ldexp > > #endif > > > > Will that do the right conversions between double and long double? > > Serious loss of precision, but maybe the best we can do. > > > Do we want to work with old but still supported NetBSD or be sticky > > about requiring C99? > > You brougth it up. If yuo don't care we can drop that version of NetBSD. ldexpl() was added in NetBSD 7 (released Sept 2015). NetBSD 6 was released in Oct 2012 and last had a point release in Sept 2014). NetBSD 6 is still supported by upstream per https://www.netbsd.org/releases/formal.html#history NetBSD 8 is in beta and no release date has been announced yet. I would expect NetBSD 6 support to be dropped when NetBSD 8 comes out. Do we need to support NetBSD 6 given it's age and lack of complete c99 support? Thanks, -Matt From esr at thyrsus.com Fri Sep 8 14:46:16 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 8 Sep 2017 10:46:16 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170908131100.GA1552@twosigma.com> References: <20170907083730.C37E040605C@ip-64-139-1-69.sjc.megapath.net> <20170907124742.591c2070@spidey.rellim.com> <20170908131100.GA1552@twosigma.com> Message-ID: <20170908144616.GA18001@thyrsus.com> Matthew Selsky via devel : > ldexpl() was added in NetBSD 7 (released Sept 2015). NetBSD 6 was released in > Oct 2012 and last had a point release in Sept 2014). NetBSD 6 is still > supported by upstream per https://www.netbsd.org/releases/formal.html#history > > NetBSD 8 is in beta and no release date has been announced yet. I would expect > NetBSD 6 support to be dropped when NetBSD 8 comes out. > > Do we need to support NetBSD 6 given it's age and lack of complete c99 support? I'm going to pull the lever for "no". The point of choosing C99/POSIX-2001.1 as our support baseline was to have a cutoff point that is (a) useful for drastically cutting our complexity-related burdens, and (b) easy to explain and justify - it is a 16-year-old pair of standards, after all. I recently threw pre-10.12 versions of Mac OS X off the island for botching the time primitives. I can think of no good reason for NetBSD 6 to get an indulgence. I think being hard-line about that policy choice has been successful and we should stick to it. Technical note: Our OS baseline is not quite POSIX-2001.1 - for rather obvious reasons we require the time primitives from the 2008 revision as well, and adjtime(2). -- Eric S. Raymond Please consider contributing to my Patreon page at https://www.patreon.com/esr so I can keep the invisible wheels of the Internet turning. Give generously - the civilization you save might be your own. From gem at rellim.com Fri Sep 8 20:14:52 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 8 Sep 2017 13:14:52 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170907232305.02E5740605C@ip-64-139-1-69.sjc.megapath.net> References: <20170907232305.02E5740605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170908131452.6a1937fe@spidey.rellim.com> Yo Hal! On Thu, 07 Sep 2017 16:23:04 -0700 Hal Murray wrote: > [Using ldexp when ldexpl isn't available.] > > Serious loss of precision, but maybe the best we can do. > > Does anybody have any data on how serious that would be? Yes, a float is 56 bits of precision. a timespec is 64 bits. > Does ntp classic do anything interesting in this area? (That's the > sort of thing that Dave Mills is likely to have thought about.) ntp classic uses interger math, ntpsec uses double. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Fri Sep 8 20:16:57 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 8 Sep 2017 13:16:57 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170908030944.BC3EC40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170908023955.GA16773@thyrsus.com> <20170908030944.BC3EC40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170908131657.1e3000f1@spidey.rellim.com> Yo Hal! On Thu, 07 Sep 2017 20:09:44 -0700 Hal Murray wrote: > > No, NTP doesn't do anything interesting here. The code predates > > the era whend long double was standardized and generally available, > > a transition that might have been as late as C99. > > So it seems reasonable to assume that it's OK to run without the > extra precision. Not reasonable, it fixed a bug. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Sat Sep 9 13:14:13 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 9 Sep 2017 09:14:13 -0400 (EDT) Subject: Feature Freeze Friday Message-ID: <20170909131413.0909513A0206@snark.thyrsus.com> It is 19 days until our targeted September 28th release date for NTPsec 1.0. My intention is that on Friday the 15th of September we will enter feature freeze, with the remaining two weeks being devoted to testing, bug fixes, and writing better tests. Work on non-RFE tracker issues (currently #44, #55, #62, #269, and #377) will be exempted from the freeze. We may also let things get a bit slushier around the Python code if that comes up, as bugs there don't threaten core function. But the C code has got to be left to cool down - destabilizing changes just before a release are stupid and embarassing. This means the window on proposals to change the design of RFE #204 (Support /etc/ntp.d) will close when we freeze. Plan whatever time you can allocate to think up a better design accordingly. One special circumstance might change the schedule. Daniel has a family emergency that might prevent him from landing two security features (info minimization and AES-CMAC packet validation) before the freeze begins. I judge potential code disruption from both to be low and their mission relevance to be high; it will be a judgment call, if Daniel says he can land them after the 15th, whether we unfreeze or negotiate a deadline extension with ICEI and reset the freeze clock. Everybody on the team has done spectacularly good work to get us to this point. The rapid, effective response to the mystery crash earlier this week is only the latest entry in a record to be very proud of. Now we enter the home stretch. -- Eric S. Raymond The kind of charity you can force out of people nourishes about as much as the kind of love you can buy --- and spreads even nastier diseases. From Stromeko at nexgo.de Sat Sep 9 18:46:54 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Sat, 09 Sep 2017 20:46:54 +0200 Subject: Tinkerboard w/ TinkerOS 2.0.1 References: <87a82bis91.fsf@Rainer.invalid> Message-ID: <87fubvd80x.fsf@Rainer.invalid> Achim Gratz via devel writes: > Still no sign of PPS via GPIO, I might try Armbian later to see if it's > available there since I don't need the GPU and WLAN anyway. Some kind soul read this and sent me some hints on how to compile a kernel that has PPS enabled. Just creating the node in the device tree (with an overlay) is not enough since the kernel does not have the device compiled in. Anyway, compiling the kernel on the Tinkerboard takes only about half an hour (maybe less if you use active cooling so it doesn't throttle due to running into the thermal thresholds). I've switched the TinkerBoard to PPS and starting to collect PPS statistics. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf rackAttack V1.04R1: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From Stromeko at nexgo.de Sun Sep 10 17:11:15 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 10 Sep 2017 19:11:15 +0200 Subject: Tinkerboard w/ TinkerOS 2.0.1 References: <87a82bis91.fsf@Rainer.invalid> Message-ID: <87o9qiiimk.fsf@Rainer.invalid> [resent and expanded, the original posting did not make it to Gmane NNTP] Achim Gratz via devel writes: > Still no sign of PPS via GPIO, I might try Armbian later to see if it's > available there since I don't need the GPU and WLAN anyway. I've Some kind soul read this and sent me some hints on how to compile a kernel that has PPS enabled (thank you!). Just creating the node in the device tree (with an overlay) is not enough since the kernel does not have the device compiled in. Anyway, compiling the kernel on the Tinkerboard takes only about half an hour (maybe less if you use active cooling so it doesn't throttle due to running up to the thermal thresholds). Here's what I did (it's mostly gleaned from http://qiita.com/mt08/items/890a71d0b399ac1a9b49) and the two patches that have been applied (based on the information posted in https://tinkerboarding.co.uk/forum/thread-594.html). Patches on top of the upstream kernel sources: -------------- next part -------------- A non-text attachment was scrubbed... Name: 0001-Fix-indentation-warnings-due-to-whitespace-mixup.patch Type: text/x-patch Size: 5840 bytes Desc: not available URL: -------------- next part -------------- A non-text attachment was scrubbed... Name: 0002-Enable-PPS-on-GPIO-pin-22-for-Asus-TinkerBoard.patch Type: text/x-patch Size: 1708 bytes Desc: not available URL: -------------- next part -------------- --8<---------------cut here---------------start------------->8--- - git clone --depth 1 -b linux4.4-rk3288 https://github.com/TinkerBoard/debian_kernel.git - cd debian_kernel - git am patchfiles # fix kernel sources and change config to include PPS on GPIO Pin#22 - make ARCH=arm miniarm-rk3288_defconfig - time make ARCH=arm -j6 zImage 2>&1 | tee zImage.log - time make ARCH=arm -j6 modules 2>&1 | tee modules.log - time make ARCH=arm -j6 dtbs 2>&1 | tee dtbs.log - sudo make ARCH=arm modules_install - sudo cp -v /boot/zImage{,.bak} - sudo cp -v /boot/rk3288-miniarm.dtb{,.bak} - sudo cp -v arch/arm/boot/{zImage,dts/rk3288-miniarm.dtb} /boot - ls -al /boot /lib/modules - sudo reboot --8<---------------cut here---------------end--------------->8--- Also I finally found the reason for the unexpectedly high latency (it's interrupt coalescing in the Gigabit interface, the impact on NTP shown here https://blog.dan.drown.org/nic-interrupt-coalesce-impact-on-ntp). The default coalesce interval on the TinkerBoard is over 0.5ms and I've set it to the minimum of 69?s now. To make that setting permanent accross reboots: --8<---------------cut here---------------start------------->8--- - echo -e "iface eth0 inet dhcp\n hardware-irq-coalesce-rx-usecs 69" | sudo tee /etc/network/interfaces.d/eth0 --8<---------------cut here---------------end--------------->8--- Also, my script that regulates the temperature and load uses interval timers with short intervals and that only really works if "nohz=off" gets added to the kernel command line in /boot/extlinux/extlinux.conf. I've switched the TinkerBoard to PPS and starting to collect PPS statistics. Everything looks pretty good so far, I've also started ovenizing the XTAL, but it will be some time before I get enough statistics to extract the parameters from for a proper control loop. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptations for Waldorf Q V3.00R3 and Q+ V3.54R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From esr at thyrsus.com Sun Sep 10 19:09:42 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 10 Sep 2017 15:09:42 -0400 (EDT) Subject: Release prep, some technical news, and thanks Message-ID: <20170910190942.A021413A0206@snark.thyrsus.com> We're down to 5 issues on the tracker, 4 not counting a completed RFE. That's a good feeling this far in advance of our release date. We now have an NTPsec instance providing pool service under heavy load, in Austria. That's what smoked out the mystery crash; since then it's been running stable and our beta tester is happy. This was an important check; it's one thing to know that our ntpd works well on a low-traffic desktop client setup, entirely another to be sure it can take hundreds of requests a minute in stride. That same tester has been sending us a steady stream of small bug reports, mostly around ntpq, that have helped us polish and improve it quite a lot. This has also turned up weaknesses in the documentation, which is stronger as a result. The new 'bias' config keyword will probably the last new feature before we ship. There's actually more to the machinery behind it than meets the eye; I'm going to explain that here because it might matter past 1.0 when we think about adding new config elents to support (say) NTS. As I began looking at implementing 'bias' I became disturbed by what I found in the data path from server config declarations down to the newpeer() call that sets up a peer association. The config parser naturally packs the decalration payload into a struct, but the struct was then unpacked an the payjoad passed to newpeer() as multiple argument variables. This way of doing things made adding new config options a defect-prone process. Basically, the argument signature of newpeer() had to change every time; it featured an unreasonably large and increasing number of arguments. No more. Now the config block is passed down to newpeer(), which copies it to a permanent home in the new peer block (after sanity-checking some members). Following this change, adding 'bias' was trivial. (1) add a 'bias' member in the config block, (2) make the YACC grammar fill it in (a thing I could do in my sleep), and (3) notice that the value is available in the new peer block after config and do whatever with it. The time and expected additional defect exposure associating with adding more options is now really, really low. This doesn't mean we should go nuts adding marginal ones, but it does give us a bit freer hand on the design level. Finally, I think it is also significant what kinds of bug reports we have *not* been getting. It's hard to notice an absence, so I want you all to take a moment to notice how infrequent, minor, and superficial our problem reports have been. That's an achievement in code that has been shrunk by 75%, heavily refactored, and part rewritten in a different language. Once again, I think this is performance for the whole team to be proud of. Yes, I've been the lead surgeon for the heaviest C work, but the team's support has been magnificent and maybe I have not appreciated that as loudly as I should. Fred Brooks was right - surgical teams *work*, and they fly not just on the capability of the lead but on the ability of senior assistants and the junior associates learning by example. I'm no longer even surprised when all of you perform exceptionally well. Let us now hope the next 18 days are boring and we can fuck off a lot. Because that's predicts a really solid release. -- Eric S. Raymond "They that can give up essential liberty to obtain a little temporary safety deserve neither liberty nor safety." -- Benjamin Franklin, Historical Review of Pennsylvania, 1759. From fallenpegasus at gmail.com Wed Sep 13 18:41:12 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 13 Sep 2017 18:41:12 +0000 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170908131657.1e3000f1@spidey.rellim.com> References: <20170908023955.GA16773@thyrsus.com> <20170908030944.BC3EC40605C@ip-64-139-1-69.sjc.megapath.net> <20170908131657.1e3000f1@spidey.rellim.com> Message-ID: I agree, drop NetBSD6 and document why. Is NetBSD 6 still under development? If so, we can send them a bugreport. On the other other hand, do we still have any other compatibility shims anywhere else for any other OSes? floating point ops like this are "merely" some simple bit twiddles, as long as you know your arch and fp arch. .. On Fri, Sep 8, 2017 at 1:17 PM Gary E. Miller via devel wrote: > Yo Hal! > > On Thu, 07 Sep 2017 20:09:44 -0700 > Hal Murray wrote: > > > > No, NTP doesn't do anything interesting here. The code predates > > > the era whend long double was standardized and generally available, > > > a transition that might have been as late as C99. > > > > So it seems reasonable to assume that it's OK to run without the > > extra precision. > > Not reasonable, it fixed a bug. > > RGDS > GARY > --------------------------------------------------------------------------- > Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 > gem at rellim.com Tel:+1 541 382 8588 <(541)%20382-8588> > > Veritas liberabit vos. -- Quid est veritas? > "If you can?t measure it, you can?t improve it." - Lord Kelvin > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel -- Mark Atwood http://about.me/markatwood +1-206-604-2198 Mobile & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From hmurray at megapathdsl.net Wed Sep 13 19:16:41 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 13 Sep 2017 12:16:41 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h Message-ID: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> > Is NetBSD 6 still under development? If so, we can send them a bugreport. Development has moved on to 7 and soon to be 8. 6 is still supported which means security fixes get backported. -- These are my opinions. I hate spam. From Stromeko at nexgo.de Wed Sep 13 19:21:03 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 13 Sep 2017 21:21:03 +0200 Subject: Tinkerboard w/ TinkerOS 2.0.1 References: <87a82bis91.fsf@Rainer.invalid> <87o9qiiimk.fsf@Rainer.invalid> Message-ID: <87lgli9zhc.fsf@Rainer.invalid> Achim Gratz via devel writes: > [resent and expanded, the original posting did not make it to Gmane NNTP] The missing postings have belatedly appeared on Gmane now, so it seems to have been a hiccup somewhere between the listserver and Gmane or on Gmane itself. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From fallenpegasus at gmail.com Wed Sep 13 19:22:55 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 13 Sep 2017 19:22:55 +0000 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: NetBSD6 still supported, so it's still running in the wild. I know we've been removed most compatibility shims, but are they all gone? or do we still have a chunk of "if this OS, then define these missing functions"? ..m On Wed, Sep 13, 2017 at 12:16 PM Hal Murray wrote: > > > Is NetBSD 6 still under development? If so, we can send them a > bugreport. > > Development has moved on to 7 and soon to be 8. > > 6 is still supported which means security fixes get backported. > > > > > -- > These are my opinions. I hate spam. > > > > -- Mark Atwood http://about.me/markatwood +1-206-604-2198 Mobile & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From hmurray at megapathdsl.net Wed Sep 13 19:37:56 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 13 Sep 2017 12:37:56 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: Message from Mark Atwood of "Wed, 13 Sep 2017 19:22:55 -0000." Message-ID: <20170913193756.D693740605C@ip-64-139-1-69.sjc.megapath.net> fallenpegasus at gmail.com said: > I know we've been removed most compatibility shims, but are they all gone? > or do we still have a chunk of "if this OS, then define these missing > functions"? We have replacements for some non-POSIX string functions that are in many but not all systems. There are a bunch of OS specific ifdefs in ntp_sandbox. I don't know of any work-arounds for stuff that should be in POSIX. There could be some. -- These are my opinions. I hate spam. From esr at thyrsus.com Wed Sep 13 20:17:51 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 13 Sep 2017 16:17:51 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170913201751.GA26408@thyrsus.com> Mark Atwood via devel : > NetBSD6 still supported, so it's still running in the wild. > > I know we've been removed most compatibility shims, but are they all gone? > or do we still have a chunk of "if this OS, then define these missing > functions"? I think the last OS-related shim where we define substitute code is gone. It was the kludge for pre-10.12 Mac OS X, which turned out not to work because (a) in some versions the system headers didn't match the docs, and (b) we got a report that in some versions the set-time primitive doesn't work. We still some compatibilty shims of a more suerficial kind, supplying things like strlcpy and friends if the native C library doesn't have them. We also have some code that is conditionally disabled if the native OS's features don't support it. This is true notably in the sandboxing code, where privileges are dropped at startup. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From fallenpegasus at gmail.com Wed Sep 13 22:31:31 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 13 Sep 2017 22:31:31 +0000 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170913201751.GA26408@thyrsus.com> References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> Message-ID: How much complexity would it add to add the missing fp functions in the same way the strlcpy function is? It doesnt even have to be fully generic, just if NetBSD6 etc. If a BSD is still supported by it's community, I'm not happy about dropping it. ..m On Wed, Sep 13, 2017 at 1:17 PM Eric S. Raymond wrote: > Mark Atwood via devel : > > NetBSD6 still supported, so it's still running in the wild. > > > > I know we've been removed most compatibility shims, but are they all > gone? > > or do we still have a chunk of "if this OS, then define these missing > > functions"? > > I think the last OS-related shim where we define substitute code is gone. > It was the kludge for pre-10.12 Mac OS X, which turned out not to work > because > (a) in some versions the system headers didn't match the docs, and (b) we > got > a report that in some versions the set-time primitive doesn't work. > > We still some compatibilty shims of a more suerficial kind, supplying > things > like strlcpy and friends if the native C library doesn't have them. > > We also have some code that is conditionally disabled if the native OS's > features don't support it. This is true notably in the sandboxing code, > where privileges are dropped at startup. > -- > Eric S. Raymond > > My work is funded by the Internet Civil Engineering Institute: > https://icei.org > Please visit their site and donate: the civilization you save might be > your own. > > > -- Mark Atwood http://about.me/markatwood +1-206-604-2198 Mobile & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Wed Sep 13 23:05:30 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 13 Sep 2017 16:05:30 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> Message-ID: <20170913160530.34f86d26@spidey.rellim.com> Yo Mark! On Wed, 13 Sep 2017 22:31:31 +0000 Mark Atwood via devel wrote: > How much complexity would it add to add the missing fp functions in > the same way the strlcpy function is? I think all we need for NetBSD 6.1 is ldexpl(). Here is one way, a very slow way, to do it: long double ldexpl(long double value, int e) { if (value == 0 || value == INFINITY || value == -INFINITY || value != value) { // Return +0.0/-0.0, +INF/-INF and NaN as-is } else { while (e > 0) value = value * 2, e--; while (e < 0) value = value * 0.5f, e++; // won't round denormals correctly } return value; } Ripped from: https://github.com/alexfru/SmallerC/blob/master/v0100/srclib/ldexp.c NTPsec only uses 32 and -32 values for 'e', so some simplification possible. The INF tests should likely be replaced with isfinite(). RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fallenpegasus at gmail.com Wed Sep 13 23:15:10 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 13 Sep 2017 23:15:10 +0000 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170913160530.34f86d26@spidey.rellim.com> References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> <20170913160530.34f86d26@spidey.rellim.com> Message-ID: We could just grab from NetBSD7. Or if we know it's an IEEE754 float, just do the direct bit ops. Or the direct fp cpu op. ..m On Wed, Sep 13, 2017 at 4:05 PM Gary E. Miller via devel wrote: > Yo Mark! > > On Wed, 13 Sep 2017 22:31:31 +0000 > Mark Atwood via devel wrote: > > > How much complexity would it add to add the missing fp functions in > > the same way the strlcpy function is? > > I think all we need for NetBSD 6.1 is ldexpl(). > > Here is one way, a very slow way, to do it: > > long double ldexpl(long double value, int e) > { > if (value == 0 || value == INFINITY || value == -INFINITY || value != > value) > { > // Return +0.0/-0.0, +INF/-INF and NaN as-is > } > else > { > while (e > 0) > value = value * 2, e--; > while (e < 0) > value = value * 0.5f, e++; // won't round denormals correctly > } > return value; > } > > Ripped from: > > https://github.com/alexfru/SmallerC/blob/master/v0100/srclib/ldexp.c > > NTPsec only uses 32 and -32 values for 'e', so some simplification > possible. > > The INF tests should likely be replaced with isfinite(). > > RGDS > GARY > --------------------------------------------------------------------------- > Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 > gem at rellim.com Tel:+1 541 382 8588 <(541)%20382-8588> > > Veritas liberabit vos. -- Quid est veritas? > "If you can?t measure it, you can?t improve it." - Lord Kelvin > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel -- Mark Atwood http://about.me/markatwood +1-206-604-2198 Mobile & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Wed Sep 13 23:27:28 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 13 Sep 2017 16:27:28 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> <20170913160530.34f86d26@spidey.rellim.com> Message-ID: <20170913162728.60ea0723@spidey.rellim.com> Yo Mark! On Wed, 13 Sep 2017 23:15:10 +0000 Mark Atwood wrote: > We could just grab from NetBSD7. Nope, that is very low level FPU assembly code. Very arch dependent. Usually buried deep in the C compiler as a builtin. Just look at gcc for all the arch options it has for floating point! > Or if we know it's an IEEE754 > float, just do the direct bit ops. Sort of a float, it is a long double. The IEEE754 does not specify how big in memory a long double is. It may be 80 bits, or 128 bits, or? IEEE754 also does not specify how the bits are arranged in memory. > Or the direct fp cpu op. Assuming you even have a long double FPU. Remember this has to support various, ARM, i386, amd64, MIPS, sparc, etc. you may not even have any FPU. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Sep 14 00:36:47 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 13 Sep 2017 17:36:47 -0700 Subject: comments on adjusting the clock Message-ID: <20170914003647.E1DD340605C@ip-64-139-1-69.sjc.megapath.net> (I'll be out for the next day or so, so don't expect prompt answers. Sorry this isn't cleaner. Gotta run.) There may be bogus stuff in here. Measure twice, edit once. I think the whole doubletime_t was a wild goose chase. The claimed reason was precision. A double has 53 bits. We are interrested in adjustments, not absolute values. If we are taking a huge adjustment (31 bits), that still leaves 20 bits of fractional second. That's microseconds. I think that's good enough to get started. Most adjustments are small fractions of a second. If we start with 53 bits, we will be throwing away almost half of them when we convert to l_fp. Is there any reason we are using doubles? Why not l_fp? (There may be some statistical math that is easier with floats.) Maybe we could carry a parallel version in l_fp if we want to preserve precision. --------- There are two modules that both adjust the system clock. libntp/clockwork.c and libntp/systime.c There are two ways to adjust the clock: adj_systime is in ./libntp/systime.c and ntp_adjtime_ns is in ./libntp/clockwork.c I haven't figured out what's going on. It doesn't look good. The doubletime_t stuff is only used in one of them. All the calls to adj_systime pass in adjtime. Why? (I assume history that nobody has cleaned up yet.) --------- ./include/ntp_syscall.h checks STA_NANO and defines ntp_error_in_seconds The comment mentions maxerror and esterror It's only used in ntpd/ntp_control.c, but there it is used 5 times. It calls ntp_adjtime directly, under HAVE_KERNEL_PLL I think it could just call ntp_adjtime_ns ---------- There is some ugly stuff in start_kern_loop involving ntp_adjtime_error_handler I think it's trying to make a run time test to see if ntp_adjtime actually works. I'm a bit surprised there isn't a cleaner way. -------- ntp_adjtime ntp_error_in_seconds libntp/systime.c adj_systime libntp/clockwork.c ntp_adjtime_ns called by ntp_loopfilter ntp_adjtime_error_handler I think it's for in-kernel PLL pll_trap -- These are my opinions. I hate spam. From esr at thyrsus.com Thu Sep 14 00:52:35 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 13 Sep 2017 20:52:35 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170913162728.60ea0723@spidey.rellim.com> References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> <20170913160530.34f86d26@spidey.rellim.com> <20170913162728.60ea0723@spidey.rellim.com> Message-ID: <20170914005235.GA4556@thyrsus.com> Gary E. Miller via devel : > Yo Mark! > > On Wed, 13 Sep 2017 23:15:10 +0000 > Mark Atwood wrote: > > > We could just grab from NetBSD7. > > Nope, that is very low level FPU assembly code. Very arch dependent. > Usually buried deep in the C compiler as a builtin. > > Just look at gcc for all the arch options it has for floating point! > > > Or if we know it's an IEEE754 > > float, just do the direct bit ops. > > Sort of a float, it is a long double. The IEEE754 does not specify how > big in memory a long double is. It may be 80 bits, or 128 bits, or? > > IEEE754 also does not specify how the bits are arranged in memory. > > > Or the direct fp cpu op. > > Assuming you even have a long double FPU. Remember this has to > support various, ARM, i386, amd64, MIPS, sparc, etc. you may not even > have any FPU. Mark: Rolling our own lldexp is getting into the territory of "really bad ideas that raise the hair on my neck". The reason I'm getting that feeling is that lldexp is unlike - say - strlcpy in an important way. Floating-point code is a *notorious* defect attractor. It's infamously difficult to even test it in a way that catches all its edge cases, especially cross-architecture. I can all too easily see us committing to this, then spending an amount of maintainence effort that diverges to infinity on code that is never quite right, constantly delivering a trickle of unpleasant low-level surprises. In principle, things could be different. If we had a dev with a strong background in FP code and numerical analysis (say, as strong in that domain as Daniel is in security/crypto), I might consider a homebrew emulation a risk worth taking for an OS version that is minor-platform and aging out rapidly. As it is, give the team we have, I'm going to strongly recommend not going there. We're good, but we're no good enough at *this*. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Thu Sep 14 02:23:32 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 13 Sep 2017 19:23:32 -0700 Subject: =?UTF-8?B?4pyYTk1FQS9QUFM=?= driver #20 fix for issue #62 Message-ID: <20170913192332.0f462e4e@spidey.rellim.com> Yo All! I have a provisional fix for issue #62: https://gitlab.com/NTPsec/ntpsec/issues/62 In the NMEA refclock (#20), when PPS is lost, the peer jitter was left at the low value calculated for PPS. This jitter is then used for NMEA. So the NMEA time is wrongly thought to be PPS precise. This caused the selection algorithm to wrongly pick NMEA over other better sources. The fix is to reset the jitter for the refclock to default when PPS is lost. A try at a fix is in commit 7d49f80d If you are using the NMEA driver with PPS, please test. I'll also test. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fallenpegasus at gmail.com Thu Sep 14 18:55:18 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Thu, 14 Sep 2017 18:55:18 +0000 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170914005235.GA4556@thyrsus.com> References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> <20170913160530.34f86d26@spidey.rellim.com> <20170913162728.60ea0723@spidey.rellim.com> <20170914005235.GA4556@thyrsus.com> Message-ID: Fair enough. We should still feed a bug report to NetBSD6, maybe one of their FP guys will patch it in. And we drop NetBSD6 now, because of that lack. ..m On Wed, Sep 13, 2017 at 5:52 PM Eric S. Raymond via devel wrote: > Gary E. Miller via devel : > > Yo Mark! > > > > On Wed, 13 Sep 2017 23:15:10 +0000 > > Mark Atwood wrote: > > > > > We could just grab from NetBSD7. > > > > Nope, that is very low level FPU assembly code. Very arch dependent. > > Usually buried deep in the C compiler as a builtin. > > > > Just look at gcc for all the arch options it has for floating point! > > > > > Or if we know it's an IEEE754 > > > float, just do the direct bit ops. > > > > Sort of a float, it is a long double. The IEEE754 does not specify how > > big in memory a long double is. It may be 80 bits, or 128 bits, or? > > > > IEEE754 also does not specify how the bits are arranged in memory. > > > > > Or the direct fp cpu op. > > > > Assuming you even have a long double FPU. Remember this has to > > support various, ARM, i386, amd64, MIPS, sparc, etc. you may not even > > have any FPU. > > Mark: Rolling our own lldexp is getting into the territory of "really > bad ideas that raise the hair on my neck". > > The reason I'm getting that feeling is that lldexp is unlike - say - > strlcpy > in an important way. Floating-point code is a *notorious* defect > attractor. > It's infamously difficult to even test it in a way that catches all its > edge > cases, especially cross-architecture. > > I can all too easily see us committing to this, then spending an > amount of maintainence effort that diverges to infinity on code that > is never quite right, constantly delivering a trickle of unpleasant > low-level surprises. > > In principle, things could be different. If we had a dev with a strong > background in FP code and numerical analysis (say, as strong in that > domain as Daniel is in security/crypto), I might consider a homebrew > emulation a risk worth taking for an OS version that is minor-platform > and aging out rapidly. > > As it is, give the team we have, I'm going to strongly recommend not going > there. We're good, but we're no good enough at *this*. > -- > Eric S. Raymond > > My work is funded by the Internet Civil Engineering Institute: > https://icei.org > Please visit their site and donate: the civilization you save might be > your own. > > > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel -- Mark Atwood http://about.me/markatwood +1-206-604-2198 Mobile & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Thu Sep 14 19:04:04 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 14 Sep 2017 12:04:04 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> <20170913160530.34f86d26@spidey.rellim.com> <20170913162728.60ea0723@spidey.rellim.com> <20170914005235.GA4556@thyrsus.com> Message-ID: <20170914120404.34384ad2@spidey.rellim.com> Yo Mark! On Thu, 14 Sep 2017 18:55:18 +0000 Mark Atwood wrote: > Fair enough. We should still feed a bug report to NetBSD6, maybe one > of their FP guys will patch it in. And we drop NetBSD6 now, because > of that lack. Done. Awaiting the issue number. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Thu Sep 14 19:25:46 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 14 Sep 2017 12:25:46 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170914120404.34384ad2@spidey.rellim.com> References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> <20170913160530.34f86d26@spidey.rellim.com> <20170913162728.60ea0723@spidey.rellim.com> <20170914005235.GA4556@thyrsus.com> <20170914120404.34384ad2@spidey.rellim.com> Message-ID: <20170914122546.2a467745@spidey.rellim.com> Yo All! > Done. Awaiting the issue number. Just in: Thank you very much for your problem report. It has the internal identification `lib/52541'. The individual assigned to look at your report is: lib-bug-people. >Category: lib >Responsible: lib-bug-people >Synopsis: ldexpl() missing from NetBSD 6 >Arrival-Date: Thu Sep 14 19:05:00 +0000 2017 RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Thu Sep 14 19:30:05 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 14 Sep 2017 15:30:05 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> <20170913160530.34f86d26@spidey.rellim.com> <20170913162728.60ea0723@spidey.rellim.com> <20170914005235.GA4556@thyrsus.com> Message-ID: <20170914193005.GA21945@thyrsus.com> Mark Atwood : > Fair enough. We should still feed a bug report to NetBSD6, maybe one of > their FP guys will patch it in. And we drop NetBSD6 now, because of that > lack. Checking...Matt already did the NetBSD 6 drop. Any volunteer to file the NetBSD bug? I don't know their procedures. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Thu Sep 14 19:31:18 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 14 Sep 2017 15:31:18 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170914120404.34384ad2@spidey.rellim.com> References: <20170913191641.94A2040605C@ip-64-139-1-69.sjc.megapath.net> <20170913201751.GA26408@thyrsus.com> <20170913160530.34f86d26@spidey.rellim.com> <20170913162728.60ea0723@spidey.rellim.com> <20170914005235.GA4556@thyrsus.com> <20170914120404.34384ad2@spidey.rellim.com> Message-ID: <20170914193118.GB21945@thyrsus.com> Gary E. Miller via devel : > Yo Mark! > > On Thu, 14 Sep 2017 18:55:18 +0000 > Mark Atwood wrote: > > > Fair enough. We should still feed a bug report to NetBSD6, maybe one > > of their FP guys will patch it in. And we drop NetBSD6 now, because > > of that lack. > > Done. Awaiting the issue number. OK, that'll teach me to read *all* my mail before replying. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From fw at fwright.net Thu Sep 14 21:46:56 2017 From: fw at fwright.net (Fred Wright) Date: Thu, 14 Sep 2017 14:46:56 -0700 (PDT) Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h Message-ID: On Wed, 13 Sep 2017, Hal Murray via devel wrote: > I think the whole doubletime_t was a wild goose chase. > > The claimed reason was precision. A double has 53 bits. We are interrested > in adjustments, not absolute values. If we are taking a huge adjustment (31 > bits), that still leaves 20 bits of fractional second. That's microseconds. > I think that's good enough to get started. Most adjustments are small > fractions of a second. If we start with 53 bits, we will be throwing away > almost half of them when we convert to l_fp. Indeed. For time *deltas*, ordinary double precision is perfectly adequate. E.g., a one-hour delta has sub-picosecond precision, a one-week delta has sub-nanosecond precison, etc. If your clock is off by a month and you're worrying about nanoseconds, then you're misguided. :-) Ordinary doubles are indeed inadequate for "absolute" time values (i.e. values relative to any plausible epoch). But long doubles don't fix this, because the C language doesn't really take them seriously, and doesn't guarantee that they differ from standard doubles at all. If you *need* precision beyond a double, your only portable recourse is to avoid floating-point altogether. To put it another way, in order to use long doubles, you need to: 1) Accept that you only get extra precison opportunistically, as a function of the platform (possibly no extra precision at all). 2) Accept that when the added precision *is* available, all related operations will be significantly slower. 3) Either give up on regression testing, or make said testing explicitly platform-specific. IMO, if a proper cost-benefit analysis of the use of long doubles in the NTP context were conducted, it would result in a resounding thumbs down. On Wed, 13 Sep 2017, Eric S. Raymond via devel wrote: > Mark: Rolling our own lldexp is getting into the territory of "really > bad ideas that raise the hair on my neck". Well, the underlying bad idea is using long doubles at all. :-) Once one doesn't do that, the issue is moot. On Wed, 13 Sep 2017, Mark Atwood via devel wrote: > It doesnt even have to be fully generic, just if NetBSD6 etc. If a BSD is > still supported by it's community, I'm not happy about dropping it. All the fuss over long doubles has distracted folks from a more legitimate issue with NetBSD 6.1.5, which is that python-config returns a nonworking build setup for the C extension. But a workaround should be possible, and it's only in the build procedure, not the code. On Wed, 13 Sep 2017, Eric S. Raymond via devel wrote: > Mark Atwood via devel : > > NetBSD6 still supported, so it's still running in the wild. > > > > I know we've been removed most compatibility shims, but are they all gone? > > or do we still have a chunk of "if this OS, then define these missing > > functions"? > > I think the last OS-related shim where we define substitute code is gone. > It was the kludge for pre-10.12 Mac OS X, which turned out not to work because > (a) in some versions the system headers didn't match the docs, and (b) we got > a report that in some versions the set-time primitive doesn't work. Well, *that* fallback code was broken in multiple ways, anyway, as was the comparable code in GPSD before I fixed it. With *correct* fallback code, (a) and (b) are both inapplicable. Limiting support to a OS version that's not even a year old is rather heavy-handed, especially when there isn't a really good reason for it. And the same fallback that works for 10.11 works at least as far back as 10.5 (all of which are supported by classic ntpd, BTW). MR coming, now that I've tested the code on 10 different "machines". Now I can get back to fixing the multitude of Python library issues. Fred Wright From hmurray at megapathdsl.net Thu Sep 14 23:13:43 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 14 Sep 2017 16:13:43 -0700 Subject: python-config on NetBSD 6 Message-ID: <20170914231343.785ED406061@ip-64-139-1-69.sjc.megapath.net> > All the fuss over long doubles has distracted folks from a more legitimate > issue with NetBSD 6.1.5, which is that python-config returns a nonworking > build setup for the C extension. But a workaround should be possible, and > it's only in the build procedure, not the code. Could you please say more. What doesn't work and/or what is the nature of the fix? Is the problem specifix to NetBSD (6 or otherwise)? My NetBSD setup notes include: ln -s /usr/pkg/bin/python2.7 /usr/pkg/bin/python ln -s /usr/pkg/bin/python2.7 /usr/pkg/bin/python2 ln -s /usr/pkg/lib/libpython2.7.so.1.0 /usr/lib/libpython2.7.so.1.0 Would that cover what you are describing? -- These are my opinions. I hate spam. From gem at rellim.com Thu Sep 14 23:16:57 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 14 Sep 2017 16:16:57 -0700 Subject: python-config on NetBSD 6 In-Reply-To: <20170914231343.785ED406061@ip-64-139-1-69.sjc.megapath.net> References: <20170914231343.785ED406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170914161639.5231a5f2@spidey.rellim.com> Yo Hal! On Thu, 14 Sep 2017 16:13:43 -0700 Hal Murray via devel wrote: > > All the fuss over long doubles has distracted folks from a more > > legitimate issue with NetBSD 6.1.5, which is that python-config > > returns a nonworking build setup for the C extension. But a > > workaround should be possible, and it's only in the build > > procedure, not the code. > > Could you please say more. I wish I could say more, but you, not I, have a working NetBSD 6 system. So have at it. I just got the ball rolling, but I can';t push it more. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Thu Sep 14 23:36:49 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 14 Sep 2017 16:36:49 -0700 Subject: Pivot cruft in step_systime Message-ID: <20170914233649.12062406061@ip-64-139-1-69.sjc.megapath.net> Can we get rid of it if very early in the startup sequence, we force the system time to be at least the build time? That is, if current time is less than build time, set the time to build time. (no network involved) -- These are my opinions. I hate spam. From gem at rellim.com Thu Sep 14 23:41:03 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 14 Sep 2017 16:41:03 -0700 Subject: Pivot cruft in step_systime In-Reply-To: <20170914233649.12062406061@ip-64-139-1-69.sjc.megapath.net> References: <20170914233649.12062406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170914164103.059595dc@spidey.rellim.com> Yo Hal! On Thu, 14 Sep 2017 16:36:49 -0700 Hal Murray via devel wrote: > Can we get rid of it if very early in the startup sequence, we force > the system time to be at least the build time? > > That is, if current time is less than build time, set the time to > build time. (no network involved) Nice idea, but prolly breaks some regression tests. Maybe for after 1.0? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Fri Sep 15 01:26:46 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 14 Sep 2017 21:26:46 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: References: Message-ID: <20170915012646.GA495@thyrsus.com> Fred Wright via devel : > IMO, if a proper cost-benefit analysis of the use of long doubles in the > NTP context were conducted, it would result in a resounding thumbs down. Thank you, Fred. I found your contribution measured and valuable even though I'm not certain I understood all of the issues you were raising. :-) I'm not wedded to using long doubles. It was a direct response to this issue report from gemiller: https://gitlab.com/NTPsec/ntpsec/issues/270 The new, reverted code, in step_systime() has a loss of precision: fp_sys = dtolfp(sys_residual); fp_ofs = dtolfp(step); fp_ofs += fp_sys; sys_residual and step are double and only have 53 bits of precision. But the l_fp needs 64 bits of precision, arguably 65 bits after 2026. Initial steps may be large, such as when a host has no valid RTC and thinks the current time is 1Jan70. The C standard does not specify the precision of a double. C99 Annex F makes IEEE754 compliance optional and very few C compilers are fully IEEE754 compliant. C doubles may be may be 24 bits, 53 bits, or something else. Only rarely would a C double be able to hold a 65 bit number without loss of precision. Best to avoid any doubt about precision and perform all the computations as long double or better as timespec(64). The fix might be increasing the precision os sys-residual and step before calling step_systime(). timespec(64) is my notation for a timespec containing a time64_t tv_sec and long tv_nsec. The replaced code used timespec(64) on 64 bit binaries and thus worked well past 2200. When I posted the fix I wrote this: There are a couple if different issues tangled together here. Let's do proper separation of concerns before trying anything risky. As a first step, I've addressed the concern about loss of precision in what I think is a simpler way than changing the argument signature of step_system() away from using a float type (that might be a good idea but it's a separate discussion). Since the underlying problem seems to be that step and sys_residual have a float type that doesn't fully cover the range of l_fp, I've fixed that. There's now a doubletime_t typedef that is long double and thus a minimum of 80 bits (except under Microsoft C but who cares). This easily handles the full range of l_fp. I've tweaked all the appropriate type converters and tried to use double_time everywhere that the full precision of an l_fp is required. Please review this change carefully, hunting for any places I might have missed where double variables need to become doubletime_t. My goal is for all the floating-point time operations requiring that full range to use this type. Pivoting is a separate concern - there be dragons at that edge of the map. We have a note about that in devel/TODO so it doesn't need to be tracked by this issue, which I want to get closed for 1.0. Please reopen this if you find any changes required for the doubletime_t cleanup. That was the last comment in the bug thread. I chose the recommendation to move to long double because I was (and still am) trying to narrow the footprint of the NTP homebrew types. There are several reasons I want to do this, all basically long-term ones involving gradual reduction of global complexity in several places where it's still pretty bad. That goal can be traded away, but I want to have a clearer idea of why the trade is necessary before I do it. Also feature freeze is supposed to be tomorrow and my reluctance to do changes that might be subtly destabilizing is going to rise dramatically. I am not enough of a floating-point guru to really evaluate or critique Gary's arguments about the original loss of precision, nor to judge the efficacy of his fix, nor to understand Fred's assertion that the applied fix is somehow useless. I just followed Gary's instructions with my "is this an invariant-breaker?" sensors turned up to max gain. That seemed to have been sufficient; the code worked. Beyond that I admit to feeling pretty clueless about what's going on here. So, did I make an ignorant mistake? Can this fix be rescued? Is someone else better equipped than me for the rescue? (Translation: I'd really love to dump this mess on Fred or Gary.) > All the fuss over long doubles has distracted folks from a more legitimate > issue with NetBSD 6.1.5, which is that python-config returns a nonworking > build setup for the C extension. But a workaround should be possible, and > it's only in the build procedure, not the code. > Well, [the gettime(2)/settime(2)] fallback code was broken in > multiple ways, anyway, as was the comparable code in GPSD before I > fixed it. With *correct* fallback code, (a) and (b) are both > inapplicable. These are good things to know, but... > Limiting support to a OS version that's not even a year old is rather > heavy-handed, especially when there isn't a really good reason for it. > And the same fallback that works for 10.11 works at least as far back as > 10.5 (all of which are supported by classic ntpd, BTW). OK. Fred, our convention here is that Mark decides porting scope on considered advice from the senior devs. We treat him as the product strategist even through we're not working inside a corporate structure where that makes obvious sense, simply because he's good at the view from $30Kft and knows where a lot of the corporare bodies are buried. Final decision will be his. That said, I'm going to push - not hard, not hill-to-die-on, just moderately - for remaining strict about our C99 conformance policy and culling old releases/minor platforms that can't meet it. A significant part of *my* job as architect is to defend us against complexity creep. Of course what I'm actually defending us is an increase in expected defect rates, but everyone here understands that link. I want to ditch the NetBSD 6 and old MacOS shims because they're defect attractors. Mot only that: by visibly compromising on our "C99 or GTFO" policy we're legitimizing future exceptions and "It's just one little shim. What harm can it do?" which can come around to bite us in the ass. These compromises have a way of accumulting that explains a lot of the sorry state this code was in when we forked it. Maybe Mark makes a product-strategy decision to eat that risk, but if we do it was still my responsibility to be the guy who pulls against it. One final point: I actually think we're in a better position than most projects to be "harsh", as you put it. Security people get it about reducing attack surface; when you're trying to justify snipping off these old warts even though someone is inconvenienced, that's the closest thing you'll ever find to a sovereign excuse. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Fri Sep 15 01:30:45 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 14 Sep 2017 21:30:45 -0400 Subject: Pivot cruft in step_systime In-Reply-To: <20170914233649.12062406061@ip-64-139-1-69.sjc.megapath.net> References: <20170914233649.12062406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170915013045.GB495@thyrsus.com> Hal Murray via devel : > > Can we get rid of it if very early in the startup sequence, we force the > system time to be at least the build time? > > That is, if current time is less than build time, set the time to build time. > (no network involved) Ah, you're thinking of this as a way to get us into the right half of the next era after a wraparound? -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Fri Sep 15 01:31:35 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 14 Sep 2017 21:31:35 -0400 Subject: Pivot cruft in step_systime In-Reply-To: <20170914164103.059595dc@spidey.rellim.com> References: <20170914233649.12062406061@ip-64-139-1-69.sjc.megapath.net> <20170914164103.059595dc@spidey.rellim.com> Message-ID: <20170915013135.GC495@thyrsus.com> Gary E. Miller via devel : > Yo Hal! > > On Thu, 14 Sep 2017 16:36:49 -0700 > Hal Murray via devel wrote: > > > Can we get rid of it if very early in the startup sequence, we force > > the system time to be at least the build time? > > > > That is, if current time is less than build time, set the time to > > build time. (no network involved) > > Nice idea, but prolly breaks some regression tests. Maybe for after 1.0? Concur. Doesn't seem to me to be urgent. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Fri Sep 15 01:32:26 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 14 Sep 2017 18:32:26 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170915012646.GA495@thyrsus.com> References: <20170915012646.GA495@thyrsus.com> Message-ID: <20170914183226.64ffedbc@spidey.rellim.com> Yo Eric! On Thu, 14 Sep 2017 21:26:46 -0400 "Eric S. Raymond via devel" wrote: > Fred Wright via devel : > > IMO, if a proper cost-benefit analysis of the use of long doubles > > in the NTP context were conducted, it would result in a resounding > > thumbs down. > Best to avoid any doubt about precision and perform all the > computations as long double or better as timespec(64). I'm all in favor of timespec(64). Just not this month. > That said, I'm going to push - not hard, not hill-to-die-on, just > moderately - for remaining strict about our C99 conformance policy > and culling old releases/minor platforms that can't meet it. +1. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Fri Sep 15 02:28:17 2017 From: fw at fwright.net (Fred Wright) Date: Thu, 14 Sep 2017 19:28:17 -0700 (PDT) Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170915012646.GA495@thyrsus.com> References: <20170915012646.GA495@thyrsus.com> Message-ID: On Thu, 14 Sep 2017, Eric S. Raymond wrote: > So, did I make an ignorant mistake? Can this fix be rescued? Is > someone else better equipped than me for the rescue? (Translation: > I'd really love to dump this mess on Fred or Gary.) The point I was trying to make is that C doesn't promise that long double is any better than double, so it should only be used in cases where the added precision is "nice to have" rather than mandatory, and where one doesn't mind getting platform-specific results from regression tests. > OK. Fred, our convention here is that Mark decides porting scope on > considered advice from the senior devs. We treat him as the product > strategist even through we're not working inside a corporate structure > where that makes obvious sense, simply because he's good at the view > from $30Kft and knows where a lot of the corporare bodies are buried. > Final decision will be his. In an earlier post, Mark stated that he preferred to keep NetBSD6 support if possible because it's still recent enough to get security fixes. If the same criterion is applied to OSX, then 10.10 and 10.11 should be supported. > I want to ditch the NetBSD 6 and old MacOS shims because they're > defect attractors. Mot only that: by visibly compromising on our "C99 > or GTFO" policy we're legitimizing future exceptions and "It's just > one little shim. What harm can it do?" which can come around to bite > us in the ass. Perhaps, but a piece of code inside a compile-time conditional which is false on all platforms considered "important" is extremely unlikely to bite a platform considered "important". It does impact readability, though that could be fixed by putting it in a separate source. And strictly speaking, this isn't a "C99" issue. :-) > One final point: I actually think we're in a better position than most > projects to be "harsh", as you put it. Security people get it about > reducing attack surface; when you're trying to justify snipping off these > old warts even though someone is inconvenienced, that's the closest > thing you'll ever find to a sovereign excuse. A point that often gets missed is that upgrades themselves have a cost in that they often break things (especially with Apple). Forcing someone else to upgrade their OS to run your software seems attractive because the hassle is Somebody Else's Problem, but that's what Bruce Schneier would call an "externalization". Fred Wright From fw at fwright.net Fri Sep 15 02:44:18 2017 From: fw at fwright.net (Fred Wright) Date: Thu, 14 Sep 2017 19:44:18 -0700 (PDT) Subject: python-config on NetBSD 6 In-Reply-To: <20170914231343.785ED406061@ip-64-139-1-69.sjc.megapath.net> References: <20170914231343.785ED406061@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Thu, 14 Sep 2017, Hal Murray via devel wrote: > > All the fuss over long doubles has distracted folks from a more legitimate > > issue with NetBSD 6.1.5, which is that python-config returns a nonworking > > build setup for the C extension. But a workaround should be possible, and > > it's only in the build procedure, not the code. > > Could you please say more. What doesn't work and/or what is the nature of > the fix? Is the problem specifix to NetBSD (6 or otherwise)? It seems to affect OpenBSD 5.6 as well. > My NetBSD setup notes include: > ln -s /usr/pkg/bin/python2.7 /usr/pkg/bin/python > ln -s /usr/pkg/bin/python2.7 /usr/pkg/bin/python2 > ln -s /usr/pkg/lib/libpython2.7.so.1.0 /usr/lib/libpython2.7.so.1.0 > Would that cover what you are describing? Something along the lines of the third symlink might fix it, though that exact link didn't seem to. More investigation is needed. The basic problem is that python2.7-config --ldflags includes "-lpython2.7" but no "-L" to say where to find it. On most platforms, a suitable "-L" is included. Fred Wright From esr at thyrsus.com Fri Sep 15 04:46:07 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 00:46:07 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: References: <20170915012646.GA495@thyrsus.com> Message-ID: <20170915044607.GA11307@thyrsus.com> Fred Wright via devel : > > On Thu, 14 Sep 2017, Eric S. Raymond wrote: > > > So, did I make an ignorant mistake? Can this fix be rescued? Is > > someone else better equipped than me for the rescue? (Translation: > > I'd really love to dump this mess on Fred or Gary.) > > The point I was trying to make is that C doesn't promise that long double > is any better than double, so it should only be used in cases where the > added precision is "nice to have" rather than mandatory, and where one > doesn't mind getting platform-specific results from regression tests. Ahh, OK. That's much clearer. Now I can think about tradeoffs. Based on my research, here's what seems to be the case: 1. We are no worse off than before anywhere. The worst case is it's just normal double precision. 2. On x86_64, the usual precision of long double is 80-bit. In particular GCC long double is in the 80-bit camp. So is Mac OS X's. 3. Some BSDs (FreeBSD, OpenBSD) do the bad thing and drop to double precision. 4. Ihaven't been able to find anything definitive about ARM32. > > OK. Fred, our convention here is that Mark decides porting scope on > > considered advice from the senior devs. We treat him as the product > > strategist even through we're not working inside a corporate structure > > where that makes obvious sense, simply because he's good at the view > > from $30Kft and knows where a lot of the corporare bodies are buried. > > Final decision will be his. > > In an earlier post, Mark stated that he preferred to keep NetBSD6 support > if possible because it's still recent enough to get security fixes. If > the same criterion is applied to OSX, then 10.10 and 10.11 should be > supported. I'm absolutely going to count you as a senior dev. So, you for retaining, me and Gary for dropping. Hal and Matt's positions not known. Debate still ongoing. Mark leaning towrds "keep". Noted. > > I want to ditch the NetBSD 6 and old MacOS shims because they're > > defect attractors. Mot only that: by visibly compromising on our "C99 > > or GTFO" policy we're legitimizing future exceptions and "It's just > > one little shim. What harm can it do?" which can come around to bite > > us in the ass. > > Perhaps, but a piece of code inside a compile-time conditional which is > false on all platforms considered "important" is extremely unlikely to > bite a platform considered "important". It does impact readability, > though that could be fixed by putting it in a separate source. That comes under "How can just one little shim hurt?" Over time the resulting code bloat and dust in the cracks gets bad. But you know this. > And strictly speaking, this isn't a "C99" issue. :-) True. :-) > > One final point: I actually think we're in a better position than most > > projects to be "harsh", as you put it. Security people get it about > > reducing attack surface; when you're trying to justify snipping off these > > old warts even though someone is inconvenienced, that's the closest > > thing you'll ever find to a sovereign excuse. > > A point that often gets missed is that upgrades themselves have a cost in > that they often break things (especially with Apple). Forcing someone > else to upgrade their OS to run your software seems attractive because the > hassle is Somebody Else's Problem, but that's what Bruce Schneier would > call an "externalization". A fair point. But...on the other hand, a major platform. Not by our criterion, which is more or less "Are flocks of these going to be running in $J_RANDOM_HUMONGOUS_DATACENTER?" Part of our strategy is to optimize for the toughest, highest-end users on the newest hardware. Classic can keep the hobbyists, we're after the people who can and sometimes actually do write big donation checks. We want them to trust us and love us and give us money and lend us engineers. That's the world Matt Selsky is seconded to us from; he works for a high-frequency-trading outfit in New York. Your attribution to Bruce suggests that you don't know where he picked up the term, and you are obviously the kind of person who likes to know such things. So, "Externality" is a term of art in economics: https://en.wikipedia.org/wiki/Externality -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Fri Sep 15 05:49:50 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 14 Sep 2017 22:49:50 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: Message from "Eric S. Raymond via devel" of "Fri, 15 Sep 2017 00:46:07 EDT." <20170915044607.GA11307@thyrsus.com> Message-ID: <20170915054950.1C24D40605C@ip-64-139-1-69.sjc.megapath.net> >> In an earlier post, Mark stated that he preferred to keep NetBSD6 support >> if possible because it's still recent enough to get security fixes. If >> the same criterion is applied to OSX, then 10.10 and 10.11 should be >> supported. >I'm absolutely going to count you as a senior dev. So, you for >retaining, me and Gary for dropping. Hal and Matt's positions >not known. >Debate still ongoing. Mark leaning towrds "keep". Noted. I think we should dump long doubles. (I'm amazed that you didn't outlaw them as soon as Fred reported how screwed up they are.) That fixes that NetBSD 6 issue. (There may be python issues. It works for me, but I'm running with a few hack links and a PYTHONPATH.) > sys_residual and step are double and only have 53 bits of > precision. But the l_fp needs 64 bits of precision, arguably 65 bits > after 2026. Initial steps may be large, such as when a host has no Yes, l_fp is 64 bits, but the places where doubles are used is only deltas. If you have a delta under 1 second, that's 32 interesting bits in in the right half of the l_fp and 0s in the left half. There is loss of precision when converting from double, but it's because the l_fp doesn't have enough low bits, not because a double isn't big enough. If we have 32 bits of fraction, that leaves 21 bits for seconds. That's 24 days. So there is no loss of precision if the delta is less than 24 days. There is loss of precision if you have a huge delta. I think the worst case is something like setting the time when the system clock hasn't been initialized and defaults to 1970. 31 bits of seconds is 67 years. That leaves 22 bits of fraction. That's better than a microsecond. The next adjustment will have plenty of precision. The code has been doing that forever. Nobody has complained. (or probably even noticed) It's still better than a microsecond if you want 32 bits of seconds. We can get 67 years from the build date if we use my set-real-early suggestion in another thread. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Sep 15 06:07:52 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 14 Sep 2017 23:07:52 -0700 Subject: What is the expected lifetime of code we ship? Message-ID: <20170915060752.A5F4740605C@ip-64-139-1-69.sjc.megapath.net> Suppose we release some code. Assume it is bug free so users are happy. How long do we expect it to run correctly? Would we be happy with 67 years from the build date? Do we know the limitations? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Sep 15 06:15:15 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 14 Sep 2017 23:15:15 -0700 Subject: Pivot cruft in step_systime In-Reply-To: Message from "Eric S. Raymond via devel" of "Thu, 14 Sep 2017 21:30:45 EDT." <20170915013045.GB495@thyrsus.com> Message-ID: <20170915061515.ED6C840605C@ip-64-139-1-69.sjc.megapath.net> >> Can we get rid of it if very early in the startup sequence, we force the >> system time to be at least the build time? > Ah, you're thinking of this as a way to get us into the right half of the > next era after a wraparound? I think it's a double win. It gets rid of that ugly pivot code. I think you call it a defect attractor. Even if it is correct, it makes the rest of the code harder to read. I think it cleanly sets things up so that our first jump works-right time range is relative to build time. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Sep 15 06:51:23 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 14 Sep 2017 23:51:23 -0700 Subject: python-config on NetBSD 6 In-Reply-To: Message from Fred Wright via devel of "Thu, 14 Sep 2017 19:44:18 PDT." Message-ID: <20170915065123.D4BBF40605C@ip-64-139-1-69.sjc.megapath.net> > The basic problem is that python2.7-config --ldflags includes "-lpython2.7" > but no "-L" to say where to find it. On most platforms, a suitable "-L" is > included. I don't know anything about that area, but your "most platforms" seems optimistic. NetBSD 7.1 (GENERIC.201703111743Z) -bash-4.4$ python2.7-config --ldflags -lpython2.7 -lutil -lm -Wl,--export-dynamic -bash-4.4$ NetBSD 6.1.5 (GENERIC) -bash-4.4$ python2.7-config --ldflags -lpython2.7 -lutil -lm -Wl,--export-dynamic -bash-4.4$ Fedora release 26 (Twenty Six) [murray at hgm ~]$ python2.7-config --ldflags -lpython2.7 -lpthread -ldl -lutil -lm -Xlinker -export-dynamic [murray at hgm ~]$ OpenBSD 6.0 (GENERIC.MP) #2319: Tue Jul 26 13:00:43 MDT 2016 -bash-4.3$ python2.7-config --ldflags -lpython2.7 -lpthread -lutil -lm -Wl,--export-dynamic -bash-4.3$ FreeBSD 11.0-RELEASE-p1 (GENERIC) #0 r306420: Thu Sep 29 01:43:23 UTC 2016 [murray at ted3 ~]$ python2.7-config --ldflags -L/usr/local/lib -lpython2.7 -L/usr/local/lib -lintl -lutil -lm -Wl,--export-dynamic [murray at ted3 ~]$ PRETTY_NAME="Debian GNU/Linux 9 (stretch)" murray at deb2:~$ python2.7-config --ldflags -L/usr/lib/python2.7/config-x86_64-linux-gnu -L/usr/lib -lpython2.7 -lpthread -ldl -lutil -lm -Xlinker -export-dynamic -Wl,-O1 -Wl,-Bsymbolic-functions murray at deb2:~$ PRETTY_NAME="Raspbian GNU/Linux 8 (jessie)" murray at wp0:~$ python2.7-config --ldflags -L/usr/lib/python2.7/config-arm-linux-gnueabihf -L/usr/lib -lpython2.7 -lpthread -ldl -lutil -lm -Xlinker -export-dynamic -Wl,-O1 -Wl,-Bsymbolic-functions murray at wp0:~$ > It seems to affect OpenBSD 5.6 as well. My setup OpenBSD notes say: pkg_add python => python-2.7.11 cd /usr/local/bin/ ln -s python2.7 python ln -s python2.7 python2 -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Sep 15 07:10:51 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 15 Sep 2017 00:10:51 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: Message from "Eric S. Raymond via devel" of "Thu, 14 Sep 2017 21:26:46 EDT." <20170915012646.GA495@thyrsus.com> Message-ID: <20170915071051.7105840605C@ip-64-139-1-69.sjc.megapath.net> [Subject says NetBSD, but context has drifted to MacOS] > That said, I'm going to push - not hard, not hill-to-die-on, just moderately > - for remaining strict about our C99 conformance policy and culling old > releases/minor platforms that can't meet it. We aren't discussing C99 but rather the non-standardize way of playing with the clock. I know of two ways to handle that sort of problem. One is to use ifdefs like the current code. Being hard-nosed about C99 helps keep that sort of code clean. The other way is to have a separate module for each OS and link in the right one. That only works if you have a clean API for what the module has to do. The downside is that you usually have a lot of duplicated code, and if you make a change/fix in one place, you have to go check the other places, and remember to do it. This might be an appropriate time/place to use the separate module approach. The API is simple and clean. Ignoring MacOS, I think the current code has 2 branches. One is with ntp_adjtime or similar. The other is without it. I think that 's currently ifdef-ed with HAVE_KERNEL_PLL. If ifdef-ing the no-ntp_adjtime module to support MacOS gets too ugly, we could use a separate module. By the way, "HAVE_KERNEL_PLL" is a horrible name. There is no PLL involved. There are two features we want. One is to be able to slew the clock. The other is to tweak the clock frequency, aka drift. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Sep 15 07:38:14 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 15 Sep 2017 00:38:14 -0700 Subject: Pivot cruft in step_systime Message-ID: <20170915073814.18BA140605C@ip-64-139-1-69.sjc.megapath.net> > Nice idea, but prolly breaks some regression tests. Maybe for after 1.0? Do we have any tests for that area? -- These are my opinions. I hate spam. From ianbruene at gmail.com Fri Sep 15 16:30:07 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Fri, 15 Sep 2017 11:30:07 -0500 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170915044607.GA11307@thyrsus.com> References: <20170915012646.GA495@thyrsus.com> <20170915044607.GA11307@thyrsus.com> Message-ID: <846b1ff3-5240-9bc7-030c-331e540990b2@gmail.com> On 09/14/2017 11:46 PM, Eric S. Raymond via devel wrote: > A fair point. But...on the other hand, a major platform. Not by our > criterion, which is more or less "Are flocks of these going to be > running in $J_RANDOM_HUMONGOUS_DATACENTER?" Part of our strategy is > to optimize for the toughest, highest-end users on the newest > hardware. Classic can keep the hobbyists, we're after the people who > can and sometimes actually do write big donation checks. We want them > to trust us and love us and give us money and lend us engineers. As I understood it part of the rationale for NTPsec was the yawning security chasms in NTPclassic. Shouldn't wide adoption therefore be highly desirable? Possible answer: Aunt Tillie is not going to hunt down NTPsec and install it, we can't get any real foothold in the diffuse installed base. But NTPsec *can* get into new OS releases and big server farms. > Your attribution to Bruce suggests that you don't know where he picked up the > term, and you are obviously the kind of person who likes to know such things. > So, "Externality" is a term of art in economics: > > https://en.wikipedia.org/wiki/Externality @Fred Wright Along this line of thought you should look up Coase's Therom; there is a tremendous amount of generative value in understanding it. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From esr at thyrsus.com Fri Sep 15 17:40:21 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 13:40:21 -0400 Subject: Adoption strategy Message-ID: <20170915174021.GA10687@thyrsus.com> (Lifted from the thread on lldexp) Ian Bruene via devel : > As I understood it part of the rationale for NTPsec was the yawning security > chasms in NTPclassic. Shouldn't wide adoption therefore be highly desirable? > > Possible answer: Aunt Tillie is not going to hunt down NTPsec and install > it, we can't get any real foothold in the diffuse installed base. But NTPsec > *can* get into new OS releases and big server farms. Yeah, that's it, basically. The run-up to the 1.0 release is I think a good time to discuss our adoption strategy. The fact that we now seem to be onboarding a new senior dev - Fred Wright, who some of us know as an extremely effective contributor to GPSD - adds value to the exercise. It's Mark's job to make the big decisions about this, but I believe my thinking runs quite parallel to his and I'm sure he'll correct me if I get his evaluations wrong. The project's goal is to fix time service, where "fix" centers on improving security and reliability. The slightly better timekeeping and the rather dramatically improved monitoring tools are the sizzle on that steak. (Slightly improved timekeeping would be a bigger deal if not for the limitations of our clock sources and the scale of network weather. NTP Classic was already nearly as good that way as is functional.) So, the next question is how we get NTPsec fielded to as many places where it's needed as possible, as rapidly as possible. The key thing to notice is that "where it's needed" is not uniformly distributed across all users. The bad guys don't bother attacking the 99.99% of all NTP clients who are on desktops behind dynamic IPs. The fat targets for use as DDoS amplifiers are big data centers on static IPs. This happily coincides with the set of users we want to love us and give us money and send us engineers. So *everything* points us at an adoption push aimed at big data centers first. Either directly or by getting our stuff into the distro pipeline to their systems. This has a number of implications. The top one is that almost no platform but Linux actually matters. We're doing minor-platform stuff like *BSD and Mac more to signal competence and be good-guy citizens of a culture that considers platform breadth a virtue than because they actually matter to our strategy. Windows doesn't matter at all. In general, dropping non-Linux platforms to lower our expected defect rate is a good trade. We need to look for the knee in the curve of complexity reduction; the obvious one, which has been a project premise since week one, is full C99/POSIX conformance. We're currently arguing about how far back to support Mac OS X and NetBSD; it's not yet resolved, but it's a healthy argument proceeding from the right premises. Another implication is that ancient refclocks (anything EOLed) probably don't matter either. There may be a limited exception here for FedGov installations with really ancient hardware locked in by certification requirements, which is why we're still retaining some pretty crufty old stuff in the driver set. Yet a third implication is that support for 32-bit platforms is not very important either. We're doing that mainly because (a) good code hygiene and (b) ARM32 and similar hardware make nice microservers. > Along this line of thought you should look up Coase's Therom; there is a > tremendous amount of generative value in understanding it. Oh hell yes. The one-line version: Given sufficiently low transaction costs, any externality will be internalized. But there's a lot of unobvious freight in there - helps to study Coase's motivation for it. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Fri Sep 15 17:45:07 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 13:45:07 -0400 Subject: Pivot cruft in step_systime In-Reply-To: <20170915061515.ED6C840605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915013045.GB495@thyrsus.com> <20170915061515.ED6C840605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170915174507.GB10687@thyrsus.com> Hal Murray : > It gets rid of that ugly pivot code. I think you call it a defect attractor. Damn straight I do! > Even if it is correct, it makes the rest of the code harder to read. > > I think it cleanly sets things up so that our first jump works-right time > range is relative to build time. These are good arguments, but we're about to code-freeze and that change is *absolutely* a potential destabilizer. Gary is right, this is post-1.0 work. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Fri Sep 15 17:56:06 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 13:56:06 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170915071051.7105840605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915012646.GA495@thyrsus.com> <20170915071051.7105840605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170915175606.GC10687@thyrsus.com> Hal Murray : > > [Subject says NetBSD, but context has drifted to MacOS] > > That said, I'm going to push - not hard, not hill-to-die-on, just moderately > > - for remaining strict about our C99 conformance policy and culling old > > releases/minor platforms that can't meet it. > > We aren't discussing C99 but rather the non-standardize way of playing with > the clock. I thought Net BSD 6 wa in the discussion too. But if you're only talking about OS X, you're only talking about OS X. > I know of two ways to handle that sort of problem. One is to use ifdefs like > the current code. Being hard-nosed about C99 helps keep that sort of code > clean. > > The other way is to have a separate module for each OS and link in the right > one. That only works if you have a clean API for what the module has to do. > The downside is that you usually have a lot of duplicated code, and if you > make a change/fix in one place, you have to go check the other places, and > remember to do it. The Mac OS X shim is not too bad that way. It can be confined to libntp/clockwork.c. > By the way, "HAVE_KERNEL_PLL" is a horrible name. There is no PLL involved. > There are two features we want. One is to be able to slew the clock. The > other is to tweak the clock frequency, aka drift. Yeah, you should fix that today before I declare code freeze. Which will probably be about 5PM Eastern. I'm sure you'll choose well. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From gem at rellim.com Fri Sep 15 18:10:41 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 15 Sep 2017 11:10:41 -0700 Subject: Adoption strategy In-Reply-To: <20170915174021.GA10687@thyrsus.com> References: <20170915174021.GA10687@thyrsus.com> Message-ID: <20170915111041.7ce7c4b2@spidey.rellim.com> Yo Eric! On Fri, 15 Sep 2017 13:40:21 -0400 "Eric S. Raymond via devel" wrote: > Yet a third implication is that support for 32-bit platforms is not > very important either. We're doing that mainly because (a) good code > hygiene and (b) ARM32 and similar hardware make nice microservers. I pretty much agree with what you said, until here. There are a LOT of RaspBerry Pi's out there, and even more ARM 32-bit IoT devices. NTPsec currently supports that quickly growing market, and should continue to do so. As long as they support plain vanilla C99 and POSIX, we should support them. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Fri Sep 15 18:24:40 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 14:24:40 -0400 Subject: Adoption strategy In-Reply-To: <20170915111041.7ce7c4b2@spidey.rellim.com> References: <20170915174021.GA10687@thyrsus.com> <20170915111041.7ce7c4b2@spidey.rellim.com> Message-ID: <20170915182440.GD10687@thyrsus.com> Gary E. Miller via devel : > Yo Eric! > > On Fri, 15 Sep 2017 13:40:21 -0400 > "Eric S. Raymond via devel" wrote: > > > Yet a third implication is that support for 32-bit platforms is not > > very important either. We're doing that mainly because (a) good code > > hygiene and (b) ARM32 and similar hardware make nice microservers. > > I pretty much agree with what you said, until here. > > There are a LOT of RaspBerry Pi's out there, and even more ARM 32-bit > IoT devices. NTPsec currently supports that quickly growing market, and > should continue to do so. > > As long as they support plain vanilla C99 and POSIX, we should support > them. That was sort of supposed to be subsumed under "ARM32 and similar hardware make nice microservers." We're not actually in disagreement. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Fri Sep 15 18:30:03 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 15 Sep 2017 11:30:03 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170915071051.7105840605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915012646.GA495@thyrsus.com> <20170915071051.7105840605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170915113003.625ebe50@spidey.rellim.com> Yo Hal! On Fri, 15 Sep 2017 00:10:51 -0700 Hal Murray via devel wrote: > [Subject says NetBSD, but context has drifted to MacOS] > > That said, I'm going to push - not hard, not hill-to-die-on, just > > moderately > > - for remaining strict about our C99 conformance policy and culling > > old releases/minor platforms that can't meet it. > > We aren't discussing C99 but rather the non-standardize way of > playing with the clock. I think C99 was used here to to include the whole NTPsec policy of also wanting POSIX conformance. POSIX very much defines how the clock is played with. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Fri Sep 15 18:46:40 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 15 Sep 2017 11:46:40 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h Message-ID: <20170915184640.671E340605C@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > I think C99 was used here to to include the whole NTPsec policy of also > wanting POSIX conformance. POSIX very much defines how the clock is played > with. POSIX defines ways to access the clock, but only the simple functions like reading and setting the clock. It doesn't cover how to slew the clock or tweak the clock speed (drift) - things like ntp_adjtime or adjtime(x). -- These are my opinions. I hate spam. From Stromeko at nexgo.de Fri Sep 15 18:50:19 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Fri, 15 Sep 2017 20:50:19 +0200 Subject: What is the expected lifetime of code we ship? References: <20170915060752.A5F4740605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <877ewz94pg.fsf@Rainer.invalid> Hal Murray via devel writes: > Suppose we release some code. Assume it is bug free so users are happy. You can drop that assumption without any change to the outcome. > How long do we expect it to run correctly? The question really is: What should we do when we know it stops running correctly and how does the program learn that? > Would we be happy with 67 years from the build date? That doesn't really matter, except that larger n is a better assurance that you won't personally have to deal with any fallout. In any case, we need some cooperation from the environment we build and run on in order to gain some reference date that we can use since it's impossible to get a trustable absolute reference from within the code. > Do we know the limitations? I don't think so. I believe we've had a similar discussion before and at that time I said that shooting for something well over a century might be a good goal. Also, documenting the assumptions we make about the cooperation (or lack thereof) from the environment we run in is a must, because otherwise our claims are meaningless. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Wavetables for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldUserWavetables From hmurray at megapathdsl.net Fri Sep 15 19:16:02 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 15 Sep 2017 12:16:02 -0700 Subject: What is the expected lifetime of code we ship? In-Reply-To: Message from Achim Gratz via devel of "Fri, 15 Sep 2017 20:50:19 +0200." <877ewz94pg.fsf@Rainer.invalid> Message-ID: <20170915191602.1771240605C@ip-64-139-1-69.sjc.megapath.net> >> Suppose we release some code. Assume it is bug free so users are happy. > You can drop that assumption without any change to the outcome. I was thinking of roughly the following: Suppose the code is good for 20 years after the build date. That covers GPS rollover. If we have a security fix that requires rebuilding the code every 5 years, the code will keep working over GPS rollovers without any explicit action on our part. ------- My straw man is that we will support our current code in all versions of major OSes that are supported by the vendor. But I haven't figured out what "support" means. Does it include old versions? How old? What happens to conservative organizations that are still (happily?) running an OS version that is no longer supported because it works and they don't want to rock the boat? (or don't have the skills to upgrade) -- These are my opinions. I hate spam. From esr at thyrsus.com Fri Sep 15 19:21:21 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 15:21:21 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170915113003.625ebe50@spidey.rellim.com> References: <20170915012646.GA495@thyrsus.com> <20170915071051.7105840605C@ip-64-139-1-69.sjc.megapath.net> <20170915113003.625ebe50@spidey.rellim.com> Message-ID: <20170915192121.GA16956@thyrsus.com> Gary E. Miller via devel : > Yo Hal! > > On Fri, 15 Sep 2017 00:10:51 -0700 > Hal Murray via devel wrote: > > > [Subject says NetBSD, but context has drifted to MacOS] > > > That said, I'm going to push - not hard, not hill-to-die-on, just > > > moderately > > > - for remaining strict about our C99 conformance policy and culling > > > old releases/minor platforms that can't meet it. > > > > We aren't discussing C99 but rather the non-standardize way of > > playing with the clock. > > I think C99 was used here to to include the whole NTPsec policy of > also wanting POSIX conformance. POSIX very much defines how the > clock is played with. That is correct. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Fri Sep 15 19:23:09 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 15:23:09 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170915184640.671E340605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915184640.671E340605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170915192309.GB16956@thyrsus.com> Hal Murray via devel : > > devel at ntpsec.org said: > > I think C99 was used here to to include the whole NTPsec policy of also > > wanting POSIX conformance. POSIX very much defines how the clock is played > > with. > > POSIX defines ways to access the clock, but only the simple functions like > reading and setting the clock. It doesn't cover how to slew the clock or > tweak the clock speed (drift) - things like ntp_adjtime or adjtime(x). That is correct, but not relevant to the discussion of whether to keep NetBSD 6 and old Mac OS X around. They have to have those primitives or we couldn't have had them in the discussion at all. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Fri Sep 15 19:38:48 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 15:38:48 -0400 (EDT) Subject: Feature Freeze Friday Message-ID: <20170915193848.08B9E13A0206@snark.thyrsus.com> It's Feature Freeze Friday. The window closes at 5PM, so get your last-minute feature patches in now. As previously noted, work should continue on tracker issues and reactive bug fixes. We may accept two anticipated security-feature patches from Daniel Franke. There may be a bit more slack around the Python stuff, as that would not risk destabilizing core code. One small change: while deadline for design proposals on /etc/ntp.d has been met, I doubt I'm going to have time to do the resulting merges before 5PM while giving the task the care it merits. Therefore I am, albeit reluctantly, declaring "RFE: make location of /etc/ntp.d configurable" (#385) and "RFE: Support /etc/ntp.d (#204)" in scope for working on during freeze. With luck we'll get all that resolved in the next couple of days. -- Eric S. Raymond The Bible is not my book, and Christianity is not my religion. I could never give assent to the long, complicated statements of Christian dogma. -- Abraham Lincoln From paul at anastrophe.com Fri Sep 15 19:43:08 2017 From: paul at anastrophe.com (Paul Theodoropoulos) Date: Fri, 15 Sep 2017 12:43:08 -0700 Subject: What is the expected lifetime of code we ship? In-Reply-To: <20170915191602.1771240605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915191602.1771240605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <5b2e6c8d-b251-779e-f53d-5ffd69a5ef35@anastrophe.com> On 9/15/17 12:16, Hal Murray via devel wrote: > My straw man is that we will support our current code in all versions of > major OSes that are supported by the vendor. But I haven't figured out what > "support" means. Does it include old versions? How old? > > What happens to conservative organizations that are still (happily?) running > an OS version that is no longer supported because it works and they don't > want to rock the boat? (or don't have the skills to upgrade) Speaking only from my perspective as only a sysadmin, for a core application like timekeeping, I wouldn't expect support to be valid for more than three years, at least in respect to feature upgrades. *Maybe* five years for critical security updates. This from an admin who only a year ago took a dust-riddled strictly internal server (that I inherited) and upgraded it from etch to lenny to squeeze to wheezy. -- Paul Theodoropoulos www.anastrophe.com From gem at rellim.com Fri Sep 15 19:44:52 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 15 Sep 2017 12:44:52 -0700 Subject: Feature Freeze Friday In-Reply-To: <20170915193848.08B9E13A0206@snark.thyrsus.com> References: <20170915193848.08B9E13A0206@snark.thyrsus.com> Message-ID: <20170915124452.459a7817@spidey.rellim.com> Yo Eric! On Fri, 15 Sep 2017 15:38:48 -0400 (EDT) "Eric S. Raymond via devel" wrote: > It's Feature Freeze Friday. The window closes at 5PM, so get your > last-minute feature patches in now. I got none, except maybe the register thing (#388), if you think that could/should go in. > As previously noted, work should continue on tracker issues and > reactive bug fixes. I'm working on #62 and I hope to get to #55 soon. > One small change: while deadline for design proposals on /etc/ntp.d > has been met, How about you just merge my patch in #204. And ponder my last suggestion in #385. If you approve those would be almos trivial. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Fri Sep 15 19:55:22 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Fri, 15 Sep 2017 21:55:22 +0200 Subject: What is the expected lifetime of code we ship? References: <20170915191602.1771240605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <87377n91p1.fsf@Rainer.invalid> Hal Murray via devel writes: > If we have a security fix that requires rebuilding the code every 5 years, > the code will keep working over GPS rollovers without any explicit action on > our part. That makes the assumption that the old program running gets actually replaced by the new build. If you consider some IoT hidden somewhere not-obvious that may not happen for any number of reasons. > My straw man is that we will support our current code in all versions of > major OSes that are supported by the vendor. But I haven't figured out what > "support" means. Does it include old versions? How old? > > What happens to conservative organizations that are still (happily?) running > an OS version that is no longer supported because it works and they don't > want to rock the boat? (or don't have the skills to upgrade) In the above scenario let the company that made the IoT go out of business and their update server vanish. The only defense is to aggregate as many notions of "current time" as possible and then take it from there. NTP is most vulnerable to picking the wrong time at startup, so if you'd really want to build in some defenses against it coming up with the wrong pivot you'd need to have some sort of ratchet that keeps moving the lower limit on the time. But provided you have that you now have the problem that just one rogue or otherwise botched startup can beam you too far into the future and create a DOS. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf Blofeld V1.15B11: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From hmurray at megapathdsl.net Fri Sep 15 19:58:32 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 15 Sep 2017 12:58:32 -0700 Subject: Feature Freeze Friday In-Reply-To: Message from "Eric S. Raymond via devel" of "Fri, 15 Sep 2017 15:38:48 EDT." <20170915193848.08B9E13A0206@snark.thyrsus.com> Message-ID: <20170915195832.3A22B40605C@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > It's Feature Freeze Friday. The window closes at 5PM, so get your > last-minute feature patches in now. I think you should revert the long double change and wait until post-release to clean up that area - not just the precision part but the whole clock adjusting area. > As previously noted, work should continue on tracker issues and reactive bug > fixes. We may accept two anticipated security-feature patches from Daniel > Franke. I'll be glad to help with that. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Sep 15 20:04:16 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 15 Sep 2017 13:04:16 -0700 Subject: ntpd: Gross CPU usage Message-ID: <20170915200416.37B5640605C@ip-64-139-1-69.sjc.megapath.net> I've got a case where top shows ntpd is using 60-70% of the CPU. I noticed because the fan on the box is cycling on/off. Has anybody seen anything like that recently? -- These are my opinions. I hate spam. From gem at rellim.com Fri Sep 15 20:05:15 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 15 Sep 2017 13:05:15 -0700 Subject: Feature Freeze Friday In-Reply-To: <20170915195832.3A22B40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915193848.08B9E13A0206@snark.thyrsus.com> <20170915195832.3A22B40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170915130515.3fd5c8f6@spidey.rellim.com> Yo Hal! On Fri, 15 Sep 2017 12:58:32 -0700 Hal Murray via devel wrote: > devel at ntpsec.org said: > > It's Feature Freeze Friday. The window closes at 5PM, so get your > > last-minute feature patches in now. > > I think you should revert the long double change and wait until > post-release to clean up that area - not just the precision part but > the whole clock adjusting area. Oh, please, no. It is stable, except for the NetBSD 6 thing. It fixed a bunch of issues and starting over could take weeks. Consensus seems to be to go for timespec(64) direcctly after 1.0. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Fri Sep 15 20:06:31 2017 From: fw at fwright.net (Fred Wright) Date: Fri, 15 Sep 2017 13:06:31 -0700 (PDT) Subject: python-config on NetBSD 6 In-Reply-To: <20170915065123.D4BBF40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915065123.D4BBF40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Thu, 14 Sep 2017, Hal Murray wrote: > > The basic problem is that python2.7-config --ldflags includes "-lpython2.7" > > but no "-L" to say where to find it. On most platforms, a suitable "-L" is > > included. > > I don't know anything about that area, but your "most platforms" seems > optimistic. I guess I should have said "if needed". If the libraries that it wants are in one of the compiler's default search paths, then there's no need for a "-L". In fact, the issue may arise due to some disconnect between the C toolchain and Python. I say "Python" rather than "python-config" because the latter obtains its info from the former, and just massages it a bit. Personally, I consider python-config to be conceptually broken, but waf seems rather wedded to it. Fred Wright From fw at fwright.net Fri Sep 15 20:11:15 2017 From: fw at fwright.net (Fred Wright) Date: Fri, 15 Sep 2017 13:11:15 -0700 (PDT) Subject: python-config on NetBSD 6 In-Reply-To: References: <20170915065123.D4BBF40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Fri, 15 Sep 2017, Fred Wright via devel wrote: > On Thu, 14 Sep 2017, Hal Murray wrote: > > > > The basic problem is that python2.7-config --ldflags includes "-lpython2.7" > > > but no "-L" to say where to find it. On most platforms, a suitable "-L" is > > > included. > > > > I don't know anything about that area, but your "most platforms" seems > > optimistic. > > I guess I should have said "if needed". If the libraries that it wants > are in one of the compiler's default search paths, then there's no need > for a "-L". In fact, the issue may arise due to some disconnect between > the C toolchain and Python. I say "Python" rather than "python-config" > because the latter obtains its info from the former, and just massages it > a bit. Forgot to mention: The definitive criterion is whether "Testing pyext configuration" toward the end of configure is successful or not. Fred Wright From gem at rellim.com Fri Sep 15 20:17:23 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 15 Sep 2017 13:17:23 -0700 Subject: ntpd: Gross CPU usage In-Reply-To: <20170915200416.37B5640605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915200416.37B5640605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170915131723.445d0816@spidey.rellim.com> Yo Hal! On Fri, 15 Sep 2017 13:04:16 -0700 Hal Murray via devel wrote: > I've got a case where top shows ntpd is using 60-70% of the CPU. I > noticed because the fan on the box is cycling on/off. > > Has anybody seen anything like that recently? Nope. Which driver? That is in your ntp.conf? Anything in your ntp logs? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Fri Sep 15 20:36:38 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 16:36:38 -0400 Subject: Feature Freeze Friday In-Reply-To: <20170915130515.3fd5c8f6@spidey.rellim.com> References: <20170915193848.08B9E13A0206@snark.thyrsus.com> <20170915195832.3A22B40605C@ip-64-139-1-69.sjc.megapath.net> <20170915130515.3fd5c8f6@spidey.rellim.com> Message-ID: <20170915203638.GA19972@thyrsus.com> Gary E. Miller via devel : > Hal Murray via devel wrote: > > I think you should revert the long double change and wait until > > post-release to clean up that area - not just the precision part but > > the whole clock adjusting area. > > Oh, please, no. It is stable, except for the NetBSD 6 thing. It fixed > a bunch of issues and starting over could take weeks. > > Consensus seems to be to go for timespec(64) direcctly after 1.0. I concur. According to our analysis, long double is not better than double in the way I mistakenly believed, but it's no worse either. The change was not toxic, merely ineffective. Reverting it would be exacly the kind of poke-the-possible-hornet's-nest change that we should be avoiding. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Fri Sep 15 20:41:09 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 15 Sep 2017 16:41:09 -0400 Subject: ntpd: Gross CPU usage In-Reply-To: <20170915200416.37B5640605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915200416.37B5640605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170915204109.GB19972@thyrsus.com> Hal Murray via devel : > > I've got a case where top shows ntpd is using 60-70% of the CPU. I noticed > because the fan on the box is cycling on/off. > > Has anybody seen anything like that recently? No. It's well down in the noise right now on snark. First thing to suspect on seeing that symptom, in ntpd or gpsd, is that somebody's tty layer is reacting badly to the unusual ways we use ioctls. I've seen this before multiple times. Is it replicating on your other machines? -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Fri Sep 15 21:24:41 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 15 Sep 2017 14:24:41 -0700 Subject: ntpd: Gross CPU usage In-Reply-To: Message from "Eric S. Raymond via devel" of "Fri, 15 Sep 2017 16:41:09 EDT." <20170915204109.GB19972@thyrsus.com> Message-ID: <20170915212441.E235940605C@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > First thing to suspect on seeing that symptom, in ntpd or gpsd, is that > somebody's tty layer is reacting badly to the unusual ways we use ioctls. > I've seen this before multiple times. Thanks. That's the hint I needed. It's working normally after a reboot. I had downloaded a new kernel but not rebooted to run it because I had a long running job going that I didn't want to disturb. I assume some new library assumed it was running on the new kernel or such. (Now to figure out how to restart that job.) -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Sep 15 22:14:04 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 15 Sep 2017 15:14:04 -0700 Subject: Feature Freeze Friday Message-ID: <20170915221404.4B12040605C@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > Reverting it would be exacly the kind of poke-the-possible-hornet's-nest > change that we should be avoiding. That's the part I don't understand. The new code didn't fix any real problem. It's only been running for a few months. We have many years on the old code. The new code opens the door for the sort of problems you generally try to avoid. Things will be different on different environments so obscure bugs can linger until somebody trips over them. It might introduce significant sloth, again, depends on the environment. No sloth on one system doesn't mean it will be OK on another. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Fri Sep 15 22:21:32 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 15 Sep 2017 15:21:32 -0700 Subject: Feature Freeze Friday Message-ID: <20170915222132.4CA3540605C@ip-64-139-1-69.sjc.megapath.net> >> I think you should revert the long double change and wait until >> post-release to clean up that area - not just the precision part but >> the whole clock adjusting area. > It fixed a bunch of issues and starting over could take weeks. What problems did it fix? Do you think we need more than microsecond accuracy when making the first long jump? Are there other cases when loss of precision happens? -- These are my opinions. I hate spam. From gem at rellim.com Fri Sep 15 23:15:44 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 15 Sep 2017 16:15:44 -0700 Subject: Feature Freeze Friday In-Reply-To: <20170915222132.4CA3540605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915222132.4CA3540605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170915161544.7925af29@spidey.rellim.com> Yo Hal! On Fri, 15 Sep 2017 15:21:32 -0700 Hal Murray wrote: > >> I think you should revert the long double change and wait until > >> post-release to clean up that area - not just the precision part > >> but the whole clock adjusting area. > > > It fixed a bunch of issues and starting over could take weeks. > > What problems did it fix? It fixed many problems with accuracy. I suggest looking at the change logs if you want details. > Do you think we need more than microsecond accuracy when making the > first long jump? ntpd NEVER had better than micro second accuracy until the change. Common wisdom said it did, but the precision enhancements in ntpq and ntpmon showed otherwise. > Are there other cases when loss of precision happens? Several. timespec(64) will smoke a lot out. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Sat Sep 16 06:37:33 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sat, 16 Sep 2017 02:37:33 -0400 Subject: ntpd: Gross CPU usage In-Reply-To: <20170915212441.E235940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915204109.GB19972@thyrsus.com> <20170915212441.E235940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170916063733.GA5311@thyrsus.com> Hal Murray : > > devel at ntpsec.org said: > > First thing to suspect on seeing that symptom, in ntpd or gpsd, is that > > somebody's tty layer is reacting badly to the unusual ways we use ioctls. > > I've seen this before multiple times. > > Thanks. That's the hint I needed. One of the more unfortunate things I learned from working on GPSD is that kernel serial layers are often wretched hives of scum and villainy. This is true across several different OSes I've had to work with, so it's not just some Linux flakiness. They tend to be good at handling the common cases but to get weird when pushed into rarely-visited portions of their behavior space. Danger areas definitely include strange combinations of mode bits and ioctl calls. The consequences can be nasty. You may not have been paying enough attention to notice, a few years back when some kind of screaming-TTY bug in a Samsung port of Android interacted badly with gpsd. The result was rapid battery drainage for the users. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Mon Sep 18 10:01:06 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 18 Sep 2017 03:01:06 -0700 Subject: Leftover cruft for OPENBSD Message-ID: <20170918100106.C862140605C@ip-64-139-1-69.sjc.megapath.net> wscript says: # XXX: needed for ntp_worker, for now if ctx.env.DEST_OS == "openbsd": ctx.define("PLATFORM_OPENBSD", "1", quote=False) Also mentioned in ./devel/ifdex-ignores It's not needed any more. Do you want to kill it now, or wait until post release? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Mon Sep 18 10:27:01 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 18 Sep 2017 03:27:01 -0700 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: Message from "Eric S. Raymond via devel" of "Fri, 15 Sep 2017 15:23:09 EDT." <20170915192309.GB16956@thyrsus.com> Message-ID: <20170918102701.CCE1340605C@ip-64-139-1-69.sjc.megapath.net> >> POSIX defines ways to access the clock, but only the simple functions like >> reading and setting the clock. It doesn't cover how to slew the clock or >> tweak the clock speed (drift) - things like ntp_adjtime or adjtime(x). > That is correct, but not relevant to the discussion of whether to keep > NetBSD 6 and old Mac OS X around. They have to have those primitives or we > couldn't have had them in the discussion at all. I think there are two branches to this discussion: Does an OS have something like ntp_adjtime? Is it "standard" enough? I don't know what's in Mac OS. NetBSD doesn't have adjtimex. It does have ntp_adjtime. OpenBSD doesn't have either. (ntpsec runs there) It doesn't have clock_settime either. HAVE_KERNEL_PLL is not defined. There is a disable_kernel_pll option. It should help test the "other" branch. So I expect Mac OS to work if it has the standard non-slew ways to read/write the clock. I think the current slew-mode code requires either ntp_adjtime or adjtime and is clean about using them. include/ntp_syscall.h says: /* MUSL port shim */ #if !defined(HAVE_NTP_ADJTIME) && defined(HAVE_ADJTIMEX) #define ntp_adjtime adjtimex #endif That's the only reference in ntpd. (There are other uses in ntpfrob) So if you are saying that if the kernel has slew mode it has to call it ntp_adjtime or adjtime, that's OK, I guess, but not part of POSIX. Has anybody tested our code with MUSL? -- These are my opinions. I hate spam. From esr at thyrsus.com Mon Sep 18 19:04:37 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 18 Sep 2017 15:04:37 -0400 Subject: Leftover cruft for OPENBSD In-Reply-To: <20170918100106.C862140605C@ip-64-139-1-69.sjc.megapath.net> References: <20170918100106.C862140605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170918190437.GA1201@thyrsus.com> Hal Murray : > wscript says: > # XXX: needed for ntp_worker, for now > if ctx.env.DEST_OS == "openbsd": > ctx.define("PLATFORM_OPENBSD", "1", quote=False) > > Also mentioned in ./devel/ifdex-ignores > > It's not needed any more. Do you want to kill it now, or wait until post > release? It's safe to remove now, since no code is conditional on it. I'll do it. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Mon Sep 18 19:13:16 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Mon, 18 Sep 2017 15:13:16 -0400 Subject: NetBSD 6.1.5 doesn't have ldexpl in math.h In-Reply-To: <20170918102701.CCE1340605C@ip-64-139-1-69.sjc.megapath.net> References: <20170915192309.GB16956@thyrsus.com> <20170918102701.CCE1340605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170918191316.GB1201@thyrsus.com> Hal Murray : > So if you are saying that if the kernel has slew mode it has to call it > ntp_adjtime or adjtime, that's OK, I guess, but not part of POSIX. That is correct. I have researched this extensively, and while there are still some places my grasp of the code is incomplete this is not one of them. >From devel/tour.txt: == System call interface and the PLL == All of ntpd's clock management is done through four system calls: clock_gettime(2), clock_settime(2), and either ntp_adjtime(2) or the older BSD adjtime(2) call. For ntp_adjtime(), ntpd actually uses a thin wrapper that hides the difference between systems with nanosecond-precision and those with only microsecond precision; internally, ntpd does all its calculations with nanosecond precision. The clock_gettime(2) and clock_settime(2) calls are standardized in POSIX; ntp_adjtime(2) is not, exhibiting some variability in behavior across platforms (in particular as to whether it supports nanosecond or microsecond precision). Where adjtimex(2) exists (notably under Linux), both ntp_adjtime() and adjtime() are implemented as library wrappers around it. The need to implement adjtime() is why the Linux version of struct timex has a (non-portable) 'time' member; There is some confusion abroad about this interface because it has left a trail of abandoned experiments behind it. Older BSD systems read the clock using gettimeofday(2) (in POSIX but deprecated) and set it using settimeofday(2), which was never standardized. Neither of these calls are still used in NTPsec, though the equally ancient BSD adjtime(2) call is, on systems without kernel PLL support. Also, glibc (and possibly other C libraries) implement two other related calls, ntp_gettime(3) and ntp_gettimex(3). These are not used by the NTP suite itself (except that the ntptime test program attempts to exercise ntp_gettime(3)), but rather are intended for time-using applications that also want an estimate of clock error and the leap-second offset. Neither has been standardized by POSIX, and they have not achieved wide use in applications. Both ntp_gettime(3) and ntp_gettimex(3) can be implemented as wrappers around ntp_adjtime(2)/adjtimex(2). Thus, on a Linux system, the library ntp_gettime(3) call could conceivably go through two levels of indirection, being implemented in terms of ntp_adjtime(2) which is in turn implemented by adjtimex(2). Unhelpfully, the non-POSIX calls in the above assortment are very poorly documented. The roles of clock_gettime(2) and clock_settime(2) are simple. They're used for reading and setting ("stepping", in NTP jargon) the system clock. Stepping is avoided whenever possible because it introduces discontinuities that may confuse applications. Stepping is usually done only at ntpd startup (which is typically at boot time) and only when the skew between system and NTP time is relatively large. The sync algorithm prefers slewing to stepping. Slewing speeds up or slows down the clock by a very small amount that will, after a relatively short time, sync the clock to NTP time. The advantage of this method is that it doesn't introduce discontinuities that applications might notice. The slewing variations in clock speed are so small that they're generally invisible even to soft-realtime applications. The call ntp_adjtime(2) is for clock slewing; NTPsec never calls adjtimex(2) directly, but it may be used to implement ntp_adjtime(2). ntp_adjtime(2)/adjtimex(2) uses a kernel interface to do its work, using a control technique called a PLL/FLL (phase-locked loop/frequency-locked loop) to do it. The older BSD adjtime(2) can be used for slewing as well, but doesn't assume a kernel-level PLL is available. Some platforms, like OpenBSD and Mac OS X, use only this call because they lack ntp_adjtime(2). Without the PLL calls, convergence to good time is observably a lot slower and tracking will accordingly be less reliable. > Has anybody tested our code with MUSL? I have a dim memory of someone reporting good results with it, -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Mon Sep 18 21:34:49 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 18 Sep 2017 14:34:49 -0700 Subject: Tinkerboard Message-ID: <20170918213449.20B1E40605C@ip-64-139-1-69.sjc.megapath.net> > Why not build with all the refclocks? That is not a well tested way to > configure NTPsec. I regularly test/run with only the refclocks I plan to use. I admit I haven't tested the SHM only case. If it doesn't work, we should fix it. > Nothing in ntpd.log of interest WRT shm/pps/gps >> Nothing of interest? Or Nothing? The former says the driver loaded, the > later says the driver did not load. I don't see any startup messages from SHM. grep -i SHM 7 Sep 14:11:26 ntpd[3006]: MODE6: SHM(0) 8014 84 reachable 7 Sep 14:11:27 ntpd[3006]: MODE6: SHM(1) 8014 84 reachable 9 Sep 12:44:02 ntpd[3006]: MODE6: SHM(0) 9012 82 demobilize assoc 18208 9 Sep 12:44:02 ntpd[3006]: MODE6: SHM(1) 9012 82 demobilize assoc 18209 9 Sep 12:44:12 ntpd[9220]: MODE6: SHM(0) 8014 84 reachable 9 Sep 12:44:13 ntpd[9220]: MODE6: SHM(1) 8014 84 reachable -- These are my opinions. I hate spam. From gem at rellim.com Mon Sep 18 21:48:37 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 18 Sep 2017 14:48:37 -0700 Subject: Tinkerboard In-Reply-To: <20170918213449.20B1E40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170918213449.20B1E40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170918144837.3da99e3f@spidey.rellim.com> Yo Hal! On Mon, 18 Sep 2017 14:34:49 -0700 Hal Murray wrote: > > Why not build with all the refclocks? That is not a well tested > > way to configure NTPsec. > > I regularly test/run with only the refclocks I plan to use. Sure, test/run as you wish, but why not build them all? NTPsec has way too many user configurable build options, too many to ever test. > I admit I haven't tested the SHM only case. If it doesn't work, we > should fix it. As previously noted, this was user error. Nothing to fix at the git end of things. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Mon Sep 18 22:24:25 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 18 Sep 2017 15:24:25 -0700 Subject: Tinkerboard Message-ID: <20170918222425.A11B440605C@ip-64-139-1-69.sjc.megapath.net> > Sure, test/run as you wish, but why not build them all? No great reason. I got started that way ages ago. It seems like a good idea for somebody to test the not-all case. I think we should be able to build non-bloat systems, or at least minimal-bloat. This seems like a good step in that direction. -- These are my opinions. I hate spam. From gem at rellim.com Mon Sep 18 22:32:13 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 18 Sep 2017 15:32:13 -0700 Subject: Tinkerboard In-Reply-To: <20170918222425.A11B440605C@ip-64-139-1-69.sjc.megapath.net> References: <20170918222425.A11B440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170918153213.0178e75e@spidey.rellim.com> Yo Hal! On Mon, 18 Sep 2017 15:24:25 -0700 Hal Murray wrote: > > Sure, test/run as you wish, but why not build them all? > > No great reason. I got started that way ages ago. So why should we work to allow pointless things? > It seems like a good idea for somebody to test the not-all case. The all and not all case do get a lot of testing. > I think we should be able to build non-bloat systems, or at least > minimal-bloat. This seems like a good step in that direction. Yes, but then why allow an infinite number of in between combinations? Especially since they are combinatorially unverifiable. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Mon Sep 18 22:42:50 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 18 Sep 2017 15:42:50 -0700 Subject: Tinkerboard Message-ID: <20170918224250.E471E40605C@ip-64-139-1-69.sjc.megapath.net> >> It seems like a good idea for somebody to test the not-all case. > The all and not all case do get a lot of testing. By "not all", I meant some but not all rather than none. -- These are my opinions. I hate spam. From gem at rellim.com Mon Sep 18 22:49:23 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 18 Sep 2017 15:49:23 -0700 Subject: Tinkerboard In-Reply-To: <20170918224250.E471E40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170918224250.E471E40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170918154923.69dff70c@spidey.rellim.com> Yo Hal! On Mon, 18 Sep 2017 15:42:50 -0700 Hal Murray wrote: > >> It seems like a good idea for somebody to test the not-all case. > > The all and not all case do get a lot of testing. > > By "not all", I meant some but not all rather than none. Yes, I understood you, but all that does is lead to combintarorial excess. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Mon Sep 18 23:10:42 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 18 Sep 2017 16:10:42 -0700 Subject: Tinkerboard Message-ID: <20170918231042.A332540605C@ip-64-139-1-69.sjc.megapath.net> >> By "not all", I meant some but not all rather than none. > Yes, I understood you, but all that does is lead to combintarorial excess. It also reduces bloat. All refclocks is close to double the file size. (I don't know how that translates into actual memory usage after code gets loaded. strip didn't make it any smaller.) -- These are my opinions. I hate spam. From gem at rellim.com Mon Sep 18 23:16:19 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 18 Sep 2017 16:16:19 -0700 Subject: Tinkerboard In-Reply-To: <20170918231042.A332540605C@ip-64-139-1-69.sjc.megapath.net> References: <20170918231042.A332540605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170918161619.73e0de67@spidey.rellim.com> Yo Hal! On Mon, 18 Sep 2017 16:10:42 -0700 Hal Murray wrote: > >> By "not all", I meant some but not all rather than none. > > Yes, I understood you, but all that does is lead to combintarorial > > excess. > > It also reduces bloat. Many other, easier, better, ways to reduce bloat. But since we have no action items here, and release is coming, nothing more to discuss here. At least for now. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Sep 19 07:35:17 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 19 Sep 2017 00:35:17 -0700 Subject: Gitlab rejecting mail? Message-ID: <20170919073517.D588040605C@ip-64-139-1-69.sjc.megapath.net> Has anybody seen this before and/or is it working for you? I tried again and got the same rejection. Subject: [Rejected] Re: ntpsec | Confusion with drift at the rail (#44) From: GitLab Date: Tue, 19 Sep 2017 05:38:17 +0000 To: hmurray at megapathdsl.net Unfortunately, your email message to GitLab could not be processed. The comment could not be created for the following reasons: - Note can't be blank - Project can't be blank - Noteable type can't be blank - Noteable can't be blank - Author can't be blank -- These are my opinions. I hate spam. From esr at thyrsus.com Wed Sep 20 16:56:20 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 20 Sep 2017 12:56:20 -0400 (EDT) Subject: Default config file behavior - request for comment Message-ID: <20170920165620.D12A113A0206@snark.thyrsus.com> I've been thinking about security and defaults. Right now, if ntpd is brought up with no config file, it runs with no restrictions at all. Anyone can query it, anyone can configure it. This seems dubious from a security point of view. To fix this, we're going to have to feed it a string of config defaults if no config file is present. This is easy to do, and easily tested. There are three obvious ways to address this. 1. The infosec-focused way. Change the default restrictions to be "allow nothing." This way, if you bring it up with no config, there's no harm. It just spins inaccessibly. 2. User-friendly way. Bring it up with these permissions: restrict default kod limited nomodify nopeer noquery restrict -6 default kod limited nomodify nopeer noquery restrict 127.0.0.1 restrict -6 ::1 pool pool.ntp.org iburst driftfile /var/lib/ntp/ntp.drift That is, the behavior 99.9% of all installations want. 3. Leave current behavior alone. Please comment, everyone. Personally, I favor 2. Mark, this edges into policy territory. I'd especially like to hear your opinion. -- Eric S. Raymond "Gun control" is a job-safety program for criminals. From hmurray at megapathdsl.net Wed Sep 20 17:11:52 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 20 Sep 2017 10:11:52 -0700 Subject: Default config file behavior - request for comment In-Reply-To: Message from "Eric S. Raymond via devel" of "Wed, 20 Sep 2017 12:56:20 EDT." <20170920165620.D12A113A0206@snark.thyrsus.com> Message-ID: <20170920171152.861B540605C@ip-64-139-1-69.sjc.megapath.net> > Right now, if ntpd is brought up with no config file, it runs with no > restrictions at all. Anyone can query it, anyone can configure it. This > seems dubious from a security point of view. Seems not-too-likely in the normal case since it won't keep good time. Also seems possible in, say, a recovery mode where the file system is busted, or during setup, so I agree that this is worth fixing. > 2. User-friendly way. Bring it up with these permissions: > restrict default kod limited nomodify nopeer noquery > restrict -6 default kod limited nomodify nopeer noquery > restrict 127.0.0.1 > restrict -6 ::1 > pool pool.ntp.org iburst > driftfile /var/lib/ntp/ntp.drift I think wiring in pool names is a bad idea. There may already be a default drift file name. There is already a default default restriction. Tweaking that would be simple. What does nopeer mean these days? -- These are my opinions. I hate spam. From Stromeko at nexgo.de Wed Sep 20 17:34:58 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 20 Sep 2017 19:34:58 +0200 Subject: Default config file behavior - request for comment References: <20170920165620.D12A113A0206@snark.thyrsus.com> Message-ID: <87377h8e9p.fsf@Rainer.invalid> Eric S. Raymond via devel writes: > There are three obvious ways to address this. > > 1. The infosec-focused way. Change the default restrictions to be > "allow nothing." This way, if you bring it up with no config, there's > no harm. It just spins inaccessibly. If it does that without complaining loudly enough some folks might think it's actually doing something and act surprised when it doesn't. > 2. User-friendly way. Bring it up with these permissions: > > restrict default kod limited nomodify nopeer noquery > restrict -6 default kod limited nomodify nopeer noquery > restrict 127.0.0.1 > restrict -6 ::1 Stop it here. No pool (I think hardwiring pool names without consent of the pool administrators is a no-no). Also, no drift file. You might want to add "noserve notrust" to the last two statements. > pool pool.ntp.org iburst > driftfile /var/lib/ntp/ntp.drift > > That is, the behavior 99.9% of all installations want. > > 3. Leave current behavior alone. The current behaviour was addressing a different target audience, so I see no reason to keep it when we are targeting a different population. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From fallenpegasus at gmail.com Wed Sep 20 18:15:30 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 20 Sep 2017 18:15:30 +0000 Subject: Default config file behavior - request for comment In-Reply-To: <87377h8e9p.fsf@Rainer.invalid> References: <20170920165620.D12A113A0206@snark.thyrsus.com> <87377h8e9p.fsf@Rainer.invalid> Message-ID: While I like choice #2 for friendlyness, I have to agree re not to hardwire the pool name without external consent. Code in choice #1, and if its easy to do, with a big loud warning to stderr and logerr that it's doing nothing. Supply a reference config file that implements #2 ..m On Wed, Sep 20, 2017 at 10:35 AM Achim Gratz via devel wrote: > Eric S. Raymond via devel writes: > > There are three obvious ways to address this. > > > > 1. The infosec-focused way. Change the default restrictions to be > > "allow nothing." This way, if you bring it up with no config, there's > > no harm. It just spins inaccessibly. > > If it does that without complaining loudly enough some folks might think > it's actually doing something and act surprised when it doesn't. > > > 2. User-friendly way. Bring it up with these permissions: > > > > restrict default kod limited nomodify nopeer noquery > > restrict -6 default kod limited nomodify nopeer noquery > > restrict 127.0.0.1 > > restrict -6 ::1 > > Stop it here. No pool (I think hardwiring pool names without consent of > the pool administrators is a no-no). Also, no drift file. You might > want to add "noserve notrust" to the last two statements. > > > pool pool.ntp.org iburst > > driftfile /var/lib/ntp/ntp.drift > > > > That is, the behavior 99.9% of all installations want. > > > > 3. Leave current behavior alone. > > The current behaviour was addressing a different target audience, so I > see no reason to keep it when we are targeting a different population. > > > Regards, > Achim. > -- > +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ > > Factory and User Sound Singles for Waldorf rackAttack: > http://Synth.Stromeko.net/Downloads.html#WaldorfSounds > > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel > -- Mark Atwood http://about.me/markatwood +1-206-604-2198 Mobile & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From esr at thyrsus.com Wed Sep 20 18:28:32 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 20 Sep 2017 14:28:32 -0400 Subject: Default config file behavior - request for comment In-Reply-To: References: <20170920165620.D12A113A0206@snark.thyrsus.com> <87377h8e9p.fsf@Rainer.invalid> Message-ID: <20170920182832.GA12294@thyrsus.com> Mark Atwood via devel : > While I like choice #2 for friendlyness, I have to agree re not to hardwire > the pool name without external consent. I think the global pool name is a special case here. Why are they advertising that service if not to be used in exactly this way? > Code in choice #1, and if its easy to do, with a big loud warning to stderr > and logerr that it's doing nothing. Can be done. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From Stromeko at nexgo.de Wed Sep 20 19:46:58 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 20 Sep 2017 21:46:58 +0200 Subject: Default config file behavior - request for comment References: <20170920165620.D12A113A0206@snark.thyrsus.com> <87377h8e9p.fsf@Rainer.invalid> <20170920182832.GA12294@thyrsus.com> Message-ID: <87vakd6tl9.fsf@Rainer.invalid> Eric S. Raymond via devel writes: > I think the global pool name is a special case here. Why are they advertising > that service if not to be used in exactly this way? No, I think their rule (and no, you don't get to redefine it) is that if you wanted to have a hardcoded/default pool association you'd have to ask them for an ntpsec pool. That might be a possibility, but probably not an immediate one. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf rackAttack V1.04R1: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From Stromeko at nexgo.de Wed Sep 20 19:51:53 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Wed, 20 Sep 2017 21:51:53 +0200 Subject: Tinkerboard w/ TinkerOS 2.0.1 References: <87a82bis91.fsf@Rainer.invalid> <87o9qiiimk.fsf@Rainer.invalid> Message-ID: <87tvzx6td2.fsf@Rainer.invalid> Achim Gratz via devel writes: > I've switched the TinkerBoard to PPS and starting to collect PPS > statistics. Everything looks pretty good so far, I've also started > ovenizing the XTAL, but it will be some time before I get enough > statistics to extract the parameters from for a proper control loop. After twelve days of data collection, the performance looks quite good. Not unexpectedly, the ovenization yields different behaviour than both rasPi models, which are very similar to the point that I needed about two months of data to start using different control parameters for them. The Tinkerboard is not very far off, but enough so I needed to immediately account for it. The main problem at the moment is that the zero-TC temperature point is just beyond the first thermal threshold hardwired into the kernel for the TinkerBoard, so I can't keep the temperature high enough. As TinkerBoard is still just in it's (standard rasPi) case and not wrapped and boxed up which makes it more susceptible to sudden changes in ambient and maybe also need higher temperatures overall. Not sure when I get around to try if wrapping things is helpful or not, though. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf rackAttack V1.04R1: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From fallenpegasus at gmail.com Wed Sep 20 20:48:08 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Wed, 20 Sep 2017 20:48:08 +0000 Subject: Default config file behavior - request for comment In-Reply-To: <87vakd6tl9.fsf@Rainer.invalid> References: <20170920165620.D12A113A0206@snark.thyrsus.com> <87377h8e9p.fsf@Rainer.invalid> <20170920182832.GA12294@thyrsus.com> <87vakd6tl9.fsf@Rainer.invalid> Message-ID: Achim is right, the ToS and documentation for pool.ntp.org forbids vendors from using pool.ntp.org as a default or a hardcoded entry. I have submitted an application for a vendor pool named ntpsec.pool.ntp.org In the meantime, our hardcoded default should be #1 "locked down do nothing", and our reference example configuration should refer to ntpsec.pool.ntp.org, and hopefully NTPsec emits sane error messages if it's started before that DNS entry is created. Thank you for your feedback Achim. You helped us avoid a misstep there. ..m On Wed, Sep 20, 2017 at 12:47 PM Achim Gratz via devel wrote: > Eric S. Raymond via devel writes: > > I think the global pool name is a special case here. Why are they > advertising > > that service if not to be used in exactly this way? > > No, I think their rule (and no, you don't get to redefine it) is that if > you wanted to have a hardcoded/default pool association you'd have to > ask them for an ntpsec pool. That might be a possibility, but probably > not an immediate one. > > > Regards, > Achim. > -- > +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ > > SD adaptation for Waldorf rackAttack V1.04R1: > http://Synth.Stromeko.net/Downloads.html#WaldorfSDada > > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel > -- Mark Atwood http://about.me/markatwood +1-206-604-2198 Mobile & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From hmurray at megapathdsl.net Thu Sep 21 00:40:53 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 20 Sep 2017 17:40:53 -0700 Subject: Default config file behavior - request for comment In-Reply-To: Message from "Eric S. Raymond via devel" of "Wed, 20 Sep 2017 14:28:32 EDT." <20170920182832.GA12294@thyrsus.com> Message-ID: <20170921004053.BD9BF40605C@ip-64-139-1-69.sjc.megapath.net> >> Code in choice #1, and if its easy to do, with a big loud warning to stderr >> and logerr that it's doing nothing. > Can be done. How about exiting after the loud warnings? That solves the open-door problem if the admin is focused on some other problem. -- These are my opinions. I hate spam. From esr at thyrsus.com Thu Sep 21 04:54:37 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 21 Sep 2017 00:54:37 -0400 Subject: Default config file behavior - request for comment In-Reply-To: <87vakd6tl9.fsf@Rainer.invalid> References: <20170920165620.D12A113A0206@snark.thyrsus.com> <87377h8e9p.fsf@Rainer.invalid> <20170920182832.GA12294@thyrsus.com> <87vakd6tl9.fsf@Rainer.invalid> Message-ID: <20170921045437.GA2404@thyrsus.com> Achim Gratz via devel : > Eric S. Raymond via devel writes: > > I think the global pool name is a special case here. Why are they advertising > > that service if not to be used in exactly this way? > > No, I think their rule (and no, you don't get to redefine it) is that if > you wanted to have a hardcoded/default pool association you'd have to > ask them for an ntpsec pool. That might be a possibility, but probably > not an immediate one. It may be moot anyway. Thinking about it, I've become increasingly uncomfortable with making that large a functional change this close to release. I think we'd better stick to bug fixes. And then maybe fast-cycle a 1.1, like 90 days out or so. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Thu Sep 21 04:57:52 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 21 Sep 2017 00:57:52 -0400 Subject: Default config file behavior - request for comment In-Reply-To: <20170921004053.BD9BF40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170920182832.GA12294@thyrsus.com> <20170921004053.BD9BF40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170921045752.GB2404@thyrsus.com> Hal Murray : > > >> Code in choice #1, and if its easy to do, with a big loud warning to stderr > >> and logerr that it's doing nothing. > > > Can be done. > > How about exiting after the loud warnings? That solves the open-door problem > if the admin is focused on some other problem. As I said to Achim, I'm getting cold feet about making functional changes this close to release. Yes, we could keep tinkering with stuff like this. But I think it's time to squash whaetever tracker issues we can, then stand away and *launch* the damn rocket. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Thu Sep 21 05:16:23 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 20 Sep 2017 22:16:23 -0700 Subject: Default config file behavior - request for comment In-Reply-To: Message from "Eric S. Raymond" of "Thu, 21 Sep 2017 00:57:52 EDT." <20170921045752.GB2404@thyrsus.com> Message-ID: <20170921051623.9D7F940605C@ip-64-139-1-69.sjc.megapath.net> esr at thyrsus.com said: > Yes, we could keep tinkering with stuff like this. But I think it's time to > squash whaetever tracker issues we can, then stand away and *launch* the > damn rocket. You are the one who brought it up. What would you have done if somebody we didn't know had filed a bug report with that content? Actually, I think it's not critical. The missing "nomodify" looks evil, but all the modifications require a password. -- These are my opinions. I hate spam. From esr at thyrsus.com Thu Sep 21 05:36:22 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 21 Sep 2017 01:36:22 -0400 Subject: Default config file behavior - request for comment In-Reply-To: <20170921051623.9D7F940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170921045752.GB2404@thyrsus.com> <20170921051623.9D7F940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170921053622.GA3515@thyrsus.com> Hal Murray : > > esr at thyrsus.com said: > > Yes, we could keep tinkering with stuff like this. But I think it's time to > > squash whaetever tracker issues we can, then stand away and *launch* the > > damn rocket. > > You are the one who brought it up. I am. Perfectionism on my part. I'm a bit jittery just now, seven days from release. > What would you have done if somebody we didn't know had filed a bug report > with that content? Not sure, because I'm still not certain what the existing code is doing. When I run ntpq reslist against a config that declares nothing but athentication passwords ----------------------------------- keys /usr/local/etc/ntp.keys trustedkey 10 controlkey 10 ----------------------------------- this is what I see: hits addr/prefix or addr mask restrictions ============================================================================== 0 192.168.1.22/32 ntpport interface ignore 0 127.0.0.1/32 ntpport interface ignore 1 0.0.0.0/0 0 fe80::56a0:50ff:febb:62d0/128 ntpport interface ignore 0 2001:470:e34c:2:56a0:50ff:febb:62d0/128 ntpport interface ignore 0 ::1/128 ntpport interface ignore 0 ::/0 I'm having trouble mapping that to my mental model of how restriction blocks work, which maked me nervous about modifying mearby code. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Thu Sep 21 06:12:43 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 20 Sep 2017 23:12:43 -0700 Subject: Default config file behavior - request for comment In-Reply-To: Message from "Eric S. Raymond via devel" of "Thu, 21 Sep 2017 01:36:22 EDT." <20170921053622.GA3515@thyrsus.com> Message-ID: <20170921061243.B776040605C@ip-64-139-1-69.sjc.megapath.net> > 0 192.168.1.22/32 > ntpport interface ignore (and friends) > I'm having trouble mapping that to my mental model of how restriction blocks > work, which maked me nervous about modifying mearby code. There is code that adds rules like that to keep ntpd from getting time from itself. from create_interface in ntpd/ntp_io.c * Blacklist our own addresses, no use talking to ourself */ SET_HOSTMASK(&resmask, AF(&iface->sin)); hack_restrict(RESTRICT_FLAGS, &iface->sin, &resmask, RESM_NTPONLY | RESM_INTERFACE, RES_IGNORE, 0); -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Thu Sep 21 17:51:22 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Thu, 21 Sep 2017 10:51:22 -0700 Subject: ntpq: Tiny vs 0 Message-ID: <20170921175122.9FAC140605C@ip-64-139-1-69.sjc.megapath.net> Would it be interesting to hack the combination of ntpd and ntpq to show 0 values as 0 and tiny values at 0.000 or with a command line switch 0.9E-xx? -- These are my opinions. I hate spam. From esr at thyrsus.com Thu Sep 21 18:04:28 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Thu, 21 Sep 2017 14:04:28 -0400 Subject: ntpq: Tiny vs 0 In-Reply-To: <20170921175122.9FAC140605C@ip-64-139-1-69.sjc.megapath.net> References: <20170921175122.9FAC140605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170921180428.GA20397@thyrsus.com> Hal Murray via devel : > Would it be interesting to hack the combination of ntpd and ntpq to show 0 > values as 0 and tiny values at 0.000 or with a command line switch 0.9E-xx? Must...have...semantic...context. What column and unit are we talking here? -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From fallenpegasus at gmail.com Thu Sep 21 18:09:33 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Thu, 21 Sep 2017 18:09:33 +0000 Subject: ntpq: Tiny vs 0 In-Reply-To: <20170921175122.9FAC140605C@ip-64-139-1-69.sjc.megapath.net> References: <20170921175122.9FAC140605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Thu, Sep 21, 2017 at 10:51 AM Hal Murray via devel wrote: > > Would it be interesting to hack the combination of ntpd and ntpq to show 0 > values as 0 and tiny values at 0.000 or with a command line switch 0.9E-xx? > I've worked on systems that did things like that, but often they displayed "ZERO" or "0000". It's a neat idea. Not for 1.0 release, buy maybe soon after. ..m -- Mark Atwood http://about.me/markatwood +1-206-604-2198 Mobile & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Thu Sep 21 18:13:00 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 21 Sep 2017 11:13:00 -0700 Subject: ntpq: Tiny vs 0 In-Reply-To: <20170921175122.9FAC140605C@ip-64-139-1-69.sjc.megapath.net> References: <20170921175122.9FAC140605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170921111300.24d49bb8@spidey.rellim.com> Yo Hal! On Thu, 21 Sep 2017 10:51:22 -0700 Hal Murray via devel wrote: > Would it be interesting to hack the combination of ntpd and ntpq to > show 0 values as 0 and tiny values at 0.000 or with a command line > switch 0.9E-xx? ntpq and ntpmon can already sort of do that. Try 'ntpq -u' or just type 'u' while in ntpmon. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Stromeko at nexgo.de Sun Sep 24 16:54:28 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Sun, 24 Sep 2017 18:54:28 +0200 Subject: Apparent protocol-machine bug, new top priority References: <20170827130206.88A9413A0209@snark.thyrsus.com> Message-ID: <878th46nqz.fsf@Rainer.invalid> Eric S. Raymond via devel writes: > Now that iburst has been fixed - and Achim reports seeing this problem > with iburst off - this pretty much has to be an issue deeper in the > protocol machine. (I guess we should count our blessings and > congratulate Daniel that there haven't more of these since the big > refactor.) I have gathered more observations: it is almost always the rasPi 1B+ that kicks out its local clients and it seems to happen more often while it's busy with other stuff, so I think it somehow has to do with the speed of the NTP packets relative to the processing capabilities. Whether that happens because packages are retried (and then kicking off the rate-limiting) or packages getting the wrong source attached I don't know. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From ianbruene at gmail.com Sun Sep 24 17:15:51 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Sun, 24 Sep 2017 12:15:51 -0500 Subject: Post-1.0 refactoring and unit documentation; a request Message-ID: <011b2a73-9f01-e633-0e10-26adef6b3471@gmail.com> A few months ago when I added unit display to the python tools I created the devel/units file to document what the assorted Important Variables in NTPsec represent, and more importantly how that representation changes as the data moves through the programs. Because of my lack of knowledge of NTP and inexperience with C, this file is still very incomplete. It mostly contains random information that I could glean from obvious spots in the codebase. I am requesting that post-1.0 if you start refactoring or testing the codebase and come across an Important Variable, please add it and what unit it represents to devel/units. There is no need for complicated formatting as of yet, merely appending the information to the file will be sufficient, and I will massage it into a coherent picture. It also occurs to me that there may be other Important Invariants that need to be documented, but I do not know what those might be. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From esr at thyrsus.com Sun Sep 24 17:37:05 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Sun, 24 Sep 2017 13:37:05 -0400 Subject: Apparent protocol-machine bug, new top priority In-Reply-To: <878th46nqz.fsf@Rainer.invalid> References: <20170827130206.88A9413A0209@snark.thyrsus.com> <878th46nqz.fsf@Rainer.invalid> Message-ID: <20170924173705.GA14329@thyrsus.com> Achim Gratz via devel : > Eric S. Raymond via devel writes: > > Now that iburst has been fixed - and Achim reports seeing this problem > > with iburst off - this pretty much has to be an issue deeper in the > > protocol machine. (I guess we should count our blessings and > > congratulate Daniel that there haven't more of these since the big > > refactor.) > > I have gathered more observations: it is almost always the rasPi 1B+ > that kicks out its local clients and it seems to happen more often while > it's busy with other stuff, so I think it somehow has to do with the > speed of the NTP packets relative to the processing capabilities. > Whether that happens because packages are retried (and then kicking off > the rate-limiting) or packages getting the wrong source attached I don't > know. I'll try to reproduce on my RPis. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From fw at fwright.net Sun Sep 24 20:36:39 2017 From: fw at fwright.net (Fred Wright) Date: Sun, 24 Sep 2017 13:36:39 -0700 (PDT) Subject: Apparent protocol-machine bug, new top priority In-Reply-To: <20170924173705.GA14329@thyrsus.com> References: <20170827130206.88A9413A0209@snark.thyrsus.com> <878th46nqz.fsf@Rainer.invalid> <20170924173705.GA14329@thyrsus.com> Message-ID: On Sun, 24 Sep 2017, Eric S. Raymond via devel wrote: > Achim Gratz via devel : > > Eric S. Raymond via devel writes: > > > Now that iburst has been fixed - and Achim reports seeing this problem > > > with iburst off - this pretty much has to be an issue deeper in the > > > protocol machine. (I guess we should count our blessings and > > > congratulate Daniel that there haven't more of these since the big > > > refactor.) > > > > I have gathered more observations: it is almost always the rasPi 1B+ > > that kicks out its local clients and it seems to happen more often while > > it's busy with other stuff, so I think it somehow has to do with the > > speed of the NTP packets relative to the processing capabilities. > > Whether that happens because packages are retried (and then kicking off > > the rate-limiting) or packages getting the wrong source attached I don't > > know. I presume "packages" was meant to be "packets". > I'll try to reproduce on my RPis. Is there some kind of stress-test program that can be used to induce this kind of problem? Can the failure rate be increased by changing the governor settings to make the server slower? On the Pi that would significantly worsen the time accuracy, but for the purposes of this experiment that should be acceptable. Fred Wright From Stromeko at nexgo.de Mon Sep 25 05:47:05 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 25 Sep 2017 07:47:05 +0200 Subject: Apparent protocol-machine bug, new top priority References: <20170827130206.88A9413A0209@snark.thyrsus.com> <878th46nqz.fsf@Rainer.invalid> <20170924173705.GA14329@thyrsus.com> Message-ID: <87a81jxrc6.fsf@Rainer.invalid> Fred Wright via devel writes: > I presume "packages" was meant to be "packets". Yes, but since I was packaging a few hundred Perl modules for Cygwin at the time, my mind wandered off and that one slipped through. >> I'll try to reproduce on my RPis. > > Is there some kind of stress-test program that can be used to induce this > kind of problem? Not that I know of. I currently have four local servers that monitor each other at poll=4 (16s). As said before, it does happen with the external servers from time to time also, but it hasn't happened since the last update. > Can the failure rate be increased by changing the governor settings to > make the server slower? On the Pi that would significantly worsen the > time accuracy, but for the purposes of this experiment that should be > acceptable. You mean dropping the CPU frequency? That's maybe worth a shot, but I run the rasPi 3B at 600MHz currently for thermal stability and don't see any problems there. I still think it is rather contention at the NIC that triggers the behaviour, but the real problem is that there is no recovery happening. So maybe just testing the rate limiting branch directly with test packets would hreveal what's going on, but I don't know if there's a facility to produce them. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ SD adaptation for Waldorf microQ V2.22R2: http://Synth.Stromeko.net/Downloads.html#WaldorfSDada From hmurray at megapathdsl.net Mon Sep 25 10:10:18 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Mon, 25 Sep 2017 03:10:18 -0700 Subject: Has anybody tried a cross compile? Message-ID: <20170925101018.DC55A40605C@ip-64-139-1-69.sjc.megapath.net> Are there any systems where ntpsec is known to work and require cross compiling? -- These are my opinions. I hate spam. From fw at fwright.net Mon Sep 25 19:41:08 2017 From: fw at fwright.net (Fred Wright) Date: Mon, 25 Sep 2017 12:41:08 -0700 (PDT) Subject: Has anybody tried a cross compile? In-Reply-To: <20170925101018.DC55A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170925101018.DC55A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Mon, 25 Sep 2017, Hal Murray via devel wrote: > Are there any systems where ntpsec is known to work and require cross > compiling? I don't think a full build would work as a cross-compile, because one has to jump through extra hoops to make cross-compiling Python extensions work. Cross-compiling just the daemon should be doable, though. Fred Wright From fw at fwright.net Mon Sep 25 19:58:42 2017 From: fw at fwright.net (Fred Wright) Date: Mon, 25 Sep 2017 12:58:42 -0700 (PDT) Subject: Apparent protocol-machine bug, new top priority In-Reply-To: <87a81jxrc6.fsf@Rainer.invalid> References: <20170827130206.88A9413A0209@snark.thyrsus.com> <878th46nqz.fsf@Rainer.invalid> <20170924173705.GA14329@thyrsus.com> <87a81jxrc6.fsf@Rainer.invalid> Message-ID: On Mon, 25 Sep 2017, Achim Gratz via devel wrote: > Fred Wright via devel writes: > > > Can the failure rate be increased by changing the governor settings to > > make the server slower? On the Pi that would significantly worsen the > > time accuracy, but for the purposes of this experiment that should be > > acceptable. > > You mean dropping the CPU frequency? That's maybe worth a shot, but I > run the rasPi 3B at 600MHz currently for thermal stability and don't see > any problems there. I still think it is rather contention at the NIC > that triggers the behaviour, but the real problem is that there is no > recovery happening. So maybe just testing the rate limiting branch > directly with test packets would hreveal what's going on, but I don't > know if there's a facility to produce them. Perhaps, but it's a fairly easy experiment to try. I get a kick out of you guys fussing over "thermal stability" when the largest source of time error is the interrupt latency in timing the PPS signal. Just because you can't see the error in the graphs doesn't mean it isn't there. :-) On the Beaglebone, it's typically around 15us with the CPU running at 1GHz, going up to around 42us at 300MHz. It's directly measurable because the "real" PPS timing is via counter capture, with a total capture uncertainty (the equivalent of NTP RTT) of typically 583ns at 1GHz and 1083ns at 300MHz. Fred Wright From Stromeko at nexgo.de Mon Sep 25 21:14:35 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Mon, 25 Sep 2017 23:14:35 +0200 Subject: Apparent protocol-machine bug, new top priority References: <20170827130206.88A9413A0209@snark.thyrsus.com> <878th46nqz.fsf@Rainer.invalid> <20170924173705.GA14329@thyrsus.com> <87a81jxrc6.fsf@Rainer.invalid> Message-ID: <87shfalbus.fsf@Rainer.invalid> Fred Wright via devel writes: > I get a kick out of you guys fussing over "thermal stability" when the > largest source of time error is the interrupt latency in timing the PPS > signal. The median interrupt latency shows up as an additional offset on top of other such offsets. The variability on that latency gets filtered pretty nicely by ntpd, especially the long tail at large latencies. Now, interrupts never have been a particularly strong point for ARM, I give you that. > Just because you can't see the error in the graphs doesn't mean > it isn't there. :-) Again, that number isn't materially affecting the frequency stability, only the time offset. If you look at that, you will quickly find that your assertion of thermal effects getting dominated by the interrupt latency is wrong. > On the Beaglebone, it's typically around 15us with the > CPU running at 1GHz, going up to around 42us at 300MHz. It's directly > measurable because the "real" PPS timing is via counter capture, with a > total capture uncertainty (the equivalent of NTP RTT) of typically 583ns > at 1GHz and 1083ns at 300MHz. If you have histograms, I'd like to see them. But that seems to be in the right ballpark. Note that you could do something similar by running the PPS capture on the VC4 instead of ARM subsystem, but that part of the rasPi is woefully under-documented. In principle the hardware should allow capturing PPS at up to 250MHz and sending the timestamp via mailbox to the ARM is not timing-critical at all. If you wanted to eliminate that, you'd better use an FPGA or some other microcontroller that has capture units and hardware timestamps for the NIC. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From gem at rellim.com Mon Sep 25 21:16:21 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 25 Sep 2017 14:16:21 -0700 Subject: Apparent protocol-machine bug, new top priority In-Reply-To: References: <20170827130206.88A9413A0209@snark.thyrsus.com> <878th46nqz.fsf@Rainer.invalid> <20170924173705.GA14329@thyrsus.com> <87a81jxrc6.fsf@Rainer.invalid> Message-ID: <20170925141621.674f13c3@spidey.rellim.com> Yo Fred! On Mon, 25 Sep 2017 12:58:42 -0700 (PDT) Fred Wright via devel wrote: > I get a kick out of you guys fussing over "thermal stability" when the > largest source of time error is the interrupt latency in timing the > PPS signal. Uh, that is not my experience. And I have more control over my temperature than I have over my interrupt latency. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Mon Sep 25 23:55:30 2017 From: fw at fwright.net (Fred Wright) Date: Mon, 25 Sep 2017 16:55:30 -0700 (PDT) Subject: Apparent protocol-machine bug, new top priority Message-ID: On Mon, 25 Sep 2017, Achim Gratz via devel wrote: > Fred Wright via devel writes: > > I get a kick out of you guys fussing over "thermal stability" when the > > largest source of time error is the interrupt latency in timing the PPS > > signal. > > The median interrupt latency shows up as an additional offset on top of > other such offsets. The variability on that latency gets filtered > pretty nicely by ntpd, especially the long tail at large latencies. > Now, interrupts never have been a particularly strong point for ARM, I > give you that. But it can't "filter" the overall offset - it has no way to know what it is. > > Just because you can't see the error in the graphs doesn't mean > > it isn't there. :-) > > Again, that number isn't materially affecting the frequency stability, > only the time offset. If you look at that, you will quickly find that > your assertion of thermal effects getting dominated by the interrupt > latency is wrong. Of course it has nothing to do with the frequency stability, but it directly affects the time offset. And my assertion is based on the actual data. See below. > > On the Beaglebone, it's typically around 15us with the > > CPU running at 1GHz, going up to around 42us at 300MHz. It's directly > > measurable because the "real" PPS timing is via counter capture, with a > > total capture uncertainty (the equivalent of NTP RTT) of typically 583ns > > at 1GHz and 1083ns at 300MHz. > > If you have histograms, I'd like to see them. But that seems to be in OK, A Tale of Two Servers - It was the best of times, it was the worst of times... This is a day's data from my experimental time server: http://sonic.net/~fw/private/NTP/BB2-2017-09-24/ The main timing reference is a rubidium oscillator, so frequency-related effects are essentially nonexistent. In fact, the error in the Linux kernel's limited-precision *representation* of the frequency is about an order of magnitude larger than the typical actual frequency error. PPS(2) is the counter-capture PPS source, and is the primary timing reference. SHM(1) is the combined NMEA/PPS source from GPSD, which is configured to use the interrupt-based PPS driver, and hence illustrates the offset in the interrupt-based capture. Between 2100Z and 2200Z I switched the governor to powersave (300MHz CPU clock instead of 1GHz), and you can see the effect on the latency (but negligible effect on the actual timing accuracy). SHM(0) is a noselect peer that's included just to track the TOFF of the GPS receiver. PPS(4) is the PPS from the undisciplined rubidium oscillator, whose drift represents the rubidium frequency error, and with a step change where I'd manually reset it to the correct phase. And here's a day's data from my primary time server: http://sonic.net/~fw/private/NTP/Time-2017-09-22/ This one is running classic ntpd, but that shouldn't really matter for this purpose. The timing reference is just the normal crystal, with no special thermal treatment. Again, PPS(2) is the main timing reference, though it's listed as 127.127.22.2 due to the lame partial translation table in ntpviz. Again, SHM(1) is the NMEA/PPS source with interrupt-based PPS capture, and the offset is a combination of the interrupt latency and the frequency-related time offsets. The actual time offsets are visible in the loopstats graph and in the PPS(2) peer offset graph, and are substantially smaller than the offsets in SHM(1). QED. SHM(0) is a noselect peer here as well. > the right ballpark. Note that you could do something similar by running > the PPS capture on the VC4 instead of ARM subsystem, but that part of > the rasPi is woefully under-documented. In principle the hardware > should allow capturing PPS at up to 250MHz and sending the timestamp via > mailbox to the ARM is not timing-critical at all. Lots of stuff on the Pi is woefully under-documented. :-) After all, this is from Broadcom, an industry leader in closed documentation. It must have *really* pained them even to provide the documentation that does exist. > If you wanted to eliminate that, you'd better use an FPGA or some other > microcontroller that has capture units and hardware timestamps for the > NIC. The Beaglebone chipset already has the hardware, it's reasonably well documented, and there's driver support for it in Linux. The biggest source of inaccuracy is that the generic timekeeping code in Linux provides no way to convert a *supplied* counter value to a timestamp. The best one can do purely within the driver involves reading the current counter value and the system timestamp as close together as possible, and using that correspondence to map the captured counter value to the corresponding timestamp. The delay in that sequence accounts for the majority of the capture uncertainty. My experimental version of the driver has some improvements in that area, but I've squeezed it about as much as is possible without touching timekeeping.c. BTW, the Beaglebone chipset also has hardware timestamping in the NIC, and I believe there's kernel support for it, but one can't take full advantage of that without solving the "send timestamp problem". On Mon, 25 Sep 2017, Gary E. Miller via devel wrote: > On Mon, 25 Sep 2017 12:58:42 -0700 (PDT) > Fred Wright via devel wrote: > > > I get a kick out of you guys fussing over "thermal stability" when the > > largest source of time error is the interrupt latency in timing the > > PPS signal. > > Uh, that is not my experience. And I have more control over my > temperature than I have over my interrupt latency. See above. And note that you can at least make the latency (as well as the variation in latency) as small as possible by running the CPU as fast as possible, rather than slowing it down for "thermal stability". Fred Wright From gem at rellim.com Tue Sep 26 00:26:03 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 25 Sep 2017 17:26:03 -0700 Subject: Apparent protocol-machine bug, new top priority In-Reply-To: References: Message-ID: <20170925172603.37f154b0@spidey.rellim.com> Yo Fred! On Mon, 25 Sep 2017 16:55:30 -0700 (PDT) Fred Wright via devel wrote: > PPS(2) is the counter-capture PPS source, and is the primary timing > reference. Can you explain a bit more about this source? How does this differ from the KPPS or PPS(TIOCMIWAIT) sources? > SHM(1) is the combined NMEA/PPS source from GPSD, which is > configured to use the interrupt-based PPS driver, and hence > illustrates the offset in the interrupt-based capture. Your SHM(1), on both your hosts, seem to not be very good. Are you using KPPS? Or just TIOCMWAIT? > Again, PPS(2) is the main timing reference, though it's listed as > 127.127.22.2 due to the lame partial translation table in ntpviz. What do you think ntpviz should do better? Just convert 127.127.22.C to PPS(X) ? That would be ann easy patch. > The actual time offsets are > visible in the loopstats graph and in the PPS(2) peer offset graph, > and are substantially smaller than the offsets in SHM(1). QED. Well, your much poorer than normal SHM(1) Standard Deviation casts doubt on your QED. > > Uh, that is not my experience. And I have more control over my > > temperature than I have over my interrupt latency. > > See above. And note that you can at least make the latency (as well > as the variation in latency) as small as possible by running the CPU > as fast as possible, rather than slowing it down for "thermal > stability". I always run my ntpd's with the perfomance governor. So not an issue for me. I get 'thermal stability' by controlling an external heater and use that to stabilize my test enclosure. Then for fine temp control I vary the CPU workload to stabilize the CPU temp. Prolly several reasons why my SHM(1) seems to have 40x less jitter than yours. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Tue Sep 26 02:26:02 2017 From: fw at fwright.net (Fred Wright) Date: Mon, 25 Sep 2017 19:26:02 -0700 (PDT) Subject: Apparent protocol-machine bug, new top priority In-Reply-To: <20170925172603.37f154b0@spidey.rellim.com> References: <20170925172603.37f154b0@spidey.rellim.com> Message-ID: On Mon, 25 Sep 2017, Gary E. Miller via devel wrote: > On Mon, 25 Sep 2017 16:55:30 -0700 (PDT) > Fred Wright via devel wrote: > > > PPS(2) is the counter-capture PPS source, and is the primary timing > > reference. > > Can you explain a bit more about this source? How does this differ > from the KPPS or PPS(TIOCMIWAIT) sources? It's a different kind of KPPS, using the pps-gmtimer driver (with some experimental improvements of my own) rather than the usual pps-gpio driver. > > SHM(1) is the combined NMEA/PPS source from GPSD, which is > > configured to use the interrupt-based PPS driver, and hence > > illustrates the offset in the interrupt-based capture. > > Your SHM(1), on both your hosts, seem to not be very good. > > Are you using KPPS? Or just TIOCMWAIT? It's KPPS via the usual pps-gpio. It's not a "modem-control" PPS at all. I don't do anything to reduce the variability since that's not the real timing reference, anyway, and it's not the main issue here. Some of that may be in the receiver itself. The *cape* vendor only claims +/-200ns, even though the *chipset* vendor claims +/-60ns IIRC. > > Again, PPS(2) is the main timing reference, though it's listed as > > 127.127.22.2 due to the lame partial translation table in ntpviz. > > What do you think ntpviz should do better? Just convert 127.127.22.C > to PPS(X) ? That would be ann easy patch. Basically, yes, but note that ntpq has the same issue when pointed at classic ntpd. The mapping table should really be in one of the common libraries so that ntpq and ntpviz can share it. And it should have the complete list so that it can cover everything that classic ntpd supports. > > The actual time offsets are > > visible in the loopstats graph and in the PPS(2) peer offset graph, > > and are substantially smaller than the offsets in SHM(1). QED. > > Well, your much poorer than normal SHM(1) Standard Deviation casts > doubt on your QED. Except that the SD is still way less than the mean. Having a really tight grouping of shots five feet to the right of the target doesn't make you a good marksman. :-) > > > Uh, that is not my experience. And I have more control over my > > > temperature than I have over my interrupt latency. > > > > See above. And note that you can at least make the latency (as well > > as the variation in latency) as small as possible by running the CPU > > as fast as possible, rather than slowing it down for "thermal > > stability". > > I always run my ntpd's with the perfomance governor. So not an issue > for me. Though apparently Achim runs his slower, hence the comment. > I get 'thermal stability' by controlling an external heater and use that > to stabilize my test enclosure. Then for fine temp control I vary the > CPU workload to stabilize the CPU temp. > > Prolly several reasons why my SHM(1) seems to have 40x less jitter than > yours. Perhaps, but I'll bet you're still more than 10 microseconds off without having any way to see it. Fred Wright From gem at rellim.com Tue Sep 26 02:57:45 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 25 Sep 2017 19:57:45 -0700 Subject: Apparent protocol-machine bug, new top priority In-Reply-To: References: <20170925172603.37f154b0@spidey.rellim.com> Message-ID: <20170925195718.5219776a@spidey.rellim.com> Yo Fred! On Mon, 25 Sep 2017 19:26:02 -0700 (PDT) Fred Wright via devel wrote: > On Mon, 25 Sep 2017, Gary E. Miller via devel wrote: > > On Mon, 25 Sep 2017 16:55:30 -0700 (PDT) > > Fred Wright via devel wrote: > > > > > PPS(2) is the counter-capture PPS source, and is the primary > > > timing reference. > > > > Can you explain a bit more about this source? How does this differ > > from the KPPS or PPS(TIOCMIWAIT) sources? > > It's a different kind of KPPS, using the pps-gmtimer driver (with some > experimental improvements of my own) rather than the usual pps-gpio > driver. Cool. I hope you get it upstreamed. > > Your SHM(1), on both your hosts, seem to not be very good. > It's KPPS via the usual pps-gpio. It's not a "modem-control" PPS at > all. > I don't do anything to reduce the variability since that's not the > real timing reference, anyway, and it's not the main issue here. But it does call into question your other measurements. How can I trust the hard measurements when the easy one look marginal. > Some of that may be in the receiver itself. The *cape* vendor only > claims +/-200ns, even though the *chipset* vendor claims +/-60ns IIRC. And you were seeing over 10 micro seconds Standard Deviation... > > What do you think ntpviz should do better? Just convert > > 127.127.22.C to PPS(X) ? That would be ann easy patch. > > Basically, yes, I just pushed a patch for ntpviz. > but note that ntpq has the same issue when pointed at > classic ntpd. Nothing I can do about Classic NTP. > The mapping table should really be in one of the common > libraries so that ntpq and ntpviz can share it. ntpq does no mapping. It just uses the mapping that ntpd sent it over mode 6. > And it should have > the complete list so that it can cover everything that classic ntpd > supports. I added PPS and NMEA, but I have no idea what abbreviations to use for the others. > > Well, your much poorer than normal SHM(1) Standard Deviation casts > > doubt on your QED. > > Except that the SD is still way less than the mean. Yes, but when something is not right, you have to discount the parts that look correct. It would be interesting to add a fudge to the SHM(1). > Having a really tight grouping of shots five feet to the right of the > target doesn't make you a good marksman. :-) Really? I was taught to get your grouping right first, then work on moving the center of your cluster. > > > > Uh, that is not my experience. And I have more control over my > > > > temperature than I have over my interrupt latency. > > > > > > See above. And note that you can at least make the latency (as > > > well as the variation in latency) as small as possible by running > > > the CPU as fast as possible, rather than slowing it down for > > > "thermal stability". > > > > I always run my ntpd's with the perfomance governor. So not an > > issue for me. > > Though apparently Achim runs his slower, hence the comment. Now we are conflating two things. Thermal effects and latency. I do not find them cross-correlated. > > Prolly several reasons why my SHM(1) seems to have 40x less jitter > > than yours. > > Perhaps, but I'll bet you're still more than 10 microseconds off > without having any way to see it. I'd like to see the results of that bet. Get your driver upstreamed, and working, I'd like to try it. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Tue Sep 26 03:26:44 2017 From: fw at fwright.net (Fred Wright) Date: Mon, 25 Sep 2017 20:26:44 -0700 (PDT) Subject: Python Library Cleanups Message-ID: I finally have the fixes I've been working on related to the Python library setup ready for publication. The important issues were: 1) Waf misuses get_python_lib() in a way that often gets the wrong result, with the effect of installing the libraries in a location where Python doesn't look for them. This led to various inappropriate recommendations to set PYTHONPATH. PYTHONPATH is a special developer kludge, analogous to LD_LIBRARY_PATH, which should never be set in normal operation. Code that depends on setting PYTHONPATH isn't production-ready. 2) The in-tree testing setup wasn't fully functional. In the process of fixing that, I noticed that it was set up to apply the hacks to the *source* tree, which is inconsistent with the project's organizational philosophy of keeping build products out of the source tree. I've reworked that, and it now works for all Python versions and handles the tests as well as the clients, but the switch to the build-tree orientation has operational impact (on developers, not users): 2.1) One now needs to use the build-tree path rather than the source-tree path to run any given "uninstalled" program. E.g., run build/main/ntpclients/ntpq instead of ntpclients/ntpq. To defend against doing this wrong by accident, the source-tree copies no longer have execute permissions. This should also avoid .pyc/.pyo pollution in the source tree. 2.2) Any personal scripts and/or symlinks pointing to these programs need to be updated. Aside from a couple of things that are unavoidably created in the source tree (including the build directory itself), there are now only two build products being created in the source tree: 1) The ntpd/version.h file. This is probably fairly easy to fix, but when my quickie attempt didn't work I just left it for later. This is the only remaining use of the 'clean' part of afterparty(). 2) The autorevision cache. That whole autorevision thing needs work, since right now it's not possible to build outside a git repo without taking extra steps to create and install the cache file. And contrary to what's implied by the documentation, it's not specifically a cross-building issue. Either autorevision needs some sort of non-git fallback (a la GPSD), or there needs to be some better way to populate a non-git directory for a working build. Anyway, I left that stuff alone for now. MR coming. Fred Wright From gem at rellim.com Tue Sep 26 03:56:56 2017 From: gem at rellim.com (Gary E. Miller) Date: Mon, 25 Sep 2017 20:56:56 -0700 Subject: Python Library Cleanups In-Reply-To: References: Message-ID: <20170925205656.0efe3608@spidey.rellim.com> Yo Fred! On Mon, 25 Sep 2017 20:26:44 -0700 (PDT) Fred Wright via devel wrote: > 1) Waf misuses get_python_lib() in a way that often gets the wrong > result, with the effect of installing the libraries in a location > where Python doesn't look for them. This led to various > inappropriate recommendations to set PYTHONPATH. PYTHONPATH is a > special developer kludge, analogous to LD_LIBRARY_PATH, which should > never be set in normal operation. Code that depends on setting > PYTHONPATH isn't production-ready. The default gentoo PYTHONPATH onlly looks for system installed python libs. We don't want to install NTPsec python files in the system reserved directories. Thus PYTHONPATH is the only solution. > 2) The in-tree testing setup wasn't fully functional. In the process > of fixing that, I noticed that it was set up to apply the hacks to the > *source* tree, which is inconsistent with the project's organizational > philosophy of keeping build products out of the source tree. I've > reworked that, and it now works for all Python versions and handles > the tests as well as the clients, but the switch to the build-tree > orientation has operational impact (on developers, not users): What was not functional? I'm all for keeping the source tree clean and building in the build directories. > 1) The ntpd/version.h file. This is probably fairly easy to fix, but > when my quickie attempt didn't work I just left it for later. This > is the only remaining use of the 'clean' part of afterparty(). Can you file an issue for that? > 2) The autorevision cache. That whole autorevision thing needs work, > since right now it's not possible to build outside a git repo without > taking extra steps to create and install the cache file. And > contrary to what's implied by the documentation, it's not > specifically a cross-building issue. Either autorevision needs some > sort of non-git fallback (a la GPSD), or there needs to be some > better way to populate a non-git directory for a working build. > Anyway, I left that stuff alone for now. Can you make a new issue for that? > MR coming. Can you split apart your MR? It addresses different issues and at least #1 is probably not gonna happen. Plus these are most likely too late for 1.0. We are in the final testing phases now. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From daniele at grinta.net Tue Sep 26 04:24:39 2017 From: daniele at grinta.net (Daniele Nicolodi) Date: Mon, 25 Sep 2017 22:24:39 -0600 Subject: Python Library Cleanups In-Reply-To: <20170925205656.0efe3608@spidey.rellim.com> References: <20170925205656.0efe3608@spidey.rellim.com> Message-ID: <7c693ecb-8760-2147-4e26-265333bf0b66@grinta.net> On 25/09/17 21:56, Gary E. Miller via devel wrote: > On Mon, 25 Sep 2017 20:26:44 -0700 (PDT) > Fred Wright via devel wrote: > >> 1) Waf misuses get_python_lib() in a way that often gets the wrong >> result, with the effect of installing the libraries in a location >> where Python doesn't look for them. This led to various >> inappropriate recommendations to set PYTHONPATH. PYTHONPATH is a >> special developer kludge, analogous to LD_LIBRARY_PATH, which should >> never be set in normal operation. Code that depends on setting >> PYTHONPATH isn't production-ready. > > The default gentoo PYTHONPATH onlly looks for system installed > python libs. We don't want to install NTPsec python files in the > system reserved directories. Thus PYTHONPATH is the only solution. I sincerely hope that Gentoo is not that broken. Implementing this misfeature would require patching the Python interpreter. By default Python looks for modules in the "site-packages" directory, with a self descriptive name. Where else would you like to install NTPSec python modules? > Can you split apart your MR? It addresses different issues and > at least #1 is probably not gonna happen. Delivering a 1.0 release that requires fiddling with PYTHONPATH to use the application would make the project look like it is run by amateurs. I think that is not the impression you want to give. Cheers, Daniele From Stromeko at nexgo.de Tue Sep 26 07:03:57 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 26 Sep 2017 09:03:57 +0200 Subject: Apparent protocol-machine bug, new top priority References: Message-ID: <87efqu2b6q.fsf@Rainer.invalid> Fred Wright via devel writes: > But it can't "filter" the overall offset - it has no way to know what it > is. I never claimed that it does. I lack an absolute time reference anyway and have yet to splurge for a good enough frequency normal, so at the moment all I care about is getting the three GPS disciplined ntpd to agree on the time and minimize the (apparent) offset to my legal time as served by the PTB. The current status is that the local servers are typically within a 50?s bracket (dominated by systematic offsets that I have not yet compensated for) and about a 200?s bracket for the external servers (dominated by a variability in the link most likely) as seen by a separate box on my local net. >> Again, that number isn't materially affecting the frequency stability, >> only the time offset. If you look at that, you will quickly find that >> your assertion of thermal effects getting dominated by the interrupt >> latency is wrong. > > Of course it has nothing to do with the frequency stability, but it > directly affects the time offset. And my assertion is based on the actual > data. See below. The drift of the time offset over a day caused by just the normal diurnal temperature cycle in the summer and the heating control in the winter is larger than that offset, out-of-the-box, on a rasPi or TinkerBoard. The BeagleBoard might be a little bit more stable, but I don't think by more than a factor of two. Making the system oscillator more stable improves that variability to below 1?s as long as the PPS from the GPS is stable (your Rb disciplined PPS is much better than that). Only then I was able to consistently see the (expected) differences between the three stratum 1 boxes, but they are still a tiny bit larger than the interrupt latency. > This is a day's data from my experimental time server: > > http://sonic.net/~fw/private/NTP/BB2-2017-09-24/ > > The main timing reference is a rubidium oscillator, so frequency-related > effects are essentially nonexistent. In fact, the error in the Linux > kernel's limited-precision *representation* of the frequency is about an > order of magnitude larger than the typical actual frequency error. Thanks for the data. No wonder you belittle us poor souls trying to make do with just the board and GPS module w/ PPS. But getting the best possible performance under that constraint is what I am trying to do, at the moment at leaat. > PPS(2) is the counter-capture PPS source, and is the primary timing > reference. Modulo a constant offset, I'm about a factor of 1.5 away with my best box and not far behind with the other two. Considering the price difference, I'm pretty pleased with that. > SHM(1) is the combined NMEA/PPS source from GPSD, which is > configured to use the interrupt-based PPS driver, and hence illustrates > the offset in the interrupt-based capture. Between 2100Z and 2200Z I > switched the governor to powersave (300MHz CPU clock instead of 1GHz), and > you can see the effect on the latency (but negligible effect on the actual > timing accuracy). The jitter figures for the same time period tell a different story. If you'd actually lock to that source and monitor PPS(2) instead, you would likely get some extra time wander due to the not-quite-white jitter statistics. >> If you wanted to eliminate that, you'd better use an FPGA or some other >> microcontroller that has capture units and hardware timestamps for the >> NIC. > > The Beaglebone chipset already has the hardware, it's reasonably well > documented, and there's driver support for it in Linux. When I was considering the BeagleBoard, I couldn't find anyone who was able to deliver it, so that opportunity passed. I have both some FPGA and ?C hardware available that I could use for a better time server (and still below the BeagleBoard cost), but I'm constantly out of round tuits to have an actual go at that. > The biggest source of inaccuracy is that the generic timekeeping code > in Linux provides no way to convert a *supplied* counter value to a > timestamp. I'd have a long hard look at the RADclock patches from a few years ago. They will certainly need some porting forward to the modern kernel, but I still think the approach was sound and I would love to see that resurrected in the kernel. > BTW, the Beaglebone chipset also has hardware timestamping in the NIC, and > I believe there's kernel support for it, but one can't take full advantage > of that without solving the "send timestamp problem". Maybe, although it seems more likely that only the general timestamp support is compiled in, with the peripheral driver support missing. Try ethtool to see if the NIC driver actually recognizes that support. IF it does, then PTP should also work. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Factory and User Sound Singles for Waldorf rackAttack: http://Synth.Stromeko.net/Downloads.html#WaldorfSounds From fw at fwright.net Tue Sep 26 07:15:04 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 00:15:04 -0700 (PDT) Subject: Python Library Cleanups In-Reply-To: <20170925205656.0efe3608@spidey.rellim.com> References: <20170925205656.0efe3608@spidey.rellim.com> Message-ID: On Mon, 25 Sep 2017, Gary E. Miller via devel wrote: > On Mon, 25 Sep 2017 20:26:44 -0700 (PDT) > Fred Wright via devel wrote: > > > 1) Waf misuses get_python_lib() in a way that often gets the wrong > > result, with the effect of installing the libraries in a location > > where Python doesn't look for them. This led to various > > inappropriate recommendations to set PYTHONPATH. PYTHONPATH is a > > special developer kludge, analogous to LD_LIBRARY_PATH, which should > > never be set in normal operation. Code that depends on setting > > PYTHONPATH isn't production-ready. > > The default gentoo PYTHONPATH onlly looks for system installed > python libs. We don't want to install NTPsec python files in the > system reserved directories. Thus PYTHONPATH is the only solution. The location returned by get_python_lib() *without the prefix argument* is exactly what the GPSD install procedure uses, and why GPSD has never needed PYTHONPATH. In fact, the way I tracked down the problem is by asking why GPSD doesn't have the same problem. If the directory choice on gentoo is inappropriate, take that up with whomever packaged Python for it. > > 2) The in-tree testing setup wasn't fully functional. In the process > > of fixing that, I noticed that it was set up to apply the hacks to the > > *source* tree, which is inconsistent with the project's organizational > > philosophy of keeping build products out of the source tree. I've > > reworked that, and it now works for all Python versions and handles > > the tests as well as the clients, but the switch to the build-tree > > orientation has operational impact (on developers, not users): > > What was not functional? I'm all for keeping the source tree clean > and building in the build directories. It didn't work with Python 3, and it didn't work for the tests. > Can you split apart your MR? It addresses different issues and > at least #1 is probably not gonna happen. It's already split into multiple commits, and there seems to be some difference of opinion regarding the latter. > Plus these are most likely too late for 1.0. We are in the final > testing phases now. I think requiring users to set PYTHONPATH to run the tools should be considered a show stopper for a 1.0 release. These changes are almost entirely just in the build scripts, with the only actual code changes (Python only) being almost entirely in the error reporting for not finding the libraries. Fred Wright From esr at thyrsus.com Tue Sep 26 07:59:42 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 03:59:42 -0400 Subject: Python Library Cleanups In-Reply-To: <7c693ecb-8760-2147-4e26-265333bf0b66@grinta.net> References: <20170925205656.0efe3608@spidey.rellim.com> <7c693ecb-8760-2147-4e26-265333bf0b66@grinta.net> Message-ID: <20170926075942.GA12360@thyrsus.com> Daniele Nicolodi via devel : > Delivering a 1.0 release that requires fiddling with PYTHONPATH to use > the application would make the project look like it is run by amateurs. > I think that is not the impression you want to give. I'm sensitive to this issue. PYTHONPATH fiddling is never required on our main platforms. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Tue Sep 26 08:05:02 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 04:05:02 -0400 Subject: Python Library Cleanups In-Reply-To: References: <20170925205656.0efe3608@spidey.rellim.com> Message-ID: <20170926080502.GB12360@thyrsus.com> Fred Wright via devel : > > Plus these are most likely too late for 1.0. We are in the final > > testing phases now. > > I think requiring users to set PYTHONPATH to run the tools should be > considered a show stopper for a 1.0 release. And I would so regard it if this were ever required on Linux or FreeBSD. I have had no report of this. > These changes are almost entirely just in the build scripts, with the only > actual code changes (Python only) being almost entirely in the error > reporting for not finding the libraries. I will audit them, but dropping this in the day before release doesn't just seem like asking for trouble, it seems like screaming WE WANT TO EMBARRASS OURSELVES. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Tue Sep 26 08:30:45 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 01:30:45 -0700 Subject: Python Library Cleanups In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 26 Sep 2017 04:05:02 EDT." <20170926080502.GB12360@thyrsus.com> Message-ID: <20170926083045.9110240605C@ip-64-139-1-69.sjc.megapath.net> > I will audit them, but dropping this in the day before release doesn't just > seem like asking for trouble, it seems like screaming WE WANT TO EMBARRASS > OURSELVES. Then put it on the known glitch list and fix it in 1.x > And I would so regard it if this were ever required on Linux or FreeBSD. I > have had no report of this. It's required on Fedora. I set it up ages ago. I think that was the recommended way back then. It worked. Nobody raised the issue so I never removed it. It kept on working. (Or maybe the word did go out and I missed it.) Fedora 26 [murray at hgm ~]$ printenv PYTHONPATH /usr/local/lib/python2.7/site-packages [murray at hgm ~]$ ntpq --version ntpq ntpsec-0.9.7+1434 2017-09-24T02:08:48Z [murray at hgm ~]$ unset PYTHONPATH [murray at hgm ~]$ ntpq --version ntpq: can't find Python NTP library -- check PYTHONPATH. No module named ntp.control [murray at hgm ~]$ I get the same on FreeBSD. Should ntpq be able to print its version string without any libraries? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Tue Sep 26 08:39:47 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 01:39:47 -0700 Subject: When is the release? Message-ID: <20170926083947.BEDC040605C@ip-64-139-1-69.sjc.megapath.net> > but dropping this in the day before release I was planning to make a scan through the documentation. Is there anything else that you think would be a better use of my time? -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Sep 26 08:55:06 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 04:55:06 -0400 Subject: When is the release? In-Reply-To: <20170926083947.BEDC040605C@ip-64-139-1-69.sjc.megapath.net> References: <20170926083947.BEDC040605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170926085506.GA13669@thyrsus.com> Hal Murray : > > but dropping this in the day before release > > I was planning to make a scan through the documentation. > > Is there anything else that you think would be a better use of my time? Not offhand. That sounds like a good one. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Tue Sep 26 09:05:16 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 05:05:16 -0400 Subject: Python Library Cleanups In-Reply-To: <20170926083045.9110240605C@ip-64-139-1-69.sjc.megapath.net> References: <20170926080502.GB12360@thyrsus.com> <20170926083045.9110240605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170926090516.GB13669@thyrsus.com> Hal Murray : > > And I would so regard it if [setting PYTHONPATH] were ever required on > > Linux or FreeBSD. I have had no report of this. > > It's required on Fedora. I set it up ages ago. I think that was the > > I get the same on FreeBSD. Well, shit. That changes things. Now we have to fix this. The only question is how long to delay the release. > Should ntpq be able to print its version string without any libraries? That'd be nice, but one feature of the current organization is that there's a version-reporting function all the Python code can share. Patching around that problem by duplicating that function would treat a symptom. I'd rather fix the underlying problem. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Tue Sep 26 09:35:36 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 05:35:36 -0400 (EDT) Subject: All hands - we need to test Fred's build changes pronto Message-ID: <20170926093536.6FF6F13A0206@snark.thyrsus.com> I previously turned down merging Fred Wright's PYTHONPATH-elimination change as a potential destabilizer, but then heard from Hal that Fedora and FreeBSD actually *do* have the problem it addresses. (I've never seen it under Ubuntu, which is why I didn't know.) Given that, we're out of happy choices. I've merged Fred's patch set. It passes on the GitLab buildbots, which is good news. Fred is a careful and reliable person, which decreases my worries some. Still, this is a bad thing to have land on us two days before 1.0. Please, everybody get on the stick and test on every platform you can reach. We need to know that, *without* a PYTHONPATH set, 1. Build works. 2. ntpq and friends are able to see the Python libraries. 3. ntpd passes a smoke test And we need to know it fast. Release is on the 28th, in two days. Sigh. Everything was going so smoothly... Please check in on this list as you verify that your platform is OK. -- Eric S. Raymond After a shooting spree, they always want to take the guns away from the people who didn't do it. -- William S. Burroughs Conservatism is the blind and fear-filled worship of dead radicals. From hmurray at megapathdsl.net Tue Sep 26 10:03:53 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 03:03:53 -0700 Subject: All hands - we need to test Fred's build changes pronto In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 26 Sep 2017 05:35:36 EDT." <20170926093536.6FF6F13A0206@snark.thyrsus.com> Message-ID: <20170926100353.56E4740605C@ip-64-139-1-69.sjc.megapath.net> The permissions on the stuff in ntpclients had the execute bit removed so local testing doesn't work any more. I assume install fixes that since you reported that ntpq worked. In particular, tests/options-tester.sh says things like: VERSION: ntpd ntpsec-0.9.7+1444 2017-09-26T09:54:52Z VERSION: ./tests/option-tester.sh: line 40: ./ntpq: Permission denied VERSION: ./tests/option-tester.sh: line 42: ./ntpdig: Permission denied VERSION: ./tests/option-tester.sh: line 44: ./ntpmon: Permission denied -- These are my opinions. I hate spam. From esr at thyrsus.com Tue Sep 26 11:29:49 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 07:29:49 -0400 Subject: All hands - we need to test Fred's build changes pronto In-Reply-To: <20170926100353.56E4740605C@ip-64-139-1-69.sjc.megapath.net> References: <20170926093536.6FF6F13A0206@snark.thyrsus.com> <20170926100353.56E4740605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170926112949.GA18331@thyrsus.com> Hal Murray : > The permissions on the stuff in ntpclients had the execute bit removed so > local testing doesn't work any more. I assume install fixes that since you > reported that ntpq worked. > > In particular, tests/options-tester.sh says things like: > > VERSION: ntpd ntpsec-0.9.7+1444 2017-09-26T09:54:52Z > VERSION: ./tests/option-tester.sh: line 40: ./ntpq: Permission denied > VERSION: ./tests/option-tester.sh: line 42: ./ntpdig: Permission denied > VERSION: ./tests/option-tester.sh: line 44: ./ntpmon: Permission denied Following Fred's change, client tests now need to be run from build/main/ntpcllients rather than the source directory. I have pushed a change to tests/options-tester.sh that does this; please verify. There's a small loss of convennience here that I regret, but updates to the Python service libraries already required a waf build. It's more consistent that changes to the front ends do too, and eliminates an error where you don't get updated libraries because that step was forgotten. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From jason at azze.org Tue Sep 26 13:09:05 2017 From: jason at azze.org (Jason Azze) Date: Tue, 26 Sep 2017 09:09:05 -0400 Subject: All hands - we need to test Fred's build changes pronto In-Reply-To: <20170926093536.6FF6F13A0206@snark.thyrsus.com> References: <20170926093536.6FF6F13A0206@snark.thyrsus.com> Message-ID: On Tue, Sep 26, 2017 at 5:35 AM, Eric S. Raymond via devel wrote: > Please, everybody get on the stick and test on every platform you can > reach. We need to know that, *without* a PYTHONPATH set, I tested CentOS 6.6. I used to have to export PYTHONPATH on this platform. Now all tools in build/main/ntpclients/ run without setting PYTHONPATH. OS: CentOS release 6.6 (Final) Python: Python 2.6.6 NTPsec: [root at cent66-pa18 ntpsec]# ./build/main/ntpd/ntpd --version ntpd ntpsec-0.9.7+1409 2017-09-26T07:36:31-0400 [root at cent66-pa18 ntpsec]# git show --oneline f7d063e Renove a magic link obsolesced by PYTHONPATH changes. To test I did a fresh clone on a machine where I've never run NTPsec before. I ran the buildprep script, ./waf configure, ./waf build, ./waf install (all as root) then successfully ran the various tools in build/main/ntpclients/ (except for ntpleapfetch due to an unrelated problem with shasum. I'll open a tracker issue for it.) From ianbruene at gmail.com Tue Sep 26 16:48:16 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 26 Sep 2017 11:48:16 -0500 Subject: All hands - we need to test Fred's build changes pronto In-Reply-To: <20170926112949.GA18331@thyrsus.com> References: <20170926093536.6FF6F13A0206@snark.thyrsus.com> <20170926100353.56E4740605C@ip-64-139-1-69.sjc.megapath.net> <20170926112949.GA18331@thyrsus.com> Message-ID: Running tests on the new build changes: Build and run tests: ================= standard build (python 2.7): runs normally, uses the correct libraries from the build (this worked for me without any thingamijiggery anyway) python3 build: tests run and use the correct libraries without installation. No PYTHONPATH in use. python3.6 build: ditto Run utilities from build directory: ================= standard build (python 2.7): ntpq runs normally from build directory, uses installed libraries python3 build: fails, can't find ntp module, after installing p3 version it runs but crashes with a type conversion error (I'll get on this right away) python3.6 build: ditto, but also crashes with a "ModuleNotFoundError: No module named 'apt_pkg'" error. Run installed utilities: same results on running from the build directory -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From Stromeko at nexgo.de Tue Sep 26 17:04:24 2017 From: Stromeko at nexgo.de (Achim Gratz) Date: Tue, 26 Sep 2017 19:04:24 +0200 Subject: All hands - we need to test Fred's build changes pronto References: <20170926093536.6FF6F13A0206@snark.thyrsus.com> Message-ID: <87wp4l9ysn.fsf@Rainer.invalid> Eric S. Raymond via devel writes: > Please check in on this list as you verify that your platform is OK. It would be nice if waf said anything about the need to re-configure when appropriate. Regards, Achim. -- +<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+ Samples for the Waldorf Blofeld: http://Synth.Stromeko.net/Downloads.html#BlofeldSamplesExtra From ianbruene at gmail.com Tue Sep 26 17:25:46 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 26 Sep 2017 12:25:46 -0500 Subject: All hands - we need to test Fred's build changes pronto In-Reply-To: References: <20170926093536.6FF6F13A0206@snark.thyrsus.com> <20170926100353.56E4740605C@ip-64-139-1-69.sjc.megapath.net> <20170926112949.GA18331@thyrsus.com> Message-ID: <6a99b33c-dc90-97b8-3749-d44e767cf2f8@gmail.com> I pushed a fix for both of these. On 09/26/2017 11:48 AM, Ian Bruene wrote: > python3 build: fails, can't find ntp module, after installing p3 > version it runs but crashes with a type conversion error (I'll get on > this right away) > > python3.6 build: ditto, but also crashes with a "ModuleNotFoundError: > No module named 'apt_pkg'" error. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From ianbruene at gmail.com Tue Sep 26 17:40:52 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 26 Sep 2017 12:40:52 -0500 Subject: Python 3 and 1.0 Message-ID: The python 3 build appears to work. However it has a unicode bug in ntpq (but not ntpmon! Yay consistency!), and I can not say that I *trust* any of it. This is partially my fault, as I failed to test the software in Py3 as much as I should have. As an excuse I will note that I fixed several py3 bugs in the last few weeks, and part of the reason for the lack of testing has been the higher friction in getting the p3 build to work. Under these circumstances I strongly suggest that there be a note in the README to the effect that python 3 compatibility is not guaranteed. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From hmurray at megapathdsl.net Tue Sep 26 18:11:17 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 11:11:17 -0700 Subject: All hands - we need to test Fred's build changes pronto In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 26 Sep 2017 07:29:49 EDT." <20170926112949.GA18331@thyrsus.com> Message-ID: <20170926181117.8F7D240605C@ip-64-139-1-69.sjc.megapath.net> > Following Fred's change, client tests now need to be run from build/main/ > ntpcllients rather than the source directory. I have pushed a change to > tests/options-tester.sh that does this; please verify. Did you actually test your version? It didn't work for me. I just pushed a fix. Testing Python code is broken. It's using the system libraries rather than the new/local libraries. I hacked the installed library version of version.py to have a Q in the date rather than a Z. >From tests/option-tester.sh test-all/test.log:VERSION: ntpd ntpsec-0.9.7+1411 2017-09-26T17:45:27Z test-all/test.log:VERSION: ntpq ntpsec-0.9.7+1434 2017-09-24T02:08:48Q test-all/test.log:VERSION: ntpdig ntpsec-0.9.7+1434 2017-09-24T02:08:48Q test-all/test.log:VERSION: ntpmon ntpsec-0.9.7+1434 2017-09-24T02:08:48Q -- These are my opinions. I hate spam. From gem at rellim.com Tue Sep 26 18:15:06 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 11:15:06 -0700 Subject: All hands - we need to test Fred's build changes pronto In-Reply-To: <20170926093536.6FF6F13A0206@snark.thyrsus.com> References: <20170926093536.6FF6F13A0206@snark.thyrsus.com> Message-ID: <20170926111506.3d1075c4@spidey.rellim.com> Yo Eric! On Tue, 26 Sep 2017 05:35:36 -0400 (EDT) "Eric S. Raymond via devel" wrote: > Please, everybody get on the stick and test on every platform you can > reach. We need to know that, *without* a PYTHONPATH set, > > 1. Build works. Sort of. > 2. ntpq and friends are able to see the Python libraries. Yes, but they libs are in the wrong place. Putting them in /usr/liv/python2.7/site-packaes violates the FHS. > 3. ntpd passes a smoke test Why? ntpd does not use PYTHONPATH? > And we need to know it fast. Release is on the 28th, in two days. I say revert now. It violates existing prative and the FHS. > Please check in on this list as you verify that your platform is OK. No FHS platform is OK with this. It plainly violates the standard. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Sep 26 18:21:09 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 11:21:09 -0700 Subject: Name of path for installed libraries: ntp or ntpsec Message-ID: <20170926182109.B729940605C@ip-64-139-1-69.sjc.megapath.net> Our libraries currently get installed in places like: /usr/local/lib/python2.7/site-packages/ntp/version.py Should we be using ntpsec rather than ntp? -- These are my opinions. I hate spam. From ianbruene at gmail.com Tue Sep 26 18:22:55 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 26 Sep 2017 13:22:55 -0500 Subject: Name of path for installed libraries: ntp or ntpsec In-Reply-To: <20170926182109.B729940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170926182109.B729940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On 09/26/2017 01:21 PM, Hal Murray via devel wrote: > Should we be using ntpsec rather than ntp? If so we need to know *now*. Because every python file will need its imports changed. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From gem at rellim.com Tue Sep 26 18:34:40 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 11:34:40 -0700 Subject: Name of path for installed libraries: ntp or ntpsec In-Reply-To: <20170926182109.B729940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170926182109.B729940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170926113440.6f3fc863@spidey.rellim.com> Yo Hal! On Tue, 26 Sep 2017 11:21:09 -0700 Hal Murray via devel wrote: > Our libraries currently get installed in places like: > /usr/local/lib/python2.7/site-packages/ntp/version.py > > Should we be using ntpsec rather than ntp? Sadly, the recent patch broke that. They now go here, which viloates the FHS: /usr/lib/python2.7/site-packages/ntp/util.py But otherwise, I agree, I'd rather see site-packages/ntpsec RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Sep 26 18:38:27 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 11:38:27 -0700 Subject: Python Library Cleanups In-Reply-To: <7c693ecb-8760-2147-4e26-265333bf0b66@grinta.net> References: <20170925205656.0efe3608@spidey.rellim.com> <7c693ecb-8760-2147-4e26-265333bf0b66@grinta.net> Message-ID: <20170926113827.6fe98a54@spidey.rellim.com> Yo Daniele! On Mon, 25 Sep 2017 22:24:39 -0600 Daniele Nicolodi via devel wrote: > > The default gentoo PYTHONPATH onlly looks for system installed > > python libs. We don't want to install NTPsec python files in the > > system reserved directories. Thus PYTHONPATH is the only > > solution. > > I sincerely hope that Gentoo is not that broken. Implementing this > misfeature would require patching the Python interpreter. Uh, not Gentoo, the FHS, which all Linux is supposed to obey. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Tue Sep 26 18:39:34 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 11:39:34 -0700 Subject: Python Library Cleanups In-Reply-To: References: <20170925205656.0efe3608@spidey.rellim.com> Message-ID: <20170926113934.0b2403d5@spidey.rellim.com> Yo Fred! On Tue, 26 Sep 2017 00:15:04 -0700 (PDT) Fred Wright via devel wrote: > If the directory choice on gentoo is inappropriate, take that up with > whomever packaged Python for it. Not Gentoo, the FHS. And the package I am using is git head. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Tue Sep 26 19:26:37 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 12:26:37 -0700 Subject: All hands - we need to test Fred's build changes pronto In-Reply-To: Message from Hal Murray of "Tue, 26 Sep 2017 11:11:17 PDT." <20170926181117.8F7D240605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170926192637.40C7440605C@ip-64-139-1-69.sjc.megapath.net> hmurray at megapathdsl.net said: > Testing Python code is broken. It's using the system libraries rather than > the new/local libraries. Author: Eric S. Raymond Date: Tue Sep 26 13:59:10 2017 -0400 Restore accidentally removed creation of a magigic ntp/ link. Fixed. Thanks. -- These are my opinions. I hate spam. From fw at fwright.net Tue Sep 26 19:58:22 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 12:58:22 -0700 (PDT) Subject: ntpsec | ntpq unable to find python libraries on macOS 10.13 (#396) In-Reply-To: References: Message-ID: On Tue, 26 Sep 2017, Eric S. Raymond wrote: > I've restored that magic link. > > It fell to ab attempt to stop creating a now unneeded magic link in the > source part of the tree. I won't try to re-fix that before release. That code *used to* create magic links in the source tree, but I changed it to put them in the build tree instead. Thus, it's working as intended, and isn't a "fix later" issue. Since it doesn't create the source-tree links any more, it also doesn't remove them. If you saw them, they may have been leftovers from an earlier run. Remember, "git clean -dxf" is your friend. :-) That seemed like a sufficiently minor issue that it wasn't worth leavling the rmoval code in place for a while. Maybe that was the wrong choice. The details of how it makes the symlinks could be improved. Although waf has a feature for creating symlinks, it only works at install time. It has no built-in mechanism to create them at build time. There's probably a way to make a custom builder that provides the missing capability, which could then be used instead of the post-build hook, but that's something for later. Fred Wright From esr at thyrsus.com Tue Sep 26 20:19:08 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 16:19:08 -0400 Subject: Python 3 and 1.0 In-Reply-To: References: Message-ID: <20170926201908.GA817@thyrsus.com> Ian Bruene via devel : > > The python 3 build appears to work. However it has a unicode bug in ntpq > (but not ntpmon! Yay consistency!), and I can not say that I *trust* any of > it. > > This is partially my fault, as I failed to test the software in Py3 as much > as I should have. As an excuse I will note that I fixed several py3 bugs in > the last few weeks, and part of the reason for the lack of testing has been > the higher friction in getting the p3 build to work. > > Under these circumstances I strongly suggest that there be a note in the > README to the effect that python 3 compatibility is not guaranteed. Can you get enough verificaion in a week? We may have to push back the release for other reasons. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From fw at fwright.net Tue Sep 26 20:25:30 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 13:25:30 -0700 (PDT) Subject: All hands - we need to test Fred's build changes pronto In-Reply-To: <20170926192637.40C7440605C@ip-64-139-1-69.sjc.megapath.net> References: <20170926192637.40C7440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Tue, 26 Sep 2017, Eric S. Raymond via devel wrote: > Hal Murray : > > The permissions on the stuff in ntpclients had the execute bit removed so > > local testing doesn't work any more. I assume install fixes that since you > > reported that ntpq worked. > > > > In particular, tests/options-tester.sh says things like: > > > > VERSION: ntpd ntpsec-0.9.7+1444 2017-09-26T09:54:52Z > > VERSION: ./tests/option-tester.sh: line 40: ./ntpq: Permission denied > > VERSION: ./tests/option-tester.sh: line 42: ./ntpdig: Permission denied > > VERSION: ./tests/option-tester.sh: line 44: ./ntpmon: Permission denied > > Following Fred's change, client tests now need to be run from > build/main/ntpcllients rather than the source directory. I have pushed > a change to tests/options-tester.sh that does this; please verify. Sorry, I should have caught that. I was using my own test script to run everything. My original post did mention the possibility of personal scripts possibly needed updates, but I failed to notice that there was one in the project files. > There's a small loss of convennience here that I regret, but updates to the > Python service libraries already required a waf build. It's more consistent > that changes to the front ends do too, and eliminates an error where you > don't get updated libraries because that step was forgotten. Running from the build directory would be a bit more completion-friendly if "buildprep" were renamed to something that doesn't start with "build". :-) On Tue, 26 Sep 2017, Ian Bruene via devel wrote: > python3 build: fails, can't find ntp module, after installing p3 version > it runs but crashes with a type conversion error (I'll get on this right > away) This is a bit short on detail, but you might just have been running afoul of version dependencies. For in-tree testing with pure Python libraries, one can usually switch Python versions with no issues (within whatever versions are supported). But compiled extensions are compiled and linked against a specific version of Python, and in general shouldn't be expected to work with other versions. I.e., compiled extensions can only be polyglot at the source level, not the binary level. Recent versions of Python have started putting the Python version in the extension filenames, so that they're simply not found when using the wrong Python version. But in other cases, it may load an extension that just doesn't work properly. I know of at least one case where it results in an app crash. For installed code, it's even worse, since the library location is almost always version-specific. So even pure Python libraries don't get found when using a different Python version than was used for the build/install. The only Python programs that don't suffer from this are ones that only use the standard Python libraries. On Tue, 26 Sep 2017, Hal Murray via devel wrote: > hmurray at megapathdsl.net said: > > Testing Python code is broken. It's using the system libraries rather than > > the new/local libraries. > > Author: Eric S. Raymond > Date: Tue Sep 26 13:59:10 2017 -0400 > > Restore accidentally removed creation of a magigic ntp/ link. > > Fixed. Thanks. This illustrates the subtler problem with in-tree testing. The problem is often framed as programs not working at all when the code hasn't been installed to the system. But when an older version is installed, code relying on the system libraries may be using the older version, instead of what one is trying to test. This is the sort of scenario where using PYTHONPATH *could be* legitimate, but since it's fairly easy to set things up to work without it, that's usually a better choice. When running a Python program, Python always adds the program's own directory to sys.path, so as long as all local imports are findable relative to that directory, they'll shadow any installed versions that happen to be present. Fred Wright From ianbruene at gmail.com Tue Sep 26 20:27:26 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 26 Sep 2017 15:27:26 -0500 Subject: Python 3 and 1.0 In-Reply-To: <20170926201908.GA817@thyrsus.com> References: <20170926201908.GA817@thyrsus.com> Message-ID: <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> On 09/26/2017 03:19 PM, Eric S. Raymond wrote: > Can you get enough verificaion in a week? We may have to push back the release > for other reasons. I can hammer on it, if nothing serious shows up it should be fine. Post-1.0 I'd like to take a systematic look at how data is flowing through the code and how it is being represented by the different versions of python. I can't tell for sure yet what is and isn't a kluge. I know for example that my fix earlier today was a band-aid due to time pressure and not fully understanding the data flow. I do *not* like not knowing how many of these there are. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From esr at thyrsus.com Tue Sep 26 20:31:49 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 16:31:49 -0400 Subject: Python 3 and 1.0 In-Reply-To: <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> References: <20170926201908.GA817@thyrsus.com> <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> Message-ID: <20170926203149.GA1447@thyrsus.com> Ian Bruene via devel : > Post-1.0 I'd like to take a systematic look at how data is flowing through > the code and how it is being represented by the different versions of > python. I can't tell for sure yet what is and isn't a kluge. I know for > example that my fix earlier today was a band-aid due to time pressure and > not fully understanding the data flow. I do *not* like not knowing how many > of these there are. That's a good thing to be wary about. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From fw at fwright.net Tue Sep 26 21:02:17 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 14:02:17 -0700 (PDT) Subject: 2 forwarded messages... Message-ID: On Tue, 26 Sep 2017, Eric S. Raymond wrote: > Fred Wright via devel : > > > Plus these are most likely too late for 1.0. We are in the final > > > testing phases now. > > > > I think requiring users to set PYTHONPATH to run the tools should be > > considered a show stopper for a 1.0 release. > > And I would so regard it if this were ever required on Linux or FreeBSD. > I have had no report of this. The documentation used to suggest putting a PYTHONPATH definition in one's .bashrc. That should have set off alarm bells. > > These changes are almost entirely just in the build scripts, with the only > > actual code changes (Python only) being almost entirely in the error > > reporting for not finding the libraries. > > I will audit them, but dropping this in the day before release doesn't just > seem like asking for trouble, it seems like screaming WE WANT TO EMBARRASS > OURSELVES. Sorry for this being so late, but I spent a lot of time testing the changes. And my Ubuntu VM picked a rather inopportune time to start saying "you can't purge a package, your disk is full". :-) On Tue, 26 Sep 2017, Eric S. Raymond via devel wrote: > Hal Murray : > > > Should ntpq be able to print its version string without any libraries? > > That'd be nice, but one feature of the current organization is that there's > a version-reporting function all the Python code can share. Also, the current behavior makes for a nice simple test that the libraries can be loaded. There are a few clients that are lacking the -V option, so that doesn't work for them. This is probably woth fixing. And it might be worth fixing ntploggps to be able to report its version without needing the GPSD libraries. None of this is a 1.0 issue, of course. On Tue, 26 Sep 2017, Gary E. Miller via devel wrote: > On Tue, 26 Sep 2017 00:15:04 -0700 (PDT) > Fred Wright via devel wrote: > > > If the directory choice on gentoo is inappropriate, take that up with > > whomever packaged Python for it. > > Not Gentoo, the FHS. And the package I am using is git head. The point is that the portable way to determine where to install the libraries is to ask Python via the get_python_lib() function. Whatever it returns (as long as you don't supply the 'prefix' argument) is guaranteed to be in its initial sys.path. If that location is inappropriate, it's the fault of whomever configured and built that Python. Fred Wright From fw at fwright.net Tue Sep 26 21:08:56 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 14:08:56 -0700 (PDT) Subject: Python Library Cleanups Message-ID: Resending with the correct subject. On Tue, 26 Sep 2017, Eric S. Raymond wrote: > Fred Wright via devel : > > > Plus these are most likely too late for 1.0. We are in the final > > > testing phases now. > > > > I think requiring users to set PYTHONPATH to run the tools should be > > considered a show stopper for a 1.0 release. > > And I would so regard it if this were ever required on Linux or FreeBSD. > I have had no report of this. The documentation used to suggest putting a PYTHONPATH definition in one's .bashrc. That should have set off alarm bells. > > These changes are almost entirely just in the build scripts, with the only > > actual code changes (Python only) being almost entirely in the error > > reporting for not finding the libraries. > > I will audit them, but dropping this in the day before release doesn't just > seem like asking for trouble, it seems like screaming WE WANT TO EMBARRASS > OURSELVES. Sorry for this being so late, but I spent a lot of time testing the changes. And my Ubuntu VM picked a rather inopportune time to start saying "you can't purge a package, your disk is full". :-) On Tue, 26 Sep 2017, Eric S. Raymond via devel wrote: > Hal Murray : > > > Should ntpq be able to print its version string without any libraries? > > That'd be nice, but one feature of the current organization is that there's > a version-reporting function all the Python code can share. Also, the current behavior makes for a nice simple test that the libraries can be loaded. There are a few clients that are lacking the -V option, so that doesn't work for them. This is probably woth fixing. And it might be worth fixing ntploggps to be able to report its version without needing the GPSD libraries. None of this is a 1.0 issue, of course. On Tue, 26 Sep 2017, Gary E. Miller via devel wrote: > On Tue, 26 Sep 2017 00:15:04 -0700 (PDT) > Fred Wright via devel wrote: > > > If the directory choice on gentoo is inappropriate, take that up with > > whomever packaged Python for it. > > Not Gentoo, the FHS. And the package I am using is git head. The point is that the portable way to determine where to install the libraries is to ask Python via the get_python_lib() function. Whatever it returns (as long as you don't supply the 'prefix' argument) is guaranteed to be in its initial sys.path. If that location is inappropriate, it's the fault of whomever configured and built that Python. Fred Wright _______________________________________________ devel mailing list devel at ntpsec.org http://lists.ntpsec.org/mailman/listinfo/devel From fw at fwright.net Tue Sep 26 21:20:34 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 14:20:34 -0700 (PDT) Subject: Python 3 and 1.0 In-Reply-To: <20170926203149.GA1447@thyrsus.com> References: <20170926201908.GA817@thyrsus.com> <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> <20170926203149.GA1447@thyrsus.com> Message-ID: On Tue, 26 Sep 2017, Eric S. Raymond via devel wrote: > Ian Bruene via devel : > > > > The python 3 build appears to work. However it has a unicode bug in ntpq > > (but not ntpmon! Yay consistency!), and I can not say that I *trust* any of > > it. > > > > This is partially my fault, as I failed to test the software in Py3 as much > > as I should have. As an excuse I will note that I fixed several py3 bugs in > > the last few weeks, and part of the reason for the lack of testing has been > > the higher friction in getting the p3 build to work. I'd noticed the problem with "ntpq -p", and even made a brief attempt at fixing it, but decided I needed to focus on the more pressing issues. I also notcied that test_agentx.py doesn't work with Python 3, but my impression is that the agentx stuff is still a WIP, anyway. BTW, all the tests fail on FreeBSD, due to an undefined reference in jigs.py. On Tue, 26 Sep 2017, Eric S. Raymond via devel wrote: > Ian Bruene via devel : > > Post-1.0 I'd like to take a systematic look at how data is flowing through > > the code and how it is being represented by the different versions of > > python. I can't tell for sure yet what is and isn't a kluge. I know for > > example that my fix earlier today was a band-aid due to time pressure and > > not fully understanding the data flow. I do *not* like not knowing how many > > of these there are. > > That's a good thing to be wary about. Indeed. When I started looking at the ntpq bug, I noticed that there seemed to be some inconsistencies in whether 'response' was expected to be str or bytes. It doesn't matter in Python 2, of course. But tossing in enough polystr() and polybytes() calls to make the exceptions go away isn't necessarily the best approach to making reliable code. :-) Fred Wright From gem at rellim.com Tue Sep 26 21:24:51 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 14:24:51 -0700 Subject: Python Library Cleanups In-Reply-To: References: Message-ID: <20170926142451.319fa446@spidey.rellim.com> Yo Fred! On Tue, 26 Sep 2017 14:08:56 -0700 (PDT) Fred Wright via devel wrote: > Resending with the correct subject. > > On Tue, 26 Sep 2017, Eric S. Raymond wrote: > > Fred Wright via devel : > > > > Plus these are most likely too late for 1.0. We are in the > > > > final testing phases now. > > > > > > I think requiring users to set PYTHONPATH to run the tools should > > > be considered a show stopper for a 1.0 release. > > > > And I would so regard it if this were ever required on Linux or > > FreeBSD. I have had no report of this. > > The documentation used to suggest putting a PYTHONPATH definition in > one's .bashrc. That should have set off alarm bells. That is how it is SUPPOSED to work for user installed source files. Locally installed user source is supposed to go in /usr/local/. System packages are supposed to go in /usr/ Not obeying the 30 year old convention of wchich to pick is bad. But the new patches put some in /usr and some in /usr/local/which is just wrong. This is all per the HFS. There may be other ways around the PYTHONPATH issues,, but breaking the FHS is a bad one. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Sep 26 21:59:07 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 17:59:07 -0400 Subject: Python Library Cleanups In-Reply-To: References: Message-ID: <20170926215907.GB3083@thyrsus.com> Fred Wright via devel : > > Not Gentoo, the FHS. And the package I am using is git head. > > The point is that the portable way to determine where to install the > libraries is to ask Python via the get_python_lib() function. Whatever it > returns (as long as you don't supply the 'prefix' argument) is guaranteed > to be in its initial sys.path. If that location is inappropriate, it's > the fault of whomever configured and built that Python. Does this means the FHS nonconformance isc an upstream bug? -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Tue Sep 26 22:02:33 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 18:02:33 -0400 Subject: Python 3 and 1.0 In-Reply-To: References: <20170926201908.GA817@thyrsus.com> <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> <20170926203149.GA1447@thyrsus.com> Message-ID: <20170926220233.GC3083@thyrsus.com> Fred Wright via devel : > BTW, all the tests fail on FreeBSD, due to an undefined reference in > jigs.py. Huh? If so, why has this not shown up in the results from the FreeBSD buildbot. > Indeed. When I started looking at the ntpq bug, I noticed that there > seemed to be some inconsistencies in whether 'response' was expected to be > str or bytes. It doesn't matter in Python 2, of course. But tossing in > enough polystr() and polybytes() calls to make the exceptions go away > isn't necessarily the best approach to making reliable code. :-) Unfortunately it's about the only recourse we have. See http://www.catb.org/esr/faqs/practical-python-porting/ for detailed discussion. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Tue Sep 26 22:09:06 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 18:09:06 -0400 Subject: Python Library Cleanups In-Reply-To: <20170926142451.319fa446@spidey.rellim.com> References: <20170926142451.319fa446@spidey.rellim.com> Message-ID: <20170926220906.GD3083@thyrsus.com> Gary E. Miller via devel : > Locally installed user source is supposed to go in /usr/local/. > > System packages are supposed to go in /usr/ > > Not obeying the 30 year old convention of wchich to pick is bad. But > the new patches put some in /usr and some in /usr/local/which is just > wrong. > > This is all per the HFS. > > There may be other ways around the PYTHONPATH issues,, but breaking the > FHS is a bad one. Gary, nobody broke FHS conformance deliberately and everybody wants conformance fixed, so please don't be truculent about this. Based on having looked at the code, I'm at this point leaning towards the theory that *Python* is getting this wrong and Fred's patches unmasked an upstream bug rather than crearting one in our build. The first thing we need to determine is whether that's true. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From fw at fwright.net Tue Sep 26 22:16:50 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 15:16:50 -0700 (PDT) Subject: Python Library Cleanups In-Reply-To: <20170926215907.GB3083@thyrsus.com> References: <20170926215907.GB3083@thyrsus.com> Message-ID: On Tue, 26 Sep 2017, Eric S. Raymond wrote: > Fred Wright via devel : > > > Not Gentoo, the FHS. And the package I am using is git head. > > > > The point is that the portable way to determine where to install the > > libraries is to ask Python via the get_python_lib() function. Whatever it > > returns (as long as you don't supply the 'prefix' argument) is guaranteed > > to be in its initial sys.path. If that location is inappropriate, it's > > the fault of whomever configured and built that Python. > > Does this means the FHS nonconformance isc an upstream bug? Exactly. And one that nobody bothered to complain about for GPSD. BTW, I have a tool that lists a few things including the library paths for every version of Python it can find on the system. I can submit it to devel/ if you like. Fred Wright From hmurray at megapathdsl.net Tue Sep 26 22:18:00 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 15:18:00 -0700 Subject: Python 3 and 1.0 In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 26 Sep 2017 18:02:33 EDT." <20170926220233.GC3083@thyrsus.com> Message-ID: <20170926221800.D876E40605C@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: >> BTW, all the tests fail on FreeBSD, due to an undefined >> reference in jigs.py. > Huh? If so, why has this not shown up in the results from the FreeBSD > buildbot. It works on my FreeBSD setup. -- These are my opinions. I hate spam. From fw at fwright.net Tue Sep 26 22:28:12 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 15:28:12 -0700 (PDT) Subject: Python 3 and 1.0 In-Reply-To: <20170926220233.GC3083@thyrsus.com> References: <20170926201908.GA817@thyrsus.com> <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> <20170926203149.GA1447@thyrsus.com> <20170926220233.GC3083@thyrsus.com> Message-ID: On Tue, 26 Sep 2017, Eric S. Raymond wrote: > Fred Wright via devel : > > BTW, all the tests fail on FreeBSD, due to an undefined reference in > > jigs.py. > > Huh? If so, why has this not shown up in the results from the FreeBSD buildbot. I don't know, but what I see here is this: $ cd ntpsec/ $ uname -a FreeBSD MacFree 10.3-RELEASE-p20 FreeBSD 10.3-RELEASE-p20 #0: Wed Jul 12 03:13:07 UTC 2017 root at amd64-builder.daemonology.net:/usr/obj/usr/src/sys/GENERIC amd64 $ python -V Python 2.7.13 $ ./build/main/tests/pylib/test_packet.py Traceback (most recent call last): File "./build/main/tests/pylib/test_packet.py", line 15, in import jigs File "/usr/home/fw/ntpsec/build/main/tests/pylib/jigs.py", line 175, in class SocketModuleJig: File "/usr/home/fw/ntpsec/build/main/tests/pylib/jigs.py", line 185, in SocketModuleJig EAI_NODATA = socket.EAI_NODATA AttributeError: 'module' object has no attribute 'EAI_NODATA' I have no idea whether this affects anything real or whether it's just a test artifact. If the latter, it's clearly not a release blocker. BTW, when I said "all the tests", that's not quite true. I exclude the agentx test due to its Python 3 issues, but it's not affected by this particular bug. I only see this on FreeBSD. The tests work on OpenBSD, and the build doesn't work on NetBSD due to the 6.1.5 issues. > > Indeed. When I started looking at the ntpq bug, I noticed that there > > seemed to be some inconsistencies in whether 'response' was expected to be > > str or bytes. It doesn't matter in Python 2, of course. But tossing in > > enough polystr() and polybytes() calls to make the exceptions go away > > isn't necessarily the best approach to making reliable code. :-) > > Unfortunately it's about the only recourse we have. See > > http://www.catb.org/esr/faqs/practical-python-porting/ > > for detailed discussion. Yeah, I'm familiar with that document. :-) The point is that one should try to understand what's going on and type things consistently, rather than just ramdomly converting all over the place. And since the Python ntpq is a *new* program, it has less excuse for getting this wrong. Fred Wright From fw at fwright.net Tue Sep 26 22:32:30 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 15:32:30 -0700 (PDT) Subject: Python 3 and 1.0 In-Reply-To: <20170926221800.D876E40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170926221800.D876E40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Tue, 26 Sep 2017, Hal Murray via devel wrote: > devel at ntpsec.org said: > >> BTW, all the tests fail on FreeBSD, due to an undefined > >> reference in jigs.py. > > Huh? If so, why has this not shown up in the results from the FreeBSD > > buildbot. > > It works on my FreeBSD setup. If that's also FreeBSD 10.3 and Python 2.7.13, then it must be something weird in my install, and nothing to worry about in general. Fred Wright From gem at rellim.com Tue Sep 26 22:36:50 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 15:36:50 -0700 Subject: Python Library Cleanups In-Reply-To: References: <20170926215907.GB3083@thyrsus.com> Message-ID: <20170926153650.1a3aaf50@spidey.rellim.com> Yo Fred! On Tue, 26 Sep 2017 15:16:50 -0700 (PDT) Fred Wright via devel wrote: > On Tue, 26 Sep 2017, Eric S. Raymond wrote: > > Fred Wright via devel : > > > > Not Gentoo, the FHS. And the package I am using is git head. > > > > > > The point is that the portable way to determine where to install > > > the libraries is to ask Python via the get_python_lib() > > > function. Whatever it returns (as long as you don't supply the > > > 'prefix' argument) is guaranteed to be in its initial sys.path. > > > If that location is inappropriate, it's the fault of whomever > > > configured and built that Python. > > > > Does this means the FHS nonconformance isc an upstream bug? > > Exactly. And one that nobody bothered to complain about for GPSD. Good catch. gpsd also breaks the FHS. I'm glad you point that out. I'm looking for when that got broken right now. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ianbruene at gmail.com Tue Sep 26 22:37:31 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 26 Sep 2017 17:37:31 -0500 Subject: Python 3 and 1.0 In-Reply-To: References: <20170926201908.GA817@thyrsus.com> <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> <20170926203149.GA1447@thyrsus.com> Message-ID: <7955cc1b-1905-d07b-f184-5c3d55bcfb7e@gmail.com> On 09/26/2017 04:20 PM, Fred Wright via devel wrote: > I also notcied that test_agentx.py doesn't work with Python 3, but my > impression is that the agentx stuff is still a WIP, anyway. This is true. > Indeed. When I started looking at the ntpq bug, I noticed that there > seemed to be some inconsistencies in whether 'response' was expected to be > str or bytes. It doesn't matter in Python 2, of course. But tossing in > enough polystr() and polybytes() calls to make the exceptions go away > isn't necessarily the best approach to making reliable code. :-) Right. Hence why I need to figure out the best way to make this work /cleanly/. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From hmurray at megapathdsl.net Tue Sep 26 22:39:47 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 15:39:47 -0700 Subject: Python 3 and 1.0 In-Reply-To: Message from Fred Wright via devel of "Tue, 26 Sep 2017 15:32:30 PDT." Message-ID: <20170926223947.8AC7C40605C@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > If that's also FreeBSD 10.3 and Python 2.7.13, then it must be something > weird in my install, and nothing to worry about in general. I've tested 10.3 and 11.0 on 64 bit Intel, 11.0 on 32 bit Intel, and 11.0 on ARM (Pi, and BBB) Your earlier message said: FreeBSD MacFree 10.3-RELEASE-p20 FreeBSD I don't recognize "MacFree". I assume it's a glitch in that build. -- These are my opinions. I hate spam. From ianbruene at gmail.com Tue Sep 26 22:42:49 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 26 Sep 2017 17:42:49 -0500 Subject: Python 3 and 1.0 In-Reply-To: References: <20170926201908.GA817@thyrsus.com> <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> <20170926203149.GA1447@thyrsus.com> <20170926220233.GC3083@thyrsus.com> Message-ID: On 09/26/2017 05:28 PM, Fred Wright via devel wrote: > [error snipped] > I have no idea whether this affects anything real or whether it's just a > test artifact. If the latter, it's clearly not a release blocker. This error does not effect the program itself. It is part of a test jig which is spliced into the code when necessary. > Yeah, I'm familiar with that document. :-) The point is that one should > try to understand what's going on and type things consistently, rather > than just ramdomly converting all over the place. And since the Python > ntpq is a *new* program, it has less excuse for getting this wrong. Well there it gets tricky... ntpq is not strictly a new program: it was translated from C, and bears the scars of that. As the tests get better it will be easier to refactor. Also, these problems are concentrated not in ntpq, but packet.py. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From fw at fwright.net Tue Sep 26 22:43:56 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 15:43:56 -0700 (PDT) Subject: Python 3 and 1.0 In-Reply-To: <20170926223947.8AC7C40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170926223947.8AC7C40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Tue, 26 Sep 2017, Hal Murray wrote: > devel at ntpsec.org said: > > If that's also FreeBSD 10.3 and Python 2.7.13, then it must be something > > weird in my install, and nothing to worry about in general. > > I've tested 10.3 and 11.0 on 64 bit Intel, 11.0 on 32 bit Intel, and 11.0 on > ARM (Pi, and BBB) > > Your earlier message said: > FreeBSD MacFree 10.3-RELEASE-p20 FreeBSD > > I don't recognize "MacFree". I assume it's a glitch in that build. No, it's because "uname -a" includes the hostname. Fred Wright From ianbruene at gmail.com Tue Sep 26 22:44:38 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 26 Sep 2017 17:44:38 -0500 Subject: Python 3 and 1.0 In-Reply-To: <20170926220233.GC3083@thyrsus.com> References: <20170926201908.GA817@thyrsus.com> <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> <20170926203149.GA1447@thyrsus.com> <20170926220233.GC3083@thyrsus.com> Message-ID: <6efba396-140d-4f6f-ef0e-6483b33cf667@gmail.com> On 09/26/2017 05:02 PM, Eric S. Raymond via devel wrote: > Huh? If so, why has this not shown up in the results from the FreeBSD buildbot. Two reasons: 1. python tests still not run by the build script 2. subsequent reports are inconsistent on whether FreeBSD has a problem or Fred's system is wonky. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From fw at fwright.net Tue Sep 26 22:50:50 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 15:50:50 -0700 (PDT) Subject: Python 3 and 1.0 In-Reply-To: <6efba396-140d-4f6f-ef0e-6483b33cf667@gmail.com> References: <20170926201908.GA817@thyrsus.com> <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> <20170926203149.GA1447@thyrsus.com> <20170926220233.GC3083@thyrsus.com> <6efba396-140d-4f6f-ef0e-6483b33cf667@gmail.com> Message-ID: On Tue, 26 Sep 2017, Ian Bruene via devel wrote: > On 09/26/2017 05:02 PM, Eric S. Raymond via devel wrote: > > Huh? If so, why has this not shown up in the results from the FreeBSD buildbot. > > Two reasons: > > 1. python tests still not run by the build script Ah, yes. Two of the tests were run by the buid script at one time, but that was commented out when the tests didn't work in the in-tree environment. Although that's fixed now, I didn't bother to uncomment them, because I think it would be better to work them in with waf's test runner rather than just throw in a couple of shell commands. But there are some things that need to be figured out to make Python-based tests work in the waf test framework. > 2. subsequent reports are inconsistent on whether FreeBSD has a problem > or Fred's system is wonky. Yes. But not a release blocker either way. Fred Wright From gem at rellim.com Tue Sep 26 22:53:44 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 15:53:44 -0700 Subject: Python 3 and 1.0 In-Reply-To: References: Message-ID: <20170926155344.1f261d2a@spidey.rellim.com> Yo Ian! On Tue, 26 Sep 2017 12:40:52 -0500 Ian Bruene via devel wrote: > Under these circumstances I strongly suggest that there be a note in > the README to the effect that python 3 compatibility is not > guaranteed. I have been occasionally testing on Python 3.4. Gentoo is still on Python 2.7, so mostly I stay on 2.7. I'm also waiting for the rest of Python 2.7 to get forward ported to Python 3.x. :-) I just did some quick tests, and I see only one issue. Do you have any specific issues worth looking at? This does not work well with Python 3.4: ntpq -u -p I'll push a patch shortly. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Sep 26 22:55:31 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 18:55:31 -0400 Subject: Python Library Cleanups In-Reply-To: References: <20170926215907.GB3083@thyrsus.com> Message-ID: <20170926225531.GA3712@thyrsus.com> Fred Wright via devel : > > On Tue, 26 Sep 2017, Eric S. Raymond wrote: > > Fred Wright via devel : > > > > Not Gentoo, the FHS. And the package I am using is git head. > > > > > > The point is that the portable way to determine where to install the > > > libraries is to ask Python via the get_python_lib() function. Whatever it > > > returns (as long as you don't supply the 'prefix' argument) is guaranteed > > > to be in its initial sys.path. If that location is inappropriate, it's > > > the fault of whomever configured and built that Python. > > > > Does this means the FHS nonconformance isc an upstream bug? > > Exactly. And one that nobody bothered to complain about for GPSD. Well, *that* changes things a lot. OK, here's the new plan. It goes into effect unless Gary replies with a good argument that the bug is ours and not Python's. 1. We keep Fred's recent MR. 2. Gary files a bug upstream to the Python devs detailing how get_python_lib() is implicated in FHS nonconformance. Gary, you willing? 3. We ship with a warning that the default installation is not FHS-conformant due to an upstream bug for which we give an the issue tracker reference that we got from step 2. 4. Release gets slipped by some days so we can have everything properly tested at ship time. > BTW, I have a tool that lists a few things including the library paths for > every version of Python it can find on the system. I can submit it to > devel/ if you like. Please do. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From gem at rellim.com Tue Sep 26 23:01:26 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 16:01:26 -0700 Subject: Python Library Cleanups In-Reply-To: <20170926225531.GA3712@thyrsus.com> References: <20170926215907.GB3083@thyrsus.com> <20170926225531.GA3712@thyrsus.com> Message-ID: <20170926160126.6fada0fa@spidey.rellim.com> Yo Eric! On Tue, 26 Sep 2017 18:55:31 -0400 "Eric S. Raymond via devel" wrote: > > > Does this means the FHS nonconformance isc an upstream bug? > > > > Exactly. And one that nobody bothered to complain about for GPSD. > > Well, *that* changes things a lot. > > 2. Gary files a bug upstream to the Python devs detailing how > get_python_lib() is implicated in FHS nonconformance. Gary, you > willing? I'm feeling like a broken record. The current behavior is a feature, not a bug. You know I love to bash Python, but in this case they got it right. If the user is installing code from source, then it should not be executable by default. For the same reason /usr/local/bin is not in the standard PATH, the /usr/local/lib/pythonx.x/ is not in the PYTHONPATH. If you are gonna file a bug on PYTHONPATH, you gotta file one on PATH. Just don't put my name on it. And I still feel there is a middle ground here that might work, but it is not immediately obvious. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Tue Sep 26 23:03:40 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 19:03:40 -0400 Subject: Python 3 and 1.0 In-Reply-To: <6efba396-140d-4f6f-ef0e-6483b33cf667@gmail.com> References: <20170926201908.GA817@thyrsus.com> <6fb60ab8-3169-adc9-0dd4-4c389f90ec99@gmail.com> <20170926203149.GA1447@thyrsus.com> <20170926220233.GC3083@thyrsus.com> <6efba396-140d-4f6f-ef0e-6483b33cf667@gmail.com> Message-ID: <20170926230340.GB3712@thyrsus.com> Ian Bruene via devel : > > > On 09/26/2017 05:02 PM, Eric S. Raymond via devel wrote: > >Huh? If so, why has this not shown up in the results from the FreeBSD buildbot. > > Two reasons: > > 1. python tests still not run by the build script > > 2. subsequent reports are inconsistent on whether FreeBSD has a problem or > Fred's system is wonky. Ouch. Those are good reasons. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From ianbruene at gmail.com Tue Sep 26 23:03:45 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Tue, 26 Sep 2017 18:03:45 -0500 Subject: Python 3 and 1.0 In-Reply-To: <20170926155344.1f261d2a@spidey.rellim.com> References: <20170926155344.1f261d2a@spidey.rellim.com> Message-ID: <3a461e9e-f322-ecc1-3ed5-d6e12667faef@gmail.com> On 09/26/2017 05:53 PM, Gary E. Miller via devel wrote: > Yo Ian! > > On Tue, 26 Sep 2017 12:40:52 -0500 > Ian Bruene via devel wrote: > >> Under these circumstances I strongly suggest that there be a note in >> the README to the effect that python 3 compatibility is not >> guaranteed. > I have been occasionally testing on Python 3.4. Gentoo is still on Python > 2.7, so mostly I stay on 2.7. I'm also waiting for the rest of > Python 2.7 to get forward ported to Python 3.x. :-) Whoop! That is better news than I had hoped. The quirks with the module installation have always made me less than confident with anything on the Py3 side of things. And the lack of bug reports could have just as easily been no one using it as there not being a problem. Hindsight: should have asked if anyone was using it. > I just did some quick tests, and I see only one issue. Do you have any > specific issues worth looking at? ntpq run in Py3 or 3.6 does not like the unicode in the units display. ntpmon same version is just fine with it. Beyond that not specific, I just fixed a bunch of tests to work with Py3 and uncovered bugs in the process, then today found a bug the tests didn't catch. Simply less trust in the code than I'd like. > This does not work well with Python 3.4: > > ntpq -u -p > > I'll push a patch shortly. Ack. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From gem at rellim.com Tue Sep 26 23:15:06 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 16:15:06 -0700 Subject: Python 3 and 1.0 In-Reply-To: <3a461e9e-f322-ecc1-3ed5-d6e12667faef@gmail.com> References: <20170926155344.1f261d2a@spidey.rellim.com> <3a461e9e-f322-ecc1-3ed5-d6e12667faef@gmail.com> Message-ID: <20170926161506.0fad8136@spidey.rellim.com> Yo Ian! On Tue, 26 Sep 2017 18:03:45 -0500 Ian Bruene via devel wrote: > The quirks with the > module installation have always made me less than confident with > anything on the Py3 side of things. What quirks? It was working fine for me. > Hindsight: should have asked if anyone was using it. You can only trust your own tests. :-) > > I just did some quick tests, and I see only one issue. Do you have > > any specific issues worth looking at? > > ntpq run in Py3 or 3.6 does not like the unicode in the units > display. ntpmon same version is just fine with it. Working on it. I'm sure I broke it. I also found ntpleapfetch hangs forever on Python 3.4 > Simply less trust in the code than I'd like. Yeah, always good to test stable code for a few weeks before release. Spend some time reading how Linus does his releases. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Tue Sep 26 23:22:51 2017 From: fw at fwright.net (Fred Wright) Date: Tue, 26 Sep 2017 16:22:51 -0700 (PDT) Subject: Python Library Cleanups In-Reply-To: <20170926225531.GA3712@thyrsus.com> References: <20170926215907.GB3083@thyrsus.com> <20170926225531.GA3712@thyrsus.com> Message-ID: On Tue, 26 Sep 2017, Eric S. Raymond wrote: > Fred Wright via devel : > > > BTW, I have a tool that lists a few things including the library paths for > > every version of Python it can find on the system. I can submit it to > > devel/ if you like. > > Please do. MR submitted. Fred Wright From esr at thyrsus.com Tue Sep 26 23:30:02 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 19:30:02 -0400 Subject: Python Library Cleanups In-Reply-To: <20170926160126.6fada0fa@spidey.rellim.com> References: <20170926215907.GB3083@thyrsus.com> <20170926225531.GA3712@thyrsus.com> <20170926160126.6fada0fa@spidey.rellim.com> Message-ID: <20170926233002.GD3712@thyrsus.com> Gary E. Miller via devel : > > 2. Gary files a bug upstream to the Python devs detailing how > > get_python_lib() is implicated in FHS nonconformance. Gary, you > > willing? > > I'm feeling like a broken record. The current behavior is a feature, > not a bug. You know I love to bash Python, but in this case they > got it right. > > If the user is installing code from source, then it should not > be executable by default. For the same reason /usr/local/bin is > not in the standard PATH, the /usr/local/lib/pythonx.x/ is not > in the PYTHONPATH. > > If you are gonna file a bug on PYTHONPATH, you gotta file one on > PATH. Just don't put my name on it. > > And I still feel there is a middle ground here that might work, but it > is not immediately obvious. Are you sure we're still talking about the same problem? At this point it looks very much as though: 1. Our code was accidentally FHS-correct, but not doing what it should, which is calling get_python_lib(). 2. Fred's patch changed it to do the right thing, call get_python_lib()... 3. ...which unmasked an upstream Python bug breaking FHS conformance. Does this match your understanding? -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Tue Sep 26 23:31:34 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 19:31:34 -0400 Subject: Python Library Cleanups In-Reply-To: References: <20170926215907.GB3083@thyrsus.com> <20170926225531.GA3712@thyrsus.com> Message-ID: <20170926233134.GE3712@thyrsus.com> Fred Wright via devel : > > On Tue, 26 Sep 2017, Eric S. Raymond wrote: > > Fred Wright via devel : > > > > > BTW, I have a tool that lists a few things including the library paths for > > > every version of Python it can find on the system. I can submit it to > > > devel/ if you like. > > > > Please do. > > MR submitted. I'll merge when the pipeline completes. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Wed Sep 27 01:12:43 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 18:12:43 -0700 Subject: Python 3 and 1.0 In-Reply-To: Message from Ian Bruene via devel of "Tue, 26 Sep 2017 17:44:38 CDT." <6efba396-140d-4f6f-ef0e-6483b33cf667@gmail.com> Message-ID: <20170927011244.0A84940605C@ip-64-139-1-69.sjc.megapath.net> > 1. python tests still not run by the build script What do I type to run them? I have a handy script which is a good place for things like that. -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Wed Sep 27 01:15:49 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 18:15:49 -0700 Subject: Releasing, testing Message-ID: <20170927011549.5800440605C@ip-64-139-1-69.sjc.megapath.net> > Spend some time reading how Linus does his releases. Got a suggested URL? -- These are my opinions. I hate spam. From gem at rellim.com Wed Sep 27 01:35:28 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 18:35:28 -0700 Subject: Releasing, testing In-Reply-To: <20170927011549.5800440605C@ip-64-139-1-69.sjc.megapath.net> References: <20170927011549.5800440605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170926183528.4fc9cb6b@spidey.rellim.com> Yo Hal! On Tue, 26 Sep 2017 18:15:49 -0700 Hal Murray wrote: > > Spend some time reading how Linus does his releases. > > Got a suggested URL? I used to follow LKML, but that got impossible for me. LWN.net does a fair job keeping up with the major twists and turns in the daily soap opera that is the Linux Kernel. The wikipedia article give a bit of an overview: https://en.wikipedia.org/wiki/Linux_kernel#Development_model But that has not been updated in a long time. In a horribly abbreviated form: Linus accepts enhancements for 2 weeks, every 3 months (the merge window). Then, mostly, does weekly patch releases. After the merge, it is almost exclusively fixes. Bigger and less important patches at first, then working down the size of the patches, and only taking the most critical patches, as the release nears. Maybe we should ask Linus to start a blog? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Wed Sep 27 02:04:53 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 22:04:53 -0400 (EDT) Subject: Our last-minute mess Message-ID: <20170927020453.5F75C13A0206@snark.thyrsus.com> Some clarity is emerging about the build problems addressed in Fred Wright's patch from last night. He originally wrote: 1) Adds a workaround for waf's broken code that sets up PYTHONDIR and PYTHONARCHDIR. The code is conceptually broken on all platforms, though it may produce correct results by coincidence on some platforms. See the comments in wafhelpers/fix_python_config.py for more details. Once his patch was applied, however, it broke FHS conformance in the install paths. On further investigation, it appears that: (a) Fred's code is correct; a function get_python_lib() in python.disutils that it relies on is broken, resulting in FHS violation. (b) This exact same bug, with the same result of FHS nonconformance, has affected GPSD's build since 2013 - and nobody noticed. Not Gary or Hal or me, and nobody in our userbase either. FHS summarized here: https://en.wikipedia.org/wiki/Filesystem_Hierarchy_Standard So, the first thing for everyone to know is that we have slack to fix this. ICEI has not made any commitments or representations about our release date, otner than we absolutely have to ship by 28 Oct for fundraising reasons. I don't think we should need nearly that much time. I think we have three possible alternatives at this point: 1. Revert Fred's patch - live with the fact that choice of installation paths is broken, though coincidentally right on some platforms. 2. Keep Fred's patch. Ship 1.0 with FHS non-conformance as a known and documented bug. 3. Hold 1.0 until we can write a replacement get_python_lib() that works right (e.g. produces an FHS-conformant path set by default.) I'd like to hear from the senior devs (and anyone else with something intelligent to say!) on this. I have two relevant opinions: (1) I think we dodged a support-cost bullet by not shipping the code in its previous state. Random failures that have to be patched around by setting an environment variable...that has "support death by a thousand cuts" written all over it. I should have been paying closer attention to this. (2) I am, frankly, considering FHS conformance much less urgent than I did 24 hours ago. GPSD has been nominally broken that way for four years (that's 7, possibly 8 releases) and *nobody noticed*. Still good to do, if only to set an example, but apparently not pressing. Floor is open for debate. But let's try not to wrangle too long about this; I really don't want to use up more than a week of our slack, and we need another cooldown period. -- Eric S. Raymond The United States is in no way founded upon the Christian religion -- George Washington & John Adams, in a diplomatic message to Malta. From gem at rellim.com Wed Sep 27 02:27:13 2017 From: gem at rellim.com (Gary E. Miller) Date: Tue, 26 Sep 2017 19:27:13 -0700 Subject: Our last-minute mess In-Reply-To: <20170927020453.5F75C13A0206@snark.thyrsus.com> References: <20170927020453.5F75C13A0206@snark.thyrsus.com> Message-ID: <20170926192713.6f3bc975@spidey.rellim.com> Yo Eric! On Tue, 26 Sep 2017 22:04:53 -0400 (EDT) "Eric S. Raymond via devel" wrote: > Once his patch was applied, however, it broke FHS conformance in the > install paths. On further investigation, it appears that: I'm not gonna agree, but that does not matter. Until we agree on the desired install locations we can not decide on how to get there. > I think we have three possible alternatives at this point: > 1. Revert Fred's patch Works for me. > 2. Keep Fred's patch. Ship 1.0 with FHS non-conformance as a known > and documented bug. Gack. Opening a tech support nightmare. We constanlty have issues with conflicting system installed and user installed ntpd. it will be a lot of fun when the distro updates ntpd and breaks the user installed ntpd. That was not a problem before this patch. > 3. Hold 1.0 until we can write a replacement get_python_lib() that > works right (e.g. produces an FHS-conformant path set by default.) Before someone rewrites get_python_lib() we better agree on what it should do. I suspect the solution will require much more than just changes to get_python_lib(). > (2) I am, frankly, considering FHS conformance much less urgent than I > did 24 hours ago. Gack. So, OK to break 30 year old UNIX guarantees? Do we really have to go over all that bad things that breaking FHS compliance brings? I would consider non-FHS copmpliance an instant flunk on a security audit. A lot of smart people have worked on that for a long time, with good reason. And before you break FHS, talk to some of our maintainers to see what THEY think of breaking FHS? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From Matthew.Selsky at twosigma.com Wed Sep 27 02:46:22 2017 From: Matthew.Selsky at twosigma.com (Matthew Selsky) Date: Tue, 26 Sep 2017 22:46:22 -0400 Subject: Python Library Cleanups In-Reply-To: <20170926233002.GD3712@thyrsus.com> References: <20170926215907.GB3083@thyrsus.com> <20170926225531.GA3712@thyrsus.com> <20170926160126.6fada0fa@spidey.rellim.com> <20170926233002.GD3712@thyrsus.com> Message-ID: <20170927024622.GC16340@twosigma.com> On Tue, Sep 26, 2017 at 07:30:02PM -0400, Eric S. Raymond via devel wrote: > Gary E. Miller via devel : > > > 2. Gary files a bug upstream to the Python devs detailing how > > > get_python_lib() is implicated in FHS nonconformance. Gary, you > > > willing? > > > > I'm feeling like a broken record. The current behavior is a feature, > > not a bug. You know I love to bash Python, but in this case they > > got it right. > > > > If the user is installing code from source, then it should not > > be executable by default. For the same reason /usr/local/bin is > > not in the standard PATH, the /usr/local/lib/pythonx.x/ is not > > in the PYTHONPATH. > > > > If you are gonna file a bug on PYTHONPATH, you gotta file one on > > PATH. Just don't put my name on it. > > > > And I still feel there is a middle ground here that might work, but it > > is not immediately obvious. > > Are you sure we're still talking about the same problem? > > At this point it looks very much as though: > > 1. Our code was accidentally FHS-correct, but not doing what it should, > which is calling get_python_lib(). > > 2. Fred's patch changed it to do the right thing, call get_python_lib()... > > 3. ...which unmasked an upstream Python bug breaking FHS conformance. > > Does this match your understanding? Eric, didn't you file a ticket with waf (https://github.com/waf-project/waf/issues/1897) for this same issue in January 2017? Also, waf chose not to use a directory in sys.path in commit https://github.com/waf-project/waf/commit/588f809ffa4dd514ea90bdcd0341d9baf508784f See https://github.com/waf-project/waf/pull/1554 for more background. We could revert that commit in our local waf copy, or we maybe modify the internals in our wscript file... Cheers, -Matt From esr at thyrsus.com Wed Sep 27 02:59:32 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Tue, 26 Sep 2017 22:59:32 -0400 Subject: Python Library Cleanups In-Reply-To: <20170927024622.GC16340@twosigma.com> References: <20170926215907.GB3083@thyrsus.com> <20170926225531.GA3712@thyrsus.com> <20170926160126.6fada0fa@spidey.rellim.com> <20170926233002.GD3712@thyrsus.com> <20170927024622.GC16340@twosigma.com> Message-ID: <20170927025932.GA9201@thyrsus.com> Matthew Selsky via devel : > Eric, didn't you file a ticket with waf (https://github.com/waf-project/waf/issues/1897) for this same issue in January 2017? That's clearly the same functional issue, but at the time I did not yet have any idea that waf itself was (even partly) the victim of an upstream bug. As I told ita then, I didn't have enough knowledge to generate a fix patch. I still do not. > Also, waf chose not to use a directory in sys.path in commit https://github.com/waf-project/waf/commit/588f809ffa4dd514ea90bdcd0341d9baf508784f > > See https://github.com/waf-project/waf/pull/1554 for more background. > > We could revert that commit in our local waf copy, or we maybe modify the internals in our wscript file... And the rabbit-hole would get deeper. We might end up doing this, but the thought frightens me. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Wed Sep 27 05:59:48 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Tue, 26 Sep 2017 22:59:48 -0700 Subject: Our last-minute mess In-Reply-To: Message from "Eric S. Raymond via devel" of "Tue, 26 Sep 2017 22:04:53 EDT." <20170927020453.5F75C13A0206@snark.thyrsus.com> Message-ID: <20170927055949.0060640605C@ip-64-139-1-69.sjc.megapath.net> > I'd like to hear from the senior devs (and anyone else with something > intelligent to say!) on this. You need a steering committee to represent the customers on things like this. I didn't find enough info in the wiki page to enlighten me. I get the general idea, but I don't know the tag that describes out software. Is it real system software? What about devel mode? Distros aren't going to use our install script. They don't want to install stuff, they want to package it up in a .deb or .rpm or whatever. How do we get them the info they need in a format they can use? What are the plans for splitting out the python stuff? Do most distros include Python in their basic package? -- These are my opinions. I hate spam. From esr at thyrsus.com Wed Sep 27 11:39:17 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 07:39:17 -0400 Subject: Our last-minute mess In-Reply-To: <20170927055949.0060640605C@ip-64-139-1-69.sjc.megapath.net> References: <20170927020453.5F75C13A0206@snark.thyrsus.com> <20170927055949.0060640605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170927113917.GA18175@thyrsus.com> Hal Murray : > > > I'd like to hear from the senior devs (and anyone else with something > > intelligent to say!) on this. > > You need a steering committee to represent the customers on things like this. Good idea. I'll keep that in mind as we get more customers. > I didn't find enough info in the wiki page to enlighten me. I get the > general idea, but I don't know the tag that describes out software. Is it > real system software? What about devel mode? It's what FHS consider "non-essential system software" - needs to run as root at boot but is not required for single-user recovery mode. I couldn't find a reference to "devel mode" in the FHS spec, so I can't answer that question. > Distros aren't going to use our install script. They don't want to install > stuff, they want to package it up in a .deb or .rpm or whatever. How do we > get them the info they need in a format they can use? That's what the packaging/ directory is for. It's supposed to contain both meta data examples and documentation that is guidance for packagers. > What are the plans for splitting out the python stuff? Do most distros > include Python in their basic package? Python is effectively universal at this point. The rational partitioning is probaly (1) core daemon alone, (2) ntpq + ntpmon, (3) everything else. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Wed Sep 27 14:21:57 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 10:21:57 -0400 (EDT) Subject: Fix for Python library path problem Message-ID: <20170927142157.5214613A0206@snark.thyrsus.com> I've pushed a fix for Fred Wright's FixConfig class that seems to solve the problem of incorrect Python library locations. I tested it with no --prefix option and with --prefix=/usr, using install --destdir=/tmp/ntp. Gary, please verify that this addresses your FHS concerns. Fred, please tell me if you think this is broken in some obscure way. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Wed Sep 27 17:28:43 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 27 Sep 2017 10:28:43 -0700 Subject: Duplicate issue-closed messages Message-ID: <20170927172843.27A4E40605C@ip-64-139-1-69.sjc.megapath.net> I'm getting duplicates of issue-closed messages from gitlab. I haven't checked to see if it's all of them, or just when you close something, or just enough of them to attract my attention. Are you doing anything interesting? Is anybody else getting them? The last pair had dates that were 1 second apart. Subject: Re: ntpsec | ntpleapfetch on CentOS 6.x - wants shasum but on CentOS it's sha1sum (#394) From: "Eric S. Raymond" Date: Wed, 27 Sep 2017 11:46:44 +0000 Subject: Re: ntpsec | ntpleapfetch on CentOS 6.x - wants shasum but on CentOS it's sha1sum (#394) From: "Eric S. Raymond" Date: Wed, 27 Sep 2017 11:46:44 +0000 -- These are my opinions. I hate spam. From gem at rellim.com Wed Sep 27 17:32:20 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 10:32:20 -0700 Subject: Fix for Python library path problem In-Reply-To: <20170927142157.5214613A0206@snark.thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> Message-ID: <20170927103220.7440ff64@spidey.rellim.com> Yo Eric! On Wed, 27 Sep 2017 10:21:57 -0400 (EDT) "Eric S. Raymond via devel" wrote: > Gary, please verify that this addresses your FHS concerns. Sort of. Looks like the python libs now installed in the right place, again: /usr/local/lib64/python2.7/site-packages/ntp/packet.py My first quick test shows the install leaving the old python libs installed. Yesterday, when the libs got moved, the old ones got deleted. So my subsequent tests where using the wrong library versions. When I deleted the old /usr/lib64/python2.7/site-packages/ntp/ then I get this: # ntpmon ntpmon: can't find Python NTP library. No module named ntp.packet So you forgot to restore the PYTHONPATH checks. So a bunch more commits to revert. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Wed Sep 27 17:47:34 2017 From: fw at fwright.net (Fred Wright) Date: Wed, 27 Sep 2017 10:47:34 -0700 (PDT) Subject: Fix for Python library path problem In-Reply-To: <20170927142157.5214613A0206@snark.thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> Message-ID: On Wed, 27 Sep 2017, Eric S. Raymond via devel wrote: > I've pushed a fix for Fred Wright's FixConfig class that seems to > solve the problem of incorrect Python library locations. > > I tested it with no --prefix option and with --prefix=/usr, > using install --destdir=/tmp/ntp. > > Gary, please verify that this addresses your FHS concerns. > > Fred, please tell me if you think this is broken in some obscure way. I'm not sure about "obscure", but if the result isn't in sys.path, then it's back to the same old problem. Looking at the waf change that introduced the trouble, it looks like it was mainly motivated by wanting to allow --prefix to influence the results (even though one can always supply --pythondir and --pythonarchdir), and they simply caused the no --prefix case to pas the default prefix instead of nothing, perhaps without realizing how this screws up the result. AFAICT, Python simply doesn't follow FHS on Linux. It may have the attitude that the fact that the paths have "pythonX.Y" in them makes them "owned" by Python, and hence exempt from the usual FHS rules. Whether one agrees with that philosophy or not, it's the way Python is set up (on Linux, anyway), and going against it can be expected to cause trouble. Take a look at the "non-FHS-compliant" Python library location on your system, and see how many *other* packages are being installed there. *Everyone* is going with Python rather than FHS on this issue, and if you want it fixed, you should convince the Python folks (or whoever configures the Linux Python installs) to fix it. Fred Wright From fw at fwright.net Wed Sep 27 17:59:53 2017 From: fw at fwright.net (Fred Wright) Date: Wed, 27 Sep 2017 10:59:53 -0700 (PDT) Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> Message-ID: On Wed, 27 Sep 2017, Fred Wright wrote: > On Wed, 27 Sep 2017, Eric S. Raymond via devel wrote: > > > I've pushed a fix for Fred Wright's FixConfig class that seems to > > solve the problem of incorrect Python library locations. > > > > I tested it with no --prefix option and with --prefix=/usr, > > using install --destdir=/tmp/ntp. > > > > Gary, please verify that this addresses your FHS concerns. > > > > Fred, please tell me if you think this is broken in some obscure way. > > I'm not sure about "obscure", but if the result isn't in sys.path, then > it's back to the same old problem. FYI, I just took a look at sys.path on the three Linuces I have here (Ubuntu, CentOS, and Fedora), and none of them has a single entry with "local" as part of the path. > Looking at the waf change that introduced the trouble, it looks like it > was mainly motivated by wanting to allow --prefix to influence the results > (even though one can always supply --pythondir and --pythonarchdir), and > they simply caused the no --prefix case to pas the default prefix instead > of nothing, perhaps without realizing how this screws up the result. > > AFAICT, Python simply doesn't follow FHS on Linux. It may have the > attitude that the fact that the paths have "pythonX.Y" in them makes them > "owned" by Python, and hence exempt from the usual FHS rules. Whether one > agrees with that philosophy or not, it's the way Python is set up (on > Linux, anyway), and going against it can be expected to cause trouble. > > Take a look at the "non-FHS-compliant" Python library location on your > system, and see how many *other* packages are being installed there. > *Everyone* is going with Python rather than FHS on this issue, and if you > want it fixed, you should convince the Python folks (or whoever configures > the Linux Python installs) to fix it. > > Fred Wright > From gem at rellim.com Wed Sep 27 18:16:46 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 11:16:46 -0700 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> Message-ID: <20170927111646.6367fbbe@spidey.rellim.com> Yo Fred! On Wed, 27 Sep 2017 10:47:34 -0700 (PDT) Fred Wright via devel wrote: > AFAICT, Python simply doesn't follow FHS on Linux. Really? It does on Gentoo. Ditto debian, etc... RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Sep 27 18:18:58 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 11:18:58 -0700 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> Message-ID: <20170927111858.6fe30ef4@spidey.rellim.com> Yo Fred! On Wed, 27 Sep 2017 10:59:53 -0700 (PDT) Fred Wright via devel wrote: > > I'm not sure about "obscure", but if the result isn't in sys.path, > > then it's back to the same old problem. > > FYI, I just took a look at sys.path on the three Linuces I have here > (Ubuntu, CentOS, and Fedora), and none of them has a single entry with > "local" as part of the path. Yes, as it should be. Local is local, and sys is sys. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Wed Sep 27 18:41:13 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 14:41:13 -0400 Subject: Duplicate issue-closed messages In-Reply-To: <20170927172843.27A4E40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170927172843.27A4E40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170927184113.GA19136@thyrsus.com> Hal Murray : > Are you doing anything interesting? Is anybody else getting them? No. I've actually been asleep. :-) I'm not seeing duplicates. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Wed Sep 27 18:46:15 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 14:46:15 -0400 Subject: Fix for Python library path problem In-Reply-To: <20170927103220.7440ff64@spidey.rellim.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927103220.7440ff64@spidey.rellim.com> Message-ID: <20170927184615.GC19136@thyrsus.com> Gary E. Miller via devel : > Yo Eric! > > On Wed, 27 Sep 2017 10:21:57 -0400 (EDT) > "Eric S. Raymond via devel" wrote: > > > Gary, please verify that this addresses your FHS concerns. > > Sort of. Looks like the python libs now installed in the right place, > again: > > /usr/local/lib64/python2.7/site-packages/ntp/packet.py Good. > My first quick test shows the install leaving the old python libs installed. > > Yesterday, when the libs got moved, the old ones got deleted. Right, I haven't fixed that. I don't see an obvios way to address it. > So my subsequent tests where using the wrong library versions. > > When I deleted the old /usr/lib64/python2.7/site-packages/ntp/ then > I get this: > > # ntpmon > ntpmon: can't find Python NTP library. > No module named ntp.packet > > So you forgot to restore the PYTHONPATH checks. So a bunch more commits > to revert. One of the goals of this change is to abolish the need for PYTHONPATH to be set. I'll exoplain the reasoning behind this change in my reply to Fred. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From ianbruene at gmail.com Wed Sep 27 18:51:15 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Wed, 27 Sep 2017 13:51:15 -0500 Subject: Duplicate issue-closed messages In-Reply-To: <20170927184113.GA19136@thyrsus.com> References: <20170927172843.27A4E40605C@ip-64-139-1-69.sjc.megapath.net> <20170927184113.GA19136@thyrsus.com> Message-ID: On 09/27/2017 01:41 PM, Eric S. Raymond via devel wrote: > Hal Murray : >> Are you doing anything interesting? Is anybody else getting them? > No. I've actually been asleep. :-) > > I'm not seeing duplicates. I have been, seeing duplicates that is. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From jason at azze.org Wed Sep 27 19:13:05 2017 From: jason at azze.org (Jason Azze) Date: Wed, 27 Sep 2017 15:13:05 -0400 Subject: Duplicate issue-closed messages In-Reply-To: <20170927172843.27A4E40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170927172843.27A4E40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: On Wed, Sep 27, 2017 at 1:28 PM, Hal Murray via devel wrote: > I'm getting duplicates of issue-closed messages from gitlab. > > Are you doing anything interesting? Is anybody else getting them? I also got a duplicate on the ntpleapfetch closure message. From esr at thyrsus.com Wed Sep 27 20:19:54 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 16:19:54 -0400 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> Message-ID: <20170927201954.GD19136@thyrsus.com> Fred Wright via devel : > > On Wed, 27 Sep 2017, Eric S. Raymond via devel wrote: > > > I've pushed a fix for Fred Wright's FixConfig class that seems to > > solve the problem of incorrect Python library locations. > > > > I tested it with no --prefix option and with --prefix=/usr, > > using install --destdir=/tmp/ntp. > > > > Gary, please verify that this addresses your FHS concerns. > > > > Fred, please tell me if you think this is broken in some obscure way. > > I'm not sure about "obscure", but if the result isn't in sys.path, then > it's back to the same old problem. That's right. What we can do, though, is win under the following assumption: if /usr/lib/X/Y/ is in sys.path, so is /usr/local/lib/X/Y/. Look at this from my system: >>> [x for x in sys.path if x.find('/usr/lib') != -1] ['/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client'] >>> [x for x in sys.path if x.find('/usr/lib') != -1 and x.replace('/usr/lib', '/usr/local/lib') == -1] [] >>> [x for x in sys.path if x.find('/usr/lib') != -1 and x.replace('/usr/lib', '/usr/local/lib') != -1] ['/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client'] Ubuntu follows that rule. I think it's a safe bet that everything else does too - in part because of FHS, in part because of ancient autoconf conventions. A slightly more sophisticated version would pop off the first path component and replace it with PREFIX. We might need to do that if we ever hit a platform that really wants to install NTP under /opt. I think this is unlikely, I've never heard of the /opt convention being used for system daemons. > Looking at the waf change that introduced the trouble, it looks like it > was mainly motivated by wanting to allow --prefix to influence the results > (even though one can always supply --pythondir and --pythonarchdir), and > they simply caused the no --prefix case to pas the default prefix instead > of nothing, perhaps without realizing how this screws up the result. Would you be willing to file an issue about this on the waf tracker? > AFAICT, Python simply doesn't follow FHS on Linux. Doesn't it? Look at my example again. It looks a lot like somebody, either Python or Ubuntu's Python packagers, has gone to the effort to ensure that FHS-compliant library directories under /usr/local/lib exist in parallel with every system library directory under /usr/lib. I just checked Raspbian on one of my Pis. General rule works there too. Gentoo and our other platforms almost certainly have the same regularity, otherwise it would be an almighty coincidence that what Gary considers the right (FHS-compliant) locations were good before your MR. I think we're done here. I'll add an explanatory comment to the massage() logic. Just to be sure, though, people with access to other platforms - like Red Hat and FreeBSD - should run these checks in Python >>> [x for x in sys.path if x.find('/usr/lib') != -1] >>> [x for x in sys.path if x.find('/usr/lib') != -1 and x.replace('/usr/lib', '/usr/local/lib') == -1] If the second one ever comes up non-empty we could have a problem. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Wed Sep 27 20:28:30 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 16:28:30 -0400 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> Message-ID: <20170927202830.GF19136@thyrsus.com> Fred Wright via devel : > FYI, I just took a look at sys.path on the three Linuces I have here > (Ubuntu, CentOS, and Fedora), and none of them has a single entry with > "local" as part of the path. I see this under Ubuntu: >>> [x for x in sys.path if x.find('local') != -1] ['/usr/local/lib/python2.7/dist-packages/lbrynet-0.2.0-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/appdirs-1.4.0-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/jsonrpc-1.2-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/lbryum-2.6-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/leveldb-0.193-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/unqlite-0.2.0-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/txJSON_RPC-0.3.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/python_bitcoinrpc-0.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/seccure-0.3.1.3-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/Yapsy-1.11.223-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/miniupnpc-1.9-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/Twisted-16.0.0-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/jsonrpclib-0.1.7-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/dnspython-1.12.0-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/protobuf-3.0.0b2-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/qrcode-5.2.2-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/pbkdf2-1.3-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/ecdsa-0.13-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/slowaes-0.1a1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/gmpy-1.17-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/temperusb-1.5.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/pyusb-1.0.0-py2.7.egg', '/home/esr/.local/lib/python2.7/site-packages', '/usr/local/lib/python2.7/dist-packages'] -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From gem at rellim.com Wed Sep 27 20:29:32 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 13:29:32 -0700 Subject: Fix for Python library path problem In-Reply-To: <20170927201954.GD19136@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> Message-ID: <20170927132932.43bcdfdd@spidey.rellim.com> Yo Eric! On Wed, 27 Sep 2017 16:19:54 -0400 "Eric S. Raymond via devel" wrote: > I think we're done here. I'll add an explanatory comment to the > massage() logic. Except for your upcoming solution to the PYTHONPATH issue. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Wed Sep 27 20:43:54 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 16:43:54 -0400 Subject: Fix for Python library path problem In-Reply-To: <20170927132932.43bcdfdd@spidey.rellim.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> Message-ID: <20170927204354.GA22161@thyrsus.com> Gary E. Miller via devel : > Except for your upcoming solution to the PYTHONPATH issue. Explain "the PYTHONPATH issue", please. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From fw at fwright.net Wed Sep 27 20:45:30 2017 From: fw at fwright.net (Fred Wright) Date: Wed, 27 Sep 2017 13:45:30 -0700 (PDT) Subject: Fix for Python library path problem In-Reply-To: <20170927201954.GD19136@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> Message-ID: On Wed, 27 Sep 2017, Eric S. Raymond wrote: > Fred Wright via devel : > > > > On Wed, 27 Sep 2017, Eric S. Raymond via devel wrote: > > > > > I've pushed a fix for Fred Wright's FixConfig class that seems to > > > solve the problem of incorrect Python library locations. > > > > > > I tested it with no --prefix option and with --prefix=/usr, > > > using install --destdir=/tmp/ntp. > > > > > > Gary, please verify that this addresses your FHS concerns. > > > > > > Fred, please tell me if you think this is broken in some obscure way. > > > > I'm not sure about "obscure", but if the result isn't in sys.path, then > > it's back to the same old problem. > > That's right. What we can do, though, is win under the following assumption: > if /usr/lib/X/Y/ is in sys.path, so is /usr/local/lib/X/Y/. Look at > this from my system: > > >>> [x for x in sys.path if x.find('/usr/lib') != -1] > ['/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client'] > >>> [x for x in sys.path if x.find('/usr/lib') != -1 and x.replace('/usr/lib', '/usr/local/lib') == -1] > [] > >>> [x for x in sys.path if x.find('/usr/lib') != -1 and x.replace('/usr/lib', '/usr/local/lib') != -1] > ['/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client'] > > Ubuntu follows that rule. I think it's a safe bet that everything else does > too - in part because of FHS, in part because of ancient autoconf conventions. > > A slightly more sophisticated version would pop off the first path component > and replace it with PREFIX. We might need to do that if we ever hit a platform > that really wants to install NTP under /opt. I think this is unlikely, I've > never heard of the /opt convention being used for system daemons. > > > Looking at the waf change that introduced the trouble, it looks like it > > was mainly motivated by wanting to allow --prefix to influence the results > > (even though one can always supply --pythondir and --pythonarchdir), and > > they simply caused the no --prefix case to pas the default prefix instead > > of nothing, perhaps without realizing how this screws up the result. > > Would you be willing to file an issue about this on the waf tracker? > > > AFAICT, Python simply doesn't follow FHS on Linux. > > Doesn't it? Look at my example again. It looks a lot like somebody, either > Python or Ubuntu's Python packagers, has gone to the effort to ensure that > FHS-compliant library directories under /usr/local/lib exist in parallel with > every system library directory under /usr/lib. Whether the directories exists isn't the point. No directories under /usr/local/lib are in the default sys.path. Hence directories of that form don't work for imports without special help. E.g.: fw at ubuntu:~$ python -c 'import sys; print(sys.path)' ['', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client'] > I just checked Raspbian on one of my Pis. General rule works there too. > > Gentoo and our other platforms almost certainly have the same regularity, > otherwise it would be an almighty coincidence that what Gary > considers the right (FHS-compliant) locations were good before your > MR. Well, as long as "good" doesn't have to mean "working without PYTHONPATH kludgery", then yes. :-) As far as using the non-FHS location being "evil" goes, note: fw at ubuntu:~$ python -c 'from distutils import sysconfig; print(sysconfig.get_python_lib())' /usr/lib/python2.7/dist-packages fw at ubuntu:~$ ls /usr/lib/python2.7/dist-packages adium_theme_ubuntu-0.3.4.egg-info hpmudext.so python_apt-0.9.3.5ubuntu2.egg-info ANSI.py httplib2 python_debian-0.1.21_nmu2ubuntu2.egg-info ANSI.pyc httplib2-0.8.egg-info pyxdg-0.25.egg-info apt ibus README aptdaemon indicator_keyboard reportlab apt_inst.so ldb.so reportlab-3.0.egg-info apt_pkg.so _ldb_text.py requests aptsources _ldb_text.pyc requests-2.2.1.egg-info apt_xapian_index-0.45.egg-info libxml2mod.so samba axi libxml2.py scanext.la cairo libxml2.pyc scanext.so chardet lockfile-0.8.egg-info screen.py chardet-2.0.1.egg-info lockfile.py screen.pyc CommandNotFound lockfile.pyc serial command_not_found-0.3.egg-info lsb_release.py sessioninstaller Crypto lsb_release.pyc sessioninstaller-0.0.0.egg-info cupsext.la lxml sipconfig_nd.py cupsext.so lxml-3.3.3.egg-info sipconfig_nd.pyc cupshelpers ntdb.so sipconfig.py cups.so oauthlib sipconfig.pyc curl oauthlib-0.6.1.egg-info sip.so dblatex-0.3.4_3.egg-info oneconf six-1.5.2.egg-info dbtexmf oneconf-0.3.7.14.04.1.egg-info six.py dbus OpenSSL six.pyc _dbus_bindings.so PAM-0.4.2.egg-info smbc _dbus_glib_bindings.so PAM.x86_64-linux-gnu.so _smbc.so deb822.py pcardext.la softwarecenter_aptd_plugins deb822.pyc pcardext.so software_center_aptd_plugins-0.0.0.egg-info debconf.py pexpect ssh_import_id-3.21.egg-info debconf.pyc pexpect-3.1.egg-info system_service-0.1.6.egg-info debian PIL talloc.so debian_bundle PILcompat tdb.so debtagshw PILcompat.pth _tdb_text.py debtagshw-0.1.egg-info Pillow-2.3.0.egg-info _tdb_text.pyc defer piston_mini_client twisted defer-1.0.6.egg-info piston_mini_client-0.7.5.egg-info Twisted_Core-13.2.0.egg-info dirspec pkg_resources.py Twisted_Web-13.2.0.egg-info dirspec-13.10.egg-info pkg_resources.pyc ubuntu-sso-client drv_libxml2.py pxssh.py ubuntu-sso-client.pth drv_libxml2.pyc pxssh.pyc UbuntuSystemService duplicity pycrypto-2.6.1.egg-info unity_lens_photos-1.0.egg-info duplicity-0.6.23.egg-info pycups-1.9.66.egg-info urllib3 fdpexpect.py pycurl-7.19.3.egg-info urllib3-1.7.1.egg-info fdpexpect.pyc pycurl.so xapian FSM.py pygobject-3.12.0.egg-info xdg FSM.pyc pygtkcompat xdiagnose gi pygtk.pth xdiagnose-3.6.3build2.egg-info glib pygtk.py zeitgeist gobject pygtk.pyc zope gps-3.9.egg-info pyOpenSSL-0.13.egg-info zope.interface-4.0.5.egg-info gtk-2.0 PyQt4 zope.interface-4.0.5-nspkg.pth gtk-2.0-pysupport-compat.pth pyserial-2.6.egg-info hpmudext.la pysmbc-1.0.14.1.egg-info So most of the world elects to follow Python, not FHS. Fred Wright From gem at rellim.com Wed Sep 27 20:49:33 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 13:49:33 -0700 Subject: Fix for Python library path problem In-Reply-To: <20170927204354.GA22161@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> Message-ID: <20170927134933.19070c11@spidey.rellim.com> Yo Eric! On Wed, 27 Sep 2017 16:43:54 -0400 "Eric S. Raymond" wrote: > Gary E. Miller via devel : > > Except for your upcoming solution to the PYTHONPATH issue. > > Explain "the PYTHONPATH issue", please. I just installed git head. No PYTHONPATH: spidey ntpsec # ntpq -up ntpq: can't find Python NTP library. No module named 'ntp' When I add PYTHONPATH it works again: spidey ntpsec # export PYTHONPATH=/usr/local/lib64/python3.4/site-packages spidey ntpsec # ntpq -up [...] When I unset PYTHONPATH, it is broken again: spidey ntpsec # unset PYTHONPATH spidey ntpsec # ntpq -up ntpq: can't find Python NTP library. No module named 'ntp' This is why all the commits that ripped out PYTHONPATH stuff need to be reverted. This is how this whole mess got started. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Sep 27 20:54:13 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 13:54:13 -0700 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> Message-ID: <20170927135413.37d0a3a5@spidey.rellim.com> Yo Fred! On Wed, 27 Sep 2017 13:45:30 -0700 (PDT) Fred Wright via devel wrote: > So most of the world elects to follow Python, not FHS. Uh, you misunderstood the FHS. None of those were source code tar balls you installed. Those are the system packages, right where they should be. The system packages are never installed in /usr/local/ local is local, and sys is sys. Separate, not quite equal, well defined by FHS. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From jason at azze.org Wed Sep 27 20:54:56 2017 From: jason at azze.org (Jason Azze) Date: Wed, 27 Sep 2017 16:54:56 -0400 Subject: Fix for Python library path problem In-Reply-To: <20170927201954.GD19136@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> Message-ID: On Wed, Sep 27, 2017 at 4:19 PM, Eric S. Raymond via devel wrote: > Just to be sure, though, people with access to other platforms - like Red Hat > and FreeBSD - should run these checks in Python > >>>> [x for x in sys.path if x.find('/usr/lib') != -1] > >>>> [x for x in sys.path if x.find('/usr/lib') != -1 and x.replace('/usr/lib', '/usr/local/lib') == -1] > > If the second one ever comes up non-empty we could have a problem. I checked CentOS 6.9 and CentOS 7.3 and, after I figured out I had to import sys, I can confirm that the second expression comes back empty. From fw at fwright.net Wed Sep 27 20:56:49 2017 From: fw at fwright.net (Fred Wright) Date: Wed, 27 Sep 2017 13:56:49 -0700 (PDT) Subject: Fix for Python library path problem In-Reply-To: <20170927202830.GF19136@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927202830.GF19136@thyrsus.com> Message-ID: On Wed, 27 Sep 2017, Eric S. Raymond wrote: > Fred Wright via devel : > > FYI, I just took a look at sys.path on the three Linuces I have here > > (Ubuntu, CentOS, and Fedora), and none of them has a single entry with > > "local" as part of the path. > > I see this under Ubuntu: > > >>> [x for x in sys.path if x.find('local') != -1] > ['/usr/local/lib/python2.7/dist-packages/lbrynet-0.2.0-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/appdirs-1.4.0-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/jsonrpc-1.2-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/lbryum-2.6-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/leveldb-0.193-py2.7-linux-x86_64.egg', > '/usr/local/lib/python2.7/dist-packages/unqlite-0.2.0-py2.7-linux-x86_64.egg', > '/usr/local/lib/python2.7/dist-packages/txJSON_RPC-0.3.1-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/python_bitcoinrpc-0.1-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/seccure-0.3.1.3-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/Yapsy-1.11.223-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/miniupnpc-1.9-py2.7-linux-x86_64.egg', > '/usr/local/lib/python2.7/dist-packages/Twisted-16.0.0-py2.7-linux-x86_64.egg', > '/usr/local/lib/python2.7/dist-packages/jsonrpclib-0.1.7-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/dnspython-1.12.0-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/protobuf-3.0.0b2-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/qrcode-5.2.2-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/pbkdf2-1.3-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/ecdsa-0.13-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/slowaes-0.1a1-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/gmpy-1.17-py2.7-linux-x86_64.egg', > '/usr/local/lib/python2.7/dist-packages/temperusb-1.5.1-py2.7.egg', > '/usr/local/lib/python2.7/dist-packages/pyusb-1.0.0-py2.7.egg', > '/home/esr/.local/lib/python2.7/site-packages', > '/usr/local/lib/python2.7/dist-packages'] So *something* is adding additional entries to sys.path in your Ubuntu Python (but not mine). If there's a way to make that happen, it could be another solution. I *don't* see any of that here (ubuntu 14.04, Python 2.7.6), even though there are multiple packages with egg files. What does get_python_lib() show in this Python? It did occur to me that finding an FHS-compliant directory that's already in sys.path might be a solution, but when I discovered that, in all three Linuces I have here, the intersection between the set of sys.path directories and the set of FHS-compliant directories is empty, it seemed pointless to suggest it. Fred Wright From gem at rellim.com Wed Sep 27 21:00:41 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 14:00:41 -0700 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> Message-ID: <20170927140041.7f38b9b6@spidey.rellim.com> Yo Jason! On Wed, 27 Sep 2017 16:54:56 -0400 Jason Azze via devel wrote: > On Wed, Sep 27, 2017 at 4:19 PM, Eric S. Raymond via devel > wrote: > > > Just to be sure, though, people with access to other platforms - > > like Red Hat and FreeBSD - should run these checks in Python > > > >>>> [x for x in sys.path if x.find('/usr/lib') != -1] > > > >>>> [x for x in sys.path if x.find('/usr/lib') != -1 and > >>>> x.replace('/usr/lib', '/usr/local/lib') == -1] > > > > If the second one ever comes up non-empty we could have a problem. > > I checked CentOS 6.9 and CentOS 7.3 and, after I figured out I had to > import sys, I can confirm that the second expression comes back empty. Ditto for a barebones Gentoo. Which is what I expect. I have no .eggs installed. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Sep 27 21:07:05 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 14:07:05 -0700 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927202830.GF19136@thyrsus.com> Message-ID: <20170927140705.74f7311e@spidey.rellim.com> Yo Fred! On Wed, 27 Sep 2017 13:56:49 -0700 (PDT) Fred Wright via devel wrote: > So *something* is adding additional entries to sys.path in your Ubuntu > Python (but not mine). pip adds to the sys.path. Other package managers prolly do as well. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Wed Sep 27 21:08:08 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 17:08:08 -0400 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> Message-ID: <20170927210808.GA22484@thyrsus.com> Fred Wright via devel : > > Doesn't it? Look at my example again. It looks a lot like somebody, either > > Python or Ubuntu's Python packagers, has gone to the effort to ensure that > > FHS-compliant library directories under /usr/local/lib exist in parallel with > > every system library directory under /usr/lib. > > Whether the directories exists isn't the point. No directories under > /usr/local/lib are in the default sys.path. Hence directories of that > form don't work for imports without special help. E.g.: > > fw at ubuntu:~$ python -c 'import sys; print(sys.path)' > ['', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', > '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', > '/usr/lib/python2.7/lib-dynload', '/usr/lib/python2.7/dist-packages', > '/usr/lib/python2.7/dist-packages/PILcompat', > '/usr/lib/python2.7/dist-packages/gtk-2.0', > '/usr/lib/python2.7/dist-packages/ubuntu-sso-client'] I see this: esr at snark:~/software/ntp-rescue/ntpsec$ python -c 'import sys; print(sys.path)' ['', '/usr/local/lib/python2.7/dist-packages/lbrynet-0.2.0-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/appdirs-1.4.0-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/jsonrpc-1.2-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/lbryum-2.6-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/leveldb-0.193-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/unqlite-0.2.0-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/txJSON_RPC-0.3.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/python_bitcoinrpc-0.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/seccure-0.3.1.3-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/Yapsy-1.11.223-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/miniupnpc-1.9-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/Twisted-16.0.0-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/jsonrpclib-0.1.7-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/dnspython-1.12.0-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/protobuf-3.0.0b2-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/qrcode-5.2.2-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/pbkdf2-1.3-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/ecdsa-0.13-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/slowaes-0.1a1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/gmpy-1.17-py2.7-linux-x86_64.egg', '/usr/local/lib/python2.7/dist-packages/temperusb-1.5.1-py2.7.egg', '/usr/local/lib/python2.7/dist-packages/pyusb-1.0.0-py2.7.egg', '/usr/lib/python2.7', '/usr/lib/python2.7/plat-x86_64-linux-gnu', '/usr/lib/python2.7/lib-tk', '/usr/lib/python2.7/lib-old', '/usr/lib/python2.7/lib-dynload', '/home/esr/.local/lib/python2.7/site-packages', '/usr/local/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages', '/usr/lib/python2.7/dist-packages/PILcompat', '/usr/lib/python2.7/dist-packages/gtk-2.0', '/usr/lib/python2.7/dist-packages/ubuntu-sso-client'] I think the first thing we need to understand is why you and I, both using Ubuntu systems, are seeing different sys.path values. > As far as using the non-FHS location being "evil" goes, note: > So most of the world elects to follow Python, not FHS. Have that argument with Gary, not with me. If you can persuade him that FHS nonconformance is not a big deal, we can just revert my massage() patch and go. Having discovered that nobody complained about it over 7 or possibly 8 GPSD releases, I no longer care much. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Wed Sep 27 21:09:31 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 17:09:31 -0400 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> Message-ID: <20170927210931.GB22484@thyrsus.com> Jason Azze via devel : > I checked CentOS 6.9 and CentOS 7.3 and, after I figured out I had to > import sys, I can confirm that the second expression comes back empty. Good to know, thanks. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Wed Sep 27 21:21:23 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 17:21:23 -0400 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927202830.GF19136@thyrsus.com> Message-ID: <20170927212123.GC22484@thyrsus.com> Fred Wright via devel : > So *something* is adding additional entries to sys.path in your Ubuntu > Python (but not mine). If there's a way to make that happen, it could be > another solution. I *don't* see any of that here (ubuntu 14.04, Python > 2.7.6), even though there are multiple packages with egg files. You seem to be the odd person out. We now have confirmation of my rule (/usr/lib/X/Y in sys.path implies /usr/local/lib/X/Y in sys.path) from Gentoo and two versions of CentOS, as well as my Ubuntu 16 and Raspbian systems. > What does get_python_lib() show in this Python? >>> import distutils.sysconfig >>> distutils.sysconfig.get_python_lib() '/usr/lib/python2.7/dist-packages' -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From gem at rellim.com Wed Sep 27 21:28:19 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 14:28:19 -0700 Subject: Fix for Python library path problem In-Reply-To: <20170927212123.GC22484@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927202830.GF19136@thyrsus.com> <20170927212123.GC22484@thyrsus.com> Message-ID: <20170927142819.76f18cad@spidey.rellim.com> Yo Eric! On Wed, 27 Sep 2017 17:21:23 -0400 "Eric S. Raymond via devel" wrote: > Fred Wright via devel : > > So *something* is adding additional entries to sys.path in your > > Ubuntu Python (but not mine). If there's a way to make that > > happen, it could be another solution. I *don't* see any of that > > here (ubuntu 14.04, Python 2.7.6), even though there are multiple > > packages with egg files. > > You seem to be the odd person out. We now have confirmation of my > rule (/usr/lib/X/Y in sys.path implies /usr/local/lib/X/Y in > sys.path) from Gentoo and two versions of CentOS, as well as my > Ubuntu 16 and Raspbian systems. Uh, no. Not on my Gentoo, and not on CentOS. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Wed Sep 27 21:36:56 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 17:36:56 -0400 Subject: Fix for Python library path problem In-Reply-To: <20170927134933.19070c11@spidey.rellim.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> Message-ID: <20170927213656.GD22484@thyrsus.com> Gary E. Miller via devel : > Yo Eric! > > On Wed, 27 Sep 2017 16:43:54 -0400 > "Eric S. Raymond" wrote: > > > Gary E. Miller via devel : > > > Except for your upcoming solution to the PYTHONPATH issue. > > > > Explain "the PYTHONPATH issue", please. > > I just installed git head. No PYTHONPATH: > > spidey ntpsec # ntpq -up > ntpq: can't find Python NTP library. > No module named 'ntp' What is on your sys.path? Looks like waf is installing to the wrong place. You should do an install with --destdir=/tmp/ntp to see what installation path it's generating. > When I add PYTHONPATH it works again: > > spidey ntpsec # export PYTHONPATH=/usr/local/lib64/python3.4/site-packages > spidey ntpsec # ntpq -up > [...] > > When I unset PYTHONPATH, it is broken again: > > spidey ntpsec # unset PYTHONPATH > spidey ntpsec # ntpq -up > ntpq: can't find Python NTP library. > No module named 'ntp' > > This is why all the commits that ripped out PYTHONPATH stuff need to be > reverted. Please go have that argument with Fred, not me. We'll do whatever fixes you two can agree on. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Wed Sep 27 21:40:56 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 17:40:56 -0400 Subject: Fix for Python library path problem In-Reply-To: <20170927142819.76f18cad@spidey.rellim.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927202830.GF19136@thyrsus.com> <20170927212123.GC22484@thyrsus.com> <20170927142819.76f18cad@spidey.rellim.com> Message-ID: <20170927214055.GE22484@thyrsus.com> Gary E. Miller via devel : > Yo Eric! > > On Wed, 27 Sep 2017 17:21:23 -0400 > "Eric S. Raymond via devel" wrote: > > > Fred Wright via devel : > > > So *something* is adding additional entries to sys.path in your > > > Ubuntu Python (but not mine). If there's a way to make that > > > happen, it could be another solution. I *don't* see any of that > > > here (ubuntu 14.04, Python 2.7.6), even though there are multiple > > > packages with egg files. > > > > You seem to be the odd person out. We now have confirmation of my > > rule (/usr/lib/X/Y in sys.path implies /usr/local/lib/X/Y in > > sys.path) from Gentoo and two versions of CentOS, as well as my > > Ubuntu 16 and Raspbian systems. > > Uh, no. Not on my Gentoo, and not on CentOS. I thought you just told me the opposite, and I *know* Jason did. Jason Azze via devel : > I checked CentOS 6.9 and CentOS 7.3 and, after I figured out I had to > import sys, I can confirm that the second expression comes back empty. If Gentoo doesn't obey that rule, how did your FHS-compliant installations ever work before Fred's patch? -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From gem at rellim.com Wed Sep 27 22:16:10 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 15:16:10 -0700 Subject: Fix for Python library path problem In-Reply-To: <20170927213656.GD22484@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> Message-ID: <20170927151610.5c920b57@spidey.rellim.com> Yo Eric! On Wed, 27 Sep 2017 17:36:56 -0400 "Eric S. Raymond" wrote: > Gary E. Miller via devel : > > Yo Eric! > > > > On Wed, 27 Sep 2017 16:43:54 -0400 > > "Eric S. Raymond" wrote: > > > > > Gary E. Miller via devel : > > > > Except for your upcoming solution to the PYTHONPATH issue. > > > > > > Explain "the PYTHONPATH issue", please. > > > > I just installed git head. No PYTHONPATH: > > > > spidey ntpsec # ntpq -up > > ntpq: can't find Python NTP library. > > No module named 'ntp' > > What is on your sys.path? On a simple RasPi gentoo 3.4: pi3 etc # python Python 3.4.5 (default, May 3 2017, 05:22:30) [GCC 5.4.0] on linux Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.path ['', '/usr/local/lib/python3.4/site-packages/ntp', '/usr/lib/python34.zip', '/usr/lib/python3.4', '/usr/lib/python3.4/plat-linux', '/usr/lib/python3.4/lib-dynload', '/usr/lib/python3.4/site-packages'] > Looks like waf is installing to the wrong > place. It installed in the right place: pi3 etc # ls /usr/local/lib/python3.4/site-packages/ ntp > You should do an install with --destdir=/tmp/ntp to see what kkknstallation path it's generating. Works perfectly: pi3 ntpsec # ls /tmp/ntp/usr/local/lib/python3.4/site-packages/ntp/ agentx.py __init__.py ntpc.cpython-34m.so __pycache__ util.py control.py magic.py packet.py statfiles.py version.py > > This is why all the commits that ripped out PYTHONPATH stuff need > > to be reverted. > > Please go have that argument with Fred, not me. We'll do whatever > fixes you two can agree on. We all already agreed we preferred PYTHONPATH to go away, but the replacement needs to actually work. I mainly brought up PYTHONPATH as it shows what remains to be fixed, and how we used to handle the issue. We can always go back to the old soution that worked, until we find a better way. Right now, the NTPsec install is broken, and needs a fixed. What we see that pip does, is edit the sys.path to include the location an egg is installed. That looks to me like a method to go forward with. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Wed Sep 27 22:27:35 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 15:27:35 -0700 Subject: Fix for Python library path problem In-Reply-To: <20170927214055.GE22484@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927202830.GF19136@thyrsus.com> <20170927212123.GC22484@thyrsus.com> <20170927142819.76f18cad@spidey.rellim.com> <20170927214055.GE22484@thyrsus.com> Message-ID: <20170927152735.7114132d@spidey.rellim.com> Yo Eric! On Wed, 27 Sep 2017 17:40:56 -0400 "Eric S. Raymond" wrote: > Gary E. Miller via devel : > > Yo Eric! > > > > On Wed, 27 Sep 2017 17:21:23 -0400 > > "Eric S. Raymond via devel" wrote: > > > > > Fred Wright via devel : > > > > So *something* is adding additional entries to sys.path in your > > > > Ubuntu Python (but not mine). If there's a way to make that > > > > happen, it could be another solution. I *don't* see any of that > > > > here (ubuntu 14.04, Python 2.7.6), even though there are > > > > multiple packages with egg files. > > > > > > You seem to be the odd person out. We now have confirmation of my > > > rule (/usr/lib/X/Y in sys.path implies /usr/local/lib/X/Y in > > > sys.path) from Gentoo and two versions of CentOS, as well as my > > > Ubuntu 16 and Raspbian systems. > > > > Uh, no. Not on my Gentoo, and not on CentOS. > > I thought you just told me the opposite, Sorry if I was not clear the first time. > and I *know* Jason did. I sugggest you reconfirm with him. Re-reading the emails I see a bunch of double negatives going around. > Jason Azze via devel : > > I checked CentOS 6.9 and CentOS 7.3 and, after I figured out I had > > to import sys, I can confirm that the second expression comes back > > empty. > > If Gentoo doesn't obey that rule, Uh, what rule? > how did your FHS-compliant > installations ever work before Fred's patch? Well, I sorta hate to bring it up, but since you asked: PYTHONPATH. Until yesterday, the PYTHONPATH stuff was well documented and had matching error messages to guide users in fixing their systems. Once again, I'd like to go beyond PYTHONGPATH, but you asked. Once again, I think pip has the answer: edit the sys.path. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From ianbruene at gmail.com Wed Sep 27 22:39:49 2017 From: ianbruene at gmail.com (Ian Bruene) Date: Wed, 27 Sep 2017 17:39:49 -0500 Subject: Python 3 and 1.0 In-Reply-To: <20170927011244.0A84940605C@ip-64-139-1-69.sjc.megapath.net> References: <20170927011244.0A84940605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <93128ca6-98fb-e320-c1e7-eb585f88890f@gmail.com> Since my initial complaint about Py3 compatibility some bugs have been fixed, agentx tests work, and I've poked at it with a stick. Panic-mode rescinded. -- In the end; what separates a Man, from a Slave? Money? Power? No. A Man Chooses, a Slave Obeys. -- Andrew Ryan From fw at fwright.net Wed Sep 27 22:51:25 2017 From: fw at fwright.net (Fred Wright) Date: Wed, 27 Sep 2017 15:51:25 -0700 (PDT) Subject: Fix for Python library path problem In-Reply-To: <20170927151610.5c920b57@spidey.rellim.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> <20170927151610.5c920b57@spidey.rellim.com> Message-ID: On Wed, 27 Sep 2017, Gary E. Miller via devel wrote: > > What we see that pip does, is edit the sys.path to include the > location an egg is installed. That looks to me like a method > to go forward with. That sounds plausible, but we need to figure out how it does that. It's not just about "editing" sys.path. Although the latter is just a Python list which can be modified in the usual ways at runtime, pip seems to be setting up something that gets processed by Python at its startup time to effectively augment sys.path *persistently*, without any action on the part of the program being run. I'll try to figure this out, but the fact that I don't see any "egg stuff" in any Linux sys.path here makes it harder. It only matters for Linux, since get_python_lib() returns FHS-compliant results on *BSD, and on OSX the paths are so completely different that FHS doesn't apply. Fred Wright From gem at rellim.com Wed Sep 27 23:03:06 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 16:03:06 -0700 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> <20170927151610.5c920b57@spidey.rellim.com> Message-ID: <20170927160306.5d0b9f75@spidey.rellim.com> Yo Fred! On Wed, 27 Sep 2017 15:51:25 -0700 (PDT) Fred Wright via devel wrote: > On Wed, 27 Sep 2017, Gary E. Miller via devel wrote: > > > > What we see that pip does, is edit the sys.path to include the > > location an egg is installed. That looks to me like a method > > to go forward with. > > That sounds plausible, but we need to figure out how it does that. > It's not just about "editing" sys.path. Although the latter is just > a Python list which can be modified in the usual ways at runtime, pip > seems to be setting up something that gets processed by Python at its > startup time to effectively augment sys.path *persistently*, without > any action on the part of the program being run. Yup, I overly summmarized the issue as I'm with you still learning this arcane area. > I'll try to figure this out, but the fact that I don't see any "egg > stuff" in any Linux sys.path here makes it harder. Neither did I, until I used pip to install some eggs. I'm guessing that are many other ways. It only matters > for Linux, since get_python_lib() returns FHS-compliant results on > *BSD, and on OSX the paths are so completely different that FHS > doesn't apply. Uh, lost me. macOS is very much FHS compliant. When I run NTPsec on macOS I export PYTHONPATH. And AFAIK, NTPsec is now, once again, installing in the proper location on all OS. I no longer see an issue here. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Wed Sep 27 23:31:12 2017 From: fw at fwright.net (Fred Wright) Date: Wed, 27 Sep 2017 16:31:12 -0700 (PDT) Subject: Fix for Python library path problem In-Reply-To: <20170927160306.5d0b9f75@spidey.rellim.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> <20170927151610.5c920b57@spidey.rellim.com> <20170927160306.5d0b9f75@spidey.rellim.com> Message-ID: On Wed, 27 Sep 2017, Gary E. Miller via devel wrote: > Fred Wright via devel wrote: > It only matters > > for Linux, since get_python_lib() returns FHS-compliant results on > > *BSD, and on OSX the paths are so completely different that FHS > > doesn't apply. > > Uh, lost me. macOS is very much FHS compliant. MacPro:~ fw$ /usr/bin/python -c 'from distutils import sysconfig; print(sysconfig.get_python_lib())' /Library/Python/2.7/site-packages Completely different scheme. > When I run NTPsec on macOS I export PYTHONPATH. Should no longer be necessary, since Eric's patch doesn't affect my fix in the OSX case. It does, however, screw up BSD. E.g., on FreeBSD: >>> sysconfig.get_python_lib() '/usr/local/lib/python2.7/site-packages' >>> sysconfig.get_python_lib().replace('/usr', '/usr/local') '/usr/local/local/lib/python2.7/site-packages' It needs a bit more restraint in patching the path. Maybe it should only apply to paths where the first element is 'usr' and the second element is not 'local'. > And AFAIK, NTPsec is now, once again, installing in the proper location > on all OS. I no longer see an issue here. Well, *I* see an issue until setting PYTHONPATH is rendered unnecessary on all platforms, which is not yet the case. Fred Wright From gem at rellim.com Thu Sep 28 00:00:43 2017 From: gem at rellim.com (Gary E. Miller) Date: Wed, 27 Sep 2017 17:00:43 -0700 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> <20170927151610.5c920b57@spidey.rellim.com> <20170927160306.5d0b9f75@spidey.rellim.com> Message-ID: <20170927170043.0f6dcb45@spidey.rellim.com> Yo Fred! On Wed, 27 Sep 2017 16:31:12 -0700 (PDT) Fred Wright via devel wrote: > On Wed, 27 Sep 2017, Gary E. Miller via devel wrote: > > Fred Wright via devel wrote: > > > It only matters > > > for Linux, since get_python_lib() returns FHS-compliant results on > > > *BSD, and on OSX the paths are so completely different that FHS > > > doesn't apply. > > > > Uh, lost me. macOS is very much FHS compliant. > > MacPro:~ fw$ /usr/bin/python -c 'from distutils import sysconfig; > print(sysconfig.get_python_lib())' /Library/Python/2.7/site-packages Nothing in FHS says an OS can't use directories not in the FHS. What you neglected to show was where NTPsec installed itself? By default, my NTPsec installed here: -rw-r--r-- 1 root wheel 71992 Sep 27 16:01 /opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/ntp/packet.py Which is a little odd, but no way in conflict with the FHS. > Completely different scheme. Check out your root directory. It still has /bin, /dev, /etc, /opt, /sbin, /usr, /var, and most of the other things you would expect. And they are used in the way the FHS would approve of. So, I still fail to see an issue here. > > When I run NTPsec on macOS I export PYTHONPATH. > > Should no longer be necessary, since Eric's patch doesn't affect my > fix in the OSX case. You really think I did not test it before my last reply? I got NTPsec pieces spread all over now on macOS since the last few changes. > It does, however, screw up BSD. E.g., on FreeBSD: > > >>> sysconfig.get_python_lib() > '/usr/local/lib/python2.7/site-packages' > >>> sysconfig.get_python_lib().replace('/usr', '/usr/local') > '/usr/local/local/lib/python2.7/site-packages' I fail to see the problem there? Or what you are trying to show? Or why I should care? > It needs a bit more restraint in patching the path. Why would you even do that? You totally lost me. > > And AFAIK, NTPsec is now, once again, installing in the proper > > location on all OS. I no longer see an issue here. > > Well, *I* see an issue until setting PYTHONPATH is rendered > unnecessary on all platforms, which is not yet the case. Sigh, I hate to sound like a broken record, but we have all agreed several times today that we would like to do without PYTHONPATH. No need to say it is a point of difference, it is not. How about we work on that instead? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Thu Sep 28 00:51:22 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 20:51:22 -0400 Subject: Fix for Python library path problem In-Reply-To: <20170927152735.7114132d@spidey.rellim.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927202830.GF19136@thyrsus.com> <20170927212123.GC22484@thyrsus.com> <20170927142819.76f18cad@spidey.rellim.com> <20170927214055.GE22484@thyrsus.com> <20170927152735.7114132d@spidey.rellim.com> Message-ID: <20170928005122.GA25427@thyrsus.com> Gary E. Miller via devel : > > I thought you just told me the opposite, > > Sorry if I was not clear the first time. > > > and I *know* Jason did. > > I sugggest you reconfirm with him. > > Re-reading the emails I see a bunch of double negatives going around. Great. Now I don't think I know *anything.* New proposal coming. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From esr at thyrsus.com Thu Sep 28 00:54:44 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 20:54:44 -0400 Subject: Fix for Python library path problem In-Reply-To: <20170927151610.5c920b57@spidey.rellim.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> <20170927151610.5c920b57@spidey.rellim.com> Message-ID: <20170928005444.GB25427@thyrsus.com> Gary E. Miller via devel : > What we see that pip does, is edit the sys.path to include the > location an egg is installed. That looks to me like a method > to go forward with. It looks to me like a fscking disaster, introducing yet another complication that will cause us endless headaches. What, now whether an NTPsec install works is going to depend on whether popi has been run on some random package with an egg? No, no, and no. That's horrible. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 811 bytes Desc: not available URL: From fw at fwright.net Thu Sep 28 02:02:23 2017 From: fw at fwright.net (Fred Wright) Date: Wed, 27 Sep 2017 19:02:23 -0700 (PDT) Subject: Fix for Python library path problem In-Reply-To: <20170928005444.GB25427@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> <20170927151610.5c920b57@spidey.rellim.com> <20170928005444.GB25427@thyrsus.com> Message-ID: On Wed, 27 Sep 2017, Eric S. Raymond via devel wrote: > Gary E. Miller via devel : > > What we see that pip does, is edit the sys.path to include the > > location an egg is installed. That looks to me like a method > > to go forward with. > > It looks to me like a fscking disaster, introducing yet another > complication that will cause us endless headaches. > > What, now whether an NTPsec install works is going to depend on whether > popi has been run on some random package with an egg? No, no, and no. > That's horrible. Let's step back and look at what the actual requirements are. First of all, in order for an import to work in Python, the directory containing the module needs to be in sys.path at the time of the import. This requirement can be met in one of two ways: 1) Placing the module in a directory which is already in sys.path by default. 2) Arranging to augment sys.path by some means prior to the import. One of the ways to do #1 is to use the path returned by get_python_lib() without the prefix option. This is what GPSD has done for years. There may be other possibilities in this area, but it doesn't look promising in the general case, if the goal is to come up with a "local" path. For approach #2, there are a few possibilities: 2.1) Require users to set up PYTHONPATH appropriately. There seems to be general agreement that this is a bad idea. 2.2) Make use of some sort of hook to augment sys.path, which is what pip appears to do. So far, this approach doesn't look promising, partly because the only such mechanisms seem to require that the hook itself be in a "non-local" location in order to be seen by Python before being applied. A chicken-and-egg problem. 2.3) Add code to the programs to augment sys.path prior to the import. This would need to be in all the programs; common code in a library would suffer from a chicken-and-egg problem. Since the target location is site-specific, the programs would need to be patched at build time with the correct path. This actually shouldn't be too hard, since they're already being copied by the substituter, as long as there are no gotchas with the substituter. It would work, but be ugly and nonstandard, though it could be limited to the cases where it's actually needed. Other suggestions? Fred Wright From esr at thyrsus.com Thu Sep 28 03:49:38 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Wed, 27 Sep 2017 23:49:38 -0400 Subject: Fix for Python library path problem In-Reply-To: References: <20170927202830.GF19136@thyrsus.com> <20170927212123.GC22484@thyrsus.com> <20170927142819.76f18cad@spidey.rellim.com> <20170927214055.GE22484@thyrsus.com> <20170927152735.7114132d@spidey.rellim.com> <20170928005122.GA25427@thyrsus.com> Message-ID: <20170928034938.GA26848@thyrsus.com> Jason Azze : > On Sep 27, 2017 8:51 PM, "Eric S. Raymond via devel" > wrote: > > Gary E. Miller via devel : > > > I thought you just told me the opposite, > > > > Sorry if I was not clear the first time. > > > > > and I *know* Jason did. > > > > I sugggest you reconfirm with him. > > > > Re-reading the emails I see a bunch of double negatives going around. > > Great. Now I don't think I know *anything.* > > > ESR provided two test lines. The first returned some path stuff for me. The > second returned an empty pair of square brackets. > > My reading of ESR's call for testing was that that was the hoped for result > and that if the second expression had returned anything, that would have > been bad. That is correct. Those lines were intended to test this rule: "If /usr/lib/X/Y is in sys.path, then /usr/local/lib/X/Y is in sys.path" Jason's test confirmed the rule. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From hmurray at megapathdsl.net Thu Sep 28 06:20:36 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 27 Sep 2017 23:20:36 -0700 Subject: Fix for Python library path problem In-Reply-To: Message from "Eric S. Raymond via devel" of "Wed, 27 Sep 2017 16:19:54 EDT." <20170927201954.GD19136@thyrsus.com> Message-ID: <20170928062036.BDD5040605C@ip-64-139-1-69.sjc.megapath.net> devel at ntpsec.org said: > That's right. What we can do, though, is win under the following > assumption: if /usr/lib/X/Y/ is in sys.path, so is /usr/local/lib/X/Y/. > Look at this from my system: Bad assumption, at least on Fedora: Python 2.7.13 (default, Sep 5 2017, 08:53:59) [GCC 7.1.1 20170622 (Red Hat 7.1.1-3)] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import sys >>> sys.path ['', '/usr/lib/python27.zip', '/usr/lib64/python2.7', '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload', '/usr/lib64/python2.7/site-packages', '/usr/lib/python2.7/site-packages'] -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Thu Sep 28 06:35:23 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 27 Sep 2017 23:35:23 -0700 Subject: Fix for Python library path problem Message-ID: <20170928063523.88F9A40605C@ip-64-139-1-69.sjc.megapath.net> > Once again, I think pip has the answer: edit the sys.path. Why is editing sys.path better than using PYTHONPATH? -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Thu Sep 28 06:50:51 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Wed, 27 Sep 2017 23:50:51 -0700 Subject: Fix for Python library path problem In-Reply-To: Message from Fred Wright via devel of "Wed, 27 Sep 2017 19:02:23 PDT." Message-ID: <20170928065051.B803340605C@ip-64-139-1-69.sjc.megapath.net> > 2.3) Add code to the programs to augment sys.path prior to the import. This > would need to be in all the programs; common code in a library would suffer > from a chicken-and-egg problem. The step that copies python code over to $build/main/... could do minor edits. There would be only one copy of the code that does that edit. Is the install location fixed at configure time, or can that be specified at install time? -- These are my opinions. I hate spam. From fallenpegasus at gmail.com Thu Sep 28 18:47:43 2017 From: fallenpegasus at gmail.com (Mark Atwood) Date: Thu, 28 Sep 2017 18:47:43 +0000 Subject: Our last-minute mess In-Reply-To: <20170927113917.GA18175@thyrsus.com> References: <20170927020453.5F75C13A0206@snark.thyrsus.com> <20170927055949.0060640605C@ip-64-139-1-69.sjc.megapath.net> <20170927113917.GA18175@thyrsus.com> Message-ID: My inclination is to keep his patch, document the lack of FHS compliance, and roadmap a fix to get_python_lib, possibly by nudging the WAF or python communities to write it. And we again specifically thank Fred for his patch. ..m On Wed, Sep 27, 2017 at 4:39 AM Eric S. Raymond via devel wrote: > Hal Murray : > > > > > I'd like to hear from the senior devs (and anyone else with something > > > intelligent to say!) on this. > > > > You need a steering committee to represent the customers on things like > this. > > Good idea. I'll keep that in mind as we get more customers. > > > I didn't find enough info in the wiki page to enlighten me. I get the > > general idea, but I don't know the tag that describes out software. Is > it > > real system software? What about devel mode? > > It's what FHS consider "non-essential system software" - needs to run as > root > at boot but is not required for single-user recovery mode. I couldn't > find a > reference to "devel mode" in the FHS spec, so I can't answer that question. > > > Distros aren't going to use our install script. They don't want to > install > > stuff, they want to package it up in a .deb or .rpm or whatever. How do > we > > get them the info they need in a format they can use? > > That's what the packaging/ directory is for. It's supposed to contain both > meta data examples and documentation that is guidance for packagers. > > > What are the plans for splitting out the python stuff? Do most distros > > include Python in their basic package? > > Python is effectively universal at this point. > > The rational partitioning is probaly (1) core daemon alone, (2) ntpq + > ntpmon, > (3) everything else. > -- > Eric S. Raymond > > My work is funded by the Internet Civil Engineering Institute: > https://icei.org > Please visit their site and donate: the civilization you save might be > your own. > > > _______________________________________________ > devel mailing list > devel at ntpsec.org > http://lists.ntpsec.org/mailman/listinfo/devel > -- Mark Atwood http://about.me/markatwood +1-206-604-2198 Mobile & Signal -------------- next part -------------- An HTML attachment was scrubbed... URL: From gem at rellim.com Thu Sep 28 19:01:39 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 28 Sep 2017 12:01:39 -0700 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> <20170927151610.5c920b57@spidey.rellim.com> <20170928005444.GB25427@thyrsus.com> Message-ID: <20170928120139.791794ef@spidey.rellim.com> Yo Fred! On Wed, 27 Sep 2017 19:02:23 -0700 (PDT) Fred Wright via devel wrote: > First of all, in order for an import to work in Python, the directory > containing the module needs to be in sys.path at the time of the > import. This requirement can be met in one of two ways: Yup. > One of the ways to do #1 is to use the path returned by > get_python_lib() without the prefix option. This is what GPSD has > done for years. And caused subtl problems for years. > 2.2) Make use of some sort of hook to augment sys.path, which is what > pip appears to do. So far, this approach doesn't look promising, > partly because the only such mechanisms seem to require that the hook > itself be in a "non-local" location in order to be seen by Python > before being applied. A chicken-and-egg problem. Huh? Looks simple to me. Looks like the preferred solution. Since pip, and many other programs do it, that is a lot of good precendent. > 2.3) Add code to the programs to augment sys.path prior to the import. Gack. Breaks in a large number of ways. It is very common for packaers to build to one path, package the results, then install elsewhere. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Thu Sep 28 19:15:31 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 28 Sep 2017 12:15:31 -0700 Subject: Fix for Python library path problem In-Reply-To: <20170928063523.88F9A40605C@ip-64-139-1-69.sjc.megapath.net> References: <20170928063523.88F9A40605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170928121531.4ed7362c@spidey.rellim.com> Yo Hal! On Wed, 27 Sep 2017 23:35:23 -0700 Hal Murray wrote: > > Once again, I think pip has the answer: edit the sys.path. > > Why is editing sys.path better than using PYTHONPATH? Editing the config file that stores sys.path is persistent, and used by all python that uses that path. PYTHONPATH must be in the current environment, thus no available, by default, to cron jobs. And when you change from python2 to python3 the PYTHONPATH will need to be change. In contrast, when you change from python2 to python3, the correct config file for the current version is read so the right ntp is used. For example, I have NTpsec installed for Python2.7 and Python 3.5: /usr/local/lib64/python2.7/site-packages/ntp/ /usr/local/lib64/python3.4/site-packages/ntp/ To run from python 2, using PYTHONPATH, I need to do: export PYTHONPATH=/usr/local/lib64/python2.7/site-packages/ntp/ python2 ntpq To run Python3: export PYTHONPATH=/usr/local/lib64/python3.4/site-packages/ntp/ python3 ntpq By contrast, if the config file that is read by the current python has the correct paths, like pip does, then I only need to do: python2 ntpq python3 ntpq This works for the dozens of pip packages I have installed, Are the pip coders smarter then us? We even have the pip code to steal from. For further info, you'll find much interesting reading here: /usr/lib64/python3.4/site.py /usr/lib64/python2.7/site.py /usr/lib64/python3.5/site.py As a teaser, here is the top line: """Append module search paths for third-party packages to sys.path. Seems apropos? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From gem at rellim.com Thu Sep 28 19:16:42 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 28 Sep 2017 12:16:42 -0700 Subject: Fix for Python library path problem In-Reply-To: <20170928065051.B803340605C@ip-64-139-1-69.sjc.megapath.net> References: <20170928065051.B803340605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170928121642.2aef1083@spidey.rellim.com> Yo Hal! On Wed, 27 Sep 2017 23:50:51 -0700 Hal Murray via devel wrote: > Is the install location fixed at configure time, or can that be > specified at install time? None of the above. Packagers do it after the install to a temp location. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From fw at fwright.net Fri Sep 29 02:25:06 2017 From: fw at fwright.net (Fred Wright) Date: Thu, 28 Sep 2017 19:25:06 -0700 (PDT) Subject: Fix for Python library path problem Message-ID: On Tue, 26 Sep 2017, Gary E. Miller via devel wrote (different thread): > On Tue, 26 Sep 2017 22:04:53 -0400 (EDT) > "Eric S. Raymond via devel" wrote: > > > 2. Keep Fred's patch. Ship 1.0 with FHS non-conformance as a known > > and documented bug. > > Gack. Opening a tech support nightmare. > > We constanlty have issues with conflicting system installed and user > installed ntpd. it will be a lot of fun when the distro updates > ntpd and breaks the user installed ntpd. That was not a problem > before this patch. Interestingly enough, this last paragraph is both completely irrelevant to the issue at hand, and simultaneously gets to the crux of the matter. :-) The whole issue being discussed here is the install location for *Python libraries*. Nothing else. Since the classic NTP package doesn't use Python at all, a conflict in this area is impossible. There might, of course, be conflicts in *programs*, *config files*, or whatever, but that has absolutely nothing to do with the Python library path. *However*, once distros start including ntpsec instead of classic ntpd, this sort of conflict will become possible, and FHS compliance will matter. But if we can assume that no distro is including a pre-1.0 ntpsec, then the conflict doesn't exist for 1.0 by definition. If we can punt on FHS compliance for 1.0, then that removes the time pressure for figuring out the right way to get Linux Python to play nicely with FHS. > > 3. Hold 1.0 until we can write a replacement get_python_lib() that > > works right (e.g. produces an FHS-conformant path set by default.) > > Before someone rewrites get_python_lib() we better agree on what it > should do. I suspect the solution will require much more than just > changes to get_python_lib(). Agreed. On Thu, 28 Sep 2017, Gary E. Miller via devel wrote: > On Wed, 27 Sep 2017 19:02:23 -0700 (PDT) > Fred Wright via devel wrote: > > > One of the ways to do #1 is to use the path returned by > > get_python_lib() without the prefix option. This is what GPSD has > > done for years. > > And caused subtl problems for years. Can you name any actual problems, aside from offending the FHS gods? If it caused so many problems, why were you apparently unaware of it until I pointed it out? :-) > > 2.2) Make use of some sort of hook to augment sys.path, which is what > > pip appears to do. So far, this approach doesn't look promising, > > partly because the only such mechanisms seem to require that the hook > > itself be in a "non-local" location in order to be seen by Python > > before being applied. A chicken-and-egg problem. > > Huh? Looks simple to me. Looks like the preferred solution. > > Since pip, and many other programs do it, that is a lot of good precendent. Precedent, yes. Full understanding of the mechanism and constraints, no. > > 2.3) Add code to the programs to augment sys.path prior to the import. > > Gack. Breaks in a large number of ways. It is very common for > packaers to build to one path, package the results, then install > elsewhere. It would naturally be based on the *final* install location, not the intermediate. I believe packaging systems normally know the final location at package build time, but I could be wrong about that. If the final install location isn't known at the time the package is built, then *any* kind of sys.path adjustment would have to be handled by the packaging system, since the ntpsec build scripts would have no way to know what to do. This applies whether one is patching code, generating special hook files, or whatever. I'm not saying tat I'm especially fond of this solution, but it is something that would work. Another possibility that occurred to me is placing the library directory in the *program* directory, e.g., /usr/local/bin/ntp/, since module directories parallel to a Python program are always recognized automatically by Python. This would work, but it's rather nonstandard. If ntpsec assumes that it "owns" all paths of the form /bin/ntp*, then this wouldn't create any new conflicts, and AFAIK the NTP suite has never had a program just called "ntp", anyway. Again, a not very flavorful, but functional solution. I attempted to see if there's any precedent for this, but the systems I have here don't seem to be terribly useful examples. My Ubuntu install has exactly one program in /usr/local/bin/. My CentOS and Fedora installs don't even have /usr/local/bin/. My Debian installs have lots of things in /usr/local/bin/, but the only Python code is either from ntpsec or GPSD, so it's not really an "outside opinion". One of the this I've discovered in looking at this stuff is that having code that relies on looking at sys.path would be fragile, for at least two reasons: 1) Some directories get added by installs of other packages, but relying on any such directories would break if the relevant packages were removed. 2) Directories that don't exist don't get added. But the install process creates directories as needed. So if one came up with a directory that Python knows about but doesn't currently exist, it would appear from looking at sys.path at configure time that Python doesn't know about it, even though it might work perfectly well after the install. Note that a workaround for #1 is to launch Python with the -S option to suppress processing site.py, but that also may exclude some directories that *aren't* package-specific. BTW, it's worth pointing out that this entire argument is over the *default* value of PYTHONDIR (and technically also PYTHONARCHDIR, but that's currently unused as noted in the comment in pylib/wscript). Any user or packaging system is free to supply a different value, either via environment or via option. At that point, it becomes the responsibility of the user or packaging system to ensure that the specified directory is either valid by default, or will be made valid when it's needed. In fact, I suspect packaging systems would tend to specify a lot of such things explicitly in their own scripts, anyway, so this is mostly about getting reasonable default behavior for a "bare" configure/build/install. Fred Wright From gem at rellim.com Fri Sep 29 02:33:27 2017 From: gem at rellim.com (Gary E. Miller) Date: Thu, 28 Sep 2017 19:33:27 -0700 Subject: Fix for Python library path problem In-Reply-To: References: Message-ID: <20170928193327.34924870@spidey.rellim.com> Yo Fred! On Thu, 28 Sep 2017 19:25:06 -0700 (PDT) Fred Wright via devel wrote: > Interestingly enough, this last paragraph is both completely > irrelevant to the issue at hand, and simultaneously gets to the crux > of the matter. :-) Always my intention . > The whole issue being discussed here is the install location for > *Python libraries*. Nothing else. Clearly we are in parallel universes. I was pretty sure that everyone agreed that today's installation locations were agreeble to everyone. If not, then please be specific on how today's default differs from your desired location. And /usr/lib is NOT an option if ntpd is in /usr/local/bin/ > Since the classic NTP package > doesn't use Python at all, a conflict in this area is impossible. And totally unrelated to anything I've said. So why do you bring it up? > *However*, once distros start including ntpsec instead of classic > ntpd, this sort of conflict will become possible, and FHS compliance > will matter. Already happened. So, we agree, FHS compliance, to keep system and local copies apart is a valid goal? > > Before someone rewrites get_python_lib() we better agree on what it > > should do. I suspect the solution will require much more than just > > changes to get_python_lib(). > > Agreed. Good. Next? RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From esr at thyrsus.com Fri Sep 29 08:28:23 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 29 Sep 2017 04:28:23 -0400 Subject: Fix for Python library path problem In-Reply-To: References: Message-ID: <20170929082823.GA28823@thyrsus.com> Fred Wright via devel : > > We constanlty have issues with conflicting system installed and user > > installed ntpd. it will be a lot of fun when the distro updates > > ntpd and breaks the user installed ntpd. That was not a problem > > before this patch. > > Interestingly enough, this last paragraph is both completely irrelevant to > the issue at hand, and simultaneously gets to the crux of the matter. :-) > > The whole issue being discussed here is the install location for *Python > libraries*. Nothing else. Since the classic NTP package doesn't use > Python at all, a conflict in this area is impossible. There might, of > course, be conflicts in *programs*, *config files*, or whatever, but that > has absolutely nothing to do with the Python library path. Fred, this logic had not escaped me. > *However*, once distros start including ntpsec instead of classic ntpd, > this sort of conflict will become possible, and FHS compliance will > matter. But if we can assume that no distro is including a pre-1.0 > ntpsec, then the conflict doesn't exist for 1.0 by definition. If we can > punt on FHS compliance for 1.0, then that removes the time pressure for > figuring out the right way to get Linux Python to play nicely with FHS. I've been semi-quiet because I've been researching one possible alternative, but I think thqr wasn't viable and there's only one possibility left. More later today. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Fri Sep 29 18:47:27 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 29 Sep 2017 14:47:27 -0400 Subject: Fix for Python library path problem In-Reply-To: References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> <20170927151610.5c920b57@spidey.rellim.com> <20170928005444.GB25427@thyrsus.com> Message-ID: <20170929184727.GA5677@thyrsus.com> Fred Wright via devel : > Let's step back and look at what the actual requirements are. > > First of all, in order for an import to work in Python, the directory > containing the module needs to be in sys.path at the time of the import. > This requirement can be met in one of two ways: > > 1) Placing the module in a directory which is already in sys.path by > default. > > 2) Arranging to augment sys.path by some means prior to the import. > > One of the ways to do #1 is to use the path returned by get_python_lib() > without the prefix option. This is what GPSD has done for years. There > may be other possibilities in this area, but it doesn't look promising in > the general case, if the goal is to come up with a "local" path. > > For approach #2, there are a few possibilities: > > 2.1) Require users to set up PYTHONPATH appropriately. There seems to be > general agreement that this is a bad idea. > > 2.2) Make use of some sort of hook to augment sys.path, which is what pip > appears to do. So far, this approach doesn't look promising, partly > because the only such mechanisms seem to require that the hook itself be > in a "non-local" location in order to be seen by Python before being > applied. A chicken-and-egg problem. > > 2.3) Add code to the programs to augment sys.path prior to the import. > This would need to be in all the programs; common code in a library would > suffer from a chicken-and-egg problem. Since the target location is > site-specific, the programs would need to be patched at build time with > the correct path. This actually shouldn't be too hard, since they're > already being copied by the substituter, as long as there are no gotchas > with the substituter. It would work, but be ugly and nonstandard, though > it could be limited to the cases where it's actually needed. > > Other suggestions? After researching and trying these out for the last couple days I have concluded that 2, attempts to adjust the sys.path, are doomed. site.py won't do it. It adjusts the load path at Python startup time, but only in accordance with sys.prefix at the time Python was built. That's going to be /usr for Python installed from the package system. We only get help from this if the Python instance was built from source with a /usr/local prefix. The 2.x alternatives are all complexity traps. I rejected 2.2 immediately and experimented with ways to implement 2.3. Every present and future client would have templated - you can't do it in the support library for reasons Fred has already observed. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From esr at thyrsus.com Fri Sep 29 18:48:09 2017 From: esr at thyrsus.com (Eric S. Raymond) Date: Fri, 29 Sep 2017 14:48:09 -0400 Subject: Fix for Python library path problem In-Reply-To: <20170928062036.BDD5040605C@ip-64-139-1-69.sjc.megapath.net> References: <20170927201954.GD19136@thyrsus.com> <20170928062036.BDD5040605C@ip-64-139-1-69.sjc.megapath.net> Message-ID: <20170929184809.GB5677@thyrsus.com> Hal Murray : > > devel at ntpsec.org said: > > That's right. What we can do, though, is win under the following > > assumption: if /usr/lib/X/Y/ is in sys.path, so is /usr/local/lib/X/Y/. > > Look at this from my system: > > Bad assumption, at least on Fedora: > > Python 2.7.13 (default, Sep 5 2017, 08:53:59) > [GCC 7.1.1 20170622 (Red Hat 7.1.1-3)] on linux2 > Type "help", "copyright", "credits" or "license" for more information. > >>> import sys > >>> sys.path > ['', '/usr/lib/python27.zip', '/usr/lib64/python2.7', > '/usr/lib64/python2.7/plat-linux2', '/usr/lib64/python2.7/lib-tk', > '/usr/lib64/python2.7/lib-old', '/usr/lib64/python2.7/lib-dynload', > '/usr/lib64/python2.7/site-packages', '/usr/lib/python2.7/site-packages'] Yes. The approach in my last commit won't generalize. -- Eric S. Raymond My work is funded by the Internet Civil Engineering Institute: https://icei.org Please visit their site and donate: the civilization you save might be your own. From gem at rellim.com Fri Sep 29 19:55:13 2017 From: gem at rellim.com (Gary E. Miller) Date: Fri, 29 Sep 2017 12:55:13 -0700 Subject: Fix for Python library path problem In-Reply-To: <20170929184727.GA5677@thyrsus.com> References: <20170927142157.5214613A0206@snark.thyrsus.com> <20170927201954.GD19136@thyrsus.com> <20170927132932.43bcdfdd@spidey.rellim.com> <20170927204354.GA22161@thyrsus.com> <20170927134933.19070c11@spidey.rellim.com> <20170927213656.GD22484@thyrsus.com> <20170927151610.5c920b57@spidey.rellim.com> <20170928005444.GB25427@thyrsus.com> <20170929184727.GA5677@thyrsus.com> Message-ID: <20170929125513.29131478@spidey.rellim.com> Yo Eric! On Fri, 29 Sep 2017 14:47:27 -0400 "Eric S. Raymond via devel" wrote: > site.py won't do it. I didn't usggest that site.py was the answer. My suggestions was the commens in site.py lead to the answer. It adjusts the load path at Python startup > time, but only in accordance with sys.prefix at the time Python was > built. That's going to be /usr for Python installed from the package > system. We only get help from this if the Python instance was built > from source with a /usr/local prefix. I admit to not understanding copletely how this works, but others make this work. Certainly pip makes it work. pip gives me many options on how to install python libs. The cool part, is I can use pip to install X in python2.7, then again in python3.4, and they both work! I'll dig into this idea. Others do not seem to have this problem. RGDS GARY --------------------------------------------------------------------------- Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703 gem at rellim.com Tel:+1 541 382 8588 Veritas liberabit vos. -- Quid est veritas? "If you can?t measure it, you can?t improve it." - Lord Kelvin -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 488 bytes Desc: OpenPGP digital signature URL: From hmurray at megapathdsl.net Sat Sep 30 05:52:10 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 29 Sep 2017 22:52:10 -0700 Subject: Our last-minute mess In-Reply-To: Message from "Eric S. Raymond via devel" of "Wed, 27 Sep 2017 07:39:17 EDT." <20170927113917.GA18175@thyrsus.com> Message-ID: <20170930055210.6ED0F40605C@ip-64-139-1-69.sjc.megapath.net> >> What are the plans for splitting out the python stuff? Do most distros >> include Python in their basic package? > Python is effectively universal at this point. My notes for setting up NetBSD and FreeBSD and OpenBSD include installing python. (and setting up some links which may or may not be necessary now but were important enough that I saved the recipe) -- These are my opinions. I hate spam. From hmurray at megapathdsl.net Sat Sep 30 06:18:47 2017 From: hmurray at megapathdsl.net (Hal Murray) Date: Fri, 29 Sep 2017 23:18:47 -0700 Subject: Testing Message-ID: <20170930061847.424FD40605C@ip-64-139-1-69.sjc.megapath.net> Should we test buildprep? I think that means setting up a clean minimal system for each distro we claim to support. Maybe we should document what we setup - the ISO used to make the CD/Thumb-drive and any options selected if there was a choice. Can we build the binaries and install them on a bare system to see if they really work and/or to build a dependencies list? Should we build a matrix of distro and refclock? Some drivers have options to support various devices that are similar but different enough to be worth testing. -- These are my opinions. I hate spam.