My pre-1.0 wishlist

Sat Jun 4 04:38:51 UTC 2016

Daniel Franke <dfoxfranke at gmail.com>:
> There's been some chatter on this list lately about whether we're
> ready for a 1.0 release. My opinion is "not yet". Here is my wishlist
> for things that I want us to get done before we go to 1.0, in rough
> order of decreasing importance.

Thank you, Daniel, this is an excellent dicussion to be having now
and I'm very glad you launched it.  Unfortunately, I don't think
you're going to like my answers much.

> 1. End-to-end test automation
> 
> While many of us having been regularly exercising NTPsec and
> witnessing excellent stability, I suspect that there are a great many
> obscure features that nobody is testing and I worry that some of these
> have bitrotted. What was the last time anybody tried out symmetric
> interleaved mode? Orphan mode? Pool mode? The huff'n'puff filter? We
> need automation that covers this stuff and keeps us from shipping
> broken releases. And if some feature is too obscure or useless to put
> the energy into automating it, we should rip it out.

You cannot *possibly* be wishing for this outcome harder than I was
for most of the last year.  Sadly, I can't even estimate when we'll
get there.  I had a plan, it failed twice, and I'm now scrambiling
for ways to recover.

We have three major possible pathways to this kind of test coverage.  One
is Mozilla rr, the other is some kind of serious simplification attack
on the hairball in the network code, and the third is somebody managing to
grok the hairball in its full, present, hideous glory.

False modesty would serve our planning needs poorly so I'm just going
to say straight out that if *I* can't get my head all the way around
the fscking hairball (and so far I clearly can't) the list of people
who could plausibly manage it is pretty damned short.  One of our rays
of hope is Steve Summit if we can get him fully on board - he might be
one of them.

I'm working the simplification attack right now, in a relatively indirect
way, and plan to start learning rr as soon as I ship version 1.0 of the
Microserver HOWTO.  

But there is *no* plausible scenario in which any of these things
happens quickly. I don't think we can wait 1.0 on that.  Which sucks
and I hate it, but reality is reality.

> 2. Better process and communication around vulnerabilities inherited
> from NTP Classic
> 
> We were slow in getting vuln fixes from ntp-4.2.8p7 into NTPsec
> because we got blindsided by so many of them. This time around with
> 4.2.8p8 we did far better, but I want us to mature to the point where
> our release announcements can be simultaneous. Part of this will have
> to involve improving our internal communication and release
> engineering and the other part will have to involve better
> collaboration with NTP.org. I know exactly how problematic the
> politics are around the latter, but we have got to find a way to make
> it happen. When we start pushing for broader NTPsec adoption, the rest
> of the world isn't going to care about our political squabbles.
> They're going to care that NTP Classic fixed a critical vulnerability
> N hours ago, and that there is no patch yet for NTPsec.

I think better internal process would be a good plan, though not likely
to be sufficient unless we have someone other than many-tasked me
wiilling to accept responsibility for forward-porting fixes.

Better collaboration with NTP.org...well, that would sure be nice and
I'd be willing, but I fear pigs will achieve escape velocity before we
get any actual cooperation from them.  I won't be more specific on a
list with publicly-accessible archives.

Do you really think it's realistic to block 1.0 on that?  I don't.

> 3. All roadmapped featurectomies
> 
> The more users we have, the more loudly people are going to complain
> about any features we remove. At our F2F we committed to eventually
> removing saveconfig and client-side broadcast mode. We shouldn't
> release 1.0 before we've completed that work, and furthermore we
> should look around and decide what *else* we don't want to commit to
> supporting for years to come.

I have a little list, I have a little list...

Unfortunately you're entering another political minefield here.  At
least two features I want to remove would cause a political shitstorm
if word got out that we were thinking of diking them out *before* I
have solid white papers to show they're nugatory.

This is a major motivation for the test farm - I need to have
*evidence* that certain things are not contributing above the noise
floor.  Which is why I've let myself spend over a month on the HOWTO
and test farm - if it was only for the HOWTO I'd have run to the end of
justified investment maybe two weeks back.

Again, this is not going to happen quickly.

> 4. Any planned backward-incompatible changes to the configuration language
> 
> We've been talking for a long time about replacing the 'restrict'
> directive and all the language around what keys have what
> authorizations with something sane. We should get this done before
> 1.0.

Fair enough. Send an RFC to the list.  This one is a straight-up coding
and documentation job and can be done in bounded time.

> 5. Formal processes around code review
> 
> Last year we looked at Phabricator and decided that for the time being
> it was too much of an obstacle for our development pace. I think that
> was the right call at the time, but eventually I want to revisit this.
> We don't have to choose that tool in particular, but if we're still
> moving too quickly for a merge-based workflow with code-reviewer
> sign-off before every master commit to be practical, then we're also
> not yet stable enough for 1.0.

I agree with the intent to revisit using something Phabricator-like.
I disagree that that should be coupled with 1.0.  Stability ought
to be measured by er, *stability*, not by whether we've gone through
that process.

> 6. An NTP Classic -> NTPsec migration guide
> 
> Documenting all incompatible changes since the fork.

Reasonable.  It won't be a long guide.

Basically, I agree with holding 1.0 on the things you've cited that
can be realistically time-bounded - 4 and 6 in particular.  I do not
agree that we should hold 1.0 on changes for which we can't even
plausibly estimate completion time right now.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>