My pre-1.0 wishlist

Sun Jun 5 13:39:31 UTC 2016

Daniel Franke <dfoxfranke at gmail.com>:
> On 6/4/16, Eric S. Raymond <esr at thyrsus.com> wrote:
> 
> > We have three major possible pathways to this kind of test coverage.  One
> > is Mozilla rr, the other is some kind of serious simplification attack
> > on the hairball in the network code, and the third is somebody managing to
> > grok the hairball in its full, present, hideous glory.
> 
> We don't need anything nearly as clever as Mozilla rr or TESTFRAME to
> achieve the kind of automation I'm asking for. You want to test things
> at the granularity of system calls (rr) or some abstraction that goes
> deeper into the code (TESTFRAME). I want to test things at the
> *user-visible* level. Automate the process of supplying configuration
> files that exercise a variety of functionality, running them on real
> hardware and real networks, and monitoring the results with ntpq. That
> is, take the sort of testing that we're already doing by hand and make
> it systematic and automatic.

I don't believe you've thought the problem all of the way through.  All plans
to test in a normal environment founder on one simple problem: you can't
beat behavioral replication out of *this* software without spoofing the
clock it sees.

Unless you set up behavioral replicability (that is, an environment in
which a known sequence of clock readings, I/O events, and other
syscalls leads to another known sequence, or at least correct
recognition teatures of same like ntpq -p showing what you expect) you
don't have testing - because you don't know what output features
discriminate between success and failure pf the test.

Follow out the implications of that and I think you'll find there
isn't any solution less "clever" than TESTFRAME or
rr-with-clock-spoofing.  In fact I'm somewhat doubtful that rr will be
strong enough.  Hoping to be wrong about that.

Believe me, I'd love to get away with something much simpler, like
gpsfake in GPSD that spoofs the daemon's I/O environment and feeds it
data streams from the outside. But that won't work because you have to
spoof the clock, too.  And socket opens -  your test environment needs
to be able to pretend to be the entire net so TCP/IP traffic with the
daemon under test can be controlled.

> > Better collaboration with NTP.org...well, that would sure be nice and
> > I'd be willing, but I fear pigs will achieve escape velocity before we
> > get any actual cooperation from them.  I won't be more specific on a
> > list with publicly-accessible archives.
> >
> > Do you really think it's realistic to block 1.0 on that?  I don't.
> 
> If it can't be done then it can't be done, but I don't think we've
> exhausted our options. I'll follow up off-list.

Please do.

> > I have a little list, I have a little list...
> >
> > Unfortunately you're entering another political minefield here.  At
> > least two features I want to remove would cause a political shitstorm
> > if word got out that we were thinking of diking them out *before* I
> > have solid white papers to show they're nugatory.
> 
> Okay, if some of the featurectomies are going to be controversial no
> matter what, then we don't have to block 1.0 on those. But let's do
> the rest of them so that we're not (justifiably) accused of
> gratuitously breaking people's setups post-1.0.

I think those accusations are going to happen anyway, and our strategy
(including release humbering) cannot depend on the assumption of
avoiding them.  So we have to learn to justiy each backward-incompatible
change on its own merits, brace and cope.

One category where we can't avoid this is political-shitstorm
features.  Another is changes that might be less functionally
controversial but we can't design ahead to yet, like the refclockd
split.  That is certain to requre significant changes in config-file
processing and logging.

Yet a third is incompatible changes that will turn out to be necessary
for security reasons but we can't predict will be needed at 1.0 time
because the sploit hasn't happen yet.

If you're not willing to bet that that third category will be empty
forever, it doesn't make any sense to tie our hands up front.

> > I agree with the intent to revisit using something Phabricator-like.
> > I disagree that that should be coupled with 1.0.  Stability ought
> > to be measured by er, *stability*, not by whether we've gone through
> > that process.
> 
> We're using two different senses of "stable". You're using it to mean
> "free of crashes". I'm using it way Debian uses it: a version is
> stable when it ceases to require frequent patching, and if a
> particular, serious issue comes up (usually but not always
> security-related) then the maintainers will provide an update that
> fixes that issue and leaves everything else untouched. A corollary to
> something being stable in this sense is that we can afford high
> overhead for making changes, because we don't have to make very many
> of them.

Ah, OK, I misunderstood you.

I agree that it would be good to get the stable branch to a patch
velocity low enough that Phabricator overhead is tolerable.  Good
luck with that; me, I don't think *I* can realistically expect it
for 18 months or so.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>