Bug 119, ntpdig doesn't do IPv6, fixed
Eric S. Raymond
esr at thyrsus.com
Mon Oct 3 14:13:06 UTC 2016
Hal Murray <hmurray at megapathdsl.net>:
> Thanks for tracking that down.
> > I suspect the reason Hal didn't catch this and fix it instantly is that he
> > is, like the rest of us, really focused on ntpd. And thus didn't think to
> > test ntpdig when he modified it.
> Can we add some tests that would have caught this?
> Do we need another category of tests? I don't have a good word. I'm
> thinking of a script that gets run nightly/weekly and requires human review
> to decide if a problem is due to a recent change in the code or a quirk in
> the environment.
It could be done with an expect/send framework running smoke tests on ntpdig,
and ntpq to a known-good IP address - ntpsec.org's would do.
> > Take a lesson, everybody. It's the tests you don't run that'll hurt you.
> I've worked on at least one project where part of the culture was to collect
> test cases along with bug fixes, and merge them into the standard test
> collection. It's embarrassing how often bugs get reinvented. (That may be
> an indication of poor architecture or just a messy area.)
I'm pretty religious about this practice on two of my other projects, GPSD and
reposurgeon. GPSD has about 125 tests; reposurgeon about 145. On both,
reinvented bugs have been quite rare, though not entirely nonexistent.
There is a noticeable difference in effectiveness; GPSD has a very low
defect rate, reposurgeon a somewhat higher one (though still pretty
good compared to what I see on other projects' bugtrackers).
I think one difference is degree of novelty. Bug replay is extremely
effective mitigation when your codebase is relatively stable, doing
much the same things it did last year; less so when you routinely try
to add capability. GPSD is stable that way, reposurgeon is not.
I think NTP might be more like GPSD.
Another difference is algorithmic density. GPSD is high on that scale
(how many programs include both a pattern-recognition FSM derived from
compiler technology and nontrivial matrix algebra?) but reposurgeon is
stratospheric (several large FSMs, heavy use of exotic graph-traversal
and graph-surgery algorithms, a copy-on-write cache tuned for its
internal data structures).
I think bug replay decreases in effectiveness when your problem space
is so tricky that not being sure what the right thing is looms larger
than implementation mistakes. The actual time-sync algorithms in NTP are
like that; most of the rest of (the network plumbing, in particular)
TESTFRAME was intended in large part as a way to collect bug cases and
rerun them. The concept stemmed directly from the way GPSD's gpsfake
works. I put huge effort into it because I know how effective gpsfake
has been. It is very sad that TESTFRAME turned out to be unworkable;
I don't have a plan B for bug replay yet, but it is something I am
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
More information about the devel