prep for cutting a release, target 2019-01-13

Eric S. Raymond esr at thyrsus.com
Sat Jan 12 05:23:40 UTC 2019


Achim Gratz via devel <devel at ntpsec.org>:
> Hal Murray via devel writes:
> > devel at ntpsec.org said:
> >> It doesn't recover at all (well, not over the course of two weeks, that was
> >> the longest time I watched it without restarting) and there is no code I
> >> could find that would allow it to recover.  I'm almost certain that it falls
> >> into the KOD trap at some point (which should be non-survivable, probably),
> >> what I don't understand is how it gets there.
> >
> > I just tried it.  My test case recovered.
> 
> I'm certainly missing some context as I know neither what you tried nor
> what test case you're talking about (presumably something in conjunction
> with #437).
> 
> Anyway, recovery in the case I'm talking about would be if the
> originally configured poll interval was re-established (I run all
> internal servers with minpoll=maxpoll=4).  That never happens for me
> once hpoll moves up to 10.

It's all coming back to me now.  We couldn't reproduce this when you
first reported it, either.  I tried auditing the interval-change rules in the
code, but couldn't find any bad smell to investigate further.

I don't doubt you're seeing *something*.  But it seems to depend on some
fact pattern the rest of us have never replicated.  I'm at a loss what to
do about this.

Usually in these situations I just keep adding instrumentation until
something jumps out and says "boo".  How reliably can you reproduce
this.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.




More information about the devel mailing list