prep for cutting a release, target 2019-01-13
Eric S. Raymond
esr at thyrsus.com
Sat Jan 12 05:23:40 UTC 2019
Achim Gratz via devel <devel at ntpsec.org>:
> Hal Murray via devel writes:
> > devel at ntpsec.org said:
> >> It doesn't recover at all (well, not over the course of two weeks, that was
> >> the longest time I watched it without restarting) and there is no code I
> >> could find that would allow it to recover. I'm almost certain that it falls
> >> into the KOD trap at some point (which should be non-survivable, probably),
> >> what I don't understand is how it gets there.
> >
> > I just tried it. My test case recovered.
>
> I'm certainly missing some context as I know neither what you tried nor
> what test case you're talking about (presumably something in conjunction
> with #437).
>
> Anyway, recovery in the case I'm talking about would be if the
> originally configured poll interval was re-established (I run all
> internal servers with minpoll=maxpoll=4). That never happens for me
> once hpoll moves up to 10.
It's all coming back to me now. We couldn't reproduce this when you
first reported it, either. I tried auditing the interval-change rules in the
code, but couldn't find any bad smell to investigate further.
I don't doubt you're seeing *something*. But it seems to depend on some
fact pattern the rest of us have never replicated. I'm at a loss what to
do about this.
Usually in these situations I just keep adding instrumentation until
something jumps out and says "boo". How reliably can you reproduce
this.
--
<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>
My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.
More information about the devel
mailing list