Apparent protocol-machine bug, new top priority

Sun Aug 27 13:02:06 UTC 2017

Heads up, Daniel!

Achim Gratz via devel writes:
>> I still think there must be some bug somewhere that either makes the
>> client send too many packets or the server sending that KOD too early.
>
>This is still happening with ntpsec-0.9.7+1104, albeit much less often
>now (but I've removed iburst from the configuration files).  I just had
>it happen while I was updating the rasPis.  Again, the symptom is that
>there is a "rate_exceeded" event and the hpoll gets set to some high
>value and never recovers from there even though the poll interval should
>be fixed:
>
>assoc=53276: conf, reach, sel_reject, 1 event, rate_exceeded
>unreach=0 hmode=3 pmode=4 hpoll=10 ppoll=4 headway=7908 flash=4096 keyid=0
>
>This was and is not happening with NTP classic.

Now that iburst has been fixed - and Achim reports seeing this problem
with iburst off - this pretty much has to be an issue deeper in the
protocol machine.  (I guess we should count our blessings and
congratulate Daniel that there haven't more of these since the big
refactor.)

If this is happening with iburst *off*, it becomes more difficult to
understand how the rate limit is being triggered.  I think maybe we
should start by focusing on something else: why is hpoll not
recovering after a KOD?

I'm thinking this sounds like some KOD-recovery logic got lost during
the refactor.

I also judge this is our new most serious bug. Daniel, would you give
it a hard look, please?  You too, Hal - I'm thinking you have better
odds of diagnosing this one than I do.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

Rifles, muskets, long-bows and hand-grenades are inherently democratic
weapons.  A complex weapon makes the strong stronger, while a simple
weapon -- so long as there is no answer to it -- gives claws to the
weak.
        -- George Orwell, "You and the Atom Bomb", 1945