ntpq mru hang

Mon Dec 19 05:18:20 UTC 2016

Hal Murray <hmurray at megapathdsl.net>:
> 
> I've found an interesting problem.
> 
> A successful batch returns a new nonce.  ntpq asks for more using the old 
> one.  Nothing comes back.

Don't do that, then!

> That seems broken-by-design.  If the response is lost, the server will have 
> switched to the new nonce but the client only knows the old one.  It might 
> work if the server remembers the old nonce until it gets a packet with the 
> new one.

As-needed nonce generation is baked into Session.mrulist().  I am
resistent to adding complexity in ntpd to handle the edge case where a
response gets lost, especially since I fear it might open us up to a
replay attack.

If it's relatively easy to do, I'd support returning a BADVALUE
serrver in this case error

> I haven't looked at the server code, but I don't see any mention of that in 
> the mode6 doc.

You're right, it's not covered.

> mode6 doc says:
>   === CTL_OP_REQ_NONCE ===
>   This request is used to initialize an MRU-list conversation.
> Does ntpd do anything other than return the nonce?

No, that's it.

> Thinking about the big picture of monitoring a server...
> 
> We need a way to get some info when the MRU list is getting updated faster 
> than ntpq can retrieve it.  It never converges.
> 
> One possibility is to add an option to log a slot when it gets recycled.  
> That would expose a DoS attack.
> 
> I think we need something like ntpmon that uses all the screen space for the 
> MRU output.

That is relatively easily arranged.  I could add a command to flip it to an
all-MRU display.

> I think we need an option to ntpq/mru to tell it how many slots to ask for,

We have that. I added it a few days ago.  ntpmon uses it to avoid hanging
on servers with long MRU lists. See "recent" on the Mode 6 page.

> and/or to have an option that says give me all that fits in a batch, but 
> start from the newest rather than oldest.  With something like that, we could 
> script some useful collection.  It wouldn't get everything, but I think it 
> would get a helpful sample.

Trouble is, "all that fits in a batch" can differ depending on the length
of the data literals in records...
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>