Progress on mrulist

Hal Murray hmurray at megapathdsl.net
Tue Dec 20 07:03:46 UTC 2016


Lots of work still to do, but we are off the ground.

The second batch never worked.  In hindsight, it's actually easy to test the 
multiple batch case.  Just set frags=1  That gets about 4 slots, so a server 
with a handful of clients is plenty.

I fixed it to take frags=xx and limit=xx on the command line.

It now processes the returned new nonce.

It sent back the where-to-start times in decimal rather than hex, so nothing 
matched when ntpd was trying to figure out where to start sending more.

Fixing those, got data back for the second batch.

The code building the client copy of the MRU list used the index from the 
batch.  Those indexes are local to the batch rather than global, so the 
second batch overwrote the slots from the previous batch.

My fix will make duplicates if a slot gets moved up because a new packet 
arrived.  That should be easy to fix.  We want duplicates if a slot gets 
recycled and then recreated.  We want to replace the slot if it got moved up.


Stuff that looks fishy:
  if you specify recent=xxx, it gets sent back in the request for a second 
batch.
    With the default batch size, ntpmon gets a screenfull in one batch.
  If you specify frags=xxx, it gets recalculated.

The nonce stuff isn't working right on retransmissions.  (I haven't 
investigated.)

If a packet gets lost, we should process any complete slots in the data that 
we did get.

Does ^C during collecting jump out or set a flag to unwind?  (I saw a divide 
by 0 that would be explained by jumping out.)

I think we are going to run out of memory trying to grab everything on my 
test case pool server.  We may need a write-on-the-fly mode.  Yes, it works 
well enough to discover that.

There is another interesting doesn't-converge case that I hadn't considered.  
Assume that you are near to the end.  It can fail to converge because new 
slots get created faster than they are being retrieved.  There is another 
case at startup.  When you ask for more starting at IP x, that slot could 
have been recycled already.

We need to fix all the error trap cases to print what they have already 
collected.


-- 
These are my opinions.  I hate spam.





More information about the devel mailing list