The Epic of MRUlist
hmurray at megapathdsl.net
Tue Apr 28 10:58:40 UTC 2020
> After a great deal of refactoring, digging, confusion, and generalized
> wrestling with the surprising number of tentacles that comprise the mrulist
> system I can now make a report of sorts:
Did you fix anything in the process? Are you describing the current code or
Did your refactoring clean things up? Are you going to push it?
> * If there is no overlap between the requested entries and what the server
> has available the client is supposed to back up until it can find an
> overlap. In the worst case this can mean completely restarting the request
The client sends a batch of IP-Address/time-stamp pairs. I think it limits
the batch-size to what fits in a packet.
The server starts with the first pair, looks up the IP Address and compares
the time-stamp. If they match, great, start processing with the next on the
list. If not, try the next pair.
If it runs out of next-pairs, I think the server goes back to the first pair
and does a linear search for the a slot with a newer time-stamp. We should
add some logging to the ntpd/server side for that case.
If the client/ntpq is doing retries at that level, we should add some printout
and or bail instead.
> * Duplicate entries may be created as data is updated during the request
Mostly. One of the tentacles is the direct mode. The idea is to be able to
collect everything without needing a lot of memory if the server has a lot of
slots. It just writes them out as they come in. You can drop duplicates
> * Packet errors should result in packet size adjustment.
> This is implemented.
Are you sure? When I looked at it, it wasn't working.
The MRU stuff sends one request packet that contains the number of packets or
number of slots for the reply. There is a routine that collects the answers
and extracts info. The answer packets have a sequence number. The collect
routine bailed, returning nothing, if it found a missing packet.
It should at least return as much as it can, up to the missing packet. It
should also return an indication if there was a missing packet or timeout or
> * What happens when a packet in the middle of the sequence is dropped? Who
> knows! If it is seen as a timeout then the client will adjust packet size
> and try again... forever. Or maybe it silently doesn't notice?
The case I was looking at retried forever.
> * If the server data changes enough while the request sequence is running
> the system can just fail for no good reason because the error handling for
> that doesn't exist. Arguably that is when things are going well; I can
> imagine some subtle and wacky hijinks when dumb luck causes it to not fail
The nasty case that I know about is where the server is getting requests
faster than ntpq can read them. The slots for those packets get stuffed on
the end of the list where ntpq will read them next try.
I think we can fix that back at ntpq, but I haven't worked out the details.
It would be something like only send 1000 more requests after you get one with
a time stamp within a few seconds of "now". I haven't actually run into that
problem yet so I haven't been motivated to fix it.
What I really want to do is capture the data for everything a server does in a
day. We could add a new stats type file and log a slot when it gets recycled.
That wouldn't ever get slots for active IP Addresses.
There are 2 timestamps in each slot: first and last. The first is the age of
the slot. The list is sorted by last, the time of most recent arrival. We
could scan the list, say hourly, looking old slots and write out the data with
a still live flag. Post processing could toss duplicates. We could either
clear the counters (so live data would lie) or add another time stamp for when
it was written out.
ntpq uses a packet size limit of 4xx bytes. That matches the old minimum MTU.
We should look into making that bigger.
These are my opinions. I hate spam.
More information about the devel