Monitoring busy/pool servers

Hal Murray hmurray at megapathdsl.net
Wed Dec 14 19:12:53 UTC 2016


Was: Subject: Re: ntpmon now has some keystroke commands

ghane0 at gmail.com said:
> Unfortunately, I am unable to play with this new toy, as all my servers that
> run NTPsec are in the pool, and ntpmon just goes to sleep on them (see issue
> #206). 

There are two problems in this area.  (at least that I know about.  Maybe 
more)

The first is that the interesting data doesn't change often enough for a 
display that updates frequently to be helpful.  For my eye/brain, updating 
the when or lstint column is a distraction since my peripheral vision grabs 
my attention when I'm trying to look at something that hasn't changed.

YMMV

The other problem is that a busy server has way more clients than will fit on 
a screen so a mrulist printout is useless.  A busy pool server will collect 
ballpark of a million clients over 24 hours.  That takes ballpark of 10 
minutes to collect the data on localhost.  Round up if you have network 
delays.

On top of that, the mru code for ntpq (and I assume ntpmon) is broken.  It 
seems to hang when the server has lots of clients.  Eric dropped that ball 
and I haven't had time to pick it up.  (yet?)  If anybody likes debugging 
python code, please poke me off list.

The mru sort options are also reversed in some options.

---------

If/when we get the mru stuff fixed, you can get a reasonable sample of your 
abusive clients with something like:
  ntpq -nc "mru mincount=1000 sort=avgint"
Adjust the mincount to fit your setup and/or how long your server has been up.

I have a pool server with
  mru initmem 25000 maxmem 150000 maxage 100000
The maxmem fits a low cost cloud server.  The maxage tosses slots that are a 
bit over a day old.
I have a cron job setup to capture a mrulist each midnight.  That should 
record everything if I want to analyze it.




-- 
These are my opinions.  I hate spam.





More information about the devel mailing list