Logfile visualization tools - request for comment
Dan Drown
dan-ntp at drown.org
Tue Jul 19 03:24:32 UTC 2016
Quoting "Eric S. Raymond" <esr at thyrsus.com>:
> I've spent the last week reading code and preparing for a serious
> effort to write logfile visualization tools for NTPsec.
>
> There are at least two good reasons to do this, one retrospective and
> one prospective. The retrospective one is that the stats and
> data-reduction tools now in the distribution are a huge mess. They're
> archaic, often embodying assumptions that have long since passed their
> sell-by date (one pair of tools relies, for example, on mode 7, which
> we've eliminated). They're poorly documented or not documented at
> all. They're written in Perl, which is a serious maintainability
> problem. The whole area cries to be cleaned up - or better yet, nuked
> and replaced with better code.
>
> The prospective reason is that I need a way to make sense out of my test
> farm data. I want to be able to answer a bunch of questions, beginning with
> "How important are check servers to a machine with an GPS?"
One of my NTP modules has this annoying habit of drifting multiple
milliseconds while still producing a PPS (which is odd, it was
claiming unlocked status for multiple hours). I think it was
physically damaged in shipment. I don't use that module anymore. But
it was useful to have other servers configured to verify what was
going on.
LAN stratum 1 sources can measure offset wander in the tens of
microseconds, with the right conditions. For example:
https://dan.drown.org/rpi/pi2.html
> The path forward that I'm considering is a Python translation of the
> NTP branch of David Drown's chrony-graph software. It makes beautiful and
> interesting visualizations, embodying a lot of domain knowledge about
> which statistics and relationships are interesting. And of course, that last
> part is where my own knowledge is weakest. Co-opting his work will let me
> concentrate on the software-engineering aspect of the problem.
My first name is Daniel. David is my dad's name by random chance. So
unless he wrote NTP visualization software... :)
> I'm thinking Python translation for two reasons. One is our general
> Python-and-sh policy for scripting, to reduce maintainance complexity
> down the road.
>
> Another is that, as Gary Miller has pointed out, ddrown's collection of
> shellscripts and Perl has terrible locality. Gary says he can see in
> his graphs artifacts from chrony-graph's disk overhead, and I have no
> reason to disbelieve that. Gary suggests that a symbiont daemon, keeping
> intermediate data in memory until the final graphs need to be produced,
> would produce less noise.
I wouldn't be surprised if it was from processor activity (instead of
disk activity), actually.
On my Intel machine I generate all my graphs on, the time spent is
broken down like this.
1. bin/run (excluding bin/plot), log filtering/processing = ~2 seconds
2. bin/plot = ~9 seconds
2a. bin/plot - just calls to bin/percentile and bin/histogram (perl) =
~2 seconds
2b. bin/plot - just calls to gnuplot = ~7 seconds
3. bin/copy-to-website, copying html/png to remote system = ~1 second
total script time:
real 0m12.166s
user 0m8.503s
sys 0m2.039s
Disk activity during this time:
0 read operations (everything came from cache)
144 write operations totaling 16MB taking 136ms
These numbers are going to be much slower on a Raspberry Pi, but they
shouldn't be a drastic impact on the system when running every hour.
I experimented a bit to see if I could speed this up any. The biggest
win was limiting the output of the bin/histogram program. After I do
that, gnuplot is much faster (and the temporary data file
loopstats.history is much smaller):
2. bin/plot = ~3 seconds
2a. bin/plot - just calls to bin/percentile and bin/histogram (perl) =
~1 second
2b. bin/plot - just calls to gnuplot = ~2 seconds
total script time:
real 0m5.984s
user 0m3.797s
sys 0m0.715s
> So, translate chrony-graph to Python. But this would leave us with
> a coordination problem. It means either ddrown has to be prepared to
> let the Python version be his new mainline, or we have to cross-port
> all his improvements after the fork.
>
> David (*Daniel), do you have any suggestions for making this less painful?
I don't see a compelling reason to switch to python. I guess I don't
see the pain points.
More information about the devel
mailing list