REFCLOCK rises again

Sat Mar 2 15:42:46 UTC 2019

This is redirected from a bug thread on issue #563.

Hal Murray <gitlab at mg.gitlab.com>:
> 
> Eric said:
> > Sadly, I think the great refclock cleanup is mostly done.  There are issues
> > about the GPSD and SHM drivers but the days when I could dike out an obsolete
> > driver a week are gone - we seem to be down to devices people are still
> > actually using.
> 
> Ahh...  I've used "great refclock cleanup" to mean something bigger.  I've
> been thinking of moving all the refclocks out of ntpd and having something
> like SHM that is good enough to handle everything.  I don't know how to do it.
>  I haven't spent a lot of time thinking about it.

This was my original REFCLOCKD plan back in 2015 - I *have* spent a lot
of time thinking about it. Now I'll tell you why I've more or less
abandoned the idea.  I don't mind revisiting the issue, though; other
than the usual tendency towards premature optimization I find your
architectural judgment is generally pretty sound, perhaps you will see
a different attack and persuade me that I should have persisted.

(Ian, pay attention.  This is real-world sys-arch stuff.  Hal, because I'm
filling in background for my apprentice I'm going to restate some points that
may be obvious to you.)

My original goal for REFCLOCKD was formulated when there were over 40
refclocks, the full set we inherited from Classic.  That was a lot of
code, more than 30% of the whole suite and actually close to 60% if
you don't count the *huge* in-treecopy of libevent2.

I was then, as I still am, very focused on drastically reducing the
global complexity and attack surface of the suite as a central
strategy for security-hardening the code. The main difference between
then and now is that then I only had an expertise-informed prediction
in my brain that that would be enough to bulletproof us against a lot
of future CVEs.  Now, in 2019, we know that actually worked. (*Phew!*)

That prediction is why my early moves included ripping out Mode 7 and
Autokey.  But those put together were less than half the bulk of the
refclocks.  The really simple, obvious, effective move would be to saw
the code in half at the joint between the network code and the refclocks.
And reconnect them via SHM.

(Ian: Effective why?  Well...suddenly half the code wouldn't be exposed to
the network anymore. We could make the refclockd to ntpd interface read-only.
Other benefits, too - we could ship a pure network daemon, much skinnier
than full ntpd.)

That was in the original plan I wrote just after accepting the tech
lead role.  This was some weeks or maybe a few months before I
recruited you (er, Hal).  Like most battle plans, it looked beautiful
before the enemy got involved.

When I started thinking about how to do it, two problems intervened.
One is that the actual interface from the refclocks to the main logic
is a nasty hairball.  I think I more or less understand it now, but
even now I'm sure there are tricky details to trip over. And that's
after more than three years of spelunking the codebase; back then,
well, it is a relevant fact that I'm particularly good at this kind
of dissection-and-refactoring job, but even so...ye gods and little
fishes!  I was just getting started exploring then; the attempt would
have been near hopeless at that time, and possibly crowded out getting
anything more useful accomplished.

(Ian: That last part is important.  Always factor the risk of failure
into your surgical plans...)

Because the hairball is so nasty, the only practical way to go about it
would be to clone ntpd under the name reclockd.  Then start whacking
chunks off of its source files until the network-facing stuff is all
gone.  Then go back to ntpd and remove all the refclock stuff -
admittedly easier as the #ifdef REFCLOCK conditionals tell you what to
excise.  It would have been like the old joke about sculpting an
elephant from a block of marble. "It's easy!  You just remove
everything that's not elephant!"  Twice over.

Um, until you notice that you now have two different, divergent offspring
of the previously unified protocol machine to maintain.  Uh oh...

Another problem would have been configuration. The piece of ntpd
that would stay ntpd and the part that would become refclockd are now
configured from the same file.  That means that if we break the programs
apart, there are only two choices.

1. Break the config into an ntpd.conf and a refclockd.conf. Architecturally
the right thing, but I judged it a political nonstarter.  We were the
new kids on the block then, with no accomplishments to exhibit; we're
going to *start* by telling every Stratum 1 site their existing configs
won't work?  Yowza...

That would have been exactly the opposite of my (other) plan - remain
scrupulously backward-compatible except for where we outright drop
features that we can strongly relate to a security issue.  Be the
low-effort choice for people who have operational reasons to value
increased security but don't want or plain can't afford the cognitive
load of learning new software with new rules.

2. Have refclockd share the same config as ntpd. Seems closer to
possible: in theory we could have the config parser be a common parser
exporting one data structure, parts of which each daemon parses while
ignoring other's parts.

OK, a partial solution, even if it somewhat compromises the goal of
strict separation of concerns.  Unfortunately, there's more  depth
to this rabbit hole.  Like: How does ntpq work now?  Looks like it's
going to have to exchange packets with two different daemons just to light
up a peers display.

Scuba divers have a concept of "incident pits".  I've written about the
application to software engineering here:

http://esr.ibiblio.org/?p=7111

"It describes a cascade that depends with a small thing going
wrong. You try to fix the small thing, but the fix has an unexpected
effect that lands you in more trouble. You try to fix that thing,
don’t get it quite right, and are in bigger trouble. Now you’re under
stress and maybe not thinking clearly. The next mistake is larger..."

(Ian, when you enough experience at this sort of thing you will start
to be able to recognize when you are at the edge of a software
incident pit.  When I realized that I was not only going to have to
deploy two different instances of the config parser but give ntpq a
split-brain operation, I started getting that feeling. You've heard me
call it "spider-sense" on IRC.)

This is about when I backed slowly away from REFCLOCKD.  "Bugger this
for a lark!" I remember thinking (yes, I learned to swear while living
in England): "I'm going after easier gains first."

A large part of the easier gains turned out to be identifying
individual obsolete refclocks and other features that could be sliced
off while still executing our intended..."product strategy" is a
business term that fits here.  Be the low-effort choice, etc.

And a funny thing happened on the way to shrinking the codebase by
4:1.  Sawing off refclockd started to look much less attractive simply
because the overall global complexity of the unified ntpd was so
much lower.  These tradeoffs look a lot different at 55KLOC of
relatively clean clode than they do at 231KLOC of jungly mess.  The
entire suite is now about half the LOC each of the the individual
pieces would have been if I'd done REFCLOCKD straight off.

The way I think about these things, the benefit of the REFCLOCKD
dissection would have been proportional to the amount of global
complexity eliminated, which is proportional to the number of possible
call paths pruned. Now, I believe from experience that
well-modularized software the call-path graph, exclusive of loops,
tends to look statistically like a roughly scale-free network with a
consequent approximately n log n complexity relationship to LOC -
sub-quadratic (which is what you push towards with spaghetti code) but
super-linear.

(Ian: Homework assignment.  Find some Go program of nontrivial
size. Look at one of the nifty call graphs generated by go prof web.
Learn what a "scale-free network" is well enough to criticize the claim
in my last sentence. Why is n log n the appropriate cost function?
Where do you see the largest deviations from scale invariance?)

Supposing that's true, when you drop your LOC by a factor of 4 you
collect a superlinear drop in complexity.  (4n log 4n)/(n log n) = 4 *
(log 4n / log n), with the second term getting arbitrarily large as n
increases.  That means that the payoff from REFCLOCKD today would be
lower than 1/4 of the payoff in 2015, possibly a whole lot lower.

Though its internal complexity remains high, ntpd is really quite small
now.  A binary with all refclocks in is 1.4GB.  Which might sound like
a lot, but vi is 2.7GB, Emacs is 16GB, and those are both skinny compared
to more modern programs with heavy GUIs.  

On the other hand, the fixed cost of the REFCLOCKD dissection has
changed much less, because the interface between clock-land and
network-land hasn't changed at all.  It's still just as gnarly as on
day one.

REFCLOCKD benefits way down, cost almost unchanged. Every time I model
this in my head the same answer comes out: bad idea.  I think we have
better complexity-reduction attacks available, like translating the
whole thing to Go to get rid of a lot of the resource-management hair
in the C code. But maybe you can change my mind?
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.