on the NTP security issues and fixes

Gary E. Miller gem at rellim.com
Fri May 6 18:29:29 UTC 2016


Yo Eric!

On Fri, 6 May 2016 08:39:55 -0400
"Eric S. Raymond" <esr at thyrsus.com> wrote:

> Gary E. Miller <gem at rellim.com>:
> > Yo All!
> > 
> > I pulled git head, running it now on a server in place of chronyd.
> > 
> > Seems to work OK.  I'll keep an eye on it.  
> 
> That's valuable feedback, seeing as I just performed the major surgery
> of removing Autokey.

Yeah, and two new bugs in your queue...

> > I really like the chronyd socket interface over the SHM one.  The
> > user is not playing with magic numbers.  
> 
> I'll take some convincing on this.  Yes, the SHM interface is cryptic
> but the last thing we need is jitter added by buffer management in the
> socket layer.  We don't need the additional complexity or attack
> surface, either; adding yet another interface would be going in the
> exact opposite direction from where I'm trying to take us.

Jitter is not relevant on this interface, ant it reduces the attack surface.

Plus the socket is extensible, whereas the SHM is currently very twitchy,
any change and binaries no longer work together.

But, more important things to do.  Old teclo rule of thumb: fix bugs
first, then add features.  Maybe fix #48.

> 
> > 'ntpq -p' is user hostile compared to 'chronyc sources'.  chronyc
> > adds units to the display, so you do not have to keep referring to
> > them manual, and it makes it easy to deal with jitter and delay that
> > varies by orders of magnitude.  
> 
> I think you have a really strong point here.  I'm adding your quote to
> devel/TODO.  It might not happen until we rewrite ntpq in
> Python, though.

No rush, a good step forward.

> > And last, but not least, ntpd takes way, way, way longer to converge
> > than chronyd.  Which is why on the fly reconfiguation in ntpd is SO
> > important.  Last thing you ever want to do is restart ntpd.
> > 
> > Right now, after 10 mins, ntpd has 2,000 times the jitter as
> > chronyd had when I turned it off.  
> 
> Mark and Daniel and I would prefer strongly to get rid of on-the-fly
> reconfiguration entirely.  It's a complicated hack leading to chronic
> security issues.  Mark has expressed a preference (with which I
> strongly agree) for the only reconfig method to be "you SIGHUP the
> daemon and it rereads its config".

Yes, I agree with all that.  Pretty much standard in UNIX to SUGHUP a
daemon to reread config, while staying running.  I'm surprised ntpd did
not do that.

All the other reconfig should go.

> Do we have to live with long convergence times?  Do you have any
> theory about what causes this and how it can be fixed?

I have not gone deep into the PLL, grouping, and selection layers.
Daniel Franke's talk at Penguicon leads me to believe he is starting to
see these issues.

I'm trying not to fall in that rabbit hole.  I suspect that so much of
that stuff is in RFCs and changing it will be initially politically
difficult.  The end result of an obviously better performance will help
market acceptance.

I do know that chronyd performs better than ntpd in certain configuration.
A compare and contrast needs to be done, maybe grab some chronyd modes
and add a switch so people can compare.  I would ask you not do so until
the howto is done and stable.

RGDS
GARY
---------------------------------------------------------------------------
Gary E. Miller Rellim 109 NW Wilmington Ave., Suite E, Bend, OR 97703
	gem at rellim.com  Tel:+1 541 382 8588
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: OpenPGP digital signature
URL: <http://lists.ntpsec.org/pipermail/devel/attachments/20160506/b10eae1a/attachment.bin>


More information about the devel mailing list