Removing interleaved mode

Tue Aug 30 10:12:05 UTC 2016

Hal Murray <hmurray at megapathdsl.net>:
> 
> esr at thyrsus.com said:
> > I'm not clear how we would tell if anything were broken. Or are you just
> > suggesting looking for a timing difference? 
> 
> Mostly I was suggesting that we needed to test it to make sure that it seemed 
> to work rather than worrying about how well it worked.
> 
> After that, it would be nice to see how well it works.  I think that would 
> take 2 systems with PPS on both.  That's 2 per test.  I think we could test 
> old and new with 4 systems: 2 new, 2 old, and exchange packets in all 12 
> combinations.
> 
> My suggestion of using Pi-s is probably bogus.  The Ethernet on USB will add 
> enough noise to confuse things.

Given the nature of the bugs Daniel found, I don't think this effort can really
be justified.  Here is his comment in full.

    Remove interleaved mode

    Interleaved mode was an invention intended to improve timekeeping
    precision in symmetric and broadcast mode. The problem it intended to
    solve is that transmit timestamps have to be written before the packet
    is sent, but right *after* the packet is sent, better information
    becomes available because you know exactly when the packet made it
    through the kernel and out onto the wire. So, the basic idea of
    interleaved mode was to dump that better value into the *next* packet,
    and have the peer follow along with that, always one packet behind.

    This is a problem that PTP is clearly better suited to solving, but
    interleaved mode still seems at least reasonable in theory. However,
    there are two big problems.

    First, interleaved mode adds a great deal of complexity to NTP's state
    machine. This led to at least one terrible vulnerability (CVE-2016-1548)
    which took two tries to fix (CVE-2016-4956), and probably indirectly
    led to a few others.

    Second, the implementation was flawed. "Drivestamps" were collected
    simply by calling get_systime() immediately after sendpkt() returned.
    However, on modern kernels, send() returns immediately unless the
    network buffer is full. So the timestamp that NTP was collecting had
    nothing to do with the time the packet actually went out, and was not
    any more accurate than the transmit timestamp obtained in basic mode.

    If interleaved mode ever provided a timekeeping improvement, there are
    two possible explanations for why. One possibility is that the Solaris
    boxen that Dave Mills tested it on had a simpler kernel networking
    stack, so the timestamp he was collecting was something closer to a
    true drivestamp. Another possibility is the presence of a simple bug:
    before the recent refactor of receive(), in every mode except
    interleaved mode, NTP was storing a transmit timestamp where a receive
    timestamp belonged. Interleaved mode may have been improving
    performance just by dodging this buggy code.

-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>