Requesting code review on possible fix for nopeer/pool conflict

Tue Jul 5 13:59:19 UTC 2016

I think I may be on the road to a fix for GitLab issue #79: "pool
command conflicts with restrict nopeer",  However, the relevant piece
of code is so horribly snarled, and the fix likely enough to produce subtle
bugs if we get it wrong, that I want some eyes on my assumptions before I
commit anything.

Hal's bug report reads like this:

    restrict nopeer kills using the pool command. (Try it.) The symptom is
    that no slots ever show up in ntpq -p

    The nopeer restriction is intended to prevent attackers from
    pretending to be a peer and then screwing up the local clock. The pool
    command is using the same peer mechanism to setup a temporary slot. We
    should be able to bypass that part of the restrict filter. We know
    what IP address to expect. The server slots already do it.

I believe the relevant pieces of code are ntp_proto.c. First, the
AUTH macro:

/*
 * This macro defines the authentication state. If x is 1 authentication
 * is required; othewise it is optional.
 */
#define	AUTH(x, y)	((x) ? (y) == AUTH_OK : (y) == AUTH_OK || \
			    (y) == AUTH_NONE)

#define	AUTH_NONE	0	/* authentication not required */
#define	AUTH_OK		1	/* authentication OK */
#define	AUTH_ERROR	2	/* authentication error */
#define	AUTH_CRYPTO	3	/* crypto_NAK */

Then this:

	/*
	 * This is a server mode packet returned in response to a client
	 * mode packet sent to a multicast group address (for
	 * manycastclient) or to a unicast address (for pool). The
	 * origin timestamp is a good nonce to reliably associate the
	 * reply with what was sent. If there is no match, that's
	 * curious and could be an intruder attempting to clog, so we
	 * just ignore it.
	 *
	 * If the packet is authentic and the manycastclient or pool
	 * association is found, we mobilize a client association and
	 * copy pertinent variables from the manycastclient or pool
	 * association to the new client association. If not, just
	 * ignore the packet.
	 *
	 * There is an implosion hazard at the manycast client, since
	 * the manycast servers send the server packet immediately. If
	 * the guy is already here, don't fire up a duplicate.
	 */
	case AM_MANYCAST:

		if ((peer2 = findmanycastpeer(rbufp)) == NULL) {
			sys_restricted++;
			return;			/* not enabled */
		}
		if (!AUTH((!(peer2->cast_flags & MDF_POOL) &&
			   (sys_authenticate?1:0)) | (restrict_mask & (RES_NOPEER |
		    RES_DONTTRUST)), is_authentic)) {
			sys_restricted++;
			return;			/* access denied */
		}

		/*
		 * Do not respond if unsynchronized or stratum is below
		 * the floor or at or above the ceiling.
		 */
		if (hisleap == LEAP_NOTINSYNC || hisstratum <
		    sys_floor || hisstratum >= sys_ceiling) {
			sys_declined++;
			return;			/* no help */
		}
		peer = newpeer(&rbufp->recv_srcadr, NULL, rbufp->dstadr,
			       MODE_CLIENT, hisversion, peer2->minpoll,
			       peer2->maxpoll, FLAG_PREEMPT |
			       (FLAG_IBURST & peer2->flags), MDF_UCAST |
			       MDF_UCLNT, 0, skeyid);
		if (NULL == peer) {
			sys_declined++;
			return;			/* ignore duplicate  */
		}

Here are some of my deductions.  I'm requesting checks on them.

1. This is where a response from a pool dispatch server arrives.  That's
   what the first three lines of the comment seem to imply, anyway.

2. peer2 gets initialized to point at peer structure of the server (a
   pool dispatcher or manycaster) that send the response.

3. The source address in the packet is that of the pool server that the
   pool dispatcher is inviting us to connect to.

Given these assumptions, I think our bug is in this statement here:

    if (!AUTH((!(peer2->cast_flags & MDF_POOL) &&
	       (sys_authenticate?1:0)) | (restrict_mask & (RES_NOPEER |
	RES_DONTTRUST)), is_authentic)) {
	    sys_restricted++;
	    return;			/* access denied */
    }

That is, the guard expression is mis-coded in a way that makes it
evaluate to true (causing responses from pool servers to get
bit-bucketed) when peer2 is a pool dispatcher and the RES_NOPEER
restriction is on.

Our next problem is that this guard is *seriously* nasty.  As in, with
30+ years of C experience I still need to be damned careful modifying
it nasty.

So here's how I attempt to refactor it into something tractable:

    int pool_dispatcher = (peer2->cast_flags & MDF_POOL);
    int need_auth = (!pool_dispatcher && (sys_authenticate?1:0));
    int restricted = (restrict_mask & (RES_NOPEER | RES_DONTTRUST));
    if (!AUTH(need_auth | restricted, is_authentic))
    {
	    sys_restricted++;
	    return;			/* access denied */
    }

Right away something looks fishy here.  The && makes the need_auth part
of the guard bool-valued, but it's composed via | with a mask in which the
on bits are

#define	RES_DONTTRUST		0x0004	/* authentication required */
#define	RES_NOPEER		0x0010	/* new association denied */

As I read the logic, the guard comes out to requiring authentication if
either one of two things is true:

1. peer2 is not a pool dispatcher and sys_authenticate is on, or

2. either the notrust or nopeer restrictions are enabled.

...and there's our bug.  The logic is not right in the pool case.

Here is the logic I think we actually want:

* If sys_authenticate is off, never require authentication.

* If peer2 is a pool dispatcher, require auth only on notrust

* If peer2 is not a pool dispatcher, require auth on notrust or nopeer.

OK, now please eyeball-check this guard carefully:

    int pool_dispatcher = (peer2->cast_flags & MDF_POOL);
    int keepout = pool_dispatcher ? RES_NOTRUST : (RES_NOTRUST | RES_NOPEER);
    if (!AUTH(sys_authenticate, (restrict_mask | keepout), is_authentic)
    {
	    sys_restricted++;
	    return;			/* access denied */
    }

Am I missing anything here?
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>