sys_fuzzMime-Version: 1.0

Tue Jan 24 23:22:20 UTC 2017

On Tue, 24 Jan 2017, Gary E. Miller wrote:

> Last week we had a discussion on sys_fuzz and the value of adding
> random noise to some measurements.  The code defi2nes sys_fuzz asL
>
>     "* The sys_fuzz variable measures the minimum time to read the system
>      * clock, regardless of its precision."
>
> Rondomness of half the sys_fuzz is then added to some values, like this:
>
>     fuzz = ntp_random() * 2. / FRAC * sys_fuzz
>
> Makes no sense to me.  Adding randomness helps when you have hysteresis,
> stiction, friction, lash and some other things, but none of those apply
> to NTP.

Basing it on the time to *read* the clock definitely makes no sense,
although I suspect one would have to dig back fairly far in the history to
determine the source of that confusion.

If one is dithering, the amount of dither should be based on the clock's
actual resolution, *not* the time required to read it.  In a sampled
system, one would add dither equal to the quantization interval, in order
to produce results statistically similar to sampling with infinite
resolution.  For time values, one would add dither equal to the clock's
counting period, to produce results statistically similar to a clock
running at infinite frequency.

On Tue, 24 Jan 2017, Hal Murray wrote:

> The NTP case is roughly stiction.  Remember the age of this code.  It was
> working long before CPUs had instructions to read a cycle counter.  Back
> then, the system clock was updated on the scheduler interrupt.  There was no
> interpolation between ticks.

Indeed.  The interrupt was often derived from the power line, making the
clock resolution 16.7ms or 20ms.  With such crummy resolution, applying
some "whitening" looks attractive.

> Mark/Eric: Can you guarantee that we will never run on a system with a crappy
> clock?  In this context, crappy means one that takes big steps.

There are two different time intervals involved - the interval between
successive time values, and the time required to read the clock.  I'd use
the term "coarse" to describe a clock where the former is larger than the
latter, such that it's possible to read the same value more than once.

If you mean "big steps" in the absolute sense, then for some meaning of
"big", the term "crappy" is warranted. :-) But note that a clock can be
"coarse" without being "crappy".  For example, a clock running at 10MHz
isn't particularly "crappy", but if it can be read in 50ns, then it's
still "coarse".

> There is an additional worm in this can.  Some OSes with crappy clocks bumped
> the clock by a tiny bit each time you read it so that all clock-reads
> returned different results and you could use it for making unique IDs.

That's not uncommon, but it's a really bad idea.  Demanding that a clock
always return unique values is an unwarranted extension of the job
description of a clock.  The proper way to derive unique values from a
clock is to wrap it with something that fudges *its* values as needed,
without inflicting lies on the clock itself.  Any clock classified as
"coarse" by the above definition is corrupted by a uniqueness requirement,
whether "crappy" or not.

Also note that in some contexts it's reasonable to extend the resolution
of a "coarse" clock (without breaking "fine" clocks) by reading the clock
in a loop until the value changes.  This approach is completely neutered
by a uniqueness kludge.

Getting back to the original issue, if dithering is warranted, then there
are a couple of pitfalls:

1) If it's a "coarse" clock, then dithering destroys monotonicity.  In
*some* (mainly statistical) contexts, non-monotonic time values may be
perfectly OK, but in any context involving intervals they can be
disastrous.  So one would probably need to keep both dithered and
undithered time values.

2) Determining the proper amount of dither isn't necessarily easy.  The
clock_getres() function is supposed to report the actual clock resolution,
which is what should determine the amount of dither, but in practice it's
rarely correctly implemented.  E.g., in the Linux cases I've tested, it
ignores the hardware properties and just retuns 1ns.

I'm not convinced that sub-microsecond dithering is worthwhile, anyway.
If the dithering code is retained at all, it might make sense to have a
configure test that reads clock_getres(), and only enables dithering
support if the result is more than a microsecond.  That test would be
unaffected by the aforementioned lies in clock_getres().  Though there'd
need to be a way to force dithering on for testing, since it's unlikely
that any test platforms would use it naturally.  And those sorts of
configure tests are problematic for cross-building.

BTW, if the only use for randomness is for computational dithering, and
not for security, then there's no need for crypto-quality randomness.  In
that case, why not just read /dev/urandom directly and dispense with the
whole libsodium mess?

Fred Wright