worth reading "Write opinionated workarounds"

Sat Apr 16 20:44:54 UTC 2016

http://www.daemonology.net/blog/2016-04-11-write-opinionated-workarounds.html

Write opinionated workarounds
by Colin Percival

A few years ago, I decided that I should aim for my code to be as portable
as possible. This generally meant targeting POSIX; in some cases I required
slightly more, e.g., "POSIX with OpenSSL installed and cryptographic
entropy available from /dev/urandom". This dedication made me rather
unusual among software developers; grepping the source code for the
software I have installed on my laptop, I cannot find any other examples of
code with strictly POSIX compliant Makefiles, for example. (I did find one
other Makefile which claimed to be POSIX-compatible; but in actual fact it
used a GNU extension.) As far as I was concerned, strict POSIX compliance
meant never having to say you're sorry for portability problems; if someone
ran into problems with my standard-compliant code, well, they could fix
their broken operating system.

And some people did. Unfortunately, despite the promise of open source,
many users were unable to make such fixes themselves, and for a rather
large number of operating systems the principle of standards compliance
seems to be more aspirational than actual. Given the limits which would
otherwise be imposed on the user base of my software, I eventually decided
that it was necessary to add workarounds for some of the more common bugs.
That said, I decided upon two policies:
Workarounds should be disabled by default, and only enabled upon detecting
an afflicted system.
Users should be warned that a workaround is being applied.

The first policy is essential for preventing a scenario often found in
older software: A workaround is added for one system, but then that
workaround introduces a problem on a second system and so a workaround is
added for the workaround, and then a problem is found with that second
workaround... and ten years later there's a stack of workarounds to
workarounds which nobody dares to remove, even though the original problem
which was being worked around has long since been corrected. If a
workaround is disabled by default, it's less likely to provoke such a stack
of workarounds — and it's going to be much easier to remove them once
they're no longer needed.

The second policy is important as a matter of education: Users deserve to
know that they're running a broken operating system. And running broken
operating systems they are doing. Here are some of the warnings people will
see, along with explanations (more for the benefit of people who arrive
here via google than for my regular readership):
WARNING: POSIX violation: make's CC doesn't understand -lxnet
WARNING: POSIX violation: make's CC doesn't understand -lrt
The POSIX C compiler
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/c99.html> is
required to accept the options -lxnet and -lrt even if those libraries do
not exist. On many systems the functionality implied by those options is
included in libc and is thus always available, but those options are not
properly ignored.
WARNING: POSIX violation: <time.h> not defining CLOCK_REALTIME
Up to POSIX POSIX.1-2004, CLOCK_REALTIME was part of the optional "Timers"
component; but it is now a mandatory part of the standard
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/time.h.html>,
although the (arguably far more useful) CLOCK_MONOTONIC clock remains
optional.
WARNING: POSIX violation: <sys/socket.h> not defining MSG_NOSIGNAL
Another historical portability problem, MSG_NOSIGNAL became mandatory
starting in POSIX.1-2008
<http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_socket.h.html>
.
#warning Working around bug in LLVM optimizer
#warning For more details see https://llvm.org/bugs/show_bug.cgi?id=27190
LLVM is known to miscompile code paths containing longjmp or siglongjmp
calls. I'm actually rather shocked that this wasn't noticed and fixed a
long time ago; longjmp doesn't get used very often, but the places where it
does get used tend to be places where having miscompiled code is even
scarier than normal.
WARNING: Applying workaround for Docker signal-handling bug
Unlike the others, this warning appears at run-time; it refers to a problem
where SIGTERM and SIGINT are disabled for a process running as init in a
Docker container.

But as passionate as I am about user education, there's a far more
important reason for that second policy: Getting things fixed. All of these
are problems we could have worked around silently; indeed, with the
exception of the LLVM bug (which I don't think anyone else has noticed) all
of themhave been worked around silently. But while silent workarounds solve
the immediate problem for one piece of software, they do nothing to help
the next developer who trips over those bugs. Warnings, on the other hand,
can help to get bugs fixed: Indeed, a few months ago I fixed a bug in
FreeBSD <https://svnweb.freebsd.org/base?view=revision&revision=292723>for
the sole reason that I was getting annoyed by one of my own warning
messages! Even if the vast majority of people who see those warnings
disregard them, any chance that the right developer will get the message
and fix a bug is better than none.

My regular readers will know that I care deeply about producing correct
code, offering bounties <http://www.tarsnap.com/bugbounty.html> for issues
as trivial as misplaced punctuation in comments. But it isn't just my own
code I care about; I'm affected by bugs in all of the code I run, and even
by bugs in code I don't run if I rely on someone else who does. So please,
if you find a bug, don't just work around it; shout it from the rooftops in
the hope that the right people will hear.

Because if we all stop accepting broken code, we might eventually end up
with less broken code.

END
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.ntpsec.org/pipermail/devel/attachments/20160416/3328bc05/attachment.html>