Dump bison

Eric S. Raymond esr at thyrsus.com
Tue Jun 19 04:45:58 UTC 2018


Hal Murray via devel <devel at ntpsec.org>:
> 
> Gary said:
> > In a perfect world someone rewrites ntp_parser.tab.c in a modern language. 
> 
> What are the options in that area?

Not so great.  Nothing but Bison really makes strategic sense.

(Hey, Ian!  Pay attention.  Lore of some significance about to be spoken.)

Sorry, Gary, but rewriting ntp_parser.tab.c (the generated parser
code) in a modern language is a *terrible* idea.  It would lead to one
of two outcomes:

(1) I rewrite the C in a strict LL(1) recursive-descent style, that being
the only practical way to hand-roll a parser of any size.  The source
becomes relatively easy to understand and modify, but the quality of the
messages we get on parse errors degrades significantly.  This is why
LALR(1) parsers like the ones Yacc/Bison generates are popular; they
are better than recursive descent at error diagnostics and recovery.

Also, verifying the equivalence of our new hand-rolled LL(1) grammar
with the old Bison-generated one is actually quite difficult. As Jamie
Zawinski's famous snark about a parallel situation almost put it,
"Now you have *two* problems!"

(2) We stick with the generated LALR(1) parser, freeze the state
tables in place, and clean up the C.  OK, we keep good recovery
behavior but this made further modifications of the parser much, *much*
more difficult then they need to be.  Future maintainers will want to
hurt us very badly if we do this.

I think Hal is trying to imply rewriting the grammar with a parser
generator that grinds out less crappy, more modern C. This is
possible.  I could stand to learn MARPA or ANTLR and would in fact
rather enjoy doing so, because as Gary noted, I'm ("Parsers are ESR's
thing.") queer that way.  Just about any parser generator designed
after 19-fscking-73 (the year of Yacc's debut) would pretty much have
to generate C that is less riddled with weird archaisms like those
#@!%@$ yy-prefix global names that are a such a shit-awful substitute
for a simple context structure.

(Ian, part of the reason I sent up a pay-attention flare is that
otherwise you would have tripped over Yacc someday, noticed that this
time-hallowed tool of the primal Unix gods seems to have a hideously
bad interface design, and maybe wondered whether that meant there's
something wrong with *your* judgment.  There isn't.  Yacc is
*immensely* useful and I love it almost like I love my 1911, but its
interface is one of the most regrettable legacies of early Unix, up
there with significant tabs in Makefiles.)

The problem with the new-parser-generator theory is that as much fun
as I'd have doing it, the net effect on stability and maintainability
would probably be negative.  There's that how-do-you-know-you-specced-
the-same-grammar problem again.  Also, while aspects of Yacc are
ancient and botched it's kind of a Schelling point - anybody who knows
how to drive exactly one parser generator knows Yacc, and if you
switch to something else you reduce the number of people pre-qualified
to maintain your code pretty sharply.

This also affects our vague speculations about Go. If we ever move on
those I know we'll have a Go equivalent of Yacc that will make that
part of the transition relatively easy. I have no such confidence about
any other parser generator.

So, in sum, I think living with a warning or two is the least bad option.
The second least bad is that I slightly customize the Bison parser
skeleton to make the problem go away; of course then we'd have to
maintain that through Bison upgrades.
-- 
		<a href="http://www.catb.org/~esr/">Eric S. Raymond</a>

My work is funded by the Internet Civil Engineering Institute: https://icei.org
Please visit their site and donate: the civilization you save might be your own.




More information about the devel mailing list