[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Possible extension to SableCC. WAS: Re: reduce conflict on TKAs

A Schneider at XCC wrote:
> After each token, the parser could be asked to how to proceede, e.g.
> shift or reduce.
> ...

YACC (& Bison) let you do this.  But this has proved to be a very
dangerous thing to do, since it changes the "language" accepted by the
parser.  You can ask any yacc/bison old timer about playing with
conflict resolution in parse tables.

A theoretical result says that any language that can be recognized by a
deterministic push down automaton (PDA) has an LR(1) grammar. [In the
literature, this set of language is called: the deterministic
context-free languages (DCFL)].

In other words, tweaking the parsing tables using "constant" decisions
on reduce/reduce and/or shift/reduce conflicts does not add any power to
the parser generator. There exist an equivalent grammar which is LR(1).

The advantage of a pure LALR(1) grammar over its equivalent "hacked"
version is that you know how it will behave on all inputs. Using
Bison/Yacc conflict resolution mechanisms creates a gap between the
written grammar and the actual grammar of the language being parsed.
This is a cause of much trouble in compiler development and maintenance.

There are however a few well known and useful conflict resolution hacks
that can be use relatively safely, for they have been tested extensively
over the last couple of decades. These hacks are (1) the dangling-else
ambiguity and, (2) operator precedence in exp = exp op1 exp | exp op2
exp | ... | exp opn exp;

I could keep writing about this subject for long.  Let's make this short
but just saying that I will not provide any dangerous conflict
resolution mechanisms, UNLESS I can figure out a way of making sure that
enough information is provided to a user to make a clear and sound
decision, and that the process is explicit enough about its consequences
as to discourage light-headed use;-)

I do know about the idealistic "I simply want an AST for my grammar". 
But there are many problems with grammars.  For one, a grammar can be
ambiguous... This is no good for building an AST, because there are many
valid ASTs.   Secondly, the theory tells us that finding if a given
context-free grammar is ambiguous or not is an undecidable problem!!! 
This means that there will never be any program to answer this question.

On the other hand, you can go over the mailing list archive.  I have
gathered many requests from users and I am working on some improvements
to SableCC that I have already outlined in some message.  But just keep
in mind that this is an open source project.  There is no time schedule
yet for any release of these extensions.  If you have any urgent need,
see the "readme.html" file...

For your other messages, you might want to look in the mailing list for
"ignored alternatives" and customized parsers/lexers.

Keep in mind that a lexer is always ahead of the parser. So it can be
quite tricky to establish feedback from parser to lexer.  But a few
people succeeded in doing this in the past using yacc/bison.  So it
should be feasible using SableCC.  Just be warned that it is a tricky
thing to do!

Have fun!


Etienne Gagnon, M.Sc.                   e-mail: gagnon@sable.mcgill.ca
Author of SableCC:                 http://www.sable.mcgill.ca/sablecc/