[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: LaTeX from SableCC & auto-resolution of shift/reduce
Hi Pat,
Patrick LAM wrote:
I'm writing a toy language (for a POPL paper). To avoid embarassing
errors, it would be nice if my toy language could actually have a parser
which recognizes the language. So far, so good; I can write a SableCC
grammar for the language.
1. It would be great if I could generate a LaTeX grammar from the SableCC
grammar (design conformance for compilers!). It doesn't seem like too
much work and I guess I could do this in a few hours.
I have an M.Sc. student working on a back-end scripting engine for SableCC.
Once it is in place, it will be fairly easy to write a LaTeX target.
Meanwhile, you can probably get away with relatively small modifications to
Gen*.java classes and a related latex.txt template file.
2. Unfortunately, my language has shift-reduce conflicts. I can manually
resolve them, but then if I would automatically generate LaTeX on the
resolved grammars, I'd get pretty nonintuitive grammars as a result.
Now, it appears to me that Etienne's inlining algorithm for resolving
shift-reduce conflicts could be automatically applied to a source SableCC
file to get a shift-reduce free SableCC file. Does it seem to be
reasonable to implement a SableCC-to-SableCC compiler that could
automatically inline offending productions, or am I missing something
here?
This is effectively a quite neat idea. "Automati Conflict Resolution" for
simple conflicts sounds cool! I have already put a student to work on this.
When you think about it, you can see that automatic inlining can *sometimes*
be even more powerful than using K>1 lookahead [LR(K)]:
Grammar with conflict:
======================
prod =
{one} a T.id list T.one |
{two} b T.id list T.two;
a = T.id;
b = T.id;
list =
{single} T.id |
{many} list T.id;
In this case, there is obviously a reduce/reduce conflict on a/b after seing
a T.id. No finite amount K of lookahead is sufficient to resolve this conflict
as "list" can be arbitrarily long, and the parser needs to know whether the token
following the list is a T.one or T.two to resolve the conflict.
On the other hand, inlining does resolve the conflict:
Iniling
=======
prod =
{one} T.id T.id list T.one |
{two} T.id T.id list T.two;
list =
{single} T.id |
{many} list T.id;
In tnis case, the conflict is eliminated because the LR parser won't have to
make a decision on whether a prod.{one} or a prod.{two{ is being parsed before
seeing T.one or T.two. :-)
Kevin's CST->AST transformations can be extended to make such internal inlining
in case of conflict while hiding this fact from the programmer.
Thanks a lot, Patrick, for this neat idea.
Etienne
--
Etienne M. Gagnon, Ph.D. http://www.info.uqam.ca/~egagnon/
SableVM: http://www.sablevm.org/
SableCC: http://www.sablecc.org/