[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ANNOUNCE: SableCC version 2.1

To: "Chris F Clark" <cfc@world.std.com>, <gagnon@cs.mcgill.ca>
Subject: Re: ANNOUNCE: SableCC version 2.1
From: "Etienne Gagnon" <egagnon@sprynet.com>
Date: Wed, 25 Mar 1998 18:16:42 -0500
Cc: <cfc@world.std.com>, <sreeni@metamata.com>, <sriram.sankar@metamata.com>, <parrt@magelang.com>, <nkramer@jprc.com>, <javacc-interest@metamata.com>, <sablecc-list@sable.mcgill.ca>

>Another way of specifying conflict resolution appeared in a recent
>(say last 2 years) TOPLAS article.  If I remember correctly, it was
>coached in terms of which rule was to be preferred, which might work
>well for you since you have names for every rule.
>
>Of course, you also have (the option of) names for almost every token,
>and so you could also express conflict resolution in terms of those
>names.  Something like the following would be quite explicit....
>
> exp = {plus_alternative} exp plus exp
>     | {star_alternative} exp star exp
>
>PREFER REDUCE plus_alternative OVER SHIFT plus
>// makes a + b + c parse as (a + b) + c
>PREFER SHIFT star OVER REDUCE plus_alternative
>// makes a + b * c parse as a + (b * c)


Interesting approach. I'll investigate it and probably implement it in
future version of SableCC.

>
>Etienne:
>> They key to hacking a SableCC generated compiler is customized
>> Lexers/Parsers.
>>...
>This should suffice.  Of course, some grammars will only work with the
>correct filters applied (e.g. C with its infamous typedef problem),
>but that is a separate concern.  You seem to be very good at following
>your maxim:
>> SableCC is also different in its approach to compiler construction. It
>> keeps, at all points, a very clear separation of generated/human written
>> code. (It uses OO design patterns to achieve this.)
>


One way of dealing with difficult grammars in SableCC is to "relax" the
parsing rules (to the extreme, if needed), and to modify the AST later. For
example, in the SIMPLE C grammar, I dealt with the typedef problem by
removing the following alternative:

type_specifier = ... |
  ({typedef} identifier);  // <== IGNORED

and then adding:

variable_declaration =
  type_specifier declarator semicolon | // was already there.
  {identifier} identifier declarator semicolon; // <== ADDED

(In reality, you do this transformation at several points of the grammar).

Later, once the AST is fully constructed, I do one pass over it to change
back the AST to match the original grammar. This is quite simple. Here's the
code to remove the added alternative:

class ModifyAST extends DepthFirstAdapter
{
 public void outAIdentifierVariableDeclaration(
                AIdentifierVariableDeclaration node)
 {
  node.replaceBy(new AVariableDeclaration(
   new ATypedefTypeSpecifier(node.getIdentifier()),
   node.getDeclarator(),
   node.getSemicolon()));
 }
}

If this is not enough, then all the old tricks are allowed;-) You could let
a customized parser exchange information with a customized lexer, and play
all the possible dirty tricks. But I do not want to encourage people to use
these tricks and loose the clean + readable + simple coding principles of
maintainable software.

>Hope this helps,


Thanks for the references on the new sr/rr conflict resolution techniques.

Etienne

Prev by Date: Re: ANNOUNCE: SableCC version 2.1
Next by Date: More Visitors
Prev by thread: Re: ANNOUNCE: SableCC version 2.1
Next by thread: Re: ANNOUNCE: SableCC version 2.1
Index(es):
- Date
- Thread