[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: grammar questions

Ah! Yes...

There's a weird solution that would parse the original PHP grammar... 
With LALR parsers, weeding (e.g. post-parsing cleanup) is a very
powerful tool.

So, you could relax your grammar rules and define your grammar as
attached.  This:
(1) will accept invalid programs like:
  if (a) x = 4; else v = 3; else y = 4;
(2) won't group "if", "elseif" and "else" statements together.
(3) will group things incorrectly e.g.:

if (a)
  if (b)
    x = 1;
    x = 2;
  x = 3

will be grouped as

if (a)
  if (b)
    x = 1;
  x = 2;
  x = 3

But, as it accepts a superset of your language, all you need to do is
traverse the AST and match related things together, and weed out invalid
constructs:-)  All this in linear time (no need for infinite lookahead
with potentially exponential costs)...

It's a little tricky, as you have to regroup things correctly.

(As you see, there's always a solution).

Have fun!

Etienne M. Gagnon, M.Sc.                     e-mail: egagnon@j-meg.com
Author of SableCC:                             http://www.sablecc.org/
and SableVM:                                   http://www.sablevm.org/

  blank = (' ' | 10 | 13)*;

  semicolon = ';';
  if = 'if';
  exp = 'exp';
  colon = ':';
  else = 'else';
  elseif = 'elseif';
  endif = 'endif';

Ignored Tokens


program = stmt*;

stmt =
  {empty} semicolon |
  {if} if exp stmt |
  {elseif} elseif exp stmt |
  {else} else stmt |
  {ifcolon} if exp colon stmt elseifcolonstmt* elsecolonstmt? endif;

elseifcolonstmt =
  elseif exp colon stmt;

elsecolonstmt =
  else colon stmt;