[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: use of states



Hi Kyle.

The grammar you sent does not compile, e.g.:
[25,4] TEXT undefined.

I have made a few obvious modifications.  It should now do what you
intended.

You'll see that the ideal solution involves both "states" and
"lookahead".  As lookahead is not yet implemented, you can achieve the
same result using a custom lexer.

Please try the grammar, and tell us if it works fine.  I have not tested
it!

Etienne
-- 
----------------------------------------------------------------------
Etienne M. Gagnon, M.Sc.                     e-mail: egagnon@j-meg.com
Author of SableCC:                             http://www.sablecc.org/
and SableVM:                                   http://www.sablevm.org/
Helpers
  ascii_character     = [0..0xff];
  ascii_small         = ['a'..'z'];
  ascii_caps          = ['A'..'Z'];
  unicode_character   = [0..0xffff];

  digit               = ['0'..'9'];
  id_prefix           = ascii_small | ascii_caps | '_';
  id_char             = id_prefix | digit;
  id_component        = id_prefix id_char*;

  lf  = 0x0a;
  cr  = 0x0d;

  line_terminator     = lf | cr | cr lf;
  input_character     = [[ascii_character - [cr + lf]] - '$'];

States
  normal,
  var;

Tokens
  {normal->var} var_prefix = '$';

/************
  If lookaheah was implemented, it would look like:

  {normal->var} var_prefix = '$' / '{' id_component ('.' id_component)* '}';

  For now, if you want a similar functionality, you would write:

  {normal->var} var_prefix = '$' '{' id_component ('.' id_component)* '}';

  This would match the whole macro, but you would use a 
  custom lexer to push back all the characters but the
  leading '$' into the input BufferedReader.

  Why all this?  Because you probably don't want to switch to the "var"
  state if you see something like "$$ I am writing $$ signs...".
************/

  {normal} string_literal = input_character+;

  {var} start_id = '{';
  {var} identifier = id_component ('.' id_component)*;
  {var->normal} end_id = '}';

Productions
  file = fragment*;

  fragment =
    {text} string_literal |
    {macro} macro;

  macro = 
    var_prefix start_id identifier end_id;