[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: SableCC Thoughts



Dan Sandberg wrote:
> 3.  How can we allow modification to the parser stack (and lexer state)
> without betraying so much information about the internals that code
> would have to be changed if the parser was LL instead of LALR?

The consequence of providing a very low level error-recovery mechanism,
is that now compiler programmers are back hacking on the parse stack,
and into the parse tables.

I dislike this idea.  It has many inconvenients, ranging from making it
difficult to improve SableCC's parsing engine without breaking backward
compatibility, to the loss of the nice "high-level" approach of SableCC
to compiler specification, with clean separation of parse-engine
(machine generated, no programmer involvement) and AST (that's where the
compiler programmer works, in a clean environment).

> BTW, it seems like Sable should not be using a PushBackReader, because
> you have to specify in the constructor what the maximum number of
> characters that can be pushed back is.  It's really easy to make our own
> PushBackReader without this constraint.  It can just use a StringBuffer
> instead of the PushBackReader's char[].

Might be a good idea.  Have you made tests on the relative speed on both
approaches?  

> There would still be only one type of token.  It's just that floating
> tokens would be auto-declared.   Maybe I'm the only one who is bothered
> by this, but I am parsing a configuration format that has dozens and
> dozens of keywords, so it's getting on my nerves.  Actually, when I used
> JTB / JavaCC on a previous project, I also found the lack of this
> feature annoying.  I've spent a lot of time doing database stuff, so I
> guess I am trying to 'normalize' the grammar file :)  Anyone else have
> an opinion on this?  This feature could be off by default, and turned on
> by placing a special floating token marker in the Tokens section, like
> this:
> 
> Tokens
>   eol = cr | lf;
>   {bol} hello = 'hello';
>   {bol->normal, fubar} AUTO_DECLARE_TOKENS;
>   number = digit+;
>    ...

Hmmm... Maybe... Let say.  How about we solve the other problems first,
and after that, if it still itches you, we can rethink about it? ;-)))

Etienne
-- 
----------------------------------------------------------------------
Etienne M. Gagnon, M.Sc.                     e-mail: egagnon@j-meg.com
Author of SableCC:                             http://www.sablecc.org/
and SableVM:                                   http://www.sablevm.org/