[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re:



Hi Chris,

>I've been playing with SableCC and I'm very impressed.  It seems to be
>an excellent framework for building AST-deriving parsers, and
>associated tools.  My only caveats concern performance of the parser
>construction stage. Especially irritating is the length of time one
>has to wait before grammatical errors are reported.  But aside from
>that its an excellent tool.


The lexer and parser construction stages are being completely reconstructed
for optimizing the speed of lexer/parser table construction. (See below).

>One mechanism I particular like is the macro mechanism for generating
>the basic classes.  Indeed I've already extended this to add a new
>analysis class appropriate to the work I'm doing.  However, the
>extension process is not simple requiring the modification of the
>*.txt and Gen* files directly.


Normally, you shouldn't be playing around and modifying the AST structure.
The data you want to associate with a node should reside in a hashtable in
the anaysis class (e.g. extends DepthFirstAdapter) that is associated with
this data.

This being said, I agree that a more generic method of generating classes
could be used. The question is whether this would be really useful for
SableCC users. It might encourage users to simply modify the AST instead of
doing as suggested previously, which would be the wrong way to use the
framework.

Why keep the data in a hashtable? For maintenance reasons and memory
usage/performance reasons. If you use a hashtable, and at some point you
want to get rid of the data, you simply release the reference to the
hashtable, and the Java grabage collector will do the rest. If, on the other
hand, you have custom nodes, with fields for this data, you will have to
visit all nodes to free the data. Furthermore, each node will waiste at
least 4 bytes for a reference to null (previously a reference to the data).
In a big AST, this can represent a lot of memory.

>I've implemented an alternative mechanism for overriding the macro files
>and defining the output set.  I would like to feed this back into SableCC,
>and so would like to ask if there is an accepted system for proposing
>and submitting changes to SableCC.


As for now, the only existing process is using the mailing list to
contribute ideas, not code. Accepting code contributions are more of a
problem, because of all the legal hassle. For instance, if you are working
for some company, we must get a written disclaimers from the company stating
that it disclaims all copyright interest in the contributed code, blah,
blah, blah. The main problem being that we (i.e.:Sable Research group and I)
do not have time nor available lawyers to take care of these minor but
important details.

>One other thing I would like to ask about is regeneration of the parser.
>I note that section 7.1 of Etienne's thesis discusses building SableCC 2,
>using SableCC 1.  Is the grammar for this available and are there any plans
>to upgrade it to version 2?


There is little chance you will ever get a look at version 1;-) On the other
hand I am rewriting SableCC using version 2.

I do not want yet to disclose too much about my current development, and set
expectations too high. Here are the general directions. I'm attacking three
fronts:

1- Lexer/Parser generation time (and a few implementation details).
2- Grammar expressiveness. [go beyond LALR(1)].
3- Automatic simplified AST construction. [e.g. CST -> AST at parse time].

Definition
  CST = Concrete Syntax Tree. (SableCC 2 ASTs matche their CST).

This is a lot of work, so don't ask me for a time frame;-)

Etienne




>
>Chris Watson
>
>----------------------------------------------------------------------
>Dr. Christopher Watson              Tel: +44 (0)1223 460408
>Quintic Ltd,                        Fax: +44 (0)1223 460409
>Cambridge,                          E-mail: chris@quintic.co.uk
>United Kingdom.                     Website: http://www.quintic.co.uk
>