[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Performance of SableCC v ANTLR



erik poupaert wrote:
* apparently, I cannot use regular expressions to define lexer
tokentypes. I'm currently doing stuff like: identifier= ["_" + [ '.' + [
... and so on. I can use ranges, but no regexes proper?

I don't understand what you are saying. Tokens can be described using regular expressions.

* The generation of parser tables takes quite a bit of time ...

Yes.


* The litmus proof of the tool should be the fact that it generates its
own grammar... Version 2.16.2 doesn't seem to do that. My generators
generate (part of) their own sources. I feel this is important to
establish confidence in the tool (eat your own ...)

The latest stable version, 2.18.0, is generated using itself. The problem in 2.16.x was not related to parse tables; it was that the internal visitors were based on the 1.x generated framework (which was not as nice).

Kevin Agbakpem and I did the work to make 2.18.0 self generating.

In my impression SableCc has never been truly optimized.

This is true. The motto is: first make it work, then optimize.


Now, the new CST->AST stuff should help a lot.  Some profiling done
a while ago by some Sable student showed that most of the time was lost
in creating the huge CST (AST).  With the new parse time CST->AST
code, the created AST should be much smaller and thus reduce significantly
the pressure on the garbage collector.

I'm a bit worried that the maintainers are looking at adding features
(scripting languages and so) that can perfectly coded outside the tool,
while not paying attention to down-to-earth issues like tool performance
and performance of the sources generated.

Many planned features will help performance. Also, do not forget the above motto...



My observations so far are that the ANTLR front-end runs more than five times faster than the SableCC equivalent. SableCC produces 445 AST node types
for the grammar, the ANTRL solution just 77.

CST->AST code should help here.


Etienne

--
Etienne M. Gagnon, Ph.D.             http://www.info.uqam.ca/~egagnon/
SableVM:                                       http://www.sablevm.org/
SableCC:                                       http://www.sablecc.org/