[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Performance of SableCC v ANTLR



> Compared to javaCC Sable is a like a breath of fresh air. Easy source syntax, 
> visitor pattern design, strong typing on the AST, good error reporting. 
> Within a few weeks I had my front-end running and was starting coding a 
> back-end. 

I am using SableCc to compile model definition files into java sources.
I profoundly appreciate the design and usage of SableCc. I've got the
following concerns though:

* apparently, I cannot use regular expressions to define lexer
tokentypes. I'm currently doing stuff like: identifier= ["_" + [ '.' + [
... and so on. I can use ranges, but no regexes proper?
* The generation of parser tables takes quite a bit of time ...
* The litmus proof of the tool should be the fact that it generates its
own grammar... Version 2.16.2 doesn't seem to do that. My generators
generate (part of) their own sources. I feel this is important to
establish confidence in the tool (eat your own ...)

> It was only when I started to compile larger programs (>250 lines) that I  
> noticed that the compilation was taking several seconds on a 2.1 Ghz PC. This 
> worried me greatly. 

In my impression SableCc has never been truly optimized.

I'm a bit worried that the maintainers are looking at adding features
(scripting languages and so) that can perfectly coded outside the tool,
while not paying attention to down-to-earth issues like tool performance
and performance of the sources generated.

> My observations so far are that the ANTLR front-end runs more than five times 
> faster than the SableCC equivalent. SableCC produces 445 AST node types
> for the grammar, the ANTRL solution just 77.

I wouldn't be surprised. ANTRL is ugly, but makes it up with
performance. There is no reason why SableCc couldn't be optimized.

> Can I ask is SableCC intrinsicly slower than ANTLR? 
> Has anyone else had similar experiences?


> Does anyone know of any SableCC based  "full" compiler solutions, not a 
> language sub-set, that I can look at for camparison? Preferably for one of 
> the standard languages.

It is true that there is a lack of well-designed ELF or COFF/PE
Java-oriented generation tools. You can work around this problem,
however, by targetting slightly higher-level formats like
semi-multiplatform C generators like e.g. tinyCc; or generate assembler
with nasm or gas -- if you feel like scrutinizing the various assembler
syntaxes and models for different platforms. 

My target is gcj.