[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-textual semantic values for Tokens?

Othman Alaoui wrote:
> I've looked in particular at the Basic interpreter example provided and have
> noticed that it extracts those values only in later passes for the purpose
> of interpretation. For now, I am planning on doing something similar: not
> extract those semantic values until I need them, which is at code generation
> in my case. I plan on using a hashTable indexed by AST node (like the in and
> out provided by SableCC's analyses) to store them if I need them more than
> once (which I probably won't...). Is there a better way to handle this?

This IS the way to do it.  Now, you could want to save some place, by reducing
the AST size.  Look at recent mailing-list archived posts for a full
discussion of the subject.  So, I deally, you extract the semantic info using
"Integer.parseInt()" instead of "atoi()", and you store the result in a
hashtable indexed by AST nodes.

Why not in the AST itself?  I already answered this many times (in the
mailing-list archive, an in my these, f I remeber correctly).  Because of code
maintenance.  The idea is, when you want to get rid of the semantic
information, you simply drop the reference to the hashtable.  Java'c garbage
collector will take care of reclaiming the memory for you.

> My project supervisor finds it hard to believe that, in SableCC, you can't
> do this sort of "semantic value extraction" while keeping the information
> extracted in the AST itself. I may be missing something then...

No, you are not missing anything.  You shouldn't store any information in the
AST itself.  This is enforced in SableCC: all node types are explicitely
declared "final".  This is by design.

This desing is the result of lessons learned by the (now defunct) ACAPS Group
of McGill University, building the McCAT optimizing C compiler.  The final
version of the compiler was a very powerful research tool (with more that 10
years of development and many generations of M.Sc./Ph.D. students), but it
lacked modularity.  Part of the SableCC design deals with the software
engineering issues raised in developing large compiler projects that may span
many generation of programmers.

For a compiler to be modular, it is essential that the addition and/or removal
of a compiler phase not to require much changes to existing code.  In fact,
with a SableCC compiler framework, dependencies can be minimal, if not
eliminated in some cases.  This cannot be achieved if analysis specific
information is kept in the AST itself.  Personally, I think that the few
additional CPU cycles it takes to extract the data out of the hastable instead
of the AST are really worth it in the long term, from a software engineering
point of view.

Etienne M. Gagnon, M.Sc.                     e-mail: egagnon@j-meg.com
Author of SableCC:                 http://www.sable.mcgill.ca/sablecc/