[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Implementing identification on SableCC ASTs


I'm working on writing a compiler (for a student project), but have run
into some difficulties regarding the implementation of identification. I
hope somebody on this list have a minute or two to spare to give us a
hint! :-)

The situation is, that we have implemented a parser for the language in
SableCC. The language offer variables that are not declared in advance
of their use. We want to implement identification so that we find the
"first occurence" of a variable and use that as a kind of declaration,
and then find each applied occurence of the identifer and link that to
the "first use" of that variable.

Finding the first occurence is simple. We then put this into a standard
SymbolTable (the language has nested block structures). Now when we find
an applied occurence it is easy to find the first occurence from the

However, the problem I'm facing is how exactly to store this information
for later retrieval?

To separate concerns, we want identification to happen seperately from
the code generation. Therefore we can not simply "mix"
identification/codegeneration and simply use the SymbolTable to lookup

My next thought was to modify the AST, but all nodes in the tree are
declared final (and that has a purpose, as far as I can see in the
documentation). Therefore I can not simply extend one of the nodes.

I could however make a new class, derived from Node, that contains the
wanted attributes and then cut-n-paste the code for the methods from the
original node. The identification process could then simply use
replaceBy() to replace each "applied occurence" node with a "our own
applied occurence node" that contained a link to the "first occurence".
However this is not really elegant at all.

I note that the AnalysisAdapter offers setIn/getIn and setOut/getOut
methods for storing information about each node. This seems ideal.
However, this information is tied to a specific AnalysisAdapter - so the
information from Identification wouldn't be immediately available in the
Codegenerator (both derived from DepthFirstAdapter).

We could pass the Identification object to the Codegenerator, and then
let the Codegenerator use the getIn/getOut methods on that object to
retrieve information about the nodes in the AST from the Identification
object. However that seems to be a bit cumbersome or at least confusing,
and it would be possible to have several "descriptions" or "added
attributes" for each node.

So I guess what I'm asking is, what is the common way to do this in a
clean OO way? - Have I overlooked something important?

Is there a way to "transfer" the HashMap from one AnalysisAdapter to
another? (without having to edit the code generated by SableCC ofcourse)

Thanks in advance for any advice you can give!

Jens Kristian Søgaard, jks@cs.auc.dk