[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Non-textual semantic values for Tokens?

Hi Laurie,

I have to be careful here... I'm talking to my supervisor;-)

"Prof. Laurie HENDREN" wrote:
> Yes, Etienne, I agree that for storing data flow information and other
> semantic information like types,  the hashtable idea is not bad.   However,
> I think that for the value of tokens,  there is a valid reason to want to
> actually store a value that matches the kind of token.   For example, if
> you are parsing an integer token,  doesn't it make sense for the token's
> value to be integer,  and the value in the parse tree to be integer?

The answer is yes and no.  Given a grammar, SableCC generates a single
framework that can be used to multiple ends.  One possible end is writing a
compiler.  In this context, it would make sense to reserve some place in
numerical tokens for the semantic (or primitive type) value.  But, another end
is writing a pretty-printer.  In this case, however, there is not point in
reserving some additional place in each numerical token to store information
that we won't even compute.

Another important point is that the AST nodes are generated by SableCC.  Their
(source) code might change from SableCC version to version.  if you were to
add methods/fields you could be creating forward compatibility problems with
future SableCC versions.  You do not want people to hand code anything in
these classes.

The danger of providing a means in SableCC to directly store data in the AST
is that beginners (and others) will be tempted to store everything in the
AST.  This is the, unfortunately, intuitive way to store semantic information.

It is fairly simple to write wrapper methods that will hide the semantic data
separation from the AST. e.g.:

// instead of writing
... = node.getValue();

// you write
... = getValue(node);

where you define a local method:

private int getValue(Node node)
  return ((Integer) hash.get(node)).intValue();

So, given this, I hardly see a real need for AST stored "mild" semantic
values. The cons of doing so seem more important than the pros.

But, you are right, one would expect being able to store obviously related
semantic values in the AST itself.  I just see no simple and elegant solution
that won't become a gun for beginners (and others) to shoot themselves in the
foot in the longer term.

Etienne M. Gagnon, M.Sc.                     e-mail: egagnon@j-meg.com
Author of SableCC:                 http://www.sable.mcgill.ca/sablecc/