[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Questions on grammar and inheritance
Hi,
I am using SableCC to learn (at home - this email has nothing
to do with my employers) more about computing (as a software
engineer with no formal training I have an inferiority complex
driven desire to learn :-). So please forgive the simple level
of these questions...
First, in writing an interpreter (for the anonymous language
in Dijkstra's "A Discipline of Programming"), I have a grammar
where the following structure is common:
expression_list =
{head} expression |
{tail} expression comma expression_list ;
i.e. I have syntactic lists which have internal separators,
but no terminal symbol. Another example, as code, could be:
begin
statement1 ;
statement2
end
(note statement2 has no terminating ";").
The grammar pattern above is fine if I am traversing the tree
"in the normal way", but sometimes I need to run through the
list of nodes in the list directly. This produces ugly code
like:
PExpressionList list = node.getExpressionList( ) ;
while ( ! ( list instanceof AHeadExpressionList ) ) {
name = ((ATailExpressionList)list).getExpression( ).getText( ) ;
list = ((ATailExpressionList)list).getExpressionList( ) ;
{ do something with name }
}
name = ((AHeadExpressionList)list).getExpression( ).getText( ) ;
{ do the same something with name }
because both head and tail contain an identifier. This is
clearly poor code. But how do I remove that final ugly
assignment to name? Using "+" in the grammar would produces a
nice linked list, but seems to force a final terminator.
Second (I can see this is a long email - but I'm in no hurry
so if anyone would like to reply when they have more time I
would still be grateful) it would be elegant to store
intemediate results, which are associated with particular nodes,
in the nodes themselves.
Compare this approach with that used in the minibasic example
where a separate hash array stores intermediate results. In
minibasic results are stored by calling setIntValue rather
that something like node.setTempValue( ... ). This makes it
difficult to split the job of interpretation between two separate
tree walkers, for example.
However, without multiple inheritance, it's hard to see how
to do this without altering sablecc-generated code. Any ideas?
Third, Dijkstra, in his wisdom, decided that the following would
be acceptable:
x, y, z = 1, 2, 3 ;
x, y = y, x ;
Where print x, y, z would, after executing both lines, give 2, 1, 3.
As he points out in his book, you can easily generate a parser
that checks the two lists balance (see chunk below), but it
associates, in this example, x with 3, y with 2 and z with 1.
In other words, it gets the order reversed.
My simplest solution is to have the parser accept unequal lists
and then have a run-time error. Alternatively, I could use a
grammar like
assign =
var := arg |
var , assign , arg ;
And chop the tree around before interpretation to fix the problem
described above. But tree surgery is a bit intimidating - so is
there a simpler solution?
OK, that's all. Any comments appreciated, or pointers to good
discussions (I've read the dragon book, but these questions are
more about style and efficiency than "doing it in C").
Finally, thanks for such a nicely designed piece of software.
While looking for info on the internet I came across an old
discussion of SableCC on a usenet group - I was amazed to see
what seemed to be blatant disregard for decent (object) design.
While I don't know much about compiler compilers, I can see that
someone has tried hard to produce a safe and reliable tool.
Excellent!
Cheers,
Andrew
P.S. Wouldn't it be nice to have this list archived somewhere?