[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Handling an INCLUDE directive.



Will Hartung wrote:
> I sort of hacked in the support by extending the generated Lexer class. I
> override both getToken and filter (though in reality, I probably don't need
> to override filter).

You might have to override the filter method to catch the end-of-file of
an included file.

> This has a couple of problems as is, though. First, it doesn't capture the
> file information (though it does capture the line information). To capture
> this information I need to change SableCC directly, as the LexerException
> and ParseException don't have slots for something like filename. I was
> debating on the best path for this.

Right, we must also modify the exception classes.

> Another thought is that the system could pass the actual token to the
> exception constructor...

This looks fine.  The token would be made available to the user.

> Add to this an "extra info" slot in Node that is left null unless
> the user sticks something into it. 

This one: I am not convinced.  It's too tempting to use this field to
put all sort of things, instead of using hashtables to associate data
with tokens.  I have argued against this approach both in my thesis and
on the mailing-list.  My primary goal is good software engineering.  

Now, arguably, the "file" information could be included in all SableCC
tokens, as it would be of interest to most SableCC generated compilers. 
So, I would agree to have an additional "file" field in Token.java.  I
would also provide hooks in the Lexer class to retrieve and set the
lexer internal "file,line,column" information.

> Another problem with my current system is that the lexer must "parse" the
> token stream looking for a properly constructed INCLUDE statement. Hardly
> overwhelming, but it seems kind of nutty to have "a parser in the lexer".

Did you really need to get down to the parser to detect the INCLUDE
construct?  It's OK to cheat a little by recognizing the "INCLUDE" token
in the lexer, then do some manual processing to get the file name that
follows.  This way, the parser (and grammar) does not need to be aware
of the inclusion.  In C, for example, you can detect "#include" then
look for something of the form "..." or <>.

> ... I gravitated towards
> SableCC because I liked the seperation of the grammar from the code itself.
> When I had problems with Sable, I started to look at others (Antlr, Cup,
> etc), and just threw my hands up in disgust because (at a glance) there was
> never a boundary between the grammar and the code, making it difficult to
> follow. Now, that model may work well for those whom have cut their teeth on
> LEX/YACC, but for a complete novice, it's very confusing.

I'm happy you find SableCC easier to use.  It's the goal:-)

> I'll look over at SourceForge and try and get the latest version to work
> with.

If you register as a SourceForge user, I can add you as a SableCC
developer, so that you can help implementing support for file
inclusion.  (Don't forget to send me your username and ID).

Etienne
-- 
----------------------------------------------------------------------
Etienne M. Gagnon, M.Sc.                     e-mail: egagnon@j-meg.com
Author of SableCC:                             http://www.sablecc.org/
and SableVM:                                   http://www.sablevm.org/