[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: ? Ignore tokens blank;
John,
please see my comments below.
>I can't ignore blanks as I'm expecting to. I'd think that these 2
>strings would parse the same with the grammar given below, but the
>the first gives:formfile.parser.ParserException: [1,8] TString expected
>
> (WINDOW "System" 699 180 475 457 0 NIL
> (WINDOW"System"699 180 475 457 0 NIL
>
>
>Here's the grammar (or rather, the relevant fragment. The rest is
>fromminibasic.txt)
> ...
I suspect that you have a token that is defined to take ->WINDOW"...<- as a
single token. Remember that the longest possible token is the one passed to
the parser.
Here is a customized lexer that will help you debug your lexer.
import wig.node.*;
import wig.lexer.*;
import java.io.*;
class PrintLexer extends Lexer
{
PrintLexer(PushbackReader reader)
{
super(reader);
}
protected void filter()
{
System.out.println(token.getClass() +
", state : " + state.id() +
", text : [" + token.getText() + "]");
}
>Where can I learn more about the grammar files?
>
Read the documentation on http://www.sable.mcgill.ca/sablecc/
There are two documents. The "thesis" is a very readable document; it is NOT
written in a cryptic technical language and it should be accessible to all.
Chapters 3, 4,5 and 6 are the most important ones. You will find there a
complete description.
For a grammar, you can look at the grammar provided with the examples.
>
>2 other questions:
>1) why does this break w/ "blank undefined"?
>
> Tokens
> new_line = cr | lf | cr lf;
> blank = ' '*;
> whitespace = (blank | new_line)*;
>
Tokens can only reuse "helpers" in their regular expressions. A token
definition cannot reuse another token definition;-)
>2) Can I put comment lines in a grammar?
>
Yes. C and C++ style of comments (if I remember well).
Etienne