[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: ? Ignore tokens blank;


please see my comments below.

>I can't ignore blanks as I'm expecting to.  I'd think that these 2
>strings would parse the same with the grammar given below, but the
>the first gives:formfile.parser.ParserException: [1,8] TString expected
> (WINDOW "System" 699 180 475 457 0 NIL
> (WINDOW"System"699 180 475 457 0 NIL
>Here's the grammar (or rather, the relevant fragment. The rest is
> ...

I suspect that you have a token that is defined to take ->WINDOW"...<- as a
single token. Remember that the longest possible token is the one passed to
the parser.

Here is a customized lexer that will help you debug your lexer.

import wig.node.*;
import wig.lexer.*;
import java.io.*;

class PrintLexer extends Lexer
PrintLexer(PushbackReader reader)

protected void filter()
  System.out.println(token.getClass() +
   ", state : " + state.id() +
   ", text : [" + token.getText() + "]");

>Where can I learn more about the grammar files?


Read the documentation on http://www.sable.mcgill.ca/sablecc/
There are two documents. The "thesis" is a very readable document; it is NOT
written in a cryptic technical language and it should be accessible to all.
Chapters 3, 4,5  and 6 are the most important ones. You will find there a
complete description.

For a grammar, you can look at the grammar provided with the examples.

>2 other questions:
>1) why does this break w/ "blank undefined"?
> Tokens
>   new_line = cr | lf | cr lf;
>   blank = ' '*;
>   whitespace = (blank | new_line)*;

Tokens can only reuse "helpers" in their regular expressions. A token
definition cannot reuse another token definition;-)

>2) Can I put comment lines in a grammar?

Yes. C and C++ style of comments (if I remember well).