[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing problem



Hi Damon,

> I hope someone can give me a hand, since this has me completely stumped.
> I've reduced the grammar file down to its essentials, I think.
> 
> Enclosed is the grammar file and the errors I get on simple inputs.
> 
> ========================================================
> ---------    grammar file   -----------------------
> ========================================================
> 
> Package com.redwood.reportParserFramework;
> 
> Helpers
> 
>     sp  = ' ';
> 
>     letter = ['a'..'z'] | ['A'..'Z'];
>     digit = ['0'..'9'];
>     letter_or_digit = letter | digit;
>     allowed_expresion_char = letter | digit |
>         '(' | ')' |
>         '<' | '>' | '=' | '.' | ':';
> 
> /*******************************************************************
>  * Tokens                                                          *
>  *******************************************************************/
> Tokens
> 
>     lparen = '(';
>     rparen = ')';
> 
>     white_space = sp*;
> 
>     report_keywd = '<report>';
> 
>     identifier = letter letter_or_digit*;
>     text = (allowed_expresion_char)*;


"<report>()" this input fits perfectly into text token, and is tokenized
as text, not as report_keywd () thing.  You have to remoember, that
tokenizer tries always to build the longest possible token from the input
stream. That's why the first production is getting test token and that's
why it is throwing 
com.redwood.reportParserFramework.parser.ParserException: [1,1]  
"TReportKeywd expected." exception.

"()" this input fits perfectly into text token as well. Because "()" as
text token is longer then "(" as lparen, you are getting problem with 
parsing "<report>()" thing, as it cannot accept report_def =
report_keywd text.  Put yourself into a tokenizer position (;o) and you
will be able to fix the problems with this simple grammar. For the time
being, remove text token and all should be fine (text token is not being
used in the productions at the moment anyway).


Hth
Mariusz





 
> /*******************************************************************
>  * Ignored Tokens                                                  *
>  *******************************************************************/
> Ignored Tokens
> 
>     white_space;
> 
> /*******************************************************************
>  * Productions                                                     *
>  *******************************************************************/
> Productions
> 
> goal = report_def;
> 
> report_def =
>   {keyword}
>     report_keywd |
>   {empty}
>     report_keywd lparen rparen;
> 
> 
> 
> ===============================================================
> ========           Output              ========================
> ===============================================================
> 
> "<report>"  works fine
> "<report>  " works now, but only now that the file has been shrunk down
> 
> "<report>  () " does not work - the error reads
> com.redwood.reportParserFramework.parser.ParserException: [1,11] 			"EOF
> expected.",
> 
> 
> "<report>()"  does not work - the error reads:
> com.redwood.reportParserFramework.parser.ParserException: [1,1]
> "TReportKeywd expected.",
> which makes me think that it only recognizes the <report> token when there
> is a whitespace after it.