[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Head required (LONG)



Dear parser generators grandmasters,

well, funny subject. Its not my head, got it in the right place :)

I have serious problem with this:

Imagine this simple parser generator grammar:

Package cz.zapletal.jpg.sablecc;

Helpers
   
    all = [0 .. 0xFFFF];  
    lowercase = ['a' .. 'z'];
    uppercase = ['A' .. 'Z'];
    digit = ['0' .. '9'];
    hex_digit = [digit + [['a' .. 'f'] + ['A' .. 'F']]];

    tab = 9;
    cr = 13;
    lf = 10;
    eol = cr lf | cr | lf; // funguje jak na Linuxu tak na Windows

    not_cr_lf = [all - [cr + lf]];
    not_star = [all - '*'];
    not_star_slash = [not_star - '/'];

    blank = (' ' | tab | eol)+;

    short_comment = '//' not_cr_lf* eol;
    long_comment = '/*' not_star* '*'+ (not_star_slash not_star* '*'+)*
'/';
    comment = short_comment | long_comment;

    letter = lowercase | uppercase | '_' | '$'; 
    id_part = lowercase (lowercase | digit)*;

States
    main;

Tokens

{main}
    blank = blank;
    comment = comment;

    headers = '%header%';
    tokens = '%tokens%';
    productions = '%productions%';

    equal = '=';
    semicolon = ';';
    bar = '|';

	char = ''' not_cr_lf ''';
    id = id_part ('_' id_part)*;
    string = '"' [not_cr_lf - '"']+ '"';
    token = uppercase (uppercase | digit | '_')*;
    head = uppercase (uppercase | digit | '_')*;

Ignored Tokens 

    blank,
    comment;

Productions

    grammar =
        P.headers? P.tokens? P.productions?;

    headers =
    	T.headers [header_lists]:P.header+;
    
    header =
    	T.head equal string;

    tokens =
    	T.tokens [token_lists]:P.token+;
    
    token =
    	T.token equal string;

    productions =
        T.productions [prod_lists]:prod+;

    prod =
        id equal alts semicolon;

    alts =
        alt [alts]:alts_tail*;

    alts_tail =
        bar alt;

    alt =
        {parsed} [elems]:elem*;
        
    elem =
        id;

-----------------------
Now the source code:

package cz.zapletal.jpg.tests;

import java.io.FileReader;
import java.io.PushbackReader;

import cz.zapletal.jpg.sablecc.lexer.Lexer;
import cz.zapletal.jpg.sablecc.node.Start;
import cz.zapletal.jpg.sablecc.parser.Parser;

public class ParserTest
{
	public static void main(String[] arguments)
	{
		try
		{
			Parser p = new Parser(new Lexer(
					new PushbackReader(
							new
FileReader("arithmetic.grammar"), 1024)));
			System.out.println("Parsing...");
			Start tree = p.parse();
			System.out.println("Done.");
		}
		catch(Exception e)
		{
			System.out.println(e.getMessage());
		}
	}
}


--------------
And THE (my) grammar (INPUT):

%header%

GRAMMARTYPE = "LL"
WHITESPACE = "[ \t\n\r]+"

%tokens%

ADD			= "+"
SUB			= "-"
MUL			= "*"
DIV			= "/"
LPAREN		= "("
RPAREN		= ")"
NUMBER		= "[0-9]+"

%productions%

expression = term expression_rest
	| term ;

expression_rest = ADD expression
               | SUB expression ;

term = factor termrest
	| factor ;
     
termrest = MUL term
         | DIV term ;

factor = atom
       | LEFT_PAREN expression RIGHT_PAREN ;

atom = NUMBER
     | IDENTIFIER ; 

------
When I try to run the test it ends with:

(1,3): Head expected.

But there IS a head token there (GRAMMARTYPE) that suíts the rule! Whats
wrong here?

Thanks for help.


S pozdravem / Best regards

Lukas Zapletal
sefredaktor / editor-in-chief
Linux+ magazine

Software Media s.r.o.
U Krbu 2395/17
100 00 Praha 10

icq: 17569735

Attachment: grammar.scc
Description: Binary data

Attachment: ParserTest.java
Description: Binary data

Attachment: arithmetic.grammar
Description: Binary data