[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: xml like grammar used in sableCC



Hi,

Your grammar given below is quite poor:

1. It does not take into account the SableCC's naming rules for alternatives in a production
2. Your grammar allows non-valid xml syntax like: <           /          r   o    A   d    >
3. It contains a shift/reduce conflict
4.  Lexer states define a DFA on top of token defintions, and each state is asssociated with a set of tokens. When the lexer is in a state, only the tokens associated with this state are recognized. The state that is listed first is the initial state of the lexer (in your case, the "global" state). Because most of your token definitions are not prefixed with a state, they will not be recognized by the lexer and therefore will not be returned to the parser.
5. According to your given desired AST, "mapping_contents" can have more than one "road" and more than one "crossing", but the corresponding production (map_contents = crossing | road;) does not allow this.

So, at least start with a correct and working sablecc2-version of your grammar, then add the abstract syntax and the corresponding transformation rules. To do this, you can use http://www.mare.ee/indrek/sablecc/test.sablecc3.txt as a guideline and use the AST-printer walker class http://people.ninthave.net/~roger/sablecc/ASTPrinter.java to see the AST that was built during parsing of some sample input .

You can also consider to use DOM as your input is XML.

Pieter.


On Thu, 2004-03-18 at 10:20, Kim Schulz wrote:
hi
As a university project, I am currently working on creating a compiler 
that takes an XML like input with road information and then compile it
into a roadmap.
I have a working sablecc 2.x grammer for this input language, but I need
the AST part as well and therefore tries to convert it into version 3
grammar. An early (sorry the final 2.0 is overwritten by the v.3
grammar. version 2.x grammer is listed below.

When trying to complete the v3 grammar we hit a problem saying:
[41,51] t_road must be one of the elements on the left side of the arrow
 or is already refered to in this alternative.

line 41 is:
map_contents  =  road {-> New map_contents.road(t_road)};


The documentation (on the net at least) is very limited on this point
and it is very hard to transfer our grammar according to the few
examples there is on the net. 

Is there any documentation out there somwhere besides the thesis and the
links on sablecc.org?

could anyone give a small intro or somthing to how to do the
transformation so I could get an AST like this:


                             start
                              |
                          map contents
                              |
     --------------------------------------------------------
     |       |       |        |             |               |
   road    road    road       road       crossing       crossing
		     |                      |
		-----------------      (like the other)
                |    |    |     |
              coord ... coord  attr
                |               |
              ----           -------
              |  |           |     | 
              x  y          attr1 attr2

 -------------------------version 2 grammar--------------------------
Package RIS;

Helpers
			t_digit = ['0' .. '9'];
			t_letter = (['a' .. 'z'] | ['A' .. 'Z']);
			t_symbol = ('*' | ',' | '.' | '-' | '_' | '|');
			t_dot = '.';

 			tab = 9;
    			cr = 13;
    			lf = 10;
			space = (' ');

States
			global,
			value;

Tokens
				t_opentag = '<';
				t_closetag = '>';
				t_equals = '=';
				t_slash = '/';
{value->global, global->value}	t_quote = '"';
				t_coord = ('c' | 'C') ('o' | 'O') ('o' | 'O') ('r' |
'R') ('d' | 'D');
				t_map = ('m' | 'M') ('a' | 'A') ('p' | 'P');
				t_road = ('r' | 'R') ('o' | 'O') ('a' | 'A') ('d' |
'D');
				t_crossing = ('c' | 'C')  ('r' | 'R') ('o' | 'O') ('s' |
'S') ('s' | 'S') ('i' | 'I') ('n' | 'N') ('g' | 'G');
				t_connection = ('c' | 'C') ('o' | 'O') ('n' | 'N') ('n'
| 'N') ('e' | 'E') ('c' | 'C') ('t' | 'T') ('i' | 'I') ('o' | 'O') ('n'
| 'N');
				t_subroad = ('s' | 'S') ('u' | 'U') ('b' | 'B') ('r' |
'R') ('o' | 'O') ('a' | 'A') ('d' | 'D');

{global}			t_identifier = t_letter (t_letter | t_digit)*;
{value}				t_string = (t_letter | t_digit | t_symbol)*;
				t_blank = (cr lf | cr | lf | space)+;

Ignored Tokens
	t_blank;

Productions
	start  = [leftopentag]:t_opentag [leftmap]:t_map 		
attribute* [leftclosetag]:t_closetag map_contents 		
[rightopentag]:t_opentag t_slash [rightmap]:t_map 		
[rightclosetag]:t_closetag;

	map_contents = 	crossing |
			road;
			
	crossing = t_crossing;
	
	road	= [l1]:t_opentag [l2]:t_road attribute* [l3]:t_closetag 		
road_contents [r1]:t_opentag t_slash [r2]:t_road 		
[r3]:t_closetag;
	
	road_contents =	coords*;

	coords = t_opentag t_coord attribute* t_slash t_closetag;
			
	attribute = t_identifier t_equals [l1]:t_quote t_string 		
[r1]:t_quote;