[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: epsilon tokens

To: VEROK Istvan <vi@inf.bme.hu>
Subject: Re: epsilon tokens
From: "Etienne M. Gagnon" <etienne.gagnon@uqam.ca>
Date: Tue, 03 Sep 2002 10:52:58 -0400
Cc: sablecc-list@sable.mcgill.ca
References: <Pine.GSO.4.00.10209031509200.2365-100000@kempelen.iit.bme.hu>
Sender: owner-sablecc-list@sable.mcgill.ca
User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.0) Gecko/20020615 Debian/1.0.0-3

Hi Istvan (that's your first name, right?),

VEROK Istvan wrote:

I'm a novice SableCC user and have run into the following tidbit:
I need to occasionally recognize epsilons (zero-length tokens)
in a grammar (yes, there IS a reason for that).  The problem
can be extracted to and illustrated by the following toy grammar:

==== eps.sablecc ====

Package eps;

States
 initial, rest;

Tokens
 {initial -> rest} eps = ;
 {rest} char = [0x0000 .. 0xffff];

Productions
 whole = eps char*;

==== ends here ====

Zero-length tokens were never meant to be recognized. This is because the lexer operates independently from the parser. Think about it: if zero-length tokens were allowed, there's nothing that would stop the lexer from returning the same zero-length token indefinitely (infinite loop).

Now, SableCC allows a grammar designer to specify zero-length tokens. These tokens are NEVER instantiated by the lexer, but might be very useful for "cutomized" lexers.

So, what is happening in your case is this:
In state "initial", the lexer recognizes NO tokens (no 1-or-more-length tokens were specified). So, when it sees the "e" character, the lexer complains that it doesn't match anything.

A solution:
===========
Package eps;

Tokens
char = [0x0000 .. 0xffff];

Productions
whole = eps char*;
eps = ;

Explanation: Zero-length tokens are not recognized, but zero-length productions ARE recognized. :-))

Have fun!

Etienne
--
Etienne M. Gagnon http://www.info.uqam.ca/~egagnon/
SableVM: http://www.sablevm.org/
SableCC: http://www.sablecc.org/

References:
- epsilon tokens
  - From: VEROK Istvan <vi@inf.bme.hu>

Prev by Date: epsilon tokens
Next by Date: wishlist: deterministic epsilon token recognition
Previous by thread: epsilon tokens
Next by thread: wishlist: deterministic epsilon token recognition
Index(es):
- Date
- Thread