[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: help with lexer

To: gagnon@cs.mcgill.ca (Etienne GAGNON)
Subject: Re: help with lexer
From: Archie Cobbs <archie@whistle.com>
Date: Mon, 20 Jul 1998 12:06:23 -0700 (PDT)
Cc: sablecc-list@sable.mcgill.ca
In-Reply-To: <35B38B2D.6B32B814@sable.mcgill.ca> from Etienne GAGNON at "Jul 20, 98 02:23:41 pm"

Etienne GAGNON writes:
> Archie Cobbs wrote:
> > However, one thing has always bothered me though about all parsers:
> > why is EOF considered a special token?!? Why can't it just be
> > another token like any other? Why does a parser always assume
> > that the very last token will be EOF?
> 
> It has to do with the way a parser makes decisions, be it LL or LR.
> 
> If you had a grammar like this:
> 
> start = right_recursive eof; 
> right_recursive = id right_recursive | id;
> 
> If you remove the 'eof', how does the parser know if it should parse the
> following into two ASTs or one AST? (or, in the general case, if it
> should return an error?)
> 
> input: id id
> 
> There is the possibility of saying: parse the longest string.

I have two possible responses.. :-)

 #1 Right.. you would need "parse the longest string" as an implicit
    rule. Note: this rule is the norm for lexical analysis.

    Your example (without the eof) is equivalent to:

      start = right_recursive; 
      right_recursive = id+;

    From a lexical point of view, everybody agrees what should happen...
    you shift & reduce id's until there aren't any more. One could claim that
    from a parsing point of view, the same logic and reasoning applies.
    That is, the same logic that says "always shift as much as you can
    before reducing".

 #2 On the other hand, it would make more sense to generate a shift/reduce
    conflict error upon compiling the grammar, because without the eof
    it's ambiguous. But that's the language designer's problem, not the
    compiler tool's problem!

    In my example, a newline always ends a command so the language is
    not ambiguous. I just want to use "eol" instead of "eof", but the
    tool is not giving me that choice.

I think I believe in #2 rather than #1.

-Archie

___________________________________________________________________________
Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com

References:
- Re: help with lexer
  - From: Etienne GAGNON <gagnon@sable.mcgill.ca>

Prev by Date: Re: help with lexer
Next by Date: Pasrser and EOF
Prev by thread: Re: help with lexer
Next by thread: Pasrser and EOF
Index(es):
- Date
- Thread