[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pasrser and EOF

Etienne Gagnon writes:
> I really do like your idea. See my comments below.

Me too! :-)

> Hendrik Boom <hendrik@topoi.cam.org> wrote:
> >You require the reduce transition for a start production
> >to need no lookahead; in other words, an LR state that can reduce
> >the start production can have no other read or reduce transitions.
> Let's try to be precise. The start production itself be LR(0). In addition,
> no other production in the grammar may require a lookahead past the end of
> the start production.

This is assuming you don't want to read past the last token in the
(valid) input.. which is the best possible scenario.

It seems to me like either (a) the language inherently *requires* the
parser to look beyond the last token, or else (b) the language does
not require this -- and a normal parser won't do it.

Example of (b):

Start = 
  foo bar* foo;

Input "foo bar bar foo blah".

The parser would read the last foo symbol, do a reduction,
move to the accept state, and stop.. never reading the "blah" token.

As an example of (a):

Start =
  foo bar*;

Input "foo bar bar foo".

In this case, the language itself requires that the last "foo"
is looked at, no matter how fancy the parser is.

> I was already planning (for some future SableCC development) to isolate
> LR(0) states, and not to look ahead in those states. This seems pretty
> simple to achieve, if I implement a heterogeneous state machine in the
> parser. [See Terrence Parr's Ph.D. thesis for details;-)]

It seems to me like the language dictates what must happen. Can
you give an example of when a normal parser would look ahead past
the last token, even though the language didn't inherently require it?

If so, then the heterogeneous state machine idea would be needed.


Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com