[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pasrser and EOF

Archie Cobbs wrote:
> > Let's try to be precise. The start production itself be LR(0). In addition,
> > no other production in the grammar may require a lookahead past the end of
> > the start production.
> This is assuming you don't want to read past the last token in the
> (valid) input.. which is the best possible scenario.

The idea is to reject any language that would require backtracking, and
accept all others. A language that has a start production ending with
eof does not require backtracking. But other languages fall into that
category. We want to add these languages to the set of languages parsed
by SableCC. But SableCC will reject any other language. This seems
intuitive enough, because a language that requires backtracking has,
like you called it, an ambiguous end of input; which we want to avoid.

The problem, now, is to give meaningful error messages. But that's my

> It seems to me like either (a) the language inherently *requires* the
> parser to look beyond the last token, or else (b) the language does
> not require this -- and a normal parser won't do it.

Yes. We want to reject (a) and accept (b) languages. I DO NOT WANT TO
IMPLEMENT BACKTRACKING! It is expesive in memory space requirement
and/or computation time.

I want SableCC to remain deterministic! Decisions should be made in O(1)
and parsing should be O(n) where (n = AST size).

> ... Can
> you give an example of when a normal parser would look ahead past
> the last token, even though the language didn't inherently require it?

I don't know off hands. But it may happen with ugly LALR kind of
conflicts that would not be present in a canonical LR state machine...