[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Pasrser and EOF



Hi Hendrik,

I really do like your idea. See my comments below.

Hendrik Boom <hendrik@topoi.cam.org> wrote:
>You require the reduce transition for a start production
>to need no lookahead; in other words, an LR state that can reduce
>the start production can have no other read or reduce transitions.


Let's try to be precise. The start production itself be LR(0). In addition,
no other production in the grammar may require a lookahead past the end of
the start production.

e.g.
start = hello you;
you = ...

"you" should be LR(0) too. Am I right?

>There is a slight complication; you do need to avoid reading each
>token until it's actually needed.  Otherwise you risk reading past
>the end for an unneeded lookahead.  Presumably you would have to
>maintain a flag that says whether the lookahead symbol has been read
>yet, and to read it only when required.


I was already planning (for some future SableCC development) to isolate
LR(0) states, and not to look ahead in those states. This seems pretty
simple to achieve, if I implement a heterogeneous state machine in the
parser. [See Terrence Parr's Ph.D. thesis for details;-)]

>Then there's no longer any need for a user-specified reserved end-of-input
>token either.
>
>If the grammar-writer chooses to have an end-of-input token, that is
>strictly a matter of how he chooses to use the tokens provided by the
lexer.
>He just includes it at the end of each start production.


Are you implying to have multiple start productions? Wouldn't a single
production (the first one) with many alternatives be OK?

>Would this satisfy everybody's need?


It looks nice to me.

Etienne