[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fw: Pasrser and EOF



From: Hendrik Boom <hendrik@topoi.cam.org>
To: gagnon@CS.McGill.CA <gagnon@CS.McGill.CA>
Cc: Hendrik Boom <hendrik@topoi.cam.org>
Date: Tuesday, July 21, 1998 9:56 PM
Subject: Re: Pasrser and EOF


Please put this in the sable mailing list:
>
> Hi.
>
> It seems that implementing a parser without some kind of end-of-input
> character is not really feasible without changing the "expected"
> behavior of the parser. This is because some backtracking would be
> needed, and I am not sure I want to deal with this kind of headache.

> ...
> ...
> ...

No backtracking is needed, and no specisl end-of-input token is needed
either,
if you perform the right kind of test during parser generation.

You require the reduce transition for a start production
to need no lookahead; in other words, an LR state that can reduce
the start production can have no other read or reduce transitions.

There is a slight complication; you do need to avoid reading each
token until it's actually needed.  Otherwise you risk reading past
the end for an unneeded lookahead.  Presumably you would have to
maintain a flag that says whether the lookahead symbol has beed read
yet, and to read it only when required.

> On the other hand, the "end-of-input" token could be user specified.

Then there's no longer any need for a user-specified reserved end-of-input
token either.

If the grammar-writer chooses to have an end-of-input token, that is
strictly a matter of how he chooses to use the tokens provided by the lexer.
He just includes it at the end of each start production.
A start production is one that rewrites the start nonterminal.

What identifies end-of-input can even depend on context !

The lexer can still provide a physical end-of-input token, of course,
but it need not be the designated way of recognising the end of the
grammatical input.

> There is a catch, though. The "End Of Stream Token" cannot be used in
> the grammar.

And the end-of-stream token can even be used within the grammar.

>
> Would that satisfy everybody's need?

Would this satisfy everybody's need?

>
> Etienne
>

hendrik.