[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: help with lexer
Hi Archie.
I have read your e-mail. Unless I am wrong, there might be an easy (and
relatively elegant) solution to your problem.
I assume that every command is independent. This means that every
command can (and will, if you use my approach) result in a "stand alone"
AST.
OK. Here it goes.
To do this:
1- You remove the new line token from your Tokens section.
2- You define the following customized PushbackReader which takes care
of the newline and continuation problem.
import java.io.*;
public class CustomPushbackReader extends PushbackReader
{
boolean lastWasNewLine;
CustomPushbackReader(Reader reader, int bufsize)
{
super(reader, bufsize);
}
public void purge() throws IOException
{
if(lastWasNewLine)
{
return;
}
int c;
while((c = read()) != -1)
{
if(c == '\n')
{
return;
}
}
return;
}
public int read() throws IOException
{
int c = super.read();
switch(c)
{
case '\n':
{
lastWasNewLine = true;
return -1;
}
case '\\':
{
lastWasNewLine = false;
int next = super.read();
if(next == '\n')
{
return super.read();
}
else
{
unread(next);
return c;
}
}
default:
{
lastWasNewLine = false;
return c;
}
}
}
}
3- You create and keep a reference to one instance of class
CustomPushbackReader.
4- You create a Lexer instance passing the CustomPushbackReader instance
to the constructor.
5- You create a new Parser instance for each command, passing the same
"unique" Lexer instance to the constructor.
6- You get the AST of a command with the parse() method.
7- On errors, you call the CustomPushbackReader.purge() method.
I think that this would fulfill all your requirements.
Etienne
Archie Cobbs wrote:
>
> Hmm.. I'm not sure how to do what I want to do with SableCC...
>
> I have a line-based protocol running over TCP and want to create
> a parser that the server will use. There is exactly one command
> per line, where a line is terminated with \n.
>
> Ther is one twist: backslash continuations are allowed... that is,
> if a newline immediately follows a backslash, both characters are
> ignored and the command continues on the next line.
>
> I want the server to be able to parse the client's input, with
> these things being true:
>
> o Interactivity - the server should not read any more characters
> than absolutely necessary to determine the correct token (ie,
> it should not read past the terminating newline character).
>
> o Error recovery - if there's any lexical or parse error, the server
> should read and discard characters up to & including the next newline
> character (with flex, you'd define an "error command" that matches
> the error token).
>
> o Repeatabilty - the server should be able to repeatedly call the
> parsing engine to parse out consecutive AST's (ie, commands) from
> the input stream.
>
> I'm having trouble doing this with SableCC, mainly the last point,
> because it always expects to parse the first production followed by
> an EOF token. In bison, you could use YYACCEPT after parsing a
> "command" production to make the parser accept & stop. With SableCC,
> I'm trying to insert an EOF token after reading the newline, but
> it seems impossible to insert an EOF token in the filter() routine
> without violating the interactivity requirement.
>
> Any ideas appreciated! Mainly, what would be the most correct/elegant
> approach with SableCC to a design problem like this in terms of
> satisfying those three requirements.
>
> Thanks,
> -Archie
>
> ___________________________________________________________________________
> Archie Cobbs * Whistle Communications, Inc. * http://www.whistle.com