[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: help with lexer



Hi Archie.

I have read your e-mail. Unless I am wrong, there might be an easy (and
relatively elegant) solution to your problem.

I assume that every command is independent. This means that every
command can (and will, if you use my approach) result in a "stand alone"
AST.


OK. Here it goes.

To do this:
1- You remove the new line token from your Tokens section.
2- You define the following customized PushbackReader which takes care
of the newline and continuation problem.

import java.io.*;
public class CustomPushbackReader extends PushbackReader
{
    boolean lastWasNewLine;
    CustomPushbackReader(Reader reader, int bufsize)
    {
	super(reader, bufsize);
    }
    public void purge() throws IOException
    {
        if(lastWasNewLine)
        {
            return;
        }

        int c;
        while((c = read()) != -1)
        {
            if(c == '\n')
            {
                return;
            }
        }

        return;
    }
    public int read() throws IOException
    {
	int c = super.read();
	switch(c)
	{
	    case '\n': 
	    {
		lastWasNewLine = true;
		return -1;
	    }
	    case '\\':
	    {
		lastWasNewLine = false;
		int next = super.read();
		if(next == '\n')
		{
		    return super.read();
		}
		else
		{
		    unread(next);
		    return c;
		}
	    }
	    default:
	    {
		lastWasNewLine = false;
		return c;
	    }
	}
    }
}

3- You create and keep a reference to one instance of class
CustomPushbackReader.
4- You create a Lexer instance passing the CustomPushbackReader instance
to the constructor.
5- You create a new Parser instance for each command, passing the same
"unique" Lexer instance to the constructor.
6- You get the AST of a command with the parse() method.
7- On errors, you call the CustomPushbackReader.purge() method.

I think that this would fulfill all your requirements.

Etienne 


Archie Cobbs wrote:
> 
> Hmm.. I'm not sure how to do what I want to do with SableCC...
> 
> I have a line-based protocol running over TCP and want to create
> a parser that the server will use. There is exactly one command
> per line, where a line is terminated with \n.
> 
> Ther is one twist: backslash continuations are allowed... that is,
> if a newline immediately follows a backslash, both characters are
> ignored and the command continues on the next line.
> 
> I want the server to be able to parse the client's input, with
> these things being true:
> 
>  o Interactivity - the server should not read any more characters
>    than absolutely necessary to determine the correct token (ie,
>    it should not read past the terminating newline character).
> 
>  o Error recovery - if there's any lexical or parse error, the server
>    should read and discard characters up to & including the next newline
>    character (with flex, you'd define an "error command" that matches
>    the error token).
> 
>  o Repeatabilty - the server should be able to repeatedly call the
>    parsing engine to parse out consecutive AST's (ie, commands) from
>    the input stream.
> 
> I'm having trouble doing this with SableCC, mainly the last point,
> because it always expects to parse the first production followed by
> an EOF token. In bison, you could use YYACCEPT after parsing a
> "command" production to make the parser accept & stop. With SableCC,
> I'm trying to insert an EOF token after reading the newline, but
> it seems impossible to insert an EOF token in the filter() routine
> without violating the interactivity requirement.
> 
> Any ideas appreciated! Mainly, what would be the most correct/elegant
> approach with SableCC to a design problem like this in terms of
> satisfying those three requirements.
> 
> Thanks,
> -Archie
> 
> ___________________________________________________________________________
> Archie Cobbs   *   Whistle Communications, Inc.  *   http://www.whistle.com