[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

sablecc generating grammars for other languages




I'm pondering here an idea on how simple would it be to make
SableCC generate grammars for other languages. And foremost i'm having
C++ in mind. C++ is rather close to Java so it shouldn't be very
difficult.

I'm not thinking yet on how the structure or object model of generated
c++
parser should be but more how easy would it be generating it. As far as
I've looked
into sablecc the code generation is hardcoded into the Gen* files.

There is a class named ResolveIds that contains a lot of information
about the
structure of parser. But it does not have everything in it to generate a
parser.

So it either needs improvement or a completely new data model for parser
data
has to be created. A data model that could be used by generators for
generating parsers. Which to prefer?

What I have in mind is something like:

    interface ParserGenerator {
        void generate (ResolveIds parserData);
    };

Anyways I'm not afraid to spend a weekend or two for making it happen.
If you
think my efforts are not pointless and give me your blessing :)

Regards,
Indrek

PS: I found the following oddity from the SableCC jdk2 distribution at
       generated lexer code:

    private String getText(int acceptLength)
    {
        StringBuffer s = new StringBuffer(acceptLength);
        for(int i = 0; i < acceptLength; i++)
        {
            s.append(text.charAt(i));
        }

        return s.toString();
    }

Here text is a StringBuffer. Wouldn't this be cleaner:

    private String getText(int acceptLength)
    {
       return text.substring (0, acceptLength);
    }

This function is heavily used I think; so the new approach would benefit

the performance.