Hi,
I get an "IOException: Pushback buffer overflow" with the SableCC grammar appended below.
Test 1: "[+-]" Parsing this string leads to pushback buffer overflow. The problem seems to arise from the competition between T.range and T.class_char.
Test 2: "[\a]" When I use the commented alternative to T.class_char (i.e. "{ charclass } class_char = class_char ;"), I get a pushback buffer overflow exception on any backslash escaped character occurring within brackets. I guess this is a SableCC bug, since the expected exception should be a LexerException.
Any help is greatly appreciated! Franz
PS: I have a JUnit test for the above cases. I'll be happy to send a zip.
PPS: Can anyone point me to a SableCC grammar for regular expressions?
/********** GRAMMAR **********/ Package quoted_char;
Helpers
all = [ 0 .. 0xffff ] ;
// metacharacters esc = '\' ; //' hyphen = '-' ; l_bracket = '[' ; r_bracket = ']' ;
// general plain_ch = all ; quoted_ch = esc plain_ch ; class_char = [ plain_ch - r_bracket ];
States // The first state is the initial state normal , charclass ;
Tokens
{ normal->charclass } l_bracket = l_bracket ;
{ charclass->normal } r_bracket = r_bracket ;
{ charclass } range = (class_char|quoted_ch) hyphen (class_char|quoted_ch); // { charclass } class_char = class_char ; { charclass } class_char = (class_char|quoted_ch) ;
{ normal } plain_ch = plain_ch ; { normal } quoted_ch = quoted_ch ;
Productions
char = {plain} T.plain_ch | {quoted} T.quoted_ch | {class} ch_class ;
// character class ch_class = T.l_bracket ch_class_unit+ T.r_bracket ;
ch_class_unit = {single} T.class_char | {range} T.range ;