[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Bug in SableCC 2.17.2?
I also wanted to add a couple of minor comments regarding character sets and
I had problems with tokens that combined the two. It's been at least a month
and I can't quite remember the scenario but given something simple like:
digits = ['0'..'9'] ;
vowel = 'a' | 'e' | 'i' | 'o' | 'u' ;
message = digit | vowel ;
I had problems with message tokens. At the time it struck me that sets and
lists were not interchangeable which I found rather strange. Although they
are expressed differently, there should be no fundamental difference (that I
can see from superficial thinking.) I wrote vowel as a list because (at the
time) I couldn't figure out how to write it as a set. Of course, one may do
the following (and other variations):
vowel = ['a' + ['e' +['i' + ['o'+'u']]]] ;
Seems silly that the set operator(s) can only handle one operand. Why not
allow: ['a' + 'e' + 'i' + 'o' + 'u'] ? This makes disjoint sets much easier
to define and debug.
As far as the combining of Sets and Lists in a token, I imagine that this
must work and at the time I was just confused. I have been working with
regular expression type grammars and they have a lot of issues with
ambiguious tokens (which is another, long e-mail) so it could simply have
been that the tokens I was defining were ambiguous and I didn't realize it
at the time.
----- Original Message -----
From: "Mariusz Nowostawski" <firstname.lastname@example.org>
To: "Xuan Baldauf" <email@example.com>
Sent: Monday, August 20, 2001 10:36 PM
Subject: Re: Bug in SableCC 2.17.2?
> Hi Xuân,
> You are absolutely right, the current syntax for sets is cumbersome and
> not nice. I belive Etienne is planning for new (major) release of SableCC
> to address it and give much more intuitive and simple syntax.
> The functional difference between list of alternatives and set of
> "characters" is really in the ability to "substract" - one can create
> a bigger set and then declare smaller ones by reducing/substracting
> another set from the bigger set.
> In your case it would be better to use sets as you want to declare
> tokenchars as such characters which are not separators. But as you have
> pointed out the current syntax is not nice at all for that.
> As to the second part of your question, I do not really understand the
> problem (apart again that you have pointed out not-nice error handling and
> not meaningful error message). Brackets () in sablecc grammar file are
> used to group list of productions/tokens. However, grouping something
> without a reason is treated as an error ;o) So, if something does not
> need to be grouped, do not group it ;o)
> I hope you will know what I mean by reading the code below, if you do not
> know how to express something tell me what exactly you want the tail to
> represent and I may help you with that.
> best regards
> tokenchar = [[0x0000..0xFFFF]-','];
> token = tokenchar+;
> comma = ',';
> token_list = token token_list_tail*;
> //token_list_tail = token; // works
> //token_list_tail = (token)+; // works
> //token_list_tail = comma token; // works
> //token_list_tail = (comma token)+;// works