[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Case sensivity & character sets



Mariusz Nowostawski wrote:
> Is there a way to change case-sensitive/case-insensitive parser
> generation? The examples seem to be case insensitive, and also tokens in
> my grammar seem to be case insensitive, so e.g.
> having the definition for the token

The examples are case sensitive. SableCC's strings are case sensitive.
If you want case insensitive strings, you can make helpers, e.g.:

Helpers
  a = ['a' + 'A'];
  b = ['b' + 'B'];
...

Token
  keyword = k e y w o r d; // case insensitive!

> Is there nice way to declare character sets? In my grammar I am about to
> declare alpha helper, which should be a set of characters like:
> ! $ % & * + - . / < > _ ~
> 
> so I tried:
> 
> Helper
>  alpha = ['!' + '$' + '%' + '&' ] ;
> 
> but it seems that + can accept only two arguments, so I ended up in
> something awful like:
> 
> Helper
>  alpha = [[[['!' + '$'] + ['%' + '&']] +
>            [['*' + '+'] + ['-' + '.']]] +
>            [[['/' + '<'] + '>'] + ['_' + '~']]];
> 
> it really does the job, but it is not the nicer construction, and
> writing/checking/modifying it is a little bit painful.

Do you really need a set? If you can use a regular expression helper,
instead, it would be simpler:

Helper
  alpha = '!' | '$' | '%' | ...

The advantage of sets is the availability of the "-" operator that is
unavailable on regular expressions, e.g.:

  not_cr_lf = [all - [10 + 13]];

(I agree that future versions should support the intuitive ['!' + '$' +
'%' + '&' ] notation.)

> 
> Thanks very much in advance for the answers.

Hope this helps.

Etienne

-- 

----------------------------------------------------------------------
Etienne Gagnon, M.Sc.                   e-mail: gagnon@sable.mcgill.ca
Author of SableCC:                 http://www.sable.mcgill.ca/sablecc/
----------------------------------------------------------------------