[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Tokens and free text



>So how do these states work? When a token like 'to' is recognised only
>in the bol state and when it is encountered a shift to 'normal' state
>occurs?
>
>If you remember my other post about case sensitivity you might notice a
>slight complication. If there wasn't a separator (':' in my case)
>between header and the rest, would states work? I still vote for "to" :)
>
>Is this documented anywhere? I still think I must be missing some
>documentation.


The reference document (or the "Bible" if you want) is my thesis. Everything
you need to know is written in there. (Lexer states are explained in detail,
as well as customized lexers/parsers, etc.) If you want to every
implementation detail, then you may want to look at the source code and
discover that I used the "Dragon Book" algorithms (with a few caching
optimizations) to generate lexer/parser tables.

A look at the examples might help you to discover a few tricks.

And you have this list.

Etienne