Hi,
This is more a regexp question rather than TextMate so it is a bit OT, but I'll try anyway since I know that there are som clever regexp heads on this list...
I'm trying to figure out a way to parse a string with an expression, so that I can filter out subexpressions to build up a structure of expression objects.
An example expression: ((print == MIB) >> (color == Black));
Our expression structure is built up so that everything is an expression with a recursive structure (Expression = expression operator expression): ((print == MIB) >> (color == Black)); (print == MIB) (color == Black) print MIB color Black
Additionally, from this list, each variable is also an expression (print, MIB, ...)
I guess that some regexp magic would be some great tool here, but I can't seem to figure out the correct approach, and thus need some starter help.
Any idea of how to approach this? Programming language shouldn't matter, since it anyway would be some recursive structure to dig down the structure, but for reference I can say that I'm using C#.
On 26/5/2006, at 11:19, Geir-Tore Lindsve wrote:
[...] I'm trying to figure out a way to parse a string with an expression, so that I can filter out subexpressions to build up a structure of expression objects.
If I understand correctly, this is not a task for regular expressions.
Regular expressions are for parsing “flat” syntax, i.e. w/o recursive constructs. As soon as recursion enters the picture, you need a real parser [1].
Such parser is fairly simple to write, but I personally have grown rather attached to ANTLR for generating parsers. In TextMate snippets, scope selectors, my plist derivative (in the bundle editor), and format strings (for regexp replacements) are all parsed using an ANTLR generated parser -- I did write some of them by hand initially, but if there is the slightest chance the grammar needs to be extended in the future, it’s not worth it -- also, when first you have learned to use a parser tool, creating a new parser is fairly trivial.