On 26/5/2006, at 11:19, Geir-Tore Lindsve wrote:
[...] I'm trying to figure out a way to parse a string with an expression, so that I can filter out subexpressions to build up a structure of expression objects.
If I understand correctly, this is not a task for regular expressions.
Regular expressions are for parsing “flat” syntax, i.e. w/o recursive constructs. As soon as recursion enters the picture, you need a real parser [1].
Such parser is fairly simple to write, but I personally have grown rather attached to ANTLR for generating parsers. In TextMate snippets, scope selectors, my plist derivative (in the bundle editor), and format strings (for regexp replacements) are all parsed using an ANTLR generated parser -- I did write some of them by hand initially, but if there is the slightest chance the grammar needs to be extended in the future, it’s not worth it -- also, when first you have learned to use a parser tool, creating a new parser is fairly trivial.