[SVN] Regular Expression Language Grammar
Gerd Knops
gerti-textmate at bitart.com
Wed Jun 14 21:08:23 UTC 2006
Allan et all,
to make it easier to 'parse' complex regular expressions, I am in the
process of designing a Regular Expression Language Grammar. That
seems to have a lot of potential! But I got a few questions before I
release a first experimental version:
# Included language missing in scope #
The Regular Expression Language would be included with something like
...
include = 'source.regexp';
...
That works, but when I look at the scope inside a regular expression
(Shift-Ctrl-P), 'source.regexp' does not appear. That seems to be the
case for all included languages, they do not appear in the scope,
only names defined in those languages appear. Is that an oversight?
# Conditional pattern matches #
Since most programming languages use very similar regular
expressions, this language would be a candidate for inclusion in a
number of languages. However most languages add their own quirks to
regular expressions (eg variables). These would have to be listed at
strategic locations inside the regular expression grammar, otherwise
we end up copying the entire RegExp grammar for all these languages
and adding the exceptions.
So it would be great if there were conditional pattern matches,
something along the lines of
scope_contains = ( 'source.perl', 'source.ruby' );
or the invers
scope_contains_not = ('source.perl');
# Names and coloring #
Currently I defined a 'private' namespace for the regular
expressions, with names like
string.regexp.escaped_char.newline
string.regexp.posix_bracket.alnum
string.regexp.quantifier.greedy.0_up_to_n
Downside is that lots of new colors would have to be defined in the
themes to make use of this. So I wonder if I should be using things like
string.newline
string.octal
keyword.operator
On the other hand while going through some more complex regex it is
great to do Ctrl-Shift-P and see 'string.regexp.quantifier.reluctant.
1_or_more' or some such to explain what is happening at that point in
the regex.
Any suggestions?
# Include and Match/Captures #
Sometimes there are constructs where a match would be much better
suited than begin and end, but I want to include something. A (not
quite correct but you get the idea) example:
(red|green)
I wish I could write a pattern as follows:
match = '\((.+)\|(.+)\)';
name = 'string.regexp.alternation';
captures =
{ 1 = { include = '$self'; };
2 = { include = '$self'; };
};
Is there any workaround for patterns like these?
Thanks
Gerd
More information about the textmate-dev
mailing list