On Oct 22, 2012, at 9:11 PM, Corey Johnson <cj(a)github.com> wrote:
I'm trying to understand how the single line
comment rule works […]
begin = '(^[ \t]+)?(?=//)';
end = '(?!\G)';
[…] Since TextMate grammars are line based, I'm not sure how it's possible for
the '\n' pattern and the '(?!\G)' pattern to work together.
The regexp is matched aginst a single line, true, but that line will include it’s trailing
newline (unless it’s the last line in the document).
[…] the end pattern '(?!\G)' will match the
'a' in 'var'.
Stuff wrapped in (?=…), (?!…), (?<=…), and (?<!…) are “look around” asserrions. They
will peak at the character after or before the current matching position, but they will
not consume the character.
So a rule like: { match = 'ba(?=r)'; } will match ‘ba’ when followed by ‘r’, but
leave the ‘r’ to be potentially matched by a new rule.
[…] This fails as I described in TextMate 1 but works
in TextMate 2. Is there something fundamentally different about using the \G anchor in
TextMate 2.
Yes, in 1.x \G was undefined, in 2.0 it has a defined behavior, which brings us to Nathan
Sobo’s question:
How is \G defined in TextMate grammars?
It should be the end of the previous match (as you suggest).
For a begin/end rule, the end rule’s ‘\G’ will match where the begin rule ended. In the
line comment rule we use a negative look-ahead assertion (?!\G) requiring that the end
rule is *not* where the begin rule stopped.
This is because the begin pattern can match zero characters (it matches optional leading
whitespace and does a look-ahead on the two forward slashes, but does not match them), so
in the case of no leading whitespace, it will not match any characters. In this case, we
do not want the end rule to match immidiately, as that also match zero characters, and we
would thus end up with an infinite loop.