[TxMt] RegExp n00b
Allan Odgaard
allan at macromates.com
Tue Sep 13 12:52:42 UTC 2005
On 13/09/2005, at 14.27, Andreas Wahlin wrote:
> I've been trying some time now with the javascript bundle, I get
> almost everything after your little help there Allan :)
Good to hear!
> foldingStartMarker = "^\\s*([A-Za-z0-9.]+s*=\\s*)?(function)\\b";
>
> what does the = sign mean?
That's a literal match, so no special meaning.
> Does \\b mean ending bracket?
No, it's a word boundary. Basically meaning that the next character
needs to be a non-word character (since the previous was a word
character).
This is required because if e.g. we want to match the start of a bold
tag, we'd do:
<b
but that would also match <body or anything else starting with b, so
instead we do:
<b\\b
The \\b isn't matching any characters per se, but is an assertion.
> And why isn't the s there escaped, or should it match the letter s
> how many times you want (considering the * after it)?
It's definitely a bug, should have been escaped :)
> match = "^\\s*(function)\\s*([a-zA-Z_]\\w*)\\s*\\(([^)]*)\\)";
>
> This one I get almost completely, except the ([^)]*) part. My only
> guess is that it means something like how many )'s you want at the
> end of the string or something, but that hardly seems necessary.
The brackets can contain single characters instead of ranges.
[)] will match ), so [^)] will match anything but ). I.e. [^)]*
matches up till the first ). Since we match the actual ), we could
also have done:
match = "^\\s*(function)\\s*([a-zA-Z_]\\w*)\\s*\\((.*?)\\)";
So given: “function foo (...)” it matches the ... part, and the ...
part is not allowed to contain any )'s.
> Also, is it the matching of meta.function.js that dictates matches
> in command+shift+t (go to symbol)?
Partially, yes.
If you look at the rule, you'll notice it has:
captures = {
1 = { name = "storage.type.function.js"; };
2 = { name = "entity.name.function.js"; };
3 = { name = "variable.parameter.function.js"; };
};
These are assigning names to the 3 captures in the regexp (i.e. the
function keyword, the actual name of the function, and the ... part
in parentheses).
If you place the caret on each of these parts (in a javascript
source) and press ctrl-shift P, you'll be able to verify this.
Now if you go to the Source bundle (in the Bundle Editor) and look at
the Symbol List preferences item, the actual preference is:
{ showInSymbolList = "1"; }
And the scope selector of that item is:
entity.name.function, meta.toc-list
This means that every scope selected by that scope selector should
have the showInSymbolList enabled. This is what causes stuff marked
up as entity.name.function in javascript to appear in the popup list.
If you look in the CSS bundle or HTML bundle, you'll see that there
are additional preference items to place CSS selectors and HTML id
arguments in the symbol list as well (since these are not matched by
the scope selector above). So the entity.name.function name is only a
convention -- everything can go in the popup :)
More information about the textmate
mailing list