[SVN] Improving the Ruby Syntax
Allan Odgaard
allan at macromates.com
Mon Mar 7 20:37:34 UTC 2005
On Mar 7, 2005, at 20:55, Chris Thomas wrote:
> I say go for it. It's always better to classify things specifically
> where possible. Whether or not the default style sheet colors all of
> the keyword elements the same is a different question, and it probably
> doesn't matter, because the stylesheets will allow full per-user
> customization.
Just so you guys know, to name captures one would do e.g.:
name = "keywords.functions.method-with-arguments.ruby";
match = "^\\s*(def\\>)\\s*([.a-zA-Z_?!]+)\\s*\\((.*)\\)";
captures = {
2 = { name = "function-name.ruby"; };
3 = { name = "function-arguments.ruby"; };
};
For begin/end there's beginCaptures and endCaptures to name only
captures in the begin or end match. The path will have the name after
the name of the entire match. E.g. the function name will have the full
path:
source.ruby keywords.functions.method-with-arguments.ruby
function-name.ruby
If captures are nested, like:
name = "test";
match = "(foo(bar))";
captures = (
1 = { name = "foobar"; };
2 = { name = "bar"; };
);
Then the bar part will have this path:
test foobar bar
I made the values of the captures arrays be dictionaries with a name
key mainly to make it easier for me to handle (so I don't need special
code for captures). And it does allow to add more info to captures in
the future if it should ever be needed, but I guess it is a little
redundant...
And Eric, I actually have heredocs working in my current version :)
Though the rule I had to make to match heredocts is a little special:
name = "comments.heredoc.shell";
begin = "(?=<<(\\w+))"; end = "^\\1";
patterns = (
{ begin = "^<<\\w+"; end = "$";
patterns = ( { include = "source.shell"; } );
}
);
What it does is it makes the begin pattern only a look-ahead assertion
on the delimiter. That way, the delimiter is not eaten when arriving at
the sub-patterns, so I made one sub-pattern that also matches the
delimiter with end set to end-of-line ($) and this rule has the entire
shell syntax as sub-patterns, so basically, after the actual
<<DELIMITER there will be normal shell-highlight till end-of-line.
In practice this isn't perfect, and it still doesn't handle nested
heredocs (actually it does, but in the reverse order), but I think this
will cover 99% of the situations arising in code.
I made the end pattern able to refer to captures in the begin pattern,
but this means that the end pattern itself cannot refer to its own
captures. The reason for this choice is both technical and practical.
E.g. naming captures in the end pattern would need to take number of
captures in the begin pattern into account. And currently one cannot
use captures in conditions, which might actually be useful, e.g.
conditionally match the dash in front of the delimiter in the begin
pattern, and allow leading tabs in the end pattern if it was matched
(so for now we'd need two rules, one with and one without the dash).
Also, my example also doesn't allow the delimiter to be quoted (as it's
just an example).
Oh, and Chris, as you can probably see from the patterns above, there
are no longer problems with zero-width matches, so it's not a problem
to match multi-line preprocessor instructions in C.
I actually currently also run the patterns on the entire source, rather
than one line at a time, but I'm not 100% sure I'll continue to do this
-- the problem mostly has to do with having to resume parsing in the
middle of a source, if multi-line matches are allowed, it would mean
that I couldn't be sure that any given line was a safe starting point
(since a change in line n may affect a match starting at line n-i (for
n, i >= 0)).
> And the Rails stuff in particular definitely needs to be in a separate
> Rails syntax.
Amen! :)
More information about the textmate-dev
mailing list