[TxMt] Re: Symbol list & Language grammar

Martin Kühl martin.kuehl at gmail.com
Wed Nov 30 13:34:08 UTC 2011


On Wed, Nov 30, 2011 at 13:44, Sean T Allen <sean at monkeysnatchbanana.com> wrote:
> I'm trying to work an issue out...
>
> It's a smalltalk grammar for redline smalltalk.
>
> method start with either + or - to indicate class or instance
>
> then all of the following are examples of valid smalltalk methods
>
> string
> at:
> at:put:
>
> textually those would be something like
>
> - string
>
> - at: anIndex
>
> - at: anIndex put: aValue
>
> The current rule I have to match that is:
>
> method_definition = {
> begin = '^-|\+\s';
> end = '$';
> patterns = (
> { match = '((\w+:?)(\s*\w+)?)';
> captures = {
> 2 = { name = 'entity.name.function.instance.smalltalk'; };
> 3 = { name = 'variable.parameter.method.smalltalk'; };
> };
> },
> { include = '#language_elements'; },
> );
> };
>
> The problem really are ones like
>
> at:put:
>
> Keyword methods with more than a single keyword.
>
> Textmate sees it not as
>
> at:put:
>
> But at: & put:
>
> Highlighting works great but the naming isn't correct.
>
> Can I mark
>
> at: anIndex put: aValue
>
> as the function
> then further say anIndex and aValue are method parameters?
>
> then transforming into just at:put:
>
> If that is a reasonable approach. What is the best way to match the variable
> number of keywords as a single run?
> I can get the one at a time in the above approach but I can't figure out a
> way to match all at once and still mark the individual method parameters.

The way to do this is by having a scope that matches the whole method
"head" (`at: anIndex put: aValue`, say) and add that to the symbol list
with a transformation removing the parameters.

This is not generally a trivial task, but it can be done if you can
either find a simpler matcher (usually a match instead of a begin/end
pair) or a clean separator between a method's head and body (the "hard"
way), or if you're fine without having separate scopes for method head,
body and (complete) definition.

In your example, if you can do without the `'#language_elements`
inclusion you could try a matcher like
    ^([-+])\s+((\w+:?)(\s*\w+)?)+
to match the whole line at once and assign some scope to the 0 capture.

I'll have to document the "hard" way properly some day, but the gist is
you can match a construct of the form `X stuff Y more Z` with a pair
of begin/end matchers for X and Z and then including nested begin/end
matchers for X to Y and Y to Z using lots of lookaheads and -behinds.

An example is included in my Maude grammar[1] as the `#operator` rule
with a matcher for the whole operator construct and nested matchers for
its range, domain and definition.

Hope that helps anyway?

Cheers,
Martin

[1]: https://github.com/mkhl/maude.tmbundle/blob/master/Syntaxes/Maude.tmLanguage


More information about the textmate mailing list