Composable grammars

List overview All Threads
Download

newer

older

Filter in bundle context/commands...

TernJS Completion Bundle 0.1

Jacob Carlborg

6 Apr 2016 6 Apr '16

10:14 p.m.

I've been working for quite a while with trying to rewrite the grammar for the D bundle to be more accurate to the official grammar. The grammar for D is quite complex, that in the combination with the syntax for grammars in TextMate doesn't allow any good ways to reuse or compose rules making it very difficult to describe a grammar. I know it's possible to reuse rules with the repository, but that seems to be mostly useful when matching with "begin" and "end".

For example, this is the grammar for a function declaration from the official D grammar:

FuncDeclaration: StorageClasses(opt) BasicType FuncDeclarator FunctionBody AutoFuncDeclaration

AutoFuncDeclaration: StorageClasses Identifier FuncDeclaratorSuffix FunctionBody

FuncDeclarator: BasicType2(opt) Identifier FuncDeclaratorSuffix

FuncDeclaratorSuffix: Parameters MemberFunctionAttributes(opt) TemplateParameters Parameters MemberFunctionAttributes(opt) Constraint(opt)

Each of these parts/rules of the grammar consists of several other rules, many levels deep.

It would be really nice if the TextMate grammar syntax allowed, somehow, to define rules, or parts of a rule, which the other rules can be composed of, similar to above.

Or is there a way to already do something similar with the current syntax?

-- /Jacob Carlborg

Show replies by date

fukurokujo

6 Apr 6 Apr

10:31 p.m.

The whole grammar system should really be changed to some BNF, EBNF kinda parser system. Regex should never be used for grammars.

...

On 06 Apr 2016, at 22:14, Jacob Carlborg doob@me.com wrote:

I've been working for quite a while with trying to rewrite the grammar for the D bundle to be more accurate to the official grammar. The grammar for D is quite complex, that in the combination with the syntax for grammars in TextMate doesn't allow any good ways to reuse or compose rules making it very difficult to describe a grammar. I know it's possible to reuse rules with the repository, but that seems to be mostly useful when matching with "begin" and "end".

For example, this is the grammar for a function declaration from the official D grammar:

FuncDeclaration: StorageClasses(opt) BasicType FuncDeclarator FunctionBody AutoFuncDeclaration

AutoFuncDeclaration: StorageClasses Identifier FuncDeclaratorSuffix FunctionBody

FuncDeclarator: BasicType2(opt) Identifier FuncDeclaratorSuffix

FuncDeclaratorSuffix: Parameters MemberFunctionAttributes(opt) TemplateParameters Parameters MemberFunctionAttributes(opt) Constraint(opt)

Each of these parts/rules of the grammar consists of several other rules, many levels deep.

It would be really nice if the TextMate grammar syntax allowed, somehow, to define rules, or parts of a rule, which the other rules can be composed of, similar to above.

Or is there a way to already do something similar with the current syntax?

-- /Jacob Carlborg

textmate mailing list textmate@lists.macromates.com http://lists.macromates.com/listinfo/textmate

Allan Odgaard

10 Apr 10 Apr

4:37 p.m.

On 7 Apr 2016, at 3:31, fukurokujo wrote:

...

The whole grammar system should really be changed to some BNF, EBNF kinda parser system. Regex should never be used for grammars.

So how would you define a token, if not by using a regex?

Martin Kühl

7 Apr 7 Apr

11:27 a.m.

On 6 April 2016 at 22:14, Jacob Carlborg doob@me.com wrote:

...

I've been working for quite a while with trying to rewrite the grammar for the D bundle to be more accurate to the official grammar. The grammar for D is quite complex, that in the combination with the syntax for grammars in TextMate doesn't allow any good ways to reuse or compose rules making it very difficult to describe a grammar. I know it's possible to reuse rules with the repository, but that seems to be mostly useful when matching with "begin" and "end".

For example, this is the grammar for a function declaration from the official D grammar:

FuncDeclaration: StorageClasses(opt) BasicType FuncDeclarator FunctionBody AutoFuncDeclaration

AutoFuncDeclaration: StorageClasses Identifier FuncDeclaratorSuffix FunctionBody

FuncDeclarator: BasicType2(opt) Identifier FuncDeclaratorSuffix

FuncDeclaratorSuffix: Parameters MemberFunctionAttributes(opt) TemplateParameters Parameters MemberFunctionAttributes(opt) Constraint(opt)

Each of these parts/rules of the grammar consists of several other rules, many levels deep.

It would be really nice if the TextMate grammar syntax allowed, somehow, to define rules, or parts of a rule, which the other rules can be composed of, similar to above.

Or is there a way to already do something similar with the current syntax?

I tried to approximate something like this with my Maude grammar[1]. It's been a while, but I believe the basic idea was to split scopes into parts using lookaheads and -behinds and to include specific rules there.

Take #equation as an example: An equation begins with `eq` and ends with `.`, but in between I can differentiate the left hand side from the right hand side using (entirely too many) lookarounds.

Of course this only works because I can use `==` as a kind of anchor in between, I'm assuming most rules of the grammar you quoted above don't include any meaningful symbols but are mostly one (or, worse, possibly several) words. So this approach probably won't help you that much. It's just as far as I got.

Cheers, Martin

[1]: https://github.com/mkhl/maude.tmbundle/blob/master/Syntaxes/Maude.tmLanguage

Jacob Carlborg

1:48 p.m.

On 2016-04-07 11:27, Martin Kühl wrote:

...

I tried to approximate something like this with my Maude grammar[1]. It's been a while, but I believe the basic idea was to split scopes into parts using lookaheads and -behinds and to include specific rules there.

Take #equation as an example: An equation begins with `eq` and ends with `.`, but in between I can differentiate the left hand side from the right hand side using (entirely too many) lookarounds.

Of course this only works because I can use `==` as a kind of anchor in between, I'm assuming most rules of the grammar you quoted above don't include any meaningful symbols but are mostly one (or, worse, possibly several) words. So this approach probably won't help you that much. It's just as far as I got.

If I understand you correctly, that will not work. I don't have anything to use as an anchor in many places.

-- /Jacob Carlborg

Martin Kühl

4:47 p.m.

On 7 April 2016 at 13:48, Jacob Carlborg doob@me.com wrote:

...

On 2016-04-07 11:27, Martin Kühl wrote:

...
I tried to approximate something like this with my Maude grammar[1]. It's been a while, but I believe the basic idea was to split scopes into parts using lookaheads and -behinds and to include specific rules there.

Take #equation as an example: An equation begins with `eq` and ends with `.`, but in between I can differentiate the left hand side from the right hand side using (entirely too many) lookarounds.

Of course this only works because I can use `==` as a kind of anchor in between, I'm assuming most rules of the grammar you quoted above don't include any meaningful symbols but are mostly one (or, worse, possibly several) words. So this approach probably won't help you that much. It's just as far as I got.

If I understand you correctly, that will not work. I don't have anything to use as an anchor in many places.

Unfortunately, I believe you understand correctly.

Allan Odgaard

10 Apr 10 Apr

4:43 p.m.

On 7 Apr 2016, at 3:14, Jacob Carlborg wrote:

...

It would be really nice if the TextMate grammar syntax allowed, somehow, to define rules, or parts of a rule, which the other rules can be composed of, similar to above.

Just to be sure I understand, you would like to define somewhere in the grammar how e.g. an identifier looks (regex) and then in other patterns, say a function prototype, be able to insert this definition (in the full regex for the function prototype)?

If so, have a look at this proposal: https://github.com/textmate/textmate/pull/1276#issuecomment-63450941 and let me know if that fits your needs.

Jacob Carlborg

6:38 p.m.

On 2016-04-10 16:43, Allan Odgaard wrote:

...

Just to be sure I understand, you would like to define somewhere in the grammar how e.g. an identifier looks (regex) and then in other patterns, say a function prototype, be able to insert this definition (in the full regex for the function prototype)?

Kind of.

...

If so, have a look at this proposal: https://github.com/textmate/textmate/pull/1276#issuecomment-63450941 and let me know if that fits your needs.

I've looked at that before. It's difficult to tell if it fits my needs without trying it.

It's question of how well it scales and how it's implement. I would prefer to define rules in the TextMate grammar that exactly matches the rules in the official grammar of a language. For that it would require many variables and many levels deep.

It will no work by just providing some syntax that allows to interpolate the regular expression. I tried that. I've even create a full Ruby DSL [1] to describe TextMate grammars. It worked great until TextMate choke on the generated regular expression being too long. I gave up when it took 15 seconds to load language.

Here's an example using that Ruby DSL to describe an integer literal [2]. That might give you and idea of what I'm after.

[1] https://github.com/jacob-carlborg/tm_grammar/tree/dev [2] https://github.com/jacob-carlborg/d.tmbundle/blob/reboot_grammar/Syntaxes/ru...

-- /Jacob Carlborg

3413

days inactive

3417

days old

textmate@lists.macromates.com

7 comments

participants

tags (0)

participants (4)

Allan Odgaard
fukurokujo
Jacob Carlborg
Martin Kühl