Hi,
I am working on the language grammar of my bundle and I am running into problems for the following case:
Given:
===== Heading =====
I want to be able to assign a style to the over and underlines and another style to the Heading part.
- "Heading" can be any character (but is always on one line). - The over line is not required and is usually used to indicate the highest level of a heading (but it can be left out completely).
Characters for the over/under lines can be any non alphabet letter but usually one of the following is used:
.,-=+~*"'´|
The goal is to at least be able to style the "Heading" text and to include the "Heading" in the symbol list. Also this is used at the top scope definition level so that I can chop up the text into heading sections and text body sections. I would therefore include repository rules for heading sections and other repository rules for the text body sections.
Because TextMate can't handle multiline regexes I found this problem to be unusually hard.
I am also unclear on questions of order within the top level scope definition. If two grammar patterns apply to one text section equally which one is used?
e.g. if I have a patterns array in the top level:
patterns = ( rule1 = { name = 'scope1', match = ... }; rule2 = { name = 'scope2', match = ... }; )
and both apply equally to some text is rule1 used and then overwritten by rule2 so that the text is under scope 'scope2' ?
The help seems to discuss ambiguities like this for theme scope selectors but not for language grammars.
I am also unclear how you can make a regex backreference from an end pattern to a capture group in begin pattern or vice versa. (below I assume it is backslash-1 but I'm not sure).
Also is it possible to define rules for one scope, rules for another and then say everything else is some third scope without needing to define match rules?
Here's what I have so far:
{ scopeName = 'source.sphinx.doc'; fileTypes = ( 'rst', 'rest', 'txt' ); patterns = ( { contentName = 'meta.doctitle.sphinx'; begin = '^((?:=|-|_|~|`|#|"|^|+|*){3,})$'; end = '^(\1)$'; beginCaptures = { 0 = { name = 'markup.doctitle.overline.sphinx'; }; }; endCaptures = { 0 = { name = 'markup.doctitle.underline.sphinx'; }; }; }, { name = 'meta.heading.sphinx'; begin = '^([A-Za-z][^\n]+)$'; end = '^(=|-|_|~|`|#|"|^|+|*{3,})$'; beginCaptures = { 0 = { name = 'markup.heading.sphinx'; }; }; endCaptures = { 0 = { name = 'markup.heading.underline.sphinx'; }; }; }, { contentName = 'meta.paragraph.restructuredtext'; begin = '^(?!=|-|~|`|#|"|^|+|*)([ \t]*)(?=\S)'; end = '^(?!\1(?=\S))'; }, ); };
meta.heading.sphinx is used for
Heading ---------
and meta.doctitle.sphinx should be used for the over/under line Heading case.
A meta.paragraph should then capture everything that isn't meta.heading or meta.doctitle but this has proved problematic to say the least.
I would appreciate it if anyone could share a trick to make this possible or an answer if this is even possible at the moment or not.
Thank you very much for reading!
Andre
On 19 Dec 2010, at 06:41, AndreBerg wrote:
[…] Because TextMate can't handle multiline regexes I found this problem to be unusually hard.
Indeed — it is not possible to handle a line when how it should be parsed is determined on a later line in that file.
I am also unclear on questions of order within the top level scope definition. If two grammar patterns apply to one text section equally which one is used?
The first one listed in the patterns array.
I am also unclear how you can make a regex backreference from an end pattern to a capture group in begin pattern or vice versa. (below I assume it is backslash-1 but I'm not sure).
Are you asking about: { begin = '<(.*?)>'; end = '</\1>'; } style rules?
Also is it possible to define rules for one scope, rules for another and then say everything else is some third scope without needing to define match rules?
No, you would need to make a fallback/complement rule.
On 21 Dec 2010, at 10:46, AllanOdgaard-4 wrote:
Indeed — it is not possible to handle a line when how it should be parsed is determined on a later line in that file.
Thank you for clarifying that further.
From the many tests (and the help) such a picture already emerged but it's
good to have additional confirmation.
Are you asking about: { begin = '<(.*?)>'; end = '</\1>'; } style rules?
Exactly. I assume this is from the HTML bundle. I hadn't thought of looking there (though I did consult other bundles of course). This already answers my question.
No, you would need to make a fallback/complement rule.
I see. My initial goal was to split everything up at the top level in meta sections and then apply inline rules to everything that isnt meta.heading or meta.doctitle but you pretty much confirmed above this is impossible at the moment. The reStructuredText syntax is just to ambigious w.r.t. headings and so on if you look at single lines only.
I'll have to assume everything is body text then and style other more unambigous constructs.
Maybe I can make a macro that uses a multiline regex in the find dialog to jump between headings or back up all the way to the doctitle. This would serve as a crude replacement for a missing symbol list of headings.
No need to reply if you don't want to. I'll simply try it.
In any case thank you for not being one of the tldr; people :)
André