On Aug 18, 2012, at 7:17 PM, Gerd Knops wrote:
- Right now it seems injection appends to the patterns. Would it be possible to also have 'early injection' where the new patterns are prepended to the patterns? I am experimenting with special comment sections like /*H: */, and can't seem to do this with injection. (Another example application: Headerdoc/Docbook comments /** */ etc).
I am not following why this would work for appending but not prepending… this is because you want to replace the _entire_ comment rule? As opposed to inject the grammar into the comment?
Right. Here is a simple example: I like to use line comments '//' that begin at the start of the line as code separators, so I draw the line background in a different color, so basically
match = "^//.*\n?";
This doesn't work with injections, because the 'regular' grammar already gobbles up any '//', so the injected grammar never sees it.
Similar for docbook comments: an injected grammar never sees the '/**' comments.
As for DocBook, you can inject into ‘comment.block’ and make the grammar:
{ patterns = ( { begin = '\G*'; end = '(?=*/)'; name = 'text.docbook'; patterns = ( … ); }, ); }
Here \G anchors to the start of the parent match, i.e. directly after ‘/*’.
I don’t disagree on maybe changing the order of append/prepend, but there is a good case to be made for keeping as much parsing as possible in the original grammar.
By doing it that way, my rule above works with all comment types and can safely be injected into ‘comment.block’ rather than having to specify a list of source scopes that use ‘/* … */’ comments (i.e. if injected into ‘source’ you’d suddenly have rules for matching such comments in languages that do not support them).
The line comment rule, I am not sure if there is a case for changing how our “canonical” line rule should be made, perhaps matching ‘//’ with a look-ahead so that this can be re-parsed (and tested with ^) by the injected rule, or perhaps injected rules should run on the _entire_ scope, rather than start at where the ‘begin’ stopped.
My goal is to keep this generalized, so your injected rule to add a scope to line comments without leading whitespace would not just add a rule for double-slash comments (as not all languages may support this, and having to provide a list of which do, in the injection scope selector, fails to exploit the semantic value of scoping).
Another example is leading whitespace: I like to have leading tabs highlighted with alternating light backgrounds, like so:
<PastedGraphic-2.png>
I would LOVE to use injection for that, unfortunately many definitions eat up leading space. So my only recourse is to write custom language grammars for ALL languages I want to have this feature (eg ALL languages I use), start with my rules (so they get first dib), then include the actual language.
I fear though that if you were to inject the whitespace rule you would break these definitions that also parse the whitespace, as many do things like: ‘^\s*«keyword»’ — I think either way, we need to “fix” these grammars to (generally) not parse the leading whitespace (I’m actually not sure why so many rules do this, and in some languages it’s *clearly* a bug, for example PHP’s ‘return’ keyword has the leading whitespace included not only in the match, but also the scope, meaning word selection/movement is wrong (as that leading whitespace is assigned the ‘keyword’ character class).