Re: [TxMt] RegExp n00b

14 Sep 2005


      On 13/09/2005, at 21.35, Andreas Wahlin wrote:
...
I tried to add this, after the normal tag declaration (that is,  
almost first in the language)
Not sure which version of TM you're using. A recent revision (i.e.  
nightly build) changed the “longest match” to just first rule which  
matches. Thus your specific stuff should be _above_ the generic catch  
all stuff.
...
[...] but I tried some other stuff and I always got stuck with  
either having img tags or attributes, not both :(
Well, to catch the attributes, there needs to be actual rules to  
match them.
...
I also do not really get the inital (?i:( part of the original tag  
catching code, shouldn't ?'s be used after something to mean 1 or 0  
occurrences of that very thing? (I also suppose : is some xml  
thingie) Just noted that (?:subexp) means "not captured group", but  
that wouldn't apply here since there's an "i" in between?
These are options, noted in the doc as:
   (?imx-imx:subexp)  option on/off for subexp
The (?i:...) means that ... should be matched case insensitive.
...
DID SOME MORE RESEARCH and found out that perhaps I should rather  
have edited further down and popped in something like
        {   name = "declaration.tag.html";
            begin = "<(img)";
            end = ">";
            captures = { 1 = { name =  
"entity.name.img.tag.html"; }; };
            patterns = ( { include = "#tag-stuff"; } );
        },
which makes sense given the repository and the tag-stuff, but this  
doesn't seem to do anything either
That rule would markup an img tag with attributes (you may want to  
add \b after img). If it doesn't do so, try moving it to the top of  
the grammar (since it may just be, that the generic tag rule is used  
instead).
...
[...] what does the
        {   name = "declaration.tag.html";
            match = "<(?i:(head|table|thead|tbody|tfoot|tr|td|div| 
fieldset|style|script|ul|ol|li|form|dl))\b[^>]*(>)</(\1)>";
            [...]
part do?
This is a special rule to match empty tag pairs like <div></div> with  
only one purpose: to give a scope to the position between the start/ 
end tag, so that return can be overloaded for that position (there is  
a snippet in the HTML bundle which inserts two newlines and extra  
indent, which is bound to return and the scope which marks the  
position between the two tags).
...
hehe, as a final, light note, I entered the word bar in a file and  
then tried to a regexp find for both bar\w bar\w (bar)\w (bar)\w  
but neither found it, what was wrong with that?
Probably you wanted bar\b -- \w is a word character.

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [TxMt] RegExp n00b