[TxMt] [ANN] Select Balanced HTML Tag!!!1!

Hans-Jörg Bibiko bibiko at eva.mpg.de
Fri Nov 16 13:00:31 UTC 2007


On 16.11.2007, at 13:29, Thomas Aylott - subtleGradient wrote:
> This runs into the problem I'd been having for 3 years.
> How do you get it to work when you have a tag nested inside the  
> same kind of tag?
> Keeping it from matching the first close tag it finds, or the very  
> last one.
> <div>
> 	<div>
> 		<div>
> 			TEXT
> 		</div>
> 	</div>
> </div>

Of course, you're right. That is THE problem! And I also have no  
solution for it by using regexp.

One way I have in my mind is to write a character by character  
parser. If one has found the closing tag (e.g. 'p') it should be  
possible to go from the caret's position step by step to the right  
side to look for '</p>'. If one finds '<p...>' while doing this a  
counter would be set counter+1; if one finds '</p>' the counter would  
be set to counter-1; then if counter < 0 I found my closing tag  
(meaning index). As next the same from the caret's position to left  
side. If one writes this in perl/ruby/... and the entire text is  
stored as character array I can splice the array and finally I have  
the desired string. With that string I can execute a normal findNext  
and findPrevios macro.

I don't know whether it works but ...
Maybe I find some time to try it out. The advantage would be that I  
don't have to parse the entire document.
Or one would write it in Objective-C as plug-in, or Allan has a nice  
idea for it ;)

On the other hand I thought about to use an external HTML parser.  
This works but the parser is also very slow if one has a large HTML  
file. One could think about to restrict the area - 100 line above and  
below the current line - for parsing but this is also tricky.


Cheers,

--Hans








More information about the textmate mailing list