[ANN] Select Balanced HTML Tag!!!1!

List overview All Threads
Download

newer

older

Re: TM_USERNAME is null on...

Language Grammar: content...

Thomas Aylott - subtleGradient

15 Nov 2007 15 Nov '07

6:37 a.m.

YES! You heard it here first folks.

I just added a Select Balanced HTML/XML Tag macro to the Experimental bundle!

I've been trying different ways of making this work since I first started using TextMate back in, what like 2004 or something?

I've tried regex, I've tried Ruby, I've tried everything I could think of... But finally I've come up with something very simple.

The only real downside is that it's EXTREMELY slow. Like 1 second from running the macro until it's done. But that's WAY faster than using the mouse to manually select that same code, or trying to use the keyboard to manually select it.

This is frankly the #1 biggest missing feature of TextMate that directly impacts my life.

If this breaks anything, just remember that it's undoable. Command-Z is your friend. Please also report and and all problems to me with an example of the code that it chokes on.

If you only want to DOWNLOAD this macro alone, just visit this link and double-click the file. http://macromates.com/svn/Bundles/trunk/Bundles/Experimental.tmbundle/Macros...

—Thomas Aylott – subtleGradient—

Show replies by date

Saul Rosenbaum

15 Nov 15 Nov

1:11 p.m.

Thomas - very large beer headed your way... I don't really care about the speed, this works a charm..

-saul - visualchutzpah

On Nov 15, 2007, at 1:37 AM, Thomas Aylott - subtleGradient wrote:

YES! You heard it here first folks.

I just added a Select Balanced HTML/XML Tag macro to the Experimental bundle!

I've been trying different ways of making this work since I first started using TextMate back in, what like 2004 or something?

I've tried regex, I've tried Ruby, I've tried everything I could think of... But finally I've come up with something very simple.

This is frankly the #1 biggest missing feature of TextMate that directly impacts my life.

If this breaks anything, just remember that it's undoable. Command-Z is your friend. Please also report and and all problems to me with an example of the code that it chokes on.

If you only want to DOWNLOAD this macro alone, just visit this link and double-click the file. http://macromates.com/svn/Bundles/trunk/Bundles/Experimental.tmbundle/ Macros/Select%20Balanced%20HTML:XML%20Tag.tmMacro

—Thomas Aylott – subtleGradient—

Kai Janson

1:48 p.m.

Fantastic! Make that two beers -= Very Very Large =- heading your way ... :) Thank you!

On Nov 15, 2007, at 7:11 AM, Saul Rosenbaum wrote:

...

Thomas - very large beer headed your way... I don't really care about the speed, this works a charm..

-saul - visualchutzpah

On Nov 15, 2007, at 1:37 AM, Thomas Aylott - subtleGradient wrote:

YES! You heard it here first folks.

I just added a Select Balanced HTML/XML Tag macro to the Experimental bundle!

I've been trying different ways of making this work since I first started using TextMate back in, what like 2004 or something?

I've tried regex, I've tried Ruby, I've tried everything I could think of... But finally I've come up with something very simple.

The only real downside is that it's EXTREMELY slow. Like 1 second from running the macro until it's done. But that's WAY faster than using the mouse to manually select that same code, or trying to use the keyboard to manually select it.

This is frankly the #1 biggest missing feature of TextMate that directly impacts my life.

If this breaks anything, just remember that it's undoable. Command-Z is your friend. Please also report and and all problems to me with an example of the code that it chokes on.

If you only want to DOWNLOAD this macro alone, just visit this link and double-click the file. http://macromates.com/svn/Bundles/trunk/Bundles/Experimental.tmbundle/Macros...

—Thomas Aylott – subtleGradient—

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Ciarán Walsh

4:01 p.m.

On 15 Nov 2007, at 00:37, Thomas Aylott - subtleGradient wrote:

...

I just added a Select Balanced HTML/XML Tag macro to the Experimental bundle!

This is pretty awesome – unfortunately I found one problem, take the following example:

the -> structure trips it up :(

Great job though!

Thomas Aylott - subtleGradient

4:41 p.m.

On Nov 15, 2007, at 11:01 AM, Ciarán Walsh wrote:

...

On 15 Nov 2007, at 00:37, Thomas Aylott - subtleGradient wrote:

...
I just added a Select Balanced HTML/XML Tag macro to the Experimental bundle!

This is pretty awesome – unfortunately I found one problem, take the following example:

<table> <tr> <td>foo</td> </tr> <tr> <td><?php echo $foo->bar ?></td> </tr> </table>

the -> structure trips it up :(

Great job though!

Good catch. I plan to make it work with mixed source pages soon. Just wanted to get it out there asis first.

It sure is a pain to edit macros after you've recorded them. (Hint hint Allan)

—Thomas Aylott – subtleGradient—

Saul Rosenbaum

5:04 p.m.

Thomas -

Can't help but notice you file syntax is "HTML Bracketed" what is that? I've never seen that before..

saul

On Nov 15, 2007, at 11:41 AM, Thomas Aylott - subtleGradient wrote:

On Nov 15, 2007, at 11:01 AM, Ciarán Walsh wrote:

...

On 15 Nov 2007, at 00:37, Thomas Aylott - subtleGradient wrote:

...
I just added a Select Balanced HTML/XML Tag macro to the Experimental bundle!

This is pretty awesome – unfortunately I found one problem, take the following example:

<table> <tr> <td>foo</td> </tr> <tr> <td><?php echo $foo->bar ?></td> </tr> </table>

the -> structure trips it up :(

Great job though!

Good catch. I plan to make it work with mixed source pages soon. Just wanted to get it out there asis first.

It sure is a pain to edit macros after you've recorded them. (Hint hint Allan)

—Thomas Aylott – subtleGradient—

Thomas Aylott - subtleGradient

6:26 p.m.

It's my own little toy wrapper for HTML. It just adds some leading space scopes so that my themes can color it nicely.

—Thomas Aylott – subtleGradient—

On Nov 15, 2007, at 12:04 PM, Saul Rosenbaum wrote:

...

Thomas -

Can't help but notice you file syntax is "HTML Bracketed" what is that? I've never seen that before..

saul

Saul Rosenbaum

6:50 p.m.

rock on! -- that's great... much appreciated..

On Nov 15, 2007, at 1:26 PM, Thomas Aylott - subtleGradient wrote:

It's my own little toy wrapper for HTML. It just adds some leading space scopes so that my themes can color it nicely.

—Thomas Aylott – subtleGradient—

On Nov 15, 2007, at 12:04 PM, Saul Rosenbaum wrote:

...

Thomas -

Can't help but notice you file syntax is "HTML Bracketed" what is that? I've never seen that before..

saul

______________________________________________________________________ For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Saul Rosenbaum Visual Chutzpah --------------------------------- Strengthening Brands Through Illustrated Images And Interactive Design

VISIT US ON THE WEB web: http://www.visualchutzpah.com email: lists@visualchutzpah.com illos: http://www.saulrosenbaum.com blog: http://www.saulrosenbaum.com/blog ---------------------------------

Hans-Jörg Bibiko

16 Nov 16 Nov

10:41 a.m.

On 15.11.2007, at 19:50, Saul Rosenbaum wrote:

...

rock on! -- that's great...

First of all: yes, I agree, it's an awesome macro ;)

But I encounter further tiny problems.

1) I'd prefer to not use the local clipboard for that macro

2) The macro doesn't work for such an HTML:

<html> <head> <title>title</title> <style type="text/css" media="screen"> p { color: red; } </style> </head> <body> <p>Test <i>bla</i>sak{j}d(sahdskjdh)sadasl sakd</p> </body></html>

If you delete <title>..</title> it works If you leave <title> and delete <style> instead it also works BTW: I don't know why.

3) If you place the caret inside of '()' or '{}' it matches these brackets first. No problem redo it.

4) To use the undo function for going back it's awesome too, but it could be a bit confusing if one wants to really undo something. One has to do it at east twice.

5) Sometimes the entire window is scrolled to right side after selecting. This is maybe an issue of the length of some lines if one has switch off 'soft wrap'.

6) Instead of using CARET_WAS_HERE and ≤, ≥ I'd prefer to use unused unicode code points like \UFFFF0 -\UFFF9

By myself I also tried it with TMTOOLS. In principal it works, it's also faster, but there are other tiny things.

On the other hand TM has that selectBlock: function built-in. I don't know whether it would be possible for Allan to modify that function in order to get rid of it, because this feature is actually a very essential one, by my opinion.

Nevertheless, it's a very cool approach ;)

Cheers,

--Hans

Thomas Aylott - subtleGradient

11:23 a.m.

Thanks!

Yeah, I've already come across most of this stuff too. I just finally found something that half-worked and wanted to get it out in the world before I got bored or too busy to finish it.

Some sort of native functionality is what I would prefer. It's almost shameful that TextMate doesn't offer this natively.

For the updated version, I'll probly have it escape all brackets in the doc first, and then do another pass to put back only those brackets that are most likely to give the best balancing results. And then do all the business with selections and whatever. That should make it much more robust and capable of handling rhtml and embedded source and stuff.

The current version is only really useful on REALLY clean pages, and then only sometimes ;)

I'd also like to build up some nice testcases to check against anytime I make any changes. I bet I could use TMTOOLS to automate testing since it can move the caret around and run macros.

—Thomas Aylott – subtleGradient—

On Nov 16, 2007, at 5:41 AM, Hans-Jörg Bibiko wrote:

...

On 15.11.2007, at 19:50, Saul Rosenbaum wrote:

...
rock on! -- that's great...

First of all: yes, I agree, it's an awesome macro ;)

But I encounter further tiny problems.

I'd prefer to not use the local clipboard for that macro

The macro doesn't work for such an HTML:

<html> <head> <title>title</title> <style type="text/css" media="screen"> p { color: red; } </style> </head> <body> <p>Test <i>bla</i>sak{j}d(sahdskjdh)sadasl sakd</p> </body></html>

If you delete <title>..</title> it works If you leave <title> and delete <style> instead it also works BTW: I don't know why.

If you place the caret inside of '()' or '{}' it matches these brackets first. No problem redo it.

To use the undo function for going back it's awesome too, but it could be a bit confusing if one wants to really undo something. One has to do it at east twice.

Sometimes the entire window is scrolled to right side after selecting. This is maybe an issue of the length of some lines if one has switch off 'soft wrap'.

Instead of using CARET_WAS_HERE and ≤, ≥ I'd prefer to use unused unicode code points like \UFFFF0 -\UFFF9

By myself I also tried it with TMTOOLS. In principal it works, it's also faster, but there are other tiny things.

On the other hand TM has that selectBlock: function built-in. I don't know whether it would be possible for Allan to modify that function in order to get rid of it, because this feature is actually a very essential one, by my opinion.

Nevertheless, it's a very cool approach ;)

Cheers,

--Hans

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Hans-Jörg Bibiko

11:38 a.m.

On 16.11.2007, at 12:23, Thomas Aylott - subtleGradient wrote:

...

Thanks!

Yeah, I've already come across most of this stuff too. I just finally found something that half-worked and wanted to get it out in the world before I got bored or too busy to finish it.

Some sort of native functionality is what I would prefer. It's almost shameful that TextMate doesn't offer this natively.

For the updated version, I'll probly have it escape all brackets in the doc first, and then do another pass to put back only those brackets that are most likely to give the best balancing results. And then do all the business with selections and whatever. That should make it much more robust and capable of handling rhtml and embedded source and stuff.

The current version is only really useful on REALLY clean pages, and then only sometimes ;)

I'd also like to build up some nice testcases to check against anytime I make any changes. I bet I could use TMTOOLS to automate testing since it can move the caret around and run macros.

I attached a very FIRST approach using TMTOOLS for that.

Mainly I took the ruby script 'Insert close Tag' := $TAG [It has some tiny hooks but at the first glance it works quite well]

Then I created a SIMPLE regexp: REGEXP="(?m)<$TAG\b.*?>\/$TAG>" [One has to enhance it later!!]

Then I took that regexp for two macro: -findNext regexp -findPrevious regexp

Maybe you find some hints in it ;)

Of course, that command has no 'go backwards' function.

Cheers,

--Hans

Thomas Aylott - subtleGradient

12:29 p.m.

This runs into the problem I'd been having for 3 years. How do you get it to work when you have a tag nested inside the same kind of tag? Keeping it from matching the first close tag it finds, or the very last one.

I've tried some rather complex Regex, but it just turns out that oniguruma isn't capable of doing if tests in regex. That might be a solution to the problem, but I forget the specifics of how that might work now.

This is one of the paths I'd tried a while ago:

(?xm) <(?<tag>\w+) # match the initial opening parenthesis

# Now make a named group 'balanced' which matches # a balanced substring. (?<balanced>

# Look for a single character that isn't the start of an html tag [^<]*(?=<) | # Then look for everything until something that could be and ending tag or nested tag .*?(?=</?\k<tag>) | # or is a full this tag with opening and end: <\k<tag> # A parenthesised string begins with an opening this tag \g<balanced>* # followed by a sequence of balanced substrings </\k<tag>> # and ends with a closing this tag

)* # Look for a sequence of balanced substrings

</\k<tag>> # Finally, the outer closing parenthesis

I think the problem with this was that it was crashing TextMate. I don't remember now.

—Thomas Aylott – subtleGradient—

On Nov 16, 2007, at 6:38 AM, Hans-Jörg Bibiko wrote:

...

On 16.11.2007, at 12:23, Thomas Aylott - subtleGradient wrote:

...
Thanks!

Yeah, I've already come across most of this stuff too. I just finally found something that half-worked and wanted to get it out in the world before I got bored or too busy to finish it.

Some sort of native functionality is what I would prefer. It's almost shameful that TextMate doesn't offer this natively.

For the updated version, I'll probly have it escape all brackets in the doc first, and then do another pass to put back only those brackets that are most likely to give the best balancing results. And then do all the business with selections and whatever. That should make it much more robust and capable of handling rhtml and embedded source and stuff.

The current version is only really useful on REALLY clean pages, and then only sometimes ;)

I'd also like to build up some nice testcases to check against anytime I make any changes. I bet I could use TMTOOLS to automate testing since it can move the caret around and run macros.

I attached a very FIRST approach using TMTOOLS for that.

Mainly I took the ruby script 'Insert close Tag' := $TAG [It has some tiny hooks but at the first glance it works quite well]

Then I created a SIMPLE regexp: REGEXP="(?m)<$TAG\b.*?>\/$TAG>" [One has to enhance it later!!]

Then I took that regexp for two macro: -findNext regexp -findPrevious regexp

Maybe you find some hints in it ;)

Of course, that command has no 'go backwards' function.

Cheers,

--Hans

<Select Balanced XHTML Tag FIRST APPROACH.tmCommand> ______________________________________________________________________ For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Hans-Jörg Bibiko

1 p.m.

On 16.11.2007, at 13:29, Thomas Aylott - subtleGradient wrote:

...

This runs into the problem I'd been having for 3 years. How do you get it to work when you have a tag nested inside the same kind of tag? Keeping it from matching the first close tag it finds, or the very last one.

<div> <div> <div> TEXT </div> </div> </div>

Of course, you're right. That is THE problem! And I also have no solution for it by using regexp.

One way I have in my mind is to write a character by character parser. If one has found the closing tag (e.g. 'p') it should be possible to go from the caret's position step by step to the right side to look for '</p>'. If one finds '<p...>' while doing this a counter would be set counter+1; if one finds '</p>' the counter would be set to counter-1; then if counter < 0 I found my closing tag (meaning index). As next the same from the caret's position to left side. If one writes this in perl/ruby/... and the entire text is stored as character array I can splice the array and finally I have the desired string. With that string I can execute a normal findNext and findPrevios macro.

I don't know whether it works but ... Maybe I find some time to try it out. The advantage would be that I don't have to parse the entire document. Or one would write it in Objective-C as plug-in, or Allan has a nice idea for it ;)

On the other hand I thought about to use an external HTML parser. This works but the parser is also very slow if one has a large HTML file. One could think about to restrict the area - 100 line above and below the current line - for parsing but this is also tricky.

Cheers,

--Hans

Peter Haza

1:10 p.m.

Wouldn'it be possible to remove currently selected text (and its tag), find the next matching tags(in the remaining text), and insert the text again when you've found them?

On Nov 16, 2007, at 2:00 PM, Hans-Jörg Bibiko wrote:

...

On 16.11.2007, at 13:29, Thomas Aylott - subtleGradient wrote:

...
This runs into the problem I'd been having for 3 years. How do you get it to work when you have a tag nested inside the same kind of tag? Keeping it from matching the first close tag it finds, or the very last one.

<div> <div> <div> TEXT </div> </div> </div>

Of course, you're right. That is THE problem! And I also have no solution for it by using regexp.

One way I have in my mind is to write a character by character parser. If one has found the closing tag (e.g. 'p') it should be possible to go from the caret's position step by step to the right side to look for '</p>'. If one finds '<p...>' while doing this a counter would be set counter+1; if one finds '</p>' the counter would be set to counter-1; then if counter < 0 I found my closing tag (meaning index). As next the same from the caret's position to left side. If one writes this in perl/ruby/... and the entire text is stored as character array I can splice the array and finally I have the desired string. With that string I can execute a normal findNext and findPrevios macro.

I don't know whether it works but ... Maybe I find some time to try it out. The advantage would be that I don't have to parse the entire document. Or one would write it in Objective-C as plug-in, or Allan has a nice idea for it ;)

On the other hand I thought about to use an external HTML parser. This works but the parser is also very slow if one has a large HTML file. One could think about to restrict the area - 100 line above and below the current line - for parsing but this is also tricky.

Cheers,

--Hans

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Thomas Aylott - subtleGradient

4:02 p.m.

On Nov 16, 2007, at 8:00 AM, Hans-Jörg Bibiko wrote:

...

On 16.11.2007, at 13:29, Thomas Aylott - subtleGradient wrote:

...
This runs into the problem I'd been having for 3 years. How do you get it to work when you have a tag nested inside the same kind of tag? Keeping it from matching the first close tag it finds, or the very last one.

<div> <div> <div> TEXT </div> </div> </div>

Of course, you're right. That is THE problem! And I also have no solution for it by using regexp.

One way I have in my mind is to write a character by character parser. If one has found the closing tag (e.g. 'p') it should be possible to go from the caret's position step by step to the right side to look for '</p>'. If one finds '<p...>' while doing this a counter would be set counter+1; if one finds '</p>' the counter would be set to counter-1; then if counter < 0 I found my closing tag (meaning index). As next the same from the caret's position to left side. If one writes this in perl/ruby/... and the entire text is stored as character array I can splice the array and finally I have the desired string. With that string I can execute a normal findNext and findPrevios macro.

I don't know whether it works but ... Maybe I find some time to try it out. The advantage would be that I don't have to parse the entire document. Or one would write it in Objective-C as plug-in, or Allan has a nice idea for it ;)

On the other hand I thought about to use an external HTML parser. This works but the parser is also very slow if one has a large HTML file. One could think about to restrict the area - 100 line above and below the current line - for parsing but this is also tricky.

Cheers,

--Hans

One idea is to remove the problem of all the nested identical tags by using 1 pass to make all tagnames unique. Something like what you said with a counter that goes up and down as it hits a duplicate tagname:

Then you could do a simpler regex to find the balance of the tags.

Then it's just a matter of wrapping the selection with something unique... Fixing the document again... And then finding your selection again... And then removing that unique wrapper.

We'd have to come up with a nice way to limit the scope initially so you don't have to parse the whole document every time.

I'm sure there's a simple way to do it that we're just not seeing.

—Thomas Aylott – subtleGradient—

Alex Ross

5:30 p.m.

On Nov 16, 2007, at 8:02 AM, Thomas Aylott - subtleGradient wrote:

...

On Nov 16, 2007, at 8:00 AM, Hans-Jörg Bibiko wrote:

...
On 16.11.2007, at 13:29, Thomas Aylott - subtleGradient wrote:

...
This runs into the problem I'd been having for 3 years. How do you get it to work when you have a tag nested inside the same kind of tag? Keeping it from matching the first close tag it finds, or the very last one.

<div> <div> <div> TEXT </div> </div> </div>

Of course, you're right. That is THE problem! And I also have no solution for it by using regexp.

One way I have in my mind is to write a character by character parser. If one has found the closing tag (e.g. 'p') it should be possible to go from the caret's position step by step to the right side to look for '</p>'. If one finds '<p...>' while doing this a counter would be set counter+1; if one finds '</p>' the counter would be set to counter-1; then if counter < 0 I found my closing tag (meaning index). As next the same from the caret's position to left side. If one writes this in perl/ruby/... and the entire text is stored as character array I can splice the array and finally I have the desired string. With that string I can execute a normal findNext and findPrevios macro.

I don't know whether it works but ... Maybe I find some time to try it out. The advantage would be that I don't have to parse the entire document. Or one would write it in Objective-C as plug-in, or Allan has a nice idea for it ;)

On the other hand I thought about to use an external HTML parser. This works but the parser is also very slow if one has a large HTML file. One could think about to restrict the area - 100 line above and below the current line - for parsing but this is also tricky.

Cheers,

--Hans

One idea is to remove the problem of all the nested identical tags by using 1 pass to make all tagnames unique. Something like what you said with a counter that goes up and down as it hits a duplicate tagname:

<div1> <div2> <div3> TEXT </div3> </div2> </div1>

Then you could do a simpler regex to find the balance of the tags.

Then it's just a matter of wrapping the selection with something unique... Fixing the document again... And then finding your selection again... And then removing that unique wrapper.

We'd have to come up with a nice way to limit the scope initially so you don't have to parse the whole document every time.

I'm sure there's a simple way to do it that we're just not seeing.

—Thomas Aylott – subtleGradient—

I don't mean to get all nerdy on you guys but the problem that you're running into here is that HTML is context-free. Regular expressions can't match context free languages. There are mathematical proofs of this... so what you're trying to do is really truly impossible.

What you need is someway to remember some state, which is the counter that Hans mentioned or a transformation of the tags so that each has a unique (ordered) identifier. Also, you can match it if you have the recursive subexpression from Oniguruma. But you don't. My uh... suggestion is to give up using plain-old regular expressions :). It can never work.

–Alex

Thomas Aylott - subtleGradient

5:51 p.m.

On Nov 16, 2007, at 12:30 PM, Alex Ross wrote:

...

I don't mean to get all nerdy on you guys but the problem that you're running into here is that HTML is context-free. Regular expressions can't match context free languages. There are mathematical proofs of this... so what you're trying to do is really truly impossible.

What you need is someway to remember some state, which is the counter that Hans mentioned or a transformation of the tags so that each has a unique (ordered) identifier. Also, you can match it if you have the recursive subexpression from Oniguruma. But you don't. My uh... suggestion is to give up using plain-old regular expressions :). It can never work.

–Alex

Hey, thanks. But duh ;) that's why the macro I just released actually works. I gave up on regex finally and mangled HTML around a bit to work with the normal select brackets internal TextMate command.

I was fiddling around with recursive regex, but it seems to completely crash TextMate if you don't have it just right, so it's extremely hard to play with.

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

20 Nov 20 Nov

6:06 p.m.

On 16.11.2007, at 18:51, Thomas Aylott - subtleGradient wrote:

...

I gave up on regex finally and mangled HTML around a bit to work with the normal select brackets internal TextMate command.

Hi,

I wrote a perl script with a nested tag killer as tmCommand. Fine. It works and I believe I caught 99% of all exceptions within HTML/XML ;) And it is fast even for large document with 10000 lines (ca. 100ms).

If my document is larger than ca. 540kB the tmCommand doesn't work. I don't know why, but I guess there is a maximum size for passing parameter, or what ever.

OK. This script outputs the offset of the start tag, the length of the found text chunk, and I can output the entire text chunk.

Now my question: Is there a way to highlight this text by using a macro/command WITHOUT TMTOOLS? I can copy this text into the clipboard but then ?? Any ideas?

With TMTOOLS it wouldn't be a problem. Also the text size doesn't matter.

Cheers,

--Hans

Gerd Knops

7:34 p.m.

On Nov 20, 2007, at 12:06 PM, Hans-Jörg Bibiko wrote:

...

On 16.11.2007, at 18:51, Thomas Aylott - subtleGradient wrote:

...
I gave up on regex finally and mangled HTML around a bit to work with the normal select brackets internal TextMate command.

Hi,

I wrote a perl script with a nested tag killer as tmCommand. Fine. It works and I believe I caught 99% of all exceptions within HTML/ XML ;) And it is fast even for large document with 10000 lines (ca. 100ms).

If my document is larger than ca. 540kB the tmCommand doesn't work. I don't know why, but I guess there is a maximum size for passing parameter, or what ever.

OK. This script outputs the offset of the start tag, the length of the found text chunk, and I can output the entire text chunk.

Now my question: Is there a way to highlight this text by using a macro/command WITHOUT TMTOOLS? I can copy this text into the clipboard but then ?? Any ideas?

Output as snippet?

Gerd

Thomas Aylott - subtleGradient

10:45 p.m.

On Nov 20, 2007, at 1:06 PM, Hans-Jörg Bibiko wrote:

...

On 16.11.2007, at 18:51, Thomas Aylott - subtleGradient wrote:

...
I gave up on regex finally and mangled HTML around a bit to work with the normal select brackets internal TextMate command.

Hi,

I wrote a perl script with a nested tag killer as tmCommand. Fine. It works and I believe I caught 99% of all exceptions within HTML/ XML ;) And it is fast even for large document with 10000 lines (ca. 100ms).

If my document is larger than ca. 540kB the tmCommand doesn't work. I don't know why, but I guess there is a maximum size for passing parameter, or what ever.

OK. This script outputs the offset of the start tag, the length of the found text chunk, and I can output the entire text chunk.

Now my question: Is there a way to highlight this text by using a macro/command WITHOUT TMTOOLS? I can copy this text into the clipboard but then ?? Any ideas?

With TMTOOLS it wouldn't be a problem. Also the text size doesn't matter.

Cheers,

--Hans

Well, don't hold on on me here! Give me the TMTOOLS version now and we can figure out a better version later.

Oh, and have I mentioned how much we all love you?

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

21 Nov 21 Nov

12:46 a.m.

On 20.11.2007, at 23:45, Thomas Aylott - subtleGradient wrote:

...

On Nov 20, 2007, at 1:06 PM, Hans-Jörg Bibiko wrote:

...
I wrote a perl script with a nested tag killer as tmCommand. Fine. It works and I believe I caught 99% of all exceptions within HTML/ XML ;) And it is fast even for large document with 10000 lines (ca. 100ms).

If my document is larger than ca. 540kB the tmCommand doesn't work. I don't know why, but I guess there is a maximum size for passing parameter, or what ever.

Give me the TMTOOLS version now and we can figure out a better version later.

Well, I thought about to avoid TMTOOLS but I couldn't find a solution. I also tried to manipulate the *.tmmacro file directly; call via osascript reloadBundle. It works but it takes times and it's a stony way.

I will write a TMTOOLS command. Actually I did it, but I figured out that if the document is too large Perl crashes with

libperl.dylib 0x977aebd4 S_regmatch + 13756

ASAIK is this a memory exception. Well, TM is stable, but maybe I will try Ruby. I also tried to write such a parser in pure Obj-C. It works brilliant but it's slow, thus I guess one has to write it in low-level C. It would be VERY nice if one has the chance to use TM's Oniguruma regexp engine within a plug-in, but up to now I didn't get rid of it. Maybe Allan is able to give a hint :) This would simplify thousands of tiny thing :D

With Perl I have to fix the coding issue, i.e. byte offset versus character offset; and finally there's a tiny problem with 'Soft Tabs' which I have to solve.

In other words I need a bit time yet ;)

Regards, --Hans

Thomas Aylott - subtleGradient

2:21 a.m.

Bah. Excuses excuses... ;)

Thomas Aylott [SubtleGradient] from iPhone

On Nov 20, 2007, at 7:46 PM, Hans-Jörg Bibiko bibiko@eva.mpg.de wrote:

...

On 20.11.2007, at 23:45, Thomas Aylott - subtleGradient wrote:

...
On Nov 20, 2007, at 1:06 PM, Hans-Jörg Bibiko wrote:

...
I wrote a perl script with a nested tag killer as tmCommand. Fine. It works and I believe I caught 99% of all exceptions within HTML/ XML ;) And it is fast even for large document with 10000 lines (ca. 100ms).

If my document is larger than ca. 540kB the tmCommand doesn't work. I don't know why, but I guess there is a maximum size for passing parameter, or what ever.

Give me the TMTOOLS version now and we can figure out a better version later.

Well, I thought about to avoid TMTOOLS but I couldn't find a solution. I also tried to manipulate the *.tmmacro file directly; call via osascript reloadBundle. It works but it takes times and it's a stony way.

I will write a TMTOOLS command. Actually I did it, but I figured out that if the document is too large Perl crashes with

libperl.dylib 0x977aebd4 S_regmatch + 13756

ASAIK is this a memory exception. Well, TM is stable, but maybe I will try Ruby. I also tried to write such a parser in pure Obj-C. It works brilliant but it's slow, thus I guess one has to write it in low- level C. It would be VERY nice if one has the chance to use TM's Oniguruma regexp engine within a plug-in, but up to now I didn't get rid of it. Maybe Allan is able to give a hint :) This would simplify thousands of tiny thing :D

With Perl I have to fix the coding issue, i.e. byte offset versus character offset; and finally there's a tiny problem with 'Soft Tabs' which I have to solve.

In other words I need a bit time yet ;)

Regards, --Hans

For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Hans-Joerg Bibiko

22 Nov 22 Nov

1:50 p.m.

Hi,

Please try this TENTATIVE macro for selecting balanced HTML/PLIST or XML tags (according to TM_MODE).

Up to now I'm using a Perl script which will crashes (WITHOUT affecting TM or the current text!) on very large documents (more than 12.000 lines). I try to fix this with Perl or I try to use Ruby instead.

Please check the macro against: - any kind of embedded php/jsp/etc. stuff - any kind of nested tags - and kind of comments! - any kind of unfinished XHTML code meaning invalid code etc.

The only thing I didn't check is that if a '<' appears within quotes like <input value=" < a >"...>. I can do this but then I'm not able to select e.g. the entire tag <input name="bla"> if my caret is within "bla". The clean way is to use entities.

In short terms try it out and give me any feedback.

Have fun (hopefully)

--Hans

Hans-Joerg Bibiko

3:16 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!2!

Hi,

here comes the next version of the TENTATIVE macro for selecting balanced HTML/PLIST or XML tags (according to TM_MODE).

I fixed some tiny things.

But there is one big issue.

If I breach the logical structure of an html document like there's is a single "</b>" without an open tag this macro cannot match the correct tags. Almost all browser will ignore such things. But the macro can't do that up to now.

What is the best practice of this? Should the macro only match a correct match? Meaning if the opened tag differs from the closed tag it won't highlight something?

Best,

--Hans

Hans-Joerg Bibiko

3:43 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!2!

Oops, I posted the wrong version. Sorry!

--Hans

Thomas Aylott - subtleGradient

7:53 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!2!

On Nov 22, 2007, at 10:43 AM, Hans-Joerg Bibiko wrote:

...

Oops, I posted the wrong version. Sorry!

<Select Balanced XHTML Tags.tmMacro>

--Hans

Works better than mine sofar. Except when between any kind of quotes.

I recommend looking into existing html parsing solutions. This is not the first time people have needed to parse all kinds of crazy html.

I'd help out with it but I'm a bit swamped for now.

I'm EXTREMELY thrilled that we now have someone with your high level of awesomeness working on this problem. Where were you in 2005?! ;)

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

8:21 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!2!

On 22.11.2007, at 20:53, Thomas Aylott - subtleGradient wrote:

...

Works better than mine sofar. Except when between any kind of quotes.

What do you mean? I found it quite convient to do this [ '|' := cursor]:

invoke the macro and it selects:

do it again and it selects:

This is also quite useful if one is in XML mode.

...

I recommend looking into existing html parsing solutions. This is not the first time people have needed to parse all kinds of crazy html.

Yes. I will look for it. For now I'm glad that I found a method to select something by using a macro ;)

--Hans

Thomas Aylott - subtleGradient

9:06 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!2!

On Nov 22, 2007, at 3:21 PM, Hans-Jörg Bibiko wrote:

...

On 22.11.2007, at 20:53, Thomas Aylott - subtleGradient wrote:

...
Works better than mine sofar. Except when between any kind of quotes.

What do you mean?

...

--Hans

1) Make a new document with the text: <p>"<b>test</b>"</p>

2) put your cursor in the word "test" 3) run your macro. it selects '<p>"<b>test</b>' 5) cry

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

10:31 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!2!

On 22.11.2007, at 22:06, Thomas Aylott - subtleGradient wrote:

...

<p>"<b>test</b>"</p>

OK. I cried ;)

This is the tricky thing with HTML. Actually one should write this as

but (un)fortunately many roads lead to Rome ;)

I'll see ;)

--Hans

Michael Sheets

23 Nov 23 Nov

1:33 a.m.

New subject: [ANN] Select Balanced HTML Tag!!!2!

On Nov 22, 2007, at 4:31 PM, Hans-Jörg Bibiko wrote:

...

On 22.11.2007, at 22:06, Thomas Aylott - subtleGradient wrote:

...
<p>"<b>test</b>"</p>

OK. I cried ;)

This is the tricky thing with HTML. Actually one should write this as

<p>"<b>test</b>"</p>

but (un)fortunately many roads lead to Rome ;)

One should what? ;) The only two chars that are illegal in HTML are < and & everything else is fair game.

Hans-Jörg Bibiko

10:11 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

Hi,

this macro should work. I use a Perl script as an event based parser. This approach should be quite robust against (X|H)TML syntax. The disadvantage is speed. To increase the speed even for very large documents I implemented a two-step parser. The first step assumes that the sought open tag is closed to the caret (meaning 6000 bytes). If nothing is found the second step will parse the entire document that could takes a bit (more than 10,000 lines). With this macro I also got rid of the Perl regexp engine crash on large documents. I tried it out. I took an HTML file with 45,000 lines and it worked fine for tags which are closed to the caret. If the tags are not closed to the caret it takes a while but one can stop the macro by pressing APPLE+.

If the macro shouldn't find anything or the start tag name differs from the end tag name - which is the case if one works with incomplete (X|H)TML - the macro will play an alert sound.

Of course, that macro cannot parse any HTML exception. E.g. one can write: <tr><td><p>foo</td></tr> . The <p> tag will be closed by </td>.

An other issue is the <script> tag. Up to now the macro allows to select <script>..</script> but e.g. in JavaScript one can write: document.write("<SCR" + "IPT='2' SRC='/footers/sites/"+ site_name +".js' TYPE='text/javascript'></SCR" + "IPT>"); "<SCR" won't be ignored!

QUESTION: Should the macro ignore any <script>..</script> tags?

I rebound the macro to SHIFT+OPT+APPLE+B because I find it sometimes quite useful to have to chance to invoke 'Enclosing Brackets' in (X|H) TML but this is up to the user.

Please try it out and give a feedback ;)

Cheers,

--Hans

Thomas Aylott - subtleGradient

11:15 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On Nov 23, 2007, at 5:11 PM, Hans-Jörg Bibiko wrote:

...

Please try it out and give a feedback ;)

Try it out inside the XML of the tmMacro file itself. Odd behavior. Still way better than mine though. I went ahead and replaced mine with this version so that all the people downloading it from BundleForge and where-ever will be getting yours instead.

FEATURE REQUEST: Any way to make it select the insides of the tag instead? Like BBEdit. Maybe it would make sense to make a different Macro for that so that they can have different triggers

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

11:38 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On 24.11.2007, at 00:15, Thomas Aylott - subtleGradient wrote:

...

On Nov 23, 2007, at 5:11 PM, Hans-Jörg Bibiko wrote:

...
Please try it out and give a feedback ;)

Try it out inside the XML of the tmMacro file itself. Odd behavior.

Do you mean that <string>\xFFF3</string> will be selected?

Delete this line and it'll work because the first command in that macro searches for \xFFF3 to set the seach parameters for findNext.

Maybe I will change this to a real unused Unicode character.

...

FEATURE REQUEST: Any way to make it select the insides of the tag instead? Like BBEdit.

Do you mean: ['|' := caret]

to select the <img src="foo">

...

Maybe it would make sense to make a different Macro for that so that they can have different triggers

Yes, I agree because that macro is called "Select balanced tags"

I will see ;)

Regards, --Hans

Thomas Aylott - subtleGradient

24 Nov 24 Nov

5:06 a.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On Nov 23, 2007, at 6:38 PM, Hans-Jörg Bibiko wrote:

...

...
Any way to make it select the insides of the tag instead? Like BBEdit.

Do you mean: ['|' := caret]

<img src="fo|o">

to select the <img src="foo">

No, I mean: (|=caret)

selects:

some text

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

12:51 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!4! Only the content

On 24.11.2007, at 06:06, Thomas Aylott - subtleGradient wrote:

...

No, I mean: (|=caret)

<div>some | text</div>

selects:

some text

Please try this one for selecting the content of balanced tags.

Cheers,

--Hans

Thomas Aylott - subtleGradient

3:57 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!4! Only the content

On Nov 24, 2007, at 7:51 AM, Hans-Jörg Bibiko wrote:

...

On 24.11.2007, at 06:06, Thomas Aylott - subtleGradient wrote:

...
No, I mean: (|=caret)

<div>some | text</div>

selects:

some text

Please try this one for selecting the content of balanced tags.

Cheers,

--Hans

<Select Balanced XHTML Tags Content.tmMacro>

PERFECT!

—Thomas Aylott – subtleGradient—

Alexey Blinov

26 Nov 26 Nov

11:33 a.m.

New subject: [ANN] Select Balanced HTML Tag!!!4! Only the content

Maybe it should go to BundleForge too?

On Nov 24, 2007 6:57 PM, Thomas Aylott - subtleGradient textmate@subtlegradient.com wrote:

...

On Nov 24, 2007, at 7:51 AM, Hans-Jörg Bibiko wrote:

...
On 24.11.2007, at 06:06, Thomas Aylott - subtleGradient wrote:

...
No, I mean: (|=caret)

<div>some | text</div>

selects:

some text

Please try this one for selecting the content of balanced tags.

Cheers,

--Hans

<Select Balanced XHTML Tags Content.tmMacro>

PERFECT!

—Thomas Aylott – subtleGradient— ______________________________________________________________________ For new threads USE THIS: textmate@lists.macromates.com (threading gets destroyed and the universe will collapse if you don't) http://lists.macromates.com/mailman/listinfo/textmate

Thomas Aylott - subtleGradient

24 Nov 24 Nov

3:57 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

FEATURE REQUEST: Expand selection to whole lines for multi-line blocks.

When the TextMate Select Enclosing Brackets command runs... (|=caret)

{ something | }

it selects:

" { something } "

Can you expand your selection when the selection is to the very edges of a multi-line block: like this:

<p> something | </p>

Currently selects:

"<p> something </p>"

Should select:

" <p> something </p> "

I hope that made sense.

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

25 Nov 25 Nov

5:21 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On 24.11.2007, at 16:57, Thomas Aylott - subtleGradient wrote:

...

FEATURE REQUEST: Expand selection to whole lines for multi-line blocks.

Should select:

" <p> something

</p> "

Well, it's possible but I would destroy my indention. Or am I wrong?

--Hans

Thomas Aylott - subtleGradient

6:36 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On Nov 25, 2007, at 12:21 PM, Hans-Jörg Bibiko wrote:

...

On 24.11.2007, at 16:57, Thomas Aylott - subtleGradient wrote:

...
FEATURE REQUEST: Expand selection to whole lines for multi-line blocks.

Should select:

" <p> something

</p> "

Well, it's possible but I would destroy my indention. Or am I wrong?

--Hans

It depends on what you do once you have your selection. When I select a large block like that, I want to have the complete lines selected. When I select a tag that's all on one line, I want to only have the tag and its contents selected.

Look at how the Select Enclosing Brackets command works. It does exactly what I want.

—Thomas Aylott – subtleGradient—

Hans-Joerg Bibiko

26 Nov 26 Nov

5:10 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On 24.11.2007, at 16:57, Thomas Aylott - subtleGradient wrote:

...

FEATURE REQUEST: Expand selection to whole lines for multi-line blocks.

Should select:

" <p> something

</p> "

OK. Try this one. This should do the trick. It was a bit tricky. One could do this with a regexp but it slows the macro. Thus I went an other way. Furthermore I fixed some very tiny things. There is an issue like <TABLE>foo</table> . I got rid of it but the macro will beep and it isn't fully Unicode safe.

I also will fix the same tiny things for Select Balanced HTML Tags Content. It comes soon.

Cheers,

--Hans

Hans-Jörg Bibiko

27 Nov 27 Nov

1:45 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On 24.11.2007, at 16:57, Thomas Aylott - subtleGradient wrote:

...

FEATURE REQUEST: Expand selection to whole lines for multi-line blocks.

Should select:

" <p> something

</p> "

Well, here comes the final version, hopefully. Back to the Perl roots. Now I used the Perl package HTML::Parser because it' written in C and VERY fast and robust, even for very large documents. This command can handle XML/Property List/XLST and HTML syntax according to TM_MODE.

I also fixed some tiny issues which Thomas mentioned above.

I believe Perl's HTML::Parser belongs to the default packages. If not one can install it by using the Terminal via:

sudo -H cpan -i HTML::Parser

More details: [ http://sial.org/howto/perl/life-with-cpan/macosx/ ]

Please try it out and give any feedback in the case I forgot something ;)

Cheers,

--Hans

Andy Armstrong

1:50 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On 27 Nov 2007, at 13:45, Hans-Jörg Bibiko wrote:

...

I believe Perl's HTML::Parser belongs to the default packages. If not one can install it by using the Terminal via:

sudo -H cpan -i HTML::Parser

I'm pretty sure HTML::Parser is non-core - and because it's an XS module you'll need a C compiler installed too.

-- Andy Armstrong, Hexten

Chris Adams

28 Nov 28 Nov

1:39 a.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On Nov 27, 2007, at 5:50 AM, Andy Armstrong wrote:

...

I'm pretty sure HTML::Parser is non-core - and because it's an XS module you'll need a C compiler installed too.

It appears to be included in the default install on 10.5 and 10.4.

Chris

Andy Armstrong

1:41 a.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On 28 Nov 2007, at 01:39, Chris Adams wrote:

...

On Nov 27, 2007, at 5:50 AM, Andy Armstrong wrote:

...
I'm pretty sure HTML::Parser is non-core - and because it's an XS module you'll need a C compiler installed too.

It appears to be included in the default install on 10.5 and 10.4.

Ah - apologies. I guess Apple provide some non-core modules.

-- Andy Armstrong, Hexten

Thomas Aylott - subtleGradient

27 Nov 27 Nov

5:52 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On Nov 27, 2007, at 8:45 AM, Hans-Jörg Bibiko wrote:

...

Well, here comes the final version, hopefully.

...

Please try it out and give any feedback in the case I forgot something ;)

Cheers,

--Hans

<Select Balanced XHTML Tags.tmMacro>

BUG: (| = cursor) <ul> <li><a|>Text</a></li> </ul>

First selects:

"<li><a>Text</a>"

Then Selects:

"<ul> <li><a>Text</a></li>"

(This is just a simplified example of the problem)

ANOTHER BUG:

First, properly selects:

"<a>Text</a>"

Then activating the macro again causes the selection to stay the same, but the selection hilite disappears. No matter how many times you activate it at this point nothing changes.

Very odd.

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

28 Nov 28 Nov

9:49 a.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On 27.11.2007, at 18:52, Thomas Aylott - subtleGradient wrote:

...

On Nov 27, 2007, at 8:45 AM, Hans-Jörg Bibiko wrote:

...
BUG:

(| = cursor)

<ul> <li><a|>Text</a></li> </ul>

Is fixed. But the behaviour depends on whether you are inside of a start tag or inside of a end tag, because the parser looks leftwards. Meaning

selects

" <li><a|>Text</a></li>"

and

selects

"<a>Text</a>"

Is this OK?

...

ANOTHER BUG:

<ul> <li><a>Te|xt</a></li> </ul>

First, properly selects:

"<a>Text</a>"

Then activating the macro again causes the selection to stay the same, but the selection hilite disappears. No matter how many times you activate it at this point nothing changes.

Very odd.

This is really strange but I found the problem. My first command inside of that macro is to destroy a possible selection. I use this command:

[[ -n "$TM_SELECTED_TEXT" ]] && open "txmt://open/?line= $TM_INPUT_START_LINE&column=$TM_INPUT_START_COLUMN"

OK that works but not always ?? If there is at least one \t in front of the selection this command set the caret to an other position! Is this command wrong? I also used TM_INPUT_START_LINE_INDEX but no success. Anyway it turns out that the macro also works if I leave the selection as it is ;)

On 28 Nov 2007, at 01:39, Chris Adams wrote:

...

On Nov 27, 2007, at 5:50 AM, Andy Armstrong wrote:

...
I'm pretty sure HTML::Parser is non-core - and because it's an XS module you'll need a C compiler installed too.

It appears to be included in the default install on 10.5 and 10.4.

Yes, to install that module one has to install the Developer Tool in beforehand. Well, I would be glad if Mac Tiger/Leopard has that module as default but I looked for an official site with infos about Perl's core modules on Mac but I didn't find one.

--Hans

Thomas Aylott - subtleGradient

2:47 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On Nov 28, 2007, at 4:49 AM, Hans-Jörg Bibiko wrote:

...

This is really strange but I found the problem. My first command inside of that macro is to destroy a possible selection. I use this command:

[[ -n "$TM_SELECTED_TEXT" ]] && open "txmt://open/?line=$TM_INPUT_START_LINE&column=$TM_INPUT_START_COLUMN "

OK that works but not always ?? If there is at least one \t in front of the selection this command set the caret to an other position! Is this command wrong? I also used TM_INPUT_START_LINE_INDEX but no success. Anyway it turns out that the macro also works if I leave the selection as it is ;)

On 28 Nov 2007, at 01:39, Chris Adams wrote:

...
On Nov 27, 2007, at 5:50 AM, Andy Armstrong wrote:

...
I'm pretty sure HTML::Parser is non-core - and because it's an XS module you'll need a C compiler installed too.

It appears to be included in the default install on 10.5 and 10.4.

Yes, to install that module one has to install the Developer Tool in beforehand. Well, I would be glad if Mac Tiger/Leopard has that module as default but I looked for an official site with infos about Perl's core modules on Mac but I didn't find one.

--Hans

Wow! This version seems perfect sofar!

Yes, I've been wanting a proper way to get the REAL cursor location forever. Just check out what I'm doing inside the Support/lib/ codecompletion.rb to get the actual cursor location no matter if you have a selection or if the selection was started from the left or the right, etc...

Thanks much! I already replaced the version in Experimental with this version and added the contents only version too. Is that one good enough or will you be updatingthe contents version too?

Thanks! —Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

8:47 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On 28.11.2007, at 15:47, Thomas Aylott - subtleGradient wrote:

...

Yes, I've been wanting a proper way to get the REAL cursor location forever. Just check out what I'm doing inside the Support/lib/ codecompletion.rb to get the actual cursor location no matter if you have a selection or if the selection was started from the left or the right, etc...

I did it that way:

get the caret position or if a selection is done take the start point: [TM_INPUT_START_LINE is only set for a selection]

if defined "TM_INPUT_START_LINE" line = "TM_INPUT_START_LINE" - 1 line_index = "TM_INPUT_START_LINE_INDEX" else line = "TM_LINE_NUMBER" - 1 line_index = "TM_LINE_INDEX" end

get the real caret position (in bytes but depends on the regexp engine): 1) via regexp: THE_ENTIRE_TEXT =~ /((.*\n){line})(.{line_index})/ TEXT_BEFORE_CARET = "$1$3"; CARET_POSITION = length (TEXT_BEFORE_CARET)

[Perl crashes here for a very large text chunk; Ruby is fine]

2) count \n + line_index CARET_POSITION = 0 count = line; while(count--){ CARET_POSITION = index(THE_ENTIRE_TEXT,"\n", CARET_POSITION) CARET_POSITION++ } CARET_POSITION = CARET_POSITION + line_index

or others

...

I already replaced the version in Experimental with this version and added the contents only version too. Is that one good enough or will you be updatingthe contents version too?

I attached a slightly modified version of "Select Balanced (HT|X)ML Tags". I cleaned the code, improved the error handling (e.g. if HTML::Parser is not installed); Tool Tip if nothing is found or there is a mismatch.

Based on that "Select Balanced (HT|X)ML Tags" macro I will post a new (and more robust) version of "Select Content of Balanced (HT|X)ML Tags" (hopefully tomorrow). Maybe I will have some time to write a macro for "Select Balanced (HT|X)ML Tags backwards" like yours using UNDO.

Best,

--Hans

Thomas Aylott - subtleGradient

29 Nov 29 Nov

3:57 a.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On Nov 28, 2007, at 3:47 PM, Hans-Jörg Bibiko wrote:

...

Based on that "Select Balanced (HT|X)ML Tags" macro I will post a new (and more robust) version of "Select Content of Balanced (HT| X)ML Tags" (hopefully tomorrow). Maybe I will have some time to write a macro for "Select Balanced (HT|X)ML Tags backwards" like yours using UNDO.

Best,

--Hans

oooo, sweet!

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

10:26 a.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

Hi,

here comes two macros:

1) Select Balanced (HT|X)ML Tags Caused of the speed of the parser I changed the behaviour for that case if the caret is inside of a start tag. I pre-parse the entire text for setting the caret. Now it will select this tag.

example: <p>fo|o</p> <p>foo</|p> <|p>foo</p>

now all will select

"<p>foo</p>"

I believe this is more logically. It only takes some ticks more.

2) Select Content of Balanced (HT|X)ML Tags This macro will only select the content between the found tags. Here the behaviour differs from macro 1) if the caret is inside of a start tag. It will select that tag pair which is one level up. Otherwise one wouldn't be able to redo that macro if it already found something.

Both macros I also tested on very large documents and the speed is OK.

Please these macros if I missed something.

Cheers,

--Hans

Hans-Jörg Bibiko

12:09 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

Hi,

I found an "error". If one writes e.g. <hr/> instead of <hr> or <hr /

...

the parser doesn't ignore <hr/>. It gives a warning. Should I

leave it as it is or should I get rid of it by using a pre-parsing?

By my opinion a warning would be fine to keep your code clean.

Any comments?

--Hans

Thomas Aylott - subtleGradient

4:24 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On Nov 29, 2007, at 7:09 AM, Hans-Jörg Bibiko wrote:

...

Hi,

I found an "error". If one writes e.g. <hr/> instead of <hr> or <hr /

...
the parser doesn't ignore <hr/>. It gives a warning. Should I

leave it as it is or should I get rid of it by using a pre-parsing?

By my opinion a warning would be fine to keep your code clean.

Any comments?

--Hans

Well, <hr/> is perfectly valid xhtml, you only have to make it <hr /> if you want it backwards compatible with html 4.

Ideally it would treat <hr/> and <hr /> and <hr> exactly the same.

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

8:46 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On 29.11.2007, at 17:24, Thomas Aylott - subtleGradient wrote:

...

Well, <hr/> is perfectly valid xhtml, you only have to make it <hr /

...
if you want it backwards compatible with html 4.

Ideally it would treat <hr/> and <hr /> and <hr> exactly the same.

OK ;)

Here both improved versions:

Cheers,

--Hans

PS @Thomas: Regarding to the reverse functionality (your UNDO), well up to now I don't see any easy and non-destructive way

Thomas Aylott - subtleGradient

30 Nov 30 Nov

7:24 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

On Nov 29, 2007, at 3:46 PM, Hans-Jörg Bibiko wrote:

...

On 29.11.2007, at 17:24, Thomas Aylott - subtleGradient wrote:

...
Well, <hr/> is perfectly valid xhtml, you only have to make it <hr /

...
if you want it backwards compatible with html 4.

Ideally it would treat <hr/> and <hr /> and <hr> exactly the same.

OK ;)

Here both improved versions:

<Select Content of Balanced (HT|X)ML Tags.tmMacro><Select Balanced (HT|X)ML Tags.tmMacro>

Cheers,

--Hans

PS @Thomas: Regarding to the reverse functionality (your UNDO), well up to now I don't see any easy and non-destructive way

Sweet!

I've done UNDO with Macros by doing a replace for ^ (beginning of line) with nothing. Seems to not change anything in the document, but it does trigger the undo.

Thanks!

—Thomas Aylott – subtleGradient—

Hans-Jörg Bibiko

28 Nov 28 Nov

8:53 p.m.

New subject: [ANN] Select Balanced HTML Tag!!!3!

Oops, I forgot something to ask.

In HTML this is a valid code:

This macro cannot match such things. One can run Tidy in beforehand to clean it up. But my question is whether some of you know a good resource which start tags close a unclosed tag automatically?

--Hans

Thomas Aylott - subtleGradient

20 Nov 20 Nov

10:48 p.m.

On Nov 20, 2007, at 1:06 PM, Hans-Jörg Bibiko wrote:

...

Is there a way to highlight this text by using a macro/command WITHOUT TMTOOLS? I can copy this text into the clipboard but then ?? Any ideas?

You might be able to do it with 3 separate things. 1 Command to do the actual work. Which inserts extremely unique start and end strings. Then you have a Macro that does a search for that unique start and end string and anything in between. Which also removes those strings, bringing the document back to where it started except with a nice selection now. And then a third macro that would get the keystroke and would run the command and then the macro in succession.

One advantage/sideeffect is that it would be an undoable operation.

—Thomas Aylott – subtleGradient—

Alex Ross

10:48 p.m.

...

...
I don't mean to get all nerdy on you guys but the problem that you're running into here is that HTML is context-free. Regular expressions can't match context free languages. There are mathematical proofs of this... so what you're trying to do is really truly impossible.

What you need is someway to remember some state, which is the counter that Hans mentioned or a transformation of the tags so that each has a unique (ordered) identifier. Also, you can match it if you have the recursive subexpression from Oniguruma. But you don't. My uh... suggestion is to give up using plain-old regular expressions :). It can never work.

–Alex

Hey, thanks. But duh ;) that's why the macro I just released actually works. I gave up on regex finally and mangled HTML around a bit to work with the normal select brackets internal TextMate command.

oh.... uhm. sorry. didn't mean to insult.

...

I was fiddling around with recursive regex, but it seems to completely crash TextMate if you don't have it just right, so it's extremely hard to play with.

Thomas Aylott - subtleGradient

11:30 p.m.

On Nov 20, 2007, at 5:48 PM, Alex Ross wrote:

...

...
Hey, thanks. But duh ;) that's why the macro I just released actually works. I gave up on regex finally and mangled HTML around a bit to work with the normal select brackets internal TextMate command.

oh.... uhm. sorry. didn't mean to insult.

Ditto. I shouldn't be so snippy. The smily was meant to convey a more jocular tone.

If only smilies covered all transgressions. Then I'd totally send in my 1040 just full of :):):) :D:D:D ;););) And all would be forgiven :D

—Thomas Aylott – subtleGradient—

Thomas Aylott - subtleGradient

15 Nov 15 Nov

4:35 p.m.

New subject: [SCREENCAST] Select Balanced HTML Tag

Enjoy the screencasty goodness: http://subtlegradient.com/articles/2007/11/15/select-balanced-html-tag

—Thomas Aylott – subtleGradient—

On Nov 15, 2007, at 1:37 AM, Thomas Aylott - subtleGradient wrote:

...

YES! You heard it here first folks.

I just added a Select Balanced HTML/XML Tag macro to the Experimental bundle!

I've been trying different ways of making this work since I first started using TextMate back in, what like 2004 or something?

I've tried regex, I've tried Ruby, I've tried everything I could think of... But finally I've come up with something very simple.

The only real downside is that it's EXTREMELY slow. Like 1 second from running the macro until it's done. But that's WAY faster than using the mouse to manually select that same code, or trying to use the keyboard to manually select it.

This is frankly the #1 biggest missing feature of TextMate that directly impacts my life.

If this breaks anything, just remember that it's undoable. Command-Z is your friend. Please also report and and all problems to me with an example of the code that it chokes on.

If you only want to DOWNLOAD this macro alone, just visit this link and double-click the file. http://macromates.com/svn/Bundles/trunk/Bundles/Experimental.tmbundle/Macros...

—Thomas Aylott – subtleGradient—