From bibiko@eva.mpg.de Fri Nov 23 22:11:45 2007 From: =?utf-8?q?Hans-J=C3=B6rg?= Bibiko To: textmate@lists.macromates.com Subject: Re: [TxMt] [ANN] Select Balanced HTML Tag!!!3! Date: Fri, 23 Nov 2007 23:11:29 +0100 Message-ID: <337BC726-E94C-4A5E-BBD6-C4B263FC0569@eva.mpg.de> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5507697114227513764==" --===============5507697114227513764== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Hi, this macro should work. I use a Perl script as an event based parser. This approach should be quite robust against (X|H)TML syntax. The disadvantage is speed. To increase the speed even for very large documents I implemented a two-step parser. The first step assumes that the sought open tag is closed to the caret (meaning 6000 bytes). If nothing is found the second step will parse the entire document that could takes a bit (more than 10,000 lines). With this macro I also got rid of the Perl regexp engine crash on large documents. I tried it out. I took an HTML file with 45,000 lines and it worked fine for tags which are closed to the caret. If the tags are not closed to the caret it takes a while but one can stop the macro by pressing APPLE+. If the macro shouldn't find anything or the start tag name differs from the end tag name - which is the case if one works with incomplete (X|H)TML - the macro will play an alert sound. Of course, that macro cannot parse any HTML exception. E.g. one can write:

foo . The

tag will be closed by . An other issue is the but e.g. in JavaScript one can write: document.write("<\/SCR" + "IPT>"); ".. tags? I rebound the macro to SHIFT+OPT+APPLE+B because I find it sometimes quite useful to have to chance to invoke 'Enclosing Brackets' in (X|H) TML but this is up to the user. Please try it out and give a feedback ;) Cheers, --Hans --===============5507697114227513764== Content-Type: application/octet-stream Content-Transfer-Encoding: base64 Content-Disposition: attachment; filename="SelectBalancedXHTMLTags.tmMacro" MIME-Version: 1.0 PD94bWwgdmVyc2lvbj0iMS4wIiBlbmNvZGluZz0iVVRGLTgiPz4KPCFET0NUWVBFIHBsaXN0IFBV QkxJQyAiLS8vQXBwbGUgQ29tcHV0ZXIvL0RURCBQTElTVCAxLjAvL0VOIiAiaHR0cDovL3d3dy5h cHBsZS5jb20vRFREcy9Qcm9wZXJ0eUxpc3QtMS4wLmR0ZCI+CjxwbGlzdCB2ZXJzaW9uPSIxLjAi Pgo8ZGljdD4KCTxrZXk+YnVuZGxlVVVJRDwva2V5PgoJPHN0cmluZz5CNUEyMjk2Ny0yRTZBLTRC NjYtOUM3MS1EQTg3NjJDMDFFMjc8L3N0cmluZz4KCTxrZXk+Y29tbWFuZHM8L2tleT4KCTxhcnJh eT4KCQk8ZGljdD4KCQkJPGtleT5hcmd1bWVudDwva2V5PgoJCQk8ZGljdD4KCQkJCTxrZXk+YWN0 aW9uPC9rZXk+CgkJCQk8c3RyaW5nPmZpbmROZXh0PC9zdHJpbmc+CgkJCQk8a2V5PmZpbmRJblBy b2plY3RJZ25vcmVDYXNlPC9rZXk+CgkJCQk8dHJ1ZS8+CgkJCQk8a2V5PmZpbmRJblByb2plY3RS ZWd1bGFyRXhwcmVzc2lvbjwva2V5PgoJCQkJPGZhbHNlLz4KCQkJCTxrZXk+ZmluZFN0cmluZzwv a2V5PgoJCQkJPHN0cmluZz5ceEZGRjM8L3N0cmluZz4KCQkJCTxrZXk+aWdub3JlQ2FzZTwva2V5 PgoJCQkJPGZhbHNlLz4KCQkJCTxrZXk+cmVndWxhckV4cHJlc3Npb248L2tleT4KCQkJCTxmYWxz ZS8+CgkJCQk8a2V5PnJlcGxhY2VBbGxTY29wZTwva2V5PgoJCQkJPHN0cmluZz5kb2N1bWVudDwv c3RyaW5nPgoJCQkJPGtleT5yZXBsYWNlU3RyaW5nPC9rZXk+CgkJCQk8c3RyaW5nPiQxPC9zdHJp bmc+CgkJCQk8a2V5PndyYXBBcm91bmQ8L2tleT4KCQkJCTx0cnVlLz4KCQkJPC9kaWN0PgoJCQk8 a2V5PmNvbW1hbmQ8L2tleT4KCQkJPHN0cmluZz5maW5kV2l0aE9wdGlvbnM6PC9zdHJpbmc+CgkJ PC9kaWN0PgoJCTxkaWN0PgoJCQk8a2V5PmFyZ3VtZW50PC9rZXk+CgkJCTxkaWN0PgoJCQkJPGtl eT5iZWZvcmVSdW5uaW5nQ29tbWFuZDwva2V5PgoJCQkJPHN0cmluZz5ub3A8L3N0cmluZz4KCQkJ CTxrZXk+Y29tbWFuZDwva2V5PgoJCQkJPHN0cmluZz5leHBvcnQgUkVTPSQocGVybCAtZSAnCiMh L3Vzci9iaW4vcGVybCAtWApubyB3YXJuaW5nczsKdW5kZWYgJC87CgokdXRmOGEgPSBxewoJIAlb XHgwMC1ceDdGXQoJfAlbXHhDMi1ceERGXVtceDgwLVx4QkZdCgl8CVx4RTBbXHhBMC1ceEJGXVtc eDgwLVx4QkZdCgl8CVtceEUxLVx4RUZdW1x4ODAtXHhCRl1bXHg4MC1ceEJGXQoJfAlceEYwW1x4 OTAtXHhCRl1bXHg4MC1ceEJGXVtceDgwLVx4QkZdCgl8CVtceEYxLVx4RjddW1x4ODAtXHhCRl1b XHg4MC1ceEJGXVtceDgwLVx4QkZdCgl8CVx4RjhbXHg4OC1ceEJGXVtceDgwLVx4QkZdW1x4ODAt XHhCRl1bXHg4MC1ceEJGXQoJfAlbXHhGOS1ceEZCXVtceDg4LVx4QkZdW1x4ODgtXHhCRl1bXHg4 OC1ceEJGXVtceDg4LVx4QkZdCgl8CVx4RkNbXHg4NC1ceEJGXVtceDg4LVx4QkZdW1x4ODgtXHhC Rl1bXHg4OC1ceEJGXVtceDg4LVx4QkZdCgl8CVx4RkRbXHg4OC1ceEJGXVtceDg4LVx4QkZdW1x4 ODgtXHhCRl1bXHg4OC1ceEJGXVtceDg4LVx4QkZdCgkgICAgIH07Cgokb3JnID0gJmx0OyZndDs7 CiRsaW5lID0gJEVOVnsiVE1fTElORV9OVU1CRVIifS0xOwokbGluZV9pbmRleCA9ICRFTlZ7IlRN X0xJTkVfSU5ERVgifSswOwoKI3NlYXJjaCBsZWZ0IGZyb20gY2FyZXQgYmFja3dhcmRzIGluIGJs b2NrcwokY291bnQgPSAkbGluZTsKJHBvcyA9IDA7CndoaWxlKCRjb3VudC0tKXsKCSRwb3MgPSBp bmRleCgkb3JnLCJcbiIsJHBvcyk7CgkkcG9zKys7Cn0KJHBvcyArPSAkbGluZV9pbmRleDsKJHRl eHQgPSBzdWJzdHIoJG9yZywwLCRwb3MpOwoKI3BlcmwgY3Jhc2hlcyB3aXRoIHRoYXQgcmVnZXhw IG9uIGxhcmdlIGRvY3VtZW50cwojJG9yZyA9fiBtLygoLipcbil7JGxpbmV9KSgueyRsaW5lX2lu ZGV4fSkvOwojJHRleHQgPSAiJDEkMyI7CgokY3VyaW5kZXggPSBsZW5ndGgoJHRleHQpOwoKaWYo JEVOVnsiVE1fTU9ERSJ9ID1+IC9IVE1MfFByb3BlcnR5IExpc3QvICkgewoJJHRleHQgPX4gcyEm bHQ7KC8/KGFyZWF8YmFzZXxiYXNlZm9udHxicnxjb2x8ZnJhbWV8aHJ8aW1nfGlucHV0fGlzaW5k ZXh8bGlua3xtZXRhfHBhcmFtKS4qPy8/KSZndDshcmVwbDEoJCZhbXA7KSFlaXNnOwp9CiR0ZXh0 PX5zLyZsdDsoXC8pezB9W14mZ3Q7XSo/XC8mZ3Q7L3JlcGwxKCQmYW1wOykvZXNnOwoKI2ZpcnN0 IGxvb3AgZmFzdCBzZWFyY2ggbmVhciBjYXJldAokZmlyc3QgPSAyOwokcG9zID0gJGN1cmluZGV4 LTYwMDA7CndoaWxlKCRmaXJzdC0tKXsKCSRmaXJzdCsrOwoJJHBvcyA9IDAgaWYgKCRwb3MmbHQ7 MCk7Cglwb3MoJHRleHQpID0gJHBvczsKCXdoaWxlKCR0ZXh0ID1+IG0vJmx0OyhcLyk/KFx3W1x3 Oi1dKilbXiZndDtdKiZndDsvc29nKXsKCQlpZiAoIWRlZmluZWQoJDEpKSB7CgkJCXVuc2hpZnQo QHN0YWNrLCBbJDIsKHBvcygkdGV4dCktbGVuZ3RoKCQmYW1wOykpXSk7CgkJfSBlbHNlIHsKCQkJ d2hpbGUoQHN0YWNrKSB7IGxhc3QgaWYgKHNoaWZ0KEBzdGFjayktJmd0O1swXSBlcSAkMikgfTsK CQl9Cgl9Cgkkc3RhcnQgPSBzaGlmdChAc3RhY2spOwoJbGFzdCBpZiAoZGVmaW5lZCgkc3RhcnQt Jmd0O1swXSkgfHwgJHBvcyA9PSAwKTsKCSRwb3MgPSAwOwp9CgojY2FsY3VsYXRlICRjb2x1bW4g Zm9yIHNldHRpbmcgdGhlIGNhcmV0CiRvZmZzZXQgPSByaW5kZXggc3Vic3RyKCR0ZXh0LDAsJHN0 YXJ0LSZndDtbMV0pLCAiXG4iOwokb2ZmdGV4dCA9IHN1YnN0cigkdGV4dCwgJG9mZnNldCsxLCAk c3RhcnQtJmd0O1sxXS0kb2Zmc2V0KTsKQGxlbiA9ICRvZmZ0ZXh0ID1+IC8kdXRmOGEvZ294Owok Y29sdW1uID0gJCNsZW4rMTsKCiNzZWFyY2ggcmlnaHQgZnJvbSBjYXJldApAc3RhY2sgPSAoKTsK JHRleHQgPSBzdWJzdHIoJG9yZywkY3VyaW5kZXgsbGVuZ3RoKCRvcmcpLSRjdXJpbmRleCk7Cmlm KCRFTlZ7IlRNX01PREUifSA9fiAvSFRNTHxQcm9wZXJ0eSBMaXN0LyApIHsKCSR0ZXh0ID1+IHMh Jmx0OygvPyhhcmVhfGJhc2V8YmFzZWZvbnR8YnJ8Y29sfGZyYW1lfGhyfGltZ3xpbnB1dHxpc2lu ZGV4fGxpbmt8bWV0YXxwYXJhbSkuKj8vPykmZ3Q7IXJlcGwxKCQmYW1wOykhZWlzZzsKfQokdGV4 dD1+cy8mbHQ7KFwvKXswfVteJmd0O10qP1wvJmd0Oy9yZXBsMSgkJmFtcDspL2VzZzsKd2hpbGUo JHRleHQgPX4gbS8mbHQ7KFwvKT8oXHdbXHc6LV0qKVteJmd0O10qJmd0Oy9zb2cpewoJaWYgKCFk ZWZpbmVkKCQxKSkgewoJCXVuc2hpZnQoQHN0YWNrLCBbJDIsKHBvcygkdGV4dCkpXSk7Cgl9IGVs c2UgewoJCWlmICghQHN0YWNrKSB7CgkJCSRlbmQgPSBbJDIsKHBvcygkdGV4dCkpXTsKCQkJbGFz dCBpZiAoJGVuZC0mZ3Q7WzBdIGVxICRzdGFydC0mZ3Q7WzBdKTsKCQl9IGVsc2UgewoJCQl3aGls ZShAc3RhY2spIHsgbGFzdCBpZiAoc2hpZnQoQHN0YWNrKS0mZ3Q7WzBdIGVxICQyKSB9OwoJCX0K CX0KfQoKI3NldCBjYXJldAppZighbGVuZ3RoKCRzdGFydC0mZ3Q7WzBdKSl7CglzeXN0ZW0oIm9z YXNjcmlwdCAtZSBiZWVwICZhbXA7Jmd0Oy9kZXYvbnVsbCAmYW1wOyIpOwoJZXhpdDsKfQppZigk c3RhcnQtJmd0O1swXSBuZSAkZW5kLSZndDtbMF0pewoJc3lzdGVtKCJvc2FzY3JpcHQgLWUgYmVl cCAmYW1wOyZndDsvZGV2L251bGwgJmFtcDsiKTsKfQpAY25sID0gc3Vic3RyKCRvcmcsJHN0YXJ0 LSZndDtbMV0sJGN1cmluZGV4LSRzdGFydC0mZ3Q7WzFdKSA9fiBtLyhcbikvZzsKJHN0YXJ0X2xp bmUgPSAkbGluZSAtICQjY25sOwpzeXN0ZW0oIm9wZW4gXCJ0eG10Oi8vb3Blbi8/bGluZT0kc3Rh cnRfbGluZSZhbXA7Y29sdW1uPSRjb2x1bW5cIiIpOwpwcmludCBzdWJzdHIoJG9yZywkc3RhcnQt Jmd0O1sxXSwkZW5kLSZndDtbMV0rJGN1cmluZGV4LSRzdGFydC0mZ3Q7WzFdKTsKCnN1YiByZXBs MXsgJF8gPSBzaGlmdCgpOyBzL1teXHRcbl0vIC9zZzsgcmV0dXJuICRfOyB9CicpCgovYmluL2Jh c2ggLWMgJ2V4cG9ydCBfX0NGX1VTRVJfVEVYVF9FTkNPRElORz0weDFGNToweDgwMDAxMDA6MHg4 MDAwMTAwO2VjaG8gLW4gIiRSRVMiIHwgcGJjb3B5IC1wYm9hcmQgZmluZCcKCjwvc3RyaW5nPgoJ CQkJPGtleT5pbnB1dDwva2V5PgoJCQkJPHN0cmluZz5kb2N1bWVudDwvc3RyaW5nPgoJCQkJPGtl eT5rZXlFcXVpdmFsZW50PC9rZXk+CgkJCQk8c3RyaW5nPl4xPC9zdHJpbmc+CgkJCQk8a2V5Pm5h bWU8L2tleT4KCQkJCTxzdHJpbmc+c3RhcnQ8L3N0cmluZz4KCQkJCTxrZXk+b3V0cHV0PC9rZXk+ CgkJCQk8c3RyaW5nPnNob3dBc1Rvb2x0aXA8L3N0cmluZz4KCQkJCTxrZXk+dXVpZDwva2V5PgoJ CQkJPHN0cmluZz40NDg3NUQwQy0wODZCLTRCMjUtOUY2Ri03OUI5QkFBMUMwM0E8L3N0cmluZz4K CQkJPC9kaWN0PgoJCQk8a2V5PmNvbW1hbmQ8L2tleT4KCQkJPHN0cmluZz5leGVjdXRlQ29tbWFu ZFdpdGhPcHRpb25zOjwvc3RyaW5nPgoJCTwvZGljdD4KCQk8ZGljdD4KCQkJPGtleT5jb21tYW5k PC9rZXk+CgkJCTxzdHJpbmc+ZmluZE5leHQ6PC9zdHJpbmc+CgkJPC9kaWN0PgoJPC9hcnJheT4K CTxrZXk+a2V5RXF1aXZhbGVudDwva2V5PgoJPHN0cmluZz5+QEI8L3N0cmluZz4KCTxrZXk+bmFt ZTwva2V5PgoJPHN0cmluZz5TZWxlY3QgQmFsYW5jZWQgWEhUTUwgVGFnczwvc3RyaW5nPgoJPGtl eT5zY29wZTwva2V5PgoJPHN0cmluZz50ZXh0Lmh0bWwgLShjb21tZW50LmJsb2NrLmh0bWwpLCB0 ZXh0LnhtbCAtKGNvbW1lbnQuYmxvY2sueG1sKTwvc3RyaW5nPgoJPGtleT51dWlkPC9rZXk+Cgk8 c3RyaW5nPjE1NEVEMDUzLTk4NjAtNDI0OS1CMTU5LUU3QThENzQ4MkFFNTwvc3RyaW5nPgo8L2Rp Y3Q+CjwvcGxpc3Q+Cg== --===============5507697114227513764==--