[TxMt] Re: Search & replace regex question
Hans-Jörg Bibiko
bibiko at eva.mpg.de
Wed Sep 3 20:04:02 UTC 2008
On 03.09.2008, at 18:17, Allan Odgaard wrote:
> On 3 Sep 2008, at 12:53, Hans-Jörg Bibiko wrote:
>> Is there also a plan to support within the replacement format string
>> Oniguruma's named groups and the entire back reference functionality
>> (including back reference with nested levels)?
>
> The named captures will be available as $variables.
This will be awesome ;)
> Not sure what the other thing you refer to is. Can you give an
> example?
OK. Maybe you remember Thomas Aylott and I fiddled around to
implement a command which is able to select/find balanced HTML/XML tags.
Finally we found a solution but it makes usage of many many lines of
source code (the command should be in the TM trunk experimental).
Some while ago I read Oniguruma's RE.txt carefully. This kind of
match is supported natively by Oniguruma. It is called 'back
reference with nest level'.
Example 1:
I have a string: "<foo>f<foo>b<bar>123<bar>456</bar></bar>bb</foo>f</
foo>"
and this regexp (please don't be frightened ;):
(?<element>\g<stag>\g<content>*\g<etag>){0}(?<stag><\g<name>\s*>){0}(?
<name>[a-zA-Z_:]+){0}(?<content>[^<&]+(\g<element>|[^<&]+)*){0}(?
<etag></\k<name+1>>){0}\g<element>
If I run this through Oniguruma I get these named groups:
[syntax: group-name (which group): (string-indices[start-stop]])
content]
stag (2): (20-25) <bar>
content (4): (5-49) f<foo>b<bar>123<bar>456</bar></bar>bb</foo>f
element (1): (0-55) <foo>f<foo>b<bar>123<bar>456</bar></bar>bb</
foo>f</foo>
etag (5): (49-55) </foo>
name (3): (21-24) bar
Example 2:
string: "o>b<bar>123<bar>456</bar></bar>bb</foo>f</foo>"
stag (2): (11-16) <bar>
content (4): (8-25) 123<bar>456</bar>
element (1): (3-31) <bar>123<bar>456</bar></bar>
etag (5): (25-31) </bar>
name (3): (12-15) <bar>
In other words it should be possible to use Oniguruma's power to find/
select the next balanced HTML/XML tag by using only one more or less
easy regular expression depending of the position of the caret.
As far as I know Ruby 1.9 is supporting this (?). By myself I'm using
a C program linked to the onig lib to match these nested stuff.
Furthermore this issue leads to a question: Would it be possible to
use TM's Oniguruma engine from outside, meaning an API?
Best,
--Hans
More information about the textmate
mailing list