[TxMt] Re: The dreaded Regexp question
Scott Haneda
talklists at newgeo.com
Fri Jul 3 19:40:55 UTC 2009
On Jul 3, 2009, at 10:00 AM, Michael Newton wrote:
> Sorry, I know this isn't particularly on-topic (aside from the fact
> that I'm using Textmate!) but I'm not having luck with the search
> engines.
>
> I have a bunch of HTML that needs to be converted to XHTML, notably
> <input type="text"> needs to be <input type="text"/> which is easy
> enough. Problem is, it's PHP so there are things like <input
> type="<?php echo $type?>"> which I'm having troubles with. So how can
> I create a regular expression that captures the guts of the HTML
> brackets, while ignoring any PHP brackets it might come across inside
> the HTML?
I used this web tool to help me:
http://www.gskinner.com/RegExr/
I did my best to put in single tics, quote marks etc:
<input type="<?php echo $type?>"> some type and then another input
<input type="<?php echo $type?>" name='value' class="foo">
<input type="some_value">
<input type="$some_$value">
<hr>
<br>
My regex pattern was:
(</?\w+)((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)/?(>)
My replace pattern was:
$1$2$3/>
* You could do less pattern grouping, I did so as I was working
through it.
Result was:
<input type="<?php echo $type?>" type="<?php echo $type?>"/> some type
and then another input <input type="<?php echo $type?>" name='value'
class="foo" class="foo"/>
<input type="some_value" type="some_value"/>
<input type="$some_$value" type="$some_$value"/>
<hr/>
<br/>
The one issue is it will alter plain closing tags, like </a> will
become </a/> and I could not wokr that out. Either you can solve that
in the regex by ignoring anything with a "/" in it already, or, I may
be inclined to cheat. With the recording ability of textmate, I would
try something like:
find "/>"
replace "#tmp#
find (</?\w+)((\s+\w+(\s*=\s*(?:".*?"|'.*?'|[^'">\s]+))?)+\s*|\s*)/?(>)
replace $1$2$3/>
find "#tmp#
replace "/>"
It should happen pretty quick.
--
Scott * If you contact me off list replace talklists@ with scott@ *
More information about the textmate
mailing list