From miken32@gmail.com Sun Jul 5 03:57:59 2009 From: Michael Newton To: textmate@lists.macromates.com Subject: [TxMt] Re: textmate Digest, Vol 14, Issue 2 Date: Sat, 04 Jul 2009 21:57:57 -0600 Message-ID: <5aefd0c80907042057s6ffd44aexf7c206aa00a88a85@mail.gmail.com> In-Reply-To: MIME-Version: 1.0 Content-Type: multipart/mixed; boundary="===============5087422617951024552==" --===============5087422617951024552== Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: quoted-printable On Sat, Jul 4, 2009 at 6:00 AM, wro= te: > ---------- Forwarded message ---------- > From:=C2=A0Scott Haneda > To:=C2=A0TextMate users > Date:=C2=A0Fri, 3 Jul 2009 12:40:55 -0700 > Subject:=C2=A0[TxMt] Re: The dreaded Regexp question > On Jul 3, 2009, at 10:00 AM, Michael Newton wrote: > >> Sorry, I know this isn't particularly on-topic (aside from the fact >> that I'm using Textmate!) but I'm not having luck with the search >> engines. >> >> I have a bunch of HTML that needs to be converted to XHTML, notably >> needs to be which is easy >> enough. Problem is, it's PHP so there are things like > type=3D""> which I'm having troubles with. So how can >> I create a regular expression that captures the guts of the HTML >> brackets, while ignoring any PHP brackets it might come across inside >> the HTML? > > > I used this web tool to help me: > http://www.gskinner.com/RegExr/ > > I did my best to put in single tics, quote marks etc: > "> some type and then another input " name=3D'value' class=3D"foo"> > > >
>
> > My regex pattern was: > (\s]+))?)+\s*|\s*)/?(>) > > My replace pattern was: > $1$2$3/> > * You could do less pattern grouping, I did so as I was working through it. > > Result was: > " type=3D""/> some type = and then another input " name=3D'value' clas= s=3D"foo" class=3D"foo"/> > > >
>
> > The one issue is it will alter plain closing tags, like will become and I could not wokr that out. =C2=A0Either you can solve that in the reg= ex by ignoring anything with a "/" in it already, or, I may be inclined to ch= eat. =C2=A0With the recording ability of textmate, I would try something like: > find "/>" > replace "#tmp# > find (\s]+))?)+\s*|\s*)/?(>) > replace $1$2$3/> > find "#tmp# > replace "/>" > > It should happen pretty quick. > -- > Scott * If you contact me off list replace talklists@ with scott@ * > Thanks, I actually figured it out just now as I was composing a reply. Negative lookbehind assertion only matches ">" if it's not preceded by "?" or "/": find: <((?:input|img|link|meta|hr|br|area).*?)(? replace: <$1/> turns this: > "> /> "/> baz into this: /> "/> /> "/> baz Just need to see if it works in TM when I get back to my office (and my Mac!) Definitely bookmarking that site though, and will look more into this "recording ability." --=20 Michael Newton http://mike.eire.ca/ --===============5087422617951024552==--