[TxMt] invisibles bundle update; stupefy quotes/hyphens and zap non-ASCII added
Allan Odgaard
allan at macromates.com
Thu Jan 20 18:32:59 UTC 2005
On Jan 20, 2005, at 18:57, Eric Hsu wrote:
> On the other hand, extended ASCII does depend on encoding, and I'm not
> sure how standard x80-xFF are.
Since TM uses UTF-8 to talk with external commands, you don't have to
worry about encodings. The non-printable high-bit characters are
0x80-0x9F, but in UTF-8 that corresponds to this pattern:
“\xC2[\x80-\x9F]” (obtained using: “printf \x80\x9F|iconv -f iso-8859-1
-t utf-8|xxd”).
So my candidate for an UTF-8 friendly zap gremlins becomes:
perl -pe 's/[^\t\n\x20-\xFF]|\xC2[\x80-\x9F]//g'
Does anyone actually have a document with 'gremlins' to test this
stuff? ;)
More information about the textmate
mailing list