[TxMt] invisibles bundle update; stupefy quotes/hyphens and zap non-ASCII added

Allan Odgaard allan at macromates.com
Thu Jan 20 18:32:59 UTC 2005


On Jan 20, 2005, at 18:57, Eric Hsu wrote:

> On the other hand, extended ASCII does depend on encoding, and I'm not 
> sure how standard x80-xFF are.

Since TM uses UTF-8 to talk with external commands, you don't have to 
worry about encodings. The non-printable high-bit characters are 
0x80-0x9F, but in UTF-8 that corresponds to this pattern: 
“\xC2[\x80-\x9F]” (obtained using: “printf \x80\x9F|iconv -f iso-8859-1 
-t utf-8|xxd”).

So my candidate for an UTF-8 friendly zap gremlins becomes:
    perl -pe 's/[^\t\n\x20-\xFF]|\xC2[\x80-\x9F]//g'

Does anyone actually have a document with 'gremlins' to test this 
stuff? ;)




More information about the textmate mailing list