[TxMt] Textencoding problems renders scripts useless

Allan Odgaard throw-away-1 at macromates.com
Mon Mar 12 08:02:30 UTC 2007


On 11. Mar 2007, at 22:54, Stefan wrote:

> Technically speaking, I would suppose, TextMate should read the  
> clipboard using
> the current clipboard's encoding [likely to be UTF*], then convert  
> it to the
> encoding of the current document and paste it in the current document.

It does -- and the current document is unicode.

> Since TextMate obviously does a transformation, I wonder, why it  
> convert this
> way - which breaks the current document somehow.

No, what happens is, that while you type, you refrain from using  
characters outside what’s defined in Latin-1.

When you paste from Word, you get (presumably) curly quotes inserted  
into your document. These characters does not exist in Latin-1, and  
so, the next time you save your document, TextMate will “upgrade” it  
to UTF-8.

My advice, whenever people bring up non-UTF-8 encodings, is to stop  
resisting, and go with UTF-8 :) UTF-8 can represent all the  
characters you can type or paste into your document, it is identified  
with 99.9999% certainty when the file is loaded from disk, it is an  
ASCII superset and thus compatible with basically all programs that  
just expect ASCII (compilers, script interpreters), it is compact, it  
was recommended in 98 by IETF for all future internet protocols, etc.




More information about the textmate mailing list