On 11. Mar 2007, at 22:54, Stefan wrote:
Technically speaking, I would suppose, TextMate should read the clipboard using the current clipboard's encoding [likely to be UTF*], then convert it to the encoding of the current document and paste it in the current document.
It does -- and the current document is unicode.
Since TextMate obviously does a transformation, I wonder, why it convert this way - which breaks the current document somehow.
No, what happens is, that while you type, you refrain from using characters outside what’s defined in Latin-1.
When you paste from Word, you get (presumably) curly quotes inserted into your document. These characters does not exist in Latin-1, and so, the next time you save your document, TextMate will “upgrade” it to UTF-8.
My advice, whenever people bring up non-UTF-8 encodings, is to stop resisting, and go with UTF-8 :) UTF-8 can represent all the characters you can type or paste into your document, it is identified with 99.9999% certainty when the file is loaded from disk, it is an ASCII superset and thus compatible with basically all programs that just expect ASCII (compilers, script interpreters), it is compact, it was recommended in 98 by IETF for all future internet protocols, etc.