[TxMt] encoding detection

Sune Foldager cryo at cyanite.org
Mon Mar 27 09:05:14 UTC 2006


Yvon Thoraval wrote:

>
> i thought their are char codes in Mac Roman not used in cp1252 and else
> ...

cp-1252 uses close to the entire 8 bit range (except the lower command
area, which neither of the charsets use), so it's really impossible to
single it out. I don't know about Mac Roman, but latin-1 and latin-9
(8859-15) are likely well over 99% identical.

So I guess you can: If any illegal latin-1 char is used and the text is
not UTF-8 (which can be detected reliably), then it's one of Mac Roman,
latin-9 or cp-1252 :p. Not exactly very useful.

-- Sune.





More information about the textmate mailing list