On 13. Oct 2004, at 12:50, Dominique PERETTI wrote:
- If you reopen in TextMate, it stills appear as UTF8... BUT if you
reopen it in BBEdit, it opens as a MacRoman file (with accents wrong).
I don't have BBEdit, but I think BBEdit _always_ opens as MacRoman. And you have to manually tell it to open it with the proper encoding (at least that was my experience when I tried it >2 years ago, both my iso-8859-1 and utf-8 files were all opened as MacRoman by BBEdit)!?!
My guess is that TextMate is right at encoding the characters, but forgots to write the BOM right. I think so because I get the same wrong results in FlashMX if I save the file as utf8-(no BOM) from BBEdit.
The BOM is optional in utf-8, and generally you _don't_ want it, since a) it makes no sense as there is no byte-order ambiguity, b) it rules the nice property of utf-8 being compatible with ascii, c) utf-8 encoding is really easy to recognize even without a BOM.
But if several programs only treat utf-8 properly when it has a BOM, I will consider adding an option -- though I would recommend against using it of course! ;)
Kind regards Allan
On Wed, 13 Oct 2004 13:07:02 +0200, Allan Odgaard allan@macromates.com wrote:
I don't have BBEdit, but I think BBEdit _always_ opens as MacRoman. And you have to manually tell it to open it with the proper encoding (at least that was my experience when I tried it >2 years ago, both my iso-8859-1 and utf-8 files were all opened as MacRoman by BBEdit)!?!
According to the BBEdit manual (http://ftp.barebones.com/pub/manual/BBEdit_8_User_Manual.pdf) this is the procedure it uses to determine the encoding for a file:
1 If the file is well-formed HTML or XML, BBEdit looks for an "encoding=" or <meta charset=> directive. 2 If the file contains a BBEdit state resource, BBEdit uses the encoding stored in the state resource. 3 If the file contains a UTF-8 or UTF-16 (Unicode) byte-order mark, BBEdit opens it as that type of Unicode file. 4 If the file has a resource that contains font information (such as a 'styl' resource) and that resource specifies a multi-byte font, BBEdit opens the file as a Unicode file. 5 If you are opening the file with the Open command, BBEdit uses the encoding specified Read As pop-up menu on the Open dialog. 6 Finally, it uses the encoding specified by the "If the file's encoding can't be guessed, use" pop-up menu on the Text Files: Opening panel of the Preferences window.
~peter