[TxMt] Re: Broken support for Unicode characters above U+FFFF

Mojca Miklavec mojca.miklavec.lists at gmail.com
Sun May 8 22:21:32 UTC 2011


On Sun, May 8, 2011 at 23:57, Molyneux, Phil wrote:
> The first character of your example is unicode MATHEMATICAL ITALIC SMALL A (Unicode Hex U+1D44E, UTF-8 Hex 0xF09D918E, UTF-16 hex 0xD835DC4E, UTF-32 Hex 0x0001D44E) while what is displayed in TextMate is Unicode HANGUL SYLLABLE POELP (Unicode Hex U+D44E, UTF-8 Hex 0xED918E, UTF-16 hex 0xD44E, UTF-32 Hex 0x0000D44E) --- your other two mangled characters are Unicode U+1D44F and U+1D450 which are similarly coerced to U+D44F and U+D450

I didn't analyse the hex codes before, but now that you posted the
numbers, I find it even more weird. Apparently the editor is not even
ignoring the bytes; it is just calculating the numbers "mod 2^16".
That sounds like using "unsigned short int" in place of "unsigned int"
at least somewhere in the source code.

> I only have one font that contains glyphs for your characters (DejaVu Serif Italic)

The glyphs can also be found in Cambria Math (commercial), XITS Math,
Asana Math, (Neo) Euler Math and a few others.

Mojca


More information about the textmate mailing list