[TxMt] Re: Soft-wraping text containing multi-byte characters

Allan Odgaard mailinglist at textmate.org
Tue Jun 30 12:31:05 UTC 2015

On 27 Jun 2015, at 6:33, Yoichiro Hasebe wrote:

> I suspect TM2 treats a sequence of multi-byte characters as if it was
> a single word. If that is the case, with text in a language like
> Japanese, where word boundaries are not indicated by spaces, a whole
> sentence or even a paragraph will be processed as just one huge word.

Correct, TextMate will need to learn about word boundaries for languages 
that do not use space characters, to do proper wrapping.

I see CFString has hyphenation API since 10.7, so this might be usable, 
but I will need to investigate this a bit further, also, wrapping is not 
the only place where word boundaries come up, so the “fix” would 
need to go beyond just wrapping (e.g. word movement and selection should 
also use linguistic word boundary definitions).

More information about the textmate mailing list