[TxMt] Word Count

Jonas Steverud jtvrud at bredband.net
Wed May 28 16:55:57 UTC 2008


27 maj 2008 kl. 23.24 skrev Mark Eli Kalderon:
>
> On May 27, 2008, at 10:11 PM, Jonas Steverud wrote:
>
>>
>> 20 maj 2008 kl. 20.07 skrev Patrick McElhaney:
>>> CTRL+SHIFT+N. It's in the "Text" bundle.
>>
>> One should make a note though that C-S-N doesn't return the number  
>> of characters, but the number of bytes. This is only an issue if  
>> you use multi-byte character, which is commonly enough to make the  
>> C-S-N command a bit broken IMHO.
>>
>> I would be very grateful if anyone could point to a function that  
>> does the equivalent of C-S-N but returns the proper number of  
>> characters and not bytes (the ideal would be "full" statistics;  
>> words, characters and bytes). I made a quick hack but realised  
>> that I did not know how to tell Perl what character encoding there  
>> where, i.e. that it was UTF-8 or Latin-1.
>
> The command in the text bundles does report the full  
> statistics...lines, words, bytes. Perhaps you are using a modified  
> word count command that uses the same keybinding.

Yes, but I am not interested in the number of bytes, I would like to  
know the number of characters, which is not the same thing.  
Räksmörgås is ten characters but is reported as 13 bytes since åäö  
are stored as multi-byte characters. I use the Statistics for  
Document  / Selection (word count) command from the Text Bundle and  
the ruby script uses wc -l for statistics, which is not Unicode aware  
AFAIK.

/Jonas


More information about the textmate mailing list