[TxMt] New "Unicode" bundle in the Review trunk
Walter Dörwald
walter at livinglogic.de
Sun Jun 1 22:04:14 UTC 2008
Hans-Jörg Bibiko wrote:
> On 30.05.2008, at 17:32, Walter Dörwald wrote:
>
>> Hans-Joerg Bibiko wrote:
>>> Dear all,
>>> there's a new bundle called "Unicode" in the review trunk. It is
>>> meant to be a place where we can gather any kind of scripts,
>>> commands, etc. which are related to general Unicode issue, meaning
>>> non-ASCII. This should also a place ...
>>
>> One small note:
>>
>> In the character name script you should probably call
>> unicodedata.name() with a second argument in case the character has no
>> name, i.e. replace
>>
>> res = a + " : " + unicodedata.name(a)
>>
>> with
>>
>> res = a + " : " + unicodedata.name(a, "U+%04X" % ord(a))
> Thanks for the hint! These are more or less the first scripts which I
> wrote in python ;)
> Caused by the issue that python has installed some Unicode data per
> default.
Here's another patch (against the current version). It shows both the
codepoint and the name.
BTW, you don't have to use a regular expression to split a string into
characters, simply iterating through it does the trick:
Index: Commands/Show Unicode Names.tmCommand
===================================================================
--- Commands/Show Unicode Names.tmCommand (revision 9813)
+++ Commands/Show Unicode Names.tmCommand (working copy)
@@ -8,11 +8,13 @@
<string>#!/usr/bin/python
import unicodedata
import sys
-import re
-for a in re.compile("(?um)(.)").split(unicode(sys.stdin.read(), "UTF-8")):
- if (len(a)==1) and (a != '\n'):
- res = a + " : " + unicodedata.name(a, "U+%04X" % ord(a))
+for a in unicode(sys.stdin.read(), "UTF-8"):
+ if a != '\n':
+ res = u"%s : U+%04X" % (a, ord(a))
+ name = unicodedata.name(a, None)
+ if name:
+ res += u" : %s" % name
print res.encode("UTF-8")</string>
<key>fallbackInput</key>
<string>character</string>
>> Furthermore it would be great if this script could display all
>> information there is in the Python Unicode database, i.e. stuff like
>>
>> unicodedata.category()
>> unicodedata.bidrectional()
>> unicodedata.decimal()
> Yes. I have such a script in Perl which also shows up info about Unicode
> code points etc.
OK, now I see that the script displays information about every character
in the selection. Adding more info might be a space problem.
Another problem: Using Ctrl-Shift-U as the shortcut hides the "Convert
To Lowercase" command.
Servus,
Walter
More information about the textmate
mailing list