[TxMt] New "Unicode" bundle in the Review trunk

Walter Dörwald walter at livinglogic.de
Sun Jun 1 22:04:14 UTC 2008


Hans-Jörg Bibiko wrote:
 > On 30.05.2008, at 17:32, Walter Dörwald wrote:
 >
 >> Hans-Joerg Bibiko wrote:
 >>> Dear all,
 >>> there's a new bundle called "Unicode" in the review trunk. It is
 >>> meant to be a place where we can gather any kind of scripts,
 >>> commands, etc. which are related to general Unicode issue, meaning
 >>> non-ASCII. This should also a place ...
 >>
 >> One small note:
 >>
 >> In the character name script you should probably call
 >> unicodedata.name() with a second argument in case the character has no
 >> name, i.e. replace
 >>
 >>     res = a + " : " + unicodedata.name(a)
 >>
 >> with
 >>
 >>     res = a + " : " + unicodedata.name(a, "U+%04X" % ord(a))
 > Thanks for the hint! These are more or less the first scripts which I
 > wrote in python ;)
 > Caused by the issue that python has installed some Unicode data per
 > default.

Here's another patch (against the current version). It shows both the 
codepoint and the name.

BTW, you don't have to use a regular expression to split a string into 
characters, simply iterating through it does the trick:

Index: Commands/Show Unicode Names.tmCommand
===================================================================
--- Commands/Show Unicode Names.tmCommand	(revision 9813)
+++ Commands/Show Unicode Names.tmCommand	(working copy)
@@ -8,11 +8,13 @@
  	<string>#!/usr/bin/python
  import unicodedata
  import sys
-import re

-for a in re.compile("(?um)(.)").split(unicode(sys.stdin.read(), "UTF-8")):
-     if (len(a)==1) and (a != '\n'):
-          res = a + " : " + unicodedata.name(a, "U+%04X" % ord(a))
+for a in unicode(sys.stdin.read(), "UTF-8"):
+     if a != '\n':
+          res = u"%s : U+%04X" % (a, ord(a))
+          name = unicodedata.name(a, None)
+          if name:
+              res += u" : %s" % name
            print res.encode("UTF-8")</string>
  	<key>fallbackInput</key>
  	<string>character</string>


 >> Furthermore it would be great if this script could display all
 >> information there is in the Python Unicode database, i.e. stuff like
 >>
 >>    unicodedata.category()
 >>    unicodedata.bidrectional()
 >>    unicodedata.decimal()
 > Yes. I have such a script in Perl which also shows up info about Unicode
 > code points etc.

OK, now I see that the script displays information about every character 
in the selection. Adding more info might be a space problem.

Another problem: Using Ctrl-Shift-U as the shortcut hides the "Convert 
To Lowercase" command.

Servus,
    Walter



More information about the textmate mailing list