Re: [TxMt] New "Unicode" bundle in the Review trunk

3 Jun 2008


      Walter Dörwald wrote:
...
Hans-Jörg Bibiko wrote:
...
On 02.06.2008, at 00:04, Walter Dörwald wrote:
...
Here's another patch (against the current version). It shows both the 
codepoint and the name.
[...]
Here's another suggestions on the current Bundle version:
To get the UTF-8 bytes of a character, you're doing the following:
print "  UTF-8         : " + " 
".join(repr(char.encode("UTF-8")).split('\x')).lstrip("' 
").rstrip("'").upper()
This only works for characters with a codepoint >= 128. The following 
code should work better:
print "  UTF-8         : %s" % " ".join(hex(ord(c))[2:].upper() for 
c in char)
Furthermore the code:
decomp = unicodedata.decomposition(char).lstrip(' ').rstrip(' ')
can be simplyfied to:
decomp = unicodedata.decomposition(char).strip()
(strip() strips from both ends and stripping all whitespace is the 
default when no argument is given.)
Hope that helps.
Servus,
    Walter

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [TxMt] New "Unicode" bundle in the Review trunk