[TxMt] Python unicode error (was: r8839 (Python)) [reposting]

Hans-Joerg Bibiko bibiko at eva.mpg.de
Mon Jun 2 14:53:41 UTC 2008


>
> On 2 Jun 2008, at 15:40, Allan Odgaard wrote:
> To work with UTF-8 strings written to stdout in Python you need to:
>
> 1. Declare the source code to be UTF-8 (done with the encoding  
> comment).
> 2. Declare the string itself to be a unicode string (done with the u- 
> prefix).
> 3. Set the output stream to be UTF-8 (done by wrapping stdout in a  
> codec-aware writer).
>
> If step 3 is omitted, the encoding of stdout will be taken from the  
> environment, so often it will still work.
>
> The final script ends up being:
>
>    #!/usr/bin/env python
>    # -*- coding: utf-8 -*-
>
>    import sys
>    import codecs
>
>    a = u"æble"
>    sys.stdout = codecs.getwriter('utf-8')(sys.stdout);
>    print a

Only for clarification:
If I write a new python script my head should be à la:

#!/usr/bin/env python
# -*- coding: utf-8 -*-

import sys
import codecs

sys.stdout = codecs.getwriter('utf-8')(sys.stdout)
sys.stdin  = codecs.getreader('utf-8')(sys.stdin)
....

and then I do not need unicode(foo, 'UTF-8') and foo.encode('UTF-8') (?)

Thanks,
--Hans





More information about the textmate mailing list