On 28/7/2006, at 22:46, Charilaos Skiadas wrote:
[...] since Ruby's unicode support is not sterling by default. I'll see what I can do in this particular case.
What does the code do? Normally no special unicode support should be necessary to work with strings.
Though if the strings contain non-ASCII code points then these will be stored as two ore more bytes, so never do str[0] and expect to extract one code point unless you already know that it’s an ASCII character.
Set $KCODE = 'U' or give -KU as argument (e.g. in shebang) to make regular expressions multi-byte aware. This means having ‘.’ match multi-byte characters (as one code point) etc.
Should you need to explode a string into code points then either do str.unpack('U*') or if you have set KCODE you can also do str.split (//) -- the former splits into Fixnums the latter into Strings.