Writing a ruby command for TextMate to reformat author names in a list of papers I run into the obvious but sad fact that /[A-z]/ =~ "ü" does not match anything. Is there a simple workaround? I mean, simpler than a very long and unelegant list of Unicode ranges such as the one here http://forums.mozillazine.org/viewtopic.php?f=25&t=834075
Thanks, Piero
On 24.09.2008, at 11:08, Piero D'Ancona wrote:
Writing a ruby command for TextMate to reformat author names in a list of papers I run into the obvious but sad fact that /[A-z]/ =~ "ü" does not match anything. Is there a simple workaround? I mean, simpler than a very long and unelegant list of Unicode ranges such as the one here http://forums.mozillazine.org/viewtopic.php?f=25&t=834075
This is a tricky point.
If you are using Ruby 1.9 then Oniguruma'a class /[[:alpha:]]/u should work. I can remember that one could install Oniguruma's regexp engine also for Ruby 1.8.
By myself I tried to rewrite my regexp in a negated form, i.e. instead of looking for \w I wrote e.g. [^\d\s -_]. A the other way would be to look for a significant string after the author. Ruby 1.8 matches by /./u also the ü as single character.
If this doesn't work for you, well then you should use the Unicode ranges, but you can shorten it if you are only dealing with names written in Latin script.
--Hans
On Sep 24, 2008, at 4:08 AM, Piero D'Ancona wrote:
Writing a ruby command for TextMate to reformat author names in a list of papers I run into the obvious but sad fact that /[A-z]/ =~ "ü" does not match anything. Is there a simple workaround?
If you set $KCODE = "U" in Ruby 1.8 (or pass the equivalent -KU switch) character classes like \w change to the Unicode definition of "word" characters:
$ ruby -KU -ve 'p "Résumé"[/\w+/]' ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-darwin9.4.0] "Résumé"
Hope that helps.
James Edward Gray II
On 24.09.2008, at 15:01, James Gray wrote:
On Sep 24, 2008, at 4:08 AM, Piero D'Ancona wrote:
Writing a ruby command for TextMate to reformat author names in a list of papers I run into the obvious but sad fact that /[A-z]/ =~ "ü" does not match anything. Is there a simple workaround?
If you set $KCODE = "U" in Ruby 1.8 (or pass the equivalent -KU switch) character classes like \w change to the Unicode definition of "word" characters:
$ ruby -KU -ve 'p "Résumé"[/\w+/]' ruby 1.8.6 (2008-08-11 patchlevel 287) [i686-darwin9.4.0] "Résumé"
Hope that helps
Haha. Please believe me, I did exactly the same. And it didn't work. But after your mail it works. And to be honestly I do not know why ;)
Thanks,
--Hans
On 24.09.2008, at 15:01, James Gray wrote:
"Résumé" Hope that helps
It does. Thanks!
Hans-Jörg Bibiko <bibiko@...> writes:
Haha. Please believe me, I did exactly the same. And it didn't work. But after your mail it works. And to be honestly I do not know why ;)
Hans, after all these years you still believe computers are deterministic? I have strong evidence against this Thank you Piero