[TxMt] Re: how to match unicode?
Hans-Jörg Bibiko
bibiko at eva.mpg.de
Wed Sep 24 10:00:15 UTC 2008
On 24.09.2008, at 11:08, Piero D'Ancona wrote:
> Writing a ruby command for TextMate to reformat author names
> in a list of papers I run into the obvious but sad fact that
> /[A-z]/ =~ "ü"
> does not match anything. Is there a simple workaround?
> I mean, simpler than a very long and unelegant list of Unicode
> ranges such as the one here
> http://forums.mozillazine.org/viewtopic.php?f=25&t=834075
This is a tricky point.
If you are using Ruby 1.9 then Oniguruma'a class /[[:alpha:]]/u should
work.
I can remember that one could install Oniguruma's regexp engine also
for Ruby 1.8.
By myself I tried to rewrite my regexp in a negated form, i.e. instead
of looking for \w I wrote e.g. [^\d\s -_].
A the other way would be to look for a significant string after the
author. Ruby 1.8 matches by /./u also the ü as single character.
If this doesn't work for you, well then you should use the Unicode
ranges, but you can shorten it if you are only dealing with names
written in Latin script.
--Hans
More information about the textmate
mailing list