[TxMt] pbs with LaTeX labels
Hans-Joerg Bibiko
bibiko at eva.mpg.de
Thu May 31 15:24:49 UTC 2007
On 31 May 2007, at 17:07, Xavier Cambar wrote:
>
> Is there a way to substitute an accented character by its non-
> accented equivalent with a regular expression?
>
As far as I know it would be very tricky.
By myself I use perl for that:
Write a command:
Input: Selected Text or Document
Output: Replace Selected Text
Command:
perl -e'
use Unicode::Normalize;
use utf8;
no warnings;
binmode (STDIN, ":utf8");
binmode (STDOUT, ":utf8");
while(<>){
$_=NFKD($_);
s/[\x{0300}-\x{0362}]//g; # combining diacritics
s/\x{3099}//g;s/\x{FF9E}//g;s/\x{309B}//g; # Japanese voiced mark
s/\x{309A}//g;s/\x{309C}//g;s/\x{FF9F}//g; # Japanese semi-voiced mark
print;
}
'
You can delete the Japanese stuff.
The function NFKD decompose any character with a diacritic into its
base character plus the diacritics as combining form according to the
Unicode specification.
The next is simply delete all combining diacritics.
Please note, this will delete ALL diacritics, i.e cedilla, diaereses,
acute, grave, macron, hook, ogonek etc.!
I guess you have to install the Perl library Unicode::Normalize in
beforehand via CPAN, but I don't know this exactly.
How to apply this to the LaTeX snippets for sectioning, I don't know,
but maybe my hint helps.
Best,
Hans
More information about the textmate
mailing list