On Jan 20, 2005, at 19:32, Allan Odgaard wrote:
Does anyone actually have a document with 'gremlins' to test this stuff? ;)
Problem fixed:
for (( i = 0; i < 256; i++ )); do printf \x$(printf "obase=16\n$i\n"|bc); done|iconv -f iso-8859-1 -t utf-8|perl -pe 's/[^\t\n\x20-\xFF]|\xC2[\x80-\x9F]//g'|iconv -f utf-8 -t iso-8859-1|xxd
Outputs:
0000000: 090a 2021 2223 2425 2627 2829 2a2b 2c2d .. !"#$%&'()*+,- 0000010: 2e2f 3031 3233 3435 3637 3839 3a3b 3c3d ./0123456789:;<= 0000020: 3e3f 4041 4243 4445 4647 4849 4a4b 4c4d >?@ABCDEFGHIJKLM 0000030: 4e4f 5051 5253 5455 5657 5859 5a5b 5c5d NOPQRSTUVWXYZ[] 0000040: 5e5f 6061 6263 6465 6667 6869 6a6b 6c6d ^_`abcdefghijklm 0000050: 6e6f 7071 7273 7475 7677 7879 7a7b 7c7d nopqrstuvwxyz{|} 0000060: 7e7f a0a1 a2a3 a4a5 a6a7 a8a9 aaab acad ~............... 0000070: aeaf b0b1 b2b3 b4b5 b6b7 b8b9 babb bcbd ................ 0000080: bebf c0c1 c2c3 c4c5 c6c7 c8c9 cacb cccd ................ 0000090: cecf d0d1 d2d3 d4d5 d6d7 d8d9 dadb dcdd ................ 00000a0: dedf e0e1 e2e3 e4e5 e6e7 e8e9 eaeb eced ................ 00000b0: eeef f0f1 f2f3 f4f5 f6f7 f8f9 fafb fcfd ................ 00000c0: feff ..
Probably 0x7F should also be stripped... also, I didn't check if everything between 0xA0-0xFF should actually be preserved -- I'll check UCD later...