[TxMt] converting and merging files through TM

Jamal Shahin jshahin at gmail.com
Wed Oct 3 16:51:17 UTC 2007


Hello,

I'm wondering if anyone here can help with two 'slightly-related to
TextMate' problems I've encountered. I've been given a couple of
hundred documents in different formats (.doc, .pdf, .rtf) that need to
be put into one big document (which will be about 200pp long).

I used 'textutil' to convert most of the documents to txt (ps2ascii
for the pdfs), and then gave them all a filename with a number. Using
TM, I created a macro that would add the filename at the top of each
individual file, and then converted this into "##number ##". I then
used 'textutil -cat' to merge them into one file. So now each
'section' has a filename, title, and block of text. I will then turn
this into a file format which will allow formatting (LaTeX or maybe
RTF, all I need is one big PDF at the end).

I have come across two issues:

1. There are some strange characters (most likely from different
character sets?) that have appeared in the text. I am assuming that
these are accented characters, smart quotes, and so forth, and was
wondering: *How I can automatically convert these characters?* I
started doing this using find and replace for the ones I know, but I
was wondering if there was some easier way to do it.

some examples follow:
actor –meaning
Prüm
Integration :
« new regionalism » i
EU’s energy


2. the numbering system I used for the files was based on a primary
key from our database, but I've been asked to renumber all the files
starting from 1 (for a silly reason, the database started with a
primary key of 67, and manually-entered records start at 400.). *Is
there any way that I can use TM to convert "##number ##" into "##
incremental number ##"?* e.g. in find and replace, use:
find: ##(\d*)\w? ## (the \w? is there as a couple of documents are
labelled like: 137a)
replace: ##(x+1) ##

It's the (x+1) that's bothering me.

Sorry if this is not in the realm of this list (any recommendations?),
but any pointers would be really appreciated!
Many thanks,

Jamal
-- 
Vrije Universiteit Brussel


More information about the textmate mailing list