[TxMt] Word Count ?

Charilaos Skiadas skiadas at hanover.edu
Thu Nov 30 06:38:59 UTC 2006


On Nov 30, 2006, at 1:23 AM, Paul McCann wrote:

> Indeed: but all that work has already been done, as the command is  
> operating on the pdf file. So the real question becomes: why is  
> ps2ascii (aka ghostscript) so slow? (Just checked on my work  
> machine, a 2GHz intel iMac with 2G of memory, and it's still about  
> 30 seconds on a 250 page, 1.2MB pdf file.)
>
I guess the question is how are you going to get the words out of the  
pdf or ps file? If you look at the pdf/ps source file, it is filled  
with special commands and things. I suppose if you could export the  
pdf file to a txt file, then you could count the words there with  
ease. Otherwise, we are talking about parsing what seems to me to be  
code even more complicated that LaTeX. You are better off with the  
small error from counting words in the latex source instead.

Unless I am much mistaken.

> I guess it's just an irreducibly difficult procedure... Moral of  
> this story? Don't count words very often!

Or ever I would say. Why is it important how many words you have? I  
guess some things have word limits, but surely this is not a check  
you would have to do too often.

> Cheers,
> Paul

Haris





More information about the textmate mailing list