[TxMt] Updated Tidy Command for MS Word documents

John Tsombakos tsom467 at gmail.com
Fri Apr 28 20:45:05 UTC 2006


Just to give back a little, I made a change to the Tidy command in the
HTML bundle (actually, made a new bundle and copied code and changed
it.)

When running the stock Tidy HTML command on an HTML document that was
created by saving a Microsoft Word document as HTML, it completely
deletes the contents document! In my duplicate of the Tidy command, I
added an option for Tidy to know it's dealing with a Word document.

I added:  --word-2000 yes  to the tidy command in the bundle. The
first few lines now are:

"${TM_TIDY:-tidy}" -f /dev/null -iq -utf8 -asxhtml -wrap 0 --tab-size
$TM_TAB_SIZE --word-2000 yes --indent-spaces $TM_TAB_SIZE
${TM_SELECTED_TEXT:+--show-body-only yes}|\

Now it cleans all of the extra MS junk that's added to a document.
There's still a little clean up to do (it doesn't delete the
<o:p></o:p> useless tags, but that's easy to do.)

Hope this helps someone!

jt



More information about the textmate mailing list