Just to give back a little, I made a change to the Tidy command in the HTML bundle (actually, made a new bundle and copied code and changed it.)
When running the stock Tidy HTML command on an HTML document that was created by saving a Microsoft Word document as HTML, it completely deletes the contents document! In my duplicate of the Tidy command, I added an option for Tidy to know it's dealing with a Word document.
I added: --word-2000 yes to the tidy command in the bundle. The first few lines now are:
"${TM_TIDY:-tidy}" -f /dev/null -iq -utf8 -asxhtml -wrap 0 --tab-size $TM_TAB_SIZE --word-2000 yes --indent-spaces $TM_TAB_SIZE ${TM_SELECTED_TEXT:+--show-body-only yes}|\
Now it cleans all of the extra MS junk that's added to a document. There's still a little clean up to do (it doesn't delete the <o:p></o:p> useless tags, but that's easy to do.)
Hope this helps someone!
jt