The topic of unit testing grammars has often been brought up and I finally did a CLI tool for running a file through the TM parser.
The tool is here: http://updates.textmate.org/gtm.bz2
It reads the text to be parsed from stdin and takes as argument paths to tmGrammar files which should be loaded. If the -g/--grammar option is not given, the first grammar specified will be used to parse the input.
Output is the parsed document in the pseudo-XML format that TextMate commands can receive as input.
Presently the -t and -d options are not implemented.
Examples:
gtm < test.c C.tmbundle/Syntaxes/C.plist gtm < test.cc -g source.c++ C.tmbundle/Syntaxes/{C,C++}.plist
I plan to also make this a profiling tool so that it can list how much time is spent in each rule, but this is secondary to the current agenda of providing the basis for grammar unit tests.
I am announcing this to get some input on how we can build a good unit testing system. My concern is that we’ll either make really simple tests that never break (it’s generally complex interplays of rules that cause problems), or we’ll have fully parsed complex documents as the “expected output” and can’t make changes to the grammars w/o pretty much rewriting all the tests.