[SVN] RegExp List Compression
throw-away-2 at macromates.com
Fri May 16 19:32:16 UTC 2008
On 16 May 2008, at 21:22, Simon Gregory wrote:
> In a few of the language definitions there are lists of matches
> which look like this:
> (G(erd Knops|a(vin Kistner|rrett J. Woodworth)|ra(nt Hollingworth|
> eme Rocher))|R(yan McCuaig|ich Barton|o(ss Harmes|ger Braunstein|
> b(ert Rainthorpe| (Rix|Bevan))))
> I'm imagining that it's painful to achieve these by hand, so can
> anyone point me to a script that does it? If they're not machine
> generated, how big a benefit does the conciser match bring?
This is the script used: http://macromates.com/svn/Bundles/trunk/Bundles/Objective-C.tmbundle/Support/list_to_regexp.rb
We presently (still) have no script to expand them again. I think
Michael has been wanting that on more than one occasion.
As for the speed-up, for all practical purposes, I doubt you can
subjectively tell the difference between a compacted and non-compacted
regexp when it comes to responsiveness in TextMate ;)
I believe I wrote this in combination with the Objective-C symbol
scraper -- here we had hundred if not thousands of symbols all
starting with NS, so a) it did actually give an (at least) measurable
speed-up, and b) it keeps the size of the grammar down (which is
anyway auto-generated, so no real need for it to be readable).
So I’d say, when you have more than 20+ symbols and/or the symbols are
extracted automatically, it would be good to use the compaction tool.
But otherwise don’t.
More information about the textmate-dev