[SVN] RegExp List Compression

Allan Odgaard throw-away-2 at macromates.com
Fri May 16 19:32:16 UTC 2008


On 16 May 2008, at 21:22, Simon Gregory wrote:

> In a few of the language definitions there are lists of matches  
> which look like this:
>
> (G(erd Knops|a(vin Kistner|rrett J. Woodworth)|ra(nt Hollingworth| 
> eme Rocher))|R(yan McCuaig|ich Barton|o(ss Harmes|ger Braunstein| 
> b(ert Rainthorpe| (Rix|Bevan))))
>
> I'm imagining that it's painful to achieve these by hand, so can  
> anyone point me to a script that does it? If they're not machine  
> generated, how big a benefit does the conciser match bring?

This is the script used: http://macromates.com/svn/Bundles/trunk/Bundles/Objective-C.tmbundle/Support/list_to_regexp.rb

We presently (still) have no script to expand them again. I think  
Michael has been wanting that on more than one occasion.

As for the speed-up, for all practical purposes, I doubt you can  
subjectively tell the difference between a compacted and non-compacted  
regexp when it comes to responsiveness in TextMate ;)

I believe I wrote this in combination with the Objective-C symbol  
scraper -- here we had hundred if not thousands of symbols all  
starting with NS, so a) it did actually give an (at least) measurable  
speed-up, and b) it keeps the size of the grammar down (which is  
anyway auto-generated, so no real need for it to be readable).

So I’d say, when you have more than 20+ symbols and/or the symbols are  
extracted automatically, it would be good to use the compaction tool.  
But otherwise don’t.




More information about the textmate-dev mailing list