[TxMt] RegExp n00b

Gavin Kistner gavin at refinery.com
Tue Sep 13 13:00:09 UTC 2005


On Sep 13, 2005, at 6:27 AM, Andreas Wahlin wrote:
> foldingStartMarker = "^\\s*([A-Za-z0-9.]+s*=\\s*)?(function)\\b";

Changing the \\ to \
     ^\s*([A-Za-z0-9.]+s*=\s*)?(function)\b

The above regular expression says:
     Starting at the start of the line (^)
     0) find zero or more whitespace characters  (\s*)
     1) followed by one or more alphanumeric-or-period characters  
(the [...]+)
     2) followed by zero or more 's' characters  (s*)
     3) followed by an equals sign (=)
     4) followed by zero or more whitespace characters  (\s*)
     5) except you may skip all of 1-4 if you want,  (the ?)
          (but save 'em if you find 'em)
     6) but absolutely find the word 'function' (and save it as well)
     7) followed by a word boundary (the \b)

So, that would match:
     foo.bar= function
     1111111=function
     ......sssss=         function
     function

but would not match:
     foo.bar = function
     functional

The lack of escaping of the 's' certainly seems like a mistake. It is  
intended, I suspect, to allow whitespace around the equals sign.



> match = "^\\s*(function)\\s*([a-zA-Z_]\\w*)\\s*\\(([^)]*)\\)";

Changing the \\ to \ (again, for standard regexp clarity):
     ^\s*(function)\s*([a-zA-Z_]\w*)\s*\(([^)]*)\)

The above regular expression says:
      Starting at the start of the line
     1) find zero or more whitespace characters  (\s*)
     2) followed by the word 'function' (and save it)
     3) followed by zero or more whitespace characters  (\s*)
     4) Followed by a single identifier, saved ([a-zA-Z_]\w*)
     5) followed by zero or more whitespace characters  (\s*)
     6) followed by a literal left parenthesis   \(
     7) save the characters up until the next right parenthesis  [^)]*
     8) followed by a literal right parenthesis

So, that would match:
     function foo11111 ( @#$%Q!#$%@T@$%!@#$ )
     functionz()

but would not match:
     function ()
     function()
     foo = function()

Because JavaScript can have anonymous functions, depending on the  
purpose of that match, you may want to make the intermediary  
identifier optional.


My own question - what is the importance of saving sub-expressions in  
both of the above cases?




More information about the textmate mailing list