Hi - I was looking over the Python language definition, and I came upon a regex that I don't understand - or, it might be a bug. The regex is in the FoldingStartMarker:
<key>foldingStartMarker</key> <string>^\s*(def|class)\s+([.a-zA-Z0-9_ b]+)\s*(((.*)))?\s*:|{\s* $|(\s*$|[\s*$|^\s*"""(?=.)(?!.*""")</string>
The key mysterious part is right in the first character class, which I believe is designed to pick out the name of the class/def in question:
[.a-zA-Z0-9_ b]+
Okay, I get it up until the ' b' part. Essentially, the name of a class or def is a series of characters that are in a-z, A-Z, 0-9, or are '_' or '.' . What's the point of the ' b' part? Was this meant to be '\b', signifying a word boundary? Or is there some deeper meaning to space b that I don't understand? Please forgive if this is not an appropriate place for this question; I am rather new (well, completely and totally new) to the use of mailing lists.
Brilliant program, btw - it takes time to appreciate it.
Nick
On 18. Mar 2007, at 00:38, Fabry Nicholas F. wrote:
[...] The key mysterious part is right in the first character class, which I believe is designed to pick out the name of the class/def in question:
[.a-zA-Z0-9_ b]+
Okay, I get it up until the ' b' part. Essentially, the name of a class or def is a series of characters that are in a-z, A-Z, 0-9, or are '_' or '.' . What's the point of the ' b' part? Was this meant to be '\b', signifying a word boundary? Or is there some deeper meaning to space b that I don't understand?
Curious as you, I decided to track down the erroneous regexp. It happened in r976 with this change log entry:
- Converted all word boundaries in the Python Syntax (Allan probably missed these, because Tiger conveniently translated them all to &(l|g)t;)
Looking at the diff, the b was previously a <, so the intent was to convert < to \b (because of a change in the regexp library), but accidentally all <’s where converted to b’s.
I have now reverted this part of r976, better late than never :)