On Aug 14, 2007, at 08:00, textmate-request@lists.macromates.com wrote:
From: Steve Lianoglou lists@arachnedesign.net Date: August 13, 2007 14:03:56 EDT To: TextMate users textmate@lists.macromates.com Subject: Re: [TxMt] Re: Python bundle syntax highlighting bug Reply-To: TextMate users textmate@lists.macromates.com
I think you are probably right. Does anyone else have an opinion on this?
I agree this doesn't seem right, and can produce some confusing results. On the other hand, it's very nice to have regex highlighting.
I prefer to trade-off false-positives for the nicety of regex highlighting.
Wouldn't it be possible to have the first string located within a re.XXXXX('') pattern be highlighted as regex? Am I wrong, or are those the only places where regex appear?
s = r"...." pattern = re.compile(s)
is always, possible, if contrived. I think re.XXXXX is an interesting suggestion though.
I'm in the same boat with Jay.
I think it's kind of nice to have special highlighting for regex's. I'd settle for it only being switched on when we have re.* (r"regex" ... ) if the "always a regex w/ r" " really isn't suitable for others.
-steve
From: "Alexander Ross" alex.j.ross@gmail.com Date: August 13, 2007 14:35:16 EDT To: "TextMate users" textmate@lists.macromates.com Subject: Re: [TxMt] Re: Python bundle syntax highlighting bug Reply-To: TextMate users textmate@lists.macromates.com
I agree that it's very nice to have highlighted re's if possible.
What if we did something like matching r"(?#) … " as a regular expression string? The would give us something explicit to match, but it would also mean you'd have to add it to any of your preexisting re's.
That sounds like a very good idea. It would be pretty easy to find regexes and sub in a '(?#)' comment marker (I had to look up what that one was!) so one would have regex syntax highlighting in existing code, while keeping the 'raw strings are just raw strings' folks happy. For those who complain that r'(?# might just be the beginning of some particular, non-regex raw string, well.... too bad! ;)
On a side note, it may be nice to keep a documented list of such 'extra' features (above and beyond the straight language definition) in the Python bundle somewhere; for instance, the appearance of folding markers when a triple quote is followed by text, followed by a <return>, followed by another triple quote. This regex highlighting feature would also qualify - otherwise, it would not be known to a new Python + TextMate user unless they stumbled across it by accident or investigated the grammar in detail...
Hi,
I agree that it's very nice to have highlighted re's if possible.
What if we did something like matching r"(?#) … " as a regular expression string? The would give us something explicit to match, but it would also mean you'd have to add it to any of your preexisting re's.
That sounds like a very good idea. It would be pretty easy to find regexes and sub in a '(?#)' comment marker (I had to look up what that one was!) so one would have regex syntax highlighting in existing code, while keeping the 'raw strings are just raw strings' folks happy. For those who complain that r'(?# might just be the beginning of some particular, non-regex raw string, well.... too bad! ;)
To be honest, I don't really like this idea too much (in lieu of having to use re.*(r" ... ") in order to get the regex highlight, let's say).
While adding a (?#) is "harmless" in terms of the parsing of the actual code, changing your code to fit your text editor just doesn't "feel" right. More tangibly, I think it's probably "the wrong thing to do" if you're working on a project w/ other people (especially if they're not fellow TextMate users). The code won't break, but I could imagine that the extra line noise could tick people off and add some bit-rot to your version control ...
-steve
To be honest, I don't really like this idea too much (in lieu of having to use re.*(r" ... ") in order to get the regex highlight, let's say).
While adding a (?#) is "harmless" in terms of the parsing of the actual code, changing your code to fit your text editor just doesn't "feel" right. More tangibly, I think it's probably "the wrong thing to do" if you're working on a project w/ other people (especially if they're not fellow TextMate users). The code won't break, but I could imagine that the extra line noise could tick people off and add some bit-rot to your version control ...
I agree that prefixing all re's is not ideal.
So, we have five options:
1. Match all raw strings unambiguously as regular expressions. We will sometimes have false-positives.
2. Match raw strings that are arguments to methods from the re module. We will sometimes not match raw strings that are regular expressions, but can be pretty well guaranteed to never have a false- positive.
3. Require some prefix to a raw string to "turn on" regular expression matching. This has an extremely high probability of removing false-positives and false-negatives, but at the cost of additional CRUFT.
4. A combination of 2. and 3. Match raw strings that are arguments to re.compile and raw strings prefixed with (?#) as regular expressions, but no others.
5. Don't match re's at all.
It would seem there is no perfect option. I propose that we put it to a vote, and perhaps appeal to our BDFL Allan.
–Alex
On Aug 14, 2007, at 9:09 PM, Alex Ross wrote:
To be honest, I don't really like this idea too much (in lieu of having to use re.*(r" ... ") in order to get the regex highlight, let's say).
While adding a (?#) is "harmless" in terms of the parsing of the actual code, changing your code to fit your text editor just doesn't "feel" right. More tangibly, I think it's probably "the wrong thing to do" if you're working on a project w/ other people (especially if they're not fellow TextMate users). The code won't break, but I could imagine that the extra line noise could tick people off and add some bit-rot to your version control ...
I agree that prefixing all re's is not ideal.
So, we have five options:
- Match all raw strings unambiguously as regular expressions. We
will sometimes have false-positives.
- Match raw strings that are arguments to methods from the re
module. We will sometimes not match raw strings that are regular expressions, but can be pretty well guaranteed to never have a false-positive.
- Require some prefix to a raw string to "turn on" regular
expression matching. This has an extremely high probability of removing false-positives and false-negatives, but at the cost of additional CRUFT.
- A combination of 2. and 3. Match raw strings that are arguments
to re.compile and raw strings prefixed with (?#) as regular expressions, but no others.
- Don't match re's at all.
It would seem there is no perfect option. I propose that we put it to a vote, and perhaps appeal to our BDFL Allan.
I think option number 4 comes pretty darn close to perfect.
Then again, I don't write python, so my vote probably doesn't count much ;). But option 4 guarantees that, for most uses, things will me matched as expected, and for the other uses, there is a way to document what needs to be done. It also avoids to a good extent the bad case of having non-regexp raw strings colored as regexps.
–Alex
Haris Skiadas Department of Mathematics and Computer Science Hanover College
So, we have five options:
- Match all raw strings unambiguously as regular expressions. We
will sometimes have false-positives.
- Match raw strings that are arguments to methods from the re
module. We will sometimes not match raw strings that are regular expressions, but can be pretty well guaranteed to never have a false-positive.
- Require some prefix to a raw string to "turn on" regular
expression matching. This has an extremely high probability of removing false-positives and false-negatives, but at the cost of additional CRUFT.
- A combination of 2. and 3. Match raw strings that are
arguments to re.compile and raw strings prefixed with (?#) as regular expressions, but no others.
- Don't match re's at all.
It would seem there is no perfect option. I propose that we put it to a vote, and perhaps appeal to our BDFL Allan.
I think option number 4 comes pretty darn close to perfect.
I'm voting for number 4, as well. Just out of curiosity, who's maintaining the Python bundle these days?
E
On Aug 14, 2007, at 10:11 PM, Eric Abrahamsen wrote:
Just out of curiosity, who's maintaining the Python bundle these days?
All bundles now carry a contactName entry in their info.plist, for the main maintainer of the bundle. In Python's case, that would be Alex Ross.
E
Haris Skiadas Department of Mathematics and Computer Science Hanover College
I think option number 4 comes pretty darn close to perfect.
Then again, I don't write python, so my vote probably doesn't count much ;). But option 4 guarantees that, for most uses, things will me matched as expected, and for the other uses, there is a way to document what needs to be done. It also avoids to a good extent the bad case of having non-regexp raw strings colored as regexps.
haris, your vote counts for 2 of anyone elses vote.
On Aug 14, 2007, at 9:18 PM, Alex Ross wrote:
So, we have five options:
- Match all raw strings unambiguously as regular expressions. We
will sometimes have false-positives.
- Match raw strings that are arguments to methods from the re
module. We will sometimes not match raw strings that are regular expressions, but can be pretty well guaranteed to never have a false-positive.
- Require some prefix to a raw string to "turn on" regular
expression matching. This has an extremely high probability of removing false-positives and false-negatives, but at the cost of additional CRUFT.
- A combination of 2. and 3. Match raw strings that are arguments
to re.compile and raw strings prefixed with (?#) as regular expressions, but no others.
- Don't match re's at all.
My vote would be for 4, but I'll add two more options:
6. Parse r' and r''' but not r" and r""" (or vice-versa) as regexes.
7. Parse the "r" prefix, but not the "R" prefix, as regexes.
The last option is probably the simplest. I don't think I've ever seen the "R" prefix in use and didn't even know it was an option until I just read the spec moments ago.
j.
Parse r' and r''' but not r" and r""" (or vice-versa) as regexes.
Parse the "r" prefix, but not the "R" prefix, as regexes.
The last option is probably the simplest. I don't think I've ever seen the "R" prefix in use and didn't even know it was an option until I just read the spec moments ago.
I've never seen 'R' in use either, but I'm sure somebody, somewhere is doing it. I think 6,7 are going to be too confusing.
On Aug 15, 2007, at 12:25 AM, Jay Soffian wrote:
On Aug 14, 2007, at 9:18 PM, Alex Ross wrote:
So, we have five options:
- Match all raw strings unambiguously as regular expressions. We
will sometimes have false-positives.
- Match raw strings that are arguments to methods from the re
module. We will sometimes not match raw strings that are regular expressions, but can be pretty well guaranteed to never have a false-positive.
- Require some prefix to a raw string to "turn on" regular
expression matching. This has an extremely high probability of removing false-positives and false-negatives, but at the cost of additional CRUFT.
- A combination of 2. and 3. Match raw strings that are
arguments to re.compile and raw strings prefixed with (?#) as regular expressions, but no others.
- Don't match re's at all.
My vote would be for 4, but I'll add two more options:
I'll go for 4 as well ...
Thanks, -steve
I'm voting for option 2. I've never seen a syntax highlighter for a rich language that works perfectly--except maybe Lisp highlighting in Emacs--and when the highlighting fails I'd rather it fail to the plainest highlighting possible.
And adding a no-op "signal" to raw strings that will later be used as regexes just to turn on some coloring seems very unPythonic in that:
It is ugly. It is implicit. It adds complexity. It detracts from readability. It is not the obvious way to do it.
- Match raw strings that are arguments to methods from the re
module. We will sometimes not match raw strings that are regular expressions, but can be pretty well guaranteed to never have a false- positive.